Andriy Zakutayev1 Nick Wunder1 Marcus Schwarting1 John Perkins1 Robert White1 Kristin Munch1 William Tumas1 Caleb Phillips1

1, National Renewable Energy Laboratory, Golden, Colorado, United States

The use of advanced machine learning algorithms for design and discovery of novel inorganic semiconductors for optoelectronics energy conversion applications requires large datasets amenable to data mining. Whereas a number of computational materials property databases exists (e.g.,, the machine learning based on experimental data is limited by the lack of large and diverse data resources. Here, we report on our progress towards a publicly open High Throughput Experimental Materials (HTEM) Database ( Presently, this database contains 130,000 sample entries, characterized by structural (100,000), chemical (70,000), optical (50,000) and electrical (40,000) properties of novel inorganic thin film semiconductor materials, grouped in >4,000 sample entries across >100 materials systems. More than a half of these data are publicly available. In addition to showing how HTEM database may enable scientists to explore materials by browsing web-based user interface, this presentation will discuss the underlying laboratory information management system (LIMS). Also, this presentation will illustrate how advanced machine learning algorithms can be adopted to materials science problems of predicting materials conductivity using random forest methods, and clustering unrelated samples into groups based on composition similarity.