Jacqueline Cole1 2 3

1, University of Cambridge, Cambridge, , United Kingdom
2, Argonne National Laboratory, Chicago, Illinois, United States
3, Rutherford Appleton Laboratory, Harwell, Oxfordshire, United Kingdom

Large-scale data-mining workflows are increasingly able to predict successfully new materials that possess a targeted functionality [1]. The success of such materials discovery approaches is nonetheless contingent upon having the right database source to mine. This presentation shows how to auto-generate tailor-made databases to search for functional materials to meet the needs of a given device application.

The talk presents the 'chemistry-aware' open-source text- and table-mining software tool, ChemDataExtractor, that can extract large volumes of material-property data from the literature, using natural language processing, optical character recognition and machine learning capabilities [2]. Machine learning is then employed to populate any missing experimental data.

The role of this tool in accelerating materials discovery is illustrated.

