Matthias Poloczek1 Henry Herbol2 Paulette Clancy2

1, University of Arizona, Tucson, Arizona, United States
2, Cornell University, Ithaca, New York, United States

The sheer number of potential combinations of component species poses a challenging obstacle when optimizing functional materials. The huge combinatorial search space renders an exhaustive search useless and forces us to narrow down the selection of candidate solutions taken into consideration, even when using an advanced methods for the design of experiments. To overcome these limitations, we employ novel ideas from machine learning-based optimization.

In this talk, we present a novel search method that uses techniques from transfer-learning to drastically reduce the cost of the optimization process.
We present an algorithm to leverage cheap, inaccurate approximations of the actual objective. For example, these approximations, also called information sources, may be provided by Density Functional Theory (DFT) calculations at lower levels of theory or from classical semi-empirical force field simulations. These sources are typically not only noisy but also inherently biased due to the limitation of the underlying internal models. Note that this setting goes significantly beyond the notion of multi-fidelity, since information sources cannot be expected to form a hierarchy; for example, a MD simulation may be more accurate for the temperature for which it was calibrated. Therefore, the algorithm ‘learns’ the relationship of the approximations and the objective. The decision regarding which candidate solution to explore next, and what information source to use, maximizes the expected value of information about the unknown optimum. Related theoretical results guarantee that the algorithm is consistent, i.e., it will obtain a near-optimal solution.

To demonstrate the value of this approach, we implement such an algorithm to search for a metal halide perovskite composition which maximizes the binding energy to the solvent, computed via ab initio DFT calculations. Here a “composition” for the candidate material is determined by choosing a combination out of three halides (X) and one out of three cations (A), comprising our metal halide perovskite monomer (PbX3A), and one out of sixteen blends of solvents. Ab initio calculations of lower theory provide cheap approximations. Finding the solution to the optimal mixed halide and solvent blends to use in metal halide perovskites is currently of great interest to the experimentally driven community, whose only recourse at the moment is largely trial-and-error or driven by chemical intuition.