Project Details
Leveraging Materials Discovery by Data-Efficient Artificial Intelligence
Applicant
Akhil Sugathan Nair, Ph.D.
Subject Area
Theoretical Chemistry: Molecules, Materials, Surfaces
Computer-Aided Design of Materials and Simulation of Materials Behaviour from Atomic to Microscopic Scale
Computer-Aided Design of Materials and Simulation of Materials Behaviour from Atomic to Microscopic Scale
Term
since 2024
Project identifier
Deutsche Forschungsgemeinschaft (DFG) - Project number 540316537
Discovering materials with improved performance is a cornerstone of progress across diverse sectors. However, the number of possible materials is practically infinite and the materials space cannot be covered by direct high-throughput-screening approaches. The application of artificial intelligence (AI) methods demonstrates promise in this direction as they help to identify correlations and patterns in materials data, enabling the prediction of materials with desired properties. However, the scope of AI practices for materials discovery face limitations due to scarce, high-quality materials data and the associated cost of acquisition. Moreover, the limited extrapolative capabilities of commonly applied AI methods present a barrier to materials discovery. The proposed project attempts to overcome these limitations by integrating data from different sources with sequential learning (SL) for leveraging the materials discovery process. Specifically, we will develop a hierarchical symbolic regression sequential learning (HSL) method which addresses the challenges posed by data limitations. The HSL will decompose the task of finding materials with desired properties into the subtasks of identifying the most relevant features and the accuracy (or fidelity) of the data needed for a cost-efficient materials discovery campaign. Thus, HSL will guarantee effective utilization of multi-fidelity data by exploiting the larger availability of less accurate, cheap data in order to choose which more accurate, expensive experiments or calculations should be performed. The application of the symbolic regression (SR) based sure-independence screening and sparsifying operator (SISSO) approach will leverage the small amount of accurate data. Furthermore, SISSO will enable the identification of the most important physical parameters, out of many offered ones, thus providing insights into the underlying physical processes that govern the properties of materials. The focus of the project extends to establishing standardized techniques for deriving reliable uncertainty estimates while employing symbolic regression methodologies, enhancing their utility for material discovery efforts. The potential of the developed AI approach will be demonstrated towards the computational discovery of stable oxide catalysts for water splitting. For this, semi-local DFT-GGA and non-local DFT-HSE will be considered as the low- and high-fidelity methods, respectively as the latter improves the accuracy of thermodynamic and aqueous stability description of oxides. The project will be using a recent implementation of the DFT-HSE approach in the FHI-aims electronic structure package with enhanced performance. The combination of SR based SL with an efficient HSE implementation will facilitate a significant step towards large-scale AI-guided discovery of stable water splitting catalysts based on non-local DFT.
DFG Programme
WBP Position