A Statistical Learning Framework for Accelerated Bandgap Prediction of Inorganic Compounds
- 25 Downloads
This study deals with an application of machine learning (ML) techniques for electronic bandgap predictions of a host of entries in the open-source Materials Project (MP) database and inorganic perovskite compounds. Initially, a dataset of 4616 inorganic compounds having available experimental bandgap data was used to generate predictive ML models—support vector machine, k-nearest neighbors, random forest, kernel ridge regression (KRR), and artificial neural networks—requiring only compositional features based on simple elemental attributes. This was followed by identification of the most crucial features for the bandgap and an evaluation of various performance metrics against the dimensionality of the feature space. The trained KRR model having the highest accuracy was then regressed on more than 22,000 entries in the MP database, and the trends are elucidated. Subsequently, out-of-sample validation was performed on a couple of datasets containing several discovered halide perovskites, in conjunction with ab-initio investigations of the undiscovered ones. Finally, the optimal classification and regression models were employed to classify a dataset of 46,970 unknown inorganic halide perovskites into metals and nonmetals followed by bandgap predictions of the nonmetallic entries.
KeywordsMachine learning bandgap prediction inorganic compounds Materials Project halide perovskites
Unable to display preview. Download preview PDF.
This research was supported by the TCS-CTO Organization under SWON No. 1009292. The authors also thank Mr. Deepak Jain (Scientist, Physical Sciences Research Area, T.R.D.D.C.) for valuable discussions.
S.C. and P.K. did the machine learning computations, and S.G.S. carried out the DFT computations in this work. S.C. wrote the manuscript. All authors jointly discussed the results and their implications and commented on the manuscript.
- 23.VanderPlas, J., Python data science handbook: essential tools for working with data, 1st edn. (O’Reilly Media, 2016).Google Scholar
- 26.Mishra, S., Sturm, B.L. and Dixon, S., ISMIR, 537 (2017).Google Scholar
- 37.A.M. Leguy, P. Azarhoosh, M.I. Alonso, M. Campoy-Quiles, O.J. Weber, J. Yao, D. Bryant, M.T. Weller, J. Nelson, A. Walsh, M. Van Schilfgaarde, and P.R. Barnes. Nanoscale 8, 6317 (2016).Google Scholar