Abstract
Background
Computational prediction of inhibition efficiency (IE) for inhibitor molecules is a crucial supplementary way to design novel molecules that can efficiently inhibit corrosion onto metallic surfaces.
Purpose
Here we are dedicated to developing a new machine learning-based predictor for the inhibition efficiency (IE) of benzimidazole derivatives.
Methods
First, a comprehensively numerical representation was given on inhibitor molecules from all aspects of energy, electronic, topological, physicochemical and spatial properties based on 3-D structures and 150 valid structural descriptors were obtained. Then, a thorough investigation of these structural descriptors was implemented. The multicollinearity-based clustering analysis was performed to remove the linear correlated feature variables, so 47 feature clusters were produced. Meanwhile, Gini importance by random forest (RF) was used to further measure the contributions of the descriptors in each cluster and 47 non-linear descriptors were selected with the highest Gini importance score in the corresponding cluster. Further, considering the limited number of available inhibitors, different feature subsets were constructed according to the Gini importance score ranking list of 47 descriptors.
Results
Finally, support vector machine (SVM) models based on different feature subsets were tested by leave-one-out cross validation. Through comparisons, the optimal SVM model with the top 11 descriptors was achieved based on Poly kernel. This model yields a promising performance with the correlation coefficient (R) and root-mean-square error (RMSE) of 0.9589 and 4.45, respectively, which indicates that the method proposed by us gives the best performance for the current data.
Conclusion
Based on our model, 6 new benzimidazole molecules were designed and their IE values predicted by this model indicate that two of them have high potential as outstanding corrosion inhibitors.
Similar content being viewed by others
References
Mikhailovskii AI, Petrov NA (1997) Monitoring of underground pipeline corrosion condition with sensory instruments. Prot Met 33:293–295
Panchenko YM, Marshakov AI, Igonin TN, Kovtanyuk VV, Nikolaeva LA (2014) Long-term forecast of corrosion mass losses of technically important metals in various world regions using a power function. Corros Sci 88:306–316
Yıldız R (2015) An electrochemical and theoretical evaluation of 4,6-diamino-2-pyrimi-dinethiol as a corrosion inhibitor for mild steel in HCl solutions. Corros Sci 90:544–553
Spahr S, Huntscha S, Bolotin MP, Maier J, Elsner M, Hollender J (2013) Compound-specific isotope analysis of benzotriazole and its derivatives. Anal Bioanal Chem 405:2843–2856
Abd EAEE, Abd EWS, Farouk A, Abd EHSM (2013) Factors affecting the corrosion behaviour of aluminium in acid solutions. II. Inorganic additives as corrosion inhibitors for Al in HCl solutions. Corros Sci 68:14–24
Rincón Ortíz M, Rodríguez MA, Carranza RM, Rebak RB (2013) Oxyanions as inhibitors of chloride-induced crevice corrosion of Alloy 22. Corros Sci 68:72–83
Obot IB, Macdonald D, Gasem ZM (2015) Density functional theory (DFT) as a powerful tool for designing new organic corrosion inhibitors. Part 1: an overview. Corros Sci 99:1–30
Behzadi H, Roonasi P, Momeni MJ, Manzetti S, Esrafili MD, Obot IB, Yousefv M, Mousavi-Khoshdel SM (2015) A DFT study of pyrazine derivatives and their Fe complexes in corrosion inhibition process. J Mol Struct 1086:64–72
Obot IB, Umoren SA, Gasem ZM, Suleiman R, Ali BE (2015) Theoretical Prediction and electrochemical evaluation of vinylimidazo-line and allylimidazoline as corrosion inhibitors for mild steel in 1 M HCl. J Ind Eng Chem 21:1328–1339
Kabanda MM, Obot IB, Ebenso EE (2013) Computational study of some amino acid derivatives as potential corrosion inhibitors for different metal surfaces and in different media. Int J Electrochem Sci 8:10839–10850
Gómez B, Likhanova N, Dominguez M, Aguilar O, Hallen J, Martínez-Magadán J (2005) Theoretical study of a new group of corrosion inhibitors. J Phys Chem A 109:8950–8957
Kanojia R, Singh G (2005) An interesting and efficient organic corrosion inhibitor for mild steel in acidic medium. Surf Eng 21:180–186
Umoren S (2009) Polymers as corrosion inhibitors formetals in different media-a review. Open Corros J 2:175–188
Shirazi Z, Keshavarz MH, Esmaeilpour K, Golikand AN (2017) A simple approach for assessment of the corrosion inhibition efficiency of triazole, oxadiazole and thiadiazole derivatives as a function of their concentrations without using complex computer codes. Protect Met Phys Chem Surf 53:359–372
Keshavarz MH, Esmaeilpour K, Golikand AN, Shirazi Z (2016) Simple approach to predict corrosion inhibition efficiency of imidazole and benzimidazole derivatives as well as linear organic compounds containing several polar functional groups. Z Anorg Allg Chem 642:906–913
Keshavarz MH, Klapötke TM (2017) Energetic compounds: methods for prediction of their performance. Walter de Gruyter, Berlin
Yoo SH, Kim YW, Chung K, Baik SY, Kim JS (2012) Synthesis and corrosion inhibition behavior of imidazoline derivates based on vegetable oil. Corros Sci 59:42–54
Rani BEA, Basu BBJ (2012) Green inhibitors for corrosion protection of metals and alloys: an overview. Int J Corros 2:1–15
Kliskic M, Radosevi J, Gudic S (1997) Pyridine and its derivatives as inhibitors of aluminium corrosion in chloride solution. J Appl Electrochem 27:947–952
Scendo M, Hepel M (2008) Inhibiting properties of benzimidazole films for Cu(II)/Cu(I) reduction in chloride media studied by RDE and EqCN techniques. J Electroanal Chem 613:35–50
Obot IN, Obi-Egbedi NO (2010) Theoretical study of benzimidazole and its derivatives and their potential activity as corrosion inhibitors. Corros Sci 52:657–660
Benabdellah M, Tounsi A, Khaled K, Hammouti B (2011) Thermodynamic, chemical and electrochemical investigations of 2-mercapto benzimidazole as corrosion inhibitor for mild steel in hydrochloric acid solutions. Arab J Chem 4:17–24
Samanta S, Das S, Biswas P (2013) Photocatalysis by 3,6-disubstituted-s-tetrazine: sisible-light driven metal-free green synthesis of 2-substitued benzimidazole and benzothiazole. J Org Chem 78:11184–11193
Kovacevic K, Kokalj A (2011) Analysis of molecular electronic structure of imidazole and benzimidazole-based inhibitors: a simple recipe for qualitative estimation of chemical hardness. Corros Sci 53:909–921
Sun SQ, Geng YF, Tian L, Chen SH, Yan YG, Hu SQ (2012) Density functional theory study of imidazole, benzimidazole and 2-mercaptobenzimidazole adsorption onto clean Cu(III) surface. Corros Sci 63:140–147
Gutiérrez E, Rodríguez JA, Cruz-Borbolla J, Alvarado-Rodríguez JG, Thangarasu P (2016) Development of a predictive model for corrosion inhibition of carbon steel by imidazole and benzimidazole derivatives. Corros Sci 108:23–25
Obot IB, Edouk UM (2017) Benzimidazole: small planar molecule with diverse anti-corrosion potentials. J Mol Liq 246:66–90
Ashry ESH, Senior SA (2011) QSAR of lauric hydrazide and its salts as corrosion inhibitors by using the quantum chemical and topological descriptors. Corros Sci 53:1025–1034
Khaled KF (2011) Modeling corrosion inhibition of iron in acid medium by genetic function approximation method: a QSAR model. Corros Sci 53:3457–3465
Hu SQ et al (2011) 3D-QSAR study and molecular design of benzimidazole derivatives as corrosion inhibitor. Chem J Chin Univ 32:2402–2409
Camacho-Mendoza RL et al (2015) Density functional theory and electrochemical studies: structure–efficiency relationship on corrosion inhibition. J Chem Inf Model 55:2391–2402
Li L et al (2015) The discussion of descriptors for the QSAR model and molecular dynamics simulation of benzimidazole derivatives as corrosion inhibitors. Corros Sci 99:76–88
Shirazi Z, Keshavarz MH, Esmaeilpour K, Pakniya T (2017) A novel and simple method for the prediction of corrosion inhibition efficiency without using complex computer codes. Z Anorg Allg Chem 643:2149–2157
Breimanr L (2001) Random forest. Mach Learn 45:5–32
Aledo JC, Cantón FR, Veredas FJ (2017) A machine learning approach for predicting methionine oxidation sites. BMC Bioinform 18:430. https://doi.org/10.1186/s12859-017-1848-9
Luo JS, Guo YZ, Zhong Y, Ma D, Li WL, Li ML (2014) A functional feature analysis on diverse protein-protein interactions: application for the prediction of binding affinity. J Comput Mol Des 28:619–629
Luo JS, Li WL, Liu ZY, Guo YZ, Pu XM, Li ML (2015) A sequence-based two-level method for the prediction of type I secreted RTX proteins. Analyst 140:3048–3056
Wang Y et al (2015) A comparative study of family-specific protein–ligand complex affinity prediction based on random forest approach. J Comput Mol Des 29:349–360
Wang Y, Guo YZ, Pu XM, Li ML (2017) Effective prediction of bacterial type IV secreted effectors by combined features of both C-termini and N-termini. J Comput Mol Des 3:1029–1038
Qiu H, Guo YZ, Yu LZ, Pu XM, Li ML (2018) Predicting protein lysine methylation sites by incorporating single-residue structural features into Chou’s pseudo components. Chemom Intell Lab Syst 179:31–38
Hu W, Qin L, Li ML, Pu XM, Guo YZ (2018) Individually double minimum-distance definition of protein–RNA binding residues and application to structure-based prediction. J Comput Mol Des 32:1363–1373
Altmann A, Toloşi L, Sander O, Lengauer T (2010) Permutation importance: a corrected feature importance measure. Bioinformatics 26:1340–1347
Vapnik V (1998) Statistical learning theory. Wiley, New York
Ma D, Guo YZ, Luo JS, Pu XM, Li ML (2014) Prediction of protein–protein binding affinity using diverse protein–protein interface features. Chemom Intell Lab Syst 138:7–13
Zhong Y, Guo YZ, Luo JS, Pu XM, Li ML (2014) Effective identification of kinase-specific phosphorylation sites based on domain–domain interactions. Chemom Intell Lab Syst 136:97–103
Shi YN, Guo YZ, Hu YY, Li ML (2015) Position-specific prediction of methylation sites from sequence conservation based on information theory. Sci Rep 5:12403
Dai X et al (2015) Predicting the druggability of protein-protein interactions based on sequence and structure features of active pockets. Curr Pharm Des 21:3051–3061
Hu YY, Guo YZ, Shi YN, Li ML, Pu XM (2015) A consensus subunit-specific model for annotation of substrate specificity for ABC transporters. RSC Adv 5:42009–42019
Li WL, Guo YZ, Li ML, Pu XM (2017) Distinguishing the disease–associated SNPs based on composition frequency analysis. Interdiscip Sci 9:459–467
Wang Y, Guo YZ, Pu XM, Li ML (2017) A sequence-based computational method for prediction of MoRFs. RSC Adv 7:18937–18945
Acknowledgements
This work was financially supported by Major Science and Technology Project of China National Petroleum Co. Ltd (No.: 2016E − 0609). We also thank the Comprehensive Training Platform of Specialized Laboratory, College of Chemistry, Sichuan University for sample analysis.
Author information
Authors and Affiliations
Corresponding authors
Ethics declarations
Conflict of interest
The authors declare no competing financial interests.
Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
About this article
Cite this article
Liu, Y., Guo, Y., Wu, W. et al. A Machine Learning-Based QSAR Model for Benzimidazole Derivatives as Corrosion Inhibitors by Incorporating Comprehensive Feature Selection. Interdiscip Sci Comput Life Sci 11, 738–747 (2019). https://doi.org/10.1007/s12539-019-00346-7
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12539-019-00346-7