Skip to main content
Log in

A Machine Learning-Based QSAR Model for Benzimidazole Derivatives as Corrosion Inhibitors by Incorporating Comprehensive Feature Selection

  • Original research article
  • Published:
Interdisciplinary Sciences: Computational Life Sciences Aims and scope Submit manuscript

Abstract

Background

Computational prediction of inhibition efficiency (IE) for inhibitor molecules is a crucial supplementary way to design novel molecules that can efficiently inhibit corrosion onto metallic surfaces.

Purpose

Here we are dedicated to developing a new machine learning-based predictor for the inhibition efficiency (IE) of benzimidazole derivatives.

Methods

First, a comprehensively numerical representation was given on inhibitor molecules from all aspects of energy, electronic, topological, physicochemical and spatial properties based on 3-D structures and 150 valid structural descriptors were obtained. Then, a thorough investigation of these structural descriptors was implemented. The multicollinearity-based clustering analysis was performed to remove the linear correlated feature variables, so 47 feature clusters were produced. Meanwhile, Gini importance by random forest (RF) was used to further measure the contributions of the descriptors in each cluster and 47 non-linear descriptors were selected with the highest Gini importance score in the corresponding cluster. Further, considering the limited number of available inhibitors, different feature subsets were constructed according to the Gini importance score ranking list of 47 descriptors.

Results

Finally, support vector machine (SVM) models based on different feature subsets were tested by leave-one-out cross validation. Through comparisons, the optimal SVM model with the top 11 descriptors was achieved based on Poly kernel. This model yields a promising performance with the correlation coefficient (R) and root-mean-square error (RMSE) of 0.9589 and 4.45, respectively, which indicates that the method proposed by us gives the best performance for the current data.

Conclusion

Based on our model, 6 new benzimidazole molecules were designed and their IE values predicted by this model indicate that two of them have high potential as outstanding corrosion inhibitors.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

References

  1. Mikhailovskii AI, Petrov NA (1997) Monitoring of underground pipeline corrosion condition with sensory instruments. Prot Met 33:293–295

    CAS  Google Scholar 

  2. Panchenko YM, Marshakov AI, Igonin TN, Kovtanyuk VV, Nikolaeva LA (2014) Long-term forecast of corrosion mass losses of technically important metals in various world regions using a power function. Corros Sci 88:306–316

    Article  CAS  Google Scholar 

  3. Yıldız R (2015) An electrochemical and theoretical evaluation of 4,6-diamino-2-pyrimi-dinethiol as a corrosion inhibitor for mild steel in HCl solutions. Corros Sci 90:544–553

    Article  CAS  Google Scholar 

  4. Spahr S, Huntscha S, Bolotin MP, Maier J, Elsner M, Hollender J (2013) Compound-specific isotope analysis of benzotriazole and its derivatives. Anal Bioanal Chem 405:2843–2856

    Article  CAS  PubMed  Google Scholar 

  5. Abd EAEE, Abd EWS, Farouk A, Abd EHSM (2013) Factors affecting the corrosion behaviour of aluminium in acid solutions. II. Inorganic additives as corrosion inhibitors for Al in HCl solutions. Corros Sci 68:14–24

    Article  CAS  Google Scholar 

  6. Rincón Ortíz M, Rodríguez MA, Carranza RM, Rebak RB (2013) Oxyanions as inhibitors of chloride-induced crevice corrosion of Alloy 22. Corros Sci 68:72–83

    Article  CAS  Google Scholar 

  7. Obot IB, Macdonald D, Gasem ZM (2015) Density functional theory (DFT) as a powerful tool for designing new organic corrosion inhibitors. Part 1: an overview. Corros Sci 99:1–30

    Article  CAS  Google Scholar 

  8. Behzadi H, Roonasi P, Momeni MJ, Manzetti S, Esrafili MD, Obot IB, Yousefv M, Mousavi-Khoshdel SM (2015) A DFT study of pyrazine derivatives and their Fe complexes in corrosion inhibition process. J Mol Struct 1086:64–72

    Article  CAS  Google Scholar 

  9. Obot IB, Umoren SA, Gasem ZM, Suleiman R, Ali BE (2015) Theoretical Prediction and electrochemical evaluation of vinylimidazo-line and allylimidazoline as corrosion inhibitors for mild steel in 1 M HCl. J Ind Eng Chem 21:1328–1339

    Article  CAS  Google Scholar 

  10. Kabanda MM, Obot IB, Ebenso EE (2013) Computational study of some amino acid derivatives as potential corrosion inhibitors for different metal surfaces and in different media. Int J Electrochem Sci 8:10839–10850

    CAS  Google Scholar 

  11. Gómez B, Likhanova N, Dominguez M, Aguilar O, Hallen J, Martínez-Magadán J (2005) Theoretical study of a new group of corrosion inhibitors. J Phys Chem A 109:8950–8957

    Article  PubMed  CAS  Google Scholar 

  12. Kanojia R, Singh G (2005) An interesting and efficient organic corrosion inhibitor for mild steel in acidic medium. Surf Eng 21:180–186

    Article  CAS  Google Scholar 

  13. Umoren S (2009) Polymers as corrosion inhibitors formetals in different media-a review. Open Corros J 2:175–188

    Article  CAS  Google Scholar 

  14. Shirazi Z, Keshavarz MH, Esmaeilpour K, Golikand AN (2017) A simple approach for assessment of the corrosion inhibition efficiency of triazole, oxadiazole and thiadiazole derivatives as a function of their concentrations without using complex computer codes. Protect Met Phys Chem Surf 53:359–372

    Article  CAS  Google Scholar 

  15. Keshavarz MH, Esmaeilpour K, Golikand AN, Shirazi Z (2016) Simple approach to predict corrosion inhibition efficiency of imidazole and benzimidazole derivatives as well as linear organic compounds containing several polar functional groups. Z Anorg Allg Chem 642:906–913

    Article  CAS  Google Scholar 

  16. Keshavarz MH, Klapötke TM (2017) Energetic compounds: methods for prediction of their performance. Walter de Gruyter, Berlin

    Book  Google Scholar 

  17. Yoo SH, Kim YW, Chung K, Baik SY, Kim JS (2012) Synthesis and corrosion inhibition behavior of imidazoline derivates based on vegetable oil. Corros Sci 59:42–54

    Article  CAS  Google Scholar 

  18. Rani BEA, Basu BBJ (2012) Green inhibitors for corrosion protection of metals and alloys: an overview. Int J Corros 2:1–15

    Article  Google Scholar 

  19. Kliskic M, Radosevi J, Gudic S (1997) Pyridine and its derivatives as inhibitors of aluminium corrosion in chloride solution. J Appl Electrochem 27:947–952

    Article  CAS  Google Scholar 

  20. Scendo M, Hepel M (2008) Inhibiting properties of benzimidazole films for Cu(II)/Cu(I) reduction in chloride media studied by RDE and EqCN techniques. J Electroanal Chem 613:35–50

    Article  CAS  Google Scholar 

  21. Obot IN, Obi-Egbedi NO (2010) Theoretical study of benzimidazole and its derivatives and their potential activity as corrosion inhibitors. Corros Sci 52:657–660

    Article  CAS  Google Scholar 

  22. Benabdellah M, Tounsi A, Khaled K, Hammouti B (2011) Thermodynamic, chemical and electrochemical investigations of 2-mercapto benzimidazole as corrosion inhibitor for mild steel in hydrochloric acid solutions. Arab J Chem 4:17–24

    Article  CAS  Google Scholar 

  23. Samanta S, Das S, Biswas P (2013) Photocatalysis by 3,6-disubstituted-s-tetrazine: sisible-light driven metal-free green synthesis of 2-substitued benzimidazole and benzothiazole. J Org Chem 78:11184–11193

    Article  CAS  PubMed  Google Scholar 

  24. Kovacevic K, Kokalj A (2011) Analysis of molecular electronic structure of imidazole and benzimidazole-based inhibitors: a simple recipe for qualitative estimation of chemical hardness. Corros Sci 53:909–921

    Article  CAS  Google Scholar 

  25. Sun SQ, Geng YF, Tian L, Chen SH, Yan YG, Hu SQ (2012) Density functional theory study of imidazole, benzimidazole and 2-mercaptobenzimidazole adsorption onto clean Cu(III) surface. Corros Sci 63:140–147

    Article  CAS  Google Scholar 

  26. Gutiérrez E, Rodríguez JA, Cruz-Borbolla J, Alvarado-Rodríguez JG, Thangarasu P (2016) Development of a predictive model for corrosion inhibition of carbon steel by imidazole and benzimidazole derivatives. Corros Sci 108:23–25

    Article  CAS  Google Scholar 

  27. Obot IB, Edouk UM (2017) Benzimidazole: small planar molecule with diverse anti-corrosion potentials. J Mol Liq 246:66–90

    Article  CAS  Google Scholar 

  28. Ashry ESH, Senior SA (2011) QSAR of lauric hydrazide and its salts as corrosion inhibitors by using the quantum chemical and topological descriptors. Corros Sci 53:1025–1034

    Article  CAS  Google Scholar 

  29. Khaled KF (2011) Modeling corrosion inhibition of iron in acid medium by genetic function approximation method: a QSAR model. Corros Sci 53:3457–3465

    Article  CAS  Google Scholar 

  30. Hu SQ et al (2011) 3D-QSAR study and molecular design of benzimidazole derivatives as corrosion inhibitor. Chem J Chin Univ 32:2402–2409

    CAS  Google Scholar 

  31. Camacho-Mendoza RL et al (2015) Density functional theory and electrochemical studies: structure–efficiency relationship on corrosion inhibition. J Chem Inf Model 55:2391–2402

    Article  CAS  PubMed  Google Scholar 

  32. Li L et al (2015) The discussion of descriptors for the QSAR model and molecular dynamics simulation of benzimidazole derivatives as corrosion inhibitors. Corros Sci 99:76–88

    Article  CAS  Google Scholar 

  33. Shirazi Z, Keshavarz MH, Esmaeilpour K, Pakniya T (2017) A novel and simple method for the prediction of corrosion inhibition efficiency without using complex computer codes. Z Anorg Allg Chem 643:2149–2157

    Article  CAS  Google Scholar 

  34. Breimanr L (2001) Random forest. Mach Learn 45:5–32

    Article  Google Scholar 

  35. Aledo JC, Cantón FR, Veredas FJ (2017) A machine learning approach for predicting methionine oxidation sites. BMC Bioinform 18:430. https://doi.org/10.1186/s12859-017-1848-9

    Article  CAS  Google Scholar 

  36. Luo JS, Guo YZ, Zhong Y, Ma D, Li WL, Li ML (2014) A functional feature analysis on diverse protein-protein interactions: application for the prediction of binding affinity. J Comput Mol Des 28:619–629

    Article  CAS  Google Scholar 

  37. Luo JS, Li WL, Liu ZY, Guo YZ, Pu XM, Li ML (2015) A sequence-based two-level method for the prediction of type I secreted RTX proteins. Analyst 140:3048–3056

    Article  CAS  PubMed  Google Scholar 

  38. Wang Y et al (2015) A comparative study of family-specific protein–ligand complex affinity prediction based on random forest approach. J Comput Mol Des 29:349–360

    Article  CAS  Google Scholar 

  39. Wang Y, Guo YZ, Pu XM, Li ML (2017) Effective prediction of bacterial type IV secreted effectors by combined features of both C-termini and N-termini. J Comput Mol Des 3:1029–1038

    Article  CAS  Google Scholar 

  40. Qiu H, Guo YZ, Yu LZ, Pu XM, Li ML (2018) Predicting protein lysine methylation sites by incorporating single-residue structural features into Chou’s pseudo components. Chemom Intell Lab Syst 179:31–38

    Article  CAS  Google Scholar 

  41. Hu W, Qin L, Li ML, Pu XM, Guo YZ (2018) Individually double minimum-distance definition of protein–RNA binding residues and application to structure-based prediction. J Comput Mol Des 32:1363–1373

    Article  CAS  Google Scholar 

  42. Altmann A, Toloşi L, Sander O, Lengauer T (2010) Permutation importance: a corrected feature importance measure. Bioinformatics 26:1340–1347

    Article  CAS  PubMed  Google Scholar 

  43. Vapnik V (1998) Statistical learning theory. Wiley, New York

    Google Scholar 

  44. Ma D, Guo YZ, Luo JS, Pu XM, Li ML (2014) Prediction of protein–protein binding affinity using diverse protein–protein interface features. Chemom Intell Lab Syst 138:7–13

    Article  CAS  Google Scholar 

  45. Zhong Y, Guo YZ, Luo JS, Pu XM, Li ML (2014) Effective identification of kinase-specific phosphorylation sites based on domain–domain interactions. Chemom Intell Lab Syst 136:97–103

    Article  CAS  Google Scholar 

  46. Shi YN, Guo YZ, Hu YY, Li ML (2015) Position-specific prediction of methylation sites from sequence conservation based on information theory. Sci Rep 5:12403

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  47. Dai X et al (2015) Predicting the druggability of protein-protein interactions based on sequence and structure features of active pockets. Curr Pharm Des 21:3051–3061

    Article  CAS  PubMed  Google Scholar 

  48. Hu YY, Guo YZ, Shi YN, Li ML, Pu XM (2015) A consensus subunit-specific model for annotation of substrate specificity for ABC transporters. RSC Adv 5:42009–42019

    Article  CAS  Google Scholar 

  49. Li WL, Guo YZ, Li ML, Pu XM (2017) Distinguishing the disease–associated SNPs based on composition frequency analysis. Interdiscip Sci 9:459–467

    Article  CAS  PubMed  Google Scholar 

  50. Wang Y, Guo YZ, Pu XM, Li ML (2017) A sequence-based computational method for prediction of MoRFs. RSC Adv 7:18937–18945

    Article  CAS  Google Scholar 

Download references

Acknowledgements

This work was financially supported by Major Science and Technology Project of China National Petroleum Co. Ltd (No.: 2016E − 0609). We also thank the Comprehensive Training Platform of Specialized Laboratory, College of Chemistry, Sichuan University for sample analysis.

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Youquan Liu or Yanzhi Guo.

Ethics declarations

Conflict of interest

The authors declare no competing financial interests.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (DOCX 27 kb)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Liu, Y., Guo, Y., Wu, W. et al. A Machine Learning-Based QSAR Model for Benzimidazole Derivatives as Corrosion Inhibitors by Incorporating Comprehensive Feature Selection. Interdiscip Sci Comput Life Sci 11, 738–747 (2019). https://doi.org/10.1007/s12539-019-00346-7

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12539-019-00346-7

Keywords

Navigation