, Volume 8, Issue 1, pp 384–389 | Cite as

Prediction of Aromatic Hydroxylation Sites for Human CYP1A2 Substrates Using Condensed Graph of Reactions

  • T. I. Madzhidov
  • A. A. Khakimova
  • R. I. Nugmanov
  • C. Muller
  • G. Marcou
  • A. Varnek


In this paper, support vector machine and condensed graph of reaction (CGR) approaches have been used to predict the regioselectivity of aromatic hydroxylation for human CYP1A2 substrates. Experimental data on aromatic hydroxylation for human cytochrome CYP1A2 (observed molecular or “real” transformations) used in the modeling were extracted from the Metabolite database and the XenoSite database. In addition, all potential but unobserved (“unreal”) transformations were generated. The dataset containing “real” and “unreal” transformations was converted into an ensemble of CGRs representing pseudomolecules with conventional (single, double, aromatic, etc.) bonds and dynamic bonds characterizing chemical transformations. ISIDA fragment descriptors generated for CGRs were used for the modeling. The models have been validated in three times repeated fivefold cross-validation on the training set and then on an external set. The final model was constructed by consensus over models built on different descriptors sets. Predictive performance of our model on the external test set was similar to that of XenoSite and Way2Drug tools. Unlike previously used atom labeling-based approaches, the proposed CGR-based representation of metabolic transformations could be applied to different types of reactions catalyzed by the same enzyme and therefore, it is more suitable for automatized handling of metabolic data.


CYP1A2 Aromatic hydroxylation Support vector machine (SVM) Condensed graph of reaction (CGR) 



We thank Prof. Vladimir Poroikov for providing us with the experimental data set and useful discussion. ChemAxon is acknowledged for the software tools used in this study for data storage and standardization. The study was supported by Russian Science Foundation (Contract 14-43-00024).


  1. 1.
    Ekins, S., Nikolsky, Y., & Nikolskaya, T. (2005). Techniques: application of systems biology to absorption, distribution, metabolism, excretion and toxicity. Trends in Pharmacological Sciences, 26(4), 202–209. Scholar
  2. 2.
    Göller, A. H., Lang, D., Kunze, J., Testa, B., Wilson, I. D., Glen, R. C., & Schneider, G. (2015). Predicting drug metabolism: experiment and/or computation? Nature Reviews. Drug Discovery, 14(6), 387–404. Scholar
  3. 3.
    Crivori, P., & Poggesi, I. (2006). Computational approaches for predicting CYP-related metabolism properties in the screening of new drugs. European Journal of Medicinal Chemistry, 41(7), 795–808. Scholar
  4. 4.
    Jung, J., Kim, N. D., Kim, S. Y., et al. (2008). Regioselectivity prediction of CYP1A2-mediated phase I metabolism. Journal of Chemical Information and Modeling, 48(5), 1074–1080. Scholar
  5. 5.
    Cruciani, G., Carosati, E., et al. (2005). MetaSite: understanding metabolism in human cytochromes from the perspective of the chemist. Journal of Medicinal Chemistry, 48(22), 6970–6979. Scholar
  6. 6.
    Zamora, I., Afzelius, L., & Cruciani, G. (2003). Predicting drug metabolism: a site of metabolism prediction tool applied to the cytochrome P450 2C9. Journal of Medicinal Chemistry, 46(12), 2313–2324. Scholar
  7. 7.
    de Groot, M. J., Ackland, M. J., Horne, V. A., Alex, A. A., & Jones, B. C. (1999). A novel approach to predicting P450 mediated drug metabolism. CYP2D6 catalyzed N-dealkylation reactions and qualitative metabolite predictions using a combined protein and pharmacophore model for CYP2D6. Journal of Medicinal Chemistry, 42(20), 4062–4070.CrossRefGoogle Scholar
  8. 8.
    de Groot, M. J., Ackland, M. J., Horne, V. A., Alex, A. A., & Jones, B. C. (1999). Novel approach to predicting P450-mediated drug metabolism: development of a combined protein and pharmacophore model for CYP2D6. Journal of Medicinal Chemistry, 42(9), 1515–1524. Scholar
  9. 9.
    Borodina, Y., Rudik, A., Filimonov, D., Kharchevnikova, N., Dmitriev, A., Blinova, V., & Poroikov, V. (2004). A new statistical approach to predicting aromatic hydroxylation sites. Comparison with model-based approaches. Journal of Chemical Information and Computer Sciences, 44(6), 1998–2009. Scholar
  10. 10.
    Funatsu, K., Hasegawa, K., & Koyama, M. (2010). Quantitative prediction of regioselectivity toward cytochrome P450/3A4 using machine learning approaches. Molecular Informatics, 29(3), 243–249. Scholar
  11. 11.
    Singh, S. B., Shen, L. Q., Walker, M. J., & Sheridan, R. P. (2003). A model for predicting likely sites of CYP3A4-mediated metabolism on drug-like molecules. Journal of Medicinal Chemistry, 46(8), 1330–1336. Scholar
  12. 12.
    Haji-Momenian, S., Rieger, J. M., Macdonald, T. M., & Brown, M. L. (2003). Comparative molecular field analysis and QSAR on substrates binding to cytochrome p450 2D6. Bioorganic & Medicinal Chemistry, 11(24), 5545–5554. Scholar
  13. 13.
    Hennemann, M., Friedl, A., Lobell, M., Keldenich, J., Hillisch, A., Clark, T., & Göller, A. H. (2009). CypScore: Quantitative prediction of reactivity toward cytochromes P450 based on semiempirical molecular orbital theory. ChemMedChem, 4(4), 657–669. Scholar
  14. 14.
    Zheng, M., Luo, X., Shen, Q., Wang, Y., Du, Y., Zhu, W., & Jiang, H. (2009). Site of metabolism prediction for six biotransformations mediated by cytochromes P450. Bioinformatics, 25(10), 1251–1258. Scholar
  15. 15.
    Boyer, S., Arnby, C. H., Carlsson, L., & Smith, J. (2007). Reaction site mapping of xenobiotic biotransformations. Journal of Chemical Information and Modeling, 47(2), 583–590. Scholar
  16. 16.
    Sheridan, R. P., Korzekwa, K. R., Torres, R. A., & Walker, M. J. (2007). Empirical regioselectivity models for human cytochromes P450 3A4, 2D6, and 2C9. Journal of Medicinal Chemistry, 50(14), 3173–3184. Scholar
  17. 17.
    Zaretzki, J., Matlock, M., & Swamidass, S. J. (2013). XenoSite: accurately predicting CYP-mediated sites of metabolism with neural networks. Journal of Chemical Information and Modeling, 53, 3373–3383. Scholar
  18. 18.
    Hughes, T. B., Miller, G. P., & Swamidass, S. J. (2015). Modeling epoxidation of drug-like molecules with a deep machine learning network. ACS Central Science, 1, 168–180. Scholar
  19. 19.
    Dang, N. L., Hughes, T. B., & Swamidass, S. J. (2016). A simple model predicts UGT-mediated metabolism. Bioinformatics, 32, 3183–3189. Scholar
  20. 20.
    Rudik, A. V., Dmitriev, A. V., Lagunin, A. A., Filimonov, D. A., & Poroikov, V. V. (2016). Prediction of reacting atoms for the major biotransformation reactions of organic xenobiotics. Journal of Cheminformatics, 8, 68. Scholar
  21. 21.
    Rudik, A. V., Dmitriev, A. V., Lagunin, A. A., Filimonov, D. A., & Poroikov, V. V. (2014). Metabolism site prediction based on xenobiotic structural formulas and PASS prediction algorithm. Journal of Chemical Information and Modeling, 54(2), 498–507. Scholar
  22. 22.
    Rudik, A. V., Dmitriev, A. V., Lagunin, A. A., Filimonov, D. A., & Poroikov, V. V. (2015). SOMP: web server for in silico prediction of sites of metabolism for drug-like compounds. Bioinformatics, 31(12), 2046–2048. Scholar
  23. 23.
    Accelrys, Inc. (2009) Accelrys Metabolite, San Diego. .
  24. 24.
    JChem 16.4.18, 2016, ChemAxon.
  25. 25.
    Varnek, A., Fourches, D., Hoonakker, F., & Solov’ev, V. P. (2005). Substructural fragments: an universal language to encode reactions, molecular and supramolecular structures. Journal of Computer-Aided Molecular Design, 19(9–10), 693–703. Scholar
  26. 26.
    Nugmanov, R. I., Madzhidov, T. I., Khaliullina, G. R., Baskin, I. I., Antipin, I. S., & Varnek, A. A. (2014). Development of “structure-property” models in nucleophilic substitution reactions involving azides. Journal of Structural Chemistry, 55, 1026–1032. Scholar
  27. 27.
    Madzhidov, T. I., Bodrov, A. V., Gimadiev, T. R., Nugmanov, R. I., Antipin, I. S., & Varnek, A. A. (2015). Structure–reactivity relationship in bimolecular elimination reactions based on the condensed graph of a reaction. Journal of Structural Chemistry, 56, 1227–1234. Scholar
  28. 28.
    Madzhidov, T. I., Polishchuk, P. G., Nugmanov, R. I., Bodrov, A. V., Lin, A. I., Baskin, I. I., Varnek, A. A., & Antipin, I. S. (2014). Structure-reactivity relationships in terms of the condensed graphs of reactions. Russian Journal of Organic Chemistry, 50, 459–463. Scholar
  29. 29.
    Polishchuk, P., Madzhidov, T., Gimadiev, T., Bodrov, A., Nugmanov, R., & Varnek, A. (2017). Structure–reactivity modeling using mixture-based representation of chemical reactions. Journal of Computer-Aided Molecular Design, 31(9), 829–839. Scholar
  30. 30.
    Hoonakker, F., Lachiche, N., Varnek, A., & Wagner, A. (2011). Condensed Graph of Reaction: considering a chemical reaction as one single pseudo molecule. International Journal on Artificial Intelligence Tools, 20(2), 253–270.CrossRefGoogle Scholar
  31. 31.
    Marcou, G., de Sousa, J. A., Latino, D. A. R. S., de Luca, A., Horvath, D., Rietsch, V., & Varnek, A. (2015). Expert system for predicting reaction conditions: the Michael reaction case. Journal of Chemical Information and Modeling, 55(2), 239–250. Scholar
  32. 32.
    Lin, A. I., Madzhidov, T. I., Klimchuk, O., Nugmanov, R. I., Antipin, I. S., & Varnek, A. (2016). Automatized assessment of protective group reactivity: a step toward big reaction data analysis. Journal of Chemical Information and Modeling, 56, 2140–2148. Scholar
  33. 33.
    de Luca, A., Horvath, D., Marcou, G., Solov’ev, V., & Varnek, A. (2012). Mining chemical reactions using neighborhood behavior and condensed graphs of reactions approaches. Journal of Chemical Information and Modeling, 52(9), 2325–2338. Scholar
  34. 34.
    Horvath, D., Marcou, G., Varnek, A., Kayastha, S., de la Vega de León, A., & Bajorath, J. (2016). Prediction of activity cliffs using condensed graphs of reaction representations, descriptor recombination, support vector machine classification, and support vector regression. Journal of Chemical Information and Modeling, 56(9), 1631–1640. Scholar
  35. 35.
    Muller, C., Marcou, G., Horvath, D., Aires-de-Sousa, J., & Varnek, A. (2012). Models for identification of erroneous atom-to-atom mapping of reactions performed by automated algorithms. Journal of Chemical Information and Modeling, 52(12), 3116–3122. Scholar
  36. 36.
    Chen, W. L., Chen, D. Z., & Taylor, K. T. (2013). Automatic reaction mapping and reaction center detection. Wiley Interdisciplinary Reviews: Computational Molecular Science, 3, 560–593. Scholar
  37. 37.
    Cortes, C., & Vapnik, V. (1995). Support-vector networks. Machine Learning, 20(3), 273–297. Scholar
  38. 38.
    Chang CC, Lin CJ (2001) LIBSVM: a library for support vector machines. Accessed 19 October 2017
  39. 39.
    Horvath, D., Brown, J., Marcou, G., & Varnek, A. (2014). An evolutionary optimizer of libsvm models. Challenges, 5, 450–472. Scholar
  40. 40.
    Wu, T., Lin, C., & Weng, R. (2004). Probability estimates for multi-class classification by pairwise coupling. Journal of Machine Learning Research, 5, 975–1005.MathSciNetzbMATHGoogle Scholar
  41. 41.
    Filimonov, D., & Poroikov, V. (2008). Probabilistic approaches in activity prediction. In A. Varnek & A. Tropsha (Eds.), Chemoinformatics approaches to virtual screening (pp. 182–217). Cambridge: RSC Publishing.CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2018

Authors and Affiliations

  1. 1.Kazan Federal UniversityKazanRussia
  2. 2.Université de StrasbourgStrasbourgFrance

Personalised recommendations