In silico prediction of drug-induced developmental toxicity by using machine learning approaches


Some drugs and xenobiotics have the potential to disturb homeostasis, normal growth, differentiation, development or behavior during prenatal development or postnatally until puberty. Assessment of the developmental toxicity is one of the important safety considerations incorporated by international regulatory agencies. In this investigation, seven machine learning methods, including naïve Bayes, support vector machine, recursive partitioning, k-nearest neighbor, C4.5 decision tree, random forest and Adaboost, were used to build binary classification models for developmental toxicity. Among these models, the naïve Bayes classifier represented the best predictive performance and stability, which gave 91.11% overall prediction accuracy, 91.50% balanced accuracy and 0.818 MCC for the training set, and generated 83.93% concordance, 81.85% balanced accuracy and 0.627 MCC for the test set. The application domains were analyzed, and only one chemical in the test set was identified as outside the application domain. In addition, 10 important molecular descriptors related to developmental toxicity were selected by the genetic algorithm, which may contribute to explanation of the mechanisms of developmental toxicants. The best naïve Bayes classification model should be employed as alternative method for qualitative prediction of chemical-induced developmental toxicity in early stages of drug development.

Graphic abstract

This is a preview of subscription content, log in to check access.

Fig. 1


  1. 1.

    Bracken MB, Holford TR (1981) Exposure to prescribed drugs in pregnancy and association with congenital malformations. Obstet Gynecol 58:336–344.

    CAS  Article  Google Scholar 

  2. 2.

    van Gelder MM, van Rooij IA, Miller RK, Zielhuis GA, Jong-van den Berg LT, Roeleveld N (2010) Teratogenic mechanisms of medical drugs. Hum Reprod Update 16:378–394.

    CAS  Article  Google Scholar 

  3. 3.

    Wu C (2010) Overview of developmental and reproductive toxicity research in china: history, funding mechanisms, and frontiers of the research. Birth Defects Res (Part B) 89:9–17.

    CAS  Article  Google Scholar 

  4. 4.

    CEPA, Canadian Environmental Protection Act (2018) Canada. S.C., c. 33. Part III, vol 22, no 3.

  5. 5.

    EPA, U.S. Environmental Protection Agency (2014) Roundtable on environmental health sciences, research, and medicine. Board on population health and public

  6. 6.

    REACH, European Chemicals Agency, Regulation (EC) No 1907/2006 of the European Parliament and of the Council of 18 December 2006 concerning the Registration, Evaluation, Authorization and Restriction of Chemicals (REACH),establishing a European Chemicals Agency, amending Directive 1999/45/EC and repealing Council Regulation (EEC) No 793/93 and Commission Regulation (EC) No 1488/94 as well as Council Directive 76/769/EEC and Commission Directives 91/155/EEC, 93/67/EEC, 93/105/EC and 2000/21/EC. OJ L 396, 30.12.2006, pp 1–849

  7. 7.

    ICH (2015) S5(R3) final concept paper: detection of toxicity to reproduction for medicinal products and toxicity to male fertility dated 9 February 2015. Endorsed by the ICH Steering Committee on 27 March 2015

  8. 8.

    ICH (2005) Harmonized tripartite guideline, detection of toxicity to reproduction for medicinal products and toxicity to male fertility S5. Parent guideline dated 24 June 1993. Addendum dated 9 November 2000 incorporated in November 2005

  9. 9.

    OECD 414 (2001) Guideline for the testing of chemicals. No. 414 Prenatal developmental toxicity study

  10. 10.

    OECD 415 (1983) Guideline for the testing of chemicals. No. 415 One-generation reproduction toxicity study

  11. 11.

    OECD 416 (2001) Guideline for the testing of chemicals. No. 416 Two generation reproduction toxicity study

  12. 12.

    OECD 421 (2016) OECD guideline for testing of chemicals No. 421: reproduction/developmental toxicity screening test

  13. 13.

    OECD 422 (2016) OECD guideline for testing of chemicals No. 422: combined repeated dose toxicity study with the reproduction/developmental toxicity screening test

  14. 14.

    Höfer T, Gerner I, Gundert-Remy U, Liebsch M, Schulte A, Spielmann H, Richard V, Wettig K (2004) Animal testing and alternative approaches for the human health risk assessment under the proposed new European chemicals regulation. Arch Toxicol 78:549–564.

    CAS  Article  Google Scholar 

  15. 15.

    Scialli AR (2008) The challenge of reproductive and developmental toxicology under REACH. Regul Toxicol Pharmacol 51:244–250.

    CAS  Article  Google Scholar 

  16. 16.

    Manon B (2017) The era of 3Rs implementation in developmental and reproductive toxicity (DART) testing: current overview and future perspectives. Reprod Toxicol 72:86–96.

    CAS  Article  Google Scholar 

  17. 17.

    Arena VC, Sussman NB, Mazumdar S, Yu S, Macina QT (2004) The utility of structure-activity relationship (SAR) models for prediction and covariate selection in developmental toxicity: comparative analysis of logistic regression and decision tree models. SAR QSAR Environ Res 15:1–18.

    CAS  Article  Google Scholar 

  18. 18.

    Cassano A, Manganaro A, Martin T, Young D, Piclin N, Pintore M, Bigoni D, Benfenati E (2010) CAESAR models for developmental toxicity. Chem Cent J S4:1–11.

    CAS  Article  Google Scholar 

  19. 19.

    Gombar VK, Enslein K, Blake BW (1995) Assessment of developmental toxicity potential of chemicals by quantitative structure-toxicity relationship models. Chemosphere 31:2499–2510.

    CAS  Article  Google Scholar 

  20. 20.

    Ghorbanzadeh M, Zhang J, Andersson PL (2016) Binary classification model to predict developmental toxicity of industrial chemicals in zebrafish. J Chemom 30:298–307.

    CAS  Article  Google Scholar 

  21. 21.

    Gunturia SB, Ramamurthia N (2014) A novel approach to generate robust classification models to predict developmental toxicity from imbalanced datasets. SAR QSAR Environ Res 25:1–17.

    Article  Google Scholar 

  22. 22.

    Hewitt M, Ellison CM, Enoch SJ, Madden JC, Cronin MTD (2010) Integrating (Q)SAR models, expert systems and read-across approaches for the prediction of developmental toxicity. Reprod Toxicol 30:147–160.

    CAS  Article  Google Scholar 

  23. 23.

    Marzo M, Kulkarni S, Manganaro A, Roncaglioni A, Wu S, Barton-Maclaren TS, Lester C, Benfenati E (2016) Integrating in silico models to enhance predictivity for developmental toxicity. Toxicology 370:127–137.

    CAS  Article  Google Scholar 

  24. 24.

    Sussman NB, Arena VC, Yu S, Mazumdar S, Thampatty BP (2003) Decision tree SAR models for developmental toxicity based on an FDA/TERIS database. SAR QSAR Environ Res 14:83–96.

    CAS  Article  Google Scholar 

  25. 25.

    Zhang H, Ren JX, Kang YL, Bo P, Liang JY, Ding L, Kong WB, Zhang J (2017) Development of novel in silico model for developmental toxicity assessment by using naïve Bayes classifier method. Reprod Toxicol 71:8–15.

    CAS  Article  Google Scholar 

  26. 26.

    VCCLAB, Virtual Computational Chemistry Laboratory (2018)

  27. 27.

    Davis L (ed) (1991) Handbook of genetic algorithms. Van Nostrand Reinhold, New York

    Google Scholar 

  28. 28.

    Berger JO (2013) Statistical decision theory and Bayesian analysis. Springer, Berlin

    Google Scholar 

  29. 29.

    Box GE, Tiao CC (2011) Bayesian inference in statistical analysis. Wiley, Hoboken

    Google Scholar 

  30. 30.

    Vapnik V (1998) Statistical learning theory. Wiley, New York

    Google Scholar 

  31. 31.

    Yang SY, Huang Q, Li LL, Ma CY, Zhang H, Bai R, Teng QZ, Xiang ML, Wei YQ (2009) An integrated scheme for feature selection and parameter setting in the support vector machine modeling and its application to the prediction of pharmacokinetic properties of drugs. Artif Intell Med 46:155–163.

    Article  Google Scholar 

  32. 32.

    Strobl C, Malley J, Tutz G (2009) An introduction to recursive partitioning: rationale, application, and characteristics of classification and regression trees, bagging, and random forests. Psychol Methods 14:323–348.

    Article  Google Scholar 

  33. 33.

    Altman NS (1992) An introduction to kernel and nearest-neighbor nonparametric regression. Am Stat 46:175–185.

    Article  Google Scholar 

  34. 34.

    Quinlan JR (1993) C4.5: programs for machine learning. Morgan Kaufmann, San Mateo

    Google Scholar 

  35. 35.

    Huang C, Ma YH, Zhao HB, Lu XP (2017) Spectral classification of asteroids by random forest. Chin Astron Astrophys 41:549–557.

    Article  Google Scholar 

  36. 36.

    Freund Y (1995) Boosting a weak learning algorithm by majority. Inf Comput 121:256–285.

    Article  Google Scholar 

  37. 37.

    Roy K, Kar S, Ambure P (2015) On a simple approach for determining applicability domain of QSAR models. Chemom Intell Lab Syst 145:22–29.

    CAS  Article  Google Scholar 

  38. 38.

    OECD (2014) Guidance document on the validation of (quantitative) structure-activity relationship [(Q)SAR] models. In: OECD series on testing and assessment. OECD Publishing, Paris, pp 1–154

  39. 39.

    Roy K, Mitra I (2011) On various metrics used for validation of predictive QSAR models with applications in virtual screening and focused library design. Comb Chem High Throughput Screen 14:450–474.

    CAS  Article  Google Scholar 

  40. 40.

    Lei T, Chen F, Liu H, Sun H, Kang Y, Li D, Li Y, Hou T (2017) ADMET evaluation in drug discovery. Part 17: development of quantitative and qualitative prediction models for chemical-induced respiratory toxicity. Mol Pharm 14:2407–2421.

    CAS  Article  Google Scholar 

  41. 41.

    Zhang H, Ma JX, Liu CT, Ren JX, Ding L (2018) Development and evaluation of in silico prediction model for drug-induced respiratory toxicity by using naïve Bayes classifier method. Food Chem Toxicol 121:593–603.

    CAS  Article  Google Scholar 

  42. 42.

    Giaginis C, Zira A, Theocharis S, Tsantili-Kakoulidou A (2008) Simple physicochemical properties as effective filters for risk estimation of drug transport across the human placental barrier. Rev Clin Pharmacol Pharmacokinet (Int Ed) 22:146–148

    CAS  Google Scholar 

  43. 43.

    Medina-Franco JL (2013) Activity cliffs: facts or artifacts? Chem Biol Drug Des 81:553–556.

    CAS  Article  Google Scholar 

  44. 44.

    Concu R, Kleandrova VV, Speck-Planche A, Cordeiro M (2017) Probing the toxicity of nanoparticles: a unified in silico machine learning model based on perturbation theory. Nanotoxicology 11:891–906.

    CAS  Article  Google Scholar 

  45. 45.

    Gonzalez-Diaz H, Arrasate S, Gomez-Sanjuan A, Sotomayor N, Lete E, Besada-Porto L, Ruso JM (2013) General theory for multiple input–output perturbations in complex molecular systems. 1. Linear QSPR electronegativity models in physical, organic, and medicinal chemistry. Curr Top Med Chem 13:1713–1741.

    CAS  Article  Google Scholar 

  46. 46.

    Kleandrova VV, Luan F, Speck-Planche A, Cordeiro MNDS (2015) In silico assessment of the acute toxicity of chemicals: recent advances and new model for multitasking prediction of toxic effect. Mini Rev Med Chem 15:677–686.

    CAS  Article  Google Scholar 

  47. 47.

    Tenorio-Borroto E, Ramirez FR, Speck-Planche A, Cordeiro MNDS, Luan F, Gonzalez-Diaz H (2014) QSPR and flow cytometry analysis (QSPR-FCA): review and new findings on parallel study of multiple interactions of chemical compounds with immune cellular and molecular targets. Curr Drug Metab 15:414–428.

    CAS  Article  Google Scholar 

  48. 48.

    Luan F, Kleandrova VV, Gonzalez-Diaz H, Ruso JM, Melo A, Speck-Planche A, Cordeiro MNDS (2014) Computer-aided nanotoxicology: assessing cytotoxicity of nanoparticles under diverse experimental conditions by using a novel QSTR-perturbation approach. Nanoscale 6:10623–10630.

    CAS  Article  Google Scholar 

  49. 49.

    Kleandrova VV, Luan F, Gonzalez-Diaz H, Ruso JM, Speck-Planche A, Cordeiro MNDS (2014) Computational tool for risk assessment of nanomaterials: novel QSTR-perturbation model for simultaneous prediction of ecotoxicity and cytotoxicity of uncoated and coated nanoparticles under multiple experimental conditions. Environ Sci Technol 48:14686–14694.

    CAS  Article  Google Scholar 

  50. 50.

    Kleandrova VV, Luan F, Gonzalez-Diaz H, Ruso JM, Melo A, Speck-Planche A, Cordeiro MNDS (2014) Computational ecotoxicology: simultaneous prediction of ecotoxic effects of nanoparticles under different experimental conditions. Environ Int 73C:288–294.

    CAS  Article  Google Scholar 

  51. 51.

    Speck-Planche A, Kleandrova VV, Luan F, Cordeiro MNDS (2012) Predicting multiple ecotoxicological profiles in agrochemical fungicides: a multi-species chemoinformatic approach. Ecotoxicol Environ Saf 80:308–313.

    CAS  Article  Google Scholar 

Download references


This work was supported by the National Natural Science Foundation of China (Grant nos. 81660589 and 81903543).

Author information



Corresponding authors

Correspondence to Hui Zhang or Lan Ding.

Ethics declarations

Conflict of interest

The authors declare no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (XLSX 22 kb)

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Zhang, H., Mao, J., Qi, HZ. et al. In silico prediction of drug-induced developmental toxicity by using machine learning approaches. Mol Divers 24, 1281–1290 (2020).

Download citation


  • Developmental toxicity
  • Machine learning
  • In silico prediction
  • Molecular descriptor
  • Genetic algorithm