Computational Modeling of Nonlinear Phenomena Using Machine Learning

  • Anthony J. Hickey
  • Hugh D. C. Smyth
Part of the AAPS Introductions in the Pharmaceutical Sciences book series (AAPSINSTR)


Machine learning (ML) is a field of computer science that allows interrogation to allow modified navigation (learning) of the data and through statistical derivation prediction of unseen data or events. ML has been a high-profile topic for many years and is ubiquitous in many aspects of daily life – from e-mail spam and malware filtering to search results refining online customer service and fraud detection. More recently, ML has been pervasive in solving complex nonlinear phenomena in pharmaceutical and medical sciences. It has been used in modeling chemical data sets for two decades. It has only recently become a useful approach to improve healthcare diagnoses and to provide personalized medical treatments. The rapid growth in data collection and integration, as well as the accessibility of increasing computing power, especially in cloud services, explains this unforeseen capacity to transform data into information, information into knowledge, and knowledge into wisdom (see Fig. 7.1). In this section, we briefly introduce the concepts and types of ML and its application for drug discovery, drug product development, and clinical application. The literature in these fields and the importance and challenges of interpreting ML results are also discussed.


Computational modeling Machine learning Artificial intelligence Drug discovery Product development Clinical application 


  1. Abuhammad, A., & Taha, M. O. (2016). QSAR studies in the discovery of novel type-II diabetic therapies. Expert Opinion on Drug Discovery, 11(2), 197–214. Scholar
  2. Alves, V., Braga, R., Muratov, E., & Andrade, C. (2018). Development of web and mobile applications for chemical toxicity prediction. Journal of the Brazilian Chemical Society, 29(5), 982–988. Scholar
  3. Alves, V. M., Capuzzi, S. J., Braga, R. C., Borba, J. V. B., Silva, A. C., Luechtefeld, T., … Tropsha, A. (2018). A perspective and a new integrated computational strategy for skin sensitization assessment. ACS Sustainable Chemistry & Engineering, 6(3), 2845–2859. Scholar
  4. Alves, V. M., Golbraikh, A., Capuzzi, S. J., Liu, K., Lam, W. I., Korn, D. R., … Tropsha, A. (2018). Multi-Descriptor Read Across (MuDRA): A simple and transparent approach for developing accurate quantitative structure–activity relationship models. Journal of Chemical Information and Modeling, 58(6), 1214–1223. Scholar
  5. Alves, V. M., Hwang, D., Muratov, E., Sokolsky-Papkov, M., Varlamova, E., Vinod, N., … Kabanov, A. (2019). Cheminformatics-driven discovery of polymeric micelle formulations for poorly soluble drugs. Science Advances, 5(6), eaav9784. Scholar
  6. Ashburn, T. T., & Thor, K. B. (2004). Drug repositioning: Identifying and developing new uses for existing drugs. Nature Reviews Drug Discovery, 3(8), 673–683. Scholar
  7. Bi, Y., Might, M., Vankayalapati, H., & Kuberan, B. (2017). Repurposing of Proton Pump Inhibitors as first identified small molecule inhibitors of endo-β-N-acetylglucosaminidase (ENGase) for the treatment of NGLY1 deficiency, a rare genetic disease. Bioorganic & Medicinal Chemistry Letters, 27(13), 2962–2966. Scholar
  8. Bojarski, M., Del Testa, D., Dworakowski, D., Firner, B., Flepp, B., Goyal, P., … Zieba, K. (2016). End to end learning for self-driving cars. ArXiv, 1604.07316. Retrieved from
  9. Bottou, L. (2010). Large-scale machine learning with stochastic gradient descent. In Proceedings of COMPSTAT’2010 (pp. 177–186).
  10. Braga, R. C., Alves, V. M., Muratov, E. N., Strickland, J., Kleinstreuer, N., Tropsha, A., & Andrade, C. H. (2017). Pred-skin: A fast and reliable web application to assess skin sensitization effect of chemicals. Journal of Chemical Information and Modeling, 57(5), 1013–1017. Scholar
  11. Breiman, L. (2001). Random forests. Machine Learning, 45(1), 5–32. Scholar
  12. Capuzzi, S. J., Sun, W., Muratov, E. N., Martínez-Romero, C., He, S., Zhu, W., … Tropsha, A. (2018). Computer-aided discovery and characterization of novel Ebola virus inhibitors. Journal of Medicinal Chemistry, 61(8), 3582–3594. Scholar
  13. Casati, S., Aschberger, K., Barroso, J., Casey, W., Delgado, I., Kim, T. S., … Zuang, V. (2018). Standardisation of defined approaches for skin sensitisation testing to support regulatory use and international adoption: Position of the International Cooperation on Alternative Test Methods. Archives of Toxicology, 92(2), 611–617. Scholar
  14. Castelvecchi, D. (2016). Can we open the black box of AI? Nature, 538(7623), 20–23. Scholar
  15. Chakraborty, S., Tomsett, R., Raghavendra, R., Harborne, D., Alzantot, M., Cerutti, F., … Gurram, P. (2017). Interpretability of deep learning models: A survey of results. In 2017 IEEE SmartWorld, Ubiquitous Intelligence & Computing, Advanced & Trusted Computed, Scalable Computing & Communications, Cloud & Big Data Computing, Internet of People and Smart City Innovation (SmartWorld/SCALCOM/UIC/ATC/CBDCom/IOP/SCI) (pp. 1–6).
  16. Che, Z., Purushotham, S., Khemani, R., & Liu, Y. (2016). Interpretable deep models for ICU outcome prediction. In AMIA ... annual symposium proceedings. AMIA symposium, 2016 (pp. 371–380). Retrieved from
  17. Chen, J.-K., Shen, C.-R., & Liu, C.-L. (2010). N-acetylglucosamine: Production and applications. Marine Drugs, 8(9), 2493–2516. Scholar
  18. Cherkasov, A., Muratov, E. N., Fourches, D., Varnek, A., Baskin, I. I., Cronin, M., … Tropsha, A. (2014). QSAR modeling: Where have you been? Where are you going to? Journal of Medicinal Chemistry, 57(12), 4977–5010. Scholar
  19. Ciallella, H. L., & Zhu, H. (2019). Advancing computational toxicology in the big data era by artificial intelligence: Data-driven and mechanism-driven modeling for chemical toxicity. Chemical Research in Toxicology, 32(4), 536–547. Scholar
  20. Courtiol, P., Maussion, C., Moarii, M., Pronier, E., Pilcer, S., Sefta, M., … Clozel, T. (2019). Deep learning-based classification of mesothelioma improves prediction of patient outcome. Nature Medicine, 25(10), 1519–1525. Scholar
  21. Dearden, J. C. (2016). The history and development of quantitative structure-activity relationships (QSARs). International Journal of Quantitative Structure-Property Relationships, 1(1), 1–44. Scholar
  22. Dearden, J. C., Cronin, M. T. D., & Kaiser, K. L. E. (2009). How not to develop a quantitative structure-activity or structure-property relationship (QSAR/QSPR). SAR and QSAR in Environmental Research, 20(3–4), 241–266. Scholar
  23. Dearden, J. C., Hewitt, M., Roberts, D. W., Enoch, S. J., Rowe, P. H., Przybylak, K. R., … Katritzky, A. R. (2015). Mechanism-based QSAR modeling of skin sensitization. Chemical Research in Toxicology, 28(10), 1975–1986. Scholar
  24. Decencière, E., Cazuguel, G., Zhang, X., Thibault, G., Klein, J. C., Meyer, F., … Chabouis, A. (2013). TeleOphta: Machine learning and image processing methods for teleophthalmology. IRBM, 34(2), 196–203. Scholar
  25. Dhiman, K., & Agarwal, S. M. (2016). NPred: QSAR classification model for identifying plant based naturally occurring anti-cancerous inhibitors. RSC Advances, 6(55), 49395–49400. Scholar
  26. Doshi-Velez, F., & Kim, B. (2017). Towards a rigorous science of interpretable machine learning. ArXiv, 1702.08608. Retrieved from
  27. Dreyfus, H. (1979). What computers can’t do: The limits of artificial intelligence. London, UK: MIT Press.Google Scholar
  28. Durdagi, S., Erol, I., Dogan, B., & Berkay Sen, T. (2019). Integration of text mining and binary QSAR models for novel anti-hypertensive antagonist scaffolds. Biophysical Journal, 116(3), 478a. Scholar
  29. Ekins, S., Puhl, A. C., Zorn, K. M., Lane, T. R., Russo, D. P., Klein, J. J., … Clark, A. M. (2019). Exploiting machine learning for end-to-end drug discovery and development. Nature Materials, 18(5), 435–441. Scholar
  30. Fernandez, M., Ban, F., Woo, G., Isaev, O., Perez, C., Fokin, V., … Cherkasov, A. (2019). Quantitative structure–price relationship (QS$R) Modeling and the development of economically feasible drug discovery projects. Journal of Chemical Information and Modeling, 59(4), 1306–1313. Scholar
  31. Fourches, D., Muratov, E., & Tropsha, A. (2010). Trust, but verify: On the importance of chemical structure curation in cheminformatics and QSAR modeling research. Journal of Chemical Information and Modeling, 50(7), 1189–1204. Scholar
  32. Gaulton, A., Bellis, L. J., Bento, A. P., Chambers, J., Davies, M., Hersey, A., … Overington, J. P. (2012). ChEMBL: A large-scale bioactivity database for drug discovery. Nucleic Acids Research, 40(Database issue), D1100–D1107. Scholar
  33. Goodfellow, I., Bengio, Y., & Courville, A. (2016). Deep learning. Retrieved from
  34. Goto, T., Jo, T., Matsui, H., Fushimi, K., Hayashi, H., & Yasunaga, H. (2019). Machine learning-based prediction models for 30-day readmission after hospitalization for chronic obstructive pulmonary disease. COPD: Journal of Chronic Obstructive Pulmonary Disease, 1–6.
  35. Graham, S., Depp, C., Lee, E. E., Nebeker, C., Tu, X., Kim, H.-C., & Jeste, D. V. (2019). Artificial intelligence for mental health and mental illnesses: An overview. Current Psychiatry Reports, 21(11), 116. Scholar
  36. Hisaki, T., Aiba, M., Yamaguchi, M., & Sasa, H. (2015). Development of QSAR models using artificial neural network analysis for risk assessment of repeated-dose, reproductive , and developmental toxicities of cosmetic ingredients. The Journal of Toxicological Sciences, 40(2), 163–180. Scholar
  37. Horvitz, E. J., Apacible, J., Sarin, R., & Liao, L. (2012). Prediction, expectation, and surprise: Methods, designs, and study of a deployed traffic forecasting service. ArXiv, 1207.1352. Retrieved from
  38. Huval, B., Wang, T., Tandon, S., Kiske, J., Song, W., Pazhayampallil, J., … Ng, A. Y. (2015). An empirical evaluation of deep learning on highway driving. ArXiv, 1504.01716. Retrieved from
  39. Kepuska, V., & Bohouta, G. (2018). Next-generation of virtual personal assistants (Microsoft Cortana, Apple Siri, Amazon Alexa and Google Home). In 2018 IEEE 8th annual computing and communication workshop and conference, CCWC 2018, 2018-January (pp. 99–103).
  40. Kerr, K. F., Bansal, A., & Pepe, M. S. (2012). Further insight into the incremental value of new markers: The interpretation of performance measures and the importance of clinical context. American Journal of Epidemiology, 176, 482–487. Scholar
  41. Klein, R. J. (2005). Complement factor H polymorphism in age-related macular degeneration. Science (New York, N.Y.), 308(5720), 385–389. Scholar
  42. Kleinstreuer, N. C., Karmaus, A. L., Mansouri, K., Allen, D. G., Fitzpatrick, J. M., & Patlewicz, G. (2018). Predictive models for acute oral systemic toxicity: A workshop to bridge the gap from research to regulation. Computational Toxicology, 8(4), 21–24. Scholar
  43. Koh, P. W., & Liang, P. (2017). Understanding black-box predictions via influence functions. In ICML’17 proceedings of the 34th international conference on machine learning (pp. 1885–1894). Retrieved from
  44. Lavecchia, A. (2015). Machine-learning approaches in drug discovery: Methods and applications. Drug Discovery Today, 20(3), 318–331. Scholar
  45. LeCun, Y., Bengio, Y., & Hinton, G. (2015). Deep learning. Nature, 521(7553), 436–444. Scholar
  46. Lima, M. N. N., Melo-Filho, C. C., Cassiano, G. C., Neves, B. J., Alves, V. M., Braga, R. C., … Andrade, C. H. (2018). QSAR-driven design and discovery of novel compounds with antiplasmodial and transmission blocking activities. Frontiers in Pharmacology, 9, 146. Scholar
  47. Lipton, Z. C. (2016). The mythos of model interpretability. ArXiv, 1606.03490. Retrieved from
  48. Liu, J., Mansouri, K., Judson, R. S., Martin, M. T., Hong, H., Chen, M., … Shah, I. (2015). Predicting hepatotoxicity using ToxCast in vitro bioactivity and chemical structure. Chemical Research in Toxicology, 28, 738–751. Scholar
  49. Low, Y., Uehara, T., Minowa, Y., Yamada, H., Ohno, Y., Urushidani, T., … Tropsha, A. (2011). Predicting drug-induced hepatotoxicity using QSAR and toxicogenomics approaches. Chemical Research in Toxicology, 24(8), 1251–1262. Scholar
  50. Low, Y. S., Alves, V. M., Fourches, D., Sedykh, A., Andrade, C. H., Muratov, E. N., … Tropsha, A. (2018). Chemistry-Wide Association Studies (CWAS): A novel framework for identifying and interpreting structure-activity relationships. Journal of Chemical Information and Modeling, 58(11), 2203–2213. Scholar
  51. Luo, C., Wu, D., & Wu, D. (2017). A deep learning approach for credit scoring using credit default swaps. Engineering Applications of Artificial Intelligence, 65, 465–470. Scholar
  52. McCarthy, J., Minsky, M., Rochester, N., & Shannon, C. (1955). A proposal for the Dartmouth summer research project on artificial intelligence. Retrieved December 4, 2019, from
  53. Melo Calixto, N., Braz dos Santos, D., Clecildo Barreto Bezerra, J., & de Almeida SilvaID, L. (2018). In silico repositioning of approved drugs against Schistosoma mansoni energy metabolism targets. PLoS One.
  54. Melo-Filho, C. C., Dantas, R. F., Braga, R. C., Neves, B. J., Senger, M. R., Valente, W. C. G., … Andrade, C. H. (2016). QSAR-driven discovery of novel chemical scaffolds active against Schistosoma mansoni. Journal of Chemical Information and Modeling, 56(7), 1357–1372. Scholar
  55. Miotto, R., Wang, F., Wang, S., Jiang, X., & Dudley, J. T. (2017). Deep learning for healthcare: Review, opportunities and challenges. Briefings in Bioinformatics, 19(6), 1236–1246. Scholar
  56. Mitchell, T. M. (1997). Machine learning. New York, NY: McGraw-Hill.Google Scholar
  57. Neves, B. J., Braga, R. C., Alves, V. M., Lima, M. N. N., Cassiano, G. C., Muratov, E. N., Costa, F.T.M., Andrade, C. H. (2019). Deep Learning-driven research for drug discovery: Tackling Malaria. PLOS Computational Biology, 16(2):e1007025,
  58. Neves, B. J., Dantas, R. F., Senger, M. R., Melo-Filho, C. C., Valente, W. C. G., de Almeida, A. C. M., … Andrade, C. H. (2016). Discovery of new anti-schistosomal hits by integration of QSAR-based virtual screening and high content screening. Journal of Medicinal Chemistry, 59(15), 7075–7088. Scholar
  59. Nosengo, N. (2016). Can you teach old drugs new tricks? Nature, 534(7607), 314–316. Scholar
  60. Pantaleao, S. Q., Fujii, D. G. V., Maltarollo, V. G., da C. Silva, D., Trossini, G. H. G., Weber, K. C., … Honorio, K. M. (2017). The role of QSAR and virtual screening studies in type 2 diabetes drug discovery. Medicinal Chemistry, 13(8), 706–720. Scholar
  61. Perols, J. (2011). Financial statement fraud detection: An analysis of statistical and machine learning algorithms. Auditing: A Journal of Practice & Theory, 30(2), 19–50. Scholar
  62. Ping, P., Watson, K., Han, J., & Bui, A. (2017). Individualized knowledge graph: A viable informatics path to precision medicine. Circulation Research, 120(7), 1078–1080. Scholar
  63. Polishchuk, P., Kuz’min, V., Artemenko, A., & Muratov, E. (2013). Universal approach for structural interpretation of QSAR/QSPR models. Molecular Informatics, 32, 843–853.CrossRefGoogle Scholar
  64. Renard, P., Alcolea, A., & Ginsbourger, D. (2013). Stochastic versus deterministic approaches. In J. Wainwright & M. Mulligan (Eds.), Environmental modelling: Finding simplicity in complexity (2nd ed.). Chichester, UK/Hoboken, NJ: Wiley.Google Scholar
  65. Ruths, D., & Pfeffer, J. (2014). Social media for large studies of behavior. Science, 346(6213), 1063–1064. Scholar
  66. Speck-Planche, A. (2019). Multicellular target QSAR model for simultaneous prediction and design of anti-pancreatic cancer agents. ACS Omega, 4(2), 3122–3132. Scholar
  67. Sushko, I., Novotarskyi, S., Körner, R., Pandey, A. K., Cherkasov, A., Li, J., … Tetko, I. V. (2010). Applicability domains for classification problems: Benchmarking of distance to models for Ames mutagenicity set. Journal of Chemical Information and Modeling, 50(12), 2094–2111. Scholar
  68. Tildesley, D., & Care, P. (2014). Press release: Next RSC president predicts that in 15 years no chemist will do bench experiments without computer-modelling them first. Retrieved from
  69. Todeschini, R., & Consonni, V. (2009). Molecular descriptors for chemoinformatics (R. Mannhold, H. Kubinyi, & G. Folkers, Eds.).
  70. Tropsha, A. (2010). Best practices for QSAR model development, validation, and exploitation. Molecular Informatics, 29(6–7), 476–488. Scholar
  71. Vamathevan, J., Clark, D., Czodrowski, P., Dunham, I., Ferran, E., Lee, G., … Zhao, S. (2019). Applications of machine learning in drug discovery and development. Nature Reviews Drug Discovery, 18(6), 463–477. Scholar
  72. Wang, Y., Xiao, J., Suzek, T. O., Zhang, J., Wang, J., Zhou, Z., … Bryant, S. H. (2012). PubChem’s BioAssay database. Nucleic Acids Research, 40(Database issue), D400–D412. Scholar
  73. Xu, C., Cheng, F., Chen, L., Du, Z., Li, W., Liu, G., … Tang, Y. (2012). In silico prediction of chemical Ames mutagenicity. Journal of Chemical Information and Modeling, 52(11), 2840–2847. Scholar
  74. Zhang, L., Fourches, D., Sedykh, A., Zhu, H., Golbraikh, A., Ekins, S., … Tropsha, A. (2013). Discovery of novel antimalarial compounds enabled by QSAR-based virtual screening. Journal of Chemical Information and Modeling, 53(2), 475–492. Scholar
  75. Zhang, S., Wei, L., Bastow, K., Zheng, W., Brossi, A., Lee, K. H., & Tropsha, A. (2007). Antitumor agents 252. Application of validated QSAR models to database mining: Discovery of novel tylophorine derivatives as potential anticancer agents. Journal of Computer-Aided Molecular Design, 21(1–3), 97–112. Scholar
  76. Zhao, K., & So, H.-C. (2019). Using drug expression profiles and machine learning approach for drug repurposing. Methods in Molecular Biology (Clifton, N.J.), 1903, 219–237. Scholar
  77. Zhu, X., & Kruhlak, N. L. (2014). Construction and analysis of a human hepatotoxicity database suitable for QSAR modeling using post-market safety data. Toxicology, 321(1), 62–72. Scholar

Copyright information

© American Association of Pharmaceutical Scientists 2020

Authors and Affiliations

  • Anthony J. Hickey
    • 1
  • Hugh D. C. Smyth
    • 2
  1. 1.RTI InternationalResearch Triangle ParkUSA
  2. 2.College of PharmacyThe University of Texas at AustinAustinUSA

Personalised recommendations