Skip to main content

Abstract

Artificial intelligence has become an indispensable resource in chemoinformatics. Numerous machine learning algorithms for activity prediction recently emerged, becoming an indispensable approach to mine chemical information from large compound datasets. These approaches enable the automation of compound discovery to find biologically active molecules with important properties. Here, we present a review of some of the main machine learning studies in biological activity prediction of compounds, in particular for sweetness prediction. We discuss some of the most used compound featurization techniques and the major databases of chemical compounds relevant to these tasks.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT Press, Cambridge [u.a.] (2013)

    Google Scholar 

  2. Samuel, A.L.: Some studies in machine learning using the game of checkers. IBM J. Res. Dev. 3(3), 210–229 (1959)

    Article  MathSciNet  Google Scholar 

  3. Toccaceli, P., et al.: Conformal prediction of biological activity of chemical compounds. Ann. Math. Artif. Intell. 81(1–2), 105–123 (2017)

    Article  MathSciNet  Google Scholar 

  4. Wishart, D.S., et al.: DrugBank 5.0: a major update to the DrugBank database for 2018. Nucleic Acids Res. 46(D1), D1074–D1082 (2017)

    Article  Google Scholar 

  5. Kim, S., et al.: PubChem 2019 update: improved access to chemical data. Nucleic Acids Res. 47(D1), D1102–D1109 (2018)

    Article  Google Scholar 

  6. Hastings, J., et al.: ChEBI in 2016: Improved services and an expanding collection of metabolites. Nucleic Acids Res. 44(D1), D1214–D1219 (2015)

    Article  MathSciNet  Google Scholar 

  7. Wu, Z., et al.: MoleculeNet: a benchmark for molecular machine learning. Chem. Sci. 9(2), 513–530 (2018)

    Article  Google Scholar 

  8. Pence, H.E., Williams, A.: ChemSpider: an online chemical information resource. J. Chem. Educ. 87(11), 1123–1124 (2010)

    Article  Google Scholar 

  9. Wishart, D., et al.: T3DB: the toxic exposome database. Nucleic Acids Res. 43(D1), D928–D934 (2014)

    Article  Google Scholar 

  10. Mayr, A., et al.: Large-scale comparison of machine learning methods for drug target prediction on ChEMBL. Chem. Sci. 9(24), 5441–5451 (2018)

    Article  Google Scholar 

  11. Merget, B., et al.: Profiling prediction of kinase inhibitors: toward the virtual assay. J. Med. Chem. 60(1), 474–485 (2016)

    Article  Google Scholar 

  12. Ma, J., et al.: Deep neural nets as a method for quantitative structure-activity relationships. J. Chem. Inf. Model. 55(2), 263–274 (2015)

    Article  Google Scholar 

  13. Gaulton, A., et al.: The ChEMBL database in 2017. Nucleic Acids Res. 45(D1), D945–D954 (2016)

    Article  Google Scholar 

  14. Lenselink, E.B., et al.: Beyond the hype: deep neural networks outperform established methods using a ChEMBL bioactivity benchmark set. J. Cheminformatics 9(1), 45 (2017)

    Article  Google Scholar 

  15. Korotcov, A., et al.: Comparison of deep learning with multiple machine learning methods and metrics using diverse drug discovery data sets. Mol. Pharm. 14(12), 4462–4475 (2017)

    Article  Google Scholar 

  16. Xu, Y., et al.: Demystifying multitask deep neural networks for quantitative structure-activity relationships. J. Chem. Inf. Model. 57(10), 2490–2504 (2017)

    Article  Google Scholar 

  17. Koutsoukas, A., et al.: Deep-learning: investigating deep neural networks hyper-parameters and comparison of performance to shallow methods for modeling bioactivity data. J. Cheminformatics 9(1), 42 (2017)

    Article  Google Scholar 

  18. Mayr, A., et al.: DeepTox: toxicity prediction using deep learning. Front. Environ. Sci. 3, 80 (2016)

    Article  Google Scholar 

  19. Kearnes, S., et al.: Modeling industrial ADMET data with multitask networks, June 2016

    Google Scholar 

  20. Ramsundar, B., et al.: Is multitask deep learning practical for pharma? J. Chem. Inf. Model. 57(8), 2068–2076 (2017)

    Article  Google Scholar 

  21. Dahl, G., Jaitly, N., Salakhutdinov, R.: Multi-task neural networks for QSAR predictions. CoRR arXiv:1406.1231v1 (2014)

  22. Xu, Y., et al.: Deep learning for drug-induced liver injury. J. Chem. Inf. Model. 55(10), 2085–2093 (2015)

    Article  Google Scholar 

  23. Ramsundar, B., et al.: Massively multitask networks for drug discovery. CoRR arXiv:1502.02072 (2015)

  24. Unterthiner, T., et al.: Deep learning as an opportunity in virtual screening, January 2014

    Google Scholar 

  25. Chen, B., et al.: Comparison of random forest and pipeline pilot naïve bayes in prospective QSAR predictions. J. Chem. Inf. Model. 52(3), 792–803 (2012)

    Article  Google Scholar 

  26. Myint, K.Z., et al.: Molecular fingerprint-based artificial neural networks QSAR for ligand biological activity predictions. Mol. Pharm. 9(10), 2912–2923 (2012)

    Article  Google Scholar 

  27. Martin, E., et al.: Profile-QSAR: a novel meta-QSAR method that combines activities across the kinase family to accurately predict affinity, selectivity, and cellular activity. J. Chem. Inf. Model. 51(8), 1942–1956 (2011)

    Article  Google Scholar 

  28. O’Boyle, N.M.: Towards a universal SMILES representation - a standard method to generate canonical SMILES based on the InChI. J. Cheminformatics 4(1), 22 (2012)

    Article  Google Scholar 

  29. Weininger, D.: SMILES, a chemical language and information system. 1. introduction to methodology and encoding rules. J. Chem. Inf. Model. 28(1), 31–36 (1988)

    Article  Google Scholar 

  30. Heller, S.R., et al.: InChI, the IUPAC international chemical identifier. J. Cheminformatics 7(1), 23 (2015)

    Article  Google Scholar 

  31. Duvenaud, D.K., et al.: Convolutional networks on graphs for learning molecular fingerprints. CoRR arXiv:1509.09292 (2015)

  32. Kearnes, S., et al.: Molecular graph convolutions: moving beyond fingerprints. J. Comput.-Aided Mol. Des. 30(8), 595–608 (2016)

    Article  Google Scholar 

  33. Xu, Z., et al.: Seq2seq fingerprint. In: Proceedings of the 8th ACM International Conference on Bioinformatics, Computational Biology, and Health Informatics, ACM-BCB 2017, pp. 285–294. ACM Press, New York (2017)

    Google Scholar 

  34. Sutskever, I., et al.: Sequence to sequence learning with neural networks. In: Advances in Neural Information Processing Systems, pp. 3104–3112 (2014)

    Google Scholar 

  35. Jaeger, S., et al.: Mol2vec: unsupervised machine learning approach with chemical intuition. J. Chem. Inf. Model. 58(1), 27–35 (2018)

    Article  Google Scholar 

  36. Mikolov, T., et al.: Efficient estimation of word representations in vector space, January 2013

    Google Scholar 

  37. Whitehouse, C.R., et al.: The potential toxicity of artificial sweeteners. AAOHN J. 56(6), 251–259 (2008)

    Article  Google Scholar 

  38. Yang, X., et al.: In-silico prediction of sweetness of sugars and sweeteners. Food Chem. 128(3), 653–658 (2011)

    Article  Google Scholar 

  39. Zhong, M., et al.: Prediction of sweetness by multilinear regression analysis and support vector machine. J. Food Sci. 78(9), S1445–S1450 (2013)

    Article  Google Scholar 

  40. Rojas, C., et al.: A new QSPR study on relative sweetness. Int. J. Quant. Struct.-Prop. Relat. 1(1), 78–93 (2016)

    MathSciNet  Google Scholar 

  41. Rojas, C., et al.: A QSTR-based expert system to predict sweetness of molecules. Front. Chem. 5, 53 (2017)

    Article  Google Scholar 

  42. Chéron, J.B., et al.: Sweetness prediction of natural compounds. Food Chem. 221, 1421–1425 (2017)

    Article  Google Scholar 

  43. Goel, A., et al.: In-silico prediction of sweetness using structure-activity relationship models. Food Chem. 253, 127–131 (2018)

    Article  Google Scholar 

  44. Banerjee, P., Preissner, R.: BitterSweetForest: a random forest based binary classifier to predict bitterness and sweetness of chemical compounds. Front. Chem. 6, 93 (2018)

    Article  Google Scholar 

  45. Ojha, P.K., Roy, K.: Development of a robust and validated 2D-QSPR model for sweetness potency of diverse functional organic molecules. Food Chem. Toxicol. 112, 551–562 (2018)

    Article  Google Scholar 

  46. Zheng, S., et al.: e-sweet: a machine-learning based platform for the prediction of sweetener and its relative sweetness. Front. Chem. 7, 35 (2019)

    Article  Google Scholar 

  47. Ahmed, J., et al.: SuperSweet-a resource on natural and artificial sweetening agents. Nucleic Acids Res. 39(Database), D377–D382 (2010)

    Article  Google Scholar 

  48. Dagan-Wiener, A., et al.: Bitter or not? BitterPredict, a tool for predicting taste from chemical structure. Sci. Rep. 7(1) (2017)

    Google Scholar 

  49. Garg, N., et al.: FlavorDB: a database of flavor molecules. Nucleic Acids Res. 46(D1), D1210–D1216 (2017)

    Article  Google Scholar 

  50. Banerjee, P., et al.: Super natural II–a database of natural products. Nucleic Acids Res. 43(D1), D935–D939 (2014)

    Article  Google Scholar 

Download references

Acknowledgments

This study was supported by the European Commission through project SHIKIFACTORY100 - Modular cell factories for the production of 100 compounds from the shikimate pathway (Reference 814408), and by the Portuguese FCT under the scope of the strategic funding of UID/BIO/04469/2019 unit and BioTecNorte operation (NORTE-01-0145-FEDER-000004) funded by the European Regional Development Fund under the scope of Norte2020.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to João Correia .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Correia, J., Resende, T., Baptista, D., Rocha, M. (2020). Artificial Intelligence in Biological Activity Prediction. In: Fdez-Riverola, F., Rocha, M., Mohamad, M., Zaki, N., Castellanos-Garzón, J. (eds) Practical Applications of Computational Biology and Bioinformatics, 13th International Conference. PACBB 2019. Advances in Intelligent Systems and Computing, vol 1005 . Springer, Cham. https://doi.org/10.1007/978-3-030-23873-5_20

Download citation

Publish with us

Policies and ethics