Abstract
Artificial intelligence has become an indispensable resource in chemoinformatics. Numerous machine learning algorithms for activity prediction recently emerged, becoming an indispensable approach to mine chemical information from large compound datasets. These approaches enable the automation of compound discovery to find biologically active molecules with important properties. Here, we present a review of some of the main machine learning studies in biological activity prediction of compounds, in particular for sweetness prediction. We discuss some of the most used compound featurization techniques and the major databases of chemical compounds relevant to these tasks.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT Press, Cambridge [u.a.] (2013)
Samuel, A.L.: Some studies in machine learning using the game of checkers. IBM J. Res. Dev. 3(3), 210–229 (1959)
Toccaceli, P., et al.: Conformal prediction of biological activity of chemical compounds. Ann. Math. Artif. Intell. 81(1–2), 105–123 (2017)
Wishart, D.S., et al.: DrugBank 5.0: a major update to the DrugBank database for 2018. Nucleic Acids Res. 46(D1), D1074–D1082 (2017)
Kim, S., et al.: PubChem 2019 update: improved access to chemical data. Nucleic Acids Res. 47(D1), D1102–D1109 (2018)
Hastings, J., et al.: ChEBI in 2016: Improved services and an expanding collection of metabolites. Nucleic Acids Res. 44(D1), D1214–D1219 (2015)
Wu, Z., et al.: MoleculeNet: a benchmark for molecular machine learning. Chem. Sci. 9(2), 513–530 (2018)
Pence, H.E., Williams, A.: ChemSpider: an online chemical information resource. J. Chem. Educ. 87(11), 1123–1124 (2010)
Wishart, D., et al.: T3DB: the toxic exposome database. Nucleic Acids Res. 43(D1), D928–D934 (2014)
Mayr, A., et al.: Large-scale comparison of machine learning methods for drug target prediction on ChEMBL. Chem. Sci. 9(24), 5441–5451 (2018)
Merget, B., et al.: Profiling prediction of kinase inhibitors: toward the virtual assay. J. Med. Chem. 60(1), 474–485 (2016)
Ma, J., et al.: Deep neural nets as a method for quantitative structure-activity relationships. J. Chem. Inf. Model. 55(2), 263–274 (2015)
Gaulton, A., et al.: The ChEMBL database in 2017. Nucleic Acids Res. 45(D1), D945–D954 (2016)
Lenselink, E.B., et al.: Beyond the hype: deep neural networks outperform established methods using a ChEMBL bioactivity benchmark set. J. Cheminformatics 9(1), 45 (2017)
Korotcov, A., et al.: Comparison of deep learning with multiple machine learning methods and metrics using diverse drug discovery data sets. Mol. Pharm. 14(12), 4462–4475 (2017)
Xu, Y., et al.: Demystifying multitask deep neural networks for quantitative structure-activity relationships. J. Chem. Inf. Model. 57(10), 2490–2504 (2017)
Koutsoukas, A., et al.: Deep-learning: investigating deep neural networks hyper-parameters and comparison of performance to shallow methods for modeling bioactivity data. J. Cheminformatics 9(1), 42 (2017)
Mayr, A., et al.: DeepTox: toxicity prediction using deep learning. Front. Environ. Sci. 3, 80 (2016)
Kearnes, S., et al.: Modeling industrial ADMET data with multitask networks, June 2016
Ramsundar, B., et al.: Is multitask deep learning practical for pharma? J. Chem. Inf. Model. 57(8), 2068–2076 (2017)
Dahl, G., Jaitly, N., Salakhutdinov, R.: Multi-task neural networks for QSAR predictions. CoRR arXiv:1406.1231v1 (2014)
Xu, Y., et al.: Deep learning for drug-induced liver injury. J. Chem. Inf. Model. 55(10), 2085–2093 (2015)
Ramsundar, B., et al.: Massively multitask networks for drug discovery. CoRR arXiv:1502.02072 (2015)
Unterthiner, T., et al.: Deep learning as an opportunity in virtual screening, January 2014
Chen, B., et al.: Comparison of random forest and pipeline pilot naïve bayes in prospective QSAR predictions. J. Chem. Inf. Model. 52(3), 792–803 (2012)
Myint, K.Z., et al.: Molecular fingerprint-based artificial neural networks QSAR for ligand biological activity predictions. Mol. Pharm. 9(10), 2912–2923 (2012)
Martin, E., et al.: Profile-QSAR: a novel meta-QSAR method that combines activities across the kinase family to accurately predict affinity, selectivity, and cellular activity. J. Chem. Inf. Model. 51(8), 1942–1956 (2011)
O’Boyle, N.M.: Towards a universal SMILES representation - a standard method to generate canonical SMILES based on the InChI. J. Cheminformatics 4(1), 22 (2012)
Weininger, D.: SMILES, a chemical language and information system. 1. introduction to methodology and encoding rules. J. Chem. Inf. Model. 28(1), 31–36 (1988)
Heller, S.R., et al.: InChI, the IUPAC international chemical identifier. J. Cheminformatics 7(1), 23 (2015)
Duvenaud, D.K., et al.: Convolutional networks on graphs for learning molecular fingerprints. CoRR arXiv:1509.09292 (2015)
Kearnes, S., et al.: Molecular graph convolutions: moving beyond fingerprints. J. Comput.-Aided Mol. Des. 30(8), 595–608 (2016)
Xu, Z., et al.: Seq2seq fingerprint. In: Proceedings of the 8th ACM International Conference on Bioinformatics, Computational Biology, and Health Informatics, ACM-BCB 2017, pp. 285–294. ACM Press, New York (2017)
Sutskever, I., et al.: Sequence to sequence learning with neural networks. In: Advances in Neural Information Processing Systems, pp. 3104–3112 (2014)
Jaeger, S., et al.: Mol2vec: unsupervised machine learning approach with chemical intuition. J. Chem. Inf. Model. 58(1), 27–35 (2018)
Mikolov, T., et al.: Efficient estimation of word representations in vector space, January 2013
Whitehouse, C.R., et al.: The potential toxicity of artificial sweeteners. AAOHN J. 56(6), 251–259 (2008)
Yang, X., et al.: In-silico prediction of sweetness of sugars and sweeteners. Food Chem. 128(3), 653–658 (2011)
Zhong, M., et al.: Prediction of sweetness by multilinear regression analysis and support vector machine. J. Food Sci. 78(9), S1445–S1450 (2013)
Rojas, C., et al.: A new QSPR study on relative sweetness. Int. J. Quant. Struct.-Prop. Relat. 1(1), 78–93 (2016)
Rojas, C., et al.: A QSTR-based expert system to predict sweetness of molecules. Front. Chem. 5, 53 (2017)
Chéron, J.B., et al.: Sweetness prediction of natural compounds. Food Chem. 221, 1421–1425 (2017)
Goel, A., et al.: In-silico prediction of sweetness using structure-activity relationship models. Food Chem. 253, 127–131 (2018)
Banerjee, P., Preissner, R.: BitterSweetForest: a random forest based binary classifier to predict bitterness and sweetness of chemical compounds. Front. Chem. 6, 93 (2018)
Ojha, P.K., Roy, K.: Development of a robust and validated 2D-QSPR model for sweetness potency of diverse functional organic molecules. Food Chem. Toxicol. 112, 551–562 (2018)
Zheng, S., et al.: e-sweet: a machine-learning based platform for the prediction of sweetener and its relative sweetness. Front. Chem. 7, 35 (2019)
Ahmed, J., et al.: SuperSweet-a resource on natural and artificial sweetening agents. Nucleic Acids Res. 39(Database), D377–D382 (2010)
Dagan-Wiener, A., et al.: Bitter or not? BitterPredict, a tool for predicting taste from chemical structure. Sci. Rep. 7(1) (2017)
Garg, N., et al.: FlavorDB: a database of flavor molecules. Nucleic Acids Res. 46(D1), D1210–D1216 (2017)
Banerjee, P., et al.: Super natural II–a database of natural products. Nucleic Acids Res. 43(D1), D935–D939 (2014)
Acknowledgments
This study was supported by the European Commission through project SHIKIFACTORY100 - Modular cell factories for the production of 100 compounds from the shikimate pathway (Reference 814408), and by the Portuguese FCT under the scope of the strategic funding of UID/BIO/04469/2019 unit and BioTecNorte operation (NORTE-01-0145-FEDER-000004) funded by the European Regional Development Fund under the scope of Norte2020.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Correia, J., Resende, T., Baptista, D., Rocha, M. (2020). Artificial Intelligence in Biological Activity Prediction. In: Fdez-Riverola, F., Rocha, M., Mohamad, M., Zaki, N., Castellanos-Garzón, J. (eds) Practical Applications of Computational Biology and Bioinformatics, 13th International Conference. PACBB 2019. Advances in Intelligent Systems and Computing, vol 1005 . Springer, Cham. https://doi.org/10.1007/978-3-030-23873-5_20
Download citation
DOI: https://doi.org/10.1007/978-3-030-23873-5_20
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-23872-8
Online ISBN: 978-3-030-23873-5
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)