Statistical Machine Learning for Agriculture and Human Health Care Based on Biomedical Big Data
The availability of biomedical big data provides an opportunity to develop data-driven approaches in agriculture and human healthcare research. In this study, we investigate statistical machine learning approaches to metabolic pathway reconstruction and the prediction of drug–target interactions, using heterogeneous biomedical big data. We present an \(L_1\)-regularized pairwise support vector machine to predict unknown enzymatic reactions among metabolome-scale compounds, based on chemical transformation patterns of compounds. We also present supervised bipartite graph inference with kernel methods to predict unknown interactions between drugs and target proteins, based on the chemical structures of drugs and the amino acid sequences of proteins. We experimentally demonstrated that these methods could be applied to rational compound synthesis and efficient drug discovery for a range of human diseases. Such methods are expected to increase the productivity of research in food and pharmaceutical industries.
KeywordsMetabolic pathways Drug targets Machine learning Classification Feature extraction Graph inference
This work is supported by JST PRESTO Grant Number JPMJPR15D8, JSPS KAKENHI Grant Numbers 25700029 and 15K14980, and the Program to Disseminate Tenure Tracking System, MEXT, Japan and Kyushu University Interdisciplinary Programs in Education and Projects in Research Development.
- 4.F. Afendi, T. Okada, M. Yamazaki, A. Hirai-Morita, Y. Nakamura, K. Nakamura, S. Ikeda, H. Takahashi, M. Altaf-Ul-Amin, L. Darusman, K. Saito, S. Kanaya, KNApSAcK family databases: integrated metaboliteplant species databases for multifaceted plant research. Plant Cell Physiol. 53, e1 (2012)CrossRefGoogle Scholar
- 5.A. Sreekumar, L. Poisson, T. Rajendiran, A. Khan, Q. Cao, J. Yu, B. Laxman, R. Mehra, R. Lonigro, Y. Li, M. Nyati, A. Ahsan, S. Kalyana-Sundaram, B. Han, X. Cao, J. Byun, G. Omenn, D. Ghosh, S. Pennathur, D. Alexander, A. Berger, J. Shuster, J. Wei, S. Varambally, C. Beecher, A. Chinnaiyan, Metabolomic profiles delineate potential role for sarcosine in prostate cancer progression. Nature 457, 910–914 (2009)CrossRefGoogle Scholar
- 15.M. Nakamura, T. Hachiya, Y. Saito, K. Sato, Y. Sakakibara, An efficient algorithm for de novo predictions of biochemical pathways between chemical compounds. BMC Bioinform. 13 (2012)Google Scholar
- 20.H. Lodhi, Y. Yamanishi, Chemoinformatics and Advanced Machine Learning Perspectives: Complex Computational Methods and Collaborative Techniques (IGI Global, 2010)Google Scholar
- 27.Y. Yamanishi, Supervised bipartite graph inference. in Advances in Neural Information Processing Systems 21, ed. by D. Koller, D. Schuurmans, Y. Bengio, L. Bottou (MIT Press, Cambridge, MA, 2009), pp. 1841–1848Google Scholar
- 30.J. Zhu, 1-norm support vector machines, in Advances in Neural Information Processing Systems 15, ed. by S. Becker, S. Thrun, K. Obermayer (MIT Press, Cambridge, MA, 2003), pp. 49–56Google Scholar
- 45.M. Greenacre, Theory and Applications of Correspondence Analysis. (Academic Press, 1984)Google Scholar