Abstract
Bioinformatics is characterised by a growing diversity of large-scale databases containing information on genetics, proteins, metabolism and disease. It is widely agreed that there is an increasingly urgent need for technologies which can integrate these disparate knowledge sources. In this paper we propose that not only is machine learning a good candidate technology for such data integration, but Inductive Logic Programming, in particular, has strengths for handling the relational aspects of this task. Relations can be used to capture, in a single representation, not only biochemical reaction information but also protein and ligand structure as well as metabolic network information. Resources such as the Gene Ontology (GO) and the Enzyme Commission (EC) system both provide isa-hierarchies of enzyme functions. On the face of it GO and EC should be invaluable resources for supporting automation within Functional Genomics, which aims at predicting the function of unassigned enzymes from the genome projects. However, neither GO nor EC can be directly used for this purpose since the classes have only a natural language description. In this paper we make an initial attempt at machine learning EC classes for the purpose of enzyme function prediction in terms of biochemical reaction descriptions found in the LIGAND database. To our knowledge this is the first attempt to do so. In our experiments we learn descriptions for a small set of EC classes including Oxireductase and Phosphotransferase. Predictive accuracy are provided for all learned classes. In further work we hope to complete the learning of enzyme classes and integrate the learned models with metabolic network descriptions to support “gap-filling” in the present understanding of metabolism.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Arita, M., Nishioka, T.: Hierarchical classification of chemical reactions. Bio Industry 17(7), 45–50 (2000)
Bryant, C.H., Muggleton, S.H., Oliver, S.G., Kell, D.B., Reiser, P., King, R.D.: Combining inductive logic programming, active learning and robotics to discover the function of genes. Electronic Transactions in Artificial Intelligence 5-B1(012), 1–36 (2001)
The Gene Ontology Consortium. Gene ontology: Tool for the unification of biology. Nature Genetics 25, 25–29 (2000)
Goto, S., Okuno, Y., Hattori, M., Nishioka, T., Kanehisa, M.: Ligand: Database of chemical compounds and reactions in biological pathways. Nucleic Acids Research 30, 402–404 (2002)
International Union of Biochemistry and Molecular Biology. Enzyme Nomenclature: Recommendations (1992) of the Nomenclature Committee of the International Union of Biochemistry and Molecular Biology. Academic Press, New York (1992)
Turcotte, M., Muggleton, S.H., Sternberg, M.J.E.: Automated discovery of structural signatures of protein fold and function. Journal of Molecular Biology 306, 591–605 (2001)
Walsh, C.: Enzymatic Reaction Mechanisms. W. H. Freeman and Company, New York (1979)
Wilkins, M.R., Williams, K.L., Appel, R.D., Hochstrasser, D.F.: Proteome Research: New Frontiers in Functional Genomics (Principles and Practice). Springer, Berlin (1997)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2003 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Muggleton, S., Tamaddoni-Nezhad, A., Watanabe, H. (2003). Induction of Enzyme Classes from Biological Databases. In: Horváth, T., Yamamoto, A. (eds) Inductive Logic Programming. ILP 2003. Lecture Notes in Computer Science(), vol 2835. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-39917-9_18
Download citation
DOI: https://doi.org/10.1007/978-3-540-39917-9_18
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-20144-1
Online ISBN: 978-3-540-39917-9
eBook Packages: Springer Book Archive