Abstract
In this work, a machine-learning approach was developed, which performs the prediction of the putative enzymatic function of unknown proteins, based on the PFAM protein domain database and the Enzyme Commission (EC) numbers that describe the enzymatic activities. The classifier was trained with well annotated protein datasets from the Uniprot database, in order to define the characteristic domains of each enzymatic sub-category in the class of Hydrolases. As a conclusion, the machine-learning procedure based on Hmmer3 scores against the PFAM database can accurately predict the enzymatic activity of unknown proteins as a part of metagenomic analysis workflows.
Chapter PDF
Similar content being viewed by others
References
Lorenz, P., Eck, J.: Metagenomics and industrial applications. Nat. Rev. Microbiol. 3(6), 510–516 (2005)
Finn, R.D., et al.: The PFAM protein families database. Nucleic Acids Res. 36(Database issue), D281–D288 (2008)
Finn, R.D., Clements, J., Eddy, S.R.: HMMER web server: interactive sequence similarity searching. Nucleic Acids Res. 39(Web Server issue), W29–W37 (2011)
Apweiler, R., et al.: UniProt: The Universal Protein knowledgebase. Nucleic Acids Res. 32(Database issue), D115–D119 (2004)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 IFIP International Federation for Information Processing
About this paper
Cite this paper
Koutsandreas, T., Pilalis, E., Chatziioannou, A. (2013). A Machine-Learning Approach for theof Enzymatic Activity of Proteins in Metagenomic Samples. In: Papadopoulos, H., Andreou, A.S., Iliadis, L., Maglogiannis, I. (eds) Artificial Intelligence Applications and Innovations. AIAI 2013. IFIP Advances in Information and Communication Technology, vol 412. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-41142-7_9
Download citation
DOI: https://doi.org/10.1007/978-3-642-41142-7_9
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-41141-0
Online ISBN: 978-3-642-41142-7
eBook Packages: Computer ScienceComputer Science (R0)