Abstract
The main objective of this research work is to build a comprehensible enzyme function classification by using high-coverage and high-precision reactive motifs. Reactive motifs are directly extracted from the protein active and binding sites. Main advantage of the reactive motifs is their high-coverage, however their generalization make them over-generalized with low-precision quality. In this paper, a method for generating reactive motifs with negative patterns is proposed. Reactive motifs with negative patterns are able to control the level of motif generalization. As result, non over-generalized reactive motifs with high-precision are generated. Each of the reactive motifs is associated with a specific enzyme function. They can directly predict enzyme function of an unknown protein sequence. Without use of a complex classification model, set of voting methods are proposed and used to construct a comprehensible enzyme function classification. Essential clues of the enzyme mechanism are provided to the biologist end-users.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Freitas, A.A., Wieser, D.C., Apweiler, R.: On the importance of comprehensible classification models for protein function prediction. IEEE/ACM Trans. Comput. Biol. Bioinformatics 7(1), 172–182 (2010)
Kunik, V., Solan, Z., Edelman, S., Ruppin, E., Horn, D.: Motif extraction and protein classification. In: CSB, pp. 80–85. IEEE Computer Society (2005)
Sander, C., Schneider, R.: Database of homology-derived protein structures and the structural meaning of sequence alignment. Proteins: Structure, Function, and Genetics 9(1), 56–68 (1991)
Eidhammer, I., Jonassen, I., Taylor, W.R.: Structure comparison and structure patterns. Journal of Computational Biology 7, 685–716 (1999)
Huang, J.Y., Brutlag, D.L.: The emotif database. Nucleic Acids Research 29(1), 202–204 (2001)
Bennett, S.P., Lu, L., Brutlag, D.L.: 3matrix and 3motif: a protein structure visualization system for conserved sequence motifs. Nucleic Acids Research 31(13), 3328–3332 (2003)
Waiyamai, K., Liewlom, P., Kangkachit, T., Rakthanmanon, T.: Concept lattice–based mutation control for reactive motifs discovery. In: Washio, T., Suzuki, E., Ting, K.M., Inokuchi, A. (eds.) PAKDD 2008. LNCS (LNAI), vol. 5012, pp. 767–776. Springer, Heidelberg (2008)
Kangkachit, T., Waiyamai, K., Lenca, P.: Enzyme classification using reactive motifs. I. J. Functional Informatics and Personalised Medicine 4(3/4), 243–258 (2014)
Bairoch, A.: The prosite dictionary of sites and patterns in proteins, its current status. Nucleic Acids Research 21(13), 3097–3103 (1993)
Bork, P., Koonin, E.: Protein sequence motifs. Curr. Opin. Struct. Biol. 6(3), 366–376 (1996)
Boser, B.E., Guyon, I., Vapnik, V.: A training algorithm for optimal margin classifiers. In: COLT, pp. 144–152 (1992)
Freund, Y., Schapire, R.E.: A decision-theoretic generalization of on-line learning and an application to boosting. Journal of Computer and System Sciences 55(1), 119–139 (1997)
Schomburg, I., Chang, A., Ebeling, C., Gremse, M., Heldt, C., Huhn, G., Schomburg, D.: BRENDA, the enzyme database: updates and major new developments. Nucleic Acids Research 32(Database issue), D431–D433 (2004)
Henikoff, S., Henikoff, J.G.: Amino acid substitution matrices from protein blocks. Proceedings of the National Academy of Sciences 89(22), 10915–10919 (1992)
Smith, H.O., Annau, T.M., Chandrasegaran, S.: Finding sequence motifs in groups of functionally related proteins. Proceedings of the National Academy of Sciences 87(2), 826–830 (1990)
Chenna, R., Sugawara, H., Koike, T., Lopez, R., Gibson, T.J., Higgins, D.G., Thompson, J.D.: Multiple sequence alignment with the clustal series of programs. Nucleic Acids Res. 31(13), 3497–3500 (2003)
Apweiler, R., Bairoch, A., Wu, C.H., Barker, W.C., Boeckmann, B., Ferro, S., Gasteiger, E., Huang, H., Lopez, R., Magrane, M., Martin, M.J., Natale, D.A., O’Donovan, C., Redaschi, N., Yeh, L.S.L.: UniProt: the Universal Protein knowledgebase. Nucleic Acids Research 32(Database–Issue), 115–119 (2004)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Kangkachit, T., Waiyamai, K. (2016). Comprehensible Enzyme Function Classification Using Reactive Motifs with Negative Patterns. In: Perner, P. (eds) Machine Learning and Data Mining in Pattern Recognition. MLDM 2016. Lecture Notes in Computer Science(), vol 9729. Springer, Cham. https://doi.org/10.1007/978-3-319-41920-6_44
Download citation
DOI: https://doi.org/10.1007/978-3-319-41920-6_44
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-41919-0
Online ISBN: 978-3-319-41920-6
eBook Packages: Computer ScienceComputer Science (R0)