Comprehensible Enzyme Function Classification Using Reactive Motifs with Negative Patterns

Kangkachit, Thanapat; Waiyamai, Kitsana

doi:10.1007/978-3-319-41920-6_44

Thanapat Kangkachit¹⁴ &
Kitsana Waiyamai¹⁴

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 9729))

Included in the following conference series:

International Conference on Machine Learning and Data Mining in Pattern Recognition

3015 Accesses

Abstract

The main objective of this research work is to build a comprehensible enzyme function classification by using high-coverage and high-precision reactive motifs. Reactive motifs are directly extracted from the protein active and binding sites. Main advantage of the reactive motifs is their high-coverage, however their generalization make them over-generalized with low-precision quality. In this paper, a method for generating reactive motifs with negative patterns is proposed. Reactive motifs with negative patterns are able to control the level of motif generalization. As result, non over-generalized reactive motifs with high-precision are generated. Each of the reactive motifs is associated with a specific enzyme function. They can directly predict enzyme function of an unknown protein sequence. Without use of a complex classification model, set of voting methods are proposed and used to construct a comprehensible enzyme function classification. Essential clues of the enzyme mechanism are provided to the biologist end-users.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Freitas, A.A., Wieser, D.C., Apweiler, R.: On the importance of comprehensible classification models for protein function prediction. IEEE/ACM Trans. Comput. Biol. Bioinformatics 7(1), 172–182 (2010)
Article Google Scholar
Kunik, V., Solan, Z., Edelman, S., Ruppin, E., Horn, D.: Motif extraction and protein classification. In: CSB, pp. 80–85. IEEE Computer Society (2005)
Google Scholar
Sander, C., Schneider, R.: Database of homology-derived protein structures and the structural meaning of sequence alignment. Proteins: Structure, Function, and Genetics 9(1), 56–68 (1991)
Article Google Scholar
Eidhammer, I., Jonassen, I., Taylor, W.R.: Structure comparison and structure patterns. Journal of Computational Biology 7, 685–716 (1999)
Article Google Scholar
Huang, J.Y., Brutlag, D.L.: The emotif database. Nucleic Acids Research 29(1), 202–204 (2001)
Article Google Scholar
Bennett, S.P., Lu, L., Brutlag, D.L.: 3matrix and 3motif: a protein structure visualization system for conserved sequence motifs. Nucleic Acids Research 31(13), 3328–3332 (2003)
Article Google Scholar
Waiyamai, K., Liewlom, P., Kangkachit, T., Rakthanmanon, T.: Concept lattice–based mutation control for reactive motifs discovery. In: Washio, T., Suzuki, E., Ting, K.M., Inokuchi, A. (eds.) PAKDD 2008. LNCS (LNAI), vol. 5012, pp. 767–776. Springer, Heidelberg (2008)
Chapter Google Scholar
Kangkachit, T., Waiyamai, K., Lenca, P.: Enzyme classification using reactive motifs. I. J. Functional Informatics and Personalised Medicine 4(3/4), 243–258 (2014)
Article Google Scholar
Bairoch, A.: The prosite dictionary of sites and patterns in proteins, its current status. Nucleic Acids Research 21(13), 3097–3103 (1993)
Article Google Scholar
Bork, P., Koonin, E.: Protein sequence motifs. Curr. Opin. Struct. Biol. 6(3), 366–376 (1996)
Article Google Scholar
Boser, B.E., Guyon, I., Vapnik, V.: A training algorithm for optimal margin classifiers. In: COLT, pp. 144–152 (1992)
Google Scholar
Freund, Y., Schapire, R.E.: A decision-theoretic generalization of on-line learning and an application to boosting. Journal of Computer and System Sciences 55(1), 119–139 (1997)
Article MathSciNet MATH Google Scholar
Schomburg, I., Chang, A., Ebeling, C., Gremse, M., Heldt, C., Huhn, G., Schomburg, D.: BRENDA, the enzyme database: updates and major new developments. Nucleic Acids Research 32(Database issue), D431–D433 (2004)
Article Google Scholar
Henikoff, S., Henikoff, J.G.: Amino acid substitution matrices from protein blocks. Proceedings of the National Academy of Sciences 89(22), 10915–10919 (1992)
Article Google Scholar
Smith, H.O., Annau, T.M., Chandrasegaran, S.: Finding sequence motifs in groups of functionally related proteins. Proceedings of the National Academy of Sciences 87(2), 826–830 (1990)
Article Google Scholar
Chenna, R., Sugawara, H., Koike, T., Lopez, R., Gibson, T.J., Higgins, D.G., Thompson, J.D.: Multiple sequence alignment with the clustal series of programs. Nucleic Acids Res. 31(13), 3497–3500 (2003)
Article Google Scholar
Apweiler, R., Bairoch, A., Wu, C.H., Barker, W.C., Boeckmann, B., Ferro, S., Gasteiger, E., Huang, H., Lopez, R., Magrane, M., Martin, M.J., Natale, D.A., O’Donovan, C., Redaschi, N., Yeh, L.S.L.: UniProt: the Universal Protein knowledgebase. Nucleic Acids Research 32(Database–Issue), 115–119 (2004)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Engineering, Kasetsart University, Bangkok, Thailand
Thanapat Kangkachit & Kitsana Waiyamai

Authors

Thanapat Kangkachit
View author publications
You can also search for this author in PubMed Google Scholar
Kitsana Waiyamai
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Thanapat Kangkachit .

Editor information

Editors and Affiliations

IBaI, Inst of Comp Vision and applied Comp Sci, Leipzig, Sachsen, Germany
Petra Perner

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Kangkachit, T., Waiyamai, K. (2016). Comprehensible Enzyme Function Classification Using Reactive Motifs with Negative Patterns. In: Perner, P. (eds) Machine Learning and Data Mining in Pattern Recognition. MLDM 2016. Lecture Notes in Computer Science(), vol 9729. Springer, Cham. https://doi.org/10.1007/978-3-319-41920-6_44

Download citation

DOI: https://doi.org/10.1007/978-3-319-41920-6_44
Published: 28 June 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-41919-0
Online ISBN: 978-3-319-41920-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics