Machine Learning Techniques Applied to the Cleavage Site Prediction Problem

Alvarez, Gloria Inés; Bravo, Enrique; Linares, Diego; Vargas, Jheyson Faride; Velasco, Jairo Andrés

doi:10.1007/978-3-642-45114-0_39

Machine Learning Techniques Applied to the Cleavage Site Prediction Problem

Gloria Inés Alvarez²²,
Enrique Bravo²³,
Diego Linares²²,
Jheyson Faride Vargas²² &
…
Jairo Andrés Velasco²²

Conference paper

1316 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 8265))

Abstract

The Genome of the Potyviridae virus family is usually expressed as a polyprotein which can be divided into ten proteins through the action of enzymes or proteases which cut the chain in specific places called cleavage sites. Three different techniques were employed to model each cleavage site: Hidden Markov Models (HMM), grammatical inference OIL algorithm (OIL), and Artificial Neural Networks (ANN).

Based on experimentation, the Hidden Markov Model has the best classification performance as well as a high robustness in relation to class imbalance. However, the Order Independent Language (OIL) algorithm is found to exhibit the ability to improve when models are trained using a greater number of samples without regard to their huge imbalance.

The translation for publication in English was done by John Field Palencia Roth, assistant professor in the Department of Communication and Language of the Faculty of Humanities and Social Sciences at the Pontificia Universidad Javeriana Cali. This work is funded by the Departamento Administrativo de Ciencia, Tecnología e Innovación de Colombia ( COLCIENCIAS) under the grant project code 1251-521-28290.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Bendtsen, J.D., Nielsen, H., von Heijne, G., Brunak, S.: Improved prediction of signal peptides: SignalP 3.0. Journal of Molecular Biology 340(4), 783–795 (2004)
Article Google Scholar
Nielsen, H., Brunak, S., von Heijne, G.: Machine learning approaches for the prediction of signal peptides and other protein sorting signals. Protein Engineering 12(1), 3–9 (1999)
Article Google Scholar
Leversen, N.A., de Souza, G.A., Målen, H., Prasad, S., Jonassen, I., Wiker, H.G.: Evaluation of signal peptide prediction algorithms for identification of mycobacterial signal peptides using sequence data from proteomic methods. Microbiology 155(7), 2375–2383 (2009)
Article Google Scholar
Álvarez, G.I.: Estudio de la mezcla de estados determinista y no determinista en el diseño de algoritmos para inferencia gramatical de lenguajes regulares. PhD thesis, Universitad Politécnica de Valéncia, Departamento de Sistemas Informáticos y Computación (2008)
Google Scholar
Garćia, P., de Parga, M.V., Álvarez, G.I., Ruiz, J.: Universal automata and NFA learning. Theoretical Computer Science 407(1-3), 192–202 (2008)
Article MathSciNet MATH Google Scholar
Rabiner, L.: A tutorial on hidden markov models and selected applications in speech recognition. Proceedings of the IEEE 77(2), 257–286 (1989)
Article Google Scholar
Haykin, S.: Neural Networks: A Comprehensive Foundation, 2nd edn. Prentice Hall PTR, Upper Saddle River (1998)
Google Scholar
Baldi, P., Brunak, S., Chauvin, Y., Andersen, C.A.F., Nielsen, H.: Assessing the accuracy of prediction algorithms for classification: an overview. Bioinformatics 16(5), 412–424 (2000)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Pontificia Universidad Javeriana Cali, Colombia
Gloria Inés Alvarez, Diego Linares, Jheyson Faride Vargas & Jairo Andrés Velasco
Universidad del Valle, Colombia
Enrique Bravo

Authors

Gloria Inés Alvarez
View author publications
You can also search for this author in PubMed Google Scholar
Enrique Bravo
View author publications
You can also search for this author in PubMed Google Scholar
Diego Linares
View author publications
You can also search for this author in PubMed Google Scholar
Jheyson Faride Vargas
View author publications
You can also search for this author in PubMed Google Scholar
Jairo Andrés Velasco
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Universidad Autónoma del Estado de Hidalgo, Ciudad Universitaria,, Carretera Pachuca–Tulancingo km 4.5, Hidalgo, Mexico
Félix Castro
Centro de Investigación en Computación, Instituto Politécnico Nacional, Av. Juan Dios Bátiz s/n, Col. Nueva Industrial Vallejo, 07738, Mexico City, Mexico
Alexander Gelbukh
Tecnológico de Monterrey, Campus Estado de México,, Carretera Lago de Guadalupe Km 3.5, Atizapán de Zaragoza,, CP 52926, Estado de México, Mexico
Miguel González

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Alvarez, G.I., Bravo, E., Linares, D., Vargas, J.F., Velasco, J.A. (2013). Machine Learning Techniques Applied to the Cleavage Site Prediction Problem. In: Castro, F., Gelbukh, A., González, M. (eds) Advances in Artificial Intelligence and Its Applications. MICAI 2013. Lecture Notes in Computer Science(), vol 8265. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-45114-0_39

Download citation

DOI: https://doi.org/10.1007/978-3-642-45114-0_39
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-45113-3
Online ISBN: 978-3-642-45114-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics