Skip to main content

Machine Learning Techniques Applied to the Cleavage Site Prediction Problem

  • Conference paper
  • 1316 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 8265))

Abstract

The Genome of the Potyviridae virus family is usually expressed as a polyprotein which can be divided into ten proteins through the action of enzymes or proteases which cut the chain in specific places called cleavage sites. Three different techniques were employed to model each cleavage site: Hidden Markov Models (HMM), grammatical inference OIL algorithm (OIL), and Artificial Neural Networks (ANN).

Based on experimentation, the Hidden Markov Model has the best classification performance as well as a high robustness in relation to class imbalance. However, the Order Independent Language (OIL) algorithm is found to exhibit the ability to improve when models are trained using a greater number of samples without regard to their huge imbalance.

The translation for publication in English was done by John Field Palencia Roth, assistant professor in the Department of Communication and Language of the Faculty of Humanities and Social Sciences at the Pontificia Universidad Javeriana Cali. This work is funded by the Departamento Administrativo de Ciencia, Tecnología e Innovación de Colombia ( COLCIENCIAS) under the grant project code 1251-521-28290.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Bendtsen, J.D., Nielsen, H., von Heijne, G., Brunak, S.: Improved prediction of signal peptides: SignalP 3.0. Journal of Molecular Biology 340(4), 783–795 (2004)

    Article  Google Scholar 

  2. Nielsen, H., Brunak, S., von Heijne, G.: Machine learning approaches for the prediction of signal peptides and other protein sorting signals. Protein Engineering 12(1), 3–9 (1999)

    Article  Google Scholar 

  3. Leversen, N.A., de Souza, G.A., Målen, H., Prasad, S., Jonassen, I., Wiker, H.G.: Evaluation of signal peptide prediction algorithms for identification of mycobacterial signal peptides using sequence data from proteomic methods. Microbiology 155(7), 2375–2383 (2009)

    Article  Google Scholar 

  4. Álvarez, G.I.: Estudio de la mezcla de estados determinista y no determinista en el diseño de algoritmos para inferencia gramatical de lenguajes regulares. PhD thesis, Universitad Politécnica de Valéncia, Departamento de Sistemas Informáticos y Computación (2008)

    Google Scholar 

  5. Garćia, P., de Parga, M.V., Álvarez, G.I., Ruiz, J.: Universal automata and NFA learning. Theoretical Computer Science 407(1-3), 192–202 (2008)

    Article  MathSciNet  MATH  Google Scholar 

  6. Rabiner, L.: A tutorial on hidden markov models and selected applications in speech recognition. Proceedings of the IEEE 77(2), 257–286 (1989)

    Article  Google Scholar 

  7. Haykin, S.: Neural Networks: A Comprehensive Foundation, 2nd edn. Prentice Hall PTR, Upper Saddle River (1998)

    Google Scholar 

  8. Baldi, P., Brunak, S., Chauvin, Y., Andersen, C.A.F., Nielsen, H.: Assessing the accuracy of prediction algorithms for classification: an overview. Bioinformatics 16(5), 412–424 (2000)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Alvarez, G.I., Bravo, E., Linares, D., Vargas, J.F., Velasco, J.A. (2013). Machine Learning Techniques Applied to the Cleavage Site Prediction Problem. In: Castro, F., Gelbukh, A., González, M. (eds) Advances in Artificial Intelligence and Its Applications. MICAI 2013. Lecture Notes in Computer Science(), vol 8265. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-45114-0_39

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-45114-0_39

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-45113-3

  • Online ISBN: 978-3-642-45114-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics