Skip to main content

Functional Protein Prediction Using HMM Based Feature Representation and Relevance Analysis

  • Conference paper
Advances in Computational Biology

Abstract

The prediction of subcellular location aims to understand the biological processes being carried out within the cell. Here, a feature representation methodology is proposed to identify subcellular locations in gram-positive bacteria. Regarding this, each considered class is employed to train a hidden Markov model, and the probability of a sequence of amino acids, being generated by each of the trained models is employed as a feature in further classification stage. Our proposal is tested on a well known database, containing amino acids sequences of bacteria. For concrete testing, a percentage of less than 80% identity is studied, using a multi-label Support Vector Machines with soft margin classifier. Attained results show that our approach improves issues raised in PfamFeat. Moreover, it seems to be an appropriate tool for predicting subcellular location proteins.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Gardy, J.L., Brinkman, F.S.L.: Methods for predicting bacterial protein subcellular localization. Nature Reviews Microbiology 4(10), 741–751 (2006)

    Article  Google Scholar 

  2. Gardy, J.L., Spencer, C., Wang, K., Ester, M., Tusnady, G.E., Simon, I., Hua, S., Lambert, C., Nakai, K., Brinkman, F.S., et al.: Psort-b: Improving protein subcellular localization prediction for gram-negative bacteria. Nucleic Acids Research 31(13), 3613–3617 (2003)

    Article  Google Scholar 

  3. Yu, C.S., Lin, C.J., Hwang, J.K.: Predicting subcellular localization of proteins for Gram-negative bacteria by support vector machines based on n-peptide compositions. Protein Science 13(5), 1402–1406 (2004)

    Article  Google Scholar 

  4. Lu, Z., Szafron, D., Greiner, R., Lu, P., Wishart, D., Poulin, B., Anvik, J., Macdonell, C., Eisner, R.: Predicting subcellular localization of proteins using machine-learned classifiers. Bioinformatics 20(4), 547–556 (2004)

    Article  Google Scholar 

  5. Punta, M., Coggill, P.C., Eberhardt, R.Y., Mistry, J., Tate, J., Boursnell, C., Pang, N., Forslund, K., Ceric, G., Clements, J., Heger, A., Holm, L., Sonnhammer, E.L.L., Eddy, S.R., Bateman, A., Finn, R.D.: The Pfam protein families database. Nucleic Acids Research 40(Database issue), D290–D301 (2012)

    Google Scholar 

  6. Crammer, K.: On the algorithmic implementation of multiclass kernel-based vector machines. The Journal of Machine Learning Research 2, 265–292 (2002)

    MATH  Google Scholar 

  7. Rabiner, L.R.: A tutorial on hidden markov models and selected applications in speech recognition. Proceedings of the IEEE 77(2), 257–286 (1989)

    Article  Google Scholar 

  8. Scholkopg, B., Smola, A.J.: Learning with Kernels. The MIT Press, Cambridge (2002)

    Google Scholar 

  9. Rey, S., Acab, M., Gardy, J.L., Laird, M.R., Lambert, C., Brinkman, F.S., et al.: Psortdb: a protein subcellular localization database for bacteria. Nucleic Acids Research 33(suppl. 1), D164–D168 (2005)

    Google Scholar 

  10. Li, W., Godzik, A.: Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences. Bioinformatics (Oxford, England) 22(13), 1658–1659 (2006)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Diego Fabian Collazos-Huertas .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this paper

Cite this paper

Collazos-Huertas, D.F., Giraldo-Forero, A.F., Cárdenas-Peña, D., Álvarez-Meza, A.M., Castellanos-Domínguez, G. (2014). Functional Protein Prediction Using HMM Based Feature Representation and Relevance Analysis. In: Castillo, L., Cristancho, M., Isaza, G., Pinzón, A., Rodríguez, J. (eds) Advances in Computational Biology. Advances in Intelligent Systems and Computing, vol 232. Springer, Cham. https://doi.org/10.1007/978-3-319-01568-2_10

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-01568-2_10

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-01567-5

  • Online ISBN: 978-3-319-01568-2

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics