Skip to main content

Prediction of Mitochondrial Matrix Protein Structures Based on Feature Selection and Fragment Assembly

  • Conference paper
  • 1220 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 7246))

Abstract

Protein structure prediction consists in determining the thre-e-dimensional conformation of a protein based only on its amino acid sequence. This is currently a difficult and significant challenge in structural bioinformatics because these structures are necessary for drug designing. This work proposes a method that reconstructs protein structures from protein fragments assembled according to their physico-chemical similarities, using information extracted from known protein structures. Our prediction system produces distance maps to represent protein structures, which provides more information than contact maps, which are predicted by many proposals in the literature. Most commonly used amino acid physico-chemical properties are hydrophobicity, polarity and charge. In our method, we performed a feature selection on the 544 properties of the AAindex repository, resulting in 16 properties which were used to predictions. We tested our proposal on 74 mitochondrial matrix proteins with a maximum sequence identity of 30% obtained from the Protein Data Bank. We achieved a recall of 0.80 and a precision of 0.79 with an 8-angstrom cut-off and a minimum sequence separation of 7 amino acids. Finally, we compared our system with other relevant proposal on the same benchmark and we achieved a recall improvement of 50.82%. Therefore, for the studied proteins, our method provides a notable improvement in terms of recall.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   54.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   69.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Zhou, Y., Duan, Y., Yang, Y., Faraggi, E., Lei, H.: Trends in template/fragment-free protein structure prediction. Theoretical Chemistry Accounts: Theory, Computation, and Modeling (Theoretica Chimica Acta) 128, 3–16 (2011)

    Google Scholar 

  2. Walsh, I., Bau, D., Martin, A., Mooney, C., Vullo, A., Pollastri, G.: Ab initio and template-based prediction of multi-class distance maps by two-dimensional recursive neural networks. BMC Structural Biology 9(1), 5 (2009)

    Article  Google Scholar 

  3. Li, S.C., Bu, D., Xu, J., Li, M.: Fragment-hmm: a new approach to protein structure prediction. Protein Science: A Publication of the Protein Society 17(11), 1925–1934 (2008)

    Google Scholar 

  4. Jones, D.T.: Predicting novel protein folds by using fragfold. Proteins (suppl.5), 127–132 (2001)

    Article  Google Scholar 

  5. Rohl, C.A., Strauss, C.E.M., Misura, K.M.S., Baker, D.: Protein structure prediction using rosetta. In: Brand, L., Johnson, M.L. (eds.) Numerical Computer Methods, Part D. Methods in Enzymology, vol. 383, pp. 66–93. Academic Press (2004)

    Google Scholar 

  6. Li, Y., Fang, Y., Fang, J.: Predicting residue-residue contacts using random forest models. Bioinformatics (2011)

    Google Scholar 

  7. Hoque, T., Chetty, M., Sattar, A.: Extended hp model for protein structure prediction. Journal of Computational Biology: A Journal of Computational Molecular Cell Biology 16(1), 85–103 (2009)

    MathSciNet  Google Scholar 

  8. Kawashima, S., Pokarowski, P., Pokarowska, M., Kolinski, A., Katayama, T., Kanehisa, M.: Aaindex: amino acid index database, progress report 2008. Nucleic Acids Res. 36(Database issue), D202–D205 (2008)

    Google Scholar 

  9. Lin, K.-L., Lin, C.-Y., Huang, C.-D., Chang, H.-M., Yang, C.-Y., Lin, C.-T., Tang, C.Y., Hsu, D.F.: Feature selection and combination criteria for improving accuracy in protein structure prediction. IEEE Transactions on NanoBioscience 6(2), 186–196 (2007)

    Article  Google Scholar 

  10. Blum, A.L., Langley, P.: Selection of relevant features and examples in machine learning. Artificial Intelligence 97, 245–271 (1997)

    Article  MathSciNet  MATH  Google Scholar 

  11. Guyon, I.: An introduction to variable and feature selection. Journal of Machine Learning Research 3, 1157–1182 (2003)

    MATH  Google Scholar 

  12. Golub, T.R., Slonim, D.K., Tamayo, P., Huard, C., Gaasenbeek, M., Mesirov, J.P., Coller, H., Loh, M.L., Downing, J.R., Caligiuri, M.A., Bloomfield, C.D.: Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science 286, 531–537 (1999)

    Article  Google Scholar 

  13. Ruiz, R., Riquelme, J.C., Aguilar-Ruiz, J.S.: Best agglomerative ranked subset for feature selection. Journal of Machine Learning Research - Proceedings Track 4, 148–162 (2008)

    Google Scholar 

  14. Yu, L., Liu, H., Guyon, I.: Efficient feature selection via analysis of relevance and redundancy. Journal of Machine Learning Research 5, 1205–1224 (2004)

    MATH  Google Scholar 

  15. Wu, S., Szilagyi, A., Zhang, Y.: Improving protein structure prediction using multiple sequence-based contact predictions. Structure 19(8), 1182–1191 (2011)

    Article  Google Scholar 

  16. Kloczkowski, A., Jernigan, R., Wu, Z., Song, G., Yang, L., Kolinski, A., Pokarowski, P.: Distance matrix-based approach to protein structure prediction. Journal of Structural and Functional Genomics 10, 67–81 (2009)

    Article  Google Scholar 

  17. Berman, H., Westbrook, J., Feng, Z., Gilliland, G., Bhat, T.N., Weissig, H., Shindyalov, I., Bourne, P.: The protein data bank. Nucl. Acids Res. 28(1), 235–242 (2000)

    Article  Google Scholar 

  18. Fariselli, P., Olmea, O., Valencia, A., Casadio, R.: Prediction of contact maps with neural networks and correlated mutations. Protein Engineering 14(11), 835–843 (2001)

    Article  Google Scholar 

  19. Zhang, G.-Z., Huang, D.S., Quan, Z.H.: Combining a binary input encoding scheme with rbfnn for globulin protein inter-residue contact map prediction. Pattern Recogn. Lett. 26, 1543–1553 (2005)

    Article  Google Scholar 

  20. Fariselli, P., Casadio, R.: A neural network based predictor of residue contacts in proteins. Protein Engineering 12(1), 15–21 (1999)

    Article  Google Scholar 

  21. Baldi, P., Brunak, S., Chauvin, Y., Andersen, C.A.F., Nielsen, H.: Assessing the accuracy of prediction algorithms for classification: an overview. Bioinformatics 16(5), 412–424 (2000)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Asencio-Cortés, G., Aguilar-Ruiz, J.S., Márquez-Chamorro, A.E., Ruiz, R., Santiesteban-Toca, C.E. (2012). Prediction of Mitochondrial Matrix Protein Structures Based on Feature Selection and Fragment Assembly. In: Giacobini, M., Vanneschi, L., Bush, W.S. (eds) Evolutionary Computation, Machine Learning and Data Mining in Bioinformatics. EvoBIO 2012. Lecture Notes in Computer Science, vol 7246. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-29066-4_14

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-29066-4_14

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-29065-7

  • Online ISBN: 978-3-642-29066-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics