Prediction of Mitochondrial Matrix Protein Structures Based on Feature Selection and Fragment Assembly

Asencio-Cortés, Gualberto; Aguilar-Ruiz, Jesús S.; Márquez-Chamorro, Alfonso E.; Ruiz, Roberto; Santiesteban-Toca, Cosme E.

doi:10.1007/978-3-642-29066-4_14

Prediction of Mitochondrial Matrix Protein Structures Based on Feature Selection and Fragment Assembly

Gualberto Asencio-Cortés¹⁹,
Jesús S. Aguilar-Ruiz¹⁹,
Alfonso E. Márquez-Chamorro¹⁹,
Roberto Ruiz¹⁹ &
…
Cosme E. Santiesteban-Toca²⁰

Conference paper

1220 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 7246))

Abstract

Protein structure prediction consists in determining the thre-e-dimensional conformation of a protein based only on its amino acid sequence. This is currently a difficult and significant challenge in structural bioinformatics because these structures are necessary for drug designing. This work proposes a method that reconstructs protein structures from protein fragments assembled according to their physico-chemical similarities, using information extracted from known protein structures. Our prediction system produces distance maps to represent protein structures, which provides more information than contact maps, which are predicted by many proposals in the literature. Most commonly used amino acid physico-chemical properties are hydrophobicity, polarity and charge. In our method, we performed a feature selection on the 544 properties of the AAindex repository, resulting in 16 properties which were used to predictions. We tested our proposal on 74 mitochondrial matrix proteins with a maximum sequence identity of 30% obtained from the Protein Data Bank. We achieved a recall of 0.80 and a precision of 0.79 with an 8-angstrom cut-off and a minimum sequence separation of 7 amino acids. Finally, we compared our system with other relevant proposal on the same benchmark and we achieved a recall improvement of 50.82%. Therefore, for the studied proteins, our method provides a notable improvement in terms of recall.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 54.99; Price excludes VAT (USA)

Softcover Book: USD 69.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Zhou, Y., Duan, Y., Yang, Y., Faraggi, E., Lei, H.: Trends in template/fragment-free protein structure prediction. Theoretical Chemistry Accounts: Theory, Computation, and Modeling (Theoretica Chimica Acta) 128, 3–16 (2011)
Google Scholar
Walsh, I., Bau, D., Martin, A., Mooney, C., Vullo, A., Pollastri, G.: Ab initio and template-based prediction of multi-class distance maps by two-dimensional recursive neural networks. BMC Structural Biology 9(1), 5 (2009)
Article Google Scholar
Li, S.C., Bu, D., Xu, J., Li, M.: Fragment-hmm: a new approach to protein structure prediction. Protein Science: A Publication of the Protein Society 17(11), 1925–1934 (2008)
Google Scholar
Jones, D.T.: Predicting novel protein folds by using fragfold. Proteins (suppl.5), 127–132 (2001)
Article Google Scholar
Rohl, C.A., Strauss, C.E.M., Misura, K.M.S., Baker, D.: Protein structure prediction using rosetta. In: Brand, L., Johnson, M.L. (eds.) Numerical Computer Methods, Part D. Methods in Enzymology, vol. 383, pp. 66–93. Academic Press (2004)
Google Scholar
Li, Y., Fang, Y., Fang, J.: Predicting residue-residue contacts using random forest models. Bioinformatics (2011)
Google Scholar
Hoque, T., Chetty, M., Sattar, A.: Extended hp model for protein structure prediction. Journal of Computational Biology: A Journal of Computational Molecular Cell Biology 16(1), 85–103 (2009)
MathSciNet Google Scholar
Kawashima, S., Pokarowski, P., Pokarowska, M., Kolinski, A., Katayama, T., Kanehisa, M.: Aaindex: amino acid index database, progress report 2008. Nucleic Acids Res. 36(Database issue), D202–D205 (2008)
Google Scholar
Lin, K.-L., Lin, C.-Y., Huang, C.-D., Chang, H.-M., Yang, C.-Y., Lin, C.-T., Tang, C.Y., Hsu, D.F.: Feature selection and combination criteria for improving accuracy in protein structure prediction. IEEE Transactions on NanoBioscience 6(2), 186–196 (2007)
Article Google Scholar
Blum, A.L., Langley, P.: Selection of relevant features and examples in machine learning. Artificial Intelligence 97, 245–271 (1997)
Article MathSciNet MATH Google Scholar
Guyon, I.: An introduction to variable and feature selection. Journal of Machine Learning Research 3, 1157–1182 (2003)
MATH Google Scholar
Golub, T.R., Slonim, D.K., Tamayo, P., Huard, C., Gaasenbeek, M., Mesirov, J.P., Coller, H., Loh, M.L., Downing, J.R., Caligiuri, M.A., Bloomfield, C.D.: Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science 286, 531–537 (1999)
Article Google Scholar
Ruiz, R., Riquelme, J.C., Aguilar-Ruiz, J.S.: Best agglomerative ranked subset for feature selection. Journal of Machine Learning Research - Proceedings Track 4, 148–162 (2008)
Google Scholar
Yu, L., Liu, H., Guyon, I.: Efficient feature selection via analysis of relevance and redundancy. Journal of Machine Learning Research 5, 1205–1224 (2004)
MATH Google Scholar
Wu, S., Szilagyi, A., Zhang, Y.: Improving protein structure prediction using multiple sequence-based contact predictions. Structure 19(8), 1182–1191 (2011)
Article Google Scholar
Kloczkowski, A., Jernigan, R., Wu, Z., Song, G., Yang, L., Kolinski, A., Pokarowski, P.: Distance matrix-based approach to protein structure prediction. Journal of Structural and Functional Genomics 10, 67–81 (2009)
Article Google Scholar
Berman, H., Westbrook, J., Feng, Z., Gilliland, G., Bhat, T.N., Weissig, H., Shindyalov, I., Bourne, P.: The protein data bank. Nucl. Acids Res. 28(1), 235–242 (2000)
Article Google Scholar
Fariselli, P., Olmea, O., Valencia, A., Casadio, R.: Prediction of contact maps with neural networks and correlated mutations. Protein Engineering 14(11), 835–843 (2001)
Article Google Scholar
Zhang, G.-Z., Huang, D.S., Quan, Z.H.: Combining a binary input encoding scheme with rbfnn for globulin protein inter-residue contact map prediction. Pattern Recogn. Lett. 26, 1543–1553 (2005)
Article Google Scholar
Fariselli, P., Casadio, R.: A neural network based predictor of residue contacts in proteins. Protein Engineering 12(1), 15–21 (1999)
Article Google Scholar
Baldi, P., Brunak, S., Chauvin, Y., Andersen, C.A.F., Nielsen, H.: Assessing the accuracy of prediction algorithms for classification: an overview. Bioinformatics 16(5), 412–424 (2000)
Article Google Scholar

Download references

Author information

Authors and Affiliations

School of Engineering, Pablo de Olavide University, Seville, Spain
Gualberto Asencio-Cortés, Jesús S. Aguilar-Ruiz, Alfonso E. Márquez-Chamorro & Roberto Ruiz
Centro de Bioplantas, University of Ciego de Ávila, Cuba
Cosme E. Santiesteban-Toca

Authors

Gualberto Asencio-Cortés
View author publications
You can also search for this author in PubMed Google Scholar
Jesús S. Aguilar-Ruiz
View author publications
You can also search for this author in PubMed Google Scholar
Alfonso E. Márquez-Chamorro
View author publications
You can also search for this author in PubMed Google Scholar
Roberto Ruiz
View author publications
You can also search for this author in PubMed Google Scholar
Cosme E. Santiesteban-Toca
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Animal Production Epidemiology and Ecology, University of Torino, Via Leonardo da Vinci 44, 10095, Grugliasco, (TO), Italy
Mario Giacobini
Universidade Nove de Lisboa, ISEGI, 1070-312 Lisboa, Portugal and University of Milano-Bicocca, D.I.S.Co., Viale Sarca 336, 20126 Milan, Italy
Leonardo Vanneschi
Center for Human Genetics Research, Department of Biomedical Informatics, Vanderbilt University, 519 Light Hall, 37232, Nashville, TN, USA
William S. Bush

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Asencio-Cortés, G., Aguilar-Ruiz, J.S., Márquez-Chamorro, A.E., Ruiz, R., Santiesteban-Toca, C.E. (2012). Prediction of Mitochondrial Matrix Protein Structures Based on Feature Selection and Fragment Assembly. In: Giacobini, M., Vanneschi, L., Bush, W.S. (eds) Evolutionary Computation, Machine Learning and Data Mining in Bioinformatics. EvoBIO 2012. Lecture Notes in Computer Science, vol 7246. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-29066-4_14

Download citation

DOI: https://doi.org/10.1007/978-3-642-29066-4_14
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-29065-7
Online ISBN: 978-3-642-29066-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics