Exploiting Long-Range Dependencies in Protein β-Sheet Secondary Structure Prediction

Ni, Yizhao; Niranjan, Mahesan

doi:10.1007/978-3-642-16001-1_30

Exploiting Long-Range Dependencies in Protein β-Sheet Secondary Structure Prediction

Yizhao Ni²¹ &
Mahesan Niranjan²¹

Conference paper

1273 Accesses
2 Citations

Part of the book series: Lecture Notes in Computer Science ((LNBI,volume 6282))

Abstract

We investigate if interactions of longer range than typically considered in local protein secondary structure prediction methods can be captured in a simple machine learning framework to improve the prediction of β sheets. We use support vector machines and recursive feature elimination to show that the small signals available in long range interactions can indeed be exploited. The improvement is small but statistically significant on the benchmark datasets we used. We also show that feature selection within a long window and over amino acids at specific positions typically selects amino acids that are shown to be more relevant in the initiation and termination of β-sheet formation.

Download to read the full chapter text

Chapter PDF

References

Bamber, D.: The area above the ordinal dominance graph and the area below the receiver operating characteristic graph. Journal of Mathematical Psychology 12, 387–415 (1975)
Article Google Scholar
Cole, C., Barber, J., Barton, G.: The jpred 3 secondary structure prediction server. Nucleic Acids Research, doi:10.1093/nar/gkn238
Google Scholar
Cuff, J., Barton, G.: Evaluation and improvement of multiple sequence methods for protein secondary structure prediction. Proteins: Struct. Funct. Genet. 34, 508–519 (1999)
Article CAS Google Scholar
FarzadFard, F., Gharaei, N., Pezesnk, H., Marashi, S.: β-sheet capping: Signals that initiate and terminate β-sheet formation. Journal of Structure Biology 161(1), 101–110 (2008)
Article CAS Google Scholar
Guyon, I., Weston, J., Barnhill, S., Vapnik, V.: Gene selection for cancer classification using support vector machines. Machine Learning 46(1-3), 389–422 (2002)
Article Google Scholar
Hua, S., Sun, Z.: A novel method of protein secondary structure prediction with high segment overlap measure: support vector machine approach. J. Mol. Biol. 308(2), 397–407 (2001)
Article CAS PubMed Google Scholar
Kabsch, W., Sander, C.: Dictionary of protein secondary structure: pattern recognition of hydrogen bonded and geometrical features. Biopolymers 22, 2577–2637 (1983)
Article CAS PubMed Google Scholar
Minh, H.Q., Niyogi, P., Yao, Y.: Mercer’s theorem, feature map, and smoothing. In: COLT, pp. 154–168 (2006)
Google Scholar
Nguyen, M., Rajapakse, J.: Multi-class support vector machines for protein secondary structure prediction. Genome Informatics 14 (2003)
Google Scholar
Ni, Y., Saunders, C., Szedmak, S., Niranjan, M.: The application of structure learning in natural language processing. Machine Translation (in Press)
Google Scholar
Qian, N., Sejnowski, T.: Predicting the secondary structure of globular proteins using neural network models. J. Mol. Biol. 202, 865–884 (1988)
Article CAS PubMed Google Scholar
Qian, N., Sejnowski, T.: Predicting the secondary structure of globular proteins using neural network models. Journal of Molecular Biology 202(4), 865–884 (1988)
Article CAS PubMed Google Scholar
Rost, B.: Protein secondary structure prediction continues to rise. Journal of Structural Biology 134, 204–218 (2001)
Article CAS PubMed Google Scholar
Rost, B., Sander, C.: Prediction of protein secondary structure at better than 70% accuracy. J. Mol. Biol. 232, 584–599 (1993)
Article CAS PubMed Google Scholar
Ward, J., McGuffin, L., Buxton, B., Jones, D.: Secondary structure prediction with support vector machines. Bioinformatics 19(13), 1650–1655 (2003)
Article CAS PubMed Google Scholar

Download references

Author information

Authors and Affiliations

ISIS Group, School of Electronics and Computer Science, University of Southampton, U.K.
Yizhao Ni & Mahesan Niranjan

Authors

Yizhao Ni
View author publications
You can also search for this author in PubMed Google Scholar
Mahesan Niranjan
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Institute for Computing and Information Sciences, Radboud University Nijmegen, Heyendaalseweg 135, 6525AJ, Nijmegen, The Netherlands
Tjeerd M. H. Dijkstra , Elena Marchiori & Tom Heskes , &
Institute for Computing and Information Sciences, Turku Centre for Computer Science, Radboud University Nijmegen, Heyendaalseweg 135, 6525AJ, Nijmegen, The Netherlands
Evgeni Tsivtsivadze

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Ni, Y., Niranjan, M. (2010). Exploiting Long-Range Dependencies in Protein β-Sheet Secondary Structure Prediction. In: Dijkstra, T.M.H., Tsivtsivadze, E., Marchiori, E., Heskes, T. (eds) Pattern Recognition in Bioinformatics. PRIB 2010. Lecture Notes in Computer Science(), vol 6282. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-16001-1_30

Download citation

DOI: https://doi.org/10.1007/978-3-642-16001-1_30
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-16000-4
Online ISBN: 978-3-642-16001-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

The International Association for Pattern Recognition (opens in a new tab)