Skip to main content

Analysis of the Effects of Multiple Sequence Alignments in Protein Secondary Structure Prediction

  • Conference paper
Advances in Bioinformatics and Computational Biology (BSB 2005)

Part of the book series: Lecture Notes in Computer Science ((LNBI,volume 3594))

Included in the following conference series:

  • 713 Accesses

Abstract

Secondary structure prediction methods are widely used bioinformatics algorithms providing initial insights about protein structure from sequence information. Significant efforts to improve the prediction accuracy over the past years were made, specially the incorporation of information from multiple sequence alignments. This motivated the search for the factors contributing for this improvement. We show that in two of the highly ranked secondary structure prediction methods, DSC and PREDATOR, the use of multiple alignments consistently improves the prediction accuracy as compared to the use of single sequences. This is validated by using different measures of accuracy, which also permit to identify that helical regions benefit the most from alignments, whereas β-strands seem to have reached a plateau in terms of predictability. Also, the origins of this improvement is explored in terms of sequence specificity, secondary structure composition and the extent of sequence similarity which provides the optimal performance. It is found that divergent sequences, in the identity range of 25–55% provide the largest accuracy gain and that above 65% identity there is almost no advantage in using multiple alignments.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Anfinsen, C.: Principles that govern the folding of protein chains. Science 181, 223–230 (1973)

    Article  Google Scholar 

  2. Rost, B.: Prediction in 1D: secondary structure, membrane helices, and accessibility. Methods Biochem. Anal. 44, 559–587 (2003)

    Google Scholar 

  3. Rost, B.: Review: protein secondary structure prediction continues to rise. J. Struct. Biol. 134, 204–218 (2001)

    Article  Google Scholar 

  4. Garnier, J., Levin, J.: The protein structure code: what is its present status? Comput. Appl. Biosci. 7, 133–142 (1991)

    Google Scholar 

  5. Rackovsky, S.: On the existence and implications of an inverse folding code in proteins. Proc. Natl. Acad. Sci. USA 92, 6861–6863 (1995)

    Article  Google Scholar 

  6. Kloczkowski, A., Ting, K.L., Jernigan, R., Garnier, J.: Combining the GOR V algorithm with evolutionary information for protein secondary structure prediction from amino acid sequence. Proteins 49, 154–166 (2002)

    Article  Google Scholar 

  7. Zvelebil, M., Barton, G., Taylor, W., Sternberg, M.: Prediction of protein secondary structure and active sites using the alignment of homologous sequences. J. Mol. Biol. 195, 957–961 (1987)

    Article  Google Scholar 

  8. Rost, B., Sander, C.: Combining evolutionary information and neural networks to predict protein secondary structure. Proteins 19, 55–72 (1994)

    Article  Google Scholar 

  9. Salamov, A., Solovyev, V.: Protein secondary structure prediction using local alignments. J. Mol. Biol. 268, 31–36 (1997)

    Article  Google Scholar 

  10. King, R., Sternberg, M.: Identification and application of the concepts important for accurate and reliable protein secondary structure prediction. Protein Sci. 5, 2298–2310 (1996)

    Article  Google Scholar 

  11. Frishman, D., Argos, P.: Seventy-five percent accuracy in protein secondary structure prediction. Proteins 27, 329–335 (1997)

    Article  Google Scholar 

  12. Abagyan, R., Batalov, S.: Do aligned sequences share the same fold? J. Mol. Biol. 273, 355–368 (1997)

    Article  Google Scholar 

  13. Rost, B.: Twilight zone of protein sequence alignments. Protein Eng. 12, 85–94 (1999)

    Article  Google Scholar 

  14. Chothia, C.: Proteins. One thousand families for the molecular biologist. Nature 357, 543–544 (1992)

    Article  Google Scholar 

  15. Pascarella, S., Argos, P.: Analysis of insertions/deletions in protein structures. J. Mol. Biol. 224, 461–471 (1992)

    Article  Google Scholar 

  16. Di Francesco, V., Garnier, J., Munson, P.: Improving protein secondary structure prediction with aligned homologous sequences. Protein Sci. 5, 106–113 (1996)

    Article  Google Scholar 

  17. Altschul, S., Madden, T., Schäffer, A., Zhang, J., Zhang, Z., Miller, W., Lipman, D.: Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 25, 3389–3402 (1997)

    Article  Google Scholar 

  18. Jones, D.: Protein secondary structure prediction based on position-specific scoring matrices. J. Mol. Biol. 292, 195–202 (1999)

    Article  Google Scholar 

  19. Cuff, J., Barton, G.: Application of multiple sequence alignment profiles to improve protein secondary structure prediction. Proteins 40, 502–511 (2000)

    Article  Google Scholar 

  20. Petersen, T., Lundegaard, C., Nielsen, M., Bohr, H., Bohr, J., Brunak, S., Gippert, G., Lund, O.: Prediction of protein secondary structure at 80% accuracy. Proteins 41, 17–20 (2000)

    Article  Google Scholar 

  21. Rost, B., Sander, C.: Prediction of protein secondary structure at better than 70% accuracy. J. Mol. Biol. 232, 584–599 (1993)

    Article  Google Scholar 

  22. Cuff, J., Barton, G.: Evaluation and improvement of multiple sequence methods for protein secondary structure prediction. Proteins 34, 508–519 (1999)

    Article  Google Scholar 

  23. Przybylski, D., Rost, B.: Alignments grow, secondary structure prediction improves. Proteins 46, 197–205 (2002)

    Article  Google Scholar 

  24. Bernstein, F., Koetzle, T., Williams, G., Meyer, E., Brice, M., Rodgers, J., Kennard, O., Shimanouchi, T., Tasumi, M.: The Protein Data Bank: a computer-based archival file for macromolecular structures. J. Mol. Biol. 112, 535–542 (1977)

    Article  Google Scholar 

  25. Heringa, J., Sommerfeldt, H., Higgins, D., Argos, P.: OBSTRUCT: a program to obtain largest cliques from a protein sequence set according to structural resolution and sequence similarity. Comput. Appl. Biosci. 8, 599–600 (1992)

    Google Scholar 

  26. Sander, C., Schneider, R.: Database of homology-derived protein structures and the structural meaning of sequence alignment. Proteins 9, 56–68 (1991)

    Article  Google Scholar 

  27. Kabsch, W., Sander, C.: Dictionary of protein secondary structure: pattern recognition of hydrogen-bonded and geometrical features. Biopolymers 22, 2577–2637 (1983)

    Article  Google Scholar 

  28. Matthews, B.: Comparison of the predicted and observed secondary structure of T4 phage lysozyme. Biochim. Biophys. Acta. 405, 442–451 (1975)

    Google Scholar 

  29. Goldman, N., Thorne, J., Jones, D.: Using evolutionary trees in protein secondary structure prediction and other comparative sequence analyses. J. Mol. Biol. 263, 196–208 (1996)

    Article  Google Scholar 

  30. Argos, P.: Analysis of sequence-similar pentapeptides in unrelated protein tertiary structures. Strategies for protein folding and a guide for site-directed mutagenesis. J. Mol. Biol. 197, 331–348 (1987)

    Article  Google Scholar 

  31. Cohen, B., Presnell, S., Cohen, F.: Origins of structural diversity within sequentially identical hexapeptides. Protein Sci. 2, 2134–2145 (1993)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2005 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Pappas, G.J., Subramaniam, S. (2005). Analysis of the Effects of Multiple Sequence Alignments in Protein Secondary Structure Prediction. In: Setubal, J.C., Verjovski-Almeida, S. (eds) Advances in Bioinformatics and Computational Biology. BSB 2005. Lecture Notes in Computer Science(), vol 3594. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11532323_14

Download citation

  • DOI: https://doi.org/10.1007/11532323_14

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-28008-8

  • Online ISBN: 978-3-540-31861-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics