Predicting Secondary Structure of All-Helical Proteins Using Hidden Markov Support Vector Machines

  • Blaise Gassend
  • Charles W. O’Donnell
  • William Thies
  • Andrew Lee
  • Marten van Dijk
  • Srinivas Devadas
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4146)


Our goal is to develop a state-of-the-art secondary structure predictor with an intuitive and biophysically-motivated energy model through the use of Hidden Markov Support Vector Machines (HM- SVMs), a recent innovation in the field of machine learning. We focus on the prediction of alpha helices and show that by using HM-SVMs, a simple 7-state HMM with 302 parameters can achieve a Q α value of 77.6% and a SOV α value of 73.4%. As detailed in an accompanying technical report [11], these performance numbers are among the best for techniques that do not rely on external databases (such as multiple sequence alignments).


Support Vector Machine Secondary Structure Cost Function Hide Markov Model Protein Data Bank 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Altun, Y., Tsochantaridis, I., Hofmann, T.: Hidden Markov Support Vector Machines. In: ICML (2003)Google Scholar
  2. 2.
    Aurora, R., Rose, G.: Helix capping. Protein Science 7 (1998)Google Scholar
  3. 3.
    Baldi, P., Brunak, S., Frasconi, P., Soda, G., Pollastri, G.: Exploiting the past and the future in protein secondary structure prediction. Bioinformatics 15 (1999)Google Scholar
  4. 4.
    Berman, H., Westbrook, J., Feng, Z., Gilliland, G., Bhat, T., Weissig, H., Shindyalov, I., Bourne, P.: The protein data bank. Nucleic Acids Research 28 (2000)Google Scholar
  5. 5.
    Bystroff, C., Thorsson, V., Baker, D.: HMMSTR: a Hidden Markov Model for Local Sequence-Structure Correlations in Proteins. J. of Mol. Bio. 301 (2000)Google Scholar
  6. 6.
    EVA Largest sequence of unique subset of PDB,
  7. 7.
    Joachims, T.: Making large-scale SVM learning practical. In: Advances in Kernel Methods – Support Vector Learning, pp. 169–185. MIT Press, Cambridge (1998)Google Scholar
  8. 8.
    Jones, D.T.: Protein Secondary Structure Prediction Based on Position-specific Scoring Matrices. Journal of Molecular Biology 292, 195–202 (1999)CrossRefGoogle Scholar
  9. 9.
    Kabsch, W., Sander, C.: Dictionary of protein secondary structure. Biopolymers 22 (1983)Google Scholar
  10. 10.
    Eyrich, V., et al.: EVA: Continuous automatic evaluation of protein structure prediction servers. Bioinformatics 17(12), 1242–1243 (2001)CrossRefGoogle Scholar
  11. 11.
    Gassend, B., et al.: Secondary Structure Prediction of All-Helical Proteins Using Hidden Markov Support Vector Machines. Technical Report MIT-CSAIL-TR-2005-060, MIT (December 2005),
  12. 12.
    Nguyen, M.N., Rajapakse, J.C.: Prediction of protein secondary structure using bayesian method and support vector machines. In: ICONIP (2002)Google Scholar
  13. 13.
    Riis, S., Krogh, A.: Improving prediction of protein secondary structure using structured neural networks and multiple sequence alignments. Journal of Computational Biology 3, 163–183 (1996)CrossRefGoogle Scholar
  14. 14.
    Rost, B.: Review: Protein Secondary Structure Prediction Continues to Rise. Journal of Structural Biology 134(2), 204–218 (2001)CrossRefGoogle Scholar
  15. 15.
    Rost, B., Sander, C.: Prediction of protein secondary structure at better than 70% accuracy. Journal of Molecular Biology 232, 584–599 (1993)CrossRefGoogle Scholar
  16. 16.
    Tsochantaridis, I., Altun, Y., Hoffman, T.: A crossover between SVMs and HMMs for protein structure prediction. In: NIPS Workshop on Machine Learning Techniques for Bioinformatics (2002)Google Scholar
  17. 17.
    Tsochantaridis, I., Hofmann, T., Joachims, T., Altun, Y.: Support Vector Machine Learning for Interdependent and Structured Output Spaces. In: ICML (2004)Google Scholar
  18. 18.
    Won, K., Hamelryck, T., Prügel-Bennett, A., Krogh, A.: Evolving Hidden Markov Models for Protein Secondary Structure Prediction. In: Proceedings of IEEE Congress on Evolutionary Computation, pp. 33–40 (2005)Google Scholar
  19. 19.
    Zemla, A., Venclovas, Č., Fidelis, K., Rost, B.: A Modified Definition of Sov, a Segment-Based Measure for Protein Secondary Structure Prediction Assessment. Proteins 34(2), 220–223 (1999)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Blaise Gassend
    • 1
  • Charles W. O’Donnell
    • 1
  • William Thies
    • 1
  • Andrew Lee
    • 1
  • Marten van Dijk
    • 1
  • Srinivas Devadas
    • 1
  1. 1.Massachusetts Institute of TechnologyComputer Science and Artificial Intelligence Laboratory 

Personalised recommendations