Local Structure Prediction of Proteins

  • Victo A. Simossis
  • Jaap Heringa


Protein architecture represents a complex and multilayered hierarchy (Fig. 7.1; Crippen, 1978; Rose, 1979). It starts from a linear chain of amino acid residues (primary structure) that arrange themselves in space to form local structures (secondary structure and supersecondary structure) and extends up to the globular threedimensional structure of a fully functional folded protein (tertiary and quaternary structure).


Secondary Structure Hide Markov Model Structure Prediction Secondary Structure Prediction Protein Secondary Structure 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. Albrecht, M., Tosatto, S.C., Lengauer, T., and Valle, G. 2003. Simple consensus procedures are effective and sufficient in secondary structure prediction. Protein Eng. 16:459–462.CrossRefGoogle Scholar
  2. Altschul, S.F., and Koonin, E.V. 1998. Iterated profile searches with PSI-BLAST — A tool for discovery in protein databases. Trends Biochem. Sci. 23:444–447.CrossRefGoogle Scholar
  3. Altschul, S.F., Madden, T.L., Schaffer, A.A., Zhang, J., Zhang, Z., Miller, W., and Lipman, D.J. 1997. Gapped BLAST and PSI-BLAST: A new generation of protein database search programs. Nucleic Acids Res. 25:3389–3402.CrossRefGoogle Scholar
  4. An, J., Totrov, M., and Abagyan, R. 2005. Pocketome via comprehensive identification and classification of ligand binding envelopes. Mol. Cell. Proteomics 4:752–761.CrossRefGoogle Scholar
  5. An, Y., and Friesner, R.A. 2002. A novel fold recognition method using composite predicted secondary structures. Proteins 48:352–366.CrossRefGoogle Scholar
  6. Andrade, M.A., Ponting, C.P., Gibson, T.J., and Bork, P. 2000. Homology-based method for identification of protein repeats using statistical significance estimates. J. Mol. Biol. 298:521–537.CrossRefGoogle Scholar
  7. Argos, P. 1987. Analysis of sequence-similar pentapeptides in unrelated protein tertiary structures. Strategies for protein folding and a guide for site-directed mutagenesis. J. Mol. Biol. 197:331–348.CrossRefGoogle Scholar
  8. Bagos, P.G., Liakopoulos, T.D., and Hamodrakas, S.J. 2005. Evaluation of methods for predicting the topology of beta-barrel outer membrane proteins and a consensus prediction method. BMC Bioinformatics 6:7.CrossRefGoogle Scholar
  9. Bairoch, A., and Boeckmann, B. 1991. The SWISS-PROT protein sequence data bank. Nucleic Acids Res. 19(Suppl.):2247–2249.Google Scholar
  10. Baldi, P., Brunak, S., Frasconi, P., Soda, G., and Pollastri, G. 1999. Exploiting the past and the future in protein secondary structure prediction. Bioinformatics 15:937–946.CrossRefGoogle Scholar
  11. Berman, H.M., Westbrook, J., Feng, Z., Gilliland, G., Bhat, T.N., Weissig, H., Shindyalov, I.N., and Bourne, P.E. 2000. The Protein Data Bank. Nucleic Acids Res. 28:235–242.CrossRefGoogle Scholar
  12. Bishop, C.M. 1995. Neural Networks for Pattern Recognition. Oxford, Clarendon Press.Google Scholar
  13. Blanco, F.J., Rivas, G., and Serrano, L. 1994. A short linear peptide that folds into a native stable beta-hairpin in aqueous solution. Nat. Struct. Biol. 1:584–590.CrossRefGoogle Scholar
  14. Boeckmann, B., Bairoch, A., Apweiler, R., Blatter, M.C., Estreicher, A., Gasteiger, E., Martin, M.J., Michoud, K., O'Donovan, C., Phan, I., Pilbout, S., and Schneider, M. 2003. The SWISS-PROT protein knowledgebase and its supplement TrEMBL in 2003. Nucleic Acids Res. 31:365–370.CrossRefGoogle Scholar
  15. Bordner, A.J., and Abagyan, R. 2005. Statistical analysis and prediction of protein–protein interfaces. Proteins Struct. Funct. Bioinf. 60:353–366.CrossRefGoogle Scholar
  16. Boswell, D.R., and McLachlan, A.D. 1984. Sequence comparison by exponentially-damped alignment. Nucleic Acids Res. 12:457–464.CrossRefGoogle Scholar
  17. Bracken, C. 2001. NMR spin relaxation methods for characterization of disorder and folding in proteins. J. Mol. Graph. Model 19:3–12.CrossRefMathSciNetGoogle Scholar
  18. Bystroff, C., Thorsson, V., and Baker, D. 2000. HMMSTR: A hidden Markov model for local sequence–structure correlations in proteins. J. Mol. Biol. 301:173–190.CrossRefGoogle Scholar
  19. Byvatov, E., and Schneider, G. 2003. Support vector machine applications in bioinformatics. Appl. Bioinf. 2:67–77.Google Scholar
  20. Cai, Y. D., Feng, K.Y., Li, Y.X., and Chou, K.C. 2003. Support vector machine for predicting alpha-turn types. Peptides 24:629–630.CrossRefGoogle Scholar
  21. Capriotti, E., Fariselli, P., Rossi, I., and Casadio, R. 2004. A Shannon entropy-based filter detects high-quality profile–profile alignments in searches for remote homologues. Proteins 54:351–360.CrossRefGoogle Scholar
  22. Chandonia, J.M., and Karplus, M. 1999. New methods for accurate prediction of protein secondary structure. Proteins 35:293–306.CrossRefGoogle Scholar
  23. Cheng, J., Sweredoski, M.J., and Baldi, P. 2005. Accurate prediction of protein disordered regions by mining protein structure data. Data Mining Knowledge Discovery 11:213–222.CrossRefMathSciNetGoogle Scholar
  24. Chothia, C. 1984. Principles that determine the structure of proteins. Annu. Rev. Biochem. 53:537–572.CrossRefGoogle Scholar
  25. Chothia, C., and Lesk, A.M. 1986. The relation between the divergence of sequence and structure in proteins. EMBO J 5:823–826.Google Scholar
  26. Chou, P.Y., and Fasman, G.D. 1974. Prediction of protein conformation. Biochemistry 13:222–245.CrossRefGoogle Scholar
  27. Chung, R., and Yona, G. 2004. Protein family comparison using statistical models and predicted structural information. BMC Bioinformatics 5:183.CrossRefGoogle Scholar
  28. Churchill, G.A. 1989. Stochastic models for heterogeneous DNA sequences. Bull. Math. Biol. 51:79–94.MATHMathSciNetGoogle Scholar
  29. Cozzetto, D., and Tramontano, A. 2005. Relationship between multiple sequence alignments and quality of protein comparative models. Proteins 58:151–157.CrossRefGoogle Scholar
  30. Cregut, D., Civera, C., Macias, M.J., Wallon, G., and Serrano, L. 1999. A tale of two secondary structure elements: When a beta-hairpin becomes an alpha-helix. J. Mol. Biol. 292:389–401.CrossRefGoogle Scholar
  31. Crippen, G.M. 1978. The tree structural organization of proteins. J. Mol. Biol. 126:315–332.CrossRefGoogle Scholar
  32. Cristianini, N., and Shawe-Taylor, J. 2000. An Introduction to Support Vector Machines and Other Kernel-Based Learning Methods. New York, Cambridge University Press.Google Scholar
  33. Cuff, J.A., and Barton, G.J. 1999. Evaluation and improvement of multiple sequence methods for protein secondary structure prediction. Proteins 34:508–519.CrossRefGoogle Scholar
  34. Cuff, J.A., and Barton, G.J. 2000. Application of multiple sequence alignment profiles to improve protein secondary structure prediction. Proteins 40:502–511.CrossRefGoogle Scholar
  35. Cuff, J.A., Clamp, M.E., Siddiqui, A.S., Finlay, M., and Barton, G.J. 1998. JPred: A consensus secondary structure prediction server. Bioinformatics 14:892–893.CrossRefGoogle Scholar
  36. Dayhoff, M.O., Barker, W.C., and Hunt, L.T. 1983. Establishing homologies in protein sequences. Methods Enzymol. 91:524–545.Google Scholar
  37. de la Cruz, X., Hutchinson, E.G., Shepherd, A., and Thornton, J.M. 2002. Toward predicting protein topology: An approach to identifying beta hairpins. Proc. Natl. Acad. Sci. USA. 99:11157–11162.ADSCrossRefGoogle Scholar
  38. de la Cruz, X., and Thornton, J.M. 1999. Factors limiting the performance of prediction-based fold recognition methods. Protein Sci. 8:750–759.CrossRefGoogle Scholar
  39. Derreumaux, P. 2001. Evidence that the 127–164 region of prion proteins has two equi-energetic conformations with beta or alpha features. Biophys. J. 81:1657–1665.ADSGoogle Scholar
  40. Dickerson, R.E., Timkovich, R., and Almassy, R.J. 1976. The cytochrome fold and the evolution of bacterial energy metabolism. J. Mol. Biol. 100:473–491.CrossRefGoogle Scholar
  41. Dunker, A.K., Brown, C.J., Lawson, J.D., Iakoucheva, L.M., and Obradovic, Z. 2002. Intrinsic disorder and protein function. Biochemistry 41:6573–6582.CrossRefGoogle Scholar
  42. Dunker, A.K., Lawson, J.D., Brown, C.J., Williams, R.M., Romero, P., Oh, J.S., Oldfield, C.J., Campen, A.M., Ratliff, C.M., Hipps, K.W., Ausio, J., Nissen, M.S., Reeves, R., Kang, C., Kissinger, C.R., Bailey, R.W., Griswold, M.D., Chiu, W., Garner, E.C., and Obradovic, Z. 2001. Intrinsically disordered protein. J. Mol. Graph. Model 19:26–59.CrossRefGoogle Scholar
  43. Dunker, A.K., Obradovic, Z., Romero, P., Garner, E.C., and Brown, C.J. 2000. Intrinsic protein disorder in complete genomes. Genome Inform. Ser. Workshop Genome Inform. 11:161–171.Google Scholar
  44. Durbin, R. 1998. Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids. New York,Cambridge University Press.MATHGoogle Scholar
  45. Durbin, R., Eddy, S., Krogh, A., and Mitchison, G. 2000. Markov chains and hidden Markov models. In Biological Sequence Analysis: Probalistic Models of Proteins and Nucleic Acids. New York, Cambridge University Press, pp.46–79.Google Scholar
  46. Dutta, S., and Berman, H.M. 2005. Large macromolecular complexes in the Protein Data Bank: A status report. Structure 13:381–388.CrossRefGoogle Scholar
  47. Dyson, H.J., and Wright, P.E. 2002. Insights into the structure and dynamics of unfolded proteins from nuclear magnetic resonance. Adv. Protein Chem. 62:311–340.Google Scholar
  48. Eddy, S.R. 1996. Hidden Markov models. Curr. Opin. Struct. Biol. 6:361–365.CrossRefMathSciNetGoogle Scholar
  49. Edgar, R.C., and Sjolander, K. 2004. COACH: Profile–profile alignment of protein families using hidden Markov models. Bioinformatics 20:1309–1318.CrossRefGoogle Scholar
  50. Forcellino, F., and Derreumaux, P. 2001. Computer simulations aimed at structure prediction of supersecondary motifs in proteins. Proteins 45:159–166.CrossRefGoogle Scholar
  51. Frenkel, D., and Smit, B. 2002. Monte Carlo simulations. In: Understanding Molecular Simulation: From Algorithms to Applications (D. Frenkel, M. Klein, M. Parrinello, and B. Smit, Eds.). San Diego, Academic Press, pp. 23–58.Google Scholar
  52. Friedberg, I., Kaplan, T., and Margalit, H. 2000. Evaluation of PSI-BLAST alignment accuracy in comparison to structural alignments. Protein Sci. 9:2278–2284.Google Scholar
  53. Frishman, D., and Argos, P. 1996. Incorporation of non-local interactions in protein secondary structure prediction from the amino acid sequence. Protein Eng. 9:133–142.CrossRefGoogle Scholar
  54. Frishman, D., and Argos, P. 1997. Seventy-five percent accuracy in protein secondary structure prediction. Proteins 27:329–335.CrossRefGoogle Scholar
  55. Garnier, J., Gibrat, J.F., and Robson, B. 1996. GOR method for predicting protein secondary structure from amino acid sequence. Methods Enzymol 266:540–553.Google Scholar
  56. Garnier, J., Osguthorpe, D.J., and Robson, B. 1978. Analysis of the accuracy and implications of simple methods for predicting the secondary structure of globular proteins. J. Mol. Biol. 120:97–120.CrossRefGoogle Scholar
  57. George, R.A., and Heringa, J. 2000. The REPRO server: Finding protein internal sequence repeats through the Web. Trends Biochem. Sci. 25:515–517.CrossRefGoogle Scholar
  58. Gibrat, J.F., Garnier, J., and Robson, B. 1987. Further developments of protein secondary structure prediction using information theory. New parameters and consideration of residue pairs. J. Mol. Biol. 198:425–443.CrossRefGoogle Scholar
  59. Ginalski, K., Pas, J., Wyrwicz, L.S., von Grotthuss, M., Bujnicki, J.M., and Rychlewski, L. 2003. ORFeus: Detection of distant homology using sequence profiles and predicted secondary structure. Nucleic Acids Res. 31:3804–3807.CrossRefGoogle Scholar
  60. Ginalski, K., von Grotthuss, M., Grishin, N.V., and Rychlewski, L. 2004. Detecting distant homology with Meta-BASIC. Nucleic Acids Res. 32:W576–581.CrossRefGoogle Scholar
  61. Guermeur, Y., Geourjon, C., Gallinari, P., and Deleage, G. 1999. Improved performance in protein secondary structure prediction by inhomogeneous score combination. Bioinformatics 15:413–421.CrossRefGoogle Scholar
  62. Guo, J., Chen, H., Sun, Z., and Lin, Y. 2004. A novel method for protein secondary structure prediction using dual-layer SVM and profiles. Proteins 54:738–743.CrossRefGoogle Scholar
  63. Hedman, M., Deloof, H., Von Heijne, G., and Elofsson, A. 2002. Improved detection of homologous membrane proteins by inclusion of information from topology predictions. Protein Sci. 11:652–658.CrossRefGoogle Scholar
  64. Heger, A., and Holm, L. 2000. Rapid automatic detection and alignment of repeats in protein sequences. Proteins 41:224–237.CrossRefGoogle Scholar
  65. Heringa, J. 1994. The evolution and recognition of protein sequence repeats. Comput. Chem. 18:233–243.MATHCrossRefGoogle Scholar
  66. Heringa, J. 1998. Detection of internal repeats: How common are they? Curr. Opin. Struct. Biol. 8:338–345.CrossRefGoogle Scholar
  67. Heringa, J. 1999. Two strategies for sequence comparison: Profile-preprocessed and secondary structure-induced multiple alignment. Comput. Chem. 23:341–364.CrossRefGoogle Scholar
  68. Heringa, J. 2000. Computational methods for protein secondary structure prediction using multiple sequence alignments. Curr. Protein Pept. Sci. 1:273–301.CrossRefGoogle Scholar
  69. Heringa, J. 2002. Local weighting schemes for protein multiple sequence alignment. Comput. Chem. 26:459–477.CrossRefGoogle Scholar
  70. Heringa, J., and Argos, P. 1993. A method to recognize distant repeats in protein sequences. Proteins 17:391–341.CrossRefGoogle Scholar
  71. Hu, H.J., Pan, Y., Harrison, R., and Tai, P.C. 2004. Improved protein secondary structure prediction using support vector machine with a new encoding scheme and an advanced tertiary classifier. IEEE Trans. Nanobiosci. 3:265–271.CrossRefGoogle Scholar
  72. Hu, W.P., Kolinski, A., and Skolnick, J. 1997. Improved method for prediction of protein backbone U-turn positions and major secondary structural elements between U-turns. Proteins 29:443–460.CrossRefGoogle Scholar
  73. Hua, S., and Sun, Z. 2001. A novel method of protein secondary structure prediction with high segment overlap measure: Support vector machine approach. J. Mol. Biol. 308:397–407.CrossRefGoogle Scholar
  74. Huang, C.H., Lin, Y.S., Yang, Y.L., Huang, S.W., and Chen, C.W. 1998. The telomeres of Streptomyces chromosomes contain conserved palindromic sequences with potential to form complex secondary structures. Mol. Microbiol. 28:905–916.CrossRefGoogle Scholar
  75. Huang, X.Q., Hardison, R.C., and Miller, W. 1990. A space-efficient algorithm for local similarities. Comput. Appl. Biosci. 6:373–381.Google Scholar
  76. Hughey, R., and Krogh, A. 1996. Hidden Markov models for sequence analysis: Extension and analysis of the basic method. Comput. Appl. Biosci. 12:95–107.Google Scholar
  77. Hutchinson, E.G., and Thornton, J.M. 1993. The Greek key motif: Extraction, classification and analysis. Protein Eng. 6:233–245.CrossRefGoogle Scholar
  78. Jones, D.T. 1999. Protein secondary structure prediction based on position-specific scoring matrices. J. Mol. Biol. 292:195–202.CrossRefGoogle Scholar
  79. Karplus, K., Barrett, C., Cline, M., Diekhans, M., Grate, L., and Hughey, R. 1999. Predicting protein structure using only sequence information. Proteins Suppl. 3:121–125.CrossRefGoogle Scholar
  80. Karplus, K., Barrett, C., and Hughey, R. 1998. Hidden Markov models for detecting remote protein homologies. Bioinformatics 14:846–856.CrossRefGoogle Scholar
  81. Karplus, K., Karchin, R., Barrett, C., Tu, S., Cline, M., Diekhans, M., Grate, L., Casper, J., and Hughey, R. 2001. What is the value added by human intervention in protein structure prediction? Proteins Suppl. 5:86–91.CrossRefGoogle Scholar
  82. Karplus, K., Karchin, R., Draper, J., Casper, J., Mandel-Gutfreund, Y., Diekhans, M., and Hughey, R. 2003. Combining local-structure, fold-recognition, and new fold methods for protein structure prediction. Proteins 53(Suppl.6):491–496.CrossRefGoogle Scholar
  83. Karplus, K., Karchin, R., Hughey, R., Draper, J., Mandel-Gutfreund, Y., Casper, J., and Diekhans, M. 2002. SAM-T02: Protein structure prediction with neural nets, hidden Markov models, and fragment packing. CASP 5.Google Scholar
  84. Kim, D.E., Chivian, D., and Baker, D. 2004. Protein structure prediction and analysis using the Robetta server. Nucleic Acids Res. 32:W526–531.CrossRefGoogle Scholar
  85. Kim, H., and Park, H. 2003. Protein secondary structure prediction based on an improved support vector machines approach. Protein Eng. 16:553–560.CrossRefGoogle Scholar
  86. King, R.D., Ouali, M., Strong, A.T., Aly, A., Elmaghraby, A., Kantardzic, M., and Page, D. 2000. Is it better to combine predictions? Protein Eng. 13:15–19.CrossRefGoogle Scholar
  87. Kirshenbaum, K., Young, M., and Highsmith, S. 1999. Predicting allosteric switches in myosins. Protein Sci. 8:1806–1815.Google Scholar
  88. Kleinjung, J., Romein, J., Lin, K., and Heringa, J. 2004. Contact-based sequence alignment. Nucleic Acids Res. 32:2464–2473.CrossRefGoogle Scholar
  89. Koh, I.Y., Eyrich, V.A., Marti-Renom, M.A., Przybylski, D., Madhusudhan, M.S., Eswar, N., Grana, O., Pazos, F., Valencia, A., Sali, A., and Rost, B. 2003. EVA: Evaluation of protein structure prediction servers. Nucleic Acids Res. 31:3311–3315.CrossRefGoogle Scholar
  90. Kolaskar, A.S., and Kulkarni-Kale, U. 1992. Sequence alignment approach to pick up conformationally similar protein fragments. J. Mol. Biol. 223:1053–1061.CrossRefGoogle Scholar
  91. Kolinski, A., Skolnick, J., Godzik, A., and Hu, W.P. 1997. A method for the prediction of surface “U”-turns and transglobular connections in small proteins. Proteins 27:290–308.CrossRefGoogle Scholar
  92. Krogh, A., Brown, M., Mian, I.S., Sjolander, K., Haussler, D. 1994. Hidden Markov models in computational biology. Applications to protein modeling. J. Mol. Biol. 235:1501–1531.CrossRefGoogle Scholar
  93. Kuhn, M., Meiler, J., and Baker, D. 2004. Strand–loop–strand motifs: Prediction of hairpins and diverging turns in proteins. Proteins 54:282–288.CrossRefGoogle Scholar
  94. Kurtz, S., and Schleiermacher, C. 1999. REPuter: Fast computation of maximal repeats in complete genomes. Bioinformatics 15:426–427.CrossRefGoogle Scholar
  95. Langosch, D., and Heringa, J. 1998. Interaction of transmembrane helices by a knobs-into-holes packing characteristic of soluble coiled coils, Proteins: Struct. Func. and Gen. 31:150–159.CrossRefGoogle Scholar
  96. Lim, V.I. 1974. Structural principles of the globular organization of protein chains. A stereochemical theory of globular protein secondary structure. J. Mol. Biol. 88:857–872.CrossRefGoogle Scholar
  97. Lin, K., Simossis, V.A., Taylor, W.R., and Heringa, J. 2005. A simple and fast secondary structure prediction method using hidden neural networks. Bioinformatics 21:152–159.CrossRefGoogle Scholar
  98. Linding, R., Jensen, L.J., Diella, F., Bork, P., Gibson, T.J., and Russell, R.B. 2003a. Protein disorder prediction: Implications for structural proteomics. Structure 11:1453–1459.CrossRefGoogle Scholar
  99. Linding, R., Russell, R.B., Neduva, V., and Gibson, T.J. 2003b. GlobPlot: Exploring protein sequences for globularity and disorder. Nucleic Acids Res. 31:3701–3708.CrossRefGoogle Scholar
  100. Luisi, D.L., Wu, W.J., and Raleigh, D.P. 1999. Conformational analysis of a set of peptides corresponding to the entire primary sequence of the N-terminal domain of the ribosomal protein L9: Evidence for stable native-like secondary structure in the unfolded state. J. Mol. Biol. 287:395–407.CrossRefGoogle Scholar
  101. Lupas, A. 1996. Prediction and analysis of coiled-coil structures. Methods Enzymol 266:513–525.Google Scholar
  102. Lupas, A., Van Dyke, M., and Stock, J. 1991. Predicting coiled coils from protein sequences, Science 252:1162–1164.ADSCrossRefGoogle Scholar
  103. Luthy, R., Xenarios, I., and Bucher, P. 1994. Improving the sensitivity of the sequence profile method. Protein Sci. 3:139–146.CrossRefGoogle Scholar
  104. Macdonald, J.R., and Johnson, W.C., Jr. 2001. Environmental features are important in determining protein secondary structure. Protein Sci. 10:1172–1177.CrossRefGoogle Scholar
  105. Marcotte, E.M., Pellegrini, M., Yeates, T.O., and Eisenberg, D. 1999. A census of protein repeats. J. Mol. Biol. 293:151–160.CrossRefGoogle Scholar
  106. McGuffin, L.J., and Jones, D.T. 2003. Benchmarking secondary structure prediction for fold recognition. Proteins 52:166–175.CrossRefGoogle Scholar
  107. McLachlan, A.D. 1972. Repeating sequences and gene duplication in proteins. J. Mol. Biol. 64:417–437.CrossRefGoogle Scholar
  108. McLachlan, A.D. 1977. Analysis of periodic patterns in amino acid sequences: Collagen. Biopolymers 16:1271–1297.CrossRefGoogle Scholar
  109. McLachlan, A.D. 1979. Gene duplications in the structural evolution of chymotrypsin. J. Mol. Biol. 128:49–79.CrossRefGoogle Scholar
  110. McLachlan, A.D. 1983. Analysis of gene duplication repeats in the myosin rod. J. Mol. Biol. 169:15–30.CrossRefGoogle Scholar
  111. McLachlan, A.D., and Stewart, M. 1976. The 14-fold periodicity in alpha-tropomyosin and the interaction with actin. J. Mol. Biol. 103:271–298.CrossRefGoogle Scholar
  112. Mehta, P.K., Heringa, J., and Argos, P. 1995. A simple and fast approach to prediction of protein secondary structure from multiply aligned sequences with accuracy above 70%. Protein Sci. 4:2517–2525.Google Scholar
  113. Meiler, J., and Baker, D. 2003. Coupled prediction of protein secondary and tertiary structure. Proc. Natl. Acad. Sci. USA 100:12105–12110.ADSCrossRefGoogle Scholar
  114. Metropolis, N., and Ulam, S. 1949. The Monte Carlo method. J. Am. Stat. Assoc. 44:335–341.MATHCrossRefMathSciNetGoogle Scholar
  115. Minor, D.L., Jr., and Kim, P.S. 1996. Context-dependent secondary structure formation of a designed protein sequence. Nature 380:730–734.ADSCrossRefGoogle Scholar
  116. Minsky, M.L., and Papert, S. 1988. Perceptrons: An Introduction to Computational Geometry. Cambridge, Mass., MIT Press.MATHGoogle Scholar
  117. Mittelman, D., Sadreyev, R., and Grishin, N. 2003. Probabilistic scoring measures for profile–profile comparison yield more accurate short seed alignments. Bioinformatics 19:1531–1539.CrossRefGoogle Scholar
  118. Nagano, K. 1973. Logical analysis of the mechanism of protein folding. I. Predictions of helices, loops and beta-structures from primary structure. J. Mol. Biol. 75:401–420.CrossRefGoogle Scholar
  119. Needleman, S.B., and Wunsch, C.D. 1970. A general method applicable to the search for similarities in the amino acid sequence of two proteins. J. Mol. Biol. 48:443–453.CrossRefGoogle Scholar
  120. Noble, W.S. 2004. Support vector machine applications in computational biology. In Kernel Methods in Computational Biology (J.-p. Vert, B. Schoelkopf, and K. Tsuda, Eds.). Cambridge, Mass., MIT Press, pp. 71–92.Google Scholar
  121. Obradovic, Z., Peng, K., Vucetic, S., Radivojac, P., Brown, C.J., and Dunker, A.K. 2003. Predicting intrinsic disorder from amino acid sequence. Proteins 53(Suppl. 6):566–572.CrossRefGoogle Scholar
  122. Ohlson, T., Wallner, B., and Elofsson, A. 2004. Profile–profile methods provide improved fold-recognition: A study of different profile–profile alignment methods. Proteins 57:188–197.CrossRefGoogle Scholar
  123. Ouali, M., and King, R.D. 2000. Cascaded multiple classifiers for secondary structure prediction. Protein Sci. 9:1162–1176.Google Scholar
  124. Park, J., Karplus, K., Barrett, C., Hughey, R., Haussler, D., Hubbard, T., and Chothia, C. 1998. Sequence comparisons using multiple sequences detect three times as many remote homologues as pairwise methods. J. Mol. Biol. 284:1201–1210.CrossRefGoogle Scholar
  125. Pellegrini, M., Marcotte, E.M., and Yeates, T.O. 1999. A fast algorithm for genome-wide analysis of proteins with repeated sequences. Proteins 35:440–446.CrossRefGoogle Scholar
  126. Petersen, T.N., Lundegaard, C., Nielsen, M., Bohr, H., Bohr, J., Brunak, S., Gippert, G.P., and Lund, O. 2000. Prediction of protein secondary structure at 80% accuracy. Proteins 41:17–20.CrossRefGoogle Scholar
  127. Pollastri, G., Przybylski, D., Rost, B., and Baldi, P. 2002. Improving the prediction of protein secondary structure in three and eight classes using recurrent neural networks and profiles. Proteins 47:228–235.CrossRefGoogle Scholar
  128. Prilusky, J., Felder, C.E., Zeev-Ben-Mordehai, T., Rydberg, E., Man, O., Beckmann, J.S., Silman, I., and Sussman, J.L. 2005. FoldIndex: A simple tool to predict whether a given protein sequence is intrinsically unfolded. Bioinformatics 21:3435–3438.CrossRefGoogle Scholar
  129. Przybylski, D., and Rost, B. 2002. Alignments grow, secondary structure prediction improves. Proteins 46:197–205.CrossRefGoogle Scholar
  130. Ptitsyn, O.B. 1994. Kinetic and equilibrium intermediates in protein folding. Protein Eng 7:593–596.CrossRefGoogle Scholar
  131. Raghava, G.P.S. 2000. Protein secondary structure prediction using nearest neighbor and neural network approach. CASP 4, 75-76.Google Scholar
  132. Raghava, G.P.S. 2002a. APSSP2: A combination method for protein secondary structure prediction based on neural network and example based learning. CASP 5. URL: Scholar
  133. Raghava, G.P.S. 2002b. APSSP: Automatic method for protein secondary structure prediction. CASP 5. URL: Scholar
  134. Ramirez-Alvarado, M., Serrano, L., and Blanco, F.J. 1997. Conformational analysis of peptides corresponding to all the secondary structure elements of protein L B1 domain: Secondary structure propensities are not conserved in proteins with the same fold. Protein Sci. 6:162–174.CrossRefGoogle Scholar
  135. Rao, S.T., and Rossmann, M.G. 1973. Comparison of super secondary structures in proteins. J. Mol. Biol. 76:241–256.CrossRefGoogle Scholar
  136. Reymond, M.T., Merutka, G., Dyson, H.J., and Wright, P.E. 1997. Folding propensities of peptide fragments of myoglobin. Protein Sci. 6:706–716.CrossRefGoogle Scholar
  137. Romero, P., Obradovic, Z., Li, X., Garner, E.C., Brown, C.J., and Dunker, A.K. 2001. Sequence complexity of disordered protein. Proteins 42:38–48.CrossRefGoogle Scholar
  138. Rose, G.D. 1979. Hierarchic organization of domains in globular proteins. J. Mol. Biol. 134:447–470.ADSCrossRefGoogle Scholar
  139. Rost, B., and Sander, C. 1993. Prediction of protein secondary structure at better than 70% accuracy. J. Mol. Biol. 232:584–599.CrossRefGoogle Scholar
  140. Rost, B., Sander, C., and Schneider, R. 1994. Redefining the goals of protein secondary structure prediction. J. Mol. Biol. 235:13–26.Google Scholar
  141. Rost, B., Schneider, R., and Sander, C. 1997. Protein fold recognition by prediction-based threading. J. Mol. Biol. 270:471–480.CrossRefGoogle Scholar
  142. Rychlewski, L., Jaroszewski, L., Li, W., and Godzik, A. 2000. Comparison of sequence profiles. Strategies for structural predictions using sequence information. Protein Sci. 9:232–241.CrossRefGoogle Scholar
  143. Salem, G.M., Hutchinson, E.G., Orengo, C.A., and Thornton, J.M. 1999. Correlation of observed fold frequency with the occurrence of local structural motifs. J. Mol. Biol. 287:969–981.CrossRefGoogle Scholar
  144. Sander, C., and Schneider, R. 1991. Database of homology-derived protein structures and the structural meaning of sequence alignment. Proteins 9:56–68.CrossRefGoogle Scholar
  145. Schaffer, A.A., Aravind, L., Madden, T.L., Shavirin, S., Spouge, J.L., Wolf, Y.I., Koonin, E.V., and Altschul, S.F. 2001. Improving the accuracy of PSI-BLAST protein database searches with composition-based statistics and other refinements. Nucleic Acids Res. 29:2994–3005.CrossRefGoogle Scholar
  146. Schiffer, M., and Edmundson, A.B. 1967. Use of helical wheels to represent the structures of proteins and to identify segments with helical potential. Biophys. J. 7:121–135.Google Scholar
  147. Schoelkopf, B., Tsuda, K., and Vert, J.-P.(Eds.). 2004. Kernel Methods in Computational Biology. Cambridge, Mass., MIT Press.Google Scholar
  148. Schulz, G.E. 1988. A critical evaluation of methods for prediction of protein secondary structures. Annu. Rev. Biophys. Biophys. Chem. 17:1–21.CrossRefMathSciNetGoogle Scholar
  149. Selbig, J., Mevissen, T., and Lengauer, T. 1999. Decision tree-based formation of consensus protein secondary structure prediction. Bioinformatics 15:1039–1046.CrossRefGoogle Scholar
  150. Simossis, V.A., and Heringa, J. 2004a. The influence of gapped positions in multiple sequence alignments on secondary structure prediction methods. Comput. Biol. Chem. 28(5–6:351–366.MATHCrossRefGoogle Scholar
  151. Simossis, V.A., and Heringa, J. 2004b. Integrating protein secondary structure prediction and multiple sequence alignment. Curr. Protein Pept. Sci. 5:249–266.CrossRefGoogle Scholar
  152. Simossis, V.A., and Heringa, J. 2005. SYMPRED consensus secondary structure prediction. Scholar
  153. Smit, A., Hubley, R., and Green, P. 2004. RepeatMasker open-3.0. 1996–2004. Scholar
  154. Smith, T.F., and Waterman, M.S. 1981. Identification of common molecular subsequences. J. Mol. Biol. 147:195–197.CrossRefGoogle Scholar
  155. Soding, J. 2005. Protein homology detection by HMM–HMM comparison. Bioinformatics 21:951–960.CrossRefGoogle Scholar
  156. Stultz, C.M., White, J.V., and Smith, T.F. 1993. Structural analysis based on state-space modeling. Protein Sci. 2:305–314.CrossRefGoogle Scholar
  157. Sun, Z., Rao, X., Peng, L., and Xu, D. 1997. Prediction of protein supersecondary structures based on the artificial neural network method. Protein Eng. 10:763–769.CrossRefGoogle Scholar
  158. Szklarczyk, R., and Heringa, J. 2004. Tracking repeats using significance and transitivity. Bioinformatics 20(Suppl. 1):I311–I317.CrossRefGoogle Scholar
  159. Taylor, W.R., Heringa, J., Baud, F., and Flores, T.P. 2002. A Fourier analysis of symmetry in protein structure. Protein Eng. 15:79–89.CrossRefGoogle Scholar
  160. Teodorescu, O., Galor, T., Pillardy, J., and Elber, R. 2004. Enriching the sequence substitution matrix by structural information. Proteins 54:41–48.CrossRefGoogle Scholar
  161. Tomii, K., and Akiyama, Y. 2004. FORTE: A profile–profile comparison tool for protein fold recognition. Bioinformatics 20:594–595.CrossRefGoogle Scholar
  162. Unger, R., and Moult, J. 1993. Finding the lowest free energy conformation of a protein is an NP-hard problem: Proof and implications. Bull. Math. Biol. 55:1183–1198.MATHGoogle Scholar
  163. Uversky, V.N., Gillespie, J.R., and Fink, A.L. 2000. Why are “natively unfolded” proteins unstructured under physiologic conditions? Proteins 41:415–427.CrossRefGoogle Scholar
  164. van Belkum, A., Scherer, S., van Alphen, L., and Verbrugh, H. 1998. Short-sequence DNA repeats in prokaryotic genomes. Microbiol. Mol. Biol. Rev. 62:275–293.Google Scholar
  165. Vapnik, V.N. 1995. The Nature of Statistical Learning Theory. New York, Springer.MATHGoogle Scholar
  166. Vapnik, V.N. 1998. Statistical Learning Theory. New York, Wiley.MATHGoogle Scholar
  167. Vihinen, M., Torkkila, E., and Riikonen, P. 1994. Accuracy of protein flexibility predictions. Proteins 19:141–149.CrossRefGoogle Scholar
  168. von Ohsen, N., Sommer, I., and Zimmer, R. 2003. Profile–profile alignment: A powerful tool for protein structure prediction. Pac. Symp. Biocomput. 252–263.Google Scholar
  169. von Ohsen, N., Sommer, I., Zimmer, R., and Lengauer, T. 2004. Arby: Automatic protein structure prediction using profile–profile alignment and confidence measures. Bioinformatics 20:2228–2235.CrossRefGoogle Scholar
  170. Vucetic, S., Brown, C.J., Dunker, A.K., and Obradovic, Z. 2003. Flavors of protein disorder. Proteins 52:573–584.CrossRefGoogle Scholar
  171. Wang, G., and Dunbrack, R.L., Jr. 2004. Scoring profile-to-profile sequence alignments. Protein Sci. 13:1612–1626.CrossRefGoogle Scholar
  172. Ward, J.J., McGuffin, L.J., Bryson, K., Buxton, B.F., and Jones, D.T. 2004. The DISOPRED server for the prediction of protein disorder. Bioinformatics 20:2138–2139.CrossRefGoogle Scholar
  173. Ward, J.J., McGuffin, L.J., Buxton, B.F., and Jones, D.T. 2003. Secondary structure prediction with support vector machines. Bioinformatics 19:1650–1655.CrossRefGoogle Scholar
  174. Waterman, M.S., and Eggert, M. 1987. A new algorithm for best subsequence alignments with application to tRNA–rRNA comparisons. J. Mol. Biol. 197:723–728.CrossRefGoogle Scholar
  175. Wetlaufer, D.B. 1973. Nucleation, rapid folding, and globular intrachain regions in proteins. Proc. Natl. Acad. Sci. USA 70:697–701.ADSCrossRefGoogle Scholar
  176. White, J.V., Stultz, C.M., and Smith, T.F. 1994. Protein classification by stochastic modeling and optimal filtering of amino-acid sequences. Math. Biosci. 119:35–75.MATHCrossRefGoogle Scholar
  177. Wootton, J.C., and Federhen, S. 1996. Analysis of compositionally biased regions in sequence databases. Methods Enzymol. 266:554–571.CrossRefGoogle Scholar
  178. Wright, P.E., and Dyson, H.J. 1999. Intrinsically unstructured proteins: Re-assessing the protein structure–function paradigm. J. Mol. Biol. 293:321–331.CrossRefGoogle Scholar
  179. Xie, Q., Arnold, G.E., Romero, P., Obradovic, Z., Garner, E., and Dunker, A.K. 1998. The sequence attribute method for determining relationships between sequence and protein disorder. Genome Inform. Ser. Workshop Genome Inform. 9:193–200.Google Scholar
  180. Yona, G., and Levitt, M. 2002. Within the twilight zone: A sensitive profile–profile comparison tool based on information theory. J. Mol. Biol. 315:1257–1275.CrossRefGoogle Scholar
  181. Young, M., Kirshenbaum, K., Dill, K.A., and Highsmith, S. 1999. Predicting conformational switches in proteins. Protein Sci. 8:1752–1764.Google Scholar
  182. Yu, L., White, J.V., and Smith, T.F. 1998. A homology identification method that combines protein sequence and structure information. Protein Sci. 7:2499–2510.CrossRefGoogle Scholar
  183. Zemla, A., Venclovas, C., Fidelis, K., and Rost, B. 1999. A modified definition of Sov, a segment-based measure for protein secondary structure prediction assessment. Proteins 34:220–223.CrossRefGoogle Scholar
  184. Zvelebil, M.J., Barton, G.J., Taylor, W.R., and Sternberg, M.J. 1987. Prediction of protein secondary structure and active sites using the alignment of homologous sequences. J. Mol. Biol. 195:957–961.CrossRefGoogle Scholar
  185. Jones, N.C., and Pevzner P.A. 2004. An Introduction to Bioinformatics Algorithms. Cambridge, MA, MIT Press.Google Scholar
  186. Konopka, A.K., and Crabbe, M.J.C. (Eds.). 2004. Compact Handbook of Computational Biology. New York, Dekker.MATHGoogle Scholar

Copyright information

© Springer 2007

Authors and Affiliations

  • Victo A. Simossis
  • Jaap Heringa

There are no affiliations available

Personalised recommendations