Protein Contact Map Prediction

  • Xin Yuan
  • Christopher Bystroff


Proteins are linear chains that fold into characteristic shapes and features. To understand proteins and protein folding, we try to represent the protein molecule in such a way that its features are easy to see and manipulate. A simple representation facilitates algorithm design for structure prediction. The simplicity of the threestate character string representation of secondary structure is part of the reason for secondary structure prediction receiving so much attention early in the era of computational biology. One-dimensional strings are easily understood, parsed, mined, and manipulated. But secondary structure alone does not tell us enough about the overall shapes and features of a protein.We need a simpleway to represent the overall tertiary structure of a protein.


Association Rule Local Contact Correlate Mutation Residue Pair Residue Contact 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. Baker, D. 2000. A surprising simplicity to protein folding. Nature 405:39–42.CrossRefADSGoogle Scholar
  2. Vendruscolo, M., Najmanovich, R., and Domany, E. 1999. Protein folding in contact map space. Phys. Rev. Lett. 82:656–659.CrossRefADSGoogle Scholar
  3. Aloy, P., Stark, A., Hadley, C., and Russell, R.B. 2003. Predictions without templates: New folds, secondary structure, and contacts in CASP5. Proteins 53 (Suppl. 6):436–456.CrossRefGoogle Scholar
  4. Altschuh, D., Lesk, A.M., Bloomer, A.C., and Klug, A. 1987. Correlation of co-ordinated amino acid substitutions with function in viruses related to tobacco mosaic virus. J. Mol. Biol. 193:693–707.CrossRefGoogle Scholar
  5. Aszodi, A., Gradwell, M.J., and Taylor, W.R. 1995. Global fold determination from a small number of distance restraints. J. Mol. Biol. 251:308–326.CrossRefGoogle Scholar
  6. Berrera, M., Molinari, H., and Fogolari, F. 2003. Amino acid empirical contact energy definitions for fold recognition in the space of contact maps. BMC Bioinformatics 4:8.CrossRefGoogle Scholar
  7. Bystroff, C., and Shao, Y. 2003. Modeling protein folding pathways. In Practical Bioinformatics (J.M. Bujnicki, Ed.). Berlin, Springer-Verlag.Google Scholar
  8. Bystroff, C., Thorsson, V., and Baker, D. 2000. HMMSTR: A hidden Markov model for local sequence–structure correlations in proteins. J. Mol. Biol. 301:173–190.CrossRefGoogle Scholar
  9. Chavez, L.L., Onuchic, J.N., and Clementi, C. 2004. Quantifying the roughness on the free energy landscape: Entropic bottlenecks and protein folding rates. J. Am. Chem. Soc. 126:8426–8432.CrossRefGoogle Scholar
  10. Cheng, J., Randall, A., Sweredoski, M., and Baldi, P. 2005. SCRATCH: A protein structure and structural feature prediction server. Nucleic Acids Res. 33: 72–76.CrossRefGoogle Scholar
  11. Dodge, C., Schneider, R., and Sander, C. 1998. The HSSP database of protein structure—sequence alignments and family profiles. Nucleic Acids Res. 26:313–315.CrossRefGoogle Scholar
  12. Dosztanyi, Z., Fiser, A., and Simon, I. 1997. Stabilization centers in proteins: Identification, characterization and predictions. J. Mol. Biol. 272:597–612.CrossRefGoogle Scholar
  13. Eisenhawer, M., Cattarinussi, S., Kuhn, A., and Vogel, H. 2001. Fluorescence resonance energy transfer shows a close helix—helix distance in the transmembrane M13 procoat protein. Biochemistry 40:12321–12328.CrossRefGoogle Scholar
  14. Enosh, A., Fleishman, S.J., Ben-Tal, N., and Halperin, D. 2004. Assigning transmembrane segments to helices in intermediate-resolution structures. Bioinformatics 20 (Suppl. 1):I122–I129.CrossRefGoogle Scholar
  15. Fariselli, P., and Casadio, R. 1999. A neural network based predictor of residue contacts in proteins. Protein Eng. 12:15–21.CrossRefGoogle Scholar
  16. Fariselli, P., Olmea, O., Valencia, A., and Casadio, R. 2001a. Prediction of contact maps with neural networks and correlated mutations. Protein Eng. 14:835–843.CrossRefGoogle Scholar
  17. Fariselli, P., Olmea, O., Valencia, A., and Casadio, R. 2001b. Progress in predicting inter-residue contacts of proteins with neural networks and correlated mutations. Proteins Suppl. 5:157–62.CrossRefGoogle Scholar
  18. Göbel, U., Sander, C., Schneider, R., and Valencia, A. 1994. Correlated mutations and residue contacts in proteins. Proteins 18:309–317.CrossRefGoogle Scholar
  19. Graña, O., Baker, D., Maccallum, R.M., Meiler, J., Punta, M., Rost, B., Tress, M.L., and Valencia, A. 2005. CASP6 assessment of contact prediction. Proteins [Epub 26 Sep 2005].Google Scholar
  20. Hamilton, N., Burrage, K., Ragan, M.A., and Huber, T. 2004. Protein contact prediction using patterns of correlation. Proteins 56:679–684.CrossRefGoogle Scholar
  21. Havel, T.F., Crippen, G.M., and Kuntz, I.D. 1979. Effects of distance constraints on macromolecular conformation. II. Simulation of experimental results and theoretical predictions. Biopolymers 18:73–81.CrossRefGoogle Scholar
  22. Hu, J., Shen, X., Shao, Y., Bystroff, C., and Zaki, M.J. 2002. Mining protein contact maps. BIOKDD 2002, Edmonton, Canada.Google Scholar
  23. Huang, E.S., Subbiah, S., and Levitt, M. 1995. Recognizing native folds by the arrangement of hydrophobic and polar residues. J. Mol. Biol. 252:709–720.CrossRefGoogle Scholar
  24. Jones, D.T. 1999. Protein secondary structure prediction based on position-specific scoring matrices, J. Mol. Biol. 292:195–202.CrossRefGoogle Scholar
  25. Kleinjung, J., Romein, J., Lin, K., and Heringa, J. 2004. Contact-based sequence alignment. Nucleic Acids Res. 32:2464–2473.CrossRefGoogle Scholar
  26. Koh, I.Y., Eyrich, V.A., Marti-Renom, M.A., Przybylski, D., Madhusudhan, M.S., Eswar, N., Grana, O., Pazos, F., Valencia, A., Sali, A., and Rost, B. 2003. EVA: Evaluation of protein structure prediction servers. Nucleic Acids Res. 31:3311–3315.CrossRefGoogle Scholar
  27. Kraulis, P.J. 1991. MOLSCRIPT: A program to produce both detailed and schematic plots of protein structures. J. App. Crystallogr. 24:946–950.CrossRefGoogle Scholar
  28. Kuznetsov, I.B., and Rackovsky, S. 2004. Class-specific correlations between protein folding rate, structure-derived, and sequence-derived descriptors. Proteins 54:333–334.CrossRefGoogle Scholar
  29. Lichtarge, O., Bourne, H.R., and Cohen, F.E. 1996. An evolutionary trace method defines binding surfaces common to protein families. J. Mol. Biol. 257:342–358.CrossRefGoogle Scholar
  30. Lin, K., Kleinjung, J., Taylor, W., and Heringa, J. 2003. Testing homology with CAO: A contact-based Markov model of protein evolution. Comp. Biol. Chem. 27:93–102.MATHCrossRefGoogle Scholar
  31. Lund, O., Frimand, K., Gorodkin, J., Bohr, H., Bohr, J., Hansen, J., and Brunak, S. 1997. Protein distance constraints predicted by neural networks and probability density functions. Protein Eng. 10:1241–1248.CrossRefGoogle Scholar
  32. MacCallum, R.M. 2004. Striped sheets and protein contact prediction. Bioinformatics 20(Suppl. 1):I224–I231.CrossRefGoogle Scholar
  33. Maiorov, V.N., and Crippen, G.M. 1992. Contact potential that recognizes the correct folding of globular proteins. J. Mol. Biol. 227:876–888.CrossRefGoogle Scholar
  34. McGuffin, L.J., Bryson, K., and Jones, D.T. 2000. The PSIPRED protein structure prediction server. Bioinformatics 16:404–405.CrossRefGoogle Scholar
  35. McLachlan, A.D. 1971. Tests for comparing related amino-acid sequences. Cytochrome c and cytochrome c 551. J. Mol. Biol. 61:409–424.CrossRefGoogle Scholar
  36. Michael, T.S., and Quint, T. 1999. Sphere of influence graphs in general metric spaces. Math. Comput. Model. 29:45–53.MATHCrossRefMathSciNetGoogle Scholar
  37. Michalopoulos, I., Torrance, G.M., Gilbert, D.R., and Westhead, D.R. 2004. TOPS: An enhanced database of protein structural topology. Nucleic Acids Res. 32:D251–D254.CrossRefGoogle Scholar
  38. Mirny, L., and Domany, E. 1996. Protein fold recognition and dynamics in the space of contact maps. Proteins 26:391–410.CrossRefGoogle Scholar
  39. Monge, A., Friesner, R.A., and Honig, B. 1994. An algorithm to generate low-resolution protein tertiary structures from knowledge of secondary structure. Proc. Natl. Acad. Sci. USA 91:5027–5029.CrossRefADSGoogle Scholar
  40. Moult, J., Fidelis, K., Zemla, A., and Hubbard, T. 2003. Critical assessment of methods of protein structure prediction (CASP)—round V. Proteins 53 (Suppl. 6):334–339.CrossRefGoogle Scholar
  41. Neher, E. 1994. How frequent are correlated changes in families of protein sequences? Proc. Natl. Acad. Sci. USA 91:98–102.CrossRefADSGoogle Scholar
  42. Olmea, O., and Valencia, A. 1997. Improving contact predictions by the combination of correlated mutations and other sources of sequence information. Fold Des. 2:S25–S32.CrossRefGoogle Scholar
  43. Orengo, C.A., Michie, A.D., Jones, S., Jones, D.T., Swindells, M.B., and Thornton, J.M. 1997. CATH—A hierarchic classification of protein domain structures. Structure 5:1093–1108.CrossRefGoogle Scholar
  44. Park, K., Vendruscolo, M., and Domany, E. 2000. Toward an energy function for the contact map representation of proteins. Proteins 40:237–248.CrossRefGoogle Scholar
  45. Pazos, F., Helmer-Citterich, M., Ausiello, G., and Valencia, A. 1997. Correlated mutations contain information about protein—protein interaction. J. Mol. Biol. 271:511–523.CrossRefGoogle Scholar
  46. Plaxco, K.W., Simons, K.T., and Baker, D. 1998. Contact order, transition state placement and the refolding rates of single domain proteins. J. Mol. Biol. 277:985–994.CrossRefGoogle Scholar
  47. Pollastri, G., and Baldi, P. 2002. Prediction of contact maps by GIOHMMs and recurrent neural networks using lateral propagation from all four cardinal corners. Bioinformatics 18(Suppl. 1):S62–S70.Google Scholar
  48. Porto, M., Bastolla, U., Roman, H.E., and Vendruscolo, M. 2004. Reconstruction of protein structures from a vectorial representation. Phys. Rev. Lett. 92:218101–218104.CrossRefADSGoogle Scholar
  49. Punta, M., and Rost, B. 2005. Protein folding rates estimated from contact predictions. J. Mol. Biol. 348:507–512.CrossRefGoogle Scholar
  50. Rabiner, L.R. 1989. A tutorial on hidden Markov models and selected applications in speech recognition. Proc. IEEE 77:257–286.CrossRefGoogle Scholar
  51. Rodionov, M.A., and Johnson, M.S. 1994. Residue—residue contact substitution probabilities derived from aligned three-dimensional structures and the identification of common folds. Protein Sci. 3:2366–2377.CrossRefGoogle Scholar
  52. Saitoh, S., Nakai, T., and Nishikawa, K. 1993. A geometrical constraint approach for reproducing the native backbone conformation of a protein. Proteins 15:191–204.CrossRefGoogle Scholar
  53. Shao, Y., and Bystroff, C. 2003. Predicting interresidue contacts using templates and pathways. Proteins 53(Suppl. 6):497–502.CrossRefGoogle Scholar
  54. Shindyalov, I.N., Kolchanov, N.A., and Sander, C. 1994. Can three-dimensional contacts in protein structures be predicted by analysis of correlated mutations? Protein Eng. 7:349–358.CrossRefGoogle Scholar
  55. Singer, M.S., Vriend, G., and Bywater, R.P. 2002. Prediction of protein residue contacts with a PDB-derived likelihood matrix. Protein Eng. 15:721–725.CrossRefGoogle Scholar
  56. Sippl, M.J. 1990. Calculation of conformational ensembles from potentials of mean force. An approach to the knowledge-based prediction of local structures in globular proteins. J. Mol. Biol. 213:859–883.CrossRefGoogle Scholar
  57. Skolnick, J., Kolinski, A., and Ortiz, A.R. 1997. MONSSTER: A method for folding globular proteins with a small number of distance restraints. J. Mol. Biol. 265:217–241.CrossRefGoogle Scholar
  58. Tanaka, S., and Scheraga, H.A. 1976. Medium- and long-range interaction parameters between amino acids for predicting three-dimensional structures of proteins. Macromolecules 9:945–950.CrossRefADSGoogle Scholar
  59. Taylor, W.R., and Hatrick, K. 1994. Compensating changes in protein multiple sequence alignments. Protein Eng. 7:341–348.CrossRefGoogle Scholar
  60. Thomas, D.J., Casari, G., and Sander, C. 1996. The prediction of protein contacts from multiple sequence alignments. Protein Eng. 9:941–948.CrossRefGoogle Scholar
  61. Vendruscolo, M., and Domany, E. 1998. Efficient dynamics in the space of contact maps. Fold Des. 3:329–336.CrossRefGoogle Scholar
  62. Vendruscolo, M., Kussell, E., and Domany, E. 1997. Recovery of protein structure from contact maps. Fold Des. 2:295–306.CrossRefGoogle Scholar
  63. Wako, H., and Scheraga, H.A. 1982. Visualization of the nature of protein folding by a study of a distance constraint approach in two-dimensional models. Biopolymers 21:611–632.CrossRefGoogle Scholar
  64. Yuan, X., and Bystroff, C. 2005. Non-sequential structure-based alignments reveal topology-independent core packing arrangements in proteins. Bioinformatics 27:1010–1019.Google Scholar
  65. Zaki, M.J., Shan, J., and Bystroff, C. 2000. Mining residue contacts in proteins using local structure predictions. Proceedings IEEE International Symposium on Bio-Informatics and Biomedical Engineering, Arlington, VA.Google Scholar
  66. Zhang, C., and Kim, S.H. 2000. Environment-dependent residue contact energies for proteins. Proc. Natl. Acad. Sci. USA 97:2550–2555.CrossRefADSGoogle Scholar
  67. Zhao, Y., and Karypis, G. 2003. Prediction of contact maps using support vector machines. BIBE 2003, Bethesda, MD. IEEE Computer Society, pp. 26–36.Google Scholar

Copyright information

© Springer 2007

Authors and Affiliations

  • Xin Yuan
  • Christopher Bystroff

There are no affiliations available

Personalised recommendations