De Novo Protein Structure Prediction

  • Ling-Hong Hung
  • Shing-Chung Ngan
  • Ram Samudrala
Part of the Biological and Medical Physics, Biomedical Engineering book series (BIOMEDICAL)


An unparalleled amount of sequence data is being made available from large-scale genome sequencing efforts. The data provide a shortcut to the determination of the function of a gene of interest, as long as there is an existing sequenced gene with similar sequence and of known function. This has spurred structural genomic initiatives with the goal of determining as many protein folds as possible (Brenner and Levitt, 2000; Burley, 2000; Brenner, 2001; Heinemann et al., 2001). The purpose of this is twofold: First, the structure of a gene product can often lead to direct inference of its function. Second, since the function of a protein is dependent on its structure, direct comparison of the structures of gene products can be more sensitive than the comparison of sequences of genes for detecting homology. Presently, structural determination by crystallography and NMR techniques is still slow and expensive in terms of manpower and resources, despite attempts to automate the processes. Computer structure prediction algorithms, while not providing the accuracy of the traditional techniques, are extremely quick and inexpensive and can provide useful low-resolution data for structure comparisons (Bonneau and Baker, 2001). Given the immense number of structures which the structural genomic projects are attempting to solve, there would be a considerable gain even if the computer structure prediction approach were applicable to a subset of proteins.


Energy Function Structure Prediction Protein Structure Prediction Iterative Density Hydrophobic Moment 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. Andreeva, A., Howorth, D., Brenner, S.E., Hubbard, T.J., Chothia, C., and Murzin, A.G. 2004. SCOP database in 2004: Refinements integrate structure and sequence family data. Nucleic Acids Res. 32:D226–D229.CrossRefGoogle Scholar
  2. Beck, D.A.C., and Daggett, V. 2004. Methods for molecular dynamics simulations of protein folding/unfolding in solution. Methods 34:112–120.CrossRefGoogle Scholar
  3. Berman, H.M., Bourne, P.E., and Westbrook, J. 2004. The Protein Data Bank: A case study in management of community data. Curr. Proteomics 1:49–57.CrossRefGoogle Scholar
  4. Boniecki, M., Rotkiewicz, P., Skolnick, J., and Kolinski, A. 2003. Protein fragment reconstruction using various modeling techniques. J. Comput. Aided Mol. Des. 17:725–738.CrossRefADSGoogle Scholar
  5. Bonneau, R., and Baker, D. 2001. Ab initio protein structure prediction: Progress and prospects. Annu. Rev. Biophys. Biomol. Struct. 30:173–189.CrossRefGoogle Scholar
  6. Bonneau, R., Strauss, C.E., Rohl, C.A., Chivian, D., Bradley, P., Malmstrom, L., Robertson, T., and Baker, D. 2002. De novo prediction of three-dimensional structures for major protein families. J. Mol. Biol. 322:65–78.CrossRefGoogle Scholar
  7. Bowie, J.U., and Eisenberg, D. 1994. An evolutionary approach to folding small alpha-helical proteins that uses sequence information and an empirical guiding fitness function. Proc. Natl. Acad. Sci. USA 91:4436–4440.CrossRefADSGoogle Scholar
  8. Bradley, P., Chivian, D., Meiler, J., Misura, K.M., Rohl, C.A., Schief, W.R., Wedemeyer, W.J., Schueler-Furman, O., Murphy, P., Schonbrun, J., Strauss, C.E., and Baker, D. 2003. Rosetta predictions in CASP5: Successes, failures, and prospects for complete automation. Proteins 53(Suppl. 6):457–468.CrossRefGoogle Scholar
  9. Bradley, P., Malmstrom, L., Qian, B., Schonbrun, J., Chivian, D., Kim, D.E., Meiler, J., Misura, K.M.S., and Baker, D. 2005a. Free modeling with Rosetta in CASP6. Proteins 61(Suppl. 7): 128–134.CrossRefGoogle Scholar
  10. Bradley, P., Misura, K.M.S., and Baker, D. 2005b. Toward high-resolution de novo structure prediction for small proteins. Science 309:1868–1871.CrossRefADSGoogle Scholar
  11. Brenner, S.E. 2001. A tour of structural genomics. Nat. Genet. 210:801–809.Google Scholar
  12. Brenner, S., and Levitt, M. 2000. Expectations from structural genomics. Protein Sci. 9:197–200.CrossRefGoogle Scholar
  13. Brooks, B.R., Bruccoleri, R.E., Olafson, B.D., States, D.J., Swaminathan, S., and Karplus, M. 1983. CHARMM: A program for macromolecular energy, minimization, and dynamics calculations. J. Comput. Chem. 4:187–217.CrossRefGoogle Scholar
  14. Brunger, A.T., Clore, G.M., Gronenborn, A.M., and Karplus, M. 1986. Three-dimensional structure of proteins determined by molecular dynamics with interproton distance restraints: Application to crambin. Proc. Natl. Acad. Sci. USA 83:3801–3805.CrossRefADSGoogle Scholar
  15. Burley, S.K. 2000. An overview of structural genomics. Nat. Struct. Biol. 7 (Suppl.):932–934.CrossRefGoogle Scholar
  16. Chivian, D., Kim, D.E., Malmstrom, L., Bradley, P., Robertson, T., Murphy, P., Strauss, C.E., Bonneau, R., Rohl, C.A., and Baker, D. 2003. Automated prediction of CASP-5 structures using the Roberta server. Proteins 53(Suppl. 6):524–533.CrossRefGoogle Scholar
  17. Crivelli, S., Eskow, E., Bader, B., Lamberti, V., Byrd, R., Schnabel, R., and Head-Gordon, T. 2002. A physical approach to protein structure prediction. Biophys. J. 82:36–49.CrossRefGoogle Scholar
  18. Daggett, V., and Fersht, A.R. 2003. The present view of the mechanism of protein folding. Nat. Rev. Mol. Cell Biol. 4:497–502.CrossRefGoogle Scholar
  19. Daggett, L.P., Sacaan, A.I., Akong. M., Rao, S.P., Hess, S.D., Liaw, C., Urrutia, A., Jachec, C., Ellis, S.B., Dreessen, J., et al. 1995. Molecular and functional characterization of recombinant human metabotropic glutamate receptor subtype 5. Neuropharmacology 34:871–886.CrossRefGoogle Scholar
  20. Dill, K.A. 1990. Dominant forces in protein folding. Biochemistry 29:7133–7155.CrossRefGoogle Scholar
  21. Eisenberg, D., Weiss, R.M., and Terwilliger, T.C. 1982. The helical hydrophobic moment: A measure of the amphiphilicity of a helix. Nature 299:371–374.CrossRefADSGoogle Scholar
  22. Frank, H.S., and Evans, M.W. 1945. Free volume and entropy in condensed systems. III. Entropy in binary liquid mixtures; partial molal entropy in dilute solutions; structure and thermodynamics in aqueous electrolytes. J. Chem. Phys. 13:507–532.CrossRefADSGoogle Scholar
  23. Hartree, D.R. 1957. The Calculation of Atomic Structure. New York, John Wiley & Sons.Google Scholar
  24. Head-Gordon, T., and Brown, S. 2003. Minimalist models for protein folding and design. Curr. Opin. Struct. Biol. 13:160–167.CrossRefGoogle Scholar
  25. Heinemann, U., Illing, G., and Oschkinat, H. 2001. High-throughput three-dimensional protein structure determination. Curr. Opin. Biotechnol. 12:348–354.CrossRefGoogle Scholar
  26. Hinds, D.A., and Levitt, M. 1992. A lattice model for protein structure prediction at low resolution. Proc. Natl. Acad. Sci. USA 89:2536–2540.CrossRefADSGoogle Scholar
  27. Hohenberg, P., and Kohn, W. 1964. Inhomogeneous electron gas. Phys. Rev. 136:864.CrossRefMathSciNetADSGoogle Scholar
  28. Hung, L.-H., Ngan, S.-C., Liu, T., and Samudrala, R. 2005. PROTINFO: New algorithms for enhanced protein structure predictions. Nucleic Acids Res. 33: (in press).Google Scholar
  29. Hung, L.-H., and Samudrala, R. 2003. PROTINFO: Secondary and tertiary protein structure prediction. Nucleic Acids Res. 31:3296–3299.CrossRefGoogle Scholar
  30. Jones, D.T. 1999. Protein secondary structure prediction based on position-specific scoring matrices. J. Mol. Biol. 292:195–202.CrossRefGoogle Scholar
  31. Jones, D.T. 2001. Predicting novel protein folds by using FRAGFOLD. Proteins Suppl. 5:127–132.Google Scholar
  32. Jones, D.T., Bryson, K., Coleman, A., McGuffin, L.J., Sadowski, M.I., Sodhi, J.S., and Ward, J.J. 2005. Prediction of novel and analogous folds using fragment assembly and fold recognition. Proteins 61(Suppl. 7):143–151.CrossRefGoogle Scholar
  33. Jones, D.T., and McGuffin, L.J. 2003. Assembling novel protein folds from supersecondary structural fragments. Proteins 53(Suppl 6):480–485.CrossRefGoogle Scholar
  34. Jones, D.T., Taylor, W.R., and Thornton, J.M. 1992. A new approach to protein fold recognition. Nature 358:86–89.CrossRefADSGoogle Scholar
  35. Karplus, K., Karchin, R., Draper, J., Casper, J., Mandel-Gutfreund, Y., Diekhans, M., and Hughey, R. 2003. Combining local-structure, fold-recognition, and new fold methods for protein structure prediction. Proteins 53(Suppl 6):491–496.CrossRefGoogle Scholar
  36. Karplus, K., Katzman, S., Shackleford, G., Koeva, M., Draper, J., Barnes, B., Soriano, M., and Hughey, R. 2005. SAM-T04: what is new in protein-structure prediction for CASP6. Proteins 61(Suppl. 7):135–142.CrossRefGoogle Scholar
  37. Kauzmann, W. 1959. Some factors in the interpretation of protein denaturation. Adv. Protein Chem. 14:1–64.CrossRefGoogle Scholar
  38. Kolinski, A., Betancourt, M.R., Kihara, D., Rotkiewicz, P., and Skolnick, J. 2001. Generalized comparative modeling (GENECOMP): A combination of sequence comparison, threading, and lattice modeling for protein structure prediction and refinement. Proteins 44:133–149.CrossRefGoogle Scholar
  39. Kolinski, A., and Bujnicki, J.M. 2005. Generalized protein structure prediction based on combination of fold-recognition with de novo folding and evaluation of models. Proteins 61(Suppl. 7):84–90.CrossRefGoogle Scholar
  40. Kolinski, A., Gront, D., Pokarowski, P., and Skolnick, J. 2003. A simple lattice model that exhibits a protein-like cooperative all-or-none folding transition. Biopolymers 69:399–405.CrossRefGoogle Scholar
  41. Koonin, E.V., Wolf, Y.I., and Karev, G.P. 2002. The structure of the protein universe and genome evolution. Nature 420:218–223.CrossRefADSGoogle Scholar
  42. Kosinski, J., Gajda, M.J., Cymerman, I.A., Kurowski, M.A., Pawlowski, M., Boniecki, M., Obarska, A., Papaj, G., Sroczynska-Obuchowicz, P., Tkaczuk, K.L., Sniezynska, P., Sasin, J.M., Augustyn, A., Bujnicki, J.M., and Feder, M. 2005. FRankenstein becomes a cyborg: The automatic recombination and realignment of fold-recognition models in CASP6. Proteins 61(Suppl. 7): 106–113.CrossRefGoogle Scholar
  43. Lee, B., Kurochkina, N., and Kang, H.S. 1996. Protein folding by a biased Monte Carlo procedure in the dihedral angle space. FASEB J. 10:119–125.Google Scholar
  44. Levinthal, C. 1968. Are there pathways for protein folding? J. Chim. Phys. 65: 44.Google Scholar
  45. Levitt, M. 1983a. Molecular dynamics of native protein. I. Computer simulation of trajectories. J. Mol. Biol. 168:595–617.CrossRefGoogle Scholar
  46. Levitt, M. 1983b. Protein folding by restrained energy minimization and molecular dynamics. J. Mol. Biol. 170:723–764.CrossRefGoogle Scholar
  47. Levitt, M., Hirshberg, M., Sharon, R., and Daggett, V. 1995. Potential energy function and parameters for simulations of the molecular dynamics of proteins and nucleic acids in solution. Comput. Phys. Commun. 91:215–231.CrossRefADSGoogle Scholar
  48. Levitt, M., and Warshel, A. 1975. Computer simulation of protein folding. Nature 253:694–698.CrossRefADSGoogle Scholar
  49. Li, W., Zhang, Y., Kihara, D., Huang, Y.J., Zheng, D., Montelion, G.T., Kolinski, A., and Skolnick, J. 2003. TOUCHSTONEX: Protein structure prediction with sparse NMR data. Proteins 53:290–306.CrossRefGoogle Scholar
  50. Melo, F., Sanchez, R., and Sali, A. 2002. Statistical potentials for fold assessment. Protein Sci. 11:430–448.CrossRefGoogle Scholar
  51. Moore, G.E. 1965. Cramming more components onto integrated circuits. Electronics 38:114–117.Google Scholar
  52. Morozov, A.V., Kortemme, T., Tsemekhman, K., and Baker, D. 2004. Close agreement between the orientation dependence of hydrogen bonds observed in protein structures and quantum mechanical calculations. Proc. Natl. Acad. Sci. USA 101:6946–6951.CrossRefADSGoogle Scholar
  53. Moult, J. 1997. Comparison of database potentials and molecular mechanics force fields. Curr. Opin. Struct. Biol. 7:194–199.CrossRefGoogle Scholar
  54. Moult, J. 1999. Predicting protein three-dimensional structure. Curr. Opin. Biotechnol. 10:583–588.CrossRefGoogle Scholar
  55. Moult, J., Fidelis, K., Tramontano, A., Rost, B., and Hubbard, T. 2005. Critical assessment of methods of protein structure prediction (CASP)—round VI. Proteins (accepted preprint).Google Scholar
  56. Moult, J., Fidelis, K., Zemla, A., and Hubbard, T. 2001. Critical assessment of methods of protein structure prediction (CASP): Round IV Proteins Suppl. 5:2–7.Google Scholar
  57. Moult, J., Fidelis, K., Zemla, A., and Hubbard, T. 2003. Critical assessment of methods of protein structure prediction (CASP)—round V. Proteins 53(Suppl. 6):334–339.CrossRefGoogle Scholar
  58. Moult, J., Hubbard, T., Bryant, S.H., Fidelis, K., and Pedersen, J.T. 1997. Critical assessment of methods of protein structure prediction (CASP): Round II. Proteins Suppl. 1:2–6.Google Scholar
  59. Moult, J., Hubbard, T., Fidelis, K., and Pedersen, J.T. 1999. Critical assessment of methods of protein structure prediction (CASP): Round III. Proteins Suppl. 3:2–6.Google Scholar
  60. Moult, J., Pedersen, J.T., Judson, R., and Fidelis, K. 1995. A large-scale experiment to assess protein structure prediction methods. Proteins 23: ii–v.CrossRefGoogle Scholar
  61. Murzin, A.G., Brenner, S.E., Hubbard, T., and Chothia, C. 1995. SCOP: A structural classification of proteins database for the investigation of sequences and structures. J. Mol. Biol. 247:536–540.Google Scholar
  62. Onuchic, J.N., and Wolynes, P. G. 2004. Theory of protein folding. Curr. Opin. Struct. Biol. 14:70–75.CrossRefGoogle Scholar
  63. Park, B.H., and Levitt, M. 1995. The complexity and accuracy of discrete state models of protein structure. J. Mol. Biol. 249:493–507.CrossRefGoogle Scholar
  64. Qian, B., Ortiz, A.R., and Baker, D. 2004. Improvement of comparative model accuracy by free-energy optimization along principal components of natural structural variation. Proc. Natl. Acad. Sci. USA 101:15346–15351.CrossRefADSGoogle Scholar
  65. Rabow, A.A., and Scheraga, H.A. 1996. Improved genetic algorithm for the protein folding problem by use of a Cartesian combination operator. Protein Sei. 5:1800–1815.CrossRefGoogle Scholar
  66. Rohl, C. A., and Baker, D. 2002. De novo determination of protein backbone structure from residual dipolar couplings using Rosetta. J. Am. Chem. Soc. 124:2723–2729.CrossRefGoogle Scholar
  67. Samudrala, R., and Moult, J. 1998. An all-atom distance-dependent conditional probability discriminatory function for protein structure prediction. J. Mol. Biol. 275:895–916.CrossRefGoogle Scholar
  68. Samudrala, R., Xia, Y., Huang, E., and Levitt, M. 1999a. Ab initio protein structure prediction using a combined hierarchical approach. Proteins Suppl. 3:194–198.Google Scholar
  69. Samudrala, R., Xia, Y., Levitt, M., and Huang, E.S. 1999b. A combined approach for ab initio construction of low resolution protein tertiary structures from sequence. Pac. Symp. Biocomput. pp. 505–516.Google Scholar
  70. Simons, K.T., Bonneau, R., Ruczinski, I., and Baker, D. 1999. Ab initio protein structure prediction of CASP III targets using ROSETTA. Proteins Suppl. 3:171–176.Google Scholar
  71. Sippl, M.J., and Weitckus, S. 1992. Detection of native-like models for amino acid sequences of unknown three-dimensional structure in a data base of known protein conformations. Proteins 13:258–271.CrossRefGoogle Scholar
  72. Venclovas, C., Zemla, A., Fidelis, K., and Moult, J. 1999. Some measures of comparative performance in the three CASPs. Proteins Suppl. 3:231–237.Google Scholar
  73. Wang, K., Fain, B., Levitt, M., and Samudrala, R. 2004. Improved protein structure selection using decoy-dependent discriminatory functions. BMC Struct. Biol. 4:8.CrossRefGoogle Scholar
  74. Weiner, P.K., and Kollman, P.A. 1981. AMBER: Assisted model building with energy refinement. A general program for modeling molecules and their interactions. J. Comput. Chem. 2:287–303.CrossRefGoogle Scholar
  75. Wolynes, P. G. 2005. Energy landscapes and solved protein folding problems. Philos. Trans. R. Soc. London Sen. A 363:453–464.CrossRefADSGoogle Scholar
  76. Zhang, C., Liu, S., Zhou, H., and Zhou, Y. 2004a. The dependence of all-atom statistical potentials on training structural database. Biophys. J. 86:3349–3358.CrossRefGoogle Scholar
  77. Zhang, C., Liu, S., Zhou, H., and Zhou, Y. 2004b. An accurate residue-level pair potential of mean force for folding and binding based on the distance-scaled ideal-gas reference state. Protein Sci. 13:400–411.CrossRefGoogle Scholar
  78. Zhang, Y., and Skolnick, J. 2004a. Automated structure prediction of weakly homologous proteins on a genomic scale. Proc. Natl. Acad. Sci. USA 101:7594–7599.CrossRefADSGoogle Scholar
  79. Zhang, Y., and Skolnick, J. 2004b. SPICKER: A clustering approach to identify near-native protein folds. J. Comput. Chem. 25:865–871.CrossRefGoogle Scholar
  80. Zhang, Y., and Skolnick, J. 2004c. Tertiary structure predictions on a comprehensive benchmark of medium to large size proteins. Biophys. J. 87:2647–2655.CrossRefADSGoogle Scholar
  81. Zhou, H., and Zhou, Y. 2002. Distance-scaled, finite ideal-gas reference state improves structure-derived potentials of mean force for structure selection and stability prediction. Protein Sci. 11:2714–2726.CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC 2007

Authors and Affiliations

  • Ling-Hong Hung
    • 1
  • Shing-Chung Ngan
    • 1
  • Ram Samudrala
    • 1
  1. 1.Department of MicrobiologyUniversity of WashingtonSeattle

Personalised recommendations