Skip to main content

Scoring Functions for ab initio Protein Structure Prediction

  • Protocol
Protein Structure Prediction

Part of the book series: Methods in Molecular Biology™ ((MIMB,volume 143))

Abstract

The native conformation of a protein is generally assumed to be the one with the lowest free energy (1). The successful prediction of protein structure depends on the surmounting of three subproblems: (1) choosing a representation of protein conformation that includes structures similar to the correct conformation but limits the search space; (2) formulating a scoring function that relates a particular protein conformation to its free energy; and (3) devising a method to combine the first two elements in a search through conformational space for the state with the globally optimum score. These three requirements apply to the major classes of protein structure prediction: homology modeling, threading (fold recognition), and ab initio folding. In this chapter, we focus on the second of the three subproblems, that of developing energy functions, and place an emphasis on functions tailored for ab initio folding, although much of the discussion will also apply to threading.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Protocol
USD 49.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 79.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Anfinsen, C. B. (1973) Principles that govern the folding of protein chains. Science 181, 223–230.

    Article  PubMed  CAS  Google Scholar 

  2. Miyazawa, S. and Jernigan, R. L. (1985) Estimation of effective interresidue contact energies from protein crystal structures: quasi-chemical approximation. Macomolecules 18, 534–552

    Article  CAS  Google Scholar 

  3. Bernstein, F. C., Koetzle, T. F., Williams, G. J. B., Meyer, E. F., Brice, M. D., Jr, Rodgers, J. R., Kennard, O., Shimanouchi, T., and Tasumi, M. (1977) Protein Data Bank: a computer-based archival file for macromolecular structures. J. Mol. Biol. 112, 535–542.

    Article  PubMed  CAS  Google Scholar 

  4. Levitt, M., Hirshberg, M., Sharon, R., and Daggett, V. (1995) Potential energy function and parameters for simulations of the molecular dynamics of proteins and nucleic acids in solution. Comp. Phys. Commun. 91, 215–231.

    Article  CAS  Google Scholar 

  5. Brooks, B. R., Bruccoleri, R. E., Olafson, B. D., States, D. J., Swaminathan, S., and Karplus, M. (1983) CHARMM: a program for macromolecular energy, minimization, and dynamics calculations. J. Comput. Chem. 4, 187–217.

    Article  CAS  Google Scholar 

  6. Cornell, W. D., Cieplak, P., Bayly, C. I., Gould, I. R., Merz, K. M., Jr, Ferguson, D. M., Spellmeyer, D. C., Fox, T., Caldwell, J. W., and Kollman, P. A. (1995) A second generation force field for the simulation of proteins, nucleic acids, and organic molecules. J. Am. Chem. Soc. 117, 5179–5197.

    Article  CAS  Google Scholar 

  7. Jorgensen, W. and Tirado-Rives, J. (1988) The OPLS potential function for proteins. Energy minimizations for crystals of cyclic peptides and crambin. J. Am. Chem. Soc. 110, 1657–1666.

    Article  CAS  Google Scholar 

  8. Sippl, M. J. (1995) Knowledge-based potentials for proteins. Curr. Opin. Struct. Biol. 5, 229–235.

    Article  PubMed  CAS  Google Scholar 

  9. Jernigan, R. L. and Bahar, I. (1996) Structure-derived potentials and protein simulations. Curr. Opin. Struct. Biol. 6, 195–209.

    Article  PubMed  CAS  Google Scholar 

  10. Sippl, M. J. (1990) Calculation of conformational ensembles from potentials of mean force. An approach to the knowledge-based prediction of local structures in globular proteins. J. Mol. Biol. 213, 859–883.

    Article  PubMed  CAS  Google Scholar 

  11. Bowie, J. U., Lüthy, R., and Eisenberg, D. (1991) A method to identify protein sequences that fold into a known three-dimensional structure. Science 253, 164–170.

    Article  PubMed  CAS  Google Scholar 

  12. Jones, D. T., Taylor, W. R., and Thornton, J. M. (1992) A new approach to protein fold recognition. Nature 358, 86–89.

    Article  PubMed  CAS  Google Scholar 

  13. Bryant, S. H. and Lawrence, C. E. (1993) An empirical energy function for threading protein sequence through folding motif. Proteins: Struct. Funct. Genet. 16, 92–112.

    Article  CAS  Google Scholar 

  14. Kocher, J.-P. A., Rooman, M. J., and Wodak, S. J. (1994) Factors influencing the ability of knowledge-based potentials to identify native sequence-structure matches. J. Mol. Biol. 235, 1598–1613.

    Article  PubMed  CAS  Google Scholar 

  15. Godzik, A., Kolinski, A., and Skolnick, J. (1995) Are proteins ideal mixtures of amino acids? Analysis of energy parameter sets. Protein Sci. 4, 2107–2117.

    Article  PubMed  CAS  Google Scholar 

  16. Godzik, A. (1996) Knowledge-based potentials for protein folding: what can we learn from known protein sequences? Structure 4, 363–366.

    Article  PubMed  CAS  Google Scholar 

  17. Thomas, P. D. and Dill, K. A. (1996) Statistical potentials extracted from protein structures: how accurate are they? J. Mol. Biol. 257, 457–469.

    Article  PubMed  CAS  Google Scholar 

  18. Ben Naim, A. (1997) Statistical potentials extracted from protein structures: are these meaningful potentials? J. Chem. Phys. 107, 3698–3706.

    Google Scholar 

  19. Rooman, M. J. and Wodak, S. J. (1995) Are database-derived potentials valid for scoring both forward and inverted protein folding? Protein Eng. 8, 849–858.

    Article  PubMed  CAS  Google Scholar 

  20. Skolnick, J., Jaroszewski, L., Kolinski, A., and Godzik, A. (1997) Derivation and testing of pair potentials for protein folding. When is the quasi-chemical approximation correct? Protein Sci. 6, 676–688.

    Article  PubMed  CAS  Google Scholar 

  21. Simons, K. T., Kooperberg, C., Huang, E., and Baker, D. (1997) Assembly of protein tertiary structures from fragments with similar local sequences using simulated annealing and Bayesian scoring functions. J. Mol. Biol. 268, 209–225.

    Article  PubMed  CAS  Google Scholar 

  22. Samudrala, R. and Moult, J. (1997) An all-atom distance-dependent conditional probability discriminatory function for protein structure prediction. J. Mol. Biol. 275, 893–914.

    Google Scholar 

  23. Wodak, S. J. and Rooman, M. J. (1993) Generating and testing protein folds. Curr. Opin. Struct. Biol. 3, 247–259.

    Article  CAS  Google Scholar 

  24. Moult, J. (1997) Comparison of database potentials and molecular mechanics force field. Curr. Opin. Struct. Biol. 7, 194–199.

    Article  PubMed  CAS  Google Scholar 

  25. Godzik, A., Kolinski, A., and Skolnick, J. (1992) Topology fingerprint approach to the inverse protein folding problem. J. Mol. Biol. 227, 227–238.

    Article  PubMed  CAS  Google Scholar 

  26. Rice, D. W. and Eisenberg, D. (1997) A 3D-1D substitution matrix for protein fold recognition that includes predicted secondary structure of the sequence. J. Mol. Biol. 267, 1026–1038.

    Article  PubMed  CAS  Google Scholar 

  27. Russell, R. B., Copley, R. R., and Barton, G. J. (1996) Protein fold recognition by mapping predicted secondary structures. J. Mol. Biol. 259, 349–365.

    Article  PubMed  CAS  Google Scholar 

  28. Di Francesco, V., Garnier, J., and Munson, P. J. (1997) Protein topology recognition from secondary structure sequences: application of the hidden Markov models to the alpha class proteins. J. Mol. Biol. 267, 446–463.

    Article  PubMed  Google Scholar 

  29. Defay, T. R. and Cohen, F. E. (1996) Multiple sequence information for threading algorithms. J. Mol. Biol. 262, 314–323.

    Article  PubMed  CAS  Google Scholar 

  30. Park, B. and Levitt, M. (1996) Energy functions that discriminate X-ray and near-native folds from well-constructed decoys. J. Mol. Biol. 258, 267–392.

    Article  Google Scholar 

  31. Hinds, D. A. and Levitt, M. (1992) A lattice model for protein structure prediction at low resolution. Proc. Natl. Acad. Sci. USA 89, 2536–2540.

    Article  PubMed  CAS  Google Scholar 

  32. Park, B. and Levitt, M. (1995) The complexity and accuracy of discrete state models of protein structure. J. Mol. Biol. 249, 493–507.

    Article  PubMed  CAS  Google Scholar 

  33. Park, B. H., Huang, E. S., and Levitt, M. (1997) Factors affecting the ability of energy functions to discriminate correct from incorrect folds. J. Mol. Biol. 266, 831–846.

    Article  PubMed  CAS  Google Scholar 

  34. Huang, E. S., Subbiah, S., and Levitt, M. (1995) Recognizing native folds by the arrangement of hydrophobic and polar residues. J. Mol. Biol. 252, 709–720.

    Article  PubMed  CAS  Google Scholar 

  35. Kolinski, A. and Skolnick, J. (1994) Monte Carlo simulations of protein folding. I. Lattice model and interaction scheme. Proteins: Struct. Funct. Genet. 18, 338–352.

    Article  CAS  Google Scholar 

  36. Kolinski, A., and Skolnick, J. (1994) Monte Carlo simulations of protein folding. II. Application to Protein A, ROP, and crambin. Proteins: Struct. Funct. Genet. 18, 353–366.

    Article  CAS  Google Scholar 

  37. Sun, S., Thomas, P. D., and Dill, K. A. (1995) A simple protein folding algorithm using a binary code and secondary structure constraints. Protein Eng. 8, 769–778.

    Article  PubMed  CAS  Google Scholar 

  38. Holland, J. H. (1975) Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence. University of Michigan Press, Ann Arbor, MI.

    Google Scholar 

  39. Holm, L. and Sander, C. (1992) Evaluation of protein models by atomic solvation preference. J. Mol. Biol. 225, 93–105.

    Article  PubMed  CAS  Google Scholar 

  40. Levitt, M. (1992) Accurate modeling of protein conformation by automatic segment matching. J. Mol. Biol. 226, 507–533.

    Article  PubMed  CAS  Google Scholar 

  41. Holm, L. and Sander, C. (1991) Database algorithm for generating protein backbone and side-chain coordinates from a Cα trace: application to model building and detection of coordinate errors. J. Mol. Biol. 218, 183–194.

    Article  PubMed  CAS  Google Scholar 

  42. Wallqvist, A. and Ullner, M. (1994) A simplified amino acid potential for use in structure prediction of proteins. Proteins: Struct. Funct. Genet. 18, 267–280.

    Article  CAS  Google Scholar 

  43. Munson, P. J. and Singh, R. K. (1997) Statistical significance of hierarchical multi-body potentials based on Delaunay tessellation and their application in sequence-structure alignment. Protein Sci. 6, 1467–1481.

    Article  PubMed  CAS  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2000 Humana Press Inc.

About this protocol

Cite this protocol

Huang, E.S., Samudrala, R., Park, B.H. (2000). Scoring Functions for ab initio Protein Structure Prediction. In: Webster, D.M. (eds) Protein Structure Prediction. Methods in Molecular Biology™, vol 143. Humana Press. https://doi.org/10.1385/1-59259-368-2:223

Download citation

  • DOI: https://doi.org/10.1385/1-59259-368-2:223

  • Publisher Name: Humana Press

  • Print ISBN: 978-0-89603-637-6

  • Online ISBN: 978-1-59259-368-2

  • eBook Packages: Springer Protocols

Publish with us

Policies and ethics