Skip to main content

Evolution teaches neural networks to predict protein structure

  • Conference paper
  • First Online:
Scientific Applications of Neural Nets

Part of the book series: Lecture Notes in Physics ((LNP,volume 522))

Abstract

In the wake of the genome data flow, we need — more urgently than ever — accurate tools to predict protein structure. The problem of predicting protein structure from sequence remains fundamentally unsolved despite more than three decades of intensive research effort. However, the wealth of evolutionary information deposited in current databases enabled a significant improvement for methods predicting protein structure in 1D: secondary structure, transmembrane helices, and solvent accessibility. In particular, the combination of evolutionary information with neural networks has proved extremely successful. The new generation of prediction methods proved to be accurate and reliable enough to be useful in genome analysis, and in experimental structure determination. Moreover, the new generation of theoretical methods is increasingly influencing experiments in molecular biology. Neural networks have been applied to many pattern classification problems. Here, I review applications to the problem of predicting protein structure from protein sequence. Initially, methods were designed as a ‘quick and dirty’ demonstration that artificial intelligence-based machines could solve real-life problems. At that stage, biologists typically reached higher levels of accuracy when using their expertise than computer scientists when using their machines. However, more thorough investigations introduced the information used by experts into neural network-based tools. Now, some tools are — on average — as accurate as the best experts, and experts using such tools often arrive at even more accurate predictions. Thus, several neural network-based methods have eventually contributed significantly to advancing the field of bio-informatics, and some are clearly influencing molecular biology.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Fleischmann, R. D., et al.: Whole-genome random sequencing and assembly of Haemophilus influenzae Rd. Science 269 (1995) 496–512.

    Article  ADS  Google Scholar 

  2. Goffeau, A., et al.: Life with 6000 genes. Science 274 (1996) 546–567.

    Article  ADS  Google Scholar 

  3. Gaasterland, T.: Genome sequencing projects. WWW document (http://www.mcs.anl.gov/home/gaasterl/genomes.html), Univ. Chicago (1998).

    Google Scholar 

  4. Brändén, C., Tooze, J.: Introduction to Protein Structure. New York, London: Garland Publ. (1991).

    Google Scholar 

  5. Anfinsen, C. B.: Principles that govern the folding of protein chains. Science 181 (1973) 223–230.

    Article  ADS  Google Scholar 

  6. Rost, B., O’Donoghue, S. I.: Sisyphus and prediction of protein structure. CABIOS 13 (1997) 345–356.

    Google Scholar 

  7. Barton, G. J.: Protein secondary structure prediction. Curr. Opin. Str. Biol. 5 (1995) 372–376.

    Article  Google Scholar 

  8. Rost, B., Sander, C.: Bridging the protein sequence-structure gap by structure predictions. Annu. Rev. Biophys. Biomol. Struct. 25 (1996) 113–136.

    Google Scholar 

  9. Doolittle, R. F.: Computer methods for macromolecular sequence analysis. San Diego: Academic Press (1996).

    Book  Google Scholar 

  10. Honig, B., Cohen, F. E.: Adding backbone to protein folding: why proteins are polypeptides. Folding & Design 1 (1996) R17–R20.

    Article  Google Scholar 

  11. Moult, J., Hubburad, T., Bryant, S. H., Fidelis, K., Pedersen, J. T.: Critical assessment of methods of protein structure prediction (CASP): Round II. Proteins Suppl 1 (1997) 2–6.

    Article  Google Scholar 

  12. Arbib, M.: The handbook of brain theory and neural networks. Cambridge, MA: Bradford Books/The MIT Press (1995).

    Google Scholar 

  13. Fiesler, E., Beale, R.: Handbook of Neural Computation. New York: Oxford Univ. Press (1996).

    Book  MATH  Google Scholar 

  14. Rost, B.: PHD: predicting one-dimensional protein structure by profile based neural networks. Meth. Enzymol. 266 (1996) 525–539.

    Article  Google Scholar 

  15. Schulz, G. E., Schirmer, R. H.: Principles of Protein Structure. Heidelberg: Springer (1979).

    Google Scholar 

  16. Kabsch, W., Sander, C.: How good are predictions of protein secondary structure? FEBS Lett. 155 (1983) 179–182.

    Article  Google Scholar 

  17. Fasman, G. D.: Prediction of protein structure and the principles of protein conformation. New York, London: Plenum (1989).

    Google Scholar 

  18. Maxfield, F. R., Scheraga, H. A.: Improvements in the Prediction of Protein Topography by Reduction of Statistical Errors. Biochem. 18 (1979) 697–704.

    Article  Google Scholar 

  19. Zvelebil, M. J., Barton, G. J., Taylor, W. R., Sternberg, M. J. E.: Prediction of protein secondary structure and active sites using alignment of homologous sequences. J. Mol. Biol. 195 (1987) 957–961.

    Article  Google Scholar 

  20. Gascuel, O., Golmard, J. L.: A simple method for predicting the secondary structure of globular proteins: implications and accuracy. CABIOS 4 (1988) 357–365.

    Google Scholar 

  21. Kabsch, W., Sander, C.: Segment83. unpublished (1983).

    Google Scholar 

  22. Garnier, J., Levin, J. M.: The protein structure code: what is its present status? CABIOS 7 (1991) 133–142.

    Google Scholar 

  23. Bohr, H., Bohr, J., Brunak, S., Cotterill, R. M. J., Lautrup, B., Nørskov, L., Olsen, O. H., Petersen, S. B.: Protein secondary structure and homology by neural networks. FEBS Lett. 241 (1988) 223–228.

    Article  Google Scholar 

  24. Qian, N., Sejnowski, T. J.: Predicting the secondary structure of globular proteins using neural network models. J. Mol. Biol. 202 (1988) 865–884.

    Article  Google Scholar 

  25. Holley, H. L., Karplus, M.: Protein secondary structure prediction with a neural network. Proc. Natl. Acad. Sc. U.S.A. 86 (1989) 152–156.

    Article  ADS  Google Scholar 

  26. Rost, B., Sander, C.: Secondary structure prediction of all-helical proteins in two states. Prot. Engin. 6 (1993) 831–836.

    Article  Google Scholar 

  27. Rost, B., Sander, C., Schneider, R.: Progress in protein structure prediction? TIBS 18 (1993) 120–123.

    Google Scholar 

  28. Rost, B., Sander, C.: Prediction of protein secondary structure at better than 70accuracy. J. Mol. Biol. 232 (1993) 584–599.

    Article  Google Scholar 

  29. Rost, B., Sander, C.: Improved prediction of protein secondary structure by use of sequence profiles and neural networks. Proc. Natl. Acad. Sc. U.S.A. 90 (1993) 7558–7562.

    Article  ADS  Google Scholar 

  30. Rost, B., Sander, C.: Combining evolutionary information and neural networks to predict protein secondary structure. Proteins 19 (1994) 55–72.

    Article  Google Scholar 

  31. Moult, J., Pedersen, J. T., Judson, R., Fidelis, K.: A large-scale experiment to assess protein structure prediction methods. Proteins 23 (1995) ii–iv

    Article  Google Scholar 

  32. Dao-pin, S., Söderlind, E., Baase, W. A., Wozniak, J. A., Sauer, U., Matthews, B. W.: Cumulative site-directed charge-change replacements in bacteriophage T4 lysozyme suggest that long-range electrostatic interactions contribute little to protein stability. J. Mol. Biol. 221 (1991) 873–887.

    Article  Google Scholar 

  33. Chothia, C., Lesk, A. M.: The relation between the divergence of sequence and structure in proteins. EMBO J. 5 (1986) 823–826.

    Google Scholar 

  34. Doolittle, R. F.: Of URFs and ORFs: a primer on how to analyze derived amino acid sequences. Mill Valley California: University Science Books (1986).

    Google Scholar 

  35. Lesk, A. M.: Protein Architecture — A Practical Approach. Oxford, New York, Tokyo: Oxford University Press (1991).

    Google Scholar 

  36. Sander, C., Schneider, R.: Database of homology-derived structures and the structural meaning of sequence alignment. Proteins 9 (1991) 56–68.

    Article  Google Scholar 

  37. Rost, B.: Twilight zone of protein sequence alignments. J. Mol. Biol. (1998).

    Google Scholar 

  38. Rost, B.: Protein structures sustain evolutionary drift. Folding & Design 2 (1997) S19–S24.

    Article  Google Scholar 

  39. Rost, B.: Marrying structure and genomics. Structure 6 (1998) 259–263.

    Article  Google Scholar 

  40. Goebel, U., Sander, C., Schneider, R., Valencia, A.: Correlated mutations and residue contacts in proteins. Proteins 18 (1994) 309–317.

    Article  Google Scholar 

  41. Schneider, R.: Sequenz und Sequenz-Struktur Vergleiche und deren Anwendung fr die Struktur-und Funktionsvorhersage von Proteinen. Ph.D. thesis, Univ. of Heidelberg (1994).

    Google Scholar 

  42. Rost, B.: Better 1D predictions by experts with machines. Proteins Suppl. 1 (1997) 192–197.

    Article  Google Scholar 

  43. von Heijne, G.: Membrane proteins: from sequence to structure. Annu. Rev. Biophys. Biomol. Struct. 23 (1994) 167–192.

    Article  Google Scholar 

  44. Rost, B., Casadio, R., Fariselli, P.: Topology prediction for helical transmembrane proteins at 86% accuracy. Prot. Sci. 5 (1996) 1704–1718.

    Article  Google Scholar 

  45. Rost, B., Casadio, R., Fariselli, P.: Refining neural network predictions for helical transmembrane proteins by dynamic programming. In States, D., et al. eds. Fourth International Conference on Intelligent Systems for Molecular Biology. St. Louis, M.O., U.S.A.: Menlo Park, CA: AAAI Press (1996) 192–200.

    Google Scholar 

  46. Cohen, F. E., Presnell, S. R.: The combinatorial approach. In Sternberg, M. J. E. eds. Protein structure prediction. Oxford: Oxford Univ. Press (1996) 207–228.

    Google Scholar 

  47. Lee, B. K., Richards, F. M.: The interpretation of protein structures: estimation of static accessibility. J. Mol. Biol. 55 (1971) 379–400.

    Article  Google Scholar 

  48. Chothia, C.: The nature of the accessible and buried surfaces in proteins. J. Mol. Biol. 105 (1976) 1–12.

    Article  Google Scholar 

  49. Connolly, M. L.: Solvent-accessible surfaces of proteins and nucleic acids. Science 221 (1983) 709–713.

    Article  ADS  Google Scholar 

  50. Tanford, C.: The hydrophobic effect: formation of micelles and biological membranes. New York: John Wiley & Sons (1980).

    Google Scholar 

  51. Kyte, J., Doolittle, R. F.: A simple method for displaying the hydrophathic character of a protein. J. Mol. Biol. 157 (1982) 105–132.

    Article  Google Scholar 

  52. Eisenberg, D., Weiss, R. M., Terwilliger, T. C.: The hydrophobic moment detects periodicity in protein hydrophobicity. Proc. Natl. Acad. Sc. U.S.A. 81 (1984) 140–144.

    Article  ADS  Google Scholar 

  53. Rost, B., Sander, C.: Conservation and prediction of solvent accessibility in protein families. Proteins 20 (1994) 216–226.

    Article  Google Scholar 

  54. Rost, B.: Average conservation of 1D structure between remote homologues. WWW document (http://www.embl-heidelberg.de/~rost/Res/96E-ConservationOf1D.html), EMBL Heidelberg, Germany (1996).

    Google Scholar 

  55. Rost, B., Sander, C.: Progress of 1D protein structure prediction at last. Proteins 23 (1995) 295–300.

    Article  Google Scholar 

  56. Rost, B.: PredictProtein — internet prediction service. WWW document (http://www.embl-heidelberg.de/predictprotein), EMBL (1997).

    Google Scholar 

  57. Rost, B., Schneider, R.: Pedestrian guide to analysing sequence databases. In Ashman, K. eds. Core techniques in biochemistry. Heidelberg: Springer (1998) (in press).

    Google Scholar 

  58. Bernstein, F. C., Koetzle, T. F., Williams, G. J. B., Meyer, E. F., Brice, M. D., Rodgers, J. R., Kennard, O., Shimanouchi, T., Tasumi, M.: The Protein Data Bank: a computer based archival file for macromolecular structures. J. Mol. Biol. 112 (1977) 535–542.

    Article  Google Scholar 

  59. von Heijne, G.: Membrane protein structure prediction. J. Mol. Biol. 225 (1992) 487–494.

    Article  Google Scholar 

  60. Kraulis, P. J.: J. Appl. Crystallography 24 (1991), 946–950.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

John W. Clark Thomas Lindenau Manfred L. Ristig

Rights and permissions

Reprints and permissions

Copyright information

© 1999 Springer-Verlag

About this paper

Cite this paper

Rost, B. (1999). Evolution teaches neural networks to predict protein structure. In: Clark, J.W., Lindenau, T., Ristig, M.L. (eds) Scientific Applications of Neural Nets. Lecture Notes in Physics, vol 522. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0104282

Download citation

  • DOI: https://doi.org/10.1007/BFb0104282

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-65737-8

  • Online ISBN: 978-3-540-48980-1

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics