Abstract
In the wake of the genome data flow, we need — more urgently than ever — accurate tools to predict protein structure. The problem of predicting protein structure from sequence remains fundamentally unsolved despite more than three decades of intensive research effort. However, the wealth of evolutionary information deposited in current databases enabled a significant improvement for methods predicting protein structure in 1D: secondary structure, transmembrane helices, and solvent accessibility. In particular, the combination of evolutionary information with neural networks has proved extremely successful. The new generation of prediction methods proved to be accurate and reliable enough to be useful in genome analysis, and in experimental structure determination. Moreover, the new generation of theoretical methods is increasingly influencing experiments in molecular biology. Neural networks have been applied to many pattern classification problems. Here, I review applications to the problem of predicting protein structure from protein sequence. Initially, methods were designed as a ‘quick and dirty’ demonstration that artificial intelligence-based machines could solve real-life problems. At that stage, biologists typically reached higher levels of accuracy when using their expertise than computer scientists when using their machines. However, more thorough investigations introduced the information used by experts into neural network-based tools. Now, some tools are — on average — as accurate as the best experts, and experts using such tools often arrive at even more accurate predictions. Thus, several neural network-based methods have eventually contributed significantly to advancing the field of bio-informatics, and some are clearly influencing molecular biology.
Preview
Unable to display preview. Download preview PDF.
References
Fleischmann, R. D., et al.: Whole-genome random sequencing and assembly of Haemophilus influenzae Rd. Science 269 (1995) 496–512.
Goffeau, A., et al.: Life with 6000 genes. Science 274 (1996) 546–567.
Gaasterland, T.: Genome sequencing projects. WWW document (http://www.mcs.anl.gov/home/gaasterl/genomes.html), Univ. Chicago (1998).
Brändén, C., Tooze, J.: Introduction to Protein Structure. New York, London: Garland Publ. (1991).
Anfinsen, C. B.: Principles that govern the folding of protein chains. Science 181 (1973) 223–230.
Rost, B., O’Donoghue, S. I.: Sisyphus and prediction of protein structure. CABIOS 13 (1997) 345–356.
Barton, G. J.: Protein secondary structure prediction. Curr. Opin. Str. Biol. 5 (1995) 372–376.
Rost, B., Sander, C.: Bridging the protein sequence-structure gap by structure predictions. Annu. Rev. Biophys. Biomol. Struct. 25 (1996) 113–136.
Doolittle, R. F.: Computer methods for macromolecular sequence analysis. San Diego: Academic Press (1996).
Honig, B., Cohen, F. E.: Adding backbone to protein folding: why proteins are polypeptides. Folding & Design 1 (1996) R17–R20.
Moult, J., Hubburad, T., Bryant, S. H., Fidelis, K., Pedersen, J. T.: Critical assessment of methods of protein structure prediction (CASP): Round II. Proteins Suppl 1 (1997) 2–6.
Arbib, M.: The handbook of brain theory and neural networks. Cambridge, MA: Bradford Books/The MIT Press (1995).
Fiesler, E., Beale, R.: Handbook of Neural Computation. New York: Oxford Univ. Press (1996).
Rost, B.: PHD: predicting one-dimensional protein structure by profile based neural networks. Meth. Enzymol. 266 (1996) 525–539.
Schulz, G. E., Schirmer, R. H.: Principles of Protein Structure. Heidelberg: Springer (1979).
Kabsch, W., Sander, C.: How good are predictions of protein secondary structure? FEBS Lett. 155 (1983) 179–182.
Fasman, G. D.: Prediction of protein structure and the principles of protein conformation. New York, London: Plenum (1989).
Maxfield, F. R., Scheraga, H. A.: Improvements in the Prediction of Protein Topography by Reduction of Statistical Errors. Biochem. 18 (1979) 697–704.
Zvelebil, M. J., Barton, G. J., Taylor, W. R., Sternberg, M. J. E.: Prediction of protein secondary structure and active sites using alignment of homologous sequences. J. Mol. Biol. 195 (1987) 957–961.
Gascuel, O., Golmard, J. L.: A simple method for predicting the secondary structure of globular proteins: implications and accuracy. CABIOS 4 (1988) 357–365.
Kabsch, W., Sander, C.: Segment83. unpublished (1983).
Garnier, J., Levin, J. M.: The protein structure code: what is its present status? CABIOS 7 (1991) 133–142.
Bohr, H., Bohr, J., Brunak, S., Cotterill, R. M. J., Lautrup, B., Nørskov, L., Olsen, O. H., Petersen, S. B.: Protein secondary structure and homology by neural networks. FEBS Lett. 241 (1988) 223–228.
Qian, N., Sejnowski, T. J.: Predicting the secondary structure of globular proteins using neural network models. J. Mol. Biol. 202 (1988) 865–884.
Holley, H. L., Karplus, M.: Protein secondary structure prediction with a neural network. Proc. Natl. Acad. Sc. U.S.A. 86 (1989) 152–156.
Rost, B., Sander, C.: Secondary structure prediction of all-helical proteins in two states. Prot. Engin. 6 (1993) 831–836.
Rost, B., Sander, C., Schneider, R.: Progress in protein structure prediction? TIBS 18 (1993) 120–123.
Rost, B., Sander, C.: Prediction of protein secondary structure at better than 70accuracy. J. Mol. Biol. 232 (1993) 584–599.
Rost, B., Sander, C.: Improved prediction of protein secondary structure by use of sequence profiles and neural networks. Proc. Natl. Acad. Sc. U.S.A. 90 (1993) 7558–7562.
Rost, B., Sander, C.: Combining evolutionary information and neural networks to predict protein secondary structure. Proteins 19 (1994) 55–72.
Moult, J., Pedersen, J. T., Judson, R., Fidelis, K.: A large-scale experiment to assess protein structure prediction methods. Proteins 23 (1995) ii–iv
Dao-pin, S., Söderlind, E., Baase, W. A., Wozniak, J. A., Sauer, U., Matthews, B. W.: Cumulative site-directed charge-change replacements in bacteriophage T4 lysozyme suggest that long-range electrostatic interactions contribute little to protein stability. J. Mol. Biol. 221 (1991) 873–887.
Chothia, C., Lesk, A. M.: The relation between the divergence of sequence and structure in proteins. EMBO J. 5 (1986) 823–826.
Doolittle, R. F.: Of URFs and ORFs: a primer on how to analyze derived amino acid sequences. Mill Valley California: University Science Books (1986).
Lesk, A. M.: Protein Architecture — A Practical Approach. Oxford, New York, Tokyo: Oxford University Press (1991).
Sander, C., Schneider, R.: Database of homology-derived structures and the structural meaning of sequence alignment. Proteins 9 (1991) 56–68.
Rost, B.: Twilight zone of protein sequence alignments. J. Mol. Biol. (1998).
Rost, B.: Protein structures sustain evolutionary drift. Folding & Design 2 (1997) S19–S24.
Rost, B.: Marrying structure and genomics. Structure 6 (1998) 259–263.
Goebel, U., Sander, C., Schneider, R., Valencia, A.: Correlated mutations and residue contacts in proteins. Proteins 18 (1994) 309–317.
Schneider, R.: Sequenz und Sequenz-Struktur Vergleiche und deren Anwendung fr die Struktur-und Funktionsvorhersage von Proteinen. Ph.D. thesis, Univ. of Heidelberg (1994).
Rost, B.: Better 1D predictions by experts with machines. Proteins Suppl. 1 (1997) 192–197.
von Heijne, G.: Membrane proteins: from sequence to structure. Annu. Rev. Biophys. Biomol. Struct. 23 (1994) 167–192.
Rost, B., Casadio, R., Fariselli, P.: Topology prediction for helical transmembrane proteins at 86% accuracy. Prot. Sci. 5 (1996) 1704–1718.
Rost, B., Casadio, R., Fariselli, P.: Refining neural network predictions for helical transmembrane proteins by dynamic programming. In States, D., et al. eds. Fourth International Conference on Intelligent Systems for Molecular Biology. St. Louis, M.O., U.S.A.: Menlo Park, CA: AAAI Press (1996) 192–200.
Cohen, F. E., Presnell, S. R.: The combinatorial approach. In Sternberg, M. J. E. eds. Protein structure prediction. Oxford: Oxford Univ. Press (1996) 207–228.
Lee, B. K., Richards, F. M.: The interpretation of protein structures: estimation of static accessibility. J. Mol. Biol. 55 (1971) 379–400.
Chothia, C.: The nature of the accessible and buried surfaces in proteins. J. Mol. Biol. 105 (1976) 1–12.
Connolly, M. L.: Solvent-accessible surfaces of proteins and nucleic acids. Science 221 (1983) 709–713.
Tanford, C.: The hydrophobic effect: formation of micelles and biological membranes. New York: John Wiley & Sons (1980).
Kyte, J., Doolittle, R. F.: A simple method for displaying the hydrophathic character of a protein. J. Mol. Biol. 157 (1982) 105–132.
Eisenberg, D., Weiss, R. M., Terwilliger, T. C.: The hydrophobic moment detects periodicity in protein hydrophobicity. Proc. Natl. Acad. Sc. U.S.A. 81 (1984) 140–144.
Rost, B., Sander, C.: Conservation and prediction of solvent accessibility in protein families. Proteins 20 (1994) 216–226.
Rost, B.: Average conservation of 1D structure between remote homologues. WWW document (http://www.embl-heidelberg.de/~rost/Res/96E-ConservationOf1D.html), EMBL Heidelberg, Germany (1996).
Rost, B., Sander, C.: Progress of 1D protein structure prediction at last. Proteins 23 (1995) 295–300.
Rost, B.: PredictProtein — internet prediction service. WWW document (http://www.embl-heidelberg.de/predictprotein), EMBL (1997).
Rost, B., Schneider, R.: Pedestrian guide to analysing sequence databases. In Ashman, K. eds. Core techniques in biochemistry. Heidelberg: Springer (1998) (in press).
Bernstein, F. C., Koetzle, T. F., Williams, G. J. B., Meyer, E. F., Brice, M. D., Rodgers, J. R., Kennard, O., Shimanouchi, T., Tasumi, M.: The Protein Data Bank: a computer based archival file for macromolecular structures. J. Mol. Biol. 112 (1977) 535–542.
von Heijne, G.: Membrane protein structure prediction. J. Mol. Biol. 225 (1992) 487–494.
Kraulis, P. J.: J. Appl. Crystallography 24 (1991), 946–950.
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1999 Springer-Verlag
About this paper
Cite this paper
Rost, B. (1999). Evolution teaches neural networks to predict protein structure. In: Clark, J.W., Lindenau, T., Ristig, M.L. (eds) Scientific Applications of Neural Nets. Lecture Notes in Physics, vol 522. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0104282
Download citation
DOI: https://doi.org/10.1007/BFb0104282
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-65737-8
Online ISBN: 978-3-540-48980-1
eBook Packages: Springer Book Archive