Amino acid sequence analysis provides important insight into the structure of proteins,which in turn greatly facilitates the understanding of its biochemical and cellular function. Efforts to use computational methods in predicting protein structure based only on sequence information started 30 years ago (Nagano 1973; Chou and Fasman 1974).However, only during the last decade, has the introduction of new computational techniques such as protein fold recognition and the growth of sequence and structure databases due to modern high-throughput technologies led to an increase in the success rate of prediction methods, so that they can be used by the molecular biologist or biochemist as an aid in the experimental investigations.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ (1990) Basic local alignment search tool. J Mol Biol 215:403-410
Altschul SF, Madden TL, Schaffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res 25:3389-3402
Aravind L, Koonin EV (1999) Gleaning non-trivial structural, functional and evolutionary information about proteins by iterative database searches. J Mol Biol 287:1023-1040
Aravind L, Mazumder R, Vasudevan S, Koonin EV (2002) Trends in protein evolution inferred from sequence and structure analysis. Curr Opin Struct Biol 12:392-399
Bateman A, Birney E, Cerruti L, Durbin R, Etwiller L, Eddy SR, Griffiths-Jones S, Howe KL, Marshall M, Sonnhammer EL (2002) The Pfam protein families database. Nucleic Acids Res 30:276-280
Bowie JU, Luthy R, Eisenberg D (1991) A method to identify protein sequences that fold into a known three-dimensional structure. Science 253:164-170
Bryant SH, Lawrence CE (1993) An empirical energy function for threading protein sequence through the folding motif. Proteins 16:92-112
Bujnicki JM, Elofsson A, Fischer D, Rychlewski L (2001a) LiveBench-1: continuous benchmarking of protein structure prediction servers. Protein Sci 10:352-361
Bujnicki JM, Elofsson A, Fischer D, Rychlewski L (2001b) LiveBench-2: Large-scale automated evaluation of protein structure prediction servers. Proteins 45:184-191
Bujnicki JM, Elofsson A, Fischer D, Rychlewski L (2001 c) Structure prediction Meta Server. Bioinformatics 17:750-751
Bystroff C, Thorsson V, Baker D (2000) HMMSTR: a hidden Markov model for local sequence-structure correlations in proteins J Mol Biol 301:173-190
Chandonia JM, Karplus M (1995) Neural networks for secondary structure and structural class predictions. Protein Sci 4:275-285
Chothia C (1992) Proteins. One thousand families for the molecular biologist. Nature 357:543-544
Chothia C, Lesk AM (1986) The relation between the divergence of sequence and structure in proteins. EMBO J 5:823-826
Chou PY, Fasman GD (1974) Prediction of protein conformation. Biochemistry 13:222-245
Combet C, Blanchet C, Geourjon C, Deleage G (2000) NPS@: network protein sequence analysis. Trends Biochem Sci 25:147-150
Cserzo M,Wallin ESimon Ivon Heijne G,Elofsson A (1997) Prediction of transmembrane alpha-helices in prokaryotic membrane proteins: the dense alignment surface method. Protein Eng 10:673-676
Cuff JA, Clamp ME, Siddiqui AS, Finlay M, Barton GJ (1998) JPred: a consensus secondary structure prediction server. Bioinformatics 14:892-893
Douguet D, Labesse G (2001) Easier threading through web-based comparisons and cross-validations. Bioinformatics 17:752-753
Eddy SR (1996) Hidden Markov models. Curr Opin Struct Biol 6:361-365
Eyrich VA,Rost B (2003) META-PP: single interface to crucial prediction servers.Nucleic Acids Res 31:3308-3310
Fischer D (2000) Hybrid fold recognition: combining sequence derived properties with evolutionary information. Pac Symp Biocomput , pp 119-130
Fischer D, Elofsson A, Rice D, Eisenberg D (1996) Assessing the performance of fold recognition methods by means of a comprehensive benchmark.Pac Symp Biocomput , pp 300-318
Frishman D, Argos P (1997) Seventy-five percent accuracy in protein secondary structure prediction. Proteins 27:329-335
Garnier J, Osguthorpe DJ, Robson B (1978) Analysis of the accuracy and implications of simple methods for predicting the secondary structure of globular proteins. J Mol Biol 120:97-120
Gerstein M, Levitt M (1997) A structural census of the current population of protein sequences. Proc Natl Acad Sci USA 94:11911-11916
Ginalski K, Elofsson A, Fischer D, Rychlewski L (2003a) 3D-Jury: a simple approach to improve protein structure predictions. Bioinformatics 19:1015-1018
Ginalski K, Pas J, Wyrwicz LS, von Grotthuss M, Bujnicki JM, Rychlewski L (2003b) ORFeus: detection of distant homology using sequence profiles and predicted secondary structure. Nucleic Acids Res 31:3804-3807
Godzik A, Kolinski A, Skolnick J (1992) Topology fingerprint approach to the inverse protein folding problem. J Mol Biol 227:227-238
Grishin NV (2001a) Fold change in evolution of protein structures. J Struct Biol 134:167-185
Grishin NV (2001b) Treble clef finger-a functionally diverse zinc-binding structural motif. Nucleic Acids Res 29:1703-1714
Haft DH, Selengut JD, White O (2003) The TIGRFAMs database of protein families. Nucleic Acids Res 31: 371-373
Henikoff JG, Greene EA, Pietrokovski S, Henikoff S (2000) Increased coverage of protein families with the blocks database servers. Nucleic Acids Res 28:228-230
Hirokawa T, Boon-Chieng S, Mitaku S(1998) SOSUI: classification and secondary structure prediction system for membrane proteins. Bioinformatics 14:378-379
Hofmann K, Stoffel W (1993) TMbase - a database of membrane spanning proteins segments. Biol Chem 374:166
Ikeda M, Arai M, Lao DM, Shimizu T (2002) Transmembrane topology prediction methods: a re-assessment and improvement by a consensus method using a dataset of experimentally-characterized transmembrane topologies. In Silico Biol 2:19-33
Jones DT (1999a) GenTHREADER: an efficient and reliable protein fold recognition method for genomic sequences. J Mol Biol 287:797-815
Jones DT (1999b) Protein secondary structure prediction based on position-specific scoring matrices. J Mol Biol 292:195-202
Jones DT, Taylor WR, Thornton JM (1992) A new approach to protein fold recognition. Nature 358:86-89
Jones DT, Taylor WR, Thornton JM (1994) A model recognition approach to the prediction of all-helical membrane protein structure and topology. Biochemistry 33:3038-3049
Karplus K, Barrett C, Hughey R (1998) Hidden Markov models for detecting remote protein homologies. Bioinformatics 14:846-856
Karplus K, Karchin R, Barrett C, Tu S, Cline M, Diekhans M, Grate L, Casper J, Hughey R (2001) What is the value added by human intervention in protein structure prediction? Proteins 45(Suppl 5):86-91
Kaur H, Raghava GP (2003a) A neural-network based method for prediction of gammaturns in proteins from multiple sequence alignment. Protein Sci 12:923-929
Kaur H, Raghava GP (2003b) Prediction of beta-turns in proteins from multiple alignment using neural network. Protein Sci 12:627-634
Kelley LA, McCallum CM, Sternberg MJ (2000) Enhanced genome annotation using structural profiles in the program 3D-PSSM. J Mol Biol 299:501-522
Kihara D, Lu H, Kolinski A, Skolnick J (2001) TOUCHSTONE: an ab initio protein structure prediction method that uses threading-based tertiary restraints. Proc Natl Acad Sci USA 98:10125-10130
King RD, Ouali M, Strong AT,Aly A, Elmaghraby A, Kantardzic M, Page D (2000) Is it better to combine predictions? Protein Eng 13:15-19
Kneller DG, Cohen FE, Langridge R (1990) Improvements in protein secondary structure prediction by an enhanced neural network. J Mol Biol 214:171-182
Koh IY, Eyrich VA, Marti-Renom MA, Przybylski D, Madhusudhan MS, Eswar N, Grana O, Pazos F,Valencia A, Sali A, Rost B (2003) EVA: evaluation of protein structure prediction servers. Nucleic Acids Res 31:3311-3315
Koonin EV, Wolf YI, Karev GP (2002) The structure of the protein universe and genome evolution. Nature 420:218-223
Krieger E, Nabuurs SB, Vriend G (2003) Homology modeling. Methods Biochem Anal 44:509-523
Kuhlmann UC, Moore GR, James R, Kleanthous C, Hemmings AM (1999) Structural parsimony in endonuclease active sites: should the number of homing endonuclease families be redefined? FEBS Lett 463:1-2
Kurowski MA, Bujnicki JM (2003) GeneSilico protein structure prediction meta-server. Nucleic Acids Res 31:3305-3307
Lambert C, Leonard N, De B, X, Depiereux E (2002) ESyPred3D: Prediction of proteins 3D structures. Bioinformatics 18:1250-1256
Lathrop RH (1994) The protein threading problem with sequence amino acid interaction preferences is NP-complete. Protein Eng 7:1059-1068
Lemer CM, Rooman MJ, Wodak SJ (1995). Protein structure prediction by threading methods: evaluation of current techniques. Proteins 23:337-355
Letunic I, Goodstadt L, Dickens NJ, Doerks T, Schultz J, Mott R, Ciccarelli F, Copley RR, Ponting CP, Bork P (2002) Recent improvements to the SMART domain-based sequence annotation resource. Nucleic Acids Res 30:242-244
Levin JM, Pascarella S, Argos P, Garnier J (1993) Quantification of secondary structure prediction improvement using multiple alignments. Protein Eng 6:849-854
Li W,Pio F,Pawlowski K,Godzik A (2000) Saturated BLAST: an automated multiple intermediate sequence search used to detect distant homology. Bioinformatics 16:1105-1110
Liakopoulos TD, Pasquier C, Hamodrakas SJ (2001) A novel tool for the prediction of transmembrane protein topology based on a statistical analysis of the SwissProt database: the OrienTM algorithm. Protein Eng 14:387-390
Liu J, Tan H, Rost B (2002) Loopy proteins appear conserved in evolution. J Mol Biol 322:53-64
Lundstrom J, Rychlewski L, Bujnicki JM, Elofsson A (2001) Pcons: a neural-network-based consensus predictor that improves fold recognition. Protein Sci 10:2354-2362
Lupas A,Van Dyke M, Stock J (1991) Predicting coiled coils from protein sequences. Science 252:1162-1164
Marchler-Bauer A, Anderson JB, DeWeese-Scott C, Fedorova ND, Geer LY, He S, Hurwitz DI., Jackson JD, Jacobs AR, Lanczycki CJ, Liebert CA, Liu C, Madej T, Marchler GH, Mazumder R, Nikolskaya AN, Panchenko AR, Rao BS, Shoemaker BA, Simonyan V, Song JS, Thiessen PA, Vasudevan S, Wang Y, Yamashita RA, Yin JJ, Bryant SH (2003) CDD: a curated Entrez database of conserved domain alignments. Nucleic Acids Res 31:383-387
Martelli PL, Fariselli P, Krogh A, Casadio R (2002) A sequence-profile-based HMM for predicting and discriminating beta barrel membrane proteins. Bioinformatics 18 (Suppl 1):S46-S53
Milpetz F, Argos P, Persson B (1995) TMAP: a new email and WWW service for membrane-protein structural predictions. Trends Biochem Sci 20:204-205
Mulder NJ,Apweiler R,Attwood TK, Bairoch A, Barrell D, Bateman A, Binns D, Biswas M, Bradley P, Bork P, Bucher P, Copley RR, Courcelle E, Das U, Durbin R, Falquet L, Fleischmann W, Griffiths-Jones S, Haft D, Harte N, Hulo N, Kahn D, Kanapin A, Krestyaninova M, Lopez R, Letunic I, Lonsdale D, Silventoinen V, Orchard SE, Pagni M, Peyruc D, Ponting CP, Selengut JD, Servant F, Sigrist CJ,Vaughan R, Zdobnov EM (2003) The InterPro Database, 2003 brings increased coverage and new features. Nucleic Acids Res 31:315-318
Murzin AG (1998) How far divergent evolution goes in proteins. Curr Opin Struct Biol 8 380-387
Nagano K (1973) Logical analysis of the mechanism of protein folding. I. Predictions of helices, loops and beta-structures from primary structure. J Mol Biol 75:401-420
Needleman SB, Wunsch CD (1970) A general method applicable to the search for similarities in the amino acid sequence of two proteins. J Mol Biol 48:443-453
Ouali M, King RD (2000) Cascaded multiple classifiers for secondary structure prediction. Protein Sci 9:1162-1176
Ouzounis C, Sander C, Scharf M, Schneider R (1993) Prediction of protein structure by evaluation of sequence-structure fitness. Aligning sequences to contact profiles derived from three-dimensional structures. J Mol Biol 232:805-825
Pagni M, Jongeneel CV (2001) Making sense of score statistics for sequence alignments. Brief Bioinform 2:51-67
Park J, Karplus K, Barrett C, Hughey R, Haussler D, Hubbard T, Chothia C (1998) Sequence comparisons using multiple sequences detect three times as many remote homologues as pairwise methods. J Mol Biol 284:1201-1210
Park J,Teichmann SA,Hubbard T,Chothia C (1997).Intermediate sequences increase the detection of homology between sequences. J Mol Biol 273:349-354
Pasquier C, Promponas VJ, Palaios GA, Hamodrakas JS, Hamodrakas SJ (1999) A novel method for predicting transmembrane segments in proteins based on a statistical analysis of the SwissProt database: the PRED-TMR algorithm. Protein Eng 12:381-385
Pearson WR (1998) Empirical statistical estimates for sequence similarity searches. J Mol. Biol 276:71-84
Pearson WR,Lipman DJ (1988) Improved tools for biological sequence comparison.Proc Natl Acad Sci U. S.A. 85:2444-2448
Pizzi E, Frontali C.(2001) Low-complexity regions in Plasmodium falciparum proteins. Genome Res 11:218-229
Pollastri G, Przybylski D, Rost B, Baldi P (2002) Improving the prediction of protein sec- ondary structure in three and eight classes using recurrent neural networks and profiles. Proteins 47:228-235
Rost B, Fariselli P, and Casadio R (1996) Topology prediction for helical transmembrane proteins at 86 % accuracy. Protein Sci 5:1704-1718
Rost B, Sander C, Schneider R (1994) PHD-an automatic mail server for protein secondary structure prediction. Comput Appl Biosci 10:53-60
Rychlewski L, Jaroszewski L, Li W, Godzik A (2000) Comparison of sequence profiles. Strategies for structural predictions using sequence information. Protein Sci 9:232-241
Salamov AA, Solovyev VV (1995) Prediction of protein secondary structure by combining nearest-neighbor algorithms and multiple sequence alignments. J Mol Biol 247: 11-15
Samudrala R, Levitt M (2002) A comprehensive analysis of 40 blind protein structure predictions. BMC Struct Biol 2:3
Sanchez R, Sali A (2000) Comparative protein structure modeling. Introduction and practical examples with modeller. Methods Mol Biol 143:97-129
Schaffer AA, Aravind L, Madden TL, Shavirin S, Spouge JL, Wolf YI, Koonin EV, Altschul SF (2001) Improving the accuracy of PSI-BLAST protein database searches with composition-based statistics and other refinements. Nucleic Acids Res 29:2994-3005
Servant F, Bru C, Carrere S, Courcelle E, Gouzy J, Peyruc D, Kahn D (2002) ProDom: automated clustering of homologous domains. Brief Bioinform 3(3):246-251
Shi J, Blundell TL, Mizuguchi K (2001) Fugue: sequence-structure homology recognition using environment-specific substitution tables and structure-dependent gap penalties. J Mol Biol 310:243-257
Sigrist CJ, Cerutti L, Hulo N, Gattiker A, Falquet L, Pagni M, Bairoch A, Bucher P (2002) PROSITE: a documented database using patterns and profiles as motif descriptors. Brief Bioinform 3:265-274
Simons KT, Kooperberg C, Huang E, Baker D (1997) Assembly of protein tertiary structures from fragments with similar local sequences using simulated annealing and Bayesian scoring functions. J Mol Biol 268:209-225
Sippl MJ,Weitckus S (1992) Detection of native-like models for amino acid sequences of unknown three-dimensional structure in a data base of known protein conformations. Proteins 13:258-271
Smith TF, Waterman MS (1981) Identification of common molecular subsequences. J Mol Biol 147:195-197
Sonnhammer EL, von Heijne G, Krogh A (1998) A hidden Markov model for predicting transmembrane helices in protein sequences. Proc Int Conf Intell Syst Mol Biol 6:175-182
Tatusov RL, Natale DA, Garkavtsev IV, Tatusova TA, Shankavaram UT, Rao BS, Kiryutin B, Galperin MY, Fedorova ND, Koonin EV (2001) The COG database: new developments in phylogenetic classification of proteins from complete genomes. Nucleic Acids Res 29:22-28
Taylor PD,Attwood TK, Flower DR (2003) BPROMPT: a consensus server for membrane protein prediction. Nucleic Acids Res 31:3698-3700
Thornton JM, Orengo CA, Todd AE, Pearl FM (1999) Protein folds, functions and evolution. J Mol Biol 293:333-342
Tompa P (2002) Intrinsically unstructured proteins. Trends Biochem Sci 27:527-533
Tusnady GE, Simon I (2001) The HMMTOP transmembrane topology prediction server. Bioinformatics 17:849-850
Vlahovicek K, Kajan L, Murvai J, Hegedus Z, Pongor S (2003) The SBASE domain sequence library, release 10: domain architecture prediction. Nucleic Acids Res 31:403-405
von Heijne G (1986) The distribution of positively charged residues in bacterial inner membrane proteins correlates with the trans-membrane topology. EMBO J 5:3021-3027
von Heijne G (1992) Membrane protein structure prediction. Hydrophobicity analysis and the positive-inside rule. J Mol. Biol 225:487-494
Wallner B, Elofsson A (2003) Can correct protein models be identified? Protein Sci 12:1073-1086
Webber C,Barton GJ (2003) Increased coverage obtained by combination of methods for protein sequence database searching. Bioinformatics 19:1397-1403
Wolf YI, Grishin NV, Koonin EV (2000) Estimating the number of protein folds and families from complete genome data. J Mol Biol 299:897-905
Wootton JC (1994) Sequences with “unusual” amino acid composition. Curr Opin Struct Biol 4:413-421
Wootton JC, Federhen S (1996) Analysis of compositionally biased regions in sequence databases. Methods Enzymol 266:554-571
Wright PE, Dyson HJ (1999) Intrinsically unstructured proteins: re-assessing the protein structure-function paradigm. J Mol Biol 293:321-331
Xu J, Li M, Lin G, Kim D, Xu Y (2003) Protein structure prediction by linear programming. Pac Symp Biocomput 264:75
Xu Y, Xu D (2000) Protein threading using PROSPECT: design and evaluation. Proteins 40 (3):343-354
Yona G, Levitt M (2002) Within the twilight zone: a sensitive profile-profile comparison tool based on information theory. J Mol Biol 315:1257-1275
Zhai Y, Saier MH Jr (2001) A web-based program (WHAT) for the simultaneous prediction of hydropathy, amphipathicity, secondary structure and transmembrane topology for a single protein sequence. J Mol Microbiol Biotechnol 3:501-502
Zhai Y, Saier MH Jr (2002) The beta-barrel finder (BBF) program, allowing identification of outer membrane beta-barrel proteins encoded within prokaryotic genomes. Protein Sci 11:2196-2207
Zhang C, DeLisi C (1998) Estimating the number of protein folds. J Mol. Biol 284:1301-1305
Zhou H,Zhou Y (2003) Predicting the topology of transmembrane helical proteins using mean burial propensity and a hidden-Markov-model-based method. Protein Sci 12:1547-1555
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Cymerman, I.A., Feder, M., PawŁowski, M., Kurowski, M.A., Bujnicki, J.M. (2008). Computational Methods for Protein Structure Prediction and Fold Recognition. In: Bujnicki, J.M. (eds) Practical Bioinformatics. Nucleic Acids and Molecular Biology, vol 15. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-74268-5_1
Download citation
DOI: https://doi.org/10.1007/978-3-540-74268-5_1
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-74267-8
Online ISBN: 978-3-540-74268-5
eBook Packages: Springer Book Archive