Abstract
Protein structure prediction has matured over the past few years to the point that even fully automated methods can provide reasonably accurate three-dimensional models of protein structures. However, until now it has not been possible to develop programs able to perform as well as human experts, who are still capable of systematically producing better models than automated servers. Although the precise details of protein structure prediction procedures are different for virtually every protein, this chapter describes a generic procedure to obtain a three-dimensional protein model starting from the amino acid sequence. This procedure takes advantage both of programs and servers that have been shown to perform best in blind tests and of the current knowledge about evolutionary relationships between proteins, gained from detailed analyses of protein sequence, structure, and functional data.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Moult, J., Pedersen, J. T., Judson, R., et al. (1995) A large-scale experiment to assess protein structure prediction methods. Proteins 23, ii–v.
Moult, J., Hubbard, T., Bryant, S. H., et al. (1997) Critical assessment of methods of protein structure prediction (CASP): round II. Proteins Suppl. 1, 2–6.
Moult, J., Hubbard, T., Fidelis, K., et al. (1999) Critical assessment of methods of protein structure prediction (CASP): round III. Proteins Suppl. 3, 2–6.
Moult, J., Fidelis, K., Zemla, A., et al. (2001) Critical assessment of methods of protein structure prediction (CASP): round IV. Proteins Suppl. 5, 2–7.
Moult, J., Fidelis, K., Zemla, A., et al. (2003) Critical assessment of methods of protein structure prediction (CASP): round V. Proteins 53, Suppl. 6, 334–339.
Moult, J., Fidelis, K., Rost, B., et al. (2005) Critical assessment of methods of protein structure prediction (CASP): round 6. Proteins 61, Suppl. 7, 3–7.
Fischer, D., Barret, C., Bryson, K., et al. (1999) CAFASP-1: critical assessment of fully automated structure prediction methods. Proteins Suppl. 3, 209–217.
Fischer, D., Elofsson, A., Rychlewski, L., et al. (2001) CAFASP2: the second critical assessment of fully automated structure prediction methods. Proteins Suppl. 5, 171–183.
Fischer, D., Rychlewski, L., Dunbrack, R. L., Jr., et al. (2003) CAFASP3: the third critical assessment of fully automated structure prediction methods. Proteins 53, Suppl. 6, 503–516.
Berman, H. M., Westbrook, J., Feng, Z., et al. (2000) The Protein Data Bank. Nucleic Acids Res 28, 235–242.
Rychlewski, L., Fischer, D. (2005) Live Bench-8: the large-scale, continuous assessment of automated protein structure prediction. Protein Sci 14, 240–245.
Koh, I. Y., Eyrich, V. A., Marti-Renom, M. A., et al. (2003) EVA: Evaluation of protein structure prediction servers. Nucleic Acids Res 31, 3311–3315.
Chothia, C., Lesk, A. M. (1986) The relation between the divergence of sequence and structure in proteins. Embo J 5, 823–826.
Tress, M., Ezkurdia, I., Grana, O., et al. (2005) Assessment of predictions submitted for the CASP6 comparative modeling category. Proteins 61, Suppl. 7, 27–45.
Bowie, J. U., Luthy, R., Eisenberg, D. (1991) A method to identify protein sequences that fold into a known three-dimensional structure. Science 253, 164–170.
Jones, D. T., Taylor, W. R., Thornton, J. M. (1992) A new approach to protein fold recognition. Nature 358, 86–89.
Sippl, M. J., Weitckus, S. (1992) Detection of native-like models for amino acid sequences of unknown three-dimensional structure in a data base of known protein conformations. Proteins 13, 258–271.
Jones, D. T. (1997) Successful ab initio prediction of the tertiary structure of NK-lysin using multiple sequences and recognized supersecondary structural motifs. Proteins Suppl. 1, 185–191.
Simons, K. T., Kooperberg, C., Huang, E., et al. (1997) Assembly of protein tertiary structures from fragments with similar local sequences using simulated annealing and Bayesian scoring functions. J Mol Biol 268, 209–225.
Sprague, E. R., Wang, C., Baker, D., et al. (2006) Crystal structure of the HSV-1 Fc receptor bound to Fc reveals a mechanism for antibody bipolar bridging. PLoS Biol 4, e148.
Galperin, M. Y. (2006) The Molecular Biology Database Collection: 2006 update. Nucleic Acids Res 34, D3–5.
Fox, J. A., McMillan, S., Ouellette, B. F. (2006) A compilation of molecular biology web servers: 2006 update on the Bioinfor-matics Links Directory. Nucleic Acids Res 34, W3–5.
Benson, D. A., Boguski, M. S., Lipman, D. J., et al. (1997) GenBank. Nucleic Acids Res 25, 1–6.
Wu, C. H., Apweiler, R., Bairoch, A., et al. (2006) The Universal Protein Resource (UniProt): an expanding universe of protein information. Nucleic Acids Res 34, D187–191.
Coutinho, P. M., Henrissat, B. (1999) Carbohydrate-active enzymes: an integrated database approach. In Recent Advances in Carbohydrate Bioengineering. H.J. Gilbert, G. Davies, B. Henrissat and B. Svensson eds., The Royal Society of Chemistry, Cambridge, UK, pp. 3–12.
Lander, E. S., Linton, L. M., Birren, B., et al. (2001) Initial sequencing and analysis of the human genome. Nature 409, 860–921.
LoVerde, P. T., Hirai, H., Merrick, J. M., et al. (2004) Schistosoma mansoni genome project: an update. Parasitol Int 53, 183–192.
Andreeva, A., Howorth, D., Brenner, S. E., et al. (2004) SCOP database in 2004: refinements integrate structure and sequence family data. Nucleic Acids Res 32, D226–229.
Pearl, F., Todd, A., Sillitoe, I., et al. (2005) The CATH Domain Structure Database and related resources Gene3D and DHS provide comprehensive domain family information for genome analysis. Nucleic Acids Res 33, D247–251.
Holm, L., Ouzounis, C., Sander, C., et al. (1992) A database of protein structure families with common folding motifs. Protein Sci 1, 1691–1698.
Holm, L., Sander, C. (1997) Dali/FSSP classification of three-dimensional protein folds. Nucleic Acids Res 25, 231–234.
Mizuguchi, K., Deane, C. M., Blundell, T. L., et al. (1998) HOMSTRAD: a database of protein structure alignments for homologous families. Protein Sci 7, 2469–2471.
Gasteiger, E., Gattiker, A., Hoogland, C., et al. (2003) ExPASy: the proteomics server for in-depth protein knowledge and analysis. Nucleic Acids Res 31, 3784–3788.
Smith, R. F., Wiese, B. A., Wojzynski, M. K., et al. (1996) BCM Search Launcher— an integrated interface to molecular biology data base search and analysis services available on the World Wide Web. Genome Res 6, 454–462.
Stothard, P. (2000) The sequence manipulation suite: JavaScript programs for analyzing and formatting protein and DNA sequences. Biotechniques 28, 1102, 1104.
Janin, J. (2005) Assessing predictions of protein-protein interaction: the CAPRI experiment. Protein Sci 14, 278–283.
Janin, J., Henrick, K., Moult, J., et al. (2003) CAPRI: a Critical Assessment of PRedicted Interactions. Proteins 52, 2–9.
Tai, C. H., Lee, W. J., Vincent, J. J., et al. (2005) Evaluation of domain prediction in CASP6. Proteins 61, Suppl. 7, 183–192.
Kim, D. E., Chivian, D., Malmstrom, L., et al. (2005) Automated prediction of domain boundaries in CASP6 targets using Ginzu and RosettaDOM. Proteins 61, Suppl. 7, 193– 200.
Suyama, M., Ohara, O. (2003) DomCut: prediction of inter-domain linker regions in amino acid sequences. Bioinformatics 19, 673–674.
Marchler-Bauer, A., Anderson, J. B., Cherukuri, P. F., et al. (2005) CDD: a Conserved Domain Database for protein classification. Nucleic Acids Res 33, D192–196.
Finn, R. D., Mistry, J., Schuster-Bockler, B., et al. (2006) Pfam: clans, web tools and services. Nucleic Acids Res 34, D247–251.
Letunic, I., Copley, R. R., Pils, B., et al. (2006) SMART 5: domains in the context of genomes and networks. Nucleic Acids Res 34, D257–260.
Bru, C., Courcelle, E., Carrere, S., et al.(2005) The ProDom database of protein domain families: more emphasis on 3D. Nucleic Acids Res 33, D212–215.
Mulder, N. J., Apweiler, R., Attwood, T. K., et al. (2005) InterPro, progress and status in 2005. Nucleic Acids Res 33, D201–205.
Hulo, N., Bairoch, A., Bulliard, V., et al. (2006) The PROSITE database. Nucleic Acids Res 34, D227–230.
Gough, J., Chothia, C. (2002) SUPER-FAMILY: HMMs representing all proteins of known structure. SCOP sequence searches, alignments and genome assignments. Nucleic Acids Res 30, 268–272.
Madera, M., Vogel, C., Kummerfeld, S. K., et al. (2004) The SUPERFAMILY database in 2004: additions and improvements. Nucleic Acids Res 32, D235–239.
Jin, Y., Dunbrack, R. L., Jr. (2005) Assessment of disorder predictions in CASP6. Proteins 61, Suppl. 7, 167–175.
Obradovic, Z., Peng, K., Vucetic, S., et al. (2005) Exploiting heterogeneous sequence properties improves prediction of protein disorder. Proteins 61, Suppl. 7, 176–182.
Peng, K., Radivojac, P., Vucetic, S., et al.(2006) Length-dependent prediction of protein intrinsic disorder. BMC Bioinfor-matics 7, 208.
Cheng, J., Sweredoski, M., Baldi, P. (2005) Accurate prediction of protein disordered regions by mining protein structure data. Data Mining Knowl Disc 11, 213–222.
Dosztanyi, Z., Csizmok, V., Tompa, P., et al.(2005) IUPred: web server for the prediction of intrinsically unstructured regions of proteins based on estimated energy content. Bioinformatics 21, 3433–3434.
Vullo, A., Bortolami, O., Pollastri, G., et al. (2006) Spritz: a server for the prediction of intrinsically disordered regions in protein sequences using kernel machines. Nucleic Acids Res 34, W164–168.
Ward, J. J., Sodhi, J. S., McGuffin, L. J., et al. (2004) Prediction and functional analysis of native disorder in proteins from the three kingdoms of life. J Mol Biol 337, 635–645.
Bryson, K., McGuffin, L. J., Marsden, R. L., et al. (2005) Protein structure prediction servers at University College London. Nucleic Acids Res 33, W36–38.
Krogh, A., Larsson, B., von Heijne, G., et al. (2001) Predicting transmembrane protein topology with a hidden Markov model: application to complete genomes. J Mol Biol 305, 567–580.
Rost, B., Yachdav, G., Liu, J. (2004) The PredictProtein server. Nucleic Acids Res 32, W321–326.
Bagos, P. G., Liakopoulos, T. D., Spyro-poulos, I. C., et al. (2004) PRED-TMBB: a web server for predicting the topology of beta-barrel outer membrane proteins. Nucleic Acids Res. 32, W400–404.
Natt, N. K., Kaur, H., Raghava, G. P. (2004) Prediction of transmembrane regions of beta-barrel proteins using ANN- and SVM-based methods. Proteins 56, 11–18.
Jones, D. T. (1999) Protein secondary structure prediction based on position-specific scoring matrices. J Mol Biol 292, 195–202.
Karplus, K., Barrett, C., Hughey, R. (1998) Hidden Markov models for detecting remote protein homologies. Bioinformatics 14, 846–856.
Pollastri, G., McLysaght, A. (2005) Porter: a new, accurate server for protein secondary structure prediction. Bioinformatics 21, 1719–1720.
Cuff, J. A., Barton, G. J. (2000) Application of multiple sequence alignment profiles to improve protein secondary structure prediction. Proteins 40, 502–511.
Cuff, J. A., Clamp, M. E., Siddiqui, A. S., et al. (1998) JPred: a consensus secondary structure prediction server. Bioinformatics 14, 892–893.
Altschul, S. F., Madden, T. L., Schaffer, A. A., et al. (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res 25, 3389–3402.
Tress, M., Tai, C. H., Wang, G., et al. (2005) Domain definition and target classification for CASP6. Proteins 61, Suppl. 7, 8–18.
Pearson, W. R. (1990) Rapid and sensitive sequence comparison with FASTP and FASTA. Methods Enzymol 183, 63–98.
Pearson, W. R. (1995) Comparison of methods for searching protein sequence databases. Protein Sci 4, 1145–1160.
Park, J., Karplus, K., Barrett, C., et al. (1998) Sequence comparisons using multiple sequences detect three times as many remote homologues as pairwise methods. J Mol Biol 284, 1201–1210.
Eddy, S. R. (1996) Hidden Markov models. Curr Opin Struct Biol 6, 361–365.
Eddy, S. R. (1998) Profile hidden Markov models. Bioinformatics 14, 755–763.
Madera, M., Gough, J. (2002) A comparison of profile hidden Markov model procedures for remote homology detection. Nucleic Acids Res 30, 4321–4328.
Karplus, K., Karchin, R., Draper, J., et al. (2003) Combining local-structure, fold-recognition, and new fold methods for protein structure prediction. Proteins 53, Suppl. 6, 491–496.
Karplus, K., Katzman, S., Shackleford, G., et al. (2005) SAM-T04: what is new in protein-structure prediction for CASP6. Proteins 61, Suppl. 7, 135–142.
Schaffer, A. A., Wolf, Y. I., Ponting, C. P., et al. (1999) IMPALA: matching a protein sequence against a collection of PSI-BLAST-constructed position-specific score matrices. Bioinformatics 15, 1000–1011.
Ohlson, T., Wallner, B., Elofsson, A. (2004) Profile-profile methods provide improved fold-recognition: a study of different profile-profile alignment methods. Proteins 57, 188–197.
Yona, G., Levitt, M. (2002) Within the twilight zone: a sensitive profile-profile comparison tool based on information theory. J Mol Biol 315, 1257–1275.
von Ohsen, N., Sommer, I., Zimmer, R. (2003) Profile-profile alignment: a powerful tool for protein structure prediction. Pac Symp Biocomput 252–263.
von Ohsen, N., Sommer, I., Zimmer, R., et al. (2004) Arby: automatic protein structure prediction using profile-profile alignment and confidence measures. Bioin-formatics 20, 2228–2235.
Sadreyev, R., Grishin, N. (2003) COMPASS: a tool for comparison of multiple protein alignments with assessment of statistical significance. J Mol Biol 326, 317–336.
Mittelman, D., Sadreyev, R., Grishin, N. (2003) Probabilistic scoring measures for profile-profile comparison yield more accurate short seed alignments. Bioinformatics 19, 1531–1539.
Sadreyev, R. I., Baker, D., Grishin, N. V. (2003) Profile-profile comparisons by COMPASS predict intricate homologies between protein families. Protein Sci 12, 2262–2272.
Heger, A., Holm, L. (2001) Picasso: generating a covering set of protein family profiles. Bioinformatics 17, 272–279.
Edgar, R. C., Sjolander, K. (2004) COACH: profile-profile alignment of protein families using hidden Markov models. Bioinformat-ics 20, 1309–1318.
Pietrokovski, S. (1996) Searching databases of conserved sequence regions by aligning protein multiple-alignments. Nucleic Acids Res 24, 3836–3845.
Jaroszewski, L., Rychlewski, L., Li, Z., et al. (2005) FFAS03: a server for profile–profile sequence alignments. Nucleic Acids Res 33, W284–288.
Tomii, K., Akiyama, Y. (2004) FORTE: a profile-profile comparison tool for protein fold recognition. Bioinformatics 20, 594–595.
Ginalski, K., Pas, J., Wyrwicz, L. S., et al. (2003) ORFeus: detection of distant homology using sequence profiles and predicted secondary structure. Nucleic Acids Res 31, 3804–3807.
Soding, J., Biegert, A., Lupas, A. N. (2005) The HHpred interactive server for protein homology detection and structure prediction. Nucleic Acids Res 33, W244–248.
Kabsch, W., Sander, C. (1983) Dictionary of protein secondary structure: pattern recognition of hydrogen-bonded and geometrical features. Biopolymers 22, 2577–2637.
Sippl, M. J. (1995) Knowledge-based potentials for proteins. Curr Opin Struct Biol 5, 229–235.
Kelley, L. A., MacCallum, R. M., Sternberg, M. J. (2000) Enhanced genome annotation using structural profiles in the program 3D-PSSM. J Mol Biol 299, 499–520.
Jones, D. T. (1999) GenTHREADER: an efficient and reliable protein fold recognition method for genomic sequences. J Mol Biol 287, 797–815.
McGuffin, L. J., Bryson, K., Jones, D. T. (2000) The PSIPRED protein structure prediction server. Bioinformatics 16, 404–405.
Zhang, Y., Arakaki, A. K., Skolnick, J. (2005) TASSER: an automated method for the prediction of protein tertiary structures in CASP6. Proteins 61, Suppl. 7, 91–98.
Skolnick, J., Kihara, D., Zhang, Y. (2004) Development and large scale benchmark testing of the PROSPECTOR_3 threading algorithm. Proteins 56, 502–518.
Shi, J., Blundell, T. L., Mizuguchi, K. (2001) FUGUE: sequence-structure homology recognition using environment-specific substitution tables and structure-dependent gap penalties. J Mol Biol 310, 243–257.
Xu, J., Li, M., Kim, D., et al. (2003) RAPTOR: optimal protein threading by linear programming. J Bioinform Comput Biol 1, 95–117.
Tang, C. L., Xie, L., Koh, I. Y., et al. (2003) On the role of structural information in remote homology detection and sequence alignment: new methods using hybrid sequence profiles. J Mol Biol 334, 1043–1062.
Teodorescu, O., Galor, T., Pillardy, J., et al. (2004) Enriching the sequence substitution matrix by structural information. Proteins 54, 41–48.
Zhou, H., Zhou, Y. (2004) Single-body residue-level knowledge-based energy score combined with sequence-profile and secondary structure information for fold recognition. Proteins 55, 1005–1013.
Zhou, H., Zhou, Y. (2005) SPARKS 2 and SP3 servers in CASP6. Proteins 61, Suppl. 7, 152–156.
Zhou, H., Zhou, Y. (2005) Fold recognition by combining sequence profiles derived from evolution and from depth-dependent structural alignment of fragments. Proteins 58, 321–328.
Thompson, J. D., Higgins, D. G., Gibson, T. J. (1994) CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res 22, 4673–4680.
Notredame, C., Higgins, D. G., Heringa, J. (2000) T-Coffee: a novel method for fast and accurate multiple sequence alignment. J Mol Biol 302, 205–217.
Thompson, J. D., Gibson, T. J., Plewniak, F., et al. (1997) The CLUSTAL_X windows interface: flexible strategies for multiple sequence alignment aided by quality analysis tools. Nucleic Acids Res 25, 4876–4882.
Crooks, G. E., Hon, G., Chandonia, J. M., et al. (2004) WebLogo: a sequence logo generator. Genome Res 14, 1188–1190.
Sonnhammer, E. L., Hollich, V. (2005) Scoredist: a simple and robust protein sequence distance estimator. BMC Bioin-formatics 6, 108.
Galtier, N., Gouy, M., Gautier, C. (1996) SEAVIEW and PHYLO_WIN: two graphic tools for sequence alignment and molecular phylogeny. Comput Appl Biosci 12, 543–548.
Parry-Smith, D. J., Payne, A. W., Michie, A. D., et al. (1998) CINEMA—a novel colour INteractive editor for multiple alignments. Gene 221, GC57–63.
Ginalski, K., von Grotthuss, M., Grishin, N. V., et al. (2004) Detecting distant homol-ogy with Meta-BASIC. Nucleic Acids Res 32, W576–581.
Xu, Y., Xu, D., Gabow, H. N. (2000) Protein domain decomposition using a graph-theoretic approach. Bioinformatics 16, 1091–1104.
Guo, J. T., Xu, D., Kim, D., et al. (2003) Improving the performance of Domain-Parser for structural domain partition using neural network. Nucleic Acids Res 31, 944–952.
Alexandrov, N., Shindyalov, I. (2003) PDP: protein domain parser. Bioinformatics 19, 429–430.
Todd, A. E., Orengo, C. A., Thornton, J. M. (1999) DOMPLOT: a program to generate schematic diagrams of the structural domain organization within proteins, annotated by ligand contacts. Protein Eng 12, 375–379.
Zemla, A. (2003) LGA: a method for finding 3D similarities in protein structures. Nucleic Acids Res 31, 3370–3374.
Holm, L., Park, J. (2000) DaliLite workbench for protein structure comparison. Bioinformatics 16, 566–567.
Ortiz, A. R., Strauss, C. E., Olmea, O. (2002) MAMMOTH (matching molecular models obtained from theory): an automated method for model comparison. Protein Sci 11, 2606–2621.
Gibrat, J. F., Madej, T., Br yant, S. H. (1996) Surprising similarities in structure comparison. Curr Opin Struct Biol 6, 377–385.
Shindyalov, I. N., Bourne, P. E. (1998) Protein structure alignment by incremental combinatorial extension (CE) of the optimal path. Protein Eng 11, 739–747.
Orengo, C. A., Taylor, W. R. (1996) SSAP: sequential structure alignment program for protein structure comparison. Methods Enzymol 266, 617–635.
Krissinel, E., Henrick, K. (2004) Secondary-structure matching (SSM), a new tool for fast protein structure alignment in three dimensions. Acta Crystallogr D Biol Crystallogr 60, 2256–2268.
Yang, A. S., Honig, B. (1999) Sequence to structure alignment in comparative modeling using PrISM. Proteins Suppl. 3, 66–72.
Lupyan, D., Leo-Macias, A., Ortiz, A. R. (2005) A new progressive-iterative algorithm for multiple structure alignment. Bioinformatics 21, 3255–3263.
Ye, Y., Godzik, A. (2005) Multiple flexible structure alignment using partial order graphs. Bioinformatics 21, 2362–2369.
Hill, E. E., Morea, V., Chothia, C. (2002) Sequence conservation in families whose members have little or no sequence similarity: the four-helical cytokines and cyto-chromes. J Mol Biol 322, 205–233.
Chothia, C., Jones, E. Y. (1997) The molecular structure of cell adhesion molecules. Annu Rev Biochem 66, 823–862.
Hill, E., Broadbent, I. D., Chothia, C., et al. (2001) Cadherin superfamily proteins in Caenorhabditis elegans and Drosophila melanogaster. J Mol Biol 305, 1011–1024.
Chothia, C., Lesk, A. M. (1987) Canonical structures for the hypervariable regions of immunoglobulins. J Mol Biol 196, 901–917.
Chothia, C., Lesk, A. M., Tramontano, A., et al. (1989) Conformations of immu-noglobulin hypervariable regions. Nature 342, 877–883.
Al-Lazikani, B., Lesk, A. M., Chothia, C. (1997) Standard conformations for the canonical structures of immunoglobulins. J Mol Biol 273, 927–948.
Morea, V., Tramontano, A., Rustici, M., et al. (1998) Conformations of the third hypervariable region in the VH domain of immunoglobulins. J Mol Biol 275, 269–294.
Mizuguchi, K., Deane, C. M., Blundell, T. L., et al. (1998) JOY: protein sequence-structure representation and analysis. Bio-informatics 14, 617–623.
Hubbard, S. J., Thornton, J. M., (1993) NACCESS. Department of Biochemistry and Molecular Biology, University College London.
McDonald, I. K., Thornton, J. M. (1994) Satisfying hydrogen bonding potential in proteins. J Mol Biol 238, 777–793.
Morris, A. L., MacArthur, M. W., Hutch-inson, E. G., et al. (1992) Stereochemical quality of protein structure coordinates. Proteins 12, 345–364.
Laskowski, R. A., MacArthur, M. W., Moss, D. S., et al. (1993) PROCHECK: a program to check the stereochemical quality of protein structures J Appl Cryst 26, 283–291.
Wallace, A. C., Laskowski, R. A., Thornton, J. M. (1995) LIGPLOT: a program to generate schematic diagrams of protein-lig-and interactions. Protein Eng 8, 127–134.
Laskowski, R. A., Hutchinson, E. G., Michie, A. D., et al. (1997) PDBsum: a Web-based database of summaries and analyses of all PDB structures. Trends Biochem Sci 22, 488–490.
Sasin, J. M., Bujnicki, J. M. (2004) COLO-RADO3D, a web server for the visual analysis of protein structures. Nucleic Acids Res 32, W586–589.
Landau, M., Mayrose, I., Rosenberg, Y., et al. (2005) ConSurf 2005: the projection of evolutionary conservation scores of residues on protein structures. Nucleic Acids Res 33, W299–302.
Guex, N., Peitsch, M. C. (1997) SWISS-MODEL and the Swiss-PdbViewer: an environment for comparative protein modeling. Electrophoresis 18, 2714–2723.
Sayle, R. A., Milner-White, E. J. (1995) RASMOL: biomolecular graphics for all. Trends Biochem Sci 20, 374.
Martz, E. (2002) Protein Explorer: easy yet powerful macromolecular visualization. Trends Biochem Sci 27, 107–109.
Wang, Y., Geer, L. Y., Chappey, C., et al. (2000) Cn3D: sequence and structure views for Entrez. Trends Biochem Sci 25, 300–302.
Vriend, G. (1990) WHAT IF: a molecular modeling and drug design program. J Mol Graph 8, 52–56.
Koradi, R., Billeter, M., Wuthrich, K. (1996) MOLMOL: a program for display and analysis of macromolecular structures. J Mol Graph 14, 51–55, 29–32.
Humphrey, W., Dalke, A., Schulten, K. (1996) VMD: visual molecular dynamics. J Mol Graph 14, 33–38, 27–38.
Tramontano, A., Chothia, C., Lesk, A. M. (1990) Framework residue 71 is a major determinant of the position and conformation of the second hypervariable region in the VH domains of immunoglobulins. J Mol Biol 215, 175–182.
Sibanda, B. L., Thornton, J. M. (1985) Beta-hairpin families in globular proteins. Nature 316, 170–174.
Sibanda, B. L., Blundell, T. L., Thornton, J. M. (1989) Conformation of beta-hairpins in protein structures. A systematic classification with applications to modelling by homology, electron density fitting and protein engineering. J Mol Biol 206, 759–777.
Bruccoleri, R. E. (2000) Ab initio loop modeling and its application to homology modeling. Methods Mol Biol 143, 247–264.
Xiang, Z., Soto, C. S., Honig, B. (2002) Evaluating conformational free energies: the colony energy and its application to the problem of loop prediction. Proc Natl Acad Sci U S A 99, 7432–7437.
Tosatto, S. C., Bindewald, E., Hesser, J., et al. (2002) A divide and conquer approach to fast loop modeling. Protein Eng 15, 279–286.
Fiser, A., Sali, A. (2003) ModLoop: automated modeling of loops in protein structures. Bioinformatics 19, 2500–2501.
Canutescu, A. A., Shelenkov, A. A., Dun-brack, R. L., Jr. (2003) A graph-theory algorithm for rapid protein side-chain prediction. Protein Sci 12, 2001–2014.
Hung, L. H., Ngan, S. C., Liu, T., et al. (2005) PROTINFO: new algorithms for enhanced protein structure predictions. Nucleic Acids Res 33, W77–80.
Xiang, Z., Honig, B. (2001) Extending the accuracy limits of prediction for side-chain conformations. J Mol Biol 311, 421–430.
Marti-Renom, M. A., Stuart, A. C., Fiser, A., et al. (2000) Comparative protein structure modeling of genes and genomes. Annu Rev Biophys Biomol Struct 29, 291–325.
Levitt, M. (1992) Accurate modeling of protein conformation by automatic segment matching.J Mol Biol 226, 507–533.
Schwede, T., Kopp, J., Guex, N., et al. (2003) SWISS-MODEL: an automated protein homology-modeling server.Nucleic Acids Res 31, 3381–3385.
Bates, P. A., Kelley, L. A., MacCallum, R. M., et al. (2001) Enhancement of protein modeling by human intervention in applying the automatic programs 3D-JIGSAW and 3D-PSSM.Proteins Suppl. 5, 39–46.
Petrey, D., Xiang, Z., Tang, C. L., et al. (2003) Using multiple structure alignments, fast model building, and energetic analysis in fold recognition and homology modeling.Proteins 53, Suppl. 6, 430–435.
Koehl, P., Delarue, M. (1994) Application of a self-consistent mean field theory to predict protein side-chains conformation and estimate their conformational entropy.J Mol Biol 239, 249–275.
Wallner, B., Elofsson, A. (2005) All are not equal: a benchmark of different homology modeling programs.Protein Sci 14, 1315–1327.
Lund, O., Frimand, K., Gorodkin, J., et al. (1997) Protein distance constraints predicted by neural networks and probability density functions.Protein Eng 10, 1241–1248.
Lambert, C., Leonard, N., De Bolle, X., et al. (2002) ESyPred3D: prediction of proteins 3D structures.Bioinformatics 18, 1250–1256.
Hooft, R. W., Vriend, G., Sander, C., et al. (1996) Errors in protein structures.Nature 381, 272.
Sippl, M. J. (1993) Recognition of errors in three-dimensional structures of proteins.Proteins 17, 355–362.
Luthy, R., Bowie, J. U., Eisenberg, D. (1992) Assessment of protein models with three-dimensional profiles.Nature 356, 83–85.
Melo, F., Devos, D., Depiereux, E., et al. (1997) ANOLEA: a www server to assess protein structures.Proc Int Conf Intell Syst Mol Biol 5, 187–190.
Melo, F., Feytmans, E. (1998) Assessing protein structures with a non-local atomic interaction energy.J Mol Biol 277, 1141–1152.
Wallner, B., Elofsson, A. (2003) Can correct protein models be identified?Protein Sci 12, 1073–1086.
Wallner, B., Elofsson, A. (2006) Identification of correct regions in protein models using structural, alignment, and consensus information.Protein Sci 15, 900–913.
Fischer, D. (2006) Servers for protein structure prediction.Current Opin Struct Biol 16, 178–182.
Dayringer, H. E., Tramontano, A., Sprang, S. R., et al. (1986) Interactive program for visualization and modeling of protein, nucleic acid and small molecules.J Mol Graph 4, 82–87.
Spoel, D. v. d., Lindahl, E., Hess, B., et al. (2005) GROMACS: fast, flexible and free.J Comp Chem 26, 1701–1718.
Phillips, J. C., Braun, R., Wang, W., et al. (2005) Scalable molecular dynamics with NAMD.J Comput Chem 26, 1781–1802.
Simons, K. T., Ruczinski, I., Kooperberg, C., et al. (1999) Improved recognition of native-like protein structures using a combination of sequence-dependent and sequence-independent features of proteins.Proteins. 34, 82–95.
Bonneau, R., Tsai, J., Ruczinski, I., et al. (2001) Rosetta in CASP4: progress in ab initio protein structure prediction.Proteins Suppl. 5, 119–126.
Bonneau, R., Strauss, C. E., Rohl, C. A., et al. (2002) De novo prediction of three-dimensional structures for major protein families.J Mol Biol 322, 65–78.
Rohl, C. A., Strauss, C. E., Chivian, D., et al. (2004) Modeling structurally variable regions in homologous proteins with rosetta.Proteins 55, 656–677.
Bradley, P., Malmstrom, L., Qian, B., et al. (2005) Free modeling with Rosetta in CASP6.Proteins 61, Suppl. 7, 128–134.
Chivian, D., Kim, D. E., Malmstrom, L., et al. (2003) Automated prediction of CASP-5 structures using the Robetta server.Proteins 53, Suppl. 6, 524–533.
Chivian, D., Kim, D. E., Malmstrom, L., et al. (2005) Prediction of CASP6 structures using automated Robetta protocols.Proteins 61, Suppl. 6, 157–166.
Kim, D. E., Chivian, D., Baker, D. (2004) Protein structure prediction and analysis using the Robetta server.Nucleic Acids Res 32, W526–531.
Vincent, J. J., Tai, C. H., Sathyanarayana, B. K., et al. (2005) Assessment of CASP6 predictions for new and nearly new fold targets.Proteins 61, Suppl. 7, 67–83.
Wang, G., Jin, Y., Dunbrack, R. L., Jr. (2005) Assessment of fold recognition predictions in CASP6.Proteins 61, Suppl. 7, 46–66.
Jones, D. T., Bryson, K., Coleman, A., et al. (2005) Prediction of novel and analogous folds using fragment assembly and fold recognition.Proteins 61, Suppl. 7, 143–151.
Kolinski, A., Bujnicki, J. M. (2005) Generalized protein structure prediction based on combination of fold-recognition with de novo folding and evaluation of models.Proteins 61, Suppl. 7, 84–90.
Fujikawa, K., Jin, W., Park, S. J., et al. (2005) Applying a grid technology to protein structure predictor “ROKKY”.Stud Health Technol Inform 112, 27–36.
Debe, D. A., Danzer, J. F., Goddard, W. A., et al. (2006) STRUCTFAST: protein sequence remote homology detection and alignment using novel dynamic programming and profile-profile scoring.Proteins 64, 960–967.
Ginalski, K., Elofsson, A., Fischer, D., et al. (2003) 3D-Jury: a simple approach to improve protein structure predictions.Bio-informatics 19, 1015–1018.
Fischer, D. (2003) 3DS3 and 3DS5 3D-SHOTGUN meta-predictors in CAFASP3.Proteins 53, Suppl. 6, 517–523.
Sasson, I., Fischer, D. (2003) Modeling three-dimensional protein structures for CASP5 using the 3D-SHOTGUN meta-predictors.Proteins 53, Suppl. 6, 389–394.
Fischer, D. (2003) 3D-SHOTGUN: a novel, cooperative, fold-recognition meta-predictor.Proteins 51, 434–441.
Fischer, D. (2000) Hybrid fold recognition: combining sequence derived properties with evolutionary information.Pac Symp Biocomput 119–130.
Lundstrom, J., Rychlewski, L., Bujnicki, J., et al. (2001) Pcons: a neural-network-based consensus predictor that improves fold recognition.Protein Sci 10, 2354–2362.
Kurowski, M. A., Bujnicki, J. M. (2003) Gene-Silico protein structure prediction metaserver.Nucleic Acids Res 31, 3305–3307.
Plaxco, K. W., Simons, K. T., Baker, D. (1998) Contact order, transition state placement and the refolding rates of single domain proteins.J Mol Biol 277, 985–994.
Bonneau, R., Ruczinski, I., Tsai, J., et al. (2002) Contact order and ab initio protein structure prediction.Protein Sci 11, 1937–1944.
Shortle, D., Simons, K. T., Baker, D. (1998) Clustering of low-energy conformations near the native structures of small proteins.Proc Natl Acad Sci U S A 95, 11158–11162.
Venclovas, C., Margelevicius, M. (2005) Comparative modeling in CASP6 using consensus approach to template selection, sequence-structure alignment, and structure assessment.Proteins 61, Suppl. 7, 99– 105.
Kosinski, J., Gajda, M. J., Cymerman, I. A., et al. (2005) FRankenstein becomes a cyborg: the automatic recombination and realignment of fold recognition models in CASP6.Proteins 61, Suppl. 7, 106–113.
Wallner, B., Fang, H., Elofsson, A. (2003) Automatic consensus-based fold recognition using Pcons, ProQ, and Pmodeller.Proteins 53, Suppl. 6, 534–541.
Wallner, B., Elofsson, A. (2005) Pcons5: combining consensus, structural evaluation and fold recognition scores.Bioinformatics 21, 4248–4254.
Douguet, D., Labesse, G. (2001) Easier threading through web-based comparisons and cross-validations.Bioinformatics 17, 752–753.
Takeda-Shitaka, M., Terashi, G., Takaya, D., et al. (2005) Protein structure prediction in CASP6 using CHIMERA and FAMS.Proteins 61, Suppl. 7, 122–127.
Kopp, J., Schwede, T. (2004) The SWISS-MODEL Repository of annotated three-dimensional protein structure homology models.Nucleic Acids Res 32, D230–234.
Pieper, U., Eswar, N., Braberg, H., et al. (2004) MODBASE, a database of annotated comparative protein structure models, and associated resources.Nucleic Acids Res 32, D217–222.
Yamaguchi, A., Iwadate, M., Suzuki, E., et al. (2003) Enlarged FAMSBASE: protein 3D structure models of genome sequences for 41 species.Nucleic Acids Res 31, 463–468.
Castrignano, T., De Meo, P. D., Coz-zetto, D., et al. (2006) The PMDB Protein Model Database.Nucleic Acids Res 34, D306–309.
Dayhoff, M. O., Schwartz, R. M., Orcutt, B. C., (1978) A model of evolutionary change in proteins. InAtlas of Protein Sequence and Structure. M.O. Dayhoff, ed. National Biomedical Research Foundation, Washington, DC.
Henikoff, S., Henikoff, J. G. (1992) Amino acid substitution matrices from protein blocks.Proc Natl Acad Sci U S A 89, 10915–10919.
Acknowledgments
The authors gratefully acknowledge Claudia Bertonati, Gianni Colotti, Andrea Ilari, Romina Oliva, and Christine Vogel for manuscript reading and suggestions, and Julian Gough and Martin Madera for discussions.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2008 Humana Press, a part of Springer Science+Business Media, LLC
About this protocol
Cite this protocol
Al-Lazikani, B., Hill, E.E., Morea, V. (2008). Protein Structure Prediction. In: Keith, J.M. (eds) Bioinformatics. Methods in Molecular Biology™, vol 453. Humana Press. https://doi.org/10.1007/978-1-60327-429-6_2
Download citation
DOI: https://doi.org/10.1007/978-1-60327-429-6_2
Publisher Name: Humana Press
Print ISBN: 978-1-60327-428-9
Online ISBN: 978-1-60327-429-6
eBook Packages: Springer Protocols