Protein Structure Prediction

Al-Lazikani, Bissan; Hill, Emma E.; Morea, Veronica

doi:10.1007/978-1-60327-429-6_2

Bissan Al-Lazikani³,
Emma E. Hill⁴ &
Veronica Morea⁵

Part of the book series: Methods in Molecular Biology™ ((MIMB,volume 453))

4184 Accesses
3 Citations

Abstract

Protein structure prediction has matured over the past few years to the point that even fully automated methods can provide reasonably accurate three-dimensional models of protein structures. However, until now it has not been possible to develop programs able to perform as well as human experts, who are still capable of systematically producing better models than automated servers. Although the precise details of protein structure prediction procedures are different for virtually every protein, this chapter describes a generic procedure to obtain a three-dimensional protein model starting from the amino acid sequence. This procedure takes advantage both of programs and servers that have been shown to perform best in blind tests and of the current knowledge about evolutionary relationships between proteins, gained from detailed analyses of protein sequence, structure, and functional data.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Protocol: USD 49.95; Price excludes VAT (USA)

eBook: USD 89.00; Price excludes VAT (USA)

Softcover Book: USD 119.00; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Moult, J., Pedersen, J. T., Judson, R., et al. (1995) A large-scale experiment to assess protein structure prediction methods. Proteins 23, ii–v.
PubMed CAS Google Scholar
Moult, J., Hubbard, T., Bryant, S. H., et al. (1997) Critical assessment of methods of protein structure prediction (CASP): round II. Proteins Suppl. 1, 2–6.
PubMed Google Scholar
Moult, J., Hubbard, T., Fidelis, K., et al. (1999) Critical assessment of methods of protein structure prediction (CASP): round III. Proteins Suppl. 3, 2–6.
PubMed Google Scholar
Moult, J., Fidelis, K., Zemla, A., et al. (2001) Critical assessment of methods of protein structure prediction (CASP): round IV. Proteins Suppl. 5, 2–7.
PubMed Google Scholar
Moult, J., Fidelis, K., Zemla, A., et al. (2003) Critical assessment of methods of protein structure prediction (CASP): round V. Proteins 53, Suppl. 6, 334–339.
PubMed CAS Google Scholar
Moult, J., Fidelis, K., Rost, B., et al. (2005) Critical assessment of methods of protein structure prediction (CASP): round 6. Proteins 61, Suppl. 7, 3–7.
PubMed CAS Google Scholar
Fischer, D., Barret, C., Bryson, K., et al. (1999) CAFASP-1: critical assessment of fully automated structure prediction methods. Proteins Suppl. 3, 209–217.
PubMed Google Scholar
Fischer, D., Elofsson, A., Rychlewski, L., et al. (2001) CAFASP2: the second critical assessment of fully automated structure prediction methods. Proteins Suppl. 5, 171–183.
PubMed Google Scholar
Fischer, D., Rychlewski, L., Dunbrack, R. L., Jr., et al. (2003) CAFASP3: the third critical assessment of fully automated structure prediction methods. Proteins 53, Suppl. 6, 503–516.
PubMed CAS Google Scholar
Berman, H. M., Westbrook, J., Feng, Z., et al. (2000) The Protein Data Bank. Nucleic Acids Res 28, 235–242.
PubMed CAS Google Scholar
Rychlewski, L., Fischer, D. (2005) Live Bench-8: the large-scale, continuous assessment of automated protein structure prediction. Protein Sci 14, 240–245.
PubMed CAS Google Scholar
Koh, I. Y., Eyrich, V. A., Marti-Renom, M. A., et al. (2003) EVA: Evaluation of protein structure prediction servers. Nucleic Acids Res 31, 3311–3315.
PubMed CAS Google Scholar
Chothia, C., Lesk, A. M. (1986) The relation between the divergence of sequence and structure in proteins. Embo J 5, 823–826.
PubMed CAS Google Scholar
Tress, M., Ezkurdia, I., Grana, O., et al. (2005) Assessment of predictions submitted for the CASP6 comparative modeling category. Proteins 61, Suppl. 7, 27–45.
PubMed CAS Google Scholar
Bowie, J. U., Luthy, R., Eisenberg, D. (1991) A method to identify protein sequences that fold into a known three-dimensional structure. Science 253, 164–170.
PubMed CAS Google Scholar
Jones, D. T., Taylor, W. R., Thornton, J. M. (1992) A new approach to protein fold recognition. Nature 358, 86–89.
PubMed CAS Google Scholar
Sippl, M. J., Weitckus, S. (1992) Detection of native-like models for amino acid sequences of unknown three-dimensional structure in a data base of known protein conformations. Proteins 13, 258–271.
PubMed CAS Google Scholar
Jones, D. T. (1997) Successful ab initio prediction of the tertiary structure of NK-lysin using multiple sequences and recognized supersecondary structural motifs. Proteins Suppl. 1, 185–191.
Google Scholar
Simons, K. T., Kooperberg, C., Huang, E., et al. (1997) Assembly of protein tertiary structures from fragments with similar local sequences using simulated annealing and Bayesian scoring functions. J Mol Biol 268, 209–225.
PubMed CAS Google Scholar
Sprague, E. R., Wang, C., Baker, D., et al. (2006) Crystal structure of the HSV-1 Fc receptor bound to Fc reveals a mechanism for antibody bipolar bridging. PLoS Biol 4, e148.
PubMed Google Scholar
Galperin, M. Y. (2006) The Molecular Biology Database Collection: 2006 update. Nucleic Acids Res 34, D3–5.
PubMed CAS Google Scholar
Fox, J. A., McMillan, S., Ouellette, B. F. (2006) A compilation of molecular biology web servers: 2006 update on the Bioinfor-matics Links Directory. Nucleic Acids Res 34, W3–5.
PubMed CAS Google Scholar
Benson, D. A., Boguski, M. S., Lipman, D. J., et al. (1997) GenBank. Nucleic Acids Res 25, 1–6.
PubMed CAS Google Scholar
Wu, C. H., Apweiler, R., Bairoch, A., et al. (2006) The Universal Protein Resource (UniProt): an expanding universe of protein information. Nucleic Acids Res 34, D187–191.
PubMed CAS Google Scholar
Coutinho, P. M., Henrissat, B. (1999) Carbohydrate-active enzymes: an integrated database approach. In Recent Advances in Carbohydrate Bioengineering. H.J. Gilbert, G. Davies, B. Henrissat and B. Svensson eds., The Royal Society of Chemistry, Cambridge, UK, pp. 3–12.
Google Scholar
Lander, E. S., Linton, L. M., Birren, B., et al. (2001) Initial sequencing and analysis of the human genome. Nature 409, 860–921.
PubMed CAS Google Scholar
LoVerde, P. T., Hirai, H., Merrick, J. M., et al. (2004) Schistosoma mansoni genome project: an update. Parasitol Int 53, 183–192.
PubMed CAS Google Scholar
Andreeva, A., Howorth, D., Brenner, S. E., et al. (2004) SCOP database in 2004: refinements integrate structure and sequence family data. Nucleic Acids Res 32, D226–229.
PubMed CAS Google Scholar
Pearl, F., Todd, A., Sillitoe, I., et al. (2005) The CATH Domain Structure Database and related resources Gene3D and DHS provide comprehensive domain family information for genome analysis. Nucleic Acids Res 33, D247–251.
PubMed CAS Google Scholar
Holm, L., Ouzounis, C., Sander, C., et al. (1992) A database of protein structure families with common folding motifs. Protein Sci 1, 1691–1698.
PubMed CAS Google Scholar
Holm, L., Sander, C. (1997) Dali/FSSP classification of three-dimensional protein folds. Nucleic Acids Res 25, 231–234.
PubMed CAS Google Scholar
Mizuguchi, K., Deane, C. M., Blundell, T. L., et al. (1998) HOMSTRAD: a database of protein structure alignments for homologous families. Protein Sci 7, 2469–2471.
PubMed CAS Google Scholar
Gasteiger, E., Gattiker, A., Hoogland, C., et al. (2003) ExPASy: the proteomics server for in-depth protein knowledge and analysis. Nucleic Acids Res 31, 3784–3788.
PubMed CAS Google Scholar
Smith, R. F., Wiese, B. A., Wojzynski, M. K., et al. (1996) BCM Search Launcher— an integrated interface to molecular biology data base search and analysis services available on the World Wide Web. Genome Res 6, 454–462.
PubMed CAS Google Scholar
Stothard, P. (2000) The sequence manipulation suite: JavaScript programs for analyzing and formatting protein and DNA sequences. Biotechniques 28, 1102, 1104.
PubMed CAS Google Scholar
Janin, J. (2005) Assessing predictions of protein-protein interaction: the CAPRI experiment. Protein Sci 14, 278–283.
PubMed CAS Google Scholar
Janin, J., Henrick, K., Moult, J., et al. (2003) CAPRI: a Critical Assessment of PRedicted Interactions. Proteins 52, 2–9.
PubMed CAS Google Scholar
Tai, C. H., Lee, W. J., Vincent, J. J., et al. (2005) Evaluation of domain prediction in CASP6. Proteins 61, Suppl. 7, 183–192.
PubMed CAS Google Scholar
Kim, D. E., Chivian, D., Malmstrom, L., et al. (2005) Automated prediction of domain boundaries in CASP6 targets using Ginzu and RosettaDOM. Proteins 61, Suppl. 7, 193– 200.
PubMed CAS Google Scholar
Suyama, M., Ohara, O. (2003) DomCut: prediction of inter-domain linker regions in amino acid sequences. Bioinformatics 19, 673–674.
PubMed CAS Google Scholar
Marchler-Bauer, A., Anderson, J. B., Cherukuri, P. F., et al. (2005) CDD: a Conserved Domain Database for protein classification. Nucleic Acids Res 33, D192–196.
PubMed CAS Google Scholar
Finn, R. D., Mistry, J., Schuster-Bockler, B., et al. (2006) Pfam: clans, web tools and services. Nucleic Acids Res 34, D247–251.
PubMed CAS Google Scholar
Letunic, I., Copley, R. R., Pils, B., et al. (2006) SMART 5: domains in the context of genomes and networks. Nucleic Acids Res 34, D257–260.
PubMed CAS Google Scholar
Bru, C., Courcelle, E., Carrere, S., et al.(2005) The ProDom database of protein domain families: more emphasis on 3D. Nucleic Acids Res 33, D212–215.
PubMed CAS Google Scholar
Mulder, N. J., Apweiler, R., Attwood, T. K., et al. (2005) InterPro, progress and status in 2005. Nucleic Acids Res 33, D201–205.
PubMed CAS Google Scholar
Hulo, N., Bairoch, A., Bulliard, V., et al. (2006) The PROSITE database. Nucleic Acids Res 34, D227–230.
PubMed CAS Google Scholar
Gough, J., Chothia, C. (2002) SUPER-FAMILY: HMMs representing all proteins of known structure. SCOP sequence searches, alignments and genome assignments. Nucleic Acids Res 30, 268–272.
PubMed CAS Google Scholar
Madera, M., Vogel, C., Kummerfeld, S. K., et al. (2004) The SUPERFAMILY database in 2004: additions and improvements. Nucleic Acids Res 32, D235–239.
PubMed CAS Google Scholar
Jin, Y., Dunbrack, R. L., Jr. (2005) Assessment of disorder predictions in CASP6. Proteins 61, Suppl. 7, 167–175.
PubMed CAS Google Scholar
Obradovic, Z., Peng, K., Vucetic, S., et al. (2005) Exploiting heterogeneous sequence properties improves prediction of protein disorder. Proteins 61, Suppl. 7, 176–182.
PubMed CAS Google Scholar
Peng, K., Radivojac, P., Vucetic, S., et al.(2006) Length-dependent prediction of protein intrinsic disorder. BMC Bioinfor-matics 7, 208.
Google Scholar
Cheng, J., Sweredoski, M., Baldi, P. (2005) Accurate prediction of protein disordered regions by mining protein structure data. Data Mining Knowl Disc 11, 213–222.
Google Scholar
Dosztanyi, Z., Csizmok, V., Tompa, P., et al.(2005) IUPred: web server for the prediction of intrinsically unstructured regions of proteins based on estimated energy content. Bioinformatics 21, 3433–3434.
PubMed CAS Google Scholar
Vullo, A., Bortolami, O., Pollastri, G., et al. (2006) Spritz: a server for the prediction of intrinsically disordered regions in protein sequences using kernel machines. Nucleic Acids Res 34, W164–168.
PubMed CAS Google Scholar
Ward, J. J., Sodhi, J. S., McGuffin, L. J., et al. (2004) Prediction and functional analysis of native disorder in proteins from the three kingdoms of life. J Mol Biol 337, 635–645.
PubMed CAS Google Scholar
Bryson, K., McGuffin, L. J., Marsden, R. L., et al. (2005) Protein structure prediction servers at University College London. Nucleic Acids Res 33, W36–38.
PubMed CAS Google Scholar
Krogh, A., Larsson, B., von Heijne, G., et al. (2001) Predicting transmembrane protein topology with a hidden Markov model: application to complete genomes. J Mol Biol 305, 567–580.
PubMed CAS Google Scholar
Rost, B., Yachdav, G., Liu, J. (2004) The PredictProtein server. Nucleic Acids Res 32, W321–326.
PubMed CAS Google Scholar
Bagos, P. G., Liakopoulos, T. D., Spyro-poulos, I. C., et al. (2004) PRED-TMBB: a web server for predicting the topology of beta-barrel outer membrane proteins. Nucleic Acids Res. 32, W400–404.
PubMed CAS Google Scholar
Natt, N. K., Kaur, H., Raghava, G. P. (2004) Prediction of transmembrane regions of beta-barrel proteins using ANN- and SVM-based methods. Proteins 56, 11–18.
PubMed CAS Google Scholar
Jones, D. T. (1999) Protein secondary structure prediction based on position-specific scoring matrices. J Mol Biol 292, 195–202.
PubMed CAS Google Scholar
Karplus, K., Barrett, C., Hughey, R. (1998) Hidden Markov models for detecting remote protein homologies. Bioinformatics 14, 846–856.
PubMed CAS Google Scholar
Pollastri, G., McLysaght, A. (2005) Porter: a new, accurate server for protein secondary structure prediction. Bioinformatics 21, 1719–1720.
PubMed CAS Google Scholar
Cuff, J. A., Barton, G. J. (2000) Application of multiple sequence alignment profiles to improve protein secondary structure prediction. Proteins 40, 502–511.
PubMed CAS Google Scholar
Cuff, J. A., Clamp, M. E., Siddiqui, A. S., et al. (1998) JPred: a consensus secondary structure prediction server. Bioinformatics 14, 892–893.
PubMed CAS Google Scholar
Altschul, S. F., Madden, T. L., Schaffer, A. A., et al. (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res 25, 3389–3402.
PubMed CAS Google Scholar
Tress, M., Tai, C. H., Wang, G., et al. (2005) Domain definition and target classification for CASP6. Proteins 61, Suppl. 7, 8–18.
PubMed CAS Google Scholar
Pearson, W. R. (1990) Rapid and sensitive sequence comparison with FASTP and FASTA. Methods Enzymol 183, 63–98.
PubMed CAS Google Scholar
Pearson, W. R. (1995) Comparison of methods for searching protein sequence databases. Protein Sci 4, 1145–1160.
PubMed CAS Google Scholar
Park, J., Karplus, K., Barrett, C., et al. (1998) Sequence comparisons using multiple sequences detect three times as many remote homologues as pairwise methods. J Mol Biol 284, 1201–1210.
PubMed CAS Google Scholar
Eddy, S. R. (1996) Hidden Markov models. Curr Opin Struct Biol 6, 361–365.
PubMed CAS Google Scholar
Eddy, S. R. (1998) Profile hidden Markov models. Bioinformatics 14, 755–763.
PubMed CAS Google Scholar
Madera, M., Gough, J. (2002) A comparison of profile hidden Markov model procedures for remote homology detection. Nucleic Acids Res 30, 4321–4328.
PubMed CAS Google Scholar
Karplus, K., Karchin, R., Draper, J., et al. (2003) Combining local-structure, fold-recognition, and new fold methods for protein structure prediction. Proteins 53, Suppl. 6, 491–496.
PubMed CAS Google Scholar
Karplus, K., Katzman, S., Shackleford, G., et al. (2005) SAM-T04: what is new in protein-structure prediction for CASP6. Proteins 61, Suppl. 7, 135–142.
PubMed CAS Google Scholar
Schaffer, A. A., Wolf, Y. I., Ponting, C. P., et al. (1999) IMPALA: matching a protein sequence against a collection of PSI-BLAST-constructed position-specific score matrices. Bioinformatics 15, 1000–1011.
PubMed CAS Google Scholar
Ohlson, T., Wallner, B., Elofsson, A. (2004) Profile-profile methods provide improved fold-recognition: a study of different profile-profile alignment methods. Proteins 57, 188–197.
PubMed CAS Google Scholar
Yona, G., Levitt, M. (2002) Within the twilight zone: a sensitive profile-profile comparison tool based on information theory. J Mol Biol 315, 1257–1275.
PubMed CAS Google Scholar
von Ohsen, N., Sommer, I., Zimmer, R. (2003) Profile-profile alignment: a powerful tool for protein structure prediction. Pac Symp Biocomput 252–263.
Google Scholar
von Ohsen, N., Sommer, I., Zimmer, R., et al. (2004) Arby: automatic protein structure prediction using profile-profile alignment and confidence measures. Bioin-formatics 20, 2228–2235.
Google Scholar
Sadreyev, R., Grishin, N. (2003) COMPASS: a tool for comparison of multiple protein alignments with assessment of statistical significance. J Mol Biol 326, 317–336.
PubMed CAS Google Scholar
Mittelman, D., Sadreyev, R., Grishin, N. (2003) Probabilistic scoring measures for profile-profile comparison yield more accurate short seed alignments. Bioinformatics 19, 1531–1539.
PubMed CAS Google Scholar
Sadreyev, R. I., Baker, D., Grishin, N. V. (2003) Profile-profile comparisons by COMPASS predict intricate homologies between protein families. Protein Sci 12, 2262–2272.
PubMed CAS Google Scholar
Heger, A., Holm, L. (2001) Picasso: generating a covering set of protein family profiles. Bioinformatics 17, 272–279.
PubMed CAS Google Scholar
Edgar, R. C., Sjolander, K. (2004) COACH: profile-profile alignment of protein families using hidden Markov models. Bioinformat-ics 20, 1309–1318.
CAS Google Scholar
Pietrokovski, S. (1996) Searching databases of conserved sequence regions by aligning protein multiple-alignments. Nucleic Acids Res 24, 3836–3845.
PubMed CAS Google Scholar
Jaroszewski, L., Rychlewski, L., Li, Z., et al. (2005) FFAS03: a server for profile–profile sequence alignments. Nucleic Acids Res 33, W284–288.
PubMed CAS Google Scholar
Tomii, K., Akiyama, Y. (2004) FORTE: a profile-profile comparison tool for protein fold recognition. Bioinformatics 20, 594–595.
PubMed CAS Google Scholar
Ginalski, K., Pas, J., Wyrwicz, L. S., et al. (2003) ORFeus: detection of distant homology using sequence profiles and predicted secondary structure. Nucleic Acids Res 31, 3804–3807.
PubMed CAS Google Scholar
Soding, J., Biegert, A., Lupas, A. N. (2005) The HHpred interactive server for protein homology detection and structure prediction. Nucleic Acids Res 33, W244–248.
PubMed Google Scholar
Kabsch, W., Sander, C. (1983) Dictionary of protein secondary structure: pattern recognition of hydrogen-bonded and geometrical features. Biopolymers 22, 2577–2637.
PubMed CAS Google Scholar
Sippl, M. J. (1995) Knowledge-based potentials for proteins. Curr Opin Struct Biol 5, 229–235.
PubMed CAS Google Scholar
Kelley, L. A., MacCallum, R. M., Sternberg, M. J. (2000) Enhanced genome annotation using structural profiles in the program 3D-PSSM. J Mol Biol 299, 499–520.
PubMed CAS Google Scholar
Jones, D. T. (1999) GenTHREADER: an efficient and reliable protein fold recognition method for genomic sequences. J Mol Biol 287, 797–815.
PubMed CAS Google Scholar
McGuffin, L. J., Bryson, K., Jones, D. T. (2000) The PSIPRED protein structure prediction server. Bioinformatics 16, 404–405.
PubMed CAS Google Scholar
Zhang, Y., Arakaki, A. K., Skolnick, J. (2005) TASSER: an automated method for the prediction of protein tertiary structures in CASP6. Proteins 61, Suppl. 7, 91–98.
PubMed CAS Google Scholar
Skolnick, J., Kihara, D., Zhang, Y. (2004) Development and large scale benchmark testing of the PROSPECTOR_3 threading algorithm. Proteins 56, 502–518.
PubMed CAS Google Scholar
Shi, J., Blundell, T. L., Mizuguchi, K. (2001) FUGUE: sequence-structure homology recognition using environment-specific substitution tables and structure-dependent gap penalties. J Mol Biol 310, 243–257.
PubMed CAS Google Scholar
Xu, J., Li, M., Kim, D., et al. (2003) RAPTOR: optimal protein threading by linear programming. J Bioinform Comput Biol 1, 95–117.
PubMed CAS Google Scholar
Tang, C. L., Xie, L., Koh, I. Y., et al. (2003) On the role of structural information in remote homology detection and sequence alignment: new methods using hybrid sequence profiles. J Mol Biol 334, 1043–1062.
PubMed CAS Google Scholar
Teodorescu, O., Galor, T., Pillardy, J., et al. (2004) Enriching the sequence substitution matrix by structural information. Proteins 54, 41–48.
PubMed CAS Google Scholar
Zhou, H., Zhou, Y. (2004) Single-body residue-level knowledge-based energy score combined with sequence-profile and secondary structure information for fold recognition. Proteins 55, 1005–1013.
PubMed CAS Google Scholar
Zhou, H., Zhou, Y. (2005) SPARKS 2 and SP3 servers in CASP6. Proteins 61, Suppl. 7, 152–156.
PubMed CAS Google Scholar
Zhou, H., Zhou, Y. (2005) Fold recognition by combining sequence profiles derived from evolution and from depth-dependent structural alignment of fragments. Proteins 58, 321–328.
PubMed CAS Google Scholar
Thompson, J. D., Higgins, D. G., Gibson, T. J. (1994) CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res 22, 4673–4680.
PubMed CAS Google Scholar
Notredame, C., Higgins, D. G., Heringa, J. (2000) T-Coffee: a novel method for fast and accurate multiple sequence alignment. J Mol Biol 302, 205–217.
PubMed CAS Google Scholar
Thompson, J. D., Gibson, T. J., Plewniak, F., et al. (1997) The CLUSTAL_X windows interface: flexible strategies for multiple sequence alignment aided by quality analysis tools. Nucleic Acids Res 25, 4876–4882.
PubMed CAS Google Scholar
Crooks, G. E., Hon, G., Chandonia, J. M., et al. (2004) WebLogo: a sequence logo generator. Genome Res 14, 1188–1190.
PubMed CAS Google Scholar
Sonnhammer, E. L., Hollich, V. (2005) Scoredist: a simple and robust protein sequence distance estimator. BMC Bioin-formatics 6, 108.
Google Scholar
Galtier, N., Gouy, M., Gautier, C. (1996) SEAVIEW and PHYLO_WIN: two graphic tools for sequence alignment and molecular phylogeny. Comput Appl Biosci 12, 543–548.
PubMed CAS Google Scholar
Parry-Smith, D. J., Payne, A. W., Michie, A. D., et al. (1998) CINEMA—a novel colour INteractive editor for multiple alignments. Gene 221, GC57–63.
PubMed CAS Google Scholar
Ginalski, K., von Grotthuss, M., Grishin, N. V., et al. (2004) Detecting distant homol-ogy with Meta-BASIC. Nucleic Acids Res 32, W576–581.
PubMed CAS Google Scholar
Xu, Y., Xu, D., Gabow, H. N. (2000) Protein domain decomposition using a graph-theoretic approach. Bioinformatics 16, 1091–1104.
PubMed CAS Google Scholar
Guo, J. T., Xu, D., Kim, D., et al. (2003) Improving the performance of Domain-Parser for structural domain partition using neural network. Nucleic Acids Res 31, 944–952.
PubMed CAS Google Scholar
Alexandrov, N., Shindyalov, I. (2003) PDP: protein domain parser. Bioinformatics 19, 429–430.
PubMed CAS Google Scholar
Todd, A. E., Orengo, C. A., Thornton, J. M. (1999) DOMPLOT: a program to generate schematic diagrams of the structural domain organization within proteins, annotated by ligand contacts. Protein Eng 12, 375–379.
PubMed CAS Google Scholar
Zemla, A. (2003) LGA: a method for finding 3D similarities in protein structures. Nucleic Acids Res 31, 3370–3374.
PubMed CAS Google Scholar
Holm, L., Park, J. (2000) DaliLite workbench for protein structure comparison. Bioinformatics 16, 566–567.
PubMed CAS Google Scholar
Ortiz, A. R., Strauss, C. E., Olmea, O. (2002) MAMMOTH (matching molecular models obtained from theory): an automated method for model comparison. Protein Sci 11, 2606–2621.
PubMed CAS Google Scholar
Gibrat, J. F., Madej, T., Br yant, S. H. (1996) Surprising similarities in structure comparison. Curr Opin Struct Biol 6, 377–385.
PubMed CAS Google Scholar
Shindyalov, I. N., Bourne, P. E. (1998) Protein structure alignment by incremental combinatorial extension (CE) of the optimal path. Protein Eng 11, 739–747.
PubMed CAS Google Scholar
Orengo, C. A., Taylor, W. R. (1996) SSAP: sequential structure alignment program for protein structure comparison. Methods Enzymol 266, 617–635.
PubMed CAS Google Scholar
Krissinel, E., Henrick, K. (2004) Secondary-structure matching (SSM), a new tool for fast protein structure alignment in three dimensions. Acta Crystallogr D Biol Crystallogr 60, 2256–2268.
PubMed CAS Google Scholar
Yang, A. S., Honig, B. (1999) Sequence to structure alignment in comparative modeling using PrISM. Proteins Suppl. 3, 66–72.
PubMed Google Scholar
Lupyan, D., Leo-Macias, A., Ortiz, A. R. (2005) A new progressive-iterative algorithm for multiple structure alignment. Bioinformatics 21, 3255–3263.
PubMed CAS Google Scholar
Ye, Y., Godzik, A. (2005) Multiple flexible structure alignment using partial order graphs. Bioinformatics 21, 2362–2369.
PubMed CAS Google Scholar
Hill, E. E., Morea, V., Chothia, C. (2002) Sequence conservation in families whose members have little or no sequence similarity: the four-helical cytokines and cyto-chromes. J Mol Biol 322, 205–233.
PubMed CAS Google Scholar
Chothia, C., Jones, E. Y. (1997) The molecular structure of cell adhesion molecules. Annu Rev Biochem 66, 823–862.
PubMed CAS Google Scholar
Hill, E., Broadbent, I. D., Chothia, C., et al. (2001) Cadherin superfamily proteins in Caenorhabditis elegans and Drosophila melanogaster. J Mol Biol 305, 1011–1024.
PubMed CAS Google Scholar
Chothia, C., Lesk, A. M. (1987) Canonical structures for the hypervariable regions of immunoglobulins. J Mol Biol 196, 901–917.
PubMed CAS Google Scholar
Chothia, C., Lesk, A. M., Tramontano, A., et al. (1989) Conformations of immu-noglobulin hypervariable regions. Nature 342, 877–883.
PubMed CAS Google Scholar
Al-Lazikani, B., Lesk, A. M., Chothia, C. (1997) Standard conformations for the canonical structures of immunoglobulins. J Mol Biol 273, 927–948.
PubMed CAS Google Scholar
Morea, V., Tramontano, A., Rustici, M., et al. (1998) Conformations of the third hypervariable region in the VH domain of immunoglobulins. J Mol Biol 275, 269–294.
PubMed CAS Google Scholar
Mizuguchi, K., Deane, C. M., Blundell, T. L., et al. (1998) JOY: protein sequence-structure representation and analysis. Bio-informatics 14, 617–623.
CAS Google Scholar
Hubbard, S. J., Thornton, J. M., (1993) NACCESS. Department of Biochemistry and Molecular Biology, University College London.
Google Scholar
McDonald, I. K., Thornton, J. M. (1994) Satisfying hydrogen bonding potential in proteins. J Mol Biol 238, 777–793.
PubMed CAS Google Scholar
Morris, A. L., MacArthur, M. W., Hutch-inson, E. G., et al. (1992) Stereochemical quality of protein structure coordinates. Proteins 12, 345–364.
PubMed CAS Google Scholar
Laskowski, R. A., MacArthur, M. W., Moss, D. S., et al. (1993) PROCHECK: a program to check the stereochemical quality of protein structures J Appl Cryst 26, 283–291.
CAS Google Scholar
Wallace, A. C., Laskowski, R. A., Thornton, J. M. (1995) LIGPLOT: a program to generate schematic diagrams of protein-lig-and interactions. Protein Eng 8, 127–134.
PubMed CAS Google Scholar
Laskowski, R. A., Hutchinson, E. G., Michie, A. D., et al. (1997) PDBsum: a Web-based database of summaries and analyses of all PDB structures. Trends Biochem Sci 22, 488–490.
PubMed CAS Google Scholar
Sasin, J. M., Bujnicki, J. M. (2004) COLO-RADO3D, a web server for the visual analysis of protein structures. Nucleic Acids Res 32, W586–589.
PubMed CAS Google Scholar
Landau, M., Mayrose, I., Rosenberg, Y., et al. (2005) ConSurf 2005: the projection of evolutionary conservation scores of residues on protein structures. Nucleic Acids Res 33, W299–302.
PubMed CAS Google Scholar
Guex, N., Peitsch, M. C. (1997) SWISS-MODEL and the Swiss-PdbViewer: an environment for comparative protein modeling. Electrophoresis 18, 2714–2723.
PubMed CAS Google Scholar
Sayle, R. A., Milner-White, E. J. (1995) RASMOL: biomolecular graphics for all. Trends Biochem Sci 20, 374.
PubMed CAS Google Scholar
Martz, E. (2002) Protein Explorer: easy yet powerful macromolecular visualization. Trends Biochem Sci 27, 107–109.
PubMed CAS Google Scholar
Wang, Y., Geer, L. Y., Chappey, C., et al. (2000) Cn3D: sequence and structure views for Entrez. Trends Biochem Sci 25, 300–302.
PubMed CAS Google Scholar
Vriend, G. (1990) WHAT IF: a molecular modeling and drug design program. J Mol Graph 8, 52–56.
PubMed CAS Google Scholar
Koradi, R., Billeter, M., Wuthrich, K. (1996) MOLMOL: a program for display and analysis of macromolecular structures. J Mol Graph 14, 51–55, 29–32.
PubMed CAS Google Scholar
Humphrey, W., Dalke, A., Schulten, K. (1996) VMD: visual molecular dynamics. J Mol Graph 14, 33–38, 27–38.
PubMed CAS Google Scholar
Tramontano, A., Chothia, C., Lesk, A. M. (1990) Framework residue 71 is a major determinant of the position and conformation of the second hypervariable region in the VH domains of immunoglobulins. J Mol Biol 215, 175–182.
PubMed CAS Google Scholar
Sibanda, B. L., Thornton, J. M. (1985) Beta-hairpin families in globular proteins. Nature 316, 170–174.
PubMed CAS Google Scholar
Sibanda, B. L., Blundell, T. L., Thornton, J. M. (1989) Conformation of beta-hairpins in protein structures. A systematic classification with applications to modelling by homology, electron density fitting and protein engineering. J Mol Biol 206, 759–777.
PubMed CAS Google Scholar
Bruccoleri, R. E. (2000) Ab initio loop modeling and its application to homology modeling. Methods Mol Biol 143, 247–264.
PubMed CAS Google Scholar
Xiang, Z., Soto, C. S., Honig, B. (2002) Evaluating conformational free energies: the colony energy and its application to the problem of loop prediction. Proc Natl Acad Sci U S A 99, 7432–7437.
PubMed CAS Google Scholar
Tosatto, S. C., Bindewald, E., Hesser, J., et al. (2002) A divide and conquer approach to fast loop modeling. Protein Eng 15, 279–286.
PubMed CAS Google Scholar
Fiser, A., Sali, A. (2003) ModLoop: automated modeling of loops in protein structures. Bioinformatics 19, 2500–2501.
PubMed CAS Google Scholar
Canutescu, A. A., Shelenkov, A. A., Dun-brack, R. L., Jr. (2003) A graph-theory algorithm for rapid protein side-chain prediction. Protein Sci 12, 2001–2014.
PubMed CAS Google Scholar
Hung, L. H., Ngan, S. C., Liu, T., et al. (2005) PROTINFO: new algorithms for enhanced protein structure predictions. Nucleic Acids Res 33, W77–80.
PubMed CAS Google Scholar
Xiang, Z., Honig, B. (2001) Extending the accuracy limits of prediction for side-chain conformations. J Mol Biol 311, 421–430.
PubMed CAS Google Scholar
Marti-Renom, M. A., Stuart, A. C., Fiser, A., et al. (2000) Comparative protein structure modeling of genes and genomes. Annu Rev Biophys Biomol Struct 29, 291–325.
PubMed CAS Google Scholar
Levitt, M. (1992) Accurate modeling of protein conformation by automatic segment matching.J Mol Biol 226, 507–533.
PubMed CAS Google Scholar
Schwede, T., Kopp, J., Guex, N., et al. (2003) SWISS-MODEL: an automated protein homology-modeling server.Nucleic Acids Res 31, 3381–3385.
PubMed CAS Google Scholar
Bates, P. A., Kelley, L. A., MacCallum, R. M., et al. (2001) Enhancement of protein modeling by human intervention in applying the automatic programs 3D-JIGSAW and 3D-PSSM.Proteins Suppl. 5, 39–46.
PubMed Google Scholar
Petrey, D., Xiang, Z., Tang, C. L., et al. (2003) Using multiple structure alignments, fast model building, and energetic analysis in fold recognition and homology modeling.Proteins 53, Suppl. 6, 430–435.
PubMed CAS Google Scholar
Koehl, P., Delarue, M. (1994) Application of a self-consistent mean field theory to predict protein side-chains conformation and estimate their conformational entropy.J Mol Biol 239, 249–275.
PubMed CAS Google Scholar
Wallner, B., Elofsson, A. (2005) All are not equal: a benchmark of different homology modeling programs.Protein Sci 14, 1315–1327.
PubMed CAS Google Scholar
Lund, O., Frimand, K., Gorodkin, J., et al. (1997) Protein distance constraints predicted by neural networks and probability density functions.Protein Eng 10, 1241–1248.
PubMed CAS Google Scholar
Lambert, C., Leonard, N., De Bolle, X., et al. (2002) ESyPred3D: prediction of proteins 3D structures.Bioinformatics 18, 1250–1256.
PubMed CAS Google Scholar
Hooft, R. W., Vriend, G., Sander, C., et al. (1996) Errors in protein structures.Nature 381, 272.
PubMed CAS Google Scholar
Sippl, M. J. (1993) Recognition of errors in three-dimensional structures of proteins.Proteins 17, 355–362.
PubMed CAS Google Scholar
Luthy, R., Bowie, J. U., Eisenberg, D. (1992) Assessment of protein models with three-dimensional profiles.Nature 356, 83–85.
PubMed CAS Google Scholar
Melo, F., Devos, D., Depiereux, E., et al. (1997) ANOLEA: a www server to assess protein structures.Proc Int Conf Intell Syst Mol Biol 5, 187–190.
PubMed CAS Google Scholar
Melo, F., Feytmans, E. (1998) Assessing protein structures with a non-local atomic interaction energy.J Mol Biol 277, 1141–1152.
PubMed CAS Google Scholar
Wallner, B., Elofsson, A. (2003) Can correct protein models be identified?Protein Sci 12, 1073–1086.
PubMed CAS Google Scholar
Wallner, B., Elofsson, A. (2006) Identification of correct regions in protein models using structural, alignment, and consensus information.Protein Sci 15, 900–913.
PubMed CAS Google Scholar
Fischer, D. (2006) Servers for protein structure prediction.Current Opin Struct Biol 16, 178–182.
CAS Google Scholar
Dayringer, H. E., Tramontano, A., Sprang, S. R., et al. (1986) Interactive program for visualization and modeling of protein, nucleic acid and small molecules.J Mol Graph 4, 82–87.
CAS Google Scholar
Spoel, D. v. d., Lindahl, E., Hess, B., et al. (2005) GROMACS: fast, flexible and free.J Comp Chem 26, 1701–1718.
Google Scholar
Phillips, J. C., Braun, R., Wang, W., et al. (2005) Scalable molecular dynamics with NAMD.J Comput Chem 26, 1781–1802.
PubMed CAS Google Scholar
Simons, K. T., Ruczinski, I., Kooperberg, C., et al. (1999) Improved recognition of native-like protein structures using a combination of sequence-dependent and sequence-independent features of proteins.Proteins. 34, 82–95.
PubMed CAS Google Scholar
Bonneau, R., Tsai, J., Ruczinski, I., et al. (2001) Rosetta in CASP4: progress in ab initio protein structure prediction.Proteins Suppl. 5, 119–126.
PubMed Google Scholar
Bonneau, R., Strauss, C. E., Rohl, C. A., et al. (2002) De novo prediction of three-dimensional structures for major protein families.J Mol Biol 322, 65–78.
PubMed CAS Google Scholar
Rohl, C. A., Strauss, C. E., Chivian, D., et al. (2004) Modeling structurally variable regions in homologous proteins with rosetta.Proteins 55, 656–677.
PubMed CAS Google Scholar
Bradley, P., Malmstrom, L., Qian, B., et al. (2005) Free modeling with Rosetta in CASP6.Proteins 61, Suppl. 7, 128–134.
PubMed CAS Google Scholar
Chivian, D., Kim, D. E., Malmstrom, L., et al. (2003) Automated prediction of CASP-5 structures using the Robetta server.Proteins 53, Suppl. 6, 524–533.
PubMed CAS Google Scholar
Chivian, D., Kim, D. E., Malmstrom, L., et al. (2005) Prediction of CASP6 structures using automated Robetta protocols.Proteins 61, Suppl. 6, 157–166.
PubMed CAS Google Scholar
Kim, D. E., Chivian, D., Baker, D. (2004) Protein structure prediction and analysis using the Robetta server.Nucleic Acids Res 32, W526–531.
PubMed CAS Google Scholar
Vincent, J. J., Tai, C. H., Sathyanarayana, B. K., et al. (2005) Assessment of CASP6 predictions for new and nearly new fold targets.Proteins 61, Suppl. 7, 67–83.
PubMed CAS Google Scholar
Wang, G., Jin, Y., Dunbrack, R. L., Jr. (2005) Assessment of fold recognition predictions in CASP6.Proteins 61, Suppl. 7, 46–66.
PubMed CAS Google Scholar
Jones, D. T., Bryson, K., Coleman, A., et al. (2005) Prediction of novel and analogous folds using fragment assembly and fold recognition.Proteins 61, Suppl. 7, 143–151.
PubMed CAS Google Scholar
Kolinski, A., Bujnicki, J. M. (2005) Generalized protein structure prediction based on combination of fold-recognition with de novo folding and evaluation of models.Proteins 61, Suppl. 7, 84–90.
PubMed CAS Google Scholar
Fujikawa, K., Jin, W., Park, S. J., et al. (2005) Applying a grid technology to protein structure predictor “ROKKY”.Stud Health Technol Inform 112, 27–36.
PubMed Google Scholar
Debe, D. A., Danzer, J. F., Goddard, W. A., et al. (2006) STRUCTFAST: protein sequence remote homology detection and alignment using novel dynamic programming and profile-profile scoring.Proteins 64, 960–967.
PubMed CAS Google Scholar
Ginalski, K., Elofsson, A., Fischer, D., et al. (2003) 3D-Jury: a simple approach to improve protein structure predictions.Bio-informatics 19, 1015–1018.
CAS Google Scholar
Fischer, D. (2003) 3DS3 and 3DS5 3D-SHOTGUN meta-predictors in CAFASP3.Proteins 53, Suppl. 6, 517–523.
PubMed CAS Google Scholar
Sasson, I., Fischer, D. (2003) Modeling three-dimensional protein structures for CASP5 using the 3D-SHOTGUN meta-predictors.Proteins 53, Suppl. 6, 389–394.
PubMed CAS Google Scholar
Fischer, D. (2003) 3D-SHOTGUN: a novel, cooperative, fold-recognition meta-predictor.Proteins 51, 434–441.
PubMed CAS Google Scholar
Fischer, D. (2000) Hybrid fold recognition: combining sequence derived properties with evolutionary information.Pac Symp Biocomput 119–130.
Google Scholar
Lundstrom, J., Rychlewski, L., Bujnicki, J., et al. (2001) Pcons: a neural-network-based consensus predictor that improves fold recognition.Protein Sci 10, 2354–2362.
PubMed CAS Google Scholar
Kurowski, M. A., Bujnicki, J. M. (2003) Gene-Silico protein structure prediction metaserver.Nucleic Acids Res 31, 3305–3307.
PubMed CAS Google Scholar
Plaxco, K. W., Simons, K. T., Baker, D. (1998) Contact order, transition state placement and the refolding rates of single domain proteins.J Mol Biol 277, 985–994.
PubMed CAS Google Scholar
Bonneau, R., Ruczinski, I., Tsai, J., et al. (2002) Contact order and ab initio protein structure prediction.Protein Sci 11, 1937–1944.
PubMed CAS Google Scholar
Shortle, D., Simons, K. T., Baker, D. (1998) Clustering of low-energy conformations near the native structures of small proteins.Proc Natl Acad Sci U S A 95, 11158–11162.
PubMed CAS Google Scholar
Venclovas, C., Margelevicius, M. (2005) Comparative modeling in CASP6 using consensus approach to template selection, sequence-structure alignment, and structure assessment.Proteins 61, Suppl. 7, 99– 105.
PubMed CAS Google Scholar
Kosinski, J., Gajda, M. J., Cymerman, I. A., et al. (2005) FRankenstein becomes a cyborg: the automatic recombination and realignment of fold recognition models in CASP6.Proteins 61, Suppl. 7, 106–113.
PubMed CAS Google Scholar
Wallner, B., Fang, H., Elofsson, A. (2003) Automatic consensus-based fold recognition using Pcons, ProQ, and Pmodeller.Proteins 53, Suppl. 6, 534–541.
PubMed CAS Google Scholar
Wallner, B., Elofsson, A. (2005) Pcons5: combining consensus, structural evaluation and fold recognition scores.Bioinformatics 21, 4248–4254.
PubMed CAS Google Scholar
Douguet, D., Labesse, G. (2001) Easier threading through web-based comparisons and cross-validations.Bioinformatics 17, 752–753.
PubMed CAS Google Scholar
Takeda-Shitaka, M., Terashi, G., Takaya, D., et al. (2005) Protein structure prediction in CASP6 using CHIMERA and FAMS.Proteins 61, Suppl. 7, 122–127.
PubMed CAS Google Scholar
Kopp, J., Schwede, T. (2004) The SWISS-MODEL Repository of annotated three-dimensional protein structure homology models.Nucleic Acids Res 32, D230–234.
PubMed CAS Google Scholar
Pieper, U., Eswar, N., Braberg, H., et al. (2004) MODBASE, a database of annotated comparative protein structure models, and associated resources.Nucleic Acids Res 32, D217–222.
PubMed CAS Google Scholar
Yamaguchi, A., Iwadate, M., Suzuki, E., et al. (2003) Enlarged FAMSBASE: protein 3D structure models of genome sequences for 41 species.Nucleic Acids Res 31, 463–468.
PubMed CAS Google Scholar
Castrignano, T., De Meo, P. D., Coz-zetto, D., et al. (2006) The PMDB Protein Model Database.Nucleic Acids Res 34, D306–309.
PubMed CAS Google Scholar
Dayhoff, M. O., Schwartz, R. M., Orcutt, B. C., (1978) A model of evolutionary change in proteins. InAtlas of Protein Sequence and Structure. M.O. Dayhoff, ed. National Biomedical Research Foundation, Washington, DC.
Google Scholar
Henikoff, S., Henikoff, J. G. (1992) Amino acid substitution matrices from protein blocks.Proc Natl Acad Sci U S A 89, 10915–10919.
PubMed CAS Google Scholar

Download references

Acknowledgments

The authors gratefully acknowledge Claudia Bertonati, Gianni Colotti, Andrea Ilari, Romina Oliva, and Christine Vogel for manuscript reading and suggestions, and Julian Gough and Martin Madera for discussions.

Author information

Authors and Affiliations

Biofocus DPI, London, United Kingdom
Bissan Al-Lazikani
The Journal of Cell Biology, Rockefeller University Press, New York, NY
Emma E. Hill
Institute of Molecular Biology and Pathology (IBPN), National Research Council (CNR), Rome, Italy
Veronica Morea

Authors

Bissan Al-Lazikani
View author publications
You can also search for this author in PubMed Google Scholar
Emma E. Hill
View author publications
You can also search for this author in PubMed Google Scholar
Veronica Morea
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

School of Mathematical Sciences, Queensland University of Technology, Brisbane, Queensland, Australia
Jonathan M. Keith PhD

Rights and permissions

Reprints and permissions

Copyright information

About this protocol

Cite this protocol

Al-Lazikani, B., Hill, E.E., Morea, V. (2008). Protein Structure Prediction. In: Keith, J.M. (eds) Bioinformatics. Methods in Molecular Biology™, vol 453. Humana Press. https://doi.org/10.1007/978-1-60327-429-6_2

Download citation

DOI: https://doi.org/10.1007/978-1-60327-429-6_2
Publisher Name: Humana Press
Print ISBN: 978-1-60327-428-9
Online ISBN: 978-1-60327-429-6
eBook Packages: Springer Protocols

Publish with us

Policies and ethics