An Introduction to Protein Contact Prediction

Hamilton, Nicholas; Huber, Thomas

doi:10.1007/978-1-60327-429-6_3

Nicholas Hamilton³ &
Thomas Huber⁴

Part of the book series: Methods in Molecular Biology™ ((MIMB,volume 453))

4292 Accesses
6 Citations

Abstract

A fundamental problem in molecular biology is the prediction of the three-dimensional structure of a protein from its amino acid sequence. However, molecular modeling to find the structure is at present intractable and is likely to remain so for some time, hence intermediate steps such as predicting which residues pairs are in contact have been developed. Predicted contact pairs have been used for fold prediction, as an initial condition or constraint for molecular modeling, and as a filter to rank multiple models arising from homology modeling. As contact prediction has advanced it is becoming more common for 3D structure predictors to integrate contact prediction into structure building, as this often gives information that is orthogonal to that produced by other methods. This chapter shows how evolutionary information contained in protein sequences and multiple sequence alignments can be used to predict protein structure, and the state-of-the-art predictors and their methodologies are reviewed.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Protocol: USD 49.95; Price excludes VAT (USA)

eBook: USD 89.00; Price excludes VAT (USA)

Softcover Book: USD 119.00; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Gobel, U., Sander, C, Scheider, R., et al. (1994) Correlated mutations and residue contacts in proteins. Proteins 18, 309–317.
Article PubMed CAS Google Scholar
McLachlan, A.D. (1971) Tests for comparing related amino acid sequences. J Mol Biol 61, 409–424.
Article PubMed CAS Google Scholar
Neher, E. (1994) How frequent are correlated changes in families of protein sequences? Proc Natl Acad Sci USA 91(1), 98–102.
Article PubMed CAS Google Scholar
Vicatos, S., Reddy, B.V.B., and Kaznes-sis, Y. (2005) Prediction of distant residue contacts with the use of evolutionary information. Proteins: Structure, Function, and Bioinformatics 58, 935–949.
Article CAS Google Scholar
Singer, M.S., Vriend, G., and Bywater, R.P. (2002) Prediction of protein residue contacts with a PDB-derived likelihood matrix. Protein Eng l5(9), 721–725.
Article Google Scholar
Lin, K., Kleinjung, J., Taylor, W., et al. (2003) Testing homology with CAO: A contact-based Markov model of protein evolution. Comp Biol Chem 27, 93–102.
Article CAS Google Scholar
Clarke, N.D. (1995) Covariation of residues in the homeodomain sequence family. Protein Sci. 7(11), 2269–78.
Article Google Scholar
Korber, B.T.M., Farber, R.M., Wolpert, D.H., et al. (1993) Covariation of Mutations in the V3 Loop of Human Immunodeficiency Virus Type 1 Envelope Protein: An Information Theoretic Analysis. Proc Natl Acad Sci 90, 7176–7180.
Article PubMed CAS Google Scholar
Martin, L.C., Gloor, G.B., Dunn, S.D., et al. (2005) Using information theory to search for co-evolving residues in proteins. Bioinformatics 21(22), 4116–4124.
Article PubMed CAS Google Scholar
Oliveira, L., Paiva, A.C.M., and Vriend, G. (2002) Correlated Mutation Analyses on Very Large Sequence Families. Chem Bio Chem 3(10), 1010–1017.
PubMed CAS Google Scholar
Akmaev, V.R., Kelley, S.T., and Stormo, G.D. (2000) Phylogenetically enhanced statistical tools for RNA structure prediction. Bioinformatics 16(6), 501–512.
Article PubMed CAS Google Scholar
Tillier, E.R.M. and Lui, T.W.H. (2003) Using multiple interdependency to separate functional from phylogenetic correlations in protein alignments. Bioinformatics 19(6), 750–755.
Article PubMed CAS Google Scholar
Wollenberg, K.R., and Atchley, W.R. (2000) Separation of phylogenetic and functional associations in biological sequences by using the parametric bootstrap. Proc Natl Acad Sci USA 97, 3288–3291.
Article PubMed CAS Google Scholar
McGuffin, L.J., Bryson, K., and Jones, D.T (2000) The PSIPRED protein structure prediction server. Bioinformatics 16, 404–405.
Article PubMed CAS Google Scholar
Shapire, R.E., The boosting approach to machine learning: An overview. MSRI Workshop on Nonlinear Estimation and Classification. 2002: Springer.
Google Scholar
Haykin, S., Neural Networks. 2nd ed. 1999: Prentice Hall. 104
Google Scholar
Zell, A., Marnier, M., Vogt, N., et al, Stuttgart Neural Network Simulator User Manual Version 4.2. 1998: University of Stuttgart.
Google Scholar
Punta, M., and Rost, B. (2005) PROFcon: novel prediction of long range contacts. Bioinformatics 21(13),2960–2968.
Article PubMed CAS Google Scholar
Hamilton, N., Burrage, K, Ragan, M.A., et al. (2004) Protein contact prediction using patterns of correlation. Proteins: Structure, Function, and Bioinformatics 56, 679–684.
Article CAS Google Scholar
Fariselli, P., Olmea, O., Valencia, A., et al. (2001) Prediction of contact maps with neural networks and correlated mutations. Protein Eng 14, 835–843.
Article PubMed CAS Google Scholar
MacCallum, R.M. (2004) Stripped sheets and protein contact prediction. Bioinformatics 20(1), i224–i231.
Article PubMed CAS Google Scholar
Cortes, C, and Vapnik, V. (1995) Support vector network. Machine and learning 20, 273–297.
Google Scholar
Boser, B., Guyon, I., and Vapnik, V. A training algorithm for optimal margin classifiers. in Proceedings of the fifth annual workshop on computational learning theory. 1992.
Google Scholar
Chang, C-C, and Lin, C-J, LIBSVM: a library for support vector machines. Software available at http://www.csie.ntu.edu. tw/ cjlin/libsvm. 2001.
Google Scholar
Koski, T., Hidden Markov Models for Bioinformatics. 2002: Springer.
Google Scholar
Karplus, K, Karchin, R., Draper, J., et al. (2003) Combining local-structure, fold-recognition, and new-fold methods for protein structure prediction. Proteins: Structure, Function, and Genetics 53(S6), 491–496.
Article CAS Google Scholar
Shao, Y. and Bystroff, C. (2003) Predicting Interresidue contacts using templates and pathways. Proteins 53, 497–502.
Article PubMed CAS Google Scholar
Conrad, C, Erfle, H., Warnat, P., et al. (2004) Automatic Identification of Subcel-lular Phenotypes on Human Cell Arrays. Genome Research 14, 1130–1136.
Article PubMed CAS Google Scholar
Tsai, C-H, Chen, B-J, Chan, C-h, et al. (2005) Improving disulphide connectivity prediction with sequential distance between oxidized cysteines. Bioinformatics 21(4), 4416–4419.
Article PubMed CAS Google Scholar
Hu, J., Shen, X., Shao, Y., et al., eds. Mining protein contact maps. In 2nd BIOKDD Workshop on Data Mining in Bioinformatics. 2002.
Google Scholar
Yuan, Z. (2005) Better prediction of protein contact number using a support vector regression analysis if amino acid sequence. BMC Bioinformatics 6, 248–257.
Article PubMed Google Scholar
Aloy, P., Stark, A., Hadley, C, et al. (2003) Predictions without templates: new folds, secondary structure, and contacts in CASP5. Proteins Suppl. 6, 436–456.
Article Google Scholar
Olmea, O., and Valencia, A. (1997) Improving contact predictions by the combination of correlated mutations and other sources of sequence information. Fold Design 2, S25–S32.
Article CAS Google Scholar
Mirny, L. and Domany, E. (1996) Protein Fold Recognition and Dynamics in The Space of Contact Maps. Proteins 26, 319–410.
Article Google Scholar
Fariselli, P., Olmea, O., Valencia, A., et al. (2001) Progress in predicting inter-residue contacts of proteins with neural networks and correlated mutations. Proteins Suppl 5,157–162.
Article PubMed CAS Google Scholar
Fariselli, P. and Casadio, R. (1999) Neural network based prediction of residue contacts in protein. Protein Eng 12, 15–21.
Article PubMed CAS Google Scholar
Grana, O., Baker, D., Maccallum, R.M., et al. (2005) CASP6 assessment of contact prediction. Proteins: Structure, Function, and Bioinformatics 61 Suppl 7, 214–24.
Article CAS Google Scholar
Koh, I.Y.Y., Eyrich, V.A., Marti-Renom, M.A., et al. (2003) EVA: evaluation of protein structure prediction servers. Nucleic Acids Research 31, 3311–3315.
Article PubMed CAS Google Scholar
Pazos, F., Helmer-Citterich, M., and Aus-iello, G. (1997) Correlated mutations contain information about protein-protein interaction. J Mol Biol 271, 511–523.
Article PubMed CAS Google Scholar
Rychlewski, L., and Fischer, D. (2005) LiveBench-8: The large-scale, continuous assessment of automated protein structure prediction. Protein Science 14, 240–245.
Article PubMed CAS Google Scholar
Pollastri, G. and Baldi, P. (2002) Prediction of contact maps by GIOHMMs and recurrent neural networks using lateral propagation from all four cardinal corners. Bioinformatics 18(Suppl. 1), S62–S70.
PubMed Google Scholar
Kohonen, T., and Makisari, K. (1989) The self-organizing feature maps. Phys Scripta 39, 168–172.
Article Google Scholar
Andreeva, A., Howorth, D., Brenner, S.E., et al. (2004) SCOP database in 2004: refinements integrate structure and sequence family data. Nucleic Acids Research 32(Database issue), D226–9.
Article PubMed CAS Google Scholar
Zhang, Y., Arakaki, A.K., and Skolnick, J. (2005) TASSER: An automated method for the prediction of protein tertiary structures. Protein Structure, Function, and Bioinformatics Suppl. 7, 91–98.
Article Google Scholar
Kim, D.E., Chivian, D., and Baker, D. (2004) Protein structure prediction and analysis using the Robetta server. Nucleic Acids Research 32, W526–W531.
Article PubMed CAS Google Scholar

Download references

Acknowledgments

The authors gratefully acknowledge financial support from the University of Queensland, the ARC Australian Centre for Bio-informatics and the Institute for Molecular Bioscience. The first author would also like to acknowledge the support of Prof. Kevin Burrage's Australian Federation Fellowship.

Author information

Authors and Affiliations

ARC Centre of Excellence in Bioinformatics, Institute for Molecular Bioscience and Advanced Computational Modelling Centre, The University of Queensland, Brisbane, Queensland, Australia
Nicholas Hamilton
School of Molecular and Microbial Sciences and Australian Institute for Bioengineering and Nanotechnology, The University of Queensland, Brisbane, Queensland, Australia
Thomas Huber

Authors

Nicholas Hamilton
View author publications
You can also search for this author in PubMed Google Scholar
Thomas Huber
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

School of Mathematical Sciences, Queensland University of Technology, Brisbane, Queensland, Australia
Jonathan M. Keith PhD

Rights and permissions

Reprints and permissions

Copyright information

About this protocol

Cite this protocol

Hamilton, N., Huber, T. (2008). An Introduction to Protein Contact Prediction. In: Keith, J.M. (eds) Bioinformatics. Methods in Molecular Biology™, vol 453. Humana Press. https://doi.org/10.1007/978-1-60327-429-6_3

Download citation

DOI: https://doi.org/10.1007/978-1-60327-429-6_3
Publisher Name: Humana Press
Print ISBN: 978-1-60327-428-9
Online ISBN: 978-1-60327-429-6
eBook Packages: Springer Protocols

Publish with us

Policies and ethics