Mining Residue Contacts in Proteins

Zaki, Mohammed J.; Bystroff, Chris

doi:10.1007/978-1-4615-1733-7_9

Mohammed J. Zaki &
Chris Bystroff

Part of the book series: Massive Computing ((MACO,volume 2))

428 Accesses
2 Citations

Abstract

In this paper we develop data mining techniques to predict 3D contact potentials among protein residues (or amino acids) based on the hierarchical nucleation-propagation model of protein folding. We apply a hybrid approach, using a Hidden Markov Model to extract folding initiation sites, and then apply association mining to discover contact potentials. The new hybrid approach achieves accuracy results better than those reported previously.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Softcover Book: USD 219.99; Price excludes VAT (USA)

Hardcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

S. Altschul, T. Madden, A. Schaffer, J. Zhang, Z. Zhang, W. Miller, and D. Lipman. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Research, 25(17), 3389–402, 1997.
Article Google Scholar
C. Bystroff and D. Baker. Prediction of local structure in proteins using a library of sequence- structure motifs. Journal of Molecular Biology, 281(3), 565–77, 1998.
Article Google Scholar
C. Bystroff, V. Thorsson, and D. Baker. HMMSTR: A hidden markov model for local sequence-structure correlations in proteins. Journal of Molecular Biology, (to appear), 2000.
Google Scholar
S. Eddy. Profile hidden markov models. Bioinformatics, 14(9), 755–63, 1998.
Article Google Scholar
P. Fariselli and R. Casadio. A neural network based predictor of residue contacts in proteins. Protein Engineering, 12(1), 15–21, 1999.
Article Google Scholar
K. Han and D. Baker. Global properties of the mapping between local amino acid sequence and local structure in proteins. Proc. Natl Acad. Set USA, 93(12), 5814–5818, 1996.
Article Google Scholar
B. Honig. Protein folding: from the levinthal paradox to structure prediction. Journal of Molecular Biology, 293(2), 283–93, 1999.
Article Google Scholar
U. Hobohm and C. Sander. Enlarged representative set of protein structures. Protein Science, 3(3), 522–524, 1994.
Article Google Scholar
W. Kabsch and C. Sander. Dictionary of protein secondary structure: pattern recognition of hydrogen-bonded and geometrical features. Biopolymers, 22, 2577–2637, 1983.
Article Google Scholar
J. Moult, J.T. Pedersen, R. Judson, and K. Fidelis. A large-scale experiment to assess protein structure prediction methods. Proteins, 23(3), ii-v, 1995.
Article Google Scholar
O. Olmea and A. Valencia. Improving contact predictions by the combination of correlated mutations and other sources of sequence information. Folding & Design, 2, S25-S32, June 1997.
Article Google Scholar
L. Rabiner. A tutorial on hidden markov models and selected applications in speech recognition. Proceedings of the IEEE, 77(2):257–86, 1989.
Article Google Scholar
L. Serrano, A. Matouschek, and A.R. Fersht. The folding of an enzyme. m. Structure of the transition state for unfolding of barnase analysed by a protein engineering procedure. Journal of Molecular Biology, 224(3), 805–18, 1992.
Article Google Scholar
D. Thomas, G. Casari, and C. Sander. The prediction of protein contacts from multiple sequence aligments. Protein Engineering, 9(11):941–48, 1996.
Article Google Scholar
M. Vendruscolo, E. Kussell, and E. Domany. Recovery of protein structure from contact maps. Folding & Design, 2(5), 295–306, September 1997.
Article Google Scholar
J. Wootton and S. Federhen. Analysis of compositionally biased regions in sequence databases. Methods Enzymol., 266, 554–71, 1996.
Article Google Scholar
Y. I. Wolf, N. V. Grishin, and E. V. Koonin. Estimating the number of protein folds and families from complete genome data. Journal of Molecular Biology, 299(4), 897–905, 2000.
Article Google Scholar
Q. Yi, C. Bystroff, P. Rajagopal, R. E. Klevit, and D. Baker. Prediction and structural characterization of an independently folding substructure in the src sh3 domain. Journal of Molecular Biology, 283(1), 293–300, 1998.
Article Google Scholar
C. Zhao and S.-H. Kim. Environment-dependent residue contact energies for proteins. Proc. Natl Acad. Sci. USA, 97(6), 2550–5, 2000.
Article Google Scholar

Download references

Authors

Mohammed J. Zaki
View author publications
You can also search for this author in PubMed Google Scholar
Chris Bystroff
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

University of Illinois, Chicago, USA
Robert L. Grossman
Lawrence Livermore National Laboratory, Livermore, USA
Chandrika Kamath
Sandia National Laboratories, Livermore, USA
Philip Kegelmeyer
Army High Performance Computing Research Center (AHPCRC), Minneapolis, USA
Vipin Kumar
Army Research Laboratory, Aberdeen Proving Ground, USA
Raju R. Namburu

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Zaki, M.J., Bystroff, C. (2001). Mining Residue Contacts in Proteins. In: Grossman, R.L., Kamath, C., Kegelmeyer, P., Kumar, V., Namburu, R.R. (eds) Data Mining for Scientific and Engineering Applications. Massive Computing, vol 2. Springer, Boston, MA. https://doi.org/10.1007/978-1-4615-1733-7_9

Download citation

DOI: https://doi.org/10.1007/978-1-4615-1733-7_9
Publisher Name: Springer, Boston, MA
Print ISBN: 978-1-4020-0114-7
Online ISBN: 978-1-4615-1733-7
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics