Abstract
An approach to recognizing recurrent sequence-structure patterns in proteins has been developed, based on Delaunay tessellation of protein structure. Starting with a united residue (side chain centroids) representation of a protein structure, tessellation partitions the structure into a unique set of irregular tetra-hedra, or simplices whose vertices correspond to four nearest-neighbor residues. Tetrahedral clusters composed of residues not adjacent along the polypeptide chain have been classified according to their amino acid composition and the three distances separating the residues along the sequence; these distances being defined as the sequence lengths from first to second, second to third, and third to fourth residue. An elementary tertiary packing motif is defined as a Delaunay simplex with a specific amino acid composition, together with three sequence distances (i.e., number of residues along the sequence) between vertex residues. Analysis of three databases of diverse protein structures (< 30% sequence identity between any pair, 1922 structures total) identified 224 motifs found in at least two proteins from different fold families each. To further substantiate the methodology, three groups of proteins representing unique structural and functional families were analyzed and packing motifs characteristic of each of them have been identified. The proposed methodology is termed Simplicial Neighborhood Analysis of Protein Packing (SNAPP). SNAPP can be used to locate recurrent tertiary structural motifs as well as sequence-specific, functionally relevant patterns similar to Prosite (Hofmann, et al. 1999) signatures. We anticipate that the SNAPP methodology will be useful in automating the analysis and comparison of protein structures determined in structural and functional genomics projects.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Aurenhammer, F., Voronoi Diagrams: A survey of a fundamental data st ructure. (1991) ACM. Comput. Surveys, 23, 345–405.
Altschul, SF., Madden T, Schffer A., Zhang J, Zhang Z, Miller W, Lipman D. (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 25, 3389–3402.
Bryant SH, Lawrence CE. (1993) An empirical energy function for threading protein sequence through the folding motif. Proteins, 16, 92–112
Carter CW Jr, LeFebvre BC, Cammer SA, Tropsha A, Edgell MH. (2001) Four-body potentials reveal protein-specific correlations to stability changes caused by hydrophobic core mutations. J. Mol. Biol.; 311, 625–638.
Casari G, Sippl M. 1992. Structure-derived hydrophobic potential. Hydrophobic potential derived from X-ray structures of globular proteins is able to identify native folds. Proteins. 13, 258–271.
Chothia, C. Structural invariants in protein folding (1975). Nature 254, 304–308.
Chothia, C., Levitt, M., Richardson D. (1997). Structure of Proteins: Packing of α-helices and pleated sheets. Proc. Natl. Acad. Sci. USA 74, 4130–4134.
Chothia, C, Janin, J. (1980). Packing of α-Helices onto β-pleated Sheets and the Anatomy of α/β Proteins. J. Mol. Biol. 143, 95–128.
Finney, J.L., “Random packing and the structure of simple liquids” (1970) Proc. R. Soc., A319, 479–493.
Chothia C, Levitt M, Richardson D. (1981). Helix to Helix Packing in Proteins. J. Mol. Biol. 145, 215–250.
Gan, H.H., Tropsha, A. and Schlick, T. Generating Folded Protein Structures with a Lattice Chain Algorithm. (2000) J. Chem. Phys., 113, 5511–5524.
Gan, H.H., Tropsha, A. and Schlick, T. Lattice Protein Folding with Two and Four-Body Statistical Potentials. (2001) Proteins: Struct. Funct. Genetics, 43, 161–174.
Gernert K.M., Thomas B.D., Plurad J.C., Richardson J.S., Richardson D.C., Bergman L.D. (1996). Puzzle pieces defined: locating common packing units in tertiary protein contacts. In: Pacific Symposium on Biocomputing ’96, Hawaii, Jan. 3–6, 1996, Eds. L. Hunter and T.E. Klein, World Scientific, Singapore, pp. 331–349.
Gerstein, M., Tsai, J., and Levitt, M. The volume of atoms on the protein surface: calculated from simulation using Voronoi polyhedra. (1995) J. Mol. Biol., 249, 955–966.
Godzik A, Skolnick J. (1992). Sequence Structure Matching in Globular Proteins — Application to Supersecondary and Tertiary Structure Determination. Proc. Natl. Acad. Sci. USA. 89, 12098–12102.
Harpaz, Y., Gerstein, M., and Chothia, C. (1994) Volume changes on protein folding. Structure, 2, 641–649.
Henikoff S, Henikoff J, Pietrokovski S. (1999). Blocks+: a non-redundant database of protein alignment blocks derived from multiple compilations. Bioinformatics. 15, 471–9.
Hobohm U, Scharf M, Schneider R, Sander C. (1992). Selection of Represent ative Protein Data Sets. Prot. Sci. 1, 409–417.
Hofmann K, Bucher P, Falquet L, Bairoch A. (1999). The PROSITE database, its status in 1999. Nucleic Acids Res. 27, 215–219.
Holm L, Sander C. (1998). Touring protein fold space with Dali/FSSP. Nucleic Acids Res. 26, 316–9.
Hooft RWW, Sander C, Vriend G. 1996. Verification of protein structures: Sidechain planarity. J. Appl. Cryst. 29, 714–716.
Hutchinson EG, Sessions R, Thornton, J, Woolfson, D. 1998. Determinants of strand register in antiparallel (-sheets of proteins. Protein. Sci. 7:2287–2300.
Jonassen I, Eidhammer I, Taylor W. (1999). Discovery of local packing motifs in protein structures. Proteins. Feb 1: 34(2):206–19.
Jones D, Thornton J. (1996). Potential energy functions for threading. Curr. Opin. Struct. Biol. 6, 210–216.
Koretke KK, Luthey-Schulten Z, Wolynes PG. (1996) Self-consistently optimized statistical mechanical energy functions for sequence structure alignment. Protein Sci. 5, 1043–1059.
Lahr SJ, Broadwater A, Carter CW Jr, Collier ML, Hensley L, Waldner JC, Pielak GJ, Edgell MH. (1999). Patterned library analysis: a method for the quantitative assessment of hypotheses concerning the determinants of protein structure. Proc. Natl. Acad. Sci. USA. 96, 14860–5.
Maiorov VN, Crippen GM. (1992). Contact potential that recognizes the correct folding of globular proteins. J. Mol. Biol. 227, 876–88.
Miyazawa S, Jernigan RL. (1996). Residue-residue potentials with a favorable contact pair term and an unfavorable high packing density term, for simulation and threading. J. Mol. Biol. 256, 623–44.
Munson P, Singh R. (1998). Statistical significance of hierarchical multi-body potentials based on Delaunay tessellation and their application in sequence structure alignment. Protein Sci. 6, 1467–81.
Murzin AG, Brenner SE, Hubbard T, Chothia C. (1995). SCOP: a structural classification of proteins database for the investigation of sequences and structures. J. Mol. Biol. 247, 536–540.
Okabe, A., Boots, B., and Sugihara, K. (1992) Spatial tessellations: concepts and applications of Voronoi diagrams. Chichester, Wiley.
Richards, F.M. (1974). The int erpretation of protein structures: total volume, group volume distribution and packing density. J. Mol. Biol. 82, 1–14.
Singh R, Tropsha A, Vaisman I. (1996). Delaunay tessellation of proteins: four body nearest-neighbor propensities of amino acid residues. J. Comput. Biol., 3, 213–222.
Sippl, MJ. (1995). Knowledge-based potentials for proteins. Curr. Opin. Struct. Biol. 5, 229–35.
Tropsha, A., Singh, R.K., Vaisman, I.I., and Zheng, W. (1996) Statistical Geometry Analysis of Proteins: Implications for Inverted Structure Prediction. In: Pacific Symposium on Biocomputing ’96, Hawaii, Jan. 3–6,, Eds. L. Hunter and T.E. Klein, World Scientific, Singapore, pp. 614–623.
Young M, Skillman A, Kuntz I. (1999). A rapid method for exploring the protein structure universe. Proteins 34, 317–32.
Wako H, Yamato T. (1998). Novel method to detect a motif of local structures in different protein conformations. Protein Eng. 11, 981–90.
Watson, DF. (1981). Computing the n-dimensional Delaunay tesselation with application to Voronoi polytopes. Comp. J., 24, 167–172
Zheng W, Cho J, Vaisman I, Tropsha A. (1997). A new approach to protein fold recognition based on Delaunay tessellation of protein structure. In: Pacific Symposium on Biocomputing ’97, Hawaii, Jan. 6–9, 1997, Eds, L. Hunter and T.E. Klein, World Scientific, Singapore, pp. 486–497.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2002 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Cammer, S.A., Carter, C.W., Tropsha, A. (2002). Identification of Sequence-Specific Tertiary Packing Motifs in Protein Structures using Delaunay Tessellation. In: Schlick, T., Gan, H.H. (eds) Computational Methods for Macromolecules: Challenges and Applications. Lecture Notes in Computational Science and Engineering, vol 24. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-56080-4_19
Download citation
DOI: https://doi.org/10.1007/978-3-642-56080-4_19
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-43756-7
Online ISBN: 978-3-642-56080-4
eBook Packages: Springer Book Archive