Abstract
Pattern recognition in proteins has become of central importance in Molecular Biology. Proteins are macromolecules composed of an ordered sequence of amino acids, referred to also as residues. The sequence of residues in a protein is called its primary structure. The 3-D conformation of a protein is referred to as its tertiary structure. During the last decades thousands of protein sequences have been decoded. More recently the 3-D conformation of several hundreds of proteins have been resolved using X-ray crystallographic techniques.
Todate, most work on 3-D structural protein comparison has been limited to the linear matching of the 3-D conformations of contiguous segments (allowing insertions and deletions) of the amino acid chains. Several techniques originally developed for string matching have been modified to perform 3-D structural comparison based on the sequential order of the structures. We present an application of pattern recognition techniques (in particular matching algorithms) to structural comparison of proteins. The problem we are faced with is to devise efficient techniques for routine scanning of structural databases, searching for recurrences of inexact structural motifs not necessarily composed of contiguous segments of the amino acid chain. The method uses the Geometric Hashing technique which was originally developed for model-based object recognition problems in Computer Vision. Given the three dimensional coordinate data of the structures to be compared, our method automatically identifies every region of structural similarity between the structures without prior knowledge of an initial alignment. Typical structure comparison problems are examined and the results of the new method are compared with the published results from previous methods. Examples of the application of the method to identify and search for non-linear 3-D motifs are included.
Work on this paper was supported by grant No. 89-00481 from the US-Israel Binational Science Foundation (BSF), Jerusalem, Israel
This is a preview of subscription content, log in via an institution.
Preview
Unable to display preview. Download preview PDF.
References
C. Branden and J. Tooze. Introduction to Protein Structure. Garland Publishing, Inc., New York and London, 1991.
C. Chothia and A. M. Lesk. The relation between the divergence of sequence and structure in proteins. EMBO Jour., 5(4):823–826, 1986.
D. Fischer, O. Bachar, R. Nussinov, and H. J. Wolfson. An Efficient Computer Vision based technique for detection of three dimensional structural motifs in Proteins. J. Biomolec. Str. and Dyn., 1992. in press.
Y. Lamdan, J. T. Schwartz, and H. J. Wolfson. On Recognition of 3-D Objects from 2-D Images. In Proceedings of IEEE Int. Conf. on Robotics and Automation, pages 1407–1413, Philadelphia, Pa., April 1988.
Y. Lamdan, J. T. Schwartz, and H. J. Wolfson. Affine Invariant Model-Based Object Recognition. IEEE Trans. on Robotics and Automation, 6(5):578–589, 1990.
Y. Lamdan and H. J. Wolfson. Geometric Hashing: A General and Efficient Model-Based Recognition Scheme. In Proceedings of the IEEE Int. Conf. on Computer Vision, pages 238–249, Tampa, Florida, December 1988.
A. M. Lesk and C. Chothia. J. Mol. Biol., 136:225–270, 1980.
A. M. Lesk and C. Chothia. J. Mol. Biol., 160:325–342, 1982.
B. W. Matthews and M.G. Rossman. Methods Enzymol., 115:397–420, 1985.
E. M. Mitchel, P.J. Artymiuk, D.W. Rice, and P. Willet. Use of Techniques Derived from Graph Theory to Compare Secondary Structure Motifs in Proteins. J. Mol. Biol., 212:151–166, 1989.
R. Nussinov and H.J. Wolfson. Efficient detection of three-dimensional motifs in biological macromolecules by computer vision techniques. Proc. Natl. Acad. Sci. USA, 88:10495–10499, 1991.
J. H. Ploegman, G. Drent, K. H. Kalk, and W. G. Jol. J. Mol. Biol., 123:557–594, 1987.
S. J. Remington and B. W. Matthews. Proc. Natl. Acad. Sci. USA, 75:2180–2184, 1978.
S. J. Remington and B. W. Matthews. J. Mol. Biol., 140:397–420, 1980.
R. Renetseder, S. Brunie, B. W. Dijkstra, J. Drent, and P. B. Sigler. J. Biol. Chem., 206:11627–11634, 1985.
M. G. Rossman and P. Argos. J. Biol. Chem., 250:7525–7532, 1975.
M. G. Rossman and P. Argos. J. Mol. Biol., 105:75–96, 1976.
M. G. Rossman and P. Argos. J. Mol. Biol., 109:99–129, 1977.
T. L. Sali, A. Blundell. J. Mol. Biol., 212:403–428, 1990.
D. Sankoff and J.B. Kruskal. Time Warps, String Edits and Macromolecules: The Theory and Practice of Sequence Comparison. Addison-Wesley, 1983.
J.T. Schwartz and M. Sharir. Identification of Partially Obscured Objects in Two Dimensions by Matching of Noisy ‘Characteristic Curves'. The Int. J. of Robotics Research, 6(2):29–44, 1987.
N. Subbarao and I. Haneef. Defining Topological Equivalences in Macromolecules. Protein Engineering, 4(8):887–884, 1991.
W. R. Taylor and C. A. Orengo. Protein structure alignment. J. Mol. Biol., 208:1–22, 1989.
J. R. Ullman. An algorithm for subgraph isomorphism. J. ACM, 23:31–42, 1976.
L. H. Weaver, M. G. Grutter, S. J. Remington, T. M. Gray, N. W. Issacs, and B. W. Matthews. J. Mol. Evol., 21:97–111, 1985.
H. J. Wolfson. Model Based Object Recognition by ‘Geometric Hashing'. In Proceedings of the European Conf. on Computer Vision, pages 526–536, Antibes, France, April 1990.
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1992 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Fischer, D., Nussinov, R., Wolfson, H.J. (1992). 3-D substructure matching in protein Molecules. In: Apostolico, A., Crochemore, M., Galil, Z., Manber, U. (eds) Combinatorial Pattern Matching. CPM 1992. Lecture Notes in Computer Science, vol 644. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-56024-6_11
Download citation
DOI: https://doi.org/10.1007/3-540-56024-6_11
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-56024-1
Online ISBN: 978-3-540-47357-2
eBook Packages: Springer Book Archive