Three-dimensional pattern matching in protein structure analysis

  • Arthur M. Lesk
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 937)


Many pattern-matching problems that arise in “one dimension” in the analysis of genomic sequences have three-dimensional analogs in the analysis of protein structures. This report focuses on the identification and matching of common substructures, and treats two problems: the probing of a database of structures with a segment of a protein to identify regions from other proteins with conformations similar to that of the probe, and the determination of the maximal common “rigid subunit” in comparing alternative conformations of a single protein. Approaches based on the representation of structures in terms of lists of coordinates or as a distance matrices are compared.


Distance Matrix Distance Matrice Elementary Operation Moment Invariant Common Substructure 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Alt, H., Melhorn, K., Wagener, H. and Welzl, E.: Congruence, similarity, and symmetries of geometric objects. Discrete Comput. Geom. 3, 237–256 (1988)Google Scholar
  2. 2.
    Bachar, O., Fischer, D., Nussinov, R. and Wolfson, H.J.: A computer vision based technique for 3-D sequence independent structural comparison of proteins. Prot. Eng. 6, 279–288 (1993)Google Scholar
  3. 3.
    Balas, E. and Yu, O.S. Finding a maximal clique in an arbitrary graph. SIAM J. Comput. 4 1054–1068 (1986).CrossRefGoogle Scholar
  4. 4.
    Bernstein, F.C., Koetzle, T.F., Williams, G.J.B., Meyer, E.F. Jr., Brice, M.D., Rodgers, J.R., Kennard, O., Shimanouchi, T., Tasumi, M. The protein databank: A computer-based archival file for macromolecular structure. J. Mol. Biol. 112, 535–542 (1977)PubMedGoogle Scholar
  5. 5.
    Bron, C. and Kerbosch, J. Algorithm 457: Finding all cliques of an undirected graph J. Assoc. Comp. Mach. 16, 575–577 (1973)Google Scholar
  6. 6.
    Carraghan, R. and Pardalos, P.M. An exact algorithm for the maximum clique problem. Op. Res. Lett. 9, 375–382 (1990)Google Scholar
  7. 7.
    Carrell, R. W., Stein, P. E., Fermi, G. and Wardell, M. R. Biological implications of a 3å structure of dimeric antithrombin. Structure 2, 257–270 (1994)PubMedGoogle Scholar
  8. 8.
    Crippen, G.M. and Havel, T.F. Distance Geometry and Molecular Conformation. New York: John Wiley and Sons, 1988Google Scholar
  9. 9.
    Fischer, D., Bachar, O., Nussinov, R. and Wolfson H.J. An efficient automated computer vision based technique for detection of three-dimensional structural motifs in proteins. J. Biomol. Str. Dyn. 9, 769–789 (1992).Google Scholar
  10. 10.
    Geoffrion, A.M. Integer programming by implicit enumeration and Balas' method. SIAM Review 9 (1967) 178–190Google Scholar
  11. 11.
    Gerstein, M., Lesk, A. M. and Chothia, C. Structural mechanisms for domain movements in proteins. Biochemistry 33, 6739–6749 (1994)PubMedGoogle Scholar
  12. 12.
    Gusfield, D. and Pitt, L. Equivalent approximation algorithms for node cover. Inf. Proc. Lett. 22, 291–294 (1986)Google Scholar
  13. 13.
    Guo, X. Three dimensional moment invariants under rigid transformation. In, Computer Analysis of Images and Patterns, D. Chetverikov and W. G. Kropatsch (eds.). Springer-Verlag, Berlin, 1993, pp. 518–522Google Scholar
  14. 14.
    Golub, G. and Van Loan, C.F. Matrix Computations. 2nd Ed. Baltimore, The Johns Hopkins University Press, 1989, Chap. 12Google Scholar
  15. 15.
    Grindley H., Artymiuk P.J., Rice D. and Willett P. Identification of tertiary structure resemblance in proteins using a maximal common subgraph isomorphism algorithm. J. Mol. Biol. 229 707–721 (1993).PubMedGoogle Scholar
  16. 16.
    Mitchell E.M., Artymiuk P.J., Rice D.W. and Willett P. Use of techniques from graph theory to compare secondary structure motifs in proteins. J. Mol. Biol. 212 151–166 (1989).Google Scholar
  17. 17.
    Hammer, P. and Rudeanu, S. Boolean methods in operations research and related areas. New York, Springer-Verlag, 1968.Google Scholar
  18. 18.
    Jones, T.A. and Thirup, S. Using known substructures in protein model building and crystallography. EMBO J. 5, 819–822 (1986)PubMedGoogle Scholar
  19. 19.
    Karpen, M.E., de Haseth, P.L. and Neet, K.E. Comparing short protein substructures by a method based on backbone torsion angles. Proteins: Structure, Function, Genetics 6, 155–167 (1989)Google Scholar
  20. 20.
    Lesk, A.M. A FORTRAN program for the solution of simultaneous linear boolean inequalities by the algorithm of Hammer and Rudeanu J. Comp. Phys. 12 (1973) 150–152.Google Scholar
  21. 21.
    Lesk, A.M. Protein Architecture: A Practical Approach. IRL Press, Oxford, 1991.Google Scholar
  22. 22.
    Lesk, A.M. Computational Molecular Biology. In: Encyclopedia of Computer Science and Technology A. Kent and J.G. Williams, (eds.) New York, Marcel Dekker, Inc. 1994, Volume 31, pp. 101–165.Google Scholar
  23. 23.
    Levine, M., Stuart, D. and Williams, J. A method for systematic comparison of the three-dimensional structures of proteins and some results. Acta crystallographica A40, 600–610 (1984)Google Scholar
  24. 24.
    Liebman, M. N., Venanzi, C.A., Weinstein, H., Structural analysis of carboxypeptidase A and its complexes with inhibitors as a basis for modelling enzyme recognition and specificity. Biopolymers 24, 1721–1758 (1985)PubMedGoogle Scholar
  25. 25.
    Maiorov, V.N. and Crippen, G. M. Significance of root-mean-square deviation in comparing three-dimensional structures of globular proteins. J. Mol. Biol. 235, 625–634 (1994).PubMedGoogle Scholar
  26. 26.
    Nichols, W.L, Rose, G.D., Ten Eyck, L.F. and Zimm, B.H. Rigid Domains in Proteins: An Algorithmic Approach to their Identification. Proteins, in press (1995).Google Scholar
  27. 27.
    Parker, R.G. and Rardin, R.L. Discrete Optimization. Academic Press, New York, 1988.Google Scholar
  28. 28.
    Pastore, A., Atkinson, R.A., Saudek, V. and Williams, R.J.P. Topological mirror images in protein structure computation: an underestimated problem. Proteins 10, 22–32 (1991).PubMedGoogle Scholar
  29. 29.
    Rustici, M. and Lesk, A.M. Three-dimensional searching for recurrent structural motifs in databases of protein structures. J. Comp. Biol. 1, 121–132 (1994)Google Scholar
  30. 30.
    Willett, P. Three-Dimensional Chemical Structure Handling. Research Studies Press, Taunton, Somerset, U.K. (1991)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 1995

Authors and Affiliations

  • Arthur M. Lesk
    • 1
  1. 1.MRC CentreUniversity of Cambridge Clinical SchoolCambridgeUK

Personalised recommendations