# Approximation Algorithms for 3-D Common Substructure Identification in Drug and Protein Molecules

## Abstract

Identifying the common 3-D substructure between two drug or protein molecules is an important problem in synthetic drug design and molecular biology. This problem can be represented as the following geometric pattern matching problem: given two point sets *A* and *B* in three-dimensions, and a real number*∈* > 0, find the maximum cardinality subset *S* ⊆ *S* for which there is an isometry *I*, such that each point of *I(S)* is within (ie253-1) distance of a distinct point of *B*. Since it is difficult to solve this problem exactly, in this paper we have proposed several approximation algorithms with guaranteed approximation ratio. Our algorithms can be classifed into two groups. In the first we extend the notion of partial decision algorithms for *∈*-congruence of point sets in 2-D in order to approximate the size of *S*. All the algorithms in this class exactly satisfy the constraint imposed by *∈*. In the second class of algorithms this constraint is satisfied only approximately. In the latter case, we improve the known approximation ratio for this class of algorithms, while keeping the time complexity unchanged. For the existing approximation ratio, we propose algorithms with substantially better running times. We also suggest several improvements of our basic algorithms, all of which have a running time of *O*(*n* ^{8.5}). These improvements consist of using randomization, and/or an approximate maximum matching scheme for bipartite graphs.

## Keywords

Bipartite Graph Approximation Ratio Computational Geometry Decision Algorithm Bijective Mapping## Preview

Unable to display preview. Download preview PDF.

## References

- 1.
*Proc. 12th. Annual ACM Symp. on Computational Geometry*, 1996.Google Scholar - 2.
*Proc. 3rd. Annual Intl. Conf. on Computational Molecular Biology*, April, 1999.Google Scholar - 3.T. Akutsu. Protein structure alignment using dynamic programming and iterative improvement.
*IEICE Trans. Information and Systems*, E79-D:1629–1636, 1996.Google Scholar - 4.T. Akutsu, H. Tamaki, and T. Tokuyama. Distribution of distances and triangles in a point set and algorithms for computing the largest common point sets.
*Discrete and Computational Geometry*, 20:307–331, 1998.MathSciNetzbMATHCrossRefGoogle Scholar - 5.S. Arya, D. M. Mount, N. S. Netanyahu, R. Silverman, and A. Wu. An optimal algorithm for approximate nearest neighbor searching. In
*Proc. 5th. Annual ACM-SIAM Symp. on Discrete Algorithms*, pages 573–582, 1994.Google Scholar - 6.S. Chakraborty and S. Biswas. Approximation algorithms for 3-D common substructure identification in drug and protein molecules. Technical Report TIK Report No. 69, Eidgenössische Technische Hochschule Zürich, 1999. ftp://ftp.tik.ee.ethz.ch/pub/people/samarjit/paper/CB99a.ps.gz.
- 7.L. P. Chew, K. Kedem, J. Kleinberg, and D. Huttenlocher. Fast detection of common geometric substructure in proteins. In
*Proc. RECOMB’99-3rd. Annual International Conference on Computational Molecular Biology*[2].Google Scholar - 8.A. Efrat and A. Itai. Improvements on bottleneck matching and related problems using geometry. In
*Proc. 12th. Annual ACM Symp. on Computational Geometry*[1], pages 301–310.Google Scholar - 9.P. W. Finn, L. E. Kavraki, J-C. Latombe, R. Motwani, C. Shelton, S. Venkatasubramanian, and A. Yao. RAPID: Randomized pharmacophore identification for drug design. In
*Proc. 13th. Annual ACM Symp. on Computational Geometry*, pages 324–333, Centre Universitaire Méditerranéen, Nice, France, 1997.Google Scholar - 10.D. Fischer, R. Nussinov, and H. J. Wolfson. 3-D substructure matching in protein molecules. In
*Proc. 3rd. Annual Symposium on Combinatorial Pattern Matching*, April 1992. LNCS 644, pages 136–150.CrossRefGoogle Scholar - 11.M. T. Goodrich, J. S. B. Mitchell, and M. W. Orletsky. Practical methods for approximate geometric patern matching under rigid motions. In
*Proc. 10th. Annual ACM Symp. On Computational Geometry*, pages 103–112, 1994.Google Scholar - 12.J. Hopcroft and R. M. Karp. An n
^{5/2}algorithm for maximum matchings in bipartite graphs.*SIAM J. Computing*, 2:225–231, 1973.MathSciNetzbMATHCrossRefGoogle Scholar - 13.P. Indyk, R. Motwani, and S. Venkatasubramanian. Geometric matching under noise: Combinatorial bounds and algorithms. In
*Proc. 10th. Annual ACM-SIAM Symp. on Discrete Algorithms*, 1999.Google Scholar - 14.S. Irani and P. Raghavan. Combinatorial and experimental results for randomized point matching algorithms. In
*Proc. 12th. Annual ACM Symp. on Computational Geometry*[1], pages 68–77.Google Scholar - 15.S. Lavalle, P. Finn, L. Kavraki, and J-C. Latombe. Efficient database screening for rational drug design using pharmacophore-constrained conformational search. In
*Proc. RECOMB’99-3rd. Annual International Conference on Computational Molecular Biology*[2].Google Scholar - 16.K. Mehlhorn.
*Data Structures and Algorithms 3: Multi-dimensional Searching and Computational Geometry*. Springer Verlag, Berlin, 1984.zbMATHGoogle Scholar - 17.S. Schirra. Approximate decision algorithms for approximate congruence.
*Information Processing Letters*, 43:29–34, 1992.MathSciNetzbMATHCrossRefGoogle Scholar