Skip to main content

An Integrated Approach to 2-D and 3-D Similarity Searching for the Cambridge Structural Database (CSD)

  • Conference paper
Chemical Structures 2

Abstract

Similarity searching in chemical databases depends crucially upon the chosen molecular attribute sets. The current 2-D implementation in the Cambridge Structural Database System uses substructural bit screens as attributes. These contain chemical information at a restricted connectivity level around each atom or bond; the only larger pattern units represented are rings and ring systems. Gross pattern attributes can, however, be assigned in terms of inter-nodal bond separation frequencies established using a shortest path algorithm. This information can be used alone (or in combination with the chemical attributes) to provide an alternative (or enhanced) approach to the 2-D problem. In 3-D, similarity concepts have meaning at both the substructural and full structural levels. A specific chemical substructure may exist in a variety of 3-D conformations. A modified Minkowski metric based on torsion angle descriptors is used to compare 3-D shapes. This results in a 1-D ‘conformational spectrum’ graphical representation in which different conformers often appear in well separated groups for rapid identification. At the full molecular level, comparison of complete distance matrices provides the most complete solution. However, due to the vast computational effort this requires, the distance matrix may be reduced to a distance-frequency distribution. Ultimately it is planned that the 2-D (inter-nodal bond separations) and 3-D (Å distances) approaches will be combined to provide suitable descriptors for similarity work.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Willett P.; Winterman V. ‘A Comparison of Some Measures for the Determination of Intermolecular Structural Similarity: Measures of Intermolecular Structural Similarity’. Quant Struct. Act. Relat. 1986, 5, 18–25.

    Article  CAS  Google Scholar 

  2. Bawden D. ‘Browsing and Clustering of Chemical Structures’. In Chemical Structures. The International Language of Chemistry; Warr W.A., Ed.; Springer-Verlag: Berlin, 1988; pp. 145–150.

    Google Scholar 

  3. Willett P.; Winterman V.; Bawden D. Implementation of Nearest-neighbour Searching in an Online Chemical Structure Search System. J. Chem. Inf. Comput. Sci. 1986, 26, 36–41.

    Article  CAS  Google Scholar 

  4. CSD System User’s Manual Part I: QUEST89; Cambridge Crystallographic Data Centre: Cambridge, England, 1989.

    Google Scholar 

  5. Distance Geometry and Conformational Calculations; Crippen, G.M.; Research Studies Press: Letchworth, 1977.

    Google Scholar 

  6. Bersohn M. ‘A Fast Algorithm for Calculation of the Distance Matrix of a Molecule’. J. Comput. Chem. 1983, 4, 110–113.

    Article  CAS  Google Scholar 

  7. Randic M.; Wilkins C.L. ‘Graph-based Fragment Searches in Polycyclic Structures’. J. Chem. Inf. Comput. Sci. 1979, 19, 23–37.

    Article  CAS  Google Scholar 

  8. Carhart R.E.; Smith D.H.; Venkataraghavan R. ‘Atom Pairs as Molecular Features in Structure-activity Studies: Definition and Applications’. J. Chem. Inf. Comput. Sci. 1985, 25, 64–73.

    Article  CAS  Google Scholar 

  9. Willett P. ‘Similarity Coefficients and Weighting Functions for Automatic Document Classification: an Empirical Comparison’. Int. Classif. 1983, 10, 138–142.

    Google Scholar 

  10. Similarity and Clustering in Chemical Information Systems; Willett, P.; Research Studies Press: Letchworth, 1987.

    Google Scholar 

  11. Cluster Analysis, Everitt B.; Halstead-Heinemann: London, 1980.

    Google Scholar 

  12. Morgan H.L. The Generation of a Unique Machine Description for Chemical Structures - a Technique Developed at Chemical Abstracts Service. J. Chem. Doc. 1965, 5, 107–113.

    Article  CAS  Google Scholar 

  13. Allen F.H.; Doyle M.J.; Taylor R. ‘Automated Conformational Analysis from Crystalloraphic Data’. Acta Crystallogr. 1991, B47, 50–61.

    Article  Google Scholar 

  14. Jakes S.E.; Willett P. ‘Pharmacophore Pattern Matching in Files of 3-D Chemical Structures: Selection of Interatomic Distance Screens’. J. Mol. Graphics 1986, 4, 12–20.

    Article  CAS  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 1993 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Mitchell, E.M., Allen, F.H., Mitchell, G.F., Rowland, R.S. (1993). An Integrated Approach to 2-D and 3-D Similarity Searching for the Cambridge Structural Database (CSD). In: Warr, W.A. (eds) Chemical Structures 2. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-78027-1_31

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-78027-1_31

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-78029-5

  • Online ISBN: 978-3-642-78027-1

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics