Similarity Searching In Chemical Databases Using Molecular Fields And Data Fusion

  • Peter Willett
Part of the Mathematical and Computational Chemistry book series (MACC)


Similarity searching provides a popular means of accessing databases of both 2D and 3D chemical structures [1,2], and involves finding those molecules (the nearest neighbours) that are most similar to a user-defined target structure. The target structure is characterised by a set of structural features, and this set is compared with the corresponding sets of features for each of the structures in the database that is to be searched. Each such comparison permits the calculation of a measure of similarity between the target structure and a database structure, using some quantitative definition of inter-molecular structural similarity [3–5]. The database molecules are then sorted into order of decreasing similarity with the target, giving a ranked list in which the nearest neighbours are located at the top of the list and are thus displayed first to the user. Accordingly, if an appropriate measure of similarity has been used, the first database structures inspected will be those that have the greatest probability of being of interest to the user. Since its introduction in the mid-Eighties [6, 7], similarity searching has proved extremely popular with users, who have found that it provides a means of accessing chemical databases that is complementary to the existing structure and substructure searching facilities.


Similarity Measure Data Fusion Target Structure Fusion Rule Molecular Electrostatic Potential 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Downs, G.M. and Willett, P., Rev. Comput. Chem., 7 (1995) 1.Google Scholar
  2. 2.
    Willett, P., Barnard, J.M. and Downs, G.M., J. Chem.Inf. Comput. Sci., 38 (1998) 983.CrossRefGoogle Scholar
  3. 3.
    Johnson, M.A. and Maggiora, G.M. (editors), Concepts and Applications of Molecular Similarity, John Wiley, New York (1990).Google Scholar
  4. 4.
    Special issue devoted to molecular similarity, J. Chem. Inf. Comput. Sci., 32 (1992) 577.Google Scholar
  5. 5.
    Dean, P.M. (editor), Molecular Similarity in Drug Design, Chapman and Hall, Glasgow (1995).Google Scholar
  6. 6.
    Carhart, R.E., Smith, D.H. and Venkataraghavan, R., J. Chem. Inf. Comput. Sci., 25 (1985) 64.CrossRefGoogle Scholar
  7. 7.
    Willett, P., Winterman, V. and Bawden, D., J. Chem. Inf. Comput. Sci., 26 (1986) 36.CrossRefGoogle Scholar
  8. 8.
    Willett, P., Similarity and Clustering in Chemical Information Systems, Research Studies Press, Letchworth (1987).Google Scholar
  9. 9.
    Drayton, S.K., Edwards, K., Jewell, N.E., Turner, D.B., Wild, DJ., Willett, P., Wright, P.M. and Simmons, K., Internet J. Chem. at URL
  10. 10.
    Ginn, C.M.R., Willett, P. and Bradshaw, J., Submitted for publication.Google Scholar
  11. 11.
    Dean, P.M. and Perkins, T.D.J., In Martin Y.C. and Willett, P. (editors), Designing Bioactive Molecules: Three-Dimensional Techniques and Applications, American Chemical Society, Washington (1998).Google Scholar
  12. 12.
    Kubinyi, H., Folkers, G. & Martin, Y.C. (editors), 3D QSAR in Drug Design, Kluwer/ESCOM, Leiden (1998).Google Scholar
  13. 13.
    Carbó, R., Leyda, L. and Arnau, M., Int. J. Quant. Chem., 17 (1980)1185.CrossRefGoogle Scholar
  14. 14.
    Manaut, F., Sanz, F., Jose, J. and Milesi, M., J. Comput.-Aid. Mol. Design, 5 (1991) 371.CrossRefGoogle Scholar
  15. 15.
    Richard, A.M., J. Comput. Chem., 12 (1991) 959.CrossRefGoogle Scholar
  16. 16.
    Good, A.C., Hodgkin, E.E. and Richards, W.G., J. Chem. Inf. Comput. Sci., 32 (1992) 188.CrossRefGoogle Scholar
  17. 17.
    Good, A.C. and Richards, W.G., J. Chem. Inf. Comput. Sci., 33 (1993) 112.CrossRefGoogle Scholar
  18. 18.
    Petke, J.D. (1993) J. Comput. Chem., 14 (1993) 928.CrossRefGoogle Scholar
  19. 19.
    Mestres, J., Rohrer, D.C. and Maggiora, G.M., J. Comput.-Aid. Mol. Design, 13 (1999) 79.CrossRefGoogle Scholar
  20. 20.
    Turner, D.B., Willett, P., Ferguson, A. and Heritage, T.W., SAR QSAR Environ. Res., 3 (1995) 101.CrossRefGoogle Scholar
  21. 21.
    Thorner, D.A., Willett, P., Wright, P.M. and Taylor, R., J. Comput.-Aid. Mol. Design, 11 (1997) 163.CrossRefGoogle Scholar
  22. 22.
    Wild, D.J. and Willett, P., J. Chem. Inf. Comput. Sci., 36 (1996) 159.CrossRefGoogle Scholar
  23. 23.
    Thorner, D.A., Wild, D.J., Willett, P. and Wright, P.M., J. Chem. Inf. Comput. Sci., 36 (1996) 900.CrossRefGoogle Scholar
  24. 24.
    Lipinski, C.A., Ann. Reports Med. Chem., 21 (1986) 283.CrossRefGoogle Scholar
  25. 25.
    Gaillard, P., Carrupt, P., Testa, B. and Boudon, A., J. Comput.-Aid. Mol. Design, 8 (1994) 83.CrossRefGoogle Scholar
  26. 26.
    Croizet, F., Dubost, J.P., Langlois, M.H. and Audrey, E., Quant. Struct.-Activ. Relat., 10 (1991) 211.CrossRefGoogle Scholar
  27. 27.
    The World Drug Index database is available from Derwent Information at URL
  28. 28.
    Kearsley, S.K., Sallamack, S., Fluder, E.M., Andose, J.D., Mosley, R.T. and Sheridan, R.P., J. Chem. Inf. Comput. Sci., 36 (1996) 118.CrossRefGoogle Scholar
  29. 29.
    UNITY is available from Tripos Inc. at
  30. 30.
    Pepperrell, C.A., Willett, P. and Taylor, R., Tetrahed. Comp. Methodol., 3 (1990) 575.CrossRefGoogle Scholar
  31. 31.
    Briem, H. and Kuntz, I.D., J. Med. Chem., 39 (1996) 3401.CrossRefGoogle Scholar
  32. 32.
    Turner, D.B., Tyrrell, S.M. and Willett, P., J. Chem. Inf. Comput. Sci., 37, (1997) 18.CrossRefGoogle Scholar
  33. 33.
    The BIOSTER database is available from Synopsys Systems at URL
  34. 34.
    Gillet, V.J., Schuffenhauer, A. and Willett, P., Submitted for publication.Google Scholar
  35. 35.
    Hall, D.L., Mathematical Techniques in Multisensor Data Fusion, Artech House, Northwood MA (1992).Google Scholar
  36. 36.
    Belkin, N.J., Kantor, P., Fox, E.A. and Shaw, J.B., Inf. Proc. Manag., 31 (1995) 431.CrossRefGoogle Scholar
  37. 37.
    Ginn, C.M.R., Turner, D.B., Willett, P., Ferguson, A.M. and Heritage, T.W., J. Chem. Inf. Comput. Sci., 37 (1997) 23.CrossRefGoogle Scholar
  38. 38.
    Kahn, S.D., In Schleyer, P.V.R., Allinger, N.L., Clark, T., Gasteiger, J., Kollman, P.A., Schaefer, H.F. and Schreiner, P.R. (editors), Encyclopedia of Computational Chemistry, John Wiley, Chichester (1998).Google Scholar
  39. 39.
    Molecular Simulations Inc. is at URL
  40. 40.
    ChemX products are available from Oxford Molecular Limited at URL
  41. 41.
    Daylight Chemical Information Systems Inc. is at URL
  42. 42.

Copyright information

© Springer Science+Business Media New York 2001

Authors and Affiliations

  • Peter Willett
    • 1
  1. 1.Krebs Institute for BiomolecularResearch and Department Of Information StudiesUniversity of SheffieldWestern Bank, SheffieldUK

Personalised recommendations