Similarity Searching In Chemical Databases Using Molecular Fields And Data Fusion
Similarity searching provides a popular means of accessing databases of both 2D and 3D chemical structures [1,2], and involves finding those molecules (the nearest neighbours) that are most similar to a user-defined target structure. The target structure is characterised by a set of structural features, and this set is compared with the corresponding sets of features for each of the structures in the database that is to be searched. Each such comparison permits the calculation of a measure of similarity between the target structure and a database structure, using some quantitative definition of inter-molecular structural similarity [3–5]. The database molecules are then sorted into order of decreasing similarity with the target, giving a ranked list in which the nearest neighbours are located at the top of the list and are thus displayed first to the user. Accordingly, if an appropriate measure of similarity has been used, the first database structures inspected will be those that have the greatest probability of being of interest to the user. Since its introduction in the mid-Eighties [6, 7], similarity searching has proved extremely popular with users, who have found that it provides a means of accessing chemical databases that is complementary to the existing structure and substructure searching facilities.
KeywordsSimilarity Measure Data Fusion Target Structure Fusion Rule Molecular Electrostatic Potential
Unable to display preview. Download preview PDF.
- 1.Downs, G.M. and Willett, P., Rev. Comput. Chem., 7 (1995) 1.Google Scholar
- 3.Johnson, M.A. and Maggiora, G.M. (editors), Concepts and Applications of Molecular Similarity, John Wiley, New York (1990).Google Scholar
- 4.Special issue devoted to molecular similarity, J. Chem. Inf. Comput. Sci., 32 (1992) 577.Google Scholar
- 5.Dean, P.M. (editor), Molecular Similarity in Drug Design, Chapman and Hall, Glasgow (1995).Google Scholar
- 8.Willett, P., Similarity and Clustering in Chemical Information Systems, Research Studies Press, Letchworth (1987).Google Scholar
- 9.Drayton, S.K., Edwards, K., Jewell, N.E., Turner, D.B., Wild, DJ., Willett, P., Wright, P.M. and Simmons, K., Internet J. Chem. at URL http://www.ijc.com/articles/1998v1/37/.
- 10.Ginn, C.M.R., Willett, P. and Bradshaw, J., Submitted for publication.Google Scholar
- 11.Dean, P.M. and Perkins, T.D.J., In Martin Y.C. and Willett, P. (editors), Designing Bioactive Molecules: Three-Dimensional Techniques and Applications, American Chemical Society, Washington (1998).Google Scholar
- 12.Kubinyi, H., Folkers, G. & Martin, Y.C. (editors), 3D QSAR in Drug Design, Kluwer/ESCOM, Leiden (1998).Google Scholar
- 27.The World Drug Index database is available from Derwent Information at URL http://www.derwent.co.uk
- 29.UNITY is available from Tripos Inc. at http://www.tripos.com
- 33.The BIOSTER database is available from Synopsys Systems at URL http://www.synopsys.co.uk/
- 34.Gillet, V.J., Schuffenhauer, A. and Willett, P., Submitted for publication.Google Scholar
- 35.Hall, D.L., Mathematical Techniques in Multisensor Data Fusion, Artech House, Northwood MA (1992).Google Scholar
- 38.Kahn, S.D., In Schleyer, P.V.R., Allinger, N.L., Clark, T., Gasteiger, J., Kollman, P.A., Schaefer, H.F. and Schreiner, P.R. (editors), Encyclopedia of Computational Chemistry, John Wiley, Chichester (1998).Google Scholar
- 39.Molecular Simulations Inc. is at URL http://www.msi.com
- 40.ChemX products are available from Oxford Molecular Limited at URL http://www.oxmol.co.uk
- 41.Daylight Chemical Information Systems Inc. is at URL http://www.daylight.com
- 42.Bradshaw, J., at URL http://www.daylight.com/meetings/mug97/Bradshaw/MUG97/tv_tversky.html