Enhancing the Effectiveness of Fingerprint-Based Virtual Screening: Use of Turbo Similarity Searching and of Fragment Frequencies of Occurrence

  • Shereena M. Arif
  • Jérôme Hert
  • John D. Holliday
  • Nurul Malim
  • Peter Willett
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5780)


Binary fingerprints encoding the presence of 2D fragment substructures in molecules are extensively used for similarity-based virtual screening in the agrochemical and pharmaceutical industries. This paper describes two techniques for enhancing the effectiveness of screening: the use of a second-level search based on the nearest neighbours of the initial reference structure; and the use of weighted fingerprints encoding the frequency of occurrence, rather than just the mere presence, of substructures. Experiments using several databases for which both structural and bioactivity data are available demonstrate the effectiveness of these two approaches.


Chemoinformatics Fingerprint Fragment substructure Similarity measure Similarity searching Turbo similarity searching Virtual screening Weighting scheme 


  1. 1.
    Stahura, F.L., Bajorath, J.: Virtual Screening Methods That Complement High-Throughput Screening. Combin. Chem. High-Through. Screening 7, 259–269 (2004)CrossRefGoogle Scholar
  2. 2.
    Alvarez, J., Shoichet, B. (eds.): Virtual Screening in Drug Discovery. CRC Press, Boca Raton (2005)Google Scholar
  3. 3.
    Eckert, H., Bajorath, J.: Molecular Similarity Analysis in Virtual Screening: Foundations, Limitation and Novel Approaches. Drug Discov. Today 12, 225–233 (2007)CrossRefPubMedGoogle Scholar
  4. 4.
    Willett, P.: Similarity Methods in Chemoinformatics. Ann. Rev. Inform. Sci. Technol. 43, 3–71 (2009)Google Scholar
  5. 5.
    Sheridan, R.P., Kearsley, S.K.: Why Do We Need So Many Chemical Similarity Search Methods? Drug Discov. Today 7, 903–911 (2002)CrossRefPubMedGoogle Scholar
  6. 6.
    Nikolova, N., Jaworska, J.: Approaches to Measure Chemical Similarity - a Review. QSAR Combin. Sci. 22, 1006–1026 (2003)CrossRefGoogle Scholar
  7. 7.
    Maldonado, A.G., Doucet, J.P., Petitjean, M., Fan, B.-T.: Molecular Similarity and Diversity in Chemoinformatics: From Theory to Applications. Mol. Diversity 10, 39–79 (2006)CrossRefGoogle Scholar
  8. 8.
    Glen, R.C., Adams, S.E.: Similarity Metrics and Descriptor Spaces - Which Combinations to Choose? QSAR Combin. Sci. 25, 1133–1142 (2006)CrossRefGoogle Scholar
  9. 9.
    Sheridan, R.P.: Chemical Similarity Searches: When Is Complexity Justified? Expert Opin. Drug Discov. 2, 423–430 (2007)CrossRefPubMedGoogle Scholar
  10. 10.
    Hert, J., Willett, P., Wilton, D.J., Acklin, P., Azzaoui, K., Jacoby, E., Schuffenhauer, A.: Enhancing the Effectiveness of Similarity-Based Virtual Screening Using Nearest-Neighbour Information. J. Med. Chem. 48, 7049–7054 (2005)CrossRefPubMedGoogle Scholar
  11. 11.
    Johnson, M.A., Maggiora, G.M. (eds.): Concepts and Applications of Molecular Similarity. John Wiley, New York (1990)Google Scholar
  12. 12.
    Martin, Y.C., Kofron, J.L., Traphagen, L.M.: Do Structurally Similar Molecules Have Similar Biological Activities? J. Med. Chem. 45, 4350–4358 (2002)CrossRefPubMedGoogle Scholar
  13. 13.
    Hert, J., Willett, P., Wilton, D.J., Acklin, P., Azzaoui, K., Jacoby, E., Schuffenhauer, A.: Comparison of Fingerprint-Based Methods for Virtual Screening Using Multiple Bioactive Reference Structures. J. Chem. Inf. Comput. Sci. 44, 1177–1185 (2004)CrossRefPubMedGoogle Scholar
  14. 14.
    Whittle, M., Gillet, V.J., Willett, P., Alex, A., Loesel, J.: Enhancing the Effectiveness of Virtual Screening by Fusing Nearest Neighbor Lists: A Comparison of Similarity Coefficients. J. Chem. Inf. Comput. Sci. 44, 1840–1848 (2004)CrossRefPubMedGoogle Scholar
  15. 15.
    Willett, P.: Data Fusion in Ligand-Based Virtual Screening. QSAR Combin. Sci. 25, 1143–1152 (2006)CrossRefGoogle Scholar
  16. 16.
    Goldman, B.B., Walters, W.P.: Machine Learning in Computational Chemistry. Ann. Report. Comput. Chem. 2, 127–140 (2006)CrossRefGoogle Scholar
  17. 17.
    Willett, P., Winterman, V.: A Comparison of Some Measures of Inter-Molecular Structural Similarity. Quant. Struct.-Activ. Relat. 5, 18–25 (1986)CrossRefGoogle Scholar
  18. 18.
    Hert, J., Willett, P., Wilton, D.J., Acklin, P., Azzaoui, K., Jacoby, E., Schuffenhauer, A.: New Methods for Ligand-Based Virtual Screening: Use of Data-Fusion and Machine-Learning Techniques to Enhance the Effectiveness of Similarity Searching. J. Chem. Inf. Model. 46, 462–470 (2006)CrossRefPubMedGoogle Scholar
  19. 19.
    Gardiner, E.J., Gillet, V.J., Haranczyk, M., Hert, J., Holliday, J.D., Malim, N., Patel, Y., Willett, P.: Turbo Similarity Searching: Effect of Fingerprint and Dataset on Virtual-Screening Performance. Stat. Anal. Data Mining (in press, 2009)Google Scholar
  20. 20.
    Salton, G., Buckley, C.: Term-Weighting Approaches in Automatic Text Retrieval. Inf. Proc. Manag. 24, 513–523 (1988)CrossRefGoogle Scholar
  21. 21.
    Cramer, R.D., Redl, G., Berkoff, C.E.: Substructural Analysis. A Novel Approach to the Problem of Drug Design. J. Med. Chem. 17, 533–535 (1974)CrossRefPubMedGoogle Scholar
  22. 22.
    Ormerod, A., Willett, P., Bawden, D.: Comparison of Fragment Weighting Schemes for Substructural Analysis. Quant. Struct.-Activ. Relat. 8, 115–129 (1989)CrossRefGoogle Scholar
  23. 23.
    Siegel, S., Castellan, N.J.: Nonparametric Statistics for the Behavioural Sciences. McGraw-Hill, New York (1988)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2009

Authors and Affiliations

  • Shereena M. Arif
    • 1
  • Jérôme Hert
    • 1
  • John D. Holliday
    • 1
  • Nurul Malim
    • 1
  • Peter Willett
    • 1
  1. 1.Krebs Institute for Biomolecular Research and Department of Information StudiesUniversity of SheffieldSheffieldUnited Kingdom

Personalised recommendations