Advertisement

Journal of Computer-Aided Molecular Design

, Volume 26, Issue 3, pp 279–287 | Cite as

Ligand expansion in ligand-based virtual screening using relevance feedback

  • Ammar Abdo
  • Faisal Saeed
  • Hentabli Hamza
  • Ali Ahmed
  • Naomie Salim
Article

Abstract

Query expansion is the process of reformulating an original query to improve retrieval performance in information retrieval systems. Relevance feedback is one of the most useful query modification techniques in information retrieval systems. In this paper, we introduce query expansion into ligand-based virtual screening (LBVS) using the relevance feedback technique. In this approach, a few high-ranking molecules of unknown activity are filtered from the outputs of a Bayesian inference network based on a single ligand molecule to form a set of ligand molecules. This set of ligand molecules is used to form a new ligand molecule. Simulated virtual screening experiments with the MDL Drug Data Report and maximum unbiased validation data sets show that the use of ligand expansion provides a very simple way of improving the LBVS, especially when the active molecules being sought have a high degree of structural heterogeneity. However, the effectiveness of the ligand expansion is slightly less when structurally-homogeneous sets of actives are being sought.

Keywords

Virtual screening Bayesian inference network Ligand expansion Nearest neighbours Similarity searching Drug discovery 

Notes

Acknowledgments

This work is supported by Ministry of Higher Education (MOHE) and Research Management Centre (RMC) at the Universiti Teknologi Malaysia (UTM) under Research University Grant Category (VOT Q.J130000.7128.00H72).

References

  1. 1.
    Willett P, Barnard JM, Downs GM (1998) Chemical similarity searching. J Chem Inf Comput Sci 38:983–996CrossRefGoogle Scholar
  2. 2.
    Carhart RE, Smith DH, Venkataraghavan R (1985) Atom pairs as molecular features in structure-activity studies: definition and applications. J Chem Inf Comput Sci 25:64–73CrossRefGoogle Scholar
  3. 3.
    Johnson MA, Maggiora GM (1990) Concepts and application of molecular similarity. Wiley, New YorkGoogle Scholar
  4. 4.
    Sheridan RP, Kearsley SK (2002) Why do we need so many chemical similarity search methods? Drug Discov Today 7:903–911CrossRefGoogle Scholar
  5. 5.
    Nikolova N, Jaworska J (2003) Approaches to measure chemical similarity—a review. QSAR Comb Sci 22:1006–1026CrossRefGoogle Scholar
  6. 6.
    Bender A, Glen RC (2004) Molecular similarity: a key technique in molecular informatics. Org Biomol Chem 2:3204–3218CrossRefGoogle Scholar
  7. 7.
    Maldonado A, Doucet J, Petitjean M, Fan B-T (2006) Molecular similarity and diversity in chemoinformatics: from theory to applications. Mol Divers 10:39–79CrossRefGoogle Scholar
  8. 8.
    Leach AR, Gillet VJ (2003) An Introduction to chemoinformatics. Kluwer, DordrechtGoogle Scholar
  9. 9.
    Abdo A, Salim N (2009) Similarity-based virtual screening with a Bayesian inference network. ChemMedChem 4:210–218CrossRefGoogle Scholar
  10. 10.
    Abdo A, Salim N (2011) Ligand-based virtual screening using Bayesian inference network. In: Library design, search methods, and applications of fragment-based drug design, vol 1076. ACS symposium series, vol 1076. American Chemical Society, pp 57–69Google Scholar
  11. 11.
    Abdo A, Salim N (2011) New fragment weighting scheme for the Bayesian inference network in ligand-based virtual screening. J Chem Inf Model 51:25–32CrossRefGoogle Scholar
  12. 12.
    Abdo A, Salim N (2009) Bayesian inference network significantly improves the effectiveness of similarity searching using multiple 2D fingerprints and multiple reference structures. QSAR Comb Sci 28:1537–1545CrossRefGoogle Scholar
  13. 13.
    Abdo A, Salim N (2009) Similarity-based virtual screening using Bayesian inference network: enhanced search using 2D fingerprints and multiple reference structures. QSAR Comb Sci 28:654–663CrossRefGoogle Scholar
  14. 14.
    Abdo A, Chen B, Mueller C, Salim N, Willett P (2010) Ligand-based virtual screening using Bayesian networks. J Chem Inf Model 50:1012–1020CrossRefGoogle Scholar
  15. 15.
    Hert J, Willett P, Wilton DJ, Acklin P, Azzaoui K, Jacoby E, Schuffenhauer A (2005) Enhancing the effectiveness of similarity-based virtual screening using nearest-neighbor information. J Med Chem 48:7049–7054CrossRefGoogle Scholar
  16. 16.
    Hert J, Willett P, Wilton DJ, Acklin P, Azzaoui K, Jacoby E, Schuffenhauer A (2006) New methods for ligand-based virtual screening: use of data fusion and machine learning to enhance the effectiveness of similarity searching. J Chem Inf Model 46:462–470CrossRefGoogle Scholar
  17. 17.
    Gardiner EJ, Gillet VJ, Haranczyk M, Hert J, Holliday JD, Malim N, Patel Y, Willett P (2009) Turbo similarity searching: effect of fingerprint and dataset on virtual-screening performance. Stat Anal Data Mining 2:103–114CrossRefGoogle Scholar
  18. 18.
    Abdo A, Salim N, Ahmed A (2011) Implementing relevance feedback in ligand-based virtual screening using Bayesian inference network. J Biomol Screen 16:1081–1088CrossRefGoogle Scholar
  19. 19.
    de Castro P, de França F, Ferreira H, Coelho G, Von Zuben F (2010) Query expansion using an immune-inspired biclustering algorithm. Nat Comput 9:579–602CrossRefGoogle Scholar
  20. 20.
    López-Pujalte C, Guerrero-Bote VP, Moya-Anegón FD (2003) Genetic algorithms in relevance feedback: a second test and new contributions. Inf Process Manage 39:669–687CrossRefGoogle Scholar
  21. 21.
    Taktak I, Tmar M, Hamadou A (2009) Query reformulation based on relevance feedback. In: Andreasen T, Yager R, Bulskov H, Christiansen H, Larsen H (eds) Flexible query answering systems, vol 5822. Lecture notes in computer science. Springer, Berlin, pp 134–144Google Scholar
  22. 22.
    Symyx Technologies. MDL drug data report. http://www.symyx.com/products/databases/bioactivity/mddr/index.jsp. Accessed October 20, 2011
  23. 23.
    Pipeline Pilot (2008) Accelrys Software Inc., San DiegoGoogle Scholar
  24. 24.
    Rohrer SG, Baumann K (2009) Maximum unbiased validation (MUV) data sets for virtual screening based on PubChem bioactivity data. J Chem Inf Model 49:169–184CrossRefGoogle Scholar
  25. 25.
    Siegel S, Castellan NJ (1988) Nonparametric statistics for the behavioral sciences. McGraw-Hill, New YorkGoogle Scholar
  26. 26.
    Swets J (1988) Measuring the accuracy of diagnostic systems. Science 240:1285–1293CrossRefGoogle Scholar
  27. 27.
    Triballeau N, Acher F, Brabet I, Pin J-P, Bertrand H-O (2005) Virtual screening workflow development guided by the “receiver operating characteristic” curve approach. Application to high-throughput docking on metabotropic glutamate receptor subtype 4. J Med Chem 48(7):2534–2547. doi: 10.1021/jm049092j Google Scholar

Copyright information

© Springer Science+Business Media B.V. 2012

Authors and Affiliations

  • Ammar Abdo
    • 1
    • 2
  • Faisal Saeed
    • 1
  • Hentabli Hamza
    • 1
  • Ali Ahmed
    • 1
  • Naomie Salim
    • 1
  1. 1.Faculty of Computer Science and Information SystemsUniversiti Teknologi MalaysiaSkudaiMalaysia
  2. 2.Computer Science DepartmentHodeidah UniversityHodeidahYemen

Personalised recommendations