Methods to Improve Ranking Chemical Structures in Ligand-Based Virtual Screening

  • Mohammed Mumtaz Al-DabbaghEmail author
  • Naomie Salim
  • Faisal Saeed
Conference paper
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 1073)


One of the main tasks in chemoinformatics is searching for active chemical compounds in screening databases. The chemical databases can contain thousands or millions of chemical structures for screening. Therefore, there is an increasing need for computational methods that can help alleviate some challenges for saving time and cost in drug discover design. The ranking of chemical compounds can be accomplished according to their chances of clinical success by the computational tools. In this paper, the techniques that have been used to improve the ranking of chemical structures in similarity searching methods have been highlighted through two categories. Firstly, the taxonomy of using machine learning techniques in ranking chemical structures have been introduced. Secondly, we have discussed the alternative chemical ranking approaches that can be used instead of classical ranking criteria to enhance the performance of similarity searching methods.


Molecular ranking Marching learning techniques Ranking chemical compounds Ligand-based Virtual screening Alternative ranking techniques 


  1. 1.
    Walters, W.P., Stahl, M.T., Murcko, M.A.: Virtual screening—an overview. Drug Discov. Today 3(4), 160–178 (1998)CrossRefGoogle Scholar
  2. 2.
    Johnson, M.A., Maggiora, G.M.: Concepts and Applications of Molecular Similarity. Wiley, New York (1990)Google Scholar
  3. 3.
    Gasteiger, J., Engel, T.: Chemoinformatics: A Textbook. Wiley (2006)Google Scholar
  4. 4.
    Al-Dabbagh, M.M., et al.: Quantum probability ranking principle for ligand-based virtual screening. J. Comput.-Aided Mol. Des. 31(4), 365–378 (2017)CrossRefGoogle Scholar
  5. 5.
    Jorissen, R.N., Gilson, M.K.: Virtual screening of molecular databases using a support vector machine. J. Chem. Inf. Model. 45(3), 549–561 (2005)CrossRefGoogle Scholar
  6. 6.
    Geppert, H., et al.: Support-vector-machine-based ranking significantly improves the effectiveness of similarity searching using 2D fingerprints and multiple reference compounds. J. Chem. Inf. Model. 48(4), 742–746 (2008)CrossRefGoogle Scholar
  7. 7.
    Rathke, F., et al.: StructRank: a new approach for ligand-based virtual screening. J. Chem. Inf. Model. 51(1), 83–92 (2010)CrossRefGoogle Scholar
  8. 8.
    Agarwal, S., Dugar, D., Sengupta, S.: Ranking chemical structures for drug discovery: a new machine learning approach. J. Chem. Inf. Model. 50(5), 716–731 (2010)CrossRefGoogle Scholar
  9. 9.
    Jacob, L., Vert, J.-P.: Protein-ligand interaction prediction: an improved chemogenomics approach. Bioinformatics 24(19), 2149–2156 (2008)CrossRefGoogle Scholar
  10. 10.
    Jacob, L., et al.: Virtual screening of GPCRs: an in silico chemogenomics approach. BMC Bioinf. 9(1), 363 (2008)CrossRefGoogle Scholar
  11. 11.
    Vert, J.-P., Jacob, L.: Machine learning for in silico virtual screening and chemical genomics: new strategies. Comb. Chem. High Throughput Screen. 11(8), 677 (2008)CrossRefGoogle Scholar
  12. 12.
    Rupp, M., Proschak, E., Schneider, G.: Kernel approach to molecular similarity based on iterative graph similarity. J. Chem. Inf. Model. 47(6), 2280–2286 (2007)CrossRefGoogle Scholar
  13. 13.
    Ralaivola, L., et al.: Graph kernels for chemical informatics. Neural Netw. 18(8), 1093–1110 (2005)CrossRefGoogle Scholar
  14. 14.
    Plewczynski, D.: Brainstorming: weighted voting prediction of inhibitors for protein targets. J. Mol. Model. 17(9), 2133–2141 (2011)CrossRefGoogle Scholar
  15. 15.
    Xie, Q.-Q., et al.: Combined SVM-based and docking-based virtual screening for retrieving novel inhibitors of c-Met. Eur. J. Med. Chem. 46(9), 3675–3680 (2011)CrossRefGoogle Scholar
  16. 16.
    Schneider, N., et al.: Gradual in silico filtering for druglike substances. J. Chem. Inf. Model. 48(3), 613–628 (2008)CrossRefGoogle Scholar
  17. 17.
    Klekota, J., Roth, F.P.: Chemical substructures that enrich for biological activity. Bioinformatics 24(21), 2518–2525 (2008)CrossRefGoogle Scholar
  18. 18.
    Deconinck, E., et al.: Classification tree models for the prediction of blood-brain barrier passage of drugs. J. Chem. Inf. Model. 46(3), 1410–1419 (2006)CrossRefGoogle Scholar
  19. 19.
    Hou, T., Wang, J., Li, Y.: ADME evaluation in drug discovery 8. The prediction of human intestinal absorption by a support vector machine. J. Chem. Inf. Model. 47(6), 2408–2415 (2007)CrossRefGoogle Scholar
  20. 20.
    de Cerqueira Lima, P., et al.: Combinatorial QSAR modeling of P-glycoprotein substrates. J. Chem. Inf. Model. 46(3), 1245–1254 (2006)CrossRefGoogle Scholar
  21. 21.
    Mente, S., Lombardo, F.: A recursive-partitioning model for blood–brain barrier permeation. J. Comput.-Aided Mol. Des. 19(7), 465–481 (2005)CrossRefGoogle Scholar
  22. 22.
    Lamanna, C., et al.: Straightforward recursive partitioning model for discarding insoluble compounds in the drug discovery process. J. Med. Chem. 51(10), 2891–2897 (2008)CrossRefGoogle Scholar
  23. 23.
    Koutsoukas, A., et al.: In silico target predictions: defining a benchmarking data set and comparison of performance of the multiclass naïve bayes and parzen-rosenblatt window. J. Chem. Inf. Model. 53(8), 1957–1966 (2013)CrossRefGoogle Scholar
  24. 24.
    Lowe, R., et al.: Predicting the mechanism of phospholipidosis. J. Cheminf. 4, 2 (2012)CrossRefGoogle Scholar
  25. 25.
    Wasserman, L.: Bayesian model selection and model averaging. J. Math. Psychol. 44(1), 92–107 (2000)CrossRefMathSciNetzbMATHGoogle Scholar
  26. 26.
    Abdo, A., et al.: Ligand-based virtual screening using bayesian networks. J. Chem. Inf. Model. 50(6), 1012–1020 (2010)CrossRefMathSciNetGoogle Scholar
  27. 27.
    Ahmed, A., A. Abdo, and N. Salim, Ligand-based Virtual screening using Bayesian inference network and reweighted fragments. Sci. World J. (2012)Google Scholar
  28. 28.
    Kauffman, G.W., Jurs, P.C.: QSAR and k-nearest neighbor classification analysis of selective cyclooxygenase-2 inhibitors using topologically-based numerical descriptors. J. Chem. Inf. Comput. Sci. 41(6), 1553–1560 (2001)CrossRefGoogle Scholar
  29. 29.
    Konovalov, D.A., et al.: Benchmarking of QSAR models for blood-brain barrier permeation. J. Chem. Inf. Model. 47(4), 1648–1656 (2007)CrossRefGoogle Scholar
  30. 30.
    Votano, J.R., et al.: Three new consensus QSAR models for the prediction of Ames genotoxicity. Mutagenesis 19(5), 365–377 (2004)CrossRefGoogle Scholar
  31. 31.
    Briem, H., Günther, J.: Classifying “kinase inhibitor-likeness” by using machine-learning methods. ChemBioChem 6(3), 558–566 (2005)CrossRefGoogle Scholar
  32. 32.
    De Ferrari, L., et al.: EnzML: multi-label prediction of enzyme classes using InterPro signatures. BMC Bioinf. 13(1), 61 (2012)CrossRefGoogle Scholar
  33. 33.
    Patel, J., Chaudhari, C.: Introduction to the Artificial Neural Networks and their applications in QSAR studies. Altex 22, 271 (2005)Google Scholar
  34. 34.
    Patel, J., Patel, L.: Artificial neural networks and their applications in pharmaceutical research. Pharmabuzz 2, 8–17 (2007)Google Scholar
  35. 35.
    Selzer, P., Ertl, P.: Applications of self-organizing neural networks in virtual screening and diversity selection. J. Chem. Inf. Model. 46(6), 2319–2323 (2006)CrossRefGoogle Scholar
  36. 36.
    Hykin, S.: Neural Networks: A Comprehensive Foundation. Printice-Hall Inc., New Jersey (1999)Google Scholar
  37. 37.
    Hristozov, D., Oprea, T.I., Gasteiger, J.: Ligand-based virtual screening by novelty detection with self-organizing maps. J. Chem. Inf. Model. 47(6), 2044–2062 (2007)CrossRefGoogle Scholar
  38. 38.
    Bonachera, F., et al.: Using self-organizing maps to accelerate similarity search. Bioorg. Med. Chem. 20(18), 5396–5409 (2012)CrossRefGoogle Scholar
  39. 39.
    Afantitis, A., et al.: Ligand - based virtual screening procedure for the prediction and the identification of novel β-amyloid aggregation inhibitors using Kohonen maps and Counter propagation Artificial Neural Networks. Eur. J. Med. Chem. 46(2), 497–508 (2011)CrossRefGoogle Scholar
  40. 40.
    Hasegawa, K., Funatsu, K.: Partial least squares modeling and genetic algorithm optimization in quantitative structure-activity relationships. SAR QSAR Environ. Res. 11(3–4), 189–209 (2000)CrossRefGoogle Scholar
  41. 41.
    Zuegge, J., et al.: A fast virtual screening filter for cytochrome P450 3A4 inhibition liability of compound libraries. Quant. Struct.-Act. Relat. 21(3), 249–256 (2002)CrossRefGoogle Scholar
  42. 42.
    Wang, Y., Li, Y., Wang, B.: An in silico method for screening nicotine derivatives as cytochrome P450 2A6 selective inhibitors based on kernel partial least squares. Int. J. Mol. Sci. 8(2), 166–179 (2007)CrossRefGoogle Scholar
  43. 43.
    Roche, O., et al.: A virtual screening method for prediction of the HERG potassium channel liability of compound libraries. ChemBioChem 3(5), 455–459 (2002)CrossRefGoogle Scholar
  44. 44.
    Gillet, V.J., Willett, P., Bradshaw, J.: Identification of biological activity profiles using substructural analysis and genetic algorithms. J. Chem. Inf. Comput. Sci. 38(2), 165–179 (1998)CrossRefGoogle Scholar
  45. 45.
    Shi, L.M., et al.: Mining the NCI anticancer drug discovery databases: genetic function approximation for the QSAR study of anticancer ellipticine analogues. J. Chem. Inf. Comput. Sci. 38(2), 189–199 (1998)CrossRefGoogle Scholar
  46. 46.
    Jones, G., Willett, P., Glen, R.C.: Molecular recognition of receptor sites using a genetic algorithm with a description of desolvation. J. Mol. Biol. 245(1), 43–53 (1995)CrossRefGoogle Scholar
  47. 47.
    Morris, G.M., et al.: Automated docking using a Lamarckian genetic algorithm and an empirical binding free energy function. J. Comput. Chem. 19(14), 1639–1662 (1998)CrossRefGoogle Scholar
  48. 48.
    Kang, L., et al.: An improved adaptive genetic algorithm for protein–ligand docking. J. Comput.-Aided Mol. Des. 23(1), 1–12 (2009)CrossRefGoogle Scholar
  49. 49.
    Hasegawa, K., Funatsu, K.: Non-linear modeling and chemical interpretation with aid of support vector machine and regression. Curr. Comput.-Aided Drug Des. 6(1), 24–36 (2010)CrossRefGoogle Scholar
  50. 50.
    Li, L., Wang, B., Meroueh, S.O.: Support vector regression scoring of receptor-ligand complexes for rank-ordering and virtual screening of chemical libraries. J. Chem. Inf. Model. 51(9), 2132–2138 (2011)CrossRefGoogle Scholar
  51. 51.
    Zuccon, G.: Document ranking with quantum probabilities, in College of Science and Engineering, School of Computing Science, p. 222, University of Glasgow, UK (2012)Google Scholar
  52. 52.
    Zuccon, G., Azzopardi, L., Van Rijsbergen, C.K.: An analysis of ranking principles and retrieval strategies. In: Advances in Information Retrieval Theory, pp. 151–163. Springer (2011)Google Scholar
  53. 53.
    Willett, P.: Textual and chemical information processing: different domains but similar algorithms. Inf. Res. 5(2) (2000)Google Scholar
  54. 54.
    Al-Dabbagh, M., et al.: A quantum-based similarity method in virtual screening. Molecules 20(10), 18107 (2015)CrossRefGoogle Scholar
  55. 55.
    Fuhr, N.: A probability ranking principle for interactive information retrieval. Inf. Retrieval 11(3), 251–265 (2008)CrossRefGoogle Scholar
  56. 56.
    Zuccon, G., Azzopardi, L., Rijsbergen, C.J.K.V.: The interactive PRP for diversifying document rankings. In: Proceedings of the 34th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 1227–1228. ACM, Beijing, China (2011)Google Scholar
  57. 57.
    Carbonell, J., Goldstein, J.: The use of MMR, diversity-based reranking for reordering documents and producing summaries. In: Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 335–336. ACM, Melbourne, Australia (1998)Google Scholar
  58. 58.
    Leelanupab, T., Zuccon, G., Jose, J.M.: When two is better than one: a study of ranking paradigms and their integrations for subtopic retrieval. In: Information Retrieval Technology, pp. 162–172. Springer (2010)Google Scholar
  59. 59.
    He, J., Meij, E., de Rijke, M.: Result diversification based on query-specific cluster ranking. J. Am. Soc. Inf. Sci. Technol. 62(3), 550–571 (2011)Google Scholar
  60. 60.
    Santos, R.L., Macdonald, C., Ounis, I.: On the role of novelty for search result diversification. Inf. Retrieval 15(5), 478–502 (2012)CrossRefGoogle Scholar
  61. 61.
    Wang, J., Zhu, J.: Portfolio theory of information retrieval. In: Proceedings of the 32nd International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM (2009)Google Scholar
  62. 62.
    Wang, J.: Mean-variance analysis: a new document ranking theory in information retrieval. In: Advances in Information Retrieval, pp. 4–16. Springer (2009)Google Scholar
  63. 63.
    Aly, R., et al., Beyond shot retrieval: searching for broadcast news items using language models of concepts. In: Advances in Information Retrieval, p. 241–252. Springer (2010)Google Scholar
  64. 64.
    Zuccon, G., Azzopardi, L., Rijsbergen, C.J.K.V.: Has portfolio theory got any principles?. In: Proceedings of the 33rd International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 755–756. ACM, Geneva, Switzerland (2010)Google Scholar
  65. 65.
    Rijsbergen, C.J.V.: The Geometry of Information Retrieval. Cambridge University Press, UK (2004)Google Scholar
  66. 66.
    Piwowarski, B., et al.: What can quantum theory bring to information retrieval. In: Proceedings of the 19th ACM International Conference on Information and Knowledge Management, pp. 59–68. ACM, Toronto, ON, Canada (2010)Google Scholar
  67. 67.
    Zuccon, G., Azzopardi, L.: Developing the quantum probability ranking principle (2010)Google Scholar
  68. 68.
    Arafat, S.: Foundations research in information retrieval inspired by quantum theory. University of Glasgow (2008)Google Scholar
  69. 69.
    Feynman, R.P.: The concept of probability in quantum mechanics. In: Proceedings of the Second Berkeley Symposium on Mathematical Statistics and Probability. University of California Press, Berkeley, California (1951)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  • Mohammed Mumtaz Al-Dabbagh
    • 1
    Email author
  • Naomie Salim
    • 2
  • Faisal Saeed
    • 3
  1. 1.Tishk International UniversityErbilIraq
  2. 2.Universiti Teknologi MalaysiaSkudiaMalaysia
  3. 3.College of Computer Science and EngineeringTaibah UniversityMedinaSaudi Arabia

Personalised recommendations