Cheminformatics Approaches to Study Drug Polypharmacology

  • J. Jesús Naveja
  • Fernanda I. Saldívar-González
  • Norberto Sánchez-Cruz
  • José L. Medina-FrancoEmail author
Part of the Methods in Pharmacology and Toxicology book series (MIPT)


Herein is presented a tutorial overview on selected chemoinformatics methods useful for assembling, curating/preparing a chemical database, and assessing its diversity and chemical space. Methods for evaluating the structure–activity relationships (SAR) and polypharmacology are also included. Usage of open source tools is emphasized. Step-by-step KNIME workflows are used for illustrating the methods. The methods described in this chapter are applied onto a chemical database especially relevant for epi-polypharmacology that is an emerging area in drug discovery. However, the methods described herein could be extended to other therapeutic areas and potentially to other areas of chemistry.


Chemoinformatics ChemMaps Chemical space Data mining Epigenetics Epi-informatics KNIME Molecular diversity Open-access Polypharmacology Structure–activity relationships SmARt 



This work was supported by the Programa de Apoyo a Proyectos de Investigación e Innovación Tecnológica (PAPIIT) grant IA203718 and National Council of Science and Technology (CONACyT), Mexico grant number 282785. JJN, FIS-G, and NS-C are thankful to CONACyT for the granted scholarships number 622969, 629458, and 335997, respectively.

Supplementary material

457674_1_En_6_MOESM1_ESM.knwf (169 kb)
Supplementary KNIME Workflow 1 Chemical preprocessing and database curation (KNWF 168 kb)
457674_1_En_6_MOESM2_ESM.knwf (93 kb)
Supplementary KNIME Workflow 2 Chemical diversity analysis (KNWF 92 kb)
457674_1_En_6_MOESM3_ESM.knwf (128 kb)
Supplementary KNIME Workflow 3 Consensus diversity plots (KNWF 127 kb)
457674_1_En_6_MOESM4_ESM.knwf (212 kb)
Supplementary KNIME Workflow 4 SmARt analyses (KNWF 211 kb)
457674_1_En_6_MOESM5_ESM.knwf (400 kb)
Supplementary KNIME Workflow 5 Chemical space (KNWF 399 kb)


  1. 1.
    Rosini M (2014) Polypharmacology: the rise of multitarget drugs over combination therapies. Future Med Chem 6:485–487. CrossRefPubMedGoogle Scholar
  2. 2.
    Méndez-Lucio O, Naveja JJ, Vite-Caritino H et al (2016) Review. One drug for multiple targets: a computational perspective. J Mex Chem Soc 60:168–181Google Scholar
  3. 3.
    Saldívar-González FI, Naveja JJ, Palomino-Hernández O, Medina-Franco JL (2017) Getting SMARt in drug discovery: chemoinformatics approaches for mining structure–multiple activity relationships. RSC Adv 7:632–641. CrossRefGoogle Scholar
  4. 4.
    González-Medina M, Naveja JJ, Sánchez-Cruz N, Medina-Franco JL (2017) Open chemoinformatic resources to explore the structure, properties and chemical space of molecules. RSC Adv 7:54153–54163. CrossRefGoogle Scholar
  5. 5.
    Berthold MR, Cebron N, Dill F et al (2009) KNIME – the Konstanz information miner. SIGKDD Explor Newsl 11:26. CrossRefGoogle Scholar
  6. 6.
    Varnek A (2017) Tutorials in chemoinformatics. CrossRefGoogle Scholar
  7. 7.
    Saldívar-González FI, Hernández-Luis F, Lira-Rocha A, Medina-Franco JL (2017) Manual de Quimioinformática, 1st edn. Universidad Nacional Autónoma de México, Mexico CityGoogle Scholar
  8. 8.
    González-Medina M, Medina-Franco JL (2017) Platform for unified molecular analysis: PUMA. J Chem Inf Model 57:1735–1740. CrossRefPubMedGoogle Scholar
  9. 9.
    González-Medina M, Méndez-Lucio O, Medina-Franco JL (2017) Activity landscape plotter: a web-based application for the analysis of structure-activity relationships. J Chem Inf Model 57:397–402. CrossRefPubMedGoogle Scholar
  10. 10.
    González-Medina M, Prieto-Martínez FD, Owen JR, Medina-Franco JL (2016) Consensus diversity plots: a global diversity analysis of chemical libraries. J Cheminform 8:63. CrossRefPubMedPubMedCentralGoogle Scholar
  11. 11.
    Naveja JJ, Oviedo-Osornio CI, Trujillo-Minero NN, Medina-Franco JL (2017) Chemoinformatics: a perspective from an academic setting in Latin America. Mol Divers. CrossRefGoogle Scholar
  12. 12.
    Richter L, Ecker GF (2015) Medicinal chemistry in the era of big data. Drug Discov Today Technol 14:37–41. CrossRefPubMedGoogle Scholar
  13. 13.
    Law V, Knox C, Djoumbou Y et al (2014) DrugBank 4.0: shedding new light on drug metabolism. Nucleic Acids Res 42:D1091–D1097. CrossRefGoogle Scholar
  14. 14.
    Bento AP, Gaulton A, Hersey A et al (2014) The ChEMBL bioactivity database: an update. Nucleic Acids Res 42:D1083–D1090. CrossRefPubMedGoogle Scholar
  15. 15.
    Irwin JJ, Shoichet BK (2005) ZINC – a free database of commercially available compounds for virtual screening. J Chem Inf Model 45:177–182. CrossRefPubMedPubMedCentralGoogle Scholar
  16. 16.
    Liu T, Lin Y, Wen X et al (2007) BindingDB: a web-accessible database of experimentally determined protein-ligand binding affinities. Nucleic Acids Res 35:D198–D201. CrossRefPubMedGoogle Scholar
  17. 17.
    Lavecchia A, Cerchia C (2016) In silico methods to address polypharmacology: current status, applications and future perspectives. Drug Discov Today 21:288–298. CrossRefGoogle Scholar
  18. 18.
    Fourches D, Muratov E, Tropsha A (2016) Trust, but verify II: a practical guide to chemogenomics data curation. J Chem Inf Model 56:1243–1252. CrossRefPubMedPubMedCentralGoogle Scholar
  19. 19.
    Hersey A, Chambers J, Bellis L et al (2015) Chemical databases: curation or integration by user-defined equivalence? Drug Discov Today Technol 14:17–24. CrossRefPubMedPubMedCentralGoogle Scholar
  20. 20.
    Miller MA (2002) Chemical database techniques in drug discovery. Nat Rev Drug Discov 1:220–227. CrossRefPubMedGoogle Scholar
  21. 21.
    Cherkasov A, Muratov EN, Fourches D et al (2014) QSAR modeling: where have you been? Where are you going to? J Med Chem 57:4977–5010. CrossRefPubMedPubMedCentralGoogle Scholar
  22. 22.
    Mansouri K, Abdelaziz A, Rybacka A et al (2016) CERAPP: collaborative estrogen receptor activity prediction project. Environ Health Perspect 124:1023–1033. CrossRefPubMedPubMedCentralGoogle Scholar
  23. 23.
    Gally J-M, Bourg S, Do Q-T et al (2017) Vsprep: a general KNIME workflow for the preparation of molecules for virtual screening. Mol Inform. CrossRefGoogle Scholar
  24. 24.
    Naveja JJ, Medina-Franco JL (2017) Insights from pharmacological similarity of epigenetic targets in epipolypharmacology. Drug Discov Today. CrossRefGoogle Scholar
  25. 25.
    Medina-Franco JL, Martinez-Mayorga K, Meurice N (2014) Balancing novelty with confined chemical space in modern drug discovery. Expert Opin Drug Discov 9:151–165. CrossRefPubMedGoogle Scholar
  26. 26.
    Sheridan RP, Kearsley SK (2002) Why do we need so many chemical similarity search methods? Drug Discov Today 7:903–911. CrossRefPubMedGoogle Scholar
  27. 27.
    Medina-Franco JL, Maggiora GM (2013) Molecular similarity analysis. In: Bajorath J (ed) Chemoinformatics for drug discovery. Wiley, Hoboken, NJ, pp 343–399. CrossRefGoogle Scholar
  28. 28.
    Singh N, Guha R, Giulianotti MA et al (2009) Chemoinformatic analysis of combinatorial libraries, drugs, natural products, and molecular libraries small molecule repository. J Chem Inf Model 49:1010–1024. CrossRefPubMedPubMedCentralGoogle Scholar
  29. 29.
    Xu J, Hagler A (2002) Chemoinformatics and drug discovery. Molecules 7:566–600. CrossRefPubMedCentralGoogle Scholar
  30. 30.
    Gortari EF, Medina-Franco JL (2015) Epigenetic relevant chemical space: a chemoinformatic characterization of inhibitors of DNA methyltransferases. RSC Adv 5:87465–87476. CrossRefGoogle Scholar
  31. 31.
    Eckert H, Bajorath J (2007) Molecular similarity analysis in virtual screening: foundations, limitations and novel approaches. Drug Discov Today 12:225–233. CrossRefPubMedGoogle Scholar
  32. 32.
    Rogers D, Hahn M (2010) Extended-connectivity fingerprints. J Chem Inf Model 50:742–754. CrossRefGoogle Scholar
  33. 33.
    Ewing T, Baber JC, Feher M (2006) Novel 2D fingerprints for ligand-based virtual screening. J Chem Inf Model 46:2423–2431. CrossRefPubMedGoogle Scholar
  34. 34.
    Durant JL, Leland BA, Henry DR, Nourse JG (2002) Reoptimization of MDL keys for use in drug discovery. J Chem Inf Comput Sci 42:1273–1280. CrossRefGoogle Scholar
  35. 35.
    Jaccard P (1901) Etude Comparative de la Distribution Florale dans une Portion des Alpes et des Jura. Bull Soc Vaudoise Sci Nat 37:547–579Google Scholar
  36. 36.
    Bemis GW, Murcko MA (1996) The properties of known drugs. 1. Molecular frameworks. J Med Chem 39:2887–2893. CrossRefPubMedGoogle Scholar
  37. 37.
    Xu Y-J, Johnson M (2002) Using molecular equivalence numbers to visually explore structural features that distinguish chemical libraries. J Chem Inf Comput Sci 42:912–926CrossRefGoogle Scholar
  38. 38.
    Medina-Franco J, Martínez-Mayorga K, Bender A, Scior T (2009) Scaffold diversity analysis of compound data sets using an entropy-based measure. QSAR Comb Sci 28:1551–1560. CrossRefGoogle Scholar
  39. 39.
    Shanmugasundaram V, Maggiora GM (2001) Characterizing property and activity landscapes using an information-theoretic approach. CINF-032. In 222nd ACS National Meeting, Chicago, IL, USA; August 26–30, 2001; American Chemical Society, Washington, DCGoogle Scholar
  40. 40.
    Guha R (2012) Exploring structure-activity data using the landscape paradigm. Wiley Interdiscip Rev Comput Mol Sci. Google Scholar
  41. 41.
    Bajorath J, Peltason L, Wawer M et al (2009) Navigating structure-activity landscapes. Drug Discov Today 14:698–705. CrossRefPubMedGoogle Scholar
  42. 42.
    Medina-Franco JL (2012) Scanning structure-activity relationships with structure-activity similarity and related maps: from consensus activity cliffs to selectivity switches. J Chem Inf Model 52:2485–2493. CrossRefPubMedGoogle Scholar
  43. 43.
    Medina-Franco JL, Petit J, Maggiora GM (2006) Hierarchical strategy for identifying active chemotype classes in compound databases. Chem Biol Drug Des 67:395–408. CrossRefPubMedGoogle Scholar
  44. 44.
    Maggiora G, Gokhale V (2017) A simple mathematical approach to the analysis of polypharmacology and polyspecificity data. [version 1; referees: 3 approved, 1 approved with reservations]. F1000Res. CrossRefGoogle Scholar
  45. 45.
    Pérez-Villanueva J, Santos R, Hernández-Campos A et al (2011) Structure–activity relationships of benzimidazole derivatives as antiparasitic agents: dual activity-difference (DAD) maps. Med Chem Commun 2:44–49. CrossRefGoogle Scholar
  46. 46.
    Yongye AB, Medina-Franco JL (2012) Data mining of protein-binding profiling data identifies structural modifications that distinguish selective and promiscuous compounds. J Chem Inf Model 52:2454–2461. CrossRefPubMedGoogle Scholar
  47. 47.
    Osolodkin DI, Radchenko EV, Orlov AA et al (2015) Progress in visual representations of chemical space. Expert Opin Drug Discov 10:959–973. CrossRefPubMedGoogle Scholar
  48. 48.
    Medina-Franco J, Martinez-Mayorga K, Giulianotti M et al (2008) Visualization of the chemical space in drug discovery. Curr Comput Aided Drug Des 4:322–333. CrossRefGoogle Scholar
  49. 49.
    Fernández-de Gortari E, García-Jacas CR, Martinez-Mayorga K, Medina-Franco JL (2017) Database fingerprint (DFP): an approach to represent molecular databases. J Cheminform 9:9. CrossRefPubMedPubMedCentralGoogle Scholar
  50. 50.
    Naveja JJ, Medina-Franco JL (2017) ChemMaps: towards an approach for visualizing the chemical space based on adaptive satellite compounds [version 1; referees: 1 approved, 2 approved with reservations]. F1000Res. CrossRefGoogle Scholar
  51. 51.
    Naveja JJ, Medina-Franco JL (2015) Activity landscape sweeping: insights into the mechanism of inhibition and optimization of DNMT1 inhibitors. RSC Adv 5:63882–63895. CrossRefGoogle Scholar
  52. 52.
    Wale N, Karypis G (2009) Target fishing for chemical compounds using target-ligand activity data and ranking based methods. J Chem Inf Model 49:2190–2201. CrossRefPubMedPubMedCentralGoogle Scholar
  53. 53.
    Jenkins JL, Bender A, Davies JW (2006) In silico target fishing: predicting biological targets from chemical structure. Drug Discov Today Technol 3:413–421. CrossRefGoogle Scholar
  54. 54.
    Hansch C, Maloney PP, Fujita T, Muir RM (1962) Correlation of biological activity of phenoxyacetic acids with hammett substituent constants and partition coefficients. Nature 194:178–180. CrossRefGoogle Scholar
  55. 55.
    Nettles JH, Jenkins JL, Bender A et al (2006) Bridging chemical and biological space: “target fishing” using 2D and 3D molecular descriptors. J Med Chem 49:6802–6810. CrossRefPubMedGoogle Scholar
  56. 56.
    Cramer RD (2012) The inevitable QSAR renaissance. J Comput Aided Mol Des 26:35–38. CrossRefPubMedGoogle Scholar
  57. 57.
    Lavecchia A (2015) Machine-learning approaches in drug discovery: methods and applications. Drug Discov Today 20:318–331. CrossRefPubMedGoogle Scholar
  58. 58.
    Yao Z-J, Dong J, Che Y-J et al (2016) TargetNet: a web service for predicting potential drug-target interaction profiling via multi-target SAR models. J Comput Aided Mol Des 30:413–424. CrossRefGoogle Scholar
  59. 59.
    Nidhi, Glick M, Davies JW, Jenkins JL (2006) Prediction of biological targets for compounds using multiple-category Bayesian models trained on chemogenomics databases. J Chem Inf Model 46:1124–1133. CrossRefGoogle Scholar
  60. 60.
    Kawai K, Fujishima S, Takahashi Y (2008) Predictive activity profiling of drugs by topological-fragment-spectra-based support vector machines. J Chem Inf Model 48:1152–1160. CrossRefPubMedGoogle Scholar
  61. 61.
    Nikolic K, Mavridis L, Djikic T et al (2016) Drug design for CNS diseases: polypharmacological profiling of compounds using cheminformatic, 3D-QSAR and virtual screening methodologies. Front Neurosci 10:265. CrossRefPubMedPubMedCentralGoogle Scholar
  62. 62.
    Rognan D (2010) Structure-based approaches to target fishing and ligand profiling. Mol Inform 29:176–187. CrossRefPubMedGoogle Scholar
  63. 63.
    Awale M, Reymond J-L (2017) The polypharmacology browser: a web-based multi-fingerprint target prediction tool using ChEMBL bioactivity data. J Cheminform 9:11. CrossRefPubMedPubMedCentralGoogle Scholar
  64. 64.
    Kunimoto R, Dimova D, Bajorath J (2017) Application of a new scaffold concept for computational target deconvolution of chemical cancer cell line screens. ACS Omega 2:1463–1468. CrossRefPubMedPubMedCentralGoogle Scholar
  65. 65.
    Reker D, Rodrigues T, Schneider P, Schneider G (2014) Identifying the macromolecular targets of de novo-designed chemical entities through self-organizing map consensus. Proc Natl Acad Sci U S A 111:4067–4072. CrossRefPubMedPubMedCentralGoogle Scholar
  66. 66.
    Zheng W, Thorne N, McKew JC (2013) Phenotypic screens as a renewed approach for drug discovery. Drug Discov Today 18:1067–1073. CrossRefPubMedPubMedCentralGoogle Scholar
  67. 67.
    Lee J, Bogyo M (2013) Target deconvolution techniques in modern phenotypic profiling. Curr Opin Chem Biol 17:118–126. CrossRefPubMedPubMedCentralGoogle Scholar
  68. 68.
    Mugumbate G, Mendes V, Blaszczyk M et al (2017) Target identification of mycobacterium tuberculosis phenotypic hits using a concerted chemogenomic, biophysical, and structural approach. Front Pharmacol 8:681. CrossRefPubMedPubMedCentralGoogle Scholar

Copyright information

© Springer Science+Business Media New York 2018

Authors and Affiliations

  • J. Jesús Naveja
    • 1
    • 2
  • Fernanda I. Saldívar-González
    • 1
  • Norberto Sánchez-Cruz
    • 1
  • José L. Medina-Franco
    • 1
    Email author
  1. 1.Department of Pharmacy, School of ChemistryUniversidad Nacional Autónoma de MéxicoMexico CityMexico
  2. 2.PECEM, School of MedicineUniversidad Nacional Autónoma de MéxicoMexico CityMexico

Personalised recommendations