Multi-task generative topographic mapping in virtual screening

Lin, Arkadii; Horvath, Dragos; Marcou, Gilles; Beck, Bernd; Varnek, Alexandre

doi:10.1007/s10822-019-00188-x

Multi-task generative topographic mapping in virtual screening

Published: 09 February 2019

Volume 33, pages 331–343, (2019)
Cite this article

Journal of Computer-Aided Molecular Design Aims and scope Submit manuscript

Arkadii Lin^1,2,
Dragos Horvath¹,
Gilles Marcou¹,
Bernd Beck² &
…
Alexandre Varnek ORCID: orcid.org/0000-0003-1886-925X¹

525 Accesses
15 Citations
Explore all metrics

Abstract

The previously reported procedure to generate “universal” Generative Topographic Maps (GTMs) of the drug-like chemical space is in practice a multi-task learning process, in which both operational GTM parameters (example: map grid size) and hyperparameters (key example: the molecular descriptor space to be used) are being chosen by an evolutionary process in order to fit/select “universal” GTM manifolds. After selection (a one-time task aimed at optimizing the compromise in terms of neighborhood behavior compliance, over a large pool of various biological targets), for any further use the manifolds are ready to provide “fit-free” predictive models. Using any structure–activity set—irrespectively whether the associated target served at map fitting stage or not—the generation or “coloring” a property landscape enables predicting the property for any external molecule, with zero additional fitable parameters involved. While previous works have signaled the excellent behavior of such models in aggressive three-fold cross-validation assessments of their predictive power, the present work wished to explore their behavior in Virtual Screening (VS), here simulated on hand of external DUD ligand and decoy series that are fully disjoint from the ChEMBL-extracted landscape coloring sets. Beyond the rather robust results of the universal GTM manifolds in this challenge, it could be shown that the descriptor spaces selected by the evolutionary multi-task learner were intrinsically able to serve as an excellent support for many other VS procedures, starting from parameter-free similarity searching, to local (target-specific) GTM models, to parameter-rich, nonlinear Random Forest and Neural Network approaches.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Mappability of drug-like space: towards a polypharmacologically competent map of drug-relevant compounds

Article 12 November 2015

Pavel Sidorov, Helena Gaspar, … Dragos Horvath

Artificial intelligence–enabled virtual screening of ultra-large chemical libraries with deep docking

Article 04 February 2022

Francesco Gentile, Jean Charle Yaacoub, … Artem Cherkasov

A two-layer mono-objective algorithm based on guided optimization to reduce the computational cost in virtual screening

Article Open access 27 July 2022

Miriam R. Ferrández, Savíns Puertas-Martín, … Pilar M. Ortigosa

Abbreviations

GTM:: Generative topographic mapping
UGTM:: Universal generative topographic mapping
GA:: Genetic algorithm
CV:: Cross-validation
DUD:: Directory of Useful Decoys
NN:: Neural network
RF:: Random forest

References

Bishop CM, Svensén M, Williams CK (1998) GTM: the generative topographic mapping. Neural Comput 10(1):215–234
Article Google Scholar
Kohonen T (1990) The self-organizing map. Proc IEEE 78(9):1464–1480
Article Google Scholar
Lin A, Horvath D, Afonina V, Marcou G, Jean-Louis R, Varnek A (2018) Mapping of the available chemical space versus the chemical universe of lead-like compounds. ChemMedChem 13:540–554. https://doi.org/10.1002/cmdc.201700561
Article CAS PubMed Google Scholar
Kireeva N, Baskin I, Gaspar H, Horvath D, Marcou G, Varnek A (2012) Generative topographic mapping (GTM): universal tool for data visualization, structure—activity modeling and dataset comparison. Mol Inform 31(3–4):301–312
Article CAS PubMed Google Scholar
Gaspar HA, Baskin II, Marcou G, Horvath D, Varnek A (2015) GTM-based QSAR models and their applicability domains. Mol Inform 34(6–7):348–356. https://doi.org/10.1002/minf.201400153
Article CAS PubMed Google Scholar
Muegge I, Oloff S (2006) Advances in virtual screening. Drug Discov Today 3(4):405–411. https://doi.org/10.1016/j.ddtec.2006.12.002
Article Google Scholar
Lavecchia A (2015) Machine-learning approaches in drug discovery: methods and applications. Drug Discov Today 20(3):318–331. https://doi.org/10.1016/j.drudis.2014.10.012
Article PubMed Google Scholar
Hristozov D, Oprea TI, Gasteiger J (2007) Ligand-based virtual screening by novelty detection with self-organizing maps. J Chem Inf Model 47(6):2044–2062. https://doi.org/10.1021/ci700040r
Article CAS PubMed Google Scholar
Kaiser D, Terfloth L, Kopp S, Schulz J, de Laet R, Chiba P, Ecker GF, Gasteiger J (2007) Self-organizing maps for identification of new inhibitors of P-glycoprotein. J Med Chem 50(7):1698–1702. https://doi.org/10.1021/jm060604z
Article CAS PubMed Google Scholar
Schneider G, Nettekoven M (2003) Ligand-based combinatorial design of selective purinergic receptor (A2A) antagonists using self-organizing maps. J Comb Chem 5(3):233–237
Article CAS PubMed Google Scholar
Sidorov P, Gaspar H, Marcou G, Varnek A, Horvath D (2015) Mappability of drug-like space: towards a polypharmacologically competent map of drug-relevant compounds. J Comput Aided Mol Des 29(12):1087–1108. https://doi.org/10.1007/s10822-015-9882-z
Article CAS PubMed Google Scholar
Rosenbaum L, Dörr A, Bauer MR, Boeckler FM, Zell A (2013) Inferring multi-target QSAR models with taxonomy-based multi-task learning. J Cheminform 5(1):33
Article CAS PubMed PubMed Central Google Scholar
Varnek A, Gaudin C, Marcou G, Baskin I, Pandey AK, Tetko IV (2009) Inductive transfer of knowledge: application of multi-task learning and feature net approaches to model tissue-air partition coefficients. J Chem Inf Model 49(1):133–144. https://doi.org/10.1021/ci8002914
Article CAS PubMed Google Scholar
Xu Y, Ma J, Liaw A, Sheridan RP, Svetnik V (2017) Demystifying multitask deep neural networks for quantitative structure–activity relationships. J Chem Inf Model 57(10):2490–2504
Article CAS PubMed Google Scholar
Brown JB, Okuno Y, Marcou G, Varnek A, Horvath D (2014) Computational chemogenomics: is it more than inductive transfer? J Comput Aided Mol Des 28(6):597–618. https://doi.org/10.1007/s10822-014-9743-1
Article CAS PubMed Google Scholar
Heikamp K, Bajorath J (2013) Prediction of compounds with closely related activity profiles using weighted support vector machine linear combinations. J Chem Inf Model 53(4):791–801. https://doi.org/10.1021/ci400090t
Article CAS PubMed Google Scholar
Medina-Franco JL, Giulianotti MA, Welmaker GS, Houghten RA (2013) Shifting from the single to the multitarget paradigm in drug discovery. Drug Discovery Today 18(9–10):495–501. https://doi.org/10.1016/j.drudis.2013.01.008
Article PubMed PubMed Central Google Scholar
Bieler M, Heilker R, Koeppen H, Schneider G (2011) Assay related target similarity (ARTS)—chemogenomics approach for quantitative comparison of biological targets. J Chem Inf Model 51(8):1897–1905. https://doi.org/10.1021/ci200105t
Article CAS PubMed Google Scholar
Jacob L, Hoffmann B, Stoven V, Vert J-P (2008) Virtual screening of GPCRs: an in silico chemogenomics approach. BMC Bioinform 9(1):363
Article CAS Google Scholar
Gaulton A, Bellis LJ, Bento AP, Chambers J, Davies M, Hersey A, Light Y, McGlinchey S, Michalovich D, Al-Lazikani B (2011) ChEMBL: a large-scale bioactivity database for drug discovery. Nucleic Acids Res 40(D1):D1100–D1107
Article CAS PubMed PubMed Central Google Scholar
Huang N, Shoichet BK, Irwin JJ (2006) Benchmarking sets for molecular Docking. J Med Chem 49(23):6789–6801. https://doi.org/10.1021/jm0608356
Article CAS PubMed PubMed Central Google Scholar
Ruggiu F, Marcou G, Varnek A, Horvath D (2010) ISIDA property-labelled fragment descriptors. Mol Inform 29(12):855–868. https://doi.org/10.1002/minf.201000099
Article CAS PubMed Google Scholar
Ruggiu F, Marcou G, Solov’ev V, Horvath D, Varnek A (2017) ISIDA fragmentor 2017-user manual. http://infochim.u-strasbg.fr/downloads/manuals/Fragmentor2017/Fragmentor2017_Manual_nov2017.pdf
Horvath D, Brown J, Marcou G, Varnek A (2014) An evolutionary optimizer of libsvm models. Challenges 5(2):450–472
Article Google Scholar
Gaspar HA, Baskin II, Marcou G, Horvath D, Varnek A (2015) Chemical data visualization and analysis with incremental generative topographic mapping: big data challenge. J Chem Inf Model 55(1):84–94. https://doi.org/10.1021/ci500575y
Article CAS PubMed Google Scholar
Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V (2011) Scikit-learn: machine learning in python. J Mach Learn Res 12(Oct):2825–2830
Google Scholar
Ruck DW, Rogers SK, Kabrisky M, Oxley ME, Suter BW (1990) The multilayer perceptron as an approximation to a Bayes optimal discriminant function. IEEE Trans Neural Netw 1(4):296–298. https://doi.org/10.1109/72.80266
Article CAS PubMed Google Scholar
Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323(6088):533
Article Google Scholar
Breiman L (2001) Random forests. Mach Learn 45(1):5–32
Article Google Scholar
Dahl GE, Sainath TN, Hinton GE (2013) Improving deep neural networks for LVCSR using rectified linear units and dropout. In: IEEE international conference on acoustics, speech and signal processing (ICASSP), 2013, IEEE, Vancouver, pp 8609–8613
Kingma DP, Ba J (2014) Adam: a method for stochastic optimization. arXiv preprint arXiv:14126980
Horvath D, Koch C, Schneider G, Marcou G, Varnek A (2011) Local neighborhood behavior in a combinatorial library context. J Comput Aided Mol Des 25(3):237–252. https://doi.org/10.1007/s10822-011-9416-2
Article CAS PubMed Google Scholar
Papadatos G, Cooper AWJ, Kadirkamanathan V, Macdonald SJF, McLay IM, Pickett SD, Pritchard JM, Willett P, Gillet VJ (2009) Analysis of neighborhood behavior in lead optimization and array design. J Chem Inf Model 49(2):195–208. https://doi.org/10.1021/ci800302g
Article CAS PubMed Google Scholar

Download references

Funding

The project leading to this article has received funding from the European Union’s Horizon 2020 research and innovation program under the Marie Skłodowska-Curie Grant agreement No 676434, “Big Data in Chemistry” (“BIGCHEM”, http://bigchem.eu).

Author information

Authors and Affiliations

Laboratory of Chemoinformatics, Faculty of Chemistry, University of Strasbourg, 4, Blaise Pascal Str., 67081, Strasbourg, France
Arkadii Lin, Dragos Horvath, Gilles Marcou & Alexandre Varnek
Department of Medicinal Chemistry, Boehringer Ingelheim Pharma GmbH & Co. KG, Birkendorferstrasse 65, 88397, Biberach an der Riss, Germany
Arkadii Lin & Bernd Beck

Authors

Arkadii Lin
View author publications
You can also search for this author in PubMed Google Scholar
Dragos Horvath
View author publications
You can also search for this author in PubMed Google Scholar
Gilles Marcou
View author publications
You can also search for this author in PubMed Google Scholar
Bernd Beck
View author publications
You can also search for this author in PubMed Google Scholar
Alexandre Varnek
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

The manuscript was written through contributions of all authors. All authors have given approval to the final version of the manuscript.

Corresponding author

Correspondence to Alexandre Varnek.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (DOCX 31 KB)

Supplementary material 2 (ZIP 2577 KB)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Lin, A., Horvath, D., Marcou, G. et al. Multi-task generative topographic mapping in virtual screening. J Comput Aided Mol Des 33, 331–343 (2019). https://doi.org/10.1007/s10822-019-00188-x

Download citation

Received: 15 September 2018
Accepted: 02 February 2019
Published: 09 February 2019
Issue Date: 15 March 2019
DOI: https://doi.org/10.1007/s10822-019-00188-x

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Multi-task generative topographic mapping in virtual screening

Abstract

Access this article

Similar content being viewed by others

Mappability of drug-like space: towards a polypharmacologically competent map of drug-relevant compounds

Artificial intelligence–enabled virtual screening of ultra-large chemical libraries with deep docking

A two-layer mono-objective algorithm based on guided optimization to reduce the computational cost in virtual screening

Abbreviations

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Additional information

Publisher’s Note

Electronic supplementary material

Supplementary material 1 (DOCX 31 KB)

Supplementary material 2 (ZIP 2577 KB)

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Multi-task generative topographic mapping in virtual screening

Abstract

Access this article

Similar content being viewed by others

Mappability of drug-like space: towards a polypharmacologically competent map of drug-relevant compounds

Artificial intelligence–enabled virtual screening of ultra-large chemical libraries with deep docking

A two-layer mono-objective algorithm based on guided optimization to reduce the computational cost in virtual screening

Abbreviations

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Additional information

Publisher’s Note

Electronic supplementary material

Supplementary material 1 (DOCX 31 KB)

Supplementary material 2 (ZIP 2577 KB)

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation