Advertisement

Impact of Molecular Descriptors on Computational Models

  • Francesca GrisoniEmail author
  • Viviana Consonni
  • Roberto Todeschini
Protocol
Part of the Methods in Molecular Biology book series (MIMB, volume 1825)

Abstract

Molecular descriptors encode a wide variety of molecular information and have become the support of many contemporary chemoinformatic and bioinformatic applications. They grasp specific molecular features (e.g., geometry, shape, pharmacophores, or atomic properties) and directly affect computational models, in terms of outcome, performance, and applicability. This chapter aims to illustrate the impact of different molecular descriptors on the structural information captured and on the perceived chemical similarity among molecules. After introducing the fundamental concepts of molecular descriptor theory and application, a step-by-step retrospective virtual screening procedure guides users through the fundamental processing steps and discusses the impact of different types of molecular descriptors.

Key words

Molecular descriptors Molecular similarity Chemical space Mathematical chemistry Virtual screening Similarity search Distance measure 

References

  1. 1.
    Rocke AJ (1981) Kekulé, Butlerov, and the historiography of the theory of chemical structure. BJHS 14:27–57CrossRefGoogle Scholar
  2. 2.
    Kekulé A (1858) Ueber die Constitution und die Metamorphosen der chemischen Verbindungen und über die chemische Natur des Kohlenstoffs. Eur J Org Chem 106:129–159Google Scholar
  3. 3.
    Crum-Brown A, Fraser T (1868) On the connection between chemical constitution and physiological action. Part 1. On the physiological action of the ammonium bases, derived from Strychia, Brucia, Thebaia, Codeia, Morphia and Nicotia. Trans R Soc Edinburgh 25:151–203CrossRefGoogle Scholar
  4. 4.
    Richardson B (1869) Physiological research on alcohols. Med Times and Gazzette 2:703–706Google Scholar
  5. 5.
    Körner W (1874) Studi sulla Isomeria delle Così Dette Sostanze Aromatiche a Sei Atomi di Carbonio. Gazz Chim 4:242Google Scholar
  6. 6.
    Richet M (1893) Note sur le rapport entre la toxicité et les propriétés physiques des corps. C R Séances Soc Biol 45:775–776Google Scholar
  7. 7.
    Todeschini R, Consonni V (2009) Molecular descriptors for chemoinformatics, vol 2 volumes. Wiley-VCH, WeinheimCrossRefGoogle Scholar
  8. 8.
    Kode SR (2016) Dragon (Software for Molecular Descriptor Calculation) Version 7.0–https://chm.kode-solutions.net
  9. 9.
    Moriguchi I, Hirono S, Nakagome I et al (1994) Comparison of reliability of log P values for drugs calculated by several methods. Chem Pharm Bull 42:976–978CrossRefGoogle Scholar
  10. 10.
    Schneider G, Neidhart W, Giller T et al (1999) “Scaffold-hopping” by topological pharmacophore search: a contribution to virtual screening. Angew Chem Int Ed 38:2894–2896CrossRefGoogle Scholar
  11. 11.
    Fechner U, Franke L, Renner S et al (2003) Comparison of correlation vector methods for ligand-based similarity searching. J Comput Aided Mol Des 17:687–698CrossRefGoogle Scholar
  12. 12.
    Todeschini R, Consonni V, Gramatica P (2009) Chemometrics in QSAR. In: Comprehensive chemometrics. Elsevier, Oxford, pp 129–172CrossRefGoogle Scholar
  13. 13.
    Johnson MA, Maggiora GM (1990) Concepts and applications of molecular similarity. Wiley, New YorkGoogle Scholar
  14. 14.
    Jacob L, Vert J-P (2008) Protein-ligand interaction prediction: an improved chemogenomics approach. Bioinformatics 24:2149–2156CrossRefPubMedPubMedCentralGoogle Scholar
  15. 15.
    Rognan D (2007) Chemogenomic approaches to rational drug design. Br J Pharmacol 152:38–52CrossRefPubMedPubMedCentralGoogle Scholar
  16. 16.
    Strömbergsson H, Kleywegt GJ (2009) A chemogenomics view on protein-ligand spaces. BMC Bioinformatics 10:1–11CrossRefGoogle Scholar
  17. 17.
    Bender A, Glen RC (2004) Molecular similarity: a key technique in molecular informatics. Org Biomol Chem 2:3204–3218CrossRefGoogle Scholar
  18. 18.
    Consonni V, Todeschini R, Pavan M (2002) Structure/response correlations and similarity/diversity analysis by GETAWAY descriptors. 1. Theory of the novel 3D molecular descriptors. J Chem Inf Comput Sci 42:682–692CrossRefGoogle Scholar
  19. 19.
    Eckert H, Bajorath J (2007) Molecular similarity analysis in virtual screening: foundations, limitations and novel approaches. Drug Discov Today 12:225–233CrossRefGoogle Scholar
  20. 20.
    Reutlinger M, Koch CP, Reker D et al (2013) Chemically advanced template search (CATS) for scaffold-hopping and prospective target prediction for “orphan” molecules. Mol Informatics 32:133–138CrossRefGoogle Scholar
  21. 21.
    Geppert H, Vogt M, Bajorath J (2010) Current trends in ligand-based virtual screening: molecular representations, data mining methods, new application areas, and performance evaluation. J Chem Inf Model 50:205–216CrossRefPubMedGoogle Scholar
  22. 22.
    Schneider G, Fechner U (2005) Computer-based de novo design of drug-like molecules. Nat Rev Drug Discov 4:649–663CrossRefPubMedGoogle Scholar
  23. 23.
    Hajduk PJ, Greer J (2007) A decade of fragment-based drug design: strategic advances and lessons learned. Nat Rev Drug Discov 6:211–219CrossRefPubMedGoogle Scholar
  24. 24.
    Miyao T, Kaneko H, Funatsu K (2016) Ring system-based chemical graph generation for de novo molecular design. J Comput Aided Mol Des 30:425–446CrossRefPubMedGoogle Scholar
  25. 25.
    Mansouri K, Ringsted T, Ballabio D et al (2013) Quantitative structure–activity relationship models for ready biodegradability of chemicals. J Chem Inf Model 53:867–878CrossRefPubMedGoogle Scholar
  26. 26.
    Grisoni F, Consonni V, Vighi M et al (2016) Expert QSAR system for predicting the bioconcentration factor under the REACH regulation. Environ Res 148:507–512CrossRefPubMedGoogle Scholar
  27. 27.
    Chaudhry Q, Piclin N, Cotterill J et al (2010) Global QSAR models of skin sensitisers for regulatory purposes. Chem Cent J 4(S5):1–6Google Scholar
  28. 28.
    Grisoni F, Reker D, Schneider P et al (2017) Matrix-based molecular descriptors for prospective virtual compound screening. Mol Informatics 36:1600091CrossRefGoogle Scholar
  29. 29.
    Tetko IV, Sushko I, Pandey AK et al (2008) Critical assessment of QSAR models of environmental toxicity against Tetrahymena pyriformis: focusing on applicability domain and overfitting by variable selection. J Chem Inf Model 48:1733–1746CrossRefPubMedGoogle Scholar
  30. 30.
    Zhu H, Tropsha A, Fourches D et al (2008) Combinatorial QSAR modeling of chemical toxicants tested against Tetrahymena pyriformis. J Chem Inf Model 48:766–784CrossRefGoogle Scholar
  31. 31.
    Brown JB, Niijima S, Shiraishi A, et al. (2012) Chemogenomic approach to comprehensive predictions of ligand-target interactions: a comparative study, In: 2012 I.E. International conference on bioinformatics and biomedicine workshops (BIBMW), pp. 136–142Google Scholar
  32. 32.
    Brown JB, Niijima S, Okuno Y (2013) Compound-protein interaction prediction within chemogenomics: theoretical concepts, practical usage, and future directions. Mol Informatics 32:906–921CrossRefGoogle Scholar
  33. 33.
    Fujita T, Winkler DA (2016) Understanding the roles of the two QSARs. J Chem Inf Model 56:269–274CrossRefGoogle Scholar
  34. 34.
    Grisoni F, Consonni V, Vighi M et al (2016) Investigating the mechanisms of bioconcentration through QSAR classification trees. Environ Int 88:198–205CrossRefGoogle Scholar
  35. 35.
    Todeschini R, Consonni V (2008) Handbook of molecular descriptors. John Wiley & Sons, WeinheimGoogle Scholar
  36. 36.
    Consonni V, Todeschini R (2012) Multivariate analysis of molecular descriptors. In: Dehmer M, Varmuza K, Bonchev D (eds) Statistical modelling of molecular descriptors in QSAR/QSPR. Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim, pp 111–147CrossRefGoogle Scholar
  37. 37.
    Todeschini R, Consonni V (2008) Descriptors from molecular geometry. In: Gasteiger J (ed) Handbook of chemoinformatics: from data to knowledge, vol 4 Volumes. Wiley-VCH Verlag GmbH, Weinheim, Germany, pp 1004–1033CrossRefGoogle Scholar
  38. 38.
    Nettles JH, Jenkins JL, Bender A et al (2006) Bridging chemical and biological space: “target fishing” using 2D and 3D molecular descriptors. J Med Chem 49:6802–6810CrossRefGoogle Scholar
  39. 39.
    Schuur JH, Selzer P, Gasteiger J (1996) The coding of the three-dimensional structure of molecules by molecular transforms and its application to structure-spectra correlations and studies of biological activity. J Chem Inf Comput Sci 36:334–344CrossRefGoogle Scholar
  40. 40.
    Finkelmann AR, Göller AH, Schneider G (2016) Robust molecular representations for modelling and design derived from atomic partial charges. Chem Commun 52:681–684CrossRefGoogle Scholar
  41. 41.
    Rybinska A, Sosnowska A, Barycki M et al (2016) Geometry optimization method versus predictive ability in QSPR modeling for ionic liquids. J Comput Aided Mol Des 30:165–176CrossRefGoogle Scholar
  42. 42.
    Nicklaus MC, Wang S, Driscoll JS et al (1995) Conformational changes of small molecules binding to proteins. Bioorg Med Chem 3:411–428CrossRefGoogle Scholar
  43. 43.
    Goodford PJ (1985) A computational procedure for determining energetically favorable binding sites on biologically important macromolecules. J Med Chem 28:849–857CrossRefGoogle Scholar
  44. 44.
    Cramer RD, Patterson DE, Bunce JD (1988) Comparative molecular field analysis (CoMFA). 1. Effect of shape on binding of steroids to carrier proteins. J Am Chem Soc 110:5959–5967CrossRefGoogle Scholar
  45. 45.
    Klebe G, Abraham U, Mietzner T (1994) Molecular similarity indices in a comparative analysis (CoMSIA) of drug molecules to correlate and predict their biological activity. J Med Chem 37:4130–4146CrossRefGoogle Scholar
  46. 46.
    Hopfinger AJ, Wang S, Tokarski JS et al (1997) Construction of 3D-QSAR models using the 4D-QSAR analysis formalism. J Am Chem Soc 119:10509–10524CrossRefGoogle Scholar
  47. 47.
    Andrade CH, Pasqualoto KFM, Ferreira EI et al (2010) 4D-QSAR: perspectives in drug design. Molecules 15:3281–3294CrossRefGoogle Scholar
  48. 48.
    Vedani A, McMasters DR, Dobler M (2000) Multi-conformational ligand representation in 4D-QSAR: reducing the bias associated with ligand alignment. QSAR 19:149–161Google Scholar
  49. 49.
    Vedani A, Briem H, Dobler M et al (2000) Multiple-conformation and protonation-state representation in 4D-QSAR: the neurokinin-1 receptor system. J Med Chem 43:4416–4427CrossRefGoogle Scholar
  50. 50.
    Vedani A, Dobler M (2002) 5D-QSAR: the key for simulating induced fit? J Med Chem 45:2139–2149CrossRefPubMedGoogle Scholar
  51. 51.
    Vedani A, Dobler M, Lill MA (2005) Combining protein modeling and 6D-QSAR. Simulating the binding of structurally diverse ligands to the estrogen receptor. J Med Chem 48:3700–3703CrossRefPubMedGoogle Scholar
  52. 52.
    Fourches D, Muratov E, Tropsha A (2010) Trust, but verify: on the importance of chemical structure curation in cheminformatics and QSAR modeling research. J Chem Inf Model 50:1189–1204CrossRefPubMedPubMedCentralGoogle Scholar
  53. 53.
    Olah M, Rad R, Ostopovici L et al (2008) WOMBAT and WOMBAT-PK: bioactivity databases for lead and drug discovery. In: Chemical biology: from small molecules to systems biology and drug design, vol 1-3. Wiley-VCH, New York, pp 760–786Google Scholar
  54. 54.
    Young D, Martin T, Venkatapathy R et al (2008) Are the chemical structures in your QSAR correct? QSAR 27:1337–1345CrossRefGoogle Scholar
  55. 55.
    Grisoni F, Consonni V, Villa S et al (2015) QSAR models for bioconcentration: is the increase in the complexity justified by more accurate predictions? Chemosphere 127:171–179CrossRefGoogle Scholar
  56. 56.
    Mansouri K, Abdelaziz A, Rybacka A et al (2016) CERAPP: collaborative estrogen receptor activity prediction project. Environ Health Perspect 124:1023–1033CrossRefPubMedPubMedCentralGoogle Scholar
  57. 57.
    Mansouri K, Grulke CM, Richard AM et al (2016) An automated curation procedure for addressing chemical errors and inconsistencies in public datasets used in QSAR modelling. SAR QSAR Environ Res 27:911–937CrossRefGoogle Scholar
  58. 58.
    Willett P (2006) Similarity-based virtual screening using 2D fingerprints. Drug Discov Today 11:1046–1053CrossRefGoogle Scholar
  59. 59.
    Cassotti M, Grisoni F, Nembri S et al (2016) Application of the weighted power-weakness ratio (wPWR) as a fusion rule in ligand–based virtual screening. MATCH Comm Math Comp Chem 76:359–376Google Scholar
  60. 60.
    Nembri S, Grisoni F, Consonni V et al (2016) In silico prediction of cytochrome P450-drug interaction: QSARs for CYP3A4 and CYP2C9. Int J Mol Sci 17:914CrossRefPubMedCentralGoogle Scholar
  61. 61.
    Ewing T, Baber JC, Feher M (2006) Novel 2D fingerprints for ligand-based virtual screening. J Chem Inf Model 46:2423–2431CrossRefPubMedGoogle Scholar
  62. 62.
    Watson P (2008) Naïve bayes classification using 2D pharmacophore feature triplet vectors. J Chem Inf Model 48:166–178CrossRefPubMedGoogle Scholar
  63. 63.
    Klon AE, Diller DJ (2007) Library fingerprints: a novel approach to the screening of virtual libraries. J Chem Inf Model 47:1354–1365CrossRefPubMedGoogle Scholar
  64. 64.
    Geppert H, Bajorath J (2010) Advances in 2D fingerprint similarity searching. Expert Opin Drug Discovery 5:529–542CrossRefGoogle Scholar
  65. 65.
    Ballabio D, Consonni V, Mauri A et al (2014) A novel variable reduction method adapted from space-filling designs. Chemom Intell Lab Syst 136:147–154CrossRefGoogle Scholar
  66. 66.
    Fodor IK (2002) A survey of dimension reduction techniques, Technical Report UCRL-ID-148494, Lawrence Livermore National LaboratoryGoogle Scholar
  67. 67.
    Roweis ST, Saul LK (2000) Nonlinear dimensionality reduction by locally linear embedding. Science 290:2323–2326CrossRefPubMedGoogle Scholar
  68. 68.
    Hotelling H (1933) Analysis of a complex of statistical variables into principal components. J Educ Psychol 24:417–441CrossRefGoogle Scholar
  69. 69.
    Pearson K (1901) On lines and planes of closest fit to systems of points in space. Lond Edinb Dubl Phil Mag 2:559–572CrossRefGoogle Scholar
  70. 70.
    Johnson SC (1967) Hierarchical clustering schemes. Psychometrika 32:241–254CrossRefGoogle Scholar
  71. 71.
    Arthur D, Vassilvitskii S (2007) k-means++: The advantages of careful seeding. In: Proceedings of the eighteenth annual ACM-SIAM symposium on Discrete algorithms, Society for Industrial and Applied Mathematics, pp. 1027–1035Google Scholar
  72. 72.
    Todeschini R, Ballabio D, Consonni V (2015) Distances and other dissimilarity measures in chemometrics. In: Encyclopedia of analytical chemistry. John Wiley & Sons, Ltd, Hoboken, pp 1–34Google Scholar
  73. 73.
    Truchon J-F, Bayly CI (2007) Evaluating virtual screening methods: good and bad metrics for the “early recognition” problem. J Chem Inf Model 47:488–508CrossRefGoogle Scholar
  74. 74.
    Goldberg DE, Holland JH (1988) Genetic algorithms and machine learning. Mach Learn 3:95–99CrossRefGoogle Scholar
  75. 75.
    Grisoni F, Cassotti M, Todeschini R (2014) Reshaped sequential replacement for variable selection in QSPR: comparison with other reference methods. J Chemom 28:249–259CrossRefGoogle Scholar
  76. 76.
    Cassotti M, Grisoni F, Todeschini R (2014) Reshaped sequential replacement algorithm: an efficient approach to variable selection. Chemom Intell Lab Syst 133:136–148CrossRefGoogle Scholar
  77. 77.
    Shen Q, Jiang J-H, Jiao C-X et al (2004) Modified particle swarm optimization algorithm for variable selection in MLR and PLS modeling: QSAR studies of antagonism of angiotensin II antagonists. Eur J Pharm Sci 22:145–152CrossRefGoogle Scholar
  78. 78.
    Derksen S, Keselman HJ (1992) Backward, forward and stepwise automated subset selection algorithms: frequency of obtaining authentic and noise variables. Br J Math Stat Psychol 45:265–282CrossRefGoogle Scholar
  79. 79.
    Cramer RD, Bunce JD, Patterson DE et al (1988) Crossvalidation, bootstrapping, and partial least squares compared with multiple regression in conventional QSAR studies. QSAR 7:18–25Google Scholar
  80. 80.
    Todeschini R, Ballabio D, Grisoni F (2016) Beware of unreliable Q2! A comparative study of regression metrics for predictivity assessment of QSAR models. J Chem Inf Model 56:1905–1913CrossRefGoogle Scholar
  81. 81.
    Sokolova M, Lapalme G (2009) A systematic analysis of performance measures for classification tasks. Inf Process Manag 45:427–437CrossRefGoogle Scholar
  82. 82.
    Berthold MR, Cebron N, Dill F et al (2009) KNIME - the Konstanz information miner: version 2.0 and beyond. SIGKDD Explor Newsl 11:26–31CrossRefGoogle Scholar
  83. 83.
    Warr WA (2012) Scientific workflow systems: pipeline pilot and KNIME. J Comput Aided Mol Des 26:801–804CrossRefPubMedPubMedCentralGoogle Scholar
  84. 84.
  85. 85.
    R: The R Project for Statistical Computing, https://www.r-project.org/
  86. 86.
    MATLAB (2016) R2016a, The MathWorks Inc., Natick, MassachusettsGoogle Scholar
  87. 87.
    Mysinger MM, Carchia M, Irwin JJ et al (2012) Directory of useful decoys, enhanced (DUD-E): better ligands and decoys for better benchmarking. J Med Chem 55:6582–6594CrossRefPubMedPubMedCentralGoogle Scholar
  88. 88.
    Nishimura-Yabe C (1998) Aldose reductase in the polyol pathway: a potential target for the therapeutic intervention of diabetic complications, Nihon yakurigaku zasshi. Folia pharmacologica Japonica 111:137–145CrossRefGoogle Scholar
  89. 89.
    Ramirez MA, Borja NL (2008) Epalrestat: an aldose reductase inhibitor for the treatment of diabetic neuropathy. Pharmacotherapy 28:646–655CrossRefGoogle Scholar
  90. 90.
    Structure Checker ChemAxon, 2016. http://www.chemaxon.com
  91. 91.
    Borg I, Groenen PJF (2005) Modern multidimensional scaling: theory and applications, 2nd edn. Springer Verlag, Berlin, GermanyGoogle Scholar
  92. 92.
    Harris CJ, Stevens AP (2006) Chemogenomics: structuring the drug discovery process to gene families. Drug Discov Today 11:880–888CrossRefPubMedGoogle Scholar
  93. 93.
    Birault V, Harris CJ, Le J et al (2006) Bringing kinases into focus: efficient drug design through the use of chemogenomic toolkits. Curr Med Chem 13:1735–1748CrossRefPubMedGoogle Scholar
  94. 94.
    Brown JB (2013) Systems chemical biology via computational compound-protein interaction prediction: core ideas, translational validity, and important perspectives, Invited Lecture at the Autumn School of Chemoinformatics, Nara, JapanGoogle Scholar
  95. 95.
    KNIME | Trusted Community Contributions, https://tech.knime.org/trusted-community-contributions
  96. 96.
    KNIME | Cheminformatics Extensions, https://tech.knime.org/cheminformatics-extensions
  97. 97.
  98. 98.
  99. 99.
    Weininger D (1988) SMILES, a chemical language and information system. 1. Introduction to methodology and encoding rules. J Chem Inf Comput Sci 28:31–36CrossRefGoogle Scholar
  100. 100.
    O’Boyle NM, Banck M, James CA et al (2011) Open babel: an open chemical toolbox. J Cheminform 3:1–14CrossRefGoogle Scholar
  101. 101.
    Mauri A, Consonni V, Todeschini R (2016) Molecular descriptors. In: Leszczynski J (ed) Handbook of computational chemistry. Springer, Netherlands, pp 1–29Google Scholar
  102. 102.
    Schneider N, Sayle RA, Landrum GA (2015) Get your atoms in order—an open-source implementation of a novel and robust molecular canonicalization algorithm. J Chem Inf Model 55:2111–2120CrossRefPubMedGoogle Scholar
  103. 103.
    Weininger D, Weininger A, Weininger JL (1989) SMILES. 2. Algorithm for generation of unique SMILES notation. J Chem Inf Comput Sci 29:97–101CrossRefGoogle Scholar
  104. 104.
    O’Boyle NM (2012) Towards a universal SMILES representation–a standard method to generate canonical SMILES based on the InChI. J Chem 4:1–14CrossRefGoogle Scholar
  105. 105.
    Koichi S, Iwata S, Uno T et al (2007) Algorithm for advanced canonical coding of planar chemical structures that considers stereochemical and symmetric information. J Chem Inf Model 47:1734–1746CrossRefPubMedGoogle Scholar
  106. 106.
    RDKit: Open-source cheminformatics; http://www.rdkit.org,
  107. 107.
    Halgren TA (1996) Merck molecular force field. I. Basis, form, scope, parameterization, and performance of MMFF94. J Comput Chem 17:490–519CrossRefGoogle Scholar
  108. 108.
    Lipinski CA, Lombardo F, Dominy BW et al (1997) Experimental and computational approaches to estimate solubility and permeability in drug discovery and development settings. Adv Drug Deliv Rev 23:3–25CrossRefGoogle Scholar
  109. 109.
    Rogers D, Hahn M (2010) Extended-connectivity fingerprints. J Chem Inf Model 50:742–754CrossRefGoogle Scholar
  110. 110.
    Carhart RE, Smith DH, Venkataraghavan R (1985) Atom pairs as molecular features in structure-activity studies: definition and applications. J Chem Inf Comput Sci 25:64–73CrossRefGoogle Scholar
  111. 111.
  112. 112.
    Todeschini R, Ballabio D, Consonni V et al (2016) A new concept of higher-order similarity and the role of distance/similarity measures in local classification methods. Chemom Intell Lab Syst 157:50–57CrossRefGoogle Scholar
  113. 113.
    Todeschini R, Consonni V, Xiang H et al (2012) Similarity coefficients for binary chemoinformatics data: overview and extended comparison using simulated and real data sets. J Chem Inf Model 52:2884–2901CrossRefGoogle Scholar
  114. 114.
    Hvidsten TR, Kryshtafovych A, Fidelis K (2009) Local descriptors of protein structure: a systematic analysis of the sequence-structure relationship in proteins using short- and long-range interactions. Proteins 75:870–884CrossRefPubMedPubMedCentralGoogle Scholar
  115. 115.
    Henschel A, Winter C, Kim WK et al (2007) Using structural motif descriptors for sequence-based binding site prediction. BMC Bioinformatics 8:S5CrossRefPubMedPubMedCentralGoogle Scholar
  116. 116.
    Li ZR, Lin HH, Han LY et al (2006) PROFEAT: a web server for computing structural and physicochemical features of proteins and peptides from amino acid sequence. Nucleic Acids Res 34:W32–W37CrossRefPubMedPubMedCentralGoogle Scholar
  117. 117.
    O’Boyle NM, Banck M, James CA et al (2011) Open babel: an open chemical toolbox. J Chem 3:33CrossRefGoogle Scholar
  118. 118.
    Dalby A, Nourse JG, Hounshell WD et al (1992) Description of several chemical structure file formats used by computer programs developed at molecular design limited. J Chem Inf Comput Sci 32:244–255CrossRefGoogle Scholar
  119. 119.
    Marvin Sketch 5.1.11 ChemAxon, 2013. http://www.chemaxon.com
  120. 120.
    NCI/CADD Group (2013), Chemical Identifier Resolver. Available at: http://cactus.nci.nih.gov/chemical/ structure
  121. 121.
    Getting Started with the RDKit in Python—The RDKit 2016.09.1 documentation, http://www.rdkit.org/docs/GettingStartedInPython.html#list-of-available-descriptors
  122. 122.
    Steinbeck C, Han Y, Kuhn S et al (2003) The chemistry development kit (CDK): an open-source java library for chemo- and bioinformatics. J Chem Inf Comput Sci 43:493–500CrossRefPubMedPubMedCentralGoogle Scholar
  123. 123.
    Steinbeck C, Hoppe C, Kuhn S et al (2006) Recent developments of the chemistry development kit (CDK)-an open-source java library for chemo-and bioinformatics. Curr Pharm Des 12:2111–2120CrossRefGoogle Scholar
  124. 124.
    Chemical Computing Group Inc. (2013) Molecular Operating Environment (MOE), 1010 Sherbooke St. West, Suite #910, Montreal, QC, Canada, H3A 2R7Google Scholar
  125. 125.
    Hong H, Xie Q, Ge W et al (2008) Mold2, molecular descriptors from 2D structures for chemoinformatics and toxicoinformatics. J Chem Inf Model 48:1337–1344CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2018

Authors and Affiliations

  • Francesca Grisoni
    • 1
    Email author
  • Viviana Consonni
    • 1
  • Roberto Todeschini
    • 1
  1. 1.Department of Earth and Environmental Sciences, Milano Chemometrics and QSAR Research GroupUniversity of Milano-BicoccaMilanItaly

Personalised recommendations