Skip to main content

Molecular Descriptors for Structure–Activity Applications: A Hands-On Approach

  • Protocol
  • First Online:
Computational Toxicology

Part of the book series: Methods in Molecular Biology ((MIMB,volume 1800))

Abstract

Molecular descriptors capture diverse parts of the structural information of molecules and they are the support of many contemporary computer-assisted toxicological and chemical applications. After briefly introducing some fundamental concepts of structure–activity applications (e.g., molecular descriptor dimensionality, classical vs. fingerprint description, and activity landscapes), this chapter guides the readers through a step-by-step explanation of molecular descriptors rationale and application. To this end, the chapter illustrates a case study of a recently published application of molecular descriptors for modeling the activity on cytochrome P450.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Protocol
USD 49.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 99.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 129.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 199.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Schultz TW, Cronin MTD, Walker JD, Aptula AO (2003) Quantitative structure–activity relationships (QSARs) in toxicology: a historical perspective. J Mol Struct THEOCHEM 622:1–22

    Article  CAS  Google Scholar 

  2. McKinney JD, Richard A, Waller C, Newman MC, Gerberick F (2000) The practice of structure activity relationships (SAR) in toxicology. Toxicol Sci 56:8–17

    Article  PubMed  CAS  Google Scholar 

  3. Johnson MA, Maggiora GM (1990) Concepts and applications of molecular similarity. Wiley, New York

    Google Scholar 

  4. Crum-Brown A, Fraser T (1868) On the connection between chemical constitution and physiological action. Part 1. On the physiological action of the ammonium bases, derived from Strychia, Brucia, Thebaia, Codeia, Morphia and Nicotia. Trans R Soc Edinb 25:151–203

    Article  Google Scholar 

  5. Hansch C, Maloney PP, Fujita T, Muir RM (1962) Correlation of biological activity of phenoxyacetic acids with Hammett substituent constants and partition coefficients. Nature 194:178–180

    Article  CAS  Google Scholar 

  6. Richardson B (1869) Physiological research on alcohols. Med Times Gazzette 703:706

    Google Scholar 

  7. Richet M (1893) Note sur le rapport entre la toxicité et les propriétés physiques des corps. Compt Rend Soc Biol Paris 45:775–776

    Google Scholar 

  8. Wiener H (1947) Influence of interatomic forces on paraffin properties. J Chem Phys 15:766–766

    Article  CAS  Google Scholar 

  9. Platt JR (1947) Influence of neighbor bonds on additive bond properties in paraffins. J Chem Phys 15:419–420

    Article  CAS  Google Scholar 

  10. Todeschini R, Consonni V (2009) Molecular descriptors for chemoinformatics, vol 2. Wiley-VCH Verlag GmbH, Weinheim, Germany, Weinheim

    Book  Google Scholar 

  11. Todeschini R, Consonni V, Gramatica P (2009) Chemometrics in QSAR. In: Comprehensive Chemometrics. Elsevier, Oxford, pp 129–172

    Chapter  Google Scholar 

  12. Fourches D, Muratov E, Tropsha A (2010) Trust, but verify: on the importance of chemical structure curation in cheminformatics and QSAR modeling research. J Chem Inf Model 50:1189–1204

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  13. Furusjö E, Svenson A, Rahmberg M, Andersson M (2006) The importance of outlier detection and training set selection for reliable environmental QSAR predictions. Chemosphere 63:99–108

    Article  PubMed  CAS  Google Scholar 

  14. Mansouri K, Grulke CM, Richard AM, Judson RS, Williams AJ (2016) An automated curation procedure for addressing chemical errors and inconsistencies in public datasets used in QSAR modelling. SAR QSAR Environ Res 27:911–937

    Article  CAS  Google Scholar 

  15. Grisoni F, Consonni V, Villa S, Vighi M, Todeschini R (2015) QSAR models for bioconcentration: is the increase in the complexity justified by more accurate predictions? Chemosphere 127:171–179

    Article  PubMed  CAS  Google Scholar 

  16. Goldberg DE, Holland JH (1988) Genetic algorithms and machine learning. Mach Learn 3:95–99

    Article  Google Scholar 

  17. Grisoni F, Cassotti M, Todeschini R (2014) Reshaped sequential replacement for variable selection in QSPR: comparison with other reference methods. J Chemom 28:249–259

    Article  CAS  Google Scholar 

  18. Cassotti M, Grisoni F, Todeschini R (2014) Reshaped sequential replacement algorithm: an efficient approach to variable selection. Chemom Intell Lab Syst 133:136–148

    Article  CAS  Google Scholar 

  19. Shen Q, Jiang J-H, Jiao C-X, Shen G, Yu R-Q (2004) Modified particle swarm optimization algorithm for variable selection in MLR and PLS modeling: QSAR studies of antagonism of angiotensin II antagonists. Eur J Pharm Sci 22:145–152

    Article  PubMed  CAS  Google Scholar 

  20. Derksen S, Keselman HJ (1992) Backward, forward and stepwise automated subset selection algorithms: frequency of obtaining authentic and noise variables. Br J Math Stat Psychol 45:265–282

    Article  Google Scholar 

  21. Cramer RD, Bunce JD, Patterson DE, Frank IE (1988) Crossvalidation, bootstrapping, and partial least squares compared with multiple regression in conventional QSAR studies. Quant Struct Act Relat 7:18–25

    Article  Google Scholar 

  22. Todeschini R, Ballabio D, Grisoni F (2016) Beware of unreliable Q2! A comparative study of regression metrics for predictivity assessment of QSAR models. J Chem Inf Model 56(10):1905–1913

    Article  PubMed  CAS  Google Scholar 

  23. Sokolova M, Lapalme G (2009) A systematic analysis of performance measures for classification tasks. Inf Process Manag 45:427–437

    Article  Google Scholar 

  24. Sahigara F, Mansouri K, Ballabio D, Mauri A, Consonni V, Todeschini R (2012) Comparison of different approaches to define the applicability domain of QSAR models. Molecules 17:4791–4810

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  25. Dragos H, Gilles M, Alexandre V (2009) Predicting the predictability: a unified approach to the applicability domain problem of QSAR models. J Chem Inf Model 49:1762–1776

    Article  PubMed  CAS  Google Scholar 

  26. Sabljic A (2001) QSAR models for estimating properties of persistent organic pollutants required in evaluation of their environmental fate and risk. Chemosphere 43:363–375

    Article  PubMed  CAS  Google Scholar 

  27. Novič M, Vračko M (2010) QSAR models for reproductive toxicity and endocrine disruption activity. Molecules 15:1987–1999

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  28. Miyao T, Arakawa M, Funatsu K (2010) Exhaustive structure generation for inverse-QSPR/QSAR. Mol Inform 29:111–125

    Article  PubMed  CAS  Google Scholar 

  29. Munteanu RC, Fernandez-Blanco E, Seoane AJ, Izquierdo-Novo P, Angel Rodriguez-Fernandez J, Maria Prieto-Gonzalez J, Rabunal RJ, Pazos A (2010) Drug discovery and design for complex diseases through QSAR computational methods. Curr Pharm Des 16:2640–2655

    Article  PubMed  CAS  Google Scholar 

  30. Nembri S, Grisoni F, Consonni V, Todeschini R (2016) In silico prediction of cytochrome P450-drug interaction: QSARs for CYP3A4 and CYP2C9. Int J Mol Sci 17:914

    Article  PubMed Central  Google Scholar 

  31. Grisoni F, Consonni V, Vighi M, Villa S, Todeschini R (2016) Investigating the mechanisms of bioconcentration through QSAR classification trees. Environ Int 88:198–205

    Article  PubMed  CAS  Google Scholar 

  32. Cramer RD, Patterson DE, Bunce JD (1988) Comparative molecular field analysis (CoMFA). 1. Effect of shape on binding of steroids to carrier proteins. J Am Chem Soc 110:5959–5967

    Article  PubMed  CAS  Google Scholar 

  33. Marrero Ponce Y (2004) Total and local (atom and atom type) molecular quadratic indices: significance interpretation, comparison to other molecular descriptors, and QSPR/QSAR applications. Bioorg Med Chem 12:6351–6369

    Article  CAS  Google Scholar 

  34. Bender A, Glen CR (2004) Molecular similarity: a key technique in molecular informatics. Org Biomol Chem 2:3204–3218

    Article  PubMed  CAS  Google Scholar 

  35. Patlewicz G, Ball N, Booth ED, Hulzebos E, Zvinavashe E, Hennes C (2013) Use of category approaches, read-across and (Q)SAR: general considerations. Regul Toxicol Pharmacol 67:1–12

    Article  PubMed  Google Scholar 

  36. Schneider G, Neidhart W, Giller T, Schmid G (1999) “Scaffold-hopping” by topological pharmacophore search: a contribution to virtual screening. Angew Chem Int Ed 38:2894–2896

    Article  CAS  Google Scholar 

  37. Höfer T, Gerner I, Gundert-Remy U, Liebsch M, Schulte A, Spielmann H, Vogel R, Wettig K (2004) Animal testing and alternative approaches for the human health risk assessment under the proposed new European chemicals regulation. Arch Toxicol 78:549–564

    Article  PubMed  CAS  Google Scholar 

  38. Mansouri K, Abdelaziz A, Rybacka A et al (2016) CERAPP: collaborative estrogen receptor activity prediction project. Environ Health Perspect 124(7):1023–1033. https://doi.org/10.1289/ehp.1510267

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  39. Sedykh A, Zhu H, Tang H, Zhang L, Richard A, Rusyn I, Tropsha A (2011) Use of in vitro HTS-derived concentration–response data as biological descriptors improves the accuracy of QSAR models of in vivo toxicity. Environ Health Perspect 119:364–370

    Article  PubMed  CAS  Google Scholar 

  40. Cassotti M, Ballabio D, Todeschini R, Consonni V (2015) A similarity-based QSAR model for predicting acute toxicity towards the fathead minnow (Pimephales promelas). SAR QSAR Environ Res 26:217–243

    Article  PubMed  CAS  Google Scholar 

  41. Belanger SE, Brill JL, Rawlings JM, Price BB (2016) Development of acute toxicity quantitative structure activity relationships (QSAR) and their use in linear alkylbenzene sulfonate species sensitivity distributions. Chemosphere 155:18–27

    Article  PubMed  CAS  Google Scholar 

  42. Wang C, Lu GH, Li YM (2005) QSARs for the chronic toxicity of halogenated benzenes to bacteria in natural waters. Bull Environ Contam Toxicol 75:102–108

    Article  PubMed  CAS  Google Scholar 

  43. Fan D, Liu J, Wang L, Yang X, Zhang S, Zhang Y, Shi L (2016) Development of quantitative structure–activity relationship models for predicting chronic toxicity of substituted benzenes to daphnia magna. Bull Environ Contam Toxicol 96:664–670

    Article  PubMed  CAS  Google Scholar 

  44. Austin TJ, Eadsforth CV (2014) Development of a chronic fish toxicity model for predicting sub-lethal NOEC values for non-polar narcotics. SAR QSAR Environ Res 25:147–160

    Article  PubMed  CAS  Google Scholar 

  45. Schöning V, Hammann F, Peinl M, Drewe J (2017) Identification of any structure-specific hepatotoxic potential of different pyrrolizidine alkaloids using random forest and artificial neural network. Toxicol Sci 160(2):361–370. https://doi.org/10.1093/toxsci/kfx187

    Article  PubMed  CAS  Google Scholar 

  46. Myshkin E, Brennan R, Khasanova T, Sitnik T, Serebriyskaya T, Litvinova E, Guryanov A, Nikolsky Y, Nikolskaya T, Bureeva S (2012) Prediction of organ toxicity endpoints by QSAR modeling based on precise chemical-histopathology annotations. Chem Biol Drug Des 80:406–416

    Article  PubMed  CAS  Google Scholar 

  47. Gu C, Goodarzi M, Yang X, Bian Y, Sun C, Jiang X (2012) Predictive insight into the relationship between AhR binding property and toxicity of polybrominated diphenyl ethers by PLS-derived QSAR. Toxicol Lett 208:269–274

    Article  PubMed  CAS  Google Scholar 

  48. Tong W, Fang H, Hong H, Xie Q, Perkins R, Sheehan DM (2004) Receptor-mediated toxicity: QSARs for estrogen receptor binding and priority setting of potential estrogenic endocrine disruptors. CRC Press, Boca Raton, FL, USA

    Book  Google Scholar 

  49. Grisoni F, Reker D, Schneider P, Friedrich L, Consonni V, Todeschini R, Koeberle A, Werz O, Schneider G (2017) Matrix-based molecular descriptors for prospective virtual compound screening. Mol Inform 36:1–7

    Article  CAS  Google Scholar 

  50. Ekins S, Mestres J, Testa B (2007) In silico pharmacology for drug discovery: methods for virtual ligand screening and profiling. Br J Pharmacol 152:9–20

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  51. Jacob L, Vert J-P (2008) Protein-ligand interaction prediction: an improved chemogenomics approach. Bioinformatics 24:2149–2156

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  52. Rognan D (2007) Chemogenomic approaches to rational drug design. Br J Pharmacol 152:38–52

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  53. Strömbergsson H, Kleywegt GJ (2009) A chemogenomics view on protein-ligand spaces. BMC Bioinformatics 10:1–11

    Article  CAS  Google Scholar 

  54. Cronin MTD, Walker JD, Jaworska JS, Comber MHI, Watts CD, Worth AP (2003) Use of QSARs in international decision-making frameworks to predict ecologic effects and environmental fate of chemical substances. Environ Health Perspect 111:1376–1390

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  55. Mansouri K, Ringsted T, Ballabio D, Todeschini R, Consonni V (2013) Quantitative structure–activity relationship models for ready biodegradability of chemicals. J Chem Inf Model 53:867–878

    Article  PubMed  CAS  Google Scholar 

  56. Carlsen L, Walker JD (2003) QSARs for prioritizing PBT substances to promote pollution prevention. QSAR Comb Sci 22:49–57

    Article  CAS  Google Scholar 

  57. Gramatica P, Papa E (2007) Screening and ranking of POPs for global half-life: QSAR approaches for prioritization based on molecular structure. Environ Sci Technol 41:2833–2839

    Article  PubMed  CAS  Google Scholar 

  58. Rojas C, Todeschini R, Ballabio D, Mauri A, Consonni V, Tripaldi P, Grisoni F (2017) A QSTR-based expert system to predict sweetness of molecules. Front Chem 5:53. https://doi.org/10.3389/fchem.2017.00053

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  59. Martinez-Mayorga K, Medina-Franco JL (2009) Chapter 2 chemoinformatics—applications in food chemistry. Adv Food Nutr Res 58:33–56

    Article  PubMed  CAS  Google Scholar 

  60. Sweeney MH, Mocarelli P (2000) Human health effects after exposure to 2,3,7,8-TCDD. Food Addit Contam 17:303–316

    Article  PubMed  CAS  Google Scholar 

  61. Walker MK, Spitsbergen JM, Olson JR, Peterson RE (1991) 2,3,7,8-Tetrachlorodibenzo-p-dioxin (TCDD) toxicity during early life stage development of lake trout (Salvelinus namaycush). Can J Fish Aquat Sci 48:875–883

    Article  CAS  Google Scholar 

  62. Consonni V, Todeschini R (2012) Multivariate analysis of molecular descriptors. In: Dehmer M, Varmuza K, Bonchev D (eds) Statistical modelling of molecular descriptors in QSAR/QSPR. Wiley-VCH Verlag GmbH & Co, KGaA, pp 111–147

    Chapter  Google Scholar 

  63. Reutlinger M, Koch CP, Reker D, Todoroff N, Schneider P, Rodrigues T, Schneider G (2013) Chemically advanced template search (CATS) for scaffold-hopping and prospective target prediction for “orphan” molecules. Mol Inform 32:133–138

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  64. Fechner U, Franke L, Renner S, Schneider P, Schneider G (2003) Comparison of correlation vector methods for ligand-based similarity searching. J Comput Aided Mol Des 17:687–698

    Article  PubMed  CAS  Google Scholar 

  65. Basak SC, Gute BD, Grunwald GD (1997) Use of topostructural, topochemical, and geometric parameters in the prediction of vapor pressure: a hierarchical QSAR approach. J Chem Inf Comput Sci 37:651–655

    Article  CAS  Google Scholar 

  66. Kubinyi H (1993) 3D QSAR in drug design. In: Theory methods and applications, vol 1. Springer Science & Business Media, Berlin

    Google Scholar 

  67. Consonni V, Todeschini R, Pavan M (2002) Structure/response correlations and similarity/diversity analysis by GETAWAY descriptors. 1. Theory of the novel 3D molecular descriptors. J Chem Inf Comput Sci 42:682–692

    Article  PubMed  CAS  Google Scholar 

  68. Nettles JH, Jenkins JL, Bender A, Deng Z, Davies JW, Glick M (2006) Bridging chemical and biological space: “target fishing” using 2D and 3D molecular descriptors. J Med Chem 49:6802–6810

    Article  PubMed  CAS  Google Scholar 

  69. Schuur JH, Selzer P, Gasteiger J (1996) The coding of the three-dimensional structure of molecules by molecular transforms and its application to structure-spectra correlations and studies of biological activity. J Chem Inf Comput Sci 36:334–344

    Article  CAS  Google Scholar 

  70. Rybinska A, Sosnowska A, Barycki M, Puzyn T (2016) Geometry optimization method versus predictive ability in QSPR modeling for ionic liquids. J Comput Aided Mol Des 30:165–176

    Article  PubMed  CAS  Google Scholar 

  71. Nicklaus MC, Wang S, Driscoll JS, Milne GWA (1995) Conformational changes of small molecules binding to proteins. Bioorg Med Chem 3:411–428

    Article  PubMed  CAS  Google Scholar 

  72. Klebe G, Abraham U, Mietzner T (1994) Molecular similarity indices in a comparative analysis (CoMSIA) of drug molecules to correlate and predict their biological activity. J Med Chem 37:4130–4146

    Article  PubMed  CAS  Google Scholar 

  73. Hopfinger AJ, Wang S, Tokarski JS, Jin B, Albuquerque M, Madhav PJ, Duraiswami C (1997) Construction of 3D-QSAR models using the 4D-QSAR analysis formalism. J Am Chem Soc 119:10509–10524

    Article  CAS  Google Scholar 

  74. Andrade CH, Pasqualoto KFM, Ferreira EI, Hopfinger AJ (2010) 4D-QSAR: perspectives in drug design. Mol Basel Switz 15:3281–3294

    CAS  Google Scholar 

  75. Vedani A, McMasters DR, Dobler M (2000) Multi-conformational ligand representation in 4D-QSAR: reducing the bias associated with ligand alignment. Quant Struct Act Relat 19:149–161

    Article  CAS  Google Scholar 

  76. Vedani A, Briem H, Dobler M, Dollinger H, McMasters DR (2000) Multiple-conformation and protonation-state representation in 4D-QSAR: the Neurokinin-1 receptor system. J Med Chem 43:4416–4427

    Article  PubMed  CAS  Google Scholar 

  77. Vedani A, Dobler M (2002) 5D-QSAR: the key for simulating induced fit? J Med Chem 45:2139–2149

    Article  PubMed  CAS  Google Scholar 

  78. Vedani A, Dobler M, Lill MA (2005) Combining protein modeling and 6D-QSAR. Simulating the binding of structurally diverse ligands to the estrogen receptor. J Med Chem 48:3700–3703

    Article  PubMed  CAS  Google Scholar 

  79. Willett P (2006) Similarity-based virtual screening using 2D fingerprints. Drug Discov Today 11:1046–1053

    Article  PubMed  CAS  Google Scholar 

  80. Cassotti M, Grisoni F, Nembri S, Todeschini R (2016) Application of the weighted power-weakness ratio (wPWR) as a fusion rule in ligand–based virtual screening. MATCH Comm Math Comp Chem 76:359–376

    Google Scholar 

  81. Ewing T, Baber JC, Feher M (2006) Novel 2D fingerprints for ligand-based virtual screening. J Chem Inf Model 46:2423–2431

    Article  PubMed  CAS  Google Scholar 

  82. Watson P (2008) Naïve bayes classification using 2D pharmacophore feature triplet vectors. J Chem Inf Model 48:166–178

    Article  PubMed  CAS  Google Scholar 

  83. Klon AE, Diller DJ (2007) Library fingerprints: a novel approach to the screening of virtual libraries. J Chem Inf Model 47:1354–1365

    Article  PubMed  CAS  Google Scholar 

  84. Geppert H, Bajorath J (2010) Advances in 2D fingerprint similarity searching. Expert Opin Drug Discov 5:529–542

    Article  PubMed  CAS  Google Scholar 

  85. Tetko IV, Sushko I, Pandey AK, Zhu H, Tropsha A, Papa E, Oberg T, Todeschini R, Fourches D, Varnek A (2008) Critical assessment of QSAR models of environmental toxicity against Tetrahymena pyriformis: focusing on applicability domain and overfitting by variable selection. J Chem Inf Model 48:1733–1746

    Article  PubMed  CAS  Google Scholar 

  86. Zhu H, Tropsha A, Fourches D, Varnek A, Papa E, Gramatica P, Oberg T, Dao P, Cherkasov A, Tetko IV (2008) Combinatorial QSAR modeling of chemical toxicants tested against Tetrahymena pyriformis. J Chem Inf Model 48:766–784

    Article  PubMed  CAS  Google Scholar 

  87. Guha R (2011) The ups and downs of structure-activity landscapes. Methods Mol Biol 672:101–117

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  88. Bajorath J, Peltason L, Wawer M, Guha R, Lajiness MS, Van Drie JH (2009) Navigating structure–activity landscapes. Drug Discov Today 14:698–705

    Article  PubMed  CAS  Google Scholar 

  89. Wassermann AM, Wawer M, Bajorath J (2010) Activity landscape representations for structure−activity relationship analysis. J Med Chem 53:8209–8223

    Article  PubMed  CAS  Google Scholar 

  90. Maggiora GM (2006) On outliers and activity cliffs: why QSAR often disappoints. J Chem Inf Model 46:1535–1535

    Article  PubMed  CAS  Google Scholar 

  91. Eckert H, Bajorath J (2007) Molecular similarity analysis in virtual screening: foundations, limitations and novel approaches. Drug Discov Today 12:225–233

    Article  PubMed  CAS  Google Scholar 

  92. Hu Y, Bajorath J (2012) Extending the activity cliff concept: structural categorization of activity cliffs and systematic identification of different types of cliffs in the ChEMBL database. J Chem Inf Model 52:1806–1811

    Article  PubMed  CAS  Google Scholar 

  93. Cruz-Monteagudo M, Medina-Franco JL, Pérez-Castillo Y, Nicolotti O, Cordeiro MNDS, Borges F (2014) Activity cliffs in drug discovery: Dr Jekyll or Mr Hyde? Drug Discov Today 19:1069–1080

    Article  PubMed  CAS  Google Scholar 

  94. Guha R, Jurs PC (2004) Development of QSAR models to predict and interpret the biological activity of artemisinin analogues. J Chem Inf Comput Sci 44:1440–1449

    Article  PubMed  CAS  Google Scholar 

  95. McCarty LS, Dixon DG, MacKay D, Smith AD, Ozburn GW (1992) Residue-based interpretation of toxicity and bioconcentration QSARs from aquatic bioassays: neutral narcotic organics. Environ Toxicol Chem 11:917–930

    Article  CAS  Google Scholar 

  96. Munro AW, Girvan HM, Mason AE, Dunford AJ, McLean KJ (2013) What makes a P450 tick? Trends Biochem Sci 38:140–150

    Article  PubMed  CAS  Google Scholar 

  97. Gonzalez FJ (2005) Role of cytochromes P450 in chemical toxicity and oxidative stress: studies with CYP2E1. Mutat Res 569:101–110

    Article  PubMed  CAS  Google Scholar 

  98. Gonzalez FJ, Gelboin HV (1994) Role of human cytochromes P450 in the metabolic activation of chemical carcinogens and toxins. Drug Metab Rev 26:165–183

    Article  PubMed  CAS  Google Scholar 

  99. Zanger UM, Schwab M (2013) Cytochrome P450 enzymes in drug metabolism: regulation of gene expression, enzyme activities, and impact of genetic variation. Pharmacol Ther 138:103–141

    Article  PubMed  CAS  Google Scholar 

  100. Guengerich FP (2006) Cytochrome P450s and other enzymes in drug metabolism and toxicity. AAPS J 8:E101–E111

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  101. Protein Data Bank (2013) Crystal structure of CYP3A4 in complex with an inhibitor. PDB ID: 4NY4

    Google Scholar 

  102. Veith H, Southall N, Huang R et al (2009) Comprehensive characterization of cytochrome P450 isozyme selectivity across chemical libraries. Nat Biotechnol 27:1050–1055

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  103. The PubChem Project. https://pubchem.ncbi.nlm.nih.gov/. Accessed 11 Sep 2017

  104. Nembri S, Grisoni F, Consonni V, Todeschini R (2016) Cytochrome P450–Drug interaction dataset, available at http://michem.disat.unimib.it/chm/download/cytochrome.htm. http://michem.disat.unimib.it/chm/download/cytochrome.htm. Accessed 29 Sep 2017

  105. Breiman L, Friedman J, Stone CJ, Olshen RA (1984) Classification and regression trees. CRC press

    Google Scholar 

  106. Daszykowski M, Walczak B, Xu Q-S et al (2004) Classification and regression trees–studies of HIV reverse transcriptase inhibitors. J Chem Inf Comput Sci 44:716–726

    Article  PubMed  CAS  Google Scholar 

  107. Steinberg D, Colla P (2009) CART: classification and regression trees. Top Ten Algorithms Data Min 9:179

    Article  Google Scholar 

  108. Cover T, Hart P (1967) Nearest neighbor pattern classification. IEEE Trans Inf Theory 13:21–27

    Article  Google Scholar 

  109. Ballabio D, Grisoni F, Todeschini R (2017) Multivariate comparison of classification performance measures. Chemom Intell Lab Syst 174:33–44

    Article  CAS  Google Scholar 

  110. Kode SRL (2016) Dragon (software for molecular descriptor calculation) version 7.0–2016–https://chm.kode-solutions.net

  111. E-Dragon Software. http://www.vcclab.org/lab/edragon/. Accessed 4 Sep 2017

  112. MathWorks Inc. (2016) MATLAB R2016b. https://it.mathworks.com/. Accessed 6 Sep 2017

  113. Python. In: Python.org. https://www.python.org/. Accessed 23 Feb 2017

  114. Daylight Theory: SMILES. http://www.daylight.com/dayhtml/doc/theory/theory.smiles.html. Accessed 9 Jun 2016

  115. West DB (2001) Introduction to graph theory. Pearson, Prentice hall Upper Saddle River

    Google Scholar 

  116. Weininger D (1988) SMILES, a chemical language and information system. 1. Introduction to methodology and encoding rules. J Chem Inf Comput Sci 28:31–36

    Article  CAS  Google Scholar 

  117. Schneider N, Sayle RA, Landrum GA (2015) Get your atoms in order—an open-source implementation of a novel and robust molecular canonicalization algorithm. J Chem Inf Model 55:2111–2120

    Article  PubMed  CAS  Google Scholar 

  118. Weininger D, Weininger A, Weininger JL (1989) SMILES. 2. Algorithm for generation of unique SMILES notation. J Chem Inf Comput Sci 29:97–101

    Article  CAS  Google Scholar 

  119. O’Boyle NM (2012) Towards a universal SMILES representation - a standard method to generate canonical SMILES based on the InChI. J Cheminform 4:1–14

    Article  CAS  Google Scholar 

  120. Koichi S, Iwata S, Uno T, Koshino H, Satoh H (2007) Algorithm for advanced canonical coding of planar chemical structures that considers stereochemical and symmetric information. J Chem Inf Model 47:1734–1746

    Article  PubMed  CAS  Google Scholar 

  121. O’Boyle NM, Banck M, James CA, Morley C, Vandermeersch T, Hutchison GR (2011) Open babel: an open chemical toolbox. J Cheminform 3:33

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  122. Broto P, Moreau G, Vandycke C (1984) Molecular structures: perception, autocorrelation descriptor and Sar studies: system of atomic contributions for the calculation of the n-octanol/water partition coefficients. Eur J Med Chem 19:71–78

    CAS  Google Scholar 

  123. Broto P, Moreau G, Vandycke C (1984) Molecular structures: perception, autocorrelation descriptor and Sar studies. Use of the autocorrelation descriptor in the qsar study of two non-narcotic analgesic series. Eur J Med Chem 19:79–84

    CAS  Google Scholar 

  124. Moreau G, Turpin C (1996) Use of similarity analysis to reduce large molecular libraries to smaller sets of representative molecules: Informatique et analyse. I. Analysis 24:M17–M21

    CAS  Google Scholar 

  125. Hollas B (2002) Correlation properties of the autocorrelation descriptor for molecules. MATCH–Commun math. Comput Chem 45:27

    CAS  Google Scholar 

  126. Magnuson V, Harriss D, Basak S (1983) Topological indices based on neighborhood symmetry: chemical and biological applications. In: Chemical applications of topology and graph theory. Elsevier, Amsterdam, pp 178–191

    Google Scholar 

  127. Roy A, Basak S, Harriss D, Magnuson V (1984) Neighborhood complexities and symmetry of chemical graphs and their biological applications. Pergamon Press, New York

    Book  Google Scholar 

  128. Hall LH, Kier LB, Brown BB (1995) Molecular similarity based on novel atom-type electrotopological state indices. J Chem Inf Comput Sci 35:1074–1080

    Article  CAS  Google Scholar 

  129. Hall LH, Kier LB (1995) Electrotopological state indices for atom types: a novel combination of electronic, topological, and valence state information. J Chem Inf Comput Sci 35:1039–1045

    Article  CAS  Google Scholar 

  130. Kier LB, Hall LH (1990) An electrotopological-state index for atoms in molecules. Pharm Res 7:801–807

    Article  PubMed  CAS  Google Scholar 

  131. Butina D (2004) Performance of kier-hall E-state descriptors in quantitative structure activity relationship (QSAR) studies of multifunctional molecules. Molecules 9:1004–1009

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  132. Todeschini R, Ballabio D, Consonni V (2015) Distances and other dissimilarity measures in chemometrics. In: Encyclopedia of analytical chemistry. John Wiley & Sons Ltd, Hoboken

    Google Scholar 

  133. Todeschini R, Ballabio D, Consonni V, Grisoni F (2016) A new concept of higher-order similarity and the role of distance/similarity measures in local classification methods. Chemom Intell Lab Syst 157:50–57

    Article  CAS  Google Scholar 

  134. Cassotti M, Ballabio D, Consonni V, Mauri A, Tetko IV, Todeschini R (2014) Prediction of acute aquatic toxicity toward Daphnia magna by using the GA-kNN method. Altern Lab Anim 42:31–41

    PubMed  CAS  Google Scholar 

  135. Sahigara F, Ballabio D, Todeschini R, Consonni V (2013) Defining a novel k-nearest neighbours approach to assess the applicability domain of a QSAR model for reliable predictions. J Cheminform 5:27

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  136. Dimitrov S, Dimitrova G, Pavlov T, Dimitrova N, Patlewicz G, Niemela J, Mekenyan O (2005) A stepwise approach for defining the applicability domain of SAR and QSAR models. J Chem Inf Model 45:839–849

    Article  PubMed  CAS  Google Scholar 

  137. Jolliffe IT (1986) Principal component analysis and factor analysis. In: Principal component analysis. Springer, New York, NY, pp 115–128

    Chapter  Google Scholar 

  138. Marvin Sketch 5.1.11 ChemAxon, (2013). http://www.chemaxon.com

  139. NCI/CADD Group, (2013) Chemical Identifier Resolver. Available at: http://cactus.nci.nih.gov/chemical/structure

  140. Dalby A, Nourse JG, Hounshell WD, Gushurst AK, Grier DL, Leland BA, Laufer J (1992) Description of several chemical structure file formats used by computer programs developed at molecular design limited. J Chem Inf Comput Sci 32:244–255

    Article  CAS  Google Scholar 

  141. RDKit: Open-source cheminformatics; http://www.rdkit.org

  142. Steinbeck C, Han Y, Kuhn S, Horlacher O, Luttmann E, Willighagen E (2003) The chemistry development kit (CDK): an open-source java library for chemo- and bioinformatics. J Chem Inf Comput Sci 43:493–500

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  143. Steinbeck C, Hoppe C, Kuhn S, Floris M, Guha R, Willighagen EL (2006) Recent developments of the chemistry development kit (CDK)-an open-source java library for chemo-and bioinformatics. Curr Pharm Des 12:2111–2120

    Article  PubMed  CAS  Google Scholar 

  144. Chemical Computing Group Inc., (2013) Molecular operating environment (MOE). 1010 Sherbooke St West Suite 910 Montr. QC Can. H3A 2R7 2014

    Google Scholar 

  145. Hong H, Xie Q, Ge W, Qian F, Fang H, Shi L, Su Z, Perkins R, Tong W (2008) Mold2, molecular descriptors from 2D structures for chemoinformatics and toxicoinformatics. J Chem Inf Model 48:1337–1344

    Article  PubMed  CAS  Google Scholar 

  146. SciPy.orgSciPy.org. https://www.scipy.org/. Accessed 5 Sep 2017

  147. Ballabio D (2015) A MATLAB toolbox for principal component analysis and unsupervised exploration of data structure. Chemom Intell Lab Syst 149:1–9

    Article  CAS  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Francesca Grisoni .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Science+Business Media, LLC, part of Springer Nature

About this protocol

Check for updates. Verify currency and authenticity via CrossMark

Cite this protocol

Grisoni, F., Ballabio, D., Todeschini, R., Consonni, V. (2018). Molecular Descriptors for Structure–Activity Applications: A Hands-On Approach. In: Nicolotti, O. (eds) Computational Toxicology. Methods in Molecular Biology, vol 1800. Humana Press, New York, NY. https://doi.org/10.1007/978-1-4939-7899-1_1

Download citation

  • DOI: https://doi.org/10.1007/978-1-4939-7899-1_1

  • Published:

  • Publisher Name: Humana Press, New York, NY

  • Print ISBN: 978-1-4939-7898-4

  • Online ISBN: 978-1-4939-7899-1

  • eBook Packages: Springer Protocols

Publish with us

Policies and ethics