Skip to main content

Application of Chemoinformatics to High-Throughput Screening

Practical Considerations

  • Protocol
Chemoinformatics

Part of the book series: Methods in Molecular Biology™ ((MIMB,volume 275))

Abstract

The objective of this chapter is to summarize and evaluate some of the most common chemoinformatic methods that are applied to the analysis of high-throughput-screening data. The chapter will briefly describe current high-throughput-screening practices and will stress how the major constraint on the application of chemoinformatics is often the quality of high-throughput-screening data. Discussion of the NCI dataset and how it differs from most high-throughputscreening datasets will be made to highlight this point.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Protocol
USD 49.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Kubinyi, H. (1999) Chance favors the prepared mind—from serendipity to rational drug design. J. Recep. Signal Transduc. Research 19, 15–39.

    Article  CAS  Google Scholar 

  2. Lepre, C. A. (2001) Library design for NMR-based screening. Drug Discovery Today 6, 133–140.

    Article  PubMed  CAS  Google Scholar 

  3. Nienaber, V. L., Richardson, P. L., Klighofer, V., Bouska, J. L., Giranda, V. L., and Greer, J. (2000) Discovering novel ligands for macromolecules using x-ray crystallographic screening. Nature 18, 1105–1108.

    Article  CAS  Google Scholar 

  4. Su, A. I., Lorber, D. M., Weston, G. S., Baase, W. A., Mathews, B. W., and Shoichet, B. K. (2001) Docking molecules by families to increase the diversity of hits in database screens: computational strategy and experimental evaluation. Proteins: Struc. Func. Genet. 42, 279–293.

    Article  CAS  Google Scholar 

  5. Joseph-McCarthy, D. (1999) Computational approaches to structure-based ligand design. Pharmacology Therapeutics 84, 179–191.

    Article  PubMed  CAS  Google Scholar 

  6. Frye, S. V. (1999) Structure-activity relationship homology (SARAH): a conceptual framework for drug discovery in the genomic era. Chem. Biol. 6, R3–R7.

    Article  PubMed  CAS  Google Scholar 

  7. Butte, A. J., Tamayo, P., Slonim, D., Golub, T. R., and Kohane, I. S. (2000) Discovering functional relationships between RNA expression and chemotherapeutic susceptibility using relevance networks. Proc. Natl. Acad. Sci. USA 97, 12,182–12,186.

    Article  PubMed  CAS  Google Scholar 

  8. Sills, M. A., Weiss, D., Pham, Q., Schweitzer, R., Wu, X., and Wu, J. J. (2002) Comparison of assay technologies for a tyrosine kinase assay generates different results in high throughput screening. J. Biomol. Screen. 7, 191–214.

    Article  PubMed  CAS  Google Scholar 

  9. Caron, P. R., Mullican, M. D., Mashal, R. D., Wilson, K. P., Su, M. S., and Murcko, M. A. (2001) Chemgenomic approaches to drug discovery. Current Opinion Chem. Biol. 5, 464–470.

    Article  CAS  Google Scholar 

  10. Engels M. F., Wouters L., Verbeeck R., and Vanhoof G. (2002) Outlier mining in high throughput screening experiments. J. Biomol. Screen. 7, 341–351.

    Article  PubMed  CAS  Google Scholar 

  11. Weinstein, J. N. (1998) Fishing expeditions. Science 282, 628–629.

    Article  PubMed  CAS  Google Scholar 

  12. Spencer, R. W. (1997) Diversity analysis in high throughput screening. J. Biomol. Screen. 2, 69–70.

    Article  Google Scholar 

  13. Hann, M., Hudson, B., Lewell, X., Lifely, R., Miller, L., and Ramsden, N. (1999) Strategic pooling of compounds for high-throughput screening. J. Chem. Inf. Comput. Sci. 39, 897–902.

    PubMed  CAS  Google Scholar 

  14. Balkenhohl, F., van dem Bussche-Hunnefeld, C., Lansky, A., and Zechel, C. (1996) Combinatorial synthesis of small organic molecules. Angew. Chem. Int. Ed. 35, 2288–2337.

    Article  CAS  Google Scholar 

  15. Villar, H. O. and Koehler, R. T. (2000) Comments on the design of chemical libraries for screening. Molecular Diversity 5, 13–24.

    Article  PubMed  CAS  Google Scholar 

  16. Lajiness, M. S. (1997) Dissimilarity-based compound selection techniques. Perspect. Drug Disc, Des. 7/8, 65–84.

    CAS  Google Scholar 

  17. Ferguson, A. M., Patterson, D. E., Garr, C. D., and Underiner, T. L. (1996) Designing chemical libraries for lead discovery. J. Biomol. Screen. 1, 65–73.

    Article  CAS  Google Scholar 

  18. Doman, T. N., Cibulskis, J. M., Cibulskis, M. J., McCray, P. D., and Spangler, D. P. (1996) Algorithm5: A technique for fuzzy similarity clustering of chemical inventories. J. Chem. Inf. Comput. Sci. 36, 1195–1204.

    CAS  Google Scholar 

  19. Schnur, D. (1999) Design and diversity analysis of large combinatorial libraries using cell-based methods. J. Chem. Inf. Comput. Sci. 39, 36–45.

    CAS  Google Scholar 

  20. Nilakantan, R., Immermann, F., and Haraki, K. (2002) A novel approach to combinatorial library design. Combi. Chem. High Through. Screen. 5, 105–110.

    CAS  Google Scholar 

  21. Xu., Y.-J. and Johnson, M. (2001) Algorithm for naming molecular equivalence classes represented by labeled pseudographs. J. Chem. Inf. Comput. Sci. 41, 181–185.

    PubMed  CAS  Google Scholar 

  22. Voigt, J. H., Bienfait, B., Wang, S., and Nicklaus, M. C. (2001) Comparison of the NCI open database with seven large chemical structural databases. J. Chem. Inf. Comput. Sci. 41, 702–712.

    PubMed  CAS  Google Scholar 

  23. Teig, S. L. (1998) Informative libraries are more useful than diverse ones. J. Biomol. Screen. 3, 85–88.

    Article  Google Scholar 

  24. Stanton, D. T., Morris, T. W., Roychoudhury, S., and Parker, C. N. (1999) Application of nearest neighbor and cluster analysis in pharmaceutical lead discovery. J. Chem. Inf. Comput. Sci. 39, 21–27.

    PubMed  CAS  Google Scholar 

  25. Schneider, G., Neidhart, W., Giller, T., and Schmid, G. (1999) “Scaffold-hopping” by topological pharmacophore search: a contribution to virtual screening. Angew. Chem. Int. Ed. 38, 2894–2896.

    Article  CAS  Google Scholar 

  26. Martin, Y. C., Kofron, J. L., and Traphagen, L. M. (2002) Do structurally similar molecules have similar biological activity? J. Med. Chem. 45, 4350–4358.

    Article  PubMed  CAS  Google Scholar 

  27. Brown, R. D. and Martin, Y. C. (1996) Use of structure-activity to compare structurebased clustering methods and descriptors for use in compound selection. J. Chem. Inf. Comput. Sci. 36, 572–584.

    CAS  Google Scholar 

  28. Brown, R. D. and Martin, Y. C. (1997) The information content of 2D and 3D structural descriptors relevant to ligand-receptor binding. J. Chem. Inf. Comput. Sci. 37, 1–9.

    CAS  Google Scholar 

  29. Sheridan, R. P. and Kearsley, S. K. (2002) Why do we need so many chemical similarity search methods? Drug Disc. Today 7, 903–911.

    Article  Google Scholar 

  30. Holliday, J. D., Hu, C.-Y., and Willett, P. (2002) Grouping of coefficients for the calculation of inter-molecular similarity and dissimilarity using 2D fragment bit-strings. Combi. Chem. High Through. Screen. 5, 155–166.

    CAS  Google Scholar 

  31. Morgan, J. N. and Sonquist, J. A. (1963) Problems in analysis of survey data and a proposal. J. Am. Statist. Assoc. 58, 415–434.

    Article  Google Scholar 

  32. Rusinko III, A., Farmen, M., Lambert, C. G., Brown, P. B., and Young, S. S. (1999) Analysis of a large structure/biology activity data set using recursive partitioning. J. Chem. Inf. Comput. Sci. 39, 1017–1026.

    PubMed  CAS  Google Scholar 

  33. Chen, X., Rusinko III, A., Tropsha, A., and Young, S. S. (1999) Automated pharmacophore identification for large chemical data sets. J. Chem. Inf. Comput. Sci. 39, 887–896.

    PubMed  CAS  Google Scholar 

  34. Jones-Hertzog, D. K., Mukhopadhyay, P., Keefer, C. E., and Young, S. S. (1999) Use of recursive partitioning in the sequential screening of G-protein-couples receptors. J. Pharmacol. Toxicol. 42, 207–215.

    Article  CAS  Google Scholar 

  35. Blower, P., Fligner, M., Verducci, J., and Bjoraker, J. (2001) On combining recursive partitioning and simulated annealing to detect groups of biologically active compounds. J. Chem Inf. Comput. Sci. 42, 393–404.

    Google Scholar 

  36. Welch, W. J., Lam, R. L. H., and Young, S. S. (2002) Cell-based analysis of high throughput screening data for drug discovery, WTO 02/12568 A2.

    Google Scholar 

  37. Abt, M., Lim, Y. B., Sacks, J., Xie, M., and Young, S. S. (2001) A sequential approach for identifying lead compounds in large chemical databases. Stat. Science 16, 154–168.

    Article  Google Scholar 

  38. van Rhee, A. M., Stocker J., Printzenhoff, D., Creech, C., Wagoner, P. K., and Spear, K. L. (2001) Retrospective analysis of an experimental high-throughput screening data set by recursive partitioning. J. Comb. Chem. 3, 267–277.

    Article  PubMed  Google Scholar 

  39. Godden J. W., Furr J. R., and Bajorath, J. (2003) Recursive median partitioning for virtual screening of large databases. J. Chem. Inf. Comput. Sci. 43, 182–188.

    PubMed  CAS  Google Scholar 

  40. Tong, W., Hong, H., Fang, H., Xie, Q., and Perkins, R. (2003) Decision forest: combining the predictions of multiple independent decision tree models. J. Chem. Inf. Comput. Sci. 43, 525–531.

    PubMed  CAS  Google Scholar 

  41. Miller, D. W. (2001) Results of a new classification algorithm combining K nearest neighbors and recursive partitioning. J. Chem Inf. Comput. Sci. 41, 168–175.

    PubMed  CAS  Google Scholar 

  42. Roberts, G., Myatt, G. J., Johnson, W. P., Cross, K. P., and Blower, P. E. (2000) LeadScope: software for exploring large sets of screening data. J. Chem Inf. Comput. Sci. 40, 1302–1314.

    PubMed  CAS  Google Scholar 

  43. Xu, Y. J. and Johnson, M. (2002) Using molecular equivalence numbers to visually explore structural features that distinguish chemical libraries. J. Chem. Inf. Comput. Sci. 42, 912–926.

    PubMed  CAS  Google Scholar 

  44. Nilakantan, R., Bauman, N., Haraki, K. S., and Venkataraghavan, R. (1990) A ring-based chemical structural query system: use of a novel ring-complexity heuristic. J. Chem. Inf. Comput. Sci. 30, 65–68.

    CAS  Google Scholar 

  45. Labute, P. (1999) Binary QSAR: a new method for the determination of quantitative structure activity relationships. Pac. Symp. Biocomput., pp. 444–455.

    Google Scholar 

  46. Klopman, G. (1984) Artificial intelligence approach to structure-activity studies. Computer automated structure evaluation of biological activity of organic molecules. J. Am. Chem. Soc. 106, 7315–7318.

    Article  CAS  Google Scholar 

  47. Klopman, G. (1998) The MultiCASE program II. Baseline activity identification algorithm (BAIA). J. Chem. Inf. Comput. Sci. 38, 78–81.

    PubMed  CAS  Google Scholar 

  48. ter Harr, E., Rosenkranz, H. S., Hamel, E., and Day, B. W. (1996) Computational and molecular modeling evaluation of the structural basis for tubulin polymerization inhibition by colchicine site agents. Bioorg. Med. Chem. 4, 1659–1671.

    Article  Google Scholar 

  49. Gao, H. (2001) Application of BCUT metrics and genetic algorithm in binary QSAR analysis. J. Chem. Inf. Comput. Sci. 41, 402–407.

    PubMed  CAS  Google Scholar 

  50. Gao, H., Lajiness, M. S., and Van Drie, J. (2002) Enhancement of binary QSAR analysis by a GA-based variable selection method. J. Mol. Graph. Model. 20, 259–268.

    Article  PubMed  CAS  Google Scholar 

  51. Harper, G., Bradshaw, J., Gittins, J. C., Green, D. V. S., and Leach, A. R. (2001) Prediction of biological activity for high-throughput screening using binary kernel discrimination. J. Chem. Inf. Comput. Sci. 41, 1295–1300.

    PubMed  CAS  Google Scholar 

  52. Gao, H. and Bajorath, J. (1990) Comparision of binary and 2D QSAR analysis using inhibitors of human carbonic anhydrase II as a test case. Mol. Div. 4, 115–130.

    Article  Google Scholar 

  53. Maxwell, A. (1997) DNA gyrase as a drug target. Trends Microbiol. 5, 102–109.

    Article  PubMed  CAS  Google Scholar 

  54. Wermuth, C. G. (2001) The SOSA approach, an alternative to high-throughput screening. Med. Chem. Res. 10, 431–439.

    CAS  Google Scholar 

  55. Hopfinger, A. J. and Duca, J. S. (2000) Extraction of pharmacophore information from high-throughput screens. Curr. Opin. Biotech. 11, 97–103.

    Article  PubMed  CAS  Google Scholar 

  56. Hecker, E. A., Duraiswami C., Andrea T. A., and Diller D. J. (2002) Use of catalyst pharmacophore models for screening of large combinatorial libraries. J. Chem. Inf. Comput. Sci. 42, 1204–1211.

    PubMed  CAS  Google Scholar 

  57. Tamura, S. Y., Bacha, P. A., Gruver, H. S., and Nutt, R. F. (2002) Data analysis of high-throughput screening results: application of multidomain clustering to the NCI anti-HIV data set. J. Med. Chem. 45, 3082–3093.

    Article  PubMed  CAS  Google Scholar 

  58. Bacha, P. A., Gruver, H. S., Den Hartog, B. K., Tamura, S. Y., and Nutt, R. F. (2002) Rule extraction from a mutagenicity data set using adaptively grown phylogenetic-like trees. J. Chem. Inf. Comput. Sci. 42, 1104–1111.

    PubMed  CAS  Google Scholar 

  59. Andersson, P. M., Linusson, A., Wold, S., Sjostrom, M., Lundstedt, T., and Norden, B. (1999) Design of small molecule libraries for lead exploration. In Molecular diversity in drug design, Dean, P. M. and Lewis, R. A. (eds.), Kluwer Academic Publishers, pp. 197–220.

    Google Scholar 

  60. Brown, P. J., Smith-Oliver, T. A., Charifson, P. S., et al. (1997) Identification of peroxisome proliferator-activated receptor ligands from a biased chemical library. Chem. Biol. 4, 909–918.

    Article  PubMed  Google Scholar 

  61. Schreyer, S. K., Parker, C. N., and Maggiora, G. M. (2004) Data Shaving—A Novel Strategy for Analysis of High Throughput Screening Data, J. Chem. Inf. Comput. Sci., in press.

    Google Scholar 

  62. Xue, L., Stahura, F. L., Godden, J. W., and Bajorath, J. (2001) Fingerprint scaling increases the probability of identifying molecules with similar activity in virtual screening calculations. J. Chem. Inf. Comput. Sci. 41, 746–753.

    PubMed  CAS  Google Scholar 

  63. Brown, N., Willett, P., and Wilton, D. J. (2003) Generation and display of activityweighted chemical hyperstructures. J. Chem. Inf. Comput. Sci. 43, 288–297.

    PubMed  CAS  Google Scholar 

  64. Bissantz, C., Folkers, G., and Rognan, D. (2000) Protein-based virtual screening of chemical databases. 1. Evaluation of different docking/scoring combinations. J. Med. Chem. 43, 4759–4767.

    Article  PubMed  CAS  Google Scholar 

  65. Andersson, P. M., Sjostrom, M., Wold, S., and Lundstedt, T. (2001) Strategies for subset selection of parts of an in-house chemical library. J. Chemometrics 15, 353–369.

    Article  CAS  Google Scholar 

  66. Agrafiotis, D. K. and Rassokhin, D. N. (2001) Design and prioritization of plates for high-throughput screening. J. Chem. Inf. Comput. Sci. 41, 798–805.

    PubMed  CAS  Google Scholar 

  67. Zhang J.-H., Chung, T. D., and Oldenburg K. R. (1999) A simple statistical parameter for use in evaluation and validation of high throughput screening assays. J. Biomol. Screen. 4, 67–73.

    Article  PubMed  Google Scholar 

  68. Zhang, J.-H., Chung, T. D. Y., and Oldenburg, K. R. (2000) Confirmation of primary active substances from high throughput screening of chemical and biological populations: a statistical approach and practical considerations. J. Comb. Chem. 2, 258–265.

    Article  PubMed  CAS  Google Scholar 

  69. Yurek, D. A., Branch, D. L., and Kuo, M. S. (2002) Development of a system to evaluate compound identity, purity, and concentration in a single experiment and its application in quality assessment of combinatorial libraries and screening hits. J. Comb. Chem. 4, 138–148.

    Article  PubMed  CAS  Google Scholar 

  70. Humphrey, P. (2002) Studies on compound stability in DMSO, presented at the Sample Management Special Interest Group Meeting, September 24, 8th Annual General Meeting of The Society for Biomolecular Screening, The Hague, The Netherlands.

    Google Scholar 

  71. Lipinski, C. A. (2000) Drug-like properties and the causes of poor solubility and poor permeability. J. Pharmacol. Toxicol. Methods 44, 235–249.

    Article  PubMed  CAS  Google Scholar 

  72. Veber, D. F., Johnson, S. R., Cheng, H.-Y., Smith, B. R., Ward, K. W., and Kopple, K. D. (2002) Molecular properties that influence the oral bioavailability of drug candidates. J. Med. Chem. 45, 2615–2623.

    Article  PubMed  CAS  Google Scholar 

  73. Hagadone, T. R. (1992) Molecular substructure searching: efficient retrieval in two-dimensional structure databases. J. Chem. Inf. Comput. Sci. 32, 515–521.

    CAS  Google Scholar 

  74. Lajiness, M. S. (2000) Using Enterprise Miner to explore and exploit drug discovery data. Published in the proceedings of the SAS Users Group International-SUGI 25 paper, 266–255.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2004 Humana Press Inc.

About this protocol

Cite this protocol

Parker, C.N., Schreyer, S.K. (2004). Application of Chemoinformatics to High-Throughput Screening. In: Bajorath, J. (eds) Chemoinformatics. Methods in Molecular Biology™, vol 275. Humana Press. https://doi.org/10.1385/1-59259-802-1:085

Download citation

  • DOI: https://doi.org/10.1385/1-59259-802-1:085

  • Publisher Name: Humana Press

  • Print ISBN: 978-1-58829-261-2

  • Online ISBN: 978-1-59259-802-1

  • eBook Packages: Springer Protocols

Publish with us

Policies and ethics