Application of Chemoinformatics to High-Throughput Screening

Parker, Christian N.; Schreyer, Suzanne K.

doi:10.1385/1-59259-802-1:085

Christian N. Parker³ &
Suzanne K. Schreyer⁴

Part of the book series: Methods in Molecular Biology™ ((MIMB,volume 275))

1233 Accesses
8 Citations

Abstract

The objective of this chapter is to summarize and evaluate some of the most common chemoinformatic methods that are applied to the analysis of high-throughput-screening data. The chapter will briefly describe current high-throughput-screening practices and will stress how the major constraint on the application of chemoinformatics is often the quality of high-throughput-screening data. Discussion of the NCI dataset and how it differs from most high-throughputscreening datasets will be made to highlight this point.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Protocol: USD 49.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Kubinyi, H. (1999) Chance favors the prepared mind—from serendipity to rational drug design. J. Recep. Signal Transduc. Research 19, 15–39.
Article CAS Google Scholar
Lepre, C. A. (2001) Library design for NMR-based screening. Drug Discovery Today 6, 133–140.
Article PubMed CAS Google Scholar
Nienaber, V. L., Richardson, P. L., Klighofer, V., Bouska, J. L., Giranda, V. L., and Greer, J. (2000) Discovering novel ligands for macromolecules using x-ray crystallographic screening. Nature 18, 1105–1108.
Article CAS Google Scholar
Su, A. I., Lorber, D. M., Weston, G. S., Baase, W. A., Mathews, B. W., and Shoichet, B. K. (2001) Docking molecules by families to increase the diversity of hits in database screens: computational strategy and experimental evaluation. Proteins: Struc. Func. Genet. 42, 279–293.
Article CAS Google Scholar
Joseph-McCarthy, D. (1999) Computational approaches to structure-based ligand design. Pharmacology Therapeutics 84, 179–191.
Article PubMed CAS Google Scholar
Frye, S. V. (1999) Structure-activity relationship homology (SARAH): a conceptual framework for drug discovery in the genomic era. Chem. Biol. 6, R3–R7.
Article PubMed CAS Google Scholar
Butte, A. J., Tamayo, P., Slonim, D., Golub, T. R., and Kohane, I. S. (2000) Discovering functional relationships between RNA expression and chemotherapeutic susceptibility using relevance networks. Proc. Natl. Acad. Sci. USA 97, 12,182–12,186.
Article PubMed CAS Google Scholar
Sills, M. A., Weiss, D., Pham, Q., Schweitzer, R., Wu, X., and Wu, J. J. (2002) Comparison of assay technologies for a tyrosine kinase assay generates different results in high throughput screening. J. Biomol. Screen. 7, 191–214.
Article PubMed CAS Google Scholar
Caron, P. R., Mullican, M. D., Mashal, R. D., Wilson, K. P., Su, M. S., and Murcko, M. A. (2001) Chemgenomic approaches to drug discovery. Current Opinion Chem. Biol. 5, 464–470.
Article CAS Google Scholar
Engels M. F., Wouters L., Verbeeck R., and Vanhoof G. (2002) Outlier mining in high throughput screening experiments. J. Biomol. Screen. 7, 341–351.
Article PubMed CAS Google Scholar
Weinstein, J. N. (1998) Fishing expeditions. Science 282, 628–629.
Article PubMed CAS Google Scholar
Spencer, R. W. (1997) Diversity analysis in high throughput screening. J. Biomol. Screen. 2, 69–70.
Article Google Scholar
Hann, M., Hudson, B., Lewell, X., Lifely, R., Miller, L., and Ramsden, N. (1999) Strategic pooling of compounds for high-throughput screening. J. Chem. Inf. Comput. Sci. 39, 897–902.
PubMed CAS Google Scholar
Balkenhohl, F., van dem Bussche-Hunnefeld, C., Lansky, A., and Zechel, C. (1996) Combinatorial synthesis of small organic molecules. Angew. Chem. Int. Ed. 35, 2288–2337.
Article CAS Google Scholar
Villar, H. O. and Koehler, R. T. (2000) Comments on the design of chemical libraries for screening. Molecular Diversity 5, 13–24.
Article PubMed CAS Google Scholar
Lajiness, M. S. (1997) Dissimilarity-based compound selection techniques. Perspect. Drug Disc, Des. 7/8, 65–84.
CAS Google Scholar
Ferguson, A. M., Patterson, D. E., Garr, C. D., and Underiner, T. L. (1996) Designing chemical libraries for lead discovery. J. Biomol. Screen. 1, 65–73.
Article CAS Google Scholar
Doman, T. N., Cibulskis, J. M., Cibulskis, M. J., McCray, P. D., and Spangler, D. P. (1996) Algorithm5: A technique for fuzzy similarity clustering of chemical inventories. J. Chem. Inf. Comput. Sci. 36, 1195–1204.
CAS Google Scholar
Schnur, D. (1999) Design and diversity analysis of large combinatorial libraries using cell-based methods. J. Chem. Inf. Comput. Sci. 39, 36–45.
CAS Google Scholar
Nilakantan, R., Immermann, F., and Haraki, K. (2002) A novel approach to combinatorial library design. Combi. Chem. High Through. Screen. 5, 105–110.
CAS Google Scholar
Xu., Y.-J. and Johnson, M. (2001) Algorithm for naming molecular equivalence classes represented by labeled pseudographs. J. Chem. Inf. Comput. Sci. 41, 181–185.
PubMed CAS Google Scholar
Voigt, J. H., Bienfait, B., Wang, S., and Nicklaus, M. C. (2001) Comparison of the NCI open database with seven large chemical structural databases. J. Chem. Inf. Comput. Sci. 41, 702–712.
PubMed CAS Google Scholar
Teig, S. L. (1998) Informative libraries are more useful than diverse ones. J. Biomol. Screen. 3, 85–88.
Article Google Scholar
Stanton, D. T., Morris, T. W., Roychoudhury, S., and Parker, C. N. (1999) Application of nearest neighbor and cluster analysis in pharmaceutical lead discovery. J. Chem. Inf. Comput. Sci. 39, 21–27.
PubMed CAS Google Scholar
Schneider, G., Neidhart, W., Giller, T., and Schmid, G. (1999) “Scaffold-hopping” by topological pharmacophore search: a contribution to virtual screening. Angew. Chem. Int. Ed. 38, 2894–2896.
Article CAS Google Scholar
Martin, Y. C., Kofron, J. L., and Traphagen, L. M. (2002) Do structurally similar molecules have similar biological activity? J. Med. Chem. 45, 4350–4358.
Article PubMed CAS Google Scholar
Brown, R. D. and Martin, Y. C. (1996) Use of structure-activity to compare structurebased clustering methods and descriptors for use in compound selection. J. Chem. Inf. Comput. Sci. 36, 572–584.
CAS Google Scholar
Brown, R. D. and Martin, Y. C. (1997) The information content of 2D and 3D structural descriptors relevant to ligand-receptor binding. J. Chem. Inf. Comput. Sci. 37, 1–9.
CAS Google Scholar
Sheridan, R. P. and Kearsley, S. K. (2002) Why do we need so many chemical similarity search methods? Drug Disc. Today 7, 903–911.
Article Google Scholar
Holliday, J. D., Hu, C.-Y., and Willett, P. (2002) Grouping of coefficients for the calculation of inter-molecular similarity and dissimilarity using 2D fragment bit-strings. Combi. Chem. High Through. Screen. 5, 155–166.
CAS Google Scholar
Morgan, J. N. and Sonquist, J. A. (1963) Problems in analysis of survey data and a proposal. J. Am. Statist. Assoc. 58, 415–434.
Article Google Scholar
Rusinko III, A., Farmen, M., Lambert, C. G., Brown, P. B., and Young, S. S. (1999) Analysis of a large structure/biology activity data set using recursive partitioning. J. Chem. Inf. Comput. Sci. 39, 1017–1026.
PubMed CAS Google Scholar
Chen, X., Rusinko III, A., Tropsha, A., and Young, S. S. (1999) Automated pharmacophore identification for large chemical data sets. J. Chem. Inf. Comput. Sci. 39, 887–896.
PubMed CAS Google Scholar
Jones-Hertzog, D. K., Mukhopadhyay, P., Keefer, C. E., and Young, S. S. (1999) Use of recursive partitioning in the sequential screening of G-protein-couples receptors. J. Pharmacol. Toxicol. 42, 207–215.
Article CAS Google Scholar
Blower, P., Fligner, M., Verducci, J., and Bjoraker, J. (2001) On combining recursive partitioning and simulated annealing to detect groups of biologically active compounds. J. Chem Inf. Comput. Sci. 42, 393–404.
Google Scholar
Welch, W. J., Lam, R. L. H., and Young, S. S. (2002) Cell-based analysis of high throughput screening data for drug discovery, WTO 02/12568 A2.
Google Scholar
Abt, M., Lim, Y. B., Sacks, J., Xie, M., and Young, S. S. (2001) A sequential approach for identifying lead compounds in large chemical databases. Stat. Science 16, 154–168.
Article Google Scholar
van Rhee, A. M., Stocker J., Printzenhoff, D., Creech, C., Wagoner, P. K., and Spear, K. L. (2001) Retrospective analysis of an experimental high-throughput screening data set by recursive partitioning. J. Comb. Chem. 3, 267–277.
Article PubMed Google Scholar
Godden J. W., Furr J. R., and Bajorath, J. (2003) Recursive median partitioning for virtual screening of large databases. J. Chem. Inf. Comput. Sci. 43, 182–188.
PubMed CAS Google Scholar
Tong, W., Hong, H., Fang, H., Xie, Q., and Perkins, R. (2003) Decision forest: combining the predictions of multiple independent decision tree models. J. Chem. Inf. Comput. Sci. 43, 525–531.
PubMed CAS Google Scholar
Miller, D. W. (2001) Results of a new classification algorithm combining K nearest neighbors and recursive partitioning. J. Chem Inf. Comput. Sci. 41, 168–175.
PubMed CAS Google Scholar
Roberts, G., Myatt, G. J., Johnson, W. P., Cross, K. P., and Blower, P. E. (2000) LeadScope: software for exploring large sets of screening data. J. Chem Inf. Comput. Sci. 40, 1302–1314.
PubMed CAS Google Scholar
Xu, Y. J. and Johnson, M. (2002) Using molecular equivalence numbers to visually explore structural features that distinguish chemical libraries. J. Chem. Inf. Comput. Sci. 42, 912–926.
PubMed CAS Google Scholar
Nilakantan, R., Bauman, N., Haraki, K. S., and Venkataraghavan, R. (1990) A ring-based chemical structural query system: use of a novel ring-complexity heuristic. J. Chem. Inf. Comput. Sci. 30, 65–68.
CAS Google Scholar
Labute, P. (1999) Binary QSAR: a new method for the determination of quantitative structure activity relationships. Pac. Symp. Biocomput., pp. 444–455.
Google Scholar
Klopman, G. (1984) Artificial intelligence approach to structure-activity studies. Computer automated structure evaluation of biological activity of organic molecules. J. Am. Chem. Soc. 106, 7315–7318.
Article CAS Google Scholar
Klopman, G. (1998) The MultiCASE program II. Baseline activity identification algorithm (BAIA). J. Chem. Inf. Comput. Sci. 38, 78–81.
PubMed CAS Google Scholar
ter Harr, E., Rosenkranz, H. S., Hamel, E., and Day, B. W. (1996) Computational and molecular modeling evaluation of the structural basis for tubulin polymerization inhibition by colchicine site agents. Bioorg. Med. Chem. 4, 1659–1671.
Article Google Scholar
Gao, H. (2001) Application of BCUT metrics and genetic algorithm in binary QSAR analysis. J. Chem. Inf. Comput. Sci. 41, 402–407.
PubMed CAS Google Scholar
Gao, H., Lajiness, M. S., and Van Drie, J. (2002) Enhancement of binary QSAR analysis by a GA-based variable selection method. J. Mol. Graph. Model. 20, 259–268.
Article PubMed CAS Google Scholar
Harper, G., Bradshaw, J., Gittins, J. C., Green, D. V. S., and Leach, A. R. (2001) Prediction of biological activity for high-throughput screening using binary kernel discrimination. J. Chem. Inf. Comput. Sci. 41, 1295–1300.
PubMed CAS Google Scholar
Gao, H. and Bajorath, J. (1990) Comparision of binary and 2D QSAR analysis using inhibitors of human carbonic anhydrase II as a test case. Mol. Div. 4, 115–130.
Article Google Scholar
Maxwell, A. (1997) DNA gyrase as a drug target. Trends Microbiol. 5, 102–109.
Article PubMed CAS Google Scholar
Wermuth, C. G. (2001) The SOSA approach, an alternative to high-throughput screening. Med. Chem. Res. 10, 431–439.
CAS Google Scholar
Hopfinger, A. J. and Duca, J. S. (2000) Extraction of pharmacophore information from high-throughput screens. Curr. Opin. Biotech. 11, 97–103.
Article PubMed CAS Google Scholar
Hecker, E. A., Duraiswami C., Andrea T. A., and Diller D. J. (2002) Use of catalyst pharmacophore models for screening of large combinatorial libraries. J. Chem. Inf. Comput. Sci. 42, 1204–1211.
PubMed CAS Google Scholar
Tamura, S. Y., Bacha, P. A., Gruver, H. S., and Nutt, R. F. (2002) Data analysis of high-throughput screening results: application of multidomain clustering to the NCI anti-HIV data set. J. Med. Chem. 45, 3082–3093.
Article PubMed CAS Google Scholar
Bacha, P. A., Gruver, H. S., Den Hartog, B. K., Tamura, S. Y., and Nutt, R. F. (2002) Rule extraction from a mutagenicity data set using adaptively grown phylogenetic-like trees. J. Chem. Inf. Comput. Sci. 42, 1104–1111.
PubMed CAS Google Scholar
Andersson, P. M., Linusson, A., Wold, S., Sjostrom, M., Lundstedt, T., and Norden, B. (1999) Design of small molecule libraries for lead exploration. In Molecular diversity in drug design, Dean, P. M. and Lewis, R. A. (eds.), Kluwer Academic Publishers, pp. 197–220.
Google Scholar
Brown, P. J., Smith-Oliver, T. A., Charifson, P. S., et al. (1997) Identification of peroxisome proliferator-activated receptor ligands from a biased chemical library. Chem. Biol. 4, 909–918.
Article PubMed Google Scholar
Schreyer, S. K., Parker, C. N., and Maggiora, G. M. (2004) Data Shaving—A Novel Strategy for Analysis of High Throughput Screening Data, J. Chem. Inf. Comput. Sci., in press.
Google Scholar
Xue, L., Stahura, F. L., Godden, J. W., and Bajorath, J. (2001) Fingerprint scaling increases the probability of identifying molecules with similar activity in virtual screening calculations. J. Chem. Inf. Comput. Sci. 41, 746–753.
PubMed CAS Google Scholar
Brown, N., Willett, P., and Wilton, D. J. (2003) Generation and display of activityweighted chemical hyperstructures. J. Chem. Inf. Comput. Sci. 43, 288–297.
PubMed CAS Google Scholar
Bissantz, C., Folkers, G., and Rognan, D. (2000) Protein-based virtual screening of chemical databases. 1. Evaluation of different docking/scoring combinations. J. Med. Chem. 43, 4759–4767.
Article PubMed CAS Google Scholar
Andersson, P. M., Sjostrom, M., Wold, S., and Lundstedt, T. (2001) Strategies for subset selection of parts of an in-house chemical library. J. Chemometrics 15, 353–369.
Article CAS Google Scholar
Agrafiotis, D. K. and Rassokhin, D. N. (2001) Design and prioritization of plates for high-throughput screening. J. Chem. Inf. Comput. Sci. 41, 798–805.
PubMed CAS Google Scholar
Zhang J.-H., Chung, T. D., and Oldenburg K. R. (1999) A simple statistical parameter for use in evaluation and validation of high throughput screening assays. J. Biomol. Screen. 4, 67–73.
Article PubMed Google Scholar
Zhang, J.-H., Chung, T. D. Y., and Oldenburg, K. R. (2000) Confirmation of primary active substances from high throughput screening of chemical and biological populations: a statistical approach and practical considerations. J. Comb. Chem. 2, 258–265.
Article PubMed CAS Google Scholar
Yurek, D. A., Branch, D. L., and Kuo, M. S. (2002) Development of a system to evaluate compound identity, purity, and concentration in a single experiment and its application in quality assessment of combinatorial libraries and screening hits. J. Comb. Chem. 4, 138–148.
Article PubMed CAS Google Scholar
Humphrey, P. (2002) Studies on compound stability in DMSO, presented at the Sample Management Special Interest Group Meeting, September 24, 8th Annual General Meeting of The Society for Biomolecular Screening, The Hague, The Netherlands.
Google Scholar
Lipinski, C. A. (2000) Drug-like properties and the causes of poor solubility and poor permeability. J. Pharmacol. Toxicol. Methods 44, 235–249.
Article PubMed CAS Google Scholar
Veber, D. F., Johnson, S. R., Cheng, H.-Y., Smith, B. R., Ward, K. W., and Kopple, K. D. (2002) Molecular properties that influence the oral bioavailability of drug candidates. J. Med. Chem. 45, 2615–2623.
Article PubMed CAS Google Scholar
Hagadone, T. R. (1992) Molecular substructure searching: efficient retrieval in two-dimensional structure databases. J. Chem. Inf. Comput. Sci. 32, 515–521.
CAS Google Scholar
Lajiness, M. S. (2000) Using Enterprise Miner to explore and exploit drug discovery data. Published in the proceedings of the SAS Users Group International-SUGI 25 paper, 266–255.
Google Scholar

Download references

Author information

Authors and Affiliations

Novartis Institute for BioMedical Research, Cambridge, Massachusetts, USA
Christian N. Parker
Chemical Computing Group Inc., Montreal, Quebec, Canada
Suzanne K. Schreyer

Authors

Christian N. Parker
View author publications
You can also search for this author in PubMed Google Scholar
Suzanne K. Schreyer
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Albany Molecular Research Inc., Bothell Research Center, Bothell, WA
Jürgen Bajorath
University of Washington, Seattle, WA
Jürgen Bajorath

Rights and permissions

Reprints and permissions

Copyright information

About this protocol

Cite this protocol

Parker, C.N., Schreyer, S.K. (2004). Application of Chemoinformatics to High-Throughput Screening. In: Bajorath, J. (eds) Chemoinformatics. Methods in Molecular Biology™, vol 275. Humana Press. https://doi.org/10.1385/1-59259-802-1:085

Download citation

DOI: https://doi.org/10.1385/1-59259-802-1:085
Publisher Name: Humana Press
Print ISBN: 978-1-58829-261-2
Online ISBN: 978-1-59259-802-1
eBook Packages: Springer Protocols

Publish with us

Policies and ethics