Analysis of a Large, High-Throughput Screening Data Using Recursive Partitioning

Young, S. Stanley; Sacks, Jerome

doi:10.1007/978-1-4615-4141-7_17

S. Stanley Young³ &
Jerome Sacks⁴

23 Accesses
1 Citations

Abstract

As biological drug targets multiply through the human genome project and as the number of chemical compounds available for screening becomes very large, the expense of screening every compound against every target becomes prohibitive. We need to improve the efficiency of the drug screening process so that active compounds can be found for more biological targets and turned over to medicinal chemists for atom-by-atom optimization. We create a method for analysis of the very large, complex data sets coming from high throughput screening, and then integrate the analysis with the selection of compounds for screening so that the structure-activity rules derived from an initial compound set can be used to suggest additional compounds for screening. Cycles of screening and analysis become sequential screening rather than the mass screening of all available compounds. We extend the analysis method to deal with multivariate responses. Previously, a screening campaign might screen hundreds of thousands of compounds; sequential screening can cut the number of compounds screened by up to eighty percent. Sequential screening also gives SAR rules that can be used to mathematically screen compound collections or virtual chemical libraries.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

eBook: USD 16.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Box, G.E., Hunter, W.G., and Hunter, S., 1978, Statistics for Experimenters. J. Wiley & Sons, New York.
Google Scholar
Breiman, L., Friedman, J.H., Olshen, R.A., and Stone, C.J., 1984, Classification and Regression Trees,Wadsworth
Google Scholar
Brown, R.D. and Martin, Y.C., 1996, Use of Structure-activity data to compare structure-based clustering methods and descriptors for use in compound selection. J. Chem. Inf. Comput. Sei., 36: 572–584.
Article CAS Google Scholar
Carhart, R.E., Smith, D.H., Venkataraghavan, R., 1985, Atom pairs as molecular features in structure-activity studies: Definition and applications. J. Chem. Inf. Comput. Sci. 25: 64–73.
Article CAS Google Scholar
Hawkins, D.M., 1995, FIRM Formal Inference-based Recursive Modeling. Release 2. University of Minnesota: St. Paul, MN.
Google Scholar
Hawkins, D.M. and Kass, G.V., 1982, Automatic Interaction Detection. In Topics in Applied Multivariate Analysis; Hawkins, D. H., Ed.; Cambridge University Press, pp. 269–302.
Chapter Google Scholar
Hawkins, D.M., Young, S.S., and Rusinko, A., 1997, Analysis of a large structure-activity data set using recursive partitioning. Quant. Struct.-Act. Relat. 16: 296–302.
Article CAS Google Scholar
Kass, G.V. 1980, An exploratory technique for investigating large quantities of categorical data. Applied Statistics 29: 119–127.
Article Google Scholar
Nilakantan, R., Bauman, N., Dixon, J. S., Venkataraghavan, R., 1987, Topological torsion: a new molecular descriptor for SAR applications. Comparison with other descriptors. J Chem. Inf. Comput. Sci. 27: 82–85.
Article CAS Google Scholar
Rusinko, A. III, Farmen, M.W., Lambert, C.G., Brown, P.L., and Young, S.S. Analysis of a large structure-activity data set using recursive partitioning. (submitted for publication).
Google Scholar
Quinlan, J.R.,1993, C4.5: Programs for Machine Learning. Morgan Kaufmann
Google Scholar
Young, S.S., Farmer, M.W., and Rusinko, A. III, 1996, Random versus rational. Which is better for general compound screening.http://www.netsci.org/science/screening/feature09.html .
Google Scholar

Download references

Author information

Authors and Affiliations

Glaxo Wellcome Inc., 5 Moore Drive RTP, North Carolina, 27709, USA
S. Stanley Young
National Institute of Statistical Science, P.O. Box 14006, RTP, North Carolina, 27709-4006, USA
Jerome Sacks

Authors

S. Stanley Young
View author publications
You can also search for this author in PubMed Google Scholar
Jerome Sacks
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

H. Lundbeck A/S, Valby, Denmark
Klaus Gundertofte
Royal Danish School of Pharmacy, Copenhagen, Denmark
Flemming Steen Jørgensen

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Young, S.S., Sacks, J. (2000). Analysis of a Large, High-Throughput Screening Data Using Recursive Partitioning. In: Gundertofte, K., Jørgensen, F.S. (eds) Molecular Modeling and Prediction of Bioactivity. Springer, Boston, MA. https://doi.org/10.1007/978-1-4615-4141-7_17

Download citation

DOI: https://doi.org/10.1007/978-1-4615-4141-7_17
Publisher Name: Springer, Boston, MA
Print ISBN: 978-1-4613-6857-1
Online ISBN: 978-1-4615-4141-7
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics