Abstract
As biological drug targets multiply through the human genome project and as the number of chemical compounds available for screening becomes very large, the expense of screening every compound against every target becomes prohibitive. We need to improve the efficiency of the drug screening process so that active compounds can be found for more biological targets and turned over to medicinal chemists for atom-by-atom optimization. We create a method for analysis of the very large, complex data sets coming from high throughput screening, and then integrate the analysis with the selection of compounds for screening so that the structure-activity rules derived from an initial compound set can be used to suggest additional compounds for screening. Cycles of screening and analysis become sequential screening rather than the mass screening of all available compounds. We extend the analysis method to deal with multivariate responses. Previously, a screening campaign might screen hundreds of thousands of compounds; sequential screening can cut the number of compounds screened by up to eighty percent. Sequential screening also gives SAR rules that can be used to mathematically screen compound collections or virtual chemical libraries.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Box, G.E., Hunter, W.G., and Hunter, S., 1978, Statistics for Experimenters. J. Wiley & Sons, New York.
Breiman, L., Friedman, J.H., Olshen, R.A., and Stone, C.J., 1984, Classification and Regression Trees,Wadsworth
Brown, R.D. and Martin, Y.C., 1996, Use of Structure-activity data to compare structure-based clustering methods and descriptors for use in compound selection. J. Chem. Inf. Comput. Sei., 36: 572–584.
Carhart, R.E., Smith, D.H., Venkataraghavan, R., 1985, Atom pairs as molecular features in structure-activity studies: Definition and applications. J. Chem. Inf. Comput. Sci. 25: 64–73.
Hawkins, D.M., 1995, FIRM Formal Inference-based Recursive Modeling. Release 2. University of Minnesota: St. Paul, MN.
Hawkins, D.M. and Kass, G.V., 1982, Automatic Interaction Detection. In Topics in Applied Multivariate Analysis; Hawkins, D. H., Ed.; Cambridge University Press, pp. 269–302.
Hawkins, D.M., Young, S.S., and Rusinko, A., 1997, Analysis of a large structure-activity data set using recursive partitioning. Quant. Struct.-Act. Relat. 16: 296–302.
Kass, G.V. 1980, An exploratory technique for investigating large quantities of categorical data. Applied Statistics 29: 119–127.
Nilakantan, R., Bauman, N., Dixon, J. S., Venkataraghavan, R., 1987, Topological torsion: a new molecular descriptor for SAR applications. Comparison with other descriptors. J Chem. Inf. Comput. Sci. 27: 82–85.
Rusinko, A. III, Farmen, M.W., Lambert, C.G., Brown, P.L., and Young, S.S. Analysis of a large structure-activity data set using recursive partitioning. (submitted for publication).
Quinlan, J.R.,1993, C4.5: Programs for Machine Learning. Morgan Kaufmann
Young, S.S., Farmer, M.W., and Rusinko, A. III, 1996, Random versus rational. Which is better for general compound screening.http://www.netsci.org/science/screening/feature09.html .
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2000 Springer Science+Business Media New York
About this chapter
Cite this chapter
Young, S.S., Sacks, J. (2000). Analysis of a Large, High-Throughput Screening Data Using Recursive Partitioning. In: Gundertofte, K., Jørgensen, F.S. (eds) Molecular Modeling and Prediction of Bioactivity. Springer, Boston, MA. https://doi.org/10.1007/978-1-4615-4141-7_17
Download citation
DOI: https://doi.org/10.1007/978-1-4615-4141-7_17
Publisher Name: Springer, Boston, MA
Print ISBN: 978-1-4613-6857-1
Online ISBN: 978-1-4615-4141-7
eBook Packages: Springer Book Archive