Skip to main content

Partitioning in Binary-Transformed Chemical Descriptor Spaces

  • Protocol
Chemoinformatics

Part of the book series: Methods in Molecular Biology™ ((MIMB,volume 275))

  • 1183 Accesses

Abstract

Here we describe a statistically based partitioning method called median partitioning (MP), which involves the transformation of value distributions of molecular property descriptors into a binary classification scheme. The MP approach fundamentally differs from other partitioning approaches that involve dimension reduction of chemical spaces such as cell-based partitioning, since MP directly operates in original, albeit simplified, chemical space. Modified versions of the MP algorithm have been implemented and successfully applied in diversity selection, compound classification, and virtual screening. These findings have demonstrated that dimension reduction techniques, although elegant in their design, are not necessarily required for effective partitioning of molecular datasets. An attractive feature of statistical partitioning approaches such as decision tree methods or MP is their computational efficiency, which is becoming an important criterion for the analysis of compound databases containing millions of molecules.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Protocol
USD 49.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Pearlman, R. S. and Smith, K. M. (1998) Novel software tools for chemical diversity. Perspect. Drug Discov. Design 9, 339–353.

    Article  Google Scholar 

  2. Mason, J. S. and Pickett, S. D. (1997) Partition-based selection. Perspect. Drug Discov. Design 7/8, 85–114.

    CAS  Google Scholar 

  3. Bajorath, J. (2002) Integration of virtual and high-throughput screening. Nature Drug Discov. Rev. 1, 337–346.

    Article  Google Scholar 

  4. Stahura, F. L. and Bajorath, J. (2003) Partitioning methods for the identification of active molecules. Curr. Med. Chem. 10, 707–715.

    Article  PubMed  CAS  Google Scholar 

  5. Friedman, J. A. (1977) Recursive partitioning decision rules for non-arametric classification. IEEE Trans. Comput. 26, 404–408.

    Article  Google Scholar 

  6. Chen, X., Rusinko, A. III, and Young, S. S. (1998) Recursive partitioning analysis of a large structure-activity data set using three-dimensional descriptors. J. Chem. Inf. Comput. Sci. 38, 1054–1062.

    CAS  Google Scholar 

  7. Rusinko, A. III, Farmen, M. W., Lambert, C. G., Brown, P. L., and Young, S. S. (1999) Analysis of a large structure/biological activity data set using recursive partitioning. J. Chem. Inf. Comput. Sci. 39, 1017–1026.

    PubMed  CAS  Google Scholar 

  8. Agrafiotis, D. K., Lobanov, V. S., and Salemme, R. F. (2002) Combinatorial informatics in the post-genomics era. Nature Drug Discov. Rev. 1, 337–346.

    Article  CAS  Google Scholar 

  9. Ward, J. H. (1963) Hierarchical grouping to optimize an objective function. J. Am. Stat. Assoc. 58, 236–244.

    Article  Google Scholar 

  10. Snarey, M., Terrett, N. K., Willett, P., and Wilton, D. J. (1997) Comparison of algorithms for dissimilarity-based compound selection. J. Mol. Graph. Model. 15, 372–285.

    Article  PubMed  CAS  Google Scholar 

  11. Higgs, R. E., Bemis, K. G., Watson, I. A., and Wikel, J. H. (1997) Experimental designs for selecting molecules from large chemical databases. J. Chem. Inf. Comput. Sci. 37, 861–870.

    CAS  Google Scholar 

  12. Willett, P. (1999) Dissimilarity-based algorithms for selecting structurally diverse sets of compounds. J. Comput. Biol. 6, 447–457.

    Article  PubMed  CAS  Google Scholar 

  13. Godden J. W., Xue, L., Kitchen, D. B., Stahura, F. L., Schermerhorn, E. J., and Bajorath, J. (2002) Median partitioning: A novel method for the selection of representative subsets from large compound pools. J. Chem. Inf. Comput. Sci. 42, 885–893.

    PubMed  CAS  Google Scholar 

  14. Godden, J. W., Xue, L., and Bajorath, J. (2002) Classification of biologically active compounds by median partitioning. J. Chem. Inf. Comput. Sci. 42, 1263–1269.

    PubMed  CAS  Google Scholar 

  15. Godden, J. W., Furr, J. R., and Bajorath, J. (2003) Recursive median partitioning for virtual screening of large databases. J. Chem. Inf. Comput. Sci. 43, 182–188.

    PubMed  CAS  Google Scholar 

  16. Livingstone, D. J. (2000) The characterization of chemical structures using molecular properties. A survey. J. Chem. Inf. Comput. Sci. 40, 195–209.

    PubMed  CAS  Google Scholar 

  17. Xue, L. and Bajorath, J. (2000) Molecular descriptors in chemoinformatics, computational combinatorial chemistry, and virtual screening. Combin. Chem. High Throughput Screen. 3, 363–372.

    CAS  Google Scholar 

  18. Meier, P. C. and Zünd, R. E. (2000) Statistical methods in analytical chemistry. Wiley, New York, NY.

    Book  Google Scholar 

  19. Godden, J. W. and Bajorath, J. (2002) Chemical descriptors with distinct levels of information content and varying sensitivity to differences between selected compound databases identified by SE-DSE analysis. J. Chem. Inf. Comput. Sci. 42, 87–93.

    PubMed  CAS  Google Scholar 

  20. Shannon, C. E. and Weaver, W. (1963) The mathematical theory of communication. University of Illinois Press, Urbana, IL.

    Google Scholar 

  21. Forrest, S. (1993) Genetic algorithms-principles of natural selection applied to computation. Science 261, 872–878.

    Article  PubMed  CAS  Google Scholar 

  22. Agrafiotis, D. K. (2001) A constant time algorithm for estimating the diversity of large chemical libraries. J. Chem. Inf. Comput. Sci. 41, 159–167.

    PubMed  CAS  Google Scholar 

  23. Xue, L. and Bajorath, J. (2002) Accurate partitioning of compounds belonging to diverse activity classes. J. Chem. Inf. Comput. Sci. 42, 757–764.

    PubMed  CAS  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2004 Humana Press Inc.

About this protocol

Cite this protocol

Godden, J.W., Bajorath, J. (2004). Partitioning in Binary-Transformed Chemical Descriptor Spaces. In: Bajorath, J. (eds) Chemoinformatics. Methods in Molecular Biology™, vol 275. Humana Press. https://doi.org/10.1385/1-59259-802-1:291

Download citation

  • DOI: https://doi.org/10.1385/1-59259-802-1:291

  • Publisher Name: Humana Press

  • Print ISBN: 978-1-58829-261-2

  • Online ISBN: 978-1-59259-802-1

  • eBook Packages: Springer Protocols

Publish with us

Policies and ethics