Accessing the High-Throughput Screening Data Landscape

  • Daniel P. Russo
  • Hao ZhuEmail author
Part of the Methods in Molecular Biology book series (MIMB, volume 1473)


The progress of high-throughput screening (HTS) techniques is changing the chemical data landscape by producing massive biological data from tested compounds. Public data repositories (e.g., PubChem) receive HTS data provided by various institutes and this data pool is being updated on a daily basis. The goal of these data sharing efforts is to let users quickly obtain the biological data of target compounds. Without a universal chemical identifier, the repositories (e.g., PubChem) provide users various methods to query and retrieve chemical properties and biological data by several different chemical identifiers (e.g., SMILES, InChIKey, and IUPAC name). The major challenge for most users, especially computational modelers, is obtaining the biological data for a large dataset of compounds (e.g., thousands of drug molecules) instead of a single compound. This chapter aims to introduce the steps to access the public data repositories for target compounds with specific emphasis on the automatic data downloading for large datasets.

Key words

Compounds Chemical identifier Biological data PubChem 


  1. 1.
    Kim S, Thiessen PA, Bolton EE, Chen J, Fu G, Gindulyte A, Han L, He J, He S, Shoemaker BA, et al. (2015) PubChem Substance and Compound databases. Nucleic Acids Res 44:D1202–D1213.Google Scholar
  2. 2.
    Weininger D (1988) SMILES, a chemical language and information system. 1. Introduction to methodology and encoding rules. J Chem Inf Comput Sci 28:31–36CrossRefGoogle Scholar
  3. 3.
    Weininger D, Weininger A, Weininger JL (1989) SMILES. 2. Algorithm for generation of unique SMILES notation. J Chem Inf Comput Sci 29:97–101CrossRefGoogle Scholar
  4. 4.
    Weininger D (1990) SMILES. 3. DEPICT. Graphical depiction of chemical structures. J Chem Inf Comput Sci 30:237–243CrossRefGoogle Scholar
  5. 5.
    Heller S, McNaught A, Stein S, Tchekhovskoi D, Pletnev I (2013) InChI - the worldwide chemical structure identifier standard. J Cheminformatics 5:7CrossRefGoogle Scholar
  6. 6.
    Kim S, Thiessen PA, Bolton EE, Bryant SH (2015) PUG-SOAP and PUG-REST: web services for programmatic access to chemical information in PubChem. Nucleic Acids Res 43:W605–W611CrossRefPubMedPubMedCentralGoogle Scholar

Copyright information

© Springer Science+Business Media New York 2016

Authors and Affiliations

  1. 1.Department of Chemistry, Rutgers Center for Computational and Integrative BiologyRutgers UniversityCamdenUSA

Personalised recommendations