Curating and Preparing High-Throughput Screening Data for Quantitative Structure-Activity Relationship Modeling

  • Marlene T. Kim
  • Wenyi Wang
  • Alexander Sedykh
  • Hao ZhuEmail author
Part of the Methods in Molecular Biology book series (MIMB, volume 1473)


Publicly available bioassay data often contains errors. Curating massive bioassay data, especially high-throughput screening (HTS) data, for Quantitative Structure-Activity Relationship (QSAR) modeling requires the assistance of automated data curation tools. Using automated data curation tools are beneficial to users, especially ones without prior computer skills, because many platforms have been developed and optimized based on standardized requirements. As a result, the users do not need to extensively configure the curation tool prior to the application procedure. In this chapter, a freely available automatic tool to curate and prepare HTS data for QSAR modeling purposes will be described.

Key words

QSAR Data curation Chemical structures Computational modeling 


  1. 1.
    Daniel WW (2009) Biostatistics: a foundation for analysis in the health sciences, 9th edn. Wiley, Hoboken, NJGoogle Scholar
  2. 2.
    Tropsha A, Golbraikh A (2007) Predictive QSAR modeling workflow, model applicability domains, and virtual screening. Curr Pharm Des 13:3494–3504. doi: 10.2174/138161207782794257 CrossRefPubMedGoogle Scholar
  3. 3.
    Weininger D (1988) SMILES, a chemical language and information system. 1. Introduction to methodology and encoding rules. J Chem Inf Comput Sci 28:31–36. doi: 10.1021/ci00057a005 CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media New York 2016

Authors and Affiliations

  • Marlene T. Kim
    • 1
  • Wenyi Wang
    • 1
  • Alexander Sedykh
    • 2
  • Hao Zhu
    • 1
    Email author
  1. 1.Department of Chemistry, Rutgers Center for Computational and Integrative BiologyRutgers UniversityCamdenUSA
  2. 2.Multicase Inc.BeachwoodUSA

Personalised recommendations