web cellHTS2: A web-application for the analysis of high-throughput screening data
The analysis of high-throughput screening data sets is an expanding field in bioinformatics. High-throughput screens by RNAi generate large primary data sets which need to be analyzed and annotated to identify relevant phenotypic hits. Large-scale RNAi screens are frequently used to identify novel factors that influence a broad range of cellular processes, including signaling pathway activity, cell proliferation, and host cell infection. Here, we present a web-based application utility for the end-to-end analysis of large cell-based screening experiments by cellHTS2.
The software guides the user through the configuration steps that are required for the analysis of single or multi-channel experiments. The web-application provides options for various standardization and normalization methods, annotation of data sets and a comprehensive HTML report of the screening data analysis, including a ranked hit list. Sessions can be saved and restored for later re-analysis. The web frontend for the cellHTS2 R/Bioconductor package interacts with it through an R-server implementation that enables highly parallel analysis of screening data sets. web cellHTS2 further provides a file import and configuration module for common file formats.
The implemented web-application facilitates the analysis of high-throughput data sets and provides a user-friendly interface. web cellHTS2 is accessible online at http://web-cellHTS2.dkfz.de. A standalone version as a virtual appliance and source code for platforms supporting Java 1.5.0 can be downloaded from the web cellHTS2 page. web cellHTS2 is freely distributed under GPL.
KeywordsStatistical Quality Control Java Server Page Virtual Appliance Error Detection Mechanism Small Molecule Screen
High-throughput cell-based screens have become an important experimental tool for the analysis of many cellular processes. Whole genome sequences and methods for gene silencing by RNA interference (RNAi) have enabled loss-of-function analysis in ex vivo and in vivo, opening new avenues for functional analysis that were previously unfeasible [1, 2]. Different experimental methods to assess phenotypic changes are being used, from single-channel homogenous readouts to multi-channel cytometry and imaging, producing large data sets that need to be analyzed to extract phenotypically relevant information. RNAi screening has found a broad user-base as a genetic method to dissect many different cellular processes, such as cell survival, signaling pathways and other cellular phenotypes in a high-throughput manner [3, 4, 5, 6].
High-throughput screens are mostly performed using 96- to 384-well plates and produce large data sets that need to be normalized, summarized and ranked to generate a list of significant phenotypic modifiers. Large-scale RNAi screens can easily exceed more than 100,000 data points per screening experiment and specialized statistical approaches have been developed for their analysis [7, 8, 9, 10]. Quality control assessments of assays and screening data are performed to provide benchmarks for the overall performance, such as experiment-wide performance of controls, reproducibility between replicate experiments, as well as other statistical quality control measures [10, 11, 12, 13].
We have previously described cellHTS as an analysis toolbox for cell-based high-throughput screens . cellHTS is implemented in R/Bioconductor  as a command-line utility that provides a workflow for the analysis of high-throughput data sets. cellHTS and cellHTS2 have become widely used in the community as they provide an end-to-end solution for the analysis of high-throughput screening data sets, while retaining the flexibility to incorporate further functions for statistical analysis as the field matures. However, an obstacle for general use in the laboratory was the lack of an integrated and easy-to-use solution for the configuration of screening plates, choice of controls and analysis methods.
Data files needed for the analysis are generated through the graphical user interface or can be provided through upload forms. web cellHTS2 also provides an import module that supports upload of a spectrum of different file formats. web cellHTS2 implements error detection mechanisms for each data file or website input, checking for common input errors prior to running cellHTS2. Once the configuration of a screening experiment is completed, the analysis project, containing information on the complete session including all input files and processing parameters, can be saved for re-use. This function allows for rapid re-processing of similar datasets and generation of a full documentation of the analysis. The results of the analysis can be streamed to the web browser or can be sent via E-mail directly.
web cellHTS2 facilitates the analysis of high-throughput screening data by providing an easy to use web-application. It has been developed with a view towards large-scale RNAi screens but can also be employed for the analysis of small molecule screens. A particular focus has been to provide a user-friendly interface to select analysis parameters and to generate "re-usable" analysis workflows. Furthermore, error-checking procedures of raw data and annotation files, and automated pre-processing of uploaded data have been implemented. web cellHTS2 can be accessed online or downloaded for local installation. web cellHTS can also be downloaded as a virtual appliance to run web cellHTS2 in a contained environment .
Examples of normalization options
Measurements are divided by the median of all sample wells in the plate
The midpoint of the 'shorth' of the distribution of all sample wells is used for normalization
Measurements are divided by the mean of all sample wells in the plate
Measurements are divided by the median of negative controls in the plate
Measurements are divided by the mean of the plate's positive control
Normalized percent control
Measurements are divides by the difference of the plates positive and negative controls
A two-way (row and column) median polish is applied to each plate
Robust local fit regression
Spatial effects are normalized by fitting a bivariate local regression
Spatial effects are normalized using Loess regression
The application presented here is a web or stand-alone program to facilitate the analysis of high-throughput screening data. High-throughput screening experiments are of increasing importance, both for basic science and drug discovery. Such data sets easily exceed the complexity of transcriptome experiments, however there are still comparably much fewer tools available that enable an easy-to-use analysis. cellHTS and other software packages  have started to address this issue by enabling an end-to-end analysis of high-throughput screening data sets and have become widely used in the community. Here, we provide a web application as a front-end for cellHTS2 to increase its accessibility and accelerate the analysis of high-throughput screening data sets. The web application can be used both for RNAi and compound screening experiments and can be extended to meet future needs. In contrast to commercial packages, we provide an open-source and extensible solution for online and offline usage.
Conclusions and future directions
web cellHTS2 provides an intuitive interface for the analysis of high-throughput screens. The user can choose among different options for the analysis of screening data sets. Statistical analysis options will be expanded as new methods become available and broadly used [9, 18]. The graphical user interface for the configuration of screening experiments and the option to save "re-usable" session templates make it convenient to use in the laboratory. Future developments of the application will be to provide direct links to phenotype databases , e.g. to compare hit lists, to annotate hit list with additional information from public databases e.g. through BioMart and to extend the analysis by functional annotation data such GO enrichment analysis. It is also planned to provide diagnostic plots "on-the-fly" to allow the user to compare different normalization strategies.
Availability and requirements
Project name: web cellHTS2
Project home page: http://web-cellHTS2.dkfz.de
Operating system(s): Platform independent Programming language: e.g. Java Other requirements: Java 1.5.0
Downloadable Version: R 2.10.0, cellHTS2 2.11.1 and Rserve 0.6.0
Virtual appliance: Open source software Virtual box http://www.virtualbox.org
License: GNU GPL Any restrictions to use by non-academics: none
We thank Grainne Kerr, Xian Zhang, Gregoire Pau and Wolfgang Huber for helpful suggestions and critical comments on the manuscript. We are grateful to Tobias Reber for help with IT infrastructure. Funding was provided by grants from the Helmholtz Alliance for Systems Biology and BMBF NGFN-Plus (01GS08181).
- 6.Huang S, Laoukili J, Epping MT, Koster J, Hölzel M, Westerman BA, Nijkamp W, Hata A, Asgharzadeh S, Seeger RC, Versteeg R, Beijersbergen RL, Bernards R: ZNF423 is criticallyrequired for retinoic acid-induced differentiation and is a marker of neuroblastoma outcome. Cancer Cell 2009, 15: 328–340. 10.1016/j.ccr.2009.02.023CrossRefPubMedPubMedCentralGoogle Scholar
- 8.Birmingham A, Selfors LM, Forster T, Wrobel D, Kennedy CJ, Shanks E, Santoyo-Lopez J, Dunican DJ, Long A, Kelleher D, Smith Q, Beijersbergen RL, Ghazal P, Shamu CE: Statistical methods for analysis of high-throughput RNA interference screens. Nature Methods 2009, 6: 569–75. 10.1038/nmeth.1351CrossRefPubMedPubMedCentralGoogle Scholar
- 12.Gentleman RC, Carey VJ, Bates DM, Bolstad B, Dettling M, Dudoit S, Ellis B, Gautier L, Ge Y, Gentry J, Hornik K, Hothorn T, Huber W, Iacus S, Irizarry R, Leisch F, Li C, Maechler M, Rossini AJ, Sawitzki G, Smith C, Smyth G, Tierney L, Yang JY, Zhang J: Bioconductor: opensoftware development for computational biology and bioinformatics. Genome Biology 2004, 5: R80. 10.1186/gb-2004-5-10-r80CrossRefPubMedPubMedCentralGoogle Scholar
- 14.Apache Tapestry 5[http://tapestry.apache.org]
- 15.Apache Tomcat 5.5[http://tomcat.apache.org]
- 16.Jetty Java webserver[http://www.mortbay.org/jetty/]
- 17.Virtual Box[http://www.virtualbox.org/]
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.