Abstract
Background
Single-cell RNA sequencing (scRNA-seq) is an emerging technology that enables high resolution detection of heterogeneities between cells. One important application of scRNA-seq data is to detect differential expression (DE) of genes. Currently, some researchers still use DE analysis methods developed for bulk RNA-Seq data on single-cell data, and some new methods for scRNA-seq data have also been developed. Bulk and single-cell RNA-seq data have different characteristics. A systematic evaluation of the two types of methods on scRNA-seq data is needed.
Results
In this study, we conducted a series of experiments on scRNA-seq data to quantitatively evaluate 14 popular DE analysis methods, including both of traditional methods developed for bulk RNA-seq data and new methods specifically designed for scRNA-seq data. We obtained observations and recommendations for the methods under different situations.
Conclusions
DE analysis methods should be chosen for scRNA-seq data with great caution with regard to different situations of data. Different strategies should be taken for data with different sample sizes and/or different strengths of the expected signals. Several methods for scRNA-seq data show advantages in some aspects, and DEGSeq tends to outperform other methods with respect to consistency, reproducibility and accuracy of predictions on scRNA-seq data.
Article PDF
Similar content being viewed by others
References
Mortazavi, A., Williams, B. A., McCue, K., Schaeffer, L. and Wold, B. (2008) Mapping and quantifying mammalian transcriptomes by RNASeq. Nat. Methods, 5, 621–628
Stegle, O., Teichmann, S. A. and Marioni, J. C. (2015) Computational and analytical challenges in single-cell transcriptomics. Nat. Rev. Genet., 16, 133–145
Shapiro, E., Biezuner, T. and Linnarsson, S. (2013) Single-cell sequencing-based technologies will revolutionize whole-organism science. Nat. Rev. Genet., 14, 618–630
Macaulay, I. C. and Voet, T. (2014) Single cell genomics: advances and future perspectives. PLoS Genet., 10, e1004126
Tang, F., Lao, K. and Surani, M. A. (2011) Development and applications of single-cell transcriptome analysis. Nat. Methods, 8, S6–S11
Kanter, I. and Kalisky, T. (2015) Single cell transcriptomics: methods and applications. Front. Oncol., 5, 53
Kolodziejczyk, A. A., Kim, J. K., Svensson, V., Marioni, J. C. and Teichmann, S. A. (2015) The technology and biology of single-cell RNA sequencing. Mol. Cell, 58, 610–620
Sandberg, R. (2014) Entering the era of single-cell transcriptomics in biology and medicine. Nat. Methods, 11, 22–24
Saliba, A. E., Westermann, A. J., Gorski, S. A. and Vogel, J. (2014) Single-cell RNA-seq: advances and future challenges. Nucleic Acids Res., 42, 8845–8860
Anders, S. and Huber, W. (2010) Differential expression analysis for sequence count data. Genome Biol., 11, R106
Robinson, M. D., McCarthy, D. J. and Smyth, G. K. (2010) edgeR: a Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics, 26, 139–140
Wang, L., Feng, Z., Wang, X., Wang, X. and Zhang, X. (2010) DEGseq: an R package for identifying differentially expressed genes from RNAseq data. Bioinformatics, 26, 136–138
Kharchenko, P. V., Silberstein, L. and Scadden, D. T. (2014) Bayesian approach to single-cell differential expression analysis. Nat. Methods, 11, 740–742
Trapnell, C., Cacchiarelli, D., Grimsby, J., Pokharel, P., Li, S., Morse, M., Lennon, N. J., Livak, K. J., Mikkelsen, T. S. and Rinn, J. L. (2014) The dynamics and regulators of cell fate decisions are revealed by pseudotemporal ordering of single cells. Nat. Biotechnol., 32, 381–386
Delmans, M. and Hemberg, M. (2016) Discrete distributional differential expression (D3E)—a tool for gene expression analysis of single-cell RNA-seq data. BMC Bioinformatics, 17, 110
Vu, T. N., Wills, Q. F., Kalari, K. R., Niu, N., Wang, L., Rantalainen, M. and Pawitan, Y. (2016) Beta-Poisson model for single-cell RNA-seq data analyses. Bioinformatics, 32, 2128–2135
Finak, G., McDavid, A., Yajima, M., Deng, J., Gersuk, V., Shalek, A. K., Slichter, C. K., Miller, H. W., McElrath, M. J., Prlic, M., et al. (2015) MAST: a flexible statistical framework for assessing transcriptional changes and characterizing heterogeneity in single-cell RNA sequencing data. Genome Biol., 16, 278
Wu, L., Zhang, X., Zhao, Z., Wang, L., Li, B., Li, G., Dean, M., Yu, Q., Wang, Y., Lin, X., et al. (2015) Full-length single-cell RNA-seq applied to a viral human cancer: applications to HPV expression and splicing analysis in HeLa S3 cells. Gigascience, 4, 51
Freeman, B. T., Jung, J. P. and Ogle, B. M. (2015) Single-cell RNA-seq of bone marrow-derived mesenchymal stem cells reveals unique profiles of lineage priming. PLoS One, 10, e0136199
Avraham, R., Haseley, N., Brown, D., Penaranda, C., Jijon, H. B., Trombetta, J. J., Satija, R., Shalek, A. K., Xavier, R. J., Regev, A., et al. (2015) Pathogen cell-to-cell variability drives heterogeneity in host immune responses. Cell, 162, 1309–1321
Blakeley, P., Fogarty, N. M. E., Valle, I. D., Wamaitha, S. E., Hu, T. X., Elder, K., Snell, P., Christie, L., Robson, P. and Niakan, K. K. (2015) Defining the three cell lineages of the human blastocyst by single-cell RNA-seq. Development, 142, 3613
Fan, X., Zhang, X., Wu, X., Guo, H., Hu, Y., Tang, F. and Huang, Y. (2015) Single-cell RNA-seq transcriptome analysis of linear and circular RNAs in mouse preimplantation embryos. Genome Biol., 16, 148
Tasic, B., Menon, V., Nguyen, T. N., Kim, T. K., Jarsky, T., Yao, Z., Levi, B., Gray, L. T., Sorensen, S. A., Dolbeare, T., et al. (2016) Adult mouse cortical cell taxonomy revealed by single cell transcriptomics. Nat. Neurosci., 19, 335–346
Hardcastle, T. J. and Kelly, K. A. (2010) baySeq: empirical Bayesian methods for identifying differential expression in sequence count data. BMC Bioinformatics, 11, 422
Di, Y., Schafer, D. W., Cumbie, J. S. and Chang, J. H. (2011) The NBP negative binomial model for assessing differential gene expression from RNA-Seq. Stat. Appl. Genet. Mol. Biol., 10, 1–28
Trapnell, C., Hendrickson, D. G., Sauvageau, M., Goff, L., Rinn, J. L. and Pachter, L. (2013) Differential analysis of gene regulation at transcript resolution with RNA-seq. Nat. Biotechnol., 31, 46–53
Auer, P. L. and Doerge, R. W. (2011) A two-stage Poisson model for testing RNA-Seq data. Stat. Appl. Genet. Mol. Biol., 10, doi: 10.2202/1544-6115.1627
Ritchie, M. E., Phipson, B., Wu, D., Hu, Y., Law, C. W., Shi, W. and Smyth, G. K. (2015) limma powers differential expression analyses for RNA-sequencing and microarray studies. Nucleic Acids Res., 43, e47
Frazee, A. C., Pertea, G., Jaffe, A. E., Langmead, B., Salzberg, S. L. and Leek, J. T. (2014) Flexible analysis of transcriptome assemblies with Ballgown. Biorxiv: http://dx.doi.org/10.1101/003665
Li, J. and Tibshirani, R. (2013) Finding consistent patterns: a nonparametric approach for identifying differential expression in RNA-Seq data. Stat. Methods Med. Res., 22, 519–536
Li, H., Handsaker, B., Wysoker, A., Fennell, T., Ruan, J., Homer, N., Marth, G., Abecasis, G. Durbin, R., and the 1000 Genome Project Data Processing Subgroup. (2009) The sequence alignment/map format and SAMtools. Bioinformatics, 25, 2078–2079
Shalek, A. K., Satija, R., Shuga, J., Trombetta, J. J., Gennert, D., Lu, D., Chen, P., Gertner, R. S., Gaublomme, J. T., Yosef, N., et al. (2014) Single-cell RNA-seq reveals dynamic paracrine control of cellular variation. Nature, 510, 363–369
Brunskill, E. W., Park, J. S., Chung, E., Chen, F., Magella, B. and Potter, S. S. (2014) Single cell dissection of early kidney development: multilineage priming. Development, 141, 3093–3101
Kimmerling, R. J., Lee Szeto, G., Li, J. W., Genshaft, A. S., Kazer, S. W., Payer, K. R., de Riba Borrajo, J., Blainey, P. C., Irvine, D. J., Shalek, A. K., et al. (2016) A microfluidic platform enabling single-cell RNA-seq of multigenerational lineages. Nat. Commun., 7, 10220
Su, Z., Labaj, P. P., Li, S., Thierry-Mieg, J., Thierry-Mieg, D., Shi, W., Wang, C., Schroth, G. P., Setterquist, R. A., and Thompson, J. F. (2014) A comprehensive assessment of RNA-seq accuracy, reproducibility and information content by the Sequencing Quality Control Consortium. Nat. Biotechnol., 32, 903–914
Tan, P. K., Downey, T. J., Spitznagel, E. L., Xu, P., Fu, D., Dimitrov, D. S., Lempicki, R. A., Raaka, B. M. and Cam, M. C. (2003) Evaluation of gene expression measurements from commercial microarray platforms. Nucleic Acids Res., 31, 5676–5684
Shi, L., Shi, L., Reid, L. H., Jones,W. D., Shippy, R., Warrington, J. A., Baker, S. C., Collins, P. J., de Longueville, F., Kawasaki, E. S., et al. (2006) The MicroArray Quality Control (MAQC) project shows interand intraplatform reproducibility of gene expression measurements. Nat. Biotechnol., 24, 1151–1161
Trapnell, C., Pachter, L. and Salzberg, S. L. (2009) TopHat: discovering splice junctions with RNA-Seq. Bioinformatics, 25, 1105–1111
Anders, S., Pyl, P. T., Huber, W (2015) HTSeq–a Python framework to work with high-throughput sequencing data. Bioinformatics 2015, 31, 166–169
Author information
Authors and Affiliations
Corresponding author
Electronic supplementary material
Rights and permissions
About this article
Cite this article
Miao, Z., Zhang, X. Differential expression analyses for single-cell RNA-Seq: old questions on new data. Quant Biol 4, 243–260 (2016). https://doi.org/10.1007/s40484-016-0089-7
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s40484-016-0089-7