Abstract
However, count data is not purely relative—the count pair (1, 2) carries different information than counts of (1000, 2000) even though the relative amounts of the two components are the same.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Anders, S., and W. Huber. 2010. Differential expression analysis for sequence count data. Genome Biology 11 (10): R106.
Anders, S., D.J. McCarthy, et al. 2013. Count-based differential expression analysis of RNA sequencing data using R and Bioconductor. Nature Protocols 8 (9): 1765–1786.
Bacon-Shone, J. 2008. Discrete and continuous compositions. In Proceedings of CODAWORK’08, The 3rd Compositional Data Analysis Workshop, ed. J. Daunis-i Estadella and J. E. Fernández. Girona: University of Girona.
Baggerly, K.A., L. Deng, et al. 2003. Differential expression in SAGE: Accounting for normal between-library variation. Bioinformatics 19 (12): 1477–1483.
Bottomly, D., N.A.R. Walter, et al. 2011. Evaluating gene expression in C57BL/6 J and DBA/2 J mouse striatum using RNA-seq and microarrays. PLoS ONE 6 (3): e17820.
Bourgon, R., R. Gentleman, et al. 2010. Independent filtering increases detection power for high-throughput experiments. Proceedings of the National Academy of Sciences 107 (21): 9546–9551.
Bullard, J.H., E. Purdom, et al. 2010. Evaluation of statistical methods for normalization and differential expression in mRNA-seq experiments. BMC Bioinformatics 11 (1): 94.
Cameron, A.C., and P.K. Trivedi. 1998. Regression analysis of count data. Cambridge, UK: Cambridge University Press.
Charlson, E.S., J. Chen, et al. 2010. Disordered microbial communities in the upper respiratory tract of cigarette smokers. PLoS ONE 5 (12): e15216.
Chen, Y., D. McCarthy, et al. 2017. edgeR: Differential expression analysis of digital gene expression data User’s Guide. (Last revised September 15, 2017): 1–115.
Costea, P. I., G. Zeller, et al. 2017. Towards standards for human fecal sample processing in metagenomic studies. Nature Biotechnology (advance online publication).
Cui, X., J.T. Hwang, et al. 2005. Improved statistical tests for differential gene expression by shrinking variance components estimates. Biostatistics 6 (1): 59–75.
Dillies, M.-A., A. Rau, et al. 2013. A comprehensive evaluation of normalization methods for illumina high-throughput RNA sequencing data analysis. Briefings in Bioinformatics 14 (6): 671–683.
Greenacre, M. 2011. Compositional data and correspondence analysis. In Compositional data analysis: Theory and applications, ed. V. Pawlowsky-Glahn, and A. Buccianti, 104–113. Chichester, UK: Wiley.
Harati, S., J.H. Phan, et al. 2014. Investigation of factors affecting RNA-seq gene expression calls. Proceedings of Conference of IEEE Engineering in Medicine and Biology Society 5 (10): 6944805.
Harris, R. A., T. Wang, et al. 2010. Comparison of sequencing-based methods to profile DNA methylation and identification of monoallelic epigenetic modifications. Nat Biotechnol 28.
Kuczynski, J., C.L. Lauber, et al. 2011. Experimental and analytical tools for studying the human microbiome. Nature Reviews Genetics 13 (1): 47–58.
Kvam, V.M., P. Liu, et al. 2012. A comparison of statistical methods for detecting differentially expressed genes from RNA-seq data. American Journal of Botany 99 (2): 248–256.
Law, C.W., Y. Chen, et al. 2014. voom: Precision weights unlock linear model analysis tools for RNA-seq read counts. Genome Biology 15 (2): R29.
Li, H. 2015. Microbiome, metagenomics, and high-dimensional compositional data analysis. Annual Review of Statistics and Its Application 2: 73–94.
Love, M.I., W. Huber, et al. 2014. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biology 15 (12): 550.
Lovell, D., V. Pawlowsky-Glahn, et al. 2015. Proportionality: A valid alternative to correlation for relative data. PLoS Computational Biology 11 (3): e1004075.
Lu, J., J. K. Tomfohr, et al. 2005. Identifying differential expression in multiple SAGE libraries: an overdispersed log-linear model approach. BMC Bioinformatics 6.
Marioni, J.C., C.E. Mason, et al. 2008. RNA-seq: An assessment of technical reproducibility and comparison with gene expression arrays. Genome Research 18 (9): 1509–1517.
McCarthy, D.J., Y. Chen, et al. 2012. Differential expression analysis of multifactor RNA-Seq experiments with respect to biological variation. Nucleic Acids Research 40 (10): 4288–4297.
McCullagh, P., and J. Nelder. 1989. Generalized linearmodels. London, UK: Chapman & Hall/CRC.
McMurdie, P.J., and S. Holmes. 2014. Waste not, want not: Why rarefying microbiome data is inadmissible. PLoS Computational Biology 10 (4): e1003531.
Munro, S.A., S.P. Lund, et al. 2014. Assessing technical performance in differential gene expression experiments with external spike-in RNA control ratio mixtures. Nature Communications 5: 5125.
Murdoch, D.J., Y.-L. Tsai, et al. 2008. P-values are random variables. The American Statistician 62 (3): 242–245.
Nagalakshmi, U., Z. Wang, et al. 2008. The transcriptional landscape of the yeast genome defined by RNA sequencing. Science 320.
Nookaew, I., M. Papini, et al. 2012. A comprehensive comparison of RNA-seq-based transcriptome analysis from reads to differential gene expression and cross-comparison with microarrays: A case study in Saccharomyces cerevisiae. Nucleic Acids Research 40 (20): 10084–10097.
Oshlack, A., M.D. Robinson, et al. 2010. From RNA-seq reads to differential expression results. Genome Biology 11 (12): 220.
Rapaport, F., R. Khanin, et al. 2013. Comprehensive evaluation of differential gene expression analysis methods for RNA-seq data. Genome Biology 14 (9): R95–R95.
Rau, A., M. Gallopin, et al. 2013. Data-based filtering for replicated high-throughput transcriptome sequencing experiments. Bioinformatics 29 (17): 2146–2152.
Robinson, M.D., D.J. McCarthy, et al. 2010. edgeR: A bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics 26 (1): 139–140.
Robinson, M.D., and A. Oshlack. 2010. A scaling normalization method for differential expression analysis of RNA-seq data. Genome Biology 11 (3): R25–R25.
Robinson, M.D., and G.K. Smyth. 2007. Moderated statistical tests for assessing differences in tag abundance. Bioinformatics 23 (21): 2881–2887.
Robinson, M.D., and G.K. Smyth. 2008. Small-sample estimation of negative binomial dispersion, with applications to SAGE data. Biostatistics 9 (2): 321–332.
Sha, Y., J. H. Phan, et al. 2015. Effect of low-expression gene filtering on detection of differentially expressed genes in RNA-seq data. Conference Proceedings: 2015 37th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, 6461–6464.
Smyth, G.K. 2004. Linear models and empirical bayes methods for assessing differential expression in microarray experiments. Statistical Applications in Genetics and Molecular Biology 3: 12.
Soneson, C., and M. Delorenzi. 2013. A comparison of methods for differential expression analysis of RNA-seq data. BMC Bioinformatics 14 (91): 1471–2105.
Sultan, M., M.H. Schulz, et al. 2008. A global view of gene activity and alternative splicing by deep sequencing of the human transcriptome. Science 321 (5891): 956–960.
Wang, L., Z. Feng, et al. 2010. DEGseq: An R package for identifying differentially expressed genes from RNA-seq data. Bioinformatics 26 (1): 136–138.
Xia, Y., D. Morrison-Beedy, et al. 2012. Modeling count outcomes from HIV risk reduction interventions: A comparison of competing statistical models for count responses. AIDS Research and Treatment 2012: 11 pages.
Yu, D., W. Huber, et al. 2013. Shrinkage estimation of dispersion in negative binomial models for RNA-seq experiments with small sample size. Bioinformatics 29 (10): 1275–1282.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
Copyright information
© 2018 Springer Nature Singapore Pte Ltd.
About this chapter
Cite this chapter
Xia, Y., Sun, J., Chen, DG. (2018). Modeling Over-Dispersed Microbiome Data. In: Statistical Analysis of Microbiome Data with R. ICSA Book Series in Statistics. Springer, Singapore. https://doi.org/10.1007/978-981-13-1534-3_11
Download citation
DOI: https://doi.org/10.1007/978-981-13-1534-3_11
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-13-1533-6
Online ISBN: 978-981-13-1534-3
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)