Abstract
The proper analysis of high-throughput sequencing datasets of mixed microbial communities (meta-transcriptomics) is substantially more complex than for datasets composed of single organisms. Adapting commonly used RNA-seq methods to the analysis of meta-transcriptome datasets can be misleading and not use all the available information in a consistent manner. However, meta-transcriptomic experiments can be investigated in a principled manner using Bayesian probabilistic modeling of the data at a functional level coupled with analysis under a compositional data analysis paradigm. We present a worked example for the differential functional evaluation of mixed-species microbial communities obtained from human clinical samples that were sequenced on an Illumina platform. We demonstrate methods to functionally map reads directly, conduct a compositionally appropriate exploratory data analysis, evaluate differential relative abundance, and finally identify compositionally associated (constant ratio) functions. Using these approaches we have found that meta-transcriptomic functional analyses are highly reproducible and convey significant information regarding the ecosystem.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Jiang Y, Xiong X, Danska J, Parkinson J (2016) Metatranscriptomic analysis of diverse microbial communities reveals core metabolic pathways and microbiome-specific functionality. Microbiome 4:2. https://doi.org/10.1186/s40168-015-0146-x
Fernandes AD, Macklaim JM, Linn TG, Reid G, Gloor GB (2013) ANOVA-like differ- ential expression (aldex) analysis for mixed population rna-seq. PLoS One 8:e67019. https://doi.org/10.1371/journal.pone.0067019
Macklaim MJ, Fernandes DA, Di Bella MJ, Hammond J-A, Reid G, Gloor GB (2013) Com- parative meta-RNA-seq of the vaginal microbiota and differential expression by Lactobacillus iners in health and dysbiosis. Microbiome 1:15. doi: https://doi.org/10.1186/2049-2618-1-12
Aitchison J (1986) The statistical analysis of compositional data. Chapman & Hall, London, England
van den Boogaart KG, Tolosana-Delgado R (2008) “Compositions”: a unified R package to analyze compositional data. Comput Geosci 34:320–338. https://doi.org/10.1016/j.cageo.2006.11.017
Pawlowsky-Glahn V, Egozcue JJ, Tolosana-Delgado R (2015) Modeling and analysis of compositional data. John Wiley & Sons
Bernstein JA, Khodursky AB, Lin P-H, Lin-Chao S, Cohen SN (2002) Global analysis of mRNA decay and abundance in escherichia coli at single-gene resolution using two-color fluorescent dna microarrays. Proc Natl Acad Sci 99:9697–9702
Macklaim JM, Gloor GB, Anukam KC, Cribby S, Reid G (2011) At the crossroads of vaginal health and disease, the genome sequence of Lactobacillus iners AB-1. Proc Natl Acad Sci U S A 108(Suppl 1):4688–4695. https://doi.org/10.1073/pnas.1000086107
Besser J, Carleton HA, Gerner-Smidt P, Lindsey RL, Trees E (2017) Next-generation sequencing technologies and their application to the study and control of bacterial infections. Clin Microbiol Infect. https://doi.org/10.1016/j.cmi.2017.10.013
Overbeek R, Olson R, Pusch GD, Olsen GJ, Davis JJ, Disz T, Edwards RA, Gerdes S, Parrello B, Shukla M, Vonstein V, Wattam AR, Xia F, Stevens R (2014) The seed and the rapid annotation of microbial genomes using subsystems technology (rast). Nucleic Acids Res 42:D206–D214. https://doi.org/10.1093/nar/gkt1226
Mitra S, Rupek P, Richter DC, Urich T, Gilbert JA, Meyer F, Wilke A, Huson DH (2011) Functional analysis of metagenomes and metatranscriptomes using SEED and KEGG. BMC Bioinform 12 Suppl 1:S21. https://doi.org/10.1186/1471-2105-12-S1-S21
Kanehisa M, Goto S, Furumichi M, Tanabe M, Hirakawa M (2010) KEGG for represen- tation and analysis of molecular networks involving diseases and drugs. Nucleic Acids Res 38:D355–D360. https://doi.org/10.1093/nar/gkp896
Gloor GB, Macklaim JM, Pawlowsky-Glahn V, Egozcue JJ (2017) Microbiome datasets are compositional: and this is not optional. Front Microbiol 8:2224. https://doi.org/10.3389/fmicb.2017.02224
Gloor GB, Macklaim JM, Vu M, Fernandes AD (2016) Compositional uncertainty should not be ignored in high-throughput sequencing data analysis. Aust J Stat 45:73–87. https://doi.org/10.17713/ajs.v45i4.122
Lovell D, Pawlowsky-Glahn V, Egozcue JJ, Marguerat S, Bähler J (2015) Proportionality: a valid alternative to correlation for relative data. PLoS Comput Biol 11:e1004075. https://doi.org/10.1371/journal.pcbi.1004075
Quinn TP, Erb I, Richardson MF, Crowley TM (2017) Understanding sequencing data as compositions: an outlook and review. bioRxiv. https://doi.org/10.1101/206425
Aitchison J (1983) Principal component analysis of compositional data. Biometrika 70:57–65
Egozcue JJ, Pawlowsky-Glahn V, Gloor GB (2018) Linear association in compositional data analysis. Aust J Stat 47:3–31
Palarea-Albaladejo J, Martín-Fernández JA (2015) ZCompositions—R package for mul- tivariate imputation of left-censored data under a compositional approach. Chemom Intel Lab Syst 143, 85:–96. https://doi.org/10.1016/j.chemolab.2015.02.019
Jaynes ET, Bretthorst GL (2003) Probability theory: the logic of science. Cambridge University Press, Cambridge
Thorsen J, Brejnrod A, Mortensen M, Rasmussen MA, Stokholm J, Al-Soud WA, Sørensen S, Bisgaard H, Waage J (2016) Large-scale benchmarking reveals false discoveries and count transformation sensitivity in 16S rRNA gene amplicon data analysis methods used in microbiome studies. Microbiome 4:62. https://doi.org/10.1186/s40168-016-0208-8
Bian G, Gloor GB, Gong A, Jia C, Zhang W, Hu J, Zhang H, Zhang Y, Zhou Z, Zhang J, Burton JP, Reid G, Xiao Y, Zeng Q, Yang K, Li J The gut microbiota of healthy aged chinese is similar to that of the healthy young. mSphere 2:e00327–e00317. https://doi.org/10.1128/mSphere.00327-17
Goneau LW, Hannan TJ, MacPhee RA, Schwartz DJ, Macklaim JM, Gloor GB, Razvi H, Reid G, Hultgren SJ, Burton JP (2015) Subinhibitory antibiotic therapy alters recurrent urinary tract infection pathogenesis through modulation of bacterial virulence and host immunity. MBio 6. https://doi.org/10.1128/mBio.00356-15
McMillan A, Rulisa S, Sumarah M, Macklaim JM, Renaud J, Bisanz JE, Gloor GB, Reid G (2015) A multi-platform metabolomics approach identifies highly specific biomarkers of bacterial diversity in the vagina of pregnant and non-pregnant women. Sci Rep 5:14174. https://doi.org/10.1038/srep14174
McMurrough TA, Dickson RJ, Thibert SMF, Gloor GB, Edgell DR (2014) Control of catalytic efficiency by a coevolving network of catalytic and noncatalytic residues. Proc Natl Acad Sci U S A 111:E2376–E2383. https://doi.org/10.1073/pnas.1322352111
Aitchison J, Greenacre M (2002) Biplots of compositional data. J Royal Stat Soc Ser C (Appl Stat) 51:375–392
Hawinkel S, Mattiello F, Bijnens L, Thas O (2017) A broken promise: microbiome differential abundance methods do not control the false discovery rate. Brief Bioinform bbx104
Quinn T, Richardson MF, Lovell D, Crowley T (2017) Propr: an R-package for identifying proportionally abundant features using compositional data analysis. bioRxiv. https://doi.org/10.1101/104935
Erb I, Quinn T, Lovell D, Notredame C (2017) Differential proportionality—a normalization-free approach to differential gene expression. bioRxiv. https://doi.org/10.1101/134536
Gloor GB, Reid G (2016) Compositional analysis: A valid approach to analyze microbiome high-throughput sequencing data. Can J Microbiol 62:692–703. https://doi.org/10.1139/cjm-2015-0821
Gloor GB, Macklaim JM, Fernandes AD (2016) Displaying variation in large datasets: Plotting a visual summary of effect sizes. J Comput Graph Stat 25:971–979. https://doi.org/10.1080/10618600.2015.1131161
Erb I, Notredame C (2016) How should we measure proportionality on relative gene expression data? Theory Biosci 135:21–36
Gierliński M, Cole C, Schofield P, Schurch NJ, Sherstnev A, Singh V, Wrobel N, Gharbi K, Simpson G, Owen-Hughes T, Blaxter M, Barton GJ (2015) Statistical models for rna-seq data derived from a two-condition 48-replicate experiment. Bioinformatics 31:3625–3630. https://doi.org/10.1093/bioinformatics/btv425
Halsey LG, Curran-Everett D, Vowler SL, Drummond GB (2015) The fickle p value generates irreproducible results. Nat Methods 12:179–185. https://doi.org/10.1038/nmeth.3288
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Science+Business Media, LLC, part of Springer Nature
About this protocol
Cite this protocol
Macklaim, J.M., Gloor, G.B. (2018). From RNA-seq to Biological Inference: Using Compositional Data Analysis in Meta-Transcriptomics. In: Beiko, R., Hsiao, W., Parkinson, J. (eds) Microbiome Analysis. Methods in Molecular Biology, vol 1849. Humana Press, New York, NY. https://doi.org/10.1007/978-1-4939-8728-3_13
Download citation
DOI: https://doi.org/10.1007/978-1-4939-8728-3_13
Published:
Publisher Name: Humana Press, New York, NY
Print ISBN: 978-1-4939-8726-9
Online ISBN: 978-1-4939-8728-3
eBook Packages: Springer Protocols