Abstract
With the development of next generation sequencing technologies, the human microbiome can now be studied using direct DNA sequencing. Many human diseases have been shown to be associated with the disorder of the human microbiome. Previous statistical methods for associating the microbiome composition with an outcome such as disease status focus on the association of the abundance of individual taxon or their abundance ratios with the outcome variable. However, the problem of multiple testing leads to loss of power to detect the association. When individual taxon-level association test fails, an overall test, which pools the individually weak association signal, can be applied to test the significance of the effect of the overall microbiome composition on an outcome variable. In this paper, we propose a kernel-based semi-parametric regression method for testing the significance of the effect of the microbiome composition on a continuous or binary outcome. Our method provides the flexibility to incorporate the phylogenetic information into the kernels as well as the ability to naturally adjust for the covariate effects. We evaluate our methods using simulations as well as a real data set on testing the significance of the human gut microbiome composition on body mass index (BMI) while adjusting for total fat intake. Our result suggests that the gut microbiome has a strong effect on BMI and this effect is independent of total fat intake.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Cho I, Blaser MJ (2012) The human microbiome: at the interface of health and disease. Nat Rev Genet 13(4):260–270
Grice EA, Kong HH, Conlan S et al (2009) Topographical and temporal diversity of the human skin microbiome. Science 324(5931):1190–1192
Qin J, Li R, Raes J et al (2010) A human gut microbial gene catalogue established by metagenomic sequencing. Nature 464(7285):59–65
Arumugam M, Raes J, Pelletier E et al (2011) Enterotypes of the human gut microbiome. Nature 473(7346):174–180
Muegge BD, Kuczynski J, Knights D et al (2011) Diet drives convergence in gut microbiome functions across mammalian phylogeny and within humans. Science 332(6032):970–974
Wu GD, Chen J, Hoffmann C et al (2011) Linking long-term dietary patterns with gut microbial enterotypes. Science 334(6052):105–108
Kinross JM, Darzi AW, Nicholson JK (2011) Gut microbiome-host interactions in health and disease. Genome Med 3(3):14
Kuczynski J, Lauber CL, Walters WA et al(2011) Experimental and analytical tools for studying the human microbiome. Nat Rev Genet 13(1):47–58
Chen J, Bittinger K, Charlson ES et al (2012) Associating microbiome composition with environmental covariates using generalized UniFrac distances. Bioinformatics 28(16):2106–2113
Chen J, Bushman FD, Lewis JD et al (2012) Structure-constrained sparse canonical correlation analysis with an application to microbiome data analysis. Biostatistics, doi: 10.1093/biostatistics/kxs038
Chen J, Li H (2012) Variable Selection for Sparse Dirichlet-Multinomial Regression with An Application to Microbiome Data Analysis. Ann Appl Stat, in press
Purdom E (2011) Analysis of a data matrix and a graph: Metagenomic data and the phylogenetic tree. Ann Appl Stat 5(4):2326–2358
Liu D, Lin X, Ghosh D (2007) Semiparametric Regression of Multidimensional Genetic Pathway Data: Least-Squares Kernel Machines and Linear Mixed Models. Biometrics 63(4):1079–1088
Liu D, Ghosh D, Lin X (2008) Estimation and testing for the effect of a genetic pathway on a disease outcome using logistic kernel machine regression via logistic mixed models. BMC bioinformatics 9(1):292
Lozupone C, Knight R (2005) UniFrac: a new phylogenetic method for comparing microbial communities. Appl Environ Microbiol 71(12):8228–8235
Charlson ES, Chen J, Custers-Allen R et al(2010) Disordered microbial communities in the upper respiratory tract of cigarette smokers. PloS One 5(12): e15216
Caporaso JG, Kuczynski J, Stombaugh J et al (2010) QIIME allows analysis of high-throughput community sequencing data. Nat methods 7(5):335–336
Turnbaugh PJ, Hamady M, Yatsunenko T et al(2008) A core gut microbiome in obese and lean twins. Nature 457(7228):480484
Hildebrandt MA, Hoffmann C, Sherrill-Mix SA et al (2009) High-fat diet determines the composition of the murine gut microbiome independently of obesity. Gastroenterology 137(5):1716–1724.
Ley RE (2010) Obesity and the human microbiome. Curr opin gastroen 26(1):5.
Acknowledgments
We thank Rick Bushman, James Lewis, and Gary Wu for sharing the data and for many helpful discussions. This research is supported by NIH grants CA127334 and GM097505.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer Science+Business Media New York
About this paper
Cite this paper
Chen, J., Li, H. (2013). Kernel Methods for Regression Analysis of Microbiome Compositional Data. In: Hu, M., Liu, Y., Lin, J. (eds) Topics in Applied Statistics. Springer Proceedings in Mathematics & Statistics, vol 55. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-7846-1_16
Download citation
DOI: https://doi.org/10.1007/978-1-4614-7846-1_16
Published:
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4614-7845-4
Online ISBN: 978-1-4614-7846-1
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)