Skip to main content

Sparse Treatment-Effect Model for Taxon Identification with High-Dimensional Metagenomic Data

  • Protocol
  • First Online:
Microbiome Analysis

Part of the book series: Methods in Molecular Biology ((MIMB,volume 1849))

Abstract

To identify disease-associated taxa is an important task in metagenomics. To date, many methods have been proposed for feature selection and prediction. However, those proposed methods are either using univariate (generalized) regression approaches to get the corresponding P-values without considering the interactions among taxa, or using lasso or L0 type sparse modeling approaches to identify taxa with best predictions without providing P-values. To the best of our knowledge, there are no available methods that consider taxon interactions and also generate P-values.

In this paper, we propose a treatment-effect model for identifying taxa (STEMIT) and performing statistical inference with high-dimensional metagenomic data. STEMIT will provide a P-value for a taxon through a two-step treatment-effect maximization. It will provide causal inference if the study is a clinical trial. We first identify taxa associated with the treatment-effect variable and the targeting feature with sparse modeling, and then estimate the P-value of the targeting gene with ordinary least square (OLS) regression. We demonstrate that the proposed method is efficient and can identify biologically important taxa with a real metagenomic data set. The software for L0 sparse modeling can be downloaded at https://cran.r-project.org/web/packages/l0ara/.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Protocol
USD 49.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 109.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 139.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 249.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Belloni A, Chernozhukov V, Hansen C (2014) High-dimensional methods and inference on structural and treatment effects. J Econ Perspect 28(2):29–50

    Article  Google Scholar 

  2. Fang R, Wagner B, Harris J, Fillon S (2016) Zero-inflated negative binomial mixed model: an application to two microbial organisms important in oesophagitis. Epidemiol Infect 1:1–9

    Google Scholar 

  3. Gilbert JA, Jansson JK, Knight R (2014) The earth microbiome project: successes and aspirations. BMC Biol 12(1):1

    Article  Google Scholar 

  4. Gruber S, van der Laan MJ (2010) A targeted maximum likelihood estimator of a causal effect on a bounded continuous outcome. Int J Biostat 6(1):26. http://doi.org/10.2202/1557-4679.1260.

    PubMed Central  Google Scholar 

  5. Human Microbiome Project Consortium (2012) Structure, function and diversity of the healthy human microbiome. Nature 486(7402):207–214

    Article  Google Scholar 

  6. Karlsson F, Tremaroli V, Nookaew I, Bergström G, Behre C, Fagerberg B, Nielsen J, Bäckhed F (2013) Gut metagenome in European women with normal, impaired and diabetic glucose control. Nature 498:99–103

    Article  CAS  Google Scholar 

  7. Law C, Chen Y, Shi W, Smyth G (2014) Voom: precision weights unlock linear model analysis tools for RNA-seq read counts. Genome Biol 15(2):R29

    Article  Google Scholar 

  8. Lippert K, Kedenko L, Antonielli L, Kedenko I, Gemeier C, Leitner M, Kautzky-Willer A, Paulweber B, Hackl E (2017) Gut microbiota dysbiosis associated with glucose metabolism disorders and the metabolic syndrome in older adults. Benef Microbes 13:1–12. http://doi.org/10.3920/BM2016.0184

    Google Scholar 

  9. Liu Z, Hsiao W, Cantarel BL, Drábek EF, Fraser-Liggett C (2011) Sparse distance-based learning for simultaneous multiclass classification and feature selection of metagenomic data. Bioinformatics 27(23):3242–3249

    Article  CAS  Google Scholar 

  10. Liu Z, Sun F, Braun J, McGovern D, Piantadosi S (2015) Multilevel regularized regression for simultaneous taxa selection and network construction with metagenomic count data. Bioinformatics 31(7):1067–1074

    Article  CAS  Google Scholar 

  11. Liu Z, Li G (2016) Efficient regularized regression with L0 penalty for variable selection and network construction. Comput Math Methods Med 2016:3456153

    PubMed  PubMed Central  Google Scholar 

  12. Mackelprang R, Waldrop MP, DeAngelis KM, David MM, Chavarria KL, Blazewicz SJ, Rubin EM, Jansson JK (2011) Metagenomic analysis of a permafrost microbial community reveals a rapid response to thaw. Nature 480(7377):368–371

    Article  CAS  Google Scholar 

  13. Manichanh C, Rigottier-Gois L, Bonnaud E, Gloux K, Pelletier E, Frangeul L, Nalin R, Jarrin C, Chardon P, Marteau P et al (2006). Reduced diversity of faecal microbiota in Crohn’s disease revealed by a metagenomic approach. Gut 55(2):205–211

    Article  CAS  Google Scholar 

  14. Nayfach S, Pollard KS (2016) Toward accurate and quantitative comparative metagenomics. Cell 166(5):1103–1116

    Article  CAS  Google Scholar 

  15. Paulson JN, Stine OC, Bravo HC, Pop M (2013) Differential abundance analysis for microbial marker-gene surveys. Nat Methods 10(12):1200–1202

    Article  CAS  Google Scholar 

  16. Peng X, Li G, Liu Z (2016) Zero-inflated beta regression for differential abundance analysis with metagenomics data. J Comput Biol 23(2):102–110

    Article  CAS  Google Scholar 

  17. Rubin DB (1974) Estimating causal effects of treatment in randomized and nonrandomized studies. J Educational Pschol 66:688–701

    Article  Google Scholar 

  18. Rubin DB (2005) Causal inference using potential outcomes: design, modeling, decisions. J Am Stat Assoc 100:322–331

    Article  CAS  Google Scholar 

  19. Shaw KA, Bertha M, Hofmekler T, Chopra P, Vatanen T, Srivatsa A, Prince J, Kumar A, Sauer C, Zwick ME, Satten GA, Kostic AD, Mulle JG, Xavier RJ, Kugathasan S (2016) Dysbiosis, inflammation, and response to treatment: a longitudinal study of pediatric subjects with newly diagnosed inflammatory bowel disease. Genome Med 8(1):75

    Article  Google Scholar 

  20. Shawki A, McCole DF (2016) Mechanisms of intestinal epithelial barrier dysfunction by adherent-invasive Escherichia coli. Cell Mol Gastroenterol Hepatol 3(1):41–50

    Article  Google Scholar 

  21. Smith RJ, Jeffries TC, Roudnew B, Fitch AJ, Seymour JR, Delpin MW, Newton K, Brown MH, Mitchell JG (2012) Metagenomic comparison of microbial communities inhabiting confined and unconfined aquifer ecosystems. Environ Microbiol 14(1):240–253

    Article  CAS  Google Scholar 

  22. Takahashi K, Nishida A, Fujimoto T, Fujii M, Shioya M, Imaeda H, Inatomi O, Bamba S, Sugimoto M, Andoh A (2016) Reduced abundance of butyrate-producing bacteria species in the fecal microbial community in Crohn’s disease. Digestion 93(1): 59–65

    Article  CAS  Google Scholar 

  23. Tong M et al (2013) A modular organization of the human intestinal mucosal microbiota and its association with inflammatory bowel disease. PLoS One 8:e80702

    Article  Google Scholar 

  24. Turnbaugh P, Ley R, Hamady M, Liggett C, Knight R, Gordon J (2007) The human microbiome project: exploring the microbial part of ourselves in a changing world. Nature 449:804–810

    Article  CAS  Google Scholar 

  25. Zhang X, Mallick H, Tang Z, Zhang L, Cui X, Benson AK, Yi N (2017) Negative binomial mixed models for analyzing microbiome count data. BMC Bioinf 18(1):4

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Zhenqiu Liu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Science+Business Media, LLC, part of Springer Nature

About this protocol

Check for updates. Verify currency and authenticity via CrossMark

Cite this protocol

Liu, Z., Lin, S. (2018). Sparse Treatment-Effect Model for Taxon Identification with High-Dimensional Metagenomic Data. In: Beiko, R., Hsiao, W., Parkinson, J. (eds) Microbiome Analysis. Methods in Molecular Biology, vol 1849. Humana Press, New York, NY. https://doi.org/10.1007/978-1-4939-8728-3_19

Download citation

  • DOI: https://doi.org/10.1007/978-1-4939-8728-3_19

  • Published:

  • Publisher Name: Humana Press, New York, NY

  • Print ISBN: 978-1-4939-8726-9

  • Online ISBN: 978-1-4939-8728-3

  • eBook Packages: Springer Protocols

Publish with us

Policies and ethics