Microbiome Data Mining for Microbial Interactions and Relationships

  • Xingpeng Jiang
  • Xiaohua Hu


The study of how microbial species coexist and interact in a host-associated environment or a natural environment is crucial to advance basic microbiology science and the understanding of human health and diseases. Researchers have started to infer common interspecies interactions and species–phenotype relations such as competitive and cooperative interactions leveraging to big microbiome data. These endeavors have facilitated the discovery of previously unknown principles of microbial world and expedited the understanding of the disease mechanism. In this review, we will summarize current computational efforts in microbiome data mining for discovering microbial interactions and relationships including dimension reduction and data visualization, association analysis, microbial network reconstruction, as well as dynamic modeling and simulations.


Microbial Community Canonical Correlation Analysis Flux Balance Analysis Nonnegative Matrix Factorization Microbial Interaction 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.



This work was supported in part by NSF IIP 1160960, NNS IIP 1332024, NSFC 61532008, and China National 12-5 plan 2012BAK24B01 and the international cooperation project of Hubei Province (No. 2014BHE0017) and the Self-determined Research Funds of CCNU from the Colleges’ Basic Research and Operation of MOE (No. CCNU16KFY04).


  1. 1.
    Wooley JC, Godzik A, Friedberg I (2010) A primer on metagenomics. PLoS Comput Biol 6(2):e1000667CrossRefGoogle Scholar
  2. 2.
    Qin J et al (2010) A human gut microbial gene catalogue established by metagenomic sequencing. Nature 464(7285):59–65CrossRefGoogle Scholar
  3. 3.
    Cho I et al (2012) Antibiotics in early life alter the murine colonic microbiome and adiposity. Nature 488(7413):621–626CrossRefGoogle Scholar
  4. 4.
    Lin J, Wilbur WJ (2007) PubMed related articles: a probabilistic topic-based model for content similarity. BMC Bioinform 8Google Scholar
  5. 5.
    Abubucker S et al (2012) Metabolic reconstruction for metagenomic data and its application to the human microbiome. PLoS Comput Biol 8(6):e1002358CrossRefGoogle Scholar
  6. 6.
    Jiang X et al (2012) Functional biogeography of ocean microbes revealed through non-negative matrix factorization. PLoS ONE 7(9):e43866CrossRefGoogle Scholar
  7. 7.
    Karlsson FH et al (2013) Gut metagenome in European women with normal, impaired and diabetic glucose control. Nature 498(7452):99–103CrossRefGoogle Scholar
  8. 8.
    Morgan, X.C. and C. Huttenhower, Chapter 12: Human microbiome analysis. PLoS Comput Biol, 2012. 8(12): p. e1002808Google Scholar
  9. 9.
    Ren TT et al (2013) 16S rRNA survey revealed complex bacterial communities and evidence of bacterial interference on human adenoids. Environ Microbiol 15(2):535–547CrossRefGoogle Scholar
  10. 10.
    Chaffron S et al (2010) A global network of coexisting microbes from environmental and whole-genome sequence data. Genome Res 20(7):947–959CrossRefGoogle Scholar
  11. 11.
    Carr R, Borenstein E (2012) NetSeed: a network-based reverse-ecology tool for calculating the metabolic interface of an organism with its environment. Bioinformatics 28(5):734–735CrossRefGoogle Scholar
  12. 12.
    Greenblum S et al (2013) Towards a predictive systems-level model of the human microbiome: progress, challenges, and opportunities. Curr Opin Biotechnol 24(4):810–820CrossRefGoogle Scholar
  13. 13.
    Shoaie, S., et al., Understanding the interactions between bacteria in the human gut through metabolic modeling. Scientific Reports, 2013. 3 Google Scholar
  14. 14.
    Freilich S et al (2010) The large-scale organization of the bacterial network of ecological co-occurrence interactions. Nucleic Acids Res 38(12):3857–3868CrossRefGoogle Scholar
  15. 15.
    Patel PV et al (2010) Analysis of membrane proteins in metagenomics: Networks of correlated environmental features and protein families. Genome Res 20(7):960–971CrossRefGoogle Scholar
  16. 16.
    Temperton B et al (2011) Novel analysis of oceanic surface water metagenomes suggests importance of polyphosphate metabolism in oligotrophic environments. PLoS ONE 6(1):e16499MathSciNetCrossRefGoogle Scholar
  17. 17.
    Jiang X, Weitz JS, Dushoff J (2012) A non-negative matrix factorization framework for identifying modular patterns in metagenomic profile data. J Math Biol 64(4):697–711MathSciNetCrossRefMATHGoogle Scholar
  18. 18.
    Chen X et al (2012) Estimating functional groups in human gut microbiome with probabilistic topic models. IEEE Trans Nanobiosci 11(3):203–215CrossRefGoogle Scholar
  19. 19.
    Arumugam M et al (2011) Enterotypes of the human gut microbiome. Nature 473(7346):174–180CrossRefGoogle Scholar
  20. 20.
    Yatsunenko T et al (2012) Human gut microbiome viewed across age and geography. Nature 486(7402):222–227Google Scholar
  21. 21.
    Wu GD et al (2011) Linking long-term dietary patterns with gut microbial enterotypes. Science 334(6052):105–108CrossRefGoogle Scholar
  22. 22.
    Hildebrand F et al (2013) Inflammation-associated enterotypes, host genotype, cage and inter-individual effects drive gut microbiota variation in common laboratory mice. Genome Biol 14(1):R4CrossRefGoogle Scholar
  23. 23.
    Koren O et al (2013) A guide to enterotypes across the human body: meta-analysis of microbial community structures in human microbiome datasets. PLoS Comput Biol 9(1):e1002863CrossRefGoogle Scholar
  24. 24.
    Moeller AH et al (2012) Chimpanzees and humans harbour compositionally similar gut enterotypes. Nat Commun 3:1179CrossRefGoogle Scholar
  25. 25.
    Jeffery IB et al (2012) Categorization of the gut microbiota: enterotypes or gradients? Nat Rev Microbiol 10(9):591–592CrossRefGoogle Scholar
  26. 26.
    Siezen RJ, Kleerebezem M (2011) The human gut microbiome: are we our enterotypes? Microb Biotechnol 4(5):550–553CrossRefGoogle Scholar
  27. 27.
    Jiang X et al (2012) Manifold learning reveals nonlinear structure in metagenomic profiles. In: IEEE BIBM 2012Google Scholar
  28. 28.
    Chen X et al (2012) Exploiting the functional and taxonomic structure of genomic data by probabilistic topic modeling. IEEE-ACM Trans Comput Biol Bioinform 9(4):980–991CrossRefGoogle Scholar
  29. 29.
    Holmes I, Harris K, Quince C (2012) Dirichlet multinomial mixtures: generative models for microbial metagenomics. Plos ONE 7(2)Google Scholar
  30. 30.
    Gianoulis TA et al (2009) Quantifying environmental adaptation of metabolic pathways in metagenomics. Proc Natl Acad Sci USA 106(5):1374–1379CrossRefGoogle Scholar
  31. 31.
    Raes J et al (2011) Toward molecular trait-based ecology through integration of biogeochemical, geographical and metagenomic data. Mol Syst Biol 7:473CrossRefGoogle Scholar
  32. 32.
    Friedman J, Alm EJ (2012) Inferring correlation networks from genomic survey data. PLoS Comput Biol 8(9):e1002687CrossRefGoogle Scholar
  33. 33.
    Reshef DN et al (2011) Detecting novel associations in large data sets. Science 334(6062):1518–1524CrossRefGoogle Scholar
  34. 34.
    Koren O et al (2012) Host remodeling of the gut microbiome and metabolic changes during pregnancy. Cell 150(3):470–480CrossRefGoogle Scholar
  35. 35.
    Anderson MJ et al (2003) Biochemical and toxicopathic biomarkers assessed in smallmouth bass recovered from a polychlorinated biphenyl-contaminated river. Biomarkers 8(5):371–393CrossRefGoogle Scholar
  36. 36.
    Hinton D et al (2003) ‘Hit by the wind’ and temperature-shift panic among Vietnamese refugees. Transcult Psychiatry 40(3):342–376CrossRefGoogle Scholar
  37. 37.
    Kamita SG et al (2003) Juvenile hormone (JH) esterase: why are you so JH specific? Insect Biochem Mol Biol 33(12):1261–1273CrossRefGoogle Scholar
  38. 38.
    Chaffron S et al (2010) A global network of coexisting microbes from environmental and whole-genome sequence data. Genome Res 20(7):947–959CrossRefGoogle Scholar
  39. 39.
    Zupancic M et al (2012) Analysis of the gut microbiota in the old order amish and its relation to the metabolic syndrome. PLoS ONE 7(8):e43052Google Scholar
  40. 40.
    Faust K et al (2012) Microbial co-occurrence relationships in the human microbiome. Plos Comput Biol 8(7)Google Scholar
  41. 41.
    Lockhart R et al (2014) A significance test for the Lasso. Ann Stat 42(2):413–468MathSciNetCrossRefMATHGoogle Scholar
  42. 42.
    Negi JS et al (2013) Development of solid lipid nanoparticles (SLNs) of lopinavir using hot self nano-emulsification (SNE) technique. Eur J Pharm Sci 48(1–2):231–239CrossRefGoogle Scholar
  43. 43.
    Xie B et al (2011) m-SNE: multiview stochastic neighbor embedding. IEEE Trans Syst Man Cybern B CybernGoogle Scholar
  44. 44.
    Greene G (2010) SNE: a place where research and practice meet. J Nutr Educ Behav 42(4):215MathSciNetCrossRefGoogle Scholar
  45. 45.
    Friedman J, Alm EJ (2012) Inferring correlation networks from genomic survey data. Plos Comput Biol 8(9)Google Scholar
  46. 46.
    Jiang X et al (2014) Predicting microbial interactions using vector autoregressive model with graph regularization. IEEE/ACM Trans Comput Biol Bioinform (in press). doi: 10.1109/TCBB.2014.2338298
  47. 47.
    Jiang X et al (2013) Inference of microbial interactions from time series data using vector autoregression model. In 2013 IEEE International conference on bioinformatics and biomedicine (BIBM). IEEEGoogle Scholar
  48. 48.
    Ishak N et al (2014) There is a specific response to pH by isolates of Haemophilus influenzae and this has a direct influence on biofilm formation. BMC Microbiol 14:47CrossRefGoogle Scholar
  49. 49.
    Dethlefsen L, Relman DA (2011) Incomplete recovery and individualized responses of the human distal gut microbiota to repeated antibiotic perturbation. Proc Natl Acad Sci USA 108(Suppl 1):4554–4561CrossRefGoogle Scholar
  50. 50.
    Gerber GK (2014) The dynamic microbiome. FEBS LettGoogle Scholar
  51. 51.
    Mounier J et al (2008) Microbial interactions within a cheese microbial community. Appl Environ Microbiol 74(1):172–181CrossRefGoogle Scholar
  52. 52.
    Hoffmann KH et al (2007) Power law rank-abundance models for marine phage communities. FEMS Microbiol Lett 273(2):224–228CrossRefGoogle Scholar
  53. 53.
    Orth JD, Thiele I, Palsson BO (2010) What is flux balance analysis? Nat Biotechnol 28(3):245–248CrossRefGoogle Scholar
  54. 54.
    Stolyar S et al (2007) Metabolic modeling of a mutualistic microbial community. Mol Syst Biol 3(1):92Google Scholar

Copyright information

© Springer India 2016

Authors and Affiliations

  1. 1.School of ComputerCentral China Normal UniversityWuhanChina
  2. 2.College of Computing and InformaticsDrexel UniversityPhiladelphiaUSA

Personalised recommendations