Bioinformatics tools to assess metagenomic data for applied microbiology
The reduction of the price of DNA sequencing has resulted in the emergence of large data sets to handle and analyze, especially in microbial ecosystems, which are characterized by high taxonomic and functional diversities. To assess the properties of these complex ecosystems, a conceptual background of the application of NGS technology and bioinformatics analysis to metagenomics is required. Accordingly, this article presents an overview of the evolution of knowledge of microbial ecology from traditional culture-dependent methods to culture-independent methods and the last frontier in knowledge, metagenomics. Topics that will be covered include sample preparation for NGS, starting with total DNA extraction and library preparation, followed by a brief discussion of the chemistry of NGS to help provide an understanding of which bioinformatics pipeline approach may be helpful for achieving a researcher’s goals. The importance of selecting appropriate sequencing coverage and depth parameters to obtain a suitable measure of microbial diversity is discussed. As all DNA sequencing processes produce base-calling errors that compromise data analysis, including genome assembly and microbial functional analysis, dedicated software is presented and conceptually discussed with regard to potential applications in the general microbial ecology field.
KeywordsMetagenomics NGS Applied bioinformatics Microbial diversity
ECP De Martinis is a fellow of National Council for Scientific and Technological Development, Brazil (grant #6762/2006-4) and she is grateful for a Research Grant from São Paulo Research Foundation (FAPESP), Brazil (grant # 2017/18928-0). OGG Almeida is grateful to São Paulo Research Foundation (FAPESP), Brazil, for a Ph.D. fellowship (grant #2017/13759-6).
This study was financed in part by the Coordenação de Aperfeiçoamento de Pessoal de Nível Superior - Brasil (CAPES) - Finance Code 001.
Compliance with ethical standards
Conflict of interest
The authors declare that they have no conflict of interest.
Research involving human participants and/or animals
This article does not contain any studies with human participants or animals performed by any of the authors.
- Bag S, Saha B, Mehta O, Anbumani D, Naveen K, Dayal M, Pant A, Kumar P, Saxena S, Allin KH, Hansen T, Arumugam M, Vestergaard H, Pedersen O, Pereira V, Abraham P, Tripathi R, Wadhwa N, Bhatnagar S, Prakash VG, Radha V, Anjana RM, Mohan V, Takeda K, Kurakawa T, Nair GB, Das B (2016) An improved method for high qualitymetagenomics DNA extraction from human and environmental samples. Sci Rep 6. https://doi.org/10.1038/srep26775
- Cocolin L, Mataragas M, Bourdichon F, Doulgeraki A, Pilet MF, Jagadeesan B, Rantsiou K, Phister T (2017) Next generation microbial risk assessment meta-omics: the next need for integration. Int J Food Microbiol. https://doi.org/10.1016/j.ijfoodmicro.2017.11.008
- Corley SM, MacKenzie KL, Beverdam A, Roddam LF, Wilkins MR (2017) Differentially expressed genes from RNA-seq and functional enrichment results are affected by the choice of single-end versus paired-end reads and stranded versus non-stranded protocols. BMC Genomics 18:399. https://doi.org/10.1186/s12864-017-3797-0 CrossRefPubMedPubMedCentralGoogle Scholar
- Escobar-Zepeda A, Léon AVP, Sanchez-Flores A (2015) The road to metagenomics: from microbiology to DNA sequencing technologies and bioinformatics. Front Genet 6. https://doi.org/10.3389/fgene.2015.00348
- Felczykowska A, Krajewska A, Zielińska S, Łoś JM (2015a) Sampling, metadata, and DNA extraction- importante steps in metagenomic studies. Acta Biochim Pol. https://doi.org/10.18388/abp.2014_916
- Felczykowska A, Krajewska A, Zielińska S, Łoś JM, Bloch SK, Nejman-Faleńczyk B (2015b) Metagenomics. Acta Biochim Pol. https://doi.org/10.18388/abp.2014_917
- Fullwood MJ, Wei CL, Liu ET, Ruan Y (2009) Next-generation DNA sequencing of paired-end tags (PET) for transcriptome and genomeanalyses. Genome Res. https://doi.org/10.1101/gr.074906.107
- Huson DH, Beier S, Flade I, Górska A, El-Hadidi M, Mitra S, Ruscheweyh HJ, Tappu R (2016) Megan Community edition – interactive exploration and analysis of large-scale microbiome sequencing data. PLoS Comput Biol 12:e1004957. https://doi.org/10.1371/journal.pcbi.1004957 CrossRefPubMedPubMedCentralGoogle Scholar
- Keisam S, Romi W, Ahmed G, Jeyaram K (2016) Quantifying the biases in metagenome mining for realistic assessment of microbial ecology of naturally fermented foods. Sci Rep 6. https://doi.org/10.1038/srep34155
- Martin M (2011) Cutadapt removes adapter sequences from high-throughput sequencing reads. EMBnetjournal 17. https://doi.org/10.14806/ej.17.1.200
- Meyer M, Kircher M (2010) Illumina sequencing library preparation for highly multiplexed target capture and sequencing. Cold Spring Harb Protoc. https://doi.org/10.1101/pdb.prot5448
- Meyer F, Paarman D, D’Souza M, Olson R, Glass EM, Kubal M, Paczian T, Rodrigues A, Stevens R, Wilke A, Wilkening J, Edwards RA (2008) The metagenomics RAST server- a public resource for the automatic phylogenetic and functional analysis of metagenomes. BMC Bioinf 9:386. https://doi.org/10.1186/1471-2105-9-386 CrossRefGoogle Scholar
- Namiki T, Hachiya T, Tanaka H, Sakakibara Y (2012) MetaVelvet: an extension of velvet assembler to de novo metagenome assembly from short sequence reads. Nucleic Acids Res. https://doi.org/10.1093/nar/gks678
- Ogram A (2000) Soil molecular microbial ecology at age 20: methodological challenges for the future. Soil Biol Biochem. https://doi.org/10.1016/S0038-0717(00)00088-2
- Oulas A, Pavloudi C, Polymenakou P, Pavlopoulos GA, Papanikolaou N, Kotoulas G, Arvanitidis C, Iliopoulos I (2015) Metagenomics: tools and insights for analyzing next-generation sequencing data derived from biodiversity studies. Bioinform Biol Insights 9:BBI.S12462. https://doi.org/10.4137/BBI.S12462 CrossRefGoogle Scholar
- Ranjan R, Rani A, Metwally A, McGee HS, Perkins DL (2016) Analysis of the microbiome: advantages of whole genome shotgun versus 16S amplicon sequencing. Biochem Biophys Res Commun. https://doi.org/10.1016/j.bbrc.2015.12.083
- Rhodes J, Beale MA, Fisher MC (2014) Illuminating choices for library prep: a comparison of library preparation methods for whole genome sequencing of Cryptococcus neoformans using Illumina HiSeq. PLoS One 9:e113501. https://doi.org/10.1371/journal.pone.0113501 CrossRefPubMedPubMedCentralGoogle Scholar
- Rodriguez-R LM, Konstantinidis KT (2014a) Estimating coverage in metagenomic data sets and why it matters. ISME J. https://doi.org/10.1038/ismej.2014.76
- Salonen A, Nikkilä J, Jalanka-Tuovinen J, Immonen O, Rajilić-Stojanović M, Kekkonen RA, Palva A, de Vos WM (2010) Comparative analysis of fecal DNA extraction methods with phylogenetic microarray: effective recovery of bacterial and archaeal DNA using mechanical cell lysis. J Microbiol Methods. https://doi.org/10.1016/j.mimet.2010.02.007
- Schloss PD, Westcott SL, Ryabin T, Hall JR, Hartmann M, Hollister EB, Lesniewski RA, Oakley BB, Parks DH, Robinson CJ, Sahl JW, Strez B, Thallinger GG, Van Horn DJ, Weber CF (2009) Introducing mothur: open-source, plataform-independent, community-supported software for describing and comparing microbial communities. Appl Environ Microbiol 75:7537–7541. https://doi.org/10.1128/AEM.01541-09 CrossRefPubMedPubMedCentralGoogle Scholar
- Schoch CL, Seifert KA, Huhndorf S, Robert V, Spouge JL, Levesque CA, Chen W, Fungal Barcoding Consortium (2012) Nuclear ribosomal internal transcribed spacer (ITS) region as a universal DNA barcode marker for Fungi. PNAS 109:6241–6246. https://doi.org/10.1073/pnas.1117018109 CrossRefPubMedGoogle Scholar
- van der Walt AJ, van Goethem MW, Ramond JB, Makhalanyane TP, Reva O, Cowan DA (2017) Assembling metagenomes, one community at a time. BMC Genomics. https://doi.org/10.1186/s12864-017-3918-9
- Van Nieuwerburgh F, Thompson RC, Ledesma J, Deforce D, Gaasterland T, Ordoukhanian P, Head SR (2012) Illumina mate-paired DNA sequencing-library preparation using Cre-Lox recombination. Nucleic Acids Res. https://doi.org/10.1093/nar/gkr1000
- Wesolowska-Andersen A, Bahl MI, Carvalho V, Kristiansen K, Sicheritz-Pontén T, Gupta R, Licht TR (2014) Choice of bacterial DNA extraction method from fecal material influences community structure as evaluated by metagenomics analysis. Microbiome 2:19. https://doi.org/10.1186/2049-2618-2-19 CrossRefPubMedPubMedCentralGoogle Scholar
- Zhou Q, Su X, Ning K (2014) Assessment of quality control approaches for metagenomic data analysis. Sci Rep 4. https://doi.org/10.1038/srep06957