Advertisement

Journal of Microbiology

, Volume 56, Issue 4, pp 280–285 | Cite as

UBCG: Up-to-date bacterial core gene set and pipeline for phylogenomic tree reconstruction

  • Seong-In Na
  • Yeong Ouk Kim
  • Seok-Hwan Yoon
  • Sung-min Ha
  • Inwoo Baek
  • Jongsik Chun
Systems and Synthetic Microbiology and Bioinformatics

Abstract

Genome-based phylogeny plays a central role in the future taxonomy and phylogenetics of Bacteria and Archaea by replacing 16S rRNA gene phylogeny. The concatenated core gene alignments are frequently used for such a purpose. The bacterial core genes are defined as single-copy, homologous genes that are present in most of the known bacterial species. There have been several studies describing such a gene set, but the number of species considered was rather small. Here we present the up-to-date bacterial core gene set, named UBCG, and software suites to accommodate necessary steps to generate and evaluate phylogenetic trees. The method was successfully used to infer phylogenomic relationship of Escherichia and related taxa and can be used for the set of genomes at any taxonomic ranks of Bacteria. The UBCG pipeline and file viewer are freely available at https://www.ezbiocloud.net/tools/ubcg and https://www.ezbiocloud.net/tools/ubcg_viewer, respectively.

Keywords

phylogeny phylogenetic analysis phylogenomics bacterial core gene 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Supplementary material

12275_2018_8014_MOESM1_ESM.pdf (702 kb)
Supplementary material, approximately 701 KB.

References

  1. Ankenbrand, M.J. and Keller, A. 2016. bcgTree: automatized phylogenetic tree building from bacterial core genomes. Genome 59, 783–791.CrossRefGoogle Scholar
  2. Chun, J. and Rainey, F.A. 2014. Integrating genomics into the taxonomy and systematics of the Bacteria and Archaea. Int. J. Syst. Evol. Microbiol. 64, 316–324.CrossRefGoogle Scholar
  3. Chun, J., Oren, A., Ventosa, A., Christensen, H., Arahal, D.R., da Costa, M.S., Rooney, A.P., Yi, H., Xu, X.W., De Meyer, S., et al. 2018. Proposed minimal standards for the use of genome data for the taxonomy of prokaryotes. Int. J. Syst. Evol. Microbiol. 68, 461–466.CrossRefGoogle Scholar
  4. Creevey, C.J., Doerks, T., Fitzpatrick, D.A., Raes, J., and Bork, P. 2011. Universally distributed single-copy genes indicate a constant rate of horizontal transfer. PLoS One 6, e22099.CrossRefGoogle Scholar
  5. Darling, A.E., Jospin, G., Lowe, E., Matsen, F.I., Bik, H.M., and Eisen, J.A. 2014. PhyloSift: phylogenetic analysis of genomes and metagenomes. PeerJ 2, e243.CrossRefGoogle Scholar
  6. Dupont, C.L., Rusch, D.B., Yooseph, S., Lombardo, M.J., Richter, R.A., Valas, R., Novotny, M., Yee-Greenbaum, J., Selengut, J.D., Haft, D.H., et al. 2012. Genomic insights to SAR86, an abundant and uncultivated marine bacterial lineage. ISME J. 6, 1186–1199.CrossRefGoogle Scholar
  7. Eddy, S.R. 2011. Accelerated profile HMM searches. PLoS Comput. Biol. 7, e1002195.CrossRefGoogle Scholar
  8. Edgar, R.C. 2010. Search and clustering orders of magnitude faster than BLAST. Bioinformatics 26, 2460–2461.CrossRefGoogle Scholar
  9. Eisen, J.A. and Fraser, C.M. 2003. Phylogenomics: intersection of evolution and genomics. Science 300, 1706–1707.CrossRefGoogle Scholar
  10. Felsenstein, J. 1985. Confidence-limits on phylogenies–an approach using the bootstrap. Evolution 39, 783–791.CrossRefGoogle Scholar
  11. Finn, R.D., Coggill, P., Eberhardt, R.Y., Eddy, S.R., Mistry, J., Mitchell, A.L., Potter, S.C., Punta, M., Qureshi, M., Sangrador-Vegas, A., et al. 2016. The Pfam protein families database: towards a more sustainable future. Nucleic Acids Res. 44, D279–D285.CrossRefGoogle Scholar
  12. Fox, G.E., Wisotzkey, J.D., and Jurtshuk, P.J. 1992. How close is close: 16S rRNA sequence identity may not be sufficient to guarantee species identity. Int. J. Syst. Bacteriol. 42, 166–170.CrossRefGoogle Scholar
  13. Haft, D.H., Selengut, J.D., Richter, R.A., Harkins, D., Basu, M.K., and Beck, E. 2013. TIGRFAMs and genome properties in 2013. Nucleic Acids Res. 41, D387–D395.CrossRefGoogle Scholar
  14. Hyatt, D., Chen, G.L., LoCascio, P.F., Land, M.L., Larimer, F.W., and Hauser, L.J. 2010. Prodigal: prokaryotic gene recognition and translation initiation site identification. BMC Bioinformatics 11, 119.CrossRefGoogle Scholar
  15. Jeon, Y.S., Lee, K., Park, S.C., Kim, B.S., Cho, Y.J., Ha, S.M., and Chun, J. 2014. EzEditor: a versatile sequence alignment editor for both rRNA-and protein-coding genes. Int. J. Syst. Evol. Microbiol. 64, 689–691.CrossRefGoogle Scholar
  16. Katoh, K. and Standley, D.M. 2013. MAFFT Multiple sequence alignment software version 7: Improvements in performance and usability. Mol. Biol. Evol. 30, 772–780.CrossRefGoogle Scholar
  17. Price, M.N., Dehal, P.S., and Arkin, A.P. 2010. FastTree 2-approximately maximum-likelihood trees for large alignments. PLoS One 5, e9490.CrossRefGoogle Scholar
  18. Radford, A.D., Chapman, D., Dixon, L., Chantrey, J., Darby, A.C., and Hall, N. 2012. Application of next-generation sequencing technologies in virology. J. Gen. Virol. 93, 1853–1868.CrossRefGoogle Scholar
  19. Rinke, C., Schwientek, P., Sczyrba, A., Ivanova, N.N., Anderson, I.J., Cheng, J.F., Darling, A., Malfatti, S., Swan, B.K., Gies, E.A., et al. 2013. Insights into the phylogeny and coding potential of microbial dark matter. Nature 499, 431–437.CrossRefGoogle Scholar
  20. Rosselló-Mora, R. and Amann, R. 2001. The species concept for prokaryotes. FEMS Microbiol. Rev. 25, 39–67.CrossRefGoogle Scholar
  21. Shih, P.M., Wu, D.Y., Latifi, A., Axen, S.D., Fewer, D.P., Talla, E., Calteau, A., Cai, F., de Marsac, N.T., Rippka, R., et al. 2013. Improving the coverage of the cyanobacterial phylum using diversity-driven genome sequencing. Proc. Natl. Acad. Sci. USA 110, 1053–1058.CrossRefGoogle Scholar
  22. Stamatakis, A. 2014. RAxML version 8: a tool for phylogenetic analysis and post-analysis of large phylogenies. Bioinformatics 30, 1312–1313.CrossRefGoogle Scholar
  23. Tagini, F. and Greub, G. 2017. Bacterial genome sequencing in clinical microbiology: a pathogen-oriented review. Eur. J. Clin. Microbiol. Infect. Dis. 36, 2007–2020.CrossRefGoogle Scholar
  24. Wu, D., Hugenholtz, P., Mavromatis, K., Pukall, R., Dalin, E., Ivanova, N.N., Kunin, V., Goodwin, L., Wu, M., Tindall, B.J., et al. 2009. A phylogeny-driven genomic encyclopaedia of bacteria and archaea. Nature 462, 1056–1060.CrossRefGoogle Scholar
  25. Wu, D.Y., Jospin, G., and Eisen, J.A. 2013. Systematic identification of gene families for use as markers for phylogenetic and phylogeny-driven ecological studies of bacteria and archaea and their major subgroups. PLoS One 8, e77033.CrossRefGoogle Scholar
  26. Yoon, S.H., Ha, S.M., Kwon, S., Lim, J., Kim, Y., Seo, H., and Chun, J. 2017. Introducing EzBioCloud: a taxonomically united database of 16S rRNA gene sequences and whole-genome assemblies. Int. J. Syst. Evol. Microbiol. 67, 1613–1617.CrossRefGoogle Scholar

Copyright information

© The Microbiological Society of Korea and Springer Nature B.V. 2018

Authors and Affiliations

  • Seong-In Na
    • 1
    • 2
  • Yeong Ouk Kim
    • 1
    • 2
  • Seok-Hwan Yoon
    • 4
  • Sung-min Ha
    • 3
    • 4
  • Inwoo Baek
    • 2
    • 3
  • Jongsik Chun
    • 1
    • 2
    • 3
    • 4
  1. 1.Interdisciplinary Program in BioinformaticsSeoul National UniversitySeoulRepublic of Korea
  2. 2.Institute of Molecular Biology & GeneticsSeoul National UniversitySeoulRepublic of Korea
  3. 3.School of Biological SciencesSeoul National UniversitySeoulRepublic of Korea
  4. 4.ChunLab, Inc.SeoulRepublic of Korea

Personalised recommendations