Advertisement

Aerobic Hydrocarbon-Degrading Gammaproteobacteria: Oleiphilaceae and Relatives

  • Aleksei A. Korzhenkov
  • Stepan V. Toshchakov
  • Olga V. Golyshina
  • Manuel Ferrer
  • Tatyana N. Chernikova
  • Karl-Erich Jaeger
  • Michail M. Yakimov
  • Peter N. GolyshinEmail author
Living reference work entry
Part of the Handbook of Hydrocarbon and Lipid Microbiology book series (HHLM)

Abstract

Despite the ubiquity of marine hydrocarbon-degrading bacteria from the family Oleiphilaceae, until now there is only one strain from this family with a validly published name and fully assembled genome, Oleiphilus messinensis strain ME102 (= DSM 13489). The availability of draft genomes of 27 other isolates gave us the opportunity to get an insight into the genome evolution and speciation patterns within this group. Whole-genome alignments and genome-to-genome distance calculation data demonstrated that Oleiphilaceae consists of four distinct genome clusters that correspond to the species level. Furthermore, we suggest that all known Oleiphilaceae genomes cluster into two genera, the first one being Oleiphilus, which includes O. messinensis ME102 and the second represented by bacteria isolated near Hawaii. The Oleiphilaceae pangenome of 1796 core gene clusters roughly corresponds to the two-thirds of an Oleiphilaceae genome. All high-quality genomes had double copies of almA coding for flavin-binding family monooxygenase linked with degradation of long-chain alkanes. Alkane monooxygenases with pairwise identities between 43% and 86.5% were encoded by four genomes, with two of them having double loci. Cytochromes P450 were present in all genomes and were assigned to two distinct clusters, which, together with the low redundancy of alkane monooxygenases, points at different microorganisms as the sources of acquisition of alkane-monooxygenation enzymes by Oleiphilaceae.

1 Introduction

Hydrocarbonoclastic bacteria (HCB) represent wide group of microorganisms involved in degradation of oil and oil derivatives, which are common pollutants in marine environments. HCB are present in Proteobacteria, Actinobacteria, Firmicutes, FCB (Fibrobacteres, Chlorobi and Bacteroidetes) group bacteria, etc. In turn, known obligate HCB, for which hydrocarbons are sole, or almost sole, source of energy and carbon, are represented only by taxa within Gammaproteobacteria with the exception of Thalassospira (Alphaproteobacteria) (Berry and Gutierrez 2017). To date, despite the availability of a vast number of metagenome-derived genome sequences through public genome databases, only a small number of studies involved comparative genomics and/or detailed genome-based analysis of phylogenetic diversity within different groups of obligate HCB. One of such examples must be the family Oleiphilaceae, which includes Oleiphilus messinensis – one of the first discovered obligate alkane degraders, isolated and described in 1998 (Yakimov et al. 1998) and sequenced only recently in 2017 (Toshchakov et al. 2017). Here we present a brief analysis of publicly available Oleiphilaceae genomes, including verification of taxonomic attribution based on 16S rRNA genes, average nucleotide identity, and pangenome analysis.

2 Type Strain, Global Distribution, and Genomic Properties of Oleiphilaceae

The type strain, O. messinensis strain ME102 (= DSM 13489), was isolated in Messina harbor, Italy (38° 11′ 22” N; 15° 33′ 55″ E) from sediments polluted by hydrocarbons from heavy marine traffic, and formed a new genus, Oleiphilus, within the novel family Oleiphilaceae in the order Oceanospirillales (Gammaproteobacteria) (Golyshin et al. 2002). For many years, O. messinensis ME102 was the only strain in the Oleiphilaceae with a validly published name, despite the presence of Oleiphilus-related bacteria, e.g., in:
  1. 1.

    Subtidal sediments of the North Atlantic coast of Spain impacted by tanker Prestige oil spill (JQ580103.1, Rodas Beach, 42° 13′ 56″ N; 8° 53′ 50″ W and JQ579692.1, Figueiras Beach, 42° 8′ 8″ N; 8° 32′ 6″ W) (Acosta-González et al. 2013)

     
  2. 2.

    Chronically contaminated coastal sediments of Etang de Berre lagoon (FM242233.1, France, 43° 28′ 00″ N; 5° 10′ 00″ E) (Yakimov and Golyshin 2014)

     
  3. 3.

    100-m-deep water sampled from the Southern Ocean iron fertilization experiment (JX530194.1, Southern Ocean, 47° 30′ 05″ S, 15° 26′ 42″ W) (Singh et al. 2015)

     
  4. 4.

    Enrichment cultures, derived from samples taken from the bottom of euphotic zone (deep chlorophyll maximum, 95 m depth) and the upper mesopelagic zone (250 m depth) near Hawaii islands (22° 46′ 41.5″ N, 158° 04′ 10.2″ W) (Sosa et al. 2017)

     
Thus, Oleiphilaceae can be found in a wide range of marine environments from tropical to moderate, in both the water column and sediments, showing high potential for adaptation to environmental stimuli in diverse and rapidly changing marine environments (Fig. 1).
Fig. 1

Sampling sites where Oleiphilaceae bacteria were detected. White circles mark sampling sites, where 16S rRNA gene sequences of Oleiphiliaceae bacteria were detected, in particular: sediments of Messina harbor, Sicily, site of O. messinensis type strain isolation (CP021425); subtidal sediments of the North Atlantic coast of Spain (JQ580103 and JQ579692), chronically contaminated coastal sediments of Etang de Berre lagoon (FM242233), 100 m-deep water sampled from the Southern ocean iron fertilization experiment (JX530194)

The complete genome of O. messinensis ME102T was sequenced and analyzed recently, showing the presence of genes necessary for both short- and long-chain alkane catabolism, as well as coding potential for utilization of other alkane derivatives, e.g., gene for haloalkane dehalogenase (Toshchakov et al. 2017). Notably, strain ME102T showed an unprecedented level of genome mobility for OHCB, stressing the importance of wider genome analysis of Oleiphilaceae representatives to better understand their biology and diversity. In 2017, the family Oleiphilaceae gained 27 draft genomes, acquired during a study of bacterial degraders of phosphonates associated with high-molecular-weight dissolved organic matter (HMWDOM) produced by photosynthetic microorganisms of surface waters. Water samples for that study were taken near Hawaii Islands from the bottom of euphotic zone (deep chlorophyll maximum, 95 m depth) and the upper mesopelagic zone (250 m depth). These samples were used for setting up series of enrichments amended with HMWDOM collected by ultrafiltration of seawater collected at 20 m depth. High-throughput sequencing of the obtained cultures showed that the most abundant group was Oleiphilus-related Gammaproteobacteria, which was specifically enriched in mesopelagic samples. Analysis of correlation of abundance of Oleiphilus-related OTUs with depth showed that it continuously increased with sampling depth, showing maximum at >200 m depth. The authors also reported that the growth rate of Oleiphilaceae isolates had been significantly stimulated by addition of HMWDOM, hydrocarbon compounds, and fatty acids, reflecting a broad substrate specificity of Oleiphilaceae representatives (Sosa et al. 2017).

3 Genome-Based Assessment of Taxonomic Relatedness in Oleiphilaceae

As of January 2019, 28 genome assemblies attributed to Oleiphilus were deposited in NCBI GenBank databases, with only one complete genome belonging to O. messinensis ME102T (Table 1). All available genome assemblies were assessed for completeness and contamination using CheckM (Parks et al. 2014) with Oceanospirillales marker set. O. messinensis ME102T showed 99.13% completeness lacking only three marker genes and having five duplicated marker genes, which can be explained by extensive gene transfer and a large number of mobile elements in the genome (Toshchakov et al. 2017). Initial assemblies of the “Hawaiian group” had mean completeness 98.92 ± 0.85% and mean contamination 49.35 ± 24.88% (contamination ranged from 19.80% to 99.67%). Genome contamination of more than 15% usually is considered as “very high” and occurs mostly in single-cell assemblies and metagenome-derived genomes (Parks et al. 2017). Detailed information about all publicly available Oleiphilaceae genome assemblies is shown in Table 1.
Table 1

Oleiphilaceae genome assemblies available by January 2019 in NCBI Assembly database

Cluster

GenBank accession

Strain

Contig number

Assembly length

N50

Completeness, %

Contamination, %

Strain heterogeneity, %

4

GCA_002162375.1

O. messinensis ME102

1

6379281

6379281

99.13

1.67

0

3

GCA_001635535.1

HI0009

2650

6316839

16347

99.53

89.45

88.08

1

GCA_001634935.1

HI0043

1975

5472755

25101

99.31

49.91

93.57

1

GCA_001635615.1

HI0050

2036

4819895

5000

97.29

38.33

89.22

1

GCA_001635675.1

HI0061

1885

5095272

8507

99.04

40.59

89.59

2

GCA_001635715.1

HI0065

2693

6398614

28355

99.56

93.42

96.27

3

GCA_001635725.1

HI0066

1887

5044867

22586

99.31

56.3

93.89

3

GCA_001635735.1

HI0067

1752

4840075

20427

99.53

49.52

93.1

1

GCA_001635765.1

HI0068

2001

4842151

5172

98.34

29.58

87.58

1

GCA_001635795.1

HI0069

2238

5901752

14127

99.25

60.24

90.16

2

GCA_001635805.1

HI0071

2289

5698317

24676

99.56

70.33

95.28

1

GCA_001634975.1

HI0072

2192

4921543

4687

98.82

36.27

90.05

2

GCA_001634985.1

HI0073

2628

6103244

56040

99.34

90.21

96.07

1

GCA_001635075.1

HI0078

2530

4431832

2905

96.47

20.89

76.11

2

GCA_001635105.1

HI0079

1448

4407017

13007

99.56

33.79

91.33

2

GCA_001635135.1

HI0080

1249

3938001

8128

99.56

22.59

87.96

1

GCA_001635145.1

HI0081

2463

4402732

3030

98.43

23.99

88.06

1

GCA_001635215.1

HI0085

3230

4482542

2080

97.01

28.47

83.23

1

GCA_001635225.1

HI0086

1534

4753747

10771

99.25

28.81

90.32

1

GCA_001635235.1

HI0117

1521

4620570

7176

98.49

19.8

88.19

2

GCA_001635265.1

HI0118

2826

6560891

15926

99.56

99.67

94.23

2

GCA_001635305.1

HI0122

2490

6039240

37188

99.56

82.6

94.25

1

GCA_001635315.1

HI0123

1909

4468126

5022

98.21

21.72

85.12

3

GCA_001635355.1

HI0125

1531

4553828

19709

99.53

47.16

92.75

1

GCA_001635365.1

HI0128

1636

4885910

11631

98.49

36.07

92.75

2

GCA_001635385.1

HI0130

2064

5194401

10326

99.56

52.88

96.11

1

GCA_001635845.1

HI0132

1901

5170237

9473

98.82

36

92.27

2

GCA_001635835.1

HI0133

2393

5728633

47727

99.56

73.76

94.93

Because of the poor quality of the “Hawaiian group” genome assemblies, draft genomes were reassembled using raw sequencing reads from the study of Sosa and coauthors, deposited in NCBI Sequence Read Archive (Kodama et al. 2011). Sequencing reads were trimmed by quality and filtered by length using fastq-mcf tool from ea-utils package (Aronesty 2013). Overlapping paired reads were merged using seqprep tool (https://github.com/jstjohn/SeqPrep). De novo genome assembly was performed using SPAdes 3.10 (Bankevich et al. 2012) in “careful” mode with default settings and automatic choice of k-mer length. Reassembled genomes showed much better levels of completeness and contamination, indicating the possibility of minor contamination of dilution-to-extinction enrichment cultures obtained by Sosa et al., which for some reasons significantly compromised assembly quality. Thus, initial assemblies had a mean length of 5.15 Mbp with significant levels of contamination and strain heterogeneity (Table 1), while reassembled genomes had a mean length of 3.39 Mbp, mean completeness of 94.06 ± 7.07%, and mean contamination of 3.20 ± 1.51% (Table 2). To assess correspondence of two assembly pipelines, average nucleotide identities (ANI) between original genome assemblies (Sosa et al. 2017) and respective reassembled genomes were calculated using ani.rb script from the “enveomics” tool collection (https://github.com/lmrodriguezr/enveomics; Rodriguez-R and Konstantinidis 2016). Analysis showed that all ANI values were close to 100%, indicating that biological conclusions made by Sosa and coauthors shouldn’t be compromised by the low levels of homogeneity of their assemblies.
Table 2

Reassembled Oleiphilaceae draft genomes, representing distinct clusters, revealed with ANI analysis and clustering

Cluster

Strain

Contig number

Assembly length

N50

Completeness, %

Contamination,%

Strain heterogeneity, %

3

HI0009

235

3593725

41038

99.53

4.27

0

1

HI0043

504

3686850

12403

98.60

2.65

9.09

2

HI0118

98

3382672

119088

99.56

0.96

0

4 Four Distinct Genomic Clusters Within Oleiphilaceae

Reassembled genomes from 27 “Hawaiian” strains (Sosa et al. 2017) and genome of O. messinensis ME102 (Toshchakov et al. 2017) were clustered on the basis of ANI (calculated as described before) in R environment using hierarchical cluster analysis (Fig. 2). ANI values were in the range from 77.61% to 99.99%. Hierarchical clustering resulted in four distinct genome clusters (Fig. 2). The minimal intra-cluster ANI value was 99.41 ± 0.36%, which leads to the conclusion that each cluster contains genomes of same subspecies. For several intra-cluster pairs, digital DNA–DNA hybridization (dDDH) was performed using genome-to-genome distance calculator (Meier-Kolthoff et al. 2013) resulting in values close to 100% (data not shown). Clustering of reassembled genomes by ANI value showed the same results as clustering of original assemblies (data not shown).
Fig. 2

Average nucleotide identity between Oleiphilaceae genomes. Heatmap displays hierarchical clustering of Oleiphilaceae genomes based on average nucleotide identities between genome assemblies. Color represents the ANI value (white corresponding to 77.61% ANI and dark green to almost identical genomes with 99.99% ANI)

For further analysis, one genomic assembly was selected from each cluster on the basis of high genome completeness, number of contigs, N50 value, and level of contamination: Oleiphilus sp. HI0043, Oleiphilus sp. HI0118, Oleiphilus sp. HI0009, and O. messinensis strain ME102T. 16S rRNA genes were identified using Infernal tool (Nawrocki and Eddy 2013) and rfam database (Nawrocki et al. 2014) and aligned against each other using BLASTn (Altschul et al. 1990). O. messinensis ME102T had the most distant 16S rRNA gene sequence showing just 93.37% average identity to Hawaiian strains (Table 3). Meanwhile, 16S rRNA genes of HI0043, HI0118, HI0009 had an identity of more than 96.5% (Table 3). dDDH were performed for all possible pairs of the four genomes, and showed that no one pair of genomes could be considered as the same species (data not shown).
Table 3

16S rRNA gene identity matrix, values at intersections equals to gene identity in percent

 

Oleiphilus sp. HI0043

Oleiphilus sp. HI0118

O. messinensis ME102TT

Oleiphilus sp. HI0009

96.74

98.17

93.56

Oleiphilus sp. HI0043

 

96.54

92.91

Oleiphilus sp. HI0118

  

93.63

5 New Uncultivated Taxa Within Oleiphilaceae

According to 95% 16S rRNA gene identity threshold for genus delimitation (Stackebrandt and Goebel 1994), we suppose that Oleiphilus sp. HI0043, Oleiphilus sp. HI0118, and Oleiphilus sp. HI0009 form a new genus of Oleiphilaceae. Analysis of ANI values and dDDH estimation suggest that these genomes represent three different species. Thus, while 16S rRNA gene sequences of Oleiphilus sp. HI0118 and Oleiphilus sp. HI0009 have 98.17% identity, falling close to the 98.65% 16S rRNA identity threshold (Kim et al. 2014), the ANI value between these genomes is significantly less than the 95% threshold for ANI-based species delimitation proposed by Richter and Rossello-Mora (2009).

The 16S rRNA sequences of O. messinensis, representatives of the “Hawaiian group” and the closest 16S rRNA gene sequences from uncultivated microorganisms found in NCBI nr nucleotide database, were used for phylogenetic reconstruction of Oleiphilaceae (Fig. 3). Multiple sequence alignment was performed using MUSCLE (Edgar 2004) and a phylogenetic tree was inferred using RAxML (Stamatakis 2014) with GTR-Gamma model. The phylogenetic reconstruction supports our suggestions and suggests the possible presence of 3–4 genus-level and 5–6 species-level lineages based on 16S rRNA gene sequences.
Fig. 3

Phylogenetic reconstruction of Oleiphilaceae based on 16S rRNA gene sequence. Acc. No. JQ580103.1 clone RII-OX016 and acc. no. JQ579692.1 clone FII-OX043 from subtidal sediments of North Atlantic coast of Spain impacted by tanker Prestige oil spill, acc.no. FM242233.1 clone 26 T0 h-oil from chronically contaminated coastal sediments of Etang de Berre lagoon, France, acc. no. JX530194.1 clone C146100020 from 100 m depth water samples derived from the Southern Ocean iron fertilization experiment (Singh et al. 2015). Bootstrap support values displayed next to nodes (based on 100 resamplings). 16S rRNA gene sequence of A. borkumensis was used as an out-group. The scale bar represents estimated number of nucleotide substitutions per site

6 Pangenome, Hydrocarbon Utilization, and Osmoprotection

Reassembled genomes of strains HI0043, HI0118, and HI0009 and the genome of O. messinensis strain ME102 were used for prediction of protein-coding genes and sequential pangenome analysis. Protein-coding sequences were predicted using prodigal in single-genome mode (Hyatt et al. 2010). Pangenome analysis was made using orthoMCL according to instructions provided by software developers (Li et al. 2003). The results of pangenome analysis were visualized using ClusterVenn web server (Wang et al. 2015). Oleiphilaceae pangenome consists of 3117 clusters including 11,488 protein-coding genes, 1686 of them were single-copy gene clusters. Furthermore, clusters containing genes from only one genome were found: 198 such clusters for O. messinensis strain ME102, 25 for HI0009, 24 for HI0043, 12 for HI0118. These clusters mainly consist of mobile elements or proteins with unknown function, supporting the remote phylogenetic position of O. messinensis ME102T relative to “Hawaii strains.” 3997 genes were found to be singletons (825 for HI0043, 840 for HI0009, 338 for HI0118, 1994 for ME102, which is in good agreement with genome size). Strains HI0009 and HI0118 have the closest proteomes, supporting the 16S rRNA-based phylogenetic reconstruction: 2370 shared gene clusters, 228 and 273 unique clusters, respectively (Fig. 4).
Fig. 4

Oleiphilaceae pangenome. Top chart: shared and unique gene clusters in Oleiphilaceae genomes. Middle panel: gene clusters number shown for each genome. Bottom panel: numbers of gene clusters shared by 4, 3, or 2 genomes and specific gene clusters

We conducted the analysis of gene clusters responsible for hydrocarbon utilization in genomes of Oleiphilaceae bacteria. Genes of enzymes involved in hydrocarbon utilization and ectoine biosynthesis were identified either by alignment against annotated sequences from NCBI nr protein database using BLASTp algorithm (Altschul et al. 1997) or by alignment against respective pfam hmm profile (Finn et al. 2015) using hmmsearch tool from HMMER package (Eddy 2011). Two clusters of cytochrome P450 monooxygenases were identified: one containing proteins presented in all four genomes analyzed and the second presented in only three genomes (HI0009, HI0118, and O. messinensis 102T). Notably, O. messinensis ME102T genome contains three genes of CYP450, one of which is disrupted by an active IS4 mobile element, therefore accentuating the dynamic nature of Oleiphilaceae genomes (Toshchakov et al. 2017). The gene almA coding for flavin-binding family monooxygenase, linked with long-chain alkanes degradation (Shao and Wang 2013), was found twice in all genomes except that from strain HI0043. All seven sequences formed one orthologous cluster. Alkane monooxygenases were present as single copies in O. messinensis strain ME102T and strain HI0009, and double copies in strains HI0043 and HI0118.

Ectoine is a compatible solute which protects bacteria against osmotic stress (Louis and Galinski 1997). To study possible differences in adaptation to lower temperatures, high osmosis, or elevated hydrostatic pressure, we analyzed genetic loci responsible for ectoine biosynthesis. The ectoine synthesis operon ectABCR includes genes coding for: diaminobutyric acid (DABA) aminotransferase (EctB), DABA acetyltransferase (EctA), ectoine synthase (EctC), and MarR-like transcriptional regulatory protein (EctR). This operon was found in all Oleiphilaceae genomes except HI0009, which was sampled from a depth of 95 mbsf as opposed to HI0043 and HI0118, sampled from 250 mbsf (Sosa et al. 2017). Increased level of transcription of ectoine biosynthesis operon at elevated hydrostatic pressure was reported for Alcanivorax borkumensis (Scoma et al. 2016). Despite the unclear role of ectoine in the response to stress induced by hydrostatic pressure, the lack of an ectoine biosynthesis operon in shallow-water strain HI0009 gives an opportunity to speculate that this solute might be utilized by Oleiphilaceae not only as an osmoprotector, but also as a piezoprotector.

7 Conclusion

In this study, we analyzed all publicly available (by the beginning of 2019) Oleiphilaceae genomes and closest environmental 16S rRNA gene sequences. Whole-genome alignments and digital DNA–DNA hybridization demonstrated that Oleiphilaceae includes four distinct genome clusters that correspond to a species. Additional 16S rRNA gene alignment and phylogenetic reconstruction showed that Oleiphilaceae genomes could be divided into two genera; the first includes O. messinensis ME102 and the second is represented by bacteria isolated near Hawaii islands. Low quality of assembly and high level of contamination motivated us to reassemble the genomes of the “Hawaiian group” from raw reads available at NCBI SRA. Resulting assemblies had higher completeness, lower contamination, improved N50 metrics, and smaller contig number.

Pangenome analysis of complete and reassembled genomes of Oleiphilaceae resulted in 1796 core gene clusters, which roughly correspond to the two-thirds of Oleiphilaceae genome. The number of unique genes in the genome is in good agreement with genome size and distance from other genomes. All genomes of Oleiphilaceae bacteria have genes responsible for alkane degradation, such as genes coding for alkane monooxygenase, flavin-binding family monooxygenase, and cytochrome P450 monooxygenase.

Notes

Acknowledgments

The work of ST was supported by the RSF project # 17-74-30025. MF acknowledges grants PCIN-2014-107 (within ERA NET IB2 grant nr. ERA-IB-14-030—MetaCat), PCIN-2017-078 (within the Marine Biotechnology ERA-NET (ERA-MBT) funded under the European Commission’s Seventh Framework Programme, 2013–2017, Grant agreement 604814), BIO2014-54494-R, and BIO2017-85522-R from the Ministerio de Ciencia, Innovación y Universidades, formerly Ministerio de Economía, Industria y Competitividad. MMY, TNC, OVG, MF, KEJ, and PNG received funding from the European Union’s Horizon 2020 research and innovation program Blue Growth: Unlocking the potential of Seas and Oceans under grant agreement no. [634486] (project acronym INMARE). PNG acknowledges ERA NET IB2, grant no. ERA-IB-14-030, and UK Biotechnology and Biological Sciences Research Council (BBSRC), grant no. BB/M029085/1. TCH, OVG and PNG acknowledge the support of the Centre for Environmental Biotechnology Project funded by the European Regional Development Fund (ERDF) through the Welsh Government.

References

  1. Acosta-González A, Rosselló-Móra R, Marqués S (2013) Characterization of the anaerobic microbial community in oil-polluted subtidal sediments: aromatic biodegradation potential after the Prestige oil spill. Environ Microbiol 15(1):77–92PubMedCrossRefGoogle Scholar
  2. Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ (1990) Basic local alignment search tool. J Mol Biol 215(3):403–410PubMedPubMedCentralCrossRefGoogle Scholar
  3. Altschul SF, Madden TL, Schäffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res 25(17):3389–3402PubMedPubMedCentralCrossRefGoogle Scholar
  4. Aronesty E (2013) Comparison of sequencing utility programs. The Open Bioinformatics Journal 7:1–8.  https://doi.org/10.2174/1875036201307010001CrossRefGoogle Scholar
  5. Bankevich A, Nurk S, Antipov D, Gurevich AA, Dvorkin M, Kulikov AS, Lesin VM, Nikolenko SI, Pham S, Prjibelski AD, Pyshkin AV, Sirotkin AV, Vyahhi N, Tesler G, Alekseyev MA, Pevzner PA (2012) SPAdes: a new genome assembly algorithm and its applications to single-cell sequencing. J Comput Biol 19(5):455–477PubMedPubMedCentralCrossRefGoogle Scholar
  6. Berry D, Gutierrez T (2017) Evaluating the detection of hydrocarbon-degrading bacteria in 16S rRNA gene sequencing surveys. Front Microbiol 8:896PubMedPubMedCentralCrossRefGoogle Scholar
  7. Eddy SR (2011) Accelerated profile HMM searches. PLoS Comput Biol 7(10):e1002195PubMedPubMedCentralCrossRefGoogle Scholar
  8. Edgar RC (2004) MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res 32(5):1792–1797PubMedPubMedCentralCrossRefGoogle Scholar
  9. Finn RD, Coggill P, Eberhardt RY, Eddy SR, Mistry J, Mitchell AL, Potter SC, Punta M, Qureshi M, Sangrador-Vegas A, Salazar GA, Tate J, Bateman A (2015) The Pfam protein families database: towards a more sustainable future. Nucleic Acids Res 44(D1):D279–D285PubMedPubMedCentralCrossRefGoogle Scholar
  10. Golyshin PN, Chernikova T, Abraham WR, Lunsdorf H, Timmis KN, Yakimov MM (2002) Oleiphilaceae fam. Nov., to include Oleiphilus messinensis gen. nov., sp. nov., a novel marine bacterium that obligately utilizes hydrocarbons. Int J Syst Evol Microbiol 52:901–911PubMedGoogle Scholar
  11. Hyatt D, Chen GL, LoCascio PF, Land ML, Larimer FW, Hauser LJ (2010) Prodigal: prokaryotic gene recognition and translation initiation site identification. BMC Bioinformatics 11(1):119PubMedPubMedCentralCrossRefGoogle Scholar
  12. Kim M, Oh HS, Park SC, Chun J (2014) Towards a taxonomic coherence between average nucleotide identity and 16S rRNA gene sequence similarity for species demarcation of prokaryotes. Int J Syst Evol Microbiol 64(2):346–351PubMedCrossRefGoogle Scholar
  13. Kodama Y, Shumway M, Leinonen R (2011) The sequence read archive: explosive growth of sequencing data. Nucleic Acids Res 40(D1):D54–D56PubMedPubMedCentralCrossRefGoogle Scholar
  14. Li L, Stoeckert CJ, Roos DS (2003) OrthoMCL: identification of ortholog groups for eukaryotic genomes. Genome Res 13(9):2178–2189PubMedPubMedCentralCrossRefGoogle Scholar
  15. Louis P, Galinski EA (1997) Characterization of genes for the biosynthesis of the compatible solute ectoine from Marinococcus halophilus and osmoregulated expression in Escherichia coli. Microbiology 143(4):1141–1149PubMedCrossRefGoogle Scholar
  16. Meier-Kolthoff JP, Auch AF, Klenk H-P, Göker M (2013) Genome sequence-based species delimitation with confidence intervals and improved distance functions. BMC Bioinformatics 14(1):60PubMedPubMedCentralCrossRefGoogle Scholar
  17. Nawrocki EP, Eddy SR (2013) Infernal 1.1: 100-fold faster RNA homology searches. Bioinformatics 29(22):2933–2935PubMedPubMedCentralCrossRefGoogle Scholar
  18. Nawrocki EP, Burge SW, Bateman A, Daub J, Eberhardt RY, Eddy SR, Floden EW, Gardner PP, Jones TA, Tate J, Finn RD (2014) Rfam 12.0: updates to the RNA families database. Nucleic Acids Res 43(D1):D130–D137PubMedPubMedCentralCrossRefGoogle Scholar
  19. Parks DH, Imelfort M, Skennerton CT, Hugenholtz P, Tyson GW (2014) Assessing the quality of microbial genomes recovered from isolates, single cells, and metagenomes. Genome Res 25:1043–1055CrossRefGoogle Scholar
  20. Parks DH, Rinke C, Chuvochina M, Chaumeil PA, Woodcroft BJ, Evans PN, Hugenholtz P, Tyson GW (2017) Recovery of nearly 8,000 metagenome-assembled genomes substantially expands the tree of life. Nat Microbiol 2:1533–1542.  https://doi.org/10.1038/s41564-017-0012-7PubMedCrossRefGoogle Scholar
  21. Richter M, Rosselló-Móra R (2009) Shifting the genomic gold standard for the prokaryotic species definition. Proc Natl Acad Sci USA 106(45):19126–19131PubMedCrossRefGoogle Scholar
  22. Rodriguez-R LM, Konstantinidis KT (2016) The enveomics collection: a toolbox for specialized analyses of microbial genomes and metagenomes. PeerJ Preprints 4:e1900v1Google Scholar
  23. Scoma A, Barbato M, Borin S, Daffonchio D, Boon N (2016) An impaired metabolic response to hydrostatic pressure explains Alcanivorax borkumensis recorded distribution in the deep marine water column. Sci Rep 6:31316.  https://doi.org/10.1038/srep31316PubMedPubMedCentralCrossRefGoogle Scholar
  24. Shao Z, Wang W (2013) Enzymes and genes involved in aerobic alkane degradation. Front Microbiol 4:116PubMedPubMedCentralGoogle Scholar
  25. Singh SK, Kotakonda A, Kapardar RJ, Kankipati HK, Rao PS, Sankaranarayanan PM, Vetaikorumagan SR, Gundlapally SP, Nagappa R, Shivaji S (2015) Response of bacterioplankton to iron fertilization of the Southern Ocean, Antarctica. Front Microbiol 6:863PubMedPubMedCentralGoogle Scholar
  26. Sosa OA, Repeta DJ, Ferrón S, Bryant JA, Mende DR, Karl DM, DeLong EF (2017) Isolation and characterization of bacteria that degrade phosphonates in marine dissolved organic matter. Front Microbiol 8:1786PubMedPubMedCentralCrossRefGoogle Scholar
  27. Stackebrandt E, Goebel BM (1994) Taxonomic note: a place for DNA-DNA reassociation and 16S rRNA sequence analysis in the present species definition in bacteriology. Int J Syst Bacteriol 44:846–849CrossRefGoogle Scholar
  28. Stamatakis A (2014) RAxML version 8: a tool for phylogenetic analysis and post-analysis of large phylogenies. Bioinformatics 30(9):1312–1313PubMedPubMedCentralCrossRefGoogle Scholar
  29. Toshchakov SV, Korzhenkov AA, Chernikova TN, Ferrer M, Golyshina OV, Yakimov MM, Golyshin PN (2017) The genome analysis of Oleiphilus messinensis ME102 (DSM 13489T) reveals backgrounds of its obligate alkane-devouring marine lifestyle. Mar Genomics 36:41–47PubMedPubMedCentralCrossRefGoogle Scholar
  30. Wang Y, Coleman-Derr D, Chen G, Gu YQ (2015) OrthoVenn: a web server for genome wide comparison and annotation of orthologous clusters across multiple species. Nucleic Acids Res 43:W78–W84PubMedPubMedCentralCrossRefGoogle Scholar
  31. Yakimov MM, Golyshin PN (2014) The Family Oleiphilaceae. In: Rosenberg E., DeLong E.F., Lory S., Stackebrandt E., Thompson F. (eds) The Prokaryotes. Springer, Berlin, Heidelberg  https://doi.org/10.1007/978-3-642-38922-1_285Google Scholar
  32. Yakimov MM, Golyshin PN, Lang S, Moore ER, Abraham WR, Lünsdorf H, Timmis KN (1998) Alcanivorax borkumensis gen. nov., sp. nov., a new, hydrocarbon-degrading and surfactant-producing marine bacterium. Int J Syst Bacteriol 48(2):339–348PubMedCrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  • Aleksei A. Korzhenkov
    • 1
  • Stepan V. Toshchakov
    • 2
  • Olga V. Golyshina
    • 3
  • Manuel Ferrer
    • 4
  • Tatyana N. Chernikova
    • 3
  • Karl-Erich Jaeger
    • 5
  • Michail M. Yakimov
    • 6
  • Peter N. Golyshin
    • 3
    Email author
  1. 1.National Research Centre “Kurchatov Institute”MoscowRussia
  2. 2.Winogradsky Institute of Microbiology, FRC “Biotechnology” RASMoscowRussia
  3. 3.School of Natural SciencesBangor UniversityBangorUK
  4. 4.Department of Applied BiocatalysisCSIC – Institute of CatalysisMadridSpain
  5. 5.Helmholtz Centre JuelichHeinrich Heine University of DüsseldorfJülichGermany
  6. 6.Institute for Biological Resources and Marine Biotechnology, CNRMessinaItaly

Personalised recommendations