Recent metagenomics surveys have provided insights into the marine virosphere. However, these surveys have focused solely on viruses in seawater, neglecting those associated with biofilms. By analyzing 1.75 terabases of biofilm metagenomic data, 3974 viral sequences were identified from eight locations around the world. Over 90% of these viral sequences were not found in previously reported datasets. Comparisons between biofilm and seawater metagenomes identified viruses that are endemic to the biofilm niche. Analysis of viral sequences integrated within biofilm-derived microbial genomes revealed potential functional genes for trimeric autotransporter adhesin and polysaccharide metabolism, which may contribute to biofilm formation by the bacterial hosts. However, more than 70% of the genes could not be annotated. These findings show marine biofilms to be a reservoir of novel viruses and have enhanced our understanding of natural virus-bacteria ecosystems.
Viruses make a significant contribution to nutrient and energy conversion processes in marine ecosystems via the modulation of the structure and functions of protistan, bacterial, and archaeal communities (Breitbart 2012; Kristensen et al. 2011; Suttle 2005; Zhang et al. 2014). The viral shunt releases around 10 billion tons of carbon per day and is probably a fundamental step in marine carbon cycling (Breitbart 2012). Viruses that infect bacterial hosts, known as phages, express auxiliary metabolic genes (AMGs) that influence the central metabolic processes of their hosts, such as photosynthesis and nutrient acquisition (Thompson et al. 2011; Xu et al. 2018).
However, because of their large and highly dynamic populations, the vast majority of marine viruses remain unexplored. Great efforts have been spent on the isolation of viruses infecting many major marine bacterial lineages, such as Prochlorococcus (Sullivan et al. 2003), SAR11 (Zhao et al. 2013), and Roseobacter (Zhang et al. 2019a). Recent advances in culture-free approaches (e.g., metagenomics) have facilitated an unprecedented increase in the analysis of the diversity of marine microbes (bacteria and archaea) and viruses (reviewed in Coutinho et al. 2018). The global ocean dsDNA viromic dataset was established by the Tara Ocean Project with the goal of exploring ocean virus diversity to better understand the ecological and evolutionary drivers behind these viral communities and to reveal new mechanisms by which these viruses affect global oceanic microbial processes (Brum et al. 2015). In addition to the Tara Ocean Project, several other projects have revealed viruses to be the most abundant biological entities in marine ecosystems, e.g., in coral reefs (Thurber et al. 2017), and in marine sediments (Danovaro et al. 2008; Engelhardt et al. 2014).
Most oceanic surveys have focused on the viruses infecting free-living bacterioplankton while those associated with biofilms were neglected. Biofilm formation confers several ecological advantages on bacteria and archaea, such as environmental protection, increased access to nutrients, and enhanced interspecies interactions (Dang and Lovell 2016). The individual and collective viral protection is conferred by the biofilm architecture (Vidakovic et al. 2018). Biofilms supported on artificial surfaces have been used as models to study biofilm developmental processes, microbe-invertebrate interactions, and novel microbial diversity and functions in marine environments (Chung et al. 2010; Salta et al. 2013; Zhang et al. 2015). In a recent study, Zhang et al. (2019b) examined 101 biofilm samples formed on man-made panels and natural rocks immersed in eight locations across the Atlantic, Indian, and Pacific Oceans and investigated the microbial (bacterial and archaeal) diversity and functional potential within these microbiomes. In the present study, we analyzed the viral sequences extracted from the same 101 biofilm metagenomes with the aim of obtaining a systematic understanding of viral diversity and function.
Metagenomic identification of viruses in marine biofilms
The locations of where the 101 biofilm samples were collected are shown (Fig. 1). The biofilms were developed on eight types of artificial substrates (polystyrene Petri dishes, zinc panels, aluminum, poly(ether-ether-ketone), polytetrafluoroethylene, poly(vinyl chloride), stainless steel, and titanium). The substrates were deployed at a depth of 1–2 m at eight locations around the world (the South Atlantic, the Red Sea, the waters off Hong Kong, Yung Shue O Bay, the East China Sea, and three sites in the South China Sea). Metagenome assembly generated 72,132,494 contigs in total, from which 3974 viral sequences (longer than 5.0 kbp) were predicted. The viral sequences had a maximum length of 351.55 kbp, an average length of 14.57 kbp, and an average cytosine bases (GC) content of 47.75%. The novelty of the viruses was evaluated by comparing the viral sequences with the Integrated Microbial Genome/Virus (IMG/VR) database, which contains sequences from almost 8500 isolated viruses and over 700,000 viral contigs from metagenomes. Consistent with the rules used in a previous study (Paez-Espino et al. 2017), biofilm-derived viral sequences of over 1 kbp with 90% or higher similarity to sequences in the IMG/VR database were considered to be known viruses; only 358 (9.01%) biofilm-derived viral sequences were found in the IMG/VR database (Fig. 2).
74,895 open reading frames (ORFs) were predicted from the biofilm-derived viral sequences. To confirm the VirSorter prediction, HMMER was used to search the ORFs against the virus orthologous groups (VOGs) database. As a result, all the viral sequences had ORFs that achieved hits in the VOG database. In total, 31,038 VOG hits (41.44% of the total ORFs) were obtained, of which 2764 were non-redundant. The 30 most abundant VOGs consisted of genes encoding viral structural proteins, such as terminase large unit (VOG09355), base plate protein J (VOG00195), terminase large unit gp2 (VOG00080), and probable capsid protein gp17 (VOG02249) (Supplementary Fig. S1). The other abundant VOGs included genes responsible for DNA replication and transcription, such as DNA polymerase (VOG00073) (Supplementary Fig. S1).
Taxonomic classification indicated that 81.60% of the VOGs were associated with Caudovirales, 0.32% were associated with Maveriviricetes, with the remaining VOGs considered to be unclassified viruses (Supplementary Fig. S2). Phylogenetic analysis using the terminase large subunit VOG9355, identified from the biofilm viral sequences and sequences from the VOG database, revealed three relatively independent branches formed by the biofilm viruses, most likely representing novel viral lineages (Fig. 3).
Endemism of the viruses to marine biofilms
To explore the niche specificity of the viruses detected in the biofilms, the abundance of the biofilm-derived viruses was investigated by mapping the metagenomic reads of the 101 biofilm and 91 seawater samples (10 million reads per sample) to the viral sequences. To this end, 250 viral sequences with coverage > 1 in at least one biofilm and coverage = 0 in all seawater samples were identified (Fig. 4), suggesting the existence of viruses that are endemic to the biofilm niche. To confirm this result, five phages that were abundant in the Red Sea biofilms were selected their distribution in nine Red Sea biofilm samples and nine adjacent seawater samples were investigated. The number of reads mapped to these phages exceeded 100 in almost all of the biofilm samples but was close to zero in the seawater metagenomes (Supplementary Fig. S3).
Viruses in single genomes and their functions
To investigate the hosts of the viruses and the potential virus-host interactions, 479 microbial genome bins extracted from the biofilm metagenomes were analysed. These genome bins belonged to 20 different microbial phyla, including Proteobacteria (272 genomes), six ‘Candidatus’ phyla (7 genomes), Acidobacteria (6 genomes), Actinobacteria (10 genomes), Bacteroidetes (100 genomes), Cyanobacteria (34 genomes), Deinococcus-Thermus (1 genome), Firmicutes (2 genomes), Lentisphaerae (2 genomes), Parcubacteria (1 genome), Planctomycetes (22 genomes), Rhodothermaeota (1 genome), Verrucomicrobia (19 genomes), and Euryarchaeota (2 genomes) (Supplementary Fig. S4). Viral sequences from the genome bins were identified using the software PHASEER (McCoy et al. 2007), which was designed for mining phage sequences from draft genomes. In total, 149 phage sequences were distributed in 101 bacterial genome bins of Alphaproteobacteria, Gammaproteobacteria, Acidobacteria, Actinobacteria, Bacteroidetes, Candidatus, Gracilibacteria, Cyanobacteria, Firmicutes, Lentisphaerae, Oligoflexia, Planctomycetes, and Rhodothermaeota (Fig. 5a). Within these taxa, Gammaproteobacteria (n = 43), followed by Alphaproteobacteria (n = 30), possessed the largest number of phage-containing genome bins (Fig. 5a). The GC content of the phage contigs was compared with that of the bacterial genomes and found to be very similar (Supplementary Fig. S5).
To detect potential functions encoded by these phages, all genes derived from the phage sequences (4121 predicted ORFs) were analyzed by classifying the gene functions using the COG database (Galperin et al. 2015; Tatusov et al. 2000), which resulted in 22 COG categories (Fig. 5b). In total, 1023 ORFs (24.82%) resulted in hits in the COG database; however, 521 ORFs were classified as “general function” predictions only [R] or as “function unknown” [S]. Of the remaining 502 COGs, 40 were classified as being involved in amino acid transport and metabolism [E], nucleotide transport and metabolism [F], or as carbohydrate transport and metabolism [G], such as the genes encoding Na+/glutamate symporter [COG0786], deoxynucleotide kinases [COG1428], and chitinase [COG3325] (Fig. 5b).
The functions of all genes derived from these phage contigs (4121 predicted ORFs) were further analyzed by searching them against the KEGG (Kanehisa et al. 2017) and CAZy databases (Lombard et al. 2014). The genes were characterized by searching the 1062 ORFs against the KEGG database’s annotated sequences and the top 18 abundant KEGG matches are shown in Fig. 6. Interestingly, the most abundant KEGG hit was for trimeric autotransporter adhesin (K21449) and the genes for viral structure (e.g., K06909), transcriptional regulation (e.g., ParB family transcriptional regulator, chromosome partitioning protein K03497) and DNA replication (e.g., putative DNA primase/helicase K06919) were also annotated. KEGG annotation also revealed uncharacterized but relatively conserved genes (n = 96; 9.0% of all KEGG hits), such as K06903, K06907, and K06904. In parallel, 351 ORFs were annotated by CAZy: these mostly included genes for lysozymes, chitinases, lyase, and peptidoglycan lytic transglycosylases (Supplementary Fig. S6). In total 1133 ORFs were annotated by the CAZy or KEGG database, while the remaining 72.51% achieved no hits.
The finding here that biofilms are composed of a number of previously unknown viruses is consistent with the notion that biofilm formation promotes virus accumulation and may be a potential library of infectious pathogens (Bettarel et al. 2006). When the biofilm-derived viral sequences were aligned with the VOG database, the most abundant genes were found to be related to structure and replication. More specifically, the base plate is a part of tailed prokaryotic viruses, such as Caudovirales, and it suggests the prevalence of tailed viruses in marine biofilms. The terminase large subunit is a viral DNA-packaging motor, which cleaves viral DNA into smaller pieces and inserts them into a procapsid powered by ATP hydrolysis (Rao and Feiss 2008). Capsid proteins encoded by relatively short genes function to protect nucleic acids and the tertiary structure of capsid proteins contain all the information required for virus assembly (Hagan and Zandi 2016). The annotation of these VOGs validates the conserved structure and function of biofilm-derived viruses; however, phylogenetic analysis of these proteins also indicates the existence of novel viral lineages in marine biofilms.
There have been few studies reporting on virus endemism in environmental niches. In this study, it is shown that biofilm virus endemism is much greater than in seawater: 250 viral sequences were present in the biofilms collected from the different oceans, but they were absent from all the seawater samples. While surface-associated microbes and viruses must be seeded from seawater, many viruses are very scarce in seawater and so are unlikely to be sampled. Extracellular DNA released through cell lysis mediated by phages has been shown to enhance biofilm formation (Gödeke et al. 2011). Certain viruses are capable of forming biofilm-like assemblies for propagation (Thoulouze and Alcover 2011). In addition, phages can select for a mucoid bacterial phenotype to co-evolve and induce biofilm formation (Scanlan and Buckling 2012). One of the underlying mechanisms coordinating this relationship between viruses and biofilms involves quorum-sensing signals, which upregulate the expression of CRISPR-related genes (Høyland-Kroghsbo et al. 2017; Patterson et al. 2016) and decrease the level of phage receptors (Høyland-Kroghsbo et al. 2013; Tan et al. 2015). Another reason why so many novel viruses were discovered in biofilms is the seawater filtering process, which can highly concentrate the low abundant viruses that are missed during seawater sampling.
According to previous metagenomic analyses of marine viruses (Coutinho et al. 2017; Mizuno et al. 2013), Cyanobacteria, Actinobacteria, Alphaproteobacteria, Gammaproteobacteria, and Verrucomicrobia are the most prevalent phage hosts. Results presented here are consistent with previous reports with Alpha- and Gamma-proteobacteria being the major hosts of phages in the biofilms. The proportion of guanine and GC content in DNA provides survival advantages in the adaption to environmental conditions (Almpanis et al. 2018; Mann and Chen 2010). Results presented here show a similar GC content between the phages and their hosts, suggesting that the viruses have adapted to their hosts and that certain environmental factors have had roles in shaping the intimate relationships between the phages and the bacteria in the biofilms (Motlagh et al. 2017). Viral sequences identified from microbial genomes are probably phages; however, due to technical limitations, it is difficult to extract all the genomes from metagenomes and distinguish all the phages from free viruses.
With regard to phage function, more than 70% of the ORFs could not be annotated by the COG, KEGG, or CAZy databases, indicating the limited understanding of the function of biofilm-derived viruses and the need for additional experimental research. COG annotation suggested that the phages inhabiting biofilms may encode enzymes involved in central carbon metabolism. No phage genes for photosynthesis were detected, suggesting that the phages contribute little to carbon fixation in the biofilm communities, which is in contrast to previous findings that showed photosynthetic genes are prevalent in phages infecting subtidal microbial communities (McMinn et al. 2020; Sullivan et al. 2005; Thompson et al. 2011). Notably, 89 genes were found to code for trimeric autotransporter adhesin (K21449), which is a trimeric autotransporter that promotes biofilm formation in bacteria (Fey et al. 2002; Luqman et al. 2018; Raghunathan et al. 2011); mutation of this gene abolished the ability of biofilms to attach to plastic surfaces (Lazar Adler et al. 2013); over-expression of this gene in Salmonella enterica increased cell aggregation and adhesion to human intestinal Caco-2 epithelial cells (Raghunathan et al. 2011). Similarly, a recent study showed that SadA-expressing Staphylococci from the human gut showed increased cell adherence and internalization (Luqman et al. 2018). The high abundance of K21449 indicates the role of phages in facilitating biofilm formation by the bacterial hosts and thus provides clues to the specificity of the viral sphere in marine biofilms. Transcriptional regulators may also have significant mediating effects on the interactions between human beings and Epstein-Barr viruses (Arvey et al. 2012); however, the function of transcriptional regulators in marine viruses is unclear. Furthermore, the polysaccharide metabolism genes (e.g., chitinases) annotated by CAZy are probably used by phages to lyse hosts and are involved in carbon recycling within the biofilm communities.
Here we found that over 90% of the biofilm-derived viruses had no overlap with the IMG/VR database and provided evidence for the existence of viruses endemic to biofilms, suggesting that biofilm formation enables the discovery and reconstruction of viral genomes from marine environments. We identified potential auxiliary metabolic genes for trimeric autotransporter adhesin and polysaccharide metabolism in viral sequences integrated into the biofilm-derived microbial genomes, suggesting that phages may contribute to biofilm formation by the bacterial hosts, yet more than 70% of the phage genes functions remain unknown. Taken together, the present study has unveiled a hidden marine virosphere with novel viral diversity and unexplored functions.
Materials and methods
The biofilms were developed on eight types of artificial substrates: polystyrene petri dishes (9 × 1.2 cm), zinc panels (11 × 11 cm), aluminum, poly(ether-ether-ketone), polytetrafluoroethylene, poly(vinyl chloride), stainless steel, and titanium (5 × 5 cm). The artificial substrates were deployed at a depth of 1–2 m at eight locations around the world: the South Atlantic, the Red Sea, the waters off Hong Kong, Yung Shue O Bay, the East China Sea, and three sites in the South China Sea. The petri dishes were immersed in seawater for 12 days to allow for biofilm formation; the other artificial substrates were immersed for 30 days to allow for visible bacterial attachment. Biofilms that had formed on natural rocks were also collected. After collection, the biofilms were immediately transferred to the laboratory, and the surface bacterial cells were removed using sterile cotton tips and stored in 5 ml of DNA storage buffer (500 mmol/L NaCl, 50 mmol/L Tris–HCl, 40 mmol/L EDTA, and 50 mmol/L glucose) at − 80 °C. During biofilm development, adjacent seawater samples were collected and successively filtered through 0.1-μm polycarbonate membrane filters (Millipore, Massachusetts, USA). The filters were stored in 5 ml of DNA storage buffer at − 80 °C. In total, 101 biofilms and 24 seawater samples were collected. Additionally, 67 Tara seawater samples collected from marine surface (Sunagawa et al. 2015) were also used for comparisons between the biofilms and seawater (Supplementary Table S1).
DNA extraction and sequencing
Biofilms from the cotton tips and seawater samples on the filters were re-suspended in Tris–HCl buffer, pelleted by centrifugation at 4000 g for 10 min and then lysed with lysozyme (37 °C for 30 min) and the lysis buffer provided by the TIANamp Genomic DNA Kit (Tiangen Biotech, Beijing, China). Then, DNA extraction was performed using the TIANamp Genomic DNA Kit, following the manufacturer’s protocol. DNA sequencing for the Red Sea samples was performed at the Beijing Genomics Institute (BGI, Beijing, China), and the other samples were sequenced at the Novogene Bioinformatics Institute (Novogene, Beijing, China). After the construction of 350-bp insert libraries, the DNA was sequenced on the HiSeq X Ten System at Novogene and the HiSeq 2500 System at BGI. Quality control was performed on a local server using the software NGS QC Toolkit (version 2.0) (Patel and Jain 2012) to remove low-quality reads (assigned by a quality score < 20 for > 30% of the read length) or unpaired high-quality reads. Information on metagenomic reads is given in Supplementary Table S1.
Metagenomic assembly and microbial genome binning
Following quality control, reads from the biofilm metagenomes were assembled into contigs using the software MEGAHIT (version 1.0.2) (Li et al. 2015) with kmer values of 21–121, increasing in steps of 10. Coverage information was generated by mapping metagenomic reads to the contigs using Bowtie2 (fastq as input format under a sensitive-local model). The contigs as well as the coverage information were used as input for MaxBin (version 2.0) (Wu et al. 2016) to assign the contigs to single genomes. The single genomes were further analyzed using MetaBAT for purification. The completeness and contamination of the genome bins were analyzed using CheckM (Parks et al. 2015). Duplicated genomes were removed based on the average nucleotide identity (ANI) information provided by the ANI calculator (Yoon et al. 2017), where genome pairs with ANI values exceeding 0.99 were taken as redundant genomes. Information of the assembled metagenomic contigs is given in Supplementary Table S2. Information on the genome bins is provided in Supplementary Table S3.
Viral sequences prediction and annotation
The software VirSorter (version 1.0.5) (Roux et al. 2015), installed on a local server, was used to identify viral sequences from the metagenomic contigs and genome bins. The database ‘Refseqdb’ and the mode ‘BLASTp’ were used for mining viral sequences, and only viruses in the categories of ‘sure’ or ‘somewhat sure’ were retained for the following analyses. Metagenomic reads of 101 biofilms and 91 seawater samples were mapped to the viral sequences using bbmap (version 2) (Bushnell 2014) to indicate viral coverage in biofilms and seawater (minimum alignment identity = 0.76). All the metagenomes for mapping were normalized to 10 million reads per metagenome, and all reads were trimmed to 101 bp in length by NGS QC Toolkit (version 2.0). The viral ORFs were predicted using Prodigal (version 2.0) (Hyatt et al. 2010) in the Meta model (only closed ends were allowed). A HMMER hmmscan (Johnson et al. 2010) against the VOG database (https://vogdb.org) was performed to classify the ORFs using an e-value cutoff of 1e − 7, and then the taxonomic affiliation was examined by MEGAN (Huson et al. 2016). The reference genes were selected from VOG database with hmmscan, and a phylogenetic tree was established with ClustW and 1000 bootstraps by MEGA 6 (Tamura et al. 2013). For potential function mining, annotation of the phage genes was performed by BLASTp (e value 1e − 7) searching against the COG (Galperin et al. 2015; Tatusov et al. 2000), KEGG (Kanehisa et al. 2017), and CAZy (Lombard et al. 2014) databases. The workflow of the present study is summarized in Supplementary Fig. S7.
All the metagenomic datasets (101 biofilm and 24 adjacent seawater metagenomes) have been deposited in the NCBI database under BioProject accession no. PRJNA438384. The 479 microbial genome bins are uploaded to figshare (https://figshare.com/s/2994fdafe79112b99907, https://doi.org/10.6084/m9.figshare.7082684).
Almpanis A, Swain M, Gatherer D, McEwan N (2018) Correlation between bacterial G+C content, genome size and the G+C content of associated plasmids and bacteriophages. Microb Genom 4:000168
Arvey A, Tempera I, Tsai K, Chen HS, Tikhmyanova N, Klichinsky M, Leslie C, Lieberman PM (2012) An atlas of the Epstein-Barr virus transcriptome and epigenome reveals host-virus regulatory interactions. Cell Host Microbe 12:233–245
Bettarel Y, Bouvy M, Dumont C, Sime-Ngando T (2006) Virus-bacterium interactions in water and sediment of West African inland aquatic systems. Appl Environ Microbiol 72:5274–5282
Breitbart M (2012) Marine viruses: truth or dare. Ann Rev Mar Sci 4:425–448
Brum JR, Ignacio-Espinoza JC, Roux S, Doulcier G, Acinas SG, Alberti A, Chaffron S, Cruaud C, De Vargas C, Gasol JM, Gorsky G, Gregory AC, Guidi L, Hingamp P, Iudicone D, Not F, Ogata H, Pesant S, Poulos BT, Schwenck SM et al (2015) Patterns and ecological drivers of ocean viral communities. Science 348:1261498
Bushnell B (2014) BBMap: a fast, accurate, splice-aware aligner. Lawrence Berkeley National Lab (LBNL), Berkeley
Chung HC, Lee OO, Huang YL, Mok SY, Kolter R, Qian PY (2010) Bacterial community succession and chemical profiles of subtidal biofilms in relation to larval settlement of the polychaete Hydroides elegans. ISME J 4:817–828
Coutinho FH, Silveira CB, Gregoracci GB, Thompson CC, Edwards RA, Brussaard CP, Dutilh BE, Thompson FL (2017) Marine viruses discovered via metagenomics shed light on viral strategies throughout the oceans. Nat Commun 8:1–2
Coutinho FH, Gregoracci GB, Walter JM, Thompson CC, Thompson FL (2018) Metagenomics sheds light on the ecology of marine microbes and their viruses. Trends Microbiol 26:955–965
Dang H, Lovell CR (2016) Microbial surface colonization and biofilm development in marine environments. Microbiol Mol Biol Rev 80:91–138
Danovaro R, Dell’Anno A, Corinaldesi C, Magagnini M, Noble R, Tamburini C, Weinbauer M (2008) Major viral impact on the functioning of benthic deep-sea ecosystems. Nature 454:1084–1087
Engelhardt T, Kallmeyer J, Cypionka H, Engelen B (2014) High virus-to-cell ratios indicate ongoing production of viruses in deep subsurface sediments. ISME J 8:1503–1509
Fey P, Stephens S, Titus MA, Chisholm RL (2002) SadA, a novel adhesion receptor in Dictyostelium. J Cell Biol 159:1109–1119
Galperin MY, Makarova KS, Wolf YI, Koonin EV (2015) Expanded microbial genome coverage and improved protein family annotation in the COG database. Nucleic Acids Res 43:261–269
Gödeke J, Paul K, Lassak J, Thormann KM (2011) Phage-induced lysis enhances biofilm formation in Shewanella oneidensis MR-1. ISME J 5:613–626
Hagan MF, Zandi R (2016) Recent advances in coarse-grained modeling of virus assembly. Curr Opin Virol 18:36–43
Høyland-Kroghsbo NM, Mærkedahl RB, Svenningsen SL (2013) A quorum-sensing-induced bacteriophage defense mechanism. mBio 4:00362
Høyland-Kroghsbo NM, Paczkowski J, Mukherjee S, Broniewski J, Westra E, Bondy-Denomy J, Bassler BL (2017) Quorum sensing controls the Pseudomonas aeruginosa CRISPR-Cas adaptive immune system. Proc Natl Acad Sci USA 114:131–135
Huson DH, Beier S, Flade I, Górska A, El-Hadidi M, Mitra S, Ruscheweyh HJ, Tappu R (2016) MEGAN community edition-interactive exploration and analysis of large-scale microbiome sequencing data. PLoS Comput Bio 12:e1004957
Hyatt D, Chen GL, LoCascio PF, Land ML, Larimer FW, Hauser LJ (2010) Prodigal: prokaryotic gene recognition and translation initiation site identification. BMC Bioinf 11:119
Johnson LS, Eddy SR, Portugaly E (2010) Hidden Markov model speed heuristic and iterative HMM search procedure. BMC Bioinf 11:431
Kanehisa M, Furumichi M, Tanabe M, Sato Y, Morishima K (2017) KEGG: new perspectives on genomes, pathways, diseases and drugs. Nucleic Acids Res 45:353–361
Kristensen DM, Mushegian AR, Koonin EV (2011) Systems biology of bacteriophage proteins and new dimensions of the virus world discovered through metagenomics. Genome Biol 12:9
Lazar Adler NR, Dean RE, Saint RJ, Stevens MP, Prior JL, Atkins TP, Galyov EE (2013) Identification of a predicted trimeric autotransporter adhesin required for biofilm formation of Burkholderia pseudomallei. PLoS ONE 8:79461
Li D, Liu CM, Luo R, Sadakane K, Lam TW (2015) MEGAHIT: an ultra-fast single-node solution for large and complex metagenomics assembly via succinct de Bruijn graph. Bioinformatics 31:1674–1676
Lombard V, Golaconda Ramulu H, Drula E, Coutinho PM, Henrissat B (2014) The carbohydrate-active enzymes database (CAZy) in 2013. Nucleic Acids Res 42:490–495
Luqman A, Nega M, Nguyen MT, Ebner P, Götz F (2018) SadA-expressing staphylococci in the human gut show increased cell adherence and internalization. Cell Rep 22:535–545
Mann S, Chen YP (2010) Bacterial genomic G+C composition-eliciting environmental adaptation. Genomics 95:7–15
Marshall D, Sample C (1995) Epstein-Barr virus nuclear antigen 3C is a transcriptional regulator. J Virol 69:3624–3630
McCoy AJ, Grosse-Kunstleve RW, Adams PD, Winn MD, Storoni LC, Read RJ (2007) Phaser crystallographic software. J Appl Crystallogr 40:658–674
McMinn A, Liang Y, Wang M (2020) Minireview: the role of viruses in marine photosynthetic biofilms. Mar Life Sci Technol 2:203–208
Mizuno CM, Rodriguez-Valera F, Kimes NE, Ghai R (2013) Expanding the marine virosphere using metagenomics. PLoS Genet 9:1003987
Motlagh AM, Bhattacharjee AS, Coutinho FH, Dutilh BE, Casjens SR, Goel RK (2017) Insights of phage-host interaction in hypersaline ecosystem through metagenomics analyses. Front Microbiol 1:1–15
Paez-Espino D, Chen IM, Palaniappan K, Ratner A, Chu K, Szeto E, Pillay M, Huang J, Markowitz VM, Nielsen T, Huntemann M, Reddy TBK, Pavlopoulos GA, Sullivan MB, Campbell BJ, Chen F, Mcmahon KD, Hallam SJ, Denef VJ, Cavicchioli R et al (2016) IMG/VR: a database of cultured and uncultured DNA Viruses and retroviruses. Nucleic Acids Res 30:1030
Paez-Espino D, Pavlopoulos GA, Ivanova NN, Kyrpides NC (2017) Nontargeted virus sequence discovery pipeline and virus clustering for metagenomic data. Nat Protoc 12:1673–1682
Parks DH, Imelfort M, Skennerton CT, Hugenholtz P, Tyson GW (2015) CheckM: assessing the quality of microbial genomes recovered from isolates, single cells, and metagenomes. Genome Res 25:1043–1055
Patel RK, Jain M (2012) NGS QC Toolkit: a toolkit for quality control of next generation sequencing data. PLoS ONE 7:30619
Patterson AG, Jackson SA, Taylor C, Evans GB, Salmond GP, Przybilski R, Staals RH, Fineran PC (2016) Quorum sensing controls adaptive immunity through the regulation of multiple CRISPR-Cas systems. Mol Cell 64:1102–1108
Raghunathan D, Wells TJ, Morris FC, Shaw RK, Bobat S, Peters SE, Paterson GK, Jensen KT, Leyton DL, Blair JM, Browning DF, Pravin J, Floreslangarica A, Hitchcock J, Moraes CTP, Piazza RMF, Maskell DJ, Webber M, May RC, Maclennan CA et al (2011) SadA, a trimeric autotransporter from Salmonella enterica serovar Typhimurium, can promote biofilm formation and provides limited protection against infection. Infect Immun 79:4342–4352
Rao VB, Feiss M (2008) The bacteriophage DNA packaging motor. Annu Rev Genet 42:647–681
Roux S, Enault F, Hurwitz BL, Sullivan MB (2015) VirSorter: mining viral signal from microbial genomic data. Peer J 3:985
Salta M, Wharton JA, Blache Y, Stokes KR, Briand JF (2013) Marine biofilms on artificial surfaces: structure and dynamics. Environ Microbiol 15:2879–2893
Scanlan PD, Buckling A (2012) Co-evolution with lytic phage selects for the mucoid phenotype of Pseudomonas fluorescens SBW25. ISME J 6:1148–1158
Sullivan MB, Waterbury JB, Chisholm SW (2003) Cyanophages infecting the oceanic cyanobacterium Prochlorococcus. Nature 424:1047–1051
Sullivan MB, Coleman ML, Weigele P, Rohwer F, Chisholm SW (2005) Three Prochlorococcus cyanophage genomes: signature features and ecological interpretations. PLoS Biol 3:790–806
Sunagawa S, Coelho LP, Chaffron S, Kultima JR, Labadie K, Salazar G, Djahanschiri B, Zeller G, Mende DR, Alberti A, Cornejocastillo FM, Costea PI, Cruaud C, Dovidio F, Engelen S, Ferrera I, Gasol JM, Guidi L, Hildebrand F, Kokoszka F et al (2015) Structure and function of the global ocean microbiome. Science 348:1261359
Suttle CA (2005) Viruses in the sea. Nature 437:356–361
Tamura K, Stecher G, Peterson D, Filipski A, Kumar S (2013) MEGA6: molecular evolutionary genetics analysis version 6.0. Mol Biol Evol 30:2725–2729
Tan D, Svenningsen SL, Middelboe M (2015) Quorum sensing determines the choice of antiphage defense strategy in Vibrio anguillarum. MBio 6:00627
Tatusov RL, Galperin MY, Natale DA, Koonin EV (2000) The COG database: a tool for genome-scale analysis of protein functions and evolution. Nucleic Acids Res 28:33–36
Thompson LR, Zeng Q, Kelly L, Huang KH, Singer AU, Stubbe J, Chisholm SW (2011) Phage auxiliary metabolic genes and the redirection of cyanobacterial host carbon metabolism. Proc Natl Acad Sci USA 108:757–764
Thoulouze MI, Alcover A (2011) Can viruses form biofilms? Trends Microbiol 19:257–262
Thurber RV, Payet JP, Thurber AR, Correa AM (2017) Virus–host interactions and their roles in coral reef health and disease. Nat Rev Microbiol 15:205–216
Vidakovic L, Singh PK, Hartmann R, Nadell CD, Drescher K (2018) Dynamic biofilm architecture confers individual and collective mechanisms of viral protection. Nat Microbiol 3:26–31
Wu YW, Simmons BA, Singer SW (2016) MaxBin 2.0: an automated binning algorithm to recover genomes from multiple metagenomic datasets. Bioinformatics 32:605–607
Xu Y, Zhang R, Wang N, Cai L, Tong Y, Sun Q, Chen F, Jiao N (2018) Novel phage-host interactions and evolution as revealed by a cyanomyovirus isolated from an estuarine environment. Environ Microbiol 20:2974–2989
Yoon SH, Ha SM, Lim J, Kwon S, Chun J (2017) A large-scale evaluation of algorithms to calculate average nucleotide identity. Antonie Van Leeuwenhoek 110:1281–1286
Zhang R, Wei W, Cai L (2014) The fate and biogeochemical cycling of viral elements. Nat Rev Microbiol 12:850–851
Zhang W, Wang Y, Bougouffa S, Tian R, Cao H, Li Y, Cai L, Wong YH, Zhang G, Zhou G, Zhang X, Bajic VB, Al-Suwailem A, Qian PY (2015) Synchronized dynamics of bacterial niche-specific functions during biofilm development in a cold seep brine pool. Environ Microbiol 17:4089–4104
Zhang W, Ding W, Li YX, Tam C, Bougouffa S, Wang R, Pei B, Chiang H, Leung P, Lu Y, Sun J, Fu H, Bajic VB, Liu H, Webster NS, Qian PY (2019) Marine biofilms constitute a bank of hidden microbial diversity and functional potential. Nat Commun 10:517
Zhang Z, Chen F, Chu X, Zhang H, Luo H, Qin F, Zhai Z, Yang M, Sun J, Zhao Y (2019) Diverse, abundant, and novel viruses infecting the marine Roseobacter RCA lineage. Msystems 4:e00494-e519
Zhao Y, Temperton B, Thrash JC, Schwalbach MS, Vergin KL, Landry ZC, Ellisman M, Deerinck T, Sullivan MB, Giovannoni SJ (2013) Abundant SAR11 viruses in the ocean. Nature 494:357–360
The authors are grateful to a grant from the National Key Research and Development Program of China (2018YFC0310600) and two grants from Ocean University of China (841912035 and 842041010) to W.Z. The authors are also grateful to a grant from China Ocean Mineral Resources Research and Development Association (DY135-B2-03) and a grant from the Hong Kong Branch of South Marine Science and Engineering Guangdong Laboratory (SMSEGL20SC01) to P.Y.Q.
Conflict of interest
The authors declare that they have no conflict of interest.
Animal and human rights statement
This article does not contain any studies with human participants or animals performed by any of the authors.
Edited by Chengchao Chen.
Electronic supplementary material
Below is the link to the electronic supplementary material.
About this article
Cite this article
Ding, W., Wang, R., Liang, Z. et al. Expanding our understanding of marine viral diversity through metagenomic analyses of biofilms. Mar Life Sci Technol (2021). https://doi.org/10.1007/s42995-020-00078-4