Organ transcriptomes of the lucinid clam Loripes orbiculatus (Poli, 1791) provide insights into their specialised roles in the biology of a chemosymbiotic bivalve
The lucinid clam Loripes orbiculatus lives in a nutritional symbiosis with sulphur-oxidizing bacteria housed in its gills. Although our understanding of the lucinid endosymbiont physiology and metabolism has made significant progress, relatively little is known about how the host regulates the symbiosis at the genetic and molecular levels. We generated transcriptomes from four L. orbiculatus organs (gills, foot, visceral mass, and mantle) for differential expression analyses, to better understand this clam’s physiological adaptations to a chemosymbiotic lifestyle, and how it regulates nutritional and immune interactions with its symbionts.
The transcriptome profile of the symbiont-housing gill suggests the regulation of apoptosis and innate immunity are important processes in this organ. We also identified many transcripts encoding ion transporters from the solute carrier family that possibly allow metabolite exchange between host and symbiont. Despite the clam holobiont’s clear reliance on chemosynthesis, the clam’s visceral mass, which contains the digestive tract, is characterised by enzymes involved in digestion, carbohydrate recognition and metabolism, suggesting that L. orbiculatus has a mixotrophic diet. The foot transcriptome is dominated by the biosynthesis of glycoproteins for the construction of mucus tubes, and receptors that mediate the detection of chemical cues in the environment.
The transcriptome profiles of gills, mantle, foot and visceral mass provide insights into the molecular basis underlying the functional specialisation of bivalve organs adapted to a chemosymbiotic lifestyle.
KeywordsChemosynthetic symbiosis RNA-Seq Bivalve Lucinid Innate immunity Nutritional symbiosis
Benchmarking Universal Single-Copy Ortholog
False discovery rate
C-terminal fibrinogen-related domain
G protein-coupled receptor
Ionotropic glutamate receptor
Membrane attack complex/Perforin
Microbe Associated Molecular Pattern
Principal component analysis
Pattern Recognition Receptor
Scavenger receptor cysteine-rich
Tumor necrosis factor receptor
Associations between invertebrates and chemoautotrophic bacteria contributed to the evolutionary success of diverse animal lineages, and are also fundamental to the functioning of marine ecosystems [1, 2]. One prominent example is chemosynthetic symbioses that are found in at least seven phyla of metazoan invertebrates . In the deep sea, these symbioses underpin primary productivity that supports an unexpectedly large biomass (hydrothermal vent mussels and tubeworms) in a food scarce environment, thus serving as the foundation of entire ecosystems [1, 2]. Invertebrates that host chemoautotrophic bacteria are also ubiquitous in shallow water habitats and lucinid clams, in particular, are highly abundant members of the macrofaunal community in shallow water sediments. Indeed, lucinids are the most species rich and widely distributed group of chemosynthetic bivalves , and their ecological and evolutionary success can be attributed to their highly specific association with a single phylotype of sulphur-oxidising gammaproteobacteria.
The symbiosis between lucinid clams and sulphur-oxidising bacteria is a nutritional strategy; the symbionts use reduced sulphur compounds – produced by organic matter decomposing in anoxic sediment – as electron donors and autotrophically fix carbon dioxide to synthesise organic compounds that are transferred to the host . Lucinid clams possess multiple atypical morphological traits associated with the peculiar nature of this nutritional strategy. To house significant bacterial biomass, the gills of lucinids are modified into a plump and conspicuous symbiont-housing organ with a large surface area extending beyond the limits of the visceral mass [4, 5]. The gill symbionts are enclosed in vacuoles of specialised cells known as bacteriocytes and the gills can constitute up to a third of the host’s weight in soft tissue . This heavy reliance on symbionts for nutrition is reflected in the morphological simplification of the lucinid feeding structures and digestive system; poorly developed labial palps, a truncated stomach and digestive glands with a reduced number of tubules . The lucinid foot is also modified to become vermiform and highly extensible (up to 5 times body length) . In addition to locomotory and burrowing activities, the foot is responsible for building a tube connecting the clam to the sediment-water interface, which is a functional replacement for the loss of their inhalant siphon . The lucinid foot also constructs tubes burrowing into the sediments below the animal to mine pockets of dissolved sulphide associated with decaying organic matter or the insoluble iron sulphides attached to sediment, in order to support the sulphide-requiring metabolism of their endosymbionts . These unusual morphological adaptations to a chemosymbiotic lifestyle have allowed lucinids to successfully exploit the deep layers of sulphide-rich sediment, an ecological niche inaccessible to most other animals.
A wealth of literature documents the taxonomy and ecology of lucinid symbioses [4, 8, 9, 10], and we are beginning to unravel metabolic and physiological capabilities of the endosymbionts [11, 12, 13]. However, we still have little understanding of how the host regulates symbiotic interactions at the molecular level. Previous studies suggest a high degree of control over symbionts in the gills, as lucinids appear to inhibit symbiont cell division and are able to selectively digest symbionts or reacquire the same strain from the environment [14, 15, 16]. These studies suggest lucinid-symbiont crosstalk is not limited to nutrient exchange and may be regulated by other processes including host immunity. Indeed, immunity has been implicated in regulating interactions between deep sea mussels and the endosymbiotic bacteria harboured in their gills [17, 18, 19]. Additionally, non-symbiotic lucinid organs can also influence lucinid-symbiont interactions by controlling the availability of essential resources required for metabolism. Examples include filter feeding and digestion in the digestive tract [20, 21, 22], and sulphide and oxygen acquisition by the foot . These examples show that the functions and physiology of the non-symbiotic host organs also contribute to the functioning of the chemosynthetic holobiont. Therefore, transcriptional profiles of symbiotic and non-symbiotic organs will provide more complete insights into how the lucinid host regulates this symbiosis.
Results & discussion
Assembly of a comprehensive lucinid transcriptome
We sequenced 12 RNA-Seq libraries from three replicates each of four organ types: the gills, foot, mantle, and visceral mass. After error correction, filtering, and trimming, each library comprised an average of 22,186,325 million read pairs that were de novo assembled into transcriptomes. These were combined with previously sequenced gill RNA-Seq libraries into a reference transcriptome of 104, 568 contigs with an N50 of 1365 bp, and 1996 contigs above 1 kbp in size (assembly statistics in Additional file 1). Analysis with BUSCO revealed 99% (97.1% complete and 1.8% partial) of the metazoan BUSCO gene set to be present; a higher level of completeness than most other published bivalve transcriptomes [23, 24]. This transcriptome is expected to be a valuable addition to the existing genetic resources for studying the molecular underpinnings of lucinid symbioses [11, 12]. Despite the large number of contigs in the final assembly, only 2.5% of the BUSCOs were identified as duplicates while 20,163 transcripts were functionally annotated with gene ontology terms. This could be due to a number of factors that might include a high frequency of uncharacterised lineage-specific genes and non-coding transcripts. Indeed, this large number of predicted gene models is not unprecedented as similar trends have been reported in other members of the Imparidentia; the draft genomes of Ruditapes phillippinarum and Dreissena polymorpha are predicted to encode over 100, 000 and 60, 000 genes, respectively [25, 26].
Each organ is characterised by a unique transcriptome profile
We used this highly complete transcriptome as a reference to test for genes differentially expressed amongst the gill, visceral mass, mantle and foot (Fig. 1). Identifying clusters of organ-specific genes provides informative baseline data on the functions and physiology of each organ, and thus how both symbiotic and non-symbiotic organs may contribute to the functioning of the lucinid holobiont. A principal component analysis (PCA) of the variance stabilised dataset showed clear separation amongst the organs without any overlaps, indicating each organ has a clearly distinct transcriptomic profile (Fig. 1 b). This is supported by differential gene expression analyses, which showed that 1917, 1064, 823, and 999 corset unigenes were most highly expressed in the gill, visceral mass, mantle, and foot, respectively (Fig. 1 c). We then used gene set enrichment analyses of gene ontology (GO) terms and PFAM domains, as well as manual curation to gain deeper insights into the functional implications of these organ-specific gene clusters.
Immunity and cell death are important processes in the symbiont-housing organ
To identify candidate immune genes involved in mediating lucinid clam-symbiont crosstalk, we used KEGG term annotation and conserved Interpro domains to analyse the reference transcriptome for metazoan immune pathways and receptors. The key components of the main immune signalling pathways conserved across most metazoan lineages were present in the transcriptome of L. orbiculatus (Additional file 2). We also investigated the repertoire of different pattern recognition receptor (PRR) gene families in the L. orbiculatus transcriptomes. PRRs of the evolutionarily ancient innate immune system detect Microbe-Associated Molecular Patterns, the non-self molecules that illicit the animal immune response [27, 28]. All major intracellular and extracellular PRR families typically observed in bivalve genomes are expressed by L. orbiculatus and we found no evidence massive immune gene family shrinkage nor loss of immune signalling pathways (Additional file 2). Some of these PRR families have undergone expansions similar to those in non-symbiotic bivalve genomes: the Toll-like receptors (TLRs), Scavenger Receptor Cysteine-Rich domain-containing genes (SRCRs), C1q-domain containing genes, and C-terminal fibrinogen-related domain-containing proteins (FReDs) (Additional file 2).
TLRs recruit proteins downstream to transduce their signals through the dimerization of the cytosolic TIR domain with the TIR domain of adaptor proteins . The remaining TIR domain-containing unigene highly expressed in the gills is unlikely to mediate MAMP recognition because it lacks the conserved organisation of TLRs and is further predicted to be localised to the cytosol as evidenced by the absence of a transmembrane domain or signal peptide motif (Fig. 2 b). However, it could regulate lucinid-symbiont interactions by transducing signals from TLRs through their TIR domains. This gene comprises a long N-terminal region containing ARMADILLO repeat-like motifs, a central Death domain, and two N-terminal TIR domains (Fig. 2 b); an unusual domain composition characteristic of the ecTIR-DC family 6 genes reported only in other invertebrates . The Death and TIR domain combination is characteristic of MyD88, the adaptor protein that transduces signals from ligand-bound TLRs , while ARMADILLO repeats are also found on proteins that regulate immune signalling . The functions of these domains and the co-expression of this unigene with TLRs in the gills leads us to speculate it could be involved in transducing TLR signals.
The expression of these diverse molecules involved in various arms of innate immunity suggests cells in the lucinid gills have the capacity to detect and raise a complex response to bacteria. We cannot rule out the possibility that the PRRs, immune signalling and effector molecules expressed in the gills are used to defend the host against harmful bacteria because the gills are constantly exposed to the water from the environment. An important caveat to note is that the KEGG immune signalling pathway annotations are biased towards classical model organisms and little conclusive functional evidence exists to support the involvement of proteins containing the domains discussed (e.g. C-type lectin, Death and SRCR) in bivalve immunity. Nevertheless, these data allow to shortlist candidates for future studies investigating the role of innate immunity in lucinid-symbiont crosstalk as well as candidates for investigating the roles of cell death in regulating endosymbiont abundance or bacteriocyte turnover dynamics.
Unigenes expressed in the gill facilitate the chemoautotrophic metabolism of the endosymbiotic bacteria
The beating of the lateral cilia on bivalve gills generates the water currents that facilitate gas exchange and filter-feeding, and previous studies indicate this activity is under neurohormonal control . This is consistent with the enrichment of BP GO terms associated with ion transmembrane transport in the gills, because half of the unigenes contributing to this GO term encode ion channels typical of the animal nervous system known as nicotinic acetylcholine receptors (Fig. 3; Additional file 3). Unigenes associated with the dynein complex are also enriched amongst the cellular component GO terms (Additional file 4); dyneins are critical to ciliary movement because they convert ATP to mechanical motion . Nicotinic acetylcholine receptors and dynein complex genes are similarly over-represented in the gills of non-chemosymbiotic bivalves , which indicates that these genes are unlikely to be directly involved in the unique symbiont housing role of lucinid gills. Nevertheless, their functions influence host-symbiont interactions because the water flow driven by ciliary beating ensures symbionts are supplied with the oxygen and sulphide necessary for chemoautotrophic metabolism.
In order to supply their symbionts with resources for chemoautotrophic metabolism, the haemoglobins of thioautotrophic invertebrates have evolved to bind sulphide and oxygen while preventing the rapid oxidation of sulphides . Two globin domain-containing unigenes are expressed in the gills, one of which groups with the sulphide-binding hemoglobin HBI of the lucinid Phacoides pectinatus, while the other forms a clade with P. pectinatus HBII and HBIII, which are responsible for oxygen binding (Additional file 5). These findings suggest that like other invertebrates associated with thioautrophic symbionts, L. orbiculatus uses specialised haemoglobins to supply its symbionts with sulphide and oxygen. Through the oxidation of sulphur, endosymbionts produce energy that is used to fixed inorganic carbon in the form of CO2 into sugars . However, dissolved CO2 levels are generally low in seawater and the spontaneous generation of CO2 from HCO3− is minimal due to the high alkalinity of sea water. To solve this physiological challenge, all known thioautotrophic symbioses rely on carbonic anhydrases (CAs) to catalyse the reversible dehydration of bicarbonate to carbon dioxide, which diffuses easily across cell membranes and facilitates inorganic carbon uptake . Two CA unigenes are expressed in the gills (Fig. 4, Additional file 3, Additional file 4), both of which have signal peptides that indicate they are secreted. However, the larger of the two has a predicted GPI anchor site (Serine position 432) that could mediate the attachment of the protein to the cell membrane (Additional file 5). A similar secreted CA bearing a GPI anchor site is expressed in bacteriocytes of the chemosynthetic vesicomyid clam Phreagena okutanii and is probably responsible for catalysing the conversion of HCO3− in the hemolymph into CO2 that can diffuse into the bacteriocytes . The CAs in bivalves regulate multiple physiological functions including respiration, calcification and mineralization [45, 46]. However, in addition to these other physiological roles, we hypothesise that the CAs expressed in the gills of L. orbiculatus could be localised to the bacteriocyte cell membrane or secreted into the gill hemolymph to make inorganic carbon available of to the endosymbionts. An alternative but non-mutually exclusive route for providing the symbionts with inorganic carbon is to transport HCO3− directly into the bacteriocytes. Diatoms rely on the SLC4 family of carbonate anion transporters to obtain the inorganic carbon required for photosynthesis . Unigenes encoding SLC4 were not more abundant in the gills compared to the other organs analysed, but we identified a unigene from the SLC26 family of transporters more highly expressed in the gills (Fig. 4, Additional file 4, Additional file 5). This SLC26 unigene has a highly conserved domain architecture and forms a clade with Drosophila melanogaster and Anopheles gambiae genes encoding Prestin (Additional file 5), a protein with the capacity to transport bicarbonate, oxalate, and sulphate anions . The role of SLC26 in transporting bicarbonate into the bacteriocytes is worth further investigation as the symbiont genome encodes multiple carbonic anhydrases that could convert bicarbonate into CO2 in bacteriocytes .
The lucinid metabolism relies on both nutrient translocation and heterotrophy
Previous studies have inferred that lucinids acquire nutrients from their symbionts through a carbon transfer strategy known as “milking”, whereby nutrients are ‘leaked’ or translocated to the host following carbon fixation . This was supported by microscopic evidence indicating low rates of symbiont digestion in the gills and that symbiont cell division is inhibited [15, 50]. Expressed in the gills are 20 unigenes encoding transporters belonging the to the Solute Carrier (SLC), and folate (Additional file 5). Indeed, the SLC family has been implicated in regulating nutrient transfer in family that contribute to the enrichment of BP GO terms associated with ion transmembrane transport (Fig. 4, Additional file 5). There are a total 46 SLC families that are united by a common mode of function – they use the ion gradients across cell membranes to drive the transmembrane transport of solutes. Fifteen different SLC families are expressed in the gills, with specificities for a diversity of substrates including amino acids, glucose, phosphate, monocarboxylates, organic cations and anions, zinc, ascorbic acidother nutritional symbioses. The SLC5 and SLC46A3 families are thought to regulate nutrient transfer between cnidarians and endosymbiotic algae [51, 52], while members of SLC families 5, 6, 13, and 16 are enriched in the symbiont-housing root of the bone eating worm Osedax japonicas . The diverse transporters expressed in the L. orbiculatus gill therefore likely facilitate lucinid-endosymbiont nutrient exchange and supports the inference that the host obtains nutrients through “milking”, although intracellular digestion may also play a role (see below).
We also identified unigenes from other transporter superfamilies in the gills, including monocarboxylate and organic cation transporters from the Major Facilitator Superfamily, and a DUR3-like unigene encoding a urea-proton symporter (Fig. 4, Additional file 5). DUR3 is thought to be involved in nitrogen transfer between invertebrate hosts (Hydra and the giant clam) and their algal symbionts [54, 55], but the expression of a urea transporter in the lucinid gills is somewhat unexpected because Candidatus Thiodiazotropha endoloripes, the endosymbiont of L. orbiculatus, has the capacity to fix nitrogen. Nevertheless, the endosymbiont genome also encodes the transporters and enzymes (Additional file 5) necessary for metabolising urea , which suggests the L. orbiculatus DUR3 could be used to provide symbionts with a nitrogen source in the form of urea. The recycling of nitrogenous compounds like urea could be a symbiont strategy for coping with nitrogen-limitation while also helping the host remove metabolic waste. An alternative possibility is that the lucinid host could play an important role in controlling symbiont nitrogen metabolism. Indeed, the pea plant prevents rhizobium from assimilating ammonium by providing the bacteria with amino acids, thereby regulating symbiotic nitrogen fixation . It is thus interesting to speculate that metabolite transporters could also serve as a mechanism for controlling symbionts by regulating access to nutrients.
Metabolic flexibility in both L. orbiculatus and its symbionts is likely important in the coastal sediments of Elba where the waters are oligotrophic and the availability of sulphide fluctuates in time and space [57, 58]. Biochemical signals of acid phosphatase and arylsulfatase activity, both markers of lysosomal activity, were detected in freshly collected clams indicating that lucinids also directly digest their symbionts . This is consistent with the expression of unigenes encoding proteins involved in lysosomal activity in the gills, including two acid phosphatases, two arylsulfatases, and one Cathepsin L (Fig. 4, Additional file 3). We also identified two unigenes from the Membrane Attack Complex/Perforin (MACPF) superfamily (PF01823) expressed in the gills. Proteins from this superfamily are pore forming toxins involved in immunity that form holes in the membrane of bacteria leading to cell lysis . A MACPF gene in the oyster Crassostrea gigas is upregulated under bacterial exposure, localised to late endosomes, and is highly expressed in the gills, digestive gland and gonads . While it is possible the lucinid MACPF proteins are involved primarily in immune defence, these genes could potentially be co-opted to facilitate the digestion of endosymbiont bacteria.
Gill endosymbiont digestion is, however, unlikely to play a significant role in nutrient acquisition except under exceptional circumstances such as prolonged sulphide limitation . In fact, many more enzymes typically associated with metazoan digestion are highly expressed in the lucinid visceral mass rather than the gills (Additional file 3). This includes carboxypeptidases, papain and pepsin family aspartic peptidases such as cathepsins (D, F, and L), as well as chymotrypsin and subtilisin family serine peptidases (Additional file 3). The importance of digestion in the visceral mass is further reflected in the enrichment of BP GO terms with chitin and polysaccharide metabolism, as well as the Glycosyl hydrolase family 9 PFAM domain (PF00759) (Fig. 3). These expression patterns suggest the lucinid digestive system is able to digest complex polysaccharides that are typical constituents of phytoplankton in the marine environment . This correlates with the enrichment of lectin C-type domains (PF00059) – corresponding to 25 C-type lectin unigenes – in the visceral mass, suggesting that carbohydrate recognition is important in the lucinid gut. C-type lectins are characterised by having one or more copies of the C-type lectin domain, and they fulfil diverse immunological and physiological roles through the recognition of an array of primarily sugar-based ligands . In the mosquito gut, C-type lectins bind to microbes to enable antimicrobial peptide evasion, thereby allowing microbial colonisation and gut homeostasis . It is likely that the lucinid digestive track is also associated with a thus far unstudied microbiota, and the visceral mass C-type lectins could thus play a role in gut immunity and homeostasis. Alternatively, the lucinid C-type lectins could also regulate food sorting as in Crassostrea gigas, where they are expressed on the mucosal surfaces of feeding organs . Regardless of their precise roles, these lines of evidence suggest that heterotrophy remains an important strategy for acquiring nutrients despite the morphological simplification of the lucinid digestive system and its symbiosis with chemosynthetic bacteria. Our findings thus provide molecular support for previous observations that lucinids are able to capture and ingest particulate organic matter .
The foot as a sensor and regulator of interactions with the environment
The BP GO terms enriched in the foot include transmission of nerve impulse, protein glycosylation and carbohydrate metabolism (Fig. 3). These GO terms correlate with the enrichment of Glycosyl transferase (PF17039 and PF00852), sulfotransferase (PF03567), and von wildebrand A (PF00092) PFAM domains in the foot (Fig. 3). VWA domains are typically found in glycoproteins that form the basis of mucus and are secreted into the extracellular matrix [65, 66]. These proteins are often post-translationally modified with glycan attachment in a process known as glycosylation involving enzymes called glycosyltransferases. Further post-translational modifications are carried out by various enzymes such as sulfotransferases and fucosyltransferases . Therefore, the enrichment of GO terms and PFAM domains associated with glycosylation indicates a battery of genes expressed in the lucinid foot are dedicated to mucus biosynthesis and is consistent with the foot’s ecological and physiological functions. Unlike most other bivalves, lucinids use the foot to build a mucus-lined inhalant tube to access the oxygenated water column while they inhabit the deeper sulphide-rich sediment layers inaccessible to other bivalves . They also dig mucus-lined sulphide-mining tubes ventrally to probe deeper sediments for sulphide to provide to their symbionts [5, 7]. By contrast, gene expression in the non-symbiotic mussel foot is instead dominated by byssal cuticle proteins and other proteins that contribute to connective tissue structures , which reflects the mussel foot’s primary role in secreting byssal threads that facilitate attachment to hard substrates.
The chemosynthetic invertebrates inhabiting deep sea hydrothermal vent ecosystems are constantly bathed in dissolved sulphide, while lucinids, on the other hand, inhabit sediments where sulphides levels may be low or patchy [5, 7]. Consequently, the lucinid foot has become a specialised probe for locating and acquiring sulphide in the clam’s surrounding environment [5, 7]. The enrichment of GO terms associated with nerve impulse transmission indicates that the lucinid nervous system plays an important role in coordinating this behaviour (Fig. 3); unigenes contributing to this GO term include both nicotinic acetylcholine receptors and ionotropic glutamate receptors (iGluRs). In muscles, the former are located at neuromuscular junctions where electrical signals from neurons control the contraction of muscle cells . Three iGluRs unigenes are expressed in the foot, one of which belongs to the NMDA2 subfamily while the remaining two are members of the AKDF subfamily (Additional file 3). iGluRs similarly mediate synaptic transmission throughout the nervous system, and while they do not directly bind H2S, it is interesting that NMDA2 receptor signalling is enhanced by physiological levels of H2S in the mammalian nervous system [69, 70]. Although little is known about the receptors specific for H2S, it is interesting to speculate that the iGluRs expressed in the foot could be involved in coordinating the clam’s behavioural response to sulphide in the environment.
G-protein coupled receptors (GPCRs) are also more abundant in the foot (16 unigenes) than in all other organs, except the mantle (18 unigenes), which is consistent with the mantle’s role in sensing and initiating responses to environmental cues . Based on the Interpro annotations of the unigenes, most of the GPCRs expressed in the foot contain a 7tm_1 domain (PF0001) and have short N-termini regions, which indicate that they belong to the rhodopsin family of GPCRs . GPCRs belonging to this family interact with a broad spectrum of ligands and their expression in the lucinid foot suggests they could play a role in enabling the foot to detect cues in the immediate environment. Furthermore, the deployment of non-overlapping subsets of GPCRs in the mantle and foot also suggests the molecular set-up for sensing environmental cues is specialised to the distinct functions of each organ (Additional file 3). In addition to receptors for chemical stimuli, a unigene encoding a piezo-type mechanosensitive ion channel component is also expressed in the foot; this unigene is likely to be important for mechanical signal transduction as the foot probes the sediment . These findings indicate the lucinid foot possesses an intricate array of receptors deployed for probing the environment and we hypothesise these chemosensory abilities play an important role in detecting and acquiring resources for the symbiosis.
The global transcriptomic profiles of the L. orbiculatus organs are broadly consistent with the expected functions of each organ based on our current understanding of lucinid biology and ecology. However, our findings also allow us to generate new questions to address in future studies. For example, in addition to detecting resources in the environment, does the foot also play a role in replenishing symbiont abundance by secreting mucus that serves as a chemoattractant and first point of contact a potential symbiont might have with the host? Given the importance of heterotrophy, how do seasonal changes to resource availability in the environment (sediment sulphide and phytoplankton abundance) influence the dynamics of nutritional and immune exchanges in the symbiont-housing gills? The transcriptome we have assembled and the accompanying insights it has provided into molecular basis each lucinid organ’s functions, set up the foundation for future studies to build a picture of how lucinids use their molecular toolkit to orchestrate the tripartite interactions linking host, symbiont and the environment.
Sample collection and processing
L. orbiculatus (roughly 1.5 cm in length) were dug up from 10 to 20 cm deep sediment in the Bay of Fetovaia, Elba, Italy (May 2017). The clams were brought back to the Hydra Institute where they were kept in sediment and aerated sea water from the bay for no longer than 3 h before they were fixed in RNAlater (Thermo Fisher Scientific, Vilnius, Lithuania) and stored at − 20 °C. The gills, mantle, visceral mass, and foot from three individuals were dissected, placed in a tube containing sterile glass beads and TRIzol (Thermo Fisher Scientific), and subsequently homogenised using a pellet pestle. Total RNA was extracted following the manufacturer’s instructions. The extracted RNA were quantified using a Qubit 3.0 fluorometer (Thermo Fisher Scientific) and their quality assessed using the RNA 6000 Nano Kit on the Agilent 2100 Bioanalyzer (Agilent Technologies, Santa Clara, USA).
To generate a comprehensive reference assembly, we also included 5 additional gill cDNA libraries that were previously sequenced by the group. These gill samples were excised from freshly collected specimens from the same location Elba in October 2015 and from the same batch of specimens that were brought back to the University of Vienna and kept in oxygenated water and Elba sediment until December 2016. The samples were fixed and RNA was extracted as described above. Each RNA extract from these samples was DNase-treated with the Turbo DNA-free Kit (Thermo Fisher Scientific, Vilnius, Lithuania) following the manufacturer’s instructions. The treated RNA was ethanol precipitated by adding 150 μl DEPC water, 20 μl sodium acetate (3 M, pH 5.2), 2 μl glycogen (20 mg/ml), 660 μl ethanol and precipitated at − 80 °C for 30 min. The precipitated RNA was then pelleted at 21,000 x g for 15 min at 4 °C, washed with 70% ethanol, centrifuged at 21,000 x g for 10 min at 4 °C, air dried, and eluted in 30 μl DEPC water.
Library construction and sequencing, sequence pre-processing, and filtering
The RNA samples were poly (A) enriched and cDNA libraries were prepared using the NEBNext® Ultra™ RNA Library Prep Kit for Illumina (New England Biolabs, Ipswich, MA) by the Vienna Biocenter Core Facilities GmbH (VBCF, Vienna, Austria). A total of 12 stranded RNA-Seq libraries (4 tissues × 3 biological replicates per tissue) were multiplexed and sequenced with TruSeq V4 chemistry by the VBCF in one lane of a flow cell on the Illumina HiSeq 2500 to generate 125 bp paired-end reads. Predicted errors in the raw reads were corrected using Rcorrector under default settings and the uncorrectable reads were removed . SortMeRNA was used to remove rRNA sequences from the libraries by searching against the 5S, 5.8S, 16S, 23S, 18S, and 28S rRNA sequences in the package’s default rRNA database . The filtered and corrected reads were then trimmed using Trimmomatic v0.36 with the following settings “SLIDINGWINDOW:4:15 MINLEN:36” . Illumina adapter sequences were also removed from the paired-end libraries with the option ILLUMINACLIP:adapters/TruSeq3-PE-2.fa:2:30. The quality of the trimmed and filtered reads were assessed using FastQC v0.11.5, taking into consideration per base quality scores, GC-content, and read length (http://www.bioinformatics.bbsrc.ac.uk/projects/fastqc).
For the additional 5 gill libraries that were added to our comprehensive reference dataset, the RNA samples were processed with a ribosomal RNA depletion step using the Ribo-Zero Gold rRNA removal Kit (Epidemiology) prior to library preparation. Library preparation and sequencing were then carried out by the VBCF as described above. These libraries were processed using the same workflow as the other libraries sequenced for the present study but with an additional step to remove symbiont-derived reads. Briefly, after filtering out rRNA reads using SortMeRNA, the remaining reads were mapped under default parameters to the symbiont genome (accession IDs LVJW00000000 and LVKA00000000; genomes available on NCBI in in BioProject PRJNA314435), using Bowtie2 v18.104.22.168 [12, 77]. The unmapped reads were retained for trimming and processing as described above.
De novo transcriptome assembly, quality assessment
Each filtered library of reads was individually assembled into contiguous cDNA sequences with IDBA_tran v1.1.3 with the parameters -max_isoforms 100 --mink 20 --maxk 80 --step 5 . The assembled libraries were then concatenated and CD-HIT-EST v4.7 was used to cluster the contigs and stringently remove redundancies with the parameters -c 1.0 -G 0 -aS 1.0 -aL 0.005 . The Evidential Gene tr2aacds pipeline was used to predict the coding sequence regions within the transcripts (https://sourceforge.net/projects/evidentialgene/), and highly similar peptide sequences (90% identity) were clustered using CD-HIT v4.7 (−c 0.90) to create the final list of transcripts . We used TransRate to assess the assembled transcriptome with variety of metrics, including number of transcripts, total number of bases, N50 values and GC content using . Assembled transcripts with predicted open reading frames shorter than 80 amino acids were filtered out. Although small open reading frames and other non-coding transcripts are likely to have important albeit unknown biological functions, this arbitrary threshold was chosen to reduce the number of transcripts in the final assembly to reduce the computational resources required for subsequent analyses. We used the BUSCO program v3.0.2 to assess the completeness of the final reference transcriptome through a search against a set of 978 metazoan BUSCOs (Benchmarking set of Universal Single-Copy Orthologs) [81, 82].
Functional annotation of the predicted proteins
The predicted peptide sequences were annotated with protein domain and conserved motif information using InterProScan (5.28–67.0)  and hmmscan against the PFAM-A database . We also annotated the predicted proteins by blastp searching against the SwissProt database (−max_target_seqs 1 -evalue 1e-6). The BLAST and InterProScan outputs were combined and then annotated with GO terms using BLAST2GO v5.0.21 [85, 86]. The predicted peptide sequences were also annotated with KEGG terms using the BlastKOALA webserver (https://www.kegg.jp/blastkoala/) and then mapped to the KEGG pathways database to identify unigenes involved in immune signalling [87, 88].
Identification of differentially expressed genes and functional enrichment of organ-specific expression profiles
Only the libraries sequenced from the organs collected in May 2017 were used for differential expression analyses. We used a combination of Salmon and Corset to map reads to each transcript and cluster transcripts into putative unigenes [89, 90]; each unigene is assembled from transcripts that appear to originate from the same transcriptional locus. A reference transcriptome index was constructed using Salmon v0.9.1 with the parameters “--type quasi -k 31”, and we used the “pseudo-alignment” algorithm to map the reads from each library to the index with default settings . The numbers of reads mapped to each transcript were generated using Corset v1.07 under default parameters but with ‘-m 0’ applied to retain all transcripts . Corset uses the Fragment equivalence classes output by Salmon to cluster transcripts into putative genes based on sequence similarity and read mapping . A Principal component analysis (PCA) of the vsd-transformed (variance stabilizing transformation) counts was used to explore the global transcriptomic variation across samples; implemented through DESeq2 version v1.18.1 according to instructions in the DESeq2 manual  and visualised using the R package “ggplot2” .
Analyses to identify unigenes differentially expressed across the organs were conducted through pairwise comparisons between each organ using DESeq2 v1.18.1 . TXimport v1.6.0 was used to import the read count data from the Salmon output . We excluded unigenes with fewer than 10 reads mapped across all samples to reduce computational demands. A false discovery rate (FDR) of 10% was implemented using the Benjamini–Hochberg method. An FDR-adjusted p-value ≤0.01 and log2 fold change threshold of 1 (equates to a minimum 2-fold change) were used to produce the final list of differentially expressed genes. Organ-specific clusters of unigenes upregulated in one organ versus all others were identified with a Venn diagram (http://bioinfogp.cnb.csic.es/tools/venny).
A hypergeometric test, implemented through the Cytoscape plug-in BiNGO v3.0.3, was used to test for statistical enrichment (FDR-adjusted p-value < 0.05) of GO terms in the lists of organ-specific unigenes . The GO terms enriched in each organ were visually summarised using Revigo . GO terms were clustered using semantic similarity measure (SimRel) in REVIGO and GO terms with > 0.7 redundancy were collapsed . We also tested for statistically enriched PFAM domains amongst the different organs using a hypergeometric test and the p-values were adjusted for an FDR of 10%. The PFAM domain enrichment analyses were carried out in R using the script provided in Supplemental Text 1 of Chandran et al. (2009) .
The authors thank Miriam Weber, Christian Lott and the team of the HYDRA Field Station Elba for their assistance during fieldwork. We would also like to thank Andrew Calcino and Andre Luiz De Oliveira for their invaluable bioinformatics advice.
BY carried out the assembly, annotation and bioinformatics analyses, and wrote the paper. BY and JPetersen conceived the project and experimental plan. BY and JPolzin collected the biological material, prepared the tissues, performed the RNA extractions and evaluated the quality of the RNA. Preparation of the cDNA libraries and sequencing were performed at the VBCF NGS Unit (www.viennabiocenter.org/facilities). All the authors provided critical contributions to improve the early drafts of the manuscript and approved its final version.
This study was supported by a Vienna Research Groups for Young Investigators grant (VRG14–021) awarded to Jillian Petersen from the Vienna Science and Technology Fund (WWTF). The funding agencies played no role in the design of the study, sample collection, analysis or interpretation of the data nor in the writing of the manuscript.
Ethics approval and consent to participate
Consent for publication
The authors declare that they have no competing interests.
- 22.Cardini U, Bartoli M, Lücker S, Mooshammer M, Polzin J, Lee RW, Micić V, Hofmann T, Weber M, Petersen JM. Chemosymbiotic bivalves contribute to the nitrogen budget of seagrass ecosystems. ISME J. 2019. https://www.nature.com/articles/s41396-019-0486-9.
- 23.Gerdol M, Fujii Y, Hasan I, Koike T, Shimojo S, Spazzali F, Yamamoto K, Ozeki Y, Pallavicini A, Fujita H. The purplish bifurcate mussel Mytilisepta virgata gene expression atlas reveals a remarkable tissue functional specialization. BMC Genomics. 2017;18(1):590.PubMedPubMedCentralCrossRefGoogle Scholar
- 26.McCartney MA, Auch B, Kono T, Mallez S, Zhang Y, Obille A, Becker A, Abrahante JE, Garbe J, Badalamenti JP, et al. The genome of the zebra mussel, Dreissena polymorpha: A resource for invasive species research. bioRxiv. 2019:696732.Google Scholar
- 40.Détrée C, Haddad I, Demey-Thomas E, Vinh J, Lallier FH, Tanguy A, Mary J. Global host molecular perturbations upon in situ loss of bacterial endosymbionts in the deep-sea mussel Bathymodiolus azoricus assessed using proteomics and transcriptomics. BMC Genomics. 2019;20(1):109.PubMedPubMedCentralCrossRefGoogle Scholar
- 45.Wilbur KM, Saleuddin ASM. 6 - Shell Formation. In: ASM S, Wilbur KM, editors. The Mollusca: Academic Press; 1983. vol. 4. p. 235–87.Google Scholar
- 55.Chan CYL, Hiong KC, Boo MV, Choo CYL, Wong WP, Chew SF, Ip YK. Light exposure enhances urea absorption in the fluted giant clam, Tridacna squamosa, and up-regulates the protein abundance of a light-dependent urea active transporter, DUR3-like, in its ctenidium. J Exp Biol. 2018;221(Pt 8).PubMedCrossRefGoogle Scholar
- 58.Kleiner M, Wentrup C, Lott C, Teeling H, Wetzel S, Young J, Chang Y-J, Shah M, VerBerkmoes NC, Zarzycki J, et al. Metaproteomics of a gutless marine worm and its symbiotic microbial community reveal unusual pathways for carbon and energy use. Proc Natl Acad Sci. 2012;109(19):E1173–82.PubMedCrossRefGoogle Scholar
- 67.Brockhausen I, Stanley P. O-GalNAc Glycans. 2017. In: Varki A, Cummings RD, Esko JD, et al., editors. Essentials of Glycobiology [Internet]. 3rd edition. Cold Spring Harbor (NY): Cold Spring Harbor Laboratory Press; 2015-2017. Chapter 10. Available from: https://www.ncbi.nlm.nih.gov/books/NBK453030/.
- 71.Schmidt-Rhaesa A, Harzsch S, Purschke G. Structure and evolution of invertebrate nervous systems. Oxford: Oxford University press; 2016.Google Scholar
- 92.Wickham H. ggplot2 Elegant graphics for data analysis. New York: Springer; 2009.Google Scholar
- 96.Chandran D, Tai YC, Hather G, Dewdney J, Denoux C, Burgess DG, Ausubel FM, Speed TP, Wildermuth MC. Temporal global expression data reveal known and novel salicylate-impacted processes and regulators mediating powdery mildew growth and reproduction on Arabidopsis. Plant Physiol. 2009;149(3):1435–51.PubMedPubMedCentralCrossRefGoogle Scholar
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.