The methylome of the marbled crayfish links gene body methylation to stable expression of poorly accessible genes
The parthenogenetic marbled crayfish (Procambarus virginalis) is a novel species that has rapidly invaded and colonized various different habitats. Adaptation to different environments appears to be independent of the selection of genetic variants, but epigenetic programming of the marbled crayfish genome remains to be understood.
Here, we provide a comprehensive analysis of DNA methylation in marbled crayfish. Whole-genome bisulfite sequencing of multiple replicates and different tissues revealed a methylation pattern that is characterized by gene body methylation of housekeeping genes. Interestingly, this pattern was largely tissue invariant, suggesting a function that is unrelated to cell fate specification. Indeed, integrative analysis of DNA methylation, chromatin accessibility and mRNA expression patterns revealed that gene body methylation correlated with limited chromatin accessibility and stable gene expression, while low-methylated genes often resided in chromatin with higher accessibility and showed increased expression variation. Interestingly, marbled crayfish also showed reduced gene body methylation and higher gene expression variability when compared with their noninvasive mother species, Procambarus fallax.
Our results provide novel insights into invertebrate gene body methylation and its potential role in adaptive gene regulation.
The marbled crayfish (Procambarus virginalis) represents a novel freshwater crayfish species that emerged in the German aquarium trade in 1995 . Marbled crayfish reproduce by apomictic parthenogenesis, thus producing large numbers of genetically identical offspring [2, 3, 4]. It is assumed that P. virginalis originated from a very recent evolutionary macromutation in the Florida slough crayfish Procambarus fallax [5, 6]. Comparative whole-genome sequencing of a diverse set of animals from various sources has shown that the global population can be considered a single genetic clone with negligible genetic variation .
Despite their genetic homogeneity, marbled crayfish have successfully invaded and colonized a variety of habitats in subtropical and temperate regions [8, 9]. This is exemplified by the rapid propagation of marbled crayfish on Madagascar, where the animals have increased their distribution area 100-fold over the past 10 years . Of note, the genetic homogeneity of the population precludes the selection of genetic variants as an explanation for rapid adaptation. As such, it is important to understand epigenetic regulation in marbled crayfish.
DNA methylation represents a conserved and well-established epigenetic modification [10, 11, 12]. It is mediated by the family of DNA methyltransferases (DNMTs) which catalyze the methylation of genomic cytosine residues in a wide range of organisms and provide an important toolkit for epigenetic regulation . Two previous studies have used capillary electrophoresis and mass spectrometry to demonstrate the presence of DNA methylation in marbled crayfish [4, 6]. In addition, changes in global DNA methylation levels have been linked to phenotypic variants . However, information about the DNA methylation toolkit, the patterning of DNA methylation and its potential function has been lacking.
Single-base resolution methylation maps have been generated for a large variety of species and have shown a surprising diversity of DNA methylation patterns in the animal kingdom [14, 15, 16], ranging from almost ubiquitous methylation to low levels or no methylation. Also, different species can show methylation in distinct subgenomic compartments, including promoters, repetitive elements and gene bodies [14, 15]. Interestingly, gene body methylation is often associated with housekeeping genes [17, 18].
The function of gene body methylation remains to be fully understood [19, 20]. The modification has been linked to various aspects of gene regulation, including transcriptional elongation, mRNA splicing, chromatin structure and the suppression of cryptic intragenic promoters in transcribed chromatin [21, 22, 23, 24]. In invertebrates, the preferential methylation of highly conserved genes with housekeeping functions and their moderate but stable expression similarly suggests a regulating function for gene expression, possibly through the suppression of transcriptional noise or expression variation [17, 18, 25, 26]. It has also been shown that conflicting chromatin states, determined by the absence of active histone marks and presence of repressive histone marks, are associated with high levels of transcription noise in actively transcribed genes . However, a clear mechanistic understanding of the role of gene body methylation in gene expression variability remains lacking.
We have now used whole-genome bisulfite sequencing to establish high-resolution methylation maps of P. virginalis from several independent animals and tissues. We also performed RNA-seq and chromatin accessibility assay sequencing (ATAC-seq) to integratively analyze the effect of gene body methylation on chromatin accessibility states and gene expression. The results reveal a DNA methylation pattern that is characterized by tissue-invariant gene body methylation of housekeeping genes. While gene body methylation was negatively associated with chromatin accessibility, we also found that active genes with variable expression were less accessible than stably expressed genes. Comparing gene body methylation levels and gene expression variation levels in the marbled crayfish and its parent species P. fallax, we observe overall lower gene body methylation levels and higher gene expression variation levels in the marbled crayfish. Together, these findings establish the methylome of an emerging invasive animal and provide novel insight into invertebrate gene body methylation.
Identification of a conserved DNA methylation system in marbled crayfish
To confirm the expression of the marbled crayfish DNA methylation system, we used qRT-PCR analysis of various developmental stages and dissected tissues from adult animals. Based on published data from Daphnia pulex  and an evaluation of five different housekeeping genes (Additional file 1), TATA-box-binding protein (TBP) mRNA was used as an internal reference. The results showed low mRNA levels for all three genes in early embryonic stages. Dnmt1 became strongly upregulated in embryonic stage 1.5, while Dnmt3 expression continuously increased during embryogenesis (Fig. 1d). Tet mRNA levels became strongly increased during mid-embryogenesis and remained high in juveniles (Fig. 1d). In adult tissues, Dnmt1 was stably expressed at moderate levels (Fig. 1e), which is consistent with a general maintenance methyltransferase function of the enzyme. The expression pattern of Dnmt3 appeared to be more tissue specific, with the highest level in hemocytes and lowest level in the ovary (Fig. 1e). Tet expression was high in most tissues, but low in the ovary (Fig. 1e). Together, these data show that the components of the DNA methylation system are dynamically expressed during marbled crayfish development and in adult tissues.
Characterization of the marbled crayfish methylome
We also quantified CpG methylation levels for various subgenomic features, such as repeats, exons and introns. This revealed that repeats were relatively hypomethylated (Fig. 2b), while methylation was strongly enriched at gene bodies (Fig. 2b, c). More specifically, average methylation levels were found slightly increased over the genome average in 5’-UTRs and more strongly increased in exons, introns and 3’-UTRs (Fig. 2b). Interestingly, gene body methylation showed a bimodal distribution, with distinct populations of low-methylated and high-methylated genes (Fig. 2d). Further analysis revealed that methylation preferentially targets long, CpG-poor and evolutionarily conserved genes (Additional file 4). These characteristics represent defining features of housekeeping genes, and indeed, methylation levels of housekeeping genes were strongly elevated when compared with other genes (Fig. 2e). Together, our findings thus suggest that DNA methylation in marbled crayfish is enriched at gene bodies and identify housekeeping genes as an important methylation target.
The marbled crayfish methylome is largely tissue invariant
Further analysis also showed that the majority of repetitive elements was unmethylated, while some minor repeat classes, such as DNA transposons and TcMar-Tigger elements, showed higher methylation levels (Fig. 3c, Additional file 5). On the genome level, repeat methylation appeared strongly associated with gene body methylation, as repeats outside of genes had lower methylation levels compared to repeats that were located within genes (Fig. 3d). Of note, repeat methylation again appeared largely invariant between different tissues (Additional file 5), which is consistent with the methylation patterns observed for genes.
Gene body methylation, chromatin accessibility and gene expression variability
To investigate potential gene regulatory functions of DNA methylation in marbled crayfish, we integrated our methylation datasets with RNA-seq datasets that were obtained from three independent hepatopancreas and abdominal musculature samples, respectively (Additional file 6). The results showed a parabolic correlation between gene body methylation and gene expression levels, with the highest and lowest expression ranks being relatively undermethylated (Additional file 7). These results are similar to findings originally made in plants [22, 32]. For further insight, we also addressed the relationship between DNA methylation and chromatin accessibility. For this purpose, we used ATAC-seq  to generate high-resolution chromatin accessibility maps that could be analyzed together with DNA methylation maps. ATAC-seq was successfully established for marbled crayfish hemocytes (Additional file 8), which are isolated cells that are suitable for ATAC-seq analysis (Additional file 9). We also generated RNA-seq data from hemocytes (Additional file 10) and integrated the WGBS data from hemocytes into our analysis.
In further analyses, we also explored the relationship between gene body methylation, chromatin accessibility and gene expression variability. Consistent with earlier observations in other organisms [25, 26, 35], we observed that low-methylated genes show greater gene expression variability (Fig. 4d). This inverse correlation between gene body methylation and gene expression variability was also conserved in the two other tissues that were analyzed, hepatopancreas and abdominal muscle (Additional file 12). Also, convincing examples for high-methylated genes with low gene expression variability, and low-methylated genes with high expression variability were identified (Additional file 12). We next divided genes into low-methylated (average methylation ratio < 0.4) and high-methylated (average methylation ratio > 0.4) gene to understand the association between gene expression variation and chromatin accessibility. The results showed that low-methylated genes with stable expression were distinctly more accessible than low-methylated genes with high gene expression variability (Fig. 4e). In contrast, high-methylated genes showed no major difference in accessibility for stably and variably expressed genes (Fig. 4f). These findings suggest that gene body methylation promotes stable expression of poorly accessible genes.
Increased gene body methylation and reduced gene expression variability in P. fallax
DNA methylation is a highly conserved modification in the animal kingdom [14, 15]. However, relatively little is known about its potential function in epigenetic regulation outside of mammals. Our study provides an in-depth analysis of DNA methylation in marbled crayfish, an emerging model organism and new invasive species that is characterized by genetic uniformity and high adaptive potential.
A large number of arthropod methylomes have been published over the past years [14, 15, 16]. This includes the methylomes from two crustaceans, Daphnia  and P. hawaiensis . However, all known arthropod methylomes have so far been analyzed using DNA preparations from whole animals or a single, specific tissue. We carried out a direct comparison of arthropod methylation patterns from different tissues and from different animals. Our results show that the marbled crayfish methylome is highly conserved between different tissues and therefore substantially different from paradigmatic mammalian methylomes. Tissue-invariant methylation patterns have also been concluded from a comparison of single sperm and muscle methylomes from the tunicate Ciona intestinalis . However, this feature has not been investigated systematically yet and it will be interesting to determine its conservation in invertebrates.
The stability of DNA methylation levels and patterns in marbled crayfish contrasts the dynamic expression of the DNA methylation toolkit during development and in different adult tissues and may suggest additional non-catalytic functions of these enzymes . It should also be noted that the erasure of parental DNA methylation patterns is considered a fundamental requirement for the establishment of totipotency and organismal development in mammals [41, 42]. It is possible that DNA methylation reprogramming in marbled crayfish occurs before the earliest developmental stage that could be investigated by whole-genome bisulfite sequencing (stage 1.7). Alternatively, DNA methylation may not play a major role in the development of the parthenogenetic marbled crayfish.
The marbled crayfish methylome also showed several additional defining features. This includes the relatively sparse methylation of repeats. While similar observations have been made in a few other organisms, including honeybees , repeat methylation is a functionally important feature of many animal and plant genomes [44, 45, 46]. Also, while mammalian genomes are characterized by dynamic methylation at regulatory regions, such as promoters and enhancers [47, 48, 49], the marbled crayfish methylome is characterized by gene body methylation.
Of note, we also found a large number of genes to be hypomethylated in marbled crayfish when compared with its parent species, P. fallax. In addition, we also observed that gene expression variability was significantly increased in marbled crayfish. Variable gene expression has recently been identified as a key mechanism for coral adaptation to a variable environment . Furthermore, epigenetic variability has been shown to increase fitness in simulations with fluctuating environments . Altogether, our findings thus raise the interesting possibility that hypomethylation of gene bodies and its associated gene expression variability provide a mechanism for the adaptability and invasive potential of marbled crayfish.
Laboratory animals were bred as described before  and handled according to institutional guidelines for the care of laboratory animals. Wild animals were collected from lake Moosweiher (Germany, N48°01.844′E07°48.368′), Moramanga (Madagascar, S18°47.350′E48°14.764′) and lake Reilingen (Germany, N49°17.649′E08°32.672′), in compliance with local fishery regulations. Additional information is provided in Additional file 2.
Identification of P. virginalis Dnmt and Tet homologs
P. virginalis Dnmt1, Dnmt3 and Tet homologs were identified by using the respective annotated coding sequences from Daphnia pulex. Blast (v. 2.2.29 +) was used to align assembled marbled crayfish transcripts against the Daphnia pulex sequences. Candidate sequences were validated by searching with Blast against the non-redundant protein sequence database. Read coverage was analyzed by using Bowtie2 (v. 2.2.3)  to map transcriptome reads to the respective sequences. Finally, alignments were illustrated in IGV (v. 2.3.34) . In addition, the 3′ SMARTER RACE kit (Takara) was used to resolve remaining ambiguities at the 3′-end of Dnmt3.
RNA and DNA preparation
Samples of organs and tissues from adult crayfish for DNA and RNA preparation were taken under a dissection microscope, snap-frozen in liquid nitrogen and stored at − 80 °C until extraction of nucleic acids. Embryonated eggs and juveniles were snap-frozen in liquid nitrogen. Genomic DNA was isolated using the Blood and Cell Culture DNA Kit (Qiagen, Hilden, Germany), and total RNA was purified with Trizol (Invitrogen, Darmstadt, Germany).
For first-strand cDNA synthesis, RNA was reverse-transcribed using the QuantiTec Reverse Transcription Kit (Qiagen, Hilden, Germany). qRT-PCR analyses were performed on a LightCycler 480 Real-Time PCR System (Roche, Mannheim, Germany) using the Absolute QPCR SYBR Green Mix (Thermo Scientific, St. Leon-Rot, Germany). The expression levels of Dnmt1, Dnmt3 and Tet were determined by the mean crossing point (Cp) value of three technical replicates using the TATA-box-binding protein (TBP) as a reference gene. Primer sequences were as follows: DNMT1_for: GGGAGAAGGCACTGATTGG and DNMT1_rev: CGATCATCGTTGTTCACCAG; DNMT3_for: GAATGGAACATCAGCACCTGC and DNMT3_rev: CGGTGCTCTCATTCCACAATC; Tet_for: CCAGTAGAAGTGATCAACAGTG and Tet_rev: CCTCCAATATCTGGATCGTGG; TBP_for: CCACAGCTACAGAACATCG and TBP_rev: CTCATGATGACGGCTGC.
Whole-genome bisulfite sequencing
Genomic DNA was isolated as described above. The TruSeq PCR-Free Library Prep Kit (LT; Illumina, San Diego, US) was used for library preparation and the Epitect Kit (Qiagen) for bisulfite conversion. Library amplification was performed using the Kapa HiFi HotStart Uracil + ReadyMix (2 ×; Kapa Biosystems). Samples were then sequenced on an Illumina HiSeq platform.
Approximately 500 µL of hemolymph was collected from three independent animals using a 23-G needle inserted in the abdomen of a cold-anesthetized crayfish. One volume of anticoagulant solution (0.14 M and NaCl, 0.1 M glucose, 30 mM Na3 Citrate.2 H2O, 26 mM citric acid, 0.5 M EDTA) was added prior to centrifugation for 5 min at 300 x g and 4 °C. After washing the cell pellet twice with sterile and cold PBS 1 ×, 50.000 hemocytes were immediately used for the ATAC library preparation . The transposase reaction was optimized and a 20-min reaction was used to avoid DNA “over transposition.” The subsequent steps were as described in the original protocol . Libraries were sequenced on an Illumina HiSeq platform (Additional file 8).
Whole-genome bisulfite sequencing data analysis
Read pairs were quality trimmed (minimum quality value ≥ 15 and minimum length ≥ 36 bp), and both marbled crayfish and P. fallax data were mapped to the marbled crayfish genome assembly using BSMAP version 2.73 . Correctly mapped read pairs (appropriate orientation and distance to each other) with both reads mapping uniquely to the same scaffold were used for methylation calling. The methylation ratio (methylation calling) for each CpG was determined by the Python script distributed with the BSMAP package. The provided Python script was slightly changed to analyze only reads fulfilling the following additional criteria: (1) minimum quality value of the base at C position ≥ 30 and (2) minimum quality value of the two bases before and after the C position ≥ 20. Only C-positions with a minimum coverage of three reads per strand were used in further analyses. Bisulfite conversion rates and mapping rates are provided in Additional file 2.
Raw data for P. hawaiensis and D. pulex were downloaded from NCBI using accessions PRJNA306836 and GSE60475, respectively. Custom R scripts were used to determine methylation levels in the genome and genomic features. Violin plots were generated for individual methylomes by R’s ggplot2.violinplot function. Housekeeping genes were identified by blasting human housekeeping genes against the genome assembly. Heatmaps were generated by the heatmap.2 function of R. Data were tested for differential methylation using a Wilcoxon rank-sum test. This was applied to each gene as a paired difference test to see whether the means of the two tissue groups are significantly different from each other. The wilcox.test function in R was used with a p value cutoff of 0.1. Barplots for repeat count and methylation were generated in R using ggplot2′s geom_bar function.
RNA-seq data analysis
Rsem  was used to calculate expression levels (TPM values) for each sample of the RNA-seq datasets. TPM values were then used to calculate the coefficient of variation of expression levels per tissue and per species, methylation deciles and correlations between the two. For expression ranks, genes were grouped into quintiles and octiles by their TPM values. For hemocytes, correlation of expression levels (TPMs) between samples was confirmed (Additional file 10) and datasets were pooled for gene expression level analyses.
ATAC-seq data analysis
Raw sequencing data were trimmed and quality filtered (TrimGalore-v0.4.5). Reads were then mapped against the reference genome using Bowtie2 , and duplicates were removed with samtools . Broad ATAC peaks were called using MACS2 . After confirming a high correlation between the three biological replicates (Additional file 9), we pooled the three samples for downstream analyses. ATAC-seq coverage was directly extracted from the merged bam file for regions surrounding the transcription start site by using samtools’ bedcov function. Heatmaps and metagene plots were produced using the image function of R and the geom_smooth function of ggplot2.
FG, CF, JG, GR and VCC analyzed the data. KH and VCC performed experiments. FL conceived the study and wrote the paper with contributions from the coauthors. All authors read and approved the final manuscript.
We thank Stephan Wolf and the DKFZ Genomics and Proteomics Core Facility for sequencing services. We also thank Dr. Manuel Paredes-Rodriguez for help with ATAC-seq library preparation.
The authors declare that they have no competing interests.
Availability of data and materials
The coding sequences of P. virginalis Dnmt1, Dnmt3 and Tet are available in GenBank (accession nos. KM453737, KM453738 and KM453739). The whole-genome bisulfite sequencing, RNA-seq and ATAC-seq datasets are available in the NCBI Gene Expression Omnibus (GEO; http://www.ncbi.nlm.nih.gov/geo/) under accession no. GSE112411.
Consent for publication
Ethics approval and consent to participate
VCC is supported by a postdoctoral fellowship from CellNetworks.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
- 5.Martin P, Dorn NJ, Kawai T, van der Heiden C, Scholtz G. The enigmatic Marmorkrebs (marbled crayfish) is the parthenogenetic form of Procambarus fallax (Hagen, 1870). Contrib Zool. 2010;79:107–18.Google Scholar
- 30.Kao D, Lai AG, Stamataki E, Rosic S, Konstantinides N, Jarvis E, Di Donfrancesco A, Pouchkina-Stancheva N, Semon M, Grillo M, et al. The genome of the crustacean Parhyale hawaiensis, a model for animal development, regeneration, immunity and lignocellulose digestion. Elife. 2016;5:e20062.CrossRefGoogle Scholar
- 36.Hobbs HHJ. The crayfishes of Florida. Biol Sci Ser. 1942;3:1–179.Google Scholar
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.