Systematic evaluation of C. elegans lincRNAs with CRISPR knockout mutants
Long intergenic RNAs (lincRNAs) play critical roles in eukaryotic cells, but systematic analyses of the lincRNAs of an animal for phenotypes are lacking. We generate CRISPR knockout strains for Caenorhabditis elegans lincRNAs and evaluate their phenotypes.
C. elegans lincRNAs demonstrate global features such as shorter length and fewer exons than mRNAs. For the systematic evaluation of C. elegans lincRNAs, we produce CRISPR knockout strains for 155 of the total 170 C. elegans lincRNAs. Mutants of 23 lincRNAs show phenotypes in 6 analyzed traits. We investigate these lincRNAs by phenotype for their gene expression patterns and potential functional mechanisms. Some C. elegans lincRNAs play cis roles to modulate the expression of their neighboring genes, and several lincRNAs play trans roles as ceRNAs against microRNAs. We also examine the regulation of lincRNA expression by transcription factors, and we dissect the pathway by which two transcription factors, UNC-30 and UNC-55, together control the expression of linc-73. Furthermore, linc-73 possesses a cis function to modulate the expression of its neighboring kinesin gene unc-104 and thus plays roles in C. elegans locomotion.
By using CRISPR/cas9 technology, we generate knockout strains of 155 C. elegans lincRNAs as valuable resources for studies in noncoding RNAs, and we provide biological insights for 23 lincRNAs with the phenotypes identified in this study.
KeywordsC. elegans lincRNA MicroRNA CRISPR Phenotype Transcription factor
- C. elegans
Chromatin immunoprecipitation sequencing
Clustered regularly interspaced short palindromic repeats
- D mns
D motor neurons
Long intergenic noncoding RNA
Polycomb repressive complex2
Reads per Kilobase per Million mapped reads
Long intergenic RNAs (lincRNAs) are a specific class of long noncoding RNAs (lncRNAs) that are encoded by genomic sequences without overlap with genomic sequences of known coding genes [1, 2]. LincRNAs were identified first in mammalian cells, and they are key regulators of diverse biological processes such as transcription and chromatin epigenetics [3, 4]. Mutations in lincRNAs have been shown to promote the development of many complex diseases, such as inflammation, viral infection, and carcinogenesis [3, 5, 6]. For example, one extensively studied lincRNA, hotair, regulates epidermal differentiation and associates with cancer metastasis by interacting with epigenetic factors such as Polycomb repressive complex 2 (PRC2) [7, 8]. LincRNA-p21 has been shown to play crucial roles in hypoxia-enhanced glycolysis by forming a positive feedback loop between HIF-1α and lincRNA-p21 to enhance glycolysis under hypoxia . These roles have been characterized mostly with cultured cells, tumor xerographs, tissues, and only recently and for a very limited number of lincRNAs, also at the whole organismal level [10, 11]. For example, linc1405 has recently been found to modulate the Eomes/WDR5/GCN5 complex in mouse ESCs, and at the whole animal level, depletion of linc1405 impedes heart development in mice . In another study, lincRNA-EPS was found to play a trans role in recruiting the heterochromatin binding protein hnRNP L to control nucleosome positioning and inhibit the transcription of immune response genes, and lincRNA-EPS traditional knockout mice demonstrate enhanced inflammation .
Hundreds of lincRNAs have also been identified in other metazoans such as Caenorhabditis elegans, Drosophila, and zebrafish [12, 13, 14]. There are 170 lincRNAs encoded in the current annotated C. elegans genome [15, 16]. Thus far, little is known about the functions and phenotypes associated with these C. elegans lincRNAs. Furthermore, there has been essentially no systematic analysis of all lincRNAs with knockout strains for any given animal.
CRISPR technology enables efficient production of C. elegans knockout and insertion strains [17, 18, 19, 20, 21, 22, 23]. In this study, we generated knockout strains using CRISPR for 155 of the 170 C. elegans lincRNAs. Among the 6 traits we analyzed, mutants of 23 lincRNAs exhibited phenotypes. We also provided mechanistic insights for these lincRNAs.
Genome-wide characteristics of C. elegans lincRNAs
Compared to mRNAs, lincRNAs were less conserved in 26 nematode species (Fig. 1c). When there were conserved sequences, the length of these sequences was also shorter in lincRNAs than in mRNAs (Fig. 1c). The exon numbers of lincRNAs were significantly fewer than of mRNAs (Fig. 1d). lincRNAs were also significantly shorter than mRNAs (Fig. 1e). These features of exon numbers and sequence length were also true for lincRNAs in several other organisms [1, 12].
Phenotypes of lincRNA CRISPR knockout strains
Expression patterns of lincRNAs with a mutant phenotype
Correlations between lincRNAs and mRNAs
We also analyzed the expression correlations between the lincRNAs and the corresponding coding genes within the 100 kb upstream and downstream genomic regions (Additional file 5: Figure S2a, b); for either all the 170 lincRNAs or the 23 lincRNAs with phenotypes, the correlation between the expression of lincRNAs and mRNAs seemed to have no relevance to the position of the mRNA from the lincRNA locus. We further examined the relationship between the mean expression profiles of mRNAs and lincRNAs based on RNA-seq data for embryos, L1, L2, L3, and L4, and young adults generated by our group using Short Time-series Expression Miner (STEM) . Ten expression profile patterns were obtained after normalizing the mean expression of both lincRNAs and mRNAs in L1, L2, L3, and L4, and young adults to the mean expression in the embryo (Fig. 4c). Nine of the 10 expression profiles (missing the expression profile pattern 2) contained lincRNAs that showed a correlated expression similar to the mRNAs. In these 10 expression profile patterns, profile patterns 3 and 4 showed an enrichment for the largest number of lincRNAs (11 lincRNAs in each pattern) (Fig. 4c). Gene ontology (GO) analysis of coding genes in profile 3 revealed enrichment for genes involved in the regulation of embryonic development and embryo development ending in birth or egg hatching, among others (Fig. 4d). Among the 11 lincRNAs in profile 3, only one lincRNA, linc-4, had a phenotype (egg retention) (Figs. 2a and 4d). Among the 11 lincRNAs in profile 4, two lincRNAs, linc-17 (developmental delay) and linc-109 (pharyngeal pumping), had phenotypes (Fig. 2a). GO terms in profile 4 showed enrichment for genes in system development, larval development, and pharyngeal pumping (Fig. 4e, f).
Interactions between lincRNAs and microRNAs
Thus far, it has been known that some lincRNAs play cis regulatory roles, and we were interested in whether some lincRNAs might have trans roles. Many lncRNAs play trans roles as competing endogenous RNAs (ceRNAs) to block the inhibitory regulation of microRNA (miRNAs) on mRNA targets [25, 26, 27].
To illustrate the interaction of lincRNAs and microRNAs, we also sequenced the microRNA expression profiles of C. elegans in the nine different stages and populations. A functional interaction network between lincRNAs and miRNAs was then built (Fig. 4g). We observed that of the 170 lincRNAs, 28 of them contained at least two miRNA seed regions in their sequences and showed a negative correlation with the corresponding microRNA at expression levels (Fig. 4g, Additional file 6: Table S4). Among these 28 lincRNAs, six, linc-22, linc-60, linc-73, linc-107, linc-109, and linc-126, showed phenotypes in this study (Figs. 2a and 4g). In fact, linc-109 was the lincRNA with the most microRNA interactions in this network.
Rescuing lincRNA phenotypes
Transcriptional regulation of lincRNAs
Previous studies by our group and others have shown that two transcription factors, UNC-30 and UNC-55, work together to specify GABAergic DD and VD motor neurons (mns) in C. elegans [32, 33, 34]. Therefore, we analyzed the ChIP-seq data from endogenously expressed UNC-30 and UNC-55 for their lincRNA targets . UNC-30 regulated 10 lincRNAs, and UNC-55 regulated 9 lincRNAs (Fig. 7j). UNC-30 and UNC-55 shared 6 lincRNA target genes (linc-5, linc-58, linc-73, linc-146, linc-149, and linc-152) (Fig. 7j, k, Additional file 10: Figure S5). The 6 shared lincRNA targets showed a higher relative enrichment in ChIP-seq compared with lincRNA targets that were regulated by either UNC-30 or UNC-55 alone (Fig. 7k). Among the shared lincRNA targets of UNC-30 and UNC-55, linc-5 and linc-73 had phenotypes of pharyngeal pumping and locomotion, respectively (Fig. 2a). Promoter reporters of linc-5 and linc-73 demonstrated that both lincRNAs were expressed in the head region and the D mns (Fig. 3a, b).
Molecular mechanism of linc-73 in locomotion
It is well known that unc-104 plays essential roles in the transportation of presynaptic proteins [35, 36, 37]. There was a slight decrease in the dorsal presynaptic puncta for DD mns in linc-73 mutants compared to a more dramatic decrease in the number of ventral presynaptic VD mn puncta (Fig. 8g). The detailed mechanism about how increased levels of UNC-104 in D mns resulted in asymmetric presynaptic punctum distribution remained for further investigation. These changes in DD and VD mns in linc-73 mutants would result in relatively weaker inhibition of ventral vs dorsal body wall muscles in linc-73 mutants and thus to a ventral coil phenotype. Taken together, these data suggested a model in which two transcription factors, UNC-30 and UNC-55, co-regulated the expression of linc-73, which then regulated the expression of unc-104 in cis by affecting histone modifications to modulate the formation of presynapses in the D mns and further to play roles in C. elegans locomotion (Fig. 8h).
LincRNAs are now recognized as critical players in eukaryotic cells [1, 2, 3, 4]. Studies at the cellular level have uncovered a myriad of functions and functional mechanisms for many mammalian lincRNAs [7, 9, 38]. These lincRNAs can play roles either in the nucleus or in the cytoplasm with an array of trans and cis mechanisms [39, 40].
CRISPR enables fast and efficient genetic engineering, thus providing an opportunity to generate KO strains for nearly all the lincRNAs of an animal, C. elegans. Systematic analyses of these strains for just six traits identified 23 phenotypic lincRNAs; it would be reasonable to speculate that many lincRNAs or even most of them may be phenotypic lincRNAs given the analysis of more (or more complex) traits, such as chemosensory, longevity, and male mating. Researchers have just started to explore the roles of lincRNAs and other lncRNAs systematically with CRISPR screening in mammalian cell cultures [41, 42, 43, 44]. LincRNAs do not have overlapping sequences with other genes, which makes them relatively more adaptive to perturbation, and the results from the manipulations are relatively easier to explain. Our understandings of lincRNAs could also be true for other lncRNAs, as lincRNAs have multiple features that are shared by many other lncRNAs. The study of lincRNAs and lncRNAs in C. elegans is relatively lagging behind that in mammalian cells. The C. elegans KO strains of lincRNAs from this study would be valuable resources for future studies, as this animal is a supreme model organism with powerful genetic and cell biology tools.
Critical roles of lincRNAs at the cellular level sometimes do not justify their physiological significance at the whole organismal level. For example, studies at the cellular level have demonstrated that MALAT1 plays major roles in nuclear speckles for mRNA processing, splicing, and export [45, 46]. However, there is no obvious phenotype in MALAT1 KO mice [47, 48]. Additionally, some recent arguments have been raised about the physiological roles of hotair, as some researchers believe that hotair KO mice do not show an apparent phenotype [49, 50]. Therefore, it is of great value to study lincRNAs both at the cellular level and with animals. Our lincRNA KO strains would facilitate studies at the whole organismal level. A pilot study using traditional method has generated KO strains for 18 murine lincRNAs, and essentially, all these mutants have phenotypes of embryonic lethal or severe defects in development leading to early death . It is somewhat surprising that none of the 155 C. elegans lincRNA mutants have a lethal phenotype. It could be that the mammalian development is much more complicated, and the previous study also selected for lincRNAs with expression patterns of greater association with neural development .
To analyze the connections of C. elegans lincRNAs to other transcripts and epigenetic markers, we performed ChIP-seq of H3K4me3 and H3K9me3 for L4 worms and RNA-seq for both long RNAs (e.g., lncRNAs, mRNAs, and circular RNAs) and small RNAs (e.g., microRNAs) in nine worm developmental stages and populations (GSE115324). These are also valuable resources for future studies. Network construction and expression profile association can provide mechanism insights in the roles of lincRNAs. For example, co-expression analysis revealed that linc-109 was associated with muscle development and pharyngeal pumping, as well as microtubule-based movement (Fig. 4f), and the phenotype of linc-109 mutant was a pharyngeal pumping defect. The lincRNA-microRNA co-expression and bioinformatic analyses revealed that linc-109 might be regulated by multiple microRNAs (Fig. 4g), and indeed, some of these regulatory effects were experimentally confirmed (Fig. 5). These points and the complete rescue of the linc-109 phenotype by overexpressing this lincRNA (Fig. 6a, c) strongly suggested a trans regulatory role of linc-109, making it highly plausible that it serves as a ceRNA against microRNAs. lincRNAs can play trans roles other than ceRNA [39, 52, 53], and other potential trans roles of C. elegans lincRNAs require further investigations.
For the 8 lincRNAs that were expressed exclusively at one particular stage, only the linc-155 mutant had a phenotype, and the phenotype of a decreased number of progenies seemed to match its exclusive expression in early embryo (Figs. 1a, b and 2a, f). For the 12 lincRNAs that were ubiquitously expressed, only the linc-4 mutant demonstrated a phenotype, egg retention (Figs. 1a, b and 2a, e), and it was difficult to speculate on any direct link between the ubiquitous expression of linc-4 with the mutant phenotype. For the remaining 150 lincRNAs that were not expressed either ubiquitously or exclusively, the mutants of 21 lincRNAs showed phenotypes in the six traits examined (Figs. 1a, b and 2). For locomotion, defecation, pharyngeal pumping, egg retention, and offspring number, young adults were examined. Therefore, it was difficult to identify links between the corresponding expression pattern and the phenotype. For the four lincRNAs (linc-17, linc-18, linc-36, and linc-74) with a developmental delay, their mutants already did display retardation in early development within 24 h of hatching (Figs. 1a, b and 2a, g). All four of them showed relatively high expression levels in the embryo (Fig. 1a, b, Additional file 1: Table S1).
The expression of lincRNAs is under the control of transcription factors, and we noticed that a small portion (8 of ~ 300) of transcription factors (LIN-39, EOR-1, BLMP-1, NHR-77, HLH-1, DAF-16, W03F9.2, and NHR-237) regulated the expression of ≥ 50 lincRNAs (Fig. 7a–f). It would be interesting to further investigate the biological relevance underlying this regulatory phenomenon. A lincRNA can be transcriptionally regulated by multiple transcription factors together (Fig. 7). For example, lincRNA-73 is regulated by 48 transcription factors, including UNC-30 and UNC-55, two transcription factors that converge to control the differentiation and plasticity of GABAergic D mns [32, 33, 34]. Six lincRNAs are co-regulated by UNC-30 and UNC-55 (Fig. 7j) . It was surprising that CRISPR knockout of only one of the six lincRNAs, linc-73, gave rise to uncoordination (Figs. 2a, b and 8). It is known how linc-73 plays a cell-autonomous role in D mns to regulate the expression of unc-104 (Fig. 8), but the roles of the other 5 lincRNAs that are also commonly regulated by UNC-30 and UNC-55, and why KO strains of these lincRNAs do not show a locomotion defect, remain to be elucidated. The 23 lincRNAs with mutant phenotypes in this study tended to be regulated by more transcription factors in L1, L2, and L3 worms (Fig. 7g–i). It is possible that these lincRNAs are related to greater physiological regulation, and thus, their perturbation may be more likely to cause defects. As for the regulation by histone modifications, our results show that both H3K4me3 and H3K9me3 regulate linc-73 at L2 stage, although only H3K4me3 but not H3K9me3 binds to linc-73 at L4 stage (Figs. 1c and 8e). H3K9me3 does not have that many genomic binding peaks as compared to H3K4me3 in our study and also in data from others (NCBI BioProject: PRJEB20485).
We have presented data to support that linc-73 plays a cis role to regulate the expression of unc-104 (Figs. 4a and 8), although it is possible that linc-73 also has a trans role because the overexpression of linc-73 via an extrachromosomal construct could partially rescue the linc-73 phenotype (Fig. 6a, b). However, linc-109 has been shown to function with trans roles (Figs. 4g, 5b–d, and 6a, c), although the expression of neighboring genes is altered in linc-109 KO, which may be an indication of a cis role (Fig. 4a). The effects of linc-109 KO on the expression of its neighboring genes may not contribute to the mutant phenotype, as the extrachromosomal construct could fully rescue the linc-109 phenotype (Fig. 6a, c). The application of CRISPR actually deletes the DNA sequences of lincRNAs, which may harbor DNA elements that regulate the expression of neighboring genes. Thus, for each individual lincRNA, an array of experiments must be performed to elucidate the potential cis and/or trans role.
By using CRISPR, we have generated knockout strains of 155 C. elegans lincRNAs as valuable resources for studies in ncRNAs. Systematic analyses of these strains for just six traits identified phenotypes in 23 lincRNA mutants. We have characterized some aspects of the expression patterns, molecular mechanisms, and other regulatory relevance of these lincRNAs.
Animal cultures and strains
Unless otherwise stated, all C. elegans strains used in this study were maintained on standard nematode growth medium (NGM) at 20 °C or 25 °C . N2 Bristol was obtained from the Caenorhabditis Genetic Center (CGC). Eight strains including XIL0375, XIL0389, XIL1172, XIL0354, XIL1177, XIL0386, XIL0411, and XIL1237 were gifts from Dr. Xiao Liu. All worm strains generated or used in this study are listed in Additional file 3: Table S2.
Gravid adult worms were washed three times with M9 and collected into 1.5 ml tubes, after which the tubes were centrifuged at 600 g. Animals were then treated with hypochlorite. Synchronized embryos were cultured at 20 °C on NGM plates with seeded OP50.
pDD162, expressing Cas9 II protein, was a kind gift from Dr. Guangshuo Ou. For long lincRNAs (> 2 kb), 3–6 sgRNAs were designed to target the 5′ ends of the lincRNA. In the case of short lincRNAs (< 2 kb), 2–3 sgRNAs targeting the 5′ and 3′ ends of each lincRNA were used. In order to enhance the efficiency of the sgRNA, we specifically selected sgRNAs containing two NGG PAM motif in the 3′ ends of sgRNA sequence. All the sgRNA sequence used in this work were assessed at http://crispor.tefor.net/. The 20 nt sgRNA sequence was inserted behind the U6 promoter of pPD162 plasmid between the EcoRI and HindIII restriction endonuclease sites. Homology recombination plasmids were generated by cloning the 1.5 kb DNA sequence upstream of the site of interest, 2 kb lincRNA promoter sequence, GFP sequence, and 1.5 kb DNA sequence downstream of the site of interest between Sph I and Apa I of the pPD117.01. For lincRNA transcriptional reporters, approximately 2.5 kb promoter sequence was cloned from genomic DNA. The corresponding product was fused with sl2 sequence and was inserted between the Sph I and Age I of pPD117.0 expressing GFP or between Pst I and Age I of pPD95.67 expressing RFP (Andrew Fire collection, Addgene). For rescue plasmids, 2 kb promoter sequence was cloned from the genomic DNA, and lincRNA full-length sequence was cloned from cDNA. All those products were inserted into the pPD117.01 between the Sph I and Apa I double-digested sites. In the dual-color system for the in vivo analysis of miRNA-lincRNA interaction, we constructed GFP reporters for the selected lincRNA by replacing the 3′ UTR region of pPD117.01 with the complete wild-type sequence of the lincRNA of interest. As a control, the mutated versions of each lincRNA, in which the respective miRNA binding sites within the lincRNAs were mutated, was also cloned into pPD117.01. miRNA overexpression plasmids were constructed cloning the pri-miRNA sequence of the miRNA into pPD95.67 driven by promoter of the corresponding lincRNA. For linc-73::Punc-104::mCherry plasmid, linc-73 (TTS insertion)::Punc-104::mCherry plasmid and linc-73 (mutated UNC-30 binding site)::Punc-104::mCherry plasmid construction, linc-73 promoter and gene body sequence, unc-104 promoter sequence, mCherry sequence were cloned separately and inserted into the pPD117.01. The UNC-30 or UNC-55 binding site in linc-73 promoter was mutated from GATTA to CTCAG (for UNC-30) or from ATCGATCCAT to CGATCGAACG (for UNC-55). 2X transcriptional terminal site (2X TTS, AAATAAAATTTTCAGAAATAAAATTTTACA) was inserted into the 5′ portion of linc-73. A list of primers used is provided in Additional file 12: Table S6.
Injection of CRISPR/Cas9 knockout and knock-in and other plasmids
CRISPR/Cas9 system was carried out as previously described with modifications . For the knockout system, we mixed 3–6 Pu6::lincRNA sgRNA plasmids (30 ng/μl of each) and pPD162 expressing Cas9 II protein (30 ng/μl), as well as co-injection marker Pmyo-2::mCherry (PCFJ90) (10 ng/μl) together. The mixture was injected into about 30 N2 adults (adulthood day 1). For the CRISPR knock-in, the upstream locus of linc-1 in chromosome I was selected as the knock-in site due to the presence of fewer genes located in the linc-1 neighborhood. PU6::sgRNA plasmids (30 ng/μl), PPD162 plasmid (30 ng/μl), co-marker plasmid (10 ng/μl), and homologous recombination plasmid (40 ng/μl) were injected into the 30 gravid worms, and transgenes were selected as described above. All the knockout or knock-in mutant worms were transferred to new plates and outcross for at least three generations to eliminate off targets. In the dual-color system, wild-type or mutated lincRNA reporters (20 ng/μl) were mixed with miRNA overexpression plasmids (20 ng/μl), control plasmids (20 ng/μl), and a 1-kb DNA ladder (Invitrogen) standard. For rescue experiment, overexpression plasmid of lincRNAs (20 ng/μl) was mixed with co-maker plasmid (PCFJ90 20 ng/μl) as well as DNA ladder (Invitrogen). linc-73::Punc-104::mCherry plasmid (20 ng/μl), linc-73 (TTS insertion)::Punc-104::mCherry plasmid (20 ng/μl), linc-73 (mutated UNC-30 binding site)::Punc-104::mCherry plasmid (20 ng/μl)and linc-73 (mutated UNC-55 binding site)::Punc-104::mCherry plasmid (20 ng/μl) was mixed with myo-2::GFP separately, and injected into gravid young adults. Standard microinjection techniques were used.
Screening for CRISPR deletion and knock-in strains
Approximately, 200 F1 worms were singled after the injection and cultured at 25 °C. Genomic DNA of the F3 generation was extracted and examined by PCR. Worms were harvested and transferred to 100 μl lysis buffer (20 μg/ml Proteinase K, 100 mM KCl, 10 mM PH8.3 Tris-HCl, 1.5 mM MgCl2), and then placed at − 80 °C for 10 mins, thawed at 65 °C for at least 2 h. Worms were then placed at 95 °C for 15 mins to inactivate proteinase K, and 2 μl each worm lysate was used as DNA template for PCR amplification with primers spanning sgRNA-targeted regions. For the verification of the knock-in strains, we amplified genomic regions spanning the point of insertion. Worms with the corrected PCR products were singled to NGM plates and further confirmed by DNA sequencing of the genomic PCR products. CRISPR worms were outcrossed at least three times before being used in experiments. The primers used for PCR screening are listed in Additional file 12: Table S6.
To examine locomotion of worms, young adult worms were removed from the bacterial lawn of an agar culture plate to bacteria-free plates at room temperature, and allowed to crawl away from any food remains for about 10–20 s. Complete body bends per 20 s were then counted under a dissecting microscope after animals were gently touched at the tail end (n, number of worms = 5; N, number of replicates = 3) .
Defecation cycles were performed according to previous report . Data was presented by recording the time between defecation cycles of young adult worms (n, number of worms = 5; N, number of replicates = 3).
Pharyngeal pumping behavior was assayed as previously described [55, 57]. Pharyngeal pumping was examined by counting grinder movements for 20 s at 20 °C (n, number of worms = 7; N, number of replicates = 3).
Egg retention assay was carried out as described earlier with some modifications . One day (post the last molt) old adult worms were singled out and lysed in hypochlorite solution for 6 mins in 96-well plate, and the number of eggs was counted (n, number of worms = 12; N, number of replicates = 3).
Examination of development stages
To examine the development stages of worms, synchronized eggs were allowed to hatch at 20 °C and allowed to grow at NGM plates with adequate food and their developmental stages were examined after 24 h and 48 h (n, number of worms = 30; N, number of replicates = 3) .
Number of progenies
L4 worms were singled on NGM plates and allowed to lay eggs at 20 °C . Individual worms were transferred daily from the start of egg laying until egg laying stopped. The number of live offspring (L1) were counted (n, number of worms = 7; N, number of replicates = 3). All experiments were performed under a dissecting microscope.
Quantitative RT-PCR (qRT-PCR) and quantitative PCR (qPCR)
RNAs were extracted from worms in TRIzol L/S solution (Invitrogen) after three cycles of freezing at − 80 °C and thawing at room temperature. Five hundred nanogram total RNAs were reverse transcribed into cDNA by cDNA synthesis kit (Goscript™ Reverse Transcription System, Promega). qRT-PCR (with cDNA template) and qPCR (with genomic DNA template) were performed using a GoTaq qPCR Master Mix kit (Promega) on a PikoReal 96 real-time PCR system (Thermo Scientific) according to standard procedures. 18S RNA was used for normalization. All PCR products were sequenced for confirmation. All primers used are listed in Additional file 12: Table S6.
Microscopy and calculating the relative fluorescence intensity
For all the lincRNAs reporter worms, Axio Scope A1 compound microscope (Zeiss, Oberkochen, Germany) was used for the examination of fluorescence. L4 stage worms were anesthetized in 10 mM sodium azide, and images were taken using the × 20 objective. All the images were analyzed by the ImageJ (an open-source image processing software). Confocal imaging was carried out as previously reported with some modification . Imaging of anesthetized worms were carried out on Andor Revolution XD laser confocal microscope system (Andor Technology PLC) based on a spinning-disk confocal scanning head CSU-X1 (Yokogawa Electric Corporation) under control of Andor IQ 10.1 software or two-photon confocal laser scanning microscopy FV1200MPE (Olympus) with GaAsP-NDD detector. Z-stack images were obtained on Olympus IX-71 inverted microscope (Olympus Corporation) with × 60 1.45 NA oil-immersion objective. An Andor iXonEM+ DV897K EM CCD camera was used for capturing the 14-bit digital images with Andor LC-401A Laser Combiner with diode-pumped solid state (DPSS) lasers, emissions at 458 nm, 488 nm, 515 nm, and 561 nm.
Counting the presynaptic puncta in ventral and dorsal
Dorsal nerve cord and ventral nerve cord images were obtained and counted between VD9 and VD11. ImageJ plot profile tool was used to plot nerve cords, and the number of SNB-1::GFP (Punc-25::snb-1::gfp) puncta was calculated by counting the number of crests of the plot file (n = 4).
For next-generation RNA sequencing, total RNAs were isolated from nine different stages of worms (embryos, L1, L2, dauer, L3, L4, young adult, male, and mix stage with starvation). Sequencing libraries were carried out as previously described with modifications . Whole transcriptome libraries were constructed by the TruSeq Ribo Profile Library Prep Kit (Illumina, USA), according to the manufacturer’s instructions. In brief, 10 μg total RNA was depleted rRNA with an Illumina Ribo-Zero Gold kit and purified for end repair and 5′-adaptor ligation. Then, reverse transcription was performed with random primers containing 3′ adaptor sequences and randomized hexamers. The cDNAs were purified and amplified, and PCR products of 200–500 bp were purified, quantified, and stored at − 80 °C until sequencing. For RNA sequencing of long RNAs, the libraries were prepared according to the manufacturer’s instructions and subjected to 150 nt paired-end sequencing with an Illumina Hiseq 2500 system (Novogene, China). We sequenced each library to a depth of 10–50 million read pairs, and the reads were mapped to the C. elegans genome (ce11). For small RNA (sRNA) sequencing, nine sRNA libraries were generated with TruSeq small RNA (Illumina, USA) according to the manufacturer’s instructions. Then, the prepared libraries were sequenced with an Illumina Nextseq 500 system (Novogene, China). After filtering out the reads shorter than 15 nt, the remaining reads were mapped to the C. elegans genome (ce11) and the miRNA database in miRBase with bowtie (-v 1).
Conservation, length, and exon number analysis of lincRNAs
For the genome-wide feature analysis of lincRNAs, the control was 200 mRNAs randomly picked from C. elegans transcriptome. The information of length and exon number for lincRNAs and mRNAs was extracted from the annotation of C. elegans . For the analysis of sequence conservation, we interrogated 26 nematode conservation phastCons scores from UCSC  for each base of individual C. elegans lincRNA or mRNA and averaged the scores of each transcript. The distribution of lincRNA and mRNA was compared by two-sided Mann-Whitney U test.
Construction of lincRNA-miRNA co-expression network
Functional networks of miRNA and lincRNA pairs were illustrated with cytoscape v3.5.1 . For the one-to-one connection, the expression of lincRNA with at least two 7-mers matches of particular miRNA was negatively correlated to the expression of miRNA (Pearson R < − 0.1) across nine stages.
Chromatin immunoprecipitation (ChIP)
ChIP assays were performed as described in our previous report with modifications . N2 and linc-73 mutant worms were bleached with hypochlorite solution, and the eggs were incubated at 20 °C on NGM plates seeded with OP50 to be synchronized to L2 (for ChIP-qPCR experiments) or L4 stage (for ChIP-seq experiments). Synchronized worms were then washed with three changes of M9 buffer and fixed with 2% formaldehyde for 35 min followed by stopping with 100 mM Tris pH 7.5 for 2 min. Worm pellets were washed with FA buffer supplemented with 10 μl 1 M DTT, 50 μl 0.1 M PMSF, 100 μl 10% SDS, 500 μl 20% N-Lavroyl sarcosine sodium, and 2 tablets protease inhibitors in 10 ml FA buffer. Worms were sonicated on ice for 15 min with the setting of high power, 4 °C, and 15 cycles, 30 s on, 30 s off. The tubes were then spun at 14,000 g for 10 min at 4 °C. The supernatant was carefully removed into new tubes, and an aliquot (5% of each sample) was taken as input. Prewashed salmon sperm Protein G beads were added to the supernatant for 1 h for pre-cleaning. Beads were discarded, and 2 μg anti-H3K4me3 or anti-H3K9me3 (Abcam) were added to each tube overnight at 4 °C. The beads were washed twice with 150 mM NaCl FA buffer for 5 min each, washed once with 1 M NaCl FA buffer for 5 min, twice with 500 mM NaCl FA buffer for 10 min, once with TEL buffer (0.25 M LiCl, 1% NP-40, 1% sodium deoxycholate, 1 mM EDTA, 10 mM Tris-HCl, pH 8.0) for 10 min, and finally with three changes of 1X TE buffer (1 M Tris-HCl, 0.5 M EDTA). DNA-protein complexes were eluted in 200 μl of ChIP elution buffer (1% SDS in TE with 250 mM NaCl) and incubated at 65 °C for 20 min with regular shaking every 5–10 min. Both samples and inputs were treated with RNase A (2 μg/μl) and proteinase K (2 μg/μl) for 2 h at 55 °C for 1 h and then reverse cross-linked at 65 °C overnight. DNA was purified by phenol/chloroform/isoamyl extraction and then used for ChIP-qPCR or ChIP-seq. For ChIP-seq, DNA from ChIP (along with the input) was iron fragmented at 95 °C followed by end repair and 5′ adaptor ligation, then purified and amplified. PCR products corresponding to 200–500 bps were purified for sequencing. Illumina Nextseq 500 system for 150 nt pair-end sequencing was then performed (Novogene).
Analysis of ChIP-seq data of transcription factors and H3K4me3 and H3K9me3
A total of 774 ChIP-seq raw fastq data and 561 computed gff3-file data were downloaded from modENCODE (ftp://data.modencode.org/C.elegans/Transcriptional-Factor/ChIP-seq/) [30, 31], and the regulation patterns of all transcriptional factor genes by lincRNAs in C. elegans were analyzed. The quality of all these 774 raw fastq data was verified using bowtie2 to map the reads to the C. elegans genome (ce11). We then re-analyzed the calculated peaks in order to investigate the regulation of transcriptional factors by the various lincRNAs. Considering the shorter length of the lincRNAs transcripts as compared to the mRNAs, we used the scale within 1 kb upstream or 200 bp downstream of the transcription start site of the lincRNAs. ChIP-seq data of UNC-30::GFP and UNC-55:GFP from our previous study using endogenous GFP knock-in unc-30 and unc-55 mutant worms were also analyzed (GEO: GSE102213) . Reads were first filtered from genomic repeats, and the unique reads were then mapped to the C. elegans genome (ce11) with bowtie2. Peaks of UNC-30 and UNC-55 were assigned by the cisGenome with default parameters (cutoff > 3 and p value < 10−5). H3K4me3 and H3K9me3 ChIP-seq data of L4 were mapped to the C. elegans genome (ce11) with bowtie2 using the default parameters. Samtools were used to filter *sam files and remove duplicated reads. Macs2 was used for peak calling (q parameter was set as 0.001).
Short Time-series Expression Miner analysis (STEM)
The co-expression patterns of lincRNAs and mRNAs were calculated by STEM [2 (a software program designed for clustering comparing, and visualizing gene expression data from short time series experiments) using RNA-seq data from nine different developmental stages. RNA-seq data from the embryonic stage was set as 0 point, and the other developmental stages were normalized to the embryonic stage data. K-means method was used to cluster the genes into specific profile according to their expression pattern. In all, nearly 20,000 genes with reads per kilo million (RPKM) greater than 1 were clustered into 10 profiles according changes in their expression patterns at different stages of development. The function of genes in specific clusters with similar expression patterns was analyzed by gene ontology analysis.
The significant enriched genes were analyzed with Gorilla web-server . P values were calculated with default parameters.
For Student’s t tests, the values reported in the graphs represent averages of independent experiments, with error bars showing s.e.m. in all figures, except for Fig. 5 and Additional file 7: Figure S3, in which error bars show S.D. Statistical methods are also indicated in the figure legends. All statistical significances were determined using GraphPad Prism software (version 7). Two-sided Mann-Whitney U test was used in Figs. 1d, e and 7g–i and Additional file 7: Figure S3. Unpaired Student’s t test was used in Figs. 2b–f, 5, 6b–f, and 8c–g and Additional file 7: Figure S3. Chi-square test was used in Figs. 2g and 6g.
We thank Dr. Guangshuo Ou for providing plasmids, Dr. Xiao Liu for providing strains used in this work, and Dr. Shouhong Guang for providing experimental facilities. We thank the Bioinformatics Center of the USTC, School of Life Sciences, for providing supercomputing resources.
This work was supported by the National Basic Research Program of China (2015CB943000), the National Key R&D Program of China (2018YFC1004500), the National Natural Science Foundation of China (31725016 and 31471225), the Major/Innovative Program of Development Foundation of Hefei Center for Physical Science and Technology (2016FXCX006), and the Open Project of the CAS Key Laboratory of Innate Immunity and Chronic Disease (KLIICD-201603), and the Strategic Priority Research Program (Pilot study) “Biological basis of aging and therapeutic strategies” of the Chinese Academy of Sciences (XDPB10).
Availability of data and materials
The 9 different stages of lincRNAs and microRNAs sequencing data as well as H3K4me3 and H3K9me3 ChIP-seq data generated in this study have been deposited in NCBI’s Gene Expression Omnibus and are accessible through GEO accession number GSE115324. Accession number for ChIP-seq data of UNC-30 and UNC-55 is GSE102213. ChIP-seq data of transcription factors at different stages of development are publicly available at http://www.modencode.org/ [30, 31].
GS designed and initiated this project, provided the major funding, and supervised the experiments. SW, HC, EED, TF, BY, XW, JL, LL, SF, and WL performed experiments. SW, HC, EED, and GS analyzed the data. GS, SW, EED, and HC wrote the manuscript. All authors have discussed the results and made comments on the manuscript. All authors read and approved the final manuscript.
Ethics approval and consent to participate
Consent for publication
The authors declare that they have no competing interests.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
- 1.Ransohoff JD, Wei Y, Khavari PA. The functions and unique features of long intergenic non-coding RNA. Nat Rev Mol Cell Biol. 2018;19(3):143-57.Google Scholar
- 16.Wormbase Version:264 [https://wormbase.org/] Accessed 10 Oct 2016.
- 20.Cong L, Ran FA, Cox D, Lin S, Barretto R, Habib N, Hsu PD, Wu X, Jiang W, Marraffini L. Multiplex genome engineering using CRISPR/Cas systems. Science. 2013;339(6121):819-23.Google Scholar
- 23.Yu B, Wang X, Wei S, Fu T, Dzakah EE, Waqas A, Walthall WW, Shan G. Convergent transcriptional programs regulate cAMP levels in C. elegans GABAergic motor neurons. Dev Cell. 2017;43(2):212–26 e217. https://doi.org/10.1016/j.devcel.2017.09.013.
- 30.The National Human Genome Research Institute model organism ENCyclopedia Of DNA Elements [http://www.modencode.org/] Accessed 10 May 2018.
- 31.Niu W, Lu ZJ, Zhong M, Sarov M, Murray JI, Brdlik CM, Janette J, Chen C, Alves P, Preston E, Slightham C, Jiang L, Hyman AA, Kim SK, Waterston RH, Gerstein M, Snyder M, Reinke V. Diverse transcription factor binding features revealed by genome-wide ChIP-seq in C. elegans. Genome Res. 2011;21(2):245–54 https://doi.org/10.1101/gr.114587.110.
- 42.Liu SJ, Horlbeck MA, Cho SW, Birk HS, Malatesta M, He D, Attenello FJ, Villalta JE, Cho MY, Chen Y et al: CRISPRi-based genome-scale identification of functional long noncoding RNA loci in human cells. Science. 2017;355(6320):aah7111.Google Scholar
- 44.Goyal A, Fiškin E, Gutschner T, Polycarpou-Schwarz M, Groß M, Neugebauer J, Gandhi M, Caudron-Herger M, Benes V, Diederichs S. A cautionary tale of sense-antisense gene pairs: independent regulation despite inverse correlation of expression. Nucleic Acids Res. 2017;45(21):12496–508.CrossRefGoogle Scholar
- 52.Bassett AR, Akhtar A, Barlow DP, Bird AP, Brockdorff N, Duboule D, Ephrussi A, Ferguson-Smith AC, Gingeras TR, Haerty W, Higgs DR, Miska EA, Ponting CP. Considerations when investigating lncRNA function in vivo. Elife. 2014;3:e03058.Google Scholar
- 61.UCSC [http://hgdownload.soe.ucsc.edu/goldenPath/ce11/]. Accessed 28 Mar 2018.
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.