Temporal transcriptome profiling of developing seeds reveals a concerted gene regulation in relation to oil accumulation in Pongamia (Millettia pinnata)
Pongamia (Millettia pinnata syn. Pongamia pinnata), an oilseed legume species, is emerging as potential feedstock for sustainable biodiesel production. Breeding Pongamia for favorable traits in commercial application will rely on a comprehensive understanding of molecular mechanism regulating oil accumulation during its seed development. To date, only limited genomic or transcript sequences are available for Pongamia, while a temporal transcriptome profiling of developing seeds is still lacking in this species.
In this work, we conducted a time-series analysis of morphological and physiological characters, oil contents and compositions, as well as global gene expression profiles in developing Pongamia seeds. Firstly, three major developmental phases were characterized based on the combined evidences from embryonic shape, seed weight, seed moisture content, and seed color. Then, the gene expression levels at these three phases were quantified by RNA-Seq analyses with three biological replicates from each phase. Nearly 94% of unigenes were expressed at all three phases, whereas only less than 2% of unigenes were exclusively expressed at one of these phases. A total of 8881 differentially expressed genes (DEGs) were identified between phases. Furthermore, the qRT-PCR analyses for 10 DEGs involved in lipid metabolism demonstrated a good reliability of our RNA-Seq data in temporal gene expression profiling. We observed a dramatic increase in seed oil content from the embryogenesis phase to the early seed-filling phase, followed by a steady and moderate increase towards the maximum at the desiccation phase. We proposed that a highly active expression of most genes related to fatty acid (FA) and triacylglycerol (TAG) biosynthesis at the embryogenesis phase might trigger both the substantial oil accumulation and the membrane lipid synthesis for rapid cell proliferation at this phase, while a concerted reactivation of TAG synthesis-related genes at the desiccation phase might further promote storage lipid synthesis to achieve the maximum content of seed oils.
This study not only built a bridge between gene expression profiles and oil accumulation in developing seeds, but also laid a foundation for future attempts on genetic engineering of Pongamia varieties to acquire higher oil yield or improved oil properties for biofuel applications.
KeywordsMillettia pinnata Oil accumulation Temporal transcriptome profiling Seed development Concerted regulation Biofuel
ABSCISIC ACID INSENSITIVE4
Acyl carrier protein
Differentially expressed gene
Omega-6 FA desaturase
Omega-3 FA desaturase
Fatty acyl-ACP thioesterase A
Fatty acyl-ACP thioesterase B
False discovery rate
Kyoto encyclopedia of genes and genomes
Long-chain acyl-CoA synthetase
Phosphatidic acid phosphohydrolase
Phospholipid: diacylglycerol acyltransferase
Phosphatidylcholine: diacylglycerol cholinephosphotransferase
Reads per kilobase per million mapped reads
Weeks after flowering
Growing global population and depleting fossil fuels has spurred a rising demand for alternative and renewable energy sources over the past few decades. Biodiesel, usually derived from plant oils, is one of the most promising substitutes for conventional diesel fuel with multiple advantages like lower greenhouse gas emission, faster biodegradation, greater lubricity, and higher flashpoint for safer storage and transport . A major challenge for the production and commercialization of biodiesel is the limited feedstock supply intertwined with its high price . Although a number of oil-bearing plants can theoretically serve as sources of raw materials for biodiesel, most of them are not suitable for industrialized production owing to their adverse impacts on food supply or land use. For example, an increased utilization of soybean as biodiesel feedstock might reduce their supplies of protein and oil for humans and animals , while an enlarged plantation of oil palm for biofuel application might cause rainforest fragmentation and biodiversity loss . Therefore, it is imperative to seek out more oil-yielding plants, which do not compete with food crops or forest trees, to extend the repertoire of biodiesel feedstocks.
Pongamia (Millettia pinnata syn. Pongamia pinnata) is one such oleiferous tree species that has received increasing attention in recent years [5, 6]. It belongs to the legume family (Fabaceae) and is widely distributed from India and Southeast Asia to Polynesia and North Australia . The Pongamia trees have high yield of non-edible seed oils that can be easily extracted and converted into biodiesel [7, 8]. The annual oil yield of this species can reach about 6000 L/ha, which is much higher than those amounts reported for several other feedstock species . Moreover, the Pongamia seed oils are rich in oleic acid [10, 11], which may endow the biodiesel products with more desirable fuel properties. Most importantly, the Pongamia trees can tolerate a wide range of abiotic stresses and improve the soil nutrient status as well , which means they can be planted on the marginal or degraded lands without affecting food production and forest protection. As a matter of fact, this species has already been introduced to subtropical and arid regions of Africa, India, Malaysia, Australia, and the USA for commercial cultivation . Besides, the legume trees are capable of undergoing biological nitrogen fixation and thus reducing the consumption of nitrogen fertilizers , which also makes this species more cost-effective and eco-friendly in biodiesel application.
Developing Pongamia varieties for applicable traits via either marker-assisted selection or genetic manipulation will benefit substantially from a better understanding of genetic background for this species. As an outbreeding diploid (2n = 22) legume, Pongamia has a haploid genome size of nearly 1200 Mb . While its reference genome is not yet available, dozens of genes or genomic regions have been isolated and sequenced in Pongamia for the phylogenetic and population genetic analyses [16, 17, 18, 19]. In contrast, only a handful of Pongamia genes have been characterized for functional studies. A recent study has identified four circadian clock genes (ELF4, LCL1, PRR7, and TOC1) of Pongamia and found their expression to be diurnally regulated under long-day conditions . Two other studies have successively isolated the full-length cDNA clones for two Pongamia desaturase genes (PpSAD and PpFAD2), which have displayed distinct expression patterns during different stages of seed development [21, 22].
Like other legume species, Pongamia mainly synthesize and store its oils in seeds. During seed development, the formation of oils as well as other major storage compounds like starch and proteins is promoted by various physiological events that are in turn governed by a mosaic of gene expression programs . It is thus of great importance to achieve a global measurement of transcript abundance for clarifying molecular basis underlying oil accumulation in developing seeds. So far, the global transcriptional profiling of developing seeds have already been reported for several legume species, such as soybean [24, 25, 26], Medicago , Lotus , and chickpea , using either microarray or RNA sequencing (RNA-Seq) platforms. However, these works have not placed special emphasis on genes involved in lipid metabolism. As for Pongamia, we initiated the first transcriptome analysis with root and leaf tissues using RNA-Seq and uncovered a large set of candidate salt-responsive genes . Recently, Wegrzyn et al.  constructed a leaf transcriptome with RNAs from 72 seedlings, while Sreeharsha et al.  generated a comprehensive transcriptome with pooled RNAs from leaf, flower, pod, and seed tissues. Parallel to these two works, we built a seed transcriptome for gene discovery and molecular marker development . Nevertheless, a systematic examination of transcriptional profiles for further exploration of certain regulatory mechanism is still lacking in this species.
In the current study, we first characterized the developmental process of Pongamia seeds according to their morphological and physiological changes. Meanwhile, we monitored the variations in oil content and fatty acid (FA) composition along this process. Then, we performed high-throughput sequencing for the representative RNA samples from three major developmental phases of legume seeds and generated a dataset providing a panoramic view of gene expression during seed development. Furthermore, we sorted out the differentially expressed genes (DEGs) between developmental phases and focused on the expression patterns of those genes related to FA and triacylglycerol (TAG) metabolism. Our findings will contribute to elucidating possible correlations between transcriptional reprogramming of certain lipid-metabolism-related genes and dynamic pattern of oil accumulation in developing Pongamia seeds.
Morphological and physiological changes of developing Pongamia seeds
Oil content and fatty acid composition of developing Pongamia seeds
Fatty acid composition of Pongamia seeds at different time points of development
14.10% ± 0.17
7.86% ± 0.10
27.30% ± 0.26
49.40% ± 0.09
0.32% ± 0.03
0.64% ± 0.08
0.28% ± 0.05
0.10% ± 0.02
12.98% ± 0.26
7.62% ± 0.47
31.75% ± 0.73
46.60% ± 0.27
0.25% ± 0.01
0.51% ± 0.08
0.21% ± 0.03
0.09% ± 0.01
12.47% ± 0.15
7.34% ± 0.15
38.51% ± 0.59
40.66% ± 0.26
0.21% ± 0.03
0.50% ± 0.07
0.18% ± 0.03
0.13% ± 0.02
12.40% ± 0.13
7.19% ± 0.12
39.90% ± 0.13
39.36% ± 0.21
0.22% ± 0.02
0.56% ± 0.05
0.22% ± 0.02
0.16% ± 0.02
12.19% ± 0.20
6.97% ± 0.18
40.74% ± 0.40
39.04% ± 0.37
0.20% ± 0.01
0.52% ± 0.06
0.20% ± 0.03
0.14% ± 0.02
11.63% ± 0.13
6.33% ± 0.09
43.26% ± 0.09
37.94% ± 0.04
0.20% ± 0.02
0.42% ± 0.04
0.15% ± 0.02
0.08% ± 0.02
Assessment of gene expression levels in three developmental phases of Pongamia seeds
Statistics of sequencing reads of Pongamia seeds
MpSI (10 WAF)
MpSII (20 WAF)
MpSIII (30 WAF)
Bio Rep 1
Bio Rep 2
Bio Rep 3
Bio Rep 1
Bio Rep 2
Bio Rep 3
Bio Rep 1
Bio Rep 2
Bio Rep 3
Number of raw reads
Number of clean reads
Number of unique mapped reads
Number of multiple mapped reads
Number of mapped reads
To evaluate the reproducibility of our RNA-Seq data among the three biological replicates at each phase, we performed a Pearson’s correlation analysis based on the RPKM values of all nine samples. The correlation dendrogram indicated high correlations of gene expression levels among replicates, with an average coefficient of 0.9664, 0.9925, and 0.9764 for samples at the MpSI, MpSII, and MpSIII phase, respectively (Additional file 1: Figure S2). Principal component analysis revealed that the nine samples could be clearly assigned to three groups corresponding to the three developmental phases (Additional file 1: Figure S3), which also demonstrated a good reproducibility of the gene expression data yielded in this study.
Identification and functional categorization of differentially expressed genes between seed developmental phases
We further used GO and KEGG assignments to classify the functions of DEGs identified in the two successive comparisons. Firstly, 2522 DEGs were assigned with 1975 GO terms in three major GO categories (Additional file 5: Table S4). Comparatively, 14,027 out of the 53,586 reference genes were assigned with GO terms and served as a background for enrichment analysis. In the category of biological process, DEGs in both comparisons were associated with several lipid metabolic processes, such as ‘fatty acid metabolic process’, ‘glycerolipid metabolic process’, ‘glycerophospholipid metabolic process’, ‘glycolipid metabolic process’, and ‘glycosphingolipid metabolic process’. Among them, only ‘fatty acid metabolic process’ was significantly (P ≤ 0.05) enriched by DEGs in the MpSI-vs-MpSII comparison (Additional file 6: Table S5). In the category of molecular function, although a number of lipid-metabolism-related activities, such as ‘lipase activity’, ‘fatty acid synthase activity’, ‘O-acyltransferase activity’, ‘CoA-ligase activity’, ‘lipid binding’, and ‘lipid transporter activity’ were represented by DEGs in both comparisons, only ‘CoA-ligase activity’ was among the seven terms enriched by DEGs in the MpSII-vs-MpSIII comparison (Additional file 6: Table S5). In the category of cellular component, the GO terms related to ‘photosystem’, ‘thylakoid’, and ‘organelle subcompartment’ were significantly enriched by DEGs in both comparisons (Additional file 6: Table S5). Secondly, 1506 and 201 DEGs were mapped to 125 and 89 KEGG pathways in the MpSI-vs-MpSII and the MpSII-vs-MpSIII comparison, respectively (Additional file 7: Table S6). The lipid-metabolism-related pathways, such as ‘fatty acid metabolism’, ‘glycerolipid metabolism’, ‘glycerophospholipid metabolism’, ‘sphingolipid metabolism’ and ‘ether lipid metabolism’, appeared in both comparisons. Likewise, 8498 out of the 53,586 reference genes were assigned with KEGG pathway annotations and served as a background for enrichment analysis. As a result, there were 14 and 6 pathways significantly (P ≤ 0.05) enriched by DEGs in the former and the latter comparison, respectively (Additional file 7: Table S6). Notably, the pathway of ‘fatty acid biosynthesis’ was only enriched by DEGs in the MpSI-vs-MpSII comparison.
Characterization of transcriptional profiles for unigenes involved in oil accumulation
The free FAs synthesized in plastids are acylated by LACSs to form a pool of fatty acyl-CoAs at the plastid envelope and then exported to cytosol. In this work, we observed the expression of 14 unigenes for six members of the LACS enzyme family, including LACS1, LACS2, LACS4, LACS6, LACS8, and LACS9 (Additional file 8: Table S8). Most of them displayed stable expression all through the three developmental phases. Only two unigenes, one for peroxisomal LACS6 (20808) and one for chloroplastic LACS9 (47997), were identified as DEGs with an opposite tendency of transcriptional changes (Fig. 6). The pool of acyl-CoAs can be transported from cytosol to endoplasmic reticulum (ER), and then utilized for either TAG or polyunsaturated FA synthesis.
De novo TAG assembly in ER is initiated by GPAT enzyme, which esterifies the acyl group to the sn-1 of glycerol-3-phosphate (G-3-P). Here, seven homologs belonging to the GPAT multigene family were found to be expressed in Pongamia seeds (Additional file 8: Table S8). Among them, two GPAT1 and three GPAT3 transcripts were expressed in low abundance without significant alteration, while the other two GPAT transcripts (17,378, 25,602) were significantly down-regulated from the MpSI phase to the MpSII phase, and then greatly up-regulated at the MpSIII phase (Fig. 6). Then, the lysophosphatidic acid (LPA) resulting from the above step is subjected to a second esterification reaction catalyzed by LPAT enzyme to form phosphatidic acid (PA). There were eight LPAT homologs expressed in Pongamia seeds. However, only one LPAT2 transcript (52868) showed significantly different expression during seed development. Before a third esterification reaction, the phosphate group of PA is removed by phosphatidic acid phosphohydrolase (PAP), resulting in the formation of diacylglycerol (DAG). Although four unigenes for two PAP genes (PAH1 and PAH2) were found to be expressed, none of them was identified as DEG. DAG can accept the acyl group either from acyl-CoAs by the activity of DGAT enzyme or from phosphatidylcholine (PC) by the activity of PDAT enzyme. There were three DGAT1 transcripts (8402, 21,767, 24,414) with significantly different expression, as well as seven DGAT2 transcripts and one DGAT3 transcript showing no significant changes in expression level. The unigenes for DGAT1 were more abundantly expressed than those for DGAT2. As for the PDAT enzyme, one (36776) out of six unigenes was identified as DEGs with its expression dramatically reduced at the MpSII phase, and then elevated at the MpSIII phase (Fig. 6). In addition to providing the acyl group to DAG for TAG formation, PC can also exchange phosphocholine with DAG by the activity of phosphatidylcholine: diacylglycerol cholinephosphotransferase (PDCT). Only one transcript (36710) was detected for PDCT gene, whose expression also significantly changed during seed development (Fig. 6). Lastly, the newly synthesized TAGs are surrounded by a layer of phospholipids and amphipathic proteins to form oil bodies in seeds. As aforementioned, most transcripts for oleosins were expressed in stable and high abundance at all three developmental phases (Additional file 8: Table S8). Among them, only two OLE unigenes (20,847, 22,766) were identified as DEGs with an opposite tendency of transcriptional changes (Fig. 6). The transcripts for caleosins or steroleosins were not identified in Pongamia seeds.
The biosynthesis of polyunsaturated FA is mainly based on further desaturation of C18:1 by separate pathways in plastids and ER. In ER, the C18:1 acyl might be incorporated into PC by acyl-CoA: lysophosphatidylcholine acyltransferase (LPCAT), and then sequentially desaturated by microsomal FAD2 and omega-3 FA desaturase (FAD3) to form C18:2 and C18:3. Alternatively, C18:1 can also be converted to C18:2 and C18:3 by chloroplast omega-6 FA desaturase (FAD6) and omega-3 FA desaturase (FAD7). Our study did not find any transcript for LPCAT in Pongamia seeds. Nevertheless, two transcripts for each of the two FAD2 isoforms were observed (Additional file 8: Table S8). Among them, one FAD2–1 transcript (48824) showed a bell-shaped pattern with a peak expression at the MpSII phase, whereas one FAD2–2 transcript (48822) was down-regulated from the MpSI phase to the MpSIII phase (Fig. 6). Besides, there was one FAD3 transcript (36822) undergoing a significant down-regulation from the MpSI phase to the MpSII phase, and thereafter remaining at a constant expression during the two later phases. Comparatively, one transcript for FAD6 gene and four transcripts for FAD7 genes were found to be expressed, yet none of them was identified as DEGs (Additional file 8: Table S8).
The accumulation of seed oils is not only determined by TAG production, but it is also affected by TAG degradation. The TAG lipases, coupled with those enzymes participating in FA beta-oxidation, including acyl-CoA dehydrogenase (ACD), enoyl-CoA hydratase (ECH), HDH, and 3-ketoacyl-CoA thiolase (KAT), are responsible for oil breakdown in seeds. In this study, we found seven SDP1 transcripts, all of which exhibited a decreasing expression from the MpSI phase to the MpSIII phase (Fig. 6). Similarly, nearly all the transcripts for ACD, ECH, and KAT exhibited suppressed expression levels (Additional file 8: Table S8). Only one peroxisomal HDH transcript (25781) displayed a significant up-regulation from the MpSII phase to the MpSIII phase. This transcript encoded a dehydrogenase for peroxisomal beta-oxidation, which was suggested to be essential for seedling establishment in Arabidopsis . Collectively, the suppression of the above TAG-disassembling genes might be conducive to the oil accumulation in Pongamia seeds.
Finally, we examined the transcriptional profiles of certain transcription factors with potential roles in oil accumulation. WRINKLED1 (WRI1) is a master regulator of plant oil synthesis belonging to the APETALA2/ETHYLENE RESPONSE FACTOR (AP2/ERF) family. We found only one WRI1 transcript (47905) in Pongamia seed transcriptome with a down-regulated expression pattern (Additional file 8: Table S8). FUSCA3 (FUS3) and ABSCISIC ACID INSENSITIVE4 (ABI4) are two other lipid-metabolism-related transcription factors whose expression was supported by our RNA-Seq data. A FUS3 transcript (4198) showed a similar down-regulated expression pattern with the WRI1 transcript, while an ABI4 transcript (10739) was highly expressed at all three phases without significant alteration (Additional file 8: Table S8). Except for these three unigenes, we did not identify the transcripts for other transcription factors relevant to lipid metabolism, such as LEAFY COTYLEDON1 (LEC1), LEC2, ABI3, and MYB89. More efforts are needed to enlarge the pool of transcription factors for this species in future studies.
Potential yields and properties of biodiesel produced from Pongamia are largely affected by its seed oil content and FA composition, which vary considerably not only among trees from different locations but also among different phases of seed development. In this study, the regulation of lipid metabolism in seeds was investigated on the Pongamia trees from China by lipid profiling and gene expression analysis in a developmental phase-specific manner. Compared with the Pongamia trees from India, whose flowers appear in April to June and seeds ripen during February to May of the following year , the trees from China take a shorter period of time for seed maturation as observed by our field surveys, with their flowers emerging in April to May and seeds ripening during October to December. The main reason for the slower seed development in the Indian trees may lie in the fact that they usually experience several months of minimum growth in embryo size, accompanied by an extension of pod to its maximum size, before entering a continuous embryo enlargement , whereas those trees from China only spend less than 1 month on pod extension prior to embryo enlargement.
Studies on changes in oil content and FA profile during seed development have already been carried out in some Indian Pongamia accessions. Pavithra et al.  reported a gradual increase in seed oil content from 32.06 to 36.53% during 30 to 42 WAF, which represented a time span from green pod stage to brown pod stage. They also observed that the fresh weight of seeds increased from 30 to 39 WAF and subsequently decreased at 42 WAF, while the moisture content dropped from above 50% to below 15% during 30 to 42 WAF. Therefore, the time span that they used for lipid profiling might roughly correspond to the seed-filling and the desiccation phases. Based on the same sampling time scale from 30 to 42 WAF, Sreeharsha et al.  recorded a more marked rise in oil content from 13 to 36%. Both of the above two studies indicated that seed development in Indian Pongamia accessions was at low pace with negligible oil content before 25 WAF [11, 31]. Sharma et al.  sampled the seeds with a wider time span from 7 to 37 WAF and detected a similar range of oil content from 15.96 to 36.93%. In this study, we noticed a sharp increment (10.67–21.49%) from 10 WAF at the embryogenesis phase to 14 WAF at the early seed-filling phase, followed by a steady increment (21.49–29.59%) through the seed-filling and the desiccation phases with the maximum appearing at 26 WAF (Fig. 2). The maximum oil content of the Pongamia seeds detected in our study was close to the mean value (31.70%) of oil contents obtained from 157 Indian accessions . Intriguingly, unlike the observation that the oil biosynthesis usually occurred at the mid-late stage of seed development in oilseed plants [36, 37, 38], our study and the study by Sharma et al.  provided two cases of considerable oil accumulation at the early developmental stage of Pongamia seeds.
In regards to FA composition, our study supported the predominance of palmitic, stearic, oleic and linoleic acids in Pongamia seed oils as shown in previous studies [10, 11, 34, 35]. These four types of FAs are essential constituents for either cell membrane or certain cell components , and they are required all along the seed developmental process. Hence, it was unsurprising that their relative proportions in seed oils were much higher than those of other types of FAs at all sampling time points. Besides, linolenic acid, eicosanoic acid, and behenic acid were detected in all samples, each accounting for less than 1% of seed oils. Formerly, a substantial amount of erucic acid was recorded in Pongamia seed oils by Bala et al. , but it was not detected in our study and several other studies [11, 34, 35]. Except for oleic acid, which steadily increased since the embryogenesis phase and became the most abundant one at the late seed-filling phase, all types of detectable FAs in Pongamia displayed a diminishing proportion as seeds matured (Table 1). Such tendency of changes in relative proportions for most types of FAs was largely consistent with those reported earlier [11, 34]. On the other hand, the range of variations for each type of FA differed greatly among various studies, which might result from the genetic divergence among the sampled trees as well as the environmental effects of the sampling locations.
To unravel possible correlations between the alteration in oil content or FA composition and the differential regulation of gene expression during seed development, we carried out temporal transcriptome analysis. Through Illumina sequencing, we generated more than 108 millions of short reads, which were efficiently mapped to the reference seed transcriptome of Pongamia set up by our previous study . The results from both Pearson’s correlation analysis and principal component analysis supported high consistency between biological replicates. Furthermore, the qRT-PCR analysis also validated the reliability of our RNA-Seq data in temporal gene expression profiling. Previously, RNA-Seq technology has been successfully applied in a number of oilseed plants, such as rapeseed [40, 41], castor bean , jatropha , and camelina , to characterize the set of genes and their regulatory networks controlling oil accumulation in developing seeds. The results of these studies have revealed both conserved and species-specific temporal expression patterns responsible for lipid metabolism regulation. In our study, the RNA-Seq results indicated that a high proportion (93.88%) of unigenes were expressed at all three developmental phases in Pongamia seeds, while less than 2% were exclusively expressed at one of the three phases. This observation coincided with the suggestions that most genes involved in various seed functions were shared by all developmental stages , while each stage might have a very small set of stage-specific genes . In addition, the number of expressed genes at each phase of Pongamia seeds slightly decreased as seed development progressed, which was also noticed in developing seeds of soybean or chickpea [25, 29].
Using stringent criteria, we identified 8881 DEGs in at least one pairwise comparison between phases. In general, there were substantially more down-regulated genes (5672) than up-regulated ones (716) from the embryogenesis phase to the seed-filling phase. In soybean, the down-regulated genes also overwhelmed the up-regulated ones at the seed-filling stage relative to the seed set stage , and the down-regulated genes found in the maturing seeds were mostly related to cell growth, cellular maintenance, and photosynthesis . Similarly, most genes encoding metabolic enzymes were down-regulated in the seeds approaching the mature stage as compared to the early developmental stage in chickpea . Such preferential expression of the majority of genes at the embryogenesis phase was reasonable since it was a phase with high metabolic activity for nascent protein and lipid generation in favor of cell proliferation in seeds. In other words, it reflected a requisite for high expression of the genes for synthesizing structural materials to support the rapid cell division at this phase. Besides, the decreasing levels of metabolic enzymes during seed filling were also observed in Medicago and were suggested to be an indicative of a metabolic shift from a highly active to a quiescent state as the embryo assimilated nutrients . Later, during the transition from the seed-filling phase to the desiccation phase, much less DEGs were identified. Although former transcriptomic studies paid less attention to the desiccation phase as compared to the earlier phases, the existing evidences from Arabidopsis and soybean still implied that seed desiccation was an active rather than quiescent stage in terms of gene expression, and the transition from late reserve accumulation to desiccation was associated with a major transcriptional switch [26, 48]. Hence, the finding of much more up-regulated genes (579) than down-regulated ones (306) during this transition in developing Pongamia seeds was also not beyond expectation.
With respect to the plastidial FA synthesis from acetyl-CoAs, the unigenes for all core enzymes including ACC, MAT, KASIII, KAR, HAD, EAR, and KASI showed a significant down-regulation from the embryogenesis phase to the seed-filling phase (Fig. 6). The declining trend continued in most of these unigenes from the seed-filling phase to the desiccation phase, but their changes in expression levels were not statistically significant during this developmental transition. Such a coordinated and declining expression pattern for FA synthesis-related genes was also observed in developing seeds of diverse species like Arabidopsis, rapeseed, and castor bean [48, 49]. As for FA elongation and desaturation, the unigenes for KASII and SAD were also most actively expressed at the embryogenesis phase. In accordance with previous findings in most oilseed species, the expression levels for SAD genes were much higher than for any other FA synthesis-related genes, which could possibly be explained by the low catalytic efficiency of SAD . Three unigenes for two FAD2 isoforms and FAD3 were identified as DEGs with different temporal expression patterns. A previous study reported a differential expression patterns for two FAD2 transcripts in Pongamia and suggested the existence of more than two copies for each of PpFAD2–1 and PpFAD2–2 . Judging from sequence similarity and expression pattern, it seemed that the FAD2–1 and FAD2–2 transcripts in our study might represent new copies of each isoform dissimilar to those in above study. Despite a reduction in the share of saturated FAs along with an increase in the share of unsaturated FAs as the seeds developed, we found no significant changes in expression levels of the unigenes for FATA and FATB, which preferentially hydrolyzed unsaturated and saturated FAs, respectively . Hence, we speculated that other transcripts of acyl-ACP thioesterases or post-transcriptional regulation might jointly account for the opposite shifts in the shares of the two classes of FAs in developing Pongamia seeds.
The free FAs released by thioesterases are first esterified to CoA by LACS before being assembled into TAGs. Our results confirmed the expression of six LACS isoforms. One unigene (20808) encoding peroxisomal LACS6 was noticed to be significantly up-regulated from the embryogenesis phase to the desiccation phase, implying its critical roles in preparing more acyl-CoAs for TAG assembly in Pongamia seeds. For the three acyltransferases catalyzing the stepwise acylation in TAG biosynthesis, we verified the expression not only for several members of the GPAT and the LPAT families, but also for two unrelated types of DGAT enzymes. The unigenes for DGAT1 were much more abundantly expressed than those for DGAT2 in Pongamia, which was the same as the situations in rapeseed and soybean [49, 52]. Moreover, we also detected the expression of several unigenes encoding another acyltransferase, PDAT. Interestingly, most of the DEGs identified in these acyltransferases for TAG synthesis showed a V-shaped expression pattern (Fig. 6), which meant that they were actively expressed at both the embryogenesis and the desiccation phases, but not at the seed-filling phase. Such a V-shaped expression pattern of TAG synthesis-related genes in developing seeds has not been previously reported in Arabidopsis or oilseed plants, where a continuous down-regulation or a bell-shaped expression has been the dominant pattern [48, 49]. In Pongamia, a recent work revealed that most genes involved in TAG synthesis were up-regulated to various extents during the mature green pod stages in an Indian accession , which roughly corresponded to a time span from the seed-filling phase to the early desiccation phase. In a sense, our results regarding the reactivations of most TAG synthesis-related genes at the desiccation phase were in agreement with the results of that work. Moreover, our results for gene expression profiles were based on a wider time span covering the embryogenesis phase and showed a concerted activation of TAG synthesis-related genes at this phase, which was not surveyed by that work.
Considering that the sampling time point representing the embryogenesis phase for RNA-Seq experiments was 10 WAF, the highly active expression of both FA and TAG synthesis-related genes at this phase might be most responsible for the sharp increment of oil content from 10 WAF to 14 WAF (Fig. 2). In addition, since the newly formed FAs could be utilized for synthesizing phospholipids as well, the activation of the above two sets of genes might also promote rapid synthesis of membrane lipids to support the cell proliferation at this early phase. On the other hand, the concerted gene reactivations at the desiccation phase mainly appeared in TAG synthesis-related genes, but not in FA synthesis-related genes. It seemed most likely that the Pongamia seeds prioritized themselves for storage lipid biosynthesis at this late phase. Such a preference for synthesizing storage lipids over membrane lipids at later developmental stages was also perceived in Jatropha seeds . Meanwhile, the decreasing expression of most TAG degradation-related genes observed in this study would also contribute to oil accumulation as Pongamia seeds matured.
In the present study, temporal analyses of morphological and physiological characters, oil contents and FA compositions, as well as gene expression profiles were conducted in developing Pongamia seeds to provide integrative information for understanding the molecular basis underlying oil accumulation. By monitoring embryonic shape, seed weight, seed moisture content, and seed color at reasonable intervals, we defined three major developmental phases of Pongamia seeds, with the embryogenesis phase spanning from 1 WAF to 11 WAF, the seed-filling phase from 11 WAF to 24 WAF, and the desiccation phase after 24 WAF. It should be noted that the time span of each developmental phase may vary among Pongamia trees with different origins. Nine samples from three representative time points were selected for comparative transcriptome analysis using the Illumina sequencing technology. We identified 8881 DEGs in pairwise comparisons between phases and highlighted those DEGs in relation to oil accumulation. Determination of oil content revealed a dramatic increase during the transition from the embryogenesis phase to the seed-filling phase, followed by a steady increase towards the maximum at the early desiccation phase. Such an early increase in seed oil content was associated with an active expression of most FA and TAG synthesis-related genes at the embryogenesis phase, which might also be responsible for synthesizing abundant membrane lipids to meet the needs of rapid cell proliferation at this phase. Later on, there was a concerted down-regulation of these two sets of genes till the desiccation phase, when the set of TAG synthesis-related genes were reactivated for storage lipid synthesis to achieve the maximum content of seed oils. Beyond shedding light on potential relatedness between developmental phase-specific regulation of gene expression and oil accumulation, the mass data generated in this study would provide valuable information for pinpointing crucial genes in lipid metabolism, such as those unigenes with a V-shaped expression pattern encoding GPAT (17,378, 25,602), LPAT (52868), or DGAT (21767), and facilitating genetic manipulation in Pongamia or related species for improved biofuel production.
Three 10-year-old Pongamia trees located at the Garden Expo Park in Shenzhen, China, were used as biological replicates for seed sampling. The inflorescences on different sub-branches of each tree were tagged at their first flowering dates. For microscopic analysis, pods were harvested from 5 WAF to 9 WAF at three-day intervals. For quantitative analyses of seed weight, oil content, and FA composition, pods were harvested from 9 WAF to 30 WAF at regular intervals. For RNA-seq and qRT-PCR analyses, pods were harvested at 10 WAF, 20 WAF, and 30 WAF, representing the three developmental phases of Pongamia seeds as defined by their morphological and physiological changes. At each time point, the seeds were manually separated from pods for subsequent experiments. Young leaves were also harvested from the same trees for qRT-RCR assay. The newly collected seeds and leaves were washed with distilled water, immediately frozen in liquid nitrogen, and then stored at − 80 °C before RNA extraction.
The Pongamia seeds were fixed in FAA solution (100 mL formaldehyde, 80 mL 75% ethanol, and 10 mL acetic acid) for 24 h at room temperature, washed with high purified water for three times (10 min each time), and then soaked in high purified water for 2 h. Next, the seed samples were stained in Mayer’s Hematoxylin solution for 1 h, rinsed in distilled water for 2 min, and then dehydrated in a series of ethanol solutions with increasing concentrations (i.e., 50, 70, 85, 95, and 100%) for 10 min at each solution. After that, the dehydrated samples were hyalinized with xylene, embedded in paraffin wax, and cut into slices with a thickness of 6 μm. Finally, the sections were observed under an Olympus BX51 microscope (Olympus, Japan) and photographed by a DP72 digital camera (Olympus, Japan).
Quantitative analyses of seed weight, oil content and FA composition
To minimize randomness effect of seed traits, the following quantitative analyses were all based on 100 seeds collected from each of the three trees at each time point. The fresh weight of seeds was measured immediately after removing pods, then the seeds were dried in an oven at 60 °C until no weight loss on further drying for the determination of dry weight. The moisture content of seeds was calculated by subtracting the dry weight from the fresh weight. For oil content analysis, the dry seeds collected at each time point were separately ground to powder using a mortar and pestle, and then subjected to oil extraction in a Soxhlet apparatus with n-hexane as solvent. The oil content was calculated as percentage (w/w) of dry seed. The extracted oil samples were incubated with sodium methoxide for 20 min, followed by addition of iso-octane and sodium chloride, and incubated for another 20 min. The upper phase was passed through sodium sulfate to eliminate water and transferred to a gas chromatography vial. Subsequently, the FA profile of each oil sample was analyzed by gas chromatography-mass spectrometry (Agilent 7890A-5975C, Agilent Technologies, USA). The capillary column selected was HP-5MS (30.0 m × 250 μm × 0.25 μm). The helium was used as carrier gas. The oven temperature was set from 180 °C to 240 °C at 5 °C min− 1, with an oven equilibration time of 1 min. The injector temperature was set at 230 °C, and the detector temperature was maintained at 280 °C. The assay was performed with three biological replicates. FAs were identified by use of the NIST05 Mass Spectral Library. The abundance of each FA was expressed as percentage of total FAs.
RNA extraction and library construction
Total RNA was isolated from Pongamia seeds using a modified CTAB method . For each of the three representative time points, seeds from each of the three trees were separately subjected to RNA extraction. The resulting nine RNA samples were further purified with the RNeasy Plant Mini Kit (Qiagen, Germany) according to the manufacturer’s protocol. Then, the concentration and quality of each RNA sample was determined by an Agilent 2100 Bioanalyzer (Agilent Technologies, USA). All the samples showed an OD260/OD280 ratio from 2.0 to 2.1, as well as a RIN (RNA Integrity Number) value above 7.0. For each sample, a total of 10 μg of purified total RNA was used for library construction. Firstly, poly-(A) mRNA was enriched from total RNA by Sera-mag Magnetic Oligo (dT) Beads (Thermo Fisher Scientific, USA). Next, the mRNA was digested into short fragments with fragmentation buffer (Ambion, USA). Then, these cleaved RNA fragments were used as templates for the first-strand cDNA synthesis with random hexamer primers, which was followed by the second-strand cDNA synthesis using the SuperScript Double-Stranded cDNA Synthesis Kit (Invitrogen, USA). The double-stranded cDNA fragments were purified with the QiaQuick PCR Extraction Kit (Qiagen, Germany) and ligated with sequencing adaptors. Finally, the short fragments were enriched by PCR amplification to create the sequencing libraries.
Illumina sequencing and reads mapping against reference seed transcriptome
Nine RNA-Seq libraries were sequenced on an Illumina HiSeq 2000. After filtering reads containing adaptor sequences and low-quality sequences, the resulting clean reads from each sequencing library were mapped to the reference seed transcriptome generated in our previous study . The read mapping was performed by the SOAPaligner/soap2 software , allowing mismatches of no more than two bases. To quantify gene expression abundance, the number of unique match reads to each reference unigene was normalized to RPKM, which could eliminate the influence of gene length and sequencing discrepancy on the calculation of gene expression . Pearson correlation coefficients among the three samples at each representative time point were calculated for each reference unigene based on its RPKM values. Principal component analysis was also performed for all nine samples using the edgeR package .
Four total RNA samples, including three from the seeds collected at the same time points as those for RNA-Seq experiments and one from the young leaves, were used for qRT-PCR assay. First-strand cDNA was prepared from 6 μg of total RNA using the SuperScript First-Strand cDNA Synthesis Kit (Invitrogen, USA). Primers were designed for 10 lipid-metabolism-related unigenes as listed in Additional file 1: Table S9. The reactions were performed on an ABI PRISM 7300 Sequence Detection System (Applied Biosystems, USA) following the manufacturer’s instructions. Each reaction mixture was 20 μl containing 10 μl of SYBR Premix Ex Taq (Takara, Japan), 0.5 μl of each primer (10 μM), 1 μl of cDNA template, and 8 μl of RNase-free water. The reactions for each gene were conducted in triplicate with the thermal cycling conditions as follows: 95 °C for 30 s, followed by 40 cycles of 95 °C for 5 s and 60 °C for 30s. The primer specificity was confirmed by melting curve analysis. The relative expression levels of the tested genes were calculated using the 2-ΔΔCt method with normalization to that of the actin gene (4651).
Identification and functional categorization of DEGs
Comparison of unigene expression between seed developmental phases was achieved by the edgeR package . The t test was used to judge the statistical significance of expression difference, with the FDR serving as the threshold of P-value in multiple testing. In this study, DEGs were filtered with RPKM ≥0.1, |log2 fold change| ≥ 1, and FDR ≤ 0.001 in each pairwise comparison between phases. To further characterize the function of DEGs, they were assigned the GO annotations by use of Blast2GO , and assigned metabolic pathway annotations by blast against the KEGG database. Both GO and KEGG pathway enrichment analyses for the DEGs were conducted with hyper-geometric tests by using the whole seed transcriptome as the background.
We thank Dr. Jiangxin Wang and Dr. Anping Lei for their help in lipid profiling experiments. We are also grateful to the reviewers and editors for their comments and suggestions for improving the manuscript.
This work was supported by the National Natural Science Foundation of China (Nos. 31300275 and 31370289), the Guangdong Innovation Research Team Fund (No. 2014ZT05S078), and the Research and Development Foundation of Science and Technology of Shenzhen (No. JCYJ20140724165855348).
Availability of data and materials
Illumina read data used for expression profiling of the Pongamia reference genes have been submitted to the NCBI Sequence Read Archive (SRA) under the accession number SRP132431. All other data supporting our findings can be found in Additional files 1, 2, 3, 4, 5, 6, 7, and 8.
JH, CPJ, and YZ conceived the study. XH and YJ collected the Pongamia seeds at different time points, conducted microscopic analyses, and measured seed weights. YJ and QS quantified oil content and FA composition in developing seeds. XG prepared RNAs for Illumina sequencing. KSK and YKA performed qRT-PCR experiments. JH and XH analyzed all phenotypic and molecular data. JH, DEH, CPJ, and YZ drafted and revised the manuscript. All authors have read and approved the final manuscript.
Ethics approval and consent to participate
Consent for publication
The authors declare that they have no competing interests.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
- 12.Sangwan S, Rao DV, Sharma RA. A review on Pongamia pinnata (L.) Pierre: a great versatile leguminous plant. Nat Sci. 2010;8(11):130–9.Google Scholar
- 17.Hu JM, Lavin M, Wojciechowski MF, Sanderson MJ. Phylogenetic analysis of nuclear ribosomal ITS/5.8S sequences in the tribe Millettieae (Fabaceae): Poecilanthe-Cyclolobium, the core Millettieae, and the Callerya group. Syst Bot. 2002;27(4):722–33.Google Scholar
- 43.Le BH, Cheng C, Bui AQ, Wagmaister JA, Henry KF, Pelletier J, Kwong L, Belmonte M, Kirkbride R, Horvath S, et al. Global analysis of gene activity during Arabidopsis seed development and identification of seed-specific transcription factors. Proc Natl Acad Sci U S A. 2010;107(18):8063–70.CrossRefPubMedPubMedCentralGoogle Scholar
- 58.Dussert S, Guerin C, Andersson M, Joet T, Tranbarger TJ, Pizot M, Sarah G, Omore A, Durand-Gasselin T, Morcillo F. Comparative transcriptome analysis of three oil palm fruit and seed tissues that differ in oil content and fatty acid compositon. Plant Physiol. 2013;162(3):1337–58.CrossRefPubMedPubMedCentralGoogle Scholar
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.