Global transcriptome analysis reveals circadian regulation of key pathways in plant growth and development
As nonmotile organisms, plants must rapidly adapt to ever-changing environmental conditions, including those caused by daily light/dark cycles. One important mechanism for anticipating and preparing for such predictable changes is the circadian clock. Nearly all organisms have circadian oscillators that, when they are in phase with the Earth's rotation, provide a competitive advantage. In order to understand how circadian clocks benefit plants, it is necessary to identify the pathways and processes that are clock controlled.
We have integrated information from multiple circadian microarray experiments performed on Arabidopsis thaliana in order to better estimate the fraction of the plant transcriptome that is circadian regulated. Analyzing the promoters of clock-controlled genes, we identified circadian clock regulatory elements correlated with phase-specific transcript accumulation. We have also identified several physiological pathways enriched for clock-regulated changes in transcript abundance, suggesting they may be modulated by the circadian clock.
Our analysis suggests that transcript abundance of roughly one-third of expressed A. thaliana genes is circadian regulated. We found four promoter elements, enriched in the promoters of genes with four discrete phases, which may contribute to the time-of-day specific changes in the transcript abundance of these genes. Clock-regulated genes are over-represented among all of the classical plant hormone and multiple stress response pathways, suggesting that all of these pathways are influenced by the circadian clock. Further exploration of the links between the clock and these pathways will lead to a better understanding of how the circadian clock affects plant growth and leads to improved fitness.
KeywordsCarotenoid Transcript Abundance Circadian Clock Additional Data File Methyl Jasmonate
circadian clock associated 1
circadian clock regulatory element
methyl erythritol phosphate
protein box element
reactive oxygen species
reverse transcription polymerase chain reaction
timing of CAB expression 1.
Harsh environmental extremes often accompany the daily light-dark cycle. In nearly every organism studied an endogenous time keeping mechanism has evolved that enables anticipation of these predictable changes . This is especially critical for sessile organisms such as plants. The circadian clock produces self-sustained rhythms with a period length of approximately 24 hours. To keep these rhythms in proper alignment with the day-night cycle, the clock is set or entrained by environmental timing cues such as changes in light or temperature. This is important because a functional clock can only provide an organism with a competitive advantage when it is correctly matched to the external environment [2, 3].
Although this advantage has been demonstrated for both phytoplankton and higher plants, the mechanistic link between the circadian clock and increased fitness remains unclear. Understanding how clocks confer an adaptive advantage requires a thorough knowledge of circadian-regulated pathways and processes. Fortunately, several microarray experiments have been performed to identify the circadian transcriptome of the model plant system Arabidopsis [4, 5, 6, 7, 8]. These studies have shown that a substantial portion of the plant genome is clock controlled, with transcript levels of different genes showing peak accumulation at all times, or phases, of the circadian cycle. We and others refer to genes with rhythmic regulation of transcript abundance as 'clock-regulated'; this may reflect circadian regulation of promoter activity and/or mRNA stability.
This raises another major question in circadian biology; how does the central clock mechanism control the vast array of circadian outputs and phase them to the appropriate time of day? Although the circadian clocks of higher plants, animals, and fungi consist of interlocking transcriptional feedback loops, the individual components vary [9, 10, 11]. In plants, one of these loops involves the reciprocal regulation of CCA1 (circadian clock associated 1) and TOC1 (timing of CAB expression 1), which have morning and evening phases of peak expression, respectively . Whereas TOC1 promotes CCA1 expression, the myb-related transcription factor CCA1 represses TOC1 expression upon binding to a circadian clock regulatory element (CCRE) in the TOC1 promoter [12, 13]. This CCRE, called the evening element (EE), is over-represented in the promoters of evening expressed circadian genes, and when multimerized it drives evening-phased circadian regulation of a reporter gene . The EE is one of the few CCREs that have been characterized [4, 8, 14, 15]. Several more CCREs, however, are likely required to generate the enormous diversity observed in phases of transcript accumulation of clock-regulated genes.
Here we suggest that the abundance of as many as one-third of expressed transcripts in Arabidopsis is circadian regulated; we use data from multiple circadian microarray experiments to discover known and potential circadian clock regulatory elements; and we identify new circadian-enriched pathways that may help to explain the physiological importance of the clock. These findings may help explain how clock outputs are regulated so that they occur at the appropriate time of day, a central function of the circadian clock . In addition, the enrichment of clock-regulated genes among many phytohormone- and stress-response pathways suggests that the circadian system modulates plant responses to most hormones and stresses, probably contributing to the adaptive advantage provided by a properly phased clock . These findings suggest the clock plays fundamental roles in nearly all aspects of plant growth and development, as well as in plant environment interactions.
Results and discussion
Comparison of circadian microarray datasets
Experimental differences in original circadian microarray analyses
Number of time points
Light intensity (μmol/m2 per second)
Circadian detection algorithm
Harmer and coworkers 
Affymetrix Arabidopsis Genome
Affymetrix MAS 4.0
Edwards and coworkers 
60 to 65
Affymetrix Arabidopsis ATH1
COSOPT (less stringent)
Covington and Harmer 
Affymetrix Arabidopsis ATH1
COSOPT (more stringent)
Genes present in ≥ 4 of 12 samples
We next compared the degree of circadian regulation found in the Harmer and Covington datasets when the same analytical techniques are used. Comparing only genes found on both of the array platforms used in these experiments, the degree of circadian regulation in the Harmer and Covington datasets is quite similar (Figure 2c). When the Covington and Edwards datasets are analyzed using the same method used in the original Edwards analysis , the percentage of genes designated as clock regulated in the two experiments also becomes much more similar (Figure 2d). However, the degree of overlap between the genes defined as clock regulated in both the Harmer and Covington datasets or Edwards and Covington datasets is limited: about 33% and 37%, respectively (Figure 2e).
We suspected that genes identified as circadian regulated in both the Covington and Edwards microarray studies have high amplitude rhythms, whereas genes with low amplitude rhythms tended to be identified in only one of the studies. As predicted, we found a strikingly significant difference (P = 1.7 × 10-106) between the relative amplitude of rhythmic genes identified by both datasets (0.21) and that of rhythmic genes identified only by the Covington dataset (0.12). This, together with our analysis of the Harmer dataset, suggested that identification of clock-regulated genes might be limited by technical issues and would benefit from increased sample numbers.
Because the Edwards and Covington experimental procedures were very similar, we reasoned that we might gain power by analyzing the 25 microarrays from these two experiments as a single time series. After normalizing the expression values for each probe set to its median for each dataset, we combined the two experiments in three ways: by interweaving these datasets to generate a 2-hour resolution time course spanning two days ('CECE' dataset); by appending the Edwards series after the Covington series to generate a 4-hour resolution time course over four days ('CCEE' dataset); and by appending the Covington series after the Edwards series to generate a different 4-day time course ('EECC' dataset; see Additional data file 1).
All three time courses were analyzed in accordance with the parameters used in the original Edwards analysis . In each case the abundance of 35% to 37% of expressed transcripts was found to be clock-regulated (Figure 2d). These three gene lists were remarkably consistent, with all two-way comparisons of these gene lists having 81% to 84% overlap (Figure 2e) and the intersection of all three lists being 76% of the union (Figure 2f). This group of 3,975 predicted circadian-regulated genes ('C+E intersection') at the intersection of the combined Covington and Edwards datasets contains almost all of the circadian genes found by analysis of the individual Covington and Edwards datasets (79% and 87%, respectively) as well as by the 'shuffled' Harmer time courses (81% to 88%; Figure 2g). Analysis of simulated data indicates that the strategy to identify the circadian-regulated genes in the C+E intersection has a false-positive rate of 1.1% and a false-discovery rate of 2.8%, which are much better than that for a single time course of 12 time points analyzed with the more stringent parameters used in the original Covington analysis (1.6% and 9.6%, respectively).
Two additional circadian microarray experiments ('Michael datasets') were recently performed using Arabidopsis seedlings and the same platform as the Covington and Edwards datasets . Subjecting the Michael datasets to analysis with our parameters reveals 17% circadian regulation in each dataset (Figure 2d) with limited overlap of circadian genes (Figure 2e). Seedlings harvested for the Michael datasets were grown differently than those used for the Covington, Edwards, and Harmer datasets. These differences included growth on media lacking sucrose and entrainment by daily changes in temperature (either in constant light ('Michael 1' dataset) or in combination with light/dark cycles ('Michael 2' dataset). Remarkably, even despite these differences, more than two-thirds of the circadian genes identified in our analysis of the Michael datasets are also found in the C+E intersection (Figure 2g).
A recent comparison of five independent microarray studies to identify circadian-regulated genes in Drosophila  demonstrated that differences in circadian detection algorithms as well as laboratory-dependent differences both have significant impacts on the overlap of lists of circadian-regulated genes. Even when they were reanalyzed in a uniform manner, the maximum observed overlap between lists of circadian-regulated genes from any two Drosophila datasets was only 24%, with an average overlap of 11%. The extensive overlap of cycling genes found between the C+E intersection and each of the individual datasets (Harmer, Covington, Edwards, and the two Michael datasets) suggests that a major limitation for detecting clock-regulated genes in circadian microarray experiments is not laboratory dependent or biological variation, but rather technical issues that can be alleviated by increasing the number of time points. This can be accomplished by increasing the duration of the time course, the sampling frequency during the time course, or the degree of biological replication of samples. The first two approaches provide more biological information and thus appear to be preferable to the third. In order to minimize developmental effects and the damping of rhythms that often occurs during free running conditions, we recommend circadian time courses with increased sampling frequency rather than increased duration.
Given the impressive overlap between the genes designated as clock regulated when the Covington and Edwards datasets are either appended end-to-end or interwoven (Figure 2e, f), it appears reasonable to conclude that between 31% and 41% of expressed genes (representing the intersection and the union of the cyclers found in these datasets, respectively) are under circadian regulation (Figure 2f). This is consistent with an estimate of 36% of genes being circadian regulated based on a luciferase-based enhancer-trapping approach . For a summary of the genes that are expressed and circadian in the individual and combined datasets, see Additional data file 2.
Genome organization of circadian-regulated genes
Co-expressed genes have been shown to occur in clusters throughout the Arabidopsis genome [19, 20]. Similar patterns of genome organization have also been observed in animals and fungi [21, 22]. To determine whether genome organization plays an important role in circadian regulation of gene expression, we used three computational approaches to look for patterns in genome location of clock-regulated genes. We calculated the Pearson product-moment correlation coefficient, the fraction of clustered clock-regulated genes, and the mean pMMC-β value (a significance measure for circadian rhythmicity) in a sliding window across multiple genes to test whether circadian-regulated genes are co-localized in the Arabidopsis genome.
Analysis of circadian clock regulatory elements
To evaluate the biological relevance of the CBS, we examined the phase distributions of circadian-regulated genes containing the CBS and, as a control, the related EE motif. EEs are over-represented in the promoters of evening-phased genes and are under-represented in the promoters of genes with transcripts that accumulate at any other time of day, as previously reported (Figure 4a) [4, 8]. In contrast, the CBS is only under-represented in one and is not over-represented in any phase groups (Figure 4a), which suggests that the CBS is not involved in phase-specific transcript accumulation. It may be that both the in vitro binding of CCA1 to the CBS and the evening-phased circadian regulation conferred by the multimerized CBS are artifacts caused by the high similarity between the CBS and the EE.
Only two other CCREs have been demonstrated to control phase-specific expression; when multimerized, the morning element (ME; AACCACGAAAAT) confers dawn-phased expression and the protein box element (PBX; ATGGGCC) confers midnight-phased expression on a luciferase reporter gene [8, 14]. Therefore, the question remains, how is the observed diverse array of circadian phases of transcript abundance generated? To identify motifs that are important for time-of-day-specific circadian expression, we developed a multipronged promoter motif discovery and validation approach (described in Materials and methods, see below). We reduced the number of possible CCREs with the stringent requirement that each candidate motif exhibit phase-specific over-representation among genes classified as circadian in both the Covington and Edwards datasets. These candidate CCREs were then clustered based on their sequence similarity, leading to the identification of clades of related motifs (Figure 4b). When we calculated the frequency of each motif in the promoters of circadian-regulated genes, we found that most of the clades exhibit the same phase of peak transcript abundance in both the Covington and the Edwards datasets, validating our approach (see heat map in Figure 4b). The clusters with the greatest degree of phase consolidation contain genes with transcript abundance peaking during subjective dawn (Figure 4e), early day (Figure 4f), late day (Figure 4c), and subjective dusk (Figure 4d). As expected, the frequency distribution data for these consensus sequences correlate with the mean phase-specific frequencies of all motifs in the indicated clades (Figure 4g-j).
The putative CCREs that we identified are related to motifs recently found by others to be enriched in the promoters of circadian genes [4, 8, 14, 15]. The CCACA motif that we found to be enriched in the promoters of dawn-phased genes (Figure 4e) is almost identical to the ME computationally defined by Michael and coworkers  and similar to the ME found by Harmer and Kay  to confer dawn-phased rhythms on a reporter gene. Similarly, the early day-phased motif shown in Figure 4f contains a G-box sequence, which Michael and coworkers  found to be enriched in dawn-phased genes. The late day-phased motif (Figure 4c) contains a GATA core element, which is also found within the longer EE motif (Figure 4d). Interestingly, the GATA cluster has a slightly earlier phase than the EE cluster, suggesting that specific flanking sequences might modify the phase conferred by a CCRE. Indeed, we previously showed that placing a ME adjacent to an EE in the promoter of a reporter gene results in an advanced phase of expression relative to an EE alone . Michael and coworkers  also found that GATA motifs are enriched in the promoters of genes with an afternoon phase of transcript accumulation.
Despite using different analytical strategies and gene lists, we and Michael and coworkers  found many of the same motifs to show phase-specific enrichment. This strongly suggests that the field has now identified at least four major motifs important for clock-regulated transcript accumulation at multiple phases during the subjective day and night. There may be other important CCREs yet to be discovered, because our analysis  did not identify the PBX motif found by Michael and coworkers .
It will next be critical to test whether the GATA and G-box motifs do confer different day-phased rhythms of transcript accumulation and to determine whether different combinations of the four known CCREs in the promoters of circadian genes are sufficient to confer every phase of circadian transcript accumulation. Identification of the transcription factors that bind to these CCREs will provide insight into the circuitry of the circadian clock and the regulatory network between the clock and its outputs.
Circadian transcription factors
To begin to define this regulatory network, we next wished to identify transcription factors found to be clock regulated in the C+E intersection. Only 732 of the 1,690 genes with the GOslim annotation  'transcription factor activity' are detectably expressed in the C+E intersection, perhaps reflecting specialized functions of many transcription factors in nonseedling tissues. Of these 732 genes, we found 247 (33.7%) - from a variety of families - to be circadian regulated. Although this degree of circadian regulation is no higher than would be expected by chance, seven transcription factor families exhibit a significant circadian enrichment: Constans (CO)-like, Myb-related, basic leucine zipper (bZIP), multiprotein bridging factor 1 (MBF1), barley B recombinant-basic pentacysteine 1 (BBR-BPC), tubby-like protein (TLP), and teosinte branched1/cycloidia/PCF (TCP).
Links to the circadian clock were previously described for the first three families [10, 26, 27, 28, 29, 30, 31, 32] but not for the others. A role for plant homologs of MBF1 in defense responses to pathogens has been suggested , whereas members of the BBR-BPC, TLP, and TCP families have been implicated in multiple aspects of development control [34, 35, 36, 37]. For the TCP transcription factors, this includes cell growth and proliferation, organ shape and border delimitation, and shoot branching . Perturbation of expression of clock-regulated TCP genes causes phenotypes often found in clock mutants, such as late flowering and elongated hypocotyls , suggesting these plants may have impaired circadian function.
Identification of pathways with an under- or over-representation of circadian-regulated genes
In order to understand the physiological relevance of the circadian system and how a functional clock can confer a competitive advantage , we must know which pathways and processes are controlled by the clock. We therefore identified functionally-related gene groups with either more or fewer circadian-regulated genes than expected by chance. Many core processes had significantly fewer than expected oscillatory transcripts, including the following: RNA processing; DNA synthesis and chromatin structure; protein synthesis, secretion, and ubiquitin-mediated degradation; G-protein-mediated signaling; and cell cycle. It may be that these processes are not clock regulated because they must occur during all times during the day/night cycle. On the other hand, transcript abundance of these genes may only be clock regulated in a subset of tissue types; if this is the case, then we might not detect circadian regulation given the whole-plant sampling performed in published microarray studies. Finally, these pathways might be influenced by the circadian clock either via clock-controlled transcription of one or a few key regulators or via circadian influence on post-transcriptional mechanisms such as protein degradation or phosphorylation [39, 40].
Circadian regulation of isoprenoid biosynthetic pathways and ABA biosynthetic genes
Many genes that encode enzymes acting downstream of the MEP pathway in the biosynthesis of complex isoprenoids are themselves clock regulated. More than 85% (7/8; P value for circadian enrichment = 1.7 × 10-3) of the genes involved in the conversion of GGDP and tyrosine into the various tocopherols and tocotrienols that together comprise the antioxidant vitamin E are clock regulated, six with a morning phase of peak transcript abundance (Figure 5c). Furthermore, genes encoding enzymes that act several steps upstream of tyrosine synthesis are also circadian regulated with the same morning phase (data not shown).
Similarly, we found a strikingly significant enrichment (10/12 [83%]; P = 3.1 × 10-4) of circadian regulation among genes encoding enzymes that are involved in the synthesis of carotenoids from GGDP, with most showing a peak phase of transcript abundance at around subjective dawn (Figure 5d). Notably, the transcript abundance of PSY (PHYTOENE SYNTHASE), encoding the first and rate-limiting enzyme in carotenoid biosynthesis , is clock controlled (Figure 5d). Carotenoids play an essential role in the process of nonphotochemical quenching, which allows plants to quench excited chlorophyll and prevent oxidative damage under excessive light conditions. In contrast to the dawn-phased transcript accumulation of carotenoid biosynthetic genes, NPQ1 (a gene encoding violaxanthin deepoxidase) has peak transcript levels at subjective dusk (Figure 5d). Violaxanthin deepoxidase acts antagonistically to the other clock-regulated carotenoid biosynthetic genes by recycling the carotenoid violaxanthin into compounds upstream of violaxanthin synthesis as part of the nonphotochemical quenching process . Therefore, the antagonistic function of NPQ1 coincides well with its antiphasic transcript accumulation pattern to other clock-regulated carotenoid genes.
Carotenoids are also precursors to the hormone ABA, and over-expression of either CLA1 or PSY results in increased levels of carotenoids and ABA [46, 49]. Additionally, the transcripts of the clock-regulated ABA metabolic genes NCED3 (NINE-CIS-EPOXYCAROTENOID DIOXYGENASE) and ABA2 (ABA DEFICIENT 2) accumulate during the subjective morning (Figure 5e). NCED3 encodes the rate-limiting activity for ABA biosynthesis . The extensive clock regulation of genes implicated in ABA synthesis led us to examine whether ABA-responsive genes might also be enriched for circadian regulation.
Extensive circadian regulation of hormone-responsive genes
Circadian-enriched hormone and stress response pathways
% Circadian (circadian/expressed [n])
P value for over-representation
Other reports of enrichment of genes with
Abscisic acid (up)
2.7 × 10-14
Abscisic acid (down)
5.5 × 10-10
1-Aminocyclopropane-1-carboxylic acid (up)
1-Aminocyclopropane-1-carboxylic acid (down)
1.6 × 10-05
6.2 × 10-08
1.6 × 10-02
8.3 × 10-04
Gibberellic acid (up)
Gibberellic acid (down)
3.9 × 10-07
Indole-3-acetic acid (up)
2.9 × 10-02
Indole-3-acetic acid (down)
2.2 × 10-08
Methyl jasmonate (up)
5.9 × 10-06
Methyl jasmonate (down)
9.9 × 10-20
Salicylic acid (up)
7.0 × 10-09
Salicylic acid (down)
1.3 × 10-03
1.5 × 10-02
6.6 × 10-04
2.0 × 10-02
1.1 × 10-05
3.6 × 10-04
In addition to diurnal changes in ABA abundance, it has been reported that other hormones such as auxins, brassinosteroids, cytokinins, ethylene, and gibberellins fluctuate over day/night cycles [52, 53, 54, 55, 58, 59, 60, 61]. Furthermore, there is a significant overlap between brassinolide-induced and clock-regulated genes . To investigate further the connections between the circadian clock and hormone signaling, we systematically examined genes that respond to these or other hormones within 30 minutes to 4 hours after treatment [57, 63]. Strikingly, for every plant hormone analyzed there is a significant enrichment of circadian-regulated hormone-responsive genes. Specifically, we found circadian enrichments for genes that are induced in response to ABA, cytokinin, indole-3-acetic acid (IAA), methyl jasmonate (MJ), or salicylic acid (SA), as well as for genes downregulated in response to ABA, 1-aminocyclopropane-1-carboxylic acid (ACC; a key intermediate in ethylene biosynthesis), brassinolide, cytokinin, GA, IAA, MJ, or SA (Figure 6 and Table 2). Although changes in transcript abundance do not always correlate with changes in the abundance or activity of the corresponding protein [64, 65], circadian changes in transcript levels of hormone-regulated genes probably indicates changes in either hormone levels or signaling pathway activity. Thus, our data suggest that the circadian clock modulates all of these hormone signaling pathways, perhaps helping to explain the pervasive effects of the clock on plant growth and development .
Possible links between the clock and hormone signaling
The gaseous hormone ethylene plays well-known roles in fruit ripening and the triple response during seedling emergence; in addition, it is involved in organ senescence and abscission and responses to both abiotic and biotic stresses . Production of ethylene has long been recognized as robustly clock regulated [68, 69, 70], but the mechanism linking the clock to rhythmic ethylene production is not currently understood. ACS8 (ACC SYNTHASE 8; At4g37770), a gene that is involved in the production of ethylene, has previously been shown to be circadian regulated with peak accumulation during the subjective day, the same time as peak ethylene emission; however, plants with a T-DNA insertion within the ACS8 coding region do not exhibit altered ethylene rhythms . Under typical conditions, ACC synthase is believed to be the rate-limiting step of ACC biosynthesis. Under certain circumstances, however, ACC oxidase becomes the rate-limiting step . Intriguingly, we found two genes that encode putative ACC oxidase enzymes (At1g04350 and At5g63600) are circadian regulated, with a similar phase of transcript accumulation as ACS8 (data not shown). It is possible that all three enzymes act together to generate circadian ethylene emission.
Circadian regulation of abiotic stress responses
Multiple plant hormones have been implicated in stress responses [67, 75, 76, 77] and many acute abiotic stresses are the direct result of daily light/dark cycles. As such, genes that are involved in perception, signaling and/or responses related to environmental stresses might be expected to be under clock control. Indeed, circadian regulation of salt-, osmoticum-, and cold-regulated genes has previously been demonstrated [4, 78] (Table 2). By analyzing circadian fluctuations in transcript levels from genes grouped by Gene Ontology term, we identified additional stress-response pathways that are likely to be influenced by the clock, suggesting that the circadian clock is implicated not only in plant responses to cold, salt and drought, but also in responses to heat and reactive oxygen species (ROS).
As well as generating predictable changes in temperature, the earth's daily rotation causes rhythms in light availability. Although light is essential for photosynthesis and plant survival, excess light leads to the accumulation of ROS that can damage the photosynthetic machinery and the plant . ROS production is even more pronounced under stress conditions such as bright light, drought, or extreme temperatures . Because genes that are involved in the synthesis of the compounds (carotenoids and tocopherols) that prevent ROS production through nonphotochemical quenching are clock regulated, with transcript levels peaking near subjective dawn (Figure 5c-d), it is interesting that 34% (41/122) of genes induced by ROS or oxidative damage are also clock-regulated. Although this is not a statistically significant enrichment, the average transcript profile for these genes peaks early in the subjective day, with a phase similar to that of genes involved in the light-harvesting reactions of photosynthesis (Figure 8b). It may be that clock regulation of photosynthetic and ROS responsive genes helps plants optimize photosynthetic activity while minimizing cellular damage caused by this process.
Abiotic stress responses appear to be highly interconnected, perhaps because related stresses often occur concurrently. Signaling pathways for stress-related hormones such as ABA, SA, MJ, and ethylene are believed to be important components in the crosstalk between stress signaling pathways . The high degree of circadian regulation among genes responsive to various hormones and stresses might lead one to predict that the same clock-controlled genes are regulated by many different abiotic stimuli. However, this is not the case; most circadian-regulated genes are regulated by only one or two different stresses or hormones. This is reminiscent of the limited overlap between hormone-responsive genes in general; multiple hormones may regulate the expression of a family of genes with similar functions, but each individual gene is seldom controlled by more than one or two hormones . This pathway specificity may allow the plant to fine-tune responses for a variety of stress conditions. For example, the gene response profile of plants subjected to drought and heat stress together is very different than the union of genes regulated by heat or drought alone .
Our analysis of several circadian microarray experiments suggests that between 30% and 40% of expressed genes are clock regulated in seedlings. Transcript profiling and bioinformatic analyses are leading to a better understanding of the cis and trans factors that control these rhythmic changes in transcript abundance; in particular, bioinformatic analysis of promoter sequences has implicated several discrete motifs in phase-specific regulation of clock-controlled genes. Examination of pathways with an over-representation of clock-regulated genes is giving us insight into new aspects of plants physiology influenced by the clock. Of special interest is the extensive circadian regulation of all of the hormone and many of the environmental stress signaling pathways that we have examined. These new findings suggest most aspects of plant physiology are influenced by the circadian system and will help to lead us to a mechanistic understanding of how clocks provide an adaptive advantage.
Materials and methods
Verification of rhythmic expression by RT-PCR
The gene selection procedure involved randomly choosing genes with varying degrees of robust rhythmic expression. We chose three genes from the top third highest amplitude cyclers (At1g06460: 5'-CAT CTC TCG TCC CCT TGA AC-3' and 5'-AGG CCT TTC CTT TTG CAG AT-3'; At1g69830: 5'-CCC AGT TTC TTC GTC CTT CA-3' and 5'-CAA AAG TCA ATC GCG GAA AT-3'; and At5g12110: 5'-ATC TCC ACA CAG AGC GAG GT-3' and 5'-GCA GCT TCT CTC TCT TCA GCA-3') and three from the lowest third amplitude cyclers (At3g22970: 5'-GCC ATT TAC GAT GAA GAT CCA-3' and 5'-CGT CGG CTA ACA GAT TCC TC-3'; At1g45688: 5'-AAT CAC CAT CAC GCG ACT CT-3' and 5'-CAG CTT GGA TCT TAA GCG TCT-3'; and At3g04760: 5'-TCA GGC TGT CCG AAT TTC TCG AGA-3' and 5'-CCT CTG AAC TCG TTG GTT TCA CTA TCC-3'). For each time point, circadian transcript levels were normalized by dividing by transcript levels of the control gene UBQ10 (which encodes polyubiquitin 10; At4g05320: 5'-TCA AAT CTC TCT ACC GTG ATC AAG-3' and 5'- TTA CAT GAA ACG AAA CAT TGA ACT TC-3'). Semi-quantitative PCR was conducted as previously described .
Comparison of circadian microarray datasets
The Harmer dataset was composed of technical replicates using Affymetrix Arabidopsis Genome Arrays (Affymetrix Inc., Santa Clara, CA, USA) . We randomly assigned these replicates into separate unreplicated sets 20 different times. These were reanalyzed side-by-side with the Covington dataset (Affymetrix Arabidopsis ATH1 Genome Array) . Because different sets of genes are represented on the two microarray platforms, we focused on genes common to both arrays that are also expressed in each dataset. We defined a gene as expressed if the Affymetrix MAS5.0 software called it 'Present' in at least four out of 12 samples (or out of the first 12 of 13 samples for the Edwards dataset).
Both the Edwards and Covington datasets were originally analyzed with the same circadian detection algorithm, namely COSOPT. However, the Edwards analysis did not use the initial sampling density weighted linear regression detrending, resulting in an increased number of genes identified as circadian . To compare the extent of circadian regulation of genes expressed in both datasets, we reanalyzed the Covington dataset using the Edwards protocol, ignoring the dChip-derived standard error value and omitting the detrending step. Similarly, we analyzed the Michael datasets using the COSOPT parameters originally reported by Edwards and coworkers . The Edwards and Covington datasets were combined in three different ways (as described under Results and discussion, above), and then analyzed using COSOPT . Only genes defined as expressed in both individual datasets were considered expressed in the combined dataset.
Genome organization of circadian-regulated genes
Groups of adjacent expressed genes in a sliding window (of sizes two, five, and ten genes) were evaluated based on the proportion displaying circadian expression patterns, the mean pMMC-β value (a measure of circadian rhythmicity), or the mean combinatorial pair-wise Pearson correlation coefficient. Threshold values were empirically derived via an approach based on a method originally proposed for quantitative trait mapping . Specifically, we calculated the strongest cluster score for each of 1,000 random permutations of the data. From these values, we used the 95th percentile as an estimated experiment-wise critical value to detect circadian clusters in the genome with an overall type I error rate less than 5%. For the first two approaches, statistically significant local clusters of circadian-regulated genes were only detected when we grouped genes by phase of peak transcript abundance (using bins either 2 hours or 4 hours wide). This analysis was performed using scripts written in the statistical programming language R .
Analysis of circadian clock regulatory elements
We employed four different strategies to identify potential motifs of interest: a trio of established motif discovery tools (stand-alone versions of AlignACE v2004 [88, 89], Weeder v1.2 [90, 91], and MotifSampler v3.2 [92, 93]) and an exhaustive in silico testing of 6-mer and 8-mer nucleotide sequences.
The following validation protocol using both the Covington and Edwards datasets helped to narrow the list of putative CCREs to a more tractable size (from 55,107 to 126). For both the Covington and Edwards datasets, a potential motif must be over-represented in circadian genes versus all expressed genes; over-represented in at least one phase-specific subset of circadian genes versus all circadian genes; and under-represented in at least one phase-specific subset of circadian genes vs. all circadian genes. Over-representation and under-representation was determined using a previously described permutation testing approach [7, 94]. Subsequent clustering of motifs based solely on sequence similarity (as measured using an scoring approach based on that used for Clustal ) enabled us to reduce further the number of motifs of interest by consolidating sequences with slight variations. These analyses were performed using scripts written in Perl and the statistical programming language R .
Determination of pathway over-representation
Using annotations for the circadian-regulated genes found in the C+E intersection (see Additional data file 2), we searched for functionally-related gene groups enriched for circadian patterns of transcript accumulation. Genes were grouped according to annotations based on MapMan bins , Gene Ontology terms , and The Arabidopsis Information Resource  gene families, as well as information gleaned from the primary literature. Over-representation of circadian-regulated genes was determined using Fisher's exact test.
Additional data files
The following additional data are available with the online version of this paper. Additional data file 1 is a table listing the normalized circadian expression data for the combined Covington and Edwards dataset CCEE. Additional data file 2 is a table summarizing the expressed and circadian genes identified using different circadian microarray datasets.
We thank B Usadel and M Stitt for early access to MapMan annotations, M Waugh for technical assistance, and anonymous reviewers for helpful suggestions. This project was supported the National Research Initiative of the US Department of Agriculture Cooperative State Research, Education and Extension Service, grant number 2004-35100-14903 (to MFC) and by the National Institutes of Health grant number GM069418 and National Science Foundation grant number 0616179 (to SLH).
- 8.Michael TP, Mockler TC, Breton G, McEntee C, Byer A, Trout JD, Hazen SP, Shen R, Priest HD, Sullivan CM, Givan SA, Yanovsky M, Hong F, Kay SA, Chory J: Network discovery pipeline elucidates conserved time-of-day-specific cis-regulatory modules. PLoS Genet. 2008, 4: e14-PubMedPubMedCentralCrossRefGoogle Scholar
- 25.Ashburner M, Ball CA, Blake JA, Botstein D, Butler H, Cherry JM, Davis AP, Dolinski K, Dwight SS, Eppig JT, Harris MA, Hill DP, Issel-Tarver L, Kasarskis A, Lewis S, Matese JC, Richardson JE, Ringwald M, Rubin GM, Sherlock G: Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat Genet. 2000, 25: 25-29.PubMedPubMedCentralCrossRefGoogle Scholar
- 52.Burschka C, Tenhunen JD, Hartung W: Diurnal variations in abscisic acid content and stomatal response to applied abscisic acid in leaves of irrigated and non-irrigated Arbus unedo plants under naturally fluctuating envirnomental conditions. Oecologia (Berlin). 1983, 58: 128-131.CrossRefGoogle Scholar
- 65.Gibon Y, Blaesing OE, Hannemann J, Carillo P, Hohne M, Hendriks JH, Palacios N, Cross J, Selbig J, Stitt M: A robot-based platform to measure multiple enzyme activities in Arabidopsis using a set of cycling assays: comparison of changes of enzyme activities and transcript levels during diurnal cycles and in prolonged darkness. Plant Cell. 2004, 16: 3304-3325.PubMedPubMedCentralCrossRefGoogle Scholar
- 73.AtGenExpress Visualization Tool. [http://jsp.weigelworld.org/expviz/expviz.jsp]
- 87.R Development Core Team: R: a Language and Environment for Statistical Computing. 2007, Vienna, Austria: R Foundation for Statistical ComputingGoogle Scholar
- 97.Swarbreck D, Wilks C, Lamesch P, Berardini TZ, Garcia-Hernandez M, Foerster H, Li D, Meyer T, Muller R, Ploetz L, Radenbaugh A, Singh S, Swing V, Tissier C, Zhang P, Huala E: The Arabidopsis Information Resource (TAIR): gene structure and function annotation. Nucleic Acids Res. 2008, 36: D1009-D1014.PubMedPubMedCentralCrossRefGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.