High level transgenic expression of soybean (Glycine max) GmERF and Gmubi gene promoters isolated by a novel promoter analysis pipeline
- 15k Downloads
Although numerous factors can influence gene expression, promoters are perhaps the most important component of the regulatory control process. Promoter regions are often defined as a region upstream of the transcriptional start. They contain regulatory elements that interact with regulatory proteins to modulate gene expression. Most genes possess their own unique promoter and large numbers of promoters are therefore available for study. Unfortunately, relatively few promoters have been isolated and characterized; particularly from soybean (Glycine max).
In this research, a bioinformatics approach was first performed to identify members of the Gmubi ( G. m ax ubiquitin) and the GmERF ( G . m ax Ethylene Response Factor) gene families of soybean. Ten Gmubi and ten GmERF promoters from selected genes were cloned upstream of the gfp gene and successfully characterized using rapid validation tools developed for both transient and stable expression. Quantification of promoter strength using transient expression in lima bean (Phaseolus lunatus) cotyledonary tissue and stable expression in soybean hairy roots showed that the intensity of gfp gene expression was mostly conserved across the two expression systems. Seven of the ten Gmubi promoters yielded from 2- to 7-fold higher expression than a standard CaMV35S promoter while four of the ten GmERF promoters showed from 1.5- to 2.2-times higher GFP levels compared to the CaMV35S promoter. Quantification of GFP expression in stably-transformed hairy roots of soybean was variable among roots derived from different transformation events but consistent among secondary roots, derived from the same primary transformation events. Molecular analysis of hairy root events revealed a direct relationship between copy number and expression intensity; higher copy number events displayed higher GFP expression.
In this study, we present expression intensity data on 20 novel soybean promoters from two different gene families, ubiquitin and ERF. We also demonstrate the utility of lima bean cotyledons and soybean hairy roots for rapid promoter analyses and provide novel insights towards the utilization of these expression systems. The soybean promoters characterized here will be useful for production of transgenic soybean plants for both basic research and commercial plant improvement.
KeywordsHairy Root Lima Bean Promoter Strength Ethylene Response Factor Transgenic Hairy Root
List of abbreviations
Cauliflower Mosaic Virus 35S
Murashige and Skoog
MS medium containing no plant growth regulators
Ethylene Response Factor
G lycine m ax Ubiquitin
G lycine m ax Ethylene Response Factor
- 5' UTR
5' Untranslated Region
Finer Laboratory Expression Vector
Multiple Cloning Site
Polymerase Chain Reaction.
With the increasing amount of biological information derived from genome sequencing projects of several plant species [1, 2], opportunities exist for functional analysis of those sequences using a combination of computational approaches and various methods of wet laboratory analyses of gene expression. The recent release of the soybean genome  has tremendously facilitated computational genome-wide analyses of the soybean genome and identification of specific DNA sequences, which need to be validated using functional analysis tools. The availability of the soybean genome has also provided unprecedented access to sequences for a wide range of promoters from diverse gene families, which will lead to a better understanding of the regulation of gene expression and the discovery of novel soybean promoters for use in basic research and applied crop biotechnology.
Promoters are the primary regulators of gene expression at the transcriptional level and are key to controlling transgenes in transgenic organisms . The use of one or only a few different promoters to direct expression of different genes in transgene stacks can lead to homology-based gene silencing and unpredictable transgene expression in transgenic plants . Consequently, it is absolutely necessary to increase the availability of different promoters for plant transformation. Although the constitutive highly-expressed Cauliflower Mosaic Virus 35S (CaMV35S) promoter is commonly used for gene regulation in plants, different plant genomes can provide additional useful native plant promoters ranging from highly-expressing constitutive to tissue-specific and inducible. Likewise, analyses of native promoters will most likely reveal a large variety of heretofore undiscovered cis-regulatory elements, which will increase our understanding of gene expression regulation . Although several plant promoters are available as an alternative to the CaMV35S promoter, very few soybean promoters have been isolated and extensively characterized in soybean [7, 8, 9], in spite of the world-wide economic impact of this crop.
We recently reported the isolation and characterization of a G lycine m ax polyubiquitin (Gmubi) promoter, which leads to high constitutive levels of both transient  and stable gene expression in various tissues of transgenic soybeans . Other plant ubiquitin promoters have also been isolated and characterized in a wide variety of plant species . Particularly, ubiquitin promoters from rice  and maize  have been extensively characterized and frequently used in both basic research and in the production of commercial transgenics. Ubiquitin promoters typically drive strong constitutive gene expression, which is especially high in young tissues, vascular tissues and pollen . The enhancement of gene expression from the presence of the leading intron in the different ubiquitin promoters has also received considerable attention . In spite of the emphasis on the use of ubiquitin promoters, most studies to date have relied on single promoter sequences isolated from different plant species [15, 16, 17]. However, the ubiquitin gene family is quite large in most plants and isolation and characterization of different ubiquitin promoters, from the same plant, could serve as a source of additional promoters and provide useful information on how different ubiquitin genes are differentially regulated.
As ubiquitin promoters tend to drive constitutive gene expression, additional promoter sequences from inducible genes may also be of interest . The Ethylene Response Factor (ERF) gene family encodes a large group of transcription factors characterized by the presence of a single AP2/ERF domain . ERF proteins play important roles in ethylene-mediated gene transcription  and in a wide range of biotic and abiotic stress responses such as pathogen attack , drought tolerance, salt tolerance and low temperatures [22, 23]. The ERF genes therefore could be excellent sources for inducible promoters, which most likely contain interesting cis-regulatory elements within their sequences.
Promoter characterization typically involves the introduction and analysis of DNA constructs containing promoters fused to a reporter gene. Temporal and tissue-specific expression of the reporter gene can then be directly observed and quantified in transgenic plant tissues. Although soybean transformation was first reported many years ago [24, 25, 26], it remains consistent but inefficient  and it may not be entirely suitable for medium- to high-throughput analysis of soybean promoters. Due to this limitation, analyses of soybean promoters and their cis-regulatory elements are often performed using heterologous plant expression systems such as Arabidopsis and tobacco [9, 28, 29]. Analyses using heterologous systems have value but validation of soybean promoters in soybean [7, 8], or at least in another member of the Fabaceae family, is preferred as heterologous systems may not accurately reflect promoter strength and specificity [30, 31, 32].
For rapid analysis of promoters, transient gene expression offers many advantages and some disadvantages compared with the use of stably-transformed tissues. Transient expression can be detected as early as 2 h post DNA introduction in soybean tissues , which is quite useful for rapid estimation of promoter activity. Depending on the method for DNA introduction , different tissue types can be targeted for gene delivery, allowing increased flexibility in construct evaluations. In our laboratory, transient expression has been successfully used for evaluation of soybean promoter variants , but these evaluations were performed using lima bean (Phaseolus lunatus) cotyledons. Transient expression analysis for promoter validation using soybean cotyledons as an alternate target to lima bean cotyledons has not been previously reported.
For evaluation of constructs in stably-transformed soybean tissues, the production of hairy roots provides the most rapid and efficient method for generation of transgenic soybean tissues. Soybean hairy root cultures induced by Agrobacterium rhizogenes have been successfully used for rapid analysis of soybean cyst nematode infestation , improvement of genetic transformation efficiencies  and analysis of phenolic metabolism [28, 37]. As an alternate approach, composite plants  consisting of hairy roots on non-transgenic shoots are also useful for rapid evaluation of gene expression in stably transformed soybean tissues . Previous molecular analyses conducted on soybean hairy roots have revealed the presence of high copy number integrations [35, 36, 38], although the relationship between high copy number insertions and gene expression in hairy roots has not been reported.
With the aim of discovering unique and useful soybean promoters with potential applications in both basic research and crop improvement, we here identify, clone and validate 20 novel soybean promoters from the ubiquitin and ERF gene families. We present two different and complementary promoter validation tools based on transient expression in lima bean cotyledons and production of stably-transformed soybean hairy roots. Quantitative gene expression analysis of these 20 new soybean promoters using 2 different promoter validation tools allows us to greatly expand the toolbox of available soybean promoters.
Phylogenetic analysis of the Gmubi and GmERFgenes
Phylogenetic analyses of the ERF/AP2 genes from soybean revealed a total of 371 genes, which could be annotated as AP2/ERF genes (Additional file 2). Of these, 12 genes were not incorporated into the phylogenetic tree as they were either too divergent or incorrectly predicted (Glyma01g22260.1, Glyma02g11060.1, Glyma05g07840.1, Glyma08g24110.1, Glyma11g05450.1, Glyma14g00600.1, Glyma15g25120.1, Glyma17g17010.1, Glyma18g01030.1, Glyma19g43260.1, Glyma19g43260.2, Glyma19g45390.1). A total of 359 ERF/AP2 genes were retained, including the ten chosen for this study. The soybean AP2/ERF family is broadly similar to that from other higher plant species and can be subdivided into the ERF and AP2 subfamilies (Figure 1b). Similar to Arabidopsis  and tobacco , the ERF family could be further subdivided into the DREB (groups I-V) and the ERF subfamilies (groups VI-X). One additional subfamily was apparent and may be related to the members of group VI-L and Xb-L as these proteins were omitted from both the Arabidopsis and tobacco analyses. This phylogenetic analysis provides a framework for the study of promoters from other members of the soybean AP2/ERF multigene family and illustrates the phylogenetic positions of the 10 group IX GmERF genes used for promoter isolation in this study. The GmERF1-10 genes were chosen as they are likely to be wound- and/or jasmonate-inducible based on their phylogenetic position .
Evaluation of soybean and lima bean cotyledons for transient expression analysis
To investigate if the apparent GFP diffusion visualized in soybean cotyledonary cells was related to the small size of GFP, a translational fusion of GFP::Hygromycin  was introduced into both soybean and lima bean cotyledons. Although GFP levels and the numbers of GFP-expressing cells were considerably lower than obtained earlier with the 35S-GFP introduction, GFP expression from the translational fusion remained in the targeted cells longer in both plants and was detected until over 100 h after transformation (Figure 2a-b and data not shown). Confocal microscopy of soybean cotyledons, bombarded with the GFP::Hygromycin translational fusion confirmed strong GFP expression in the cytoplasm and nuclei of targeted cells but a clear reduction of GFP levels in the adjacent cells (Figure 3). Confocal analysis of lima bean cotyledons showed high levels of GFP in the cytoplasm and nuclei of targeted cells and no detectable GFP levels in the adjacent cells (Figure 3).
Transient expression analysis of promoters using lima bean cotyledons
The transient expression profiles were mostly similar for all the Gmubi promoters regardless of promoter strength. However, GFP expression peaks for the strong Gmubi promoters appeared to be reached later than the low-expressing promoters or the CaMV35S promoter. Most of the Gmubi promoters gave rise to exceptionally high levels of transient GFP expression based on a comparison to the CaMV35S promoter; the y-axis in Figure 4 is the percent of peak CaMV35S expression. The Gmubi1, Gmubi3, Gmubi4, Gmubi5, Gmubi6, Gmubi7 and Gmubi9 promoters displayed a ~2-7-fold increase in expression over levels obtained with the CaMV35S promoter (Figure 4). The Gmubi2 and Gmubi8 promoters showed similar levels of transient GFP expression compared with the CaMV35S, while use of the Gmubi10 promoter resulted in very low levels of GFP expression.
The transient expression profiles generated for the GmERF promoters also showed a range of promoter strengths but reasonable consistency in the timing of peak expression (Figure 4). The times for peak GFP expression driven by the GmERF promoters were more variable than those observed for the Gmubi promoters but were consistently later than CaMV35S-driven GFP peak (Figure 4). Although many of the GmERF promoters resulted in lower GFP levels than the Gmubi promoters, some gave higher expression than the CaMV35S promoter. The GmERF3, GmERF5, GmERF6, and GmERF10 promoters exhibited ~1.5-2.2-times higher GFP levels compared to the CaMV35S promoter. The GmERF2, GmERF4 and GmERF7 promoters showed similar GFP levels to CaMV35S, while GmERF1, GmERF8 and GmERF9 promoters gave rise to lower levels of transient GFP expression (Figure 4).
Stable expression analysis using soybean hairy roots
Average of primary hairy roots expressing GFP mediated by promoter constructs
Number of roots analyzed
The GmERF promoters displayed somewhat lower GFP intensities in hairy roots than the Gmubi promoters but some of these promoters displayed higher expression levels than the CaMV35 promoter (ANOVA, P > 0.0001, Figure 5b). The GmERF2, GmERF6 and GmERF10 promoters showed ~1.4-1.7-times higher GFP than CaMV35S (Figure 5b). The GmERF3, GmERF4 and GmERF7 promoters exhibited similar GFP compared to the CaMV35S promoter; whereas the GmERF1, GmERF5, GmERF8 and GmERF9 promoters directed lower levels of GFP compared to the CaMV35S promoter.
Southern hybridization analysis
Transgene copy number from the southern hybridization analysis was also directly correlated with the GFP expression intensity displayed in the transgenic hairy roots that were used for genomic DNA extraction (Additional file 4). Hairy root events with high transgene copy number generally displayed high GFP intensities; whereas, hairy roots events with single or low T-DNA copy number gave low or moderate GFP intensities.
Bioinformatics analysis of the Gmubi and GmERFgene families of soybean
The polyubiquitin gene family in soybean (Figure 1A) contains three moderately well characterized genes (Gmubi1, Gmubi2 and Gmubi3); however, other family members have received little to no attention. The promoters regulating these genes have likewise not been well characterized but show promise as strong constitutive promoters based on recently-reported transcriptome data [45, 46] and previous characterizations of a soybean polyubiquitin promoter (Gmubi) [7, 10] that was recloned in this current research as a slightly longer promoter and renamed "Gmubi3".
The ERF genes were classified based on their coding sequences, and particularly on the presence of the well-conserved AP2/ERF DNA-binding domain . The phylogeny of GmERF genes in this study (Figure 1b) was very similar to phylogenies previously reported for ERFs in rice, Arabidopsis and tobacco [19, 42, 47], confirming that this family of transcription factor is quite conserved among different plants. A previous phylogenetic analysis of GmERF genes revealed the presence of 98 unigenes containing a complete AP2/ERF domain in soybean ; however, we here report 359 AP2/ERF-containing GmERF genes using data from the recently released soybean genome assembly, representing a significant update for this gene family in soybean.
Transient expression assays in soybean and lima bean cotyledons
Quantitative characterization of soybean promoters was rapidly assessed using both transient gene expression in lima bean cotyledons and stable expression in soybean hairy roots. We have previously reported the use of lima bean cotyledons for rapid analyses of transient gene expression  and characterization of viral suppressors of gene silencing [43, 48]. In this report, we also evaluate soybean cotyledons as a potential target tissue for rapid validation of soybean promoters. In soybean cotyledons, initial attempts to visualize GFP at the 24 hour time point, which is the peak expression time for the lima bean target , were unsuccessful as only very low levels of GFP were observed. However, use of our automated image collection system  for semi-continuous monitoring of GFP expression revealed that the GFP protein apparently diffused rapidly from the initial target cell in soybean cotyledons, leading to depletion of scorable GFP levels (Figure 2, Additional file 3). In lima bean cells, rapid diffusion of GFP was not detected in the cells surrounding the original targeted cell, although it may occur at reduced levels. The loss of the GFP protein observed using soybean cotyledons suggests that there are basic differences in the epidermal cell structures in lima beans and soybeans. Confocal microscopy indeed confirmed some major differences in the anatomy of epidermal cells (Figure 3).
Retention of GFP in the targeted cells after bombardment is definitely preferable for gene expression analysis. The rapid loss of GFP in soybean cotyledonary cells made analysis difficult and this target tissue is completely unsuitable for transient expression analysis using single time point determinations. The presence of small amounts of GFP at the 24 h time point could be misinterpreted as the absence of expression, which was not the case. Since single time point determinations at 24 h are often used for transient expression analysis using GFP  and GUS , loss of transient gene expression as reported here in soybean cotyledonary tissues should be recognized as a potential problem in interpreting results. The use of dynamic semi-continuous monitoring of gene expression using our automated image collection system facilitated the detection of GFP loss from targeted cells, and movement into the surrounding cells. Without semi-continuous monitoring, movement of GFP may not have been perceived.
Transient expression of GFP in cells of lima bean cotyledons was far more consistent over time compared to soybean cotyledons (Figure 2). Lima bean cotyledons therefore offer a more suitable target tissue for quantitative transient GFP expression assays. Loss of GFP from the targeted soybean cotyledonary cells was somewhat reduced through the use of a translational fusion of GFP to the hygromycin resistance gene (Figure 2, 3), resulting in production of a larger fusion protein. However, use of this translational fusion resulted in much lower apparent GFP intensities and fewer foci (Figure 2 lower panels). We have previously reported that translational fusions containing GFP give rise to considerable reductions of transient GFP intensities in lima bean cotyledonary cells, probably due to either a quenching of fluorescence by the protein partner or conformational changes in GFP as a result of an alteration of the chromophore structure [43, 48]. Although use of translational fusions can be used to minimize loss of the small GFP protein from certain target tissues, the effects of the fusion partner on GFP detection need to be considered when this approach is utilized.
Transient expression mediated by Gmubi and GmERF promoters
In this study, the Gmubi1-9 promoters were isolated from polyubiquitin genes sharing high homology (Figure 1a) but containing variable numbers of the ubiquitin-coding unit . The Gmubi1, Gmubi2, Gmubi3 and Gmubi8 contained 4 ubiquitin-coding units; the Gmubi4 and Gmubi6 contained 7 ubiquitin-coding units; and the Gmubi5, Gmubi7 and Gmubi9 contained 6, 5 and 2 ubiquitin-coding units, respectively. The Gmubi10 promoter was isolated from a more distant relative gene containing a monomeric ubiquitin-coding unit (Figure 1a). Although the Gmubi1-9 promoters gave rise to relatively high levels of gene expression, the Gmubi10 promoter displayed consistently low expression levels in both transient expression and hairy roots. All of the reports to date describing ubiquitin promoters in different plants have focused on polyubiquitin gene promoters [10, 15, 51].
The Gmubi promoters characterized here were either intron-containing or intron-less promoters. The Gmubi1-7 gene sequences contained predicted introns in the 5'-UTR, which were predicted to splice to acceptor sites generated during promoter cloning just prior to the initiation codon of the gfp coding sequence. The Gmubi8-10 gene sequences contained no predicted introns in the 5'-UTR. To our knowledge, no characterization of native intron-less plant ubiquitin promoters has been previously reported. Although in this study there were no evident differences between transient GFP expression levels mediated by the intron-containing or the intron-less Gmubi promoters, the introns within the 5'UTR of most polyubiquitin promoters quantitatively enhance transgene expression levels [51, 52].
Although most of the Gmubi promoters directed overall high expression levels, the Gmubi3 promoter gave exceptionally high levels of GFP expression. This high gene expression driven by the Gmubi3 promoter is not surprising as the Gmubi3 gene is highly active in different organs of soybean [45, 46]. We previously reported 5-fold greater transient GFP expression using a slightly truncated version of the Gmubi3 promoter (917 bp; Gmubi) compared to a CaMV35S promoter . In the present study, the Gmubi3 promoter (1438 bp) gave rise to 7-fold greater transient GFP expression compared to the same CaMV35S promoter. We have also reported that removal of the intron from the 5'UTR of the Gmubi promoter resulted in much lower levels of both transient expression in lima bean cotyledons  and stable expression in transgenic soybeans . Although the intensity of expression was altered by the removal of the intron from the 5'UTR of the Gmubi promoter, the pattern of expression remained the same. Collectively, these results indicate that the intronic and upstream regions of this promoter may contain important cis-regulatory elements responsible for high levels of expression. An in-depth functional analysis of the Gmubi3 promoter may allow the identification of specific promoter elements that lead to this high gene expression.
The transient expression profiles from the GmERF promoters (Figure 4) were similar regardless of promoter strength. However, the time of peak GFP expression for the different GmERF promoters was more inconsistent compared to the expression peaks for the Gmubi promoters. This variability in expression peaks among the GmERF promoters may be associated with the transcriptional regulation of ERF genes under conditions of stress [22, 23].
Gene IDs for Gmubi and GmERF genes and the respective sizes of their isolated promoters
Gene expression mediated by Gmubi and GmERF promoters in hairy roots
The percentage of GFP-positive hairy roots achieved here (72%, Table 1) is substantially higher than previously reported for A. rhizogenes-induced hairy roots of soybean (50%) . The development of the hairy root phenotype caused by A. rhizogenes is the result of the integration and expression of T-DNA contained in the bacterial root inducing (Ri) plasmid in the plant genome . A. rhizogenes can also transfer the T-DNA from binary vectors, leading to the formation of hairy roots with or without the binary vector T-DNA. The ratios of hairy roots with and without the binary vector T-DNA can vary tremendously across different plants .
GFP detection and analysis in hairy roots was relatively straightforward as hairy roots do not contain chlorophyll, which can otherwise interfere with GFP detection . To counteract chlorophyll interference with GFP detection, different methodologies have been developed for chlorophyll elimination in photosynthetic tissues, including exposure to alcohol , application of photobleaching herbicides  or use of gene silencing to suppress the Phytoene desaturase (PDS) gene . However, chlorophyll elimination treatments are notably harsh and demand additional manipulation of tissues, which may alter transgene expression, particularly expression of inducible DNA constructs.
Grouping of the Gmubi and GmERF promoters based on the CaMV35S-driven GFP expression.
Transient GFP expression
GFP expression in hairy roots
1, 3, 4, 5, 6, 7, 9
5, 6, 8
1, 2, 3, 4, 7, 9
1, 8, 9
2, 4, 7
3, 5, 6, 10
1, 5, 8, 9
3, 4, 7
2, 6, 10
The GFP intensities determined for GmERF promoters using hairy roots in general also correlated with the transient GFP levels determined using lima bean cotyledons. The GmERF2, GmERF3 and GmERF5 promoters were the most inconsistent expressers in this group (moderate, high and high transient GFP expression, but high, moderate and low GFP intensities in hairy roots, respectively; Table 3). Expression directed by these promoters may be affected by the wounding or other stresses caused by tissue manipulation and particle bombardment. Further studies on these 3 promoters in stably-transformed tissues may be of particular interest to identify regulatory regions within promoters that are responsive to various stimuli.
The transient expression system reported here differs considerably from the hairy root expression system, relative to the fate of the introduced DNA and the nature of the expressing tissue. Any consistency in expression intensity using the two validation tools, suggests a certain robustness in promoter activity. For transient expression using particle bombardment, large amounts of DNA are introduced  on each particle and cells that express the introduced DNA usually contain a particle in, or adjacent to the nucleus [60, 61]. Transient expression results in a rapid increase in gene expression, followed by a rapid decline (Figure 4), which has been partly attributed to gene silencing of transient expression [43, 48]. Therefore, during transient expression in lima bean cotyledonary cells, large amounts of plasmid DNA are delivered to the nucleus, which result in very high levels of extrachromosomal gene expression. Preintegrative, extrachromosomal DNAs may not be subject to the same regulatory influences as genomic DNA and this DNA may have different access to transcription factors. Nevertheless, transient expression might be a good early indicator of promoter strength in stably-transformed tissues .
Stably-expressed promoters that are introduced in soybean hairy roots are integrated into genomic DNA and expression in this tissue may more accurately reflect promoter activity in its native context. However, gene expression may also be affected by integration site and transgene copy number , as well as the status of the transgenic tissues. Although the soybean hairy root system may not be optimal for validation of some tissue-specific promoters, we have successfully used this system for validation of large number of promoters including promoters identified as "seed specific" (data not shown). Consistency in the intensity of gene expression using these two different validation tools suggests good stability and accurate prediction of relative promoter strengths.
Southern hybridization analysis and transgene copy number
GFP intensities were quite variable in independent primary hairy root events (Figure 6). This variation in gene expression across stably-transformed events has been often attributed to the site(s) of transgene insertion and transgene copy number [5, 34]. The insertion site, copy number and structure of integrated DNA differs, depending on the transformation methods utilized. Direct transformation methods such as particle bombardment can frequently result in the insertion of large copy numbers of plasmid DNA at a single-site, leading to transgene silencing [63, 64]. Gene cassettes or minimal constructs can reduce or eliminate this effect [65, 66]. On the other hand, transformation using Agrobacterium typically results in lower copy number gene introductions, which has been reported to give more consistent transgene expression [63, 64].
Our results suggest that the variability in gfp gene expression in soybean hairy roots was associated with the copy number of the introduced T-DNA. The highest GFP expression levels were associated with roots that contained the highest copy numbers of introduced DNAs. Use of Agrobacterium tumefaciens for transformation usually results in the integration of single or low T-DNA copies into the plant genome [67, 68]. Although use of A. rhizogenes can lead to high copy T-DNA integration [35, 36, 38], the relationship between high copy number integration and transgene expression has not been previously reported in hairy roots. Using Arabidopsis plants containing sequentially increasing copy numbers of a CaMV35S-driven gfp gene, Schubert et al.  demonstrated increases in GFP expression levels when up to 4 copies of a CaMV35S-driven gfp gene were present. As the copy number was increased to 5 and greater, GFP expression was suppressed. Schubert et al.  further suggested that suppression occurs once a gene expression threshold is reached and is gene-specific.
In this study, hairy roots containing up to 7 T-DNA inserts (Figure 7) displayed the highest GFP expression and did not show gene suppression. A significant correlation of high GFP expression with high copy number integration was observed with the GmERF6 and GmERF10 promoters (Additional file 4), both of which displayed higher expression levels than the CaMV35S promoter in soybean hairy roots (Figure 5b). If a threshold copy number/expression level is required to silence the gfp gene, that threshold was not reached in the transgenic hairy roots.
The use of hairy roots to validate promoter activity is a simple alternative for gauging promoter strength in stably-transformed plants, although the influence of copy number on gene expression should be considered . The transient expression analysis used in this research may nevertheless be more reflective of general promoter strength as each cell receives similar high copy numbers of each DNA construct, and hundreds to thousands of cells are collectively analyzed. As transient expression is analyzed prior to DNA integration, complications from conformational and positional effects in genomic DNA are avoided.
We report here the isolation and characterization of 20 novel soybean promoters from two different gene families, ubiquitin and ERF. A rapid quantitative evaluation of promoter strength was consistently performed in both transiently-expressing cotyledonary tissues of lima bean and stably-transformed hairy roots of soybean. We also provide novel insights towards the utilization of transient and stable expression systems for promoter validation.
Phylogenetic analysis of the Gmubi and GmERFgenes
The ubiquitin genes were identified in the soybean genome assembly (accessed in April, 2009; ftp://ftp.jgi-psf.org/pub/JGI_data/Glycine_max/Glyma1/annotation/) based on the presence of the highly conserved ubiquitin-coding unit. The soybean ERF/AP2 genes were obtained from SoyDB: A Knowledge Database of Soybean Transcription Factors (http://casp.rnet.missouri.edu/soydb/) and verified using the Soybean Transcription Factor Knowledge Base (http://www.igece.org/Soybean_TF/).
The phylogenetic trees for Gmubi and GmERF gene families were constructed with the aligned amino acid sequences using MEGA 4.0  and the Neighbor-Joining (NJ) method . For each gene family, the bootstrap consensus tree was inferred from 1000 replicates  and drawn to scale, with branch lengths in the same units as those of the evolutionary distances used to infer the phylogenetic tree. The evolutionary distances were computed using the Poisson correction method  and are in the units of the number of amino acid substitutions per site.
The DNA sequences lying immediately upstream of the coding regions of 20 selected Gmubi and GmERF genes (Table 2) were PCR-amplified using specific primers (Additional file 5). Intronic regions (5' UTR) present in the coding sequences of Gmubi1-7 were also included in the 3' end of their respective cloned promoters. PCR-amplifications were conducted on genomic DNA from soybean (G. max 'Jack') using the FailSafe™ PCR Kit (EPICENTRE® Biotechnologies, Madison, WI, USA). PCR products were purified, digested and inserted into the MCS of pFLEV (Figure 8a). All sequences of cloned promoters were confirmed by DNA sequencing. Promoter-containing pFLEV constructs were used for transient expression analysis in lima bean (P. lunatus 'Henderson-Bush') cotyledons.
For construction of the binary version of the promoter constructs, the complete expression cassettes composed of promoter, gfp coding sequence and NOS terminator were excised from pFLEV using appropriate restriction enzymes and cloned into the MCS of appropriately digested pCAMBIA1300 (CAMBIA, Canberra, Australia; Figure 8b). For soybean hairy root production, pCAMBIA1300-promoter constructs were introduced into A. rhizogenes strain K599 (kindly provided by Dr. Harold Trick, Kansas State University) by the freeze-thaw method .
Transient expression analysis
Soybean (G. max 'Jack') seeds were harvested from plants grown in the greenhouse (16/8 h light:dark, 28°C) with supplemental lighting from high pressure sodium lamps. Lima bean seeds were harvested from plants grown in a growth chamber (50% relative humidity, 16/8 h light:dark, 25/23°C day/night). Both soybean and lima bean seeds were surface sterilized in a 10% (v/v) bleach solution with slow agitation for 20 min, rinsed 4-7 times with sterile water and germinated between moistened sterile paper towels contained in GA7 culture vessels.
Transient expression was initially compared in soybean and lima bean cotyledons using a 35S-driven GFP construct  and a 35S-driven GFP::Hygromycin gene fusion (GFP::Hygromycin) . Soybean and lima bean cotyledons were excised from 2-d-old and 4-d-old germinating seedlings, respectively. DNA constructs were precipitated onto tungsten particles and introduced into the adaxial surface of the cotyledons utilizing a Particle Inflow Gun . Bombarded cotyledons were placed adaxial side up on OMS culture medium containing MS salts , B5 vitamins , 3% sucrose and 0.2% Gelrite (pH 5.7) for GFP monitoring. Semi-continuous image acquisition was performed using an automated image collection system  composed of a MZFLIII dissecting microscope (Leica, Heerbrugg, Switzerland) equipped with a "GFP-2" filter set (Excitation 480 ± 40 nm, Emission 510 nm), a Spot-RT CCD digital camera (Diagnostic Instruments Inc., Sterling Heights, MI, USA) and a robotics platform (Arrick Robotics Inc., Hurst, TX, USA). Soybean and lima bean cotyledons showing transient GFP expression were also examined 10 h post bombardment using a Leica TCS SP5 II confocal laser microscope (Leica, Heerbrugg, Switzerland). Based on the more consistent GFP expression patterns obtained using lima bean cotyledons, transient expression analysis of all 20 different cloned soybean promoters was conducted using lima bean cotyledons as the target tissue.
Quantitative analysis of transient GFP expression directed by the 20 novel soybean promoters in lima bean cotyledons was performed as previously described [10, 44]. GFP expression levels for each promoter were calculated and presented as the percentage of the peak GFP expression of the CaMV35S promoter. For each promoter construct, 5 to 9 cotyledons were bombarded and monitored for 100 h, over at least two independent experiments.
Hairy root induction and analysis
For induction of soybean hairy roots, cotyledons were inoculated as previously described  with some modifications. A. rhizogenes harboring the pCAMBIA1300-promoter constructs was grown overnight in 2 ml liquid YEP (Yeast Extract Peptone) medium containing 100 mg l-1 kanamycin. A. rhizogenes without the binary vector was grown in YEP medium lacking antibiotics. Soybean (G. max 'Williams82') seeds were surface-sterilized and germinated in GA7 containers as described above. After 5 d, cotyledons were excised and wounded several times on the abaxial side with a sterile scalpel dipped in the bacterial cultures. Inoculated cotyledons were cultured abaxial side up on P5 Fisherbrand® (Fisher Scientific, Pittsburgh, PA, USA) filter paper moistened with sterile distilled water. After 3 d, cotyledons were transferred to OMS medium containing 400 mg l-1 Timentin for hairy root induction. Cotyledons were incubated at 25°C with a 16:8 h light:dark photoperiod under an illumination of 40 μEm-2s-1.
GFP-expressing hairy roots (~2 cm) were excised from cotyledons and subcultured for 4 d on OMS medium containing 400 mg l-1 Timentin. Root tip regions were imaged utilizing the same microscope and camera used previously for transient GFP detection but the robotics components were disabled. Image analysis of roots was performed using ImageJ software . High-resolution images (1600 × 1200 pixels) of individual root tips, ~5 mm in length, were separated into red, blue and green channels and only the green channel data was used for quantification of GFP intensity. Due to the reflection of fluorescence through the culture medium next to GFP-expressing roots, the background gray value of a 100 × 100 pixel area adjacent to each root was first subtracted from every pixel present in this channel. The threshold levels were then adjusted to segment the expressing pixels from root images and the grayscale mean value of the background-corrected channel was then determined. An average grayscale mean value from the slight background fluorescence in the green channel from hairy roots induced with A. rhizogenes without the binary vector was also determined. GFP intensity for each root was calculated by subtracting the average grayscale means of roots induced with A. rhizogenes containing no binary vector from the grayscale means of the transgenic GFP-expressing hairy roots using the green channel. For each promoter construct, 14 to 32 independent hairy root events were analyzed, over at least two independent experiments. Statistical analysis was performed using SAS 9.2 TS (SAS Institute Inc., Cary, NC, USA).
Southern hybridization analysis
Southern blot analysis was performed using genomic DNA isolated from 18 transformed hairy root events containing the gfp gene regulated by either the GmERF6 or GmERF10 promoter. Genomic DNA was extracted from lyophilized root tissues according to Murray and Thompson  as modified by Fulton et al. . DNAs from each independent root event (10 μg) were digested overnight with BsrGI, which cuts the T-DNA harboring the GmERF6 or GmERF10 promoter at a single site, only 10 bp from the 3' end of the gfp gene. Digested DNAs were separated on 0.8% (w/v) agarose gels and then transferred to nylon membranes (Roche Diagnostics GmbH, Indianapolis, IN, USA) as described by Sambrook et al. . The hybridization probe was a 717 bp fragment of the gfp coding region amplified by PCR using the primers 5'ATGGTGAGCAAGGGCGAGGAGCTG3' and 5'TTACTTGTACAGCTCGTCCATG3'. The probe was labeled with [α-32P]-dCTP (Perkin-Elmer, Boston, MA, USA) using the Prime-It® II Random Labeling Kit (Stratagene, La Jolla, CA, USA) according to the manufacturer's instructions. The labeled probe was hybridized to the membranes and incubated overnight at 60°C. The hybridized membranes were exposed to a phosphor screen holder for 24 h and then scanned with a Storm 860 PhosphorImager™ System (Molecular Dynamics, Sunnyvale, CA, USA) for visualization of hybridization patterns.
We would like to thank Dr. Tea Meulia (MCIC/OARDC/OSU) for the technical assistance with confocal microscopy. We also thank Drs. Eric Stockinger and Leah McHale for the critical reading of this manuscript. Salaries and research support were provided by the United Soybean Board, and by State and Federal funds appropriated to The Ohio State University/Ohio Agricultural Research and Development Center. This research was partially supported by a fellowship from CONACYT, Mexico, to CMHG. Mention of trademark or proprietary products does not constitute a guarantee or warranty of the product by OSU/OARDC and also does not imply approval to the exclusion of other products that may also be suitable. Journal Article No HCS 10-09.
- 1.Schnable PS, Ware D, Fulton RS, Stein JC, Wei F, Pasternak S, Liang C, Zhang J, Fulton L, Graves TA, Minx P, Reily AD, Courtney L, Kruchowski SS, Tomlinson C, Strong C, Delehaunty K, Fronick C, Courtney B, Rock SM, Belter E, Du F, Kim K, Abbott RM, Cotton M, Levy A, Marchetto P, Ochoa K, Jackson SM, Gillam B, et al: The B73 maize genome: complexity, diversity, and dynamics. Science. 2009, 326: 1112-1115. 10.1126/science.1178534.PubMedCrossRefGoogle Scholar
- 2.Paterson AH, Bowers JE, Bruggmann R, Dubchak I, Grimwood J, Gundlach H, Haberer G, Hellsten U, Mitros T, Poliakov A, Schmutz J, Spannagl M, Tang H, Wang X, Wicker T, Bharti AK, Chapman J, Feltus FA, Gowik U, Grigoriev IV, Lyons E, Maher CA, Martis M, Narechania A, Otillar RP, Penning BW, Salamov AA, Wang Y, Zhang L, Carpita NC, et al: The Sorghum bicolor genome and the diversification of grasses. Nature. 2009, 457: 551-556. 10.1038/nature07723.PubMedCrossRefGoogle Scholar
- 3.Schmutz J, Cannon SB, Schlueter J, Ma J, Mitros T, Nelson W, Hyten DL, Song Q, Thelen JJ, Cheng J, Xu D, Hellsten U, May GD, Yu Y, Sakurai T, Umezawa T, Bhattacharyya MK, Sandhu D, Valliyodan B, Lindquist E, Peto M, Grant D, Shu S, Goodstein D, Barry K, Futrell-Griggs M, Abernathy B, Du J, Tian Z, Zhu L, et al: Genome sequence of the palaeopolyploid soybean. Nature. 2010, 463: 178-183. 10.1038/nature08670.PubMedCrossRefGoogle Scholar
- 10.Chiera JM, Bouchard RA, Dorsey SL, Park E, Buenrostro-Nava MT, Ling PP, Finer JJ: Isolation of two highly active soybean (Glycine max (L.) Merr.) promoters and their characterization using a new automated image collection and analysis system. Plant Cell Rep. 2007, 26: 1501-1509. 10.1007/s00299-007-0359-y.PubMedCrossRefGoogle Scholar
- 22.Stockinger EJ, Gilmour SJ, Thomashow MF: Arabidopsis thaliana CBF1 encodes an AP2 domain-containing transcriptional activator that binds to the C-repeat/DRE, a cis-acting DNA regulatory element that stimulates transcription in response to low temperature and water deficit. Proc Natl Acad Sci USA. 1997, 94: 1035-1040. 10.1073/pnas.94.3.1035.PubMedPubMedCentralCrossRefGoogle Scholar
- 23.Zhang G, Chen M, Li L, Xu Z, Chen X, Guo J, Ma Y: Overexpression of the soybean GmERF3 gene, an AP2/ERF type transcription factor for increased tolerances to salt, drought, and diseases in transgenic tobacco. J Exp Bot. 2009, 60: 3781-3796. 10.1093/jxb/erp214.PubMedPubMedCentralCrossRefGoogle Scholar
- 27.Finer JJ, Larkin KM: Genetic transformation of soybean using particle bombardment and SAAT approaches. Handbook of new technologies for genetic improvement of legumes. Edited by: Kirti P. Boca Raton, Florida CRC Press;2008:103-125.Google Scholar
- 32.Kamo K, Blowers A, Smith F, Van Eck J, Lawson R: Stable transformation of Gladiolus using suspension cells and callus. J Am Soc Hortic Sci. 1995, 120: 347-352.Google Scholar
- 45.Severin A, Woody J, Bolon Y-T, Joseph B, Diers B, Farmer A, Muehlbauer G, Nelson R, Grant D, Specht J, Graham M, Cannon S, May G, Vance C, Shoemaker R: RNA-Seq Atlas of Glycine max: A guide to the soybean transcriptome. BMC Plant Biol. 2010, 10: 160-10.1186/1471-2229-10-160.PubMedPubMedCentralCrossRefGoogle Scholar
- 65.Fu X, Duc LT, Fontana S, Bong BB, Tinjuangjun P, Sudhakar D, Twyman RM, Christou P, Kohli A: Linear transgene constructs lacking vector backbone sequences generate low-copy-number transgenic plants with simple integration patterns. Transgenic Res. 2000, 9: 11-19. 10.1023/A:1008993730505.PubMedCrossRefGoogle Scholar
- 73.Zuckerkandl E, Pauling L, Bryson V, Vogel HJ: Evolutionary divergence and convergence in proteins. Evolving Genes and Proteins. New York: Academic Press; 1965,97-166.Google Scholar
- 82.Sambrook J, Fritsch EF, Maniatis T: Molecular Cloning: A Laboratory Manual. Cold Spring Harbor, NY: Cold Spring Harbor Laboratory; 2 1989Google Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.