Directed PCR-free engineering of highly repetitive DNA sequences
- 9.3k Downloads
Highly repetitive nucleotide sequences are commonly found in nature e.g. in telomeres, microsatellite DNA, polyadenine (poly(A)) tails of eukaryotic messenger RNA as well as in several inherited human disorders linked to trinucleotide repeat expansions in the genome. Therefore, studying repetitive sequences is of biological, biotechnological and medical relevance. However, cloning of such repetitive DNA sequences is challenging because specific PCR-based amplification is hampered by the lack of unique primer binding sites resulting in unspecific products.
For the PCR-free generation of repetitive DNA sequences we used antiparallel oligonucleotides flanked by restriction sites of Type IIS endonucleases. The arrangement of recognition sites allowed for stepwise and seamless elongation of repetitive sequences. This facilitated the assembly of repetitive DNA segments and open reading frames encoding polypeptides with periodic amino acid sequences of any desired length. By this strategy we cloned a series of polyglutamine encoding sequences as well as highly repetitive polyadenine tracts. Such repetitive sequences can be used for diverse biotechnological applications. As an example, the polyglutamine sequences were expressed as His6-SUMO fusion proteins in Escherichia coli cells to study their aggregation behavior in vitro. The His6-SUMO moiety enabled affinity purification of the polyglutamine proteins, increased their solubility, and allowed controlled induction of the aggregation process. We successfully purified the fusions proteins and provide an example for their applicability in filter retardation assays.
Our seamless cloning strategy is PCR-free and allows the directed and efficient generation of highly repetitive DNA sequences of defined lengths by simple standard cloning procedures.
KeywordsRepetitive Sequence Polyglutamine Magnesium Chloride Potassium Acetate Sumo Protease
Expansions of DNA repeat sequences are associated with many inherited neurodegenerative diseases [1, 2, 3]. One of the best-studied examples for a trinucleotide expansion disease is the neurological disorder Huntington's chorea, where the accumulation of CAG triplets within the first exon of the gene encoding the Huntingtin (Htt) protein leads to an elongated polyglutamine (Poly-Q) stretch in the polypeptide. It has been shown that more than 36 consecutive glutamine residues are pathogenic as they promote Htt aggregation into amyloid-like fibrils . The translation product of the first exon of the htt gene was previously used to study the aggregation behavior of Poly-Q proteins in vitro . In order to investigate the influence of the length of the Poly-Q stretch on aggregation kinetics, we wanted to clone reporter constructs containing defined numbers of glutamine residues. We developed a PCR-free cloning strategy allowing us to create repetitive DNA sequences encoding glutamine stretches of defined length. These sequences were used to generate improved constructs for filter retardation assays to study Huntingtin aggregation in vitro.
Several methods have been described for cloning long DNA repeat tracts. However, most of them include PCR-based amplification steps, generate imperfect repeats, or result in a pool of clones that differ in the number of repeats [6, 7, 8, 9, 10, 11, 12, 13]. As a consequence, additional effort is required to identify and isolate clones with the desired length of nucleotide repeats. Therefore, we developed a simple multi-cycle cloning strategy using synthetic oligonucleotides and Type IIS restriction endonucleases to engineer highly repetitive DNA fragments. Our approach is PCR-free and generates exclusively clones carrying the desired length of repeat sequences in each step. Moreover, it is not only suitable to multiply the length of existing repetitive sequences but also to vary the number of inserted repeats in every elongation cycle. Finally, we can easily recombine constructs of the same or different repeat lengths to accelerate the construction of the desired number of repeats.
Results and discussion
The antiparallel Q-block oligonucleotides described above were commercially synthesized, annealed, and subcloned via the BsaI and SacI restriction sites into a vector containing a unique BsaI site (Figure 1B and 1C). The following elongation steps of the Poly-Q encoding sequence only require the double-stranded oligonucleotides treated with the appropriate enzymes as described above and a cloning vector without BsmBI sites. Hence, we performed the elongation cycles in the plasmid pMK1 from which all BsmBI sites were removed by site-directed mutagenesis. The initial introduction of the subcloned Q-block fragment into pMK1 resulted in the plasmid pMK1-Q11 (Figure 1C, upper panel). As a consequence, this vector now contains a single BsmBI site downstream of the glutamine-encoding triplets, which could be used for the further elongation of the Q-block (Figure 1C). Next, pMK1-Q11 was digested with the enzymes BsmBI and SacI. The resulting cohesive ends were compatible with the overhangs of the double-stranded Q-block oligonucleotides cut by BsaI and SacI (Figure 1C). Ligation of the insert with pMK1-Q11 gave rise to pMK1-Q20. Thereby a unique BsmBI site was re-introduced downstream of the elongated Poly-Q encoding region (Figure 1C). As the cohesive ends of the annealed and digested oligonucleotides are only compatible with the doubly-cut plasmid, multiple insertions were efficiently prevented. Using this strategy the directed elongation of the Q-block was achieved by repeated cycles of digestion and ligation.
Our approach is also suitable to increase the length of repetitive sequences by multiple insertions as described previously . Thereby, simultaneous digestion of the annealed oligonucleotides or pMK1-Qn plasmids with both Type IIS restriction enzymes would result in Q-block fragments with compatible overhangs (Figure 1A). This would allow for ligation of multiple Q-blocks in a single step.
To study the aggregation kinetics of Poly-Q proteins in vitro, we applied a widely used filter retardation assay, which allows the detection and quantification of small amounts of Poly-Q containing aggregates . This assay is based on the characteristic of Poly-Q fibrils to be insoluble in solutions containing 2% sodium dodecyl sulfate (SDS). Such fibrils are specifically retained on a cellulose-acetate filter, whereas the soluble monomeric species are denatured by SDS and filtered through the membrane. The captured aggregates can then be visualized by immunodetection. For such experiments fusion constructs were commonly used carrying a N-terminal glutathione S-transferase (GST) domain fused to the first exon of htt via a factor Xa cleavage site . In these constructs the globular GST domain prevents aggregation of the monomers. However, aggregation can be induced by release of the GST moiety upon proteolytic factor Xa cleavage. Based on this principle we designed a new construct in which the first exon of the htt gene was fused to a SUMO domain (Saccharomyces cerevisiae Smt3) carrying an N-terminal His6-tag for affinity purification (Figure 6A). Importantly, the His6-SUMO domain keeps the purified Poly-Q containing fusion proteins soluble and allows for the controlled induction of aggregation by treatment with the Ulp1 protease (SUMO protease from S. cerevisiae), which cleaves behind the two conserved C-terminal glycine residues of folded SUMO .
Using the His6-SUMO fusion strategy the Poly-Q proteins can be easily affinity purified under denaturing conditions using silica-based Ni2+-matrices (Figure 6B). As Poly-Q proteins often tend to form insoluble inclusions when expressed in E. coli, we purified our constructs under denaturing conditions using 6 M guanidine hydrochloride and refolded them on the affinity matrix by slowly decreasing the concentration of the denaturant. The efficiency of the refolding process was monitored by digestion of the purified and soluble Poly-Q fusion proteins with Ulp1 (Figure 6C). Importantly, this protease does not recognize a specific amino acid sequence but rather the native structure of the SUMO-domain. Under our experimental conditions the purified fusion proteins were almost completely digested by Ulp1 within one minute, indicating that refolding of the constructs was successful. Therefore, we performed the filter retardation assay with the Poly-Q fusion proteins and induced aggregation by Ulp1 mediated cleavage. Figure 6D exemplifies the assay with our construct containing 47 glutamine residues. In the control reaction, where Ulp1 was omitted, no aggregation was observed over six hours beyond marginal background signals. By contrast, significant amounts of SDS-resistant Poly-Q fibrils were formed after one hour upon induction of SUMO cleavage. Accordingly, release of His6-SUMO from the Poly-Q constructs occurs much faster than fibril formation (Figure 6C and 6D). Thus, our His6-SUMO fusion strategy provides a useful tool to study the kinetics of Poly-Q aggregation in vitro.
As mentioned above, several other seamless cloning strategies were described previously, indicating the demand for such methods. However, our approach has several advantages compared to existing methods. Here we used synthetic oligonucleotides to overcome the limitations of PCR-based amplification of repetitive DNA sequences and combined this strategy with seamless cloning using Type IIS restriction enzymes. The usefulness of this method was demonstrated by means of two frequently occurring problems: the cloning of defined stretches of Poly-Q encoding sequences and the generation of polyadenine sequences for in vitro transcription of polyadenylated mRNA. We chose the latter application because we wanted to demonstrate that the technique is suitable for generating fully homogenous nucleotide sequences of defined length. Another important advantage is that our method is directed. This means that in each elongation cycle, the number of nucleotides added to the repetitive sequence can be determined precisely by simply designing the oligonucleotides accordingly. Importantly, a single product is obtained in each step. Other methods often generate a mixture of products due to multiple insertions. Consequently, products of the right length or sequence have to be selected or purified via agarose gels, which makes the procedure more complicated, time-consuming, and error-prone. As we did not observe multiple insertions and only rarely religations of the vector, this is not necessary using our technique. In addition, the protocol presented here is straightforward. One round of elongation can be done within one day, including restriction digestion of the vector, ligation, and transformation. Bacterial growth and sequencing requires approximately two further days. We obtained approximately 100-300 clones per transformation and in most cases it was sufficient to sequence a single clone. Moreover, combining pre-existing constructs can rapidly increase the length of the repetitive sequence (Figure 2 and 3). For example, once the method was established, the pMK1-constructs could be generated within three weeks including sequencing (Figure 3A). We did not encounter a limitation of the number of repeats for our purposes.
In general, our method is cheap because only the synthetic oligonucleotides have to be purchased for each application. We usually ordered 0.02 μmol of commercially synthesized, HPLC-purified oligonucleotides, respectively. The costs depend therefore on the list price per nucleotide, purity, scale, and length of the oligonucleotides. Once annealed and digested, the oligonucleotides can be used for all subsequent elongation rounds and do not have to be re-ordered. All other materials (e.g. restriction enzymes, agarose gels, competent cells, DNA ligase) are standard equipment for molecular cloning. It should be mentioned that in principle the strategy could be used also to assemble non-repetitive sequences. In this case, the DNA fragments could be amplified even by PCR with primers carrying the Type IIS and Type IIP restriction sites in the 5' overhang regions. Finally, it should be also possible to fuse different repetitive sequences in a directed and seamless manner (Figure 5).
In summary, our seamless cloning strategy is PCR-free and allows the directed and efficient generation of highly repetitive DNA sequences of defined lengths by simple standard cloning techniques. Besides their applications in basic research, as exemplified here, repetitive nucleotide sequences become increasingly important in biopolymer technology. As artificial proteins with unique physical properties such as elastomeric polypeptides and synthetic silk fibers often contain multiple amino acid repeats, new strategies are required that improve the cloning of synthetic genes for the production of protein-based polymeric materials in biological systems [22, 23]. The method presented here is both, cheap and fast, and can be easily adapted to produce any desired DNA sequences for a wide range of applications.
All cloning steps were performed according to standard protocols . Restriction enzymes were named according to . HPLC-purified synthetic Q-block (5'-ccagtgGGTCTCaCAGCAACAGCAACAGCAGCAGCAACAGCAGcagagacgGAG CTCgatc-3' and 5'-gatcGAGCTCcgtctctgCTGCTGTTGCTGCTGCTGTTGCTGTT GCTGtGAGACCcactgg-3') and poly(A) oligonucleotides A31 (5'-gccgAAGCTT aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaGAGACCcactgg-3' and 5'-ccagtgGGTCTCtttttttttttttttttttttttttt tttttAAGCTTcggc-3') and A32 (5'-ccagtgGGTCTCAaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaAGAGACG acagCTCGAGgccg-3' and 5'-cggcCTCGAGctgtCGTCTCTttttttttttttttttttttttttttttttTGAGACCcactgg-3') were obtained from Thermo Fisher Scientific. For annealing 10 pmol of the antiparallel oligonucleotides were mixed in a final volume of 100 μl. The reaction mixture was heated up to 95°C for 5 min, incubated for 10 min at 55°C, and finally cooled down to room temperature. The annealed oligonucleotides were digested with the respective restriction endonucleases (NEB) according to the manufacturers protocol. Next, the DNA was precipitated by addition of 1/10 volume 3 M sodium acetate pH 5.2, 2.5 volumes of 100% (v/v) ethanol (p.a.) and incubation at -20°C for at least 12 h. The DNA was recovered by centrifugation at 16000 × g for 30 min at 4°C and the DNA pellets were washed with 70% (v/v) ethanol. The DNA pellets were dried at room temperature and subsequently resuspended in 100 μl sterile water. Aliquots of the double-stranded digested oligonucleotides were stored at -20°C and used for the subsequent cloning cycles to elongate the repetitive sequences. Between 0.1 μl and 5 μl of the digested oligonucleotides were ligated with 30-70 ng linearized and dephosphorylated plasmid DNA and T4 DNA ligase (Fermentas) in a 20 μl reaction mixture according the manufacturers instructions. The ligations were carried out at 23°C for 2 h and transformed into 100 μl RbCl2-competent E. coli cells (strain: DH5αZ1). Plasmid preparation, gel extraction, and DNA purification kits were obtained from Qiagen. All DNA fragments originating from plasmid DNA were separated by agarose gel electrophoresis and extracted using the QIAquick Gel Extraction Kit (Qiagen). All plasmids generated during this study were verified by DNA sequencing (GATC, Germany).
To generate the poly(A) constructs the protocol was adapted due to the lower melting temperature of the poly(A) oligonucleotides. The antiparallel oligonucleotides were mixed as described above, heated to 95°C, and annealed by a continuous temperature gradient of 0.1°C per second to 4°C. To avoid any denaturation of poly(A) stretches the annealed oligonucleotides as well as vectors containing poly(A) regions were digested with the Type IIS restriction enzymes overnight at 37°C. Restriction enzymes were removed by phenol/chloroform extraction according to standard protocols and the DNA was recovered by ethanol precipitation as described above. The resulting poly(A) constructs were analyzed by restriction digestions and the lengths of the poly(A) stretches up to 100 base pairs could be determined precisely by sequencing (GATC, Germany).
Expression and purification of Poly-Q proteins
The pLANA-Qn constructs (Figure 6A) were transformed into the E. coli strain BL21(DE3). A two liter LB-culture was inoculated with 20 ml of a stationary overnight culture and grown to an OD600 nm of 0.6-0.8 at 30°C. Expression was induced by the addition of 1 mM isopropylthio β-D-1-galactopyranoside (IPTG). After 5 h cells were harvested by centrifugation and flash-frozen in liquid nitrogen. The cell pellets were resuspended in lysis buffer (30 mM HEPES-KOH pH7.4, 500 mM potassium acetate, 5 mM magnesium chloride 5% (v/v) glycerol, 1 mM ß-mercaptoethanol, 1 mM phenylmethylsulphonyl fluoride (PMSF), 1 × protease inhibitor cocktail (Roche)) and lysed by French press. After centrifugation at 30000 × g for 30 min at 4°C the pellet was resuspended in denaturation buffer 1 (6 M guanidine hydrochloride (GdnHCl), 30 mM HEPES-KOH pH 7.4) and incubated with 2.5 g Ni-IDA silica matrix (Protino; Macherey-Nagel) for 30 min at 4°C while rotating. The matrix was washed twice with 40 ml denaturation buffer 1, twice with 40 ml denaturation buffer 2 (4 M GdnHCl, 30 mM HEPES-KOH pH 7.4), twice with 40 ml denaturation buffer 3 (2 M GdnHCl, 30 mM HEPES-KOH pH 7.4), twice with 40 ml high salt buffer (30 mM HEPES-KOH pH 7.4, 1 M potassium acetate, 5 mM magnesium chloride, 5% (v/v) glycerol, 1 mM ß-mercaptoethanol), and finally four times with with 40 ml low salt buffer (30 mM HEPES-KOH pH 7.4, 50 mM potassium acetate, 5 mM magnesium chloride, 5% (v/v) glycerol, 1 mM ß-mercaptoethanol). Proteins were eluted with four times 10 ml elution buffer (30 mM HEPES-KOH pH 7,4, 50 mM potassium acetate, 5 mM magnesium chloride, 5% (v/v) glycerol, 1 mM ß-mercaptoethanol, 250 mM imidazole-HCl pH 8.0). The purest fractions were dialyzed against 5 l low salt buffer for 3 h. To avoid aggregation the purified proteins were diluted, frozen in liquid nitrogen, and stored at -80°C. The purified proteins were analyzed by SDS-PAGE and Coomassie staining as well as immunoblotting using FLAG antibodies (ANTI-FLAG, Sigma).
Filter retardation assay
The Q-containing fusion proteins were thawed on ice and diluted with low salt buffer (30 mM HEPES-KOH pH 7.4, 50 mM potassium acetate, 5 mM magnesium chloride, 1 mM ß- mercaptoethanol, 10% (v/v) glycerol) to a final concentration of 0.02 mg/ml. Insoluble material was removed by centrifugation at 16000 × g for 5 min at 4°C. The supernatant was divided into two samples, which were directly used for the filter retardation assay. The SUMO protease was added to one of the samples to a final concentration of 10 μg per mg substrate protein. The control reaction contained BSA instead of protease. The reactions were incubated at 22°C in a final volume of 350 μl. At the indicated time points 50 μl samples (1 μg fusion protein) were taken and mixed 1:1 with stop solution (4% (w/v) SDS, 100 mM dithiothreitol) followed by incubation at 95°C for 5 min. Under these conditions monomeric Poly-Q proteins are denatured whereas Poly-Q aggregates remain intact. Subsequently, the samples were applied to a Dot-blot filtration unit and filtered through a cellulose-acetate membrane (0.2 μm pore size) equilibrated with 2% (w/v) SDS in TBS (100 mM Tris-HCl pH 8.0, 150 mM sodium chloride). After washing with 0.1% (w/v) SDS in TBS the membrane was blocked in TBS-T (100 mM Tris-HCl pH 8.0, 150 mM sodium chloride, 0.05% (v/v) Tween-20) containing 5% (w/v) non-fat dried milk. The aggregates were detected by immunoblotting using primary FLAG-antibodies and alkaline-phosphatase coupled to anti-rabbit IgG (1:10000, Sigma).
We thank Janine Kirstein-Miles, Rainer Nikolay, and Christina Schlatterer for proofreading the manuscript. This work was supported by fellowships of the Konstanz Research School Chemical Biology to M. K., A. S. and S. P. and a grant of the DFG (DE-783) to E. D.
- 1.Scherzinger E, Lurz R, Turmaine M, Mangiarini L, Hollenbach B, Hasenbank R, Bates GP, Davies SW, Lehrach H, Wanker EE: Huntingtin-encoded polyglutamine expansions form amyloid-like protein aggregates in vitro and in vivo. Cell. 1997, 90 (3): 549-558. 10.1016/S0092-8674(00)80514-0.CrossRefGoogle Scholar
- 7.Ordway JM, Detloff PJ: In vitro synthesis and cloning of long CAG repeats. Biotechniques. 1996, 21 (4): 609-610, 612.Google Scholar
- 11.Michalik A, Kazantsev A, Van Broeckhoven C: Method to introduce stable, expanded, polyglutamine-encoding CAG/CAA trinucleotide repeats into CAG repeat-containing genes. Biotechniques. 2001, 31 (2): 250-252, 254.Google Scholar
- 12.Dorsman JC, Bremmer-Bout M, Pepers B, van Ommen GJ, Den Dunnen JT: Interruption of perfect CAG repeats by CAA triplets improves the stability of glutamine-encoding repeat sequences. Biotechniques. 2002, 33 (5): 976-978.Google Scholar
- 24.Sambrook J, Russell DW: Molecular cloning: a laboratory manual. 2001, Cold Spring Harbor, N.Y.: Cold Spring Harbor Laboratory Press, 3Google Scholar
- 25.Roberts RJ, Belfort M, Bestor T, Bhagwat AS, Bickle TA, Bitinaite J, Blumenthal RM, Degtyarev S, Dryden DT, Dybvig K, et al: A nomenclature for restriction enzymes, DNA methyltransferases, homing endonucleases and their genes. Nucleic Acids Res. 2003, 31 (7): 1805-1812. 10.1093/nar/gkg274.CrossRefGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.