Dam mutants provide improved sensitivity and spatial resolution for profiling transcription factor binding
DamID, in which a protein of interest is fused to Dam methylase, enables mapping of protein-DNA binding through readout of adenine methylation in genomic DNA. DamID offers a compelling alternative to chromatin immunoprecipitation sequencing (ChIP-Seq), particularly in cases where cell number or antibody availability is limiting. This comes at a cost, however, of high non-specific signal and a lowered spatial resolution of several kb, limiting its application to transcription factor-DNA binding. Here we show that mutations in Dam, when fused to the transcription factor Tcf7l2, greatly reduce non-specific methylation. Combined with a simplified DamID sequencing protocol, we find that these Dam mutants allow for accurate detection of transcription factor binding at a sensitivity and spatial resolution closely matching that seen in ChIP-seq.
KeywordsDam DamID Tcf7l2 Transcription factor
DamID is an enzymatic assay for detecting the location of protein-DNA interactions across the genome . This technique uses the bacterial enzyme DNA adenine methyltransferase (Dam), which methylates the adenine within a sequence of G–A–T–C. In E. coli, methylation by Dam marks the original genome, directing mismatch repair to newly synthesised copies instead of the original and provides a layer of transcriptional control . DamID takes advantage of the absence of any detectable adenine methylation, or functional consequences thereof, in mammals (for evidence in other eukaryotes see [3, 4, 5]) to repurpose it into marking sites of protein-DNA interactions. Dam is tethered to a protein of interest such that wherever it binds any nearby GATCs will be methylated . Since methylation is a stable, covalent modification, it persists throughout DNA extraction and can be detected anytime afterwards by cleavage with adenine methylation-specific restriction enzymes: DpnI cleaves any methylated GATC and DpnII cleaves unmethylated GATCs. The protocol is completed by ligation of an adapter onto these cleaved methylation sites, amplification, and identification by nextgen sequencing (NGS) or hybridisation .
A major use of DamID is to profile protein-DNA binding under conditions unsuitable for the more ubiquitously used chromatin immunoprecipitation (ChIP). DamID’s use of a restriction digest followed by ligation and amplification in the absence of any lossy wash steps means it requires less starting material: a few thousand cells suffice instead of the many millions required for ChIP-seq  and otherwise results in a quicker and more straightforward protocol. Detection of methylation, however, is limited by the presence of GATCs which occur on average at 2.6 sites every kb in the mouse genome. Similarly, DamID’s use of fusion proteins avoids the need for antibodies, which are expensive to make, not available for many proteins, and are often non-specific. (This is of interest since closely related transcription factor often bind to different locations.) The downside of requiring fusion proteins is that these are typically expressed ectopically, which can lead to aberrant protein-DNA binding due to abnormally low or high expression and limits applicability to cases where transgenic cell lines or animals can be made. Thus, DamID is a compelling alternative to ChIP-seq, especially in cases where cell number or antibody availability is limiting.
Despite these benefits, DamID has seen limited usage. This is due to its substantial drawback of high background noise and low spatial resolution, likely stemming from Dam’s high enzymatic activity. In E. coli, Dam methylates most of the genome despite being expressed low . Even when fused with a DNA-binding protein, Dam still methylates many off-target sites throughout the genome, and if expressed long or high enough it will completely saturate the genome with methylation . The high methylation rate has made it necessary to use very low expression of the Dam fusion protein, most commonly with a leaky uninduced heat shock promoter  or more recently through translation reinitiation [8, 10]. Greater control over the expression of Dam constructs has also been achieved using inducible systems, allowing expression within specific cells and avoiding the toxicity from high methylation in whole organisms .
Even at low expression levels, there is still substantial off-target methylation resulting in a high correlation with unfused Dam [10, 12]. The usual solution is to subtract the methylation pattern of the unfused Dam control . Any interaction effects are ignored by this strategy: processivity, competition between Dam and protein binding, and different diffusion/methylation rates of unfused Dam could all skew this normalisation. Since this background methylation occurs more strongly within open chromatin , where the majority of transcription factors bind, any non-perfect control runs the risk of removing actual transcription factor binding signal. After normalisation with unfused Dam, binding profiles obtained by DamID only modestly correlate to ChIP signal and provide lowered spatial resolution due to the spread of methylation to several kb around binding sites [8, 10]. Indeed, the most successful use of DamID avoids these limitations entirely by studying nuclear lamin associated domains, which are much larger than the spatial resolution of DamID and whose heterochromatic organisation is negatively correlated with background Dam methylation [14, 15].
Dam methylates quite quickly; strong interactions with the DNA backbone lead it to remain bound afterwards, allowing it to processively methylate several GATCs at a stretch (including the reverse complement GATC) [16, 17, 18]. Coffin et al.  studied the structural basis of Dam processivity by mutating several basic residues of Dam that contact phosphates outside the active site . These were found to change the balance between enzyme kinetics and DNA release, such that the rate of methylation became the slower, rate-limiting step. This has the effect of making the enzyme more likely to disassociate and float away instead of continuing on to methylate nearby sites.
We hypothesised that the features of these mutants—slower methylation rate, reduced DNA binding, or less processivity—could reduce the non-specific background methylation seen in DamID. Here we screened the effect of combinations of such mutations on DamID for the transcription factor Tcf7l2 and find that in general they greatly reduce the amount of non-specific methylation. The sparser methylation required altering the existing DamID-seq protocol to detect single methylation events instead of broader regions. The end result is DamID that gives a much cleaner signal for transcription factor binding, with sensitivity and spatial resolution comparable to levels seen with ChIP-seq.
Mutant Dam protein maintains methylation sensitivity and increases specificity
Genome-wide DamID-seq protocol
Mutant Dam protein reduces background methylation and improves spatial resolution
Across the whole genome methylation by Dam-Tcf7l2, mutants colocalises much more strongly with Tcf7l2 ChIP-seq signal (Fig. 5). Normalising to unfused Dam controls doesn’t improve the colocalisation of wild-type or mutant Dam-Tcf7l2 signal with Tcf7l2 ChIP-seq, instead reducing it in all constructs (Additional file 1: Fig. S1). Similarly, the two Dam-Tcf7l2 mutants also exhibit higher sensitivity for ChIP-seq signal, with far less methylation away from binding sites (Fig. 6). Dam-Tcf7l2 mutant methylation decays to half very quickly from the midpoint of the ChIP-seq peak, at 120 bp (R95A) and 160 bp (N126A), and falls to a rate of 2% of peak methylation at 1 kb away. Wild-type Dam-Tcf7l2 reaches half methylation at 580 bp away from the midpoint, and at 1 kb still methylates on average 25% as much. Since this measure is specific to the ChIP-seq sites and could be confounded by higher background methylation or difference in what actual signal it detects, we also checked whether this pattern appears in the autocorrelation of methylation signal—how similar it is between nearby segments (averaged across 100 bp bins) and hence how fast the signal varies. This supports an increase in spatial resolution with the mutants, with wild-type Dam-Tcf7l2 signal still correlated across 1–2 kb, by which point the mutant signal is uncorrelated (Additional file 2: Fig. S2). Lastly, the reduced background methylation and increased spatial resolution of the mutants puts it in the range of the distribution of GATCs throughout the genome. This results in Tcf7l2 bound sites that are captured by only one methylated site and hence would be missed with the classic DamID protocol (Additional files 3 and 4: Fig. S3 and Fig. S4).
Here we show that four mutants of Dam (R95A, R116A, N126A, and N132A) each reduce the noise seen in DamID for the transcription factor Tcf7l2 substantially, and for two of these (R95A and N126A) we confirm that this is the case across the whole genome, resulting in less background methylation and higher spatial resolution. We strongly suspect that these conclusions will also apply to the other two mutants.
We are not sure precisely what causes the background methylation observed with wild-type Dam, and hence, why these mutants show an increased signal-to-noise ratio. Based on the observations in  of such mutations, it could be a combination of reduced methylation rate leading to only longer-lived interactions being recorded, lower processivity preventing spreading methylation, or reduced DNA binding preventing it from dragging its linked transcription factor to a new location. The observation that unfused Dam mutants closely resemble the wild-type Dam-Tcf7l2 favours the last of these: that wild-type Dam binds DNA strongly enough to drag Tcf7l2 to locations that Dam normally prefers. If the improved signal was instead due to disrupted processivity, then the correlation between wild-type Dam and Dam-Tcf7l2 should be stronger than that between mutant Dam and wild-type Dam-Tcf7l2. Alternatively, if the cause was a reduced methylation rate only capturing longer-lived interactions, then one would expect the mutant Dam only samples to show less total methylation than the corresponding Dam-Tcf7l2—the opposite was observed.
A caveat to our results is that these Dam constructs were expressed from a dox-inducible promoter at high level, in contrast to the recommended method of using low expression from a leaky uninduced promoter. It is possible that there exists a lower concentration and duration of Dam-Tcf7l2 with similar signal-to-noise properties as the N126A and R95A variants. During out initial test of wild type Dam-Tcf7l2, however, we found no concentration or duration of dox exposure that further improved the enrichment at Tcf7l2 bound sites (by qPCR) nor did the uninduced promoter provide detectable signal (these observations may be specific to the quick replication of mESCs diluting away methylation that is produced too slowly). Furthermore, previous studies all show low spatial resolution and high correlation between unfused Dam and fusions with transcription factors despite attempts to maintain low levels of expression.
Out of these, the most comparable is a recent DamID experiment by Cheetham et al.  profiling Oct4 binding in mESCs, due to the explicit comparison to ChIP-seq and similarly focal DNA binding of Tcf7l2 and Oct4 with ChIP-seq peaks of \(\sim \) 100 bp. Despite maintaining very low expression of Dam-Oct4 fusion through translation reinitiation, a comparison with Oct4 ChIP-seq shows methylation at many disparate sites and a low spatial resolution similar to what we observe for Dam-Tcf7l2 wild-type (50% decay at > 500 bp). While a portion of these may be true Oct4 binding events, the high specificity of ChIP-seq for transcription factor binding, combined with the high correlation (median of 0.77) to unfused Dam, indicates that this is mostly driven by Dam-specific effects. This matches our observations for Tcf7l2 fused to wild-type Dam, which is more strongly correlated with unfused Dams than the N126A or R95A Dam-Tcf7l2. Thus, it seems unlikely that the increase in signal-to-noise seen with the Dam mutants is achievable through further optimisation of Dam fusion expression. More generally, the strong DNA binding and processivity of wild type Dam [16, 17, 18] indicates that for any protein with similar or weaker DNA affinity, fusing it to Dam will result in off-target methylation regardless of the level of total methylation. Very low expression also adds an additional source of cell-to-cell variability due to stochastic fluctuations inherent with few mRNAs.
The spreading methylation of wild type Dam spans across multiple GATCs. The more localised methylation by mutant Dam, however, makes the frequency of GATCs the new limit for spatial resolution. We addressed this by developing a DamID-seq protocol that captures individual methylated sites, rather than reading out the correlation between adjacent pairs, which increases how frequently methylation is sampled across the genome; several Tcf7l2 binding sites were detected by only a single GATC. Additionally, this protocol reduces the number of steps required by using the initial ligated adapter directly for sequencing, instead of separating amplification of methylated fragments from later sequencing library preparation (as in ), and produces a more interpretable output of read count at each GATC instead of being smeared out into a peak. Further increasing the frequency with which binding can be detected could be achieved by combining these Dam mutants with K9A, which allows Dam to methylate at sequences other than GATC and detecting the resulting methylation by immunoprecipitation [22, 23].
The recommended method for dealing with background activity is to express an unfused Dam control and hope that it recapitulates the off-target methylation of the fusion construct. Interestingly, when we tried this we instead got a decrease in signal with respect to Tcf7l2 ChIP-seq. Since both background Dam methylation and transcription factor binding tend to occur within open chromatin regions, the unfused Dam control is already partially predictive of binding sites. Confounding factors, such as differences in background methylation rates between unfused Dam and Dam-Tcf7l2 due to higher diffusion of the small unfused Dam, would result in normalisation creating false negatives.
A previous paper has proposed the Dam mutant L122A to increase the signal to noise of DamID. They however report a higher correlation (\(\sim \) 0.7) between unfused Dam and the Dam transcription factor fusion compared to ours (Fig. 4) and provide no evidence for the claim of increased signal-to-noise of the L122A mutant . Additionally, this mutant was reported to show a preference for methylating already hemimethylated sites [20, 24]. While of interest as a possible way to maintain Dam methylation through DNA replication, preferential propagation of existing methylation throughout cell division would abolish independence between individual methylation events, confounding any statistical inference.
In this study, we focused on a specific transcription factor, Tcf7l2, and showed that mutations in Dam improved detection of its binding to DNA. Owing to the absence of any unique features of Tcf7l2—it neither binds particularly strongly nor has easy to predict binding—we would expect that these benefits should apply generally to other transcription factors. Since the correlation between unfused Dam mutants and wild-type Dam-Tcf7l2 suggests that off-target effects are due to strong DNA-binding of Dam, rather than processivity or kinetics, DamID generally would be most reliable for strongly binding proteins, such as CTCF, pioneer factors, or Cas9, while mutant Dam would have the most benefit for more transiently binding proteins.
With the growing appreciation of cellular heterogeneity, it is of interest to study transcription factor binding in finer resolution than the bulk cell cultures or tissues that are required by ChIP-seq. DamID provides unique benefits for measuring protein-DNA interactions in such situations, as the construct can be expressed in response to certain perturbations or in specific cell types—including within a whole organism—and easily isolated later due to the persistence of adenine methylation throughout further experimental processing. Due to the presence of artefacts in ChIP-seq, DamID is also of use in independently verifying binding sites, particularly those lacking a clear motif to explain binding. Since the noisiness of DamID has been a constant barrier to applying it more broadly, we hope that these improvements to its specificity and sensitivity for transcription factor binding will aid in the development of such experiments.
Materials and methods
Embryonic stem cell culture
All experiments were done in 129P2/OlaHsd mouse embryonic stem cells (mESC), which were cultured according to previously published protocols . mESCs were maintained on gelatin-coated plates feeder-free in mESC media composed of Knockout DMEM (Life Technologies) supplemented with 15% defined foetal bovine serum (FBS) (HyClone), 0.1 mM nonessential amino acids (NEAA) (Life Technologies), Glutamax (GM) (Life Technologies), 0.55 mM 2 -mercaptoethanol (b -ME) (Sigma), 1X E SGRO LIF (Millipore), 5 nM GSK-3 inhibitor XV and 500 nM UO126. Cells were regularly tested for mycoplasma.
Dam Tcf7l2 fusion constructs
Constructs were made by fusing Dam to the N-terminus of Tcf7l2 with a short flexible linker. Dam-Tcf7l2 and unfused Dam containing plasmids were integrated at one copy into mouse embryonic stem cells using a previously established p2Lox system . This puts the Dam constructs under control of a tet-responsive promoter, along with integrating a neomycin resistance gene that is selected for by culturing the cells in G418 (300 \(\upmu \)g/mL) for 1 week.
Mutant versions of Dam and Dam-Tcf7l2 were created by electroporating in plasmids coding for Cas9 and a sgRNA targeting the middle of Dam, along with a template oligo containing the Dam sequence with each possible combination of R95A, R116A, N126A, N132A, K139A/K140A. This template also contains several noncoding mutations that disrupt the sgRNA site, such that the template gets integrated by homologous recombination due to a CRISPR/Cas9 induced cut within the Dam coding sequence, but doesn’t get itself cut after. Individual clones were chosen by flow cytometric sorting individual cells into a few 96-well plates. After growing for a week, a portion of cells were taken and the relevant portion of Dam amplified with primers containing a unique combination of barcodes for each well. This was sequenced on an Illumina Miseq sequencer (paired end 150 + 150 bp) to identify which well contained which mutation.
Dam constructs were expressed by the addition of doxycycline (500 ng/ul). Wild-type constructs were expressed for 8 h, as longer expression resulted in saturating methylation and no signal. All mutant constructs showed lower overall methylation rates and were expressed for 24 h. Beyond this, there is no further increase in methylation, presumably due to it reaching a steady state with dilution during cell division.
Genomic DNA was extracted using the Purlink kit (Invitrogen #K182001). Methylated sites were digested by DpnI (20 ul reaction, 10U DpnI, 2 ul Cutsmart buffer, 500 ng genomic DNA). Similarly, unmethylated sites were digested by DpnII (20 ul reaction, 25U DpnII, 2 ul DpnII buffer, 500 ng genomic DNA).
Locus-specific DamID qPCR
Location of and primers for sites that are positive or negative (yet still in open chromatin) for Tcf7l2 binding
Only one barcoded adapter (i7) is included for amplification instead of both. Due to suppression PCR only fragments with both a ligated and nextera adapter are amplified.
9 cycles are used for amplification instead of 5 (since fewer fragment are being amplified).
A higher AMPure bead concentration is used (1.6\(\times \) instead 0.6\(\times \)) to ensure we capture the smaller size distribution of our fragments, which stems from the transposase’s preference for DNA ends.
TS carried out the experiments and analysis in this study and wrote the manuscript. JWKH and RS supervised the research and revised the manuscript. All authors read and approved the final manuscript.
This work is supported in part by a Human Frontier Science Program Young Investigator Grant (RGC0084/2014), the National Health and Medical Research Council of Australia (1105271), National Heart Foundation of Australia (100848), National Institute of Diabetes and Digestive and Kidney Diseases (1K01DK101684) and the National Human Genome Research Institute (1R01HG008363).
Consent for publication
The authors declare that they have no competing interests.
- 15.Kind J, Pagie L, de Vries SS, Nahidiazar L, Dey SS, Bienko M, Zhan Y, Lajoie B, de Graaf CA, Amendola M, Fudenberg G, Imakaev M, Mirny LA, Jalink K, Dekker J, van Oudenaarden A, van Steensel B. Genome-wide maps of nuclear lamina interactions in single human cells. Cell. 2015;163(1):134–47.CrossRefGoogle Scholar
- 17.Horton JR, Liebert K, Hattman S, Jeltsch A. Transition from nonspecific to specific DNA Interactions along the substrate-recognition pathway of Dam methyltransferase. New York. 2009;121(3):349–61.Google Scholar
- 25.ENCODE Project Consortium. An Integrated Encyclopedia of DNA Elements in the Human Genome. Nature. 2012;489(7414):57–74.Google Scholar
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.