Expanding the diversity of DNA base modifications with N 6-methyldeoxyadenosine
Vertebrate DNA is subjected to epigenetic base modifications that have been thought to be limited to methylated and other modified forms of cytidine. A recent study shows that methylation of adenine to form N 6-methyladenine is a rare but readily detectable modification that can be mapped to distinct genomic sites in vertebrates.
Keywordsm6dA Site Lower Organism Demethylation Pathway Cytidine Nucleotide m6dA Function
- 6 mA
Transcription start site
Epigenetic modifications expand the information content of DNA and have long been known to exist in the genomes of diverse organisms. The most well-studied base modification in vertebrates and higher eukaryotes is methylation of the C-5 position of cytosine residues, which forms the 5-methylcytidine (m5C) nucleotide. m5C is often detected at CpG dinucleotides where it influences transcription by recruiting repressive m5C-binding proteins or preventing the binding of transcription factors . Importantly, cytidine methylation is a reversible modification, and both methylation and demethylation pathways contribute to dynamic regulation of m5C signatures which control both developmentally regulated and activity-dependent gene expression programs . Demethylation of m5C residues has been shown to involve the formation of oxidized intermediates, such as 5-formylcytidine and 5-hydroxymethylcytidine, which also contribute to gene expression control. Thus, the cytidine nucleotide has long been considered to be the major site of DNA base modifications contributing to transcriptional regulation in vertebrates and other eukaryotes.
Unlike higher eukaryotes, many bacteria are known to contain additional DNA base modifications. These include C-4-methylated cytosine (4mC), as well as methylated adenine (N 6-methyladenine, or 6 mA). The methylated base, 6 mA, or the corresponding nucleotide N 6-methyldeoxyadenosine (m6dA), is an important component of the restriction/modification system used to defend against bacteriophage invasion. This system uses the methylation of host cell DNA to protect it against cleavage by restriction endonucleases, while enabling invading, unmethylated genomic material to be cleaved. In addition to its role in host defense pathways, m6dA is also an important regulator of DNA replication, repair, and transcriptional control in prokaryotes .
A new addition to the vertebrate epigenome
Although m6dA is a readily detectable feature of bacterial genomes, it has been more difficult to definitively establish its presence in the genomes of higher organisms. This contrasts with mRNA, where mapping of the ribonucleotide equivalent of m6dA, N 6-methyladenosine (m6A), has identified m6A sites in thousands of mammalian mRNAs . Studies aimed at detecting m6dA in the DNA of higher organisms  have failed to uncover evidence for the existence of this modification. Although such studies formed the basis for the belief that mammalian DNA is devoid of m6dA, they were done over 40 years ago and were hampered by low sensitivity (limit of detection approximately 0.01 %) In contrast, more recent studies have challenged the notion that eukaryotic DNA does not contain m6dA. In 2006, Wion and colleagues used liquid chromatography coupled with mass spectrometry to interrogate the mouse genome for the presence of m6dA and detected very low levels of m6dA (fewer than 1 m6dA per 106 nucleotides) . Although highly sensitive, such approaches can also lead to artifacts since trace amounts of bacterial contamination in genomic preparations can result in the detection of m6dA due to the high levels of m6dA in their genome. Indeed, bacterial contamination is commonplace in mammalian culture systems, and bacteria are either commensal organisms, or part of the diet of lower organisms such as Caenorhabditis elegans and Drosophila.
A major advance came with the development of m6dA mapping techniques in invertebrate organisms. Three recent studies used global m6dA profiling methods to identify m6dA in the genomes of Chlamydomonas reinhardtii, Drosophila melanogaster, and C. elegans, with levels of m6dA ranging from 0.4 % to 0.001 % of total adenine residues within these genomes [6, 7, 8]. Since these methods identified m6dA within a genomic sequence, the m6dA could be definitively assigned to the invertebrate genome rather than caused by bacterial contamination. These studies have been an important advance in our understanding of the repertoire of epigenetic modifications in higher organisms.
The outstanding question was whether m6dA is found in higher organisms, including humans. A recent study from Gurdon and colleagues  provides the first analysis of vertebrate m6dA residues genome-wide. The study, published in Nature Structural & Molecular Biology, examined the genomes of frogs, mice, and human cells using ultra-high-performance liquid chromatography with tandem mass spectrometry and revealed low levels of m6dA in all three genomes (approximately 1 m6dA for every 1.2 × 106 deoxyadenosine residues, or 0.00009 % of deoxyadenosine residues).
The authors then went on to globally profile m6dA distribution within Xenopus and mouse genomes using an m6dA antibody-based DNA immunoprecipitation (DIP) method. In brief, this method involves immunoprecipitating DNA fragments that contain m6dA using a m6dA-specific antibody. Their analysis revealed a large number of reads that cluster to form m6dA peaks in these genomes (approximately 20,000–50,000 depending on the tissue). Notably, only a small number of m6dA peaks were located within genes (approximately 7–21 %). Peaks within genes were largely excluded from exons, but a higher number were located within intronic regions. In addition, the authors observed a relative paucity of m6dA sites immediately after transcription start sites (TSSs), which contrasts with the marked enrichment of m6dA within this region in C. elegans .
The authors validated their results by repeating the global mapping studies in Xenopus using two additional m6dA antibodies. There was a high degree of overlap of individual m6dA peaks as well as overall m6dA distribution among all three antibodies tested, indicating that the DIP-seq mapping technique is likely detecting valid m6dA sites. When comparing m6dA sites across different Xenopus tissues, there were subsets of m6dA sites that were common to two or more tissues, but others that were unique. This may indicate tissue-specific m6dA patterns, but warrants further investigation to rule out sample-to-sample variability. Similarly, comparison of m6dA peaks in frog and mouse genomes revealed some peaks that overlapped and others that were distinct, which may be due to different tissues examined in each species (testes, fat, and oviduct in frog and kidney in mouse) or to species-specific methylation patterns. Additional detailed analyses of m6dA distribution across various tissues and in diverse species will likely lend further insights into the tissue-/cell type-specific distribution of m6dA as well as the consistency of m6dA sites across species.
The overall pattern of m6dA localization observed in the Gurdon study, as well as in the other recent m6dA profiling studies, provides the first insights into the potential function of m6dA. Although Gurdon and colleagues observe a slight enrichment of m6dA peaks just upstream of TSSs, the most defining feature of m6dA observed in this region is an absence of m6dA just after the TSS. In contrast, Greer et al. report no clear distribution of m6dA near genes in C. elegans , whereas m6dA is enriched at and just after TSSs in Ch. reinhardtii . These distinct distribution patterns might indicate different roles for m6dA in vertebrates than in lower organisms. The fact that m6dA exhibits variable localization around TSSs further suggests potential functions for this modification in transcriptional regulation. Indeed, m6dA in Ch. reinhardtii is associated with actively transcribed genes . However, further studies will be necessary to determine whether m6dA is repressive or permissive for transcription in vertebrates. Such analyses will also be useful for understanding how relevant functional studies in lower organisms are likely to be for understanding m6dA in vertebrates.
Moving forward: m6dA regulatory pathways and functional insights
A major priority right now for m6dA research is to uncover the function of m6dA, a task which will be facilitated by a more detailed understanding of the readers and writers of this mark. A putative m6dA methyltransferase, DAMT-1, has been identified in C. elegans , although the closest vertebrate homolog of this protein (METTL4) has not yet been explored for m6dA-forming potential. Notably, Gurdon and colleagues identify an AG-rich motif which was enriched in m6dA peaks in Xenopus. Two AG-containing motifs were also detected at m6dA sites in C. elegans , suggesting that the m6dA methyltransferase machinery in these organisms might share a similar recognition sequence. However, efforts to identify m6dA motifs in other higher eukaryotes, such as mice and flies, have failed to identify consistent consensus sequences [7, 9], suggesting the existence of other mechanisms for m6dA formation in addition to simple sequence recognition elements.
Another possibility is that m6dA functions to destabilize DNA duplexes. Although m6dA forms standard Watson:Crick base pairs with thymidine, the base pairs between m6dA and thymidine are less stable than canonical adenosine:thymidine base pairs. Thus, m6dA may facilitate DNA unwinding or the open state of DNA needed for transcription initiation and other processes (Fig. 1).
An intriguing feature of m6dA across diverse organisms is that its abundance is markedly decreased in more evolved organisms. Could this be due to the evolution of active m6dA demethylation pathways? In other words, m6dA could occur frequently, but its rapid removal may account for its low overall abundance in the genome. NMAD-1, a homolog of the m6A RNA demethylase ALKBH5, has shown evidence of m6dA demethylation in C. elegans . Interestingly, the Drosophila homolog of Tet (called Dmad), which normally functions as a m5C demethylase in vertebrates , was identified as a m6dA demethylase in flies . Its potential as a m6dA demethylase in vertebrates, however, remains to be tested.
In addition to uncovering m6dA demethylases in vertebrates, it will be important to determine whether these enzymes produce demethylated intermediates, as is seen with the Tet-catalyzed production of 5-hydroxymethylcytidine (hm5C) from m5C residues in eukaryotic DNA . Thus, there remains the possibility that the repertoire of modified nucleotides expands beyond m6dA. Such intermediate forms of modified adenine, if present, might also perform unique functions in regulating gene expression and thus represent important areas of future exploration.
This work was supported by NIH grants K99MH104712 (KDM) and R01CA186702 (SRJ).
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.