Methylated site display (MSD)-AFLP, a sensitive and affordable method for analysis of CpG methylation profiles
- 1.3k Downloads
It has been pointed out that environmental factors or chemicals can cause diseases that are developmental in origin. To detect abnormal epigenetic alterations in DNA methylation, convenient and cost-effective methods are required for such research, in which multiple samples are processed simultaneously. We here present methylated site display (MSD), a unique technique for the preparation of DNA libraries. By combining it with amplified fragment length polymorphism (AFLP) analysis, we developed a new method, MSD-AFLP.
Methylated site display libraries consist of only DNAs derived from DNA fragments that are CpG methylated at the 5′ end in the original genomic DNA sample. To test the effectiveness of this method, CpG methylation levels in liver, kidney, and hippocampal tissues of mice were compared to examine if MSD-AFLP can detect subtle differences in the levels of tissue-specific differentially methylated CpGs. As a result, many CpG sites suspected to be tissue-specific differentially methylated were detected. Nucleotide sequences adjacent to these methyl-CpG sites were identified and we determined the methylation level by methylation-sensitive restriction endonuclease (MSRE)-PCR analysis to confirm the accuracy of AFLP analysis. The differences of the methylation level among tissues were almost identical among these methods. By MSD-AFLP analysis, we detected many CpGs showing less than 5% statistically significant tissue-specific difference and less than 10% degree of variability. Additionally, MSD-AFLP analysis could be used to identify CpG methylation sites in other organisms including humans.
MSD-AFLP analysis can potentially be used to measure slight changes in CpG methylation level. Regarding the remarkable precision, sensitivity, and throughput of MSD-AFLP analysis studies, this method will be advantageous in a variety of epigenetics-based research.
KeywordsDNA methylation profiling AFLP Epigenetics
methylated site display
amplified fragment length polymorphism
polymerase chain reaction
methylation-sensitive restriction enzyme
In recent years, CpG methylation analyses have been focused mainly on epigenetics, allowing researchers to quantitatively assess important markers of differential gene expression. In particular, analyses by next-generation sequencing (NGS) provide extremely high-coverage genome-wide methylome data with all CpG methylation levels precisely measured [1, 2]. However, some of the whole-genome analyses are occasionally considered to be insufficient in terms of quantitative performance . Moreover, the whole-genome methods remain unsuitable for investigations with large sample sizes owing to high costs. Nevertheless, a few genome-wide methods that can be performed at a relatively low cost per sample are available. For example, the Infinium Beadchip system, which is based on microarray technology and sodium bisulfite treatment, has recently been used for a large set of human blood DNA samples in massive cohort projects . However, a major limitation is that the Infinium platform is designed only for CpG islands of the human genome [5, 6]. Therefore, alternative methods that can be better applied to large sample sizes should be developed. Furthermore, such a method should be convenient, cost-effective, and capable of processing multiple samples simultaneously, allowing small variations to be detected with adequate accuracy.
In this study we developed a technique, methylated site display (MSD), which displays only DNA fragments that are CpG-methylated at the 5′ end in the original genomic DNA sample. In combination with amplified fragment length polymorphism (AFLP) analysis [7, 8, 9, 10, 11], we designed MSD-AFLP analysis for obtaining methylated-CpGs profiles at a relatively low cost. By MSD-AFLP analysis, we compared the DNA methylation levels in three tissues from C57BL/6J mice to evaluate the precision and sensitivity of this method.
Conceptual design of MSD-AFLP
A total of 1,594,127 HpaII sites are found in the mouse reference genome. To obtain reliable and high-resolution AFLP electropherograms, it is necessary to keep sufficient spacing between signal peaks. When separated in a capillary sequencer, the preferable number of peaks should be smaller than 1000 in one run. Using the search capabilities of Genome DNA Fragment Database (GFDB), three primary restriction enzymes, SbfI, PacI and SwaI, were found to provide desirable peak numbers. In this study, therefore, we chose SbfI as the primary restriction enzyme. We then used GFDB to calculate the number of SbfI-HpaII fragments as well as the distribution of fragment size in the mouse reference genome sequence to assess AFLP resolution (Additional file 1: Figure S2). It is understood that the ability to interpret peak data diminishes as fragment lengths overlap. Nonetheless, we found that 40,386 of the 47,315 fragments (85.4%) do not overlap in size and are predicted to display a single peak on an AFLP chart. Despite covering only 0.22% of all CpGs in the reference genome (21,342,779 CpGs) in one analysis, this technique seems to have sufficient profiling capabilities. In addition, as a result of examining the distribution of methylated sites detected by this method, CpG sites in intragenic regions, which can be detected by MSD-AFLP, were 55.3% out of the whole genome.
We then expanded GFDB to apply other organisms, i.e., human (Additional file 1: Figure S2), zebrafish and Neurospora crassa. The number of SbfI-HpaII fragments as well as the distribution of fragment size in the human, zebrafish, and N. crassa reference genome sequences were used to assess AFLP resolution in the same way as in the mouse genome sequence. We found that 47,315 of the 56,799 fragments (75.0%) in humans and 20,006 of the 22,113 fragments (89.4%) in zebrafish do not overlap in size and are predicted to display a single peak on an AFLP chart. However, in the case of N. crassa, only appoximately 1000 SbfI-HpaII fragments were found, suggesting that SbfI cuts N. crassa DNA much less than it does the other three organisms. Therefore, alternative restriction enzymes such as NcoI, AseI, or BspHI should be used. We found that 18,139 of the 19,995 NcoI-HpaII fragments (90.7%) do not overlap in size on an AFLP chart of N. crassa.
Reproducibility of MSD-AFLP
Accuracy of MSD-AFLP
Percent methylation level from MSD-AFLP peak charts
To further verify the percent methylation levels of the MSD-AFLP peak charts, we randomly selected two Peak IDs, 44 and 59, for bisulfite genomic sequencing for methylation analysis. Our results showed that the percent methylation levels obtained by MSD-AFLP analysis were highly consistent with those obtained by bisulfite genomic sequencing in the three tissues, as well as those by MSRE-PCR analysis (Additional file 1: Figure S3).
Finally, the percent methylation levels of all 2449 CpGs in the three tissues were analyzed by hierarchical clustering analysis and principal component analysis (PCA) (Additional file 1: Figure S4). Significant clusters were found for every tissue, highlighting the capability of MSD-AFLP analysis to detect unique and contrasting methylation patterns between tissues. Moreover, significant isolation of the principal of each tissue component was observable by PCA.
Sensitivity of MSD-AFLP analysis
In this study, we developed a unique method, MSD-AFLP analysis, for determining CpG methylation level profiles with high sensitivity and accuracy. Although MSD-AFLP analysis covers only 0.22% of CpGs sites out of the whole genome, it can provide CpG methylation level profiles of a multitude of CpGs (approximately 40,000) in a single analysis with almost the same precision as MSRE-PCR analysis, a quantitative PCR method, as well as with relatively low cost compared with other current array-based or NGS-based genome wide DNA methylation analyses.
The widespread use of NGS technology has led to a number of methods for analyzing CpG methylation levels within the whole genome. Of these, whole-genome bisulfite sequencing is the most powerful technique, providing extremely high-coverage genome wide methylome data with all CpG methylation levels precisely measured [1, 2, 13]. Similarly, methylated DNA immunoprecipitation-seq  and HpaII tiny fragment enrichment by ligation-mediated PCR-tagging [15, 16] analyses were developed by incorporating NGS. However, these methods remain unsuitable for investigations with large sample sizes on account of their expensiveness and do not offer satisfactory quantitative performance even when more expensive measures are taken to obtain sufficient depths. Reduced representation bisulfite sequencing can provide quantitative values of numerous CpG methylations  however, even in analyses utilizing machines such as SOLiD (Thermo Fisher Scientific, Inc., San Diego, CA, USA) and Hiseq 2000 (Illumina, Inc., Waltham, MA, USA), the average depth of coverage is usually only approximately 30–100 reads [18, 19]. Out of all current NGS technologies, only the Roche 454 sequencing system (Roche Diagnostics), which is capable of obtaining relatively long sequences in one read, can provide such a high rate of mapping. Even so, with the Roche 454 system, more than 1000 reads are required to detect a 5% methylation level difference in the sequence of one sample [20, 21]. In contrast, the MSD-AFLP analysis established in this study was capable of easily detecting significant differences of less than 5% in methylation level (Fig. 6). In current studies of methylation analyses, huge numbers of samples containing various cell types are usually required to obtain significant data . Since multiple samples can be processed simultaneously in MSD-AFLP analysis, allowing small variations to be detected with adequate accuracy at a low cost, this method will be advantageous for a variety of epigenetics-based research studies.
Regarding the cost-benefit of current genome wide analyses, Infinium® assay (HumanMethylation450 Beadchip) has become the preferred choice, which can be used to analyze the methylation levels of approximately 450,000 CpGs [5, 6]. At present, however, this platform is designed only for the human genome and is biased towards CpG islands in the promoter region. In contrast, MSD-AFLP analysis can be used for any kind of organism.
In the research fields of hygiene and environmental toxicology, it has been pointed out that environmental chemicals and pollutants can cause diseases that are developmental in origin, possibly resulting from abnormal epigenetic alterations such as those in DNA methylation . Several genome wide DNA methylation analyses showed that environmental chemicals, such as vinclozolin and bisphenol-A, can cause changes in CpG methylation level, which can be transmitted to next-generation offspring [23, 24, 25, 26]. These inheritable DNA methylation changes were measured using sperm nuclear DNA; however, the reliability and reproducibility of these studies are still controversial . In terms of verifying the accuracy of previous reports, our MSD-AFLP analysis will be useful for analyzing such subtle changes in the CpG methylation pattern induced by environmental factors that are transmitted to later generations.
With regard to other applications, MSD-AFLP analysis will also be a useful tool in clinical cancer research. Investigating the epigenetic markers of cancer stem cells in a multitude of clinical samples is of significant interest [28, 29, 30, 31]. Although the genome coverage of MSD-AFLP is 0.22% out of all CpG sites in the whole genome, MSD-AFLP analysis can be used to screen a large number of clinical samples with relatively low cost.
MSD-AFLP analysis can be potentially used to measure slight changes in CpG methylation level. On the basis of our results regarding the remarkable precision, sensitivity, and throughput of MSD-AFLP, we conclude that this method will be advantageous in a variety of epigenetics-based studies.
The reagents and materials used in this study were purchased from the manufacturers indicated in parentheses: CpG methyltransferase (M.SssI), T4 DNA ligase, and restriction enzymes HpaII, MspI, SbfI, and StuI (New England Biolabs, MA, USA) it guarantees that the efficiency of their restriction enzymes is almost and the methylation of CpG blocks 100% HpaII digestion reaction; EpiTect Bisulfite Kit and AllPrep DNA/RNA Mini Kit (Qiagen, Hilden, Germany); Oligonucleotides (Operon, Alameda, CA, USA); Magnetic beads coated with streptavidin (Dynabeads® M-280 Streptavidin) (Dynal, Oslo, Norway); TITANIUM Taq DNA polymerase (Takara Bio, Kusatsu, Japan); GenElute™ Agarose Spin Columns (Sigma-Aldrich, St. Louis, MO, USA); Ligation Convenience Kit (Nippon Gene, Tokyo, Japan); pGEM®-T Easy Vector (Promega, Madison, WI, USA); Competent Cell DH5α and Insert Check-Ready (Toyobo, Osaka, Japan); LightCycler® 480 SYBR Green I Master (Roche Diagnostics GmbH, Mannheim, Germany); POP-7™ Polymer, GeneScan™ 500 LIZ® Size Standard, and BigDye® Terminator v3.1 Cycle Sequencing Kit (ThermoFisher Scientific Inc., San Diego, CA, USA).
Animals and tissues
Thirteen-week old male C57BL/6 J mice (n = 3) purchased from CLEA Japan Inc. (CLEA Japan Inc., Tokyo, Japan) were sacrificed by cervical dislocation to collect liver, kidney, and hippocampus samples.
Artificially CpG-methylated genomic DNA
Genomic DNA was purified with the AllPrep DNA/RNA Mini Kit. To generate the artificially methylated DNA in all CpG sites, 2 μg of mouse kidney genomic DNA was incubated with S-adenosylmethionine and SssI at 37 °C for 1 h and subsequently incubated at 65 °C for 20 min. The treated DNA was again purified with the AllPrep DNA/RNA Mini Kit. We confirmed the quality of the artificially methylated DNA by MSRE-PCR targeting on three randomly selected CpGs. The methylation levels of these CpGs were over 97%.
A flowchart of the MSD-library preparation steps is shown in Fig. 1. First, genomic DNA (100 ng) digested with SbfI was ligated with a biotinylated adaptor (Adaptor A) using 400 units of T4 DNA ligase. Next, the ligated products were digested with 100 units of the methylation-insensitive enzyme MspI for 1 h, an isoschizomer of methylation-sensitive HpaII that recognizes and digests CCGG sequences. The resulting DNA fragments were captured using Dynabeads® M-280 Streptavidin and washed with washing buffer (10 mM Tris HCl, 1 mM EDTA, 2 M NaCl, pH7.5) and TE (1 mM Tris HCl, 0.1 mM EDTA, pH7.5) three times. The DNA fragments were then ligated with Adaptor B. After another washing in the same manner, the products were digested with HpaII on the magnetic beads. While remaining on the beads, the HpaII-digested DNA fragments were then amplified with the Pre-PCR primers under the following conditions: 25 cycles of denaturation at 95 °C for 20 s, annealing at 58 °C for 20 s, and extension at 72 °C for 90 s. The resulting solution containing the MSD library was used as a template for selective PCR. All adaptors and primers used in MSD-library construction are listed in Additional file 1: Table S1.
Selective-PCR and electrophoresis
The selective-PCR step in MSD-AFLP analysis is based on the original report on AFLP . The set of selective-PCR primers is shown in Additional file 1: Table S1. We prepared 16 sequences each of the MspI-NN primer and SbfI-NN primer. The 5′ end of the MspI-NN primer was labeled with 6-carboxyfluorescein (6-FAM). PCR was performed in a 10 μL solution containing 10 pmol of the MspI-NN primer, 10 pmol of the SbfI-NN primer, 40 nmol of dNTPs, and 0.2 μL of TITANIUM Taq DNA polymerase in accordance with the manufacturer’s instructions. The cycling conditions were as follows: first denaturation at 95 °C for 1 min and 28 cycles of denaturation at 95 °C for 20 s, annealing at 66 °C for 30 s, and extension at 72 °C for 2 min. The resultant PCR products were electrophoresed using an Applied Biosystems 3730xl DNA Analyzer (ThermoFisher Scientific). Data were analyzed using GeneMapper® ID Software v3.7 (ThermoFisher Scientific) and HiAL version 5.2 software developed by Maze Inc. (Tokyo, Japan).
DNA isolation and sequencing
The DNA of fragments was sequenced as follows. An aliquot of 1 μL of MSD-AFLP analysis product was separated on a denaturing polyacrylamide gel containing 7.0 M urea. Fluorescence from this product was detected using Typhoon 9210 Molecular Imager (Amersham Biosciences, Piscataway, NJ, USA) and slices of gel containing the DNA fragments were cut out. The gel slices were suspended in 50 μL of TE buffer with 1 μL of the suspension being used for PCR with MspI-universal and SbfI-universal primers (Additional file 1: Table S1). The DNA sequence of the PCR product was determined using the MspI-universal primer and BigDye® Terminator v3.1 Cycle Sequencing Kit.
Methylation-sensitive restriction enzyme dependent PCR (MSRE-PCR) was performed as follows. All locus-specific primers used in this experiment were designed to amplify the target DNA which has HpaII-CpG (Additional file 1: Table S2). Purified genomic DNA (100 ng) was divided into two portions. One aliquot was digested with methylation-sensitive restriction enzyme HpaII while the other aliquot was digested with StuI. StuI was selected as a restriction enzyme that does not cut any of the 11 target DNAs. The HpaII- and StuI-digested DNAs were subjected to quantitative-PCR using a LightCycler® 480. PCR was performed under the following conditions: 95 °C for 5 min and 50 cycles of 95 °C for 10 s, 63 °C for 20 s, and 72 °C for 10 s, followed by determination of the melting curve at 95 °C for 5 s, 65 °C for 1 min, and 97 °C for continuous hold. The methylation levels (expressed as % methylation) of HpaII-CpG sites are presented here as a ratio of the target copy number from the HpaII-digested DNA to that from the StuI-digested DNA.
Bisulfite genomic sequencing
Sodium bisulfite conversion and purification were performed using the EpiTect Bisulfite Kit. The bisulfite-treated DNA was amplified and purified using SIGMA GenElute. The purified DNA was cloned using the pGEM®-T Easy Vector with the Ligation Convenience Kit and transformed into DH5α. Colony PCR was performed to identify positive clones. Sequences were then determined using the BigDye® Terminator v3.1 Cycle Sequencing Kit and the M13 reverse primer, GCGGATAACAATTTCACACAG. All primers used in this step are listed in Additional file 1: Table S3.
Prediction of genomic position from AFLP peak charts
In order to predict the genomic position of methylated CpGs from AFLP peak charts, we developed the GFDB (Additional file 1: Figure S1, http://gfdb.maze.co.jp/). GFDB is composed of a versatile search interface and a virtual AFLP data generation system based on input reference genome sequences. GFDB can simulate the MSD-AFLP procedure of genomic DNA cleavage with any restriction enzyme or any selective PCR. Under a given condition, it shows the number of DNA fragments produced by selecting a combination of restriction enzymes, fragment length range, and two selective nucleotides adjacent to each desired recognition sequence (Additional file 1: Figure S1).
Diffrences in methylation levels between the tissues were analyzed by one-way ANOVA followed by the post hoc Tukey test using R statistical software (http://cran.r-project.org/). Multiple comparison adjusted p-values were computed using Benjamini–Hochberg (BH) corrections . Statistical probabilities of FDR ≦ 0.05 were considered significant. Using R, we normalized CpG methylation levels to the z-score and subjected to PCA and hierarchical clustering analysis of methylation pattern utilizing Euclidean distance and the unweighted pair-group method using arithmetic mean (UPGMA). Finally, an approximation formula derived from Hill equation was developed using GraphPad Prism (GraphPad Software, La Jolla, CA, USA).
TA, TS, HK, and SO conceived method and designed experiment; SS and HY made GFDB system; TA and AH performed experiments; TA and TM analyzed the data with help by WF; TA, TS, and SO wrote the manuscript with help by CT. All authors read and approved the final manuscript.
Authors thank Mr. Emir Turkes, an American citizen and graduated from Boston University, for his kind help in English manuscript writing.
The authors declare that they have no competing interests.
Availability of data and materials
The original data of the MSD-AFLP charts will be available upon request.
Ethics approval and consent to participate
The experiment was approved by the Animal Care and Use Committee of the University of Tokyo (Committee’s reference number, Med-P11-015). Consent to participate is not applicable.
This work was supported by the Japan Society for the Promotion of Science [KAKENHI Grant No 23310044 to S.O., 20380168 to S.O.].
- 27.Inawaka K, Kawabe M, Takahashi S, Doi Y, Tomigahara Y, Tarui H, et al. Maternal exposure to anti-androgenic compounds, vinclozolin, flutamide and procymidone, has no effects on spermatogenesis and DNA methylation in male rats of subsequent generations. Toxicol Appl Pharmacol. 2009;237:178–87.CrossRefPubMedGoogle Scholar
- 32.Benjamini Y, Hochberg Y. Controlling the false discovery rate: a practical and powerful approach to multiple testing. J R Stat Soc Ser B. 1995;57:289–300.Google Scholar
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.