Abstract
DNA methylation is a covalent modification of DNA that plays important roles in processes such as the regulation of gene expression, transcription factor binding, and suppression of transposable elements. The use of whole genome bisulfite sequencing (WGBS) enables the genome-wide identification and quantification of DNA methylation patterns at single-base resolution and is the gold standard for analysis of DNA methylation. Computational analysis of WGBS data can be particularly challenging, as many computationally intensive steps are required. Here, we outline a step-by-step approach for the analysis and interpretation of WGBS data. First, sequencing reads must be trimmed, quality checked, and aligned to the genome. Second, DNA methylation levels are estimated at each cytosine position using the aligned sequence reads of the bisulfite treated DNA. Third, regions of differential cytosine methylation between samples can be identified. Finally, these data need to be visualized and interpreted in the context of the biological question at hand.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Smith ZD, Meissner A (2013) DNA methylation: roles in mammalian development. Nat Rev Genet 14:204–220. https://doi.org/10.1038/nrg3354
Friso S, Choi S-W, Dolnikowski GG, Selhub J (2002) A method to assess genomic DNA methylation using high-performance liquid chromatography/electrospray ionization mass spectrometry. Anal Chem 74:4526–4531
Weber M, Davies JJ, Wittig D et al (2005) Chromosome-wide and promoter-specific analyses identify sites of differential DNA methylation in normal and transformed human cells. Nat Genet 37:853–862. https://doi.org/10.1038/ng1598
Matzke MA, Mosher RA (2014) RNA-directed DNA methylation: an epigenetic pathway of increasing complexity. Nat Rev Genet 15:394–408. https://doi.org/10.1038/nrg3683
Frommer M, McDonald LE, Millar DS et al (1992) A genomic sequencing protocol that yields a positive display of 5-methylcytosine residues in individual DNA strands. Proc Natl Acad Sci U S A 89:1827–1831. https://doi.org/10.1073/pnas.89.5.1827
Lister R, O’Malley RC, Tonti-Filippini J et al (2008) Highly integrated single-base resolution maps of the epigenome in Arabidopsis. Cell 133:523–536. https://doi.org/10.1016/j.cell.2008.03.029
Cokus SJ, Feng S, Zhang X et al (2008) Shotgun bisulphite sequencing of the Arabidopsis genome reveals DNA methylation patterning. Nature 452:215–219. https://doi.org/10.1038/nature06745
Andrews S FastQC A Quality control tool for high throughput sequence data. In: bioinformatics.babraham.ac.uk. http://www.bioinformatics.babraham.ac.uk/projects/fastqc/. Accessed 18 Apr 2017
Martin M (2011) Cutadapt removes adapter sequences from high-throughput sequencing reads. EMBnetjournal 17:10–12. https://doi.org/10.14806/ej.17.1.200
Langmead B, Salzberg SL (2012) Fast gapped-read alignment with Bowtie 2. Nat Methods 9:357–359. https://doi.org/10.1038/nmeth.1923
Li H, Handsaker B, Wysoker A et al (2009) The sequence alignment/map format and SAMtools. Bioinformatics 25:2078–2079. https://doi.org/10.1093/bioinformatics/btp352
Guo W, Fiziev P, Yan W et al (2013) BS-Seeker2: a versatile aligning pipeline for bisulfite sequencing data. BMC Genomics 14:774. https://doi.org/10.1186/1471-2164-14-774
Feng H, Conneely KN, Wu H (2014) A Bayesian hierarchical model to detect differentially methylated loci from single nucleotide resolution sequencing data. Nucleic Acids Res 42:e69. https://doi.org/10.1093/nar/gku154
Quinlan AR, Hall IM (2010) BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics 26:841–842. https://doi.org/10.1093/bioinformatics/btq033
Ramirez F, Dundar F, Diehl S et al (2014) deepTools: a flexible platform for exploring deep-sequencing data. Nucleic Acids Res 42:W187–W191. https://doi.org/10.1093/nar/gku365
Ewels P, Magnusson M, Lundin S, Käller M (2016) MultiQC: summarize analysis results for multiple tools and samples in a single report. Bioinformatics 32:3047–3048. https://doi.org/10.1093/bioinformatics/btw354
Wu H, Xu T, Feng H et al (2015) Detection of differentially methylated regions from whole-genome bisulfite sequencing data without replicates. Nucleic Acids Res 43(21):e141. https://doi.org/10.1093/nar/gkv715
Lawrence M, Huber W, Pagès H et al (2013) Software for computing and annotating genomic ranges. PLoS Comput Biol 9:e1003118. https://doi.org/10.1371/journal.pcbi.1003118
Kawakatsu T, Stuart T, Valdes M et al (2016) Unique cell-type-specific patterns of DNA methylation in the root meristem. Nat Plants 2(5):16058. https://doi.org/10.1038/nplants.2016.58
Stroud H, Greenberg MVC, Feng S et al (2013) Comprehensive analysis of silencing mutants reveals complex regulation of the Arabidopsis methylome. Cell 152:352–364. https://doi.org/10.1016/j.cell.2012.10.054
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Science+Business Media, LLC
About this protocol
Cite this protocol
Stuart, T., Buckberry, S., Lister, R. (2018). Approaches for the Analysis and Interpretation of Whole Genome Bisulfite Sequencing Data. In: Jeltsch, A., Rots, M. (eds) Epigenome Editing. Methods in Molecular Biology, vol 1767. Humana Press, New York, NY. https://doi.org/10.1007/978-1-4939-7774-1_17
Download citation
DOI: https://doi.org/10.1007/978-1-4939-7774-1_17
Published:
Publisher Name: Humana Press, New York, NY
Print ISBN: 978-1-4939-7773-4
Online ISBN: 978-1-4939-7774-1
eBook Packages: Springer Protocols