Skip to main content

Generative Models for Quantification of DNA Modifications

  • Protocol
  • First Online:
Data Mining for Systems Biology

Part of the book series: Methods in Molecular Biology ((MIMB,volume 1807))

Abstract

There are multiple chemical modifications of cytosine that are important to the regulation and ultimately the functional expression of the genome. To date no single experiment can capture these separate modifications, and integrative experimental designs are needed to fully characterize cytosine methylation and chemical modification. This chapter describes a generative probabilistic model, Lux, for integrative analysis of cytosine methylation and its oxidized variants. Lux simultaneously analyzes partially orthogonal bisulfite sequencing data sets to estimate proportions of different cytosine methylation modifications and estimate multiple cytosine modifications for a single sample by integrating across experimental designs composed of multiple parallel destructive genomic measurements. Lux also considers the variation in measurements introduced by different imperfect experimental steps; the experimental variation can be quantified by using appropriate spike-in controls, allowing Lux to deconvolve the measurements and recover accurately the underlying signal.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Protocol
USD 49.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 99.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 129.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 199.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Kohli RM, Zhang Y (2013) TET enzymes, TDG and the dynamics of DNA demethylation. Nature 502(7472):472. https://doi.org/10.1038/nature12750

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  2. Pastor WA, Aravind L, Rao A (2013) TETonic shift: biological roles of TET proteins in DNA demethylation and transcription. Nat Rev Mol Cell Biol 14(6):341. https://doi.org/10.1038/nrm3589

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  3. Wu X, Zhang Y (2017) TET-mediated active DNA demethylation: mechanism, function and beyond. Nat Rev Genet 18(9):517–534

    Article  CAS  PubMed  Google Scholar 

  4. Shen L, Wu H, Diep D, Yamaguchi S, D’Alessio AC, Fung H-L et al (2013) Genome-wide analysis reveals TET-and TDG-dependent 5-methylcytosine oxidation dynamics. Cell 153(3):692–706

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  5. Spruijt CG, Gnerlich F, Smits AH, Pfaffeneder T, Jansen PW, Bauer C (2013) Dynamic readers for 5-(hydroxy)methylcytosine and its oxidized derivatives. Cell 152(5):1146–1159. https://doi.org/10.1016/j.cell.2013.02.004

    Article  PubMed  CAS  Google Scholar 

  6. Yin Y, Morgunova E, Jolma A, Kaasinen E, Sahu B, Khund-Sayeed S et al (2017) Impact of cytosine methylation on DNA binding specificities of human transcription factors. Science 356(6337):eaaj2239. http://www.sciencemag.org/lookup/doi/10.1126/science.aaj2239

    Article  CAS  Google Scholar 

  7. Äijö T, Huang Y, Mannerström H, Chavez L, Tsagaratou A, Rao A et al (2016) A probabilistic generative model for quantification of DNA modifications enables analysis of demethylation pathways. Genome Biol 17(1):49. https:// doi.org/10.1186/s13059-016-0911-6

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  8. Äijö T, Yue X, Rao A, Lähdesmäki H (2016) LuxGLM: a probabilistic covariate model for quantification of DNA methylation modifications with complex experimental designs. Bioinformatics 32(17):i511–i519

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  9. Plongthongkum N, Diep DH, Zhang K (2014) Advances in the profiling of DNA modifications: cytosine methylation and beyond. Nat Rev Genet 15(10):647–661. https://doi.org/10.1038/nrg3772

    Article  PubMed  CAS  Google Scholar 

  10. Huang Y, Pastor WA, Shen Y, Tahiliani M, Liu DR, Rao A (2010) The behaviour of 5-hydroxymethylcytosine in bisulfite sequencing. PLoS One 5(1):e8888. https:// doi.org/10.1371/journal.pone.0008888

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  11. Booth MJ, Branco MR, Ficz G, Oxley D, Krueger F, Reik W (2012) Quantitative sequencing of 5-methylcytosine and 5-hydroxymethylcytosine at single-base resolution. Science 336(6083):934–937. https://doi.org/10.1126/science.1220671

    Article  PubMed  CAS  Google Scholar 

  12. Yu M, Hon GC, Szulwach KE, Song CX, Zhang L, Kim A (2012) Base-resolution analysis of 5-hydroxymethylcytosine in the mammalian genome. Cell 149(6):1368–1380. https://doi.org/10.1016/j.cell.2012.04.027

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  13. Song CX, Szulwach KE, Dai Q, Fu Y, Mao SQ, Lin L (2013) Genome-wide profiling of 5-formylcytosine reveals its roles in epigenetic priming. Cell 153(3):678–691. https://doi.org/10.1016/j.cell.2013.04.001

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  14. Booth MJ, Marsico G, Bachman M, Beraldi D, Balasubramanian S (2014) Quantitative sequencing of 5-formylcytosine in DNA at single-base resolution. Nat Chem 6(5):435–440. https://doi.org/10.1038/nchem.1893

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  15. Lu X, Song CX, Szulwach K, Wang Z, Weidenbacher P, Jin P (2013) Chemical modification-assisted bisulfite sequencing (CAB-Seq) for 5-carboxylcytosine detection in DNA. J Am Chem Soc 135(25):9315–9317. https://doi.org/10.1021/ja4044856

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  16. Wu H, Wu X, Shen L, Zhang Y (2014) Single-base resolution analysis of active DNA demethylation using methylase-assisted bisulfite sequencing. Nat Biotechnol 32(12):1231–1240. https://doi.org/10.1038/nbt.3073

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  17. Yu M, Hon GC, Szulwach KE, Song C-X, Jin P, Ren B et al (2012) Tet-assisted bisulfite sequencing of 5-hydroxymethylcytosine. Nat Protoc 7(12):2159–2170. https://doi.org/ 10.1038/nprot.2012.137

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  18. Carpenter B, Gelman A, Hoffman MD, Lee D, Goodrich B, Betancourt M et al (2017) Stan: a probabilistic programming language. J Stat Softw 76(1):1–32. https://www.jstatsoft.org/v076/i01

    Article  Google Scholar 

  19. Hoffman MD, Gelman A (2014) The No-U-turn sampler: adaptively setting path lengths in Hamiltonian Monte Carlo. J Mach Learn Res 15(1):1593–1623

    Google Scholar 

  20. Gelman A, Carlin JB, Stern HS, Dunson DB, Vehtari A, Rubin DB (2013) Bayesian data analysis, 3rd edn. Taylor & Francis. (Chapman & Hall/CRC Texts in Statistical Science), London. https://books.google.com/books?id=ZXL6AQAAQBAJ

    Google Scholar 

  21. Andrews S (2010) FastQC: a quality control tool for high throughput sequence data [Internet]. http://www. bioinformatics.babraham.ac.uk/projects/fastqc/

  22. Krueger F, Andrews SR (2011) Bismark: a flexible aligner and methylation caller for bisulfite-Seq applications. Bioinformatics 27(11):1571–1572. https://doi.org/ 10.1093/bioinformatics/btr167

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  23. Quinlan AR, Hall IM (2010) BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics 26(6):841–842. https:// doi.org/10.1093/bioinformatics/btq033

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  24. Stan Development Team (2017) PyStan: the Python interface to Stan [Internet]. http://mc-stan.org

  25. Stan Development Team (2017) CmdStan: the command-line interface to Stan

    Google Scholar 

  26. Äijö T, Mannerström H (2017) Lux: an integrative hierarchical Bayesian modeli for analyzing bisulphite based sequencing data [Internet]. https://github.com/tare/Lux/

  27. Gelman A, Rubin DB (1992) Inference from iterative simulation using multiple sequences. Stat Sci 7(4):457–472. http://projecteuclid.org/euclid.ss/1177011136  

    Article  Google Scholar 

  28. Kass RE, Raftery AE (1995) Bayes factors. J Am Stat Assoc 90(430):773–795

    Article  Google Scholar 

  29. Dickey JM (1971) The weighted likelihood ratio, linear hypotheses on normal location parameters. Ann Math Stat 42(1):204–223

    Article  Google Scholar 

  30. Jeffreys H (1998) Theory of probability, 3rd edn. Oxford University Press, New York, p xii+459; (Oxford Classic Texts in the Physical Sciences)

    Google Scholar 

  31. Hon GC, Rajagopal N, Shen Y, McCleary DF, Yue F, Dang MD et al (2013) Epigenetic memory at embryonic enhancers identified in DNA methylation maps from adult mouse tissues. Nat Genet 45(10):1198–1206. http://www.nature.com/doifinder/10.1038/ng.2746

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  32. Tsagaratou A, Äijö T, Lio C-WJ, Yue X, Huang Y, Jacobsen SE et al (2014) Dissecting the dynamic changes of 5-hydroxymethylcytosine in T-cell development and differentiation. Proc Natl Acad Sci 111(32):E3306–E3315. http://www.pnas.org/cgi/doi/10.1073/pnas.1412327111

    Article  CAS  PubMed  Google Scholar 

  33. Ritchie MD, Holzinger ER, Li R, Pendergrass SA, Kim D (2015) Methods of integrating data to uncover genotype–phenotype interactions. Nat Rev Genet 16(2):85–97. http://www.nature.com/doifinder/10.1038/nrg3868

    Article  CAS  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Harri Lähdesmäki .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Science+Business Media, LLC, part of Springer Nature

About this protocol

Check for updates. Verify currency and authenticity via CrossMark

Cite this protocol

Äijö, T., Bonneau, R., Lähdesmäki, H. (2018). Generative Models for Quantification of DNA Modifications. In: Mamitsuka, H. (eds) Data Mining for Systems Biology. Methods in Molecular Biology, vol 1807. Humana Press, New York, NY. https://doi.org/10.1007/978-1-4939-8561-6_4

Download citation

  • DOI: https://doi.org/10.1007/978-1-4939-8561-6_4

  • Published:

  • Publisher Name: Humana Press, New York, NY

  • Print ISBN: 978-1-4939-8560-9

  • Online ISBN: 978-1-4939-8561-6

  • eBook Packages: Springer Protocols

Publish with us

Policies and ethics