Skip to main content

Computational and Statistical Analysis of Array-Based DNA Methylation Data

  • Protocol
  • First Online:
Cancer Bioinformatics

Part of the book series: Methods in Molecular Biology ((MIMB,volume 1878))

Abstract

The characterization of aberrant DNA methylation is emerging as a key part of the study of cancer development and phenotype. The technical advancements and decreasing costs of methods for high-throughput profiling of DNA methylation have brought about a high interest in the use of such methods in disease association studies. Here we discuss the principles for DNA methylation analysis using data from the Infinium DNA methylation BeadChip assays and describe the computational steps and statistical considerations going from processing of the raw array data to analysis of differential methylation. Moreover, we provide detailed guidelines on how to perform tumor subtype classification based on DNA methylation signatures.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Protocol
USD 49.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Sandoval J, Esteller M (2012) Cancer epigenomics: beyond genomics. Curr Opin Genet Dev 22(1):50–55. https://doi.org/10.1016/j.gde.2012.02.008

    Article  CAS  PubMed  Google Scholar 

  2. Hansen KD, Timp W, Bravo HC, Sabunciyan S, Langmead B, McDonald OG, Wen B, Wu H, Liu Y, Diep D, Briem E, Zhang K, Irizarry RA, Feinberg AP (2011) Increased methylation variation in epigenetic domains across cancer types. Nat Genet 43(8):768–775. https://doi.org/10.1038/ng.865

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  3. Weisenberger DJ (2014) Characterizing DNA methylation alterations from the cancer genome atlas. J Clin Invest 124(1):17–23. https://doi.org/10.1172/JCI69740

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  4. Timp W, Bravo HC, McDonald OG, Goggins M, Umbricht C, Zeiger M, Feinberg AP, Irizarry RA (2014) Large hypomethylated blocks as a universal defining epigenetic alteration in human solid tumors. Genome Med 6(8):61. https://doi.org/10.1186/s13073-014-0061-y

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  5. Nordlund J, Syvanen AC (2017) Epigenetics in pediatric acute lymphoblastic leukemia. Semin Cancer Biol. https://doi.org/10.1016/j.semcancer.2017.09.001

    Article  CAS  PubMed  Google Scholar 

  6. Witte T, Plass C, Gerhauser C (2014) Pan-cancer patterns of DNA methylation. Genome Med 6(8):66. https://doi.org/10.1186/s13073-014-0066-6

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  7. Nordlund J, Backlin CL, Zachariadis V, Cavelier L, Dahlberg J, Ofverholm I, Barbany G, Nordgren A, Overnas E, Abrahamsson J, Flaegstad T, Heyman MM, Jonsson OG, Kanerva J, Larsson R, Palle J, Schmiegelow K, Gustafsson MG, Lonnerholm G, Forestier E, Syvanen AC (2015) DNA methylation-based subtype prediction for pediatric acute lymphoblastic leukemia. Clin Epigenetics 7(1):11. https://doi.org/10.1186/s13148-014-0039-z

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  8. Danielsson A, Nemes S, Tisell M, Lannering B, Nordborg C, Sabel M, Caren H (2015) MethPed: a DNA methylation classifier tool for the identification of pediatric brain tumor subtypes. Clin Epigenetics 7(1):62. https://doi.org/10.1186/s13148-015-0103-3

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  9. Teschendorff AE, Widschwendter M (2012) Differential variability improves the identification of cancer risk markers in DNA methylation studies profiling precursor cancer lesions. Bioinformatics 28(11):1487–1494. https://doi.org/10.1093/bioinformatics/bts170

    Article  CAS  PubMed  Google Scholar 

  10. Bibikova M, Le J, Barnes B, Saedinia-Melnyk S, Zhou LX, Shen R, Gunderson KL (2009) Genome-wide DNA methylation profiling using Infinium (R) assay. Epigenomics 1(1):177–200. https://doi.org/10.2217/EPI.09.14

    Article  CAS  PubMed  Google Scholar 

  11. Sandoval J, Heyn H, Moran S, Serra-Musach J, Pujana MA, Bibikova M, Esteller M (2011) Validation of a DNA methylation microarray for 450,000 CpG sites in the human genome. Epigenetics 6(6):692–702

    Article  CAS  PubMed  Google Scholar 

  12. Pidsley R, Zotenko E, Peters TJ, Lawrence MG, Risbridger GP, Molloy P, Van Djik S, Muhlhausler B, Stirzaker C, Clark SJ (2016) Critical evaluation of the Illumina MethylationEPIC BeadChip microarray for whole-genome DNA methylation profiling. Genome Biol 17(1):208. https://doi.org/10.1186/s13059-016-1066-1

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  13. Walker DL, Bhagwate AV, Baheti S, Smalley RL, Hilker CA, Sun Z, Cunningham JM (2015) DNA methylation profiling: comparison of genome-wide sequencing methods and the Infinium Human Methylation 450 Bead Chip. Epigenomics 1–16. doi:https://doi.org/10.2217/EPI.15.64

    Article  CAS  PubMed  Google Scholar 

  14. Marabita F, Tegnér J, Gomez-Cabrero D (2015) Introduction to data types in epigenomics. In: Teschendorff AE (ed) Computational and statistical Epigenomics, Translational bioinformatics, vol 7. Springer, Netherlands, pp 3–34. https://doi.org/10.1007/978-94-017-9927-0_1

    Chapter  Google Scholar 

  15. Bibikova M, Barnes B, Tsan C, Ho V, Klotzle B, Le JM, Delano D, Zhang L, Schroth GP, Gunderson KL, Fan JB, Shen R (2011) High density DNA methylation array with single CpG site resolution. Genomics 98(4):288–295. https://doi.org/10.1016/j.ygeno.2011.07.007

    Article  CAS  PubMed  Google Scholar 

  16. Lister R, Pelizzola M, Dowen RH, Hawkins RD, Hon G, Tonti-Filippini J, Nery JR, Lee L, Ye Z, Ngo QM, Edsall L, Antosiewicz-Bourget J, Stewart R, Ruotti V, Millar AH, Thomson JA, Ren B, Ecker JR (2009) Human DNA methylomes at base resolution show widespread epigenomic differences. Nature 462(7271):315–322. https://doi.org/10.1038/nature08514

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  17. Bock C (2012) Analysing and interpreting DNA methylation data. Nat Rev Genet 13(10):705–719. https://doi.org/10.1038/nrg3273

    Article  CAS  PubMed  Google Scholar 

  18. Gunderson KL, Steemers FJ, Lee G, Mendoza LG, Chee MS (2005) A genome-wide scalable SNP genotyping assay using microarray technology. Nat Genet 37(5):549–554. https://doi.org/10.1038/ng1547

    Article  CAS  PubMed  Google Scholar 

  19. Dedeurwaerder S, Defrance M, Calonne E, Denis H, Sotiriou C, Fuks F (2011) Evaluation of the Infinium Methylation 450K technology. Epigenomics 3(6):771–784. https://doi.org/10.2217/epi.11.105

    Article  CAS  PubMed  Google Scholar 

  20. Sun Z, Cunningham J, Slager S, Kocher JP (2015) Base resolution methylome profiling: considerations in platform selection, data preprocessing and analysis. Epigenomics. https://doi.org/10.2217/epi.15.21

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  21. Maksimovic J, Gordon L, Oshlack A (2012) SWAN: subset-quantile within array normalization for Illumina Infinium HumanMethylation450 BeadChips. Genome Biol 13(6):R44. https://doi.org/10.1186/Gb-2012-13-6-R44

    Article  PubMed  PubMed Central  Google Scholar 

  22. Teschendorff AE, Marabita F, Lechner M, Bartlett T, Tegner J, Gomez-Cabrero D, Beck S (2013) A beta-mixture quantile normalization method for correcting probe design bias in Illumina Infinium 450 k DNA methylation data. Bioinformatics 29(2):189–196. https://doi.org/10.1093/bioinformatics/bts680

    Article  CAS  PubMed  Google Scholar 

  23. Marabita F, Almgren M, Lindholm ME, Ruhrmann S, Fagerstrom-Billai F, Jagodic M, Sundberg CJ, Ekstrom TJ, Teschendorff AE, Tegner J, Gomez-Cabrero D (2013) An evaluation of analysis pipelines for DNA methylation profiling using the Illumina HumanMethylation450 BeadChip platform. Epigenetics 8(3):333–346. https://doi.org/10.4161/epi.24008

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  24. Wu MC, Joubert BR, Kuan PF, Haberg SE, Nystad W, Peddada SD, London SJ (2014) A systematic assessment of normalization approaches for the Infinium 450K methylation platform. Epigenetics 9(2):318–329. https://doi.org/10.4161/epi.27119

    Article  CAS  PubMed  Google Scholar 

  25. Aryee MJ, Jaffe AE, Corrada-Bravo H, Ladd-Acosta C, Feinberg AP, Hansen KD, Irizarry RA (2014) Minfi: a flexible and comprehensive bioconductor package for the analysis of Infinium DNA methylation microarrays. Bioinformatics 30(10):1363–1369. https://doi.org/10.1093/bioinformatics/btu049

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  26. Morris TJ, Butcher LM, Feber A, Teschendorff AE, Chakravarthy AR, Wojdacz TK, Beck S (2014) ChAMP: 450k Chip analysis methylation pipeline. Bioinformatics 30(3):428–430. https://doi.org/10.1093/bioinformatics/btt684

    Article  CAS  PubMed  Google Scholar 

  27. Assenov Y, Muller F, Lutsik P, Walter J, Lengauer T, Bock C (2014) Comprehensive analysis of DNA methylation data with RnBeads. Nat Methods 11(11):1138–1140. https://doi.org/10.1038/nmeth.3115

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  28. Wilhelm-Benartzi CS, Koestler DC, Karagas MR, Flanagan JM, Christensen BC, Kelsey KT, Marsit CJ, Houseman EA, Brown R (2013) Review of processing and analysis methods for DNA methylation array data. Br J Cancer 109(6):1394–1402. https://doi.org/10.1038/bjc.2013.496

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  29. Nordlund J, Backlin CL, Wahlberg P, Busche S, Berglund EC, Eloranta ML, Flaegstad T, Forestier E, Frost BM, Harila-Saari A, Heyman M, Jonsson OG, Larsson R, Palle J, Ronnblom L, Schmiegelow K, Sinnett D, Soderhall S, Pastinen T, Gustafsson MG, Lonnerholm G, Syvanen AC (2013) Genome-wide signatures of differential DNA methylation in pediatric acute lymphoblastic leukemia. Genome Biol 14(9):r105. https://doi.org/10.1186/gb-2013-14-9-r105

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  30. Naeem H, Wong NC, Chatterton Z, Hong MK, Pedersen JS, Corcoran NM, Hovens CM, Macintyre G (2014) Reducing the risk of false discovery enabling identification of biologically significant genome-wide methylation status using the HumanMethylation450 array. BMC Genomics 15:51. https://doi.org/10.1186/1471-2164-15-51

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  31. Chen YA, Lemire M, Choufani S, Butcher DT, Grafodatskaya D, Zanke BW, Gallinger S, Hudson TJ, Weksberg R (2013) Discovery of cross-reactive probes and polymorphic CpGs in the Illumina Infinium HumanMethylation450 microarray. Epigenetics 8(2):203–209. https://doi.org/10.4161/epi.23470

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  32. Li H, Durbin R (2009) Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 25(14):1754–1760. https://doi.org/10.1093/bioinformatics/btp324

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  33. Du P, Zhang X, Huang CC, Jafari N, Kibbe WA, Hou L, Lin SM (2010) Comparison of Beta-value and M-value methods for quantifying methylation levels by microarray analysis. BMC Bioinformatics 11:587. https://doi.org/10.1186/1471-2105-11-587

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  34. Bernstein BE, Stamatoyannopoulos JA, Costello JF, Ren B, Milosavljevic A, Meissner A, Kellis M, Marra MA, Beaudet AL, Ecker JR, Farnham PJ, Hirst M, Lander ES, Mikkelsen TS, Thomson JA (2010) The NIH roadmap epigenomics mapping consortium. Nat Biotechnol 28(10):1045–1048. https://doi.org/10.1038/nbt1010-1045

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  35. Consortium EP (2012) An integrated encyclopedia of DNA elements in the human genome. Nature 489(7414):57–74. https://doi.org/10.1038/nature11247

    Article  CAS  Google Scholar 

  36. Harrow J, Frankish A, Gonzalez JM, Tapanari E, Diekhans M, Kokocinski F, Aken BL, Barrell D, Zadissa A, Searle S, Barnes I, Bignell A, Boychenko V, Hunt T, Kay M, Mukherjee G, Rajan J, Despacio-Reyes G, Saunders G, Steward C, Harte R, Lin M, Howald C, Tanzer A, Derrien T, Chrast J, Walters N, Balasubramanian S, Pei B, Tress M, Rodriguez JM, Ezkurdia I, van Baren J, Brent M, Haussler D, Kellis M, Valencia A, Reymond A, Gerstein M, Guigo R, Hubbard TJ (2012) GENCODE: the reference human genome annotation for The ENCODE Project. Genome Res 22(9):1760–1774. https://doi.org/10.1101/gr.135350.111

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  37. Kuhn RM, Haussler D, Kent WJ (2013) The UCSC genome browser and associated tools. Brief Bioinform 14(2):144–161. https://doi.org/10.1093/bib/bbs038

    Article  CAS  PubMed  Google Scholar 

  38. Andersson R, Gebhard C, Miguel-Escalada I, Hoof I, Bornholdt J, Boyd M, Chen Y, Zhao X, Schmidl C, Suzuki T, Ntini E, Arner E, Valen E, Li K, Schwarzfischer L, Glatz D, Raithel J, Lilje B, Rapin N, Bagger FO, Jorgensen M, Andersen PR, Bertin N, Rackham O, Burroughs AM, Baillie JK, Ishizu Y, Shimizu Y, Furuhata E, Maeda S, Negishi Y, Mungall CJ, Meehan TF, Lassmann T, Itoh M, Kawaji H, Kondo N, Kawai J, Lennartsson A, Daub CO, Heutink P, Hume DA, Jensen TH, Suzuki H, Hayashizaki Y, Muller F, Consortium F, Forrest AR, Carninci P, Rehli M, Sandelin A (2014) An atlas of active enhancers across human cell types and tissues. Nature 507(7493):455–461. https://doi.org/10.1038/nature12787

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  39. Subramanian A, Tamayo P, Mootha VK, Mukherjee S, Ebert BL, Gillette MA, Paulovich A, Pomeroy SL, Golub TR, Lander ES, Mesirov JP (2005) Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proc Natl Acad Sci U S A 102(43):15545–15550. https://doi.org/10.1073/pnas.0506580102

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  40. Balakrishnan R, Harris MA, Huntley R, Van Auken K, Cherry JM (2013) A guide to best practices for gene ontology (GO) manual annotation. Database 2013:bat054. https://doi.org/10.1093/database/bat054

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  41. Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn 20(3):273–297. https://doi.org/10.1007/BF00994018

    Article  Google Scholar 

  42. Breiman L (2001) Random forests. Mach Learn 45(1):5–32. https://doi.org/10.1023/A:1010933404324

    Article  Google Scholar 

  43. Tibshirani R, Hastie T, Narasimhan B, Chu G (2002) Diagnosis of multiple cancer types by shrunken centroids of gene expression. Proc Natl Acad Sci U S A 99(10):6567–6572. https://doi.org/10.1073/pnas.082099299

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  44. Zou H, Hastie T (2005) Regularization and variable selection via the elastic net. J R Stat Soc Series B Stat Methodol 67(2):301–320. https://doi.org/10.1111/j.1467-9868.2005.00503.x

    Article  Google Scholar 

  45. Milani L, Lundmark A, Kiialainen A, Nordlund J, Flaegstad T, Forestier E, Heyman M, Jonmundsson G, Kanerva J, Schmiegelow K, Soderhall S, Gustafsson MG, Lonnerholm G, Syvanen AC (2010) DNA methylation for subtype classification and prediction of treatment outcome in patients with childhood acute lymphoblastic leukemia. Blood 115(6):1214–1225. https://doi.org/10.1182/blood-2009-04-214668

    Article  CAS  PubMed  Google Scholar 

  46. Stefansson OA, Moran S, Gomez A, Sayols S, Arribas-Jorba C, Sandoval J, Hilmarsdottir H, Olafsdottir E, Tryggvadottir L, Jonasson JG, Eyfjord J, Esteller M (2015) A DNA methylation-based definition of biologically distinct breast cancer subtypes. Mol Oncol 9(3):555–568. https://doi.org/10.1016/j.molonc.2014.10.012

    Article  CAS  PubMed  Google Scholar 

  47. Backlin CL, Gustafsson MG (2018) Developer friendly and computationally efficient predictive modeling without information leakage: the emil package for R. J Stat Softw, 85(13). https://doi.org/10.18637/jss.v085.i13, https://www.jstatsoft.org/v085/i13

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jessica Nordlund .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Science+Business Media, LLC, part of Springer Nature

About this protocol

Check for updates. Verify currency and authenticity via CrossMark

Cite this protocol

Nordlund, J., Bäcklin, C., Raine, A. (2019). Computational and Statistical Analysis of Array-Based DNA Methylation Data. In: Krasnitz, A. (eds) Cancer Bioinformatics. Methods in Molecular Biology, vol 1878. Humana Press, New York, NY. https://doi.org/10.1007/978-1-4939-8868-6_10

Download citation

  • DOI: https://doi.org/10.1007/978-1-4939-8868-6_10

  • Published:

  • Publisher Name: Humana Press, New York, NY

  • Print ISBN: 978-1-4939-8866-2

  • Online ISBN: 978-1-4939-8868-6

  • eBook Packages: Springer Protocols

Publish with us

Policies and ethics