Abstract
The characterization of aberrant DNA methylation is emerging as a key part of the study of cancer development and phenotype. The technical advancements and decreasing costs of methods for high-throughput profiling of DNA methylation have brought about a high interest in the use of such methods in disease association studies. Here we discuss the principles for DNA methylation analysis using data from the Infinium DNA methylation BeadChip assays and describe the computational steps and statistical considerations going from processing of the raw array data to analysis of differential methylation. Moreover, we provide detailed guidelines on how to perform tumor subtype classification based on DNA methylation signatures.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Sandoval J, Esteller M (2012) Cancer epigenomics: beyond genomics. Curr Opin Genet Dev 22(1):50–55. https://doi.org/10.1016/j.gde.2012.02.008
Hansen KD, Timp W, Bravo HC, Sabunciyan S, Langmead B, McDonald OG, Wen B, Wu H, Liu Y, Diep D, Briem E, Zhang K, Irizarry RA, Feinberg AP (2011) Increased methylation variation in epigenetic domains across cancer types. Nat Genet 43(8):768–775. https://doi.org/10.1038/ng.865
Weisenberger DJ (2014) Characterizing DNA methylation alterations from the cancer genome atlas. J Clin Invest 124(1):17–23. https://doi.org/10.1172/JCI69740
Timp W, Bravo HC, McDonald OG, Goggins M, Umbricht C, Zeiger M, Feinberg AP, Irizarry RA (2014) Large hypomethylated blocks as a universal defining epigenetic alteration in human solid tumors. Genome Med 6(8):61. https://doi.org/10.1186/s13073-014-0061-y
Nordlund J, Syvanen AC (2017) Epigenetics in pediatric acute lymphoblastic leukemia. Semin Cancer Biol. https://doi.org/10.1016/j.semcancer.2017.09.001
Witte T, Plass C, Gerhauser C (2014) Pan-cancer patterns of DNA methylation. Genome Med 6(8):66. https://doi.org/10.1186/s13073-014-0066-6
Nordlund J, Backlin CL, Zachariadis V, Cavelier L, Dahlberg J, Ofverholm I, Barbany G, Nordgren A, Overnas E, Abrahamsson J, Flaegstad T, Heyman MM, Jonsson OG, Kanerva J, Larsson R, Palle J, Schmiegelow K, Gustafsson MG, Lonnerholm G, Forestier E, Syvanen AC (2015) DNA methylation-based subtype prediction for pediatric acute lymphoblastic leukemia. Clin Epigenetics 7(1):11. https://doi.org/10.1186/s13148-014-0039-z
Danielsson A, Nemes S, Tisell M, Lannering B, Nordborg C, Sabel M, Caren H (2015) MethPed: a DNA methylation classifier tool for the identification of pediatric brain tumor subtypes. Clin Epigenetics 7(1):62. https://doi.org/10.1186/s13148-015-0103-3
Teschendorff AE, Widschwendter M (2012) Differential variability improves the identification of cancer risk markers in DNA methylation studies profiling precursor cancer lesions. Bioinformatics 28(11):1487–1494. https://doi.org/10.1093/bioinformatics/bts170
Bibikova M, Le J, Barnes B, Saedinia-Melnyk S, Zhou LX, Shen R, Gunderson KL (2009) Genome-wide DNA methylation profiling using Infinium (R) assay. Epigenomics 1(1):177–200. https://doi.org/10.2217/EPI.09.14
Sandoval J, Heyn H, Moran S, Serra-Musach J, Pujana MA, Bibikova M, Esteller M (2011) Validation of a DNA methylation microarray for 450,000 CpG sites in the human genome. Epigenetics 6(6):692–702
Pidsley R, Zotenko E, Peters TJ, Lawrence MG, Risbridger GP, Molloy P, Van Djik S, Muhlhausler B, Stirzaker C, Clark SJ (2016) Critical evaluation of the Illumina MethylationEPIC BeadChip microarray for whole-genome DNA methylation profiling. Genome Biol 17(1):208. https://doi.org/10.1186/s13059-016-1066-1
Walker DL, Bhagwate AV, Baheti S, Smalley RL, Hilker CA, Sun Z, Cunningham JM (2015) DNA methylation profiling: comparison of genome-wide sequencing methods and the Infinium Human Methylation 450 Bead Chip. Epigenomics 1–16. doi:https://doi.org/10.2217/EPI.15.64
Marabita F, Tegnér J, Gomez-Cabrero D (2015) Introduction to data types in epigenomics. In: Teschendorff AE (ed) Computational and statistical Epigenomics, Translational bioinformatics, vol 7. Springer, Netherlands, pp 3–34. https://doi.org/10.1007/978-94-017-9927-0_1
Bibikova M, Barnes B, Tsan C, Ho V, Klotzle B, Le JM, Delano D, Zhang L, Schroth GP, Gunderson KL, Fan JB, Shen R (2011) High density DNA methylation array with single CpG site resolution. Genomics 98(4):288–295. https://doi.org/10.1016/j.ygeno.2011.07.007
Lister R, Pelizzola M, Dowen RH, Hawkins RD, Hon G, Tonti-Filippini J, Nery JR, Lee L, Ye Z, Ngo QM, Edsall L, Antosiewicz-Bourget J, Stewart R, Ruotti V, Millar AH, Thomson JA, Ren B, Ecker JR (2009) Human DNA methylomes at base resolution show widespread epigenomic differences. Nature 462(7271):315–322. https://doi.org/10.1038/nature08514
Bock C (2012) Analysing and interpreting DNA methylation data. Nat Rev Genet 13(10):705–719. https://doi.org/10.1038/nrg3273
Gunderson KL, Steemers FJ, Lee G, Mendoza LG, Chee MS (2005) A genome-wide scalable SNP genotyping assay using microarray technology. Nat Genet 37(5):549–554. https://doi.org/10.1038/ng1547
Dedeurwaerder S, Defrance M, Calonne E, Denis H, Sotiriou C, Fuks F (2011) Evaluation of the Infinium Methylation 450K technology. Epigenomics 3(6):771–784. https://doi.org/10.2217/epi.11.105
Sun Z, Cunningham J, Slager S, Kocher JP (2015) Base resolution methylome profiling: considerations in platform selection, data preprocessing and analysis. Epigenomics. https://doi.org/10.2217/epi.15.21
Maksimovic J, Gordon L, Oshlack A (2012) SWAN: subset-quantile within array normalization for Illumina Infinium HumanMethylation450 BeadChips. Genome Biol 13(6):R44. https://doi.org/10.1186/Gb-2012-13-6-R44
Teschendorff AE, Marabita F, Lechner M, Bartlett T, Tegner J, Gomez-Cabrero D, Beck S (2013) A beta-mixture quantile normalization method for correcting probe design bias in Illumina Infinium 450 k DNA methylation data. Bioinformatics 29(2):189–196. https://doi.org/10.1093/bioinformatics/bts680
Marabita F, Almgren M, Lindholm ME, Ruhrmann S, Fagerstrom-Billai F, Jagodic M, Sundberg CJ, Ekstrom TJ, Teschendorff AE, Tegner J, Gomez-Cabrero D (2013) An evaluation of analysis pipelines for DNA methylation profiling using the Illumina HumanMethylation450 BeadChip platform. Epigenetics 8(3):333–346. https://doi.org/10.4161/epi.24008
Wu MC, Joubert BR, Kuan PF, Haberg SE, Nystad W, Peddada SD, London SJ (2014) A systematic assessment of normalization approaches for the Infinium 450K methylation platform. Epigenetics 9(2):318–329. https://doi.org/10.4161/epi.27119
Aryee MJ, Jaffe AE, Corrada-Bravo H, Ladd-Acosta C, Feinberg AP, Hansen KD, Irizarry RA (2014) Minfi: a flexible and comprehensive bioconductor package for the analysis of Infinium DNA methylation microarrays. Bioinformatics 30(10):1363–1369. https://doi.org/10.1093/bioinformatics/btu049
Morris TJ, Butcher LM, Feber A, Teschendorff AE, Chakravarthy AR, Wojdacz TK, Beck S (2014) ChAMP: 450k Chip analysis methylation pipeline. Bioinformatics 30(3):428–430. https://doi.org/10.1093/bioinformatics/btt684
Assenov Y, Muller F, Lutsik P, Walter J, Lengauer T, Bock C (2014) Comprehensive analysis of DNA methylation data with RnBeads. Nat Methods 11(11):1138–1140. https://doi.org/10.1038/nmeth.3115
Wilhelm-Benartzi CS, Koestler DC, Karagas MR, Flanagan JM, Christensen BC, Kelsey KT, Marsit CJ, Houseman EA, Brown R (2013) Review of processing and analysis methods for DNA methylation array data. Br J Cancer 109(6):1394–1402. https://doi.org/10.1038/bjc.2013.496
Nordlund J, Backlin CL, Wahlberg P, Busche S, Berglund EC, Eloranta ML, Flaegstad T, Forestier E, Frost BM, Harila-Saari A, Heyman M, Jonsson OG, Larsson R, Palle J, Ronnblom L, Schmiegelow K, Sinnett D, Soderhall S, Pastinen T, Gustafsson MG, Lonnerholm G, Syvanen AC (2013) Genome-wide signatures of differential DNA methylation in pediatric acute lymphoblastic leukemia. Genome Biol 14(9):r105. https://doi.org/10.1186/gb-2013-14-9-r105
Naeem H, Wong NC, Chatterton Z, Hong MK, Pedersen JS, Corcoran NM, Hovens CM, Macintyre G (2014) Reducing the risk of false discovery enabling identification of biologically significant genome-wide methylation status using the HumanMethylation450 array. BMC Genomics 15:51. https://doi.org/10.1186/1471-2164-15-51
Chen YA, Lemire M, Choufani S, Butcher DT, Grafodatskaya D, Zanke BW, Gallinger S, Hudson TJ, Weksberg R (2013) Discovery of cross-reactive probes and polymorphic CpGs in the Illumina Infinium HumanMethylation450 microarray. Epigenetics 8(2):203–209. https://doi.org/10.4161/epi.23470
Li H, Durbin R (2009) Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 25(14):1754–1760. https://doi.org/10.1093/bioinformatics/btp324
Du P, Zhang X, Huang CC, Jafari N, Kibbe WA, Hou L, Lin SM (2010) Comparison of Beta-value and M-value methods for quantifying methylation levels by microarray analysis. BMC Bioinformatics 11:587. https://doi.org/10.1186/1471-2105-11-587
Bernstein BE, Stamatoyannopoulos JA, Costello JF, Ren B, Milosavljevic A, Meissner A, Kellis M, Marra MA, Beaudet AL, Ecker JR, Farnham PJ, Hirst M, Lander ES, Mikkelsen TS, Thomson JA (2010) The NIH roadmap epigenomics mapping consortium. Nat Biotechnol 28(10):1045–1048. https://doi.org/10.1038/nbt1010-1045
Consortium EP (2012) An integrated encyclopedia of DNA elements in the human genome. Nature 489(7414):57–74. https://doi.org/10.1038/nature11247
Harrow J, Frankish A, Gonzalez JM, Tapanari E, Diekhans M, Kokocinski F, Aken BL, Barrell D, Zadissa A, Searle S, Barnes I, Bignell A, Boychenko V, Hunt T, Kay M, Mukherjee G, Rajan J, Despacio-Reyes G, Saunders G, Steward C, Harte R, Lin M, Howald C, Tanzer A, Derrien T, Chrast J, Walters N, Balasubramanian S, Pei B, Tress M, Rodriguez JM, Ezkurdia I, van Baren J, Brent M, Haussler D, Kellis M, Valencia A, Reymond A, Gerstein M, Guigo R, Hubbard TJ (2012) GENCODE: the reference human genome annotation for The ENCODE Project. Genome Res 22(9):1760–1774. https://doi.org/10.1101/gr.135350.111
Kuhn RM, Haussler D, Kent WJ (2013) The UCSC genome browser and associated tools. Brief Bioinform 14(2):144–161. https://doi.org/10.1093/bib/bbs038
Andersson R, Gebhard C, Miguel-Escalada I, Hoof I, Bornholdt J, Boyd M, Chen Y, Zhao X, Schmidl C, Suzuki T, Ntini E, Arner E, Valen E, Li K, Schwarzfischer L, Glatz D, Raithel J, Lilje B, Rapin N, Bagger FO, Jorgensen M, Andersen PR, Bertin N, Rackham O, Burroughs AM, Baillie JK, Ishizu Y, Shimizu Y, Furuhata E, Maeda S, Negishi Y, Mungall CJ, Meehan TF, Lassmann T, Itoh M, Kawaji H, Kondo N, Kawai J, Lennartsson A, Daub CO, Heutink P, Hume DA, Jensen TH, Suzuki H, Hayashizaki Y, Muller F, Consortium F, Forrest AR, Carninci P, Rehli M, Sandelin A (2014) An atlas of active enhancers across human cell types and tissues. Nature 507(7493):455–461. https://doi.org/10.1038/nature12787
Subramanian A, Tamayo P, Mootha VK, Mukherjee S, Ebert BL, Gillette MA, Paulovich A, Pomeroy SL, Golub TR, Lander ES, Mesirov JP (2005) Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proc Natl Acad Sci U S A 102(43):15545–15550. https://doi.org/10.1073/pnas.0506580102
Balakrishnan R, Harris MA, Huntley R, Van Auken K, Cherry JM (2013) A guide to best practices for gene ontology (GO) manual annotation. Database 2013:bat054. https://doi.org/10.1093/database/bat054
Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn 20(3):273–297. https://doi.org/10.1007/BF00994018
Breiman L (2001) Random forests. Mach Learn 45(1):5–32. https://doi.org/10.1023/A:1010933404324
Tibshirani R, Hastie T, Narasimhan B, Chu G (2002) Diagnosis of multiple cancer types by shrunken centroids of gene expression. Proc Natl Acad Sci U S A 99(10):6567–6572. https://doi.org/10.1073/pnas.082099299
Zou H, Hastie T (2005) Regularization and variable selection via the elastic net. J R Stat Soc Series B Stat Methodol 67(2):301–320. https://doi.org/10.1111/j.1467-9868.2005.00503.x
Milani L, Lundmark A, Kiialainen A, Nordlund J, Flaegstad T, Forestier E, Heyman M, Jonmundsson G, Kanerva J, Schmiegelow K, Soderhall S, Gustafsson MG, Lonnerholm G, Syvanen AC (2010) DNA methylation for subtype classification and prediction of treatment outcome in patients with childhood acute lymphoblastic leukemia. Blood 115(6):1214–1225. https://doi.org/10.1182/blood-2009-04-214668
Stefansson OA, Moran S, Gomez A, Sayols S, Arribas-Jorba C, Sandoval J, Hilmarsdottir H, Olafsdottir E, Tryggvadottir L, Jonasson JG, Eyfjord J, Esteller M (2015) A DNA methylation-based definition of biologically distinct breast cancer subtypes. Mol Oncol 9(3):555–568. https://doi.org/10.1016/j.molonc.2014.10.012
Backlin CL, Gustafsson MG (2018) Developer friendly and computationally efficient predictive modeling without information leakage: the emil package for R. J Stat Softw, 85(13). https://doi.org/10.18637/jss.v085.i13, https://www.jstatsoft.org/v085/i13
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Science+Business Media, LLC, part of Springer Nature
About this protocol
Cite this protocol
Nordlund, J., Bäcklin, C., Raine, A. (2019). Computational and Statistical Analysis of Array-Based DNA Methylation Data. In: Krasnitz, A. (eds) Cancer Bioinformatics. Methods in Molecular Biology, vol 1878. Humana Press, New York, NY. https://doi.org/10.1007/978-1-4939-8868-6_10
Download citation
DOI: https://doi.org/10.1007/978-1-4939-8868-6_10
Published:
Publisher Name: Humana Press, New York, NY
Print ISBN: 978-1-4939-8866-2
Online ISBN: 978-1-4939-8868-6
eBook Packages: Springer Protocols