Analysis of DNA methylation in a population context has the potential to uncover novel gene and environment interactions as well as markers of health and disease. In order to find such associations it is important to control for factors which may mask or alter DNA methylation signatures. Since tissue of origin and coinciding cell type composition are major contributors to DNA methylation patterns, and can easily confound important findings, it is vital to adjust DNA methylation data for such differences across individuals. Here we describe the use of a regression method to adjust for cell type composition in DNA methylation data. We specifically discuss what information is required to adjust for cell type composition and then provide detailed instructions on how to perform cell type adjustment on high dimensional DNA methylation data. This method has been applied mainly to Illumina 450K data, but can also be adapted to pyrosequencing or genome-wide bisulfite sequencing data.
DNA methylation Illumina Infinium HumanMethylation450 BeadChip Cell type Statistical adjustment R statistical software
This is a preview of subscription content, log in to check access.
Springer Nature is developing a new tool to find and evaluate Protocols. Learn more
Reinius LE, Acevedo N, Joerink M et al (2012) Differential DNA methylation in purified human blood cells: implications for cell lineage and studies on disease susceptibility. PLoS One 7:e41361CrossRefPubMedPubMedCentralGoogle Scholar
Lam LL, Emberly E, Fraser HB et al (2012) Factors underlying variable DNA methylation in a human community cohort. Proc Natl Acad Sci U S A 109(Suppl 2):17253–17260CrossRefPubMedPubMedCentralGoogle Scholar
Liu Y, Aryee MJ, Padyukov L et al (2013) Epigenome-wide association data implicate DNA methylation as an intermediary of genetic risk in rheumatoid arthritis. Nat Biotechnol 31:142–147CrossRefPubMedPubMedCentralGoogle Scholar
Guintivano J, Aryee MJ, Kaminsky ZA (2013) A cell epigenotype specific model for the correction of brain cellular heterogeneity bias and its application to age, brain region and major depression. Epigenetics 8:290–302CrossRefPubMedPubMedCentralGoogle Scholar
Smith AK, Kilaru V, Klengel T et al (2014) DNA extracted from saliva for methylation studies of psychiatric traits: evidence tissue specificity and relatedness to brain. Am J Med Genet 168:36–44CrossRefGoogle Scholar
Houseman EA, Accomando WP, Koestler DC et al (2012) DNA methylation arrays as surrogate measures of cell mixture distribution. BMC Bioinform 13:86CrossRefGoogle Scholar
Koestler DC, Christensen B, Karagas MR et al (2013) Blood-based profiles of DNA methylation predict the underlying distribution of cell types: a validation analysis. Epigenetics 8:816–826CrossRefPubMedPubMedCentralGoogle Scholar
D.C.T. R (2008) R: a language and environment for statistical computing. R Foundation for Statistical Computing, ViennaGoogle Scholar
Aryee MJ, Jaffe AE, Corrada-Bravo H et al (2014) Minfi: a flexible and comprehensive Bioconductor package for the analysis of Infinium DNA methylation microarrays. Bioinformatics 30:1363–1369CrossRefPubMedPubMedCentralGoogle Scholar
Du P, Zhang X, Huang C-C et al (2010) Comparison of Beta-value and M-value methods for quantifying methylation levels by microarray analysis. BMC Bioinform 11:587CrossRefGoogle Scholar
Zou J, Lippert C, Heckerman D et al (2014) Epigenome-wide association studies without the need for cell-type composition. Nat Methods 11:309–311CrossRefPubMedGoogle Scholar
Leek JT, Johnson WE, Parker HS et al (2012) The sva package for removing batch effects and other unwanted variation in high-throughput experiments. Bioinformatics 28:882–883CrossRefPubMedPubMedCentralGoogle Scholar