Adjusting for Cell Type Composition in DNA Methylation Data Using a Regression-Based Approach

  • Meaghan J. Jones
  • Sumaiya A. Islam
  • Rachel D. Edgar
  • Michael S. KoborEmail author
Part of the Methods in Molecular Biology book series (MIMB, volume 1589)


Analysis of DNA methylation in a population context has the potential to uncover novel gene and environment interactions as well as markers of health and disease. In order to find such associations it is important to control for factors which may mask or alter DNA methylation signatures. Since tissue of origin and coinciding cell type composition are major contributors to DNA methylation patterns, and can easily confound important findings, it is vital to adjust DNA methylation data for such differences across individuals. Here we describe the use of a regression method to adjust for cell type composition in DNA methylation data. We specifically discuss what information is required to adjust for cell type composition and then provide detailed instructions on how to perform cell type adjustment on high dimensional DNA methylation data. This method has been applied mainly to Illumina 450K data, but can also be adapted to pyrosequencing or genome-wide bisulfite sequencing data.


DNA methylation Illumina Infinium HumanMethylation450 BeadChip Cell type Statistical adjustment R statistical software 


  1. 1.
    Reinius LE, Acevedo N, Joerink M et al (2012) Differential DNA methylation in purified human blood cells: implications for cell lineage and studies on disease susceptibility. PLoS One 7:e41361CrossRefPubMedPubMedCentralGoogle Scholar
  2. 2.
    Jaffe AE, Irizarry RA (2014) Accounting for cellular heterogeneity is critical in epigenome-wide association studies. Genome Biol 15:R31CrossRefPubMedPubMedCentralGoogle Scholar
  3. 3.
    Lam LL, Emberly E, Fraser HB et al (2012) Factors underlying variable DNA methylation in a human community cohort. Proc Natl Acad Sci U S A 109(Suppl 2):17253–17260CrossRefPubMedPubMedCentralGoogle Scholar
  4. 4.
    Liu Y, Aryee MJ, Padyukov L et al (2013) Epigenome-wide association data implicate DNA methylation as an intermediary of genetic risk in rheumatoid arthritis. Nat Biotechnol 31:142–147CrossRefPubMedPubMedCentralGoogle Scholar
  5. 5.
    Lowe R, Rakyan VK (2014) Correcting for cell-type composition bias in epigenome-wide association studies. Genome Med 6:23CrossRefPubMedPubMedCentralGoogle Scholar
  6. 6.
    Guintivano J, Aryee MJ, Kaminsky ZA (2013) A cell epigenotype specific model for the correction of brain cellular heterogeneity bias and its application to age, brain region and major depression. Epigenetics 8:290–302CrossRefPubMedPubMedCentralGoogle Scholar
  7. 7.
    Jones MJ, Farré P, McEwen LM et al (2013) Distinct DNA methylation patterns of cognitive impairment and trisomy 21 in down syndrome. BMC Med Genomics 6:58CrossRefPubMedPubMedCentralGoogle Scholar
  8. 8.
    Smith AK, Kilaru V, Klengel T et al (2014) DNA extracted from saliva for methylation studies of psychiatric traits: evidence tissue specificity and relatedness to brain. Am J Med Genet 168:36–44CrossRefGoogle Scholar
  9. 9.
    Houseman EA, Accomando WP, Koestler DC et al (2012) DNA methylation arrays as surrogate measures of cell mixture distribution. BMC Bioinform 13:86CrossRefGoogle Scholar
  10. 10.
    Montaño CM, Irizarry RA, Kaufmann WE et al (2013) Measuring cell-type specific differential methylation in human brain tissue. Genome Biol 14:R94CrossRefPubMedPubMedCentralGoogle Scholar
  11. 11.
    Koestler DC, Christensen B, Karagas MR et al (2013) Blood-based profiles of DNA methylation predict the underlying distribution of cell types: a validation analysis. Epigenetics 8:816–826CrossRefPubMedPubMedCentralGoogle Scholar
  12. 12.
    D.C.T. R (2008) R: a language and environment for statistical computing. R Foundation for Statistical Computing, ViennaGoogle Scholar
  13. 13.
    Du P, Kibbe WA, Lin SM (2008) lumi: a pipeline for processing Illumina microarray. Bioinformatics 24:1547–1548CrossRefPubMedGoogle Scholar
  14. 14.
    Aryee MJ, Jaffe AE, Corrada-Bravo H et al (2014) Minfi: a flexible and comprehensive Bioconductor package for the analysis of Infinium DNA methylation microarrays. Bioinformatics 30:1363–1369CrossRefPubMedPubMedCentralGoogle Scholar
  15. 15.
    Du P, Zhang X, Huang C-C et al (2010) Comparison of Beta-value and M-value methods for quantifying methylation levels by microarray analysis. BMC Bioinform 11:587CrossRefGoogle Scholar
  16. 16.
    Zou J, Lippert C, Heckerman D et al (2014) Epigenome-wide association studies without the need for cell-type composition. Nat Methods 11:309–311CrossRefPubMedGoogle Scholar
  17. 17.
    Houseman EA, Molitor J, Marsit CJ (2014) Reference-free cell mixture adjustments in analysis of DNA methylation data. Bioinformatics 30:1431–1439CrossRefPubMedPubMedCentralGoogle Scholar
  18. 18.
    Leek JT, Johnson WE, Parker HS et al (2012) The sva package for removing batch effects and other unwanted variation in high-throughput experiments. Bioinformatics 28:882–883CrossRefPubMedPubMedCentralGoogle Scholar

Copyright information

© Springer Science+Business Media New York 2015

Authors and Affiliations

  • Meaghan J. Jones
    • 1
  • Sumaiya A. Islam
    • 1
  • Rachel D. Edgar
    • 1
  • Michael S. Kobor
    • 1
    Email author
  1. 1.Department of Medical Genetics, Centre for Molecular Medicine and Therapeutics, Child and Family Research InstituteUniversity of British ColumbiaVancouverCanada

Personalised recommendations