Advertisement

Current Environmental Health Reports

, Volume 6, Issue 1, pp 38–51 | Cite as

Mendelian Randomization and the Environmental Epigenetics of Health: a Systematic Review

  • Maria Grau-PerezEmail author
  • Golareh Agha
  • Yuanjie Pang
  • Jose D. Bermudez
  • Maria Tellez-Plaza
Environmental Epigenetics (A Baccarelli and A Cardenas, Section Editors)
Part of the following topical collections:
  1. Topical Collection on Environmental Epigenetics

Abstract

Purpose of Review

Epigenetic modifications are environmentally responsive and may play a mechanistic role in the development of disease. Mendelian randomization uses genetic variation to assess the causal effect of modifiable exposures on health outcomes. We conducted a systematic review of Mendelian randomization studies evaluating the causal role of DNA methylation (DNAm) changes on the development of health states, emphasizing on studies that formally evaluate exposure-DNAm, in addition to DNAm-outcome, causal associations.

Recent Findings

We identified 15 articles, 4 of them including an environmental determinant of DNAm, including self-reported tobacco smoke exposure, in utero tobacco smoke exposure, measured vitamin B12, and glycemia.

Summary

Selected articles suggest a causal association of DNAm with some cardiometabolic endpoints. DNAm seemed to partly explain the association of postnatal and prenatal exposure to tobacco smoke and vitamin B12 with inflammation biomarkers, birth weight, and cognitive outcomes, respectively. However, the current evidence is not sufficient to infer causality. Additional Mendelian randomization studies from large epidemiologic samples are needed to support the causal role of environmental factors as determinants of health-related epigenetic modifications.

Keywords

Systematic review Mendelian randomization DNA methylation Environmental Health outcomes 

Introduction

With the advancement and increasing feasibility of microarray and sequencing technologies, population research on molecular mechanisms of health and disease has rapidly expanded beyond genetics research. Recognizing that genetic polymorphisms alone explain only a small proportion of disease risk, attention has increasingly shifted to epigenetic studies in order to further elucidate the underlying etiology of health and disease [1]. Epigenetics refers to the array of molecular mechanisms that contribute to the regulation of genes and cell state and function, without directly changing the underlying DNA sequence [2, 3]. There is mounting evidence that epigenetic modifications are environmentally responsive and may play a mechanistic role in the development of disease [4, 5]. DNA methylation (DNAm), the most widely studied epigenetic modification to date, is known to affect gene expression and has been associated with a range of metabolic and disease phenotypes [6, 7, 8, 9]. By virtue of epigenetic modifications being environmentally responsive and dynamic over time, epigenetic studies are subject to confounding and reverse causation just as any other study of modifiable exposures [10•]. While associations of DNAm with various phenotypic traits are now routinely reported, there is always possibility that these findings are correlational without any robust evidence of causality.

The instrumental variable analysis is a statistical method used to estimate causal relationships when controlled experiments are not feasible or when a treatment is not successfully delivered to every unit in a randomized experiment. This methodology has been established for decades in the econometrics and statistics fields [11, 12]. In brief, an instrumental variable is a variable other than the exposure or the outcome that allows controlling for confounding effects (both measured and unmeasured) and measurement error. The Mendelian randomization (MR) method, an increasingly recognized and popular approach, uses genetic variants (i.e., sections of the genetic code that differ between individuals) as instrumental variables to assess the causal effect of modifiable exposures on health outcomes using observational data. The instrumental variable analysis based on genetic instruments to address questions related to human health was first described in the early 2000s and receives the name of “Mendelian randomization” because it exploits the principle of the random assortment in genetic inheritance, first observed and characterized by Gregor Mendel [13], which states that the inheritance of one trait is independent of the inheritance of other traits. Under this principle, the genetic variants are thought to be randomly distributed in the population and, as a consequence, to be independent of environmental factors and other variables. Briefly, MR can be understood as a randomized controlled trial, in which the genotypes from one or several genetic variants are used to separate the individuals into random subgroups. In order to be considered an instrumental variable, a genetic variant must be associated with the exposure of interest (association assumption) but not associated with any confounder of the exposure-outcome association (independence assumption) and must also be associated with the outcome only via the exposure of interest (restriction assumption). Figure 1 shows a more detailed description of the MR methods and corresponding assumptions.
Fig. 1

Overview of Mendelian randomization methods. In order to be considered instrumental variable, a genetic variant (G) has to be associated with the exposure of interest (E) (association assumption) but not associated with any confounder (C) of the exposure-outcome association (independence assumption) and neither be associated with the outcome (O) other than via the exposure (exclusion restriction assumption). Given that genetic variants are fixed at conception and cannot be modified, any association between a genetic variant and a modifiable exposure or a disease is not susceptible to reverse causation. Therefore, if the genetic variant meets the three instrumental variable assumptions, the association established between the genetic variant and the outcome is assumed not be due to confounding and can be used to infer a causal effect, including quantitative estimates and confidence intervals, of the exposure on the outcome. Typically, Mendelian randomization studies use single nucleotide polymorphisms (SNPs) within genes encoding proteins as instrumental variables

Examples of successful applications of MR include studies on cardiovascular and cardiometabolic outcomes [14, 15, 16, 17, 18, 19, 20, 21, 22]. The focus of many of these initial studies has been to assess whether there is casual evidence for association between circulating cardiovascular biomarkers and subsequent cardiovascular outcomes, associations that have been established in many observational studies [14, 16, 22, 23]. For instance, Davey-Smith et al. observed associations of high sensitivity C-reactive protein (CRP) levels with blood pressure and hypertension among > 3500 British women, an association repeatedly seen in prior observational studies. They used a MR approach to show that a genetic polymorphism in the human CRP gene was associated with a robust difference in CRP levels but not with blood pressure and hypertension [14]. Thus, a casual association could not be inferred. Several subsequent studies have used common CRP gene polymorphisms to demonstrate that observed associations between genetically elevated CRP levels and ischemic heart disease are not supported by causal evidence, as indicated by MR analyses [22, 23]. Conversely, polymorphisms in the lipoprotein A (LPA) [16] and apolipoprotein E (APOE) [22] genes have been used to demonstrate that associations between genetically elevated cholesterol levels are causally related to increased risk of ischemic heart disease.

Application of MR to epigenetic studies is particularly burgeoning in the era of epigenome-wide association studies (EWAS). EWAS allow for the systematic assessment of many CpG sites across the genome in relation to a phenotype or disease of interest [24]. Microarray-based methods such as the Illumina Infinium Human Methylation 450 and the MethylationEPIC BeadChips [25] have been used in the majority of EWAS to date, allowing for the measurement of more than 450,000 CpG sites across the genome. Extending from this, studies are now using the rich array of genetic information available from genome-wide studies (GWAS) and integrating it into EWAS in order to uncover epigenetic-phenotype associations and further use MR to assess the causal evidence of these associations.

Our objective was to conduct a systematic review of MR studies evaluating the causal role of environmentally responsive DNAm changes on the development of health states. We subsequently narrowed our review to elucidate whether epigenetic modifications are part of the mediating pathways by which exposure leads to disease or an outcome of interest. Thus, we especially focused on studies that formally evaluate causality of both exposure → DNAm as well as DNAm → outcome associations. Please refer to Table 1 for a comprehensive list of MR designs and corresponding terminologies.
Table 1

Mendelian randomization designs and corresponding terminology

 

Causal associations under evaluation

Associations evaluated and study population used

Study population

A = B

Study population

A ≠ B

1-step MR

Exposure → outcome

SNP → exposure (study population A)

One-sample MR

Two-sample MR

SNP → outcome (study population B)

2-step MR

1st step: exposure → mediator

SNP1 → exposure (study population A1)

One-sample MR

Two-sample MR

SNP1 → mediator (study population B1)

2nd step: mediator → outcome

SNP2 → mediator (study population A2)

One-sample MR

Two-sample MR

SNP2 → outcome (study population B2)

Methods

Search Strategy and Study Selection

We searched PubMed for relevant studies published through April 8, 2018 using the following search strategy: “Mendelian Randomization Analysis”[Mesh] OR “Mendelian Randomization” OR “Instrumental variable”) AND (“DNA methylation” [Mesh] OR “epigenomics” [Mesh] OR ((“DNA” [Mesh] OR “Deoxyribonucleic acid“) AND (“methylation” [Mesh] OR “methylation”)) OR “DNA methylation” OR “epigenomics” OR “epigenetics.” The search strategy retrieved a total of 41 citations. We included all articles which explicitly reported an interest into the evaluation of potentially “causal” associations of DNAm with health endpoints. The search had no language restrictions. Two investigators (M.G-P. and G.A) independently reviewed each of all the abstracts and excluded 25 papers based on the following exclusion criteria (Fig. 2): (a) not a Mendelian randomization study, (b) no original research (i.e., reviews, editorials, non-research letters), (c) non-human study, and (d) no DNAm measures. As a second layer of exclusion, we additionally excluded three studies without health endpoints, including one study evaluating DNAm as the outcome [26], and two studies evaluating epigenetic age and age acceleration, respectively, as the outcome [27, 28]. Any discrepancies were resolved by consensus and if necessary, an additional reviewer (M.T-P) was involved. A native speaker reviewed the full text of any non-English article that could not be included or excluded based on the initial abstract review. The reference lists of selected articles were checked for other potentially relevant articles, identifying no additional studies. We additionally retrieved two additional articles by hand search. We included 15 articles in the final review, among which 4 explicitly included an environmental component as determinant of DNAm [29••, 30, 31••, 32] (Fig. 2). We collected the following data for each study: first author, year of publication, study design, MR design, sample size and source, gender and age characteristics of participants, DNAm assessment, environmental determinant of DNAm (if any) and its assessment, health outcome assessment, DNAm quality control methods, genetic variant determination, summary of findings, and adjustment covariates (Table 2).
Fig. 2

Flow diagram of the study selection process. Summary of inclusion and exclusion criteria used in this systematic review of studies investigating the association between DNA methylation levels and health outcomes, 18 April 2018. Caramaschi et al. [29••], Richmond et al. [35], Richardson et al. [36], and Arathimos et al. [37] used data from the ALSPAC cohort (UK); Morales at al. [30] used data form the INMA cohort (Spain); Gao et al. [43] used data from the Chinese National Twin Registry cohort (China); Jhun et al. [31••] used data from the GENOA cohort (USA); Allard et al. [32] used data from the Gen3G cohort (Canada); Richard et al. [40] and Mendelson et al. [41•] used data from the Framingham Heart Study (USA); Nano et al. [38] used data from the Rotterdam Health Study (The Netherlands); Dekkers et al. [39] used data form the Bios Consortium (The Netherlands); Hannon et al. [42] used data from the University College of London (UK) and Aberdeen (Scotland) case-control samples and the Monozygotic Twins cohort (The Netherlands); Troung et al. used data from the French-Canadian Family Study [44]; and Wahl et al. used data from the EPICOR, KORA, and LOLIPOP studies [45••]

Table 2

Table of studies applying Mendelian randomization (MR) to evaluate the causal effect of DNA methylation on an outcome, not including a formal evaluation of DNA methylation environmental determinants

Study

MR design

MR equation

Study sample

Demo.

DNAm

Health outcome

IV

Finding

Causal evidence

Model adjustments

Dekkers et al. [39]

1-step (bidirectional)

1-sample

DNAm

->

Lipid levels (direction of interest)

- N = up to 3296 (Bios Consortium, 6 Dutch cohorts)

- 43% men

- mean 53 years

Whole blood Illumina 450k cg00574958 cg17058475 (CPT1A)

cg11024682 (SREBF1)

cg27168858 (DHCR24)

cg27243685

cg06500161 (ABCG1)

TG,

LDL-C,

HDL-C

cis-meQTL of CpGs associated with lipid levels

No evidence of an effect of DNAm on lipid levels

No (but yes in the other direction)

Gender, age, cell counts, and batch effects

Richmond et al.[35]

1-step (bidirectional)

1-sample

DNAm

->

BMI (direction of interest)

- N = 845 (ALSPAC-ARIES)

- 49% men

- mean 17.1 years

Peripheral blood

Illumina 450k

BMI

2 cis-SNPs (rs8102595 and rs3826795) associated with HIF3A methylation

The effect of SNPs on BMI was different from that expected if methylation at HIF3A had a causal effect on BMI

No (but yes in the other direction)

Bisulfite conversion batch

Arathimos et al. [37]

1-step (bidirectional)

2-sample

DNAm

->

Asthma and wheeze (direction of interest)

- N = 781 (ALSPAC-ARIES) for SNPs-DNAm

- N = up to 26,700 (GABRIEL consortium) for DNAm-asthma

- 49% men

- mean 7.5 years

Whole blood

Illumina 450k

Asthma

cis-meQTL available for31 CpG sites at 7.5 years

Evidence for a causal association at cg11938718 but not after adjustment for multiple testing

No after adjustment for multiple testing

EWAS: sex, maternal age and education, pregnancy smoking, parity, and unknown confounders by surrogate variables

Gao et al. [43]

1-step

1-sample

DNAm

->

BMI

- N = 469 (Chinese National Twin Registry)

- 66% men

- mean 44.8 years

Peripheral blood

Illumina 450k cg15053022 (ATP4A)

BMI

cis-SNP rs748212 associated with the CpG site cg15053022

Lower methylation levels at cg15053022 associated with higher BMI

Yes

Sex, age, zygosity, smoking alcohol, region, physical activity, diet, and education

Hannon et al. [42]

1-step summarized data MR

1-sample

DNAm

->

43 phenotypes

- Unknown N

3 cohorts together:

University College London, Aberdeen, and Monozygotic twins

Not specified

Whole blood

Illumina 450k

cg25258033

cg05901451

43 different outcomes

meQTL SNPs

Several CpG sites associated with overlapping phenotypes

Yes (for some traits)

Age, sex, cell counts, body mass index, smoking, and batch

Mendelson et al. [41•]

1-step (bidirectional)

2-sample

DNAm

->

BMI (direction of interest)

- N = 2377 (FHS offspring cohort) for cis-meQTL SNP selection

- N = up to 339,224 (GIANT consortium) for SNPs-BMI

- 39–60% men

- mean 67–79 years

Whole blood

Illumina 450k

83 CpG sites related with BMI in EWAS

BMI

cis-meQTL SNPs

DNAm at cg11024682 (SREBF1) and cg07730360 (not annotated) showed causal effect on BMI

Yes (causal association in the other direction as well)

Age and sex

Nano et al. [38]

1-step (bidirectional)

2-sample

DNAm

->

Liver enzymes (direction of interest)

- N = 1450 (Rotterdam) for CpG enzymes

- N = 4034 (Wahl et al.) for SNP selection

- 45% men

- mean 63.5 years

Whole blood

Illumina 450k

Liver enzymes (serum levels)

cis-SNP for each CpG site influencing DNAm levels (from the study of Wahl et al.)

Methylation at SLC7A11 might be a cause of GGT but not for SLC43A1 and PHGDH

Yes

Not reported

Richard et al. [40]

1-step (bidirectional)

2-sample

DNAm

->

Blood pressure (direction of interest)

- N = 4513 (ARIC, FHS, RS, and WHI-EMPC) for meQTL SNP selection

Unknown N (1000 Genomes data) for SNPs-blood pressure

- Unknown % men

- mean 56–76 years

Whole blood

Illumina 450k

Blood pressure

meQTL SNPs

Methylation at cg08035323 (TAF1B-YWHAQ) influences BP

Yes

Age, sex, cell counts, body mass index, smoking, ancestry, and batch effects

Richardson et al. [36]

1-step

1-sample

DNAm

->

CV traits (BMI, IL6, CRP, TC, Apo A1, Apo B, IL-6, leptin)

- N = 646 to 856 depending on trait (ALSPAC-ARIES)

- 49% men

- mean 7.5 years

Whole blood

Illumina 450k

Cardiovascular traits: BMI, SBP, DBP, IL-6, leptin, TC, HDL, LDL, CRP, triglycerides, adiponectin, Apo B, Apo A1

37,812 meQTL SNPs associated with a nearby CpG site (cis-meQTLs)

Cardiovascular traits might be influenced by altered DNAm levels at the ABO, ADCY3, ADIPOQ, APOA1, APOB, and IL6R regions

Yes

Age, sex, bisulfite conversion batch, top 10 ancestry principal components, and cell counts

Truong et al. [44]

1-step (bidirectional)

1-sample

DNAm

->

Triglycerides (direction of interest)

- N = 199 (French-Canadian Family Study) (replication of findings in N = 324 independent individuals from MARTHA study)

- 47% men

- mean 40 years

Peripheral blood

Illumina 450k

Serum levels of triglycerides (TG) (spectrophotometry)

6 cis-SNPs influencing methylation at cg14476101 (PHGDH)

The 6 cis-SNPs were not associated with TG; thus, DNAm at cg14476101 is not a causal factor for TG

No (but yes in the other direction)

Age, sex, cell type proportion, and batch effects

Wahl et al. [45••]

1-step (bidirectional)

1-sample

DNAm

->

BMI (direction of interest)

- N = 4034 (KORA and LOLIPOP)

- ~ 60% men

- mean 54 years

Whole blood

Illumina 450k

BMI

cis-SNPs influencing DNAm in blood (rs11150675 near NFATC2IP)

Methylation at the CpG site cg26663590 at NFATC2IP showed causal association with increased BMI after Bonferroni correction

Only for 1 CpG site, authors conclude

No causal association of DNA on BMI

Age, gender, smoking status, physical activity index, alcohol consumption, white blood cell PC

N = 11

To assess study quality (Supplemental Material, Supplemental Table 1), we adapted the criteria used by Longnecker et al. [33] for observational studies and the criteria by Boef et al. for MR studies [34]. We organized the presentation of the results by the absence or presence of an environmental determinant of DNAm being explicitly evaluated.

Current Perspectives and Results

Studies Without Formal Evaluation of DNA Methylation Environmental Determinants

We retrieved 11 articles that applied MR to evaluate the causal role of DNAm on health outcomes but did not focus on an environmental determinant of DNAm (Table 2). The studies were conducted among individuals from the UK [35, 36, 37], the Netherlands [38, 39], USA [40, 41•], some European countries [42], China [43], Canada [44], and other study mixed European and Asian populations [45••]. Most studies conducted traditional instrumental variable analysis with genetic instruments, henceforth referred to as “one-step” MR (see Table 1). In the “one-sample” approach (see Table 1), the SNP-DNAm and the SNP-outcome associations are studied in a single study population [46], whereas in the setting of the so-called two-sample MR design, the estimates of the SNP-DNAm and SNP-outcome associations are conducted in two non-overlapping study populations [46]. For a more detailed discussion of one-sample and two-sample MR designs, see a methodological review by Davey-Smith et al. [47]. Four out of nine studies employed the two-sample design by using published data from international consortia of epidemiologic cohorts with genetic data [37, 38, 40, 41•].

Special mention needs to be made here for bidirectional MR, a sensitivity analysis effort to disentangle the direction of the association between an exposure and an outcome [47]. Here, the MR approach is conducted in both directions (DNAm → outcome and outcome → DNAm). Thus, genetic instruments for both DNAm and outcome are required, and the method will be valid only under the condition that the two genetic instruments are independent of each other [47]. Among the abovementioned studies, eight conducted a bidirectional analysis [35, 37, 38, 39, 40, 41, 44, 45••].

For example, Wahl et al. [45••] first conducted an EWAS and identify CpG sites associated with BMI (N = 5387 for discovery and N = 4851 for replication). They then performed a bidirectional analysis to assess whether DNAm is a cause or a consequence of BMI levels using a Mendelian randomization framework. They first used SNPs associated with methylation levels (i.e., cis-methylation quantitative trait loci (QTL)) to assess the direction DNAm → BMI and found that only the CpG site cg26663590 at NFATC2IP showed a causal association with increased BMI. They then conducted a MR analysis in the reverse order, BMI → DNAm, using SNPs previously reported to be associated with BMI and found several causal associations of BMI on DNAm changes. The authors concluded that alterations in DNAm are predominantly a consequence of adiposity rather than a cause. Richmond et al. [35] also found more evidence for causality of BMI on methylation levels in HIF3A than in the opposite direction using data from the ALSPAC-ARIES cohort (N = 845) with a bidirectional MR design. Similarly, Dekkers et al. [39] and Truong et al. [44] also concluded that differential methylation is the consequence of interindividual variation in blood lipid levels (triglycerides, LDL-cholesterol, and HDL-cholesterol) and not the cause (N = 3296 and 199 in Dekkers et al. and Truong et al., respectively). Mendelson and co-authors also conducted a bidirectional MR analysis to identify the link between adiposity and differential methylation using data from the Framingham Heart Study (FHS) (N for the SNPs-DNAm association = 2377) and the GIANT consortium (N for the SNPs-BMI association = 339,224) (two-sample study) [41•]. Interestingly, this study found that BMI is likely both a consequence and a cause of DNAm. In particular, while methylation alteration in the CpG site cg11024682 in the SREBF1 region was causally associated with decreased BMI (causal beta estimate = − 2.51 kg/m2 per a standard deviation change in DNAm levels, p value = 0.02), 16 CpG sites were found to be differentially methylated as a consequence of BMI (using a nominal p value of 0.05). Other studies using a bidirectional MR design revealed causal associations of DNAm with liver enzyme levels [38] and blood pressure [40] but not with asthma and wheeze [37].

In the one-direction MR framework, Gao et al. concluded that increased DNAm at ATP4A was causally associated with lower BMI in the Chinese National Twin Registry Cohort (N = 469, causal estimate = − 0.197 kg/m2 per 10% increase in DNAm levels, p value < 0.001) [43]. Another study in the ALSPAC-ARIES cohort by Richardson and co-authors [36] found that DNAm changes at different gene regions were causally associated with a list of cardiometabolic traits (N ranged from 646 to 856 depending on the trait), including adiponectin for cg05578595 at ADIPOQ (causal estimate for a SD change in DNAm and p value were − 0.846 ng/mL, p value = 5.93 · 10−7), IL-6 for cg21160290 at ABO (− 0.293 pg/mL, p value = 1.77 · 10−6), and cg02856953 at IL6R (0.468 pg/mL, p value = 0.008), CRP levels for cg04111102 at LEPR (− 0.265 mg/L, p value = 0.001), apolipoprotein A cg04087571 at APOA1 (− 0.301 g/L, p value = 2.68 · 10−4), apolipoprotein B for cg25035485 at APOB (0.298 g/L, p value = 0.009), and for cg00908766 at SORT1 (0.271 g/L, p value = 2.74 · 10−5), total cholesterol for cg19610905 at FADS1 (− 0.363 mmol/L, p value = 0.003), and BMI for cg01884057 at ADCY3 (0.106 kg/m2, p value = 0.028). A study from a consortium combining international studies found that epigenetic changes at the CpG site cg08035323 between TAF1B and YWHAQ were positively and causally associated with systolic (causal estimate 20.9 mmHg per 1% change in DNAm, p value = 0.0091) and diastolic blood pressure (causal estimate 15.1 mmHg per 1% change in DNAm, p value = 0.0111) [40]. DNAm was not causally related with childhood asthma status after multiple comparison correction, according to a study using data of the FHS (N for the SNPs-DNAm association = 781) and the GABRIEL consortium (N for the SNPs-asthma association = up to 26,700) [37].

Studies with Formal Evaluation of DNA Methylation Environmental Determinants

We retrieved four articles assessing the role of DNAm as a mediator between an exposure and a health outcome (Table 3). These studies were conducted in the USA [31••], Canada [32], UK [29••], and Spain [30]. In this regard, Relton and Davey-Smith proposed a two-step epigenetic MR approach [10•], which consists in two separated MR processes to assess causal evidence for (1) the association between the exposure and the epigenetic mediator (i.e., first step: exposure → DNAm) and (2) the association between the epigenetic mediator and the outcome of interest (i.e., second step: DNAm → outcome) (see Fig. 3). The main difficulty of the two-step technique relies in finding separate genetic variants for the exposure (first step) and the epigenetic mediator (second step), especially if the exposure and the epigenetic marker are closely related.
Table 3

Table of studies applying Mendelian randomization (MR) to evaluate the causal effect of DNA methylation on an outcome, including a formal evaluation of DNA methylation environmental determinants

Study

MR design

MR equation

Study sample

Demo.

DNAm

Environ. determinant

Health outcome

IV

Finding

Causal evidence

Model adjustments

Allard et al. [32]

2-step

1st step: 2-sample

2nd step: 1-sample

Maternal glycemia

-> DNAm

-> Neonatal leptin

1st step:

- N = 166 mother-child (Gen3G cohort) for SNPs-DNAm

- N = 467 (Gen3G cohort) for SN selection

2nd step:

- N = 170 mother-child (Gen3G cohort)

Mothers 28 years old

Offspring 52% men-neonatal

Cord blood Illumina 450k

-cg12083122 (LEP)

Maternal glycemia at 2nd trimester of pregnancy

Neonatal leptin levels (quest.)

1st step: GRS based on 10 known glycemic SNPs

2nd step: no adequate IV

1st step: higher GRS10 (high maternal glycemia) associated with lower DNAm at cg12083122

2nd step: not conducted

1st step: yes

2nd step: not conducted

Not reported

Morales et al. [30]

1-step

2-sample

Mother’s DNAm at CpGs related with smoking

- >neonatal birth weight

Previous EWAS:

- N = 179 mothers (INMA cohort) for smoking-related CpGs@2-

sample MR:

- N = 136 mother-child (INMA cohort) for SNPs-DNA

- N = 26,836 (EGG consortium) for SNPs-birth weight

Mothers 0%; men, mean 31 years

Offspring 52% men-neonatal

Placenta Illumina 450k

- cg27402634 (bw. LINC00086 and LEKR1)

- cg20340720 (WBP1L)

- cg25585967 and cg12294026 (TRIO)

Smoking during pregnancy (quest. in weeks 12 and 32)

Birth weight Z score

meQTLs for cg27402634, between LINC00086 and LEKR1 (do not say which SNPs exactly)

EWAS: smoking decreases DNAm at cg27402634

2-sample MR: decreases in DNAm at cg27402634 decreases birth weight

Yes

EWAS: study area, gestational age, sex, maternal age and social class, parity, batch, father smoking

2-sample: not reported

Caramaschi et al. [29••]

2-step

1st step: 2-sample

2nd step: 2-sample

Maternal B12

->

Offspring DNAm

->

Children IQ score

1st step:

- N = 641mother-child (ALSPAC-ARIES) for SNPs-DNAm

- Unknown N (previous literature) for SNP selection

2nd step:

- N = 3354 to 3843 (ALSPAC NON-ARIES) for cis-SNPs-IQ score

- Unknown N (ALSPAC-ARIES) for cis-SNP selection

1st step: mothers

- mean 29 years

Offspring

- 47% men

- neonatal

2nd step: offspring

- 49% men

- neonatal at baseline

- mean 8 years old at follow-up

Cord blood

Illumina 450k

- cg10543947 (APOL2)

- cg15676719 (RCSD1)

Maternal in utero vitamin B12 levels

IQ score (test)

1st step: rs492602 and rs104778 in FUT2

2nd step: one cis-SNP to each CpG site identified in 1st step

1st step: maternal B12 increases DNAm at APOL2 and RCSD1

2nd step: DNAm at APOL2 increases IQ; DNAm at RCSD1 decreases IQ

1st step: yes

2nd step: associations in opposite direction for each CpG, thus no conclusion could be made

1st step: batch, cell count, child’s genotype, mother age, BMI, education, pregnancy smoking, and parity

2nd step: age of testing

Jhun et al. [31••]

2-step

1st step: 1-sample

2nd step: 1-sample

Cigarette smoking

->

DNAm

->

Inflammation markers

1st step:

- N = 822 (GENOA study)

2nd step:

- N = 822 (GENOA study)

- 28% men

- mean 66.6 years

Peripheral blood leucocytes Illumina 27k BeadXpress

- cg03636183 (F2RL3)

- cg19859270 (GPR15)

Cigarette smoking (quest.)

Inflammation markers

(fasting blood)

1st step: 210 SNPs identified from GWAS

2nd step:

12 cis-meQTLs for cg03636183 (F2RL3)

1st step: smoking decreases DNAm in F2RL3 and GPR15

2nd step: F2RL3 DNAm associated with lower IL-18

1st step: yes

2nd step: yes

1st step: age, sex, plate, cell proportions and family

2nd step: same as above

N = 4

Fig. 3

Two-step Mendelian randomization framework considering epigenetic changes as a mediator of the association between a given exposure and a given outcome. a Overall two-step Mendelian randomization framework. b First Mendelian randomization step to evaluate the causal effect of the exposure on DNA methylation changes. In this step, the genetic variant(s) is (are) used as a proxy of the exposure. c Second Mendelian randomization step to evaluate the causal effect of DNA methylation changes on the outcome. In this step, different a genetic variant(s) is (are) used as a proxy of the DNA methylation. Dotted arrows indicate the causal association that is tested in each step. Figure adapted from Relton and Davey-Smith [10•]

For example, Jhun et al. used two-step epigenetic MR analysis among 822 African Americans participating in the GENOA study to investigate whether DNAm mediates the association between cigarette smoking and inflammation [31••], which was assessed as the levels of CRP, interleukin-6 (IL-6), interleukin-18 (IL-18), and fibrinogen from fasting blood samples. Each step was a standard one-sample MR approach. For the first step (cigarette smoking → DNAm causal association), they identified several SNPs associated with cigarette smoking in their study sample among a group of candidate SNPs that were previously associated with cigarette smoking in a GWAS. Each of the SNPs and a genetic risk score were used as instrumental variables to assess whether there was evidence of a causal association between current cigarette smoking and DNAm. They observed that current smoking status was causally associated with DNAm levels at cg03636183 in the F2RL3 gene and with cg19859270 in the GPR15. In the second step (DNAm → inflammation markers), they searched for suitable instrumental variables for the CpG sites that resulted significant in the first step. They observed that the DNAm of cg03636183 in F2RL3 was associated with IL-18 levels. Combining the results from the first and second steps, authors concluded that each additional coded allele of rs4074134 for current smokers was associated with a 0.26-unit decrease in the DNAm of cg03636183, which resulted in a 3% increase in serum IL-18 levels. Thus, as the log odds of smoking increased, DNAm levels of cg03636183 decreased, resulting in an increase in serum IL-18.

Other environmental determinants of DNAm changes considered among the retrieved articles were maternal glycemia [32], maternal circulating B12 levels [29••], and maternal smoking during pregnancy [30]. Caramaschi et al. conducted a two-step MR approach to assess the mediation by cord blood DNAm changes in the association between maternal vitamin B12 levels and offspring cognition at the age of 8 [29••]. The study used data from the ALSPAC cohort, including ARIES (N = 641) and NON-ARIES (N = from 3354 to 3843) subcohorts separately, as well as information from previously published studies reporting genotypes in FUT2 associated with vitamin B12 levels (sample size not reported). The results from the first step suggested small DNAm increases at APOL2 and RCSD1 due to increased maternal vitamin B12. However, for the second step, evaluating the causal estimate of vitamin B12-responsive DNAm on offspring intelligence, the causal estimates were positive for the APOL2 CpG site but inverse for the RCSD1 CpG site, thus providing insufficient evidence for a positive effect of in utero vitamin B12 exposure on offspring cognition. Allard et al. investigated whether cord blood DNAm changes near the LEP locus mediate the relation between maternal glycemia and neonatal leptin levels [32]. The study used data from the Gen3G cohort from Canada (N = from 170 to 467 depending on the association being evaluated). The authors concluded that maternal glycemia leads to epigenetic adaptations in offspring LEP region in the first step (glycemia → DNAm), but due to the lack of adequate genetic instruments for conducting the second step (DNAm → leptin levels), they could not confirm a causal association between maternal glycemia and neonatal leptin levels.

Finally, Morales et al. considered the mediating role of DNAm in the association between maternal cigarette smoking during pregnancy and offspring birth weight [30]. However, they did not conduct a formal two-step MR analysis. Instead, they first conducted an EWAS of placenta DNAm in relation to maternal cigarette smoking in order to identify candidate CpG sites that are associated with cigarette smoking during pregnancy in the INMA cohort from Spain (N = 179). They later conducted a single-step MR analysis to evaluate the causal link between epigenetic changes at those specific sites and birth weight (DNAm → birth weight). They used data on 136 mother-child pairs from the INMA cohort and data from the Early Growth Genetics (EGG) consortium (N = 26,836). The authors concluded that a 1% increase in methylation levels at the CpG site cg27402634, located between LINC00086 and LEKR1, which had been associated with decreased exposure to tobacco smoke during pregnancy in the previous EWAS, leads to an increase in birth weight of 3.36 g.

General Discussion

Epidemiologic evidence from distinct study populations based on MR methods, which aims to disentangle cause and effect versus correlational associations, suggests an overall trend for a causal association between DNAm and cardiometabolic endpoints. For any given specific cardiometabolic endpoint, however, the evidence is not sufficient given the low number of studies. Moreover, given the available evidence, it cannot be discarded, that for some of the evaluated traits, DNAm modifications can be both cause and consequence of interindividual differences. The quality of the retrieved studies was overall good (Supplemental Material, Supplemental Table 1). While the low number of studies evaluating environmental determinants of epigenetic changes on health endpoints and the heterogeneity of the endpoints evaluated limit the conclusion of this review, the evidence accrued so far supports the importance of exposure to tobacco smoke in modulating the epigenome and its potential health consequences.

Main Challenges and Methodological Considerations in Mendelian Randomization Analyses

Despite its increasing popularity and applicability, MR methods have its limitations [12, 48]. The main challenge of this approach lies in the strength and validity of the genetic variants used as instrumental variables. A genetic variant is considered to be a weak instrument for the trait of interest if it only explains a small proportion in the variation of that trait. While Staiger and Stock proposed a first and simple rule of thumb to denominate weak instruments when the F statistic of the association SNP trait is below 10 [49], more complex criteria have been developed recently [50]. Most of the retrieved articles addressed the strength of the genetic instruments used by reporting the partial F statistic or p values, with the exemption of four studies [29••, 36, 37, 42]. In practice, weak instrument bias can be alleviated by increasing the sample size, for instance using publicly available GWAS data [51], by increasing the number of instrumental variables used at the time [47], or by applying Bayesian modeling, which is more robust to instrumental weakness [52•].

In addition, the independence and restriction assumptions to consider a genetic variant as instrumental variable may be violated in practice due to biological reasons (e.g., pleiotropy), non-Mendelian inheritance (e.g., linkage disequilibrium, LD), or population characteristics (e.g., population stratification). Pleiotropy occurs when the genetic instrument is not only associated with the modifiable exposure but also with other risk factors of the outcome of interest [53]. As GWAS have become popular, associations of genetic variation with several phenotypes have been observed rising concerns about pleiotropic effects. Among the retrieved studies, pleiotropy was assessed by conducting specific analytic techniques such as “Egger” regression [30, 40] or by increasing the number of variants considered [29••, 30, 32, 40, 42]. Pleiotropy can also be interrogated by reviewing previous GWAS, by working with variants allocated in genes with a known function, and by conducting principal components or canonical correlation analysis [52•, 54, 55, 56]. The phenomenon of LD, i.e., several genetic variants distributed in the population in a non-independent manner, can be avoided by using similar procedures as for pleiotropy, especially by better understanding the biological function of the genetic variants/genes included in the analysis [51].

A special mention must be done for “horizontal pleiotropy,” consisting in the genetic variant affecting both the exposure and the outcome (or a factor causally related to the outcome) through two independent biological pathways. Distinguishing mediation from horizontal pleiotropy is of particular concern in studies that assess the casual role of DNAm on health endpoints by using SNPs in cis regions as proxies for DNAm, which are close to genes known to be relevant for the health endpoint under investigation. This scenario can be tested with the use of multiple genetic variants as instruments. In fact, Richardson et al. extensively discussed this particular challenge but could not rule it out because of the use of single-variant instruments in their study [36].

Finally, population stratification occurs under the presence of systematic genetic differences between individuals, commonly (but not necessarily) due to a mixture of subjects from distinct genetic backgrounds. This limitation can result in invalid associations if not appropriately corrected. In practice, population stratification can be alleviated by restricting the analysis to a subsample of individuals with the same genetic background [51]. For the case of GWAS including large samples from different geographic sites and ancestries, a common approach is to account for population structure in genetic data using principal components analysis [57], although other alternatives such us the use of mixed models have also been proposed [58, 59].

Seven of the 15 included studies in this review formally discussed the plausibility of the independence and restriction assumptions for instrumental variables [31••, 35, 36, 39, 40, 42, 45••], and two of them also evaluated the association of the genetic variants with measured confounders [31••, 35]. Sex and age are important sociodemographic factors that must be considered as potential confounders, since they have also been related to differences in DNAm [60, 61] and are commonly related with health outcomes. Only one study did not address confounding by sex or age [35] and in other two studies the model adjustments were not specified [32, 38]. Additional limitations for the application and interpretation of MR studies have been described elsewhere [20, 34, 47, 51].

Other Methodological Considerations

While the drawbacks of two-sample MR methods are the lack of consistent confounders information among the participating cohorts, potential population stratification and, in general, heterogeneity between the two samples used for the analysis, the increasing availability of published GWAS studies makes the two-sample methodology very flexible. First, GWAS can be used as a source of genetic instruments. Second, the two-sample MR usually takes advantage of SNP-outcome association benefiting from the large sample size typical from publicly available consortia. Third, two-sample MR can serve as external validation of SNP-outcome results in the setting of traditional one-sample MR. A particular challenge of the two-sample MR design is the difficulty in interpreting the temporality of the exposure-outcome associations because the two study cohorts have different study designs [30, 32, 37, 38, 40, 41•]. We only identified one study which clearly could be classified as prospective [29••].

Different protocols for DNA isolation, processing, and methylation assessment could be an additional source of artifactual variation in MR studies evaluating DNAm-outcome associations. In our systematic review, all the studies assessed DNAm levels using microarray Infinium Methylation Illumina technologies. Additional compelling issues that must be addressed and reported are differential tissue-type heterogeneity [61, 62] and potential batch effects [63, 64]. Among the studies included in this review, DNAm was measured in whole blood [36, 37, 38, 39, 40, 41, 42, 45••], cord blood [29••, 32], placenta [30], and peripheral blood [31, 35, 43, 44]. All the studies generally reported, controlled, or discussed the possibility of tissue-specific cell heterogeneity in DNAm determinations. Tissue specificity, however, may be especially problematic in Mendelian randomization studies, and it possibly should be considered at the stage of genetic instruments selection. For instance, none of the reviewed studies formally evaluated to what extent the selected genetic instruments were associated to genes with a role on main biological pathways in relevant tissues for the casual questions under study (i.e., both the tissue for the DNAm measurement but also the target tissues that can be relevant for the endpoint). On the other hand, most of the retrieved studies reported evaluating potential batch effects, with some exceptions [35, 41••] and almost all the studies validated targeted CpG regions in independent study samples or by examining previously published data, with the exception of three studies [32, 37, 39].

A Call for Increased Mendelian Randomization Research in Environmental Health Studies

The field of environmental health sciences can largely benefit from MR studies, as it is ethically questionable and often impossible to randomly assign environmental toxicants to study participants. Thus, large randomized clinical trials, the gold standard of scientific evidence, are typically not feasible in this field, with few exceptions such as ongoing quelation trials for metal removal to assess metals-related cardiovascular effects [65]. Important considerations for future epigenetic MR research include the need for larger studies with an environmental component, including environmental toxicants, especially based on established biomarkers of exposure with reduced exposure misclassification. Other pending questions include whether validation and replication of findings are actually needed in case of MR studies benefiting from large consortia; the relevance of recent epigenetic markers such as DNA hydroxymethylation, which could also play a role in epigenetic regulation of gene expression and be associated with disease susceptibility [66, 67]; and the systematic evaluation of potentially non-linear dose-response relationships of environmental determinants of DNAm and epigenetic-related health endpoints. Indeed, an interesting area of research is the development of MR methods to address potentially non-lineal causal effects. An important question is whether MR studies evaluating the hypothesis that DNAm is a mediator between a given environmental exposure and health outcome, should discuss violations of assumptions that are required to assess causality in traditional mediation frameworks [68]. Moreover, there is a debate on whether MR is a proper tool, compared with traditional mediation analysis, to evaluate potential mediation by DNAm [69, 70, 71]. On one side, MR can provide powerful evidence regarding the existence and the direction of the causal effect, thus differentiating between mediation effects, reverse causation, and confounding [71], only if MR-specific assumptions hold. On the other side, there is a concern that Mendelian randomization might miss mediation compared to regular statistical mediation methods given the fact that DNAm changes may be embedded in a complex gene regulation context [70]. Nevertheless, as large cohorts with available measurements of environmental chemicals and genome-wide DNAm data become increasingly available, collaborative meta-analyses will better elucidate the role of environmental toxicants as determinants of DNAm and further test the hypothesis that genomic DNAm may mediate toxicant-related health effects.

Conclusion

Reviewed MR studies suggest that the association between DNAm at specific CpG sites and endpoints, including BMI, adiponectin, IL-6, IL-18, CRP, Apo A1, Apo B, total cholesterol, and blood pressure, may be causal. However, studies evaluating the association between DNAm and BMI and triglycerides with a bidirectional MR design point to the possibility that DNAm changes may be a consequence of BMI or triglyceride levels, rather than the cause. In addition, DNAm may partly explain the presumably causal association of postnatal and prenatal exposure to tobacco smoke and vitamin B12 exposure with inflammation markers, cognitive outcomes, and birth weight, respectively. Nonetheless, more MR studies in large epidemiologic samples are needed to assess the causal role of environmental factors as determinants of epigenetic changes and subsequent health outcomes and formally test the possibility of reverse causation.

Notes

Acknowledgments

M.G.P. was supported by the AstraZeneca Foundation, Spain (“III Premio Jóvenes Investigadores, Programa de Fomento de los Jóvenes Científicos Españoles,” Principal Investigator: M.T.P.). The findings and conclusions in this article are those of the authors and do not necessarily reflect the views of the Carlos III Health Institutes, Madrid.

Compliance with Ethical Standards

Conflict of Interest

Maria Grau-Perez, Golareh Agha, Yuanjie Pang, José Bermudez, and Maria Tellez-Plaza declare that they have no conflict of interest.

Human and Animal Rights and Informed Consent

This article does not contain any studies with human or animal subjects performed by any of the authors.

Supplementary material

40572_2019_226_MOESM1_ESM.docx (60 kb)
ESM 1 (DOCX 60 kb)

References

Papers of particular interest, published recently, have been highlighted as: • Of importance •• Of major importance

  1. 1.
    Feinberg AP. Epigenetics at the epicenter of modern medicine. JAMA. 2008;299:1345–50.CrossRefPubMedGoogle Scholar
  2. 2.
    Bernstein BE, Meissner A, Lander ES. The mammalian epigenome. Cell. 2007;128:669–81.CrossRefPubMedGoogle Scholar
  3. 3.
    Jaenisch R, Bird A. Epigenetic regulation of gene expression: how the genome integrates intrinsic and environmental signals. Nat Genet. 2003;33(Suppl):245–54.CrossRefPubMedGoogle Scholar
  4. 4.
    Baccarelli A, Ghosh S. Environmental exposures, epigenetics and cardiovascular disease. Curr Opin Clin Nutr Metab Care. 2012;15:323–9.CrossRefPubMedPubMedCentralGoogle Scholar
  5. 5.
    Baccarelli A, Rienstra M, Benjamin EJ. Cardiovascular epigenetics: basic concepts and results from animal and human studies. Circ Cardiovasc Genet. 2010;3:567–73.CrossRefPubMedPubMedCentralGoogle Scholar
  6. 6.
    Dick KJ, Nelson CP, Tsaprouni L, Sandling JK, Aissi D, Wahl S, et al. DNA methylation and body-mass index: a genome-wide analysis. Lancet. 2014;383:1990–8.CrossRefPubMedGoogle Scholar
  7. 7.
    Joehanes R, Just AC, Marioni RE, Pilling LC, Reynolds LM, Mandaviya PR, et al. Epigenetic signatures of cigarette smoking. Circ Cardiovasc Genet. 2016;9:436–47.CrossRefPubMedPubMedCentralGoogle Scholar
  8. 8.
    Joubert BR, Felix JF, Yousefi P, Bakulski KM, Just AC, Breton C, et al. DNA methylation in newborns and maternal smoking in pregnancy: genome-wide consortium meta-analysis. Am J Hum Genet. 2016;98:680–96.CrossRefPubMedPubMedCentralGoogle Scholar
  9. 9.
    Ligthart S, Marzi C, Aslibekyan S, Mendelson MM, Conneely KN, Tanaka T, et al. DNA methylation signatures of chronic low-grade inflammation are associated with complex diseases. Genome Biol. 2016;17:255.CrossRefPubMedPubMedCentralGoogle Scholar
  10. 10.
    • Relton CL, Davey Smith G. Two-step epigenetic Mendelian randomization: a strategy for establishing the causal role of epigenetic processes in pathways to disease. Int J Epidemiol. Oxford University Press; 2012;41:161–76. This paper explains the rationale, methodology, advantages and limitations of the two-step Mendelian randomization technique. Google Scholar
  11. 11.
    Stock JH, Trebbi F. Retrospectives who invented instrumental variable regression? J Econ Perspect. 2003;17:177–94.CrossRefGoogle Scholar
  12. 12.
    Smith GD, Ebrahim S. “Mendelian randomization”: can genetic epidemiology contribute to understanding environmental determinants of disease? Int J Epidemiol. 2003;32:1–22.CrossRefPubMedGoogle Scholar
  13. 13.
    Mendel G, Bateson W. Experiments in plant-hybridisation. Cambridge: Harvard University Press; 1938.Google Scholar
  14. 14.
    Davey Smith G, Lawlor DA, Harbord R, Timpson N, Rumley A, Lowe GD, et al. Association of C-reactive protein with blood pressure and hypertension: life course confounding and Mendelian randomization tests of causality. Arter Thromb Vasc Biol. 2005;25:1051–6.CrossRefGoogle Scholar
  15. 15.
    Ding EL, Song Y, Manson JE, Hunter DJ, Lee CC, Rifai N, et al. Sex hormone-binding globulin and risk of type 2 diabetes in women and men. N Engl J Med. 2009;361:1152–63.CrossRefPubMedPubMedCentralGoogle Scholar
  16. 16.
    Kamstrup PR, Tybjaerg-Hansen A, Steffensen R, Nordestgaard BG. Genetically elevated lipoprotein(a) and increased risk of myocardial infarction. JAMA. 2009;301:2331–9.CrossRefPubMedGoogle Scholar
  17. 17.
    Keavney B, Danesh J, Parish S, Palmer A, Clark S, Youngman L, et al. Fibrinogen and coronary heart disease: test of causality by “Mendelian randomization”. Int J Epidemiol. 2006;35:935–43.CrossRefPubMedGoogle Scholar
  18. 18.
    Larsson SC, Burgess S, Michaelsson K. Association of genetic variants related to serum calcium levels with coronary artery disease and myocardial infarction. JAMA. 2017;318:371–80.CrossRefPubMedPubMedCentralGoogle Scholar
  19. 19.
    Liao JK. Genetically elevated C-reactive protein and ischemic vascular disease. Curr Atheroscler Rep. 2009;11:245.CrossRefPubMedGoogle Scholar
  20. 20.
    Mokry LE, Ross S, Timpson NJ, Sawcer S, Davey Smith G, Richards JB. Obesity and multiple sclerosis: a Mendelian randomization study. PLoS Med. 2016;13:e1002053.CrossRefPubMedPubMedCentralGoogle Scholar
  21. 21.
    Timpson NJ, Lawlor DA, Harbord RM, Gaunt TR, Day IN, Palmer LJ, et al. C-reactive protein and its role in metabolic syndrome: mendelian randomisation study. Lancet. 2005;366:1954–9.CrossRefPubMedGoogle Scholar
  22. 22.
    Zacho J, Tybjaerg-Hansen A, Jensen JS, Grande P, Sillesen H, Nordestgaard BG. Genetically elevated C-reactive protein and ischemic vascular disease. N Engl J Med. 2008;359:1897–908.CrossRefPubMedGoogle Scholar
  23. 23.
    Casas JP, Shah T, Cooper J, Hawe E, McMahon AD, Gaffney D, et al. Insight into the nature of the CRP-coronary event association using Mendelian randomization. Int J Epidemiol. 2006;35:922–31.CrossRefPubMedGoogle Scholar
  24. 24.
    Zhong J, Agha G, Baccarelli AA. The role of DNA methylation in cardiovascular risk and disease: methodological aspects, study design, and data analysis for epidemiological studies. Circ Res. 2016;118:119–31.CrossRefPubMedPubMedCentralGoogle Scholar
  25. 25.
    Sandoval J, Heyn H, Moran S, Serra-Musach J, Pujana MA, Bibikova M, et al. Validation of a DNA methylation microarray for 450,000 CpG sites in the human genome. Epigenetics. 2011;6:692–702.CrossRefPubMedGoogle Scholar
  26. 26.
    Binder AM, Michels KB. The causal effect of red blood cell folate on genome-wide methylation in cord blood: a Mendelian randomization approach. BMC Bioinformatics. 2013;14:353.CrossRefPubMedPubMedCentralGoogle Scholar
  27. 27.
    Astle WJ, Elding H, Jiang T, Allen D, Ruklisa D, Mann AL, et al. The allelic landscape of human blood cell trait variation and links to common complex disease. Cell. 2016;167:1415–1429.e19.CrossRefPubMedPubMedCentralGoogle Scholar
  28. 28.
    Lu AT, Xue L, Salfati EL, Chen BH, Ferrucci L, Levy D, et al. GWAS of epigenetic aging rates in blood reveals a critical role for TERT. Nat Commun. 2018;9:387.CrossRefPubMedPubMedCentralGoogle Scholar
  29. 29.
    •• Caramaschi D, Sharp GC, Nohr EA, Berryman K, Lewis SJ, Davey Smith G, et al. Exploring a causal role of DNA methylation in the relationship between maternal vitamin B12 during pregnancy and child’s IQ at age 8, cognitive performance and educational attainment: a two-step Mendelian randomization study. Hum Mol Genet. 2017;26:3001–13 This well-conducted prospective two-step Mendelian randomization study found that DNA methylation can have a role as mediator in the causal relationship between maternal B12 levels and offspring intelligence at the age of 8. This study is an example of a two-step Mendelian randomization study in which each step is conducted under the 2-sample scenario. CrossRefPubMedPubMedCentralGoogle Scholar
  30. 30.
    Morales E, Vilahur N, Salas LA, Motta V, Fernandez MF, Murcia M, et al. Genome-wide DNA methylation study in human placenta identifies novel loci associated with maternal smoking during pregnancy. Int J Epidemiol. 2016;45:1644–55.CrossRefPubMedGoogle Scholar
  31. 31.
    •• Jhun MA, Smith JA, Ware EB, Kardia SLR, Mosley TH, Turner ST, et al. Modeling the causal role of DNA methylation in the association between cigarette smoking and inflammation in African Americans: a 2-step epigenetic Mendelian randomization study. Am J Epidemiol. 2017;186:1149–58 This well-conducted two-step Mendelian randomization study evaluated the mediator role of DNA methylation changes in the causal association between cigarette smoking and several inflammation markers. They found that smoking decreased methylation levels in F2RL3 and GPR15 , which resulted in increased serum IL-18 levels. CrossRefPubMedPubMedCentralGoogle Scholar
  32. 32.
    Allard C, Desgagné V, Patenaude J, Lacroix M, Guillemette L, Battista MC, et al. Mendelian randomization supports causality between maternal hyperglycemia and epigenetic regulation of leptin gene in newborns. Epigenetics. 2015;10:342–51.CrossRefPubMedPubMedCentralGoogle Scholar
  33. 33.
    Longnecker MP, Berlin JA, Orza MJ, Chalmers TC. A meta-analysis of alcohol consumption in relation to risk of breast cancer. JAMA. 1988;260:652–6.CrossRefPubMedGoogle Scholar
  34. 34.
    Boef AGC, Dekkers OM, Le Cessie S, De U, User V. Mendelian randomization methodology Mendelian randomization studies: a review of the approaches used and the quality of reporting. Int J Epidemiol. 2015;44:496–511.CrossRefPubMedGoogle Scholar
  35. 35.
    Richmond RC, Sharp GC, Ward ME, Fraser A, Lyttleton O, McArdle WL, et al. DNA methylation and BMI: investigating identified methylation sites at HIF3A in a causal framework. Diabetes. 2016;65:1231–44.CrossRefPubMedPubMedCentralGoogle Scholar
  36. 36.
    Richardson TG, Zheng J, Davey Smith G, Timpson NJ, Gaunt TR, Relton CL, et al. Mendelian randomization analysis identifies CpG sites as putative mediators for genetic influences on cardiovascular disease risk. Am J Hum Genet. 2017;101:590–602.CrossRefPubMedPubMedCentralGoogle Scholar
  37. 37.
    Arathimos R, Suderman M, Sharp GC, Burrows K, Granell R, Tilling K, et al. Epigenome-wide association study of asthma and wheeze in childhood and adolescence. Clin Epigenetics. 2017;9:112.CrossRefPubMedPubMedCentralGoogle Scholar
  38. 38.
    Nano J, Ghanbari M, Wang W, de Vries PS, Dhana K, Muka T, et al. Epigenome-wide association study identifies methylation sites associated with liver enzymes and hepatic steatosis. Gastroenterology. 2017;153:1096–1106.e2.CrossRefPubMedGoogle Scholar
  39. 39.
    Dekkers KF, van Iterson M, Slieker RC, Moed MH, Bonder MJ, van Galen M, et al. Blood lipids influence DNA methylation in circulating cells. Genome Biol. 2016;17:138.CrossRefPubMedPubMedCentralGoogle Scholar
  40. 40.
    Richard MA, Huan T, Ligthart S, Gondalia R, Jhun MA, Brody JA, et al. DNA methylation analysis identifies loci for blood pressure regulation. Am J Hum Genet. 2017;101:888–902.CrossRefPubMedPubMedCentralGoogle Scholar
  41. 41.
    • Mendelson MM, Marioni RE, Joehanes R, Liu C, Hedman ÅK, Aslibekyan S, et al. Association of body mass index with DNA methylation and gene expression in blood cells and relations to cardiometabolic disease: a Mendelian randomization approach. Lewis C, editor. PLoS Med. 2017;14:e1002215. This well-conducted study evaluated the causal relationship between DNA methylation levels and body mass index using a bidirectional two-sample Mendelian randomization approach. They found that increased DNA methylation levels at the region of SREBF1 were causally associated with decreased BMI. In addition, they showed that BMI is also a cause of DNA methylation changes in other CpG sites. Google Scholar
  42. 42.
    Hannon E, Weedon M, Bray N, O’Donovan M, Mill J. Pleiotropic effects of trait-associated genetic variation on DNA methylation: utility for refining GWAS loci. Am J Hum Genet. 2017;100:954–9.CrossRefPubMedPubMedCentralGoogle Scholar
  43. 43.
    Gao Y, Wang BQ, Gao WJ, Cao WH, Yu CQ, Lyu J, et al. Mendelian randomization analysis of the relationship between obesity and DNA methylation. Zhonghua Yu Fang Yi Xue Za Zhi. 2017;51:137–42.PubMedGoogle Scholar
  44. 44.
    Truong V, Huang S, Dennis J, Lemire M, Zwingerman N, Aïssi D, et al. Blood triglyceride levels are associated with DNA methylation at the serine metabolism gene PHGDH. Sci Rep. 2017;7:11207.CrossRefPubMedPubMedCentralGoogle Scholar
  45. 45.
    •• Wahl S, Drong A, Lehne B, Loh M, Scott WR, Kunze S, et al. Epigenome-wide association study of body mass index, and the adverse outcomes of adiposity. Nature. 2017;541:81–6 This well-conducted study evaluated the causal relationship between DNA methylation levels and body mass index using a bidirectional Mendelian randomization design. The authors concluded that alterations in DNA methylation are predominantly the consequence of adiposity, rather than the cause. CrossRefPubMedGoogle Scholar
  46. 46.
    Relton CL, Davey SG. Mendelian randomization: applications and limitations in epigenetic studies. Epigenomics. 2015;7:1239–43.CrossRefPubMedPubMedCentralGoogle Scholar
  47. 47.
    Davey Smith G, Hemani G. Mendelian randomization: genetic anchors for causal inference in epidemiological studies. Hum Mol Genet Oxford University Press. 2014;23:R89–98.CrossRefGoogle Scholar
  48. 48.
    Vanderweele TJ, Tchetgen EJT, Kraft P. Methodological challenges in Mendelian randomization. Epidemiology. 2015;25:427–35.CrossRefGoogle Scholar
  49. 49.
    Staiger D, Stock JH. Instrumental variables regression with weak instruments. Econometrica. The Econometric Society; 1997;65:557.Google Scholar
  50. 50.
    Stock J, Yogo M. Testing for weak instruments in linear IV regression. In: Andrews DWK, editor. Identification and inference for econometric model. New York: Cambridge University Press; 2005. p. 80–108.CrossRefGoogle Scholar
  51. 51.
    Zheng J, Baird D, Borges M-C, Bowden J, Hemani G, Haycock P, et al. Recent developments in Mendelian randomization studies. Curr Epidemiol Rep. 2017;4:330–45.CrossRefPubMedPubMedCentralGoogle Scholar
  52. 52.
    • Burgess S, Small DS, Thompson SG. A review of instrumental variable estimators for Mendelian randomization. Stat Methods Med Res. 2017;26:2333–55 This review explains and compares the methodology, advantages and limitations of several approaches for instrumental variable estimation. It also provides techniques for obtaining confidence intervals of the causal estimators and a guide for dealing with weak instruments. CrossRefPubMedGoogle Scholar
  53. 53.
    Paaby AB, Rockman MV. The many faces of pleiotropy. Trends Genet NIH Public Access. 2013;29:66–73.CrossRefGoogle Scholar
  54. 54.
    Ferreira MAR, Purcell SM. A multivariate test of association. Bioinformatics. 2009;25:132–3.CrossRefPubMedGoogle Scholar
  55. 55.
    Solovieff N, Cotsapas C, Lee PH, Purcell SM, Smoller JW. Pleiotropy in complex traits: challenges and strategies. Nat Rev Genet. 2013;14:483–95.CrossRefPubMedPubMedCentralGoogle Scholar
  56. 56.
    Schaid DJ, Tong X, Larrabee B, Kennedy RB, Poland GA, Sinnwell JP. Statistical methods for testing genetic pleiotropy. Genetics. 2016;204:483–97.CrossRefPubMedPubMedCentralGoogle Scholar
  57. 57.
    Hellwege JN, Keaton JM, Giri A, Gao X, Velez Edwards DR, Edwards TL. Population stratification in genetic association studies. Curr Protoc Hum Genet. Hoboken, NJ, USA: John Wiley & Sons, Inc.; 2017. p. 1.22.1–1.22.23.Google Scholar
  58. 58.
    Price AL, Zaitlen NA, Reich D, Patterson N. New approaches to population stratification in genome-wide association studies. Nat Rev Genet NIH Public Access; 2010;11:459–63.Google Scholar
  59. 59.
    Li M, Reilly MP, Rader DJ, Wang L-S. Correcting population stratification in genetic association studies using a phylogenetic approach. Bioinformatics Oxford University Press; 2010;26:798–806.Google Scholar
  60. 60.
    Shah S, McRae AF, Marioni RE, Harris SE, Gibson J, Henders AK, et al. Genetic and environmental exposures constrain epigenetic drift over the human life course. Genome Res. 2014;24:1725–33.CrossRefPubMedPubMedCentralGoogle Scholar
  61. 61.
    Jaffe AE, Irizarry RA. Accounting for cellular heterogeneity is critical in epigenome-wide association studies. Genome Biol. 2014;15:R31.CrossRefPubMedPubMedCentralGoogle Scholar
  62. 62.
    Houseman EA, Accomando WP, Koestler DC, Christensen BC, Marsit CJ, Nelson HH, et al. DNA methylation arrays as surrogate measures of cell mixture distribution. BMC Bioinformatics. 2012;13:86.CrossRefPubMedPubMedCentralGoogle Scholar
  63. 63.
    Leek JT, Scharpf RB, Bravo HC, Simcha D, Langmead B, Johnson WE, et al. Tackling the widespread and critical impact of batch effects in high-throughput data. Nat Rev Genet. 2010;11:733–9.CrossRefPubMedGoogle Scholar
  64. 64.
    Aryee MJ, Jaffe AE, Corrada-Bravo H, Ladd-Acosta C, Feinberg AP, Hansen KD, et al. Minfi: a flexible and comprehensive Bioconductor package for the analysis of Infinium DNA methylation microarrays. Bioinformatics. 2014;30:1363–9.CrossRefPubMedPubMedCentralGoogle Scholar
  65. 65.
    Lamas GA, Navas-Acien A, Mark DB, Lee KL. Heavy metals, cardiovascular disease, and the unexpected benefits of edetate chelation therapy. J Am Coll Cardiol. 2016;67:2411–8.CrossRefPubMedPubMedCentralGoogle Scholar
  66. 66.
    Huang Y, Pastor WA, Shen Y, Tahiliani M, Liu DR, Rao A. The behaviour of 5-hydroxymethylcytosine in bisulfite sequencing. Liu J, editor. PLoS One. 2010;5:e8888.Google Scholar
  67. 67.
    Booth MJ, Branco MR, Ficz G, Oxley D, Krueger F, Reik W, et al. Quantitative sequencing of 5-methylcytosine and 5-hydroxymethylcytosine at single-base resolution. Science. 2012;336:934–7.CrossRefPubMedGoogle Scholar
  68. 68.
    Lange T, Rasmussen M, Thygesen LC. Assessing natural direct and indirect effects through multiple pathways. Am J Epidemiol. 2014;179:513–8.CrossRefPubMedGoogle Scholar
  69. 69.
    Tobi EW, Slieker RC, Luijk R, Dekkers KF, Stein AD, Xu KM, et al. DNA methylation as a mediator of the association between prenatal adversity and risk factors for metabolic disease in adulthood. Sci Adv. American Association for the Advancement of Science; 2018;4:eaao4364.Google Scholar
  70. 70.
    Tobi EW, van Zwet EW, Lumey L, Heijmans BT. Why mediation analysis trumps Mendelian randomization in population epigenomics studies of the Dutch Famine. bioRxiv. 2018;362392.Google Scholar
  71. 71.
    Richmond RC, Relton CL, Smith GD. RE: what evidence is required to suggest that DNA methylation mediates the association between prenatal famine exposure and adulthood disease? Sci Adv. 2018Google Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  • Maria Grau-Perez
    • 1
    • 2
    • 3
    Email author
  • Golareh Agha
    • 2
  • Yuanjie Pang
    • 4
  • Jose D. Bermudez
    • 3
  • Maria Tellez-Plaza
    • 1
    • 5
    • 6
  1. 1.Area of Cardiometabolic and Renal RiskBiomedical Research Institute Hospital Clinic of Valencia (INCLIVA)ValenciaSpain
  2. 2.Department of Environmental Health SciencesColumbia University Mailman School of Public HealthNew YorkUSA
  3. 3.Department of Statistics and Operational ResearchUniversity of ValenciaValenciaSpain
  4. 4.Clinical Trial Service Unit & Epidemiological Studies (CTSU), Nuffield Department of Population HealthUniversity of OxfordOxfordUK
  5. 5.Department of Chronic Diseases Epidemiology, National Center for EpidemiologyNational Institutes for Health Carlos IIIMadridSpain
  6. 6.Department of Environmental Health and EngineeringJohns Hopkins Bloomberg School of Public HealthBaltimoreUSA

Personalised recommendations