Background

The mortality and morbidity of Chronic Obstructive Pulmonary Disease (COPD) has been increasing over the past decades and the disease is a fundamental medical and economical problem in Western societies [1]. A genetic predisposition is thought to play a crucial role in the onset of COPD and the heritability of lung function loss that precedes COPD development has been clearly established [2, 3]. Several polymorphisms have been identified in association with level of lung function, but subsequent studies have failed to replicate these reported associations [4, 5]. So far, only a small subset of polymorphisms has been consistently replicated in their association with COPD development or lung function decline across independent studies or populations [611].

Nuclear Factor (Erythroid-derived 2)-Like 2 (NFE2L2 or NRF2) regulates the transcription of numerous antioxidant enzymes in response to oxidant injury, via direct binding to the antioxidant responsive element in the target gene [1215]. It therefore is a potent candidate gene for excess lung function loss and COPD development.

Kelch-like ECH-associated protein-1 (KEAP1) is a cytosolic repressor of NFE2L2. Oxidative stress causes disruption of the KEAP1-NFE2L2 complex, translocation of NFE2L2 to the nucleus and subsequent induction of the expression of antioxidant genes [16]. It has been shown that Nfe2l2 protects mice against elastase-induced [17] and tobacco-induced [18] emphysema. Additionally, the expression pattern of both KEAP1 and NFE2L2 is different in COPD patients as compared to healthy never- or former- smokers [19, 20] and the expression of NFE2L2-regulated antioxidant genes is lower in COPD subjects than in non-diseased controls [21]. Three new polymorphisms have been discovered in the promoter region of NFE2L2, but these were not associated with COPD in a Japanese population [22]. One study showed that one of these polymorphisms decreases NFE2L2 expression in vitro and is associated with development of acute lung injury in a Caucasian population [23]. So far no studies have investigated the role of NFE2L2 or KEAP1 polymorphisms in relation to the longitudinal course of lung function in the general population.

Therefore, in the current study we investigated whether NFE2L2 or KEAP1 polymorphisms affect the level and longitudinal course of FEV1 (Forced Expiratory Volume in 1 second), both being important risks for COPD [24]. In order to assure consistency of results, we performed this study in two prospective and independent population-based cohorts.

Methods

Subjects

Subjects from the Doetinchem cohort study [25], a prospective part of the MORGEN study [26], were included. A sub-sample (n = 1,152 subjects with 3,115 FEV1 measurements during 3 surveys: surveys 1993–1997 (n = 1,152), 1998–2002 (n = 1,152), and 2003–2007 (n = 811)), table 1) was randomly selected from the total cohort with spirometry tests and DNA available as described previously [27]. FEV1 was measured three times (maneuver performed with a heated pneumotachograph (Jaeger, Germany)) with 5-year intervals according to the European Respiratory Society (ERS) guidelines [28].

Table 1 Characteristics of Doetinchem cohort and Vlagtwedde-Vlaardingen cohort

An independent cohort (Vlagtwedde-Vlaardingen; n = 1,390 subjects with 8,159 FEV1 measurements during 8 surveys, table 1) was additionally studied. This cohort was prospectively followed for 25 years with FEV1 measurements (maneuver performed with a water-sealed spirometer (Lode Instruments, the Netherlands)) every 3 years (following ERS guidelines) [29].

The study protocols were approved by local medical ethics committees and all participants gave their written informed consent.

Selection/genotyping of Single Nucleotide Polymorphisms (SNPs)

We pairwise tagged NFE2L2 and KEAP1 with respectively five and three SNPs according to the HapMap CEU genotype data (23a) with an r2 threshold of 0.8 and Minor Allele Frequency (MAF)>5%. We additionally included three novel NFE2L2 polymorphisms [22] with MAF>5%: G(-686)A (rs35652124), C(-650)A (rs6721961) and Trinucleotide CCG Repeat (TNR). SNPs were genotyped by K-Bioscience Ltd (UK) using their patent-protected competitive allele specific PCR system (KASPar). The additional file 1 contains details on SNP-selection and NFE2L2 TNR genotyping.

Statistics

SNPs in NFE2L2 and KEAP1 and level of FEV1

We used Linear Mixed Effect (LME) to study the effects of SNPs and haplotypes (additive genetic model; coded: 0 = homozygote wild type, 1 = heterozygote, 2 = homozygote mutant) on the level of FEV1 in both cohorts separately, using all available FEV1 measurements across all surveys. This analysis was adjusted for age (defined with natural cubic spline with 4 degrees of freedom in order to take into account varying effects of age on the level of FEV1 throughout lifetime), sex, packyears smoked, height and the correlation of FEV1 measurements within each subject (random effect assigned to the intercept).

SNPs in NFE2L2 and KEAP1 and course of FEV1

We studied the effect of SNPs on course of FEV1 by introducing the interaction term of SNP × time (defined in relation to the first FEV1 measurement and with random effect assigned) into the primary analysis model described above (see additional file 1 for details).

Analysis on the pooled cohorts

Finally, we pooled both cohorts, and performed analysis on the level and course of FEV1 with additional adjustment for cohort. We studied also two other models (recessive/dominant = mutant/wild type homozygotes compared to the rest genotypes) which were reported in case they showed significant effects in the pooled cohort analysis. Similarly we investigated whether there was a significant interaction between KEAP1 and NFE2L2 genotypes in relation to the level of FEV1, using two-way combinations of genetic effects with the highest statistical power i.e. dominant and additive.

Interaction with smoking

Gene by smoking interaction analysis in relation to the level of FEV1 was performed on the pooled cohorts using data from single surveys (i.e. second in the Doetinchem cohort and last in the Vlagtwedde-Vlaardingen cohort) in order to ensure the highest cumulative exposure to tobacco smoke and the highest number of subjects analyzed. The following interaction terms in two following regression models were analyzed:

  1. 1.

    SNP by ever/never smoking status in the total population with adjustment for ever-smoking status and genotypes and no adjustment for packyears smoked

  2. 2.

    SNP by packyears smoked within ever smokers with adjustment for packyears smoked and genotypes

P values < 0.05 were considered to be statistically significant (tested 2-sided).

Software

LME models were run using S-PLUS (version 7.0). Linkage Disequilibrium (LD) plots and Hardy-Weinberg Equilibrium (HWE) tests were performed with Haploview (version 4.1) [30]. We identified, with a probability > 95%, subjects carrying no, one or two copies of a specific haplotype, using the *. out_pairs output file from PHASE software (version 2.1) [31, 32]. We used MIX software (version 1.7) [33, 34] to meta-analyze results from the Doetinchem, Vlagtwedde-Vlaardingen and British 1958 Birth cohort [35].

Results

Genetic structure of NFE2L2 and KEAP1

There was an excess of KEAP1 rs1048290 SNP heterozygotes in the Vlagtwedde-Vlaardingen cohort, which caused a significant deviation (p = 0.01) from HWE (table 2). To eliminate potential genotyping errors as underlying cause of this, we additionally genotyped KEAP1 rs9676881 SNP, that is in complete LD with rs1048290 (based on HapMap; distance between the two SNPs = 3.7 kb). This SNP also showed a significant deviation from HWE (p = 0.01; frequency of 50.6% and 12.4% for heterozygotes and homozygote mutants respectively) in the Vlagtwedde-Vlaardingen cohort.

Table 2 Characteristics of NFE2L2 and KEAP1 genotypes in the Doetinchem cohort and Vlagtwedde-Vlaardingen cohort

Five NFE2L2 TNR alleles, including three alleles not observed previously [22] i.e. 2, 6 and 7 CCG repeats, were identified in the Doetinchem cohort. These three novel alleles occurred with a total cumulative frequency of 0.4% (see additional file 1 for details).

The NFE2L2 G(-686)A (rs35652124) SNP, CCG TNR and rs2364723 SNP were in high LD as well as NFE2L2 C(-650)A (rs6721961) and rs4243387 SNPs (r2 ≥ 0.96, figure 1). We observed 5 prevalent (>5% frequency) haplotypes in NFE2L2, and 4 prevalent haplotypes in KEAP1 in both cohorts (table 3). Two haplotypes in NFE2L2 (haplotypes C and D) were unique, i.e. they were not tagged by a single allele of any SNP (table 3). Similarly, 2 haplotypes in KEAP1 (haplotypes A and B) were unique (table 3).

Table 3 Characteristics of NFE2L2 and KEAP1 haplotypes occurring with >5% frequency in the two cohorts studied
Figure 1
figure 1_787

NFE2L2 and KEAP1 linkage disequilibrium plots (100·r 2 ) in the Doetinchem cohort (n = 1,152). *given for the wild type (5 CCG repeats) and the mutant (4 CCG repeats) allele NFE2L2 = Nuclear Factor Erythroid 2-Like 2 KEAP1 = Kelch-like ECH-associated protein-1.

NFE2L2 and KEAP1 variations and level of FEV1

SNP rs2364723 in NFE2L2 was associated (p = 0.06) with a lower FEV1 level, and SNP rs11085735 in KEAP1 was significantly associated with a higher FEV1 level in the Vlagtwedde-Vlaardingen cohort (table 4). Similar, but non-significant trends for an additive effect were observed in the Doetinchem cohort, resulting in significant effects in the pooled cohort analysis (table 4).

Table 4 Additive effects of genetic variations in NFE2L2 and KEAP1 on the level of FEV1

Heterozygote subjects for rs2364723 SNP had a significantly lower FEV1 level as compared to homozygote wild type subjects (figure 2), while for the rs11085735 SNP all between-genotypes differences were significant in the pooled cohort analysis (figure 3).

Figure 2
figure 2_787

Mean adjusted FEV 1 level for heterozygote and homozygote mutant genotypes of the NFE2L2 rs2364723 SNP as compared to wild type. Mean adjusted effects (squares) and corresponding 95% Confidence Intervals (bars) are presented. *p < 0.05 as compared to wild type. NFE2L2 = Nuclear Factor Erythroid 2-Like 2. FEV1 = Forced Expiratory Volume in 1 second.

Figure 3
figure 3_787

Mean adjusted FEV 1 level for heterozygote and homozygote mutant genotypes of the KEAP1 rs11085735 SNP as compared to wild type. Mean adjusted effects (squares) and corresponding 95% Confidence Intervals (bars) are presented. * p < 0.05 for homozygote mutant genotype as compared to wild type or heterozygotes. p < 0.05 for heterozygote genotype as compared to homozygote wild type or homozygote mutant. p < 0.05 for all between-genotype comparisons. KEAP1 = Kelch-like ECH-associated protein-1. FEV1 = Forced Expiratory Volume in 1 second

Haplotype C in NFE2L2 was associated with higher FEV1 levels using an additive model in the pooled cohort analysis exclusively (table 4). Haplotype A in KEAP1 was associated with higher FEV1 level in a recessive model in the pooled cohort analysis (table 4). No additional consistent associations were observed for other SNPs or other genetic models (data not shown).

Interaction between SNPs in NFE2L2 and KEAP1

There was no significant interaction between SNPs in KEAP1 and NFE2L2 (using combinations of dominant and/or additive effects) in relation to the level of FEV1 in the pooled cohort analysis (data not shown).

Interaction between smoking and NFE2L2 and KEAP1 variations and level of FEV1

We observed no significant interaction between ever/never smoking status and variations in NFE2L2 or KEAP1 in relation to the level of FEV1. Yet the effect of rs11085735 in KEAP1 was significant only in never smokers, while the effect of rs2364723 and haplotype C in NFE2L2 was significant only in ever smokers (table 5). In the pooled cohort analysis we observed significant interactions between packyears smoked with two linked variations in KEAP1 i.e. rs1048290 (BINT = 1.9 ml/(packyear*allele number) SEINT = 0.9 p = 0.03) and haplotype B (BINT = 1.9 ml/(packyear*allele number) SEINT = 0.9 p = 0.04). In the single cohort analysis these interaction terms were not significant (p > 0.10 for both cohorts).

Table 5 Additive effects of NFE2L2 and KEAP1 SNPs on the level of FEV1 in never- and ever-smokers

SNPs in NFE2L2 and KEAP1 and course of FEV1

We did not observe any significant effect of SNPs in NFE2L2 and/or KEAP1 on the course of FEV1 in either of the cohorts nor in the pooled cohort analysis for any genetic model tested (see table 6 for additive effects).

Table 6 Additive effects of genetic variations in NFE2L2 and KEAP1 on the longitudinal course of FEV1

Discussion

The current study shows that polymorphisms in antioxidant transcription factor NFE2L2 and its repressor KEAP1 affect the level of FEV1 in the general population.

NFE2L2 is required for the transcription initiation of many antioxidant-related genes including candidate genes for lung excess function loss and COPD development such as Heme Oxygenase 1 and Glutamate Cysteine Ligase [11, 27, 36]. Moreover, murine models have shown that the Nfe2l2 depletion in vivo results in elastase- [17] and cigarette smoke-induced [18] emphysema development. Thus a functional genetic impairment concerning NFE2L2 and/or its cytosolic repressor KEAP1 would likely result in detrimental consequences in vivo.

It has been shown that lung function is genetically determined [2, 3], however so far only low-prevalent polymorphisms have been consistently associated with COPD development across independent studies, i.e. the Glu342Lys substitution in SERPINA1 (frequency 1%–3% in Caucasians) that leads to a1-antitrypsin deficiency [68] and the Arg213Gly substitution in Superoxide Dismutase 3 (frequency 1%–2% in Caucasians) [9, 10], suggesting that low-prevalent SNPs are important contributors to COPD development. Detection of the effect provided by such low prevalent SNPs often requires large sample sizes, even when the effect size is substantial. Similarly, small genetic effects for highly prevalent variations, such as those genotyped in the current study, need to be assessed in large sample sizes. Therefore, we used all available FEV1 measurements in both cohorts, in order to achieve the highest possible statistical power. Moreover, we additionally performed analyses on the pooled cohorts including over 2,500 subjects with over 11,000 FEV1 measurements.

In our opinion the most convincing association shown in the current study was that the rs11085735 SNP in KEAP1 significantly associated with higher FEV1 levels in the pooled cohort as well as in both cohorts analyzed separately, yet using different genetic models. This SNP is located in the intron 3 of KEAP1, relatively close (73 bp) to the exon 3 of this gene, and thus it might have functional consequences e.g. via affecting KEAP1 mRNA splicing. Haplotype A in KEAP1 was associated with higher FEV1 level in the Doetinchem cohort and in the pooled cohort analysis using a recessive model only. Since this haplotype does not tag any SNP that was investigated in the current study, it may be in linkage disequilibrium with another functional SNP that is either not known yet or is located outside the region that was selected for tagging.

SNP rs2364723 and haplotype C in NFE2L2 were associated with the level of FEV1 in the pooled cohort analysis, as caused by a similar though not significant trends present in both cohorts. SNP rs2364723 is in almost complete LD with the recently described promoter polymorphisms i.e. G(-686)A (rs35652124) and CCG Trinucleotide repeat (figure 1) [22], implicating a role in the regulation of NFE2L2 transcription. We found no evidence for an association of another previously identified functional NFE2L2 SNP (i.e. C(-650)A (rs6721961) tagged by us with rs4243387 SNP) [23].

None of the analyzed genetic variations showed a significantly different effect on the level of FEV1 between never and ever smokers, yet the effects provided by NFE2L2 rs2364723 SNP and haplotype C were more prominent in ever smokers while the effect of KEAP1 rs11085735 SNP was significant in never smokers exclusively. Interestingly another variation in KEAP1 (i.e. rs1048290 linked with haplotype B) showed a protective effect on the level on FEV1 in interaction with packyears smoked within ever smokers. The observed association of the level of FEV1 and the interaction between rs1048290 SNP and smoking can be somewhat weakened by a deviation from HWE observed for this SNP in one of the cohorts studied. Since the common cause of such deviation is a genotyping error, we have genotyped another, completely correlated, rs9676881 SNP, which also showed significant deviation from HWE. This suggests that genotyping error was not a cause of the observed deviation from HWE. Significant results obtained in the analysis stratified by smoking status (ever and never smokers), or in the gene by packyears interaction analysis did not reach significance in either of the cohorts analyzed separately. Since this could be due to insufficient power provided by single cohorts, subsequent studies are warranted.

Using publicly available data on the British 1958 Birth cohort [35], we checked whether our results on the significant association of SNPs with the level of FEV1 could be replicated in this independent population. The additive effects provided by. rs11085735 in KEAP1 and rs2364723 in NFE2L2 were not significant, p values being 0.11 and 0.59–0.70 (depending on the genotyping method) respectively. However, both associations were in the same direction as found in our two Dutch cohorts, i.e. positive for rs11085735 in KEAP1 (B = 52.7 ml/allele, 95% Confidence Interval (CI) = -12.6 – 118.0) and negative for rs2364723 in NFE2L2 (B = -7.3 ml/allele, 95% CI = -44.3 – 29.6, representing higher p value). A subsequent meta-analysis of the Doetinchem, Vlagtwedde-Vlaardingen and British 1958 Birth cohorts showed a higher significant protective effect of the KEAP1 SNP on the level of FEV1 (p = 0.0008) as compared to the pooled analysis in the two Dutch cohorts (p = 0.003, table 4). The p value of the additive and detrimental effect of the rs2364723 SNP was significant as well (0.036–0.046, depending on the genotyping technology in the British 1958 Birth Cohort), yet higher than the p value provided by the pooled analysis in the two Dutch cohorts (i.e. p = 0.026, table 4).

Conclusion

Our study performed in two independent Dutch cohorts shows that genetic variations in KEAP1 and NFE2L2 affect the level, but not the longitudinal course of FEV1 in the general population. Therefore, it remains for future considerations whether these SNPs play a role in the development or growth of the lung. Given the importance of both genes in the regulation of oxidative stress in the lung, further studies focusing on the NFE2L2-KEAP1 pathway are warranted.