figure a

Introduction

The prevalence of type 2 diabetes has reached epidemic levels, with major economic and public health consequences [1]. If current trends continue, the prevalence of type 2 diabetes in the USA is projected to increase from approximately one in eight to one in three adults by 2050 [2]. Dietary intake is a major, modifiable risk factor for type 2 diabetes, and the current body of evidence suggests that diets incorporating higher intakes of fruits, vegetables, whole grains, legumes, nuts and seeds and lower intakes of red meat and processed meat, refined grains and sugar-sweetened beverages are associated with lower type 2 diabetes risk [3, 4].

The Scientific Report of the 2015 Dietary Guidelines Advisory Committee, the basis for the 2015 Dietary Guidelines for Americans (DGA) and related policy, reflects this evidence base but there is minimal evidence of a direct test of how a dietary pattern reflecting these guidelines links to type 2 diabetes risk. The report also recommends limiting consumption of foods high in ‘empty calories’ (ECs) (i.e. food or beverage that contribute few or no nutrients) primarily composed of solid fats and added sugars and also refined starches and alcohol. Such foods have been identified as leading contributors to excess energy intake in the USA [5]. Additionally, there is limited evidence on the relationship between other dietary patterns increasingly adopted by Americans (such as a Palaeolithic [Palaeo] diet) and type 2 diabetes risk.

To address this knowledge gap, we examined the association between dietary pattern scores constructed to reflect the 2015 DGA Scientific Report, a modern-day Palaeo diet and a high EC intake, and risk of type 2 diabetes in a cohort of young black and white men and women with repeated assessments of diet over 30 years. We also examined the relationship between the A Priori Diet Quality Score (APDQS, cohort reference) and type 2 diabetes risk, as this score largely aligns with the 2015 DGA and has been shown to inversely associate with cardiovascular risk [6,7,8,9].

Methods

Study population

The Coronary Artery Risk Development in Young Adults (CARDIA) study is a multicentre, longitudinal investigation of the evolution of cardiovascular disease risk starting in young adulthood. Briefly, 5115 black and white men and women, aged 18–30 years, were recruited in 1985–1986 from four cities in the USA: Birmingham, Alabama; Chicago, Illinois; Minneapolis, Minnesota and Oakland, California. Details of participant enrolment and examination have been published elsewhere [10]. Re-examination occurred 2, 5, 7, 10, 15, 20, 25 and 30 years after baseline. The CARDIA study was approved by the institutional review board at each field centre and informed consent was obtained from all participants prior to enrolment [10].

Participants with a diagnosis of diabetes at baseline (n = 34), missing baseline diabetes status (n = 88) or baseline dietary data (n = 4) were excluded from this analysis. Individuals without follow-up data were also excluded (n = 152). Individuals who reported extreme energy intakes (<2510 kJ/day or >25,104 kJ/day for women [n = 53] and <3347 kJ/day or >33,472 kJ/day for men [n = 64]) were excluded. The proportion of missing data for other pertinent covariates was low (<1%). Missing values were imputed by multiple imputation, which had no impact on the results. The final study sample for this analysis comprised 4719 young adults.

Dietary intake assessment

Dietary intake was assessed at study years 0, 7 and 20 by participant self-report to the interviewer-administered validated CARDIA Diet History, as previously described [11, 12]. Briefly, interviewers asked participants open-ended questions about dietary consumption during the past month within 100 food categories, referencing 1609 separate food items in years 0 and 7 and several thousand food items in year 20. Follow-up questions addressed serving size, frequency of consumption and common additions to foods. Diet history data used codes of the University of Minnesota Nutrition Coordinating Center (NCC) and foods were placed into 166 food groups using the food grouping system developed by the NCC. Food group intake was calculated as the total number of servings per day [11].

The creation of dietary pattern scores was modelled after the APDQS used in previous CARDIA studies [6, 13]. From the 166 food groups created by the NCC system, 44 condensed groups were formed based on similar nutrient characteristics and comparability with food groups previously defined [6, 13,14,15,16]. The study population was ranked into quintiles of intake (or, when the non-consumer group was large, a group of non-consumers and quartiles among consumers) for each of the 44 food groups. Dietary pattern scores were created by classifying each food group as beneficial (+), adverse (−) or neutral (not scored due to a lack of strong, conflicting or neutral evidence) (0) in terms of the recommendations of the specific diet (see electronic supplementary materials [ESM] Table 1). For the 2015 DGA and Palaeo scores, a moderate (+/−) classification for foods with recommended moderate consumption was incorporated. The scores for all food groups were relative to the distribution of consumption in the study population (not absolute quantities), as described below.

Food groups considered beneficial by the specified dietary pattern were scored 0–4 (the highest quintile of intake received a score of 4 and the lowest a score of 0) and those considered adverse were reverse scored (highest quintile scored 0, lowest 4). Food groups with recommended moderate consumption were scored 0, 2, 4, 2, 0, with the middle quintile receiving the highest score. Dietary pattern scores were calculated by summing an individual’s scores for each food group. Neutral foods were not included in the final score. Higher scores reflect higher dietary alignment with the pre-specified patterns. The APDQS was calculated as described in previous CARDIA studies (ESM Table 2) [6, 8].

Previous investigators created the APDQS historically used in CARDIA dietary studies by classifying food groups in terms of hypothesised health effects and relationship with disease to characterise diet quality [6, 15]. The 2015 Scientific Report concluded that diets associated with better health are characterised as follows: higher levels of vegetables, fruits, whole grains, low- or non-fat dairy products, seafood, legumes and nuts; moderate intake of alcohol (among adults); lower intake of red and processed meat and low intake of sugar-sweetened foods and drinks, and refined grains [5]. Foods and food groups not explicitly mentioned in the Advisory Committee’s Consensus statement were not included in the score. The modern-day Palaeo dietary pattern was modelled after indices used in previous studies [17, 18] and was based on foods assumed to have been available to humans prior to the establishment of agriculture (mainly wild-animal and uncultivated-plant sources of foods), including meats, fish, vegetables, roots, eggs and nuts and excluding grains, legumes, dairy, salt, refined sugar and processed oils [19]. To test the concept of the Scientific Report definition of EC foods, we created an EC score based on 13 major food groups predominantly comprised of ECs [5], including alcohol, butter, margarine, chocolate, dairy dessert, fried foods, fried potatoes, fruit juice, grain dessert, refined grains, salty snacks, sugar-sweetened beverages and a sweet extra category. Intake of these food groups was measured as servings per 4184 kJ (1000 kcal). For comparison purposes, a score was also calculated using absolute servings of the 13 food groups. Higher EC scores reflect higher EC intake.

To examine the dietary patterns with type 2 diabetes risk, we calculated cumulative average scores for each pattern for each participant. For participants with follow-up time ≤7 years, baseline scores were used. For participants with follow-up time >7 and ≤20 years, the average of scores from year 0 and year 7 was used. For participants with repeated measures of diet and follow-up time >20 years, the average of dietary pattern scores at year 0, 7 and 20 was used. Cumulative averages were calculated based on available data; individuals without repeated measures of diet were assigned their baseline dietary pattern score.

Type 2 diabetes status

Diabetes status was assessed clinically at examination years 0, 7, 10, 15, 20, 25 and 30. Incidence of type 2 diabetes was defined as use of diabetes medication, a fasting blood glucose level of ≥7 mmol/l (126 mg/dl), 2 h post-challenge glucose ≥11.1 mmol/l (200 mg/dl) and/or HbA1c ≥ 48 mmol/mol (6.5%). The 2 h glucose was done at years 10, 20 and 25, while HbA1c was done at years 20, 25 and 30. Details on blood collection and laboratory procedures can be found in [10]. In the CARDIA study there was no differentiation between type 1 and type 2 diabetes; however, it is likely that most incident cases identified during follow-up are type 2 diabetes given the age of the cohort. To avoid potential misclassification, a sensitivity analysis was performed excluding participants who developed diabetes before the age of 30 years.

Covariates

At all CARDIA examinations, participants completed questionnaires on behaviours and sociodemographic, psychosocial and medical background [10]. Physical activity was assessed using the CARDIA physical activity questionnaire, a validated interviewer-based self-report of duration and intensity of participation in 13 categories of exercise over the past year [20]. Physical activity was reported in exercise units (EU), where 300 EU is approximately equal to 150 min of moderate-intensity physical activity per week or 30 min of moderate-intensity activity 5 days/week [21]. Body weight was measured with light clothing to the nearest 0.2 kg and height was measured without shoes to the nearest 0.5 cm. BMI was calculated as kg/m2 [10, 22]. Total energy intake was calculated from the CARDIA Diet History. Smoking status was assessed at all years using an interviewer-administered questionnaire. Participants were classified as current, former or never smokers. Participants who reported regular cigarette smoking (at least five cigarettes per week almost every week for at least 3 months) at the time of an examination were classified as current smokers. Former smokers were those who reported previously using cigarettes but not currently smoking. Never smokers reported no history of cigarette smoking. Self-report of cigarette smoking in CARDIA was validated at baseline against a biochemical marker of nicotine uptake (serum cotinine) and misclassification was found to be low [23].

Statistical analysis

Participants’ demographic, lifestyle and clinical characteristics were described by quartile of cumulative average dietary pattern score, using means with SD for continuous variables and frequencies with percentages for categorical variables. To compare characteristics between quartiles, ANOVA and χ2 tests were performed for continuous variables and categorical variables, respectively. Survival analysis using multivariable Cox proportional hazards models was used to estimate the HRs and corresponding 95% CIs for incident diabetes. Separate models were fit for the 2015 DGA, Palaeo, EC and APDQS scores. Follow-up time was calculated as the time (years) from baseline to the first CARDIA study examination where diabetes was identified. Otherwise, participants were censored at the last diabetes status examination before death, loss to follow-up or end of cohort surveillance, whichever came first.

For the analyses, participants were ranked into quartiles of cumulative average dietary pattern score; the lowest quartile for each score served as the reference group. Multivariable models adjusted for preselected potential demographic and lifestyle confounders were used. The base model adjusted for age, race, sex and CARDIA study field centre. Model 2 included model 1 covariates plus sociodemographic and lifestyle confounders (smoking, education, energy intake and physical activity). We used the most recent smoking status and education level prior to diabetes diagnosis and the cumulative average of years 0, 7 and 20 energy intake and physical activity data to account for repeated measures these variables. A third model adjusted for model 2 covariates plus cumulative average BMI. Cumulative averages were calculated for energy intake, physical activity and BMI in the same way as dietary pattern scores, using data from years 0, 7 and 20 or until censoring. Since family history of diabetes was missing for 698 participants (15%), we repeated the analysis using both models in the subset of the population who had this data (n = 4021). Prior to adjustment for BMI, EC models were additionally adjusted for cumulative average fruit, vegetable, whole grain, red meat, fish and dairy intake. Dietary patterns scores were also examined individually as continuous variables to test for linear trend, with the HRs calculated per SD of the score.

We tested for effect modification by sex, race, education, BMI and smoking using models that included an interaction term for the variable of interest and each dietary pattern score separately and by stratification. To evaluate how a change in diet quality over time might impact type 2 diabetes risk, we conducted a sensitivity analysis in individuals with baseline and year 20 diet data who were free of diabetes through to year 20. An individual’s diet was categorised as stable low or stable high if their baseline score was less than or greater than, respectively, the median and the difference between their year 20 and baseline score was within 1 SD of the population baseline score. An increase or decrease in diet quality was categorised as a difference between year 20 and baseline scores greater than or less than 1 SD. As a secondary analysis to inform the interpretation of the EC score, we examined the relationship between quartiles of per cent of total energy intake from added sugar and saturated fats from the year 20 dietary assessment in a subgroup of individuals with available dietary data who were free of type 2 diabetes through year 20 (n = 2436). To avoid potential misclassification, a sensitivity analysis was performed excluding participants who developed diabetes before the age of 30 years (n = 8). The proportional hazards assumption was tested by including an interaction term with loge-transformed time for each covariate. There was no evidence that the assumption was violated in any of the models. All analyses were performed using SAS version 9.4 (SAS Institute, Cary, NC, USA).

Results

Demographic, socioeconomic, lifestyle and clinical cardiometabolic characteristics by quartiles of dietary pattern scores are presented in Tables 1, 2, 3 and 4. The 2015 DGA, APDQS and Palaeo scores were positively correlated and the EC score was negatively correlated with all other dietary pattern scores (Table 5). Descriptive dietary characteristics of the scores are presented in ESM Tables 36. ESM Table 7 displays 20 year changes in the 2015 DGA and APDQS score in participants who had baseline and year 20 diet data and who were free from diabetes through year 20.

Table 1 Baseline characteristics of CARDIA study participants according to year 0, 7 and 20 cumulative average 2015 DGA scientific report diet score quartile
Table 2 Baseline characteristics of CARDIA study participants according to year 0, 7 and 20 cumulative average Palaeo diet score quartile
Table 3 Baseline characteristics of CARDIA study participants according to year 0, 7 and 20 cumulative average APDQS quartile
Table 4 Characteristics of CARDIA study participants according to year 0, 7 and 20 cumulative average empty calorie score quartile
Table 5 Pearson’s correlations between dietary pattern scores

A total of 680 incident cases of diabetes occurred during follow-up (mean [SD] 25.3 [8.3] years). As presented in Table 6, there was no association between the 2015 DGA, Palaeo and EC scores and type 2 diabetes risk in either multivariable model but there was a strong inverse association between higher APDQS score and type 2 diabetes risk.

Table 6 Association between year 0, 7 and 20 cumulative average dietary patterns and 30 year diabetes risk in young adult men and women from the CARDIA study, 1985–2015

The results for the EC score were consistent using both the score calculated per 4184 kJ of total energy intake and absolute total servings of foods high in ECs (data not presented). In a secondary analysis, a higher percentage of total energy intake from added sugar and saturated fat at year 20 was associated with an increased risk for type 2 diabetes (HR 1.16 [95% CI 1.04, 1.30]; p < 0.01) but the association was attenuated by adjustment for lifestyle factors, diet quality and BMI (ESM Table 8).

Repeated measures of the dietary pattern scores were correlated over time (r = 0.34–0.65) (ESM Table 9). The association between 20 year change in dietary pattern score and type 2 diabetes risk is presented in ESM Table 10. Individuals whose DGA score increased between year 0 and year 20 by more than one SD of year 0 scores (12.0) had a 44% lower risk of type 2 diabetes compared with those with stable DGA scores below the population median. Individuals with stable high (>median) Palaeo scores had a 41% lower risk of type 2 diabetes than those with stable low scores (per SD, HR 0.59 [95% CI 0.39, 0.88]). There was no association between changes in APDQS and EC scores as formulated in this analysis and type 2 diabetes risk.

Sensitivity analyses provided evidence for potential effect measure modification by smoking status in the DGA analysis (pinteraction = 0.007). When the DGA score was stratified on smoking status (current smokers vs non-smokers) with HRs calculated per SD of diet score, the 2015 DGA score was inversely associated with type 2 diabetes risk in non-smokers (n = 2716) (HR 0.86 [95% CI 0.74, 0.99]; p = 0.03) but no association was observed for current smokers (n = 1368) (HR 0.92 [95% CI 0.77, 1.10]; p = 0.36). In stratified analysis by education level, we observed an inverse association in participants with a college degree or higher (per SD, HR 0.75 [95% CI 0.61, 0.92]) but no association in those without a degree. There was no evidence that the association between cumulative average Palaeo, APDQS or EC scores and type 2 diabetes risk differed by smoking status. Stratified analyses, as well as formal tests for interaction between dietary pattern scores and race, sex, BMI and family history of diabetes, provided no evidence of effect modification by these factors. For the EC score, there was an inverse association in white women (HR 0.76 [95% CI 0.60, 0.96]) but null associations in all other groups. Results excluding incident diabetes cases documented before age 30 years (n = 8) were not materially different from the main analysis.

Discussion

In the CARDIA study, a cumulative average DGA 2015 score was not associated with type 2 diabetes in the main models but in pre-specified stratified analyses there was an inverse association in participants who did not smoke and in participants with a college degree, as well as in a subgroup of participants who increased their score over 20 years. The Palaeo score was not associated with type 2 diabetes in the main models but in a sensitivity analysis participants with a consistently high score over 20 years had a lower risk of type 2 diabetes. The EC score was not associated with type 2 diabetes in the main models but we did observe an inverse association between higher EC scores and type 2 diabetes risk in white women in the CARDIA study, contrary to any hypothesis. Last, the cohort reference APDQS score was strongly inversely associated with type 2 diabetes.

We are not aware of published research examining the relationship between a dietary pattern score based on the 2015 Scientific Report and incident type 2 diabetes. Previous observational studies of dietary pattern scores created to reflect earlier versions of the DGA have produced inconclusive results. Zamora et al. found no association between the 2005 Diet Quality Index and 20 year type 2 diabetes risk in the CARDIA study [24]. Other prospective studies also found no association between Healthy Eating Index (HEI) scores, reflecting both the 2005 and 2010 DGAs, and type 2 diabetes risk [25, 26]. However, the Alternate Healthy Eating Index (AHEI), which reflects the prior versions of the DGA and also includes foods and nutrients predictive of chronic diseases [27], has been associated with a decreased risk for type 2 diabetes in some study populations [25, 26, 28] but not others [29]. The heterogeneity in the overall body of evidence related to testing the DGA recommendations, including this study, suggests the need to account for the context of the population under study and the known limitations of observational dietary research.

Research examining the relationship between a modern-day Palaeo dietary pattern and incident type 2 diabetes is also lacking. Proponents of the Palaeo diet assert that humans are genetically adapted to foods available prior to changes in the food supply with the establishment of agriculture and that core metabolic processes central to common chronic disease aetiology are misaligned with these dietary changes [19, 30]. There was no association between a cumulative average modern-day Palaeo dietary pattern score and type 2 diabetes risk in the main models. However, in a subgroup analysis, participants with scores >1 SD above the median at year 0 and 20 had a lower risk of type 2 diabetes compared with participants with stable low scores. This suggests that a diet more closely aligned with a Palaeo dietary pattern may have traction over time but it is necessary to consider whether this was due to potential selection biases. This pattern differs from most dietary recommendations by eschewing food groups with evidence of being protective for type 2 diabetes risk (legumes, whole grains, dairy) [31,32,33] and emphasising animal protein, which in the context of a typical western diet is largely considered a dietary factor that increases type 2 diabetes risk [34].

The objective of examining an EC score was to investigate a largely untested concept. We hypothesised that there would be a positive association between higher EC intake and diabetes risk but there was no association in the main models. Indeed, there was an un-hypothesised inverse association in white women in the CARDIA study, suggesting potential confounding. There is a scientific basis for the recommendation of avoidance of an EC dietary pattern but the definition is nebulous and contains heterogeneous foods/nutrients. Evidence informing the concept is largely reductionist in nature or extrapolated, which makes performing rigorous science around the concept difficult to conduct and interpret.

Although the 2015 DGA and APDQS were highly correlated, it is important to contextualise their differences since imprecision surrounding the components due to the limitations of self-reported dietary data may have impacted the strength of the associations detectable in this cohort. First, alcohol was scored moderately in the 2015 DGA score and positively in the APDQS. Foods scored positively in the APDQS were not included in the 2015 DGA score (e.g. oil, poultry, coffee and tea). In addition, the APDQS negatively scored butter, fried foods and whole-fat dairy products, which were not included in the 2015 DGA score, while fruit juice, margarine and refined grains were negatively scored in the 2015 DGA score but were not included in the APDQS. The similarity between the scores included positive scoring of fruits, vegetables, avocado, nuts and seeds, legumes, fish, low-fat dairy products and whole grains and negative scoring of red and processed meats, sugar-sweetened foods and beverages, fried potatoes and salty snacks. Individuals in the highest quartile of the APDQS consumed approximately 3 g fibre/day more than those in the highest quartile of the DGA score (ESM Tables 3, 5). The inverse association between dietary fibre intake and type 2 diabetes risk is well established [35]; however, the data used in this study are not suitable for further inference on this observation. The study of dietary patterns as a whole, as opposed to the reductionist approach of examining single foods or food groups in isolation, has been increasingly recognised as the optimal approach to study diet–disease relationships because patterns are better suited to account for the correlated nature of dietary data and theoretical synergistic effects of total diet on health [36].

Smoking has been identified as a strong, independent risk factor for diabetes [37] and previous studies suggest that smoking may blunt the association between diet and diabetes risk [38,39,40]. Multiple pathophysiological pathways have been implicated in the association between smoking and type 2 diabetes [41,42,43]. Smoking may also influence the detection of diabetes through its effect on oral and intravenous glucose tolerance tests [44]. The association in smokers may further be confounded by the clustering of unhealthful behaviours and socioeconomic status [37]. These potential mechanisms may help explain the differential association in non-smokers vs smokers for the 2015 DGA.

Although the statistical models were comprehensive, residual confounding occurs to some extent in all diet–disease observational studies. The use of self-reported dietary data are also subject to recall and other biases that may alter estimates, potentially resulting in false-positive or -negative associations. The validity and reliability of the CARDIA Diet History have been demonstrated [11, 12]. Any null findings were not due to a lack of statistical power as this study were adequately powered to detect an association between dietary intake and type 2 diabetes, though a larger sample size could improve the precision of the estimates given these limitations [45].

In conclusion, adherence to contemporary dietary recommendations (2015 DGA) were inversely associated with risk of type 2 diabetes in non-smokers and individuals with a college degree. A modern-day Palaeo dietary pattern score was not associated with risk of type 2 diabetes except in a subgroup of participants who maintained a high score over 20 years. A score created to reflect high EC intake was not associated with type 2 diabetes in main models but there was a surprising inverse association between empty calorie intake and type 2 diabetes in white women in the cohort. The cohort reference APDQS, which was highly correlated with the 2015 DGA and similarly scored, was strongly inversely associated with type 2 diabetes. Overall, examination of dietary recommendations and trends in the CARDIA cohort and their relationship with risk for type 2 diabetes produced nuanced results that should be considered in the context of different potential biases.