Background

Major depressive disorder is a chronic and recurring condition with an estimated lifetime prevalence of 14.6% and 11.1% in high- and lower- and middle-income countries, respectively [1, 2]. In addition, major depressive disorder is a leading source of disability worldwide [3, 4], and is associated with diminished quality of life and medical morbidity [2, 4, 5]. An accumulating body of evidence also indicates that major depressive disorder may confer a higher risk for several non-communicable diseases (for example, diabetes [6], obesity [7], stroke [8], acute myocardial infarction [9], dementia [10], and physical health multimorbidity [11]), while these chronic health conditions appear to increase the likelihood of developing depression [7, 12,13,14,15].

It has long been suggested that depression is associated with elevated all-cause mortality [16, 17], and is an established risk factor for completed suicide [18]. In addition, depression has been associated with higher mortality rates across several settings and populations, including community samples, inpatients/outpatients, and patients with specific medical conditions (for example, stroke, diabetes, and coronary heart disease) [9, 16, 19, 20]. However, consistent evidence has not shown that specific interventions targeting depression may increase survival in both community and clinical samples. Furthermore, several confounding variables may account for the observed associations between depression and survival, namely sociodemographic variables [21], physical inactivity [22, 23], higher smoking rates [24], follow-up duration of studies [16], and co-occurring medical and psychiatric conditions [5, 25].

Several individual systematic reviews and meta-analyses have investigated the association between depression and mortality across distinct populations (for example, in community samples as well as in samples with specific chronic diseases) [16, 20, 26,27,28]. To synthesize and evaluate the available evidence we conducted an umbrella review of systematic reviews and meta-analyses that assessed the association of depression and all-cause and cause-specific mortality. The strength of the evidence supporting these associations and hints of bias were evaluated using standardized approaches [8, 29,30,31].

Methods

Literature search

We conducted an umbrella review, which is the systematic collection and assessment of multiple systematic reviews and meta-analyses done in a specific research topic [29]. The PubMed/MEDLINE, EMBASE, and PsycINFO databases were searched from inception up to January 20, 2018, for systematic reviews and meta-analyses of observational studies which examined the association of depression and all-cause or cause-specific mortality. A pre-defined search strategy was used (Additional file 1).

Eligibility criteria

We included systematic reviews and meta-analyses of observational epidemiological studies performed in humans that assessed the impact of depression on all-cause or cause-specific mortality in any specific population (for example, community samples, samples with a specific medical condition, inpatients, etc.). In addition, systematic reviews and meta-analyses that solely investigated the association of depression and suicide-related deaths were not considered; this was not an aim of the current effort as depression is an established risk factor for completed suicide [18]. However, suicide-related deaths were considered in meta-analyses that estimated the association of depression and all-cause mortality across different populations. No language restrictions were considered for the selection of systematic reviews and meta-analyses for this umbrella review. We included unique observational studies derived from all available systematic reviews and meta-analyses on a specific topic. Whenever a meta-analysis included a lower number of component studies compared to another meta-analysis on the same topic, the former was excluded only if all its individual datasets were included in the larger meta-analysis. Otherwise, we also extracted data from non-overlapping datasets included only in the meta-analysis with fewer studies. This approach aimed to synthesize the largest evidence possible derived from available systematic reviews and meta-analyses. Across each eligible systematic review and/or meta-analysis we considered studies in which the case definition of depression was based on either International Classification of Disease [32] (ICD), Diagnostic and Statistical Manual of Mental Disorders [33] (DSM), or other consensus-based acceptable criteria (e.g., the Research Diagnostic Criteria [34]). We also included studies where depression was assessed by means of a screening instrument with a specific cutoff score (e.g., the Patient Health Questionnaire-9 and the Beck Depression Inventory). We excluded individual studies from eligible systematic reviews and meta-analyses according to the following criteria: (1) reported an association only for depressive symptoms (i.e., the association was reported for an increase in scores of a depression rating scale instead of a possible diagnosis of depression based on a screening tool with a cutoff point); (2) considered other mental disorders (e.g., dysthymia) in the mortality outcome assessment unless data for depression, as defined above, was provided separately; (3) a diagnosis of depression was based only on clinical evaluation without any specification of the diagnostic criteria; (4) a diagnosis of depression was based only on the use of antidepressants or otherwise on a self-reported (or record-based) history of depression; (5) the association was reported considering other outcomes in addition to mortality (e.g., recurrence); and (6) studies that provided results based on controls that were not included in the original sample (for example, studies that estimated the associations of depression and mortality through standardized mortality ratios compared with general population data external to the study sample).

Two authors (MOM and NV) independently screened the titles and abstracts of retrieved references for eligibility. The full-text articles of potentially eligible articles were then independently scrutinized in detail by two investigators (MOM and NV). Disagreements were resolved through consensus or discussion with a third investigator (CAK or AFC).

Data extraction

Data extraction was done independently by two investigators (MOM and NV) and, in case of discrepancies, a third investigator made the final decision (CAK and AFC). For each eligible reference, we recorded the first author, year, journal of publication, specific populations evaluated and the number of included studies. If a quantitative synthesis was performed, we also extracted the most fully adjusted study-specific risk estimates (relative risk, odds ratio, hazard ratio, or incident risk ratio) and corresponding 95% confidence intervals (CIs). When available, we also extracted the following variables from each study: number of cases (number of death events in participants with depression), sample size, follow-up time, covariates included in multivariable models, method used to define depression (i.e., structured diagnostic interview or screening instrument), study design (case-control, prospective cohort, or retrospective cohort), specific population, as well as the setting and country where the study was conducted. Whenever studies used several control groups, we considered data from healthy controls as the control group. For studies with no quantitative synthesis, the authors’ main interpretations about their findings and reasons why a meta-analysis was not conducted were recorded.

Statistical analysis and methodological quality appraisal

We based our analysis on the largest meta-analysis that evaluated the association of depression and all-cause or cause-specific mortality. Furthermore, all datasets from similar meta-analyses that were not included in the largest available one were also considered (i.e., we included all datasets from the smaller meta-analysis that did not overlap with the larger one). We then estimated effect sizes (ES) and 95% CIs through both fixed and random effects models [35]. We also estimated the 95% prediction interval, which further accounts for between-study heterogeneity, and evaluates the uncertainty of the effect that would be expected in a new study addressing the same association [36, 37]. For the largest dataset of each meta-analysis, we calculated the standard error of the ES. If the standard error is < 0.1, then the 95% CI will be < 0.20 (i.e., less than the magnitude of a small ES). We calculated the I2 metric to quantify between-study heterogeneity. Values ≥ 50% indicate large heterogeneity, and values ≥ 75% are indicative of very large heterogeneity [38, 39]. To assess evidence for small-study effects we used the asymmetry test developed by Egger et al. [40]. A P value < 0.10 in the Egger’s test and the ES of the largest study being more conservative than the summary random effects ES of the meta-analysis were considered indicative of small-study effects [41]. Finally, evidence of an excess of significance was assessed by the Ioannidis test [42]. Briefly, this test estimates whether the number of studies with nominally significant results (i.e., P < 0.05) among those included in a meta-analysis is too large considering their power to detect significant effects at an alpha level of 0.05. First, the power of each study is estimated with a non-central t distribution. The sum of all power estimates provides the expected (E) number of datasets with nominal statistical significance. The actual observed (O) number of statistically significant datasets is then compared to the E number using a χ2-based test [42]. Since the true ES of a meta-analysis cannot be precisely determined, we considered the ES of the largest dataset as the plausible true ES. This decision was based on the fact that simulations indicate that the most appropriate assumption is the ES of the largest dataset included in the meta-analysis [43]. Excess significance for a single meta-analysis was considered if P < 0.10 in Ioannidis’s test and O > E. We graded the credibility of each association with standard approaches on the following categories [31, 44]: convincing (class I), highly suggestive (class II), suggestive (class III), weak evidence, and non-significant associations (Table 1).

Table 1 Criteria for classification of the credibility of the evidence (adapted from reference [31])

For associations supported by either class I or II evidence, we conducted additional analyses. First, grading of the evidence was re-assessed through sensitivity analyses (when at least three independent datasets were available for each subgroup). The following analyses were considered: (1) prospective cohort studies; (2) studies in which the ascertainment of depression was performed by means of a structured diagnostic interview; (3) studies that provided estimates adjusted for potential confounding variables through multivariable models; (5) studies from which estimates were adjusted at least for sex and age; (6) studies that adjusted for characteristics of the underlying somatic disease (i.e., whenever the association of depression and mortality was assessed in a population with a specific somatic condition); (7) studies that adjusted estimates for the presence of co-morbid diseases (including mental and/or somatic conditions); (8) settings where samples were derived from (community, primary care, outpatient samples, or inpatient samples); and (9) studies in which the follow-up time was longer than 5 years. Finally, we used credibility ceilings, which is a method of sensitivity analyses to account for potential methodological limitations of observational studies that might lead to spurious precision of combined effect estimates. In brief, this method assumes that every observational study has a probability c (credibility ceiling) that the true effect size is in a different direction from the one suggested by the point estimate [45, 46]. The pooled effect sizes were re-estimated considering a wide range of credibility ceiling values [30, 45]. All analyses were conducted in STATA/MP 14.0 (StataCorp, USA) with the metan package.

Two investigators (MOM and NV) independently rated the methodological quality of included systematic reviews and meta-analyses with the Assessment of Multiple Systematic Reviews (AMSTAR) instrument, which has been validated for this purpose [47,48,49]. Scores range from 0 to 11 with higher scores indicating greater quality. The AMSTAR tool involves dichotomous scoring (i.e., 0 or 1) of 11 related items to assess methodological rigor of systematic reviews and meta-analyses (e.g., comprehensive search strategy, publication bias assessment). AMSTAR scores are graded as high (8–11), medium (4–7), and low quality (0–3) [47].

Results

Overall, the title and abstract of 4983 references were screened for eligibility. The full-text of 52 references were then scrutinized in detail, of which 19 were excluded with reasons (Additional file 1: Table S1), while 26 references met inclusion criteria (Fig. 1). Overall, 24 references provided quantitative synthesis of evidence [16, 19, 20, 26,27,28, 50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67], and 2 references were qualitative systematic reviews [68, 69]. This umbrella review included 238 prospective studies and 8 retrospective cohort studies and comprised data from 3,825,380 participants, including 293,073 participants with depression and 282,732 death events, which were grouped in 17 meta-analytic estimates (Additional file 1: Table S2). Overall, 246 eligible studies were derived from included meta-analyses, while 667 component studies were excluded from eligible meta-analyses due to the following reasons: datasets were included in more than one meta-analysis (k = 375); other mental disorders (e.g., dysthymia) were considered in the association between depression and mortality (k = 14); a diagnosis of depression was based only on clinical evaluation without any specification of the diagnostic criteria (k = 7); a diagnosis of depression was based only on the use of antidepressants (k = 5); the association included other outcomes besides mortality (e.g., recurrence) (k = 5); overlapping samples (k = 20); did not provide data for ES estimation (k = 12); a diagnosis of depression was not established according to inclusion criteria (k = 223); and assessed the impact of depression on mortality considering standardized mortality ratios against general population data external to the study (k = 6). Overall, 165 studies (67.1%) provided adjusted association metrics, with a median number of 5 (IQR 3–8) covariates controlled for in multivariable models (see Additional file 1: Table S3 for the list of factors that were considered in multivariable models in studies derived from eligible meta-analyses). The median follow-up time of included studies was 4.5 years (IQR 2–7.5). The median AMSTAR score of eligible systematic reviews and meta-analyses was 6 (IQR 5–7.5). Scores of each domain of the AMSTAR instrument are provided in Additional file 1: Table S4.

Fig. 1
figure 1

Study flowchart

Evidence from qualitative systematic reviews

A systematic review that included 3 studies suggested that depression could be associated with reduced long-term survival in patients with head and neck cancers [68]. In addition, a systematic review that included 11 studies that assessed the association of depression and mortality in chronic pulmonary obstructive pulmonary disease (COPD) met inclusion criteria. The authors concluded that depression could be associated with an increase in early mortality in patients with COPD [69].

Summary effect sizes

At a threshold of P < 0.05, summary ESs were significant for all 17 (100%) meta-analytic estimates in both fixed and random effects models (Additional file 1: Table S2). At a more conservative threshold of P < 0.001, 16 (94.1%) and 9 (52.9%) estimates were significant in fixed and random effects models, respectively. At a threshold of P < 10− 6, 12 (70.6%) and 5 (29.4%) meta-analyses were statistically significant in fixed and random effects models, respectively.

Heterogeneity between studies

Six meta-analyses (35.6%) showed large heterogeneity (I2 = 50–75%) and 5 (29.4%) exhibited very large heterogeneity (I2 > 75%) (Additional file 1: Table S5). We further assessed the uncertainty of the summary effects by calculating their 95% prediction intervals; the null value was excluded in only 3 associations, namely in all-cause mortality in coronary artery bypass graft patients, coronary heart disease patients, and COPD patients.

Small-study effects

Evidence of small-study effects was verified in 13 meta-analyses, including associations of depression and all-cause mortality in patients after coronary artery bypass grafting, with acute coronary syndrome or coronary heart disease, after stroke, post-transplant patients, and people with HIV, chronic kidney disease, heart failure, COPD, diabetes mellitus, and mixed settings, as well as associations with depression and fatal stroke and cardiovascular mortality after acute myocardial infarction (Additional file 1: Table S5) [51].

Excess significance

We assessed excess of significance bias (i.e., the likelihood that the observed number of nominally significant studies could exceed the expected number of ‘positive’ studies for a given estimate). Eleven (64.7%) meta-analyses had evidence of excess significance bias, namely those investigating the associations of all-cause mortality and cancer, heart failure, mixed settings, coronary heart disease, acute coronary syndrome, stroke, post-transplant patients, chronic kidney disease, as well as associations of depression and fatal stroke, cardiovascular mortality in patients with diabetes mellitus, and cardiovascular mortality in mixed settings (Additional file 1: Table S5).

Grading of the evidence

We explored whether the nominally significant associations between mortality and depression were supported by convincing, highly suggestive, suggestive, or weak evidence (Table 2). Overall, no association was supported by convincing evidence, while associations of depression and all-cause mortality among patients with cancer, patients after acute myocardial infarction, patients with heart failure, and mixed settings (including inpatients, outpatients, and community as well as primary care samples) were supported by highly suggestive evidence. Furthermore, associations between depression and all-cause mortality in patients with coronary heart disease and diabetes mellitus were supported by suggestive evidence. Finally, the remaining 11 (64.7%) associations were supported by weak evidence (Table 2).

Table 2 Details of evidence grading for meta-analyses investigating associations of depression and mortality

Sensitivity analyses

Sensitivity analyses were performed for the four associations supported by highly suggestive evidence as per our protocol (Table 3). It is worth noting that, when studies that employed structured/semi-structured diagnostic interviews were considered, associations of depression and all cause-mortality in cancer as well as post-acute myocardial infarction became supported by weak evidence, while the association of depression and all-cause mortality in mixed settings dropped to suggestive evidence. Furthermore, when only studies that provided adjusted estimates were considered, associations of depression and all-cause mortality in cancer and post-acute myocardial infarction dropped to suggestive evidence. Moreover, the association of depression and all-cause mortality in cancer was supported by suggestive evidence only when studies that adjusted at least for age and sex were assessed in analysis.

Table 3 Sensitivity analyses for associations of depression and all-cause mortality supported by highly suggestive (class II) evidence

Sensitivity analyses through credibility ceilings were also conducted for the four associations supported by highly suggestive evidence (Additional file 1: Table S6). All associations remained significant when 10% credibility ceilings were considered, while no associations were nominally significant when 20% credibility ceilings were considered.

Discussion

The associations between mental disorders and mortality have been investigated for more than 150 years [70, 71]. The associations between depression and all-cause and cause-specific mortality has been particularly investigated across different types of settings and populations. All meta-analyses have obtained nominally statistically significant results for a higher risk of mortality in almost all the tested populations. However, no associations met criteria for convincing evidence, while only four associations, namely those of depression and all-cause mortality in cancer, heart failure, mixed settings as well as among patients after acute myocardial infarction, were supported by highly suggestive evidence. Nevertheless, our sensitivity analyses indicate that differences in case ascertainment of depression as well as the lack of proper adjustment for confounding variables and other major risk factors could render several associations supported by lower levels of evidence. Therefore, the current work suggests that causal inferences between depression and all-cause mortality across distinct populations do not appear to be as conclusive as once thought [16, 21, 72].

Several variables and mechanisms may contribute to the observed associations of depression and all-cause mortality. Some effects may be direct. For example, it has been suggested that depression activates several pathophysiological mechanisms that could contribute to the emergence of chronic somatic diseases that are consistently related to lowered survival. For instance, it has been claimed that depression is associated with peripheral inflammation [73] and oxidative stress [74], mechanisms which may contribute to the association of depression and obesity and cardio-metabolic conditions [66, 75,76,77]. However, depression may also exert indirect effects on survival. For example, a large body of evidence suggests that depression alters illness behavior [78], leading to a meaningful decrease in treatment adherence across several conditions [79, 80] as well as unhealthy lifestyles (e.g., sedentary behavior, higher prevalence of smoking, and non-salutary diet) [23, 73, 81, 82]. Depression also often co-exists with other mental health conditions that may also be associated with elevated mortality rates [25, 72]. Multivariable adjustment has varied across included studies, and only approximately 40% of included studies controlled their results at least for age and sex. Mortality analyses that do not account for at least these two major determinants of death risk are problematic. We observed that, when only studies that controlled for age and sex were considered, the association of depression and all-cause mortality in cancer was no longer supported by highly suggestive evidence. Furthermore, no association was supported by highly suggestive evidence when only studies that employed structured/semi-structured diagnostic interviews were considered. This is a relevant finding since recent evidence suggests that the selective use of different cutoff points may bias accuracy estimates of screening instruments for depression, even if these instruments are considered to be validated, whilst this type of bias does not apparently occur in gold-standard structured diagnostic interviews [83]. It is worthy to note, however, that the association between depression and all-cause mortality among patients with heart failure remained supported by highly suggestive evidence when only studies that provided either adjusted estimates or, otherwise, that adjusted for age and sex were considered, while due to the lack of available datasets sensitivity analyses considering studies that used structured/semi-structured diagnostic interviews could not be performed. Therefore, further studies should be conducted to evaluate this association.

Comparison with other studies

Cuijpers et al. [51] performed the largest meta-analysis to date assessing the impact of depression on mortality. Although this previous meta-analysis concluded that depression is associated with all-cause mortality, fewer studies were available when that study was conducted. In addition, the inclusion criteria differed from ours. For example, Cuijpers et al. [51] included studies in which a diagnosis of depression was based on previous exposure to antidepressants, which are drugs used for several other medical and psychiatric indications, whilst we limited our inclusion criteria to investigations in which depression was assessed by either a structured/unstructured diagnostic interview or a screening instrument with a cut-off score, and also large-scale studies that used a coded diagnosis of depression based on well-established criteria. In addition, we estimated the credibility of the evidence in different settings and populations with state-of-the art statistical methods used in previous umbrella reviews [8, 30].

A previous meta-review investigated the associations between severe mental disorders (including depression) and all-cause and suicide-related mortality [72]. Although the authors concluded that depression was associated with an excess of all-cause mortality, only three references were included and the credibility of the evidence was not quantitatively assessed. Finally, a recent study pooled evidence from 15 systematic reviews and meta-analyses and observed that evidence that depression is associated with all-cause mortality remains inconclusive [84]. This previous effort is the most comprehensive assessment of the impact of depression on mortality conducted to date. The inclusion criteria differed from ours. Furthermore, in the current effort, an attempt to demarcate the putative impact of depression on survival in different populations was performed. In addition, we assessed several hints of biases in this literature. Our findings provide further quantitative evidence that the causality of associations between depression and elevated all-cause mortality across different populations and settings remains to be proven.

Strengths and limitations

Our umbrella review might have missed some available evidence, e.g., recently published studies that had not been included in the prior meta-analyses [29]. However, in this effort, we assessed all available systematic reviews and meta-analyses, and all unique datasets which met inclusion criteria were synthesized for each estimate from all available meta-analyses and most considered meta-analyses were very recent. Although several hints of bias were found to be prevalent in this literature, it is relevant to mention that this finding does not exclude the presence of genuine (i.e., true) heterogeneity in this field. Moreover, the Ioannidis test has relatively low power in a context of high heterogeneity [42], while the assumption that the largest study could approximate the underlying ‘true’ effect size of a meta-analysis may be less straightforward for observational studies than for randomized controlled trials. Depression is a heterogeneous phenotype with different symptomatic dimensions and subtypes [85]. For example, a model has proposed that the duration and specific dimensions of depression (i.e., ‘cognitive/affective’ versus ‘somatic/affective’) may have a differential impact on the progression of coronary artery disease after acute coronary syndrome [86]. This framework was supported by a previous meta-analysis that has shown that somatic/affective symptoms of depression may exert a stronger deleterious effect upon mortality compared to cognitive/affective symptoms in patients with heart disease [87]. In addition, a recent individual-patient meta-analysis suggested that, following proper adjustment for cardiovascular factors, the association between depression and all-cause mortality is notably attenuated in patients after an acute myocardial infarction [67]. This finding underscores that the extent of proper or suboptimal adjustment of clinical and sociodemographic variables may render the association between depression and mortality less consistent across populations with chronic diseases. Although we conducted several sensitivity analyses, the reporting and multivariable adjustment to potential confounders was not consistent across included studies, thus limiting the quality of available evidence. It is possible that more studies adjusted their results at least to age and sex but considered it so trivial that they did not even report on this. Therefore, more thorough reporting of model specification and adjustment is needed in future studies.

Finally, depression may manifest in samples with chronic somatic conditions differently. For example, the diagnosis of depression in cancer patients has been a matter of debate, and may also be ascribed as a spectrum of syndromes [88, 89], some of which may not be properly captured by conventional diagnostic criteria (e.g., DSM-5 or ICD-10) [88]. Furthermore, there is a spectrum related to the timing of appearance with symptoms. In some circumstances, depression may either antedate or be considered an initial manifestation of chronic somatic diseases [78, 90], whilst in other circumstances depression may occur after the onset of the medical condition [78], and also as a result of treatment and its complications. The current effort could not elucidate how the temporal relationship between depression and the respective chronic medical condition could potentially influence mortality rates.

Implications

Our findings suggest that available evidence does not consistently allow the establishment of causal inferences linking depression to all-cause and cause-specific mortality across different settings and populations. Yet, the association of depression and all-cause mortality appears to be complex, and may be influenced by several sociodemographic and clinical variables. Moreover, we do not question the association between depression and suicide where the evidence is unquestionable [18, 91, 92]. However, suicides appear to account for a relatively smaller fraction of deaths compared to natural causes of death among people with depression [93,94,95].

The current data may also reconcile some controversies in existing literature. For example, although previous evidence has suggested that post-acute myocardial infarction depression might be associated with diminished survival, no conclusive evidence indicated that the treatment of depression translates to an increased survival in this specific population [96, 97]. Therefore, findings from this umbrella review of observational studies and data from intervention studies conducted to date appear to concur in that associations between depression and all-cause and cause-specific mortality are unlikely to be causal.

For other conditions, such as cancer, it remains unclear if prevention and treatment of depression may increase overall survival. Management of depression is worthwhile for various other reasons, e.g., improvement of quality of life, but not with the expectation that death risk will decrease. Furthermore, interventions aiming to promote a healthy lifestyle as well as the proper care of co-occurring somatic conditions in those with depression may also lead to a decrease in all-cause mortality [25]. However, the impact of those interventions at an individual, societal, and health system levels upon all-cause survival warrant further investigation.

Conclusions

The associations between depression and all-cause and specific natural cause mortality has been extensively investigated in a wide range of populations and settings. However, this umbrella review of observational studies indicates that the evidence for causal associations of depression and all-cause mortality remains inconclusive. To draw firmer conclusions, further prospective and collaborative studies with transparent a priori-defined protocols and a proper multivariable adjustment to confounders and other important risk determinants for mortality are warranted.