Introduction

Approximately 75% of all breast cancer patients have estrogen receptor (ER)-positive tumours, and are candidates for adjuvant endocrine treatment with either an aromatase inhibitor (AI) or the selective estrogen receptor modulator (SERM), tamoxifen. AQAmong other studies, the phase III Intergroup Exemestane Study (IES), which randomized 4724 postmenopausal patients with early-stage breast cancer after 2–3 years of tamoxifen therapy between the schemes of continuing on tamoxifen and switching to exemestane to complete 5 years of endocrine therapy, showed a significantly improved disease-free survival (DFS) in favour of a switch to exemestane after 2–3 years of tamoxifen, compared to 5 years of tamoxifen monotherapy [1,2,3,4]. A second study, the tamoxifen, exemestane adjuvant multinational (TEAM) phase 3 trial, was performed to assess the benefit of 5-year exemestane monotherapy over the switch scheme, and showed no statistical differences in survival between both groups [5].

Classic prognostic factors like TNM-stage, tumour grade and expressional status of hormone receptors or the human epidermal growth factor receptor 2 (HER2) do not predict which adjuvant endocrine treatment is the best one for which patient [5]. One of the factors that could act as a new prognostic or predictive biomarker may be derived from the immune system. The importance of the local immune system, in particular, the role of tumour-infiltrating lymphocytes (TILs) on the outcome of (neo)adjuvant treatment of breast cancer has recently been validated [6,7,8,9,10,11,12,13]. Cytotoxic (CD8-positive) T-cells appear to play a major role in this phenomenon [7, 9]. Most of the studies reported a clinical benefit for tumours with a higher infiltration of TILs, although this effect seems to be isolated to rapidly proliferating, ER-negative tumours [7,8,9,10,11,12]. Especially in triple-negative tumours, TILs are a promising biomarker for the success of (neo)adjuvant chemotherapy [6, 13]. However, no data are available which assess the predictive value of TILs for endocrine treatment.

The aim of the current study was to determine the prognostic value of CD8-positive TILs in ER-positive breast cancer, and predictive value of CD8-positive TILs on the outcome of endocrine therapy with either tamoxifen or exemestane in two independent cohorts. For this, we evaluated the number of CD8-positive TILs in the Dutch subsets of the IES and TEAM trials, and used this for a stratified survival analysis for tumour recurrence and survival time of patients treated with either exemestane or tamoxifen.

Materials and methods

Patients and tumour tissues

IES trial

In the IES trial, 4724 patients, who were treated with surgery for early breast cancer and who were disease free after 2–3 years of adjuvant treatment with tamoxifen, were randomized based on either continuing with tamoxifen for up to 5 years, or switching to exemestane to complete 5 years of therapy, between 1998 and 2003. The details on inclusion and exclusion criteria were described before [3]. For the Dutch fraction of this cohort (n = 236), formalin-fixed paraffin-embedded (FFPE) tumour tissue was collected and was separately converted into a tissue microarray (TMA). This TMA was created as described earlier [14]. In brief, two 0.6-mm-core needle punches were obtained from the FFPE tumour blocks, and transplanted into an empty recipient block. Follow-up for disease-free survival (DFS, defined as any local, regional or distant recurrence, new contralateral breast cancer or death due to any cause) and overall survival (OS) started at randomization after 2–3 years of tamoxifen treatment. For this analysis, follow-up data were used which were described earlier [4].

TEAM trial

The TEAM trial consists of 9779 patients who were randomized for adjuvant treatment between the switch schemes (2.5 years tamoxifen followed by 2.5 years of exemestane) and 5 years of exemestane, between 2001 and 2006. The details of this trial were described earlier [5, 15]. FFPE tumour tissue was collected for the Dutch part of this trial (n = 2596), and embedded in triplicate on a TMA with 0.6-mm punches. Since both randomization arms were similar after the moment of switch, we censored the follow-up at 2.75 years (which was the middle between 2.5 and 3 years, the timeframe for patients in the switch group to switch to exemestane) in order to solely compare the differential effects of exemestane and tamoxifen. Beyond these 2.75 years, both treatment groups were treated with exemestane, which could interfere with the marker-by-treatment interaction. Due to the censoring at 2.75 years, only the recurrence-free survival (RFS), defined as any breast cancer recurrence or death due to breast cancer if no recurrence was reported before death, was used as a parameter of clinical outcome in this study since this censoring did not allow sufficient time to have an effect on mortality outcomes. All samples of both cohorts were handled in a coded fashion, according to national ethical guidelines (‘‘Code for Proper Secondary Use of Human Tissue’’, Dutch Federation of Medical Scientific Societies).

Immunohistochemical staining

The procedures for the used immunohistochemical staining have been described before by our group in multiple different cohorts [8, 16]. In short, 4 µm sections from FFPE TMA blocks were deparaffinised in xylene and subsequently hydrated using graded alcohol washes, before endogenous peroxidase was blocked using hydrogen peroxide. Antigen retrieval was performed at 95 degrees Celsius for 10 min in a pH low target retrieval solution (DAKO, Glostrup, Denmark). The sections were incubated overnight at room temperature with primary antibodies against CD8 (clone 144B, Abcam, Cambridge, UK) at a predetermined optimal dilution using proper positive and negative controls. After washing, the sections were incubated with specific horseradish peroxidase-labelld Envision + System-HRP (DAKO) for 30 min, before they were stained using 3,3′-diaminobenzidine (DAB) solution (DAKO). Subsequently, the slides were counterstained for 30 s in haematoxylin, dehydrated using inverse-graded alcohol washes and xylene, and mounted in Pertex before they were dried and stored until further analysis.

Evaluation of immunohistochemical staining

Slides were scanned using an automated scanner (Philips, Eindhoven, Netherlands), and obtained digital images were stored on an internal server until later analysis. Each punch,at least 30% of the total area of which had tumour cells, was individually assessed for the number of CD8-positive cells in the punch by a trained investigator, after completing training by a pathologist. Results from duplicate (IES) or triplicate (TEAM) punches were then combined in order to determine the average score per patient. The median cohort value was used as a cut-off for dichotomous analysis for infiltrating cells. Since the evaluation in the TEAM trial was intended as a proof of principle and not as a formal validation, the median value of this TEAM cohort was assessed separately, and used as the cut-off for this cohort. One-third of all measurements were scored by an independent second observer, and in case of disagreement about the dichotomous classification, the punch was reviewed and discussed by both observers until agreement was reached.

Statistical analysis

The study was a non-planned, retrospective, explorative project, for which all available cases were used without a predefined sample size calculation to detect a specific effect size or reach a certain level of power. ANOVA and post hoc Bonferroni tests (corrected for multiple testing) were used to assess the mean number of CD8-positive TILs per subgroup. The kappa measurement for overall inter-observer agreement was used to assess the inter-observer variation for the dichotomized scores in one-third of all cases. Cox regression modelling was used to assess DFS and OS in the IES cohort, and RFS in the TEAM cohort, correct for possible confounders, and perform a treatment-by-marker interaction test. Missing data were included in models when they were missing in more than 10% of cases. Kaplan–Meier curves and the corresponding Log-rank tests were used to visualize these survival effects. Reverse Kaplan–Meier was used to determine the median follow-up duration. Furthermore, a post hoc analysis was performed at which every threshold was tested to determine which cut-off point would lead to the most discriminate HR for interaction. All statistical analyses were performed using SPSS version 23 (IBM).

Results

The Dutch IES cohort consisted of 236 postmenopausal patients with early breast cancer (Fig. 1a). After creating the TMA, cores containing sufficient tumour tissue (> 30%) were available from 190 patients. Patient and tumour characteristics are shown in Table 1. The median age was 64 years (range 30–96 years). The median follow-up was 10.1 years (range 0.49–11.34 years). No significant differences in the number of CD8-positive TILs were observed between clinicopathological subgroups (Table 1). The median number of CD8-positive cells per punch was 4, which is equivalent to 14 cells/mm2.

Fig. 1
figure 1

Flowcharts of the used cohorts for this study. The Dutch part of the Intergroup Exemestane Study (IES) (a) and the Dutch part of the international TEAM trial (b) were assessed for the number of CD8-positive TILs and its predictive value for endocrine therapy

Table 1 The clinicopathological features of both cohorts, including the mean number of CD8-positive TILs per punch for each subgroup

In the TEAM cohort, tumour tissues of 2596 patients were stained and scored for the presence of CD8-positive TILs. Sufficient cores (a minimum of 2 cores, containing at least 30% tumour tissue) were available for 2345 patients (90%). Punches showing artefacts or lack of tumour cells in the punches were excluded from analysis. The median follow-up, as determined by reverse Kaplan–Meier analysis, was 2.75 years (range 0–2.75). The distribution of clinicopathological subtypes was comparable to the IES cohort (Table 1). A number of significant differences in the number of CD8-positive TILs was observed between subgroups; patients above the age of 70 had a lower number of CD8-positive TILs compared to patients aged either 50–59 or 60–69 (Table 1). Furthermore, there was a significant association with tumour grade (more CD8-positive TILs with higher grade) and with HER2 expression (more CD8-positive TILs in HER2-positive tumours). The median number of CD8-positive TILs in this cohort was 6 per punch, which is equivalent to 20 cells/mm2. The overall kappa measure for accordance in both cohorts was 0.65, after which each discordant case was discussed until consensus was reached.

In the IES cohort, there was no prognostic value in the number of CD8-positive TILs for the full population for either DFS or OS (Fig. 2a, b). One of the aims of this study was to determine the predictive value of the number of CD8-positive TILs. Therefore, we stratified the survival analysis on the number of CD8-positive TILs (Table 2). It was shown that patients having a below-median number of CD8-positive TILs had a significantly better DFS when treated with exemestane after earlier tamoxifen compared to tamoxifen monotherapy (Fig. 3a). In 97 patients with a below-median number of CD8-positive TILs, 10 out of 45 patients on exemestane experienced a DFS-event, whereas 31 out of 52 patients allocated to tamoxifen encountered a DFS-event. Univariate cox regression showed a hazard ratio (HR) for DFS of 0.27 (95% CI 0.13–0.55, p < 0.001) in favour of exemestane treatment in these patients, with an adjusted HR (corrected for age, histological subtype, tumour size, lymph node status, tumour grade and PgR status) of 0.35 (95% CI 0.16–0.78) (Table 2). In contrast, in patients with above-median numbers of CD8-positive TILs, there was no significant difference in benefits of either therapy (events: 23 out of 49 on exemestane, 17 out of 44 patients on tamoxifen) with a HR of 1.34 (95% CI 0.71–2.50, p = 0.36) and an adjusted HR of 1.21 (95% CI 0.58–2.51, p = 0.97) (Fig. 2b). The HR for treatment-by-marker interaction between these groups was 5.02 (95% CI 1.93–13.02 p = 0.001), showing that the difference in treatment effects between the two marker groups was statistically significant. Although underpowered due to the small cohort size and relatively low numbers of events, the adjusted HR for interaction was 3.34 (95% CI 1.17–9.56, p = 0.02) when corrected for age, histological subtype, tumour size, lymph node status, tumour grade and PgR status.

Fig. 2
figure 2

The general prognostic effect of CD8-positive TILs on either DFS (left) or OS (right) for all (ER-positive) patients using Kaplan–Meier survival analysis. The number of CD8-positive TILs was stratified as low (below median) and high (above median) relative to the median value. Event rates are provided in the graph, and numbers at risk below the graph. P-values were determined using a Log-rank test

Table 2 Results of the stratified survival analysis for disease-free (DFS, in IES cohort), recurrence-free (RFS, in TEAM cohort) and overall survival (OS)
Fig. 3
figure 3

The predictive value of CD8-positive TILs on endocrine therapy in the IES cohort using Kaplan–Meier survival analysis. Patients with below-median (low) numbers of CD8-positive TILs are shown in the left-side graphs (DFS above OS), patients with above-median (high) numbers of CD8-positive TILs in the right-side graphs. Event rates are provided in the graph, numbers at risk below the graph. p-values were determined using a Log-rank test

Similar results were shown for overall survival, where a statistically significant benefit was shown for patients with a below-median number of CD8-positive TILs when treated with the switch scheme. In 97 patients with a below-median number of CD8-positive TILs, 9 out of 45 patients on exemestane had died at the end of follow-up, whereas 23 out of 52 patients allocated to tamoxifen were not alive at the end of follow-up (HR 0.38, 95% CI 0.17–0.82, p = 0.014; adjusted HR 0.48, 95% CI 0.19–1.18, p = 0.15). In patients with an above-median number of CD8-positive TILs, there was no difference (HR 1.13, 95% CI 0.56–2.30, p = 0.73; adjusted HR 1.07, 95% CI 0.46–2.49, p = 0.78), with 17 out of 49 patients having died on exemestane and 14 out of 44 patients having died on tamoxifen (Fig. 3c, d). Also for overall survival, a significant treatment-by-marker interaction was observed (HR for interaction 3.01, 95% CI 1.05–8.58, p = 0.04). The (underpowered) adjusted HR for interaction was 2.43 (95% CI 0.75–7.88, p = 0.14).

In a post hoc analysis, it was established that the median value of 4 cells per punch (14 cells/mm2) was close to the optimal threshold level of 3 cells per punch (11 cells/mm2), which would have resulted in the highest predictive effect of CD8-positive TILs (Supplemental Fig. 1) in the Dutch IES cohort.

In order to further explore the observed interaction between the outcome of endocrine therapy and the number of CD8-positive TILs, a similar analysis was performed in the Dutch TEAM cohort. Only the first 2.75 years of follow-up were considered for survival analysis, since after this timepoint, patients in both groups received exemestane which would diminish any biological interaction.

It was established that also in this cohort, the number of CD8-positive TILs had no prognostic effect on recurrence either censored at 2.75 years (HR 0.91, 95% CI 0.69–1.19, p = 0.47) or at full length of follow-up (HR 1.0, 95% CI 0.85–1.18 p = 0.97). With regard to the predictive value, it was shown that patients with a below-median number of CD8-positive TILs, had a HR for tumour recurrence of 0.67 (95% CI 0.45–0.99, p = 0.048) in favour of exemestane treatment, whereas patients with above-median numbers of CD8-positive TILs had a HR of 0·86 (95% CI 0.59–1.26, p = 0.44), which was similar to the findings of the first cohort (Fig. 4a, b). The adjusted HRs were not significant in either the CD8-low or CD8-high group (low numbers of CD8-positive TILs: 0.71, 95% CI 0.47–1.07, p = 0.10; high numbers of CD8-positive TILs: 0.82, 95% CI 0.56–1.21, p = 0.32). The treatment-by-marker interaction was not significant in this cohort (HR for interaction 1.29, 95% CI 0.75–2.22, p = 0.36, adjusted HR for interaction 1.20, 95% CI 0.68–2.11, p = 0.52).

Fig. 4
figure 4

The predictive value of CD8-positive TILs on endocrine therapy in the TEAM cohort using Kaplan–Meier survival analysis, stratified based on the median number of CD8-positive TILs. Patients with below-median (low) numbers of CD8-positive TILs are shown on the left, and patients with above-median (high) numbers of TILs on the right. Inserts show a more detailed graph with a range of 80–100% survival. Event rates are provided in the graph, numbers at risk below the graph. p-values were determined using a Log-rank test

Discussion

This study is the first to investigate CD8-positive TILs as a predictive biomarker for the type of adjuvant endocrine therapy in postmenopausal patients with early breast cancer. In the first IES cohort, patients with a low number of CD8-positive TILs had significantly greater treatment benefit from aromatase inhibitors (AIs) than from tamoxifen, whereas the type of therapy did not make any difference in patients with high numbers of CD8-positive TILs. The treatment-by-marker interaction, comparing the clinical benefit in both subgroups, was significant despite the low number of events in this analysis, suggesting a predictive capacity of CD8-positive TILs for endocrine therapy. In the second TEAM cohort, it was similarly suggested that patients with low levels of CD8-positive TILs had greater treatment benefit from exemestane. However, the treatment-by-marker interaction in this cohort was not significant, indicating that the benefit of exemestane in the CD8-low group was not significantly different from the benefit in the CD8-high subgroup.

The difference in significance between both cohorts can be explained by several factors. First, the IES cohort was smaller, and thereby underpowered for definite conclusions since it is more sensitive for random variation and artefactual findings. Secondly, all patients in the IES cohort were pre-treated with 2–3 years of tamoxifen, whereas the TEAM patients were treatment-naïve at the time of randomization. This pre-treatment, and the subsequent carry-over effect known from tamoxifen, could have influenced the differences between both cohorts. Finally, in the TEAM cohort, the follow-up was censored to 2.75 years, which limited the number of events and therefore hampered the required power for survival and interaction analysis. Furthermore, late recurrences are a major issue in ER-positive disease, which limits the possibility to draw conclusions on these short follow-up data. In contrast, the analysis in the IES cohort started at 2–3 years after diagnosis, and was continued up to almost 12 years post diagnosis. This difference in follow-up periods could have influenced the comparison between both cohorts as well.

At baseline, we showed in the TEAM cohort that there are statistically significant differences in the numbers of CD8-positive TILs between some subgroups. However, the absolute differences are small, and will most likely have no clinical relevance.

Earlier studies showed that TILs have no prognostic value in ER-positive disease [9, 12]. We confirmed these findings in both of our cohorts, showing that the number of CD8-positive TILs on itself had no prognostic value in both ER-positive cohorts. Interestingly, the suggestion that treatment with exemestane could be particularly beneficial for patients with a low number of infiltrating CD8-positive T-cells, as suggested by some of our results, has never been shown before in a trial-based translational study.

The mechanism behind the possible greater beneficial effect of aromatase inhibitors in case of low levels of CD8 positive cells is unknown yet. Various hypothesis can be made. One earlier study has suggested that the effect of AIs is dependent on immune suppression rather than activation [17]. In this study, they obtained 81 paired samples before and after 2 weeks of neoadjuvant anastrozole, and performed a multigene expression profile of these samples. In total, 1327 genes were differentially expressed. It was observed that a higher baseline expression of proinflammatory genes correlated to a poor therapeutical effect of anastrozole, and lymphocytic infiltration correlated to a poorer therapeutical response to AIs, which was similarly observed by others [17, 18]. Gao et al. validated these findings by showing that a high expression of genes associated with immune reaction predicted a poor response to endocrine therapy [19].

Aromatase inhibitors might also play a role in modulating the local immune response. For example, according to the study of Generali et al., aromatase inhibitors are capable of lowering the number of tumour-infiltrating regulatory T-cells, and thereby may improve treatment outcome [20]. Similar results were shown by Chan et al., who studied the ratio of cytotoxic T-cells and regulatory T-cells during neoadjuvant endocrine treatment and observed a significant increase of this ratio in responders, as opposed to non-responders [21]. Moreover, aromatase inhibitors have been shown to enhance cytokine excretion and the severity of experimental polyarthritis in murine models, indicating an activation of the immune system [22]. Furthermore, auto-immune conditions have been suggested as a contributing factor to often-reported arthralgia [23]. Based on these abovementioned findings, it could be hypothesized that aromatase inhibitors exert part of their function by activating both the systemic and the local immune responses. Therefore, patients with a weaker local immune response at baseline will benefit more from AIs, since the immunomodulation will result in greater effect in those patients compared with patients who already have a strong local immune response. However, these studies were performed in the neoadjuvant setting with the primary tumour still in situ, so they are not directly comparable to our adjuvant trial where the primary tumour was removed before endocrine treatment.

Another theory for explaining the possible differential effects of AIs and tamoxifen between TIL-rich and TIL-poor tumours is that the number of infiltrating CD8-positive TILs is a proxy variable for another tumour characteristic, which might be the mutational load. Earlier, it was established that the mutational load in the tumour, and therefore the number of neo-epitopes, is associated with the local immune response [24]. Furthermore, it has been shown that more aggressive Luminal B-type tumours, which are generally considered less responsive to endocrine therapy, have higher mutational load compared with the more responsive Luminal A subtype [25, 26]. Hypothetically, tumours with a lower mutational load might be more dependent on ER-pathway signalling, since they are less likely to acquire activating mutations in other oncogenes, whereas tumours with a higher mutational load have activated other growth-stimulating pathways and are therefore less dependent on ER-signalling for their survival. These results suggest that AIs would be the most optimal strategy for strongly ER-dependent (lower mutational load) tumours, whereas tamoxifen and AIs are equally good for less ER-dependent tumours.

A weakness of the current study is the fact that the Dutch fraction of IES-trial patients is only a small fraction of the full IES population (5%). Although randomization was stratified on individual centres, and therefore our cohort is still balanced between the treatment arms, it might still be that our cohort is biased compared to the full trial population. Furthermore, we evaluated the number of CD8-positive TILs on a TMA using immunohistochemistry, while the evaluation of TILs in the regular clinical setting is usually performed on full H&E stained slides. Therefore, our results need to be validated using this approach before clinical implementation might become feasible.

In summary, the current study provides the first suggestion that the number of CD8-positive TILs could be used as a predictive marker in the endocrine treatment of breast cancer. Upon further validation in a trial with a similar design as IES in which tamoxifen monotherapy is compared to an AI-containing regime, patients with low numbers of CD8-positive TILs could derive more benefit from AIs than that from tamoxifen, whereas patients with a strong infiltration of CD8-positive TILs derive a similar outcome on both treatment strategies. Future studies will be directed towards validation of these findings for other aromatase inhibitors, to show whether the results observed for exemestane can be extrapolated to letrozole or anastrozole as well. Our findings might contribute to a more optimized treatment of hormone-receptor-positive breast cancer using the local immune system as a predictive biomarker for adjuvant endocrine therapy.