FormalPara Key Points

Scoring methods for stromal tumor-infiltrating lymphocytes (sTILs) as recommended by the International TILs Working Group were feasible in clinical practice.

Higher baseline sTILs in pre-NAC core needle biopsy were associated with higher pCR rate and better prognosis in HER2-positive breast cancers.

A 20% threshold for sTILs may be feasible to predict pCR to NAC and prognosis in HER2-positive breast cancers.

1 Background

Human epidermal growth factor receptor 2 (HER2)-positive breast cancer is defined as hormone receptor (HR)-negative and HER2-positive tumors. HR−/HER2+ breast cancer usually has a higher histological grade and is associated with more frequent recurrence and metastasis, but it also has a higher pathological complete response (pCR) rate to neoadjuvant chemotherapy (NAC) compared with HR+ tumors [1]. pCR rates are heterogeneous among different subtypes of breast cancer treated with NAC, and the association between pCR and long-term outcomes is strongest in patients with highly proliferative and more aggressive subtypes, such as triple-negative breast cancer (TNBC) and HER2+ tumors [2]. Currently, anti-HER2-targeted NAC is widely used to treat HER2-positive breast cancers [3], and achievement of pCR following trastuzumab-based NAC in HER2-positive breast cancers has a favorable prognostic impact. Consequently, pCR has been proposed as a surrogate endpoint for long-term survival.

The presence of tumor-infiltrating lymphocytes (TILs) in the microenvironment of breast tumors has been proposed to reflect the efficacy of the immune therapy and to predict the prognosis of breast cancers [4,5,6]. The number of TILs is higher in triple-negative and HER2-positive breast tumors than in other breast cancers, and TILs are more likely associated with these two subtypes of breast cancer [7]. The prognostical role of TILs in HER2+ patients has been demonstrated in previous studies [8]. In the neoadjuvant setting, TILs may be used to predict pCR in all molecular subtypes of breast cancers and may be associated with a survival benefit in HER2+ breast cancer, while an increased number of TILs has been associated with shorter overall survival in luminal-HER2-negative tumor [5]. Thus, further research investigating the interaction of the immune system with different types of breast cancer is needed.

Owing to the clinical significance of TILs in breast tumors, the methodologies of TIL evaluation need to be standardized to improve the consistency and reproducibility of evaluating TILs in research and clinical practice. The International TILs Working Group issued consensus recommendations for the pathological assessment methods of TILs in 2014 [9]. Compared to stromal TILs (sTILs), intra-tumoral TILs (iTILs) are more difficult to evaluate and do not provide additional predictive/prognostic information; consequently,, current guidelines recommend that sTILs be evaluated in routine clinical practice. However, a biomarker cannot be recommended for routine utilization until a standardized approach has been validated in multiple settings. Furthermore, no formal recommendation for a clinically relevant TIL threshold(s) was given in the International TILs Working Group recommendations [9].

In this study, we performed a retrospective analysis of sTILs in 143 core needle biopsy specimens of HER2-positive breast cancers obtained from Chinese patients treated with NAC. Baseline sTILs in the pre-NAC specimens were scored using the method recommended by the International TILs Working Group 2014 [9]. The aim of our study was to examine the predictive and prognostic values of sTILs in HER2-positive breast cancers treated with neoadjuvant therapy in order to explore the optimal thresholds, and to evaluate the feasibility of the scoring methods in clinical practice.

2 Patients and Methods

2.1 Patients and Samples

A total of 143 core needle biopsy specimens of HER2-positive invasive breast cancers obtained from patients who had been treated with NAC followed by surgery between 2009 and 2015 were extracted from the pathology database of Fudan University Shanghai Cancer Center. The inclusion criteria were: primary invasive HER2-positive breast cancers; neoadjuvant therapy before surgical operation; available clinicopathologic data (age, tumor size, lymph node status, histological type, histological grade, lymphovascular invasion, Miller–Payne grade, estrogen receptor [ER], progesterone receptor [PR], and HER2 status, and the Ki-67 index [a marker for cell proliferation]). All patients were treated with eight cycles of trastuzumab-based NAC (paclitaxel + carboplatin + trastuzumab) and underwent modified radical mastectomy. The median follow-up was 53 (range 12–102) months.

The study was approved by the Ethics Institutional Review Board (IRB) of Fudan University Shanghai Cancer Center (IRB approval number: 1612167–10). All procedures performed involving human participants were in accordance with the ethical standards of the Ethics IRB of Fudan University Shanghai Cancer Center and with the 1964 Helsinki declaration and its later amendments. Written informed consent was obtained from all patients of the study to allow the use of their biological material, donated to our Biobank, for scientific projects and for publication purposes.

2.2 Pathological Assessment

All specimens were fixed with 10% neutral phosphate-buffered formalin and embedded in paraffin. Representative tumor blocks were then cut into 4-μm-thick sections and the sections stained with hematoxylin and eosin. Immunochemistry (IHC) studies, including assays for the ER, PR, HER2, and Ki-67, were performed on the core needle biopsies. The ER and PR were assessed as positive if ≥ 1% of tumor cells showed nuclear staining, respectively [10]. HER2-positive status was defined as a 3+ score based on IHC studies or HER2 gene amplification based on fluorescent in situ hybridization [11]. Tumors were defined as HER2-positive breast cancers when assessed as being ER and PR negative and HER2 positive (HER2 protein overexpression or gene amplification). The Ki-67 score was defined as the percentage of cells with nuclear staining among at least 1000 tumor cells counted. All core needle biopsy specimens and surgical slices were reviewed by two experienced breast pathologists (RS and WY) to confirm the histological type, according to the 2012 World Health Organization Classification of Tumors of the Breast [12]. The histological grade of tumor was evaluated in pre-NAC core needle biopsy specimens using the Nottingham grading system [13, 14]. Lymphovascular invasion (LVI) was observed in the postoperative slices. The Miller–Payne grading system was used to evaluate the pathological response in surgical specimens [15]. pCR was defined as the absence of residual invasive tumor cells in the breast and axillary lymph nodes (ypT0/is + ypN0), as determined microscopically, in surgical specimens.

2.3 sTIL Evaluation

Evaluation of sTILs on the core needle biopsy specimens was performed by two breast pathologists (XY and JR) who had studied the evaluation criteria recommended by the International TILs Working Group 2014; each pathologist scored each case independently in a blind manner. sTILs were defined as the percentage of mononuclear infiltrating cells, including lymphocytes and plasma cells, in the tumor stroma area. Carcinomas in situ, normal lobules, necrosis, hyalinization, and crush artifacts were excluded. sTILs were scored as an average value throughout full sections rather than at hotspots, and the results were scored in increments of 10, with 0 defined as < 5% TILs, 10 defined as 5–10% TILs, 20 defined as 11–20% TILs; all other scores were rounded up to the next highest decile.

2.4 Statistical Analysis

The association between sTILs and clinicopathologic characteristics, pCR, and prognosis were analyzed. sTILs were analyzed as a continuous variable (per 10% increment) and as a binary variable dichotomized by using 20% as a threshold.

The association between continuous variables (Miller–Payne grade and Ki-67 index) and sTILs were evaluated with Spearman’s rank correlation analysis (r). The associations between categorical variables (age, tumor size, lymph node status, histological grade, lymphovascular invasion, pCR) and sTILs were evaluated using the Mann–Whitney or Kruskal–Wallis tests. The Chi-square test or Fisher’s exact test was employed for comparisons of categorical variables. The intraclass correlation coefficient analysis was used to evaluate the interobserver agreement of sTILs scores.

Receiver operating characteristics (ROC) curve analysis was conducted to detect the optimal thresholds of sTILs and the predictive model to predict pCR. The maximum Youden’s Index (J = sensitivity + specificity − 1) was calculated to define the optimal thresholds, then univariate and multivariate regression analyses were used to evaluate the predictive value of sTILs as continuous and binary variables for pCR. Variables which were significant in the univariate analysis were selected for inclusion in the multivariate analysis.

The survival endpoint was disease-free survival (DFS) and overall survival (OS). DFS was defined as the time from surgery to events such as ipsilateral breast tumor recurrence or the appearance of a second primary cancer, or death resulting from any cause (whichever occurred first). OS was defined as the time from surgery to death from any cause. Univariate and multivariate Cox proportional hazards regression model analyses were performed to evaluate the associations between sTILs (as continuous and binary variables) with DFS and OS. Variables which were significant in the univariate analysis were selected for inclusion in the multivariate analysis. The Kaplan–Meier curves and the log-rank test were used to assess the DFS and OS functions for comparisons of survival curves.

All statistical tests were two-sided, and the statistical significance was defined as p < 0.05. All statistical analyses were carried out using SPSS statistical software (version 20.0; IBM Corp., Armonk, NY, USA). All figures were depicted using GraphPad Prism (GraphPad Software Inc., La Jolla, CA, USA).

3 Results

3.1 Associations Between sTILs and Clinicopathologic Characteristics

The clinicopathologic characteristics of the 143 patients with HER2-positive breast cancer are listed in Table 1. All patients were female. The median age was 50 (range 25–78) years. The pre-NAC tumor size was assessed based on radiology findings. Lymph node involvement was found in 86 (60.1%) patients by ultrasound-guided fine needle aspiration prior to the NAC. Histological type, tumor grade, ER, PR, HER2 status, and Ki-67 index were estimated in the pre-NAC specimens of the core needle biopsy. Invasive carcinoma with special subtypes was found in all specimens, with 65 (45.5%) and 78 (54.5%) of specimens showing histological grade 2 and grade 3 breast cancer, respectively. LVI was observed in the postoperative sections of 34 (23.8%) specimens. pCR was observed for 86 (60.1%) cases, and non-pCR was observed for 57 (39.9%) cases.

Table 1 Correlations between stromal tumor-infiltrating lymphocytes and clinicopathologic characteristics in HER2-positive breast cancers

The median baseline sTIL level in the core needle biopsy was 18.3%, ranging from 0 to 80% (Fig. 1). The intraclass correlation coefficient (ICC) analysis showed that the interobserver agreement of sTIL assessment was excellent (sTILs: ICC 0.95, 95% confidence interval [CI] 0.89–0.98, p = 0.001). Among the histopathological characteristics examined in the cohort, the levels of sTILs as a continuous variable were positively correlated with non-lymphonodus metastasis (p < 0.001), non-lymphovascular invasion (p = 0.009), Miller–Payne grade (p = 0.002), and pCR rates (p < 0.001) (Table 1; Fig. 2). The number of sTILs was higher in the pCR subgroup than in the non-pCR subgroup (median 22.21 vs. 12.28%, respectively; p < 0.001).

Fig. 1
figure 1

Different stromal tumor infiltrating lymphocyte (sTIL) scores in core needle biopsy. a 10%, b 30%, c 50%, d 80%. Original magnification (ad) 200×

Fig. 2
figure 2

Correlations between sTIL levels and histopathological parameters in human epidermal growth factor receptor 2 (HER2)-positive breast cancers. Y-axis represents the scores of sTILs; X-axis represents lympho-vascular invasion (LVI; negative vs. positive) (a), lymphonodus (LN; negative vs. positive) (b), Miller–Payne grade (MP grade; 1–5) (c), and pathological complete response (pCR; non-pCR vs. pCR) (d). Dots correspond to the sTIL scores, and whiskers correspond to the standard error of the respective scores. The association between sTILs and MP grade was analyzed by the Spearman correlation test, and the associations between sTILs and LVI, LN, and pCR, respectively, were evaluated with the Mann–Whitney test

3.2 Correlations between sTILs and pCR

3.2.1 The Optimal Thresholds of sTILs to Predict pCR

An ROC curve analysis was performed to determine the threshold of sTIL level that best discriminated pCR (Fig. 3). The area under the ROC curve was 0.683 (95% CI 0.600–0.758, p < 0.001) and the best threshold to discriminate the pCR and non-pCR subgroups was found to be 20%. The sensitivity, specificity, positive predictive value, negative predictive value, accuracy and Youden’s index of the 20% threshold were 61.83, 70.18, 75.71%, 54.79%, 65.03%, and 0.318, respectively (Table 2). The level of sTILs was < 20% in 72 (50.3%) specimens and ≥ 20% in 71 specimens (49.7%) cases.

Fig. 3
figure 3

Receiver operating characteristics curve analysis for the thresholds of sTILs to predict pCR in neoadjuvant treated HER2-positive breast cancers. Black dots indicate the different thresholds of sTILs. The optimal threshold, sensitivity, specificity, p value, and area under the curve (AUC) are shown

Table 2 Comparisons with different thresholds of stromal tumor-infiltrating lymphocytes

3.2.2 Correlation Between sTILs as a Continuous Variable and pCR

The relationship between sTILs scored as a continuous variable (per 10% increment) and pCR was analyzed by logistic regression analysis (Table 3). Higher sTILs scores indicated a higher pCR rate. Univariate analysis showed that sTILs (per 10% sTILs: OR 1.05, 95% CI 1.02–1.08, p = 0.001) was significantly correlated with pCR. Multivariate analysis confirmed that sTILs (per 10% sTILs: OR 1.04, 95% CI 1.00–1.07, p = 0.034) was an independent predictor for pCR, irrespective of other clinicopathologic factors.

Table 3 Correlation between stromal tumor-infiltrating lymphocytes and pathological complete response in HER2-positive breast cancers

3.2.3 Correlation Between sTILs as a Binary Variable and pCR

Logistic regression analyses were performed to evaluate the predictive value of sTILs as a binary variable (Table 3). A level of 20% sTILs (odds ratio [OR] 0.25, 95% CI 0.12–0.52, p < 0.001) was significantly associated with a higher pCR rate in the univariate analysis. Multivariate analysis confirmed that a 20% threshold of sTILs (OR 0.35, 95% CI 0.14–0.87, p = 0.024) was an independent predictive factor for pCR.

3.3 Prognostic Value of sTILs

3.3.1 Association of sTILs as a Continuous Variable with DFS and OS

Survival data were available for all patients. Over a median follow-up of 53 (range 12–102) months, 43 patients (30.1%) had local recurrence or distant metastasis, and 19 patients (13.3%) died. Cox proportional hazards regression analyses were performed to evaluate the prognostic value of sTILs as a continuous variable (Table 4). Higher sTIL scores were significantly associated with a better prognosis. In the univariate analysis, sTILs were significantly associated with DFS (hazard ratio [HR] 0.91, 95% CI 0.88–0.95, p < 0.001), and OS (HR 0.88, 95% CI 0.83–0.94, p < 0.001) I. Multivariate analysis that included prognostic variables confirmed that the sTILs score was an independent predictor of DFS (HR 0.93, 95% CI 0.90–0.97, p < 0.001) and OS (HR 0.92, 95% CI 0.86–0.98, p = 0.009).

Table 4 Association of stromal tumor-infiltrating lymphocytes with disease-free survival and overall survival in HER2-positive breast cancers

3.3.2 Association of sTILs as a Binary Variable with DFS and OS

The variable sTILs was analyzed as a binary variable using 20% as the threshold (Table 4). Patients with ≥ 20% sTILs had a significantly better prognosis than patients with < 20% sTILs. sTILs values dichotomized by the 20% cutoff were significantly associated with DFS (HR 6.60, 95% CI 2.91–14.95, p < 0.001) and OS (HR 10.29, 95% CI 2.37–44.66, p = 0.002) in the univariate analysis. The multivariate analysis indicated that sTIL values dichotomized by 20% were independent predictors of DFS (HR 3.87, 95% CI 1.65–9.12, p = 0.002) and OS (HR 4.74, 95% CI 1.02–22.01, p = 0.047).

The Kaplan–Meier curves and log-rank test showed that patients with ≥ 20% sTILs had longer DFS and OS than did with those with < 20% sTILs (Fig. 4a, b). In the pCR subgroup, sTIL levels had no significant influence on DFS and OS (log-rank test, p = 0.054 and p = 0.436, respectively) (Fig. 4c, d). In the non-pCR subgroup, patients with ≥ 20% sTILs had better DFS and OS than those with < 20% sTILs (log-rank test, p = 0.001 and p = 0.045, respectively) (Fig. 4e, f).

Fig. 4
figure 4

Kaplan–Meier curves depicting associations of sTILs with disease-free survival (DFS) and overall survival (OS) in HER2-positive breast cancers. a, b Comparison of DFS (a) and OS (b) between patients with ≥ 20% sTILs and patients with < 20% sTILs in the entire population). c, d DFS (c) and OS (d) between patients with ≥ 20% sTILs and patients with < 20% sTILs in the pCR subgroup. e, f DFS (e) and OS (f) between patients with ≥ 20% sTILs and patients with < 20% sTILs in the non-pCR subgroup. Log-rank p values are shown

4 Discussion

Tumor-infiltrating lymphocytes have been investigated as predictive and prognostic factors in breast cancers for a long time. In ER-positive/node-negative breast cancer, a higher proportion of TILs has been found to be significantly correlated with a high Oncotype DX Breast Recurrence Score [16]. In HER2-positive breast cancers, TILs have become one of the research hotspots in recent years. TILs have been found to be strongly associated with higher pCR rates and better prognosis in HER2-positive early breast cancer [17], and increased levels of sTILs have been used to predict pCR in HER2-positive breast cancer patients receiving neoadjuvant therapy [18]. Similarly, high levels of sTILs were observed to increase the benefit from trastuzumab and decrease distant recurrence in HER2-positive breast cancers [19]. Moreover, it has been suggested that TILs not only play an important clinical role in invasive HER2-positive breast cancer, but that there is also a strongly correlation with HER2+ ductal carcinoma in situ [20].

Given the presumed relevant role of TILs in breast cancer, it is important that standardized methodologies for assessing TIL levels be developed that enable the evaluation and application of TILs as a prognostic marker in clinical practice. In 2014, the International TILs Working Group formulated recommendations for the pathologic evaluation of TILs in breast cancer [9]. The recommendations need to be further validated in multiple laboratories before being applied in routine clinical practice. Since iTILs do not provide additional predictive/prognostic information compared to sTILs and need further methodological research [21], the recommendation is to evaluate sTILs in clinical practice. We conducted a retrospective analysis of sTILs in 143 core needle biopsy specimens obtained from patients with HER2-positive breast cancers treated with NAC. Our results indicate that sTILs scored by the recommendations of the TILs Working Group in pre-NAC core needle biopsies were significantly correlated with pCR and prognosis in HER2-positive breast cancers. There was an excellent interobserver agreement on the evaluations of TILs in the specimens s in our study, which facilitated its application in routine practice.

Several studies have demonstrated the value of baseline sTILs for predicting pCR in HER2-positive breast cancers treated with neoadjuvant therapy [22,23,24]. Results from the CherLOB phase II study, which integrated an evaluation of PAM50 subtypes and immune modulation of pCR in HER2-positive breast cancer patients showed that sTILs were significantly associated with pCR and that the HER2-positive subtype had the highest pCR rate [25]. A meta-analysis of randomized controlled trials showed that high TIL levels were associated with a significantly increased pCR rate, irrespective of neoadjuvant anti-HER2 agent(s) and chemotherapy regimens [26]. Our study showed that the level of sTILs as a continuous variable by 10% increments in pre-NAC core needle biopsies was significantly positively correlated with pCR in the univariate and multivariate analyses in HER2-positive breast cancers.

Several studies have confirmed that high levels of sTILs are associated with better DFS and OS in patients with HER2-positive breast cancers. Ignatiadis et al. used TILs as a continuous parameter and reported that for every increase of 10% there was a 25% reduction in the hazard for an event-free survival (EFS) event [27]. Dieci et al. suggested that TILs are strong prognostic factors for OS in patients with HER2-positive breast cancers [8]. In their study, the 10-year OS rate was 78% in the high-TIL group and 57% in the low-TIL group. In our study we found that baseline sTIL levels used as a continuous variable to study pre-NAC core needle biopsy specimens of HER2-positive breast cancers was an independent positive predictive marker for DFS and OS in both the univariate and multivariate analyses. Interestingly, Dieci et al. demonstrated that sTILs are of more prognostic relevance than pCR in TNBC [28], while pCR has been widely accepted as a strong surrogate marker for favorable DFS and OS in HER2-positive breast cancers. Our study shows that baseline sTIL values in HER2-positive tumors were a significant predictor for both DFS and OS, while pCR was only associated with improved DFS but not with OS, possibly due to the relatively short duration of follow-up in our study.

No formal recommendation for a clinically relevant sTIL threshold(s) was given by the TILs Working Group [9]. In the study of Liu et al., a 30% threshold best discriminated pCR from non-pCR subgroups and was optimally associated with improved EFS in patients with HER2-positive breast cancers treated with trastuzumab-based NAC [23]. In our study, we performed a ROC curve analysis to determine the optimal threshold of sTILs to predict pCR and found that a 20% threshold best discriminated the pCR subgroup from the non-pCR subgroup. Our study also showed that a 20% threshold for sTILs could be used to predict prognosis in HER2-positive breast cancers. However, the optimal thresholds need to be validated further in larger cohorts.

Recent studies have focused on the prognostic significance of post-NAC levels of sTILs in residual diseases. Increased sTIL levels in residual cancer have been found to be associated with improved recurrence-free survival [29], and a combination of sTILs and tumor cellularity within the post-NAC surgical specimens has been observed to be a positive predictor of pCR in HER2+ breast cancer [30]. Hamy et al. suggested that sTILs after NAC have a negative impact on DFS in patients with HER2+ breast cancer [31]. In our study, we attempted to evaluate the levels of sTILs in the residual diseases of non-pCR patients according to the evaluation recommendations of the International Immunooncology Biomarkers Working Group [32]. However, the need to include the evaluation area surrounding residual tumor cells/nests in the sTIL score was challenging in some cases, especially in samples with decreased tumor cellularity and with no concentric shrinkage and in cases involving only with tumors in metastasis nodes. Standardized methodologies for TIL assessment in the post-NAC residual disease setting with acceptable repeatability still needed to be explored.

The limitations to our analysis include the retrospective nature of our analysis, which was based on archived tissues from a single center. The optimal thresholds of sTILs to predict pCR and prognosis in HER2-positive tumors were explored in our study; however, the thresholds need to be validated in larger cohorts. Further large-scale prospective and retrospective studies need to be conducted in independent randomized trials to evaluate the clinical values of sTILs in HER2-positive breast cancers.

5 Conclusion

In summary, the results of our study indicate that baseline sTIL levels scored according to the recommendations of the International TILs Working Group in pre-NAC core needle biopsy specimens were significantly correlated with pCR and prognosis in patients with HER2-positive breast cancers. Higher sTIL scores predicted a higher pCR rate and better prognosis. A 20% threshold for sTILs may be feasible to predict pCR to NAC and prognosis in patients with HER2-positive breast cancers.