Background

Gastric cancer is one of the most common cancers worldwide. Due to the high rate of local and systemic recurrence, the survival of gastric cancer patients, especially those in advanced stage, are still not optimistic [1]. During the last decade, preoperative neoadjuvant therapy (NACT) has been recommended as a mean to improve gastric cancer patient prognosis by the National Comprehensive Cancer Network (NCCN) [2], European Society for Medical Oncology (ESMO) [3] and Japanese Gastric Cancer Treatment Guidelines [4].

The Tumor, Node, and Metastasis (TNM) staging system is the most important staging tool for prognosis evaluation of gastric cancer patients. Several revisions have been made to this staging system since its first edition in 1976. The American Joint Commission on Cancer (AJCC) released the latest version (8th) of TNM staging in October 2016. Compared to the 7th edition, one of the highlights is that the new edition takes into account the increasing use of NACT and provides a new pathological TNM staging — post-neoadjuvant therapy TNM (ypTNM) — for staging patients with NACT [5, 6]. Conventionally, pathological stage is considered as the best prognostic indicator for gastric cancer patients; but this conclusion was drawn before the introduction of ypTNM. Moreover, although ypTNM have the same grouping stage as pTNM (i.e. stage I, II, III, IV), a patient’s ypTNM stage might be very different from his/her pTNM stage if he or she had not undergone preoperative NACT, due to the downstaging effect of NACT. Thus, one question comes up: does the new ypTNM stage have the same prognostic implication as the pTNM stage? In clinical practice, should clinicians treat patients with and without NACT the same given a same pathological stage? Given very little previous evidence in this regard, we conducted this retrospective analysis in a prospectively collected cohort to investigate the difference in prognostic value between ypTNM and pTNM stage using propensity score-matching method.

Method

Study population

After obtaining approval from Peking University Ethics Committee, a retrospective selection of eligible patients was conducted within a prospectively maintained database consisting of all patients diagnosed and treated with gastric cancer at the Gastrointestinal Cancer Center of Peking University Cancer Hospital & Institute from January 1st 2007 to January 1st 2015. Consent to participate was obtained from every participant. Patient clinicopathological information was stored in this database since their first-time treatment at the hospital. Follow-up was conducted by telephone every three months in the first three years since discharge, every six months in the year 4 and 5, and every six months to one year after five years. If the patient (or contact) could not be reached in three times, the patient was defined as lost-to-follow-up. The last follow-up was carried out in February 2016.

A patient was included in the current study if met the following: 1. Preoperative pathological diagnosis (via biopsy) of gastric adenocarcinoma; 2. Complete clinical and postoperative pathological data; 3. No distant metastasis at diagnosis; 4. Receiving radical D1+/D2 gastrectomy in our center. The exclusion criteria were: 1. Gastrointestinal stromal tumor, lymphoma, neuroendocrine tumor, carcinoid tumors, soft tissue tumors and other non-gastric adenocarcinoma patients diagnosed prior to the treatment; 2. Perioperative death within one month; 3. Receiving chemotherapy for other neoplasms within six months before gastric cancer surgery; 4. Receiving neoadjuvant radiotherapy or targeted therapy before surgery; 5. Remnant stomach cancer; 6. Receiving prophylactic intraperitoneal chemotherapy.

Exposure, outcome, and confounders

In accordance with the NCCN guideline, NACT was recommended to gastric cancer patients of stage cT2-4NanyM0. Radical D1+/D2 gastrectomy was conducted on all patients. Lymph nodes were identified and dissected by experienced surgeon during the surgery. The metastasis status of resected nodes was determined by one pathologist and reviewed by another afterwards. Because ypTNM was first proposed in 2016, pTNM was used for all patients in the clinical records. We re-grouped the patients who received preoperative NACT using ypTNM staging system according to the 8th AJCC Cancer Staging Manual [6]. However, we used the pTNM stage described in the 7th AJCC Cancer Staging Manual, considering that the 7th edition is still under wide use and the application of the 8th is yet to be validated in clinical settings. Furthermore, the main difference between the 7th and 8th version, which would change our results, is that patients with stage of T1N3b is classified to II in the 7th but to III in the 8th. There was only one patient with such change of staging classification in our study. We created an indicator variable indicating if ypTNM or pTNM was used for the purpose of analysis.

The outcome of interest was overall survival, which was defined as the time interval from the time of the initial therapy to the date of all-cause death or the last follow-up. For patients receiving NACT, the time started from the first cycle chemotherapy after diagnosis with gastric cancer. For patients not receiving NACT, the time of initial therapy was the time receiving radical surgery.

The following demographic characteristic, clinical and pathological information were also extracted from the database to serve as potential confounders or predictors of survival: age, gender, BMI, Eastern Cooperative Oncology Group (ECOG) score, American Society of Anesthesiologists (ASA) score, family history of cancer, operation duration, laparoscopic surgery or not, range of gastric resection, digestive reconstruction, combination with multi-organ resection, blood loss, postoperative hospitalization duration, tumor location, total number of resected lymph node, total number of metastatic lymph nodes, pathological type, differentiation grade, tumor diameter (in long and short axis), and the existence of vascular cancer embolus.

Statistical analysis

Descriptive statistics are presented as frequencies for categorical variables and mean ± standard deviation for continuous variables. Pearson’s χ2 or Fisher’s exact tests were used to analyze categorical variables. Continuous variables were compared using Student’s T tests if normally distributed or Mann-Whitney U tests if otherwise. Overall survival was calculated using Kaplan-Meier method. We used univariate logistic regression and univariate Cox regression to identify covariates associated with the use of ypTNM and overall survival, respectively.

Propensity score matching is a method widely used to reduce bias due to confounding in non-randomized studies. In this study, we used propensity score matching for the purpose of minimizing confounding as well as another: making the distribution of T stage and N stage comparable between the ypTNM and pTNM group, so to compare the prognostic implication of ypTNM and pTNM staging when the absolute number of the stage was the same. That is, for instance, whether a patient at stage of ypT1a, ypN3a, and M0 had the same overall survival as a patient of pT1a, pN3a, and M0.

We calculated propensity score through a logistic regression model including variables that are significantly associated with the use of ypTNM or overall survival. Quadratic terms of continuous variables were added to the propensity score model to account for non-linearity if appropriate. Considering the propensity score was not normally distributed, we matched the sample on the logit of the calculated propensity score. The greedy nearest neighbor matching algorithm without replacement was used at a 1:1 ratio. A caliper size of 0.2 of the standard deviation of the logit of the propensity score was utilized, as such a caliper size had the most superior performance on reducing bias among the commonly used sizes in current clinical research according to prior Monte Carlo simulations [7,8,9]. Mann-Whitney U and Pearson’s χ2 tests were used to check if the clinicopathological characteristics between the two groups were balanced. Following propensity score-matching, overall survival between matched ypTNM and pTNM patients was examined by unconditional Cox regression.

Conventional multivariate analysis was next used to verify the results from propensity score matching. All variables in the propensity score calculation model were included in this multivariate Cox model. This method could likewise achieve the two purposes of propensity score matching mentioned above. Furthermore, a subgroup analysis by pathological stages (i.e. I, II, III) was performed using the multivariate models to assess if the difference of prognostic implication of ypTNM and pTNM differed by stage.

The assumption of proportional hazards were examined using Schoenfeld residual. All data analyses was performed with Stata software version 14 (College Station, TX: StataCorp LP) and Rstudio version 1.1.419 (RStudio, Inc., Boston, MA) with a two-sided p < 0.05 defined as statistically significant.

Results

Patient characteristics of the unmatched cohort

1487 eligible patients were included in this study, of which 46 were excluded, leaving a sample of 1441 for analysis (Fig. 1). Table 1 shows the clinicopathological characteristics of the unmatched sample. The average age was 59.2 ± 11.4 years, of which 397 (27.55%) were female. More than half of the patients had tumor located at low stomach (54.41%). Most patients had adenocarcinoma (75.57%). Most tumors were at moderate differentiation grade (46.22%), ypT or pT stage of T4a (47.95%), and ypN or pN stage of N0 (42.05%). Open gastrectomy was performed on 87.09% of the sample. The median follow-up for all patients, the ypTNM group, and the pTNM group was 37 (range = 2–106), 36 (range = 3–106), and 37 months (range = 2–106), respectively. The 3-year overall survival for all patients, the ypTNM group, and the pTNM group was 72.12, 62.40, and 76.92%, respectively.

Fig. 1
figure 1

Study design and exclusion criteria

Table 1 Baseline clinical and pathological characteristics of patients using ypTNM stage or pTNM stage in Peking University Cancer Hospital & Institute, 2007–2015 (n = 1441)

Characteristics of the propensity score-matched sample

To better control for confounding and achieve comparable distribution of T and N stage between the two groups, patients were matched 1:1 based upon factors associated with the likelihood of using ypTNM staging or survival hazard in the unmatched cohort (Additional file 1: Table S1). The propensity score-matched sample were comprised 756 patients (378 in each group). Table 2 displays the covariate differences between the ypTNM and pTNM group after matching. All previously observed covariate imbalances between the two groups were no longer significant after matching. Moreover, matching balanced the distribution of factors associated with hazard of all mortality. As such, matching was considered effective. The clinicopathological characteristics of the matched sample were somewhat similar to the unmatched cohort. The majority of patients had locally advanced gastric cancer (stage III [53.84%], node positivity [64.29%]).

Table 2 Comparison of Clinicopathological Variables between ypTNM group and pTNM (n = 756) after Propensity Score-matching Using the Greedy Nearest Neighbor Algorithm without Replacement (Caliper Width 0.2 × Standard Deviation of Logit Propensity Score])

Prediction implication of ypTNM versus pTNM on survival

Unconditioned Cox regression on the matched sample yielded a hazard ratio of 1.34 comparing ypTNM group to pTNM group (95%CI = 1.05–1.72, P = 0.019; Fig. 2). We calculated Harrell’s c-index to compare the prognostic prediction ability of ypTNM staging and pTNM staging. The model with ypTNM as the staging system achieved a c-statistic of 0.67, which appeared to be smaller than that of the model with pTNM (0.69).

Fig. 2
figure 2

Comparative prognosis implication of ypTNM vs yTNM on overall survival in propensity-matched cohorts of patients with gastric cancer. *OS is three-year overall survival

Conventional multivariate analysis was used to corroborate the results from propensity score matching. After adjusting for all variables included in the propensity score matching model (Additional file 1: Table S1), patients with specific ypTNM stage were 1.35 times more likely to die than patients with the same pTNM stage (95%CI = 1.09–1.67, P = 0.006). Further subgroup analysis using multivariate Cox models indicated this survival difference between ypTNM and pTNM group varied by TNM stage (Fig. 3). For patients in stage I and III, the adjusted hazard ratio was 3.44 (95%CI = 1.06–11.18, P = 0.040) and 1.28 (95%CI = 1.00–1.62, P = 0.048) comparing ypTNM to pTNM, respectively; whereas for patients in stage II, no significant difference was observed in terms of the prediction implication of ypTNM and pTNM stage on survival (adjusted HR = 1.37, 95%CI = 0.78–2.38, P = 0.27). The Kaplan-Meier Curves by detailed TNM staging (e.g. Ia, Ib, etc) are shown in Additional file 1: Figure S1.

Fig. 3
figure 3

Adjusted comparative prognosis implication of ypTNM vs yTNM on overall survival in unmatched cohorts of patients with gastric cancer, stratified by TNM stage. *OS is three-year overall survival

Overall survival for pT0 or ypT0

4 pT0 and 31 ypT0 patients were not included in the analysis above due to the small group size. These 4 patients had one-bite gastric cancer. 28 of the ypT0 patients were with pathological complete response and the remaining 3 had ypT0N1. The 3-year overall survival for the pT0 and ypT0 patients was 100 and 96.3%, respectively. The 5-year overall survival for the pT0 and ypT0 patients was 100 and 89.9%, respectively.

Discussion

The latest edition of AJCC Cancer Staging Manual presented a comprehensive TNM staging system for clinicians to use under different situations, including clinical TNM (cTNM), pathological TNM (pTNM) and post-neoadjuvant therapy TNM (ypTNM). cTNM stage is considered of great value when determining clinical intervention, but its significance in survival prediction is limited compared to traditional pTNM. Furthermore, several studies have compared the prognostic value of cTNM to ypTNM on gastric cancer patients [10,11,12], but none focused on the comparison between ypTNM and pTNM. The current study investigated whether the same pathological stage defined by ypTNM and pTNM predicted similar survival in patients with gastric cancer. To our best knowledge, this study is the first of its kind.

We found patients with specific ypTNM stage had worse prognosis compared to those at the same pathological stage defined by pTNM. This result is reasonable: if two patients shared a same pathological stage defined by ypTNM and pTNM, due to the downstaging effect of NACT, the patient with ypTNM stage would have a higher pTNM stage if had not received NACT, making it logical that this patient’s prognosis was worse.

Our subgroup analysis further indicated that this difference on prognostic value of ypTNM versus pTNM differed by pathological stage. Interestingly, the magnitude of effect comparing ypTNM to pTNM was largest in stage I, followed by stage III, and was not significant in stage II. Potential explanations for this phenomenon are as follow. Because patients in pStage I did not normally use NACT, the majority of patients in ypStage I should be those who responded well to NACT and thus downstaged by NACT from stage II and III. [13] These patients would therefore have much worse prognosis than those in pTNM stage I because they would have a worse pTNM stage have they had one. As patients with distant metastasis (i.e. stage IV) were excluded from our analysis, patients in ypStage III were those who did not respond to NACT and remained in stage III after NACT. These patients had worse prognosis compared to patients with pStage III because they were insensitive to chemotherapy; but the magnitude of difference was not as great as that for those in ypStage I. Patients in ypStage II might consist of two groups: one group were those downstaged by NACT from stage III, and the remainder were patients who did not respond well to NACT; the mixture of these two groups resulted in a non-significant hazard ratio.

To verify these explanations, we would need information on the counterfactual pTNM stage of the patients who underwent NACT, which is unrealistic. One alternative, if not the only one, is to use cTNM as the surrogate of pTNM. However, when doing so, one should consider the magnitude of correlation between cTNM and pTNM stage. Previous studies have found less than 60–70% cTNM stages were in conformity with pTNM stage [14, 15]. According to a more recent study conducted by the Shizuoka Cancer Center in Japan, this inconformity varied by stage [16]. Nonetheless, despite the imperfect correlation of cTNM with pTNM, it would be still worthwhile to use cTNM as an alternative considering information on pTNM is unavailable for patients with NACT.

Our results not only suggest a different prognostic value of ypTNM and pTNM, but also indicate that the treatment for patients with specific ypTNM stage should be more intensified than that for patients with the same pathological stage defined by pTNM. This difference on treatment should also vary by TNM stage according to our subgroup analysis. For instance, as recommended by the NCCN guideline (5th edition, 2017), postoperative follow-up is sufficient for patients in pStage I, whereas adjuvant chemotherapy is needed for patients in ypStage I [17]. However, the guideline does not provide evidence for this recommendation and our findings offer one theoretical evidence base.

Furthermore, we want to clarify that our results do not indicate any worse survival due to NACT. What we found was given a same pathological stage the patients with NACT had a worse survival than the patients without it. Such phenomenon are mainly due to the down stage effect of NACT [18, 19]. If removing the premise of same pathological stage, multiple previous studies have found a favorable association between NACT and overall survival for gastric cancer patients [13, 20, 21].

There are two main limitations embedded in our study. Firstly, as previously mentioned, because information on cTNM was unavailable for this study, we cannot confirm the explanation of our results. However, this does not affect the conclusion of current study, as we only considered the prognostic value of pTNM and ypTNM stage. Secondly, Borrmann type has been considered as an important prognostic factor of gastric cancer and is one of the indications of NACT in countries such as Japan. Due to the lack of information, this factor was not included in the propensity score or the conventional multivariate analysis. However, given that NACT indications in China do not contain patient Borrmann type, omitting this factor will not introduce bias to the HR estimation.

To sum up, the prognostic implication of ypTNM is different from that of pTNM stage. In particular, patients with specific ypTNM stage had worse prognosis compared to those at the same pathological stage defined by pTNM. Such difference differed by TNM stage. Our findings should be taken into account when predicting the prognosis of and deciding postoperative treatment for patients with NACT using ypTNM.