Background

Allogeneic hematopoietic stem cell transplantation (allo-HCT) is the only potentially curative post-remission consolidation treatment for high-risk acute myeloid leukemia (AML) patients [1, 2]. However, preparative regimen-related toxicities and graft-versus-host disease (GVHD) have limited its widespread use. In particular, chronic GVHD (cGVHD) remains the leading cause of late non-relapse mortality (NRM) and morbidity after allo-HCT. Furthermore, the increasing use of G-CSF mobilized peripheral blood stem cells (PBSC) [3], a well-identified risk factor for chronic GVHD [4, 5], is associated with an increased incidence of cGVHD [3, 6]. Therefore, identification of the most effective prevention of GVHD is required to improve patients’ outcome after allo-HCT, particularly in the setting of PBSC transplantation.

In vivo graft manipulation with antithymocyte globulin (ATG) [7,8,9,10,11,12,13] or alemtuzumab [14] and ex vivo graft manipulation with CD34 selection and T cell depletion [15,16,17,18,19] are strategies that are associated with lower rates of chronic GVHD. Since 2000, five phase III randomized trials have investigated the efficacy of rabbit ATG for GVHD prophylaxis in patients who received myeloablative (MAC) allo-HCT from unrelated or HLA-identical matched donors [7,8,9,10,11,12,13]. In all these studies, the use of ATG was associated with a protective effect against cGVHD and in all but one study [12], overall survival (OS) and progression-free survival were not significantly affected [13]. Therefore, in vivo T cell depletion using ATG is now considered a standard for GVHD prevention after PBSC transplantation using related HLA-identical or unrelated donors in many centers. On the other hand, several studies have shown that the use of ex vivo T cell-depleted (TCD) grafts combined with ATG significantly reduces the risk of GVHD without the need for post-transplant immunosuppression [16, 20, 21]. Although several different approaches for T cell depletion of the allograft have been used over the years, more recently, removal of T cells from the graft has routinely been performed through positive selection of CD34+ cells using immunomagnetic beads [22]. To date, no study has compared outcomes in AML patients after myeloablative allo-HCT with ex vivo TCD using CD34 selection or in vivo TCD using ATG. To compare the efficacy of both approaches, we retrospectively evaluated the outcomes of patients with intermediate or high-risk AML in first complete remission (CR1) who underwent myeloablative allo-HCT with either in vivo TCD with ATG within the European group for Blood and Marrow Transplantation (EBMT) centers or ex vivo TCD CD34 selected (CD34+) graft at the Memorial Sloan Kettering Cancer Center (MSKCC).

Methods

Study design and data collection

This retrospective multicenter analysis was performed and approved by the Acute Leukemia Working Party (ALWP) of the EBMT group registry and the institutional review board of the MSKCC. A list of the EBMT participating centers is available online (Additional file 1). The study included all adult patients (age > 18 years) with AML, with intermediate or high-risk cytogenetic, in first morphological CR, who received an in vivo or ex vivo T cell-depleted myeloablative allo-HCT from an HLA matched related (MRD) or unrelated (UD) donor using a peripheral blood stem cell graft between 2005 and 2015. Cytogenetics were classified according to the European Leukemia Net [23]. All allografts were obtained from HLA-A-, HLA-B-, HLA-C-, and HLA-DRB1-matched donors. All patients underwent myeloablative conditioning. Patients at MSKCC received ex vivo TCD graft (CD34+ group, n = 162) after conditioning with one of the following preparative regimens as previously reported: (1) i.v. busulfan (Bu) 0.8 mg/kg/dose for 10 or 12 doses over a 4-day period, melphalan 70 mg/m2/day for 2 days, and i.v. fludarabine (Flu) 25 mg/m2/day for 5 days (n = 107); (2) hyperfractionated total body irradiation (TBI) 13.75 Gy over 4 days followed by i.v. thiotepa 5 mg/kg/day for 2 days and i.v. cyclophosphamide (Cy) 60 mg/kg/day for 2 days (n = 45) or i.v. Flu 25 mg/m2/day for 5 days (n = 10). Peripheral blood grafts underwent CD34 cell selection using the ISOLEX 300i magnetic cell selection system (Baxter, Deerfield, IL), followed by sheep red blood cell-rosette depletion (Isolex-E, n = 53); or CD34+ selection using the CliniMACS CD34 Reagent System (Miltenyi Biotech, Gladbach, Germany) (n = 109). The two approaches provide a similar level of T cell depletion (log10 5.3 with Isolex-E vs. 5.1 with CliniMACS, Jakubowski et al., in preparation). All patients received equine ATG (30 mg/kg total dose, n = 12) or rabbit ATG (thymoglobulin 5 mg/kg total dose, n = 143) to prevent graft rejection, except for those patients receiving a transplant from an HLA-matched related donor and conditioned with hyperfractionated TBI, thiotepa, and Flu (n = 7). No GVHD prophylaxis was administered post-transplantation.

Within the EBMT centers, patients received unmodified grafts and in vivo T cell depletion using rabbit ATG (group ATG, n = 363) after one of the following preparative regimens: (1) Bu 9.6–12.8 mg/kg total dose and i.v. fludarabine (n = 173), (2) Bu 9.6–12.8 mg/kg total dose and i.v. Cy 100–120 mg/kg total dose (n = 129), or (3) high-dose TBI and i.v. Cy 100–120 mg/kg total dose (n = 61). Patients received either thymoglobulin (n = 233) or grafalon (formerly ATG-fresenius, n = 130) for prevention of graft rejection and of GVHD. These patients received post-HCT GVHD prophylaxis consisting of cyclosporine alone (n = 62) or in combination with methotrexate (n = 213) or mycophenolate mofetil (n = 60), tacrolimus in combination with methotrexate (n = 2) or sirolimus (n = 10) or other combinations (n = 16).

Supportive care and antimicrobial prophylaxis were administered according to standard guidelines and include infection prophylaxis for Pneumocystis jirovecii and herpes virus. All patients were assessed at least once per week for cytomegalovirus (CMV) and Epstein-Barr virus (EBV) reactivation in the peripheral blood by polymerase chain reaction, to initiate preemptive therapy [24].

Statistical analysis

Endpoints included OS, leukemia-free survival (LFS), cumulative incidence of relapse, NRM, acute and chronic GVHD, and refined GVHD-free relapse-free survival (GRFS) [25]. All outcomes were measured from the time of allo-HCT. OS was based on death, regardless of the cause. LFS was defined as survival with no evidence of relapse. OS and LFS rates were calculated by the Kaplan-Meier estimator. Cumulative incidence functions were used to estimate the probabilities of acute and chronic GVHD, NRM, relapse, and GRFS to accommodate competing risks. NRM and relapse were the competing risks for each other. Patients alive without relapse were censored at the time of last contact. For acute and chronic GVHD, the competing risk was death without the event. For refined GRFS, the events were relapse, grade III–IV acute GVHD, or extensive cGVHD and the competing risk was death without the events [25]. Acute GVHD was defined according to the standard criteria [26]. Due to the retrospective nature of this analysis, NIH cGVHD classification [27] was not available for EBMT centers; therefore, Shulman et al. classification (limited versus extensive) [28] was used for all patients.

Patients’ characteristics were compared between the CD34+ and the ATG groups using the chi-square test or the Fisher exact test for categorical variables and the Mann-Whitney test for continuous data. Univariate analyses were performed using the log-rank test for OS and LFS and Gray’s test for cumulative incidences. Chronic GVHD was analyzed as a time-dependent variable. For multivariate regression, a Cox proportional hazards model was built. Impact of age was analyzed per decade. Results were expressed as hazard ratio (HR) with 95% confidence interval (CI). All tests were two-sided and the type-1 error rate was fixed at 0.05. Statistical analyses were performed with SPSS 19 (SPSS Inc. /IBM, Armonk, NY) and R 3.0.1 (R Development Core Team, Vienna, Austria) software packages.

Results

Patient and donor characteristics

Patients and transplant characteristics are summarized in Table 1. Patients, in the CD34+ group who were significantly older, were more likely to have a matched related donor and a Karnofsky performance status < 90% compared to the ATG group. There were no statistically significant differences between groups regarding donor and patient gender, CMV serological status, and cytogenetic risk factor. The median follow-up among surviving patients was 35.4 (range, 2–139) months and was significantly longer in the CD34+ group, 58 (range, 6–139) months, compared to that in the ATG group, 24.5 (range, 2–131) months (p < 0.001). As a result, all patients were censored at 2 years for the comparison between groups.

Table 1 Study population characteristics

Engraftment

Engraftment was achieved in 362/363 patients (99.7%) in the ATG and 159/162 (98.1%) in the CD34+ group (p = 0.06). The median time to neutrophil recovery was significantly longer in the ATG group: 16 (range, 6–34) days compared with 10 (range, 8–20) days in the CD34+ group (p < 0.0001).

Overall survival and leukemia-free survival

Univariate and multivariate analyses of transplantation-related events are summarized in Tables 2 and 3, respectively. The Kaplan-Meier estimate of OS at 2 years was 63.9% (95% CI, 58.5–69.4) in the ATG group and 67.6% (95% CI, 60.3–74.9) in the CD34+ group (p = 0.31, Fig. 1a). In multivariate analysis, there was no significant difference in OS between the TBI group and the Bu group (HR, 1.43; 95% CI, 0.97–2.11; p = 0.07). The only parameters with a significant impact on OS in multivariate analysis were patients’ age and cytogenetic status. OS was significantly lower in patients with poor as compared to intermediate-risk cytogenetics (HR, 1.16; 95% CI, 1.11–2.19; p = 0.009) and in older patients (HR, 1.15; 95% CI, 1.00–1.34; p = 0.049). The Kaplan-Meier estimate of LFS at 2 years was 57.9% (95% CI, 52.4–63.4) in the ATG group and 61.0% (95% CI, 53.4–68.5) in the CD34+ group (p = 0.29, Fig. 1b). In multivariate analysis, there was no significant difference in LFS between the ATG and the CD34+ groups (HR, 1.25; 95% CI, 0.88–1.78; p = 0.21). The only parameter with a significant impact on LFS in multivariate analysis was cytogenetic status. LFS was significantly lower in patients with poor compared to those with intermediate risk cytogenetics (HR, 1.70; 95% CI, 1.26–2.31).

Table 2 Transplant-related events univariate analysis
Table 3 Transplant-related events multivariate analysis
Fig. 1
figure 1

Outcome after allo-HCT. Overall survival (a), leukemia-free survival (b), cumulative incidence of relapse (c), cumulative incidence of GVHD-free and relapse-free survival (d), and cumulative incidence of non-relapse mortality (e). OS, overall survival; LFS, leukemia-free survival; RI, relapse incidence, rGRFS, refined GVHD-free relapse-free survival; NRM, non-relapse mortality

Relapse

At 2 years, the cumulative incidence of relapse was 30.0% (95% CI, 24.9–35.2) in the ATG group and 21.6% (95% CI, 15.6–28.3) in the CD34+ group (p = 0.03, Fig. 1c). In multivariate analysis, there was a trend toward a higher cumulative incidence of relapse in the ATG group (HR, 1.52; 95% CI, 0.96–2.42; p = 0.07). The only parameter with a significant impact on relapse incidence in multivariate analysis was cytogenetic risk status: relapse was significantly increased in patients with poor compared to intermediate risk cytogenetics (HR, 2.10; 95% CI, 1.46–3.03; p < 0.0001). Therefore, a subgroup analysis was performed to separately analyze patients with intermediate and high-risk cytogenetic. In patients with intermediate risk cytogenetic, the 2-year cumulative incidence of relapse was significantly higher in the ATG group compare to the CD34+ group [25.0% (95% CI, 19.5–30.6) versus 11.6% (95% CI, 6.5–18.4)]. In multivariate analysis, TCD approach was the only parameter with an impact on relapse with a significantly higher cumulative incidence of relapse in the ATG group (HR, 2.42; 95% CI, 1.22–4.78; p = 0.01). In contrast, in the subgroup of patients with high-risk cytogenetic, TCD approach has no impact on the cumulative incidence of relapse [42.7% (95% CI, 32.1–53.0) in the ATG group versus 44.0% (95% CI, 29.9–57.3)].

Graft-versus-host disease

The day-100 cumulative incidence of grade II–IV and grade III–IV aGVHD were significantly higher in the ATG group, being 21.2% (95% CI, 17.1–25.6) and 6.2% (95% CI, 4–9.1), versus 11.3% (95% CI, 7.0–16.8) and 1.3% (95% CI, 0.2–4.1), respectively, in the CD34+ group (p = 0.006 and p = 0.01). In multivariate analysis, TCD approach was the only parameter with an impact on aGVHD. The cumulative incidence of grade II–IV aGVHD was significantly higher in the ATG group compared with the CD34+ group (HR, 2.05; 95% CI, 1.12–3.73; p = 0.02). At 1 year, the cumulative incidence of cGVHD and extensive cGVHD were significantly higher in the ATG group, being 27.6% and 11.3%, versus 2.5% and 2.5%, respectively, in the CD34+ group (p < 0.0001 and p = 0.001). In multivariate analysis, TCD approach was the only parameter with an impact on cGVHD. The cumulative incidence of cGVHD was significantly higher in the ATG group compared with the CD34+ group (HR, 15.07; 95% CI, 5.38–42.26; p < 0.0001).

Graft-versus-host disease-free relapse-free survival

At 2 years, the cumulative incidence of GRFS was significantly lower in the ATG group, being 47.0% (95% CI, 41.4–52.5) versus 59.1% (95% CI, 53.5–68.5) in the CD34+ groups (p = 0.003, Fig. 1d). In multivariate analysis, GRFS was significantly higher in the CD34+ group compared to the ATG group (HR, 1.60; 95% CI, 1.14–2.23; p = 0.006). In addition, cytogenetic status and type of donor were also associated with lower GRFS in multivariate analysis.

Non-relapse mortality

At 2 years, the cumulative incidence of NRM was 12.1% (95% CI, 8.9–15.9) in the ATG group and 17.4% (95% CI, 12.0–23.6) in the CD34+ group (p = 0.16, Fig. 1e). In multivariate analysis, there was no significant difference in NRM between the ATG and CD34+ groups (HR, 0.96; 95% CI, 0.54–1.74; p = 0.90). The only parameter with a significant impact on NRM incidence in multivariate analysis was patient age: NRM significantly increased in older patients (HR, 1.32; 95% CI, 1.03–1.68; p = 0.02). NRM was related mainly to infection (n = 35) and GVHD (n = 19), others causes being hemorrhage (n = 2), sinusoidal obstruction syndrome (SOS, n = 4), cardiac toxicity (n = 1), graft failure (n = 2), secondary malignancy (n = 2), others (n = 13), unknown (n = 3). Deaths related to infectious complication were significantly more frequent in the CD34+ group (16/28 versus 19/53 in ATG group, p = 0.045), while there was no difference between groups regarding death related to GVHD (5/28 in CD34+ versus 14/53 in ATG group, p = 0.41).

Donor lymphocyte infusion

Similar proportions of patients received DLI post-transplant in both cohorts. In the ATG T cell depletion cohort, 38 (10.5%) patients received DLI for the following indications: mixed chimerism in 8 patients, relapse in 19 patients, and prophylaxis in 11 patients. In the CD34+ cohort, 19 (11.7%) patients received DLI. The indications were mixed chimerism in 9 patients, relapse in 7 patients, and infection in 3 patients (2 EBV and 1 HHV-6).

Discussion

The optimal goal of allo-HCT is to mediate graft-versus-tumor effect to achieve cure, while sparing patients from severe comorbidity. Over the last decade, patients’ outcomes have improved significantly with an improved OS and decreased NRM [3]. However, despite this progress, there has been an increase in the incidence of cGVHD, the leading cause of late NRM and morbidity after allo-HCT [3, 6], associated with the use of unrelated donors and of PBSC grafts [4, 5]. Identification of the best strategy for cGVHD prevention is of the utmost importance, and while TCD may be a potential strategy, there remains a concern that it may be at the expense of an increased incidence of relapse. The primary objective of this retrospective study was to assess the relative efficacy of two TCD approaches, in vitro TCD using ATG, and ex vivo TCD with CD34 selection, and showed high OS and LFS with both approaches in patients with AML in CR1 transplanted with matched donors after a MAC regimen.

Furthermore, our study confirms that TCD is associated with a low incidence of severe acute and chronic GVHD, with a day 100 cumulative incidence of grade III–IV aGVHD of 6.2% and 1.3% and a 1-year cumulative incidence of extensive cGVHD of 11.3% and 2.5% in the ATG and the CD34+ groups, respectively. The results in the ATG groups are in accordance with previously published prospective randomized trials, with an incidence of grade III–IV aGVHD ranging from 2.4 to 11.7% and an incidence of extensive or moderate to severe cGVHD between 5 and 13% [8,9,10, 12]. Similarly, in the CD34+ group, the cumulative incidence of severe grade III–IV aGVHD and extensive cGVHD compares favorably to data available from clinical trials. In a prospective randomized trial of ex vivo TCD, Wagner et al. reported a cumulative incidence of grade III–IV aGVHD and overall cGVHD of 19% and 29% [29]; however, TCD in that study was limited to 1 to 2 logs depletion. More recently, with the TCD techniques utilized by the MSKCC group that allow 3 to 4 logs of T cell depletion, Devine et al. reported a cumulative incidence of grade III–IV aGVHD and extensive cGVHD of 4.5% and 6.8%, respectively, in a multicenter prospective phase 2 trial [15]. Overall progress in ex vivo TCD techniques have achieved a very low incidence of GVHD, leading to a significantly lower cumulative incidence of both acute and chronic GVHD in the CD34+ group, translating in a higher GRFS in those patients. It should be noted that the use of CD34 selection incorporates ATG, which is primarily used to promote engraftment by abrogating host T cells, but likely has an effect on residual donor T cells in the CD34-selected graft. This potentially contributes to the low risk of GVHD as well as delayed immune recovery [30]. Finally, use of a matched unrelated donor remains associated with an increase incidence of grade II–IV aGVHD, while there was a trend toward a higher incidence of cGVHD in multivariate analysis, highlighting that TCD, either ex vivo or in vivo, does not completely overcome HLA barrier.

The increased incidence of GVHD seen in the ATG group does not translate in a higher cumulative incidence in NRM in those patients. In fact, there was no difference in the incidence of death related to GVHD between groups (12% in the ATG versus 10% in the CD34 groups), while there was more death related to infections in the CD34 group: 30% versus 17% in the ATG group. Delayed immune recovery after CD34-selected TCD allo-HCT contributes to an increase incidence of infectious complications, which may result in a lack of improvement in OS despite the low incidence of GVHD, compared to unmodified grafts [18, 30,31,32,33,34,35,36]. However, the median age in the CD34+ group was over a decade older than the ATG group. Age is a known risk factor for GVHD, NRM and delayed immune reconstitution, and the results of the study need to be considered in the context of this significant age difference between the cohorts.

While use of TCD was thought to be associated with an increase incidence of relapse, relapse incidence remains limited in our study, and no difference between groups was observed in multivariate analysis. This observation is probably related to the intensity of the conditioning regimen administered. Indeed, while in four out of six studies that include exclusively RIC regimen, ATG was associated with an increased risk of relapse, in studies that include MAC or MAC and RIC regimens together, no increase in relapse risk was reported [37]. Similarly, the MSKCC group recently reported a lower incidence of relapse using CD34-selected TCD allo-HCT with MAC compared to RIC [38]. In our study, while there were some differences in conditioning regimen between the two groups, all patients received a MAC regimen. Therefore, the incidences of relapse observed in our study are in accordance with previously published studies, despite the inclusion of one third of patients with poor risk cytogenetic status. Unfortunately, due to its retrospective nature being registry-based study, molecular characteristics were not available for all patients and we were not able to refine our analysis based on these parameters. In addition, the inhomogeneity of ATG doses used in our study may have interfered with patients’ outcome. Indeed, while lower ATG exposure is thought to be associated with a higher risk of GVHD, high exposure may produce excessive T cell depletion, leading to delayed immune reconstitution with increased risk of relapse, but also higher NRM, mostly as a result of infections [13]. Therefore, Ayuk et al. reported that use of lower doses of grafalon (30 versus 60 mg/kg) was associated with a lower NRM but had no impact on chronic GVHD and relapse rate after MAC allo-HCT [39]. Furthermore, Locatelli et al. evaluated even lower doses of grafalon (15 versus 30 mg/kg) in a phase III trial and found that they were associated with an improved OS and progression-free survival without significantly increasing the incidence of GVHD [40]. Finally, ATG pharmacokinetics and pharmacodynamics may also have a significant impact on HCT outcomes. For example, while ATG dose is based on patients’ weight, Admiraal et al. recently developed a pharmacokinetic model where absolute lymphocyte count at time of ATG administration was the only relevant predictor for ATG pharmacokinetics in adults [41]. Development of such approaches may help to further improve ATG safety and tolerability.

Our study does have some limitations, due in part to its retrospective nature and potential differences in supportive care. One limitation of our study is that all patients from the CD34+ group were treated in a single center highly experienced in this approach, while the ATG group was constituted from a multicenter-based registry. However, inclusion criteria were defined in order to constitute a cohort of patient as homogeneous as possible, CR1 AML patients with a matched donor that received rabbit ATG and a Bu-Cy, Bu-Flu, or Cy-TBI MAC regimen, in order to overcome this limitation and to reduce bias due to disease status and donor type on disease outcome.

Another potential limitation is the broader applicability of CD34 selection given the fact that all patients in the CD34 group were from a single center highly experienced in this approach. However, this approach has previously shown similar results in a multicenter phase 2 trial [15]. Furthermore, a randomized multicenter phase 3 trial is currently comparing this approach to post-transplant cyclophosphamide or tacrolimus and methotrexate in patients with acute leukemia and MDS receiving a MAC transplant from an 8/8 HLA-matched related or unrelated donor (NCT02345850).

Conclusions

Overall, our study shows that NRM, OS, and LFS are similar after ex vivo CD34-selected and in vivo ATG TCD MAC allo-HCT from related/unrelated donors in patients with AML in CR1 and intermediate/high-risk cytogenetic. Notably, the cumulative incidence of acute (total and severe) and chronic GVHD was higher after allo-HCT with ATG, leading to a lower GRFS in those patients. In contrast, stronger immunosuppression in the CD34+ group leads to a higher incidence of infectious related death. Given the high OS seen in patients with AML in CR1 with both approaches, they should be compared in a prospective randomized trial.