Introduction

Metastatic melanoma is a serious disease with a poor survival outcome; the 5-year survival rate for patients with distant metastases at diagnosis is 15–20% [1, 2]. Until recently, therapies for unresectable metastatic melanoma had no confirmed survival benefit and the aim of treatment was palliation [3]. However, the number of treatment options has increased rapidly since 2011. Immunotherapies and targeted agents have demonstrated survival benefits in randomized controlled phase 3 trials and are now recommended for first- and second-line treatment of metastatic melanoma [4, 5]. The choice of treatment may be guided by the patient’s BRAF mutation status: the BRAF kinase inhibitors vemurafenib and dabrafenib are used to treat mutated-BRAF disease, which accounts for approximately 40–50% of melanoma cases, whereas immunotherapies such as ipilimumab, pembrolizumab, and nivolumab, and the oncolytic virus therapy talimogene laherparepvec, can be used regardless of BRAF mutation status [4,5,6].

Many of these new treatment options were evaluated in phase 3 trials against older agents. Each new treatment was compared with the agent or regimen considered appropriate for that trial setting at the time when the trials were designed, which led to the use of a wide range of control agents in pivotal trials. For example, ipilimumab was compared with glycoprotein peptide vaccine (gp100) in previously treated patients [7] and with dacarbazine in the first-line setting [8]. Dacarbazine was also the control agent in trials of vemurafenib and dabrafenib monotherapy [9, 10], whereas granulocyte-macrophage colony-stimulating factor (GM-CSF) was the control agent in the pivotal trial of talimogene laherparepvec [11].

As the treatment landscape continues to develop, it is important to compare currently used therapies to guide the treatment decision-making process. However, there are challenges in estimating the relative treatment effects for these newer therapies, at least in part because they were compared to older agents in phase 3 trials and the relative efficacy of these older agents has never been established. Systematic evaluations of new treatment options (health technology assessments [12]) have not discounted the possibility of equipoise between gp100 and placebo; however, there is no precedent for making the same assumption for dacarbazine. By better understanding the relative efficacy of older agents, it may be possible to strengthen the networks of evidence and indirect treatment comparisons of the newer therapies.

Here we report the outcome of an indirect treatment comparison of survival with GM-CSF, dacarbazine, and gp100 in phase 3 randomized controlled trials in metastatic melanoma. The indirect treatment comparison took the form of a treatment-specific meta-analysis of absolute treatment effect, with adjustment for heterogeneity in prognostic factors for survival. Given that survival rates vary according to stage of disease [13], we also have compared relative survival with these agents in three populations: stage IIIB–IV M1c melanoma, that is all patients in the trials; stage IIIB–IV M1a melanoma (early metastatic melanoma), that is patients without bone, brain, lung, or other visceral metastases; and stage IV M1b/c melanoma (late metastatic melanoma), that is patients with visceral metastases.

Methods

Systematic Literature Review

A systematic literature review was carried out to identify relevant trials relating to treatments for melanoma; the methodology has been described in full elsewhere [14]. The systematic review followed Cochrane and Preferred Reporting Items for Systematic Reviews and Meta-Analysis (PRISMA) guidelines. Briefly, clinical trials published between January 1990 and September 2015 that evaluated the efficacy and safety of treatments for metastatic melanoma were identified from searches of the following databases: MEDLINE, including MEDLINE In-Process Citations and Daily Update (PubMed; OvidSP); Embase (OvidSP); and the Cochrane Library, including the Cochrane Database of Systematic Reviews (CDSR), Database of Abstracts of Reviews of Effects (DARE), Cochrane Central Register of Controlled Trials (CENTRAL), Health Technology Assessment (HTA) Database, and the NHS Economic Evaluation Database (NHSEED). To identify relevant studies presented at conferences, the following meeting proceedings were also searched: American Society of Clinical Oncology (ASCO), European Association of Dermato Oncology (EADO), European Cancer Congress (ECC), European Society for Medical Oncology (ESMO), and International Society for Pharmacoeconomics and Outcomes Research (ISPOR; European and international conferences).

The trials identified in the review were assessed against pre-specified inclusion and exclusion criteria; trials that met these criteria and that investigated the agents of interest were then assessed for bias and study quality using the Grades of Recommendation, Assessment, Development and Evaluation (GRADE) criteria [15].

Establishing the Feasibility of a Valid Network Meta-Analysis

The feasibility of conducting a valid network meta-analysis of GM-CSF, dacarbazine, and gp100 was assessed using a process developed and published by Cope and colleagues [16]. The network of evidence for previously untreated or treated melanoma is shown in Fig. 1, including both current treatments and the control agents included in the trials. The network had few trials and generally one trial per connection; aside from dacarbazine, each of the treatments of interest occurred in only a single trial. There were no head-to-head data for the treatments of interest and patient characteristics (e.g., stage of disease, previous treatment) differed between trial populations meaning that a valid network of evidence could not be established [17].

Fig. 1
figure 1

Network of evidence for previously untreated or treated melanoma. a Trial ongoing but data are reported; b Non-standard dose, administration, or setting; c Previously treated patients; d NCT identifier not available, data published (Ravaud et al. 2001). GM-CSF granulocyte-macrophage colony-stimulating factor, gp100 glycoprotein peptide vaccine, T-VEC talimogene laherparepvec

Indirect Treatment Comparisons

The use of treatment-specific meta-analysis of absolute treatment effect as an alternative approach to indirect treatment comparison has been previously reported when a valid network of evidence could not be established [14]. We used the same approach here, which involved analysis of survival data for GM-CSF, dacarbazine, and gp100 from metastatic melanoma trials in which these agents were studied as monotherapy.

Survival outcomes were adjusted for heterogeneity in prognostic factors to permit a valid comparison between agents. Adjustments were based on a predictive model for survival developed by Korn and colleagues [18]; a modified version of the model has been accepted by the UK National Institute for Health and Care Excellence (NICE) [19]. The modified Korn model produces a hazard ratio (HR) based on five prognostic factors for survival: sex, Eastern Cooperative Oncology Group performance status, presence of visceral metastases, presence of brain metastases, and lactate dehydrogenase (LDH) level. Survival was adjusted using the HR as the modifier, thereby reflecting the impact of the difference in patient characteristics between trial arms for each agent.

GM-CSF was considered the reference agent for the indirect treatment comparison since this permitted additional analyses of survival rates according to stage of disease (early and late metastatic melanoma). To test the robustness of the indirect treatment comparison, survival outcomes were also analyzed using dacarbazine as the reference, as this control agent had the largest evidence base.

Survival outcomes were analyzed separately for each comparator. Where survival outcomes for a particular comparator were available from more than one trial, estimates of overall survival (OS) from each trial were adjusted separately using the modified Korn model and then pooled using the Mantel–Haenszel method [20, 21]. A detailed description of the application of the modified Korn model and Mantel–Haenszel method is published elsewhere [14].

Survival outcomes with GM-CSF were available according to stage of disease—that is, for patients without bone, brain, lung, or other visceral metastases (stage IIIB–IV M1a) and patients with visceral metastases (stage IV M1b/c). These data were not available for dacarbazine and gp100, so adjustment factors for dacarbazine and gp100 did not include this specificity.

For the analyses using dacarbazine as the reference agent, adjusted survival outcomes with GM-CSF and gp100 were determined relative to dacarbazine for all patients (stage IIIB–IV M1c disease). As survival outcomes with dacarbazine were not available according to stage of disease, it was not possible to conduct the indirect comparison of treatments on the subgroups of patients with early or late metastatic melanoma using dacarbazine as the reference agent.

Extracting Survival Data for Analysis

DigitizeIt version 2.0.3 software was used to extract and digitize Kaplan–Meier curves [22]. This provided survival estimates at consecutive half-month intervals for the relevant arm of each trial. Median survival estimates from the digitized datasets were compared with the published median survival estimates to establish the quality of the outputs.

Compliance with Ethics Guidelines

This article is based on previously conducted studies and does not involve any new studies of human or animal subjects performed by any of the authors.

Results

Systematic Review

Figure 2 shows the PRISMA flow diagram illustrating the identification of studies in the systematic review. Trials that had evaluated GM-CSF, dacarbazine, or gp100 as control agents were identified for inclusion in the meta-analysis on the basis of the following criteria: they had to be randomized phase 3 trials published from 2010 onwards, OS curves had to be reported, as did patient baseline characteristics relevant to the modified Korn model.

Fig. 2
figure 2

PRISMA flow diagram for the systematic review. GM-CSF granulocyte-macrophage colony-stimulating factor, gp100 glycoprotein 100, PRISMA Preferred Reporting Items for Systematic Reviews and Meta-Analyses

Trials Included in the Indirect Treatment Comparison

In total, six trials were identified for inclusion in the indirect treatment comparison: one trial with GM-CSF, four trials with dacarbazine, and one trial with gp100. All of the trials were designed to meet regulatory requirements and were therefore considered of sufficient quality to include in the analyses; it was not considered necessary to use formal assessments of potential bias (e.g., using GRADE criteria) to further refine the list of trials included in the analyses.

The baseline characteristics of patients enrolled in these trials are listed in Table 1. Sex, a prognostic factor for survival, was generally balanced across trials but there were important differences in some other prognostic factors: patients enrolled in the GM-CSF trial seemed to have a better prognosis than those in the dacarbazine trials and the gp100 trial; for example, a higher percentage of patients in the GM-CSF trial had stage IIIB–IV M1a disease and a higher percentage had normal LDH levels.

Table 1 Patient baseline characteristics for randomized controlled phase 3 trials of GM-CSF, dacarbazine, and gp100

The heterogeneity in patient baseline characteristics across the trials confirmed the need for adjustment before comparing the survival outcomes for the three agents, although heterogeneity in outcomes could not be assessed quantitatively because of the small number of trials and the variation in sample sizes.

The modified Korn model was applied to adjust for differences in the five baseline prognostic factors, and an HR modifier was calculated for each trial. Adjustment factors were then determined from the ratio of the HR modifiers for GM-CSF (or dacarbazine) and the comparator agents.

Relative Survival in all Patients (Stage IIIB–IV M1c)

GM-CSF as the Reference Agent

Table 2 lists the adjustment factors applied to determine the relative survival effect of dacarbazine and gp100 compared with GM-CSF in all patients (stage IIIB–IV M1c disease). All the adjustment factors are less than 1, reflecting the poorer survival prognosis for patients in the dacarbazine trials and gp100 trial compared with those in the GM-CSF trial.

Table 2 Overall survival curve adjustment: HR and adjustment factor for all patients (stage IIIB–IV M1c)

The observed median OS estimate for GM-CSF and the unadjusted and adjusted median OS estimates for dacarbazine and gp100 in all patients are presented in Table 3. For both dacarbazine and gp100, adjustment for patient baseline characteristics substantially increased adjusted median OS compared with the unadjusted estimates. Again, this reflects the poorer original survival prognosis driven by the patient characteristics in the dacarbazine trials and the gp100 trial. Median OS with GM-CSF was estimated to be longer than with dacarbazine or gp100, even after adjusting for heterogeneity in patient baseline characteristics.

Table 3 Median overall survival in months: all patients (stage IIIB–IV M1c)

The observed OS curve for GM-CSF and unadjusted OS curves for dacarbazine and gp100 are shown in Fig. 3. Adjustment for differences in baseline prognostic factors improved OS for both dacarbazine (Fig. 4) and gp100 (Fig. 5) at each time point along the curve. The observed OS for GM-CSF was greater than the unadjusted OS for dacarbazine and gp100 at each time point. The observed OS for GM-CSF was also greater than the adjusted OS for dacarbazine at every time point (Fig. 4); however, the observed OS for GM-CSF tracked close to or just below the upper bound of the 95% confidence interval (CI) of the adjusted OS curve for dacarbazine, which suggested that the survival estimates for the two agents might be similar. When comparing the observed OS for GM-CSF with the adjusted OS for gp100 (Fig. 5), the observed OS for GM-CSF tracked above the upper bound 95% CI of the adjusted OS curve for gp100 at every time point, even though differences in the trial populations that might affect survival outcomes had been accounted for, suggesting greater efficacy for GM-CSF throughout the survival curve.

Fig. 3
figure 3

Unadjusted Kaplan–Meier OS curves for dacarbazine and gp100 vs observed OS curve for GM-CSF, all patients (stage IIIB–IV M1c). GM-CSF granulocyte-macrophage colony-stimulating factor, gp100 glycoprotein 100, OS overall survival

Fig. 4
figure 4

Unadjusted and adjusted Kaplan–Meier OS curves for dacarbazine vs observed OS curve for GM-CSF, all patients (stage IIIB–IV M1c). CI confidence interval, GM-CSF granulocyte-macrophage colony-stimulating factor, OS overall survival

Fig. 5
figure 5

Unadjusted and adjusted Kaplan–Meier OS curves for gp100 vs observed OS curve for GM-CSF, all patients (stage IIIB–IV M1c). CI confidence interval, GM-CSF granulocyte-macrophage colony-stimulating factor, gp100 glycoprotein 100, OS overall survival

Dacarbazine as the Reference Agent

Table 4 lists the adjustment factors applied to determine the relative survival effect of GM-CSF and gp100 compared with dacarbazine in all patients (stage IIIB–IV M1c disease). Similarly to the original analysis, median OS with GM-CSF was longer than with dacarbazine and gp100, even after adjusting for heterogeneity in patient baseline characteristics (Table 5).

Table 4 Overall survival curve adjustment to dacarbazine reference: HR and adjustment factor for all patients (stage IIIB–IV M1c)
Table 5 Median overall survival in months: all patients (stage IIIB–IV M1c)

Although adjustment for differences in baseline prognostic factors reduced OS for GM-CSF at each time point compared with the observed data, the adjusted GM-CSF curve still dominated the observed OS curve for dacarbazine (Fig. 6). Moreover, the lower 95% CI for GM-CSF tracked above or on the observed OS curve for dacarbazine, suggesting that the survival estimates may be similar and supporting the outcome of the original analysis. The adjusted OS for gp100 tracked below the observed OS for dacarbazine but with some overlap with the upper 95% CI (Fig. 7), again suggesting that the survival estimates for the two treatments are similar.

Fig. 6
figure 6

Unadjusted and adjusted Kaplan–Meier OS curves for GM-CSF vs observed OS curve for dacarbazine, all patients (stage IIIB–IV M1c). CI confidence interval, GM-CSF granulocyte-macrophage colony-stimulating factor, OS overall survival

Fig. 7
figure 7

Unadjusted and adjusted Kaplan–Meier OS curves for gp100 vs observed OS curve for dacarbazine, all patients (stage IIIB–IV M1c). CI confidence interval, gp100 glycoprotein 100, OS overall survival

Relative Survival in Patients with Early Metastatic Melanoma (Stage IIIB–IV M1a)

Stage IIIB–IV M1a disease, in which bone, brain, lung, and other visceral metastases are absent, can be considered an early stage of metastatic melanoma. Table 6 lists the adjustment factors applied to determine the relative survival effect of dacarbazine and gp100 compared with GM-CSF in these patients. As stated previously, the adjustment factors for GM-CSF were specific to the stage of disease (i.e., to stage IIIB–IV M1a), but those for dacarbazine and gp100 were not.

Table 6 Overall survival curve adjustment: HR and adjustment factor for patients with early metastatic melanoma (stage IIIB–IV M1a)

The absence of brain or visceral metastases is recognized as a positive prognostic factor for OS and so, as would be expected, adjustment factors for this group were lower than those for the corresponding total patient populations across all trials, resulting in a greater impact on survival outcomes.

The observed median OS estimate for GM-CSF and the unadjusted and adjusted median OS estimates for dacarbazine and gp100 are shown in Table 7. For both dacarbazine and gp100, the adjusted median OS for this population was longer than for the overall patient population, as would be expected given the better survival prognosis. Median OS with GM-CSF was longer than with dacarbazine or gp100 even after adjusting for heterogeneity in patient baseline characteristics.

Table 7 Median overall survival in months: patients with early metastatic melanoma (stage IIIB–IV M1a)

As in the overall patient population, the observed OS for GM-CSF was greater than the adjusted OS for both dacarbazine and gp100 at every time point (Fig. 8). Again, the observed OS for GM-CSF tracked close to or on the upper bound 95% CI of the adjusted OS curve for dacarbazine and above the upper bound 95% CI of the adjusted OS curve for gp100.

Fig. 8
figure 8

Adjusted Kaplan–Meier OS curves for dacarbazine and gp100 vs observed OS curve for GM-CSF, patients with early metastatic melanoma (stage IIIB–IV M1a). CI confidence interval, GM-CSF granulocyte-macrophage colony-stimulating factor, gp100 glycoprotein 100, OS overall survival

Relative Survival in Patients with Late Metastatic Melanoma (Stage IV M1b/c)

Stage IV M1b/c disease, where visceral metastases are present, can be considered a late stage of metastatic melanoma. Table 8 lists the adjustment factors applied to determine the relative survival effect of dacarbazine and gp100 compared with GM-CSF in this population. The adjustment factors for this group were greater than those for the corresponding total patient populations across all trials, reflecting the poorer survival prognosis as determined by the GM-CSF data specific to this patient population.

Table 8 Overall survival curve adjustment: HR and adjustment factor for patients with late metastatic melanoma (stage IV M1b/c)

For dacarbazine and gp100, the unadjusted and adjusted median OS estimates for these patients are shown in Table 9, alongside the observed median OS estimate for GM-CSF. The adjusted median OS for each agent was shorter for these patients than for the overall patient population, again reflecting the poorer survival prognosis. Median OS with GM-CSF was longer than with dacarbazine or gp100, even after adjusting for heterogeneity in patient baseline characteristics.

Table 9 Median overall survival in months: patients with late metastatic melanoma (stage IV M1b/c)

The observed OS for GM-CSF was similar to the adjusted OS for dacarbazine, tracking on or close to the OS curve for dacarbazine at every time point (Fig. 9). The observed OS for GM-CSF was greater than the adjusted OS for gp100 at every time point, and tracked above the upper bound 95% CI of the gp100 OS curve.

Fig. 9
figure 9

Adjusted Kaplan–Meier OS curves for dacarbazine and gp100 vs observed OS curve for GM-CSF, patients with late metastatic melanoma (stage IV M1b/c). CI confidence interval, GM-CSF granulocyte-macrophage colony-stimulating factor, gp100 glycoprotein 100, OS overall survival

Discussion

The treatment landscape for metastatic melanoma has evolved rapidly in recent years. Evidence of significant survival benefits from randomized controlled phase 3 trials means that immunotherapies and targeted agents are now treatment options for a patient population that historically has been considered difficult to treat [4, 5, 23]. There are challenges in estimating the relative treatment effects of new therapies, however, because they were compared to different older agents in pivotal phase 3 trials and the relative efficacy of these older agents has never been established. By better understanding the relative efficacy of older agents, it may be possible to strengthen the networks of evidence and indirect treatment comparisons of the newer therapies.

In this indirect treatment comparison, the choice of agents for the analysis was driven by the choice of control agents in the pivotal trials of newer immunotherapies and targeted agents. GM-CSF was the control in the pivotal trial of talimogene laherparepvec [11], gp100 in the registration trial of ipilimumab [7], and dacarbazine in the trials of vemurafenib and dabrafenib monotherapies in the first-line setting [9, 10] and also in another trial of ipilimumab [8]. Dacarbazine is the most commonly used older systemic chemotherapy to treat metastatic melanoma. Guidelines from the European Society of Medical Oncology consider dacarbazine or temozolomide an option for the treatment of stage IV metastatic disease in situations where a clinical trial, immunotherapy, or targeted therapy is not available [4].

The current analysis showed that median OS was longer in patients who received GM-CSF than in those who received dacarbazine, even after adjusting for differences in baseline prognostic factors for survival. This was also consistently observed in early-stage disease (without bone, brain, lung, or other visceral metastases) and in late-stage disease (with visceral metastases). The survival estimates over time for GM-CSF were similar to the adjusted survival estimates for dacarbazine; this was also the case when dacarbazine was considered the reference agent. Median OS was longer in patients who received GM-CSF than in those who received gp100 after adjusting for differences in baseline prognostic factors for survival. Again, this was consistently observed in early and late metastatic melanoma. Survival rates for GM-CSF were higher than those for gp100 at each time point evaluated, tracking above the upper bound 95% CI of the adjusted gp100 OS curve. This suggests that GM-CSF has greater efficacy than gp100, throughout the distribution of survival.

This analysis has some limitations that are common already to previous analyses of treatment for melanoma. There were only a small number of qualifying trials and the adjustments made to the survival curves were based on a number of observed prognostic factors. It is possible that other unobserved factors also influenced the trial results; however, the factors included were previously identified on the basis of data from a large number of metastatic melanoma trials [18, 19] and have been used multiple times in adjustments for differences between studies in melanoma [14, 24]. Any additional bias left unadjusted should be minimal.

The network of evidence was such that it could not establish similarity or consistency in studies or treatment effects. This was not surprising, as trials of these agents in melanoma are sporadic, spanning back to the year 2000, and include various treatment settings. In the absence of such a network, adjustment for baseline prognostic factors was undertaken using the modified Korn model; however, there still could be uncertainty. For example, the single study that addressed the efficacy of GM-CSF was the OPTiM trial, which had some identifiably different patient characteristics, as well as better-specified data on disease stage (more patients without visceral disease were included in the trial). Some of the differences in survival, therefore, could be ascribed to patient characteristics. However, this was mitigated to a degree by the inclusion of visceral disease in the modified Korn model, and further mitigated by separate analyses of disease with and without bone, brain, lung, or other visceral diseases. The consistent findings from all patients (stage IIIB–IV M1c), patients with early metastatic melanoma (stage IIIB–IV M1a), and patients with late metastatic melanoma (stage IV M1b/c) increase the confidence in the results.

To our knowledge, this is the first study to compare the relative treatment effects of the control agents used in phase 3 trials of newer immunotherapies and targeted agents for metastatic melanoma. By adjusting for observed differences in patient characteristics across the trials, we have reliably estimated the relative survival associated with each agent. Findings from this study may help strengthen the networks of evidence in metastatic melanoma, and help future studies establish the relative value of treatments for metastatic melanoma.

Conclusions

In this indirect treatment comparison, using data from randomized controlled phase 3 trials in metastatic melanoma, the relative treatment effect of GM-CSF, dacarbazine, and gp100 has been reliably estimated by adjusting for differences in prognostic factors for survival between patient populations. After adjustment, GM-CSF had survival estimates that were at least as good as those for dacarbazine over time, a finding supported by sensitivity analysis with dacarbazine as the reference agent. Compared with gp100, GM-CSF had higher rates of survival at each time point and longer median OS, suggesting greater efficacy throughout the distribution of survival. Importantly, comparing relative treatment effects according to stage of disease consistently yielded the same pattern. Given the role of these agents as controls in pivotal phase 3 trials of new immunotherapies and targeted agents, the results of this analysis can be used to contextualize the efficacy of these new therapies.