FormalPara Key Points for Decision Makers

A network meta-analysis showed that nintedanib was significantly better than placebo in acute exacerbations and lung function decline related to idiopathic pulmonary fibrosis. Pirfenidone reached statistical significance for lung function decline. There was uncertainty about the overall survival benefits of active treatments compared with placebo.

The analysis of the trial data showed a logical trend in the association of resource use estimates and lung function as well as between EQ-5D and lung function, i.e. increasing resource use (mainly hospitalisation) and decreasing EQ-5D scores with lung function decline.

In the base-case analysis, nintedanib and pirfenidone were largely equivalent in estimated costs and benefits; the results were driven mainly by the risk of acute exacerbations.

N-Acetylcysteine was dominated by the reference strategy (best supportive care) due to a worse survival profile.

1 Introduction

Idiopathic pulmonary fibrosis (IPF) is a rare, chronic, progressive and fatal lung disease of unknown origin characterised by irreversible lung function decline [1]. The prevalence of IPF in the UK is estimated to be between 15 and 25 cases per 100,000 people [2].

Treatment of IPF focuses on managing symptoms and slowing disease progression. The majority of patients receive best supportive care (BSC), which consists of smoking cessation, oxygen therapy, pulmonary rehabilitation, opiates, anti-reflux therapy, low-dose corticosteroids and palliative care [1]. A minority of patients are eligible for lung transplant [3]. Few pharmacological treatments are available to treat IPF. Triple therapy with prednisone, azathioprine and N-acetylcysteine was once widely used, but has been shown to result in an increased risk of death and serious adverse events (AEs) [4]. Although N-acetylcysteine monotherapy may be used, it has shown little benefit compared with placebo [5]. In 2011, the European Medicines Agency (EMA) approved pirfenidone for the treatment of IPF. In 2015, the EMA approved nintedanib (OFEV®; Boehringer Ingelheim, Ingelheim, Germany) for this indication.

International guidelines recommend nintedanib and pirfenidone as treatments for IPF, thus providing physicians and their patients with genuinely effective therapeutic options [6]. Both nintedanib and pirfenidone are approved for reimbursement by the National Institute for Health and Care Excellence (NICE) [7, 8] and the Scottish Medicines Consortium (SMC) [9, 10] under confidential patient access scheme (PAS) discounts and restricted market access conditions. The objective of this study was to assess the cost effectiveness of nintedanib for the treatment of IPF against established treatments in the UK. We provide an overview of the analysis and model that was submitted to NICE and the SMC in 2015 and discuss its strengths and limitations.

2 Methods

A Markov model was designed to capture the changes in the condition of adults with IPF. To determine the structure of the model, we reviewed a cost-effectiveness evaluation focused on the non-pharmacological treatment of IPF that was available at the time of the analysis [1], and identified several outcomes that described the absolute state of patient condition: overall survival (OS), acute exacerbations and disease progression, defined as lung function decline. Several methods were explored using patient-level clinical trial data [11, 12] to examine the interdependencies of the three outcomes.

Clinical outcomes that could impact disease progression and clinical deterioration were considered for the definition of model health states. A literature review identified studies that assessed a single parameter [1320] and those using risk scoring systems with multiple parameters [21, 22]. Forced vital capacity (FVC) was the most commonly reported measure in the literature and in clinical trials [7, 8, 10, 17], and was selected as the main factor determining disease progression. FVC percent predicted (FVC %pred) was reported across the majority of published clinical trials and was therefore preferred to raw (i.e. absolute) FVC values. FVC %pred is adjusted for the age, sex and height of the patient, thus removing some of the heterogeneity of the health-state members; it also adheres to Markov model conventions. Our choice for the optimal FVC %pred range was informed by several exploratory analyses on the impact to the model results during the conceptualisation phase. After consultation with clinical experts (GJ and TM) and consideration of the evidence from the INPULSIS trials [12] and the literature [23, 24], it was decided that a 10-point categorisation of FVC %pred was the most clinically appropriate and methodologically feasible value for use in this analysis.

A number of key health states were used to represent IPF disease progression and possible transitions between them (Fig. 1). The cohort entered the model with different levels of FVC %pred and without a history of acute exacerbation (see Electronic Supplementary Material Online Resource 1). Patients who progressed to a lower FVC %pred could not regress back to health states with better lung function. History of an acute exacerbation was assumed to influence the health status of patients. We assumed that death could occur (a) at any point in the model (and from any health state); or (b) at the point that patients drop below a level of FVC %pred of 40%, which was assumed to be an unsustainable level of lung function [1].

Fig. 1
figure 1

Model structure

2.1 Treatment Efficacy

The model used evidence from three randomised controlled trials (RCTs) for nintedanib: the phase II TOMORROW (To Improve Pulmonary Fibrosis with BIBF 1120) trial and two phase III INPULSIS trials (INPULSIS-1 and INPULSIS-2) [11, 12]. Data for pirfenidone and N-acetylcysteine were either extracted from the main pirfenidone and N-acetylcysteine publications [25, 26] or were obtained from a network meta-analysis (NMA) comparing the active comparator treatments.

In the absence of head-to-head data for all comparators, an NMA was developed based on evidence from nine studies [11, 12, 2531]. Key efficacy parameters, such as OS, acute exacerbations and lung function decline, were assessed. Other efficacy outcomes analysed in the NMA, but not included in the cost-effectiveness model, were the 6-min walk test and progression-free survival. A more detailed description of the NMA methodology and results is available in Online Resource 2.

The model captured three types of transition related to treatment efficacy: OS, acute exacerbations and lung function decline (see Online Resource 3). To define the baseline mortality risk, a survival analysis was conducted on patient-level data from the TOMORROW and INPULSIS trials [11, 12]. Five regression models were assessed for goodness of fit: exponential, Gompertz, log-logistic, log-normal and Weibull. The log-logistic, Weibull and Gompertz parametric models returned the lowest Akaike Information Criterion values and were compared with data from observational studies in patients with IPF (Fig. 2) [5, 19]. The log-logistic model showed the best fit with these and was therefore used for the base-case analysis, while the alternatives were used in sensitivity analyses. It was assumed that following an acute exacerbation, patients would experience an increased risk of death, which was implemented as a hazard ratio of 1.40 per cycle [5].

Fig. 2
figure 2

Comparison of overall survival of the model best supportive care arm with observational data [5, 19]. BSC best supportive care

Data on acute exacerbations from the placebo arms of the INPULSIS trials were used to estimate the baseline risk. Time to first acute exacerbation was recorded in two ways in the INPULSIS trials: (a) based on investigator-reported events; and (b) based on events adjudicated as confirmed or suspected acute exacerbations by a blinded adjudication committee [12]. The exponential model was judged to be the best fit; the 3-month acute exacerbation risks were 1.97 and 1.47% for investigator-reported and adjudicated-confirmed/suspected exacerbations, respectively. The economic model used the investigator-reported estimate in the base-case analysis. The same risk value was assumed for recurrent events due to a lack of other evidence.

The interdependency between exacerbation and the baseline lung function risk was explored in the economic model using data from the INPULSIS trials [12]. A logistic model was used, capturing the current FVC %pred state for patients’ acute exacerbation status (i.e. progression before and after an exacerbation) (Model b). The exacerbation covariate was not statistically significant (p = 0.445) and Model a was used in the base-case analysis.

$${\text{Model a}}:{\text{LF}}_{t1} = {-} 4. 1 80 + 0.0 1 6 \times {\text{FVC \% pred}}_{t0}$$
$${\text{Model b}}:{\text{LF}}_{t1} = {-} 4. 1 80 + 0.0 1 6 \times {\text{FVC \% pred}}_{t0} + 0. 8 1 4 \times {\text{Exa}}$$

where LFt1 is the lung function at the end of the interval (time t 1 ), FVC %predt0 is the value of the FVC %pred at the start of the interval (time t 0) and Exa is the exacerbation covariate (whether an exacerbation occurred during the previous cycle).

The relative effectiveness of nintedanib, pirfenidone and N-acetylcysteine for OS, lung function decline and acute exacerbations against the baseline risk was calculated using odds ratios (ORs) obtained in the NMA (Table 1).

Table 1 Results of the network meta-analysis

2.2 Treatment Safety and Tolerability

The analysis assumed that patients were at risk of AEs for as long as they received treatment. To ensure comparability with the comparator evidence and homogeneity of the AEs, a number of criteria were assessed when considering AEs for inclusion in the model (see Online Resource 4 for details on the selection criteria). Two serious AEs were common across any two comparators (serious cardiac events and serious gastrointestinal events), and were included in the NMA. Treatment tolerability was considered using data on discontinuation due to AEs and overall discontinuation, and was also included in the NMA (Online Resource 2).

The baseline risk for these events was calculated from the placebo arms of the INPULSIS trials [12] (Table 2). ORs for nintedanib, pirfenidone and N-acetylcysteine versus placebo were obtained from the NMA (Table 1). The list of included AEs was reviewed by a clinical expert and was supplemented with other clinically important AEs identified for nintedanib and pirfenidone (Table 3).

Table 2 Incidence and risk of serious adverse events in the placebo arm of the INPULSIS trials (n = 423) [12]
Table 3 Incidence and risk of clinically important adverse events for the nintedanib [12, 53] and pirfenidone [26] arms of the model

Regarding discontinuation, both nintedanib and pirfenidone are novel treatments with limited real-world evidence. We analysed data from the INPULSIS trials [12] to determine a baseline risk (placebo arm: 5.5% per cycle) and used the NMA to reflect the relative tolerability of the active comparators (Table 1). We assumed that BSC is the minimum care patients would receive and therefore there would be no discontinuation.

2.3 Health-Related Quality of Life Inputs

An analysis of patient-level data from the INPULSIS trials [12] provided EQ-5D evidence on categories by FVC %pred status (Table 4). This served as the baseline utility dependent on the patient condition. A separate analysis of data from the INPULSIS trials provided estimates for utility decrements for acute exacerbations and serious gastrointestinal events [12]. Disutility estimates for serious cardiac events, skin disorders and gastrointestinal perforation were obtained from a retrospective analysis of a UK database [32].

Table 4 Health-related quality of life and cost inputs for the model

2.4 Cost Inputs

The cost inputs considered in the cost-effectiveness analysis were drug acquisition, treatment-related AEs, monitoring tests (liver panel tests), background follow-up, oxygen use, acute exacerbation costs and end-of-life (EoL) palliative care costs. The cost inputs were synthesised using unit cost information from the UK [3335].

The list price of nintedanib was assumed at parity with the published list price of pirfenidone in the UK, i.e. £71.7 per day [34] when pirfenidone was administered at a dose of 2403 mg/day [36]. The assumed nintedanib dose was 300 mg/day (150 mg twice daily) [12], with no dose reduction allowed. Due to the likely overlap of background follow-up and BSC, it was assumed that a similar level of pharmacological costs would apply to active treatments and control (placebo arm of the INPULSIS trials [12]). We assumed the N-acetylcysteine cost per mg was £0.001125 [34] and the recommended dose was 600 mg three times daily. The model allowed dose escalation for the N-acetylcysteine arm up to 3143.52 mg/day from week 39 onwards, as described in the PANTHER-IPF (Prednisone, Azathioprine, and N-acetylcysteine: a study THat Evaluates Response in Idiopathic Pulmonary Fibrosis) trial [25]. AE-related costs were obtained from the National Health Service (NHS) reference costs for 2012/2013 [35].

The cost of background follow-up and acute exacerbations were compiled using patient-level data from the INPULSIS trials, which recorded resource use related to hospitalisation, emergency room visits and medical procedures, and UK unit costs (Table 4). A detailed description of the background follow-up and acute exacerbation cost calculation is available in Online Resource 5.

Some patients on nintedanib and pirfenidone had elevated hepatic enzyme values [36, 37]. Liver panel tests were assumed to be routinely performed on patients receiving nintedanib and pirfenidone. The cost of a liver panel test was estimated at £3.01 [35], and was assumed to be incurred by all patients on active treatment at a quarterly frequency (i.e. every cycle). The model assumed that patients who dropped below a level of FVC %pred of 80% would require oxygen supplementation [1], assumed to cost £418 per cycle [38] (value inflated from 2010/2011 to 2012/2013 using the most recent inflation indices at the time of the analysis [33]). The model assumed that patients receive EoL palliative care in the last year of life, costing £3921 per cycle [39] (value inflated from 2007/2018 to 2012/2013 [33]). All model inputs and assumptions were validated by clinicians (co-author T.M. Maher and advisor G. Jenkins).

2.5 Analysis

The cost-effectiveness of nintedanib compared with pirfenidone, BSC and N-acetylcysteine was estimated with the incremental cost-effectiveness ratio (ICER), which synthesises quality-adjusted life-years (QALYs) and healthcare costs. A comparison with triple therapy was not considered, since after the recent results of PANTHER-IPF [4] clinicians were urged to avoid it due to the excess number of deaths, hospitalisations and serious AEs [40].

The base-case analysis was based on INPULSIS patient characteristics (see Online Resource 1). The analysis was conducted from a UK NHS and Personal Social Services perspective. Costs and QALYs were discounted at the standard annual rate of 3.5% [41] and half-cycle correction was incorporated. Outcomes and transitions were estimated over the cohort lifetime and were evaluated every 3 months, consistent with the duration between observations in the clinical trials used to estimate baseline transition probabilities [12].

We also conducted a subgroup analysis of a population with an “increased risk of progression” as defined in the ASCEND (Assessment of Pirfenidone to Confirm Efficacy and Safety in Idiopathic Pulmonary Fibrosis) clinical trial [29]. Here the survival analysis and individual patient data analysis of the INPULSIS population were restricted to mirror as much as possible the ASCEND selection criteria: IPF diagnosed at least 0.5 years before visit 2, FVC 50–90% predicted, and forced expiratory volume in 1 s (FEV1)/FVC ≥0.8. Table 5 reports the differences between the base-case and the subgroup analysis.

Table 5 Differences between the base-case and the subgroup analysis for the ASCEND population

Extensive one-way sensitivity analysis and probabilistic sensitivity analysis (1000 samples) were performed. Details on the PSA parameters and distributions are presented in Online Resource 6. External validation of the model assumptions by leading UK clinical experts, and internal validation of the OS, acute exacerbation and FVC %pred distribution are presented in Online Resource 7.

Internal model verification was conducted by the model developers. The same cost-effectiveness model was audited by independent analysts during the NICE and SMC technology appraisals. Extreme value analyses were also conducted to stress test the model results. The executable file of the model was made available to the journal for peer review.

3 Results

The base-case deterministic results showed that nintedanib dominated pirfenidone, with lower costs and more QALYs gained. This trend was attributed to the modelled acute exacerbation events, which were fewer in patients treated with nintedanib than in patients treated with pirfenidone. The NMA results showed that the OR for acute exacerbations versus placebo was 0.56 (95% confidence interval [CI] 0.35–0.89; statistically significant) for nintedanib and 1.10 (95% CI 0.43–2.85; not statistically significant) for pirfenidone (Table 1).

Compared with BSC, the ICER for nintedanib and pirfenidone was over £100,000 per QALY gained (£145,310 per QALY gained for nintedanib and £172,198 per QALY gained for pirfenidone), due to the high incremental cost difference between the active treatments and BSC (approximately £60,000) (base year of currency values 2014) (Table 6). N-Acetylcysteine was dominated by BSC, with higher total costs and fewer QALYs gained; since it was an inferior therapy; results are not shown for N-acetylcysteine. For both nintedanib and pirfenidone, the increase in costs was due to the drug acquisition costs. For nintedanib, drug acquisition costs were 74% of the total value, while background follow-up and oxygen use accounted for 13% and EoL palliative care costs for 11% (Table 6; percentages not shown). Due to their low frequency, acute exacerbations accounted for only 1% of the total costs. Finally, AE-related and liver panel tests accounted for less than 1% of the nintedanib costs. Note that these results are based on the list prices of nintedanib and pirfenidone.

Table 6 Incremental cost-effectiveness ratios for pirfenidone and nintedanib versus best supportive care

A series of sensitivity analyses (14 scenarios) were performed on the range of 95% CIs of the main model parameters; the results are shown as a tornado diagram for nintedanib versus BSC (Fig. 3). Additional analyses (36 scenarios) were undertaken on model inputs to test model assumptions and values used, as well as structural uncertainty (see Online Resource 8). The analysis versus BSC was sensitive to the mortality probabilities and assumptions. The nintedanib versus pirfenidone comparison was sensitive to the acute exacerbation parameters; results ranged from nintedanib being dominant (nintedanib cost less and was more effective than pirfenidone) to having ICER values of over £100,000 per QALY gained.

Fig. 3
figure 3

Tornado diagram for nintedanib vs. best supportive care

The result of the probabilistic sensitivity analysis (1000 samples) is presented in Fig. 4. The scatter plot indicates that nintedanib and pirfenidone are broadly equivalent, with samples for both treatments overlapping. The cost-effectiveness acceptability curve is presented in Fig. 5 and shows that nintedanib dominates pirfenidone at any threshold level.

Fig. 4
figure 4

Cost-effectiveness scatter plot. BSC best supportive care, PSA probabilistic sensitivity analysis, QALYs quality-adjusted life-years

Fig. 5
figure 5

Multiple cost-effectiveness acceptability curve. BSC best supportive care

4 Discussion

The overall structure of the model used in this analysis has similarities with the 2013 NICE clinical guideline model comparing the cost effectiveness of a pulmonary rehabilitation course to a strategy offering no pulmonary rehabilitation in IPF patients [1]. Similar to the NICE model, our analysis reflected disease progression as a change in FVC %pred and also considered the impact of acute exacerbations; one difference was the range of FVC %pred category considered (1% in the NICE model vs. 10% in this model).

In a different study, Loveman et al. [42] used IPF “unprogressed” and “progressed” (decline of 10% in FVC %pred) health states. Although this approach simplifies the disease progression model input, it assumed that a Markov health state is defined by a change in the cohort condition (a drop in FVC %pred). A change in condition is typically included in a Markov model as a health state transition and not as a health state per se. The consequence of using a change as a health state is that, depending on the disease variation of the cohort at the start of the model, the resulting Markov states are heterogeneous and not mutually exclusive. Nevertheless, despite the structural differences and distinct assumptions (e.g. the nintedanib price), the results of the Loveman et al. [42] analysis are similar to ours; nintedanib dominates pirfenidone in the deterministic analysis, both treatments have ICERs over £100,000 per QALY gained versus BSC, and treatment effect parameters are the strongest drivers of the model results (in particular, OS).

A recent economic evaluation, conducted alongside an RCT, reported on the cost utility of antibiotic medicines in IPF treatment in the UK [43]. The study used 12-month data from the RCT [44] to produce costs and utility estimates for co-trimoxazole and placebo. The study by Wilson et al. [43] estimated the 12-month costs for baseline care (placebo) to be around £1500 per patient (excluding prescription medicines). Our estimates were around £1900–2000 per patient for routine monitoring and oxygen use, management of AEs and acute exacerbations, depending on whether it was active treatment or BSC. The difference may be attributed to our resource use estimates, being based on the results of a multi-country RCT compared to a UK-only trial [43]. The difference may also be random given the sample size in Wilson et al. [43] (65–70 patients) compared with the INPULSIS data (over 1000 patients) [12].

We also note that the Wilson et al. [43] study found increased QALY benefits for the co-trimoxazole strategy compared with placebo (lowest incremental QALY estimates of 0.032 increasing to 0.057). This is an important difference to our findings in which our lifetime incremental QALYs do not exceed the 0.5 mark; crudely, assuming a 5-year average survival for our cohort this means a 0.01 QALY gained per patient per 12 months. As in the case of the cost estimates, the difference may be random, given the small sample size of the study, or it could be attributed to a small survival benefit observed in the co-trimoxazole strategy.

Our economic analysis was based on evidence collected from large international clinical trials and followed the NICE reference case and international guidelines for best practice in economic modelling [41, 45]. Its strengths include the synthesis of EQ-5D and resource use data obtained from the same source that provided the clinical evidence for one of the comparators [12]. The clinical inputs of the economic analysis were based on an NMA conducted after a systematic review of the literature. The NMA results were similar with three other studies [42, 46, 47]. The OS parametric extrapolation estimates used in the base-case analysis were validated with external observational data [5, 19].

The model has several limitations. First, due to lack of head-to-head data for nintedanib versus pirfenidone, summary statistics were used to calculate efficacy, safety and tolerability for patients receiving pirfenidone. This introduced practical difficulties in synthesising the evidence and uncertainty around the relative efficacy and safety of the two treatments, which needs to be considered when interpreting the evidence. Second, too few acute exacerbations were recorded to allow statistically robust exploration of their effect on mortality and disease progression. Consequently, the impact of acute exacerbations may be underestimated. Third, since both nintedanib and pirfenidone are new treatments for patients with IPF, there is a lack of evidence on treatment tolerability and discontinuation in real life. In our analysis, discontinuation rates were based on those observed in clinical trials. This is likely to be different from real-world observations. Nevertheless, the model results were robust to changes in the assumptions on treatment discontinuation. Fourth, the model assumed that oxygen supplementation costs are incurred only by IPF patients with an FVC %pred below 80%, which is likely to overestimate the true cost associated with this resource. However, the results were not sensitive in this cost parameter.

Overall, N-acetylcysteine was found to have similar costs to BSC and worse effectiveness, driven by the survival deficit in the NMA comparison. The nintedanib strategy had fewer acute exacerbations and, consequently, fewer costs and more QALYs than pirfenidone, but with considerable uncertainty around the point estimates. In the UK currently, a new healthcare intervention is considered for reimbursement by the NHS if its ICER is below or within £20,000–30,000 per QALY. Given the high incremental cost difference between nintedanib and pirfenidone versus BSC, the ICERs for both drugs were over £100,000 per QALY gained. This situation is frequently encountered with drugs for rare medical conditions, and it has been suggested that considerations of prevalence, budget impact, disease severity and treatment options should be weighed against cost-effectiveness parameters to arrive at a reimbursement price that is fair to both healthcare systems and drug manufacturers [4850].

NICE has recently published a decision on pirfenidone and nintedanib for reimbursement in the NHS of England and Wales [7, 8]. For both treatments, the committee considered a confidential PAS, and approved them with limited access based on two criteria: (a) initiation limited to individuals with an absolute level of lung function—FVC %pred between 50 and 80%; and (b) discontinuation mandated for individuals exhibiting ≥10% annual decline in FVC %pred. The origin for both these criteria appears to be the appraisal of pirfenidone [8], during which NICE concluded that pirfenidone offered an acceptable cost-effectiveness estimate in the subgroup of patients with an FVC %pred of 80% or less, and that this was the most appropriate population for evaluation. During that appraisal NICE also heard from clinical specialists that there is a “consensus that a decline in FVC of 10% or more from a baseline pre-treatment value represents progressive disease” [8]. Our estimates suggest that the cost-effectiveness conclusion is not very different when considering a subgroup of patients with “increased risk of progression”, when considering patients with 80% or lower FVC %pred or when patients discontinue after progression.

Given the similar efficacy between nintedanib and pirfenidone, clinicians, policy makers and individuals with IPF will need to consider drug costs, pill burden, and safety and tolerability in making treatment choices [40]. Because IPF occurs in an older population who are likely to have co-morbid conditions, treatment administration (i.e. number of pills taken per day) and adverse effects related to each drug may be important considerations for patient satisfaction and treatment adherence. Ultimately, the choice between nintedanib and pirfenidone may depend on a wide range of factors such as patient lifestyle, co-morbidities, ability to tolerate treatment [51], and even personal values, aversion to risk, willingness to take medicines [52] and accuracy of information provided by the clinicians [40].

Future clinical trials and long-term follow-up studies should include survival and protection from acute exacerbations as outcomes, since they represent the ultimate goal of therapy and are likely to have a large impact on cost effectiveness. Prospective studies designed to capture the real-world impact of treatment tolerability and discontinuation are also needed, although our model was not particularly sensitive to changes in discontinuation.

5 Conclusion

Compared with placebo, nintedanib was statistically better in protection from acute exacerbations and delaying lung function decline. In the same comparison, pirfenidone was better than placebo in lung function decline. All comparators were estimated to have similar projected survival. Based on these efficacy outcomes, over a patient’s lifetime, nintedanib and pirfenidone accrued 0.5 QALYs more than BSC (placebo). Given the high incremental cost difference between nintedanib, pirfenidone and BSC, the ICER was over £100,000 per QALY gained. Nintedanib and pirfenidone were largely equivalent in estimated costs and health-related quality of life benefits in a pairwise comparison, and N-acetylcysteine was dominated by the reference treatment (BSC). The uncertainty around the results was driven mainly by the lack of statistically significant differences in the OS of the active treatments.