FormalPara Key Points for Decision Makers

Although surrogate endpoints enable faster trials and therefore faster access to treatment, they increase the uncertainty of coverage decisions on health technologies

Our survey shows that many international health technology assessment (HTA) agencies currently lack detailed guidance for the evaluation of health technologies that rely on surrogate endpoint evidence

HTA agencies need to provide more detailed and prescriptive guidelines for the consistent qualification and incorporation of surrogate endpoint evidence in the decision processes where the evidence on patient-relevant endpoints is lacking

Current best knowledge suggests that adequate approaches include evidence hierarchy frameworks, meta-regression analytical techniques and economic modelling methods that explicitly explore the uncertainty in the surrogate-to-final endpoint relationship

1 Background

A key issue in the increasing move towards early access to new and innovative healthcare technologies is the use of surrogate endpoints to support licensing and coverage decisions of such technologies. Within this context [1, 2], a surrogate endpoint is defined as a biomarker (e.g. blood pressure) or an intermediate outcome (e.g. exercise capacity) that can substitute for a final patient-relevant outcome that includes mortality and health-related quality of life [3]. Disease areas with a strong tradition of surrogate endpoints include oncology (e.g. tumour response for overall survival) and cardiovascular disease (e.g. blood pressure for cardiovascular mortality or morbidity). In clinical areas (e.g. dermatology or acute disease) where patient-relevant outcomes are relatively quickly accrued, the need for surrogate endpoints is much less.

Many regulatory decisions across the world rely on surrogate endpoint evidence. Surrogate endpoints are the primary endpoints in almost half of the studies submitted to the US FDA for marketing approval of medicines [4, 5]. Recently, to inform the development pathways of medicines, the FDA published a list of accepted surrogate endpoints and disease areas that were the basis of approval or licensing of a medicinal or a biological product under both the accelerated and the traditional approval pathways [6].

Whilst surrogate endpoints enable faster outcome accrual and therefore shorter clinical trials [7], reliance on such endpoints can be problematic if they fail to fully capture the complete risk–benefit profile of a health technology [8]. Surrogate endpoints have been shown to overestimate intervention effects [9] and, in some cases, lead to increased risk of harm [10, 11].

As the use of surrogate endpoints has become more common in the licensing of new health technologies [12], health technology assessment (HTA) agencies [12, 13] are under increasing pressure to utilise such evidence in their recommendations that inform the coverage and funding of medicines and medical devices. Whether surrogate endpoint evidence is used to interpret clinical effect in the context of insufficient final patient-relevant endpoint information [14] or is translated to a different outcome (such as quality-adjusted life-year [QALY]) within an economic model [15], there is a need to ensure that the choice of surrogate is adequate. Therefore, it has been recommended that the use of surrogate endpoints be limited only to those that have been validated appropriately [1, 12, 16]. Such validation ideally requires (1) experimental evidence that demonstrates (2) an acceptable association between treatment-induced change on surrogate endpoint and treatment-induced change on final patient-relevant endpoint and (3) a quantification of the treatment-induced change on final patient-relevant endpoint based on the observed treatment-induced change on surrogate endpoint [1].

In 2009, Velasco Garrido and Mangiapane [17] published a survey of methodological guidance across international HTA agencies. Although 20 of 34 methods guidelines were reported to include surrogate endpoints, the depth and breadth varied considerably between documents. The authors concluded that “the role of surrogate outcomes in HTA is very limited”, with many agencies accepting health technologies based on surrogate endpoint evidence in the absence of definitive final endpoint data as exceptional and only when the validity of the surrogate endpoint has been proven. However, few agencies provided details on how such ‘validity’ would actually be assessed.

Given recent developments in accelerated and adaptive licensing pathways, this study undertook an updated survey to gain a contemporary picture of methods for the handling of surrogate endpoints by international HTA agencies. As this study was conducted within the European Union-funded COMED (Pushing the boundaries of Cost and Outcome analysis of Medical Technologies) project [18], we also sought to assess whether these methods for handling surrogate endpoints included specific provision for medical device technologies.

2 Methods

We sought to identify recommendations on approaching surrogate endpoint evidence in HTA as reflected by current public guidelines and technical documents from relevant HTA bodies.

2.1 Identification of Health Technology Assessment (HTA) Agencies

We updated the listing of European HTA agencies from the previous 2009 survey of surrogate endpoints [17] to include all organizations currently listed as members of three major HTA networks (as of March 2018): Health Technology Assessment International, the European network for Health Technology Assessment (EUnetHTA) and the International Network of Agencies for Health Technology Assessment. We included all HTA agencies unless they were patient organisations, organisations whose members/stakeholders were the industry, and university centres, hospitals and professional organisations only involved in the production of HTA reports but not in policy guidance or methods development. Additionally, we included Australian and Canadian HTA agencies as they have been established for many years and therefore reflect ‘mature HTA settings’, i.e. the Australian Pharmaceutical Benefits Advisory Committee (PBAC), the Australian Medical Services Advisory Committee (MSAC) and the Canadian Agency for Drugs and Technologies in Health (CADTH). For each HTA agency, we checked for the publicly available methodological guidance (in any language) either as guidelines or as methodological advisory documents (such as those of the UK National Institute for Health and Care Excellence [NICE] Decision Support Unit [19]). Agencies without available methods guidance were excluded.

2.2 Document Review and Data Extraction

We assessed the availability and the detail level of the guidance on the use of surrogate endpoints evidence in HTA processes that was provided by the included HTA organisations. Assessment included (1) terminology (including definitions) on the use of surrogate endpoints, (2) methods of surrogate validation and (3) methodological practices recommended in guidance documents.

2.3 Stage 1: Identification of HTA Agency Methods Guidance on the Use of Surrogate Endpoints

Websites of all included HTA agencies were screened using a combination of search terms (HTA, guidelines, methods, resources, publications, surrogate, intermediate and endpoints) to identify methods guidance availability and relevant methods documents. This was supplemented by hand searching of the relevant link categories on the websites. Where necessary, agencies were also contacted directly to enquire about relevant documents. For each included agency, the following data were extracted: (1) name of agency and country, (2) name and website location of the methods document, (3) language of the guideline, (4) text detailing use of surrogate endpoints (including location within the document and any citations referenced), (5) assessment of whether the guidance was specific to pharmaceuticals or medical devices or both and to certain disease areas (e.g. cancer). A data extraction form was developed and piloted by two authors (BG and OC) on a sample including documents in English, French and Italian to test the feasibility of the process and to ensure that captured data were appropriate and sufficient for the study’s objective. The revised extraction form was then used by a single reviewer with language skills for each agency (OC, CF, BG, MM, SR, FD, KS, SdG, AZ) between April and July 2018. A random sample of 20 documents was then checked by a second reviewer (OC, BG, SR, SdG). For the purposes of presentation in this report, all text was translated into English.

2.4 Stage 2: Detailed Analysis of Surrogate Methodological Advice

For each agency identified in stage 1 as including advice on the use of surrogate endpoints in their methods guidance, a more detailed data analysis framework was applied (see Table 1).

Table 1 Domains used to extract data from methods documents in stage 2

2.5 Data Analysis and Presentation

The findings of this survey are presented descriptively and in detailed summary results tables.

3 Results

3.1 Selection of HTA Agencies

A total of 73 HTA agencies met the inclusion criteria (see Fig. 1; Table 2); 29 were excluded because they had no published methodological guidance. Of the remaining 44 agencies, 29 (66%) included consideration of the handling of surrogate endpoints in their methods guidance. These 29 agencies included 18 European countries (Austria, Belgium, Bulgaria, Croatia, Germany, Spain, France, Germany, Hungary, Ireland, Italy, Netherlands, Norway, Poland, Portugal, Sweden, Slovakia, United Kingdom), the EUnetHTA network of agencies, and the agencies of PBAC, MSAC and CADTH. In total, 45 methodological guidance documents outlining the use of surrogate endpoints were included for analysis. Sources of these documents are presented in Table S1 in the electronic supplementary material (ESM).

Fig. 1
figure 1

Summary of agencies and documents selection. HTA health technology assessment

Table 2 Included HTA agencies and summary of the availability of methodological guidance

3.2 Consideration of Surrogate Endpoints

The extent to which methodological guidelines provided specific consideration on the use of surrogate endpoints varied greatly between agencies. The guidance documents of three (10%) HTA agencies (the Agency for Quality and Accreditation in Health Care and Social Welfare in Croatia (AAZ), the Galician Agency for Health Technology Assessment in Spain and the Norwegian Institute of Public Health) only mentioned surrogate endpoints in general terms and provided no specific methods guidance on their use.

Reflective of the collaborative partnership in the EUnetHTA project, methods guidance of many agencies was based on the guidance on surrogate endpoint methods published by the EUnetHTA in November 2015 [20]. Table 3 provides a summary of key aspects of the EUnetHTA guidance. Whilst the EUnetHTA guidelines state a preference for using final patient-relevant outcomes rather than surrogate outcomes, they also recognise the need to use surrogate/intermediate outcomes. For example, when evidence of the direct effect of the intervention on patient-relevant outcomes (such as mortality or health-related quality of life) is not available, the EUnetHTA guidelines propose criteria for acceptability of a surrogate endpoint: (1) a biological/clinical plausibility for the endpoint, (2) evidence of an association with the final patient-relevant endpoint and (3) consideration of wider risk–benefit and/or public health implications [21].

Table 3 Overview of EUnetHTA guidelines for surrogate endpoints

3.2.1 Definition for Surrogate Endpoints

In total, 13 methods documents (29%) provided explicit definitions for surrogate endpoints, many of which were consistent with the EUnetHTA guideline definition, “biomarkers and intermediate endpoints” that can “substitute for a clinically meaningful (final) endpoint”. Guidelines from PBAC [22], MSAC [23] and CADTH [24] use similar definitions. For instance, surrogate outcomes are considered by CADTH as “a subset of intermediate outcomes” and are defined as “a laboratory measurement or a physical sign used as a substitute for a clinically meaningful end point that measures directly how a patient feels, functions, or survives” [24]. In their glossary of terms, PBAC defines surrogate outcomes as “a variable that is suspected, but not necessarily demonstrated, to occur on the causal pathway from a clinical management or factor to the clinically relevant final outcome” and recommend the justification and validation of any surrogate outcome used in the analysis [22].

3.2.2 Example of Surrogate Endpoints

In total, 18 documents (40%) provided specific examples of surrogate endpoints (e.g. “Blood pressure as a surrogate endpoint for cardiovascular mortality; bone mineral density as a surrogate for bone fracture; HIV1-RNA viral load as an indicator of viral suppression”, Health Information and Quality Authority, Ireland). A support document from the German Institute for Medical Documentation and Information (DIMDI) [25] also provides examples where surrogate endpoints have been proven not to be good surrogates (e.g. increased bone density following treatment of osteoporosis with sodium fluoride did not result in an observed decrease in fractures). A total of 44 documents (98%) included some consideration of the use of surrogates in the analysis (e.g. “only use a surrogate outcome if it has a well-established link (i.e., validated) with one of (final) outcomes”, CADTH). While some guidelines seemed to implicitly consider the surrogate endpoints in a cost-effectiveness context (NICE, PBAC, CADTH), most did not seem to differentiate the interpretation of surrogate endpoints according to the domain (e.g. clinical efficacy, cost effectiveness, etc.). Only four guidelines (from the Austrian Ludwig Boltzmann Institute for Health Technology Assessment, the AAZ, the Polish Agency for Health Technology Assessment and Tariff System (AOTMiT) and CADTH) mentioned the use of surrogate outcomes for safety.

3.2.3 Acceptability of Surrogate Endpoints

In total, 26 guidelines (52%) provided discussion on the acceptability of surrogate endpoints [e.g. “If there is data that validates a surrogate, then these will be assessed in terms of their relevancy and their credibility”, German Institute for Quality and Efficiency in Health Care (IQWiG)]. Nine (18%) clearly refer to the association between surrogate endpoint and final outcome.

3.3 Detailed Methodological Guidance on Surrogate Endpoints

In addition to the EUnetHTA guidelines, seven (15%) HTA agencies had methods guidance that included detailed methodological consideration of surrogate endpoints: IQWiG (two documents), NICE, AOTMiT, the Portuguese National Authority of Medicines and Health Products (INFARMED), PBAC, MSAC and CADTH. These documents included recommendations of methods to be used for the validation of surrogate endpoints and, in two cases, cut-offs for the acceptance of surrogates according to their validation.

3.3.1 Methods for Validation of Surrogate Endpoints

Specific methods recommendations are listed in Table 4. EUnetHTA [20] and IQWiG [26] guidelines are the most detailed and prescriptive European guidelines, providing suggestions of methods for the validation of surrogate outcomes and defining necessary correlation levels for the association between surrogate and clinically relevant outcomes [27]. In contrast, NICE technology appraisal guidelines [28] focus on the decision uncertainty associated with evidence and this reflected in the economic modelling of a technology and recommend that “in all cases, the uncertainty associated with the relationship between the end point and health-related quality of life or survival should be explored and quantified” [29]. PBAC guidance [22] contains a supplementary appendix that outlines a prescriptive approach to validating surrogate endpoints for decision modelling based on a four-step approach: (1) identify the surrogate endpoints and the corresponding final outcome; (2) establish the biological plausibility of the two, and present epidemiological evidence to support it; (3) present randomised trial evidence to support the nature of the relationship; (4) translate the treatment effect on the surrogate endpoints to an estimate of the comparative treatment effect for the final outcome [22].

Table 4 HTA agencies with detailed methods for the handling of surrogate endpoints

3.3.2 Specific Guidance for Disease Areas

In three cases, specific guidance on the use of surrogate endpoints in oncology was available: NICE [30] analysed the suitability of particular surrogate endpoints (such as progression-free survival for overall survival in cancer), and IQWiG [27] provided a detailed discussion on the potential use of surrogate outcomes in oncology. In CADTH guidance, a document dedicated to the evaluation of oncology therapies [31] contained detailed discussion of acceptability of surrogate outcomes according to their correlation with patient outcomes and the treatment intent (curative, adjuvant or palliative).

3.3.3 Specific Guidance for Medical Devices

Of the 45 methods documents analysed, 15 (33%) were exclusively intended for pharmaceuticals, and only three (7%) were intended exclusively for the evaluation of medical devices (NICE Medical Technology Evaluation Programme (MTEP), the State Institute for Drug Control in the Czech Republic and MSAC). Table 5 provides a comparison of the methods guidelines across HTA programmes aimed at evaluating either general health technologies or pharmaceuticals versus those for evaluating medical devices in the UK (NICE technology appraisal vs. MTEP) and Australia (PBAC vs. MSAC). Guidelines for medical devices appeared less specific and did not include any specific methodological recommendations beyond a general need to provide supporting evidence for surrogate endpoints (Table 5).

Table 5 Surrogate endpoint guidance in medical device-specific HTA programmes compared with pharmaceuticals programmes

4 Discussion

Our updated international survey included 74 HTA agencies, of which 29 (39%) had methodological guidance documents that included consideration of surrogate endpoints. Many of the European agencies’ methods guidelines appear to have been revised to reflect the principles of the EUnetHTA guidelines on surrogate endpoints published in 2015 [20]. The EUnetHTA guidelines state a preference for evidence from final patient-relevant outcomes (such as mortality and health-related quality of life) and advise cautious consideration when surrogate endpoints are used, i.e. use of ‘validated’ surrogate endpoints. However, although the EUnetHTA guidelines are a useful development, they do not provide any explicit criteria to establish whether or not a surrogate endpoint is valid. Furthermore, none of the HTA guidelines in our survey included a list of ‘accepted’ surrogate endpoints, i.e. surrogate endpoints for which the future use in an evaluation would not require justification.

We identified only five HTA agencies (IQWiG, DIMDI, NICE, PBAC and CADTH) with guidelines providing specific prescriptive methodological advice on the statistical methods that should be used for the validation and assessment of acceptability of surrogate endpoints. Whilst there was a recognition across these guidelines of the lack of methodological consensus around the level of evidence necessary for the validation of surrogates, consensus was strong on the need for randomised trial data to support the association in the treatment effect between surrogate and final endpoints, including the use of meta-regression analysis methods. However, only a IQWiG document currently discusses numerical values for an acceptable level of association (e.g. R2 trial > 0.49) [27]. Our results showed little difference in guidance between the use of surrogate endpoints for clinical effectiveness and for incorporation into economic models, with the exception of the NICE technical guidance approach, which focuses on the exploration of uncertainty in the surrogate-to-final-outcome relationship as part of the probabilistic sensitivity analysis. Since our study was conducted, the NICE decision support unit published another technical document [32]. This report focused on the use of multivariate meta-analytic methods for combining data from multiple correlated outcomes for the purpose of surrogate endpoint evaluation and suggested that, instead of criteria about the correlation, it is important to look at predicted estimates and their uncertainty because the strength (or weakness) of the surrogate relationship will manifest itself in the width of the predicted interval of the treatment effect on the final outcome.

The majority of methodological documents on surrogate endpoints identified in our study were intended to be applied across health technologies (medicine, medical device or others) and across medical conditions. Given that the development and use of surrogate endpoints has become particularly common in oncology [33, 34], NICE, IQWiG and CADTH have published specific support documents for the use of surrogates in this clinical area [27, 30, 31]. Commonly used surrogate endpoints for the final outcome of overall survival in cancer include progression-free survival, disease-free survival and tumour response.

Pharmaceuticals and medical devices traditionally have different regulatory and evidence-generation pathways [35]. Given that various countries/agencies have separate HTA processes for the evaluation of medicines and medical devices, we could compare their methodological approaches to the consideration of surrogate endpoints [36]. The NICE technology appraisal is applicable to all medical technologies, whereas the NICE MTEP specifically considers medical devices and diagnostics. Similarly, in Australia, PBAC assesses pharmaceuticals and MSAC assesses medical devices. The PBAC and MSAC guidance on surrogate endpoints was similar, but we found more of a difference within NICE programmes. The NICE technology appraisal programme was much more detailed and directive in guidance than the MTEP, reflecting the traditionally greater evidence requirements for medicines than for devices. Whilst it might be expected that the evidence requirements for the use and validation of surrogate endpoints should not necessarily differ between health technologies and across disease areas, we recognise there may be challenges in application. For example, given the current regulatory requirements, for specific medical devices, randomised controlled trial (RCT) (and sometimes, non-RCT)-level evidence may not be available at the time of HTA appraisal and even after it [35, 37]. It is likely that the requirement of ‘several RCTs’ for good surrogate validation studies will never be satisfied for many indications requiring medical device-based procedures. When confronted with this challenge, there is a temptation to extrapolate validated surrogate endpoints from RCTs of medicines (e.g. the use of systolic blood pressure from RCTs of antihypertensive medicines) to medical-device-based therapies (e.g. renal denervation therapy). However, we caution against this approach, given that different modes of action and classes of therapies are known to affect the surrogate-to-final-outcome relationship.

Our study provides a comprehensive contemporary review of methods guidance across international HTA agencies on the use of surrogate endpoints. We explored a larger sample of agencies and documents than did the previous survey [17]. However, available resources (particularly time and linguistic access) limited the inclusion of non-European agencies to those of Australia and Canada. Furthermore, this survey only looked at publicly available documents and not at internal documentation that may be circulated within HTA agencies. As described in Sect. 2, methodological advisory documents [27, 30] were also considered, as—in our opinion—they constitute important material to inform and complement methods practice.

5 Conclusion

This updated survey of international HTA agencies demonstrates an increase in the methodological guidance for the use of surrogate endpoints over the last decade, largely based on the adoption of EUnetHTA guidance on surrogates published in 2015. Nevertheless, we found considerable differences in the depth of this guidance, with only a few agencies currently having guidelines that provide detailed methodological advice on the statistical methods and metrics for surrogate validation that are deemed acceptable. Further methodological and policy research in the harmonization of approaches to surrogate outcomes evidence in healthcare decision making is warranted. The recent EU proposal of joint HTA clinical assessment [38] may provide the opportunity for implementation of a harmonised approach to the validation of the handling of surrogate endpoints across Europe. Our study also suggests an almost exclusive consideration of surrogate endpoints from a clinical efficacy/effectiveness perspective. Opportunities therefore remain to further clarify the effective and consistent use of surrogate endpoints in other HTA domains, especially safety and cost effectiveness.