Background

Ulcerative colitis and apheresis

Ulcerative colitis (UC) is a chronic disease of the colonic mucosa that is commonly treated with corticosteroid therapy to achieve clinical remission. Corticosteroids are used empirically in patients with moderate-to-severe UC despite the fact that relapse in patients who initially responded to these drugs is common. In addition, steroid therapy is associated with frequent side effects, especially when used for a long time [1]. In the last few years, a new non-pharmacological treatment, termed apheresis, has been reported to produce similar results to those obtained with corticosteroids in terms of disease remission [26].

Apheresis devices lower the elevated blood leukocyte and platelet levels found in active UC resulting from the activation behaviour and increased survival time of these blood components [7]. Leukocytapheresis (LCAP) and granulocytapheresis (GCAP) are the most frequently used apheresis treatments [8], which usually involve five sessions (one per week), although one or two sessions per week can be used for a period of time ranging from five to ten weeks [1, 2, 4, 5, 9, 10]. However, the number of sessions can vary depending on the severity of the disease or the response to corticosteroid treatment, thus making the comparison of different studies somewhat difficult [11]. Hanai et al. reported that patients with severe active UC who were corticosteroid-naïve responded readily to granulocyte-monocyte apheresis (GMA), thereby avoiding steroid therapy [3]. These observations indicate that GMA might be a substitute for corticosteroid treatment in these patients, thereby allowing them to avoid the possible side effects of these drugs.

It has been reported that approximately 20% of patients with UC have a chronic active disease that often requires several courses of systemic steroids to achieve clinical remission. However, this treatment regime is often followed by relapse of symptoms during steroid tapering (continuous reduction of the dosage of corticosteroids once the initial high dosage has produced significant clinical improvement) or soon after their discontinuation.

Multiple studies have suggested that selective apheresis may be effective as a steroid-sparing treatment [12] because the resulting reduction in the peripheral levels of granulocytes and monocytes produced by GCAP might mitigate inflammation and delay relapse during steroid tapering in steroid-dependent patients [1].

Recommendation development

When assessing a health technology, many methodologies have been used to establish recommendations based on existing systematic reviews or other study designs, including SIGN (Scottish Intercollegiate Guidelines Network) [13] and The Oxford Centre for Evidence-Based Medicine [14]. The GRADE (Grading of Recommendations Assessment, Development and Evaluation) approach has been developed by an informal collaboration group of guideline developers, clinicians, and methodologists with the aim of developing and disseminating a sensible and transparent approach to grading quality of evidence and strength of recommendations [1517]. This approach is based on an assessment of other systems, including SIGN, and involves members from numerous international organizations. It was created to assess the quality of evidence and elaborate recommendations in clinical guidelines [15, 1822], therefore the application of this methodology would be of interest for health technology assessment (HTA) reports.

Study objective

The objective of the present study was to use GRADE to develop recommendations regarding the use of apheresis devices for the treatment of UC, and to evaluate the strengths, weaknesses, opportunities, and threats found when using GRADE in this context in comparison with those found previously using the SIGN method.

Methods

Definition of the clinical questions

We selected two of the possible questions concerning the use of apheresis devices to treat UC using the PICO model (Patients, Intervention; Comparison and Outcomes) on the basis of two previously published documents [11, 23]. These questions were as follows:

Question 1: Should apheresis devices be used to treat non-steroid-dependent or non-steroid-refractory UC patients to achieve clinical remission of the disease rather than standard corticosteroid treatment?

Question 2: Should apheresis devices be used as an adjunct treatment with corticosteroids to treat steroid-dependent UC patients with the aim of sparing or withdrawing corticosteroids rather than standard corticosteroid treatment?

Definition and assessment of all Relevant Outcomes

Five researchers (NI-R, IG-I, RR-I, ML-A and ER-R) defined the outcomes of interest for each question based on prior work concerning the development of a monitoring system for measuring the effectiveness and safety of apheresis devices in UC patients [11]. The outcomes defined for the first question were: clinical remission one month after treatment (defined as Mayo Index ≤ 2) [24]; endoscopic remission one month after treatment (Endoscopic Mayo Subindex ≤ 1); and clinical remission 12 months after treatment. The following variables were defined to evaluate the safety of the treatment: percentage of patients with mild adverse events (those requiring continuing observation but no specific therapy) and percentage of patients with moderate to severe adverse events (with moderate events being defined as those requiring transient therapeutic countermeasures, but not interruption of therapy and severe events those resulting in sequelae or increased risk of death or requiring discontinuation of UC trial therapy). The outcomes defined for the second question were as follows: percentage of patients who do not require corticosteroids one month after the last apheresis session, mean reduction of corticosteroids dose one month after treatment, clinical remission one month after treatment (Mayo Index ≤ 2 and no corticosteroids), endoscopic remission one month after treatment (endoscopic subindex ≤ 1 and no corticosteroids), improvement of quality of life (as measured by the Inflammatory Bowel Disease Questionnaire, or IBDQ, which is able to distinguish between active UC disease and remission stage), colectomy rate during follow-up, percentage of patients with long-term side-effects of both treatments, and clinical remission maintained 12 months after treatment.

Each group member scored all defined outcomes from 1 to 9 (from least to most important). If major differences between individual scores were obtained, the relevance of that particular outcome was discussed to reach consensus. Critical outcomes were defined as those with a final score of between 7 and 9, and important outcomes as those with a final score of between 4 and 6 (See Table 1). Those outcomes scoring less than 4 points were not considered further.

Table 1 Assessment of the importance of the defined outcomes

Literature search and study selection

A previous systematic review [25] was used to assess the use of apheresis devices in the treatment of UC. This research was updated by searching the following databases (up to May 2008): MEDLINE, Cochrane, EuroScan, INAHTA, ISI, Current Controlled Trials, National Guidelines Clearinghouse, New Zealand Guidelines group, SIGN, Fisterra, Lilacs, GETECCU, and the Cochrane-IBD (Inflammatory Bowel Disease) group. Boolean operators were used to combine free text such as 'inflammatory bowel disease', 'ulcerative colitis', 'Crohn's disease', 'apheresis', 'immunomodulation', 'leukocytapheresis', 'granulocytapheresis' and 'lymphocytapheresis' with controlled vocabulary. The results of this search were redefined using the Cochrane Collaboration's search filters to identify preferably randomized controlled clinical trials. We included studies if: the effectiveness of apheresis was assessed compared to conventional therapy; the safety of apheresis was evaluated; or the cost-effectiveness of the treatment was analysed. Case series with less than ten patients and studies with no control group were excluded [11].

Assessment of the outcomes

The overall quality of the evidence for each outcome was assessed according to the considerations defined by the GRADE approach: study design limitations that may bias the estimates of the treatment effect; inconsistent results due to unexplained heterogeneity; indirectness of evidence because of indirect comparisons or indirect population, intervention, comparator, or outcomes; and imprecision of the included studies due to small sample size, number of events, or wide 95% confidence intervals. When possible, meta-analysis procedures using the RevMan v.5 program were used to pool the data found for the outcomes of interest. The information obtained for each proposed question was summarised using the GRADE profiler (GRADE Pro) v.3.2 program [26].

Agreeing on recommendations

After assessing the evidence found for each outcome, the overall quality for each question was evaluated. The balance between risks and benefits was discussed, and the costs and patient values were taken into consideration when available. Finally, the recommendations provided to the decision makers were graded and defined by the group on the basis of all judgements made.

The overall process was reviewed by one of the members of the GRADE working group (HJS), who supervised and resolved any doubts concerning the methodological aspects of the process. JLC-N, a gastroenterologist and expert in inflammatory bowel disease (IBD) who had previous experience with the assessed treatment, reviewed the PICO questions and possible outcomes for each defined question.

SWOT analysis to evaluate the use of the GRADE approach for assessing new technologies

A SWOT (Strengths, Opportunities, Weaknesses and Threats) analysis was performed to evaluate our use of the GRADE approach to establish recommendations concerning this new technology. Strengths were defined as those attributes of the GRADE approach that were helpful for achieving the objective, and weaknesses as those attributes considered detrimental for this purpose. Opportunities were defined as those external conditions considered helpful for achieving the objective, and threats as those external conditions which could be detrimental to the objective.

The group of researchers that developed this work informally discussed the likely strengths, opportunities, weaknesses, and threats found when using the GRADE approach in this context. HJS did not participate in this activity because of his role in the development of GRADE. An evaluation was performed by each researcher involved (NI-R, IG-I, RR-I, ML-A, ER-R), and all the issues identified were summarized and discussed to develop common themes. Each researcher scored all of the items from 1 (least important) to 9 (most important). Finally, the total and median scores for each issue identified were calculated and used to order these issues by importance.

Results

Results for the first question

'Should apheresis devices be used to treat non-steroid-dependent or non-steroid-refractory UC patients to achieve clinical remission of the disease rather than standard corticosteroid treatment?'

The consensus reached concerning the relative importance of the outcomes defined for this question is presented in Table 1. The controlled clinical studies reported by Nishioka [5], Hanai [4], and Bresci [2] were included. The patients studied by Sands [27] were not defined in terms of the clinical scenarios considered in this study; therefore this trial was not included in the analysis.

Table 2 summarizes the information found for each outcome in a GRADE profile. When two or more studies were found for the same outcome, the data were meta-analyzed (Figure 1A).

Table 2 GRADE evidence Profile for the first clinical question
Figure 1
figure 1

Meta-analysis performed for the outcomes related to each proposed clinical question.

Some factors, such as the different definitions of clinical remission in the studies selected, complicated the analysis of the results. Although we used the Mayo Clinical Index to define clinical and endoscopic remission, a large number of different indices and definitions can be used for the same purpose [28], as was the case in some of these studies [1, 2, 5]. We found that the Rachmilewich Endoscopic Index (EI) was most often used to define endoscopic remission, although this outcome was not measured in the included studies. Some of these studies did, however, report the mean EI before and after treatment, therefore we used this outcome as an indirect measure of endoscopic remission (Table 2). We judged this indirectness to be serious enough to merit a further downgrade.

The findings for the first question can be summarised as follows:

Balance between risk and benefits: there appears to be no difference in efficacy between apheresis devices and corticosteroids, both of which induce clinical remission in both non-steroid-dependent and non-steroid-refractory patients one month after treatment, although the effect of apheresis treatment is slower than that of corticosteroids in those patients that respond to them. The incidence of adverse effects with LCAP seems to be significantly lower than with high-dose corticosteroids, although these effects are generally transient and, in most cases, disappear during or shortly after the LCAP sessions [29]. The adverse effects of a short course of corticosteroids do not appear to be important.

Remarks: the balance between risks and benefits is uncertain, although, in contrast to corticosteroids, apheresis treatment appears to be associated with more benefits than risks. The general quality of the evidence found to answer the clinical question was very low (see Table 2), although this treatment appears to have similar remission rates to corticosteroid therapy. Apheresis devices do not however seem to offer sufficient net benefit in terms of lower costs and more rapid effect than corticosteroids (in patients who respond to them) in this clinical context. Acute course and tapering of prednisone treatment cost was estimated at 218.3 euros, and the cost for Adacolumn® treatment at 6,500 euros [30].

In conclusion, in light of the limited adverse effects of a two-month course of corticosteroids and their faster induction of remission and notably lower price, apheresis devices are unlikely to be of greater benefit than corticosteroid treatment in this context.

The panel therefore agreed on the following recommendation:

'For non-steroid-dependent and non-steroid-refractory UC patients, we recommend administration of corticosteroids rather than apheresis devices (GCAP or LCAP); weak treatment recommendation, very low quality of evidence.'

Results for the second question

'Should apheresis devices be used as an adjunct treatment with corticosteroids to treat steroid-dependent UC patients with the aim of sparing or withdrawing corticosteroids rather than standard corticosteroid treatment?'

Table 1 shows the scores for the relative importance for each defined outcome. The studies by Hanai [1] and Sawada [10, 29] were selected for this question. The retrospective study of Jo et al. [31] was excluded because the authors stated that compared groups were probably different (apheresis treatment was more likely to have been applied to patients resistant to or dependent on prednisolone).

Table 3 shows the GRADE profile obtained for the second question. A meta-analysis of the data was performed for the following outcomes: clinical and endoscopic remission, and the reduction of the dose of corticosteroids before and after treatment (Figure 1B).

Table 3 GRADE evidence Profile for the second clinical question.

The length of follow up, the different indices used to define clinical and endoscopic remission and the lack of results for some of the selected outcomes complicated the assessment. Nevertheless, no differences in terms of clinical remission when using different protocols have been described in the literature [9, 32].

The findings for the second question can therefore be summarised as follows:

Balance between risks and benefits: Apheresis devices appear to be associated with more benefits than risks. As apheresis could mean that many patients with moderately active UC are spared corticosteroid therapy [1], the apparent risks of apheresis should be compared with the risks of receiving continuous corticosteroid treatment.

Remarks: In this case, the balance between risks and benefits is uncertain and only very low-quality evidence was available to answer the question. Indeed, we were only able to find a single study assessing the cost of moderate-to severe UC in two scenarios: traditional treatment versus alternative treatment incorporating GCAP [30]. This study showed that the incorporation of GCAP into the therapeutic management of moderate-to-severe steroid-dependent UC patients is cost-effective and results in savings related to the reduction of adverse effects derived from corticosteroid use and a decreased number of surgical interventions. With regard to the patients' values and preferences, we found that some UC patients refused to be treated with corticosteroids [29, 33]. Moreover, in a recent study of Crohn's disease patients' preferences, it was found that patients indicated a systematic preference for treatments that, amongst other issues, avoided the need for steroids [34].

Thus, the panel agreed to make the following recommendation:

'We recommend that patients with steroid-dependent UC should be treated with apheresis devices (GCAP or LCAP) together with corticosteroids to help them reduce or withdraw continuous corticosteroids intake (weak treatment recommendation, very low quality of evidence)'

SWOT analysis to evaluate whether the GRADE approach is appropriate for assessing new technologies

The SWOT analysis results regarding the suitability of the GRADE approach for assessing new technologies are presented in Table 4.

Table 4 SWOT analysis results

The most relevant strength found was that the elaboration and grading of the quality of evidence and recommendations starts at the very beginning of the process with the definition and importance rating of the outcomes for the proposed clinical question. The GRADE approach also takes into account the patients' values and avoids the influence of any outcomes reported in the literature. In contrast, application of the GRADE approach was considered to be more time-consuming than other methods such as SIGN because information has to be sought for all defined outcomes. Despite this, we consider that using the GRADE approach in HTA, including new technologies, could be beneficial due to its transparency and systematic methodology.

Discussion

The development of recommendations in healthcare has always been problematic, and many different methods have been used [13, 14]. Over the last two years, our group has been working on the introduction of the GRADE approach in the Spanish context because this approach incorporates the advantages of prior methods and continues to integrate new developments in health research methodology [15, 1822, 35, 36]. The aim of the current study was to apply this approach in a different context from the development of typical clinical practice guidelines, specifically the assessment of new and emerging health technologies, and for this purpose we chose the case of apheresis devices in UC treatment.

As a limitation of this study, we should note here that we did not perform a controlled study comparing the GRADE approach with another method for evaluating the quality of evidence and strength of recommendations, which would have been of interest in order to validate/confirm our results in an objective manner. Nevertheless, to learn from this study and draw conclusions about our experience, we performed a SWOT analysis to analyze the strengths, weaknesses, opportunities, and threats found when using the GRADE approach in this context.

We should also point out that our results may be influenced by our relatively limited experience with using the GRADE approach. Indeed, our interpretation may have been influenced by the impression of the participants at several workshops we have run concerning the correct use of the GRADE approach, who declared it to be a more complicated method, particularly for clinicians, and more time consuming than other systems commonly used to elaborate clinical guidelines [37]. However, using the software and support material provided by GRADE may facilitate the production of evidence profiles and enhance transparency when formulating recommendations, as pointed out in our SWOT analysis (Table 4, opportunity 3).

The inclusion of only one clinical expert could also be a limitation of this study as having only 'one point of view' could bias our work. This study was based on a previous one undertaken in collaboration with four experts in IBD, therefore the role of the clinical expert in the current study was simply to resolve any doubts that may have arisen related to this disease. For a future controlled trial, it would be advisable to include more clinical experts to cover possible different points of view.

Another limitation of this study, which is not specific to GRADE, concerns the difficulties encountered in finding data for some relevant outcomes that were not measured or reported. This was the case for the outcome 'improvement of UC patients' quality of life', for which the IBDQ questionnaire is frequently used. This outcome was defined as critical for the second question, although it was not measured in any of the included studies. Similarly, despite recent reports showing that GMA seems to be effective long-term [38, 39], no direct data were available for the outcome 'clinical remission after 12 months follow-up'. A similar situation was found for the definition of clinical remission in steroid-dependency, with some experts considering that this should be accompanied by complete withdrawal of steroids [33, 38, 40, 41]. Our inability to locate these data made the assessment of the evidence more challenging. However, in the case of new technologies, the conclusions obtained upon application of the GRADE approach should help to ensure the correct definition of the outcomes of interest, which should then be evaluated in future research (Table 4, opportunity 2).

With regard to the strengths of this study, previous work by GETECCU (The Spanish Group for the study of Crohn's disease and UC) group members facilitated the definition of the outcomes of interest, which could facilitate the acceptance of final recommendations by clinicians (Table 4, strength 6). A qualitative study performed after a training course concerning the GRADE approach in Spain found that this approach was perceived to be more sensitive to the issues faced by professionals in practice [37] because the relevant outcomes are defined taking into account those outcomes considered to be important by both professionals and patients rather than on the basis of literature findings (Table 4, strength 2). As a consequence, the elaboration of recommendations starts at the very onset of the process on the basis of patients' values and important outcomes (Table 4, strength 1). We also attempted to take patients' values and preferences into account, which is a key strength of this method (Table 4, strength 3).

With regard to the clinical questions selected, the literature studies found indicate that selective leukocyte apheresis effectively removes activated granulocytes and monocytes/macrophages from the peripheral blood of UC patients while maintaining an excellent safety profile [42]. Indeed, some studies have proposed the use of apheresis devices as a first-line treatment for UC patients rather than corticosteroid therapy [3], and others have produced evidence regarding the efficacy of selective apheresis as a steroid-sparing treatment [12], which explains why these particular clinical questions were formulated. Other questions related to the use of apheresis devices in the treatment of UC could be proposed, such as the possible use of apheresis treatment for paediatric patients or patients with toxicity to corticosteroids.

The most challenging part of this study was the assessment of the evidence found for each outcome, partly due to the context of the disease and the characteristics of the new technology being assessed (Table 4, threat 1). Whereas the SWOT analysis suggested that the method was time consuming (Table 4, weakness 1) and required some academic training (Table 4, weakness 3), both of which could be considered a limitation for its use (Table 4, threat 2), evidence assessment is, in general, complicated irrespective of whether GRADE or other methods are used. It is therefore possible that other methods could be more time consuming if they are expected to produce similarly transparent results. Moreover, the GRADE approach offers the possibility of making explicit judgements about the consistency, indirectness, and precision of the results, which is considered to be beneficial when applied to new and emerging technologies (Table 4, opportunity 1 and strength 4).

We considered that the overall quality of the evidence for each question should be based on the critical outcome with the lowest quality of evidence. In our case, this quality was very low for both questions. As we have stated in the SWOT analysis, the GRADE approach judges the relative importance of different outcomes and their trade-offs, as well as the quality of evidence, explicitly rather than implicitly [35], which in our opinion facilitates the discussion and clarification of these judgements.

As we have mentioned before, although we consider that the information obtained from the SWOT analysis concerning the feasibility of using the GRADE approach in this context is useful, we also think that a controlled trial should be designed to study whether the recommendations made differ when using different methodologies for this purpose. This would give more detailed information regarding the utility of the GRADE approach in this context.

Summary

Our study suggests that the GRADE approach could be an appropriate means of making the recommendation-formulation stage a more transparent part of the overall process of producing HTA reports. Such reports are especially relevant in the case of new technologies, although we expect that most such assessments would lead to weak recommendations due to the lack of information that accompanies the introduction of new health technologies. However, we also consider that this approach would help to determine what future research should take into account when new technologies are assessed. Furthermore, more studies should be conducted to develop the best approaches to making recommendations about new health technologies.