Background

Although systematic reviews are a valuable tool in the synthesis of evidence, they should be interpreted with caution [1]. The sharp rise in the number of systematic reviews published over the past decades has led to a concomitant increase in discordant results and conclusions between reviews on the same research question [25]. This has caused disputes between researchers and created difficulties for decision-makers in selecting appropriate health care interventions. Among other things, discordance between reviews may be caused by differences in primary study selection [6] due to variations in literature search strategies, selection criteria, and the application of selection criteria [2].

The Institute for Quality and Efficiency in Health Care (Institut für Qualität und Wirtschaftlichkeit im Gesundheitswesen, IQWiG) conducted a systematic review on the effectiveness and safety of negative pressure wound therapy (NPWT) versus conventional wound therapy in patients with acute or chronic wounds. The NPWT technique aims to accelerate wound healing by placing a foam dressing in the wound and applying controlled subatmospheric pressure [7]. The German-language full review and a rapid report on studies subsequently published are available on the IQWiG website [8, 9]. In addition, an English-language journal article has been published [10].

An additional retrospective analysis was conducted in order to compare different systematic reviews on NPWT regarding their agreement in primary study selection. The review methodologies were also compared.

Methods

For the IQWiG review and rapid report, 4 bibliographic databases (MEDLINE, EMBASE, The Cochrane Library, and CINAHL) were searched to identify systematic reviews and primary studies on NPWT versus conventional wound therapy in patients with acute or chronic wounds. All databases were searched from inception to May 2005 (review) and between May 2005 and December 2006 (rapid report).

The multi-source search strategy and literature screening are described in detail elsewhere [8]. Eligible primary studies were randomised controlled trials (RCTs), as well as non-randomised controlled trials (non-RCTs) with a concurrent control group. Studies were classified as non-randomised if allocation concealment was viewed as inadequate [11]. Quasi-randomised studies were therefore classified as non-randomised. The intervention was categorised as NPWT if a medical device system identical or comparable to the vacuum-assisted closure (V.A.C.®) system was used. Studies were considered to be eligible only if publicly accessible full-text articles or other comprehensive study information (e.g. clinical study reports provided by manufacturers) were available.

For the present analysis, an identical and sufficiently large primary study pool, i.e. the pool of studies that could potentially be identified by all reviews, was required to ensure comparability between reviews. As a preliminary analysis showed that early reviews merely included 2 to 4 primary studies, only reviews published in or after December 2004 were considered.

Eligible reviews had to include data from completed primary studies on NPWT. Reviews were classified as systematic reviews (as opposed to narrative reviews) if multiple sources were searched (at least MEDLINE and The Cochrane Library), and the search strategy (including the search date) was documented [12].

Primary studies were eligible for inclusion only if they had been published before June 2004 and if the entry date of a study in a database preceded the date of the literature search of any systematic review analysed.

The methodology and primary study selection between reviews were compared, and the overall agreement in study selection between reviews was reported.

Only a summary of the reviews' quality assessment of primary studies and their conclusions on the effectiveness of NPWT is presented here, as the main focus of this paper was to compare the agreement in primary study selection between reviews.

Results

The flow charts of the selection of systematic reviews and primary studies are presented in Figures 1 and 2. Sixteen primary studies published before June 2004 were assessed in the present analysis [1328]. A total of 5 eligible systematic reviews (the IQWiG review and 4 other systematic reviews) published between December 2004 and July 2006 were analysed [2932]. Details on all reviews identified are shown in Table 1; the main reason for exclusion was failure to qualify as a systematic review.

Table 1 Identified pool of potentially relevant reviews
Figure 1
figure 1

Flow chart of the review selection.

Figure 2
figure 2

Flow chart of the study selection.

The methods applied in the reviews included are presented in Tables 2 and 3. Regarding bibliographic databases, all reviews used MEDLINE, EMBASE, and The Cochrane Library, but the nursing database CINAHL was used only by IQWiG. The search terms applied varied between reviews. Regarding study design, the IQWiG review [8], as well as the reviews by Costa 2005 [30] and Pham 2006 [31] considered both RCTs and non-RCTs, while the reviews by Samson 2004 [29] and OHTAC 2006 [32] took only RCTs into account.

Table 2 Systematic reviews on NPWT: Requirements for primary studies and publications
Table 3 Systematic reviews on NPWT: Search strategies

As the comparison of systematic reviews based on published information showed numerous inconsistencies, we decided to contact the authors of the other reviews for clarification (this was not initially planned). We received responses from all authors approached (or from other researchers at the publishing institutions). After reviewing the responses, it became clear that reporting styles for excluded studies differed between reviews. For example, the response by OHTAC stated that "it must be noted that we do not routinely cite or analyse studies that have been excluded from our EBAs (evidence-based analyses)" [personal communication]. It consequently became apparent that some studies we had initially classified as "not identified by other reviews" had actually been identified but excluded, and subsequently not reported. We therefore changed the classification of studies not cited in reviews to "not reported". In addition, the authors of reviews corrected or clarified published information (their comments are included in Tables 4, 5, 6); in this context we thank them for generously providing information.

Table 4 Overview of primary study selection: comparison of trials included as RCTs by IQWiG
Table 5 Overview of primary study selection: comparison of trials included as non-RCTs by IQWiG
Table 6 Overview of primary study selection: comparison of trials excluded by IQWiG but included by at least one other review

Details of the primary study selection are presented according to the study classification by IQWiG in Tables 4 (5 RCTs), 5 (7 non-RCTs), and 6 (3 non-RCTs and 1 RCT excluded by IQWiG, but included by at least one other review).

The reviews included between 4 and 13 eligible primary studies published before June 2004. With regard to RCTs, the overall agreement in primary study selection between reviews was 96% (24 of 25 options) (Table 5).

More variations were noted concerning the selection of non-RCTs; the agreement between reviews considering both RCTs and non-RCTs was 57% (12 of 21 options). Of the 9 mismatches, according to published information and the information provided by authors or institutions, 7 were due to different inclusion criteria (e.g. language criteria), and 2 were due to variations in study classification (Table 5).

Four studies (3 non-RCTs and 1 RCT) were excluded by IQWiG but included by at least one other review. The reasons for exclusion were as follows: the study included historical controls (2 non-RCTs [13, 26]); the intervention applied was not comparable to the NPWT technique (1 non-RCT [14]); or an additional intervention was applied that may have affected the study outcomes (1 RCT [19]) (Table 6). Substantial variations in study selection were shown between reviews.

Only the IQWiG review included a meta-analysis (changes in wound size), which indicated an advantage in favour of NPWT. However, only a few trials with small sample sizes were analysed.

The overall quality of the primary studies was assessed in 3 of 5 reviews, and was in general classified as poor. All reviews concluded that the evidence base on NPWT was insufficient (Table 7).

Table 7 Overall quality assessment of primary studies and main conclusions of systematic reviews on negative pressure wound therapy

Discussion

An analysis of 5 systematic reviews on NPWT showed differences (which mainly concerned non-RCTs) in the citation and selection of primary studies.

We would like to emphasize that by presenting these differences, we are not implying that the 4 other reviews identified were of inferior quality compared with the IQWiG review. Variations in the number of primary studies identified and selected are not surprising, as the reviews used different search strategies, literature sources, and inclusion criteria. After correspondence with the authors of the other reviews, many differences regarding the citation of primary studies could be attributed to different reporting styles (citation or non-citation) for excluded studies, not to the non-detection of studies in the literature searches.

Most differences in study selection resulted from variations in inclusion and exclusion criteria. For example, due to language restrictions, studies published in German were selected by IQWiG, but not by other reviews. Opinions on the relevance of language bias differ; a study published in 1997 comparing English and German-language publications concluded that English-language bias may be introduced in systematic reviews if they include only trials reported in English [33]. In contrast, a more recent publication noted that, for conventional medicinal interventions, language restrictions did not appear to bias estimates of effectiveness [34]. Moreover, for German-language publications on RCTs, it has been reported that German medical journals no longer play a role in the dissemination of trial results [35].

The inclusion criteria for primary study design were also inconsistent; 3 reviews (including the IQWiG review) considered both RCTs and non-RCTs, and 2 reviews considered only RCTs. The non-RCTs included in our analysis were non-randomised controlled intervention studies. However, there are many different study types that can be seen as non-RCTs (e.g., classical observational studies). The inclusion of non-RCTs in systematic reviews is inconsistent and controversial [3640]. The validity of systematic reviews including non-RCTs may be affected by the differing susceptibility of RCTs and non-RCTs to selection bias [39], although it has been suggested that under certain conditions, estimates of effectiveness of non-RCTs may be valid if confounding is controlled for [40].

RCTs with adequately concealed allocation prevent selection bias and consequent distortions of treatment effects [41], and systematic reviews including RCTs represent the highest level of evidence for therapeutic interventions [42]. However, the quality and quantity of RCTs in surgical research is limited [43], and it has therefore been proposed not to base this type of research on RCTs alone [36, 44]. Indeed, for some topics, non-RCTs are the only evidence available [45].

As for NPWT, although this treatment is widely applied in clinical practice, particularly in chronic wounds, at the time the IQWiG systematic review on NPWT was being planned only few RCTs were available; moreover, these were of poor quality [29]. However, there has been a recent increase in published RCTs, and as several of them are ongoing, more publications can be expected in the near future. One HTA agency has already changed its policy from including both RCTs and non-RCTs in systematic reviews on NPWT to one of including solely RCTs [32]. We agree with other researchers that non-RCTs should only be performed when RCTs are infeasible or unethical [38], and that systematic reviews including non-RCTs should only be conducted when RCTs are not available [39]. However, we emphasize that this should not be generalized to recommend excluding all kinds of non-randomised studies from systematic reviews on any topic and for any outcome of interest.

The type of non-RCT considered also differed: IQWiG's precondition for inclusion was the existence of a concurrent control group; studies with a historical control group were excluded, as systematic bias may arise from time trends in the outcomes of study participants [38].

Moreover, variations in the classification of study design were noted between reviews. For example, David Sampson, one of the other review authors, stated: "In general, our definition of randomized trials was probably more inclusive than yours. We decided to be inclusive due to the small number of potentially relevant studies available at that time. Our goal was to evaluate the quality of a larger pool of included studies rather than exclude more studies, based on quality concerns, to create a smaller pool of included studies" [personal communication].

As subjective factors are involved in the preparation of systematic reviews, inter-author variation is inevitable [46]. The evaluation of inter-author variation has shown that differences particularly affect the classification of study design [46, 47]. One study showed that this was the case even when specific instructions and definitions were provided [47]. However, a recent analysis of the reproducibility of systematic reviews showed that, where authors were provided with guidelines for review preparation (including an algorithm to ensure that study designs were defined in a standardised manner), the overall reproducibility between reviews was good [48]. This finding emphasizes the relevance of standard reporting guidelines. The CONSORT statement on improving the quality of reporting for RCTs has been available for over a decade [49], and a revised version was published in 2001 [50]. In contrast, guidelines for non-RCTs are more recent [51, 52]. The introduction of uniform reporting standards for non-RCTs may improve the future quality of reporting and lead to a closer agreement in the primary study citation and selection of systematic reviews.

Even though the reviews analysed included different numbers and types of studies, all reviews reached similar conclusions. This may be explained by the fact that the overall quality of the data on NPWT is poor.

Conclusion

The citation and selection of primary studies differ between systematic reviews on NPWT, primarily with regard to non-RCTs. These differences arise from variations in review methodology and inter-author classification of study design, as well as from different reporting styles for excluded studies. Uniform methodological and reporting standards need to be applied to ensure comparability between reviews as well as the validity of their conclusions.