Introduction

Anterior cervical discectomy with fusion (ACDF) is considered the standard surgical treatment for cervical radiculopathy. Decompressing the nerve root aims to diminish radicular complaints and adding a cage to the intervertebral space aims to maintain foraminal height and cervical alignment [1,2,3]. In the past three decades, the use of a disc prosthesis (ACDA) is being investigated as an alternative treatment for patients with symptomatic cervical radiculopathy caused by cervical disc herniation. The rationale for the use of a prosthesis is to avoid loss of motion at the target level, which is a consequence of treating radiculopathy with ACDF. It is hypothesized that loss of motion causes neck disability and increased mechanical stress at the adjacent levels, possibly causing acceleration of degeneration at these adjacent segments (adjacent segment degeneration; ASD) [4, 5].

Comparing the results of ACDF and ACDA has been done before in systematic reviews and meta-analyses. An overview of Bartels et al. (2017) considered 21 meta-analyses in which the included studies tended to conclude that ACDA gave a better outcome, but differences were small and not clinically relevant [6]. However, it appeared that the meta-analyses considered mainly randomized controlled trials (RCTs) that were performed on mixed patient populations: patients suffering from primarily radiculopathy and patients suffering from primarily myelopathy.

Myelopathy is presumed to be generally occurring in another type of patient, namely the patient with the more degenerated spine. Implanting a prosthesis in these patients may not lead to the optimal effect that is aimed for by implanting a prosthesis. Comparing the outcome of fusion versus prosthesis in patients that primarily suffer form myelopathy may therefore have a different outcome than evaluation of outcome in patients that suffer primarily from radiculopathy.

In this review, only studies that discuss clinical findings in patients with primary complaints of radiculopathy, excluding those suffering primarily from myelopathy, are evaluated. Additionally, outcome of these findings will be compared to clinical outcome reported in the articles considered in the meta-analyses that evaluate mixed patient populations.

Materials and methods

Literature search strategy

The initial literature search strategy was performed in PubMed, EMBASE, Web of Science, COCHRANE, CENTRAL and CINAHL on 2 August 2016, and all English- and Chinese-language publications on the comparison of ACDF and ACDA were retrieved. Two of the authors separately evaluated the articles by title, abstract or full text, when necessary, to select the studies that met the predefined selection criteria. One author translated two relevant articles from Chinese to English. The search strategies used in the different databases were based on the search string as shown in Fig. 1.

Fig. 1
figure 1

Search strategy. Search strategy that was used to perform the literature search 2 August 2016

Article selection was based upon the following criteria:

  • The study compares ACDF to ACDA in one-level anterior discectomy.

  • The study includes at least twenty patients in each treatment arm.

  • The study provides follow-up data for at least 2 years.

  • The study measures primary or secondary outcome in either the Neck Disability Index (NDI) or Visual Analogue Scale neck pain (VAS neck pain).

  • The study only includes patients suffering from radiculopathy, excluding patients suffering from myelopathy.

  • The article is not a meeting abstract.

Any discrepancy in selection between the reviewers was resolved in open discussion, and, if needed, a third reviewer was asked to make a final decision. Reference screening and citation tracking were performed on the identified articles (Fig. 2).

Fig. 2
figure 2

Flow chart of article selection process radiculopathy articles. Flow chart describing the search process for the articles exclusively including patients suffering from cervical radiculopathy

When the literature search was repeated in August 2017, a meta-analysis by Bartels et al. was found [6].

In this study, 21 meta-analyses were evaluated that focused on the outcomes of one-level arthroplasty. The included meta-analyses primarily described studies that allowed inclusion of patients suffering from cervical myelopathy. In order to be complete in our overview, the studies described in the meta-analyses were evaluated additionally in separate mixed group tables. This systematic review was conducted in accordance with the Preferred Reporting Items for Systematic Reviews and Meta-Analyses: the PRISMA Statement [7] (Fig. 3).

Fig. 3
figure 3

Flow chart of article selection process mixed group articles. Flow chart describing the search process for the articles including patients suffering from cervical myelopathy with or without radiculopathy

Quality assessment

The methodological quality of all studies (including those from the RCTs describing mixed populations) was assessed by three independent reviewers (XY, TJ, CG), using an adjusted version of the checklist for cohort studies of the Dutch Cochrane Center [8]. If there was no consensus about the assessment, a fourth reviewer (CVL) was consulted.

Studies could be maximally awarded 9 points. Studies were then divided into low (7–9 points), intermediate (5–6 points) or high (4 or less points) risk of bias.

Outcome measures

For matters of comparison, the most frequently used outcome parameters were extracted in this systematic review: the Neck Disability Index (NDI) and the Visual Analogue Scale (VAS) for neck pain. In addition, data on reoperations and complications were collected.

The NDI is a ten-item scaled questionnaire on three different aspects of neck complaints: pain intensity, daily-work-related and non-work-related activities. Each item is scored from 0 to 5, and the raw total score ranges from 0 (best score) to 50 (worst score) [10]. Several studies indicate a MCID for NDI of 20 points on a 100-point scale [11, 12].

As many authors choose to present NDI scores on a 100-point scale, the outcome scores in this article were converted to that scale.

The Visual Analogue Scale (VAS) is the most commonly used tool to assess pain intensity. 0 mm indicates ‘no pain’ and 100 mm indicates the ‘worst pain imaginable’. According to the literature, the minimal clinical important difference (MCID) is approximately 20 mm, or 2.0 on a ten-point scale [9]. As most articles presented the VAS scores ranging from 0 to 10, we chose to convert all VAS scores to that scale in order to properly analyse and compare the data. If articles reported the NRS scores for neck pain instead of the VAS, articles were nevertheless considered eligible for inclusion because the two scales are very similar.

After performing the systematic literature review, it will be evaluated whether data can be pooled and if heterogeneity (using I2) can be calculated.

Level of evidence

The quality of evidence for all outcome parameters was evaluated using the GRADE (Grading of Recommendations Assessment, Development and Evaluation) approach according to Atkins [13] and adapted from Furlan [14].

Results

Search results and study selection of studies describing radiculopathy patients

A total of 603 articles were identified, of which 357 original articles remained after removing duplicates. Titles and abstracts were screened, resulting in 42 eligible articles. These articles were read full-text, and 14 studies met all inclusion criteria. Six articles were additionally excluded after meticulously investigating the literature. The article from Burkus et al. [15] had to be excluded because it also contained patients suffering from myelopathy. The article reports on the 7-year results of a study comparing ACDF versus prosthesis. The study population seemed to consist of patients with only radiculopathy. However, while searching for earlier follow-up results from this study, the article describing the 2-year follow-up results of this population was found [16]. From that particular article, it was clear that the study population was a mixed one, also including patients with myelopathy and therefore Burkus’ article was excluded.

Five of the remaining 12 studies concerned the same RCT comparing Prodisc-C versus ACDF (autograft bone and plate). Therefore, the four studies with shorter follow-up time were excluded (one with 2-year, one with 4-year, one with 5-year and one with 7-year follow-up results) [17,18,19,20]. We decided to only include the article describing the 7-year results (the longest follow-up period) without the continued access group [21].

Additionally, one more study was excluded since it described the 1-year follow-up results [22], while the 3-year follow-up results [23] were also available; eight studies remained that fulfilled all inclusion criteria.

Two RCTs described results on the Mobi-C prosthesis in comparison with ACDF methods using autograft alone [24] and securing with a plate [25]. Additionally, there was one retrospective study (Mobi-C vs PEEK cage without plate) [26] and one prospective cohort study comparing different types of prostheses (Prestige ST, Bryan, Prodisc-C) [27] versus ACDF (PEEK cage without plate). Two other RCTs compared the Prodisc-C prosthesis to ACDF with plate fixation [21, 23], and one RCT compared the Bryan prosthesis or Kineflex|C to ACDF with plate [28]. Lastly, one article described the comparison between the Discover prosthesis and a PEEK cage without plate [29].

The mean number of patients per group in the 8 included trials was 48. The mean age of the patients was 44.7 (ACDA) and 45.4 (ACDF) years, and the percentage of male patients was 46.1% (ACDA) and 49.0% (ACDF) (Table 1).

Table 1 Study demographics radiculopathy studies

Search results and study selection of studies describing mixed patient groups

From the 21 meta-analyses retrieved from Bartels’ article, 172 articles were found eligible for screening and after duplicates were removed 46 remained for full-text assessment. One article was selected for analysis in the radiculopathy group, as it described a population of patients from which myelopathy patients were excluded. Ten other articles were excluded because they did not report on the relevant clinical outcome measures or solely on radiological outcome parameters. Lastly, from the 35 articles that matched all inclusion criteria six articles had to be removed because they reported on the same RCTs, in which case we chose to include the article describing the longest follow-up period. Finally, 29 articles were found eligible for the mixed group overview, all reporting on the comparison between ACDA and ACDF in patients suffering from myelopathy with or without radiculopathy [15, 30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57].

In the 29 included articles, the mean follow-up period was 3 years, mean number of patients per group was 90 (ACDA) and 78 (ACDF), and the mean age 45 (ACDA) and 46 (ACDF) years. Study characteristics for each individual study can be found in Table 2.

Table 2 Study demographics mixed population studies

Quality assessment

Quality assessment in radiculopathy studies

Only one article scored 7 out of 9 points, illustrating a low risk of bias [29], four articles scored five points [21, 24, 27, 28] and one scored 4 points [23] all indicating an intermediate risk of bias. The two remaining articles scored three points illustrating a high risk of bias [25, 26] (Table 3).

Table 3 Risk of bias for the radiculopathy studies

Quality assessment in mixed studies

From the 29 studies, there were two with a low risk of bias [36, 41], 21 with an intermediate risk of bias [15, 30, 31, 33,34,35, 38,39,40, 42,43,44,45,46,47,48,49,50, 54, 55, 57] and six with a high risk of bias [32, 37, 51,52,53, 56] (Table 4).

Table 4 Risk of bias for mixed group studies

Clinical outcome

Neck Disability Index (NDI)

Disability in articles describing exclusively radiculopathy patients

Six articles use the NDI as a scale to report on functionality [21, 24,25,26, 28, 29] (Table 5). All articles show a significant improvement in post-operative functionality compared to baseline, for both treatment groups. However, only one article shows a significant difference in NDI between the two treatment groups after 2 years. Though the reported statistically significant difference in that article is not clinically relevant, it shows a more favourable outcome for fusion, as compared to the prosthesis [29].

Table 5 NDI and VAS outcome tables for radiculopathy studies

Level of evidence

The level of evidence is lowered by two levels, since most studies have an intermediate to high risk of bias. Furthermore, the findings are inconsistent as only one article presented a significant difference between the two groups, while the 5 other articles did not. Additionally, only one article succeeded in precisely stating the standard deviation (SD), but only for the baseline NDI estimate [21]. Three other articles provided information from which the SD could be calculated [24, 25, 29], while the remaining four did not [23, 26,27,28]. Therefore, the level of evidence that there is no difference in NDI improvement after 2-year follow-up in radiculopathy patients is low.

Functionality in articles describing mixed patient populations

Twenty-six articles use the NDI as an outcome parameter at baseline, and after 2 years, three articles do not [31, 35, 42]. Five articles report a statistical significant difference between ACDF and ACDA 2 years after surgery in favour of the prosthesis, and the difference is, however, never exceeding the MCID of 20. In contrast to the vast majority of articles, three studies show a small difference in favour of fusion, though not statistically significant [44, 55, 56].

Level of evidence

The level of evidence is lowered by 3 levels. Findings are inconsistent, risk of bias is intermediate to high, and data are not reported sufficiently precise. Additionally, the vast majority of studies received industry sponsoring and authors reported extensive disclosures, which enlarges the probability of reporting bias. Therefore, the level of evidence that there is no difference in NDI improvement after 2-years follow-up in mixed population patients is very low.

Visual Analogue Scale (VAS) neck pain

VAS neck pain in articles describing exclusively radiculopathy patients

Seven of the eight articles used the VAS to grade neck pain, and one article used the NRS score [29]. All articles showed that post-operative pain improved compared to baseline (Table 5). None of the articles demonstrated a statistically significant difference between the ACDA and ACDF group after 2 years.

Level of evidence

The level of evidence is lowered by 1 level, since most studies have an intermediate to high risk of bias. Moreover, only one study reports the exact standard deviations with every estimate [23]. Therefore, the level of evidence is moderate that there is no difference in neck pain improvement after implanting a cage or a prosthesis in cervical radiculopathy patients.

VAS neck pain in articles describing mixed patient populations

Twenty-four articles out of the 29 articles use the VAS neck pain as an outcome measure. All articles showed that neck pain improved post-operatively in comparison with baseline, in both treatment groups. Four articles report a statistically significant difference between the prosthesis and fusion in favour of the prosthesis [46, 50, 51, 57] (Table 6).

Table 6 NDI and VAS outcome tables for mixed group studies

Level of evidence

The level of evidence is lowered with three levels. All articles have an intermediate to high risk of bias, the vast majority of studies received industry sponsoring, and authors reported extensive disclosures, which enlarges the probability of reporting bias. Furthermore, findings are inconsistent and estimates of effect are not sufficiently precise as not all articles state the exact data. Therefore, the level of evidence that there is no difference in neck pain improvement after implanting a cage or a prosthesis in mixed population patients is very low.

Reoperations

Reoperation rate in articles describing exclusively radiculopathy patients

Seven of the eight articles reported reoperation rates, of which two articles report statistically significant differences in the rates. One study reports more reoperations in the fusion group [24], and the other higher rates in the prosthesis group [29]. Outcome reporting on the level of reoperation is rather heterogeneous and incomplete; however, the results are suggesting that reoperations are most frequent at the adjacent level for the ACDF group and at the index level for the ACDA group (Table 7).

Table 7 Number of reoperations and ASD incidence

Reoperation rate in articles describing mixed patient populations

The majority of the articles report on “subsequent surgical interventions”, which include revisions, removals and supplemental fixations. Three of the 29 studies report statistically significant differences between the groups in terms of reoperation rates, all in favour of the arthroplasty group [40, 54].

Complications

Complications in articles describing exclusively radiculopathy patients

The most common complications, apart from reoperations, included; adjacent segment disease (ASD), trauma, ongoing neck and/or arm pain, dysphagia, hoarseness, musculoskeletal pain and infections. Complications were seldom permanent. Four articles described adjacent segment disease [21, 23,24,25], of which only one article described a significantly higher incidence of ASD in ACDF patients [24]. No other statistically significant differences in complication rates were described between the treatment groups.

Complications in articles describing mixed patient populations

Three articles report statistically significant differences in the incidence of complications; the first study found a higher incidence of device-related complications in the ACDF group [48], the second study reported a higher rate of overall adverse events in the ACDA group [38], and the third article found more severe adjacent-level radiographic changes in the ACDF group [33]. Two other articles studied ASD very specifically but couldn’t find statistically significant differences between the two treatment strategies [37, 56]. However, it should be noted that in both article types, the articles that report on ASD use different diagnostic criteria for ASD. This could influence the reliability of the results and thus influence the comparability between articles that describe results on ASD.

Heterogeneity

Pooling results from the eight radiculopathy articles was considered; however, it was found that results were too heterogeneously reported for doing so. The number of studies was small, standard deviations were scarcely reported, p-values were mostly provided for the comparison between baseline and 2 years post-operatively within one treatment group instead of between the treatment group. Pooling the data would therefore require statistical imputation for the majority of the standard deviations and p-values. Articles were also clinically heterogeneous, as NDI and VAS scores were expressed on different scales and some articles reported the exact values after 2 years, while others reported the decline from baseline to 2 years or the difference between ACDA and ACDF at 2 years. Pooling results in mixed group articles has been done previously and is therefore likely not to lead to new insights [58,59,60,61]. Subsequently, this means that heterogeneity tests, such as the I2, were not performed, as data were not pooled.

Discussion

Meticulous literature research reveals that pain and disability scores were comparable in patients after 2 years and not dependent on receiving either a cage or a prosthesis, after anterior cervical discectomy for radiculopathy. Likewise, no difference in outcome scores was found between these surgical interventions in mixed patient populations. The same was true for the reoperation rates and the incidence of adjacent segment degeneration (ASD). After using the GRADE approach, the level of evidence for absence of a difference in neck pain and disability in radiculopathy patients is higher than the level of evidence in the mixed patient population; however, the overall level of evidence was low. This conclusion is in line with a meta-analysis by Bartels from 2010, which demonstrates that most studies comparing ACDF and ACDA are not blinded and that a clinical benefit for the prosthesis is not proven [58].

Several other meta-analyses comparing ACDA and ACDF have been published [59,60,61]. These meta-analyses included mainly studies that did not exclude myelopathy patients. These patients are prone to have more severely degenerated cervical spines and perform different on outcome scales. It is therefore most striking that in this systematic review the mean NDI 2 years post-operatively is lower (better) in the ACDA group with both myelopathy and radiculopathy patients than in the ACDA group with radiculopathy patients. This phenomenon might suggest the presence of bias due to industry sponsoring or lack of blinding. An alternative hypothesis could be that patients with more degenerated cervical spines are used to a certain amount of pain and therefore are more likely to report a better disability or pain score.

This review was set up as a counterweight to the 21 meta-analyses retrieved from Bartels’ article on studies comparing ACDF and ACDA concluding that outcome in prosthesis implanted patients was slightly better than in patients that underwent cervical fusion, although not statistically significant nor clinically relevant. It was hypothesized that outcome could be more convincingly favourable for the prosthesis if only radiculopathy patients would be considered. However, the opposite conclusion had to be drawn. Not only were the results in radiculopathy patients not different in ACDA and ACDF patients, but careful analysis of literature on mixed patient populations demonstrated that results were comparable in that patient population too. The suggestion that is offered by most of the existing articles, that the prosthesis is clinically superior to fusion, is therefore most likely to be too optimistic.

Another argument that is often used in favour of the prosthesis is claim of superior radiological results in terms of ASD and Range Of Motion (ROM). However, a recent systematic review shows no convincing radiological evidence for superiority of the prosthesis in ASD [62]. Additionally, the authors stress the absence of solid evidence for a correlation between the increased incidence of ASD and worse clinical outcome.

A limitation of the majority of the analysed studies is the use of a combined success score to define which treatment arm performs better. Apart from including an improvement in NDI or VAS score, most of these success scores added ‘neurological success’. This means that an evaluation conducted by the investigator for muscle strength, sensory assessments and reflex assessments was included. These investigator-conducted evaluations are prone to bias as the articles do not mention whether or not the investigator was blinded to the treatment the patient received. When these combined success scores are not taken into account, but only the plain clinical outcome measures and their statistical significance and clinical relevance, the inevitable conclusion is that ACDA is not superior to ACDF.

Based on clinical outcome measures, literature indicates that the results of ACDF and ACDA do not differ in the treatment of cervical radiculopathy. The results are not prominently different in patients suffering from myelopathy with or without radiculopathy. Further research should have more statistical power, should apply specific inclusion criteria to increase the external validity to specific groups of patients, should blind both the patients and the outcome assessor and report long-term follow-up results in order to draw definitive conclusions on the clinical relevance of the prosthesis. With the increase in power, the possibility of performing an additional subgroup analyses should be considered to identify possible subgroups that might benefit more from receiving a prosthesis.