Introduction

Artificial disk replacement (ADR) is frequently used in reconstruction after anterior decompression for cervical radiculopathy, to preserve motion and possibly prevent adjacent segment pathology (ASP). Heterotopic ossification (HO) is an unintended event that may lead to spontaneous fusion of the ADR, thus failing to preserve motion [1, 2]. The prevalence of HO (ranging in the literature from 0 to 100%) [3, 4], progression over time [5, 6] and predisposing factors [7,8,9], as well as HO’s effect on clinical outcomes [4, 10], are not fully understood. This study aimed to evaluate the prevalence of HO over a 5 year postoperative period, determine why and how late it may occur, what may possibly cause it to progress, and if and how it affects clinical outcomes.

Patients and methods

In a previously published RCT on cervical radiculopathy due to degenerative disk disease, 83 patients were operated with ADR (DiscoverTM, DePuy Spine, Johnson and Johnson, Raynham, Massachusetts, U.S.A.) at one or two levels [11, 12]. The patients who had undergone revision surgery or not having sufficiently high-quality radiographs at 5 years of follow-up were excluded, leaving 59 patients (79 implants) for analysis for presence and extent of HO at 2 and 5 years. The patients had been treated with NSAID (ketorolac 25 mg/day) for 10 days postoperatively. Seven patients with two-level ADRs where one was mobile and the other one stiff were excluded leaving 52 patients for the outcome analysis. Flowchart in Fig. 1. Baseline data are presented in Table 1.

Fig. 1
figure 1

Flowchart. ACDF  anterior cervical decompression and fusion, ADR  artificial disk replacement, FU  follow-up, HO  heterotopic ossification, N  number

Table 1 Patient characteristics at baseline

Radiology and grading

Plain radiographs in flexion, extension and neutral positions were obtained preoperatively, and at 2 and 5 years of follow-up. CT scans were obtained postoperatively and at 2 years. MRI scans were obtained preoperatively and at 5 years. HO was evaluated on plain films and graded in 5 grades according to Mehren/Suchomel [13] (Fig. 2). The postoperative CT scans were evaluated for facet joint degeneration according to Walraevens [14] and for the positioning of the ADR, dichotomized as either centered or off-centered. Disk degeneration on adjacent segments was graded on MRI scans according to Miyazaki [15].

Fig. 2
figure 2

McAfee Classification of Heterotopic Ossification (modified by Mehren/Suchomel). Grade 0: No HO present; Grade 1: HO detectable but not reaching the intervertebral space; Grade 2: HO reaching the intervertebral space; Grade 3: Bridging ossifications still allowing some movement at the segment; Grade 4: Complete fusion of the segment, without detectable movement; HO  heterotopic ossification

Clinical outcomes

The primary outcome measure was the Neck Disability Index (NDI) [16], a patient-reported function score that ranges from 0 to 100 with higher values indicating more severe symptoms. The minimum clinically important difference (MCID) for NDI is 13.4–17.3 [17, 18]. Using baseline NDI and NDI after 5 years, the influence of HO on the clinical outcome was analyzed, dividing patients into two groups: (1) arthroplasties with light or no HO (HO 0–2), i.e., moving arthroplasties, and (2) severe HO (HO 3–4), i.e., stiff or barely moving arthroplasties. Seven patients treated at 2 segments had one moving and one stiff segment. Those were excluded from the outcome analysis.

Statistics

Statistical analyses were performed in R, version 3.5.0 (R Foundation for Statistical Computing, Vienna, Austria). Missing values were replaced with multiple imputation using chained equations [19], as implemented in the R package mice, generating 100 imputed data sets. Variables with missing values were imputed. The imputation models used were: predictive mean matching for numerical variables, logistic regression for dichotomous variables, and ordinal regression for ordinal variables. For imputation of HO at 2 years, this value should be equal or less than HO at 5 years.

For the analysis of predisposing factors that may influence occurrence of HO, both univariate and multivariate analyses were done. The univariate analyses included age, sex, smoking, body mass index (BMI; < 25; 25–30; > 30), facet arthropathy, preoperative adjacent disk degeneration, and centering of the ADR. The multivariate analysis used sex, age, smoking and BMI as potential factors. As the predisposing factors are patient-related variables, the analysis was patient- and not ADR-based. In cases with two surgically treated segments in the same patient, and different grades of HO in the two segments, the worse grade was included in the analysis. When the grade of HO was the same, the level to include in the analysis was chosen at random. The number and location of operated segments could therefore not be appreciated as eventual predisposing factors in the risk analysis. For ordinal predictors, such as adjacent disk degeneration, we fitted an additional model, using equidistant scores. This corresponds to converting the variable to a numerical one with values 1, 2, 3, etc., to gain power and interpretability by estimating a single odds ratio (OR) instead of one for each level. To analyze the impact of surgeon as a predisposing factor, a simple significance test was performed, comparing HO to surgeon, treating the former as ordinal and the latter as nominal. The test was done as a permutation test, using the R package coin.

To analyze predictors affecting the progression of HO a univariate analyses (logistic regression) was done, using sex, age, smoking, preoperative adjacent disk degeneration, facet arthropathy at the index level, and centering of the ADR. For patients with arthroplasties at 2 segments, HO was considered to have progressed if at least one of the operated segments had increased; in cases where there were different degrees of progression in the two operated segments, the one that had progressed more was considered in the analysis. A multivariate analysis was then done, adjusting the variables sex, age, smoking and BMI to each other, to study their independent predictive value.

For the clinical outcome analysis, a linear regression model was fitted on imputed data, with NDI at 5 years of follow-up as the outcome, and dichotomized HO and baseline NDI as independent variables. The regression coefficient for HO was extracted. Regression coefficient, 95% confidence interval (CI) and p value were calculated for the null hypothesis of no predictive effect on outcome. The regression coefficient can be interpreted as the difference in 5-year NDI between the two HO groups, adjusted for baseline NDI. Thus, a positive value would correspond to larger NDI values in the severe HO group compared with the light or no HO group.

Results

Missing values

There were no missing values for HO at the 5-year follow-up. For HO at the 2-year follow-up, there were missing values for 24/79 ADRs. Three NDI values at 5 years were missing.

Prevalence of heterotopic ossification (based on implants)

Prevalence at 2 years of follow-up

Of the 55 arthroplasties where it was possible to evaluate HO at 2 years of follow-up, HO was present in 84%. Severe HO (grade 3 or 4) occurred in 49%, and complete fusion (grade 4) in 16% (Table 2).

Table 2 Prevalence of HO at 2 and 5 years of follow-up

Prevalence at 5 years of follow-up

Heterotopic ossification was present in 92% of the 79 arthroplasties at 5 years of follow-up. Severe HO (grade 3 or 4) existed in 71%, and complete fusion (grade 4) in 27% (Table 2).

Progression of Heterotopic Ossification (2 to 5 years of follow-up) (based on implants)

Slightly over 1/3 of the arthroplasties (36%) increased in HO grade between 2 and 5 years of follow-up. Severe progression (increase in grade by 2 or more) occurred in 7% of arthroplasties.

Predisposing factors to heterotopic ossification (at 5 years of follow-up) (based on patients)

In the univariate analysis, male sex was a strong predictor for HO (OR = 9.86, p < 0.001). Age was a predictor (OR = 2.14, p = 0.041) and BMI > 30 was at the threshold (OR = 4.27, p = 0.051). Neither smoking, preoperative ASP, facet arthropathy at the index level, nor centering of the ADR device had predictive value. No correlation was found between surgeon and HO. When stratifying HO by the number of operated segments, more severe HO was found in patients with two operated segments (p = 0.032).

In the multivariate analysis, sex remained a strong predictor for HO (OR = 8.87, p  < 0.001), while age and BMI lost significance as predisposing factors (Table 3).

Table 3 Multivariate analysis of predisposing factors to HO

Predisposing factors to progression of heterotopic ossification (based on patients)

In the univariate analysis, no variable was identified as having predictive value. In the multivariate analysis no significant predisposing factors were identified in those patients whose HO had worsened, though smoking was at the threshold, possibly representing a trend (odds ratio = 3.95, p = 0.055).

Clinical outcomes

No difference in NDI was found between patients with severe HO and those with light or no HO. The regression coefficient was −9.3, with 95% CI −21.1 to 2.5 and p = 0.12.

Discussion

Heterotopic ossification is an unintended phenomenon that occurs in cervical ADR and may lead to fusion. The term heterotopic ossification is, however, misleading. Heterotopic ossification is by definition the formation of bone outside the skeletal system [20]. In the case of cervical arthroplasty, HO refers to the formation of bone spurs at the disk level, i.e., the skeletal system. Such ossification could therefore represent a progression of an intrinsic degenerative process of natural aging, rather than a complication of surgery itself. The term has, however, fallen into general use in the specialized literature, and we therefore use it referring to bone formation in connection with an arthroplasty device.

Motion preservation and ASP

The loss of segmental motion did not affect the condition of adjacent levels, i.e., ASP, in this cohort of patients, as previously published [12].

Prevalence of HO

There is a wide range of HO prevalence reported, ranging from 0% in a study of active duty military subjects in the USA [3] to 100% in a multicenter study performed in Norway [4], both studies with 2 years of follow-up. In a meta-analysis by Kong et al. [21], HO ranges from 16 to 86%, with follow-ups from 1 to 10 years.

It is not clear why there is such a wide range of prevalence. One factor, as Kong suggests, could be interobserver error. When observing images and grading HO, detection sensitivity would be different among authors in the various institutes.

Even in high-quality randomized controlled studies, bias may lead to unreproducible results. Research findings are less likely to be true in studies conducted in a small field, with greater flexibility in designs, definitions, outcomes, analytic modes and when there are greater financial and other interests involved [22]. In a meta-analysis published by Yang et al. [23], only 3/38 included studies had low risk of bias, and all others showed intermediate or high risk of bias.

Our multicenter RCT was conducted outside IDE (investigational device exemption) conditions and was blinded until the moment of implant. The results were not positive to the “new” device. The sponsors were not involved in the study design, conduct of the trial, data analysis, interpretation of the results or the writing of the manuscript. There is therefore a lesser risk for publication, external validity, confirmation or financial conflict of interest bias, the four sources of bias defined by Radcliff et al. [24].

We found a prevalence of HO of 84% at 2 years and 92% at 5 years of follow-up. The patients with their ADRs removed due to loosening/subsidence were excluded as we unfortunately could not evaluate HO in those patients. It is likely that the patients with loosening/subsidence have less tendency to ossify, and by excluding them, the patients with ossification accumulate among the remaining patients, leading to a possible overestimation of HO.

Progression of HO

Progression of HO presented in the literature focuses on the increase in prevalence [5, 6]. In our study, we analyzed increase in grade of HO, i.e., aggravation at a given operated segment. HO progressed between 2 and 5 years of follow-up in about 1/3 of the ADRs.

Predisposing factors to occurrence of HO

Significantly more HO, and of higher grades, was found in men. One possible explanation is that the same hormonal factors that contribute to a higher rate of osteoporosis [25] in women might protect them from excessive ossification.

No other significant predisposing factors to HO were found in our study.

Age and BMI seemed to be predisposing factors for HO, but lost significance in a multivariate analysis. The odds ratio for age is similar in the uni- and multivariate analyses. It is possible that with a larger sample size the p value for age would remain significant after adjusting. Age merits, therefore, further investigation as a potential predisposing factor to HO.

It is also plausible to argue that age leads to more degenerated levels, and two treated levels have higher risk of HO than single levels. Two levels surgery might therefore be a co-variability. One other possible explanation for the two-level positive predictor is that those patients with a tendency to have degenerative disease at several segments also have a tendency to grow bony spurs, i.e., develop more HO. The HO would therefore be related to the patient’s biology and not the number of levels operated. Another factor leading to more HO in patients with ADR at 2 segments may be a disharmony between the motion of the operated segments and the motion of the patients’ total cervical spine, which is probably more rigid than the operated segments, due to generalized degenerative disk disease.

Zhou et al. found a correlation between preoperative spondylosis and postoperative ossification [9], which was not the case in our study, where neither facet arthropathy at the index level nor degeneration at adjacent levels were identified as predisposing factors.

Yi et al., in a retrospective study including 170 patients and 3 different arthroplasty devices, identified male sex and type of artificial disk device as predisposing factors to HO [8]. In our study, all patients received the same type of device, which precludes an analysis of type of arthroplasty device as a potential risk factor to HO.

In our study, the centering of the device was not a predisposing factor to HO. Yang et al. found that the amount of residual exposed endplate (> 2 mm) is a risk factor for HO (OR = 4.5), suggesting that maximizing the implant–endplate interface might diminish the risk for ossification [7].

Predisposing factors to progression of HO

No factors were identified that could explain a tendency toward the progression (i.e., aggravation) of HO. Smoking is related to disk degeneration [26], and may therefore be expected to contribute to progression of HO. Although smoking did not, in our study, reach significance as a predisposing factor to progression of HO (odds ratio = 3.95, p = 0.055), that might be explained by a small sample size, and therefore we believe that it merits further investigation.

Clinical outcomes

In our study, clinical outcomes 5 years after surgery were not affected by the presence of severe HO. This is consistent with previous reports [4, 10], and until now there are no publications implying that preserved movement or unintended fusion affect clinical outcomes. The fact that an ADR is fused, thus not fulfilling its role as a motion device, does not have any clinical impact.

Limitations

The impossibility to analyze those patients whose ADRs had been removed may have led to analyzing bias and possibly an overestimation of HO in our study, as we did not analyze the intention-to-treat ADR group but the per-protocol ADR group.

It is possible that with a larger sample factors such as age or smoking might have reached significance as risk factors to the occurrence or progression of HO.

All patients received the same type of device, therefore type of arthroplasty device could not be analyzed as a potential risk factor to HO.

Conclusion

Almost all ADR implants in our study have HO at 5 years of follow-up. Male sex is a clear risk factor. Prevalence increases slightly but in most arthroplasties HO does not progress after 2 years. Severe HO does not affect clinical outcome. Thus, having a functioning arthroplasty does not have a clinical impact.