Introduction

Surgical site infection (SSI) is one of the most serious complications after spine surgery with potentially devastating consequences such as failure of fixation, osteomyelitis, pseudarthrosis, increased length of hospital stay, mortality, unfavorable surgical outcome and associated health care costs [1,2,3,4,5]. Within the field of orthopedic surgery, a relatively high incidence of SSIs is observed after spine surgery: up to 12% depending on diagnosis, surgical approach, use of spinal instrumentation and the complexity of the procedure [6,7,8]. SSIs can be both difficult to diagnose—as there is no pathognomonic sign or symptom to accurately indicate its presence—and difficult to treat. One or more operative debridements combined with prolonged antibiotic treatment may be necessary to treat the infection [1, 9, 10]. With the rise in prevalence of antibiotic-resistant organisms, the treatment of SSI has become even more difficult, and therefore, the prevention of SSI is a matter of utmost importance [11].

Prior research has identified several factors associated with an increased risk of SSI after spine surgery, e.g., advanced age, revision surgery, obesity, diabetes, smoking, high amount of intraoperative blood loss and prolonged duration of surgery [6, 7, 12,13,14]. These risk factors are usually reported as relative risks (RR) or odds ratios (OR). However, these RRs and ORs are measures of association and are not sufficient to estimate an individual’s personal risk of SSI given a combination of these factors.

Combining risk factors into a prediction model is an appropriate tool to be used for preoperative patient counseling when evaluating the individual risk of SSI after spinal surgery. Estimating an individual’s risk of SSI may help identify high risk patients, thus optimizing patient selection with possible prevention of the devastating consequences and associated outcomes of an SSI after surgery [1].

Lee et al. developed a prediction model for SSI after spine surgery based on the patient’s comorbidity profile and invasiveness of surgery by using a prospectively collected registry for all surgical spine patients at University of Washington and Harborview Medical Center consisting of 1532 patients having instrumented and non-instrumented spinal surgery [12]. However, external validation of the prediction model showed poor predictive performance in a large cohort of patients undergoing instrumented thoracolumbar spine surgery in an academic spine setting [15].

The aim of this study was to develop and internally validate a multivariable prediction model for accurate prediction of SSI after instrumented spine surgery using a large cohort of a Western European academic center.

Methods

Patient population

This was a retrospective cohort study of all instrumented spinal surgery procedures of the thoracic, lumbar and thoracolumbar spine that have been performed in adult patients (≥ 18 year) in an academic referral center for spinal pathology from January 1, 1999, up to January 1, 2016. Patients diagnosed with an infection after instrumented spinal surgery elsewhere were excluded as well as patients for whom the medical files for at least up to 1 year after surgery were not available.

All operations were performed by 3 experienced orthopedic surgeons specialized in spine surgery. In select cases when neurological decompression was needed, neurosurgeons participated in the operation. All patients underwent an instrumented posterior (posterolateral or interbody) spinal fusion of the thoracic, lumbar and thoracolumbar spine, with or without an additional procedure (anterior fusion or release, spinal decompression, the removal of instrumentation, tumor resection or corpectomy/osteotomy).

Patients were followed for a minimum of 1 year after the index operation to monitor all complications and incidences of revision surgery. All complications, extensive demographics, comorbidity and surgical details were recorded by collecting data from all available electronic and paper records of the patients. The primary outcome of interest was the occurrence of SSI. The diagnosis of SSI was based on the CDC criteria (Centre for Disease Control and prevention) [16] and the Dutch national PREZIES network (prevention of hospital infections through surveillance) [17]. An SSI was considered to be deep if it presented at the site of the operation with involvement of the subfascial tissues.

Predictor variables

An often-used rule of thumb states that at least 10 events (i.e., occurrences of SSI) are needed per predictor variable that is tested in the prediction model development step [18]. When more predictor variables are added to the model, the probability of overfitting (i.e., the model predicts exceedingly worse for patients not comprised in the derivation cohort) increases. As a result, we needed to perform a pre-selection of all baseline characteristics of those that we thought would be most likely to result in an accurate prediction model. The pre-selection was based on what was already known from other studies, the distribution of the predictor in our sample, and experience in our own hospital. Using this method, we were able to reduce the initial set of potential predictor variables to 8, i.e., age, body mass index (BMI: kg/m2), smoking status, diagnosis, revision surgery, ASA (American Society of Anesthesiologists) physical status, surgical invasiveness index (SII) [19] and the use of nonsteroidal anti-inflammatory drugs (NSAIDs) preoperatively.

Smoking status was dichotomized into currently smoking yes or no, independent of the volume and tobacco product used. All passive smokers and ex-smokers were regarded as non-smoker. ASA physical status, a classification to assess the fitness of the patient before surgery, was coded according to the five-category physical status classification system of the American Society of Anesthesiologists in 1963 (1 = healthy person, 2 = mild systemic disease, 3 = severe systemic disease, 4 = severe systemic disease that is a constant threat to life, 5 = a moribund person who is not expected to survive without the operation) [20]. The surgical invasiveness index is a validated instrument with a range from 0 to 48 points, containing the sum of the following six weighted surgical components: the number of levels anterior decompressed, the number of levels anterior fused, the number of levels anterior instrumented, the number of levels posterior decompressed, the number of levels posterior fused and the number of levels and posterior instrumented. The weight of each component represents the number of vertebral levels at which each respective component has been performed [19]. A higher score means higher invasiveness. For example, in an L4–L5 posterior fusion and decompression with the use of an intervertebral cage, and posterior instrumentation, the score would be 9 (anterior fusion = 2, anterior instrumentation = 2, posterior decompression = 1, posterior fusion = 2, posterior instrumentation = 2).

Diagnosis of the included patients was divided into 4 subgroups, i.e., one- or two-level degenerative disorders (with or without neurologic compromise), failed back syndrome (patients that had already undergone previous spine surgery on the same level), trauma (unstable vertebral fractures with or without neurological compression) and other (adult spinal deformity, spinal metastases/malignancy, spondylodiscitis). Nonsteroidal anti-inflammatory drugs use was defined as the daily use of NSAIDs before surgery for more than 1 week and still in use at the time of surgery.

Model development

Incomplete patient records were imputed using stochastic regression imputation, to prevent a potentially considerable loss of statistical precision and to decrease the probability of biased results when compared to using only complete patient records (see Table 1). We used predictive mean matching to generate the imputed values. After imputation, we included all potential predictor variables in a logistic regression analysis. Using stepwise backward elimination on the hypothesized predictor variables, we excluded nonsignificant predictors from this category to arrive at a more parsimonious model. As suggested by prediction modeling guidelines, we used a less strict alpha for eliminating variables from the model to prevent too early deletion of potentially important predictor variables [21]. We chose to use an alpha of 0.10 compared to the conventional 0.05.

Table 1 Baseline characteristics of all patients included in the study

For continuous variables, the association is assumed to be linear. Nonlinear effects were visualized using plots and formally tested using restricted cubic splines, a regression technique that can be used to test for deviations from linearity [21]. In case of significant evidence of a nonlinear relation, the continuous variable was categorized into clinically meaningful categories.

The model’s performance was quantified using measures of discriminative ability and measures of calibration. We assessed the model’s ability to discriminate between those who developed SSI and those who did not by computing the area under the receiver operating characteristic (ROC) curve (AUC). This AUC can range from 0.5 (no discriminative ability) to 1.0 (perfect discriminative ability). Model calibration (i.e., agreement between predicted and observed probabilities) was evaluated by visual inspection of the calibration plot and by computing the Hosmer and Lemeshow goodness-of-fit test (HL test). A significant HL test indicates evidence against good model fit.

Internal validation

We internally validated the initial prediction model using standard bootstrapping techniques. Using results from the bootstrap procedure, we penalized the model’s regression coefficients, so future predictions will be less extreme (to counter the effect of overfitting) by multiplying them with a shrinkage factor, and re-estimating the model intercept. Also, we computed the estimated optimism in the AUC. This is a measure of the likely difference in AUC when the model is applied to future patients. All analyses were performed using R version 3.3.3.

Results

A total of 898 participants were available for the development of the prediction model. Sixty (6.7%) were subsequently diagnosed with SSI.

Table 1 shows a summary of baseline variables including all potential predictor variables of the whole cohort and separately for those who developed SSI and those who did not.

The restricted cubic spline regression revealed evidence of a U-shaped association between BMI and SSI instead of a linear one. Therefore, we categorized BMI into three clinically relevant subgroups: normal weight (BMI up to 25), overweight (BMI between 25 and 30) and obese (BMI over 30). The backward stepwise elimination yielded the following predictor variables: age, BMI categories, ASA physical status, degenerative or revision (versus trauma and other) and the use of NSAIDs. All other potential predictor variables were eliminated from the model because their p value was higher than 0.10.

The ROC curve of the prediction model is shown in Fig. 1. The AUC was 0.72 (95% confidence interval [CI] 0.65–0.79), indicating reasonable discriminative ability. The calibration plot is shown in Fig. 2. It shows the model is well calibrated for the whole range of predicted probabilities, as it lies close to the 45-degree line of perfect fit.

Fig. 1
figure 1

Receiver operating characteristic curve of the prediction model for surgical site infection

Fig. 2
figure 2

Calibration plot of the prediction model for surgical site infection

The internal validation step yielded a shrinkage factor of 0.87. All regression coefficients were multiplied by this factor to shrink them closer to 0 to produce less extreme predictions for future patients, to counteract the effect of model overfitting.

Prediction model

Table 2 shows the coefficients of the resulting prediction model. The way to calculate an individual’s risk of an SSI is shown in detail in Table 1.

Table 2 Prediction model for the occurrence of surgical site infection

Discussion

This manuscript presents an internally validated predictive model to estimate the risk of SSI after instrumented thoracolumbar spinal fusion. In the literature, risk factors are generally reported as relative risks or odds ratios. Although these measures of association are important in understanding what contributes to an individual’s probability of an SSI, they are difficult to translate into a tool for decision making and cannot be used to calculate an individual’s probability of an SSI. The prediction model presented in this manuscript can be used to predict an individual risk (as a proportion or percentage) for SSI after instrumented spinal fusion.

This model may be helpful in the clinical setting to identify patients at high risk of SSI, optimizing patient selection and possibly prevent devastating consequences and associated outcomes of an SSI after surgery by extra preventive measures such as prolonged antibiotic prophylaxis or optimalization of nutritional status.

To our knowledge, this is the first prediction model for SSI after instrumented spine surgery procedures. The model has an AUC of 0.72 (95% CI 0.65–0.79). This is considered to be moderate and is comparable to prediction models from other clinical disciplines with an AUC range from 0.54 to 0.73 [22, 23]. Bear in mind that the model is used for prediction of future events, compared to diagnostic models that estimate the probability of the presence or absence of an outcome in the present time. Arguably, predicting the future is much more complex, like Niels Bohr said: “prediction is very difficult, especially about future.”

Lee et al. presented a model for SSI in 2014 based on 1532 patients [12]. In the model of Lee et al., all spine surgery procedures were included, whereas in our model only instrumented procedures were included. A second difference between the two models was the definition of SSI. Lee et al. defined SSI as an infection requiring return to the operating room for irrigation and debridement without a clear difference between superficial and deep infection. Our definition of SSI was based on the CDC criteria and the Dutch national PREZIES network including only deep infections independent of return to the operating room.

One of the limitations of the model that we developed is the number of patients in our cohort. Although we used a large cohort consisting of 898 patients, more patients (and subsequently more cases of SSI) would have given us the opportunity to study even more potential predictor variables. Remarkably, some potential predictor variables that were important in other studies, like smoking and surgical invasiveness index, were not selected in our modeling procedure [6, 13]. This could be due to a lack of statistical power, also related to the number of patients in our cohort. Other risk factors described in the literature with a very low incidence in our cohort, like Parkinson’s disease and paraplegia, were not selected [24].

Some other associations between predictor variables and SSI were unexpected [25, 26]: We did not observe a linear association between BMI and the log-odds of an SSI. Overweight patients with a BMI between 25 and 30 were more protected for SSI compared to normal weight (BMI 20-25), but obese and morbidly obese patients with a BMI of more than 30 were more prone to an SSI after instrumented spine surgery. In most literature, only (morbid) obesity (BMI > 30) is described as risk factor for SSI after spine surgery although overweight patients with a BMI less than 30 were not described as a risk factor [7, 13, 27]. A hypothesis for mild overweight as a protective factor could be that these patients have more soft tissue covering of the instrumentation after an instrumented spinal procedure.

Most predictive variables were in agreement with the literature. Age, ASA score, and diagnosis were significant risk factors for SSI in our model. In the previous literature, patients with comorbid medical conditions were found to be significantly associated with SSI [7, 28, 29]. Trauma, adult spinal deformity with long segment procedures, spondylodiscitis and malignancy had a higher risk for SSI than degenerative or failed back surgery syndrome [6, 30,31,32,33]. Also older age had an increased risk of postoperative spinal infection [13, 34].

Although this model can be of great benefit when considering risk assessment, it would be most valuable if it was generalizable to future patients and patients from different hospitals. Hence, it should be externally validated to assess how the prediction model performs in patients sampled independently from the derivation cohort.

Conclusion

We presented an internally validated predictive model for SSI after instrumented thoracolumbar spine surgery. This tool can be of substantial value in the preoperative counseling of patients for shared surgical decision making and ultimately improve safety in spine surgery. Identification of patients at risk for postoperative infection allows for individualized patient risk assessment with better patient-specific counseling and may accelerate the implementation of multi-disciplinary strategies for the reduction of SSIs.