Introduction

Ischemic stroke (IS), intracerebral hemorrhage (ICH), and subarachnoid hemorrhage (SAH) patients often require endotracheal intubation (EI) for airway protection, aspiration, or neurogenic respiratory failure [1]. While tracheostomy (TR) in the general intensive care unit (ICU) is performed in about 10–15% of patients, the rate in the neurological ICU and for stroke patients ranges between 15 and 35% [2, 3]. Predicting the need for prolonged EI is challenging and confounds the optimal timing to perform TR. Several factors have been demonstrated to predict the need for prolonged EI and TR in stroke patients including low Glasgow coma score, midline shift, presence of chronic obstructive pulmonary disease (COPD), ICH volume, thalamic location of the ICH, presence of intraventricular blood, and hydrocephalus [1, 4,5,6].

Early TR has been a topic of interest throughout many ICUs, especially among intubated patients with severe IS and ICH, given the anticipated benefits of both shorter ventilation duration (VD) and ICU length of stay (LOS). Fewer sedative requirements provide an observed mortality benefit in some studies [7, 8]. Multiple scoring systems have been designed to predict the need for prolonged mechanical ventilation (MV) and which subset of patients may benefit from early TR [1, 4,5,6].

The Stroke-related Early Tracheostomy vs Orotracheal Intubation in Neurocritical care Trial (SETPOINT) is a randomized pilot study that employed a unique scoring system to predict the feasibility, safety, and need for early TR [1, 4, 9]. The SETscore consisted of risk factors for prolonged (more than 2 weeks) MV, which was divided into 3 components: neurological function (maximum 10 points), neurological lesion (maximum 22 points), and the general organ function or procedure (maximum 16 points) [4, 9].

In the SETPOINT pilot trial, critically ill neurological patients with a SETscore > 10 were randomized to receive early TR (within 3 days of intubation) compared to standard-of-care (TR between day 7 and 14 from intubation if extubation, although planned for, was not possible until then) [1]. The trial demonstrated feasibility and safety of early TR, along with lower use of sedatives, ICU mortality, and 6 months mortality in the early TR group compared to standard-of-care. No differences in the primary end point (ICU LOS) or other secondary outcomes were noted [1]. The SETscore was internally validated by the same authors in a separate independent and monocentric patient cohort and found that a SETscore value of (8) to be the optimal cutoff to predict prolonged ICU LOS, VD, and TR need, with a positive predictive value (PPV) of 0.748 (p < 0.001) and 0.799 (p < 0.001) for ICU LOS and VD, respectively [9]. The pilot trial investigators also concluded that the score needs prospective confirmation in other Neuro ICU settings, with larger patient populations, before its true validity, reliability, and predictive potential can be judged [9]. The SETscore is currently employed in the multicenter trial (SETPOINT2) (NCT02377167) to explore the potential of early tracheostomy in severely affected ventilated stroke patients [10].

We sought to perform an external validation of the SETscore for our patients with IS, ICH, or SAH in the neurological ICU to test its ability to predict those who received TR and required prolonged EI. We felt that our patient population is different from the tested original populations given that it is a minority community of African Americans (AA) with very high prevalence of strokes, multiple medical comorbidities, and vascular complications. We also reviewed other factors that might be relevant to improve the accuracy and predictive value of the current score in our patient population.

Methods

Study Population

We conducted a retrospective analysis of a prospectively collected database of consecutive patients admitted to the neurological ICU in our tertiary stroke center from January 1, 2015, to December 31, 2016. Ethics approval was obtained from the university institutional review board, and the need for patient consent was waived. Inclusion criteria were as follows: adult age (> 18 years old), and critically ill neurological patients admitted with IS, non-traumatic ICH, and SAH who required EI on arrival to the emergency department or within 48 h of admission to the neurological ICU. Exclusion criteria were as follows: patients who had withdrawal of care without an extubation attempt or an offer for TR, patients who expired within 48 h of admission secondary to devastating neurological injury, and patients intubated only for a procedure and were extubated within 24 h.

Baseline Characteristics and Outcome Measures

We collected all demographic, clinical, and outcome data relevant to the patients by reviewing their medical records and imaging. The SETscore was calculated for individual patients based on reviewing the medical records, vital signs, imaging, and laboratory values during the first 48 h of admission using relevant data within 24 h of intubation, which is consistent with the previously published validation trial [9]. Poor outcome at discharge was defined as modified Rankin Scale score of 5–6.

We were faced with the lack of clear definitions for some of the SETscore scoring points (later defined in a prospective study), for reviewing the SETscores. In order to enhance the validity and inter-rater reliability for calculating the score across the different study team members who were collecting the data [9], we defined some of the scoring items as the following:

  • Dysphagia: this was scored for those who had documented lower cranial neuropathy, documented severe dysarthria on presentation of after extubation trial, needed an NG tube for feeding even after extubation, had documented failure of swallow testing, or required percutaneous gastric tube placement.

  • Diffuse lesion: we scored this for those patients who had evidence of diffuse brain involvement, involvement of more than 3 lobes of the brain, or involvement of the supra- and infratentorial compartments. We did not consider hydrocephalous as a diffuse lesion due to the fact that it has separate scoring points.

  • Neurosurgical intervention: we scored this for all those who required an operative neurosurgical procedure within 72 h of admission, including external ventricular device (EVD) placement. We did not consider thrombectomy for ischemic strokes or angiography for coiling aneurysm or vasospasm treatment in our scoring, as they are usually performed without intubation or patients are usually extubated immediately after the procedure.

  • Additional respiratory diseases: this included all patients with one of the following confirmed or suspected diagnoses documented during the EI period: COPD, emphysema, asthma, pulmonary hypertension, pulmonary embolism, bronchiectasis, cystic fibrosis, and obesity hypoventilation syndrome.

  • Sepsis: this scoring included those with 2 points on the SIRS criteria, with suspected or documented infection within 48 of admission, according to the Surviving Sepsis Guideline [11].

This is slightly different from some of the definitions used in the validation study published by the same group for the similar SETscore, after the publication of the STEPOINT trial [9]. This is reflective of the lack of objective assessment for dysphagia if the patient arrives intubated; the controversy regarding the consideration of EVD as a neurosurgical intervention is inherent in this case, especially at some institutions where it is still performed in the operating room.

During the SETPOINT trial, patients were randomized for early TR (within 3 days of intubation) versus standard of care (which is usually 7–14 days after intubation) [12]. However, it is rarely part of our practice to perform TR at a very early stage, i.e., within 3 days of intubation, so we used 7 days as the cutoff for defining early tracheostomy in our cohort.

Statistical Analysis

Characteristics of the tracheostomy and non-tracheostomy groups were presented as means and standard deviations, or counts and percentages, with the two groups compared using either t test, Chi square test or Fisher exact test. The SETscore was examined based on the previous reported cutoff point at a score of 10. The sensitivity, specificity, and accuracy were calculated with 95% confidence intervals estimated using a bootstrap method. The receiver operating characteristics were evaluated by calculating the area under the curve (AUC) with the 95% confidence interval estimated by bootstrap. Density plots were prepared to compare the distributions of the SETscore for the two groups, and for the logistic models described below.

Logistic regression models were used to create prediction models for tracheostomy. Several models were explored to identify a reasonable model. Variables to be considered for inclusion in the final prediction model included the SETscore, its component items, and other variables that might contribute to prediction including body mass index (BMI), race, stroke type and characteristics, positive sputum culture, and intubation on arrival. The variables were assessed using Bayesian model averaging, which accessed the probability that a variable coefficient was not zero when considering all possible combinations of the variables in the logistic regression [13], and Random forests which determined which variables reduced the variability to the largest extent using recursive partitioning [14]. The variables that were identified from these two methods were then entered into a final logistic regression model. Models were evaluated using both the Akaike information criteria and AUC from receiver operating characteristics. The final model was then validated using Monte Carlo cross-validation with 70% of samples used for training and 30% for testing [15]. All analyses were done using R version 3.4.3, and significance was defined as p < 0.05 [13,14,15]. The relevance of the AUC results was as follows: AUC = 0.90–1 = Excellent, 0.80–0.89 = Good, 0.70–0.79 = Fair, 0.60–0.69 = Poor, 0.50–0.59 = Fail [16].

Results

We reviewed 511 consecutive patients admitted to neurological ICU over a 2-year period. A total of 157 patients had TR, and the remaining 354 patients were extubated. Seventeen tracheostomized patients and 249 patients from the extubation group met exclusion criteria. (Supplemental Figure I) The SETscore was calculated for 140 tracheostomized (mean age: 55 ± 13, 49% male) and 105 extubated patients (mean age: 57 ± 14, 50% male). Baseline characteristics for the study population are presented in Table 1. The patients who received TR had longer ICU LOS [days, median (lower–upper quartile): 18 (14–26) vs. 6 (4–10), p < 0.001], longer VD (mean ± SD: 16.7 ± 8.9 vs. 4.1 ± 3.6, p < 0.001), higher BMI (mean ± SD: 29.9 ± 9.2 vs. 27.1 ± 6, p = 0.004), higher percentage of morbidly obese patients (BMI > 35 kg/m2) (21 vs. 6% p = 0.001), and higher SETscore [median (lower–upper quartile): 14 (11–17) vs. 9 (6–13), p < 0.001] compared to the group of extubated patients (Table 1). The SETscore differed by almost 5 points between the two groups; however, the distribution of the scores in the two groups showed a great deal of overlap around 10. The score was more accurate in identifying those with TR versus those without TR for scores > 20 versus those < 5 (Fig. 1). Scores less than 10 favored the patients who were successfully extubated, while the tracheostomized group was more likely to have scores above 10. The sensitivity for a score > 10 to predict the need for TR was 81% (95% CI 74–87%) with a specificity of 57% (95% CI 48–67%). The overall accuracy in correctly identifying those requiring TR and those successfully extubated was 71% (95% CI 65–76%), which is moderate with a fair AUC of 0.74 (95% CI 0.68–0.81) (Fig. 2, dashed line).

Table 1 Characteristics of the tracheostomized and extubated patients
Fig. 1
figure 1

Distribution of SETscore values for the tracheostomy and no tracheostomy groups

Fig. 2
figure 2

Area under receiver operating characteristic curves for discrimination of patients with and without tracheostomy by SETscore and modified SETscore (SETscore+)

Logistic regression models were utilized to examine whether additional variables would improve the predictive value of the SETscore for the need for TR. A model including the SETscore, BMI, positive sputum culture (a surrogate for developing hospital-acquired and ventilator-associated pneumonia), race, and stroke type improved the sensitivity to 90%, specificity to 78%, and the overall accuracy to 85% with a satisfactory AUC of 0.89 (95% CI 0.85–0.93) (Table 2, Fig. 2, solid line) (Table III in supplementary materials).

Table 2 Logistic models to predict TR using SETscore and other variables found to be significant to predict TR

With cross-validation, the test samples had an AUC of 0.87 (95% CI 0.81–0.93). The model had a higher ability to discriminate patients with TR from patients requiring extubation than did SETscore alone with p < 0.001 when comparing to the AUC (Fig. 3).

Fig. 3
figure 3

Distribution of logistic prediction model using SETscore and other reported predictors (SETscore+) for the tracheostomized and successfully extubated patients

Another outcome of interest is related to the ability of the SETscore to predict the need for prolonged MV (which was defined as > 14 days) as opposed to the need for TR. The sensitivity for the score to predict prolonged MV was 84% (95% CI 74–87%), specificity 43% (95% CI 48–67%), and accuracy 55% (95% CI 65–76%) with AUC 0.66 (95% CI 0.59–0.73). Logistic models demonstrated an improvement over SETscore in the prediction for prolonged MV with consideration of additional variables including BMI, AA race, ICH, and positive sputum cultures. However, the overall accuracy of the score to predict prolonged MV remained low (69%) in comparison with the ability of the score and the model to predict need for TR (Supplemental Table I).

Table 3 compares outcomes measures in early TR (≤ 7 days, n = 44) and late TR groups (> 7 days, n = 96). The patients in the early TR group had shorter ICU LOS (median [lower–upper quartile]: 15 [10–20] days, vs. 20.5 [15.8–27] days, p = 0.002) and less VD (mean ± SD: 13.4 ± 9.4 days, vs. 18.2 ± 8.3 days, p = 0.005) in comparison with late TR group. These associations remained significant on multivariable analyses adjusting for age, NIHSS on admission, ICH score, race, positive sputum cultures, and BMI and SETscore. Early and late TR groups did not differ in terms of their rates of positive sputum culture (66 vs. 70%, p = 0.820), SETscore [median (lower quartile–upper quartile): 14 (11–17) vs. 13.5 (11–18), p = 0.930], and rates of poor outcomes at discharge (82 vs. 76%, p = 0.52).

Table 3 Comparison between early and late tracheostomy for all tracheostomized patients

Finally, we compared the outcome measures in early (n = 38) and late TR (n = 75) groups for patients who only scored > 10 on SETscore. We noted a lower ICU LOS and VD in the early TR group, while observing no differences in rates of positive sputum culture or poor outcome at discharge between the two groups (Supplemental Table II). In our cohort, 83% of those with scores greater than 10 failed extubation at least once.

Discussion

Our study evaluating the external validity of the SETscore to predict the need for TR and prolonged EI in intubated patients with IS, ICH, and SAH demonstrated a moderate validity and accuracy. There was significant overlap in the SETscores in our sample for scores around 10; however, the scores were more accurate in identifying those with TR versus those without TR for scores > 20 versus those < 5. The SETscore was less predictive of the need for prolonged mechanical ventilation (defined as > 14 days) with lower accuracy, sensitivity, and AUC.

We also found it in our tested statistical model, which was developed using 26 variables (Table III supplementary materials). We examined all of the variables using Bayesian Model Averaging, Random Forest, and then logistic regression. We observed by using the three approaches that the addition of BMI, AA race, ICH, and positive sputum culture to the SETscore produced a better accuracy and AUC in predicting the need for TR in comparison with the SETscore alone. However, adding those variables might potentially make the score more complicated and might not be feasible or available during the first 24 h of intubation, i.e., positive sputum culture.

LOS and VD were lower in the early TR group (using the standard threshold of 7 days for early TR) in comparison with those receiving late TR in our cohort [3, 12]. We did not expect to find a difference in discharge outcomes in our cohorts, considering the fact that critically ill neurological patients usually require 6 months to 1-year follow-up to detect outcome differences [17].

The SETscore is a simple score that can be clinically helpful in identifying patients who will likely require TR and prolonged EI and might be used in guiding clinical decisions regarding performing early TR which was shown to be clinically feasible in the pivotal trials introducing this score [1, 4]. However, we found some limiting factors including the lack of standard definitions for each of its components, which might affect the inter-rater accuracy of the score, especially very early after intubation. Hence, in our review we introduced standard definitions for some of the variables to improve our inter-rater accuracy of the score. We felt that our definition of dysphagia was more clinically relevant and inclusive in comparison with the definition used in the SETscore internal validation paper [9], which did not score dysphagia if the patient was already intubated on arrival. Also, the difference in considering placement of EVD in the operating room as a neurosurgical intervention continues to be a matter of debate. These differences in definitions can potentially create a bias and could potentially explain the differences when compared to the initial validation paper. In addition, 14.7% (n = 36) of our included patients were morbidly obese (BMI > 35 kg/m2), originally excluded from the initial SETPOINT trial, as obesity is considered a relative contraindication for percutaneous dilated tracheostomy (PDT) with a high risk of reintubation in this patient population.

We noted a shorter ICU LOS in the early TR group compared to the late TR group. This finding should be interpreted cautiously, since the ICU LOS is also influenced by other factors, such as a need for close monitoring or EVD requirement, irrespective of ventilator status of patients. This is despite the fact that this association retained its significance after adjustment for other potential confounders that may contribute to ICU LOS. Additionally, patients may have been extubated or received a TR but continued to require significant pulmonary toilet, which prolonged their ICU LOS despite not being mechanically ventilated.

Our results could also be influenced by our practice of giving patients a chance for extubation prior to TR, as evidenced by 83% of our tracheostomized cohort failing extubation at least once despite 60% (n = 68) with a SETscore > 10. We could not compare this to the SETPOINT trial, as the failed extubation rate in the standard treatment arm was not reported. Neither could this be compared to the internal validation of the SETscore paper, as they only reported a failed extubation rate of 14.7% for those patients who scored less than 10 [9]. We also had a number of patients with whom we attempted extubation more than once (n = 23) before considering TR. This might be a reflection on the outcome of the high incidence of pneumonias in our tracheostomized population. The frequent failed extubation attempts might be related to a higher incidence of airway and laryngeal edema with stridor, which was a common causal reason reported in our patients’ cohort.

Early TR has been suggested to be beneficial in neurological patients with shorter VD [18], ICU LOS [8, 19], improved outcome [8], and improved mortality [8] in multiple retrospective studies. TR has been shown to be feasible early in the course of neurological injury, similar to the SETPOINT trial where they did it very early (within 3 days of intubation). Part of this result lies in the increasing use of percutaneous techniques at bedside to perform PDT, in comparison with the surgical techniques in the operating room, and PDT has been shown to be effective, fast, and safe at bedside [12, 20]. In our cohort, more than 90% of the tracheostomies were performed at bedside by the PDT method, which demonstrates the feasibility of performing TR early in the neurological patient population. We only referred patients to surgical tracheostomy if they had prior TR, were morbidly obese, or had high risk of bleeding. We had one major complication with no mortalities from PDT in this cohort. The complication was due to a major bleeding event from a thyroid injury requiring surgical exploration.

In addition to the value of the SETscore in predicting the need for TR when applied within the first 24 h of intubation, our study results suggest that it is reasonable to factor in the SETscore even when extubation is being considered, especially if the patient had already failed the first extubation attempt. This is particularly indicated for those patients with high scores. For example, 83% of those with scores greater than 10 failed extubation at least once before TR in our cohort. It is noteworthy that extubation failure especially increases LOS and VD and might put the patient at higher risk of morbidity, aspiration pneumonia [21], and even mortality [22].

Limitations

Our study has a number of limitations. First, the study is from a single center which may limit the generalizability of the findings. Second, the retrospective nature might be affected by a selection bias. Third, the lack of strict published definitions of the individual components of the SETscore may have contributed to variability in calculating the score in comparison with the original scoring in the paper or at the institution of origin. Additionally, the outcome data for 3 and 6 months after discharge was not available, which limits our ability to comment on the early TR effect on outcomes in our cohort. Finally, a significant portion of our patients who received TR had actually failed extubation before, which may influence the functional outcomes of LOS and VD.

Conclusion

The SETscore is a simple score with moderate accuracy and a fair ability to predict the need for TR after MV for IS, ICH and SAH. The utility of this score may be improved when including additional variables such as BMI, AA race, ICH, and positive sputum cultures. Our study was consistent with previously published data demonstrating that early tracheostomy may improve ICU LOS and VD in our cohort.