Postoperative atrial fibrillation (POAF) is the most common complication after cardiac surgery,1 affecting 27-40% of patients and typically occurring in the first week after surgery.2 Postoperative atrial fibrillation after cardiac surgery is associated with an increased risk for postoperative stroke, recurrent atrial fibrillation, myocardial infarction, longer intensive care unit (ICU) and hospital length of stay (LOS), higher healthcare costs, higher readmission rates, and increased mortality.2,3,4,5 Risk factors for POAF are well established and include advanced age,2,6,7,8,9,10 history of atrial fibrillation (AF),2,6,8,9 valve surgery,2,9 chronic obstructive pulmonary disease (COPD),2,9 left ventricular dysfunction,6,8,9 left atrial enlargement,10 P-wave abnormalities,6,8 and medication withdrawal.2 To characterize individual risk of POAF, a number of risk indices incorporating multiple risk factors have been developed.

The Multicenter Study of Perioperative Ischemia (McSPI) AFRisk index, which was introduced in 20042 and modified in 2009 to include statin use,11 has been shown to reliably predict the occurrence of POAF after coronary artery bypass graft (CABG) surgery, but requires pre-, intra-, and postoperative data. Recently, POAF risk scores that include only preoperative data have been proposed to be sufficient. Introduced in 2010 to predict the risk of thromboembolism in patients with atrial fibrillation,12 the CHA2DS2VASc score has shown promise in predicting POAF after cardiac surgery in small- to moderate-size cohorts.13,14,15 Similarly, the recently developed POAF score9 and clinical risk-prediction model developed by Kolek et al.8 have shown predictive capacity in preliminary studies. While the POAF score has recently been found to outperform the CHA2DS2VASc score,16 there is still no consensus on which index best predicts POAF. Determining which index has the best discriminative capacity would allow for improved individual risk stratification for POAF and the potential to better assess risk reduction in interventional studies. Additionally, identification of an accurate risk index would facilitate implementation of prophylactic measures, such as amiodarone administration, to reduce the incidence of POAF after cardiac surgery.1

Because POAF increases morbidity, mortality, and healthcare costs, accurate risk assessment is important for instituting appropriate risk-reduction strategies. As such, we aimed to compare the predictive ability of the modified McSPI AFRisk index with three abbreviated indices containing only preoperative data, hypothesizing that the modified McSPI AFRisk index would have superior predictive capacity.

Methods

After obtaining approval from the Duke University School of Medicine Institutional Review Board (IRB Pro00006745, initially approved 13 March 2008), we retrospectively evaluated patients undergoing CABG ± valvular surgery in the Duke University Health System utilizing cardiopulmonary bypass (CPB) from 1994 to 2007 for whom comprehensive comorbidities, medications, and adverse outcomes had been prospectively collected. All included patients had consented to inclusion in a prospective observational database aimed at investigating phenotypic and genotypic differences that may be associated with adverse outcomes after cardiothoracic surgery. This database was collected with a target sample size of n = 1004 patients undergoing various cardiac surgical procedures and has been detailed in previous manuscripts.17 A STROBE diagram detailing cohort selection is shown in Fig. 1.

Fig. 1
figure 1

STROBE diagram detailing cohort selection. ECG = electrocardiogram; POAF = postoperative atrial fibrillation

Prior to surgery, all patients were in sinus rhythm and were not being treated with Vaughan Williams class I or III antiarrhythmic agents. In all patients, anesthesia was induced and maintained with midazolam, fentanyl, propofol, and isoflurane or sevoflurane. Cardiopulmonary bypass management was standardized, with mean arterial pressure maintained between 50 and 80 mmHg, temperature during CPB maintained at 30-35°C as per institutional practice, and alpha-stat blood gas management utilized in all cases. Patients initially recovered in the Cardiothoracic Intensive Care Unit prior to transfer to a Stepdown Unit. Throughout the entire length of stay, cardiac rhythm was continuously monitored via telemetry.

Postoperative atrial fibrillation was defined as any atrial fibrillation/flutter episode lasting > 30 sec, as detected by continuous telemetry, electrocardiogram (ECG) analysis, or any AF that required treatment or was recorded in daily notes/discharge summaries. The modified McSPI AFRisk index,2,11 CHA2DS2Vasc score,13 POAF score,9 and Kolek clinical prediction model8 were calculated for all patients as defined in Table 1. Patients were classified as high risk for POAF if they had: a CHA2DS2Vasc score > 3, a POAF score of ≥ 3, or a Kolek clinical prediction score of ≥ 5. The modified McSPI score designates three risk categories, with patients classified as low risk for POAF if their modified McSPI AFRisk score is < 7, medium risk if their score is 7-24, and high risk if their score is > 24.11

Table 1 Demographic factors in each risk score

Statistical analysis

The predictive capacity of each risk score was assessed using comparison of receiver-operating characteristic (ROC) curves,18 scaled Brier scores,19 as well as calculation of a net reclassification index (NRI) and integrated discrimination index (IDI).20 For each ROC curve, the area under the curve (AUC) and the confidence intervals were estimated with generalized Mann-Whitney U statistics, and the differences between the AUCs were tested with Chi-squared tests. As point estimates and confidence intervals of risk index performance were of primary interest, the P values for the test of difference in the AUC were not corrected for the multiple pairwise comparisons. Brier scores, a measure of the accuracy of a probabilistic prediction, are defined as the average squared difference between the predicted probability and the patient’s observed outcome.21 Brier scores were calculated for each risk score and then scaled by the maximum possible score, given the observed incidence of POAF in this cohort, to create a scaled Brier score.

As each of the risk scores under consideration has defined risk strata,2,8,9,12 we used net reclassification indices, a measure of improvement in risk classification with transition from one model to another,22 to compare risk classification for each score relative to the modified AFRisk index. The calculation requires defining four probabilities based on reclassification to a higher risk class (moving “up”) and reclassification to a lower risk class (moving “down”) among patients who either did or did not experience POAF. We defined patients as reclassified “up” if they moved from low risk in the comparison index to either moderate or high risk by the modified McSPI AFRisk index and reclassified “down” if they moved from high risk to either moderate or low risk by the modified McSPI AFRisk index. The NRI is estimated as the difference in proportion of events (patients with POAF) that were reclassified “up” vs “down” minus the difference in proportion of non-events (patients without POAF) reclassified “up” vs “down”. In addition to comparing the risk classification we also calculated the IDI to assess the difference in discrimination between models across the possible range of cut points.20 The IDI corresponds to the difference in discrimination slopes between the risk scores being compared and may be interpreted as the average improvement in risk values for events and non-events.23 All analyses were performed using SAS™ version 9.4 (SAS Institute Inc, Cary, NC, USA) or R version 3.1.1. P < 0.05 was considered statistically significant.

Results

The initial cohort included 972 patients undergoing cardiac surgery, but due to lack of availability of ECG data used to calculate the Kolek score, this was reduced to 783 patients whose demographics are depicted in Table 2. The overall incidence of POAF was 32.6%, and the incidence of POAF in each risk group is depicted in Table 3. The mean [standard deviation (SD)] modified McSPI AFRisk index was 14.5 (13.0) and was higher in patients with POAF [23.2 (13.0)] than in patients without POAF [10.4 (10.7); P < 0.001]. The mean CHA2DS2Vasc score was 2.5 (1.5) and was higher in patients with POAF [2.8 (1.5)] than in patients without POAF [2.4 (1.5); P < 0.001]. The mean POAF score was 1.4 (1.1) and was higher in patients with POAF [1.8 (1.1)] than in patients without POAF [1.2 (1.0); P < 0.001]. The mean Kolek clinical risk prediction score was [3.4 (1.6)] and was higher in patients with POAF [4.0 (1.6)] than in patients without POAF [3.1 (1.6); P < 0.001].

Table 2 Demographics of included patients
Table 3 Incidence of POAF in each risk group

Receiver-operating characteristic curves for each index are depicted in Fig. 2. The area under the ROC curve for the modified McSPI AFRisk index (0.77; 95% CI, 0.74 to 0.81) was significantly higher than for the CHA2DS2Vasc score (0.58; 95% CI, 0.54 to 0.62), POAF score (0.66; 95% CI, 0.62 to 0.70), and Kolek clinical risk prediction model (0.66; 95% CI, 0.62 to 0.70; P < 0.001 for all comparisons). In addition, the AUC for the CHA2DS2Vasc score was significantly lower than for the POAF score (P < 0.001) and the Kolek clinical risk prediction model score (P < 0.001). No difference was found between the AUC of the POAF and Kolek score (P = 0.79). A post-hoc power calculation utilizing the observed incidence of POAF (32.6%) and modified McSPI AFRisk AUC of 0.77 revealed 80% power to detect an AUC difference as small as 0.05 and 90% power for an AUC difference of 0.06. In our sample of 783 patients where 32.6% developed POAF, we estimate > 99% power to detect the observed difference between the modified McSPI AFRisk index AUC and the AUC of all other comparator scores.

Fig. 2
figure 2

Receiver-operating characteristic curves for each risk index. McSPI = Multicenter Study of Perioperative Ischemia; POAF = postoperative atrial fibrillation

Adjusted Brier scores as well as R2 values suggest greater accuracy for the modified McSPI AFRisk index than the CHA2DS2Vasc score, POAF score, or Kolek clinical risk prediction model. In addition, the Hosmer-Lemeshow goodness of fit test did not reveal evidence of model misfit in our cohort (Table 4). Agreement on risk categorization between the modified McSPI AFRisk index, CHA2DS2Vasc score, POAF risk score, and Kolek clinical risk prediction model is depicted graphically in Fig. 3. The modified McSPI AFRisk index and CHA2DS2Vasc score classified 35.2% of patients in the same POAF risk category (8.8% as high risk, 26.4% as low risk). The modified McSPI AFRisk index and POAF risk score classified 36.7% of patients in the same POAF risk category (8.6% as high risk, 28.1% as low risk). The modified McSPI AFRisk index and Kolek clinical prediction risk model classified 36.1% of patients in the same POAF risk category (9.2% as high risk, 27.0% as low risk).

Table 4 Discrimination and calibration values for each risk index
Fig. 3
figure 3

Scatter plots of comparison risk index predicted probability of postoperative atrial fibrillation (POAF) outcome by Multicenter Study of Perioperative Ischemia (McSPI) predicted probability of POAF outcome. The non-events appear in the left column with blue circles, and events appear in the right column with red crosses. The threshold for risk classification for each risk index is indicated with dashed lines

Reclassification data for the use of the modified McSPI AFRisk index compared with other risk scores are presented in Table 5. Application of the modified McSPI AFRisk index to the CHA2DS2Vasc score resulted in a total NRI = 0.209, with the NRI for patients with POAF = 0.492 and the NRI for patients without POAF = -0.283. Application of the modified McSPI AFRisk index to the POAF score resulted in a total NRI = 0.034, with the NRI for patients with POAF = 0.508 and the NRI for patients without POAF = -0.474. Application of the modified McSPI AFRisk index to the Kolek clinical risk prediction model resulted in a total NRI = 0.120, with the NRI for patients with POAF = 0.467 and the NRI for patients without POAF = -0.347. As seen in Table 5, the observed IDI was highest for the modified McSPI AFRisk index, indicating superior discrimination. By AUC difference and the NRI and IDI compared with the modified McSPI AFRisk index, the CHA2DS2Vasc score performed worst, and the POAF score was closest in performance although the modified McSPI AFRisk index had superior performance by every metric.

Table 5 Reclassification performance of the McSPI AFRisk Index vs CHA2DS2VASc score, POAF score, and Kolek clinical model

Discussion

In this retrospective study of 783 cardiac surgery (including CPB) patients, the modified McSPI AFRisk index showed superior capacity to predict POAF compared with three other indices. The POAF score and Kolek clinical risk prediction model showed moderate predictive ability, while the CHA2DS2Vasc had the least predictive capacity for POAF. Given these results, we encourage investigators and clinicians to use the modified McSPI AFRisk index to predict risk of POAF after cardiac surgery.

The incidence of POAF in our study (32.6%) was similar to that seen in multiple previous cardiac surgical cohorts.4,8,9,10,24 Of note, POAF has previously been reported as a relatively rare phenomenon (13% of patients),13 perhaps related to regional variability in incidence.2 Though not measured in our study, POAF has consistently been associated with increased in-hospital morbidity,2,5,25,26 resource utilization,5,26,27 and short-term mortality2,26 after cardiac surgery. In addition, patients experiencing POAF have an increased long-term risk of chronic atrial fibrillation28 and decreased long-term survival after cardiac surgery.7,29 Importantly, strategies to reduce POAF have been associated with reduced hospital LOS,4 healthcare costs,4,30 and incidence of stroke,31 suggesting the importance of POAF as a major morbid event after cardiac surgery. Nevertheless, strategies to prevent POAF have not appreciably changed the incidence of POAF over the past decade. Better identification of at-risk patients may improve the efficiency and outcomes associated with preventive interventions.

Assessing the predictive capacity of multiple risk prediction models can be complex. The cornerstone of these comparisons is often based on ROC curves, which assess discrimination—the ability to determine who will develop the condition of interest.20 Nevertheless, investigators have also encouraged the use of novel ways to evaluate the predictive capacity of risk models.32 One such technique is the use of the NRI, which sums the correct vs incorrect risk reclassifications in moving from one model to another.22 Based upon the derivation of the formula,23 the NRI is flexible in quantifying differences between two-category (e.g., CHA2DS2Vasc score, Kolek clinical risk prediction model, and POAF score) and three-category (e.g., modified McSPI AFRisk index) risk indices. To address a category-free measure of discrimination improvement, we also present the IDI, which can be used to quantify model improvement that may be missed by comparing AUCs.20 Additionally, the overall performance of a model can be assessed by using Brier scores, which may capture both the discrimination and calibration of a model.21 By combining multiple model performance metrics, we sought to create a more complete comparison of multiple POAF risk models.

In our analysis, the modified McSPI AFRisk index was superior to the CHA2DS2Vasc score, Kolek clinical risk prediction model, and POAF score in all metrics, as evidenced by its higher AUC, higher R2 value, higher adjusted Brier score, higher IDI, and NRI consistent with correctly reclassifying high-risk patients. The AUCs of the modified McSPI AFRisk index and Kolek clinical risk prediction model were similar to originally published values,2,8 and AUC values for the CHA2DS2Vasc and POAF score were very similar to those in a recently published comparison of preoperative POAF risk indices.16 There is much interest in identifying a highly predictive preoperative POAF risk index, and these data indicate that adding perioperative medication information significantly improves the predictive capacity of the modified McSPI AFrisk index. The NRI and IDI23 indicate that the modified McSPI AFRisk index frequently reclassifies POAF risk compared with the CHA2DS2Vasc score, Kolek clinical risk prediction model, and POAF score. Given that NRIs for POAF events were consistently higher than the NRIs for no POAF events, the modified McSPI AFRisk index improves reclassification by more often designating patients as high risk compared with other indices.

Many of the well-known risk factors for POAF, including age,6,7,8,9,10 left ventricular dysfunction,6,8,9 history of AF,2,6,8,9 valvular surgery,2,9 and COPD,2,9 are present in multiple POAF risk indices. Additional included risk factors, such as a prolonged PR interval8 or end-stage renal disease,9 likely serve as markers of vulnerable atrial substrate6 and increased comorbidity burden, respectively. The modified McSPI AFRisk index is unique in its inclusion of comprehensive medication-related data. The advent of complex electronic medical record systems can facilitate automated risk score derivation and provide timely risk stratification information to healthcare providers. While the literature has been mixed regarding the effects of perioperative angiotensin-converting enzyme (ACE) inhibitors,33 both perioperative beta blockers4 and statins34 protect against the development of POAF. Interestingly, withdrawal of preoperative beta blockers or ACE-inhibitor therapy is associated with POAF,2 a finding replicated in this cohort. It is likely that addition of these relevant medication-based factors improves the discriminative power of the modified McSPI AFRisk index.

The primary strength of this study is the prospective data collection and definition of POAF. Additionally, this represents an external validation of each risk index. Limitations of this study include the use of a relatively homogeneous cohort of primarily Caucasian patients undergoing non-emergent cardiac surgery at a large tertiary medical center. These characteristics may limit the generalizability of our results to other populations. In addition, the overall approach of this study is a retrospective cohort analysis of patients undergoing surgery over a 13-year period, during which time cardiac surgical care may have advanced. Nevertheless, perioperative care as well as data collection strategies were standardized throughout this time period. Moreover, this manuscript does not account for post-discharge POAF, which occurs in the minority of patients after cardiac surgery but is less common than in-hospital POAF.35 While this investigation does not address the negative impact of POAF on postoperative recovery profiles, this link has been well established in previous literature, with POAF associated with increased ICU and hospital LOS, higher costs, more frequent readmissions, higher perioperative and long-term mortality, and higher long-term risks of atrial dysrhythmias.5,26,27 Finally, no single performance metric provides a comprehensive assessment of model performance, and both ROC curves22 and the NRI36 have been criticized for providing incomplete measures of performance. Nevertheless, the lack of overlap between AUC confidence intervals of the modified McSPI vs other indices provides a compelling indicator of superiority. In addition, we combined multiple measures to strengthen our ability to assess the performance of each risk index.

In conclusion, the modified McSPI AFRisk index showed improved ability to predict POAF compared with the CHA2DS2Vasc score, POAF score, and Kolek clinical risk prediction model. This may be due to the inclusion of medication-related information, which is unique to the modified McSPI AFRisk index. In future studies, investigators are encouraged to account for perioperative management of common cardiovascular medications, including beta blockers, ACE inhibitors, statins, and potassium supplementation, as these data may better elucidate the risk of POAF. Moreover, as electronic medical records become increasingly complex, physicians could integrate the modified McSPI AFRisk index into perioperative care of cardiac surgical patients, thereby potentially reducing medication withdrawal and so decreasing the incidence of POAF. Future research is necessary to determine the generalizability of our results to larger, more heterogeneous populations. As cardiovascular care advances and the surgical population continues to include older, frailer patients, prevention of POAF may prove to be an effective strategy to reduce perioperative morbidity and costs. Paramount to prevention is risk stratification or individual risk prediction, and further work should focus on optimizing the performance of these various POAF risk-prediction scores.