To the Editor,

We read with interest the meta-analysis regarding intraoperative cerebral oximetry-based monitoring for maximizing perioperative outcomes by Zorrilla-Vaca et al.1 We note, however, some important discrepancies between the original source literature and the data that are used in the current analysis, which raise concerns. While conclusions regarding the primary outcome of cognitive impairment are not impacted, the means to reach that conclusion are at times not as accurate as they could be, and some of the secondary outcome conclusions differ in significance.

A primary example of discrepancy is in the studies used for the postoperative delirium outcome (Fig. 6). Of the six studies analyzed, three of them actually make no mention of “postoperative delirium”, “POD”, or “delirium” in the text or supplementary materials.2,3,4 Nor is there any mention of instruments typically used to assess POD in patients, such as the Confusion Assessment Method for the Intensive Care Unit (CAM-ICU). Of the remaining studies, the event counts used for Deschamps et al. appear to be for those receiving transfusions, not the delirium cases.5

A further example of data extraction concerns comes in the transfusion analysis in their Fig. 5. Values used for the study by Colak et al. (18 of 94 in the near-infrared spectroscopy (NIRS) group vs 24 of 96 in the control) are partly derived from values found in Fig. 1 of the source publication.6 Nevertheless, in Table 1 of the subsequent text, these values are noted to be percentages of patients (not absolute numbers of patients) who did not receive transfusions, meaning that the real numbers were 77 of 94 for NIRS and 73 of 96 for control, and thus change the direction of the effect.

The inclusion of the 2010 study by Cohn et al.7 represents a case where discrepancies in both data extraction and application of study inclusion criteria occur. The data extracted for length of hospital stay were described in the source paper as hospital-free days of the first 30 post-surgical days. Most importantly, Cohn et al. describe thenar placement of the oximetry probes, and we would therefore question inclusion of this study in a meta-analysis of cerebral NIRS-based management.

Following good practice, Zorrilla-Vaca et al. published their meta-analysis protocol (PROSPERO: CRD42017057293), but there were some deviations from the protocol that were not explained in their text. For example, the protocol describes continuous variable data synthesis using mean differences, which is changed to standardized mean differences in the final text. The latter is required when combining data on non-comparable scales, but the combined studies all used readily relatable units. For clinician readers, a mean change in hospital stay in days or hours is much more understandable than a proportional change in the standard deviation.

The protocol describes subgroup analyses by surgery and device, but the device analysis is not described. An analysis by intervention protocol was presented, but not mentioned in the protocol. This is important because the NIRS device provides data, but patient outcomes will only be influenced by the providers’ response to those data. The authors classify seven of the studies as using a Denault-type algorithm8 to correct cerebral desaturation. On closer examination, we suggest that nine of the studies generally follow the principal components of the algorithm. Regarding the subgroup results, in the text they indicate that for the primary outcome of Postoperative Cognitive Dysfunction (POCD), five of seven trials did not use the Denault algorithm yet yielded a significant effect of lower POCD. In fact, four studies did use the algorithm, and re-calculation suggests the non-Denault algorithm studies trended towards non-significantly lower incidences of POCD. This difference is potentially crucial, as the suggestion that outcomes are improved regardless of how one responds to the NIRS data may not be supported.

Following our observations, we have re-analyzed the data from the studies indicated, except for the Cohn paper, which should not have been included. Details are found in the Electronic Supplementary Material eAppendix accompanying this submission, including the values used in the re-analysis and details of any assumptions or transformations of the source data undertaken. Subgroup analyses were performed for surgery class, device, and intervention algorithm.

As mentioned, most conclusions regarding outcomes do not change; NIRS-based monitoring is still significantly associated with a lower risk of postoperative cognitive impairment (P = 0.010) and with shorter ICU length of stay (P < 0.001). Subgroup analyses suggest, though, that other outcomes may be influenced by interactions among surgery type, device, or intervention algorithm. For example, POCD risk reduction may become insignificant if a Denault-type algorithm is not used (three studies; risk ratio, 0.82; 95% confidence interval, 0.60 to 1.13; P = 0.23; intergroup P = 0.08).

We emphasize that NIRS monitoring itself cannot change patient outcomes; instead, it will be the response of the clinical team to the NIRS data and application of clinical algorithms incorporating these data that may result in patient safety benefits. We agree with the caution expressed by Zorilla-Vaca et al. that results are subject to interpretation as it remains up to the reader to decide whether statistically significant results in such an analysis justify the modification of clinical practice; in any event, however, decisions should be based on the most transparent and accurate values possible.