Acta Neurochirurgica

, Volume 161, Issue 10, pp 2083–2098 | Cite as

Intraoperative fluorescence diagnosis in the brain: a systematic review and suggestions for future standards on reporting diagnostic accuracy and clinical utility

  • Walter StummerEmail author
  • Raphael Koch
  • Ricardo Diez Valle
  • David W. Roberts
  • Nadar Sanai
  • Steve Kalkanis
  • Constantinos G. Hadjipanayis
  • Eric Suero Molina
Open Access
Review Article - Neurosurgical technique evaluation
Part of the following topical collections:
  1. Neurosurgical technique evaluation



Surgery for gliomas is often confounded by difficulties in distinguishing tumor from surrounding normal brain. For better discrimination, intraoperative optical imaging methods using fluorescent dyes are currently being explored. Understandably, such methods require the demonstration of a high degree of diagnostic accuracy and clinical benefit. Currently, clinical utility is determined by tissue biopsies which are correlated to optical signals, and quantified using measures such as sensitivity, specificity, positive predictive values, and negative predictive values. In addition, surgical outcomes, such as extent of resection rates and/or survival (progression-free survival (PFS) and overall survival (OS)) have been measured. These assessments, however, potentially involve multiple biases and confounders, which have to be minimized to ensure reproducibility, generalizability and comparability of test results. Test should aim at having a high internal and external validity. The objective of this article is to analyze how diagnostic accuracy and outcomes are utilized in available studies describing intraoperative imaging and furthermore, to derive recommendations for reliable and reproducible evaluations.


A review of the literature was performed for assessing the use of measures of diagnostic accuracy and outcomes of intraoperative optical imaging methods. From these data, we derive recommendations for designing and reporting future studies.


Available literature indicates that potential confounders and biases for reporting the diagnostic accuracy and usefulness of intraoperative optical imaging methods are seldom accounted for. Furthermore, methods for bias reduction are rarely used nor reported.


Detailed, transparent, and uniform reporting on diagnostic accuracy of intraoperative imaging methods is necessary. In the absence of such reporting, studies will not be comparable or reproducible. Future studies should consider some of the recommendations given here.


STARD CNS Glioma Fluorescence guidance Diagnostic accuracy 5-ALA Fluorescein 


During high-grade glioma (HGG) surgery, the infiltrative tumor margin is difficult to visualize during surgery. Inadvertent residual enhancing tumor is left behind when the surgeon relies only on differences in tissue color or texture for identifying tumor [5, 85]. For this reason, a number of surgical adjuncts or imaging technologies have been introduced during the last three decades which help the surgeon identify tumor tissue intraoperatively, such as neuronavigation, intraoperative MRI (iMRI) [9], ultrasound [37], and fluorescence guidance.

5-Aminolevulinic acid (5-ALA) has been the most widely studied agents used in fluorescence-guided surgery (FGS) of HGGs and is approved in different countries around the globe [6, 31, 61, 79, 84, 85, 100]. Off-label use of fluorescein sodium for FGS has been investigated in patient cohorts [2, 3, 18, 25, 50, 62, 64, 78], in addition to indocyanine green (ICG) [105]. Targeted fluorescence markers are under preclinical development and are slowly translating into the human setting [29, 51, 91].

Effective intraoperative fluorescence imaging relies on the assumptions that highlighted tissues truly represent tumor, that non-highlighted tissue presents normal brain, and that the targeted tissue corresponds to the pathology as delineated by preoperative imaging, e.g., MRI contrast enhancement in HGGs.

Therefore, diagnostic methods or tests proposed for proving intraoperative clinical reliability require precise evaluations to ensure they truly predict the presence or absence of tumor tissue in the brain, and provide the surgeon the information to decide (in conjunction with concerns for safety), whether further tumor resection should be performed. In addition to demonstrating a correlation between the signal of the intraoperative method and histology, the FDA requires a proof of clinical benefit for approval which does not necessarily include proof of improved survival [94]. For this purpose, detailed studies are necessary prior to regulatory approval and marketing of such methods.

At present, no detailed, consented criteria for testing the diagnostic accuracy or clinical benefit of intraoperative fluorescence imaging are available. Such criteria would allow comparability and reproducibility of methods. The comparative performance of such methods would ultimately be of interest for their future development and application.

Available guidelines on diagnostic accuracy, e.g., STARD (Standards for Reporting of Diagnostic Accuracy) [11, 71], do not address the particular requirements for intraoperative testing with its many confounders and inherent clustering of data. Although the basic principles for reporting on the accuracy of diagnostic tests are reflected herein, they have been constructed for diagnostic tests, which give one value per patient.

Both the Federal Drug Administration (FDA) and the European Medicines Evaluation Agency (EMA) provide guidance on the evaluation of diagnostic tests. The FDA’s “Statistical Guidance on Reporting Results from Studies Evaluating Diagnostic Tests” explicitly does not pertain to testing multiple samples from single patients, which would typically be the case for intraoperative tissue assessments.1

EMA’s “Guideline on Clinical Evaluation of Diagnostic Agents”2 includes recommendations for testing various stains/markers, e.g., stains used intraoperatively in detection of malignant mucosal lesions, recommending test performance to be expressed both in relation to an overall individual (per patient basis) and to lesions detected and/or organs or sites involved (per lesion or per site basis). However, EMA also does not address the numerous confounders and biases that might be encountered during testing for intraoperative fluorescence in the brain (as reviewed in the following), or other organs.

This review will discuss possible pitfalls and biases involved in testing intraoperative fluorescence, will analyze the available literature on how such biases have been handled, and make suggestions on possible guidelines for intraoperative diagnostic testing.

While this review focusses on widefield fluorescence imaging, which is in broad clinical use, it explicitly pertains to other methods of intraoperative tissue diagnosis as well, e.g., RAMAN spectroscopy, for which ample literature is available [13, 14, 24, 45, 47, 48].

Classical evaluation of diagnostic tests

Diagnostic testing for correctly identifying disease or health prior to treatment decisions is a universal necessity in medicine. The expression “test” signifies any technique for determining whether subjects present a certain physical status or condition, e.g., if he is afflicted with a certain disease. For evaluating the accuracy of a diagnostic test, studies are used in which test results are compared to a reference or gold standard [11]. Reference standards may be laboratory examinations, imaging, pathological data, or clinical outcomes.

Resulting test values may be binary or dichotomous (with two qualities, e.g., disease or no disease), quantitative (or continuous, e.g., PSA for detecting prostate cancer, or other laboratory values), or semi-quantitative (on an ordinal scale, e.g., test strips for detecting sugar in urine [81]).

Many diagnostic tests are based on laboratory values, which ideally, if present or exceeding a certain level, will unambiguously indicate that a patient has a disease, whereas all other patients are disease-free. Due to the inevitable variability inherent to biological systems, however, such unambiguous tests are rare. Rather, there is usually some degree of overlap between test values in diseased and non-diseased patients based on the distribution of test values in either population.

Thus, with the same value of the diagnostic test, one patient may be afflicted with a condition whereas the other is healthy. Tests are therefore assessed for their diagnostic accuracy, i.e., the amount of agreement between the test, which is being assessed, and the reference standard which unequivocally denotes disease [11, 35].

To characterize diagnostic accuracy different measures or terms have been introduced which give information on the performance of tests, derived from the frequency with which a laboratory test or test value truly or falsely indicates disease, or misses the presence of disease, as summarized in Table 1.
Table 1

Diagnostic decision matrix for diagnostic accuracy


True disease state




Test result


True positive (TP)

The test is positive and the subject suffers the disease

False positive (FP)

The test is positive whereas the subject is healthy


False negative (FN)

The test is negative, yet the patient suffers the disease

True negative (TN)

The test is negative and the subject is healthy





probability of a positive test result when a subject has the disease

TP/(TP + FN)


Probability of a test being negative if a patient does not have the disease

TN/(TN + FP)

Positive predictive value

Probability that the patient has the disease if the test is positive

TP/(TP + FP)

Negative predictive value

Probability that the patient is disease free if the test is negative

TN/(TN + FN)

Using common measures of diagnostic accuracy for intraoperative tissue diagnosis in the brain

How can traditional methods of diagnostic testing be used for testing the accuracy of intraoperative optical tissue diagnosis in brain tumor surgery?

The most important difference between studies on diagnostic tests that test for the presence of disease in the conventional sense, and intraoperative tissue diagnosis is that traditional measures were developed with every patient generating a single measurement. Hence, individual measurements, being from individual patients, are independent. On the other hand, studies evaluating the accuracy of intraoperative tumor diagnostics will typically be based on histology, and it will not suffice to take only one tissue sample per patient. Rather, multiple samples (clusters) will be collected per patient relating the signal of the detection method and the reference standard, histology. Multiple samples from single patients will render these clustered samples interdependent, an aspect which requires special consideration when assessing the accuracy of such tests.

The argument that intraoperative optical tissue diagnosis can be assessed in as simple a fashion as a laboratory test for the presence of disease therefore requires careful scrutiny. Simply adapting traditional diagnostic accuracy measurements to biopsies during brain tumor surgery is per se flawed, since samples are all collected in the diseased subject or organ but not from healthy subjects. In addition, biopsies in the brain will never be random, especially if the brain looks normal.

Hypothetically, the entire brain should be sampled with an infinite number of samples and analyzed to determine whether the volume of tissue detected by the optical method coincides with the volume of the tumor, i.e., to determine whether the test detects the entire tumor and the result of the test is truly dichotomous (all tumor detected or not). Needless to say, this is not an option. In practice, the sample volume is restricted by craniotomy and corticotomy to the area of the gross tumor and its immediate surroundings.

In addition, investigators are strongly limited by the number of samples they can collect, especially in normal appearing brain. They will have to rely on a finite number of intraoperative biopsies for histological comparisons. To do so, investigators will take samples from non-highlighted and from tissues highlighted by their diagnostic method and then examine samples histologically. Most biopsies will not be taken from normally appearing brain but rather from irregular brain tissue for obvious ethical reasons.

Thus, in contrast to the single laboratory value for a single patient (e.g., PSA for prostate cancer), trying to establish diagnostic accuracy in the brain from intraoperative tissue samples is compromised by numerous confounders and potential biases (with “bias” being defined as the result of systematic flaws or limitations in the design or conduct of a study, which distort the results [99]).

Another aspect which requires attention in the brain is the fact that gliomas are diffusely infiltrating tumors with cell densities tapering into surrounding brain. Tissue biopsies will not only give dichotomous results (tumor or not tumor). Rather, biopsies will reveal a variable degree of infiltration. The likelihood of finding tumor cells in biopsies, i.e., the prevalence of tumor cells, will depend on the distance of the biopsy from the main tumor mass. Traditional values for diagnostic accuracy will depend on where the biopsies are being taken, resulting in possible biases (Fig. 1a and b). Other biases allude to the way individual tissue samples are dissected for analysis (Fig. 2), the timing of surgery after application of the fluorochrome (Fig. 3), the type of staining used for identifying single tumor cells, or the number of samples taken in a certain tissue region.
Fig. 1

a Influence of tissue allocation bias type 1 on the NPV and specificity. Since gliomas are infiltrating tumors and the density of infiltrating cells will decrease rapidly with distance from the tumor bulk, the calculated NPV and specificity will be higher the further away from the tumor samples are collected because of the lower likelihood for falsely negative samples. b Influence of tissue allocation bias type 1 on PPV and sensitivity. The likelihood for finding falsely positive biopsies will depend on the location of biopsies. If samples are collected predominantly in the main tumor mass, the calculated PPV and sensitivity will be high. If samples are collected at the margins and the diagnostic method unreliably detects tumor, the PPV will be lower

Fig. 2

Tissue allocation bias type 3. Intraoperative optical diagnostic information is usually two-dimensional, i.e., only giving superficial information from the exposed tissue. The biopsy, on the other hand, is three-dimensional and assessment of only a part of the biopsy might miss the pathology

Fig. 3

Timing and threshold bias pertinent for fluorochromes that are applied i.v. that do not have any specific tumor affinity (e.g., fluorescein sodium, Diaz et al. 2015), or expected to have selective affinity (targeted fluorochromes, e.g., APC-analoga, Swanson et al. 2015). This graph illustrates the course of fluorescence in different tissue compartments. (A) After i.v. injection, concentrations will be high in blood vessels, all perfused tissues, and will slowly abate. (B) Due to extravasation through BBB disruption within malignant tumor, pseudo-selectivity will ensue; this effect will also pertain to any areas of surgically induced BBB damage, e.g., the resection margin. (C) Meanwhile, extravasated fluorophore propagates with edema into peritumoral tissue in an unspecific manner. The apparent diagnostic accuracy will strongly depend on the definition of thresholds and on time after injection. (D) For targeted fluorochromes, selective retention can be expected after clearance from edema and plasma. These curves directly the signal-to-noise ratio, which changes over time

An example of how biopsy numbers directly affect results of diagnostic testing is given in Table 2. Furthermore, intermixing biopsies from different patients if the numbers per patient vary, will lead to differing results depending on how these are handled (Table 3). In methods with a relevant signal-to-noise ratio that requires the creation of a threshold because of background signal, this threshold will determine the results of the test.
Table 2

How with a given diagnostic method, differences in the number of biopsies obtained from certain regions, based on the sampling algorithm chosen by investigator A compared to investigator B, will strongly influence the results for the measures of diagnostic accuracy


Tumor center

Tumor margin

Normal tissue





Investigator A

6 TP

1 TP

2 FP

3 TN

2 FN





Investigator B

2 TP

1 TP

2 FP

3 TN

2 FN





In this hypothetical example, only the number of truly positive samples from the tumor center was varied, causing a relevant difference in sensitivity and positivity (italic entries).  

Table 3

How pooling samples from different patients influences results








Patient A







Patient B







Average measures




Pooled biopsies







The two hypothetical assessments differ only in the number of samples taken by investigators per site with a particular method

Table 4 summarizes possible biases involved in assessing the accuracy and efficacy of intraoperative diagnostic methods.
Table 4

Potential biases and confounders in establishing diagnostic accuracy of intraoperative optical diagnostics

Bias type


Tissue allocation bias type A

The location from where samples are collected relative to the signal margin will directly influence the apparent accuracy of the diagnostic test that is being evaluated, i.e., the calculations of specificity and sensitivity, the negative predictive value (NPV) and the positive predictive value PPV (Fig. 1a and b).

This is due to the fact that during intraoperative diagnostic testing in a typical, infiltrating brain tumor the prevalence of tumor cells is high in its center, whereas at the infiltrating margin the prevalence of tumor cells is lower and decreases with distance away from the tumor bulk. If samples are taken immediately beyond the margins of the highlighted tumors, the likelihood of finding unmarked, falsely negative tumor cells (FN) will be higher than if the samples are taken at a distance using the same method. This will directly affect NPV and specificity. Conversely, if marked tissue samples are taken only at the center of the tumor, the prevalence of tumor cells will be high and the rate of false positive samples low, and the calculation of PPV or sensitivity will give high values. When samples are taken at the more critical margin using the same method, it is to be expected that the rate false positive samples will be higher and the values for sensitivity and PPV lower.

However, in practice PPV and sensitivity will not be as susceptible to such strong effects as the NPV and specificity, since invariably the surgeon will primarily target gross tumor, as defined by neuronavigation, ultrasound, or the optical impression under conventional illumination, and (understandably) not adjacent inconspicuous brain.

This bias can be made transparent by describing exactly the position of the biopsy relative to the signal margin.

Tissue allocation bias type B

With methods that provide ambiguous signals, investigators are more likely to sample areas of the tumor that are judged to be abnormal with conventional illumination, e.g., by texture or color, than the inconspicuous margins, which might look like normal brain. Thus, the likelihood for true positive samples may be high with such methods despite the limitations of the method for detecting tumor at the margins. In other words, the distribution of the optical signal or tissue characteristics with conventional illumination will influence the surgeon not to adhere to a truly random biopsy regime as he will be guided to take biopsies most likely where he sees the signal or suspects tumor with conventional illumination. To reduce this type of bias, investigators would have to predefine the biopsy site based on preoperative imaging alone, then to approach the predefined region for determining the signal of the optical method, finally to biopsy, irrespective of what is seen during surgery.

Tissue allocation bias type C

Depending on the size of the sample, the surface of which is usually only visualized during surgery (due to the optical character of the methods discussed and the 2D signal returned from the tissue surface) the 3D sample might harbor different types of tissue, which in turn might confound the evaluation. This depends on how the biopsy is dissected and which fractions of the biopsy are specifically interrogated histologically (Fig. 2). Tissue allocation bias of this type has been identified as an explanation why OSNA (one step nuclei acid amplification) for detecting metastasis in sentinel lymph nodes in cancer patients compared to pathological investigations were sometimes discordant, since different parts of the same lymph nodes were tested by OSNA than by pathology (e.g., Kumagai et al. 2014).

This bias can be minimized by taking small samples or volumetrically interrogating larger samples.

Bias from biopsy frequency

In many studies in the brain the number of typically collected tissue samples is rather low and the number per patient differ. These samples are then pooled for the final analysis of measures of diagnostic accuracy. These, however, depend on the number of samples taken in a certain brain region an entered into the calculation. Table 2 gives an example how differences in the numbers of samples in different regions in a single patient will directly influence measures of diagnostic accuracy.

This bias can be minimized by using statistical methods such as generalized linear mixed models or by keeping the number of samples for patients the same.

Pooling samples from different patients

If a method fails to show a signal in one patient, there will be little sampling in the tumor core. In patients in whom the method works well, it is likely that more samples will be taken. Pooling these samples will skew the results and overestimate the diagnostic accuracy for single patients. Pooling biopsies from different patients without taken the dependencies of biopsies within a patient into account will, in consequence, lead to an underestimation of the variability and of the confidence limits. Also, calculating diagnostic measures per patient and then averaging over all patients will lead to biased results (Table 3). Such uncritical pooling also ignores the interdependence of multiple samples in single patients if not statistically accounted for. Effects of such clustering should be made transparent by exactly describing the numbers of samples per patient and region. Multivariable statistical methods accounting for clustered data and potential confounders should be applied and supported by presentation of an additional patient-based analysis.

Threshold bias in methods with significant signal-to-noise ratios

For optical methods which do not provide binary or dichotomous information (i.e., signal vs. no signal) but rather provide optical information with a wide range of values (i.e., continuous), including low level signals from normal tissue, that is, a background signal resulting in noise, the apparent discrimination between diseased tissue and normal tissue will depend on the threshold which is selected. A high threshold will decrease the likelihood of false positive samples and thus will increase PPV and specificity. A low threshold will reduce the likelihood of false negatives and will therefore increase NPV and sensitivity. For instance, spectrograpical methods will return data on a continuous scale and will be subject to this relationship, as demonstrated for 5-ALA derived porphyrin fluorescence (Valdes et al. 2011, Stummer et al. 2014). Fluorochromes, such as fluorescein sodium, which are injected i.v. and present in the plasma, will lead to a background level within normal and peritumoral tissues (Fig. 3) in a manner non-specific for tumor.

Such thresholds need to be exactly defined to minimize bias and ensure reproducibility. ROC analyses should be considered with methods that return data on a continuous scale, evaluating various thresholds.

Timing bias

Many intraoperative imaging methods, which rely on dyes, reveal time-dependent staining of tumor tissue and also surrounding signal with a varying signal-to-noise ratio over time (e.g., ALA, Stummer et al. 1998,,Neira et al. 2016, Schwake et al. 2014) (Fig. 3).

Thus, results of testing for diagnostic accuracy will vary over time after administration and need to be recorded.

Bias from methods for histological assessment

Histological assessments are clearly an important standard of truth (reference standard) for intraoperative optical testing. However, it is difficult even for the experienced neuropathologist to identify individual tumor cells based on conventional stains (e.g., H&E) only. Immunohistochemical approaches might serve to increase the likelihood of detecting tumor cells in samples, e.g., Ki67 staining, p53 or IDH1 staining, Results regarding sensitivity and specificity will vary depending on the sensitivity of neuropathological assessments and the detection of tumor cells in the peritumoral region.

Thus, the histological methods need to be reported in detail.

Test result reproducibility

Multiple technical and human factors will influence the reproducibility of an intraoperative imaging test.

Technical equipment may be sensitive or insensitive in generating, detecting, and conveying the optical signal to the surgeon, and signals may vary over time due to influence by multiple factors. For example, the distance of the microscope from the illuminated cavity will determine the intensity of light reaching the cavity, which in turn will be linearly related to fluorescence intensity and may influence detection sensitivity. Typically, xenon light sources will have fluctuations and light intensity can deteriorate over time, thereby also influencing the strength of the signal. Lasers will fluctuate and will require calibration. In fluorescence, detection filters are sometimes configured to allow background light to pass. Depending on the intensity of the background signal, the test signal might be less easily detected due to background transmission of excitation light. Such effects will reduce contrast. An example for this is the Yellow 560 Zeiss, which allows a strong background signal to pass, thereby reducing the sensitivity for signal visualization. Indocyanine green (ICG) as a near-infrared (NIR) fluorochrome is invisible to the human eye and requires image processing to account for pulsation artifacts or large fluctuations in signal intensity after administration. Ambient light will interfere with tissue fluorescence in 5-ALA-induced fluorescence-guided resection. Photobleaching might play a role with all fluorochromes [86].

Also, interobserver variation will have an impact on the reproducibility when assessments are qualitative and dependent on personal judgment. An extreme example for this would be the difficulty of colorblind surgeons in differentiating red porphyrin fluorescence [67].

Some studies use technical methods for detecting specific signals from tumors, such as multiple channel spectroscopical fluorescence and/or reflectance, and generate algorithms to identify tumors based on these multiple tissue characteristics. For such methodologies, derived from training sets, with processing of multiple characteristics to give a final algorithm for tissue identification, a validation is required, e.g., cross-validation or independent test cohort. The validation is crucial to guarantee the applicability of the algorithms to data sets that differ from the particular data set used to generate the algorithm [15, 23].

Alternate reference standards

It is evident that histology is an important reference standard, or standard of truth. On the other hand, histology, even when a number of biopsies are obtained, will not give information about the entire tumor or the entire brain. Thus, an alternate outcome might be the completeness of tumor resection based on the intraoperative optical imaging method, as assessed by postoperative imaging, e.g., in how many cases was “complete” resection of the contrast-enhancing portion of tumor possible? In infiltrating lesions such as gliomas, it is necessary to define what should be considered as resection target on MRI. Traditionally, resection of enhancing tumor is considered as the target in high-grade glioma surgery [75, 86], whereas in low-grade gliomas, it is currently the FLAIR-weighted abnormality [57, 82]. However, tumor resection rates do not only depend on intraoperative optical methods for identifying residual tumors. The extent of resection will also be strongly influenced by patient selection (small, non-eloquent tumors vs. larger, eloquent tumors), the availability of intraoperative mapping/monitoring for safely performing maximal tumor resections, or the experience of the surgeon. Since these factors will differ from center to center and from surgeon to surgeon, single arm, monocentric studies will be confounded due to bias in patient selection, available resection technologies, and the surgeon. Thus, using the completeness of resection as an endpoint for evaluating intraoperative diagnostic methods will require randomized trials or prospective cohort studies, where propensity score matching or multivariable statistical methods should be applied in the analysis.

A similar argumentation pertains to outcome, i.e., survival, progression-free survival, and neurological safety. Survival has been used outside of randomized studies to indicate the benefit of a method [2].

Survival as outcome will not be directly related to the diagnostic method but rather to extent of tumor resection. Completeness of tumor resection will be under some influence of useful intraoperative optical methods, but not exclusively so, since the surgeon, who is aware of brain contrasted by a particular method may not resect tumor due to safety concerns. In the 5-ALA randomized, controlled trial in both study arms surgeons decided not to take residual visible tumor in 30% of cases due to concerns for neurological function [86]. The same limitations apply as stated for postoperative imaging. Outcomes could only be interpreted confidently when studied in prognostically balanced cohorts, which can only be achieved by randomization. However, the effects of the diagnostic method on outcome will be small since many other factors influence survival and resection rates. The outcome advantage would not be conferred by the use of the diagnostic method but “merely” by increasing the rates of more “complete” resections. Complete resections would also be observed in the control arm, and not all patients in the arm with the new diagnostic method would have complete resections for functional reasons. Thus, any effects of the intraoperative diagnostic method on outcome, for example time to progression or overall survival, would be invariably diluted and difficult to detect. The numbers needed to achieve statistical power for adequately detecting improvements of survival would therefore be very high.

The need for a guideline for intraoperative tissue diagnosis

Intraoperative tissue diagnosis is an expanding field. Reviews are being compiled, many of which are citing and pooling accuracy data from various publications, the accuracy data being based on classical definitions of diagnostic accuracy (e.g., sensitivity and specificity) and sometimes outcome (extent of resection and overall survival) without further consideration on how these data were determined in the original studies. Closer scrutiny reveals that rarely are possible confounders and biases accounted for or the methodology transparent enough in the original papers to allow generalization or comparison, i.e., ensuring internal and external validity.

For further elucidation, we reviewed all papers evaluating the use of fluorescence in brain tumor surgery, to determine how possible biases, as summarized in Table 4, were accounted for, abiding to the Preferred Reporting Item for Systematic Reviews and Meta-Analysis (PRISMA) statement [58]. MEDLINE/PubMed and Embrace data bases were interrogated for articles in English published before October 2018 with the following syntax for title and abstracts using EndNote X7 software (Thompson Reuters, Carlsbad, CA, USA): “glioma” or “gliomas” AND “fluorescence”, “fluorescence-guidance”, “fluorescence guided”, “fluorescence-guided”, “fluorophore”, “fluorochrome”, “ALA”, “5-ALA”, “5-Aminolevulinic acid”, “PPIX”, “fluorescein”, “ICG”, “indocyanin green”, “image-guided”, or “image-guidance”. The initial search delivered 2425 articles. After removing duplicates (n = 1221) and non-English articles (n = 63), all available abstracts were screened for relevance. Only articles describing clinical of fluorescence for fluorescence-guided resections of brain tumors were selected and reviewed. A cross-reference check of citations of each relevant literature review included was performed to ensure that no relevant studies were missed by the computed database search. A total of 62 studies were marked as relevant for this evaluation [1, 3, 4, 7, 8, 10, 12, 16, 17, 18, 20, 21, 22, 26, 27, 28, 32, 33, 34, 36, 38, 39, 40, 41, 42, 43, 44, 49, 50, 52, 53, 56, 59, 60, 61, 62, 63, 66, 67, 68, 70, 72, 73, 74, 77, 80, 85, 88, 89, 90, 92, 93, 95, 96, 97, 98, 100, 102, 103, 105, 106] (PRISMA flow diagram: see Fig. 4).
Fig. 4

PRISMA flow diagram

Data extraction

Two authors (ESM and WS) independently extracted the following characteristics from the included studies: detection method used, study type (retro-, prospective, randomized), tumor types evaluated, outcome measures (measures of diagnostic accuracy, qualitative or quantitative outcome measures), numbers of patients, numbers of biopsies, prespecified biopsy algorithm, and whether the following sources of bias were accounted for tissue allocation biases A, B, or C, pooling of dependent and independent samples, timing, threshold (signal-to-noise ratios), types of stains used (Table 4).

Results of literature review

Regarding the various confounders and biases, we were able to determine the following:
  • Tissue allocation bias type A: Only 9 of 31 studies investigating diagnostic accuracy [27, 61, 65, 74, 77, 90, 100, 101] describe biopsy locations based on the intraoperative signal margins in a reproducible way.

  • Tissue allocation bias type B: Only 10 of 31 studies [7, 26, 32, 62, 66, 72, 74, 77, 100, 101] correlate biopsy location with tissue regions on preoperative imaging using neuronavigation.

  • Tissue allocation bias type C: No study accounts for or gives biopsy size

  • Only two studies accommodate multiple samples per patient by using mixed models with random effects for the individual patient [74, 96] and only one study offers a patient-based and biopsy-based analysis, taking care to collect the same number of biopsies in a sufficient number per patient [87].

  • Only four studies have predefined statistical analysis and sample size calculation plans [3, 85, 87, 88].

  • All studies have “blinded” pathology, but only 10 go beyond simple H&E staining for determining the presence of tumor cells in infiltrating tumor [1, 3, 7, 27, 38, 42, 43, 66, 77, 88], if any information is available at all.

  • Only four studies use objective methods, such as spectrography, for validating the visual (subjective) optical signal [62, 90, 97] (Valdés et al. 2011: spectography; Stummer et al. 2014, spectography; Neira et al. 2016: video pixel intensities) in studies with visible fluorophores.

  • Predefined sample collection algorithms are described in several studies [90] (e.g., Stummer et al. 2014); however, these can mostly not be considered as being reproducible if independent investigators would repeat the study.

  • The numbers of biopsies per patient are surprisingly small in studies featuring correlations between biopsies and signal, which confounds the meaningful calculation of sensitivity or specificity of a diagnostic test (Table 5). Mean values range from 0.83 to 22 (median 4) biopsies per patient.

  • Two studies do not use reproducible reference standards [36, 78]. The comparator is given as “helpful” or “not helpful.”

  • Although administered fluorophores will have a strong time-dependent signal, only one study relates the time point of biopsy collection to the time point of fluorophore administration [62].

  • Only one randomized study compares conventional surgery to surgery with the diagnostic method [85].

Table 5

Frequency of patients and biopsies in studies summarized in Table 2 (for studies with biopsies









Standard deviation
















N number of patients in study, n number of biopsies per study, n/N number of biopsies per patient per study

Together, most of these studies provide only minimal information necessary for reproducing results and enabling comparability or generalizability. For further illustration, we constructed a flow chart demonstrating the design of a protocol that addresses many of the biases and confounders involved in intraoperative assessments (Fig. 5).
Fig. 5

Hypothetical examples of validation algorithms of a new microscope for visualizing fluorescence in a diffusely infiltrating tumor compared to an established method. The question to be answered are: does the new method have a similar or better diagnostic accuracy, does the new method detect the same low or lower density of infiltrating cells (biological assessment, left part of the diagram), does the new method disclose the same visual margins of fluorescence (visual assessment, right). IHC immunhistochemistry, EvG Elastica van Gieson, IDH isocitrate dehydrogenase, GFAP glial fibrillary acidic protein, MGMT O6-methylguanine DNA methyltransferase

Due to similar concerns regarding studies on the accuracy of classical tests and their “mediocre” quality [11, 71], the STARD (Standards for Reporting of Diagnostic Accuracy) initiative was born in September 2000. Reporting guidelines based on this initiative were consecutively published in several journals as open access (e.g., Bossuyt et al. [11]). It was felt that past publications with evaluations for diagnostic accuracy studies often lacked information on important aspects of design, conduct, and analysis of such studies. It was (understandably) argued that “flaws in study design can lead to biased results” [55], citing a report [55] that found diagnostic studies with specific design features to be associated with biased, optimistic estimates of diagnostic accuracy compared to studies without such deficiencies. The aim of the STARD initiative was to “to improve the quality of reporting of studies of diagnostic accuracy” with complete and accurate reporting, allowing “the reader to detect the potential for bias in the study (internal validity) and to assess the generalizability and applicability of the results (external validity)” [11].

The guidance summarized in the STARD guidelines (Electronic Supplementary Material Part 2) is pertinent and should be observed when reporting the evaluation of intraoperative diagnostic tests. However, the STARD guidelines do not address the specific requirements of intraoperative diagnostic imaging in the brain or in other organs, as they were designed for diagnostic tests where one subject gives one test value which is compared to a reference standard, in order to detect a condition of interest in that subject.

More recently the TRIPOD (Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis) guidelines were devised [19] with a similar intention of improving reporting of diagnostic models (but also of prognostic models). The TRIPOD guidelines could be pertinent in the present context, e.g., if multivariable modeling of, e.g., PPV and NPV were performed, paying attention to variables influencing these measures, such as tumor size, or location of biopsies. However, due to its more general nature, this guideline is not entirely sufficient in providing guidance for the detailed context of intraoperative imaging in neurosurgery.

Thus, for the novel and expanding field of intraoperative optical diagnosis, there is an evident need for a guideline for designing and reporting diagnostic accuracy studies.

For this purpose, we suggest expansions of the original STARD guidelines (which may be downloaded under, as summarized in Table 6, as well as several considerations and recommendations regarding statistics, which are added in Electronic Supplementary Material 2.
Table 6


1. Introduction

As with the STARD initiative [11], it is the aim of this guideline to help investigators improve the design and the reporting quality of diagnostic accuracy studies. STARD-CNS expands the original STARD guidelines [11] to encompass the area of intraoperative optical diagnostics with special reference to the brain and gives advice for the design of respective studies, which considers the plethora of pitfalls and biases involved in such studies. The authors feel that adherence to these recommendations will reduce the potential for inadvertent bias and to promote comparability, reproducibility and generalizability of results obtained for various intraoperative methods of optical imaging.

These recommendations do not only pertain to fluorescence methods but to any methods that relate tissues identified intraoperatively to imaging and/or histology, e.g., other forms of non-optical tumor identification such as navigation per se, intraoperative MRI, ultrasound, but also to targeted fluorochromes or narrow field methods such as OCT, RAMAN, confocal imaging and others. Also, these suggestions may not only be pertinent for gliomas but might be extended to other tumor entities in the brain as well (e.g., metastasis, meningiomas, adenomas) for which intraoperative detection methods are being developed or employed. Furthermore, methods of intraoperative tissue detection are also being explored for the surgery of tumors outside the CNS, where similar considerations regarding the evaluation of such methods are justified, e.g., for mapping of sentinel lymphatic node or identification of solid tumors by near infrared fluorescence (as reviewed in Schaafsma et al. [76]).

2. Recommendations pertaining to the design of a study

• Consider a protocol with intraoperative neuronavigation and postoperative imaging for assessing the extent of the detection signal and how this relates to MRI morphology.

• Consider addressing a particular tissue area first based on navigation, which relates this area to imaging data, then assessing the detection signal and finally collecting a biopsy.

• In protocols containing neuronavigation for correlating tissue signal to imaging, methods should be described that compensate for the influence of brain shift

• Consider histological assessment of the complete biopsy (the smallest unit of resection) and not only of a part of the biopsy

• Consider expanding simple H&E histology by immunohistochemistry for better detection of infiltrating tumor cells

• Consider focusing on the PPV in conjunction with the NPV (giving an exact description of where samples were taken in relationship to neuronavigation MR imaging). PPV is the only accuracy measurement, which does not require sampling from “normal” brain.

• Consider using objective methods (e.g., spectrography) to validate subjective optical impressions.

• Consider additional reference standards, i.e., extent of resection and outcome (safety, survival), apart from biopsies.

• Define statistical methods for confirmatory endpoints ex ante. Involve a statistician in the planning stage (see Electronic Supplementary Material for recommendations pertaining to statistical analysis and handling of dependent and clustered samples).

• If an equivalent and sufficient number of biopsies per patient cannot be collected, consider appropriate statistical methods to adjust for varying numbers of biopsies (see below).

• Consider randomization to analyze the usefulness of the method for improving resection rates on MRI and outcome to achieve independence from non-therapeutic factors, such as resectability, age etc.

• In studies using a method with algorithms for identifying tumor based on a specific tissue characteristic, such as with optical properties (reflection, fluorescence) with processing of multiple inputs to give a final algorithm for tissue identification, a validation cohort is required to rule out algorithms only to be valid for the particular data set used for generating the algorithm (e.g., Butte et al. [15]).


3. Checklist for reporting, expanding the STARD Checklist [11]:

Bias reduction:

• What methods were used to reduce rater bias, e.g., blinded assessments by pathologists or radiologists?

• Were optical signals validated by objective detection technology, i.e., spectrography?

• Did multiple raters address the optical signal independently?

Tissue sampling algorithms:

• Describe exactly where tissue samples are taken and give methods for documenting the location of biopsies (e.g., neuronavigation), including the size of biopsies

• Were the location and the number of biopsies taken per patient documented?

• With time sensitive methods of detection (e.g., fluorochromes injected i.v.): Are the time points at which biopsies were taken described?

• It is recommended that the same number of samples be taken from similarly defined locations in individual patients. How was this handled?

Signal detection:

• If the methodology employs thresholding, were the thresholds and the rationale for the thresholds exactly described? How was the background signal handled? Was ROC analysis employed for continuous data? Where values transformed?

• If the methodology requires image processing, the exact procedure and settings need to be described in a reproducible way.

• How was the technical equipment tested and maintained?

• What factors confound signal detection and how are these handled?

• Was intraobserver variability accounted for?

• If algorithms for tissue detection are constructed using multiple inputs, was an independent cohort for cross-validation included?

Reference standard

• Describe which types of histological assessment are implemented, e.g., was immunohistochemistry used for identifying tumor cells in low density that infiltrate the brain? Which markers were assessed, e.g., Ki-67/MIB-1 staining, EGFR, GFAP, IDH1, p53, others?

• If other reference standards are used (post-OP imaging, outcome, other optical imaging methods), are these exactly described?

• What methods are used to ensure transparency in non-histological reference standards to allow comparability?

Statistical considerations:

• Was a statistician involved in the planning stage? Was a sample size calculation performed? What are the planned settings (type I error, power, assumed effects)?

• What are the primary endpoints and statistical hypotheses?

• What is the statistical design?

• Were multiple testing procedures used for type I error control?

• Were different diagnostic tools compared? Which statistical method was used?

• Describe exactly how dependent data (biopsy within patient) and independent data (per patient) were handled.

• Are statistical methods applied to account for the clustered data structure and differences in the number of biopsies per patient (e.g., generalized linear mixed models)?

• Describe the applied statistical methods exactly and reproducibly.

• Describe how missing data were handled.

• Report estimates of diagnostic accuracy and measures of statistical uncertainty (e.g., sensitivity, specificity, PPV, NPV, and corresponding 95% confidence intervals). Are CI adjusted for clustered data structure?

• If possible, use dichotomous outcomes for pathology and dichotomous or continuous measures for the diagnostic tool.


In conclusion, the biases and confounders involved in reliable and reproducible testing of diagnostic accuracy in methods of intraoperative imaging diagnoses are many. In this rapidly expanding field, a consensus on reporting standard is becoming necessary. If investigators do not adhere to such or similar standards, different methods or different studies using the same visualization method simply cannot be compared.

The authors propose a guideline to this end, as suggested and elucidated in Table 6 (references cited in Electronic Supplementary Material [30, 46, 54, 69, 83, 104, 107]).



Compliance with ethical standards

Conflict of interest

Walter Stummer received speaker and consultant fees from Medac, Zeiss, Leica, Photonamic, and NXDC. Constantinos Hadjipanayis is a consultant for NXDC and Synaptive Medical Inc. He will receive royalties from NXDC. He has also received speaker fees by Carl Zeiss and Leica.

Ethical approval

All procedures performed in studies involving human participants were in accordance with the ethical standards of the institutional and/or national research committee (name of institute/committee) and with the 1964 Helsinki declaration and its later amendments or comparable ethical standards. For this type of expert review, formal consent is not required.

Supplementary material

701_2019_4007_MOESM1_ESM.docx (128 kb)
ESM 1 (DOCX 127 kb)
701_2019_4007_MOESM2_ESM.docx (21 kb)
ESM 2 (DOCX 20 kb)


  1. 1.
    Acerbi F, Broggi M, Broggi G, Ferroli P (2015) What is the best timing for fluorescein injection during surgical removal of high-grade gliomas? Acta Neurochir 157:1377–1378. Google Scholar
  2. 2.
    Acerbi F, Broggi M, Eoli M, Anghileri E, Cavallo C, Boffano C, Cordella R, Cuppini L, Pollo B, Schiariti M, Visintini S, Orsi C, La Corte E, Broggi G, Ferroli P (2014) Is fluorescein-guided technique able to help in resection of high-grade gliomas? Neurosurg Focus 36:E5. Google Scholar
  3. 3.
    Acerbi F, Broggi M, Eoli M, Anghileri E, Cuppini L, Pollo B, Schiariti M, Visintini S, Orsi C, Franzini A, Broggi G, Ferroli P (2013) Fluorescein-guided surgery for grade IV gliomas with a dedicated filter on the surgical microscope: preliminary results in 12 cases. Acta Neurochir 155:1277–1286. Google Scholar
  4. 4.
    Acerbi F, Broggi M, Schebesch KM, Hohne J, Cavallo C, De Laurentis C, Eoli M, Anghileri E, Servida M, Boffano C, Pollo B, Schiariti M, Visintini S, Montomoli C, Bosio L, La Corte E, Broggi G, Brawanski A, Ferroli P (2018) Fluorescein-guided surgery for resection of high-grade gliomas: a multicentric prospective phase II study (FLUOGLIO). Clin Cancer Res 24:52–61. Google Scholar
  5. 5.
    Albert FK, Forsting M, Sartor K, Adams HP, Kunze S (1994) Early postoperative magnetic resonance imaging after resection of malignant glioma: objective evaluation of residual tumor and its influence on regrowth and prognosis. Neurosurgery 34:45–60 discussion 60-41Google Scholar
  6. 6.
    Aldave G, Tejada S, Pay E, Marigil M, Bejarano B, Idoate MA, Diez-Valle R (2013) Prognostic value of residual fluorescent tissue in glioblastoma patients after gross total resection in 5-aminolevulinic acid-guided surgery. Neurosurgery 72:915–920; discussion 920-911. Google Scholar
  7. 7.
    Arita H, Kinoshita M, Kagawa N, Fujimoto Y, Kishima H, Hashimoto N, Yoshimine T (2012) (1)(1)C-methionine uptake and intraoperative 5-aminolevulinic acid-induced fluorescence as separate index markers of cell density in glioma: a stereotactic image-histological analysis. Cancer 118:1619–1627. Google Scholar
  8. 8.
    Belloch JP, Rovira V, Llacer JL, Riesgo PA, Cremades A (2014) Fluorescence-guided surgery in high grade gliomas using an exoscope system. Acta Neurochir 156:653–660. Google Scholar
  9. 9.
    Black PM, Alexander E 3rd, Martin C, Moriarty T, Nabavi A, Wong TZ, Schwartz RB, Jolesz F (1999) Craniotomy for tumor treatment in an intraoperative magnetic resonance imaging unit. Neurosurgery 45:423–431 discussion 431-423Google Scholar
  10. 10.
    Bongetta D, Zoia C, Pugliese R, Adinolfi D, Silvani V, Gaetani P (2016) Low-cost fluorescein detection system for high-grade glioma surgery. World Neurosurg 88:54–58. Google Scholar
  11. 11.
    Bossuyt PM, Reitsma JB, Bruns DE, Gatsonis CA, Glasziou PP, Irwig L, Lijmer JG, Moher D, Rennie D, de Vet HC, Kressel HY, Rifai N, Golub RM, Altman DG, Hooft L, Korevaar DA, Cohen JF, Group S (2015) STARD 2015: an updated list of essential items for reporting diagnostic accuracy studies. BMJ 351:h5527. Google Scholar
  12. 12.
    Bowden SG, Neira JA, Gill BJA, Ung TH, Englander ZK, Zanazzi G, Chang PD, Samanamud J, Grinband J, Sheth SA, McKhann GM 2nd, Sisti MB, Canoll P, D'Amico RS, Bruce JN (2018) Sodium fluorescein facilitates guided sampling of diagnostic tumor tissue in nonenhancing gliomas. Neurosurgery 82:719–727. Google Scholar
  13. 13.
    Broadbent B, Tseng J, Kast R, Noh T, Brusatori M, Kalkanis SN, Auner GW (2016) Shining light on neurosurgery diagnostics using Raman spectroscopy. J Neuro-Oncol 130:1–9. Google Scholar
  14. 14.
    Brusatori M, Auner G, Noh T, Scarpace L, Broadbent B, Kalkanis SN (2017) Intraoperative Raman spectroscopy. Neurosurg Clin N Am 28:633–652. Google Scholar
  15. 15.
    Butte PV, Mamelak AN, Nuno M, Bannykh SI, Black KL, Marcu L (2011) Fluorescence lifetime spectroscopy for guided therapy of brain tumors. Neuroimage 54(Suppl 1):S125–S135. Google Scholar
  16. 16.
    Catapano G, Sgulo FG, Seneca V, Lepore G, Columbano L, di Nuzzo G (2017) Fluorescein-guided surgery for high-grade glioma resection: an intraoperative “contrast-enhancer”. World Neurosurg 104:239–247. Google Scholar
  17. 17.
    Chan DTM, Yi-Pin Sonia H, Poon WS (2018) 5-Aminolevulinic acid fluorescence guided resection of malignant glioma: Hong Kong experience. Asian J Surg 41:467–472. Google Scholar
  18. 18.
    Chen B, Wang H, Ge P, Zhao J, Li W, Gu H, Wang G, Luo Y, Chen D (2012) Gross total resection of glioma with the intraoperative fluorescence-guidance of fluorescein sodium. Int J Med Sci 9:708–714. Google Scholar
  19. 19.
    Collins GS, Reitsma JB, Altman DG, Moons KGM, members of the Tg (2015) Transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD): the TRIPOD statement. Eur Urol 67:1142–1151. Google Scholar
  20. 20.
    Cortnum S, Laursen RJ (2012) Fluorescence-guided resection of gliomas. Dan Med J 59:A4460Google Scholar
  21. 21.
    Della Puppa A, Ciccarino P, Lombardi G, Rolma G, Cecchin D, Rossetto M (2014) 5-Aminolevulinic acid fluorescence in high grade glioma surgery: surgical outcome, intraoperative findings, and fluorescence patterns. Biomed Res Int 2014:232561. Google Scholar
  22. 22.
    Della Puppa A, De Pellegrin S, d'Avella E, Gioffre G, Rossetto M, Gerardi A, Lombardi G, Manara R, Munari M, Saladini M, Scienza R (2013) 5-aminolevulinic acid (5-ALA) fluorescence guided surgery of high-grade gliomas in eloquent areas assisted by functional mapping. Our experience and review of the literature. Acta Neurochir 155:965–972; discussion 972. Google Scholar
  23. 23.
    Desroches J, Jermyn M, Mok K, Lemieux-Leduc C, Mercier J, St-Arnaud K, Urmey K, Guiot MC, Marple E, Petrecca K, Leblond F (2015) Characterization of a Raman spectroscopy probe system for intraoperative brain tissue classification. Biomed Opt Express 6:2380–2397. Google Scholar
  24. 24.
    Devpura S, Barton KN, Brown SL, Palyvoda O, Kalkanis S, Naik VM, Siddiqui F, Naik R, Chetty IJ (2014) Vision 20/20: the role of Raman spectroscopy in early stage cancer detection and feasibility for application in radiation therapy response assessment. Med Phys 41:050901. Google Scholar
  25. 25.
    Diaz RJ, Dios RR, Hattab EM, Burrell K, Rakopoulos P, Sabha N, Hawkins C, Zadeh G, Rutka JT, Cohen-Gadol AA (2015) Study of the biodistribution of fluorescein in glioma-infiltrated mouse brain and histopathological correlation of intraoperative findings in high-grade gliomas resected under fluorescein fluorescence guidance. J Neurosurg 122:1360–1369. Google Scholar
  26. 26.
    Ewelt C, Floeth FW, Felsberg J, Steiger HJ, Sabel M, Langen KJ, Stoffels G, Stummer W (2011) Finding the anaplastic focus in diffuse gliomas: the value of Gd-DTPA enhanced MRI, FET-PET, and intraoperative, ALA-derived tissue fluorescence. Clin Neurol Neurosurg 113:541–547. Google Scholar
  27. 27.
    Eyupoglu IY, Hore N, Fan Z, Buslei R, Merkel A, Buchfelder M, Savaskan NE (2015) Intraoperative vascular DIVA surgery reveals angiogenic hotspots in tumor zones of malignant gliomas. Sci Rep 5:7958. Google Scholar
  28. 28.
    Eyupoglu IY, Hore N, Savaskan NE, Grummich P, Roessler K, Buchfelder M, Ganslandt O (2012) Improving the extent of malignant glioma resection by dual intraoperative visualization approach. PLoS One 7:e44885. Google Scholar
  29. 29.
    Fatehi D, Baral TN, Abulrob A (2014) In vivo imaging of brain cancer using epidermal growth factor single domain antibody bioconjugated to near-infrared quantum dots. J Nanosci Nanotechnol 14:5355–5362Google Scholar
  30. 30.
    Fleiss JL, Levin B, Paik M (2003) Statistical methods for rates and proportions, 3rd edn. WileyGoogle Scholar
  31. 31.
    Floeth FW, Pauleit D, Wittsack HJ, Langen KJ, Reifenberger G, Hamacher K, Messing-Junger M, Zilles K, Weber F, Stummer W, Steiger HJ, Woebker G, Muller HW, Coenen H, Sabel M (2005) Multimodal metabolic imaging of cerebral gliomas: positron emission tomography with [18F]fluoroethyl-L-tyrosine and magnetic resonance spectroscopy. J Neurosurg 102:318–327. Google Scholar
  32. 32.
    Floeth FW, Sabel M, Ewelt C, Stummer W, Felsberg J, Reifenberger G, Steiger HJ, Stoffels G, Coenen HH, Langen KJ (2011) Comparison of (18)F-FET PET and 5-ALA fluorescence in cerebral gliomas. Eur J Nucl Med Mol Imaging 38:731–741. Google Scholar
  33. 33.
    Francaviglia N, Iacopino DG, Costantino G, Villa A, Impallaria P, Meli F, Maugeri R (2017) Fluorescein for resection of high-grade gliomas: a safety study control in a single center and review of the literature. Surg Neurol Int 8:145. Google Scholar
  34. 34.
    Gessler F, Forster MT, Duetzmann S, Mittelbronn M, Hattingen E, Franz K, Seifert V, Senft C (2015) Combination of intraoperative magnetic resonance imaging and intraoperative fluorescence to enhance the resection of contrast enhancing gliomas. Neurosurgery 77:16–22; discussion 22. Google Scholar
  35. 35.
    Griner PF, Mayewski RJ, Mushlin AI, Greenland P (1981) Selection and interpretation of diagnostic tests and procedures. Principles and applications. Ann Intern Med 94:557–592Google Scholar
  36. 36.
    Hamamcioglu MK, Akcakaya MO, Goker B, Kasimcan MO, Kiris T (2016) The use of the YELLOW 560 nm surgical microscope filter for sodium fluorescein-guided resection of brain tumors: our preliminary results in a series of 28 patients. Clin Neurol Neurosurg 143:39–45. Google Scholar
  37. 37.
    Hammoud MA, Ligon BL, elSouki R, Shi WM, Schomer DF, Sawaya R (1996) Use of intraoperative ultrasound for localizing tumors and determining the extent of resection: a comparative study with magnetic resonance imaging. J Neurosurg 84:737–741. Google Scholar
  38. 38.
    Hauser SB, Kockro RA, Actor B, Sarnthein J, Bernays RL (2016) Combining 5-aminolevulinic acid fluorescence and intraoperative magnetic resonance imaging in glioblastoma surgery: a histology-based evaluation. Neurosurgery 78:475–483. Google Scholar
  39. 39.
    Hefti M, von Campe G, Moschopulos M, Siegner A, Looser H, Landolt H (2008) 5-aminolevulinic acid induced protoporphyrin IX fluorescence in high-grade glioma surgery: a one-year experience at a single institution. Swiss Med Wkly 138:180–185Google Scholar
  40. 40.
    Hickmann AK, Nadji-Ohl M, Hopf NJ (2015) Feasibility of fluorescence-guided resection of recurrent gliomas using five-aminolevulinic acid: retrospective analysis of surgical and neurological outcome in 58 patients. J Neuro-Oncol 122:151–160. Google Scholar
  41. 41.
    Hong J, Chen B, Yao X, Yang Y (2018) Outcome comparisons of high-grade glioma resection with or without fluorescein sodium-guidance. Curr Probl Cancer.
  42. 42.
    Idoate MA, Diez Valle R, Echeveste J, Tejada S (2011) Pathological characterization of the glioblastoma border as shown during surgery using 5-aminolevulinic acid-induced fluorescence. Neuropathology 31:575–582. Google Scholar
  43. 43.
    Jaber M, Ewelt C, Wolfer J, Brokinkel B, Thomas C, Hasselblatt M, Grauer O, Stummer W (2018) Is visible aminolevulinic acid-induced fluorescence an independent biomarker for prognosis in histologically confirmed (World Health Organization 2016) low-grade gliomas? Neurosurgery.
  44. 44.
    Jaber M, Wolfer J, Ewelt C, Holling M, Hasselblatt M, Niederstadt T, Zoubi T, Weckesser M, Stummer W (2016) The value of 5-aminolevulinic acid in low-grade gliomas and high-grade gliomas lacking glioblastoma imaging features: an analysis based on fluorescence, magnetic resonance imaging, 18F-fluoroethyl tyrosine positron emission tomography, and tumor molecular factors. Neurosurgery 78:401–411; discussion 411. Google Scholar
  45. 45.
    Kalkanis SN, Kast RE, Rosenblum ML, Mikkelsen T, Yurgelevic SM, Nelson KM, Raghunathan A, Poisson LM, Auner GW (2014) Raman spectroscopy to distinguish grey matter, necrosis, and glioblastoma multiforme in frozen tissue sections. J Neuro-Oncol 116:477–485. Google Scholar
  46. 46.
    Karim MR, Zeger SL (1992) Generalized linear models with random effects; salamander mating revisited. Biometrics 48:631–644Google Scholar
  47. 47.
    Kast R, Auner G, Yurgelevic S, Broadbent B, Raghunathan A, Poisson LM, Mikkelsen T, Rosenblum ML, Kalkanis SN (2015) Identification of regions of normal grey matter and white matter from pathologic glioblastoma and necrosis in frozen sections using Raman imaging. J Neuro-Oncol 125:287–295. Google Scholar
  48. 48.
    Kast RE, Auner GW, Rosenblum ML, Mikkelsen T, Yurgelevic SM, Raghunathan A, Poisson LM, Kalkanis SN (2014) Raman molecular imaging of brain frozen tissue sections. J Neuro-Oncol 120:55–62. Google Scholar
  49. 49.
    Kuroiwa T, Kajimoto Y, Ohta T (1998) Development of a fluorescein operative microscope for use during malignant glioma surgery: a technical note and preliminary report. Surg Neurol 50:41–48 discussion 48-49Google Scholar
  50. 50.
    Kuroiwa T, Kajimoto Y, Ohta T (1999) Comparison between operative findings on malignant glioma by a fluorescein surgical microscopy and histological findings. Neurol Res 21:130–134Google Scholar
  51. 51.
    Lanzardo S, Conti L, Brioschi C, Bartolomeo MP, Arosio D, Belvisi L, Manzoni L, Maiocchi A, Maisano F, Forni G (2011) A new optical imaging probe targeting alphaVbeta3 integrin in glioblastoma xenografts. Contrast Media Mol Imaging 6:449–458. Google Scholar
  52. 52.
    Lau D, Hervey-Jumper SL, Chang S, Molinaro AM, McDermott MW, Phillips JJ, Berger MS (2016) A prospective phase II clinical trial of 5-aminolevulinic acid to assess the correlation of intraoperative fluorescence intensity and degree of histologic cellularity during resection of high-grade gliomas. J Neurosurg 124:1300–1309. Google Scholar
  53. 53.
    Lee JY, Thawani JP, Pierce J, Zeh R, Martinez-Lage M, Chanin M, Venegas O, Nims S, Learned K, Keating J, Singhal S (2016) Intraoperative near-infrared optical imaging can localize gadolinium-enhancing gliomas during surgery. Neurosurgery 79:856–871. Google Scholar
  54. 54.
    Leisenring W, Pepe MS, Longton G (1997) A marginal regression modelling framework for evaluating medical diagnostic tests. Stat Med 16:1263–1281Google Scholar
  55. 55.
    Lijmer JG, Mol BW, Heisterkamp S, Bonsel GJ, Prins MH, van der Meulen JH, Bossuyt PM (1999) Empirical evidence of design-related bias in studies of diagnostic tests. JAMA 282:1061–1066Google Scholar
  56. 56.
    Liu JG, Yang SF, Liu YH, Wang X, Mao Q (2013) Magnetic resonance diffusion tensor imaging with fluorescein sodium dyeing for surgery of gliomas in brain motor functional areas. Chin Med J 126:2418–2423Google Scholar
  57. 57.
    McGirt MJ, Chaichana KL, Attenello FJ, Weingart JD, Than K, Burger PC, Olivi A, Brem H, Quinones-Hinojosa A (2008) Extent of surgical resection is independently associated with survival in patients with hemispheric infiltrating low-grade gliomas. Neurosurgery 63:700–707; author reply 707-708. Google Scholar
  58. 58.
    Moher D, Liberati A, Tetzlaff J, Altman DG, Group P (2010) Preferred reporting items for systematic reviews and meta-analyses: the PRISMA statement. Int J Surg 8:336–341. Google Scholar
  59. 59.
    Moiyadi A, Shetty P (2014) Navigable intraoperative ultrasound and fluorescence-guided resections are complementary in resection control of malignant gliomas: one size does not fit all. J Neurol Surg A Cent Eur Neurosurg 75:434–441. Google Scholar
  60. 60.
    Moiyadi AV, Sridhar E, Shetty P, Madhugiri VS, Solanki S (2018) What you see and what you don’t - utility and pitfalls during fluorescence guided resections of gliomas using 5-aminolevulinic acid. Neurol India 66:1087–1093. Google Scholar
  61. 61.
    Nabavi A, Thurm H, Zountsas B, Pietsch T, Lanfermann H, Pichlmeier U, Mehdorn M, Group ALARGS (2009) Five-aminolevulinic acid for fluorescence-guided resection of recurrent malignant gliomas: a phase ii study. Neurosurgery 65:1070–1076; discussion 1076-1077. Google Scholar
  62. 62.
    Neira JA, Ung TH, Sims JS, Malone HR, Chow DS, Samanamud JL, Zanazzi GJ, Guo X, Bowden SG, Zhao B, Sheth SA, McKhann GM 2nd, Sisti MB, Canoll P, D'Amico RS, Bruce JN (2016) Aggressive resection at the infiltrative margins of glioblastoma facilitated by intraoperative fluorescein guidance. J Neurosurg 1–12.
  63. 63.
    Ng WP, Liew BS, Idris Z, Rosman AK (2017) Fluorescence-guided versus conventional surgical resection of high grade glioma: a single-centre, 7-year, comparative effectiveness study. Malays J Med Sci 24:78–86. Google Scholar
  64. 64.
    Okuda T, Yoshioka H, Kato A (2012) Fluorescence-guided surgery for glioblastoma multiforme using high-dose fluorescein sodium with excitation and barrier filters. J Clin Neurosci 19:1719–1722. Google Scholar
  65. 65.
    Panciani PP, Fontanella M, Garbossa D, Agnoletti A, Ducati A, Lanotte M (2012) 5-aminolevulinic acid and neuronavigation in high-grade glioma surgery: results of a combined approach. Neurocirugia (Astur) 23:23–28. Google Scholar
  66. 66.
    Panciani PP, Fontanella M, Schatlo B, Garbossa D, Agnoletti A, Ducati A, Lanotte M (2012) Fluorescence and image guided resection in high grade glioma. Clin Neurol Neurosurg 114:37–41. Google Scholar
  67. 67.
    Petterssen M, Eljamel S, Eljamel S (2014) Protoporphyrin-IX fluorescence guided surgical resection in high-grade gliomas: the potential impact of human colour perception. Photodiagn Photodyn Ther 11:351–356. Google Scholar
  68. 68.
    Piquer J, Llacer JL, Rovira V, Riesgo P, Rodriguez R, Cremades A (2014) Fluorescence-guided surgery and biopsy in gliomas with an exoscope system. Biomed Res Int 2014:207974. Google Scholar
  69. 69.
    Rao JN, Scott AJ (1992) A simple method for the analysis of clustered binary data. Biometrics 48:577–585Google Scholar
  70. 70.
    Rapp M, Kamp M, Steiger HJ, Sabel M (2014) Endoscopic-assisted visualization of 5-aminolevulinic acid-induced fluorescence in malignant glioma surgery: a technical note. World Neurosurg 82:e277–e279. Google Scholar
  71. 71.
    Reid MC, Lachs MS, Feinstein AR (1995) Use of methodological standards in diagnostic test research. Getting better but still not good. JAMA 274:645–651Google Scholar
  72. 72.
    Rey-Dios R, Hattab EM, Cohen-Gadol AA (2014) Use of intraoperative fluorescein sodium fluorescence to improve the accuracy of tissue diagnosis during stereotactic needle biopsy of high-grade gliomas. Acta Neurochir 156:1071–1075; discussion 1075. Google Scholar
  73. 73.
    Ritz R, Daniels R, Noell S, Feigl GC, Schmidt V, Bornemann A, Ramina K, Mayer D, Dietz K, Strauss WS, Tatagiba M (2012) Hypericin for visualization of high grade gliomas: first clinical experience. Eur J Surg Oncol 38:352–360. Google Scholar
  74. 74.
    Roberts DW, Valdes PA, Harris BT, Fontaine KM, Hartov A, Fan X, Ji S, Lollis SS, Pogue BW, Leblond F, Tosteson TD, Wilson BC, Paulsen KD (2011) Coregistered fluorescence-enhanced tumor resection of malignant glioma: relationships between delta-aminolevulinic acid-induced protoporphyrin IX fluorescence, magnetic resonance imaging enhancement, and neuropathological parameters. Clinical article. J Neurosurg 114:595–603. Google Scholar
  75. 75.
    Sanai N, Polley MY, McDermott MW, Parsa AT, Berger MS (2011) An extent of resection threshold for newly diagnosed glioblastomas. J Neurosurg 115:3–8. Google Scholar
  76. 76.
    Schaafsma BE, Mieog JS, Hutteman M, van der Vorst JR, Kuppen PJ, Lowik CW, Frangioni JV, van de Velde CJ, Vahrmeijer AL (2011) The clinical use of indocyanine green as a near-infrared fluorescent contrast agent for image-guided oncologic surgery. J Surg Oncol 104:323–332. Google Scholar
  77. 77.
    Schebesch KM, Brawanski A, Doenitz C, Rosengarth K, Proescholdt M, Riemenschneider MJ, Grosse J, Hellwig D, Hohne J (2018) Fluorescence-guidance in non-gadolinium enhancing, but FET-PET positive gliomas. Clin Neurol Neurosurg 172:177–182. Google Scholar
  78. 78.
    Schebesch KM, Proescholdt M, Hohne J, Hohenberger C, Hansen E, Riemenschneider MJ, Ullrich W, Doenitz C, Schlaier J, Lange M, Brawanski A (2013) Sodium fluorescein-guided resection under the YELLOW 560 nm surgical microscope filter in malignant brain tumor surgery--a feasibility study. Acta Neurochir 155:693–699. Google Scholar
  79. 79.
    Schucht P, Beck J, Abu-Isa J, Andereggen L, Murek M, Seidel K, Stieglitz L, Raabe A (2012) Gross total resection rates in contemporary glioblastoma surgery: results of an institutional protocol combining 5-aminolevulinic acid intraoperative fluorescence imaging and brain mapping. Neurosurgery 71:927–935; discussion 935-926. Google Scholar
  80. 80.
    Shah A, Rangarajan V, Kaswa A, Jain S, Goel A (2016) Indocyanine green as an adjunct for resection of insular gliomas. Asian J Neurosurg 11:276–281. Google Scholar
  81. 81.
    Shapiro DE (1999) The interpretation of diagnostic tests. Stat Methods Med Res 8:113–134. Google Scholar
  82. 82.
    Smith JS, Chang EF, Lamborn KR, Chang SM, Prados MD, Cha S, Tihan T, Vandenberg S, McDermott MW, Berger MS (2008) Role of extent of resection in the long-term outcome of low-grade hemispheric gliomas. J Clin Oncol 26:1338–1345. Google Scholar
  83. 83.
    Smith PJ, Hadgu A (1992) Sensitivity and specificity for correlated observations. Stat Med 11:1503–1509Google Scholar
  84. 84.
    Stummer W, Novotny A, Stepp H, Goetz C, Bise K, Reulen HJ (2000) Fluorescence-guided resection of glioblastoma multiforme by using 5-aminolevulinic acid-induced porphyrins: a prospective study in 52 consecutive patients. J Neurosurg 93:1003–1013. Google Scholar
  85. 85.
    Stummer W, Pichlmeier U, Meinel T, Wiestler OD, Zanella F, Reulen HJ, Group AL-GS (2006) Fluorescence-guided surgery with 5-aminolevulinic acid for resection of malignant glioma: a randomised controlled multicentre phase III trial. Lancet Oncol 7:392–401. Google Scholar
  86. 86.
    Stummer W, Reulen HJ, Meinel T, Pichlmeier U, Schumacher W, Tonn JC, Rohde V, Oppel F, Turowski B, Woiciechowsky C, Franz K, Pietsch T, Group AL-GS (2008) Extent of resection and survival in glioblastoma multiforme: identification of and adjustment for bias. Neurosurgery 62:564–576; discussion 564-576. Google Scholar
  87. 87.
    Stummer W, Rodrigues F, Schucht P, Preuss M, Wiewrodt D, Nestler U, Stein M, Artero JM, Platania N, Skjoth-Rasmussen J, Della Puppa A, Caird J, Cortnum S, Eljamel S, Ewald C, Gonzalez-Garcia L, Martin AJ, Melada A, Peraud A, Brentrup A, Santarius T, Steiner HH, European ALAPBTSG (2014) Predicting the "usefulness" of 5-ALA-derived tumor fluorescence for fluorescence-guided resections in pediatric brain tumors: a European survey. Acta Neurochir 156:2315–2324. Google Scholar
  88. 88.
    Stummer W, Stepp H, Wiestler OD, Pichlmeier U (2017) Randomized, prospective double-blinded study comparing 3 different doses of 5-aminolevulinic acid for fluorescence-guided resections of malignant gliomas. Neurosurgery 81:230–239. Google Scholar
  89. 89.
    Stummer W, Stocker S, Wagner S, Stepp H, Fritsch C, Goetz C, Goetz AE, Kiefmann R, Reulen HJ (1998) Intraoperative detection of malignant gliomas by 5-aminolevulinic acid-induced porphyrin fluorescence. Neurosurgery 42:518–525 discussion 525-516Google Scholar
  90. 90.
    Stummer W, Tonn JC, Goetz C, Ullrich W, Stepp H, Bink A, Pietsch T, Pichlmeier U (2014) 5-Aminolevulinic acid-derived tumor fluorescence: the diagnostic accuracy of visible fluorescence qualities as corroborated by spectrometry and histology and postoperative imaging. Neurosurgery 74:310–319; discussion 319-320. Google Scholar
  91. 91.
    Swanson KI, Clark PA, Zhang RR, Kandela IK, Farhoud M, Weichert JP, Kuo JS (2015) Fluorescent cancer-selective alkylphosphocholine analogs for intraoperative glioma detection. Neurosurgery 76:115–123; discussion 123-114. Google Scholar
  92. 92.
    Szmuda T, Sloniewski P, Olijewski W, Springer J, Waszak PM (2015) Colour contrasting between tissues predicts the resection in 5-aminolevulinic acid-guided surgery of malignant gliomas. J Neuro-Oncol 122:575–584. Google Scholar
  93. 93.
    Tsugu A, Ishizaka H, Mizokami Y, Osada T, Baba T, Yoshiyama M, Nishiyama J, Matsumae M (2011) Impact of the combination of 5-aminolevulinic acid-induced fluorescence with intraoperative magnetic resonance imaging-guided surgery for glioma. World Neurosurg 76:120–127. Google Scholar
  94. 94.
    Tummers WS, Warram JM, Tipirneni KE, Fengler J, Jacobs P, Shankar L, Henderson L, Ballard B, Pogue BW, Weichert JP, Bouvet M, Sorger J, Contag CH, Frangioni JV, Tweedle MF, Basilion JP, Gambhir SS, Rosenthal EL (2017) Regulatory aspects of optical methods and exogenous targets for cancer detection. Cancer Res 77:2197–2206. Google Scholar
  95. 95.
    Utsuki S, Oka H, Miyajima Y, Shimizu S, Suzuki S, Fujii K (2008) Auditory alert system for fluorescence-guided resection of gliomas. Neurol Med Chir (Tokyo) 48:95–97 discussion 97-98Google Scholar
  96. 96.
    Valdes PA, Jacobs V, Harris BT, Wilson BC, Leblond F, Paulsen KD, Roberts DW (2015) Quantitative fluorescence using 5-aminolevulinic acid-induced protoporphyrin IX biomarker as a surgical adjunct in low-grade glioma surgery. J Neurosurg 123:771–780. Google Scholar
  97. 97.
    Valdes PA, Kim A, Brantsch M, Niu C, Moses ZB, Tosteson TD, Wilson BC, Paulsen KD, Roberts DW, Harris BT (2011) delta-aminolevulinic acid-induced protoporphyrin IX concentration correlates with histopathologic markers of malignancy in human gliomas: the need for quantitative fluorescence-guided resection to identify regions of increasing malignancy. Neuro-Oncology 13:846–856. Google Scholar
  98. 98.
    Valdes PA, Kim A, Leblond F, Conde OM, Harris BT, Paulsen KD, Wilson BC, Roberts DW (2011) Combined fluorescence and reflectance spectroscopy for in vivo quantification of cancer biomarkers in low- and high-grade glioma surgery. J Biomed Opt 16:116007. Google Scholar
  99. 99.
    Whiting PF, Rutjes AW, Westwood ME, Mallett S, Deeks JJ, Reitsma JB, Leeflang MM, Sterne JA, Bossuyt PM, Group Q (2011) QUADAS-2: a revised tool for the quality assessment of diagnostic accuracy studies. Ann Intern Med 155:529–536. Google Scholar
  100. 100.
    Widhalm G, Kiesel B, Woehrer A, Traub-Weidinger T, Preusser M, Marosi C, Prayer D, Hainfellner JA, Knosp E, Wolfsberger S (2013) 5-Aminolevulinic acid induced fluorescence is a powerful intraoperative marker for precise histopathological grading of gliomas with non-significant contrast-enhancement. PLoS One 8:e76988. Google Scholar
  101. 101.
    Widhalm G, Wolfsberger S, Minchev G, Woehrer A, Krssak M, Czech T, Prayer D, Asenbaum S, Hainfellner JA, Knosp E (2010) 5-Aminolevulinic acid is a promising marker for detection of anaplastic foci in diffusely infiltrating gliomas with nonsignificant contrast enhancement. Cancer 116:1545–1552. Google Scholar
  102. 102.
    Xiang Y, Zhu XP, Zhao JN, Huang GH, Tang JH, Chen HR, Du L, Zhang D, Tang XF, Yang H, Lv SQ (2018) Blood-brain barrier disruption, sodium fluorescein, and fluorescence-guided surgery of gliomas. Br J Neurosurg 32:141–148. Google Scholar
  103. 103.
    Yamada S, Muragaki Y, Maruyama T, Komori T, Okada Y (2015) Role of neurochemical navigation with 5-aminolevulinic acid during intraoperative MRI-guided resection of intracranial malignant gliomas. Clin Neurol Neurosurg 130:134–139. Google Scholar
  104. 104.
    Zeger SL, Liang KY (1986) Longitudinal data analysis for discrete and continuous outcomes. Biometrics 42:121–130Google Scholar
  105. 105.
    Zeh R, Sheikh S, Xia L, Pierce J, Newton A, Predina J, Cho S, Nasrallah M, Singhal S, Dorsey J, Lee JYK (2017) The second window ICG technique demonstrates a broad plateau period for near infrared fluorescence tumor contrast in glioblastoma. PLoS One 12:e0182034. Google Scholar
  106. 106.
    Zhang N, Tian H, Huang D, Meng X, Guo W, Wang C, Yin X, Zhang H, Jiang B, He Z, Wang Z (2017) Sodium fluorescein-guided resection under the YELLOW 560 nm surgical microscope filter in malignant gliomas: our first 38 cases experience. Biomed Res Int 2017:7865747. Google Scholar
  107. 107.
    Zhou X, Obuchowski N, McClish D (2002) Statistical methods in diagnostic medicine, 2nd edn. WileyGoogle Scholar

Copyright information

© The Author(s) 2019

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (, which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Authors and Affiliations

  • Walter Stummer
    • 1
    Email author
  • Raphael Koch
    • 2
  • Ricardo Diez Valle
    • 3
  • David W. Roberts
    • 4
  • Nadar Sanai
    • 5
  • Steve Kalkanis
    • 6
  • Constantinos G. Hadjipanayis
    • 7
  • Eric Suero Molina
    • 1
  1. 1.Department of NeurosurgeryUniversity Hospital of MünsterMünsterGermany
  2. 2.Institute of Biostatistics and Clinical ResearchUniversity of MünsterMünsterGermany
  3. 3.Department of NeurosurgeryUniversity ClinicNavarraSpain
  4. 4.Department of NeurosurgeryDartmouth Hitchcock Medical CenterLebanonUSA
  5. 5.Division of Neurosurgical Oncology, Ivy Brain Tumor CenterBarrow Neurological InstitutePhoenixUSA
  6. 6.Department of Neurosurgery, Henry Ford Health SystemDetroitUSA
  7. 7.Department of NeurosurgeryIcahn School of Medicine at Mount SinaiNew YorkUSA

Personalised recommendations