Current measures of metabolic heterogeneity within cervical cancer do not predict disease outcome
- 3.4k Downloads
A previous study evaluated the intra-tumoral heterogeneity observed in the uptake of F-18 fluorodeoxyglucose (FDG) in pre-treatment positron emission tomography (PET) scans of cancers of the uterine cervix as an indicator of disease outcome. This was done via a novel statistic which ostensibly measured the spatial variations in intra-tumoral metabolic activity. In this work, we argue that statistic is intrinsically non-spatial, and that the apparent delineation between unsuccessfully- and successfully-treated patient groups via that statistic is spurious.
We first offer a straightforward mathematical demonstration of our argument. Next, we recapitulate an assiduous re-analysis of the originally published data which was derived from FDG-PET imagery. Finally, we present the results of a principal component analysis of FDG-PET images similar to those previously analyzed.
We find that the previously published measure of intra-tumoral heterogeneity is intrinsically non-spatial, and actually is only a surrogate for tumor volume. We also find that an optimized linear combination of more canonical heterogeneity quantifiers does not predict disease outcome.
Current measures of intra-tumoral metabolic activity are not predictive of disease outcome as has been claimed previously. The implications of this finding are: clinical categorization of patients based upon these statistics is invalid; more sophisticated, and perhaps innately-geometric, quantifications of metabolic activity are required for predicting disease outcome.
KeywordsStandard Uptake Value Percent Threshold Grayscale Intensity Metabolic Heterogeneity Predict Disease Outcome
It is believed that cancerous tumors are intrinsically heterogeneous in many ways . Experimentally quantified properties that exhibit significant variation within tumors include: gene expression , cell proliferation rate , degree of vascularization , and hypoxia [3, 5]. When properties of tumors are assayed via an imaging technique such as positron emission tomography (PET), the question of quantifying biologically-functional heterogeneity becomes one of quantifying the spatial heterogeneity observed in grayscale images. In this case, one describes the arrangement of the various pixel intensities, with some arrangements subjectively appearing more heterogeneous than others. For example, the smooth gradation of a single bright spot to a darker background is intuitively less heterogeneous than the stark transitions seen by surrounding several clusters of the brightest pixels with only the darkest pixels. The goal of quantifying spatial heterogeneity is to objectively calculate a single statistic that indicates one pattern is a certain percentage more or less heterogeneous than another.
In this work, we re-analyze the identical FDG-PET-derived data used in that previous study  and offer an alternative interpretation. Specifically, we argue that the novel measure employed in that work to quantify spatial heterogeneity of the grayscale PET images is intrinsically independent of spatial arrangement, and indeed is a surrogate for tumor volume. As such, it can offer no additional predictive capacity to that of tumor volume. Thus, the delineation of patients into distinct groups of post-treatment survival time via that heterogeneity measure is invalid. Additionally, we examine a similar data set and demonstrate that fundamental, non-spatial measures of heterogeneity applied to the FDG-PET assay of metabolic activity do not predict disease outcome. Finally, we discuss some implications of these results.
Analysis of Previously Published Data
In this work, we first re-analyze the same data originally analyzed in a previous heterogeneity-quantification study . We briefly recapitulate the details of that prospective cohort study here. Patients underwent a pre-treatment, whole-body FDG-PET/CT scan. The pathologic diagnosis and histology were determined by pathologists at Washington University in St. Louis. All patients were treated with concurrent chemotherapy and radiation. A post-therapy FDG-PET/CT scan performed three months after completing the radiation treatment was used to evaluate the response to treatment. For our re-analysis of the 73 total patients, the 14 with persistent disease were combined with the 9 exhibiting new metastases into a single group of those having undergone unsuccessful treatment.
Segmentation of Additional FDG-PET Imagery
The first task of analyzing imaged tumors is to delineate the tumors from the background (referred to as image segmentation). In the case of FDG-PET, the radiopharmaceutical is also taken up and metabolized by noncancerous cells, although to a lesser extent [10, 11]. The typical result is an evidently stronger PET signal (tumor) surrounded by a weaker signal (non-tumor), with the possibility of additional non-tumorous bright-spots colocated with the bladder or rectum as undelivered radiopharmaceutical is cleared from the body . As may be seen in Figure 1, the interface between the healthy and tumorous regions may not be stark, but rather nebulous as tumor cells invade healthy tissue in a diffuse fashion . This is seen in the image as a smooth gradation from brighter pixels to dimmer ones. In order to objectively distinguish tumor from background, we employed the rule-of-thumb that, for a visually-selected, three-dimensional region of interest (ROI), any pixel brighter than 40% of the maximum ROI pixel brightness is to be considered part of the tumor. This 40% rule is based upon the observation that tumors defined as regions of greater than 40% of the maximum standard uptake value (SUV) of FDG both: colocate with those independently identified via visual analysis of computed tomography scans; and yield volumes consistent with published surgical series . The SUV is a PET intensity measure that first has been converted to proper radiation units, then corrected for both radioactive decay and patient body mass . For each patient, the net result is that every grayscale image pixel is multiplied by a single, positive constant. Because we seek to quantify intra-tumoral variation and since there is some debate as to the usefulness and validity of standard uptake values [14, 15], we apply the 40% rule directly to the grayscale intensities.
A computer program to semi-automate the image segmentation process was written in Python v2.6.1 http://www.python.org/. As is ubiquitous in the field, the raw FDG-PET images are first processed through a white-balance-correcting, back-projection algorithm via the proprietary software native to the PET machine. The resulting DICOM image files are imported into our program via the pydicom library v0.9.3 http://code.google.com/p/pydicom/ and then converted to the 8-bit grayscale images via the Python Imaging Library v1.17 http://www.pythonware.com/products/pil/. No additional image preprocessing was implemented. Our program enables the user to rapidly target a region of the whole-body, trans-axial PET image set. Next, the program applies the 40% segmentation rule to all grayscale pixels in the targeted region (e.g., the pelvic region). A flood-fill algorithm is then applied to every pixel remaining in that region in order to determine the inter-pixel connectivity (or lack thereof). The result of this algorithm is a set of distinctly-bounded, contiguous objects. The user can then visually scan the objects and click to remove those few that are obviously (for sound anatomical reasons) not tumors. The typical end result is a 10 - 20 count stack of grayscale images representing trans-axial slices of a clearly-bounded tumor.
It is important to point out that in the scheme described above, the numeric value of the slope is independent of spatial arrangement. For example, the set of grayscale values representing the tumor could be rearranged such that each value resides at a new 3D Cartesian coordinate. In other words, it is possible to "draw" various artificial objects by purposefully placing selected grayscale values at desired coordinates. However, the number of each distinct grayscale value remains constant, regardless of where in the object those values may reside. Since the volume of the tumor object ultimately was calculated by counting pixels above a given threshold, that volume does not change even when the tumor object is destroyed via rearrangement. Thus, any measure of heterogeneity given by the slope is only of the diversity of intensity values, not in spatial arrangement of those values.
Critique of Previously Published Results
In a stack of trans-axial, FDG-PET images, a region of interest fully containing the tumor is first selected by a trained clinician. This is the region of interest that is successively thresholded and the volume of the region remaining after thresholding is computed. Let V A (T) = VA 0e-λTapproximate a typical, observed volume (V) versus percent threshold (T) curve for patient A (see Figure 2). At zero percent threshold, V A (0) = VA 0, the total volume of the initial target region. It is straightforward to show that the slope of the line between a minimum, tumor-defining threshold T m and twice that threshold (e.g, 40% and 80%) is s A = (V A (T m )/T m ) · (V A (T m )/VA 0- 1). We now wish to compare this slope (ostensibly a measure of heterogeneity) to that of a second patient, B, where V B (T) = VB 0e-μT. From the 73 available V (T) curves, we observed that, save for extremely large tumor volumes (greater than 150 cm3), the total volume of tumor exhibiting pixel intensities greater than 80% of the maximum observed intensity is typically very small (≈3 cm3). Thus, the end points of the linearization are approximately equal for every patient. Therefore, Open image in new window , from which it is seen that Open image in new window . Proceeding as before, and employing this approximation, one may show that the change in slope is Δs ≡ |s A - s B | = |V B (T m ) - V A (T m )|/T m ≡ ΔV (T m )/T m . In words, the previously published measure of intra-tumoral heterogeneity is directly proportional to the pre-treatment tumor volume. It is important to note that this result depends only upon the measured 40% tumor volumes, and in no way depends upon the decay rate or closeness of fit of either exponential curve.
In an effort to verify this result, we studied the FDG-PET imagery of 47 recently-examined patients that did not appear in the previously published study. The images were again obtained as described in  but segmented as described in the Methods section. We computed the volume-detrended slopes as before.
Again, we found no distinguishing capacity whatsoever between the successfully treated patients, where the mean slope is 2.20, and the unsuccessfully treated patients where the mean slope is 2.23.
Extended Heterogeneity Analysis
Previous arguments imply that the volume versus threshold slope is sensitive to the distribution of grayscale intensities of the trans-axial image stack. We therefore chose to investigate the relation between these distributions and disease outcome via the fundamental quantifiers of distributions: the standard deviation, skewness and kurtosis. Each of these quantifiers describes a unique quality of non-spatial heterogeneity. The standard deviation indicates the number of unique grayscale values comprising the image stack; that is, the number of different levels of metabolic activity observed. The kurtosis indicates the relative strength of those metabolic levels since a distribution with only a single, sharp peak (higher kurtosis) indicates a favored metabolic activity level. The skewness indicates the pervasiveness of activity levels. For example, an overall brighter distribution (negatively skewed) implies that the majority of tumor volume exhibits relatively higher metabolic activity whereas a skewness of zero indicates equal volumes of activities above and below the mean activity.
From the corresponding eigenvalues, we compute that ≈98% of the total variation in phase space is represented by the standard deviation alone. This high percentage indicates that more sophisticated, non-spatial measures of heterogeneity--which we assert ultimately are based upon the fundamental quantifiers--are unlikely to improve upon the standard measure of uncertainty. In other words, the standard deviation alone is a reasonable non-spatial measure of the variation in metabolic activity. Thus, we suggest that the textbook usage of the standard deviation as the uncertainty in the mean value is adequate when computing statistics, such as the total glycolytic volume, which are spatially averaged over the entire tumor volume.
A potential concern lies in our definition of patient groups, where the unsuccessfully treated group is the union of those patients having post-treatment persistent cancer with those having post-treatment new metastases. In an effort to avoid any bias due to pre-existing metastases, we performed both the re-analysis of existing data as well as our entire principal component analysis again. We first eliminated those with new metastases from the unsuccessfully treated group. We then computed the volume-detrended slopes described earlier and again found that mean value for the successfully treated group (2.28) is nearly identical to that (2.32) of the unsuccessfully treated group. Thus, bias due to inclusion of patients with new metastases does not explain the lack of predictive capacity of the previously published measure of heterogeneity. We now explore the potential effect of this bias in our principal component analysis. Proceeding as before, we compute a new ψ variable for the truncated matrix of observations, excluding patients with new metastases. The mean values of ψ for patients undergoing successful or unsuccessful treatment are then 30.4 (p = 0.51) and 31.7 (p = 0.38), respectively. We again see no substantive difference between the mean values for each group and thus conclude that patients with new metastases did not bias our previous result that non-spatial metabolic quantifiers do not predict disease outcome.
It is important that we immediately point out that we are not claiming that intra-tumoral metabolic heterogeneity does not exist. Indeed, we presume that metabolic activity can vary significantly throughout a tumor. In a younger, pre-vascularized tumor, such variations are likely due to a non-constant, diffusion-limited nutrient density . In a mature tumor, these variations could be due to necrosis  or even steric constraints imposed by the spatially-randomized, densely-packed nature of newly-formed vascularization networks . In order to measure a genuine heterogeneity in a stack of images, one must be able to distinguish a single volume element (voxel) from another. The minimum detectable inter-voxel difference is determined by the noise intrinsic to the FDG-PET assay. The noise in a typical 3D FDG-PET image reconstructed via filtered back-projection has been estimated to be 1.5 kBq/mL . This is only 3% of the ≈50 kBq/mL mean activity of all tumor voxels defined above 40% intensity threshold in our extended heterogeneity study. This implies that the FDG-PET assay can distinguish relatively small changes in the metabolism of tumor cells averaged over a typical PET image voxel. We therefore conclude that the non-predictive nature of bulk heterogeneity statistics is not due to either a genuine lack of variation in metabolic activity or the poor resolution of this variation.
Instead, our results imply that that quantification of tumor composition via FDG-PET remains a challenging, open problem to be solved. We maintain that a shift of focus from tumor composition to shape and location offers immediate potential for improved clinical therapy. Consider that the uncertainty in the anatomical placement of brachytherapy radiation sources via a standard gynecological implant is at least several millimeters. This is the same order of spatial uncertainty in FDG-PET-assayed tumors where the side length of a cubical voxel is typically ≈4 mm. Also, as the computation of radiation fields is rapidly becoming more accurate and more computationally-accessible , it is feasible that more precise, geometric quantification of metabolic variations will directly yield more effective treatment plans. For example, it could be the case that tumors of a particular shape or asymmetry are indicative of disease outcome [22, 23]. These geometric qualities can be quantified readily via the well-known techniques common to image texture analysis  or the physics of particle systems .
We have shown that neither the currently accepted measure, nor other reasonable non-spatial measures, of intra-tumoral metabolic heterogeneity within cervical cancer are predictive of disease outcome. This is directly counter to a previously published claim. We have given a brief mathematical explanation of why that claim is erroneous and have supported our argument with the results of both a re-analysis of the originally published data and a fundamental statistical analysis of a similar data set. Our findings have immediate impact upon clinical research and treatment. The use of currently-accepted, non-spatial quantifiers of intra-tumoral metabolic heterogeneity as a means to categorize patients into groups predicted to be successfully or unsuccessfully treated is invalid. Thus, more sophisticated, and perhaps innately-geometric, quantifications of metabolic activity are required for predicting disease outcome.
We would like to thank Scott Brame and Bruce Davis for illuminating discussions and the latter for carefully reviewing the manuscript. This work was supported by the National Institutes of Health under Grant 1R01-CA136931-01A2.
- 5.Picchio M, Beck R, Haubner R, Seidl S, Machulla HJ, Johnson TD, Wester HJ, Reischl G, Schwaiger M, Piert M: Intratumoral spatial distribution of hypoxia and angiogenesis assessed by 18F-FAZA and 125I-Gluco-RGD autoradiography. J Nucl Med 2008,49(4):597-605. 10.2967/jnumed.107.046870CrossRefPubMedGoogle Scholar
- 11.Wahl RL: Standardized Uptake Values. In Principles and Practice of PET and PET/CT. 2nd edition. Edited by: Wahl RL. Wolters Kluwer Health; 2009.Google Scholar
- 12.Weinberg RA: The biology of cancer. New York: Garland Science; 2007.Google Scholar
- 16.Izenman AJ: Recent developments in nonparametric density-estimation. Journal of the American Statistical Association 1991,86(413):205-224. 10.2307/2289732Google Scholar
- 17.Lay DC: Linear algebra and it's applications. 3rd edition. Boston: Pearson/Addison-Wesley; 2006.Google Scholar
- 23.Mayr NA, Yuh WTC, Taoka T, Wang JZ, Wu DH, Montebello JF, Meeks SL, Paulino AC, Magnotta VA, Adli M, Sorosky JI, Knopp MV, Buatti JM: Serial therapy-induced changes in tumor shape in cervical cancer and their impact on assessing tumor volume and treatment response. AJR Am J Roentgenol 2006, 187: 65-72. 10.2214/AJR.05.0039CrossRefPubMedGoogle Scholar
- 24.Jähne B: Digital image processing. 6th rev. and ext edition. Berlin: Springer; 2005.Google Scholar
- 25.Arfken GB, Weber HJ: Mathematical methods for physicists. 6th edition. Boston: Elsevier; 2005.Google Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.