1 Introduction

Image-based quantification of tumor change after Chemo-radiotherapy (CRT) is important for evaluating treatment response and patient follow-up. Standard methods to assess the tumor metabolic response in Positron Emission Tomography (PET) images are qualitative and described based on a discrete categorization of reduction in Standardized Uptake Value (SUV) or Metabolic Tumor Volume (MTV) [12]. Overall volumetric difference is a global measurement that cannot characterize local non-uniform changes after the therapy [12]. For these reasons, diameter/SUV/volume based measurements are not consistently correlated to important outcomes [12]. Tensor Based Morphometry [8] exploits the gradient of Deformation Vector Field (DVF) i.e. determinant of Jacobian matrix termed Jacobian map (J), to characterize voxel-by-voxel volumetric ratio of an object before and after the transformation. J > 1 means local volume expansion, J < 1 means shrinkage and J = 1 denotes no change. There are many studies that utilize Jacobian map to evaluate volumetric changes. Fuentes et al. [3] used Jacobian integral (mean J\(\times \)tumor volume) to measure the local volume change of irradiated whole-brain tissues in Magnetic Resonance Images and showed that the estimated change had good agreement with ground truth segmentation. In our previous work [8] we showed that Jacobian features in Computed Tomography (CT) images could predict the tumor pathologic response with high accuracy (94%) in esophageal cancer patients.

However, structural change in CT is affected by daily anatomical variations and therapy response is mostly seen in PET as metabolic activity [12]. Conventionally, metabolic tumor change is measured by deforming the follow-up tumor volume in PET and aligning it to baseline tumor volume using the transformation obtained from CT-CT Deformable Image Registration (DIR) [11]. However, PET and CT capture different properties (metabolic vs. anatomy) of a tumor, therefore applying the transformation from CT-CT registration is suboptimal. On the other hand, directly registering PET images is problematic since there are few image features to generate an accurate transformation [11].

Some attempts performed on deformable registration of PET-CT using joint maximization of intensities [4] increased the uncertainties due to heterogeneous tumor uptake in PET and different intensity distributions between two images. Additionally, deep learning methods to estimate DVF have been proposed recently. However, training deformations were generated using existing Free-Form registrations, hence the accuracy could be as good as already available algorithms [6]. Moreover, the algorithms were not tested for multi-modality registrations.

In this work, we used a linear combination of PET and CT images to generate a single grayscale blended PET-CT image using a pixel-level fusion method. Our main goal is to combine anatomic and metabolic information to improve the accuracy of multi-modality PET-CT registration for quantification of tumor change and for prediction of pathologic tumor response. The contributions are as follows:

  1. 1.

    Local MTV change calculated using Jacobian integral of blended PET-CT image registration achieved higher correlation with the ground truth segmentation (R = 0.88) compared to mono-modality PET-PET (R = 0.80) and CT-CT (R = 0.67) registrations.

  2. 2.

    Jacobian radiomic features extracted from blended PET-CT registration could better differentiate pathologic tumor response (AUC = 0.85) than mono-modality PET and CT Jacobian and clinical features (AUC = 0.65–0.81) with only one Jacobian co-occurrence texture feature in esophageal cancer patients.

Fig. 1.
figure 1

(a) Main workflow of our method. Conceptual illustration of Jacobian map. (b) Larger sphere simulates MTV in the baseline image and smaller follow-up sphere illustrates shrinkage of a tumor. Converging DVF represents a volume loss and generates a Jacobian map (c) that illustrates local shrinkage (blue). (Color figure online)

2 Materials and Methods

Figure 1 shows our workflow and illustrates the concept of Jacobian map using a synthetic sphere that simulates a heterogeneous tumor shrinkage.

2.1 Dataset

This study included 61 patients with esophageal cancer who were treated with induction chemotherapy followed by CRT and surgery. All patients underwent baseline, post-induction and post-CRT PET/CT scans. Resolution for PET images was \(4.0\times 4.0\times 4.25\) mm\(^3\) and for CT images was \(0.98\times 0.98\times 4.0\) mm\(^3\). MTV on each PET-CT was segmented using a semi-automatic adaptive region-growing algorithm developed by our group [9]. Segmentations were visually reviewed and manually modified if necessary by a nuclear medicine physician. Average percentage of MTV change was 50 ± 30.6% in the cohort. Pathologic tumor response was assessed in surgical specimen and categorized into: pathologic complete responders (absence of viable tumor cells, 6 patients) and non-responders (partial response, progressive or stable disease, 55 patients). Registrations were performed between baseline and post-induction chemotherapy (follow-up) images.

2.2 Generating Blended PET-CT Images

Maximum intensity of CT images was clipped to 750 HU to eliminate the effect of high attenuation metals. PET images were resampled to CT resolution. PET and CT images were normalized to the range of [0, 1]. The normalization bounds used for CT were (−1000, 750) HU and for PET, the range of tumor SUVs in our patient cohort (0, 35) was used. To generate a grayscale blended PET-CT image, a weighted sum of normalized PET (nPET) and CT images (nCT) was formulated (Eq. 1) where \(\alpha \in [0,1]\):

$$\begin{aligned} Blended\; PETCT = \alpha (nCT) + (1-\alpha ) nPET \end{aligned}$$
(1)

\( \alpha \) = 0.2 was found optimal in that it produced similar blending of PET and CT information as when the nuclear medicine physician visually fused PET (window/level = 6/3 SUV) and CT (window/level = 350/40 HU) images. By using blended PET-CT images for registration, high metabolic uptake in the tumor was emphasized in the foreground while anatomic details in surrounding normal tissues were kept in the background (Fig. 3).

2.3 Registration Methods

B-Spline Regularized Diffeomorphic Registration: To correct respiratory-induced tumor motion, we first aligned follow-up images to baseline images by rigidly registering the tumors using their center of geometry as an initial transformation. Then we deformably registered two images using a rigidity penalty term [7] to enforce the local rigidity on tumor and preserve tumor’s structure while compromising on the global surrounding differences. Rigidity penalty was only applied to blended PET-CT and PET-PET registrations. Initial alignment of CT images was performed using a rigid registration. We then deployed a B-spline regularized symmetric diffeomorphic registration (BSD) [10] to characterize metabolic volume loss. A diffeomorphic registration estimates the optimized transformation, \(\phi \), parameterized over \(t \in [0,1]\) that maps the corresponding points between two images. \(\phi \) is obtained by a Symmetrized Large Deformation Diffeomorphic Metric Mapping (LDDMM) algorithm that finds a geodesic solution in the space of diffeomorphism. A symmetrized LDDMM captures large intra-modality differences and guarantees inverse consistency and one-one mapping in DVF while minimizing the bias between forward and inverse transformations. By explicitly integrating the B-spline regularization term, a viscous-fluid model is formulated that fits the calculated DVF after each iteration to a B-spline object. This gives free-form elasticity to converging vectors creating a sink point that is mapped to many points in its vicinity and represents a morphological shrinkage for the regions with non-mass conserving deformations. The optimization cost function is as follows [10]:

$$\begin{aligned} c(\phi (x,t),I_b\circ I_f )&= E_{similarity}^{MI} (\phi (x,1),I_b,I_f) + E_{geodesic}^2 (\phi (x,0),\phi (x,1)) \\ \nonumber&\quad + \rho _{Bspline} (v(\phi (x,t)),B_k) \end{aligned}$$
(2)

where \(E_{similarity}^{MI}\) is a mutual information similarity energy, \(E_{geodesic}\) is a geodesic energy function and \(\rho _{Bspline}\) denotes a B-spline regularizer. The transformation \(\phi (x,t)\) between baseline (\(I_b\)) and follow-up (\(I_f\)) images is characterized by the maps of the shortest path between time points \(t=0\) and \(t=1\).

\(v(\phi (x,t)) = \frac{\partial \phi (x,t)}{\partial t}\) is the gradient field that defines the displacement change at any given time point. \(B_k\) is B-spline function (k spline order) applied on the gradient field. Three levels of multi-resolution registration were implemented with B-spline mesh size of 32 mm, 32 mm and 16 mm at the coarsest level for blended PET-CT, PET-PET and CT-CT registrations, respectively. The mesh size was reduced by a factor of 2 at each sequential level. The optimization step size was set to 0.15 and the number of iterations (100, 70, 40) for all modalities. We used Directly Modified Free Form Deformation optimization scheme [10] that was robust to different parameters and all the registrations were performed in a cropped region 5 cm surrounding the MTV.

Registration Evaluation Methods: We considered MTV change measured by the semi-automatic segmentation with physician modification as the ground truth to compare against Jacobian integral for registration evaluation. Correlation and percentage of difference between MTV changes calculated by Jacobian integral and by semi-automatic segmentation (ground-truth) were first assessed. Dice Similarity Coefficient (DSC) was also calculated between baseline MTV and deformed follow-up MTV. We compared BSD results with a Free-Form Deformation Registration algorithm (FFD) regularized with bending energy [5]. The blended PET-CT, PET-PET and CT-CT registrations were separately performed using these two algorithms.

Optimal Registration Parameter Estimation: (i) Regularization mesh size (\(\sigma \)) and (ii) optimization step size (\(\gamma \)) were the most sensitive parameters. We experimentally studied the influence of different \(\sigma \) = 16, 32, 64, 128 mm and \(\gamma \) = 0.1, 0.15, 0.2, 0.25 on registration and Jacobian map. The registration results were used as a quantitative benchmark to find the optimal trade-off between the parameters. A parameter set that resulted in the best DSC and the highest correlation between Jacobian integral and segmentation was chosen as the optimal parameters.

2.4 Jacobian Features for Prediction of Tumor Response

We extracted 56 radiomic features quantifying the intensity and texture [13] of a tumor in the Jacobian map. The Jacobian features quantified the spatial patterns of tumor volumetric change. The importance of features in predicting pathologic tumor response was evaluated by both univariate and multivariate analysis. In univariate analysis, p-value and Area Under the Receiver Operating Characteristic Curve (AUC) for each feature was calculated using Wilcoxon rank sum test. In multi-variate analysis, firstly distinctive features were identified using hierarchical clustering [2]. A Random Forest model (RF) was then constructed (200 trees) with features chosen by a Least Absolute Shrinkage and Selection Operator (LASSO) feature selection. All distinctive features were fed to the RF classifier in a manner of a 10-fold cross-validation (CV). Within each fold, LASSO was applied to select the ten most important features. We repeated the 10-fold CV ten times to obtain the model accuracy (10\(\,\times \,\)10-fold CV).

3 Results and Discussion

3.1 Quantitative Registration Evaluation

A combination of \(\sigma \) = 32 mm (blended PET-CT), 32 mm (PET-PET), 16 mm (CT-CT) and \(\gamma \) = 0.15 achieved the best DSC and the highest correlation hence were selected as the optimal registration parameters. Larger mesh size in blended PET-CT and PET-PET registrations compared to CT-CT registration produced a more regularized and smoothed DVF to compensate the local irregular deformations due to non-uniform metabolic uptakes and lack of corresponding points in PET. Figure 2 shows scatter plots with least square regression line (solid red) between MTV change calculated by Jacobian integral and the ground truth segmentation for (a) blended PET-CT, (b) PET-PET and (c) CT-CT BSD registrations with goodness of fit (\(r^2\)) values. Blended PET-CT registration showed the highest \(r^2\) and captured the greatest range of deformations in tumor, compared to PET-PET and CT-CT registrations. Table 1 shows correlation coefficients and average percentage of difference between Jacobian integral and segmentation using BSD and FFD for each modality.

Fig. 2.
figure 2

Scatter plot showing correlation between MTV change calculated by Jacobian integral and ground truth segmentation for (a) blended PET-CT, (b) PET-PET and (c) CT-CT BSD registrations. Dashed blue line is identity line. (Color figure online)

Mean±stdev DSC are also presented in Table 1. Blended PET-CT registration showed higher DSC with less variation among the cohort. Using a blended PET-CT registration, DVF in and near the tumor region was driven by the metabolic changes while DVF outside the tumor region was driven by the anatomical structures surrounding the tumor. The blended PET-CT registration benefited by leveraging prominent image features from both PET and CT simultaneously, hence, achieving higher DSC and more accurate estimation of MTV change.

Table 1. Registration results using the optimal parameters comparing correlation and average percentage difference between MTV change estimated by Jacobian integral and segmentation.

3.2 Residual Tumor versus Non-residual Tumor Cases

Figure 3 shows blended PET-CT images of 3 heterogeneous tumor cases. Tumor shrinkage calculated by blended PET-CT, PET-PET and CT-CT registrations are illustrated using DVF and Jacobian map for each case (Top, Middle and Bottom). Qualitatively, using blended PET-CT image registration, vectors converged from the boundary toward the center of baseline and follow-up MTVs (green and blue volume), generated a sink point in the center where Jacobian was much smaller than 1 (shown in blue in Jacobian map), indicating a large shrinkage. Using PET-PET registration, due to lack of image features, the registration couldn’t accurately find the corresponding points and DVF only converged in the tumor boundary. For CT-CT registration, due to smaller structural change and uniform intensity in soft tissue, DVF magnitude was small and Jacobian map mostly showed no volume change. The percentage of tumor shrinkage calculated by semi-automatic segmentation (ground-truth) is listed in Table 2 for each case (Top, Middle and Bottom). The percentage of tumor shrinkage calculated by blended PET-CT, PET-PET and CT-CT registrations using both BSD and FFD are also shown in Table 2 for each case. Quantitatively, using BSD, both PET-PET and CT-CT registrations showed inferior results compared to blended PET-CT. For smaller shrinkage, FFD had similar accuracy to BSD, but its accuracy worsened for larger changes. However using FFD, PET-PET had the worst results while CT-CT achieved much better accuracy. These results aligned with the literature that diffeomorphic algorithm performs better on larger deformations whereas smaller soft tissue changes in CT can be better captured using the FFD algorithm [1].

Fig. 3.
figure 3

First column shows baseline and follow-up blended PET-CT images for three tumors in coronal (top, middle) and axial (bottom) views. Red contour is MTV. In the second to the last column, DVF (left) illustrate the change from baseline MTV (green) to follow-up MTV (blue) and Jacobian maps (right) are overlaid on baseline MTV. Color bar indicates shrinkage (blue) to expansion (red) in Jacobian map. (Color figure online)

Table 2. Tumor shrinkage quantified by blended PET-CT, PET-PET and CT-CT registrations compared with ground truth segmentation for each case in Fig. 3.

Jacobian maps in Fig. 3 illustrate local non-uniform tumor changes. Quantifying change in a non-residual tumor (Fig. 3 bottom) using DIR is challenging due to a large non-correspondence deformation between the two images. Here, we showed that using blended PET-CT image registration we could generate a DVF to quite accurately measure tumor change owing to the dominant metabolic tumor structures in the baseline image and anatomical structures in the follow-up image that guided the registration.

3.3 Pathologic Tumor Response Prediction

Table 3 lists the p-value and AUC for all predictive Jacobian features compared with clinical features as well as a recent esophageal cancer radiomics study using univariate analysis. Standard Deviation (SD) of Correlation, a texture feature in Jacobian map of tumors using blended PET-CT BSD registration achieved higher AUC = 0.85 compared to PET radiomic features analysis performed by Yip et al. [13]. Clinical features in our study were not predictive and none FFD based Jacobian features were significant in differentiating pathologic response.

Table 3. Important Jacobian and clinical features in univariate analysis.

In multivariate analysis, the RF-LASSO model achieved the highest accuracy with only one texture feature - Mean of Cluster Shade extracted from blended PET-CT BSD Jacobian map (Sensitivity = 80.6%, Specificity = 82.6%, Accuracy = 82.3%, AUC = 0.81). However, the performance was worsened when adding more features (Fig. 4(a)). This feature quantified the heterogeneity of the tumor change and responders showed higher values meaning more heterogeneous local MTV changes. Figure 4(b) is the ROC curve of the best model and Fig. 4(c) shows this feature can differentiate response very well. Mean of Cluster Shade was selected as the first feature by LASSO, however SD correlation with the highest AUC in univariate analysis was selected as the third feature in the multivariate model. This may be because LASSO selects the least correlated features and Mean of Cluster Shade had the smallest mean absolute correlation (r = 0.22) among the important distinctive features compared to SD correlation (r = 0.46).

Fig. 4.
figure 4

(a) Model performance with increasing number of features (b) ROC curve on the best model (c) Box plot of Mean of Cluster Shade Jacobian feature.

4 Conclusion and Future Work

We combined PET and CT images into a grayscale blended PET-CT image for quantification of local metabolic tumor change using Jacobian map. We extracted intensity and texture features from the Jacobian map to predict pathologic tumor response in esophageal cancer patients. Jacobian texture features showed the highest accuracy for prediction of pathologic tumor response (accuracy = 82.3%). In the future, we will explore automated optimal weight tuning for PET-CT blending.