Keywords

1 Introduction

Medical images acquired at different time points, or originating from different scanners, need to be brought into spatial alignment to assess complementary structural and/or functional information. This process is called image registration and is one of the fundamental medical image analysis procedures [23]. Deformable image registration is particularly important for lung applications where, for example, the different breath-hold levels need to be compensated in the acquired images. Single-modality lung registration, especially Computed Tomography (CT)-CT registration, has been widely studied [4, 17] and dedicated image registration methods have been proposed [7, 10, 11, 25, 28].

While CT-CT lung image registration is a non-trivial task, mainly because of sliding motion between the surfaces of the lungs, the ribcage, and diaphragm [18, 22], multi-modal lung image registration is even more challenging due to more complex deformations and directly incomparable intensities between the acquired scans. Registration between proton density Magnetic Resonance Imaging (pMRI) and CT is one such example, where the difficulty stems from the low proton density in the lungs and susceptibility to acquisition artifacts caused by the interfaces between air and lung tissue. Such registration, however, plays an important role in the analysis of hyperpolarized Xenon MRI (XeMRI) [2]. XeMRI, due to its non-ionizing nature, has received substantial attention in the field for imaging ventilation, perfusion, and gas transfer in the lungs [16]. As XeMRI does not provide structural information, its correspondences to the patient anatomy rely on pMRI, which is acquired during the same imaging session but not within the same breath-hold. Even though patients are provided with bags containing 1l of gas for both image acquisitions, due to different properties of air and xenon, as well as individual breathing patterns, the images might be acquired at different levels of lung inflation. It is not, therefore, straightforward to directly map XeMRI to diagnostic lung CT, for instance in the case of patients undergoing radiotherapy treatment. An intermediate registration between pMRI and CT is needed to find this mapping, as shown in Fig. 1. This registration becomes particularly challenging for a number of reasons, including the lower spatial resolution of pMRI compared with CT, the limited information from lung tissue in pMRI due to its low proton density, and the presence of susceptibility artifacts. For these reasons, the registration can easily result in under or over-estimation of deformations inside the lungs.

Fig. 1.
figure 1

To bring XeMRI into alignment with CT, we compose two transformations: transformation T\(_1\) that compensates for a possible initial misalignment between XeMRI and pMRI, and transformation T\(_2\) estimated based on registration between pMRI and CT. The dedicated framework addressing this problem is the main contribution of this work.

An alternative approach for this problem might be the application of lung motion models [15]. For instance, a statistical motion model based on deformations estimated from 4DCT was proposed in [5]. The individual motion models estimated for each subject from the dataset have been co-registered to an average shape and intensity model was generated from reference frames from 4DCT. This resulted in a development of an average inter-subject model. In [13], after estimating the deformations from 4DCT, the surface point distribution model of the shape of the lungs was constructed. After applying Principal Component Analysis (PCA) to reduce the dimensionality, the statistical model between the estimated deformations and point-based shape variations was calculated. Similar approach has been presented in [29], with the diaphragm position used as a surrogate of the motion to control the model. To create a lung motion model, Finite Element Analysis (FEA) could be also used, such as in [9], where a patient-specific bio-mechanical model has been proposed for lung CT registration. However, to achieve satisfying accuracy the FEA model-based method requires an additional registration. All of the aforementioned methods have been applied to CT-to-CT registration problem. In the case of pMRI-to-CT registration, the task may be even more challenging due to the low out-of-plane spatial resolution of pMRI and lack of direct intensity correspondences.

In this work, we address the issue of insufficient amount of information inside the lungs in pMRI, by proposing a personalized 4D-CT statistical motion model for a supervoxel-based graphical image registration [11, 25]. The main contribution is a dedicated framework, which addresses the challenges of XeMRI to CT deformable registration in the form of supervoxel-based motion model enhanced method. The evaluation has been performed on a clinical dataset and compared with state-of-the-art image registration methods, showing higher correlation of XeMRI with ventilation maps estimated from 4DCT.

2 Methods

The proposed method consists of three main steps: (1) creation of a personalized lung motion model from 4DCT, (2) lung image clustering and (3) graph-based pMRI-to-CT registration. We introduce these steps in detail in this section. An overview of the proposed method is presented in Fig. 2.

Fig. 2.
figure 2

Diagram presenting the workflow of the proposed method. We start from registering all the 4DCT volumes for each of the patients to the chosen reference frame. Over the estimated deformation fields we apply PCA decomposition to create a motion model. Subsequently, we extract supervoxels from the lungs in the reference CT volume. We create a graph, where every supervoxel is represented by a node an all adjacent supervoxels are connected by an edge. For every supervoxel we find the best set of motion model parameters to bring pMRI into alignment with CT using graph cuts optimization. We apply the estimated deformation field to XeMRI as the ultimate goal of the registration framework.

2.1 Personalized Lung Motion Model from 4DCT

In our work, to create a personalized motion model we use displacements resulting from 4DCT registration to a reference volume. We apply an image registration method dedicated to lung applications [25], which has the potential to more accurately estimate abnormal lung motion. The method has shown good performance in terms of accuracy, plausibility of the resulting deformations for lung CT registration, and the ability to address the sliding motion problem. Subsequently, we perform PCA to the estimated deformations to obtain major motion patterns for each patient.

In the proposed method, for each patient, all breathing phases from 4DCT are co-registered to a reference volume, which is chosen as the peak inhale breathing phase volume. Our 4DCT data consists of 10 volumes; therefore as a result of the alignment we acquired 9 displacement fields. After the registration, we create for the reference volume vectors comprising all the estimated deformation fields:

$$\begin{aligned} R^p(\mathbf x ) = [V^p_1(\mathbf x ), V^p_2(\mathbf x ) , ... , V^p_n(\mathbf x ) ], \end{aligned}$$
(1)

where p is the direction the deformations (anterior-posterior, left-to-right, up-to-down), n is the number of volumes co-registered to the reference volume, and \(\mathbf x \) is a voxel location.

After applying PCA, we can reformulate Eq. 1 in terms of eigenvalues and eigenvectors:

$$\begin{aligned} R^p(\mathbf x ) \simeq \mu _d^p + \sum ^{n}_{i=1} \lambda _i^p \mathbf {\nu }_i^p(\mathbf x ) , \end{aligned}$$
(2)

where \(\mu _d^p\) is the mean displacement, \(\mathbf {\nu }_i^p(\mathbf x )\) is i-th eigenvector and \(\lambda _i^p\) is corresponding eigenvalue for direction p (anterior-posterior, left-to-right, up-to-down) for voxel’s spatial location \(\mathbf x \). We restricted the motion model to use the first eigenvector, as it covers the main motion pattern observed during the registration (in anterior-posterior - 83%, left-to-right - 82% and up-to-down - 95% directions on average for our dataset). The restriction to the use of first eigenvector makes the optimization more efficient, while taking advantage of the personalized motion model application. Regional variations from the motion pattern are compensated by applying supervoxel-based motion model parameters optimization registration step.

2.2 Lung Clustering

Image clustering provides a compact image representation, which has the potential to represent anatomically consistent regions in the form of larger structures. The peak inhale breathing phase volume, which has been chosen as a reference frame, is clustered using the well-established Simple Linear Iterative Clustering method [1], which groups spatially and visually close voxels into supervoxels. In this method, a fixed number of seeds for the expected number of supervoxels is uniformly located in the image. Their initial position is corrected by moving the seeds to a position of the lowest gradient in a \(3\times 3\times 3\) neighborhood. This step is performed to avoid placing them on an edge or a noisy voxel. Following that, every voxel in the image is assigned to the closest supervoxel, based on the distance measure: \( D = \sqrt{(d_e)^2 + \left( d_I / S \right) ^2m^2}\), where \(d_e\) is the Euclidean distance of a particular voxel to the supervoxel center, \(d_I\) is a voxel’s intensity distance from the supervoxel average intensity, and m is a compactness parameter. The resultant clustering of a CT image is shown in Fig. 3.

Fig. 3.
figure 3

The reference CT image in the coronal view and superpixels estimated for the lungs imposed on the image are shown in the upper row. Below, the estimated motion model for the reference CT volume frame left-to-right, anterior-posterior and up-to-down directions shown in coronal view with propagated superpixels from the CT image. For illustrative purposes, we show superpixels extracted from a 2D image, whereas in our method we use supervoxels extracted from 3D volumes.

2.3 Graph-Based Lung Image Registration

Image registration, as a problem of finding the optimal transformation between two images, can be stated using an Markov Random Fields-based optimization and posed on a graph. Graphical methods for deformable image registration [6, 10, 11, 19, 25] have achieved state-of-the-art accuracy and good performance in addressing sliding motion. Therefore, following image clustering, we create a graph where every supervoxel is represented by a node and all nodes corresponding to adjacent supervoxels are connected by an edge. The edge values are uniformly set to 1.

We apply a similar approach to [25], with graph cuts [3] as an optimization scheme. In the proposed method, we create a predefined set of labels \(l \in L\), where every label l is a set of parameters of the motion model in form of a vector \([l_x l_y l_z]\). This is one of the main differences compared with the majority of other methods in the field, where labels usually directly represent displacements. The label is applied to the corresponding patch of the motion model, and therefore, even if the algorithm assigns the same labels to neighboring supervoxels, they may potentially still have different displacements. The displacement inside the patch is not uniform and should mimic the motion of its tissue. Such an approach restricts the possible displacements of the patches to those which have been estimated for the particular regions of the lungs, and therefore results in more anatomically plausible estimated displacements. At the same time, the method still allows for local adjustments to the model by the estimated parameters. The estimation of the motion model parameters in a form of \(l_x\), \(l_y\) and \(l_z\) is one advantage of our application, as it compensates for the residual differences, when ideal rigid alignment of pMRI and CT is difficult to achieve. This alignment is challenging mainly because of the multi-modal nature of the images, differences in position inside the scanner, as well as possible variations in the patient anatomy, for instance due to tumor appearance. The model was created from 4DCT, based on co-registration of images acquired with the patient remaining at the same position in a scanner. Therefore our approach gives more degrees of freedom than a classic model based-approaches to compensate for the misalignments, while at the same time taking the advantage of the main motion patterns represented by the motion model.

As a similarity measure to find the optimal parameters of the motion model for every supervoxel we have applied the local correlation coefficient (LCC) [12], which is a well established approach for measuring image similarity in medical image registration. The general formulation of the energy to be minimized during the optimization process is:

$$\begin{aligned} E(l) = \underbrace{\sum _p \overline{ LCC(I_{fix}( \mathbf {x}_p),I_{mov}(\mathbf {x}_p + l_p*R(\mathbf {x}_p)) } }_{data \ term} + \alpha \underbrace{ \sum _{p,q \in N} \Vert l_p - l_q \Vert ^2 }_{smoothness \ term}, \end{aligned}$$
(3)

where the data term is formulated as a mean error calculated for all voxels \( \mathbf x \) in the fixed image \(I_{fix}\) and moving image \(I_{mov}\) clustered in a certain supervoxel represented by a node p, for the applied motion model R with the parameters represented by a label \(l_p\). The piecewise smoothness term represents quadratic distance between the labels. The influence of the piecewise smoothness term on the energy is controlled by a weighting parameter \(\alpha \). Since no XeMRI ventilation signal is expected to be present outside of the lung, our registration framework is therefore restricted to estimating deformations inside the volume of the lungs. The lungs are segmented from CT and registration is done only inside the masks. Akin to [11], we use a single resolution level with multiple layers of supervoxels, slightly varying their size and initial location. The estimated deformations are averaged across all layers.

The displacements estimated for the pMRI-to-CT registration are propagated to XeMRI, just as shown in Fig. 1, resulting in their alignment. Visual assessment of the framework are presented in Fig. 4, where we also display the estimated displacement fields for all the methods.

3 Experiments and Results

Our experiments have been performed on a dataset of three patients undergoing radiotherapy at Churchill Hospital in Oxford. For each patient, imaging data consisting of 4DCT, pMRI and XeMRI have been acquired, with the resolution of \(0.98 \times 0.98 \times 2.5\,[{\text {mm}}^3]\). Each 4DCT consisted of 10 3D volumes of CTs acquired in axial plane. A mixture of 129Xe gas (80%) and air was polarized on-site to between 4% and 12%, by using a commercial polarizer operating on the rubidium vapor spin-exchange optical pumping basis. The hyperpolarized gas has been delivered to patients during the imaging in 1.0-L bags [14]. The pMRI and XeMRI have been performed at 1.5 T MR scanner as 3D volumes from coronal acquisition the resolution of \(1.56 \times 20 \times 1.56\,[{\text {mm}}^3]\).

Following [26], pMRI volumes and reference volume from 4DCT were carefully aligned initially using rigid registration with mutual information as a similarity measure. In our application it is important to achieve good alignments at the apex and upper parts of the lungs. The resulting transformation was propagated to the corresponding XeMRI volumes, bringing them into rigid alignment with the reference CT volume.

Fig. 4.
figure 4

Coronal view of the CT scan of patient 2 is shown in (a). In (b) XeMRI after applying rigid registration (T\(_1\) from Fig. 1) and in (c) ventilation estimated from 4DCT are presented. The remaining figures show XeMRI ventilation images for the corresponding to CT slices. The lung border from CT is super-imposed on the ventilation images. The results of the XeMRI ventilation after applying deformable registration are shown only inside of the lung mask in the middle row. The possible under-estimation of the motion for B-splines [21] (d) and deeds [10] (e) are pointed by green arrows, and implausible deformations by blue arrows. The results for the proposed method are shown in (f). In the bottom row we show displacements in up-to-down direction for the corresponding slices for all the methods in (g), (h) and (i). (Color figure online)

We subsequently performed deformable registration of pMRI-to-CT and compared results of our registration method with the results of the deeds deformable image registration [10] and free form deformation-based registration using B-splines [21]. The deeds method originally proposed for lung CT registration shows good performance in multi-modal image registration application due to its image descriptor-based similarity measure, while FFDs on B-splines with mutual information as a similarity measure is one of the most established approaches for multi-modal image registration.

For the proposed method we have extracted supervoxels consisting of approximately 500 voxels each, with the compactness parameter set to 0.1, and used 20 layers of supervoxels. The range of motion model parameters \(l_x, l_y\) and \(l_z\) is set between −0.6 and 0.6, at the intervals of 0.1. The weighting parameter \(\alpha \) is 0.2 and local cross correlation was calculated for a 7\(\,\times \,\)7\(\,\times \,\)7 voxels patch size. Our method has been implemented in Matlab environment and its running time with the chosen parameters setting is approximately 45 min on a i7 laptop machine. The running times for the deeds and B-splines were approximately 13 min and 25 min, respectively, with C++ implementation. Our method is capable of further optimization and parallelization, which should result in significant running time reduction.

Visual inspection from Fig. 4 of the results reveals that the displacements estimated by the proposed method are anatomically more consistent. We decided to compare the XeMRI ventilation images with ventilation maps estimated from 4DCT, which is obtained using image registration of the dynamic sequence to a reference CT volume. To estimate ventilation maps, we have used a method based on the changes of the lung intensity expressed in Hounsfield units between peak inhale and peak exhale breathing phases [27]. An alternative approaches could estimate the ventilation from 4D CT based on determinant of Jacobian [8, 20] or with the use of supervoxel tracking [24]. We calculate Spearman’s correlation coefficient of the registered XeMRI ventilation images with the estimated 4DCT-based ventilation maps. Our method resulted in higher correlation coefficient for patient 1 and patient 2 (0.344 and 0.572) compared to both other image registration methods (0.217 and 0.367 for B-splines and 0.299 and 0.5 for deeds). For patient 3 all methods achieved comparable results, with a slightly higher value for B-splines (0.171). On average our method achieved the best score of 0.359, with deeds being the second highest-scoring method (0.322), while the lowest correlation was calculated for B-splines (0.251). Standard deviation of the determinant of the Jacobian of deformations, which can be seen as a measure of complexity of the deformations, for our method was on average 0.35, compared with 1.15 for B-splines and 0.63 for deeds. The results of the calculated correlations are shown in Fig. 5.

Fig. 5.
figure 5

Spearman’s correlation between CT-based estimated ventilation maps and XeMRI ventilation images for different pMRI-to-CT registration approaches.

4 Discussion and Conclusions

In this work, we proposed a personalized model-driven method for pMRI-to-CT lung image registration. The method was evaluated on three datasets of patients undergoing radiation treatment for lung cancer. The visual results presented in Fig. 4, where we show the estimated deformations, might suggest that the proposed method better mimic the motion of the lungs. The sudden changes in the direction of the motion estimated by B-splines and deeds, especially for the left lung, are unlikely to be present during breathing. We calculated correlation between CT-based estimated ventilations and XeMRI brought into alignment with CT by our method. On average, our method outperformed other image registration approaches in terms of the correlation with ventilation maps estimated from 4DCT. The slightly lower score for patient 3 was possibly caused by the fact that the difference in the lung volume between pMRI/XeMRI and CT was the lowest in this case. This observation seems to be supported by the fact that all the methods achieved comparable results.

Our motion model-based method requires an accurate initial rigid registration. The upper parts of the lungs and apexes should be well aligned initially, or else the motion model-based registration might result in suboptimal performance. Such behavior is imposed by the lung physiology and should not be considered as a limitation of the method.

One of the challenges in our work is the lack of ground truth or landmarks set in both modality images. Low out-of-plane resolution of pMRI and XeMRI (20 mm) is another factor and, hence, the registration problem is not a trivial one to address. The correlation of XeMRI with 4DCT-based estimated ventilation maps resulted in the overall moderate correlation. The reason for that might be different breathing patterns in 4DCT compared to XeMRI/pMRI, related to physical properties of xenon gas, which is much heavier than air. Ventilation maps based on 4DCT are estimated based on the changes of the tissue density, which should correspond to the lungs filling with air, however in practice they might provide complementary information. Another limitation is that we had access to only one 4DCT scan of each patient. Therefore our method might be prone to intra breathing cycle variations. This issue could potentially be eliminated by including more scans, such as diagnostic CT, of the same patient in the breathing motion creation step.

The presented method shows promising results for the challenging application of XeMRI to CT registration. The application of the Principal Component Analysis-based motion model in the deformable registration step of the framework, seems to have the potential to help drive the registration for the regions of the lungs with insufficient amount of information.