Abstract
Diffusion magnetic resonance imaging (dMRI) is currently the only tool for noninvasively imaging the brain’s white matter tracts. The fiber orientation (FO) is a key feature computed from dMRI for tract reconstruction. Because the number of FOs in a voxel is usually small, dictionary-based sparse reconstruction has been used to estimate FOs. However, accurate estimation of complex FO configurations in the presence of noise can still be challenging. In this work we explore the use of a deep network for FO estimation in a dictionary-based framework and propose an algorithm named Fiber Orientation Reconstruction guided by a Deep Network (FORDN). FORDN consists of two steps. First, we use a smaller dictionary encoding coarse basis FOs to represent diffusion signals. To estimate the mixture fractions of the dictionary atoms, a deep network is designed to solve the sparse reconstruction problem. Second, the coarse FOs inform the final FO estimation, where a larger dictionary encoding a dense basis of FOs is used and a weighted \(\ell _{1}\)-norm regularized least squares problem is solved to encourage FOs that are consistent with the network output. FORDN was evaluated and compared with state-of-the-art algorithms that estimate FOs using sparse reconstruction on simulated and typical clinical dMRI data. The results demonstrate the benefit of using a deep network for FO estimation.
You have full access to this open access chapter, Download conference paper PDF
Similar content being viewed by others
Keywords
1 Introduction
Diffusion magnetic resonance imaging (dMRI) is currently the only tool that enables reconstruction of in vivo white matter tracts [5]. By capturing the anisotropic water diffusion in tissue, dMRI infers information about fiber orientations (FOs), which are crucial features in white matter tract reconstruction [5].
Various methods have been proposed to estimate FOs. In particular, methods based on sparse reconstruction have shown efficacy in reliable FO estimation with a reduced number of gradient directions [8]. The sparsity assumption is typically combined with multi-tensor models [3, 4, 8], which leads to dictionary-based sparse reconstruction of FOs. However, accurate estimation of complex FO configurations in the presence of noise can still be challenging.
In this work, we explore the use of a deep network to improve dictionary-based sparse reconstruction of FOs. We model the diffusion signals using a dictionary, the atoms of which encode a set of basis FOs. Then, FO estimation can be formulated as a sparse reconstruction problem and we seek to solve it with the aid of a deep network. The proposed method is named Fiber Orientation Reconstruction guided by a Deep Network (FORDN), which consists of two steps. First, a deep network that unfolds the conventional iterative estimation process is constructed and its weights are learned from synthesized training samples. To reduce the computational burden of training, this step uses a smaller dictionary that encodes a coarse set of basis FOs, and thus gives approximate estimates of FOs. Second, the final sparse reconstruction of FOs is guided by the FOs produced by the deep network. A larger dictionary encoding dense basis FOs is used, and a weighted \(\ell _{1}\)-norm regularized least squares problem is solved to encourage FOs that are consistent with the network output. Experiments were performed on simulated and typical clinical brain dMRI data, where promising results were observed compared with competing algorithms.
2 Methods
2.1 Background: FO Estimation by Sparse Reconstruction
Diffusion signals can be modeled with a set of fixed prolate tensors, each representing a possible FO by its primary eigenvector (PEV) [4, 8]. Suppose the set of the basis tensors is \(\{\mathbf {D}_{i}\}_{i=1}^{N}\) and their PEVs are \(\{\varvec{v}_{i}\}_{i=1}^{N}\), where N is the number of the basis tensors. In this work we use \(N=289\) [14], which results from tessellating an octahedron. The eigenvalues of the basis tensors can be determined by examining the diffusion tensors in regions occupied by single tracts [8].
For each diffusion gradient direction \(\varvec{g}_{k}\) (\(k=1,\ldots ,K\)) associated with a b-value \(b_{k}\), the diffusion weighted signal at each voxel can be represented as [8]
where \(S(\varvec{0})\) is the baseline signal without diffusion weighting, \(f_{i}\) is the unknown nonnegative mixture fraction (MF) for \(\mathbf {D}_{i}\) (\(\sum _{i=1}^{N} f_{i} = 1\)), and \(n(\varvec{g}_{k})\) represents image noise. By defining \(y(\varvec{g}_{k}) = S(\varvec{g}_{k})/S(\varvec{0})\) and \(\eta (\varvec{g}_{k}) = n(\varvec{g}_{k})/S(\varvec{0})\), we have
where \(\varvec{y}=(y({\varvec{g}_1}),...,y({\varvec{g}_K}))^{T}\), \(\varvec{f}=(f_{1},...,f_{N})^{T}\), \(\varvec{\eta }=(\eta ({\varvec{g}_1}),...,\eta ({\varvec{g}_K}))^{T}\), and \(\mathbf {G}\in \mathbb {R}^{K\times N}\) is a dictionary matrix with \(G_{ki}=e^{-b_{k}\varvec{g}_{k}^{T}\mathbf {D}_{i}\varvec{g}_{k}}\).
Because the number of FOs at a voxel is small compared with that of gradient directions, FOs can be estimated by solving a sparse reconstruction problem
To solve Eq. (3), the constraint of \(||\varvec{f}||_{1}=1\) is usually removed [4, 8]. Then, the problem can be solved by using iterative reweighted \(\ell _{1}\)-norm minimization [4] or by approximating the \(\ell _{0}\)-norm with the \(\ell _{1}\)-norm [8]. The solution is finally normalized so that the MFs sum to one and basis directions associated with MFs larger than a threshold are set to be FOs [8].
2.2 FO Estimation Using a Deep Network
Consider the general sparse reconstruction problem
Using methods such as iterative hard thresholding (IHT) [2], Eq. (4) or its \(\ell _{1}\)-norm relaxed version can be solved by iteratively updating \(\varvec{f}\). At iteration \(t+1\),
where \(\mathbf {W}=\mathbf {G}^{T}\), \(\mathbf {S}=\mathbf {I}-\mathbf {G}^{T}\mathbf {G}\), and \(h_{\lambda }(\cdot )\) is a thresholding operator with a parameter \(\lambda \ge 0\). Motivated by this iterative process, previous works have explored the use of a deep network for solving sparse reconstruction problems. By unfolding and truncating the process in Eq. (5), feed-forward deep network structures can be constructed for sparse reconstruction, where \(\mathbf {W}\) and \(\mathbf {S}\) are learned from training data instead of predetermined by \(\mathbf {G}\) [11, 12]. As demonstrated by [12], these learned layer-wise fixed weights could guarantee successful reconstruction across a wider range of restricted isometry property (RIP) conditions than conventional methods such as IHT.
In this work, to solve Eq. (3) we construct a deep network as shown in Fig. 1, where the input is the diffusion signal \(\varvec{y}\) at a voxel and the output is the MF \(\varvec{f}\). The layers \(L=1,2,\ldots ,8\) correspond to the unfolded and truncated iterative process in Eq. (5) (assuming \(\varvec{f}^{0}=\varvec{0}\)), where \(\mathbf {W}\) and \(\mathbf {S}\) (shared among layers) are learned. The thresholded rectified linear unit (ReLU) [7] is given by \([h_{\lambda }(\varvec{a})]_{i} = a_{i}\mathbbm {1}_{a_{i} \ge \lambda }\), where \(\mathbbm {1}\) is an indicator function, and is used in each of these layers. The thresholded ReLU yields the thresholding operator in IHT [2] (Eq. (5)). We empirically set \(\lambda =0.01\). Note that because of the nonnegative constraint on \(\varvec{f}\) in Eq. (3), \([h_{\lambda }(\varvec{a})]_{i}\) is always zero when \(a_{i} < 0\). A normalization layer is added before the output to enforce that the entries of \(\varvec{f}\) sum to one. To ensure numerical stability, we use \(\varvec{f}\leftarrow (\varvec{f}+\tau \varvec{1})/||\varvec{f}+\tau \varvec{1}||_{1}\) for the normalization, where \(\tau =10^{-10}\). The network is implemented using the Keras library (http://keras.io/). We use the mean squared error as the loss function and the Adam algorithm [6] as the optimizer; the learning rate is 0.001, the batch size is 64, and the number of epochs is 8, which achieves stable training loss in practice.
If the training data are generated by conventional algorithms, such as IHT, then the network only learns a strategy that approximates these suboptimal solutions [12]. Thus, we adopt the strategy of synthesizing observations [12] according to given FO configurations. However, synthesis of diffusion signals for all combinations is prohibitive. For example, for the cases of three crossing fibers, the total number of FO combinations is \({N\atopwithdelims ()3}\approx 4\times 10^{6}\) and each combination requires sufficient training instances with noise sampling and different combinations of MFs. Thus, training the deep network using the full set of basis directions can be computationally intensive. Therefore, we use a two-step strategy to estimate FOs: (1) by using a smaller set of basis FOs, coarse FOs are estimated using the proposed deep network; (2) the final FO estimation is guided by these coarse FOs by solving a weighted \(\ell _{1}\)-norm regularized least squares problem. Details of these two steps are described below.
Coarse FO Estimation Using a Deep Network. A smaller set of basis tensors \(\{\tilde{\mathbf {D}}_{i'}\}_{i'=1}^{N'}\) (\(N'=73\)) are considered for coarse FO estimation using the deep network. As discussed and assumed in the literature, we consider cases with three or fewer FOs in synthesizing the training data [4]. The cases of FO configurations can be given by applying an existing FO estimation method to the subject of interest. In this work we use CFARI [8] which estimates FOs using sparse reconstruction. Note that such a method does not need to provide accurate FO configurations at every voxel. Instead, it provides a good estimate of the set of FO configurations in the brain or a region of interest.
Because the original CFARI method can give multiple close FOs to represent a single FO that is not collinear with a basis direction, which unnecessarily increases the number of FOs, and that these FOs may not be collinear with the smaller set of basis directions considered in the deep network, we post-process the CFARI FOs by selecting the peak directions in terms of MFs and then map them to their closest basis directions in the coarse basis set.
The refined CFARI FOs in the brain or a brain region provide a set of training FO configurations. For each FO configuration with a single or multiple basis directions, diffusion signals were synthesized with a single-tensor or multi-tensor model using the corresponding basis tensors, respectively. For a single basis direction, its MF was set to one; for multiple basis directions, different combinations of their MFs from 0.1 to 0.9 in increments of 0.1 were used for synthesis (note that they should sum to one). Rician noise was added to the synthesized signals, and the signal-to-noise ratio (SNR) can be obtained, for example, by placing bounding boxes in background and white matter areas [13]. For each MF combination, 500 samples were generated for training.
We further reduce the computational cost of training by parcellating the brain into different regions, each containing a small number of FO configurations. This is achieved by registering the EVE template [10] to the subject using the fractional anisotropy (FA) map and the SyN algorithm [1]. A deep network is then constructed for each region using all the FO configurations in that region, and thus each network requires a much smaller number of training samples.
In the test phase, the trained networks estimate the MFs in their corresponding parcellated brain regions. Like [8], the basis directions with MFs larger than a threshold of 0.1 are set to be the FOs. The FOs predicted by the deep networks are denoted by \(\mathcal {U}=\{\varvec{u}_{p}\}_{p=1}^{U}\), where U is the cardinality of \(\mathcal {U}\).
FO Estimation Guided by the Deep Network. The coarse FOs given by the deep networks provide only approximate FO estimates due to the low angular resolution of the coarse basis; however, they can guide the final sparse FO reconstruction that uses the larger set of basis directions. Specifically, at each voxel we solve the following weighted \(\ell _{1}\)-norm regularized least squares problem [13] that allows incorporation of prior knowledge of FOs,
Here, \(\mathbf {C}\) is a diagonal matrix encoding the guiding FOs predicted by the deep network, and basis directions closer to the guiding FOs are encouraged. The diagonal weights are specified as [14].
When \(\varvec{v}_{i}\) is close to the guiding FOs, \(C_{i}\) is small and therefore \(f_{i}\) is encouraged to be large and \(\varvec{v}_{i}\) is encouraged to be selected as an FO. Equation (6) can be solved using the strategy in [13]. We set \(\alpha =0.8\) as in [14], and selected \(\beta =0.25\) because the number of diffusion gradients used in this work is about half of that used in [14]. The MFs are normalized so that they sum to one, and the FOs are determined as the basis directions associated with MFs larger than 0.1 [8].
3 Results
3.1 3D Digital Crossing Phantom
A 3D digital crossing phantom was created, where the tract geometries and diffusion parameters in [14] were used. The phantom consists of regions of single tracts, two crossing tracts, and three crossing tracts. Thirty gradient directions were applied with b = 1000 s/mm\(^2\). Rician noise (\(\mathrm {SNR}=20\) on the b0 image) was added to the diffusion weighted images (DWIs).
We quantitatively evaluated the accuracy of FORDN using the error measure in [14]. We compared FORDN with two sparse reconstruction based methods, CFARI [8] and L2L0 [4]. Note that in [14] CFARI has already been compared with techniques that are not based on sparse reconstruction, where they achieve similar estimation performance. We also compared the final FORDN results with the intermediate output from the deep network (DN). The errors in the entire phantom and in each region containing noncrossing, two crossing, or three crossing tracts are shown in Fig. 2(a). In all cases, FORDN achieves more accurate FO reconstruction. In addition, the intermediate DN results already improves FO estimation in regions with crossing tracts compared with CFARI and L2L0.
The FO errors in Fig. 2(a) were also compared between FORDN and CFARI, L2L0, and DN using a paired Student’s t-test. In all cases, the FORDN errors are significantly smaller (\(p<0.001\)), and the effect sizes (Cohen’s d) are shown in Fig. 2(b). The effect sizes are larger in regions with three crossing tracts, indicating greater improvement in regions with more complex fiber structures.
3.2 Brain dMRI
FORDN was next applied to a dMRI scan of a random subject from the Kirby21 dataset [9].Footnote 1 DWIs were acquired on a 3T MR scanner (Achieva, Philips, Best, Netherlands). Thirty-two gradient directions (b = 700 s/mm\(^2\)) were used. The in-plane resolution is 2.2 mm isotropic and was upsampled by the scanner to 0.828 mm isotropic. The slice thickness is 2.2 mm. We resampled the DWIs so that the resolution is 2.2 mm isotropic. The SNR is about 22 on the b0 image.
FOs in a region where the CC and SLF cross are shown in Fig. 3 and compared with CFARI and L2L0. FORDN better reconstructs the transverse CC FOs and the anterior–posterior SLF FOs than CFARI and L2L0 (see the highlighted region). Fiber tracking was then performed using the strategy in [15], where seeds were placed in the noncrossing CC (see Fig. 4). The FA threshold is 0.2, the turning angle threshold is \(45^{\circ }\), and the step size is 1 mm. The results are shown in Fig. 4, and each segment is color-coded using the standard color scheme—red: left–right; green: front–back; and blue: up–down. FORDN FOs do not produce the false (green) streamlines going in the anterior–posterior direction as in the CFARI and L2L0 results (see the zoomed region). Note that the streamlines tracked by FORDN propagate through multiple regions parcellated by the EVE atlas, which indicates that consistency of the streamlines is preserved although each region is associated with a different deep network.
4 Conclusion
We have proposed an algorithm of FO estimation guided by a deep network. The diffusion signals are modeled in a dictionary-based framework. A deep network designed for sparse reconstruction provides coarse FO estimation using a smaller set of the dictionary atoms, which then informs the final FO estimation using weighted \(\ell _{1}\)-norm regularization. Results on simulated and clinical brain dMRI have demonstrated promising results compared with the competing methods.
Notes
- 1.
Additional results can be found at http://arxiv.org/abs/1705.06870.
References
Avants, B.B., Epstein, C.L., Grossman, M., Gee, J.C.: Symmetric diffeomorphic image registration with cross-correlation: evaluating automated labeling of elderly and neurodegenerative brain. Med. Image Anal. 12(1), 26–41 (2008)
Blumensath, T., Davies, M.E.: Iterative thresholding for sparse approximations. J. Fourier Anal. Appl. 14(5–6), 629–654 (2008)
Chen, G., Zhang, P., Li, K., Wee, C.Y., Wu, Y., Shen, D., Yap, P.T.: Improving estimation of fiber orientations in diffusion MRI using inter-subject information sharing. Sci. Rep. 6, 37847 (2016)
Daducci, A., Van De Ville, D., Thiran, J.P., Wiaux, Y.: Sparse regularization for fiber ODF reconstruction: from the suboptimality of \(\ell _2\) and \(\ell _1\) priors to \(\ell _0\). Med. Image Anal. 18(6), 820–833 (2014)
Johansen-Berg, H., Behrens, T.E.J.: Diffusion MRI: From Quantitative Measurement to In vivo Neuroanatomy. Academic Press, Waltham (2013)
Kingma, D., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
Konda, K., Memisevic, R., Krueger, D.: Zero-bias autoencoders and the benefits of co-adapting features. arXiv preprint arXiv:1402.3337 (2014)
Landman, B.A., Bogovic, J.A., Wan, H., ElShahaby, F.E.Z., Bazin, P.L., Prince, J.L.: Resolution of crossing fibers with constrained compressed sensing using diffusion tensor MRI. NeuroImage 59(3), 2175–2186 (2012)
Landman, B.A., Huang, A.J., Gifford, A., Vikram, D.S., Lim, I.A.L., Farrell, J.A., Bogovic, J.A., Hua, J., Chen, M., Jarso, S., Smith, S.A., Joel, S., Mori, S., Pekar, J.J., Barker, P.B., Prince, J.L., van Zijl, P.C.: Multi-parametric neuroimaging reproducibility: a 3-T resource study. NeuroImage 54(4), 2854–2866 (2011)
Oishi, K., Faria, A., Jiang, H., Li, X., Akhter, K., Zhang, J., Hsu, J.T., Miller, M.I., van Zijl, P.C., Albert, M., et al.: Atlas-based whole brain white matter analysis using large deformation diffeomorphic metric mapping: application to normal elderly and Alzheimer’s disease participants. NeuroImage 46(2), 486–499 (2009)
Wang, Z., Ling, Q., Huang, T.S.: Learning deep \(\ell _0\) encoders. In: AAAI Conference on Artificial Intelligence, pp. 2194–2200 (2016)
Xin, B., Wang, Y., Gao, W., Wipf, D.: Maximal sparsity with deep networks? In: Advances in Neural Information Processing Systems, pp. 4340–4348 (2016)
Ye, C., Murano, E., Stone, M., Prince, J.L.: A bayesian approach to distinguishing interdigitated tongue muscles from limited diffusion magnetic resonance imaging. Comput. Med. Imaging Graph. 45, 63–74 (2015)
Ye, C., Zhuo, J., Gullapalli, R.P., Prince, J.L.: Estimation of fiber orientations using neighborhood information. Med. Image Anal. 32, 243–256 (2016)
Yeh, F.C., Wedeen, V.J., Tseng, W.Y.I.: Generalized \(q\)-sampling imaging. IEEE Trans. Med. Imaging 29(9), 1626–1635 (2010)
Acknowledgement
This work is supported by NSFC 61601461 and NIH 2R01NS056307.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Ye, C., Prince, J.L. (2017). Fiber Orientation Estimation Guided by a Deep Network. In: Descoteaux, M., Maier-Hein, L., Franz, A., Jannin, P., Collins, D., Duchesne, S. (eds) Medical Image Computing and Computer Assisted Intervention − MICCAI 2017. MICCAI 2017. Lecture Notes in Computer Science(), vol 10433. Springer, Cham. https://doi.org/10.1007/978-3-319-66182-7_66
Download citation
DOI: https://doi.org/10.1007/978-3-319-66182-7_66
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-66181-0
Online ISBN: 978-3-319-66182-7
eBook Packages: Computer ScienceComputer Science (R0)