Abstract
We are presenting a modification of the well-known Alternating Direction Method of Multipliers (ADMM) algorithm with additional preconditioning that aims at solving convex optimisation problems with nonlinear operator constraints. Connections to the recently developed Nonlinear Primal-Dual Hybrid Gradient Method (NL-PDHGM) are presented, and the algorithm is demonstrated to handle the nonlinear inverse problem of parallel Magnetic Resonance Imaging (MRI).
You have full access to this open access chapter, Download conference paper PDF
1 Introduction
Non-smooth regularisation methods are popular tools in the imaging sciences. They allow to promote sparsity of inverse problem solutions with respect to specific representations; they can implicitly restrict the null-space of the forward operator while guaranteeing noise suppression at the same time. The most prominent representatives of this class are total variation regularisation [19] and \(\ell ^1\)-norm regularisation as in the broader context of compressed sensing [8, 10].
In order to solve convex, non-smooth regularisation methods with linear operator constraints computationally, first-order operator splitting methods have gained increasing interest over the last decade, see [3, 9, 11, 12] to name just a few. Despite some recent extensions to certain types of non-convex problems [7, 14,15,16] there has to our knowledge only been made little progress for nonlinear operators constraints [2, 22].
In this paper we are particularly interested in minimising non-smooth, convex functionals with nonlinear operator constraints. This model covers many interesting applications; one particular application that we are going to address is the joint reconstruction of the spin-proton-density and coil sensitivity maps in parallel MRI [13, 21].
The paper is structured as follows: we will introduce the generic problem formulation, then address its numerical minimisation via a generalised ADMM method with linearised operator constraints. Subsequently we will show connections to the recently proposed NL-PDHGM method (indicating a local convergence result of the proposed algorithm) and conclude with the joint spin-proton-density and coil sensitivity map estimation as a numerical example.
2 Problem Formulation
We consider the following generic constrained minimisation problem:
Here H and J denote proper, convex and lower semi-continuous functionals, F is a nonlinear operator and c a given function. Note that for nonlinear operators of the form \(F(u, v) = G(u) - v\) and \(c = 0\) problem (1) can be written as
In the following we want to propose a strategy for solving (1) that is based on simultaneous linearisation of the nonlinear operator constraint and the solution of an inexact ADMM problem.
3 Alternating Direction Method of Multipliers
We solve (1) by alternating optimisation of the augmented Lagrange function
Alternating minimisation of (3) in u, v and subsequent maximisation of \(\mu \) via a step of gradient ascent yields this nonlinear version of ADMM [11]:
Not having to deal with nonlinear subproblems, we replace \(F(u, v^k)\) and \(F(u^{k + 1}, v)\) by their Taylor linearisations around \(u^k\) and \(v^k\), which yields \(F(u, v^k) \approx F(u^k, v^k) + \partial _u F(u^k, v^k)\left( u - u^k \right) \) and \(F(u^{k + 1}, v) \approx F(u^{k + 1}, v^k) + \partial _v F(u^{k + 1}, v^k)\left( v - v^k \right) \), respectively. The updates (4) and (5) modify to
with \(A^k := \partial _u F(u^k, v^k)\), \(B^k := \partial _v F(u^{k + 1}, v^k)\), \(c_1^k := c + A^k u^k - F(u^k, v^k)\) and \(c_2^k := c + B^k v^k - F(u^{k + 1}, v^k)\). Note that the updates (7) and (8) are still implicit, regardless of H and J. In the following, we want to modify the updates such that they become simple proximity operations.
4 Preconditioned ADMM
Based on [23], we modify (7) and (8) by adding the surrogate terms \(\Vert u^{k + 1} - u^k \Vert _{Q^k_1}^2 / 2\) and \(\Vert v^{k + 1} - v^k \Vert _{Q^k_2}^2 / 2\), with \(\Vert w \Vert _Q := \sqrt{\langle Qw, w\rangle }\) (note that if Q is chosen to be positive definite, \(\Vert \cdot \Vert _Q\) becomes a norm). We then obtain
If we choose \(Q_1^k := \tau _1^k I - \delta A^k {}^* A^k\) with \(\tau _1^k \delta < 1/\Vert A^k \Vert ^2\) and \(Q_2^k := \tau _2^k I - \delta B^k {}^* B^k\) with \(\tau _2^k \delta < 1/\Vert B^k \Vert ^2\) and if we define \(\overline{\mu }^k := 2\mu ^k - \mu ^{k - 1}\) we obtain
with \((I + \alpha \partial E)^{-1}(w)\) denoting the proximity or resolvent operator
The entire proposed algorithm with updates (9), (10) and (6) reads as
5 Connection to NL-PDHGM
In the following we want to show how the algorithm simplifies in case the nonlinear operator constraint is only nonlinear in one variable, which is sufficient for problems of the form (2). Without loss of generality we consider constraints of the form \(F(u, v) = G(u) - v\), where G represents a nonlinear operator in u. Then we have \(A^k = {{\mathrm{\mathcal {J}}}}\!G(u^k)\) (with \({{\mathrm{\mathcal {J}}}}\!G(u^k)\) denoting the Jacobi matrix of G at \(u^k\)), \(B^k = -I\) and if we further choose \(\tau _2^k = 1/\delta \) for all k, update (10) reads
Applying Moreau’s identity [18] \(b = \left( I + \frac{1}{\delta } \partial J\right) ^{-1}(b) + \frac{1}{\delta }(I + \delta \partial J^*)^{-1}(\delta b)\) yields
If we further change the order of the updates, starting with the update for \(\mu \), the whole algorithm reads
Note that this algorithm is almost the same as NL-PDHGM proposed in [22] for \(\theta = 1\), except that the extrapolation step is carried out on the dual variable \(\mu \) instead of the primal variable u. In the following we want to briefly sketch how to prove convergence for this algorithm in analogy to [22]. We define
with \(c^k := G(u^k) - {{\mathrm{\mathcal {J}}}}\!G(u^k)u^k\). Now the algorithm is: find \((\mu ^{k+1}, u^{k+1})\) such that
If we exchange the order of \(\mu \) and u here, i.e., reorder the rows of N, and the rows and columns of \(L^k\), we obtain almost the “linearised” NL-PDHGM of [22]. The difference is that the sign of \({{\mathrm{\mathcal {J}}}}G\) in \(L^k\) is inverted. The only points in [22] where the exact structure of \(L^k\) (\(M_{x^k}\) therein) is used, are Lemma 3.1, Lemma 3.6 and Lemma 3.10. The first two go through exactly as before with the negated structure. Reproducing Lemma 3.10 demands bounding actual step lengths \(\Vert u^k-u^{k+1} \Vert \) and \(\Vert \mu ^k-\mu ^{k+1} \Vert \) from below, near a solution for arbitrary \(\epsilon >0\). A proof would go beyond the page limit of this proceeding. Let us just point out that this can be done, implying that the convergence results of [22] apply for this algorithm as well. This means that under somewhat technical regularity conditions, which for TV type problems amount to Huber regularisation, local convergence in a neighbourhood of the true solution can be guaranteed.
6 Joint Estimation of the Spin-Proton Density and Coil Sensitivities in Parallel MRI
We want to demonstrate the numerical capabilities of Algorithm 1 by applying it to the nonlinear problem of joint estimation of the spin-proton density and the coil sensitivities in parallel MRI. The discrete problem of joint reconstruction from sub-sampled k-space data on a rectangular grid reads
where \(\mathcal {F}\) is the 2D discrete Fourier transform, \(f_j\) are the k-space measurements for each of the n coils, S is the sub-sampling operator and \(R_j\) denote appropriate regularisation functionals. The nonlinear operator G maps the unknown spin-proton density u and the different coil sensitivities \(c_j\) as follows [21]:
In order to compensate for sub-sampling artefacts in sub-sampled MRI it is common practice to use total variation as a regulariser [6, 17]. Coil sensitivities are assumed to be smooth, cf. Fig. 1, motivating a reconstruction model similar to the one proposed in [13]. We therefore choose the discrete isotropic total variation, \(R_0(u) = \Vert \nabla u \Vert _{2, 1}\), and the smooth 2-norm of the discretised gradient, i.e. \(R_j(c_j) := \Vert \nabla c_j \Vert _{2, 2}\), for all \(j > 0\), following the notation in [4]. We further introduce regularisation parameters \(\lambda _j\) in front of the data fidelities and rescale all regularisation parameters such that \(\alpha _0 + \frac{1}{n}\left( \sum _{j = 1}^n \lambda _j + \sum _{j = 1}^n \alpha _j \right) = 1\). In order to realise this model via Algorithm 1 we consider the following operator splitting strategy. We define \(F(u_0, \ldots , u_n, v_0, \ldots , v_{2n})\) as
set \(H(u_0, \ldots , u_n) \equiv 0\), and \(J(v_0, \ldots , v_{2n}) = \sum _{j = 0}^{2n} J_j(v_j)\) with \(J_j(v_j) := \frac{\lambda _j}{2}\Vert S\mathcal {F} v_j - f_j \Vert _2^2\) for \(j \in \{0, \ldots , n - 1\}\), \(J_n(v_n) = \alpha _0 \Vert v_n \Vert _{2, 1}\) and \(J_j(v_j) = \alpha _{j - n} \Vert v_j \Vert _{2, 2} \) for \(j \in \{ n + 1, \ldots , 2n\}\). Note that with these choices of functions, all the resolvent operations can be carried out easily. In particular, we obtain
Moreover, as \(B_k = -I\) (and thus, \(\Vert B^k \Vert = 1\)) for all k, we can simply eliminate \(\tau _2^k\) by replacing it with \(1/\delta \), similar to Sect. 5.
6.1 Experimental Setup
We now want to discuss the experimental setup. We want to reconstruct the synthetic brain phantom in Fig. 1a from sub-sampled k-space measurements. The numerical phantom is based on the design in [1] with a matrix size of \(190 \times 190\). It consists of several different tissue types like cerebrospinal fluid (CSF), gray matter (GM), white matter (WM) and cortical bone. Each pixel is assigned a set of MR tissue properties: Relaxation times \(\text {T}_1(x,y)\) and \(\text {T}_2(x,y)\) and spin density \(\rho (x,y)\). These parameters were also selected according to [1]. The MR signal s(x, y) in each pixel was then calculated by using the signal equation of a fluid attenuation inversion recovery (FLAIR) sequence [5]:
The sequence parameters were selected: TR = 10000 ms, TE = 90 ms. TI was set to 1781 ms to achieve signal nulling of CSF (\(\text {T}_1^\text {csf} \log (2)\) with \(\text {T}_1^\text {csf} = 2569\,\text {ms}\)).
In order to generate artificial k-space measurements for each coil, we proceed as follows. First, we produce 8 images of the brain phantom multiplied by the measured coil sensitivity maps shown in Fig. 1c–j. The coil sensitivity maps were generated from the measurements of a water bottle with an 8-channel head coil array. Then we produce artificial k-space data by applying the 2D discrete Fourier-transform to each of those individual images. Subsequently, we sub-sample only approx. 25% of each of the k-space datasets via the spiral shown in Fig. 1b. Finally, we add Gaußian noise with standard deviation \(\sigma \) to the sub-sampled data.
6.2 Computations
For the actual computations we use two noisy versions \(f_j\) of the simulated k-space data; one with small noise (\(\sigma = 0.05\)) and one with a high amount of noise (\(\sigma = 0.95\)). As stopping criterion we simply choose a fixed number of iterations; for both the low noise level as well as the high noise level dataset we have fixed the number of iterations to 1500. The initial values used for the algorithm are \(u_j^0 = \mathbf 1 \) with \(\mathbf 1 \in \mathbb {R}^{l \times 1}\) being the constant one-vector, for all \(j \in \{0, \ldots , n\}\). All other initial variables (\(v^0\), \(\mu ^0\), \(\overline{\mu }^0\)) are set to zero.
Low Noise Level. We have computed reconstructions from the noisy data with noise level \(\sigma = 0.05\) via Algorithm 1, with regularisation parameters set to \(\lambda _j = 0.0621\), \(\alpha _0 = 0.0062\) and \(\alpha _j = 0.9317\) for \(j \in \{1, \ldots , n\}\). We have further created a naïve reconstruction by averaging the individual inverse Fourier-transformed images obtained from zero-filling the k-space data. The modulus images of the results are visualised in Fig. 2. The PSNR values for the averaged zero-filled reconstruction is 10.2185, whereas the PSNR of the reconstruction with the proposed method is 24.5572.
High Noise Level. We proceeded as in the previous section, but for noisy data with noise level \(\sigma = 0.95\). The regularisation parameters were set to \(\lambda _j = 0.0149\), \(\alpha _0 = 0.0135\) and \(\alpha _j = 0.9716\) for \(j \in \{1, \ldots , n\}\). The modulus images of the results are visualised in Fig. 3. The PSNR values for the averaged zero-filled reconstruction is 9.9621, whereas the PSNR of the reconstruction with the proposed method is 16.672.
7 Conclusions and Outlook
We have presented a novel algorithm that allows to compute minimisers of a sum of convex functionals with nonlinear operator constraint. We have shown the connection to the recently proposed NL-PDHGM algorithm which implies local convergence results in analogy to those derived in [22]. Subsequently we have demonstrated the computational capabilities of the algorithm by applying it to a nonlinear joint reconstruction problem in parallel MRI.
For future work, the convergence of the algorithm in the general setting has to be verified, and possible extensions to guarantee global convergence have to be studied. Generalisation of stopping criteria such as a linearised primal-dual gap will also be of interest as well. With respect to the presented parallel MRI application, exact conditions for the convergence (like the exact norm of the bounds) have to be verified. The impact of the algorithm - as well as the regularisation-parameters on the reconstruction has to be analysed, and a rigorous study with artificial and real data would also be desirable. Moreover, future research will focus on alternative regularisation functions, e.g. based on spherical harmonics motivated by [20]. Last but not least, other applications that can be modelled via (1) should be considered in future research.
References
Aubert-Broche, B., Evans, A.C., Collins, L.: A new improved version of the realistic digital brain phantom. NeuroImage 32(1), 138–145 (2006)
Bachmayr, M., Burger, M.: Iterative total variation schemes for nonlinear inverse problems. Inverse Prob. 25(10), 105004 (2009)
Beck, A., Teboulle, M.: Fast gradient-based algorithms for constrained total variation image denoising and deblurring problems. IEEE Trans. Image Process. 18(11), 2419–2434 (2009)
Benning, M., Gladden, L., Holland, D., Schönlieb, C.-B., Valkonen, T.: Phase reconstruction from velocity-encoded MRI measurements-a survey of sparsity-promoting variational approaches. J. Magn. Reson. 238, 26–43 (2014)
Bernstein, M.A., King, K.F., Zhou, X.J.: Handbook of MRI Pulse Sequences. Elsevier, Amsterdam (2004)
Block, K.T., Uecker, M., Frahm, J.: Undersampled radial MRI with multiple coils. Iterative image reconstruction using a total variation constraint. Magn. Reson. Med. 57(6), 1086–1098 (2007)
Bonettini, S., Loris, I., Porta, F., Prato, M.: Variable metric inexact line-search based methods for nonsmooth optimization. Siam J. Optim. 26, 891–921 (2015)
Candes, E.J., et al.: Compressive sampling. In: Proceedings of the International Congress of Mathematicians, vol. 3, Madrid, Spain, pp. 1433–1452 (2006)
Chambolle, A., Pock, T.: A first-order primal-dual algorithm for convex problems with applications to imaging. J. Math. Imaging Vis. 40(1), 120–145 (2011)
Donoho, D.L.: Compressed sensing. IEEE Trans. Inf. Theory 52(4), 1289–1306 (2006)
Gabay, D.: Applications of the method of multipliers to variational inequalities. Stud. Math. Appl. 15, 299–331 (1983)
Goldstein, T., Osher, S.: The split Bregman method for L1-regularized problems. SIAM J. Imaging Sci. 2(2), 323–343 (2009)
Knoll, F., Clason, C., Bredies, K., Uecker, M., Stollberger, R.: Parallel imaging with nonlinear reconstruction using variational penalties. Magn. Reson. Med. 67(1), 34–41 (2012)
Möllenhoff, T., Strekalovskiy, E., Möller, M., Cremers, D.: The primal-dual hybrid gradient method for semiconvex splittings. SIAM J. Imaging Sci. 8(2), 827–857 (2015)
Möller, M., Benning, M., Schönlieb, C., Cremers, D.: Variational depth from focus reconstruction. IEEE Trans. Image Process. 24(12), 5369–5378 (2015)
Ochs, P., Chen, Y., Brox, T., Pock, T.: iPiano: inertial proximal algorithm for nonconvex optimization. SIAM J. Imaging Sci. 7(2), 1388–1419 (2014)
Ramani, S., Fessler, J., et al.: Parallel MR image reconstruction using augmented Lagrangian methods. IEEE Trans. Med. Imaging 30(3), 694–706 (2011)
Rockafellar, R.T.: Convex Analysis. Princeton Mathematical Series, 46:49. Princeton University Press, Princeton (1970)
Rudin, L.I., Osher, S., Fatemi, E.: Nonlinear total variation based noise removal algorithms. Phys. D: Nonlinear Phenom. 60(1), 259–268 (1992)
Sbrizzi, A., Hoogduin, H., Lagendijk, J.J., Luijten, P., den Berg, C.A.T.: Robust reconstruction of B1+ maps by projection into a spherical functions space. Magn. Reson. Med. 71(1), 394–401 (2014)
Uecker, M., Hohage, T., Block, K.T., Frahm, J.: Image reconstruction by regularized nonlinear inversion joint estimation of coil sensitivities and image content. Magn. Reson. Med. 60(3), 674–682 (2008)
Valkonen, T.: A primal-dual hybrid gradient method for nonlinear operators with applications to MRI. Inverse Prob. 30(5), 055012 (2014)
Zhang, X., Burger, M., Osher, S.: A unified primal-dual algorithm framework based on Bregman iteration. J. Sci. Comput. 46(1), 20–46 (2011)
Acknowledgments
MB, CS and TV acknowledge EPSRC grant EP/M00483X/1. FK ackowledges National Institutes of Health grant NIH P41 EB017183.
EPSRC Data Statement: the corresponding code and data are available for download at https://www.repository.cam.ac.uk/handle/1810/256221.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 IFIP International Federation for Information Processing
About this paper
Cite this paper
Benning, M., Knoll, F., Schönlieb, CB., Valkonen, T. (2016). Preconditioned ADMM with Nonlinear Operator Constraint. In: Bociu, L., Désidéri, JA., Habbal, A. (eds) System Modeling and Optimization. CSMO 2015. IFIP Advances in Information and Communication Technology, vol 494. Springer, Cham. https://doi.org/10.1007/978-3-319-55795-3_10
Download citation
DOI: https://doi.org/10.1007/978-3-319-55795-3_10
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-55794-6
Online ISBN: 978-3-319-55795-3
eBook Packages: Computer ScienceComputer Science (R0)