1 Introduction

Laser-induced interstitial thermotherapy (LITT) is a minimally invasive, local therapy used to destroy tumors through thermal ablation. For this, laser radiation is transmitted by an optical fiber to an application system (see Fig. 1) that is inserted into the tumorous tissue. Absorption of the radiation by the tissue results in a temperature increase around the applicator which destroys the tumor cells due to coagulative effects. The goal of the therapy is to completely destroy the tumor while protecting the healthy tissue. To reach this goal, computer simulations can assist physicians in the planing and monitoring of treatments. However, such simulations can only yield reliable results if all necessary parameters are known. While typically good measurements are available for many of the tissue parameters, a critical role is the determination of the blood perfusion rate that models the cooling effect induced by blood vessels.

Figure 1
figure 1

Schematic of the setup and the boundary decomposition

The blood perfusion rate is a nonlocal whose magnitude depends on the presence of blood vessels. Further, it depends on the shape and size of the vessels. The induced cooling effect significantly influences the temperature. The knowledge of the location of the blood vessels in the vicinity of tumorous tissue (and, thus, close to the applicator) is crucial for the performance of the therapy as well as for the reliability of a simulation model as e.g. discussed in [15]. Unfortunately, the location relative to the applicator varies for each patient and treatment. The location of major vessels can be determined a-priori, e.g., with the help of image decomposition techniques. However, this is tedious and may give a rather bad approximation of the actual perfusion rate.

A more promising approach for the identification of the blood perfusion rate is to use temperature measurements obtained by magnetic resonance (MR) thermometry. MR thermometry methods are based on MR measured parameters depending on temperature like the longitudinal relaxation time, the diffusion coefficient, or the proton resonance frequency (PRF) of tissue water. The linear temperature dependence of the proton resonance frequency and its near-independence with respect to tissue type make the PRF-based methods the preferred choice for many applications. For a more deeper understanding to MR-Thermometry we refer to the review paper [6].

The idea proposed in this paper is to identify the perfusion rate in a short time period during the beginning of the treatment from MR thermometry data. This information can then be used to simulate the remaining treatment. This is of great benefit for LITT as it can be integrated into an online therapy monitoring and prediction tool, individualized for every patient. In the following we derive a parameter identification approach for the blood perfusion rate from MR thermometry data using techniques from optimal control for partial differential equations. Synthetic measurements are used to demonstrate the approach.

This paper is structured as follows. In Sect. 2 we introduce our mathematical model of LITT. Section 3 gives the details of the parameter identification problem and its formulation as a PDE constrained optimization problem. The numerical solution of the problem is described in Sect. 4. The validation of our method is done in Sect. 5, where we discuss the results of some model problems using synthetic measurements.

2 Mathematical model

For the modeling of the LITT we use the same model as [7] which was proposed in [8]. The model is summarized in the following. The liver tissue is denoted by \(\varOmega \subset \mathbb {R}^{3}\). Its boundary Γ consists of the following three parts: The ambient boundary \(\varGamma _{{\mathrm {amb}}}\), i.e., the surface of the liver, as well as two parts corresponding to the interface between tissue and laser applicator (the latter is not part of Ω): \(\varGamma _{{\mathrm {cool}}}\), the part of the applicator boundary that is cooled, and \(\varGamma _{{\mathrm {rad}}}\), the part of the boundary where radiation is emitted. Further, t and τ denote the time [s] and the time horizon of the simulation [s], respectively.

To model the tissue temperature during LITT we use Pennes’ bio-heat equation [9] that is coupled with a P1-approximation for the laser radiation [10] and the Arrhenius law that models tissue damage [8]. The bio-heat equation [9] reads

$$ \textstyle\begin{cases} \rho C_{p}\dot{T} - \nabla \cdot (\kappa \nabla T) + \xi (T- T_{{\mathrm {b}}}) = \mu _{a} \varphi &\text{in } (0, \tau ) \times \varOmega , \\ \kappa \partial _{n}T + \alpha (T- T_{{\mathrm {ext}}}) = 0 &\text{on } (0, \tau ) \times \varGamma , \\ T(0, \cdot ) = T_{0} &\text{in } \varOmega , \end{cases} $$
(1)

where T is the tissue temperature [K], and κ, ρ and \(C_{p}\) denote the liver’s thermal conductivity [W/(m K)], density [kg/m3] and specific heat capacity [J/(m K)], respectively. The term \(\xi (T- T_{{\mathrm {b}}})\) models the heat transfer between blood vessels and liver tissue, where ξ denotes the perfusion rate [W/(K m3)] and \(T_{{\mathrm {b}}}\) is the blood temperature [K]. As the perfusion rate is typically unknown, we propose a method for identifying this quantity later on. The energy generated by the laser enters the equation as a source term, where \(\mu _{a}\) is the absorption coefficient [1/m] and φ is the radiative energy [W/m2], which we explain in the following paragraph (cf. (2)). The term \(\partial _{n}T\) denotes the normal derivative of T in the direction of the outer unit normal vector n, i.e., \(\partial _{n}T= n\cdot \nabla T\). The heat transfer coefficient [W/(K m2)] is denoted by α and the external temperature [K] by \(T_{{\mathrm {ext}}}\). Note that these two parameters vary over the boundary, in particular, we have that \(\alpha = \alpha _{{\mathrm {cool}}}\) and \(T_{{\mathrm {ext}}}= T_{{\mathrm {cool}}}\) on \(\varGamma _{{\mathrm {cool}}}\cup \varGamma _{{\mathrm {rad}}}\) as well as \(\alpha = \alpha _{{\mathrm {amb}}}\) and \(T_{{\mathrm {ext}}}= T_{{\mathrm {amb}}}\) on \(\varGamma _{{\mathrm {amb}}}\). Finally, the initial temperature of the tissue [K] is denoted by \(T_{0}\).

We model the radiative energy φ via the P1-approximation [10, 11]:

$$ \textstyle\begin{cases} - \nabla \cdot (D\nabla \varphi ) + \mu _{a} \varphi = 0 &\text{in } (0, \tau ) \times \varOmega , \\ D \partial _{n}\varphi = q &\text{on } (0, \tau ) \times ( \varGamma _{{\mathrm {cool}}}\cup \varGamma _{{\mathrm {rad}}}), \\ D \partial _{n}\varphi + \frac{1}{2} \varphi = 0 &\text{on } (0, \tau ) \times \varGamma _{{\mathrm {amb}}}, \end{cases} $$
(2)

where D denotes the diffusion coefficient [m] that is defined as

$$ D = \frac{1}{3 ( \mu _{a} + \mu _{s} ( 1 - g ) )}. $$

Here, \(\mu _{s}\) is the scattering coefficient [1/m] and g is the anisotropy factor. Furthermore, q is a boundary source that models the radiation coming from the laser applicator. On \(\varGamma _{{\mathrm {cool}}}\) there is no radiation, so \(q = 0\), and on \(\varGamma _{{\mathrm {rad}}}\) we have \(q = \frac{q_{\mathrm{app}}}{ \lvert \varGamma _{{\mathrm {rad}}}\rvert }\), where \(\lvert \varGamma _{{\mathrm {rad}}}\rvert \) denotes the area of the radiating surface \(\varGamma _{{\mathrm {rad}}}\) and \(q_{\mathrm{app}}\) is the effective laser power. The latter is given by

$$ q_{\mathrm{app}}(t) = \textstyle\begin{cases} (1 - \beta _{q}) \hat{q}_{\mathrm{app}} &\text{if } t_{\mathrm{on}} \leq t \leq t_{\mathrm{off}}, \\ 0 &\text{otherwise}, \end{cases} $$

where \(\hat{q}_{\mathrm{app}}\) is the total power emitted by the laser [W] and \(\beta _{q}\) is the coolant absorption factor that models the absorption of energy by the coolant (cf. [7]). Additionally, \(t_{\mathrm{on}}\) and \(t_{\mathrm{off}}\) denote the times where the laser is switched on and off, respectively.

Finally, we consider the coagulation of the tissue molecules using Arrhenius’ law (cf. [8]) which reads

$$ \omega (t, x) = \int _{0}^{t} A\exp \biggl( - \frac{E_{a}}{RT(\theta , x)} \biggr)\,\mathrm{d}\theta , $$
(3)

where A denotes the frequency factor [1/s], \(E_{a}\) is the activation energy [J/mol] and R the universal gas constant [J/(mol K)]. It is important to note that this quantity does not only depend on the temperature at the current time t, but on its entire history. As tissue damage influences the optical parameters substantially, we model the effect of coagulation on them according to [7, 8] by

$$ \textstyle\begin{cases} \mu _{a} = \mu _{an} + (1 - \exp (-\omega ) ) (\mu _{ac} - \mu _{an}), \\ \mu _{s} = \mu _{sn} + (1 - \exp (-\omega ) ) (\mu _{sc} - \mu _{sn}), \\ g = g_{n} + (1 - \exp (-\omega ) ) (g_{c} - g_{n}), \end{cases} $$

where the subscript n denotes the native value of that parameter and the subscript c the respective coagulated one. The parameter values used in this paper are shown in Table 1 (cf. [7, 1215]). Note, that these parameters represent healthy tissue. It is, however, easily possible to account for tumorous tissue through local variations of the corresponding parameters.

Table 1 Values for the various parameters of the liver tissue

3 Parameter identification

We now formulate the parameter identification problem and relate it to an optimal control problem, which is subsequently treated with the help of the adjoint approach.

3.1 Problem formulation

Given a temperature measurement \(T_{{\mathrm {meas}}}\) at a certain time we want to identify (or reconstruct) the blood perfusion rate ξ that induced the measured temperature distribution. Without loss of generality we assume that the time of the measurement coincides with the time horizon τ. Later on, we choose a time horizon that is much smaller than the end time of the therapy \(\tau _{{\mathrm {end}}}\) to identify the perfusion rate at the beginning of the treatment. We assume that our state equations (1) and (2) admit a unique solution (cf. [16]). We define the cost functional \(J(T, \xi )\) as

$$ J(T, \xi ) := \frac{1}{2} \int _{\varOmega } \bigl\lvert T(\tau , \cdot ) - T_{{\mathrm {meas}}}\bigr\rvert ^{2}\,\mathrm{d}x + \frac{\lambda }{2} \int _{\varOmega } \lvert \xi \rvert ^{2}\,\mathrm{d}x, $$
(4)

which is then used to model the parameter identification problem described above with the following minimization problem:

$$ \min_{\xi \in \mathcal{A}, T} J(T, \xi ) \quad \text{s.t. (1) and (2)}. $$
(5)

Here, the perfusion rate ξ plays the role of a control that is used to “steer” the state, i.e., the simulated temperature, to the desired state, i.e., the temperature measurement. Further, \(\mathcal{A} \subset L^{2}(\varOmega )\) denotes the set of admissible perfusion rates which is used in order to model so-called control constraints, e.g., only nonnegative perfusion rates are physically meaningful. The first term (also known as observation term) tries to minimize the difference of \(T(\tau , \cdot )\) and \(T_{{\mathrm {meas}}}\). This means that we try to compute a perfusion rate such that the resulting temperature distribution at time τ is close to the measured temperature. The second term is a so-called Tikhonov regularization with regularization parameter \(\lambda \geq 0\), which is used to stabilize the (possibly) ill-posed problem (cf. [17, 18]).

We reformulate the optimization problem (5) equivalently thanks to our assumption that the state equations admit a unique solution. To do this, we denote by \(T[\xi ]\) the solution of (1) and (2) with blood perfusion rate ξ. We introduce the reduced cost functional Ĵ by

$$ \hat{J}(\xi ) := J\bigl(T[\xi ], \xi \bigr), $$
(6)

and see that (5) is equivalent to the reduced problem

$$ \min_{\xi \in \mathcal{A}}\hat{J}(\xi ) = \frac{1}{2} \int _{ \varOmega } \bigl\lvert T[\xi ](\tau , \cdot ) - T_{{\mathrm {meas}}}\bigr\rvert ^{2}\,\mathrm{d}x + \frac{\lambda }{2} \int _{\varOmega } \lvert \xi \rvert ^{2}\,\mathrm{d}x, $$
(7)

where the PDE constraint is formally eliminated. To solve this minimization problem, we apply techniques from PDE-constrained optimization. In particular, we compute the gradient of the reduced cost functional Ĵ which is then used to solve problem (7) numerically with a gradient descent or a quasi-Newton method (cf. Sect. 4). For a detailed introduction to optimization problems constrained by PDEs and their (numerical) solution we refer, e.g., to [1921].

3.2 Adjoint-based identification

To compute the gradient of Ĵ, we use the formal Lagrange method of [19, Chap. 2.10]. For this purpose, we set up a Lagrangian \(\mathcal{L}(\xi , T, \varphi , p, \psi )\), where p and ψ are used as Lagrange multipliers for the PDE constraints (1) and (2). Then, the first order optimality conditions for a stationary point of the Lagrangian (and, therefore, for a minimizer of (5)) are given by the system

$$\begin{aligned}& \frac{\partial \mathcal{L}}{\partial p}(\xi , T, \varphi , p, \psi )[ \hat{p}] = 0 \quad \text{for all } \hat{p}, \end{aligned}$$
(8)
$$\begin{aligned}& \frac{\partial \mathcal{L}}{\partial \psi }(\xi , T, \varphi , p, \psi )[\hat{\psi }] = 0 \quad \text{for all } \hat{\psi }, \end{aligned}$$
(9)
$$\begin{aligned}& \frac{\partial \mathcal{L}}{\partial T}(\xi , T, \varphi , p, \psi )[ \hat{T}] = 0 \quad \text{for all } \hat{T}, \end{aligned}$$
(10)
$$\begin{aligned}& \frac{\partial \mathcal{L}}{\partial \varphi }(\xi , T, \varphi , p, \psi )[\hat{\varphi }] = 0 \quad \text{for all } \hat{\varphi }, \quad \text{and} \end{aligned}$$
(11)
$$\begin{aligned}& \frac{\partial \mathcal{L}}{\partial \xi }(\xi , T, \varphi , p, \psi )[\hat{\xi } - \xi ] \geq 0 \quad \text{for all } \hat{\xi } \in \mathcal{A}. \end{aligned}$$
(12)

For our problem (5), this Lagrangian is given by

$$ \textstyle\begin{cases} \mathcal{L}(\xi , T, \varphi , p, \psi ) = \frac{1}{2} \int _{ \varOmega } \lvert T(\tau , \cdot ) - T_{{\mathrm {meas}}}\rvert ^{2}\,\mathrm{d}x + \frac{\lambda }{2} \int _{\varOmega } \lvert \xi \rvert ^{2}\,\mathrm{d}x \\ \hphantom{\mathcal{L}(\xi , T, \varphi , p, \psi ) =}{} - \int _{0}^{\tau } \int _{\varOmega } \rho C_{p}\dot{T} p_{1}\,\mathrm{d}x \,\mathrm{d}t + \int _{0}^{\tau } \int _{\varOmega } \nabla \cdot (\kappa \nabla T) p_{1}\,\mathrm{d}x \,\mathrm{d}t \\ \hphantom{\mathcal{L}(\xi , T, \varphi , p, \psi ) =}{} - \int _{0}^{\tau } \int _{\varOmega } \xi (T- T_{{\mathrm {b}}}) p_{1}\,\mathrm{d}x \,\mathrm{d}t + \int _{0}^{\tau } \int _{\varOmega } \mu _{a} \varphi p_{1}\,\mathrm{d}x \,\mathrm{d}t \\ \hphantom{\mathcal{L}(\xi , T, \varphi , p, \psi ) =}{} - \int _{0}^{\tau } \int _{\varGamma } \kappa n\cdot \nabla T p_{2}\,\mathrm{d}s \,\mathrm{d}t - \int _{0}^{\tau } \int _{\varGamma } \alpha (T- T_{{\mathrm {ext}}}) p_{2}\,\mathrm{d}s \,\mathrm{d}t \\ \hphantom{\mathcal{L}(\xi , T, \varphi , p, \psi ) =}{} - \int _{\varOmega } (T(0, \cdot ) - T_{0}) p_{3}\,\mathrm{d}x + \int _{0}^{ \tau } \int _{\varOmega } \nabla \cdot (D \nabla \varphi ) \psi _{1}\,\mathrm{d}x \,\mathrm{d}t \\ \hphantom{\mathcal{L}(\xi , T, \varphi , p, \psi ) =}{} - \int _{0}^{\tau } \int _{\varOmega } \mu _{a} \varphi \psi _{1}\,\mathrm{d}x \,\mathrm{d}t - \int _{0}^{\tau } \int _{\varGamma _{{\mathrm {rad}}}} D n\cdot \nabla \varphi \psi _{2}\,\mathrm{d}s \,\mathrm{d}t \\ \hphantom{\mathcal{L}(\xi , T, \varphi , p, \psi ) =}{} + \int _{0}^{\tau } \int _{\varGamma _{{\mathrm {rad}}}} \frac{q_{\mathrm{app}}}{ \lvert \varGamma _{{\mathrm {rad}}}\rvert } \psi _{2}\,\mathrm{d}s \,\mathrm{d}t - \int _{0}^{\tau } \int _{ \varGamma _{{\mathrm {cool}}}} D n\cdot \nabla \varphi \psi _{3}\,\mathrm{d}s \,\mathrm{d}t \\ \hphantom{\mathcal{L}(\xi , T, \varphi , p, \psi ) =}{} - \int _{0}^{\tau } \int _{\varGamma _{{\mathrm {amb}}}} D n\cdot \nabla \varphi \psi _{4}\,\mathrm{d}s \,\mathrm{d}t - \int _{0}^{\tau } \int _{\varGamma _{{\mathrm {amb}}}} \frac{1}{2} \varphi \psi _{4}\,\mathrm{d}s \,\mathrm{d}t, \end{cases} $$
(13)

where \(p= [p_{1}, p_{2}, p_{3}]\) and \(\psi = [\psi _{1}, \psi _{2}, \psi _{3}, \psi _{4}]\). As the Lagrangian is linear in p and ψ, we see that equations (8) and (9) of the optimality system are equivalent to the state equations (1) and (2) that constrain the optimization problem (5).

Lengthy calculations show that (10) and (11) give rise to the conditions

$$\begin{aligned} &\psi _{2} = \psi _{1} \quad \text{on } (0, \tau ) \times \varGamma _{{\mathrm {rad}}}, \qquad \psi _{3} = \psi _{1} \quad \text{on } (0, \tau ) \times \varGamma _{{\mathrm {cool}}}, \\ &\psi _{4} = \psi _{1} \quad \text{on } (0, \tau ) \times \varGamma _{{\mathrm {amb}}}, \qquad p_{2} = p_{1} \quad \text{on } (0, \tau ) \times \varGamma , \quad \text{and} \\ &p_{3} = \rho C_{p}p_{1}(0), \end{aligned}$$

such that we only have to consider the multipliers \(p_{1}\) and \(\psi _{1}\). Therefore, in the following we drop the index and only write \(p= p_{1}\) as well as \(\psi = \psi _{1}\). Furthermore, with the above conditions, (10) and (11) are equivalent to the following system of adjoint equations:

$$ \textstyle\begin{cases} - \rho C_{p}\dot{p} - \nabla \cdot (\kappa \nabla p) + \xi p+ F_{1} + F_{2} = 0 &\text{in } (0, \tau ) \times \varOmega , \\ \kappa \partial _{n}p+ \alpha p= 0 &\text{on } (0, \tau ) \times \varGamma , \\ p(\tau , \cdot ) - \frac{\lambda }{\rho C_{p}} ( T(\tau , \cdot ) - T_{{\mathrm {meas}}}) = 0 &\text{in } \varOmega , \end{cases} $$
(14)

where

$$\begin{aligned}& F_{1} = A\exp \biggl( -\frac{E_{a}}{RT} \biggr) \frac{E_{a}}{RT^{2}} \int _{t}^{\tau } \frac{\partial \mu _{a}}{\partial \omega } \varphi ( \psi - p ) \,\mathrm{d}\theta , \\& F_{2} = A\exp \biggl( -\frac{E_{a}}{RT} \biggr) \frac{E_{a}}{RT^{2}} \int _{t}^{\tau } \frac{\partial D}{\partial \omega } \nabla \varphi \cdot \nabla \psi \,\mathrm{d}\theta , \end{aligned}$$

and p solves

$$ \textstyle\begin{cases} -\nabla \cdot (D \nabla \psi ) + \mu _{a} \psi = \mu _{a} p&\text{in } (0, \tau ) \times \varOmega , \\ D \partial _{n}\psi = 0 &\text{on } (0, \tau ) \times ( \varGamma _{{\mathrm {cool}}}\cup \varGamma _{{\mathrm {rad}}}), \\ D \partial _{n}\psi + \frac{1}{2} \psi = 0 &\text{on } (0, \tau ) \times \varGamma _{{\mathrm {amb}}}. \end{cases} $$
(15)

Moreover, we remark that, as usual, the adjoint (bio-)heat equation is an equation that is posed backward in time. For the analysis and numerical solution of this equation one can introduce the time shift \(\varTheta = \tau - t\) and then solve an equation that evolves forward in Θ (cf. [19, 20]).

Finally, the optimality condition (12) is equivalent to the following variational inequality

$$ \hat{J}'(\xi ) [\hat{\xi } - \xi ] = \bigl( \hat{J}'( \xi ), \hat{\xi } - \xi \bigr)_{L^{2}(\varOmega )} \geq 0 \quad \text{for all } \hat{\xi } \in \mathcal{A}, $$
(16)

where \(( u, v )_{H}\) denotes the inner product of u and v in some Hilbert space H. Here, the gradient of the reduced cost functional is given by

$$ \hat{J}'(\xi ) = \lambda \xi + \int _{0}^{\tau } ( T_{{\mathrm {b}}}- T ) p\, \mathrm{d}t. $$
(17)

This relation is used in Sect. 4 for the numerical solution of the parameter identification problem (5). Our results are summarized in Proposition 3.1.

Proposition 3.1

Let\(\mathcal{A}\)be a convex subset of\(L^{2}(\varOmega )\). The first order necessary conditions for\(\xi ^{*}\)being a minimizer of (7) are given by the state equations (1) and (2), the adjoint equations (14) and (15), as well as the variational inequality (16). The gradient\(\hat{J}'(\xi )\)of the reduced cost functional is given by (17).

4 Numerical methods

In the following we discuss the numerical methods used to solve the parameter identification problem. First, we describe the numerical solution techniques for the state and adjoint equations and then we elaborate the algorithms used for solving the optimization problem.

4.1 Solution of the PDEs

We solve all PDEs, i.e., the state and adjoint equations, with the finite element method. For this purpose, we triangulate our domain Ω with the help of GMSH, version 2.11.0 [22]. The assembly and solution of the linear systems is done with FEniCS, version 2018.1 [23, 24] and PETSc, version 3.10.5 [25], respectively. To solve the time-dependent PDEs (1) and (14) we first discretize them in time with the implicit Euler method. Further, all PDEs are discretized in space with the help of linear Lagrange elements. The resulting sequences of linear systems corresponding to (1), (2), and (15) are then solved with the conjugate gradient method and an incomplete Cholesky factorization as preconditioner. For the solution of the sequence of linear systems arising from (14) we use the MINRES algorithm and the Jacobi method as preconditioner, as the corresponding matrices are symmetric, but not necessarily positive definite.

4.2 Solution of the optimization problem

Let us now turn our attention to the numerical solution of the optimal control problem (5) which we solve by the means of a projected quasi-Newton method described in [26]. The idea of the method is the following: Assume that we computed the k-th iterate \(\xi _{k}\). To compute the gradient of the reduced cost functional, we first solve the state equations (1) and (2) and then the adjoint equations (14) and (15). Subsequently, we compute \(g_{k} = \hat{J}'(\xi _{k})\) with the relation (17) and, with this, the search direction \(d_{k}\) is computed by a L-BFGS (limited-memory Broyden–Fletcher–Goldfarb–Shanno) update of the form

$$ d_{k} = -H_{k}^{-1} g_{k}, $$
(18)

where \(H_{k}\) denotes the L-BFGS approximation of the (reduced) Hessian of Ĵ at \(\xi _{k}\). The application of the inverse of \(H_{k}\) is efficiently performed via the well-known BFGS double loop [26, 27]. Next, we perform a line search along the projected path given by \(\mathcal{P}(\xi _{k} + \alpha d_{k})\), where \(\mathcal{P}\) denotes the projection onto \(\mathcal{A}\), to find a suitable step size \(\alpha _{k}\). This is done by the following Armijo rule (cf. [28, 29]): Define \(\xi _{k}(\alpha ) = \mathcal{P}(\xi _{k} + \alpha d_{k})\). Then, the step size \(\alpha _{k}\) is of the form \(\alpha _{k} = \beta ^{m_{k}} \gamma \), where \(\beta \in (0, 1)\) and \(\gamma >0\) and \(m_{k}\) is the smallest integer satisfying

$$ \hat{J}\bigl(\xi _{k}(\alpha _{k})\bigr) \leq \hat{J}(\xi _{k}) + \gamma \bigl( \hat{J}'(\xi _{k}), \xi _{k}(\alpha _{k}) - \xi _{k} \bigr)_{L^{2}( \varOmega )}. $$
(19)

For all of our numerical results we choose \(\beta = \frac{1}{2}\) and \(\gamma = 1\times10^{-4}\) (cf. [27]). Finally, we update the iterate by \(\xi _{k+1} = \mathcal{P}(\xi _{k} + \alpha _{k} d_{k})\). For the stopping criterion we define the stationary measure as

$$ \varSigma (\xi ) := \bigl\Vert \xi - \mathcal{P}\bigl(\xi - \hat{J}'( \xi )\bigr) \bigr\Vert _{L^{2}(\varOmega )}, $$

and terminate the method once the relative stopping criterion

$$ \varSigma (\xi _{k}) \leq \operatorname{\mathtt{tol}} \varSigma (\xi _{0}) $$
(20)

is satisfied, where tol is a user-defined tolerance (cf. [26]). This procedure is summarized in Algorithm 1. A detailed description of the optimization methods can be found in, e.g., [26, 27], and their application to PDE-constrained optimization problems is covered in, e.g., [1921].

Algorithm 1
figure a

Projected quasi-Newton algorithm

For the computation of the search direction with the quasi-Newton method in step 7 of Algorithm 1 (cf. (18)) we use the algorithm given in [26, Chap. 5.5.3]. This corresponds to a projected BFGS method that approximates the reduced Hessian of the problem. We implemented this by the means of a limited memory BFGS update that only uses the information of the last \(l \in \mathbb{N}\) steps. In particular, we get a complete projected BFGS method in case \(l = \infty \). On the other hand, for \(l=0\) we choose \(H_{k} = I\), where I denotes the identity, and the whole method reduces to the projected gradient descent method that is described, e.g., in [19, 20, 26]. Therefore, when we speak of using a (projected) gradient descent method we refer to the case \(l=0\) and when we talk about using a (projected) L-BFGS method we refer to the case \(l > 0\). These two methods are compared in Sect. 5. Finally, note that in case the curvature condition for the BFGS method is not satisfied we re-initialize the method with the identity, i.e., we perform a gradient descent step (see [26, Chap. 4.2.2] for details).

4.3 Multiple measurements

The ideas and methods described before can be generalized to the case where multiple measurements are taken during the therapy. To do so, assume that m measurements \(T_{{\mathrm {meas}}}^{(1)},\ldots, T_{{\mathrm {meas}}}^{(m)}\) are taken at times \(\tau _{1} < \tau _{2} < \cdots < \tau _{m}\) and that we have an initial guess \(\xi _{0}\) for the perfusion rate. As before, all measurements should be taken before the end of the therapy, so \(\tau _{m} < \tau _{{\mathrm {end}}}\) and, additionally, we define \(\tau _{0} = 0\).

A first approach for solving this problem would be to use Algorithm 1 on each interval \((\tau _{k}, \tau _{k+1})\) separately, where the initial temperature is given by the measurement. However, this has the important drawback, that we would also need measurements of tissue damage at \(\tau _{k}\) which we cannot compute accurately from the thermometry data. Therefore, we consider using the previously simulated temperature distribution and tissue damage as initial conditions for the subsequent identification interval.

In particular, our method proceeds as follows: On the first interval, i.e., \((0, \tau _{1})\) we use Algorithm 1 in order to identify the blood perfusion rate \(\xi ^{(1)}\) and the resulting temperature distribution \(T^{(1)} = T[\xi ^{(1)}]\) as well as damage function \(\omega ^{(1)}\). These are then used as initial conditions for the state equations for the identification in the subsequent interval \((\tau _{1}, \tau _{2})\). Furthermore, we also use the perfusion rate \(\xi ^{(1)}\) as initial guess for the parameter identification algorithm on \((\tau _{1}, \tau _{2})\). These steps are then repeated until the last identification is carried out. Therefore, the blood perfusion rate we compute with this approach is constant on each interval \((\tau _{k}, \tau _{k+1})\), and the simulated temperature is the one corresponding to this piecewise constant perfusion rate. As we want to predict the temperature in the time interval \((\tau _{m}, \tau _{{\mathrm {end}}})\), we choose the last computed perfusion rate \(\xi ^{(m)}\) as the perfusion rate on this interval. We do so as the effect of the perfusion rate depends on the magnitude of the difference \(T- T_{{\mathrm {b}}}\) which increases over time. Therefore, the last identified perfusion rate should also be the most accurate one. The numerical experiments described in the following section confirm that this approach works well.

5 Numerical results

We now apply the previously introduced techniques for identifying the perfusion rate to a model problem: We generate an artificial temperature measurement \(T_{{\mathrm {meas}}}\) from a synthetic perfusion rate \(\xi _{{\mathrm {meas}}}\) as solution of the state equations. Furthermore, for all of our experiments we choose the set of admissible perfusion rates \(\mathcal{A}\) as

$$ \mathcal{A} := \bigl\{ \xi \in L^{2}(\varOmega ) \vert \xi \geq 0 \text{ a.e. in } \varOmega \bigr\} . $$

As stated previously, this is sensible as all physically meaningful perfusion rates are nonnegative.

To demonstrate the behavior and capabilities of the parameter identification algorithm of Sect. 4, we first assume that there is no noise present in the measurement, i.e., we perform the identification with an “exact” measurement. Afterwards, we investigate its performance in the presence of noisy measurements. For simplicity, consider axisymmetric problems such that we can perform the numerical studies in 2D. The axisymmetric geometry is shown in Fig. 2a. The parameters used for our problem are taken from [7] and are depicted in Table 1. Note that these parameters represent ex-vivo porcine tissue. However, the values are close to the ones of human tissue [12] and the results can be transferred to the in-vivo scenario.

Figure 2
figure 2

Problem setting

5.1 Noiseless model problem

For this problem, we only compute the perfusion rate after \(\tau _{1} = 60\mbox{ s}\) and simulate the rest of the treatment with the perfusion rate computed in this first identification step. For the optimization algorithms we choose the initial guess for the perfusion rate as \(\xi = 0\) and the relative stopping tolerance is set to \(\operatorname{\mathtt{tol}} = 1\times10^{-3}\). Furthermore, we set the regularization parameter as \(\lambda = 0\), i.e., we do not use a Tikhonov regularization for this case. We stop the parameter identification after 20 iterations in case the stopping criterion is not satisfied. The synthetic perfusion rate \(\xi _{{\mathrm {meas}}}\) is depicted in Fig. 2b. The maximum perfusion rate is chosen to be \(\xi _{\mathrm{max}} = 6\times10^{4} \mbox{ W}/(\mathrm{K}\ \mathrm{m}^{3})\) in accordance to [1]. Outside the blood vessels we set the perfusion rate to 0. For the numerical analysis of the parameter identification we consider two different types of blood vessels: First, “smooth vessels” that are modeled via two-dimensional Gaussian kernels with maximum height \(\xi _{\mathrm{max}}\), and second, “square vessels” that have a constant perfusion rate \(\xi _{\mathrm{max}}\). Note that for the L-BFGS algorithm we use (at most) the information of the last 5 iterations for the update of the approximate Hessian.

A single measurement

The identified perfusion rates are depicted in Fig. 3, where the result of the gradient descent method is shown in Fig. 3a and the perfusion rate computed with the L-BFGS method can be seen in Fig. 3b. The convergence history of both optimization algorithms is shown in Fig. 4, where the function values are given in Fig. 4a and the relative stationary measure is depicted in Fig. 4b. Finally, we also compare the simulated temperature, radiative energy and tissue damage to our artificial measurements. The absolute and relative errors in \(L^{\infty }(0,\tau _{{\mathrm {end}}}; L^{\infty }(\varOmega ))\) and \(L^{2}(0,\tau _{{\mathrm {end}}}; L^{2}(\varOmega ))\) norms can be seen in Table 2 for both the gradient descent method and the L-BFGS method. Note that we define the tissue damage as \(\delta = 1 - \exp (-\omega )\), i.e., the measure that indicates whether the tissue is in its native (\(\delta = 0\)) or coagulated (\(\delta = 1\)) state as the parameter ω enters our model only through δ.

Figure 3
figure 3

Identified perfusion rates

Figure 4
figure 4

Convergence history of the optimization methods

Table 2 Comparison between the simulated and measured temperature for a single measurement

First, we note that the proposed parameter identification performs well for both algorithms. The identified perfusion rates in Fig. 3 approximate the “measured” perfusion rate well, at least close to the applicator. The positions of the blood vessels in the first row and next to the radiating boundary are the ones that are identified best, their position and shape closely resembles the measurement data. However, the identification becomes worse for the vessels located to the top and bottom of \(\varGamma _{{\mathrm {rad}}}\) and for the vessels further away from the applicator. In particular, the second row of vessels can only be seen very faintly for the gradient descent method, whereas the three middle vessels are still recognizable for the BFGS method, even though it underestimates the magnitude of the perfusion rate. This is due to the following reason: The influence of the perfusion rate is proportional to the temperature difference \(T- T_{{\mathrm {b}}}\). However, the temperature is highest close to \(\varGamma _{{\mathrm {rad}}}\) and decays with increasing distance from there. Thus, the temperature difference at the “outer” vessel locations is not significant and neither is the effect of the perfusion rate. Thus, our algorithm performs well by finding the significant blood vessels close to the applicator.

In Fig. 4 we can further observe that the BFGS method outperforms the gradient descent algorithm as the values of the cost functions are always lower for the former and so are the values of the stationary measure. As the computational cost of the L-BFGS method is essentially the same as for the gradient descent algorithm, we can save valuable computational time by using the former, as it needs about half as many steps to reach a certain tolerance in the cost function compared to the latter.

The comparison of the simulated and measured physical quantities shown in Table 2 emphasizes the fact, that the BFGS method exhibits better properties. Furthermore, comparing the errors of the simulated physical quantities to the ones we get for a vanishing perfusion rate we observe that the effect of blood perfusion is significant for the therapy planning as stated in [1]. We observe that the simulation results of our algorithms are significantly better than those of the simulation that does not consider the effect of the perfusion rate. Moreover, the errors generated by the BFGS method are only about half of those generated with the gradient descent method for T and φ, however, the error in tissue damage goes down dramatically by a factor of 3.5 in the \(L^{\infty }\) norm and by a factor of 10 in the \(L^{2}\) norm, underlining the superior behavior of the BFGS method.

Finally, we also show the evolution of the error in temperature over time in Fig. 5 for both the \(L^{\infty }(\varOmega )\) and \(L^{2}(\varOmega )\) norm. We observe a similar picture as before, where the BFGS method outperforms the gradient descent method. Furthermore, we can see that even with only one measurement the methods produce comparatively low errors in the simulation, even though the results are “extrapolated” into the future. In particular, we again observe that the effect of blood perfusion is significant here, as the error increases rapidly when considering \(\xi = 0\), whereas the error stays comparatively small for the simulation with the identified perfusion rate.

Figure 5
figure 5

Evolution of the error in temperature over time for one measurement

Multiple measurements

Now, we also examine the quality of the identification for multiple measurements. We choose the same setting as before, but this time we assume to have temperature measurements at \(\tau _{1} = 60\mbox{ s}\), \(\tau _{2} = 120\mbox{ s}\) and \(\tau _{3} = 180\mbox{ s}\). The identified perfusion rates for this setting are shown in Fig. 6, and the comparison to the measurement data is depicted in Table 3.

Figure 6
figure 6

Identified perfusion rates for multiple measurements

Table 3 Comparison between the simulated and measured temperature for multiple measurements

Again, we observe that the L-BFGS method outperforms the gradient descent algorithm: In Fig. 6 we observe that it can identify more blood vessels and resolves them more accurately, e.g., the first column of vessels is clearly visible and well identified, the three vessels in the middle of the second column can be seen more clearly now, although their perfusion rate is still underestimated by the method. For the gradient descent algorithm only the middle vessels of the first column are identified accurately. The ones further away from \(\varGamma _{{\mathrm {rad}}}\) are better visible than before and the result resembles the one for the BFGS method with one measurement shown in Fig. 3b.

The same is true for the comparison of the simulated and synthetic data (cf. Table 3). Here, the errors for the temperature with the three measurement gradient descent algorithm are very close to the errors obtained by the BFGS method with only one measurement. However, the errors for the radiative energy and tissue damage are lower. The BFGS method also improves its performance for three measurements compared to a single one, albeit only slightly for most errors.

Again, we also show the evolution of the error in temperature over time for the whole simulation in Fig. 7, this time in comparison to the results obtained with only one measurement. The results depicted here are similar to the ones of Fig. 5. In particular, the BFGS method outperforms the gradient descent one. Moreover, we can see that the results obtained with the gradient descent method after three identifications are comparable to the ones of the BFGS method for only one identification process. It can also be seen that the error is smaller overall if we use three temperature measurements compared to using only one, and that it gives way better results even for late points in time during the therapy.

Figure 7
figure 7

Evolution of the error in temperature over time for 3 measurements

Summing up, we observe that a single measurement can be sufficient for our purposes if the identification is carried out with the BFGS method. Doing so saves a lot of computational time and is a stepping stone for the use of the method in an online therapy-planning and -monitoring tool.

5.2 Noisy model problem

Let us now consider the case of noisy measurement data: We choose the same desired perfusion rate as before (cf. Fig. 2b) and generate the measurement by solving the state equations. Then, we add Gaussian noise with zero mean and a standard deviation of σ = 2°C. This is in accordance with [30] and [31] where the accuracy of temperature measurements using MR thermometry is reported to be 2.3°C and 1.3°C, respectively. To identify the perfusion rate for this problem, we first smooth the data with an isotropic linear diffusion process with end time \(2\times10^{-7}\), which is equivalent to convolving the noisy data with a Gaussian kernel that has a standard deviation of \(6.32\times10^{-4}\). Using a higher standard deviation leads to more smoothing but also introduces bias to the synthetic measurements such that the data is not compatible with the model anymore. For more details on linear diffusion processes for filtering noise we refer to, e.g., [32]. As this problem is less regular we choose the regularization parameter as \(\lambda = 2.5\times10^{-10}\). We compare the BFGS and gradient descent method in the same context as before: First, we have a single measurement of temperature at \(\tau = 60\mbox{ s}\) and, second, we consider three measurements at \(\tau _{1} = 60\mbox{ s}\), \(\tau _{2} = 120\mbox{ s}\) and \(\tau _{3} = 180\mbox{ s}\). All other parameters are chosen as in Sect. 5.1.

A single measurement

In the case of a single measurement at \(\tau = 60\mbox{ s}\) we see that there is not much of a difference between the gradient descent method and the L-BFGS algorithm. The computed perfusion rates, depicted in Fig. 8, look nearly identical, and so do the errors, as can be seen in Table 4 and Fig. 9. As is to be expected, the results obtained from the noisy measurements are worse than the ones we got in Sect. 5.1, where we did only consider noiseless data. In particular, the computed perfusion rates are worse compared to the previous cases and one can hardly distinguish the isolated vessels anymore. Instead, the vessels are more “smeared out”. However, we can still observe that we have three peaks in the perfusion rate, corresponding to the vessels in the middle of the first column for the desired perfusion rate. Additionally, we can observe that the middle one of these three shows a higher perfusion rate, as it is the case for the synthetic blood perfusion (cf. Fig. 2b). We also observe that both methods underestimate the perfusion rate by about 20%. This is due to the effect of the regularization term that penalizes large values of ξ.

Figure 8
figure 8

Identified perfusion rates for noisy measurements

Figure 9
figure 9

Evolution of the error in temperature over time for one noisy measurement

Table 4 Comparison between the simulated and measured temperature for a single measurement

The quality of approximation for the whole therapy, as shown in Table 4, is comparable for both methods, neither one of them performs significantly better than the other. The same is true when investigating the evolution of the error in temperature over time (cf. Fig. 9), where there is barely any difference visible between both algorithms.

Multiple measurements

For the case of three noisy measurements, both the gradient descent algorithm and the L-BFGS method perform significantly better and there are more differences between the methods. In particular, the obtained perfusion rates are much closer to the synthetic one as Fig. 10 shows. Here, we observe that both methods can resolve the individual peaks of the blood vessels placed in the first row, now even all five of them are well visible. Again, we observe that the algorithms underestimate the perfusion rate, however, the error is smaller for the L-BFGS method. We can also see that there are some artifacts contained in the computed perfusion rates. We observe that noise is added to the regions behind the first column, especially in the L-BFGS method.

Figure 10
figure 10

Identified perfusion rates for multiple noisy measurements

The quality of approximation for the whole therapy also improves, as can be seen in Table 5. Again, both methods show significant improvements, in particular for the error in temperature, which is about halved in the \(L^{\infty }\)-norm, and in tissue damage, which decreases nearly by two-thirds in the \(L^{2}\)-norm. The evolution of the temperature error also confirms these findings, as the results are substantially better as for only one measurement. Additionally, we can see that the L-BFGS method now also performs better than the gradient descent algorithm (see Fig. 11).

Figure 11
figure 11

Evolution of the error in temperature over time for 3 noisy measurements

Table 5 Comparison between the simulated and measured temperature for multiple noisy measurements

6 Conclusions

We have demonstrated that the proposed parameter identification approach based on techniques from PDE-constrained optimization can identify the blood perfusion rate in the relevant region around the applicator. This was done using synthetic measurements with and without artificial noise. Making use of three instead of one subsequent measurements has notably improved the accuracy. The L-BFGS method uses significantly less iterations to converge to an acceptable solution than the gradient descent method, while the time per iteration is comparable for both methods. The next step will be to test the parameter identification with real MR thermometry data obtained from ex-vivo experiments with artificial blood vessels.