Keywords

1 Introduction

A shape functional is a real map defined on a set of admissible shapes. The goal of shape optimization is to modify an initial shape so that a shape functional attains an extremal value. A common approach is to employ steepest descent algorithms [8, Chap. 3.4]. Shapes may be parameterized by \(C^{1}\)-mappings acting on reference configurations. Then the shape gradient is a linear continuous operator on the non-reflexive Banach space \(C^1\), and the concept of steepest descent may not be well-defined; see [7, P. 103]. A compromise is to replace “steepest descents” with Riesz representatives of shape gradients with respect to a Hilbert space X. Henceforth, we refer to these representatives as X-representatives.

After recalling basic definitions of shape calculus, we provide a general perspective on error analysis in the energy norm for finite element approximations of X-representatives of shape gradients. Then, we zero in on shape functionals constrained to elliptic boundary value problems. For this case, insight into shape Hessians [12, 14] suggests to select representatives of shape gradients with respect to \(X = H^1_0(D)\), where D is a hold-all domain that encloses the initial guess \(\varOmega \); see Fig. 1. For the choice \(X = H^1_0(D)\), it is natural to consider discretization by means of linear Lagrangian finite elements [2, 14, 15]. We show that linear Lagrangian finite element approximations of \(H^1\)-representatives of shape gradients converge linearly with respect to the mesh width. Additionally, this convergence rate does not deteriorate when state and adjoint variables are replaced by linear Lagrangian finite elements solutions. This is an improvement on the result presented in [2], which involves approximations of state and adjoint variables with quadratic finite elements. Finally, we provide numerical evidence of the linear convergence rate predicted.

Fig. 1.
figure 1

The hold-all domain D encloses the domain \(\varOmega \).

2 Shape Functionals and Shape Gradients

Let \(\varOmega \subset \mathbb {R}^d\), \(d=2,3\), be an open bounded domain with piecewise smooth boundary \(\partial \varOmega \), and let \(\mathcal{J}(\varOmega )\in \mathbb {R}\) be a real-valued quantity of interest associated to it. One is often interested in shape sensitivity, which quantifies the impact of small perturbations of \(\partial \varOmega \) on the value \(\mathcal{J}(\varOmega )\).

We model perturbations of the domain \(\varOmega \) through maps of the form

$$\begin{aligned} T_\mathcal{V}({\mathbf {x}}) \,\,:=\,\,{\mathbf {x}}+ \mathcal{V}({\mathbf {x}}),\quad {\mathbf {x}}\in \mathbb {R}^d, \end{aligned}$$
(1)

where \(\mathcal{V}\) is a vector field in \(C^1(\mathbb {R}^d;\mathbb {R}^d)\). It can easily be proved that the map (1) is a diffeomorphism for \(\Vert \mathcal{V}\Vert _{C^1} <1\) [8, Lemma 6.13].

The value \(\mathcal{J}(\varOmega )\) is interpreted as the realization of a shape functional, a real map

$$\mathcal{J}: \mathcal{V}\mapsto \mathcal{J}(T_\mathcal{V}(\varOmega ))$$

defined on the ball \(\{\mathcal{V}\in C^1(\mathbb {R}^d;\mathbb {R}^d);\, \Vert \mathcal{V}\Vert _{C^1} <1\}\). Clearly, \(\mathcal{J}(\varOmega ) = \mathcal{J}(T_0(\varOmega ))\).

The sensitivity of \(\mathcal{J}(\varOmega )\) with respect to the perturbation direction \(\mathcal{V}\) is given by the Eulerian derivative of the shape functional \(\mathcal{J}\) in the direction \(\mathcal{V}\), that is,

$$\begin{aligned} d\mathcal{J}(\varOmega ;\mathcal{V}) \,\,:=\,\,\lim _{s\searrow 0} \frac{\mathcal{J}\left( T_{s\cdot \mathcal{V}} (\varOmega )\right) -\mathcal{J}(\varOmega )}{s}. \end{aligned}$$
(2)

We say that the shape functional is shape differentiable if Formula (2) defines a linear and bounded operator \(\mathcal{V}\mapsto d\mathcal{J}(\varOmega ;\mathcal{V})\). In literature, this operator is called shape gradient [9, Chap. 9, Sect. 3.4]. As mentioned in the introduction, X-representatives of shape gradients can be employed to solve shape optimization problems, that is, to find

$$\begin{aligned} \varOmega ^* \in \mathop {\text {argmin}}\limits _{\varOmega \in U_{\mathrm {ad}}}\, \mathcal{J}(\varOmega ), \end{aligned}$$

where \(U_{\mathrm {ad}}\) denotes a set of admissible shapes.

Often, the quantity of interest takes the form

$$\begin{aligned} \mathcal{J}(\varOmega ) = \int _B \alpha \nabla (u-g) \cdot \nabla (u-g) + \beta (u-g)^2 \, \mathrm {d}{{\varvec{x}}}, \end{aligned}$$
(3)

where the state function u is the solution of a boundary value problem stated on \(\varOmega \), \(B\subset \varOmega \), \(\alpha \) and \(\beta \) are two real constants and g is a sufficiently smooth target function. In this work, \(u\in H^1_0(\varOmega )\) is the (weak) solution of the elliptic boundary value problem with homogeneous Dirichlet boundary conditions

$$\begin{aligned} -\varDelta u + u= f \quad \text {in }\varOmega , \qquad u = 0 \quad \text {on }\partial \varOmega , \end{aligned}$$
(4)

that is,

$$\begin{aligned} \int _\varOmega \nabla u \cdot \nabla v + uv\, \mathrm {d}{{\varvec{x}}}= \int _\varOmega fv \,\mathrm {d}{{\varvec{x}}}\quad \text {for all }v \in H^1_0(\varOmega ), \end{aligned}$$
(5)

where \(f\in H^{1}({\varOmega })\). For the sake of brevity, we set \(g = 0\). Then, the shape gradient of the shape functional associated to (3) and constrained to (5) reads [4, Formula (2.9)]

$$\begin{aligned} \nonumber d\mathcal{J}(\varOmega ;\mathcal{V}) =&\int _{\varOmega } \bigg ( (\nabla f \cdot \mathcal{V})p +\nabla u \cdot ({\mathbf {D}}\mathcal{V}+{\mathbf {D}}\mathcal{V}^T) \nabla p \\&\qquad + {\text {div}}\mathcal{V}(fp +\chi _B (\alpha \nabla u\cdot \nabla u + \beta u^2) - \nabla u\cdot \nabla p - up )\bigg ) \, \mathrm {d}{{\varvec{x}}}, \end{aligned}$$
(6)

where the adjoint function \(p\in H^1_0(\varOmega )\) is the solution of

$$\begin{aligned} \int _\varOmega \nabla p \cdot \nabla v + pv\, \mathrm {d}{{\varvec{x}}}= \int _B \alpha \nabla u \cdot \nabla v + \beta uv \,\mathrm {d}{{\varvec{x}}}\quad \text {for all }v \in H^1_0(\varOmega ). \end{aligned}$$
(7)

Formula (3) is a prototypical PDE-constrained shape functional. In this work, Formula (6) is used as test case for proving convergence estimates and performing numerical experiments.

Remark 1

Formula (6) holds even if homogeneous Dirichlet boundary conditions in (4) are replaced by homogeneous Neumann boundary conditions, in which case the test and the trial spaces in (5) and (7) are replaced with \(H^{1}({\varOmega })\).

Remark 2

For the sake of simplicity, we restrict our considerations to homogeneous boundary conditions. However, we expect that the results of this work hold true for (sufficiently regular) inhomogeneous boundary conditions, too. Note that Formula (6) should be adjusted accordingly; see [4, Sect. 2].

3 Error Analysis for Finite Element Representatives

3.1 The General Case

Let \((\cdot ,\cdot )_X\) denote the inner product of a Hilbert space X, and let us assume that the shape gradient \(d\mathcal{J}\) is well-defined on X. The X-representative \(\mathcal{V}^X\) of \(d\mathcal{J}\) can be computed by solving

$$(\mathcal{V}^X,\mathcal{W})_X = d\mathcal{J}(\varOmega ;\mathcal{W}) \quad \text {for all } \mathcal{W}\in X.$$

Next, for an index set \(\mathcal{N}\), we introduce a family \(\{X_n\}_{n\in \mathcal{N}}\) of finite-dimensional subspaces of X. Let \(\left\{ \mathcal{V}^{X_n}\right\} _{n\in \mathcal{N}}\) be a sequence of approximate X-representatives of \(d\mathcal{J}\) defined by

$$(\mathcal{V}^{X_n},\mathcal{W}_n)_X = d\mathcal{J}(\varOmega ;\mathcal{W}_n) \quad \text {for all } \mathcal{W}_n \in X_n.$$

By Cea’s Lemma [11, Theorem 2.4.1], there exists a constant \(C>0\) independent of n such that

$$\begin{aligned} \Vert \mathcal{V}^X- \mathcal{V}^{X_n}\Vert _X \le C \inf _{\mathcal{W}_n\in X_n} \Vert \mathcal{V}^X- \mathcal{W}_n \Vert _X. \end{aligned}$$
(8)

By and large, the shape gradient of a PDE-constrained shape functional depends also on the state and the adjoint variables u and p. These functions are solutions of boundary value problem. Usually, only numerical approximations \(u_h\) and \(p_h\) are available. In that case, the approximate X-representative \(\mathcal{V}^{X_n}\) has to be replaced with the solution \(\mathcal{V}^{X_n}_h\) of

$$\begin{aligned} (\mathcal{V}^{X_n}_h,\mathcal{W}_n)_X = d\mathcal{J}_h(\varOmega ;\mathcal{W}_n) \quad \text {for all } \mathcal{W}_n \in X_n, \end{aligned}$$
(9)

where \(d\mathcal{J}_h\) is an approximation of the operator \(d\mathcal{J}\) obtained by replacing the functions u and p with their numerical approximations \(u_h\) and \(p_h\).

By Strang Lemma [11, Theorem 4.1.1], the estimate (8) should be corrected by adding a consistency term, that is,

$$\begin{aligned} \Vert \mathcal{V}^X- \mathcal{V}^{X_n}_h\Vert _X \le C \bigg (\inf _{\mathcal{W}_n\in X_n} \Vert \mathcal{V}^X&- \mathcal{W}_n \Vert _X \nonumber \\ +&\sup _{\mathcal{W}_n\in X_n} \frac{\vert d\mathcal{J}(\varOmega ;\mathcal{W}_n) -d\mathcal{J}_h(\varOmega ;\mathcal{W}_n)\vert }{\Vert \mathcal{W}_n\Vert _X}\bigg ) \end{aligned}$$
(10)

for a constant \(C>0\) independent of n and h.

3.2 \(H^1\)-Representatives and Linear Lagrangian Finite Elements

A popular approach in shape optimization consists of replacing the initial domain \(\varOmega \) with a polygon/polyhedron equipped with a finite element mesh \(\varOmega _h\). This mesh is used to compute linear Lagrangian finite element approximations of the functions u and p. Then, the coordinates of the mesh nodes are (iteratively) updated according to the shape gradient [8, Chap. 6.5]. This is equivalent to extending \(\varOmega _h\) to a mesh \(D_h\) that covers a hold-all domain D and choosing linear Lagrangian finite elements to construct the finite-dimensional subspace \(X_n\). Formula (10), standard finite element estimates, and Proposition 1 readily imply that, for this discretization, the approximate \(H^1\)-representative of (6) satisfies

$$\begin{aligned} \Vert \mathcal{V}^X-\mathcal{V}^{X_n}_h\Vert _{H^1(D)} = \mathcal{O}(h), \end{aligned}$$
(11)

which is the main result of this work.

Proposition 1

Let \(\varOmega \subset \mathbb {R}^d\) be a polyhedral domain, let \(f\in W^{1,4}(\varOmega )\) in (5), and let us assume that the solution u of (5) satisfies

$$\begin{aligned} \Vert u \Vert _{W^{2,4}(\varOmega )}\le C\Vert f \Vert _{L^4(\varOmega )}. \end{aligned}$$
(12)

Let \((V_h)_{h\in (0,1]}\) be a family of \(H^{1}_0(\varOmega )\)-conforming piecewise linear Lagrangian finite element spaces built on a quasi-uniform family of simplicial meshes \((\mathcal{T}^h)_{h\in (0,1]}\), that is, a family of meshes such that

$$\max \{\mathrm {diam}(T):\, T\in \mathcal{T}^h\} \le h\,\mathrm {diam}(\varOmega )$$

and

$$\min \{\mathrm {diam}(B_T):\, T\in \mathcal{T}^h\} \ge \rho h\,\mathrm {diam}(\varOmega ) \quad \text {for all } h\in (0,1]\,,$$

for a \(\rho > 0\), where \(B_T\) is the largest ball contained in the simplex T [10, Definition 4.4.13]. Let \(u_h,\, p_h\in V_h\) be solutions of

$$\begin{aligned} \int _\varOmega \nabla u_h\cdot \nabla v_h + u_h v_h\,\mathrm {d}{{\varvec{x}}}&= \int _\varOmega f v_h\,\mathrm {d}{{\varvec{x}}}&\text {for all }\, v_h \in V_h, \end{aligned}$$
(13)
$$\begin{aligned} \int _\varOmega \nabla p_h\cdot \nabla v_h + p_h v_h\,\mathrm {d}{{\varvec{x}}}&= \int _B \alpha \nabla u_h \cdot \nabla v_h + \beta u_hv_h \,\mathrm {d}{{\varvec{x}}}&\text {for all }\, v_h \in V_h, \end{aligned}$$
(14)

where \(\alpha \), \(\beta \in \mathbb {R}\), \(B\subset \varOmega \), and \(\alpha = 0\) or \(B=\varOmega \), if \(d=3\). Let \(d\mathcal{J}_h(\varOmega ;\mathcal{W}_n)\) denote the operator defined by Formula (6) with u and p replaced by \(u_h\) and \(p_h\), respectively. Then,

$$\begin{aligned} \sup _{\mathcal{W}_n\in X_n} \frac{\vert d\mathcal{J}(\varOmega ;\mathcal{W}_n) -d\mathcal{J}_h(\varOmega ;\mathcal{W}_n)\vert }{\Vert \mathcal{W}_n\Vert _{H^1(D)}} \le C(\varOmega ,f,u,p) h \end{aligned}$$
(15)

for a constant \(C(\varOmega ,f,u,p)>0\) independent of n and h.

Proof

First of all, note that

$$\begin{aligned} \nonumber d\mathcal{J}(\varOmega ;\mathcal{W}_n)\ -\&d\mathcal{J}_h(\varOmega ;\mathcal{W}_n) =\int _{\varOmega } (\nabla f \cdot \mathcal{W}_n+ f{\text {div}}\mathcal{W}_n)(p- p_h)\, \mathrm {d}{{\varvec{x}}}\\ \nonumber&+\int _{\varOmega }\nabla u \cdot ({\mathbf {D}}\mathcal{W}_n+{\mathbf {D}}\mathcal{W}_n^T) \nabla p -\nabla u_h \cdot ({\mathbf {D}}\mathcal{W}_n+{\mathbf {D}}\mathcal{W}_n^T) \nabla p_h\, \mathrm {d}{{\varvec{x}}}\\ \nonumber&+\int _{\varOmega } {\text {div}}\mathcal{W}_n \left( \nabla u_h\cdot \nabla p_h + u_h p_h -\nabla u\cdot \nabla p -up\right) \, \mathrm {d}{{\varvec{x}}}\,\\&+\int _{B} {\text {div}}\mathcal{W}_n \left( \alpha (\nabla u\cdot \nabla u -\nabla u_h\cdot \nabla u_h) + \beta (u^2-u_h^2)\right) \, \mathrm {d}{{\varvec{x}}}. \end{aligned}$$
(16)

We recall that, for generic functions \(q_0\in L^2(\varOmega )\) and \(q_1\), \(q_2\in L^4(\varOmega )\), the Cauchy-Schwarz inequality implies

$$\begin{aligned} \Vert q_0 q_1 q_2 \Vert _{L^1(\varOmega )} \le \Vert q_0 \Vert _{L^{2}({\varOmega })}\Vert q_1 q_2 \Vert _{L^{2}({\varOmega })} \le \Vert q_0 \Vert _{L^{2}({\varOmega })}\Vert q_1 \Vert _{L^4(\varOmega )} \Vert q_2 \Vert _{L^4(\varOmega )}. \end{aligned}$$
(17)

Thus, the first integral in (16) may be estimated as followsFootnote 1

$$\begin{aligned} \vert \int _{\varOmega }(\nabla f \cdot \mathcal{W}_n+ f{\text {div}}\mathcal{W}_n)(p- p_h) \,\mathrm {d}{{\varvec{x}}}\vert \le C \Vert \mathcal{W}_n\Vert _{H^{1}({\varOmega })} \Vert f \Vert _{W^{1,4}(\varOmega )} \Vert p - p_h \Vert _{L^4(\varOmega )}. \end{aligned}$$

The second integral in (16) may be estimated as follows

$$\begin{aligned}&\vert \int _{\varOmega } \nabla u \cdot ({\mathbf {D}}\mathcal{W}_n+{\mathbf {D}}\mathcal{W}_n^T) \nabla p -\nabla u_h \cdot ({\mathbf {D}}\mathcal{W}_n+{\mathbf {D}}\mathcal{W}_n^T) \nabla p_h \,\mathrm {d}{{\varvec{x}}}\vert \\&=\vert \int _{\varOmega } \nabla (u-u_h) \cdot ({\mathbf {D}}\mathcal{W}_n+{\mathbf {D}}\mathcal{W}_n^T) \nabla p +\nabla u_h \cdot ({\mathbf {D}}\mathcal{W}_n+{\mathbf {D}}\mathcal{W}_n^T) \nabla (p-p_h)\,\mathrm {d}{{\varvec{x}}}\vert \\&\le C \Vert \mathcal{W}_n\Vert _{H^{1}({\varOmega })} \left( \Vert u - u_h \Vert _{W^{1,4}(\varOmega )} \Vert p \Vert _{W^{1,4}(\varOmega )}+ \Vert u_h \Vert _{W^{1,4}(\varOmega )} \Vert p-p_h \Vert _{W^{1,4}(\varOmega )}\right) . \end{aligned}$$

The third and the fourth integral in (16) may be estimated similarly.

Stability of the Ritz projection with respect to \(W^{1,4}(\varOmega )\) [3]Footnote 2

$$\begin{aligned} \Vert u_h \Vert _{W^{1,4}(\varOmega )} \le C \Vert u \Vert _{W^{1,4}(\varOmega )} \end{aligned}$$
(18)

implies \(\Vert u - u_h \Vert _{W^{1,4}(\varOmega )}=\mathcal{O}( h)\). To show

$$\begin{aligned} \Vert p_h \Vert _{W^{1,4}(\varOmega )} \le C \Vert p \Vert _{W^{1,4}(\varOmega )}, \end{aligned}$$
(19)

which in turn implies \(\Vert p-p_h \Vert _{W^{1,4}(\varOmega )} = \mathcal{O}(h)\), it is necessary to repeat the proof of (18) given in [3] tracking the consistency term

$$\begin{aligned} \int _\varOmega \nabla (p-p_h)\cdot \nabla g^z_h + (p-p_h) g^z_h\,\mathrm {d}{{\varvec{x}}}= \int _B \alpha \nabla (u-u_h) \cdot \nabla g^z_h + \beta (u-u_h)g^z_h \,\mathrm {d}{{\varvec{x}}}. \end{aligned}$$
(20)

The discrete Green’s function \(g^z_h\in V_h\) is given in [3] and satisfies \(\Vert g^z_h \Vert _{H^{1}({\varOmega })} =\mathcal{O}(h^{-d/2}).\) By the Cauchy-Schwarz inequality and standard finite element estimates,

$$\begin{aligned} \vert \int _B \alpha \nabla (u-u_h) \cdot \nabla g^z_h + \beta (u-u_h)g^z_h \,\mathrm {d}{{\varvec{x}}}\vert = \mathcal{O}\left( (\vert \alpha \vert h+\vert \beta \vert h^2)h^{-d/2}\right) . \end{aligned}$$
(21)

The stability result (19) holds if (21) is bounded independently of h. For this reason, we need to set \(\alpha =0\) when \(d=3\), unless \(B=\varOmega \). In this latter case, by Galerkin orthogonality, (20) is bounded by \(\Vert (\beta -\alpha )(u-u_h)g_h^z\Vert _{L^1(\varOmega )}\).    \(\square \)

Remark 3

In Proposition 1, we assume \(W^{2,4}\)-regularity of the solution u of (5). This assumption is made to achieve linear convergence with respect to h in the estimate (15). However, a three-dimensional polyhedral domain must satisfy tight geometric conditions for u to be in \(W^{2,4}\) [6, Theorem 7.1]. Nevertheless, in [5] the authors show \(W^{1,\infty }\)-stability of the Ritz projection for general convex polyhedral domains. Therefore, we expect that (in the latter case) the right-hand side of (15) can be replaced with a term of order \(\mathcal{O}(h^{\alpha })\), where the rate \(\alpha \) depends on the regularity of u and satisfies \(0 < \alpha \le 1\).

Remark 4

In [4, 13], the authors show that one can expect superconvergence in the approximation of the shape gradient \(d\mathcal{J}\). In particular, they show that

$$\begin{aligned} \vert d\mathcal{J}(\varOmega ;\mathcal{W}) -d\mathcal{J}_h(\varOmega ;\mathcal{W})\vert \le C\Vert \mathcal{W}\Vert _{W^{2,4}(\varOmega )} h^2. \end{aligned}$$
(22)

However, in the right-hand side of (22) appears the \(W^{2,4}(\varOmega )\)-norm of \(\mathcal{W}\). Note that to prove convergence in the approximation of a \(H^1\)-representative of \(d\mathcal{J}\), the upper bound of

$$\vert d\mathcal{J}(\varOmega ;\mathcal{W}) -d\mathcal{J}_h(\varOmega ;\mathcal{W})\vert $$

cannot involve a norm stronger than the \(H^1\)-norm; see Eq. (10).

Remark 5

By the Hadamard structure theorem [9, Chap. 9, Theorem 3.6], most shape gradients admit representatives \(\mathfrak {g}(\varOmega )\) in the space of distributions \(\mathcal{D}^k(\partial \varOmega )\), that is,

$$\begin{aligned} d\mathcal{J}(\varOmega ;\mathcal{V}) = \langle \mathfrak {g}(\varOmega ), \gamma _{\partial \varOmega } \mathcal{V}\cdot {\mathbf {n}}\rangle _{\mathcal{D}^k(\partial \varOmega )}, \end{aligned}$$
(23)

where \( \gamma _{\partial \varOmega } \mathcal{V}\cdot {\mathbf {n}}\) is the normal component of \(\mathcal{V}\) on the boundary \(\partial \varOmega \). For instance, if \(u,p\in H^2(\varOmega )\), Formula (6) is equivalent to [4, Formula (2.10)]

$$\begin{aligned} d\mathcal{J}(\varOmega ;\mathcal{V}) = \int _{\partial \varOmega } \left( \mathcal{V}\cdot {\mathbf {n}}\right) \left( \alpha \nabla u \cdot \nabla u + \beta u^2 +\frac{\partial {p}}{\partial {\mathbf {n}}}\frac{\partial {u}}{\partial {\mathbf {n}}} \right) \, dS. \end{aligned}$$
(24)

We advise against the use of \(\mathfrak {g}(\varOmega )\) (which corresponds to the \(L^2(\partial \varOmega )\)-representative of \(d\mathcal{J}\)) to define descent directions because \(L^2\)-representatives might bristle with undesirable oscillations [1].

4 Numerical Experiments

We provide numerical evidence of the estimate (11). We employ linear Lagrangian finite elements on quasi-uniform triangular meshes. The experiments are performed in MATLAB and are partly based on the library LehrFEM developed at ETHZ. Mesh generation and uniform refinement are performed with the functions initmesh and refinemesh of the MATLAB PDE Toolbox [16]. The boundary of computational domains is approximated by a polygon, which is generally believed not to affect the convergence of linear finite elements [10, Sect. 10.2]. For domains with curved boundaries, the refined mesh is always adjusted to fit the boundary. Integrals in the domain are computed with a 3-point quadrature rule of order 3 in each triangle and line integrals with a 6-point Gauss quadrature on each segment.

We consider three different geometries for the domain \(\varOmega \) (see Fig. 2):

  1. 1.

    A disc of radius \(\sqrt{6/5}\) centered in (0.01,0.02).

  2. 2.

    A triangle with corners located at

    $$(-\sqrt{6/5}, -\sqrt{6/5}), (\sqrt{6/5}, -\sqrt{6/5}), (-\sqrt{6/5}, \sqrt{6/5}).$$
  3. 3.

    A circular sector of radius \(\sqrt{6/5}\) centered in (0.01, 0.02) of angle \(0.9\cdot 2\pi \).

The source function in (5) is

$$f(x,y) = \cos (x+ \pi /4).$$

The hold-all domain D is a square with edges of length 3 centered in the origin. The region of interest B is the whole domain \(\varOmega \). We set \(\alpha = 0\) and \(\beta = 1\) in (3).

Fig. 2.
figure 2

The domain \(\varOmega \) is chosen to be either a disc or a triangle or a sector.

The reference value \(\mathcal{V}^X\) is approximated by computing \(\mathcal{V}^{X_n}_h\) on a mesh with an extra level of refinement. In light or Remark 5, we employ both Formula (6) and Formula (24) to evaluate the right-hand side \(d\mathcal{J}_h\) in (9). To avoid biased results, we display convergence history of \(\Vert \mathcal{V}^X- \mathcal{V}^{X_n}_h\Vert _{H^1(D)}\) both with self- and cross-comparison.

In Fig. 3, we plot the convergence history when the domain \(\varOmega \) is either a disc (first row) or a triangle (second row). As predicted by (11), we observe linear convergence when the right-hand side in (9) is evaluated according to (6). Interestingly, using Formula (24) seems not to affect the convergence rate. The same behavior is observed when homogeneous Dirichlet boundary conditions are replaced by homogeneous Neumann boundary conditions. Note that, in this latter case, the boundary-integral counterpart of Formula (6) reads [4]

$$\begin{aligned} d\mathcal{J}(\varOmega ;\mathcal{V}) = \int _{\partial \varOmega } \mathcal{V}\cdot {\mathbf {n}}\left( \nabla u \cdot \nabla (\alpha u-p) + u(\beta u-p) + fp \right) \, dS. \end{aligned}$$
(25)

For the sake of brevity, we omit these plots.

Fig. 3.
figure 3

Convergence history when \(\varOmega \) is a disc (first row) and a triangle (second row). Line refers to evaluation of \(d\mathcal{J}_h\) according to Formula (6); line to Formula (24). We observe the linear convergence rate predicted by (11).

In Fig. 4 (first row), we plot the convergence history when the domain \(\varOmega \) is a sector. This domain does not guarantee that u and p are in \(H^2(\varOmega )\) because it has a re-entrant corner. We observe that the convergence rates decrease to fractional values. This is a consequence of the lower regularity of the functions u and p. Additionally, the convergence rates depend on the formula used to evaluate \(d\mathcal{J}_h\). In particular, in the cross-comparison, the convergence line saturates when Formula (6) is used. This may be due to a poor accuracy of the reference solution. However, we point out that Formulas (6) and (24) may not be equivalent due to the lack of regularity of the functions u and p; cf. Remark 5. Curiously, for homogeneous Neumann boundary conditions, the presence of the re-entrant corner seems to have a milder impact on convergence rates; see Fig. 4 (second row). However, note that the approximate algebraic convergence rates of

$$\Vert u - u_h \Vert _{H^{1}({\varOmega })} \quad \text {and} \quad \Vert p - p_h \Vert _{H^{1}({\varOmega })}$$

with respect to h drop to 0.67 and 0.62, respectively.

Fig. 4.
figure 4

Convergence history when \(\varOmega \) is a sector. Line refers to evaluation of \(d\mathcal{J}_h\) according to Formula (6); line to Formula (24)(in the first row) and to Formula (25) (in the second row). For Dirichlet boundary conditions, convergence rates decay to fractional values.

By the Hadamard structure theorem (see Remark 5), vector fields \(\mathcal{W}_n\) associated to interior nodes of the mesh \(\varOmega _h\) lie in the kernel of \(d\mathcal{J}\). However, these vector fields are not in the kernel of \(d\mathcal{J}_h\) because u and p are replaced by finite element approximations. Schulz et al. [14] report that this numerical error might largely affect the computation of the Riesz representative. Although we have not experienced this issue, we have repeated the numerical experiments by setting to zero the values of \(d\mathcal{J}_h(\varOmega ;\mathcal{W}_n)\) for all \(\mathcal{W}_n\) associated to interior nodes of \(\varOmega _h\). We have not observed any significative difference in the results. Thus, we acknowledge that computational resources might be saved by dropping the evaluation of \(d\mathcal{J}_h(\varOmega ;\mathcal{W}_n)\) for vector fields associated to interior nodes of \(\varOmega _h\).

5 Conclusion

Most shape optimization algorithms rely on Riesz representatives of shape gradients with respect to a chosen Hilbert space. Numerical discretization is inevitable when the shape functional is constrained to a boundary value problem. Formula (10) indicates how to estimate the discretization error when the Riesz representative is computed on a finite-dimensional trial space and the shape gradient can be evaluated only approximately.

For linear Lagrangian approximations of \(H^1\)-representatives, Proposition 1 implies that the discretization error decays linearly with respect to the mesh width h. This convergence behavior is observed in several numerical experiments.

As a consequence of the Hadamard structure theorem, most shape gradients can be equivalently formulated as boundary or volume integrals. Although Proposition 1 relies on the volume formulation of the shape gradient, we have observed linear convergence independently of the formula employed to evaluate \(d\mathcal{J}\). However, we advise to rely on the volume-based formula because it imposes lower regularity assumptions on the state and the adjoint variables [4, 9, 15].