1 Introduction

Images that take symmetric matrices as values arise in several ways in image acquisition and processing. For example, in diffusion tensor MRI [17] they result from the measurement of second-order diffusion tensors. Fields of structure tensors [13] arise as derived quantity within many image processing methods such as anisotropic diffusion [20] or variational optic flow computation [5]. By a transform introduced in [6, 7], see also [8], colour images can be transformed into fields of symmetric matrices.

Over the last two decades, many image processing methods have been devised for (symmetric) matrix-valued images, see e.g. the edited volumes [15, 21, 24]. In particular, a framework of matrix-valued morphology has been built up, starting from the seminal work of Burgeth et al. [10, 11] in which matrix-valued dilation and erosion operations were introduced. Continuous-scale matrix-valued morphology was established in [9]. Applications to colour images are found in [3, 6, 7, 14].

In their classical formulation morphological dilation and erosion are local filters in which image values from a neighbourhood of each image location are selected by a mask (or structuring element) and aggregated by taking the maximum and minimum, respectively. Since the application of the structuring element does not depend on the range of the image values being selected, the essential step in extending dilation and erosion to matrix-valued images is an appropriate generalisation of the maximum and minimum operations.

The concept of matrix-valued dilation and erosion from [10] relies on the combination of the Loewner ordering [16] as a partial order on symmetric matrices with the concept of total ordering by the trace of occurring matrices. In order to define the supremum of a set of symmetric matrices, one considers the set of all matrices that are greater than or equal to all given matrices w.r.t. the Loewner order, and chooses from this upper bound set the minimal element w.r.t. the trace total order. The criterion for choosing the minimal element from the upper bound set may be varied, leading to variants of the dilation and erosion operations, see [22, Sect. 2.4].

Clearly, the so-defined supremum and infimum generalise the scalar maximum and minimum operations underlying classical grey-value morphology in the sense that also the maximum of grey-values can be understood as the smallest value that is greater or equal to all grey-values from an input set. On the other hand, a sacrifice is made for this: In classical morphology, dilation and erosion and their compositions never extend the range of grey-values of an input set. This is of particular importance when the continuous limit of morphological operations is considered, giving rise to the partial differential equations (PDEs) of continuous-scale morphology [1, 4, 19]: These PDEs then fulfil an extremum principle. Unfortunately, this is no longer true for the matrix-valued dilation and erosion based on [10]; in general the supremum of a set of matrices will have a strictly greater trace than each of them; likewise for the infimum. We are therefore led to ask whether variants of matrix-valued dilation and erosion can be devised that allow for an extremum principle.

Taking a more principled approach to the scalar operations of classical morphology, we can identify two essential features of the maximum (and analogously for the minimum):

  1. (i)

    The maximum of given input values is greater or equal to each of them.

  2. (ii)

    The maximum of given input values is contained in their convex hull.

We remark that the scalar-valued maximum even happens to coincide with one of the input values. On one hand, in the context of multivariate data such a requirement would lead to discontinuous dependence on the input data and often result in values that do not represent the input data adequately. One might compare also the situation for median filtering of multivariate images, see [23], where filters that select only among the input values have turned out too restrictive. On the other hand, individual values of an image normally result from a sampling process, and are just representatives of a larger set. By admitting convex combination (averaging) which occurs in sampling in a natural way anyway, (ii) provides a conservative estimate of the underlying set of values. We are thus convinced that (ii) is the essential property of the maximum in this context.

The matrix dilation from [10] guarantees property (i) at the expense of giving up (ii). The trace criterion or its alternatives serve as a way to minimise the degree of violation of (ii). In turn, the exact fulfilment of (ii) is what underlies the extremum principle.

As for a given set of symmetric matrices, its upper bound set will in general be disjoint from its convex hull, so that (ii) can be enforced only at the expense of tolerating violations of (i). We aim therefore at finding matrix-valued replacements for the scalar maximum/minimum operations that satisfy (ii), while minimising the degree of violation of (i) in a suitable sense.

Our Contribution and Outline of the Paper. The structure of the paper is adapted to the goals formulated above. In Sect. 2 we show how to construct an optimisation procedure based on an interior point method that satisfies property (ii) while minimising the violation of property (i). In doing this, our method is to our best knowledge the first one proposed intentionally with the purpose to fulfil the mentioned essential aim. With the help of experiments on synthetic data sets as well as dilation and erosion of colour images, we validate our proceeding as discussed in Sect. 3. As indicated we conjecture that the results obtained with our method are more intuitive than those obtained with previous methods that violate the extremum principle. We end our paper by a conclusion which indicates the potential of the proposed method for future developments.

2 Matrix Pseudomaximum and Pseudominimum with Extremum Principle

In this section we proceed by giving technical details of the underlying model for decomposing symmetric matrices, see e.g. [23] for related developments in addition to the works mentioned above. After that we give as indicated details on the algorithmic realisation of property (ii).

2.1 Theoretical Background of the Model

For a given matrix \(\varvec{Y}\in \mathrm {Sym}(n)\) we define \(N(\varvec{Y})\) as the square sum of its negative eigenvalues,

$$\begin{aligned} N(\varvec{Y}) := \frac{1}{2} \sum \limits _{j=1}^n[\lambda _{j}(\varvec{Y})]_-^2 \end{aligned}$$
(1)

where \(\lambda _j(\varvec{Y})\) is the j-th-largest eigenvalue of \(\varvec{Y}\) and \([z]_-:=\frac{1}{2}(|z|-z)\) for \(z\in \mathbb {R}\). This can be seen as a penaliser for the degree of violation of the relation \(\varvec{Y}\succeq \varvec{0}\) where \(\succeq \) is the Loewner ordering and \(\varvec{0}\in \mathrm {Sym}(n)\) the zero matrix.

For a given (multi-) set

$$\begin{aligned} \mathcal {X} := (\varvec{X}_1,\ldots ,\varvec{X}_m)\;, \quad \varvec{X}_i\in \mathrm {Sym}(n) \end{aligned}$$
(2)

we define the pseudomaximum \(\varvec{\bigvee }(\mathcal {X})\) as the matrix from the convex hull \(\mathrm {conv}(\mathcal {X})\) of \(\mathcal {X}\) for which the total measure of violations of \(\varvec{Y}\succeq \varvec{X}_i\) is minimal:

$$\begin{aligned} \varvec{\bigvee }(\mathcal {X})&:= \mathop {\mathrm {argmin}}_{\varvec{Y}\in \mathrm {conv}(\mathcal {X})} E_\mathcal {X}(\varvec{Y})\;,\quad E_\mathcal {X}(\varvec{Y}):= \sum \limits _{i=1}^m N(\varvec{Y}-\varvec{X}_i) \,. \end{aligned}$$
(3)

Abbreviating \(-\mathcal {X}:=(-\varvec{X}_1,\ldots ,-\varvec{X}_m)\), the pseudominimum \(\varvec{\bigwedge }(\mathcal {X})\) is defined as

$$\begin{aligned} \varvec{\bigwedge }(\mathcal {X}) := -\varvec{\bigvee }(-\mathcal {X}) \;. \end{aligned}$$
(4)

2.2 Analysis

If \(\varvec{Y}\in \mathrm {Sym}(n)\) has the spectral decomposition

$$\begin{aligned} \varvec{Y} = \sum \limits _{j=1}^n\lambda _j\varvec{w}_j\varvec{w}_j^{\mathrm {T}} \end{aligned}$$
(5)

with eigenvalues \(\lambda _j\) and unit eigenvectors \(\varvec{w}_j\), then the directional derivative of the j-th-largest eigenvalue in direction of a perturbation matrix \(\varvec{Z}\in \mathrm {Sym}(n)\) is

$$\begin{aligned} \left. \frac{\mathrm {d}\lambda _j(\varvec{Y}+\varepsilon \varvec{Z})}{\mathrm {d}\varepsilon }\right| _{\varepsilon =0} = \varvec{w}_j^{\mathrm {T}}\varvec{Z}\varvec{w}_j\;. \end{aligned}$$
(6)

From this it follows that the one-sided derivative of \(N(\varvec{Y})\) w.r.t. \(\varvec{Z}\) is

$$\begin{aligned} \left. \frac{\mathrm {d}N(\varvec{Y}+\varepsilon \varvec{Z})}{\mathrm {d}\varepsilon }\right| _{\varepsilon =0^+} = \sum \limits _{j\in \mathcal {T}^-}\lambda _j \varvec{w}_j^{\mathrm {T}}\varvec{Z}\varvec{w}_j \end{aligned}$$
(7)

where \( \mathcal {T}^-:=\{j\in \{1,\ldots ,n\}~\mid ~\lambda _j<0\} \).

The objective function \(E_\mathcal {X}\) in (3) is convex. Whereas we have to defer a detailed proof to a future paper, we mention an important observation which is used in the proof: Let \(N^*>0\) be given, and dilate the cone of positive semidefinite matrices from \(\mathrm {Sym}(n)\) with the Euclidean ball of radius \(\sqrt{N^*}\) as structuring element. This results in a convex set in \(\mathrm {Sym}(n)\), the boundary of which (a hypersurface) is the set of all \(\varvec{Y}\) that fulfil \(N(\varvec{Y})=N^*\). For example, in \(\mathrm {Sym}(2)\) this set is a cone (open toward the direction of increasing trace) with its tip truncated and replaced with a sphere segment. With additional calculations it follows that \(N(\varvec{Y})\) is convex. The same is true for \(E_{\mathcal {X}}\) which is the sum of translated copies of \(N(\varvec{Y})\). By refining the argument, it can be shown that the convexity is even strict within the convex hull of input data.

Therefore (7) can be used to construct a gradient descent algorithm similar to an interior point method [12] for the pseudomaximum: One starts at some location in \(\mathrm {conv}(\mathcal {X})\) and continues by update steps within the convex hull as long as \(E_\mathcal {X}\) can be reduced. As initialisation, one might simply choose the \(\varvec{X}_i\) with maximal trace; update steps for \(\varvec{Y}\) within the convex hull can be devised as moving towards any \(\varvec{X}_i\ne \varvec{Y}\). This can be realised with \(\varvec{Z}:=\varvec{X}_i-\varvec{Y}\) and a suitable step size which should be chosen small enough so that the sign of no relevant eigenvalue of any \(\varvec{Y}-\varvec{X}_i\) changes within the update step.

The pseudomaximum (3) is a weighted average of some Loewner-maximal input matrices, i.e. those which are not Loewner-less or equal to any other input matrix. If there is only one Loewner-maximal input matrix, which is then Loewner-greater or equal to all other input matrices, it is the pseudomaximum (and in this case also the matrix supremum as defined in [10]).

2.3 Exposition on the Algorithm

Given \(\mathcal {X}\) as above, our algorithm for finding \(\varvec{\bigvee }(\mathcal {X})\) proceeds as follows.

Initialisation. Let \(\varvec{Y}_0 := \mathop {\mathrm {argmax}}\nolimits _{\varvec{Y}\in \mathcal {X}} \mathrm {trace}(\varvec{Y})\;.\)

Iteration. For \(k=0,1,\ldots \):

  1. 1.

    For each \(i=1,\ldots ,m\):

    Compute the spectral decomposition of \(\varvec{D}_{k,i}:=\varvec{Y}_k-\varvec{X}_i\):

    $$\begin{aligned} \varvec{D}_{k,i} = \sum \limits _{r=1}^n\lambda _{k,i,r} \varvec{w}_{k,i,r}\varvec{w}_{k,i,r}^{\mathrm {T}} \,. \end{aligned}$$
    (8)
  2. 2.

    Determine the index set

    $$\begin{aligned} \mathcal {T}_{k}^-&:= \left\{ (i,r)~\mid ~\lambda _{k,i,r}<0\right\} \,. \end{aligned}$$
    (9)
  3. 3.

    For each \(j=1,\ldots ,m\):

    Let \(\varvec{D}_{k,j}\) be defined as in Step 1. For \((i,r)\in \mathcal {T}_{k}^-\) let

    $$\begin{aligned} d_{k,j,i,r} := \lambda _{k,i,r}\varvec{w}_{k,i,r}^{\mathrm {T}}\varvec{D}_{k,j} \varvec{w}_{k,i,r}\,. \end{aligned}$$
    (10)

    Let

    $$\begin{aligned} g_{k,j}:= -\sum \limits _{(i,r)\in \mathcal {T}_{k}^-} d_{k,j,i,r} \,. \end{aligned}$$
    (11)
  4. 4.

    Let

    $$\begin{aligned} (j^*(k), g_k) := \left( \mathrm {argmin},\min \right) _{j=1,\ldots ,m}g_{k,j} \,. \end{aligned}$$
    (12)
  5. 5.

    If \(g_k\ge 0\), stop; \(\varvec{Y}_k\) is the sought minimiser.

    Otherwise choose a step size \(\tau _k\) which fulfils

    $$\begin{aligned} 2\tau _k d_{k,j^*(k),i,r}&\le |\lambda _{k,i,r} |\quad \text {for all }(i,r)\in \mathcal {T}_k^-\,,&\tau _k&\le 1 \end{aligned}$$
    (13)

    and let

    $$\begin{aligned} \varvec{Y}_{k+1}:=\varvec{Y}_k-\tau _k\varvec{D}_{k,j^*(k)} \,. \end{aligned}$$
    (14)

    Check whether \(E(\varvec{Y}_{k+1})<E(\varvec{Y}_k)\); if this is not the case, choose a smaller value for \(\tau _k\) and repeat (14).

  6. 6.

    Numerical stopping criterion: If \(|E(\varvec{Y}_{k+1})-E(\varvec{Y}_k)|\) is below a predefined threshold, stop; \(\varvec{Y}_{k+1}\) is an approximation to the sought minimiser.

3 Experiments

In this section we show the effect of morphological dilation and erosion using the pseudomaximum and pseudominimum of symmetric matrices introduced in Sect. 2. We will shortly denote these operations as X-dilation and X-erosion (X standing for “obeying extremum principle”).

For comparison, we use morphological dilation using the matrix supremum from [10] which we will denote as L-dilation (L indicating the strict Loewner order between the supremum and the input data guaranteed by this approach), and two versions of the morphological erosion using matrix infima as defined in [10] and [7]. In [10], the infimum of positive definite matrices is defined as the matrix inverse of the supremum of the matrix inverses of the input matrices; this definition is suitable for positive definite matrix data as it is designed to preserve positive definiteness. In contrast, [7] uses an infimum that is minus the supremum of the sign-inverted input matrices. The latter definition cannot guarantee positive definite results even for positive definite input matrices; it is suitable for matrix data the eigenvalues of which can take either sign. We will denote the first variant as LP-erosion (P standing for “positive definite”), and the second one as LI-erosion (I for “indefinite”). Note that no such distinction is needed for X-erosion because by virtue of the extremum principle (4) can be used also for positive definite matrices.

Fig. 1.
figure 1

Synthetic matrix morphology example. a Original data set consisting of symmetric positive definite \(2\times 2\) matrices \(\varvec{A}\) (depicted by ellipses \(\varvec{x}^\mathrm {T}\varvec{A}^{-1}\varvec{x}=1\)). Thin lines delineate one exemplary pixel and its corresponding structuring element. – b L-dilation following [10]. – c X-dilation. – d LP-erosion following [10]. – e X-erosion.

Experiment 1: Synthetic Data. We start with an experiment on synthetic data. Figure 1 a shows an array of 100 symmetric positive definite \(2\times 2\) matrices \(\varvec{A}\) depicted by ellipses \(\varvec{x}^\mathrm {T}\varvec{A}^{-1}\varvec{x}=1\). For the subsequent morphological operations we use a structuring element containing all pixels with distance \(\le 2\) from the centre (as shown for one exemplary pixel in the figure), and reflecting boundary conditions.

Frame b shows the result of L-dilation. Whereas in regions with well-aligned eigensystems (as near the lower boundary of the array) larger values are nicely propagated, the dilation creates matrices exceeding all contributing input matrices when these are not well aligned (as near the top boundary). Frame c shows the result of X-dilation, i.e. obtained using the proposed framework. In regions with well-aligned eigensystems the result is similar to that of L-dilation. In non-aligned regions, still larger eigenvalues are propagated but by the restriction to the convex hull of contributing input matrices no amplification of values is observed. Frames d and e in the bottom row of Fig. 1 show a similar effect for LP-erosion and X-erosion.

Let us note that the amplification of values as observable within the results of L-dilation could be interpreted as a potential trace of instability in the context of a PDE-based formulation.

Experiment 2: Colour Imagery. In our second experiment we apply the matrix-valued morphological operations to filter colour images. A correspondence between colour images and fields of \(\mathrm {Sym}(2)\) matrices was established in [6]. This correspondence is mediated by the HCL (hue–chroma–luminance) colour space. From a given RGB triple (rgb), chroma c is obtained by \(c:=M-m\) where \(M:=\max \{r,g,b\}\), \(m:=\min \{r,g,b\}\), luminance l by \(l:=\frac{1}{2}(M-m)\) and hue h by \(D+\frac{1}{6} d/M \bmod 1\) where \(d:=g-b\), \(D:=0\) for \(r\ge g,b\), \(d:=b-r\), \(D:=1/3\) for \(g\ge r,b\), \(d:=r-g\), \(D:=2/3\) for \(b\ge r,b\). A symmetric \(2\times 2\) matrix \(\varvec{A}=\varvec{A}(r,g,b)\) is then obtained by

$$\begin{aligned} \varvec{A}:= \frac{2l-1}{\sqrt{2}} \begin{pmatrix}1&{}0\\ 0&{}1\end{pmatrix} + \frac{c}{\sqrt{2}} \begin{pmatrix}-\sin (2\pi h)&{}\cos (2\pi h)\\ \cos (2\pi h)&{}\sin (2\pi h) \end{pmatrix} \;. \end{aligned}$$
(15)

For further details see [6, 8, 14]. By this transformation a bijection between the RGB colour space and a compact convex set of symmetric matrices (namely, a bi-cone) is established, see Fig. 2.

Fig. 2.
figure 2

figure adapted from [7]

Color bi-cone,

Following the procedure from [6], we can now wrap matrix-valued dilations and erosions in the RGB–\(\mathrm {Sym}(2)\) transform (15) and its inverse to obtain dilations and erosions for colour images. In this case, the infimum-based erosion is chosen as LI-erosion because the bi-cone of matrices is symmetric about zero. Our comparison therefore includes L-dilation, LI-erosion, X-dilation and X-erosion.

However, as pointed out in [6], a difficulty arises for L-dilation and LI-erosion as the supremum and infimum of matrices may generate values outside the bi-cone to which RGB colours are mapped. In [6, 7], an additional transform is therefore proposed to map back the supremum and infimum into the bi-cone.

For better comparison with X-dilation and X-erosion which do not require such a transform due to their built-in extremum principle, we omit the additional transform also for L-dilation and LI-erosion. Effectively, overshooting matrix values are just projected to the admissible colour range (sacrificing invertibility for these values).

Our colour test image is shown in Fig. 3a; two zoom-ins are shown in Fig. 4a, f. As in the previous experiment, we use the 13-pixel structuring element shown in Fig. 1, and reflecting boundary conditions. Frames b and c of Fig. 3 show the results of L-dilation and X-dilation, respectively, see also the clippings in Fig. 4b, c and g, h.

As expected, both operations extend bright structures. However, at locations where regions of comparable brightness but different colours meet, as in the region of Fig. 4a, the supremum-based dilation generates artificial colours brighter than their surrounds. This is not the case for X-dilation; instead, colours of similar brightness mutually retard their propagation. The difference image in Fig. 4k confirms this effect. In contrast, the clipping in Fig. 4f has fairly similar dilation results, see also the difference image in frame l. Here, adjacent colours are sufficiently similar or differ substantially in brightness such that the brighter colour can be propagated without generating artificial colours or exaggerating brightness.

Analogous observations can be made for erosion, see Fig. 3d, e as well as Fig. 4d, e, i, j (clippings) and l, m (difference images). Again, LI-erosion leads to brightness undershoots and artificial colours (albeit visually less pronounced due to their dark appearance) which are safely avoided by X-erosion.

Fig. 3.
figure 3

Colour morphology example. a Original colour image peppers, \(512\times 512\) pixels. – b L-dilation similar to [7]. – c X-dilation. – d LI-erosion similar to [7]. – e X-erosion.

Fig. 4.
figure 4

Colour morphology example, continued. a, f Two zoomed details from peppers image, \(100\times 100\) pixels each. – b, g L-dilation. – c, h X-dilation. – d, i LI-erosion. – e, j X-erosion. – k Difference of b and c. Middle grey represents zero, brighter colours represent positive differences, darker colours negative differences. – l Difference of g and h. – m Difference of d and e. – n Difference of i and j.

Experiment 3: Discontinuous Synthetic Colour Images. Making use of the same framework for colour images as in the previous paragraph, we now consider a particularly simple setting for colour images in which the interaction of colours during filtering is easily observed. Modifying the type of our experiments, we will now compare the proposed method with the PDE-based scheme built upon the discretisation of Rouy and Tourin [18], which was used as a building block in [2], and with the lattice-based dilation/erosion procedure from [6, 7], this time without any modification. We denote by RT-dilation and RT-erosion, respectively, the PDE-based results.

Let us note that the method of Rouy and Tourin is given by a first-order scheme. On one hand, this means that in the scalar case it is known to introduce blurry artefacts, compare again [2]. On the other hand, this is why it satisfies by construction in the scalar case the extremum principle as discussed in this work. Thus, any potential over-/undershoots that could be observed in results are due to the maximum respectively minimum construction as used in previous work.

In Fig. 5 we show the results of the experiment. Let us note that images are of very small size so that we see in practice a zoom on results. We have done just one step of dilation/erosion with the discrete methods using a small \(3\times 3\) structuring element, which corresponds to two steps with the RT method. The results by the original lattice-based method displayed in the second column clearly show the effect of leaving the convex hull of values. The remarkable colour mixture is due to the colours chosen in the experiment and their arrangement in the color bicone. The RT-scheme, the results of which are shown in the second column, displays some blurry artefacts, yet one also observes clearly similar effects of leaving the convex hull of colours as with the previous method. The results obtained by the proposed method as given in the third column are obviously much more intuitive for interpretation. Still we observe some mixing of the colours, which is due to the choice of our objective function within the optimisation: if there exist multiple input values not dominated by others in the underlying partial order, none of which can therefore well represent the input set, a compromise between them is found.

Fig. 5.
figure 5

Colour morphology example with simple images. This test is designed to observe in detail the effect of the extremum principle. a Original colour image, \(8\times 8\) pixels. – b L-dilation of [7]. – c RT-dilation. – d X-dilation. – e LI-erosion of [7]. – f RT-erosion. – g X-erosion.

4 Summary and Conclusion

We have shown that the use of the convex hull of matrices within the structuring element of matrix-valued dilation/erosion appears to be a suitable generalisation of the corresponding property in the scalar setting. The computational results confirm favourable stability properties in comparison to other methods in the field and encourage further investigation. In this context, also generalisations of the objective function used in our optimisation will be considered. It will also be of interest to which extent algebraic properties of classical scalar-valued dilation and erosion, such as adjunction, can be transferred to a matrix-valued setting.

Let us elaborate some more on the potential implications of our results. From the theoretical perspective of numerical analysis of a potential PDE-based interpretation the validity of the discrete extremum principle is a fundamental property in classic theory of numerical schemes. This holds in particular for the underlying PDEs of dilation and erosion in the scalar case which are Hamilton-Jacobi equations. It is a cornerstone of numerical analysis of PDEs that a stability notion such as an extremum principle together with consistency could enable to prove convergence of the underlying scheme. Thus our paper may form the first step towards the numerical analysis of a matrix-valued PDE.

Turning to possible implications on the more practical side, our new matrix-valued dilation and erosion provide an interesting basis for building more complex morphological procedures. For instance, they appear well-suited as a building block for morphological levelings. In the near future we aim to explore this possibility and evaluate the performance of the proposed concept in that context.