Keywords

1 Introduction

The derivative of intensity method has been widely used in various fields of image processing for detecting the change of intensity as contours. Indeed, the maximum of the first derivative (or the zero crossing of the second derivative) is used to locate the changes and discontinuities of intensity in the form of stair [4, 9, 10]. These changes and discontinuities are generally associated with contours of objects present in the image. Although very significant progress has been made in the area of edge detection, the empirical estimation of the gradient techniques proposed in the 70s to 80s are often still used in competition with more modern techniques. Indeed, the gradient operator used for the edge detection is formal, but sterile unfortunately.

In the last few years, some authors have shown that local phase information of image is more robust than the intensity gradient [3]. Measuring the local phase in several scales, otherwise the phase congruency, is a way of characterising the differences of intensities in terms of shape of intensity. Local phase information is estimated practically using quadrature filters kernel [2] and can be easily extended to higher dimensions using the representation of the monogenic signal [6].

The monogenic signal is considered as a natural extension of 2D analytic signal. In fact, this signal is derived from the generalisation of the Hilbert transform known as the Riesz transform. The appearance of this generalisation has opened up new perspectives in image processing. We cite in a non-exhaustive manner, the works on the optical flow of Zang et al. [16], wavelets of Unser et al. [15] and a recent work on segmentation by Belaid et al. [3].

It is in this double context of edge detection and the Riesz transform that our contribution lies. We will proceed by recalling the link that connects the Riesz transform and the gradient and Laplacian operators. Specifically, we will find the equivalent of the Canny detector and the Laplacian zero crossings in monogenic field. This view allows us to highlight new perspectives on the edge detection models.

This article is organised as follows: first, a simple background note is desirable to understand the proposed approach, and subsequently, we will develop the details associated with it. Finally, some preliminary comparison results and a conclusion will be presented.

2 Monogenic Signal and Riesz Transform

From the Hilbert transform concept, it is possible to introduce the analytic signal \(f_{A}(x)\) corresponding to the original 1D signal f(x). This concept is a widely used tool in signal processing and it is given by:

$$\begin{aligned} f_{A}(x)=f(x)+i f_{\mathcal H}(x). \end{aligned}$$
(1)

Applying this signal to the image processing requires the generalisation of the Hilbert transform to the multidimensional signals. A direct generalisation thereof in higher dimensions is not obvious. Indeed, the concept of positive and negative frequencies is not clear in this case. Several attempts of 1D analytic signal generalisation can be found in the literature. However, the monogenic signal, introduced by Felsberg and Sommer [6], is considered as a natural 2D extension of the analytic signal. It is derived from the generalisation of Hilbert transform known as the Riesz transform.

It should be noted that this nD generalisation approach is based on the conservation of the 1D local phase information by adding the information of the local orientation. Both features are integrated into a larger space. Thus, for a real signal nD, the extension is represented by an analytic signal of dimension (\(n+1\)). The combination of the pair of the Riesz transform and the original signal forms the new generalised nD analytic signal [6]. The Riesz transform in the spatial domain is given by:

$$\begin{aligned} \mathbf {f}_{\mathcal {R}}(\mathbf x)=\frac{\mathbf x}{A_{n+1}|\mathbf x|^{n+1}}*f(\mathbf x)=(\mathbf {h}*f)(\mathbf x),\,\,\,\text {with}\,\,\, A_{n+1}=\frac{2\pi ^{}\frac{n+1}{2}}{\varGamma (\frac{n+1}{2})}. \end{aligned}$$
(2)

For the particular case of a two-dimensional signal \((n=2)\) the generalised Hilbert transform is written as:

$$\begin{aligned} \mathbf {h}(\mathbf x)&=\frac{\mathbf x}{2\pi |\mathbf x|^3}=(h_x,h_y)(\mathbf x) =\Big (\frac{x}{2\pi (x^2+y^2)^{3/2}},\frac{y}{2\pi (x^2+y^2)^{3/2}}\Big ). \end{aligned}$$
(3)

Once the nD generalisation of the Hilbert transform is defined, it is then easy to introduce the generalisation of the analytic signal. The new signal \(f_{M}(\mathbf x): \mathbb {R}^n \rightarrow \mathbb {R}^{n+1}\), called the monogenic signal, is defined in a space of dimension \(n+1\) with sufficient degrees of freedom to represent the local characteristics of a signal in nD:

$$\begin{aligned} f_{M}(\mathbf x)=(\mathbf {f}_{\mathcal {R}},f)(\mathbf x). \end{aligned}$$
(4)

The Riesz transform preserves the most interesting properties of the 1D Hilbert transform.

3 Link Between the Differential Operators and the Riesz Transform

Felsberg and Sommer [7] and Unster et al. [15] in their very recent works have shown the existence of a direct link between the Riesz transform \(\mathbf {f}_{\mathcal {R}}(\mathbf x)\) and the complex gradient (also called Wirtinger operator):

$$\begin{aligned} \mathbf {f}_{\mathcal {R}}(\mathbf x)= & {} -\Bigg ( \frac{\partial }{\partial x}+i \frac{\partial }{\partial y} \Bigg ) \Bigg ( \frac{1}{2\pi |\mathbf x|}*f(\mathbf x) \Bigg ), \end{aligned}$$
(5)

which means that:

$$\begin{aligned} |\mathbf {f}_{\mathcal {R}}(\mathbf x)|= & {} \Bigg |\nabla \Bigg (f(\mathbf x)* \frac{1}{2\pi |\mathbf x|} \Bigg )\Bigg |. \end{aligned}$$
(6)

This formulation allows us to interpret the Riesz transform as the gradient of a smoothed image. Thus, two models can be derived from this interpretation; the maximum of \(\mathbf {f}_{\mathcal {R}}\) which is similar to the Canny edge detector, and the zero crossing of the derivative of \(\mathbf {f}_{\mathcal {R}}\) which is analogous to the Laplacian model.

The first method based on the maximum of the Riesz transform, denoted Max_TR, consists in applying a convolution between the image f and the band-stop filter kernel \(\psi _s\) (see Fig. 1):

$$\begin{aligned} f*\nabla ( \psi _s), \end{aligned}$$
(7)

with \(\psi _s = h*c_s\), \(h=-\frac{1}{2\pi |\mathbf x|}\) and \(c_s\) a low-pass filter at a given scale \(s>0\). Thus the equivalent of function h in the Fourier domain is given by \(\mathrm {H}(\mathbf u)=-|\mathbf u|\).

This new detector expresses that an image contour is obtained by filtering the image by the first derivative of a band-stop filter (see Fig. 1), and then detecting the maximum of the function obtained thereby. Based on the same interpretation, the second method denoted PZ_DTR consists in following the same development as the one used by the Haralick-Canny detector of the Laplacian zero crossings [4, 9].

Fig. 1.
figure 1

Shape of the used filters. From left to right, the filters \(c_s\) (a), \(\psi _s\) (b), \(\nabla \psi _s\) (c) and \(\varDelta \psi _s\) (d) in the Fourier domain at a given scale \( s> 0 \).

In order to achieve this, we associate to f an image of the directionsFootnote 1 of the Riesz-transform:

$$\begin{aligned} \mathbf {r}= \frac{\mathbf {q}}{|\mathbf {q}|}. \end{aligned}$$
(8)

A contour point is then defined as the place of the standard maximum of the Riesz transform in the direction specified by \(\mathbf {r}=(q_1/|\mathbf {q}|,q_2/|\mathbf {q}|)\). Thus, a contour point satisfies:

$$\begin{aligned} \frac{\partial |\mathbf {f}_{\mathcal {R}}(\mathbf x)|}{\partial \mathbf {r}}=0,~~\text {and}~~\frac{\partial ^2 |\mathbf {f}_{\mathcal {R}}(\mathbf x)|}{\partial \mathbf {r}^2}\le 0. \end{aligned}$$
(9)

The development of this property leads to

$$\begin{aligned} \frac{\partial |\mathbf {f}_{\mathcal {R}}(\mathbf x)|}{\partial \mathbf {r}}= & {} \mathbf {r}^t~~ \nabla |\mathbf {f}_{\mathcal {R}}(\mathbf x)|= \frac{\mathbf {q}^t}{|\mathbf {q}|}\nabla |\mathbf {q}|\\\nonumber= & {} \frac{\mathbf {q}^t}{|\mathbf {q}|} \begin{pmatrix} \frac{\partial }{\partial x} \sqrt{q_1^2+q_2^2} \\ \frac{\partial }{\partial y}\sqrt{q_1^2+q_2^2} \end{pmatrix}=\frac{\mathbf {q}^t}{|\mathbf {q}|} \begin{pmatrix} \frac{q_1 q_{1x}+q_2 q_{2x}}{|\mathbf {q}|}\\ \frac{q_1 q_{1y}+q_2 q_{2y}}{|\mathbf {q}|} \end{pmatrix}\\\nonumber= & {} \frac{\mathbf {q}^t}{|\mathbf {q}|} \begin{pmatrix} q_{1x} &{}q_{2x}\\ q_{1y} &{}q_{2y} \end{pmatrix}\frac{\mathbf {q}}{|\mathbf {q}|} =\mathbf {r}^tH\mathbf {r}, \end{aligned}$$
(10)

where \(H = \begin{pmatrix} q_{1x} &{} q_{2x} \\ q_{1y} &{} q_{2y} \end{pmatrix} \) represents a symmetric matrix similar to the Hessian matrix. The symmetry is due to certain properties of the Riesz transform (see [7], Eq. (9)). In addition, if we assume that the image f is locally coherent, that means it has a one-dimensional structure along the direction of \(\mathbf {r}\), then the matrix H is of rank 1, and \(\mathbf {r}^t H \mathbf {r}= trace(H)=q_{1x}+q_{2y}\).

Finally, we obtain the following equation:

$$\begin{aligned} \frac{\partial |\mathbf {f}_{\mathcal {R}}(\mathbf x)|}{\partial \mathbf {r}}= & {} q_{1x}+q_{2y}\\= & {} \frac{\partial }{\partial x}(f*h_x*c_s) + \frac{\partial }{\partial y}(f*h_y*c_s)\nonumber \\= & {} 0.\nonumber \end{aligned}$$
(11)

Equation (11) shows that the points which maximise \(|\mathbf {f}_{\mathcal {R}}(\mathbf x)|\) are those representing the zero crossing of the divergence of the Riesz transform \(\text {div}(\mathbf {q})\). Thus, by analogy with the Marr detector [10], the zero crossing of the divergence of the Riesz transform (DTR) provides information about the position of the contour.

Using the convolution product properties, Eq. (11) can be rewritten as follows:

$$\begin{aligned} f*\bigg ( \frac{\partial ^2 }{\partial x^2} + \frac{\partial ^2 }{\partial y^2}\bigg ) \bigg ( h*c_s\bigg ), \end{aligned}$$
(12)

the symbolic writing of our detector will be given by:

$$\begin{aligned} Image~contours=Zero~crossings\big (f*\varDelta \psi _s \big ). \end{aligned}$$
(13)

This shows that a contour image is obtained by filtering the image by the second derivative of a band-stop filter (see Fig. 1), and then detecting the zeros of the function thus obtained. Figure 1 shows the shape of \(c_s\) filter and the band-stop filter kernel \(\psi _s\) and its Laplacian used for edge detection.

Table 1. Summary of comparison results. Represented on the table, the F measure (ODS) of the best score over the entire image database, the OIS measure of the best score by image and the AR measure representing the area below the Precision-Recall curve.

4 Results and Discussion

To evaluate the performance of the proposed approach, we made comparisons between manual delineation of Berkeley Segmentation Database [1] and the automatic results. One filter has been chosen for these tests, namely the \(\alpha \)-scale-space filter [5]. This filter was chosen for its parametric nature of \(\alpha \in ]0,1]\) which makes it possible to find the classic filters like the Gaussian filters (\(\alpha = 1\)) and the Poisson filter (\(\alpha = 0.5\)).

The evaluation of these methods is carried out by Precision-Recall curves which are obtained by varying the detection threshold. For this purpose, three performance measures were selected, the best score over the image dataset ODS (Optimal Dataset Scale), the best score per image OIS (Optimal Image Scale), as well as the area below the precision-recall curve AR (Average Precision). There is however, an interesting point on the curves defined by the measure \(F = 2\frac{\text {Precision} \cdot \text {Recall}}{\text {Precision} + \text {Recall}}\). Thus, the location of the maximum of this measure along the curve defines the optimum threshold and provides a summary score. Table 1 reports a summary of results obtained on the database of 500 images of Berkeley. It should be noted that the most interesting measure is the F measure (ODS), the other ones are involved only to bring more precision.

Table 1 summarises the comparison results between the proposed methods -Max_TR and PZ_DTR- and the classical models as well as some newer and more sophisticated models [1, 8, 13]. An overview of these results is illustrated in Fig. 2. It is easy to see that the Max_TR approach significantly exceeds, in terms of performance, the remaining methods, irrespective of the selected filter (Gaussian or Poisson). However, the PZ_DTR approach is less efficient in comparison to other more recent approaches. Indeed, the last ones use in practice a fairly elaborate techniques, that motivate us to improve this detector involving for example the principle of multiscale.

Fig. 2.
figure 2

Edge detection results on the BSDS500 benchmark. From top to bottom: original images, corresponding manual segmentations, results obtained by the methods Max_TR, PZ_DTR.

Experimental tests have shown that the Poisson filter gives better results than the Gaussian one. Moreover, the best result is obtained for \(\alpha = 0.27\). Thus, we can recognise that the Gaussian filter often used is probably not the best choice. Indeed, since the recent appearance of the \(\alpha \) scale-space theory [5], other kernel filters having the same properties as the Gaussian filter are put forward [2, 7].

Before the emergence of the Riesz transform, a combination of different directions was necessary to describe correctly a structure. Using six directions was a good compromise for edge detection applications. Using henceforth the monogenic signal, the filter is composed of three components which can be seen as three directions. These three directions: a pair of even component and two odd components are enough to naturally and correctly sweep the whole area of the image. Indeed, the odd part of the filter is treated as a natural extension of the two-dimensional representation of the anisotropic filter. After development, this part is reduced to a single component capable of detecting structures of stair type. Therefore, compared to the case of steerable filters, there is a reduction in the number of the directions used. All these reasons justify the performance of suggested approach compared with the gradient based approaches.

It is natural to think tackle the proposed approach with more recent and efficient ones, such as the model called GPB-owt-ucm of [1]. However, such methods are in a quite developed and sophisticated level, and take into account the texture, the multi-scale framework and the presence of noise in images. To be at the same level and get better detection, it is interesting to develop our approach to multiscale framework and include a component for treating texture. We can note also that according to the experiments carried out on the Berkeley database, the measure F depends on the size and type of the selected images sample. Thus, we plan to experiment our approach by other publicly available datasets.

5 Conclusion

We proposed in this paper a new edge detection technique that has led to two methods called Max_TR and PZ_DTR. These are based on the Riesz transform and are inspired by the model of maximum gradient and Laplacian zero crossings. Indeed, the recent generalisation of the analytic signal, based on the Riesz transform allowed us to build analogues to the classical models in the monogenic domain. Using different filters, we tested and compared our approach with conventional models and some newer models. It appears that the method Max_TR based on the maximum of the Riesz transform is significantly more accurate and more efficient. Although these results are introductory, they seem to be promising. Indeed, these methods, simple to implement and easily expandable to higher dimensions, opens up new perspectives. A multi-scale representation will significantly increase the detection quality.