1 Introduction

Texture images have been one of the most important elements in computer vision systems during the last decades, with numerous applications in material sciences [4], physics [12], medicine [16], geology [15], biology [18], and many other areas.

Even though texture image (or visual texture) is not a concept defined in rigorous terms, there exist some consensual points that such type of image is supposed to follow. One of the most important of such points is the locality, i.e., the idea that most information conveyed by a texture is confined within the limits of a local neighborhood around each pixel, i.e., in local pixel patterns.

The locality property is one of the main motivations for the modeling of texture images using complex networks, more specifically, by “small-world” models like those proposed in [2]. There, the image pixels are associated to vertexes in the network and the initial graph is complete (fully connected) with weighted edges. The edge weight corresponds to a normalized distance between pairs of corresponding pixels that takes into account both the spatial separation and the gray level dissimilarity. The dynamics of the image is analyzed by applying successive threshold values to the edge weight, in such a way that more and more edges are removed, making the network more and more sparse. Finally, the authors in [2] propose that the distribution of degrees in this family of networks can be used to provide texture descriptors. They apply such descriptors in image classification with great success.

Despite the accuracy achieved by the degree distribution, other measures extracted from complex networks based on histograms are more suitable to describe global information yet they are usually not sufficiently precise to represent the local picture. In this way, we propose in this study the use of Dirichlet series [7] to control the locality of the distribution by means of a carefully chosen parameter. This is a classical series where a succession of terms are powered to an exponential parameter and accumulated into a summation.

The proposed method, dubbed Dirichlet Complex Network (DCN) descriptors, employs the values in the degree histogram of the network as terms in the Dirichlet series and takes partial sums from that series to provide image descriptors. The accuracy in texture classification is tested over two benchmark databases (UIUC [9] and USPTex [3]) and compared to other state-of-the-art descriptors, namely, Local Binary Patterns (LBP) [13], LBP+VAR [13], Bouligand-Minkowski (BM) fractal descriptors [1], Local Phase Quantization (LPQ) [14], Binarized Statistical Image Features (BSIF) [8], and the original complex network (CN) descriptors in [2]. Our proposal is competitive when compared with all the other compared methods in both databases. The results confirm our expectation about the potential of Dirichlet exponentiation as a means of evidencing complex statistical relations that are not explicit in the original histogram.

2 Related Works

Most methods for texture recognition in the literature can be divided into local-based (e.g. co-occurrence matrices [6], local binary patterns [13], bag-of-visual-words [19] and their respective variations) and multiscale approaches (e.g. multifractals [20] and fractal descriptors [3], spatial pyramids [10], scale-invariant feature transform [11], and others).

Complex networks represent in this context a paradigm that allows a combination of both local and multiscale viewpoint over the image. The most well-known and successful method in this category is that presented in [2]. Despite the success of basic statistical quantifiers as those used in [2], more recently the literature have presented more advanced techniques to better express the network model. An example of such alternative analysis is the estimation of a type of fractal dimension in [17] based on the well-known Riemann zeta function.

The method proposed here is inspired in [2] and [17], even though we do not use the zeta function but rather the most general idea of Dirichlet series. Besides, we are not focusing on specific measures like the fractal dimension, but on a technique to obtaining texture descriptors as precise and generalist as possible.

3 Complex Networks Model for Texture Description

The complex networks employed here are described in details in [2] and here we only summarize the main idea. In that model, the gray-scale image I is represented by a network G(VE), where G is a set of vertexes and E a set of edges. Each pixel in I with Cartesian coordinates (xy) is associated with a vertex \(v_{xy} \in V\). The set of edges is composed by

$$\begin{aligned} E = \{e=(v_{xy},v_{x'y'}): \sqrt{(x-x')^2 + (y-y')^2} \le r \}, \end{aligned}$$
(1)

where r is the neighborhood radius, a predefined parameter. Each edge \(e = (v_{xy},v_{x'y'})\) is associated with a weight w(e), defined by

$$\begin{aligned} w(e) = \frac{(x-x')^2 + (y-y')^2 + r^2\frac{|I(x,y)-I(x',y')|}{L}}{2r^2}, \end{aligned}$$
(2)

where L is the maximum gray level. This corresponds to a normalized Euclidean distance in a three-dimensional space where the pixels are mapped to points with coordinates (xyI(xy)).

To analyze the evolution dynamics of the network model, the original model G(VE) gives rise to a family of subgraphs \(G_t(V,E_t)\), which preserves the set of vertexes but removes a subset of edges by thresholding the corresponding weights, i.e.:

$$\begin{aligned} E_t = \{e \in E: w(e) \le t \}. \end{aligned}$$
(3)

For each vertex \(g \in G_t\) we can compute its degree by

$$\begin{aligned} d_t(g) = |\{e \in E_t: v \in e\}|, \forall g \in G, \end{aligned}$$
(4)

where \(|\cdot |\) stands for set cardinality.

4 Proposed Method

We propose the use of Dirichlet series as a mechanism to highlight different patterns in the degree distribution. The obtained descriptors are called Dirichlet Complex Network (DCN) descriptors.

The family of Dirichlet series are characterized by the general expression

$$\begin{aligned} \sum _{n=1}^{\infty }a_n n^{-\alpha }, \end{aligned}$$
(5)

where both the sequence of numbers \(a_n\) and the exponent \(\alpha \) are complex-valued (here they are real-valued in particular). As usual in any conventional series, n are integer numbers, although in practice the same effect of a real-valued n is achieved by setting the \(\alpha \) parameter appropriately.

Here, we use the number of vertexes with a particular degree n as the term \(a_n\) and take the partial sums of the series. Therefore given the degree vector \(d_t\) as defined in (4), we have the degree histogram

$$\begin{aligned} h_t(k) = \sum _{g \in G}\delta (d_t(g),k), \end{aligned}$$
(6)

where \(\delta (x,y)\) is the Kronecker delta (1 if \(x=y\), 0, otherwise) and the \(k^{th}\) term in the proposed Dirichlet series is obtained by

$$\begin{aligned} D_t^{\alpha }(k) = \sum _{n=1}^{k}h_t(n) n^{\alpha }, \end{aligned}$$
(7)

where \(\alpha \) is a parameter free to be set empirically or using any specific heuristic. The degree Dirichlet distribution is provided by

$$\begin{aligned} p_t^{\alpha }(k) = D_t^{\alpha }(k) / \sum _{k=1}^{d_{max}} D_t^{\alpha }(k). \end{aligned}$$
(8)

The statistical measures employed to compose the descriptors are similar to those described in [2], i.e., the energy E, entropy K and contrast C:

$$\begin{aligned} E_t^\alpha = \sum _{k=1}^{d_{max}} (p_t^\alpha (k))^2 \qquad K_t^\alpha = -\sum _{k=1}^{d_{max}} p_t^\alpha (k) \log p_t^\alpha (k) \qquad C_t^\alpha = \sum _{k=1}^{d_{max}} k^2 p_t^\alpha (k) \end{aligned}$$
(9)

Here, we obtained interesting performance by combining \(\alpha =-9\) and \(\alpha =-10\). We also employed \(r=2\) and t ranging between 0.05 and 0.53, with increments 0.015, as recommended in [2]. The dimension of the feature vector is reduced by the Karhunen-Loève (KL) transform [5], such that the final descriptors effectively correspond to

$$\begin{aligned} D = KL\left( \bigcup _{t \in [0.05,0.53],\,\alpha \in \{-9,-10\}}\{E_t^\alpha ,K_t^\alpha ,C_t^\alpha \} \right) , \end{aligned}$$
(10)

where KL(x) is the KL transform of the vector x. Figure 1 summarizes the main steps.

Fig. 1.
figure 1

Proposed method. From left to right, the original texture, network representation, degree distribution, Dirichlet series (\(\alpha =-9\)) and the respective descriptors (energy, entropy and contrast).

5 Experiments

Two benchmark data sets are employed for validation and comparisons in this work, namely, UIUC as used in [9] and USPTex [3].

UIUC is a database composed by 40 large gray-scale images representing landscapes, animals, materials, etc. Each image is split into 25 non-overlapping windows, each one with dimensions \(256\times 256\). This results in a database of 1000 texture images categorized into 25 groups.

The process to generate USPTex [3] is similar to that employed in UIUC. Originally, 192 large photographies (\(512\times 384\)) are captured under non-controlled conditions and from each one of these images we extract 12 smaller windows (\(128 \times 128\)) without overlapping. At the end this corresponds to a collection of images with a total amount of 2292 samples divided into 191 classes. Finally, for the comparison accomplished here where only gray-scale methods are considered, these images are converted to gray levels.

The proposal here described is compared with other classical and state-of-the-art texture descriptors in the literature, to know, Local Binary Patterns (LBP) [13], LBP+VAR [13], Bouligand-Minkowski (BM) fractal descriptors [1], Local Phase Quantization (LPQ) [14], Binarized Statistical Image Features (BSIF) [8], and the original complex network (CN) descriptors in [2].

The classifier employed for the proposed descriptors is the linear discriminant analysis [5]. Testing and training sets are determined by following a randomized 5-fold scheme, which is repeated 100 times to provide the average accuracy as well as the corresponding error (standard deviation). As for the other compared methods from the literature, we adopted the parameters suggested in the respective references.

6 Results

Table 1 lists the accuracy of the proposed method (DCN) in texture classification compared with other state-of-the-art approaches. The proposed descriptors achieved the highest accuracy in both databases. Descriptors based on complex networks, i.e., that presented in [2] and the method proposed here, present relevant advantage over the other approaches, especially in USPTex. This is exactly the most challenging data set, presenting a significantly larger number of images and categories.

Table 1. Percentage of images correctly classified in UIUC and USPTex databases and respective errors.
Fig. 2.
figure 2

Confusion matrices for the highest success rates on UIUC and USPTex databases.

Figure 2 exhibits the confusion matrices for the methods providing the two highest accuracies in both data sets. We restricted USPTex matrices to the first 50 classes to facilitate the visualization. Even though both approaches present difficulties in distinguishing complex groups as the classes 18/19 in UIUC, the presented proposal yielded a reduced number of gray points outside the diagonal, especially in USPTex database.

Here the role of the exponentiated Dirichlet summation is to equip the histogram with a nonlinear viewpoint, which allows richer analysis of the represented texture. In particular, the negative exponents employed in our application have the ability of giving larger significance to the smaller values of the histogram. The final descriptors are in this way more balanced than the original ones and information that was originally disregarded are now taken into account in the classification process. The effectiveness of such consideration is verified and confirmed by the outstanding result obtained in such challenge task of classifying large databases of texture images.

7 Conclusions

This work proposed and investigated the use of Dirichlet series to improve the performance of histogram-based descriptors of texture images, in particular, those descriptors acquired from a complex network modeling.

The method was tested on the classification of benchmark databases and the achieved accuracy outperformed other state-of-the-art descriptors. Such great performance is explained by the nonlinearity introduced by the Dirichlet series. The partial sum used here employs negative exponents, which, by giving higher weight to the smaller histogram values, make the descriptors more balanced and preserve information that is usually discarded by the classical complex network descriptors.

The great accuracy confirmed by the tests also suggests the potential of the proposed descriptors for practical applications in a number of real-world problems where the classification of texture images plays fundamental role.