Keywords

1 Introduction

The image processing is complex, and in some cases the human eyes can not identify image details. The attribute extraction task to compare two images, is inherent to the Near Sets theory [1]. Generally, each image has its attributes that can be used for classification, and the computational algorithms are indispensable to extract those attributes, for classification. The Near Sets (NS), and the tolerance Near Sets (TNS), are theories that provide the formal basis for observation, comparisons, and classification of objects, using n-dimensional attribute vectors [2]. Using these theories, the tolerance Nearness Measure (tNM) can be obtained considering two images.

The TNS have been applied in many areas, and shows to be very promising, in image analysis, comparing gray level values of pixels, or texture attributes. This paper refers to the tNM implementation to obtain the similarity index between two images. The tNM implementation has an advantage, such as the possibility of using parallel processing, during the tolerance classification of image objects or subimages [3]. To obtain the texture attributes from images, Gray-Level Co-occurrence Matrix (GLCM) is used [4].

The rest of this paper is organized as follows. At Sect. 2, the mathematical description of NS, TNS, and tNM, is presented; followed by Sect. 3, of tNM implementation. The Sect. 4 is referred to the applications and results; and the Sect. 5, conclusions and future works.

2 NS, TNS, tNM

Perceptive objects are objects that can be detected by humans. Perceptive systems are referred to the perceptive objects associated with a set of probe functions that describes these objects, and is formally defined as follows.

Definition 1:

A perceptive system \( \left( {O,F} \right) \) consists of a nonempty set \( O \) of perceptive objects, and a nonempty set F, of real valued functions \( \varPhi \), such that: \( \varPhi \, \in \,{\text{F}} \,|\,\varPhi : O\, \to \,R \).

2.1 Object Description

If \( \left( {O, F} \right) \) is a perceptive system, and \( B\, \subseteq \,F \) is a set of probe functions, a description of a perceptual object is obtained by a vector, such as of Eq. (1):

$$ \varPhi_{B} \left( x \right) = \left( {\varPhi_{1} \left( x \right),\varPhi_{2} \left( x \right), \ldots ,\varPhi_{i} \left( x \right), \ldots ,\varPhi_{l} \left( x \right)} \right), $$
(1)

where: \( l \) is the dimension of the vector \( \varPhi_{B} \), and each \( \varPhi_{i} \left( x \right) \) is a probe function.

Then considering a perceptive system \( \left( {O, F} \right) \), and B a subset of \( F \left( {B \, \subseteq \,F} \right) \), O is a set of objects with its characteristics described by vector \( \varPhi_{B} \).

An important definition related to NS is the indiscernibility relation, which results in the classification of objects in equivalence classes [2], so that the properties of reflectivity, symmetry, and transitivity are satisfied by all the objects in a class. The equivalence relation, in a given set A, is satisfied if, \( \forall \,a\, \in \,A,\,\,\,{\text{aRa}} \), (reflectivity); \( \forall \,a,\,b\, \in \,A \), if \( aRb \) then bRa (symmetry); \( \forall \,a,b,c\, \in \,{\text{A}},\;\,{\text{if aRb and bRc}} \) then aRc (transitivity). It follows the definition of the indiscernibility relation.

2.2 Perceptual Indiscernibility Relation

Definition 2:

Let \( \left( {O, F} \right) \) be a perceptual system. For each \( B \, \subseteq \, F \), the perceptual indiscernibility relation \( \sim_{B} \) is defined as Eq. (2):

$$ \sim_{B} \, = \left\{ {\left( {x,y} \right) O \times O : \forall \varPhi i \, \in \,B\,.\,\varPhi i\left( x \right) = \varPhi i\left( y \right)} \right\}, $$
(2)

meaning that, two perceptual objects x and y, are indiscernibly related if they have the same value for all probe functions of B.

This perceptual indiscernibility relation is a modification of the relation described by Pawlak [5], in his rough set theory, provided that in NS, it is always considered a pair of sets that are close each other.

2.3 Weak Perceptual Indiscernibility Relation

Definition 3:

Let \( \left( {O,{\text{F}}} \right) \) be a perceptual system, and \( X, Y \, \subseteq \,O \). The set X is weakly near from the set Y if there are \( x \, \in \,X\,,\, y \, \in \, Y \), and \( \varPhi i\, \in \, F \), such that \( x \simeq_{B} y \), as defined in Eq. (3) of the relation of weak perceptual indiscernibility relation, \( \simeq_{B} \):

(3)

The previous definitions are related to the NS theory, and as its improvement, applying tolerance to measure the relation between perceptual objects, tolerance NS was proposed, as follows.

2.4 Tolerance Near Sets

TNS is characterized by the tolerance relation between the perceptual objects, so that it can be defined as follows.

Definition 4:

Let \( \left( {{\text{O}}, {\text{F}}} \right) \) be a perceptual system. For \( {\text{B }}\, \subseteq \,{\text{F}} \), the perceptual tolerance relation \( \cong_{{B, {\epsilon }}} \) is defined by Eq. (4), where the \( L^{2} \) norm is denoted by ?\( \parallel . \parallel \)?.

$$ \cong_{B, \epsilon }\,= \{ \left( {x,y} \right) O \,\times\,O:\,\parallel \varPhi \left( x \right) - \varPhi \left( y \right)\parallel_{2} \le\upvarepsilon\}, $$
(4)

The great difference between NS and TNS is that the objects in TNS classes are subjected to reflectivity, and symmetry, but not to transitivity, properties.

It is stated that the tolerance concept is inherent to the idea of proximity between objects [6], such that it is possible to identify image segments that are similar, each other, with a tolerable difference between them. In TNS, these images are considered in the same classes. If two image segments are similar, with tolerance, the TNS classification can result in two different classes, when two image segments are similar to a third image segment, but not similar, from each other, and in consequence, the transitivity property can not be satisfied to all objets.

Given the perceptual tolerance relation definition, it is possible to observe that the transitivity property can not be present to all perceptual objects in TNS. Another characteristic of TNS is the use of a tolerance value \( \upvarepsilon \), that is the threshold value of the distance between perceptual objects, such that if the distance is below or equal this value, they are considered in the same class.

According to Poli et al. [6], the basic structure of TNS, in the case of images used as perceptual objects, is consisted of a nonempty set of images, and a finite set of probe functions. Each object description consists of several measurements obtained by image processing techniques. TNS provides a quantitative approach, by the use of these measurements, as probe functions, and the threshold value \( \upvarepsilon \), to determine the similarity of objects, without the claim for the objects to be exactly the same [3].

2.5 Tolerance Nearness Measure

The tNM was introduced by Henry and Peters [3], from the necessity to determine the degree of similarity between objects, during the application of NS, in content based image retrieval.

Definition 5:

Considering \( \left( {{\text{O}}, {\text{F}}} \right) \), a perceptual system, with two disjunct sets X and Y, such that, Z = X? Y, the similarity measure \( tNM_{ \cong B,\varepsilon } \left( {X, Y} \right) \), between X and Y, can be resumed as Eq. (5):

$$ tNM_{ \cong B,\varepsilon } \left( {X, Y} \right) = \left( {\sum\nolimits_{{C \in H \cong B\varepsilon^{\left( Z \right)} }} {\left| C \right|} } \right)^{ - 1} \times \sum\nolimits_{{C \in H \cong B,\varepsilon^{\left( Z \right)} }} {\left| C \right|} \frac{{min(|C \cap X|,\, \left| {C \cap Y} \right|)}}{{max(|C \cap X|,\, \left| {C \cap Y} \right|)}} $$
(5)

with C denoting a TNS class, and H, the set of all classes in Z.

Equation (5) has as the first term of the product, the inverse of the addition of the modulus of all classes in Z. The second term, is the addition to all classes in H, of the ratio between minimum and maximum intersection, of the class C with X and Y, multiplied by the modulus of C.

3 Methodology of tNM Implementation

In this section, the TNS classification algorithm and tNM implementation are described.

3.1 TNS Classification

The TNS classification algorithm in a perceptive system (O, F), with \( B\, \subseteq \, F \), a set of probe functions, and a set of n objects , is described as the following Algorithm 1.

figure c

3.2 tNM Algorithm

Basically, the algorithm to compute tNM, starts with the input of two images X, and Y, and two approaches can be selected. The first one, denoted GL, uses the gray level of pixels as probe functions. In this case, the tolerance represents the quantity of different gray levels considered in the same class. If only one gray level is considered in the same class, the tolerance is zero. The objects in this approach are the pixels of the images. The second approach, denoted SA, divides the image in subimages, which become objects of the perceptual system. Then, statistical attributes of each subimage are obtained using Gray Level Co-occurrence Matrix, GLCM, which describes the occurrence of patterns of pixel pairs in the image [4].

After computation of GLCM, statistical attributes such as correlation, energy, contrast and homogeneity, can be obtained. These attributes are considered as probe functions, and they can be used to compute the Euclidean distance of the pair of subimages. Two subimages with distance below or equal the tolerance ?, are included in the same class, applying Algorithm 1.

figure d

Algorithm 2 corresponds to the tNM computation, dividing each input image in n subimages. At the step (1), the variable K is initialized with zero. Then, at (2) both input images are divided into n subimages each. In the case of GL approach, a subimage is a pixel. In (3) the attributes for each subimage are computed, for the probe function vector. Then, in (4), the subimages are classified using the TNS classification Algorithm 1. After these four steps, the computation of tNM is started. For each class Ci obtained previously, the ratio of the minimum intersection between input images and all objects in a class, i.e., min \( \left( {X\, \cap \,Ci_{,} Y\, \cap \,Ci} \right) \), and the maximum intersection between the same sets, i.e., max \( \left( {X\, \cap \,Ci_{,} Y\, \cap \,Ci} \right) \), are computed, at step (5). At step (6) the previously obtained ratio is multiplied by the modulus of the class, \( \left| {Ci } \right| \), and at (7) the obtained product is accumulated in K. These three steps are repeated until all classes are computed. Then, at step (8), the final value of K is divided by the modulus \( \left| {X\, \cap \,{\text{Y}}} \right| \), resulting in tNM value.

4 Application and Results

The relevance of the city comparison is confirmed by recent articles, such as Domingues et al. [7], that described the previous studies about city structures using complex networks, contributing for the understanding and improvements in transit systems, growth, and planning of the cities. In this paper it is described the application of tNM in downtown images, considering aspects related to the satellite image textures, and gray level, indicating how much each city is similar to the other cities, in aspects, such as structures, paving, and vegetation.

The tNM System was developed in Python, and in this section, it is first described experiments for parameter determination in city image pair tNM computation. After that, an experiment of classification of 26 cities around the world is described.

4.1 Determination of Parameters

To determine the values of parameters such as tolerance, and subimage size, in both approaches, an experiment of tNM applied to two city images, with 600 × 600 pixels, of Mexico City and Frankfurt, were realized. The images were obtained from Google Maps, Fig. 1.

Fig. 1.
figure 1

Pair of city images. (a) Mexico City. (b) Frankfurt.

The first tNM measure in GL approach, used the tolerance of 1% or 25 gray levels in a class, the number of generated classes was 10, execution time, 10 s, and tNM of 0.747. The second tNM measure in GL approach, used the tolerance of 50% or 124 gray levels in a class. The number of generated classes was 2, execution time, 5 s, and tNM of 0.816.

The first tNM measure in SA approach, used the tolerance of 0.9. The dimension of a subimage was 10 × 10, then 7,200 subimages were generated, classified in two classes, and execution time of 10 min, resulting in tNM of 0.999. In this case the value of tNM does not indicate the high similarity between two images, but the low resolution of tNM, with this high value of tolerance.

In the second tNM measure, SA approach, it was used the tolerance of 0.5. The dimension of a subimage was 10 × 10, then 7,200 subimages were generated, classified in four classes, and execution time of 10 min, resulting in tNM of 0.991. In this case the value of tNM was close to the previous experiment, indicating the same high degree of generalization of the compared images.

In the next experiment with SA approach, it was used the tolerance of 0.1. The dimension of subimage was 10 × 10, and 7,200 subimages were also generated, classified in 28 classes. The execution time remained the same, resulting in tNM of 0.661. This value seems realistic considering the two images.

Figure 2 illustrates how tNM varied with the tolerance in SA approach, using the pair of images of Fig. 1. If the tolerance is 0.01, tNM is near zero. When tolerance is 0.10, tNM is near 0.5, and when tolerance is above 0.65, tNM value is near 1, showing generalization. This figure indicates that the tolerance value suitable to the experiments in SA approach can be defined as 0.1.

Fig. 2.
figure 2

Graphic of tNM, varying with tolerance, in SA approach.

4.2 Comparing City Images

In the following experiment, it was compared several images of cities around the world, Table 1.

Table 1. Cities around the continents

The images were obtained from Google Maps, and the cities were chosen by their population density, and localization, around the different continents. Then, 26 images, 6 from the America continent; and 5 from each other continents, Europe, Asia, Africa, and Oceanian. In this experiment, the images were fixed to 256 × 256 pixels, and a tolerance of 10% was used for GL approach, and 0.1 for SA.

4.3 Highest and Lowest Values of tNM Obtained Comparing City Images

In Table 2, it is shown the top thirty highest tNM values obtained when comparing the considered cities around the world, using GL approach. The highest value of tNM, 0.950, was obtained between Regina and Edmonton, both from Canada, in American Continent. The images of these two cities are showed in Fig. 3(a) and (b), respectively. In Table 3, it is shown the thirty highest tNM values, obtained, when it was used the SA approach, and in this case the highest tNM value, 0.936, was obtained comparing Regina and Pointe Noire, from American and African Continents, respectively. The image of Pointe Noire city is shown in Fig. 3(c). The tNM value between Regina and Edmonton in SA approach, was of 0.647, not so high, showing the difference between GL an SA approach; and the tNM value between Regina and Pointe Noire in GL approach was of 0.831.

Table 2. Highest tNM values obtained in GL approach.
Fig. 3.
figure 3

City images: (a) Regina, (b) Edmonton, (c) Pointe Noire, with highest tNM values for GL approach (Regina x Edmonton); and for AS approach (Regina x Pointe Noire).

Table 3. Highest tNM values obtained in SA approach.

In Fig. 4 it is showed the tNM obtained when Regina is compared with all other cities considered in this experiment, using both approaches, where the highest tNM in GL and AE are highlighted. It is also observed that the tNM values in both approach are not close in most cities, but the behavior of these values are quite similar, showing the difference between GL and statistical approaches.

Fig. 4.
figure 4

tNM values for GL and AS, obtained when Regina is computed with all the other cities considered, showing the highest value in both approaches.

In Table 4, it is shown the five lowest tNM values obtained, in GL approach, and the lowest value, 0.349, was obtained comparing Monrovia and Newcastle. In SA approach, the value of 0.911, was obtained between these two cities, showing that in SA approach, both cities are very similar, because the gray level is not considered, as can be seen in the images shown in Figs. 5(a) and (b), respectively.

Table 4. Five lowest tNM values in GL approach.
Fig. 5.
figure 5

City images: (a) Monrovia, (b) Newcastle, with lowest tNM values for GL approach.

In Fig. 6, it is shown the graph of tNM between Monrovia and all other cities considered in the experiment, highlighting the lowest GL value.

Fig. 6.
figure 6

tNM values for GL and SA, obtained when Monrovia is computed with all the other cities considered, showing the lowest GL value.

In Table 5, it is shown the five lowest tNM values obtained, in SA approach, and the lowest value, 0.062, was obtained comparing Matola and Canberra, from African and Australian Continents, respectively. It is noted that the tNM between these two cities in GL approach was of 0.802, not so low value such as in AS approach. The images of these two cities are shown in Figs. 7(a) and (b), respectively.

Table 5. Five lowest tNM values obtained in SA approach.
Fig. 7.
figure 7

City images: (a) Matola, (b) Canberra, with lowest tNM values for SA approach.

In Fig. 8, it is shown the graph of tNM between Matola and all other cities considered in the experiment, highlighting the lowest GL value. It can be noted that in this case almost all GL values was above SA values, showing that the gray levels of the images were similar to Matola image, although the statistical attributes were different.

Fig. 8.
figure 8

tNM values for GL and SA, obtained when Matola is computed with all the other cities considered, showing the lowest SA value.

4.4 City Image Classification

In Tables 6 and 7, it is showed the TNS classification of the city images, using tNM results. If tNM is a measure of similarity, and in Algorithm 1 it is used the distance from the objects compared with a tolerance ?, it was defined a tNM distance, denoted dtNM, obtained as Eq. (6):

Table 6. Classes in GL approach.
Table 7. Classes in SA approach.
$$ {\text{d}}_{\text{tNM}} = \, \left( {1{-}{\text{tNM}}} \right) $$
(6)

Using dtNM, with tNM values obtained for GL approach, it was generated the classes shown in Table 6; and for SA approach, in Table 7.

It is noted that in these classifications, one city can be present in different classes, because of TNS classes are not equivalent classes. One class that Regina, Edmonton, and Pointe Noire cities are present is the Class 14, in Table 6. These cities showed the highest tNM in GL and SA approaches, as showed in Fig. 4. The Monrovia city is alone in Class 19, since it has the lowest GL tNM, as showed in Fig. 6. In SA, Regina is present in several classes with Pointe Noire, the highest value of tNM, such as: Class 6, Class 10, and Class 12, but Edmonton, is not present in these classes, although Regina and Edmonton had the highest GL value of tNM. Canberra that had the lowest SA value of tNM, is alone in SA class 8.

4.5 Average tNM Between Continents

In Table 8, it is illustrated the average values and standard deviation of tNM calculated between cities of the same continent, in the GL approach. It can be noted that the average tNM value between different continents was close to 0.700, as showed at the last row, average2, where the corresponding value is the average of the column values, excluding the average tNM value in the same continent, showed at the diagonal. The average tNM value in the same continent was above the average value between different continents, only in American Continent, showing that in the other continents the kind of cities was diversified.

Table 8. Average tNM between Continents in GL approach.

Table 9, corresponds to the average and standard deviation of tNM between cities of the same continent, in the SA approach. It can be noted that the average tNM value between different continents was close to 0.600, as showed at the last row, average2. In this approach, the average tNM value in the same continent was above the average value between different continents, in majority of the continents, with exception of the African Continent, in which the average tNM value was 0.630.

Table 9. Average tNM between Continents in SA approach.

5 Conclusions

In this work, it was developed two approaches for tNM, in images. The GL approach considers an object, or subimage, a pixel with its corresponding gray level; and SA approach considers an object, a subimage with its statistical attributes. The experiments showed that reasonable tolerance value is 10% for GL approach, and 0.1 for SA. With these values of tolerance, 26 downtown images of cities around the world, distributed in five continents, was compared using tNM based distance, dtNM, to classification. The results showed that the two approaches present in some situations, very different values of tNM, depending on the gray level of the image in GL approach; and statistical attributes in SA approach. This can also be explained by the use of tolerance values in GL approach, and the size of subimage in SA approach. As future works, experiments should be suggested using more than one image from the same cities, to verify how the tNM varies in the same city images, varying the tolerance, and subimage size.