Keywords

1 Introduction

One of the most important challenges of modern cartography is the automation of the geographical information generalisation processes [1, 4, 9, 10]. It requires acquisition of data’s crucial information, general patterns, and tendencies and their subsequent retaining on the lower levels of detail (LoD), which corresponds to lower map scales. This task up to now has been tackled manually by the skilled cartographers and it seems to be difficult to algorithmize. Even though there is an array of algorithms of generalisation, which address particular generalisation operators (such as: objects selection, simplification, smoothing, aggregation, amplification, etc.), what still poses the problem is the control of the entire process of generalisation – starting from the decision, “whether to generalise at all’’, through the choice of the appropriate operator and algorithm, up to the final selection of parameters of the latter [1, 4, 9, 10].

The facts described above point to the conclusion that, apart from the particular tools employed in the process of generalisation, the decision-making system is required in order to manage the operations on many different levels. Such system should possess and utilise the skilled cartographer’s knowledge. However, such skills usually result from years of practice and experience along with refined aesthetic taste, and therefore they are not available explicitly. Taking the above into consideration, according to the author, the methods of knowledge data discovery (KDD) might be used to convert this hidden knowledge into an explicit form [5, 10].

Therefore the author’s ultimate aim is to create a base of fuzzy rules governing the process of generalisation. The step prior to that, then, will be to choose the features crucial for the process of generalisation itself, which will lead to simplification of the decision system. The following paper addresses the problem of attribute choice methodology (using the reducts) which takes into consideration the specific data features. While in the previous work [10] the classical rough set approach for categorical data was used, this paper focuses on the problem of numeric feature selection based on numeric decisions – it discusses and provides some new extensions for fuzzy rough set (FRST) approach. The rough set approaches are chosen for the feature selection as they are easily understandable and give good intuition about why certain attribute is selected.

2 Data Specifics

The currently developed geospatial databases provide an array of information - in the form of attributes (projected in database structure), as well as more implicit features (connected with objects’ geometry and their topography) - which can be used in the process of generalisation of geographical information. In that way the data is specified by a number of attributes that can potentially be used in the further generalisation process management.

What is worth emphasising, is the fact that the attributes can be expressed in different measurement scales: qualitative (ranging from binary scale, through classifying scale, to ordinal scale) as well as quantitative. Thus, the decision attribute can also be represented in different measurement scales.

In this article selection generalisation operator is considered, with decision expressed in two measurement scales (Table 1):

Table 1. Attributes values of test dataset (buildings): a1 – building function (r – residential, o – office, s – shops & services, g – religious), a2 – public function (1 – yes, 0 – no), a3 – area (in square meters), a4 – shortest distance to the river, a5 – shortest distance to the road, a6 – shortest distance to another building, a7 – shortest distance to the forest, a8 – shortest distance to built-up area; attributes a3 to a8 are calculated basing on objects geometry, a4 to a8 are expressed in meters; decision attributes (established by an expert) in different scales: dec1 – quantitative scale, dec2 – binary scale (for the LoD 1: 20 000 – dec2 = 1 for dec1 ≥ 20 000)
  1. 1.

    Binary – for the systems created on one particular data level of detail (corresponding with the scale 1:20 000): 1 - the object is selected, 0 - object is not selected;

  2. 2.

    Quantitative – for the system with universal character allowing to choose objects on any map scale (within assumed range) – the attribute’s value is a corresponding scale denominator.

However, the second above is strongly preferred. Firstly it does not require designing separate systems for each of the desired scales. In the past, when analogue maps prevailed, it was possible to distinguish the scales in which the data were to be generated (they corresponded to the scales in which the maps were printed). However, nowadays most maps are accessible interactively via the Internet and the end user can choose any scale, thus the generalisation to all levels of detail is useful.

The test dataset corresponds to the topographical data collected at the 1:10 0000 level of detail, which is available for the whole Poland’s area in National Cartographical Database (pl. Państwowy Zasób Kartograficzny) and known as BDOT10 k (pl. Baza Danych Obiektów Topograficznych - Topographical Objects Database). However the data are strongly simplified (Fig. 1).

Fig. 1.
figure 1

Graphical representation of test dataset (numbered buildings) with other objects assuring spatial context: forests, roads, river, built-up area

3 Rough Set Based Feature Selection

3.1 Rough Sets

The rough sets theory allows to reduce the complexity of a system by searching of reduct B – the subset of the entire attribute A set [68, 11]. The following search is based on the discernibility relation, which can be defined as:

$$ R_{B} = \left\{ {\left( {x, y} \right) \in X^{2} \;and\;\left( {\forall a \in B} \right)(a\left( x \right) = a\left( y \right))} \right\} $$

The so called decision reduct ensure the preservation of the original discernibility towards the decision: If the objects from different decision classes are discernible on the attribute set A, they are also discernible on its subset B ∈ A, being a reduct. The reduct has a minimal character, which means that none of the reduct’s attributes can be omitted without losing of discernibility mentioned above [68].

The approach described is connected with particular constraint: attributes as well as the decisions should be expressed in the classification (not orderly) scale. Other ways, a discretisation is required, what entails a partial loss of information (including e.g. the order of distinguished classes).

One of the extensions for this approach considers graded indiscernibility between objects. Thus, the classes of attributes can be more or less similar to each other [8]. Established dissimilarities between attribute classes – degrees of discernibility can be expressed in the form of a matrix (example – Table 2).

Table 2. Different discernibility degrees for classes of attribute a1

3.2 Dominance-Based Rough Set Approach

The dominance based rough set (DBRS) approach, which is an extention of rough set theory, enables, without the loss of information, the use of attributes as well as decisions expressed in the ordinal scale. The theory postulates the apporoximation (and consequently reducts’ calculation) for the union of the subsequent decision classes. The theory is insufficient, however, in the case of the attributes expressed in quantitative scales, as it indeed assumes the monotonous relation between the attributes, but does not establish the distance between the subsequent classes [3].

3.3 Fuzzy Rough Sets

The hybrid of the fuzzy sets theory and rough sets theory, enabling to create fuzzy reducts, employs the attributes in quantitative scale. The discernibility relation based on the equality of the attributes was replaced with the measure expressing the closeness of the objects represented by a fuzzy discernibility relation R [2, 11, 12].

The discernibility matrix then, existing also in the traditional fuzzy sets theory, is filed with the measure of closeness (based on each attribute) for each pair of objects with different decision value. In this paper the value of the fuzzy indiscernibility is calculated as follows:

$$ R_{a} \left( {x, y} \right) = \frac{|a\left( x \right) - a\left( y \right)|}{l(a)} $$
(1)

Subsequently, on the basis of the discernibility matrix quality of the reduct can be calculated by finding best fuzzy discernibility R b for each pair of objects and finding its’ minimal value out of all pairs (instead of max any co-norm ⊥ can be used):

$$ q(B) = \hbox{min} \left( {\hbox{max} \left( {R_{b \in B}} \right)} \right) $$
(2)

This approach is based on the original RST assumption that each reduct is as good as its weakest component, meaning it is as good as the least discernible pair of objects belonging to separate decision classes. Therefore, in the original approach minimum operator is used, however some authors find it overly restrictive and allow the use of an average instead [2]:

$$ q(B) = {\text{mean}}\left( {\hbox{max} \left( {R_{b \in B} } \right)} \right) $$
(3)

Such approach was used in the following work. The quality \( q(B) \) can be then compared with the quality of the whole attribute set A (where ε is the acceptable tolerance of the quality loose) [2]:

$$ q (B) \ge (1 - \varepsilon )q(A) $$
(4)

Another approaches for reduct evaluation is to punish pairs of objects (x, y) which belong to different decision classes but are nearly indiscernible using reduct’s attributes. It can be expressed as [12]:

$$ P_{B} = {\mathcal{T}}_{b \in B} \left(1 - R_{b} \left( {x, y} \right)\right) $$
(5)

The FRST approaches are the first enabling the calculation of the reducts for the data presented at the beginning, without discretization. It is possible, however, to establish only the reduct for the first variant (1) – binary decision scale.

4 Fuzzy Rough Reducts for Quantitative Decision

4.1 Adaptations of Existing FRST Methods

Few among the authors directly address the problem of reducts for the decisions in the quantitative scale (dec 2 ), however some of the solutions form Sect. 3.3 can be adapted for this case.

In the formula (2) or (3) similarity of objects by decision can be added (now all the objects’ pair are compared):

$$ q(B) = {\text{mean}}\left( {\hbox{max} \left( {R_{b \in B} , 1 - R_{dec} } \right)} \right) $$
(6)

However, this solution disadvantage is that it promotes too much the pairs of objects which are indiscernible according to the decision (R dec  ≅ 0) which is in fact not interesting when looking for decisive reducts.

Also formula (5) can be adapted by adding fuzzy indiscernibility relation to the t-norm [12]:

$$ {\mathcal{T}}(P_{B} , R_{dec} ) $$
(7)

The final punishment is calculated as a sum:

$$ Sim\left( {d/B} \right) = \mathop \sum \limits_{x, y: d\left( x \right) \ne d\left( y \right)b \in B} R_{dec} \mathop \prod \limits_{b \in B} (1 - R_{b} ) $$
(8)

The disadvantage of this approach (formulas 7 and 8) is that it does not seem intuitive as it includes indiscernibility by decision for data already aggregated by t-norm.

4.2 Proposed Solutions

The proposed solution intends to be more intuitive for non-mathematical expert. It is based on the necessity to establish the value of the relative relation R a_rel considering the objects’ relation R both on attribute (R a ) and decision (R dec ). Therefore, it is proposed to calculate the relative tolerance relation for each pair of objects as a t-norm of R a and R dec :

$$ R_{a\_rel} = { \mathcal{T}}\left( {R_{a} , R_{dec} } \right) $$
(9)

The most interesting from the point of view of applications described in introduction, seem to be such t-norms as:

  1. 1.

    product(R a , R dec )

  2. 2.

    Hamacher product(R a , R dec ): \( {\rm T}_{{H_{0} }} \left( {a, b} \right) = \left\{ {\begin{array}{*{20}c} 0 & {if \;a = b = 0} \\ {\frac{ab}{a + b - ab}} & {otherwise} \\ \end{array} } \right. \)

  3. 3.

    min(R a , R dec )

The further proceedings are identical as in the classic FRST method, though all possible pairs of objects are compiled and Rb_rel is used instead of Rb:

$$ q(B) = {\text{mean}}\left({\hbox{max} \left({R_{{a\_rel}\,\,\,\,{(b \in B)}}} \right)} \right) $$
(10)

Such an approach allows to follow the significance of each attribute (in relation to the decision) for each pair of objects. This can be valuable from the point of view of expert using decision system as it allows to intuitionally understand the importance of attributes.

5 Experiments on Test Data

5.1 Some General Assumptions

The calculation of fuzzy discernibility indicator R a for the pair of objects depends on the scale in which the attribute a was depicted. Therefore:

  • For the attribute a2 (expressed in binary scale) the classical discernibility approach, based on the equivalence relation was employed,

  • For the attribute a1 (expressed in the classifying scale) similarity matrix was employed (Table 2),

  • For the attributes a3 to a8 (in quantitative scale) tolerance relation R basing on formula (1) was employed.

Due to the specificity of the described problem establishing of all possible reducts of the set was not necessary. In practice, for the purpose of the further application only 1, sometimes few, reducts will be used. The accessibility to the attributes necessary for the calculation is usually high as they are available in the databases either as the descriptive attributes designed in database structure, or are easy to calculate on the basis of objects geometry. Therefore, the reducts were calculated with Johannson’s heuristic. It operates as follows: every time such attribute is added to the reduct, which results in the biggest increase of the quality \( q \) (understood as in formula 3). This steps may be repeated until:

  • Obtaining the quality \( q \) fulfils the condition (4) assumed by the user or

  • The point when adding another attributes results in the increase of quality \( q \) lower then estabilished \( \Delta q \).

In this work the second method (with \( \Delta q = 0.02 \)) was employed, due to the necessity of maintaining a low system complexity (and consequently not overly numerous reducts), if its higher complexity did not increase the overall quality significantly.

Similarly, the other reducts can be calculated (starting from the subsequent attribute), however this work limited itself to only 1 reduct in each example.

5.2 Fuzzy Reducts for Binary Decision

Determination of the reduct started with combining the objects belonging to different decision classes (dec 1 ) into pairs and calculating the discernibility matrix of relations R (according to the rules described in Sect. 5.1). Then the consecutive elements of the reduct were established with use of greedy heuristic basing on the \( q \) value – Table 3.

Table 3. Subsequent steps of fuzzy reduct creation for the binary decision, including the corresponding reducts qualities (the fields with the highest accuracy are highlighted, while the elements added to the reduct are in bold)

Consequently reduct {a1, a2, a8} have been established. Adding another attribute would not increase the quality of the reduct so no other attribute was added. What is worth mentioning is the high quality of the decision reduct, as the ability to distinguish the object with different decisions, is high.

5.3 Fuzzy Reducts for Quantitative Decisions

First the method using formula 6 was used. Results of following steps are presented in Table 4. One of the possible decision reduct is exactly the same that the one described in Sect. 5.2, although it allows to distinguish a number of object types. However accuracies seem to be over-optimistic since the goal was to distinguish between more exact decisions and the qualities here are higher than for the binary decision.

Table 4. Subsequent steps of fuzzy reduct creation for the quantitative decision, including the corresponding qualities (highlights as in Table 3); the method used: according to the formula 6

The next step was to test the method proposed in Sect. 4.2. In this case all possible pairs of objects were juxtaposed. For each of them R a , R dec1 and R a_rel were calculated (in two variants: with use of T-norm product and Hamacher product). Then the quality based on R a_rel was calculated for particular attributes – the subsequent steps of the heuristics are illustrated in the tables below (Tables 5 and 6).

Table 5. Subsequent steps of fuzzy reduct creation for the quantitative decision, including the corresponding qualities (highlights as in Table 3); the t-norm used: \( R_{rel} = {\text{ product}}\left( {R_{a} , \, R_{dec} } \right) \)
Table 6. Subsequent steps of fuzzy reduct creation for the quantitative decision, including the corresponding qualities (highlights as in Table 3); the t-norm used: \( R_{rel} = {\text{ Hamacher product}}\left( {R_{a} , \, R_{dec} } \right) \)
Table 7. Subsequent steps of fuzzy reduct creation for the quantitative decision (decision artificially brought to the binary scale), including the corresponding accuracies (highlights as in Table 3); the t-norm used: product and Hamacher product (results for both are identical)

The methods results in the same reducts as the previous ones, irrespectively which of the t-norm is used. What is more, even though qualities values differ depending on the used t-norm (and differ even more form the corresponding ones in Table 4), there seem to be noticeable tendencies in its distribution over the attributes.

In the next stage, in order to test the universality of the method, the same steps were taken, though in this case on binary decision (dec 1 ) – Table 7. In this case the results, irrespective of the chosen t-norm, were alike (\( R_{dec} \in \left\{ {0, 1} \right\} \), so all t-norms have the same value in the R b function). It should be noted, that the result (calculated reduct) and the relation between the qualities of particular reducts are identical to those calculated with the classic method described in Sect. 5.2. The ratio of corresponding accuracies in the Tables 3 and 7 equals c.a. 0.53, which reflects the proportion of the number of pairs in different decision classes and the total number of pairs (35/66), or in other words the average discernibility of all pair of objects towards the decision.

6 Conclusions

The work above addressed the problem of reduct calculation in the case of decisions in quantitative scale. In the process a tolerance relation (R rel ), understood as t-norm of tolerance relations of attribute (R a ) and decision (R dec ), was employed.

The conducted tests using the two types of t-norm – product and Hamacher product – gave similar results, meaning they both allowed to achieve the same reducts irrespective of differences in the quality value of the reduct. Generally the reducts’ qualities values calculated with this method were lower than for binary decision, what seems justifiable taking into consideration the necessity to distinguish the objects according to their continuous decision value. They are also lower than the qualities calculated by one of the existing methods (formula 6), however those qualities seem to be over-optimistic as they grow unreasonably thanks to the objects having the same or similar decision values.

The methodology employed in the case of artificially binary decision allowed to achieve the same reduct as it was the case in the original FRST method. However, even though the relation of quality values between particular attributes were maintained the absolute values of quality were different. It is a result of calculating the discernibility relation for all pair of objects.

The tests indicate that proposed method can be applied in the process of generalisation of geographical data mentioned above. However there is also a potential for other application areas. Depending on the application type other t-norm for calculating R rel can be used.

The main advantage over the other method is that it is intuitionally understandable and prevents black-box solution, allowing user to follow the importance of each attribute on every single step of the reduct computation. Therefore the method can be employed among others in the creation of the system of generalisation control, on the first stage of its development – attribute selection.