Fuzzy Rough Sets Theory Reducts for Quantitative Decisions – Approach for Spatial Data Generalization

Fiedukowicz, Anna

doi:10.1007/978-3-319-19941-2_30

Anna Fiedukowicz¹⁷

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 9124))

Included in the following conference series:

International Conference on Pattern Recognition and Machine Intelligence

1955 Accesses
3 Citations

Abstract

One of the most important objectives within the scope of current cartography is the creation of system controlling the process of geographical data generalisation. Firstly, it requires selection of the features crucial from the point of view of the decision making process. Such tools as reducts and fuzzy reducts, though useful, are still insufficient for the quantitative decisions, common in cartographical generalization. Thus the author proposed a modification in fuzzy reducts calculating, which can allow to calculate them with regard to a continuous decision variable. The proposed method is based on the t-norm of fuzzy indiscernibility based on attribute value and fuzzy indiscernibility based on decision, which is calculated for each pair of objects. The solution seems to be more intuitive than the ones established previously.

You have full access to this open access chapter, Download conference paper PDF

An Efficient Approach for Fuzzy Decision Reduct Computation

An Algorithm for Computing Goldman Fuzzy Reducts

A topological method for reduction in digital information uncertainty

Article 18 March 2019

Keywords

1 Introduction

One of the most important challenges of modern cartography is the automation of the geographical information generalisation processes [1, 4, 9, 10]. It requires acquisition of data’s crucial information, general patterns, and tendencies and their subsequent retaining on the lower levels of detail (LoD), which corresponds to lower map scales. This task up to now has been tackled manually by the skilled cartographers and it seems to be difficult to algorithmize. Even though there is an array of algorithms of generalisation, which address particular generalisation operators (such as: objects selection, simplification, smoothing, aggregation, amplification, etc.), what still poses the problem is the control of the entire process of generalisation – starting from the decision, “whether to generalise at all’’, through the choice of the appropriate operator and algorithm, up to the final selection of parameters of the latter [1, 4, 9, 10].

The facts described above point to the conclusion that, apart from the particular tools employed in the process of generalisation, the decision-making system is required in order to manage the operations on many different levels. Such system should possess and utilise the skilled cartographer’s knowledge. However, such skills usually result from years of practice and experience along with refined aesthetic taste, and therefore they are not available explicitly. Taking the above into consideration, according to the author, the methods of knowledge data discovery (KDD) might be used to convert this hidden knowledge into an explicit form [5, 10].

Therefore the author’s ultimate aim is to create a base of fuzzy rules governing the process of generalisation. The step prior to that, then, will be to choose the features crucial for the process of generalisation itself, which will lead to simplification of the decision system. The following paper addresses the problem of attribute choice methodology (using the reducts) which takes into consideration the specific data features. While in the previous work [10] the classical rough set approach for categorical data was used, this paper focuses on the problem of numeric feature selection based on numeric decisions – it discusses and provides some new extensions for fuzzy rough set (FRST) approach. The rough set approaches are chosen for the feature selection as they are easily understandable and give good intuition about why certain attribute is selected.

2 Data Specifics

The currently developed geospatial databases provide an array of information - in the form of attributes (projected in database structure), as well as more implicit features (connected with objects’ geometry and their topography) - which can be used in the process of generalisation of geographical information. In that way the data is specified by a number of attributes that can potentially be used in the further generalisation process management.

What is worth emphasising, is the fact that the attributes can be expressed in different measurement scales: qualitative (ranging from binary scale, through classifying scale, to ordinal scale) as well as quantitative. Thus, the decision attribute can also be represented in different measurement scales.

In this article selection generalisation operator is considered, with decision expressed in two measurement scales (Table 1):

Table 1. Attributes values of test dataset (buildings): a₁ – building function (r – residential, o – office, s – shops & services, g – religious), a₂ – public function (1 – yes, 0 – no), a₃ – area (in square meters), a₄ – shortest distance to the river, a₅ – shortest distance to the road, a₆ – shortest distance to another building, a₇ – shortest distance to the forest, a₈ – shortest distance to built-up area; attributes a₃ to a₈ are calculated basing on objects geometry, a₄ to a₈ are expressed in meters; decision attributes (established by an expert) in different scales: dec₁ – quantitative scale, dec₂ – binary scale (for the LoD 1: 20 000 – dec₂ = 1 for dec₁ ≥ 20 000)

Full size table

1.
Binary – for the systems created on one particular data level of detail (corresponding with the scale 1:20 000): 1 - the object is selected, 0 - object is not selected;
2.
Quantitative – for the system with universal character allowing to choose objects on any map scale (within assumed range) – the attribute’s value is a corresponding scale denominator.

However, the second above is strongly preferred. Firstly it does not require designing separate systems for each of the desired scales. In the past, when analogue maps prevailed, it was possible to distinguish the scales in which the data were to be generated (they corresponded to the scales in which the maps were printed). However, nowadays most maps are accessible interactively via the Internet and the end user can choose any scale, thus the generalisation to all levels of detail is useful.

The test dataset corresponds to the topographical data collected at the 1:10 0000 level of detail, which is available for the whole Poland’s area in National Cartographical Database (pl. Państwowy Zasób Kartograficzny) and known as BDOT10 k (pl. Baza Danych Obiektów Topograficznych - Topographical Objects Database). However the data are strongly simplified (Fig. 1).

3 Rough Set Based Feature Selection

3.1 Rough Sets

The rough sets theory allows to reduce the complexity of a system by searching of reduct B – the subset of the entire attribute A set [6–8, 11]. The following search is based on the discernibility relation, which can be defined as:

$$ R_{B} = \left\{ {\left( {x, y} \right) \in X^{2} \;and\;\left( {\forall a \in B} \right)(a\left( x \right) = a\left( y \right))} \right\} $$

The so called decision reduct ensure the preservation of the original discernibility towards the decision: If the objects from different decision classes are discernible on the attribute set A, they are also discernible on its subset B ∈ A, being a reduct. The reduct has a minimal character, which means that none of the reduct’s attributes can be omitted without losing of discernibility mentioned above [6–8].

The approach described is connected with particular constraint: attributes as well as the decisions should be expressed in the classification (not orderly) scale. Other ways, a discretisation is required, what entails a partial loss of information (including e.g. the order of distinguished classes).

One of the extensions for this approach considers graded indiscernibility between objects. Thus, the classes of attributes can be more or less similar to each other [8]. Established dissimilarities between attribute classes – degrees of discernibility can be expressed in the form of a matrix (example – Table 2).

Table 2. Different discernibility degrees for classes of attribute a₁

Full size table

3.2 Dominance-Based Rough Set Approach

The dominance based rough set (DBRS) approach, which is an extention of rough set theory, enables, without the loss of information, the use of attributes as well as decisions expressed in the ordinal scale. The theory postulates the apporoximation (and consequently reducts’ calculation) for the union of the subsequent decision classes. The theory is insufficient, however, in the case of the attributes expressed in quantitative scales, as it indeed assumes the monotonous relation between the attributes, but does not establish the distance between the subsequent classes [3].

3.3 Fuzzy Rough Sets

The hybrid of the fuzzy sets theory and rough sets theory, enabling to create fuzzy reducts, employs the attributes in quantitative scale. The discernibility relation based on the equality of the attributes was replaced with the measure expressing the closeness of the objects represented by a fuzzy discernibility relation R [2, 11, 12].

The discernibility matrix then, existing also in the traditional fuzzy sets theory, is filed with the measure of closeness (based on each attribute) for each pair of objects with different decision value. In this paper the value of the fuzzy indiscernibility is calculated as follows:

$$ R_{a} \left( {x, y} \right) = \frac{|a\left( x \right) - a\left( y \right)|}{l(a)} $$

(1)

Subsequently, on the basis of the discernibility matrix quality of the reduct can be calculated by finding best fuzzy discernibility R _b for each pair of objects and finding its’ minimal value out of all pairs (instead of max any co-norm ⊥ can be used):

$$ q(B) = \hbox{min} \left( {\hbox{max} \left( {R_{b \in B}} \right)} \right) $$

(2)

This approach is based on the original RST assumption that each reduct is as good as its weakest component, meaning it is as good as the least discernible pair of objects belonging to separate decision classes. Therefore, in the original approach minimum operator is used, however some authors find it overly restrictive and allow the use of an average instead [2]:

$$ q(B) = {\text{mean}}\left( {\hbox{max} \left( {R_{b \in B} } \right)} \right) $$

(3)

Such approach was used in the following work. The quality $ q(B) $ can be then compared with the quality of the whole attribute set A (where ε is the acceptable tolerance of the quality loose) [2]:

$$ q (B) \ge (1 - \varepsilon )q(A) $$

(4)

Another approaches for reduct evaluation is to punish pairs of objects (x, y) which belong to different decision classes but are nearly indiscernible using reduct’s attributes. It can be expressed as [12]:

$$ P_{B} = {\mathcal{T}}_{b \in B} \left(1 - R_{b} \left( {x, y} \right)\right) $$

(5)

The FRST approaches are the first enabling the calculation of the reducts for the data presented at the beginning, without discretization. It is possible, however, to establish only the reduct for the first variant (1) – binary decision scale.

4 Fuzzy Rough Reducts for Quantitative Decision

4.1 Adaptations of Existing FRST Methods

Few among the authors directly address the problem of reducts for the decisions in the quantitative scale (dec ₂), however some of the solutions form Sect. 3.3 can be adapted for this case.

In the formula (2) or (3) similarity of objects by decision can be added (now all the objects’ pair are compared):

$$ q(B) = {\text{mean}}\left( {\hbox{max} \left( {R_{b \in B} , 1 - R_{dec} } \right)} \right) $$

(6)

However, this solution disadvantage is that it promotes too much the pairs of objects which are indiscernible according to the decision (R _dec ≅ 0) which is in fact not interesting when looking for decisive reducts.

Also formula (5) can be adapted by adding fuzzy indiscernibility relation to the t-norm [12]:

$$ {\mathcal{T}}(P_{B} , R_{dec} ) $$

(7)

The final punishment is calculated as a sum:

$$ Sim\left( {d/B} \right) = \mathop \sum \limits_{x, y: d\left( x \right) \ne d\left( y \right)b \in B} R_{dec} \mathop \prod \limits_{b \in B} (1 - R_{b} ) $$

(8)

The disadvantage of this approach (formulas 7 and 8) is that it does not seem intuitive as it includes indiscernibility by decision for data already aggregated by t-norm.

4.2 Proposed Solutions

The proposed solution intends to be more intuitive for non-mathematical expert. It is based on the necessity to establish the value of the relative relation R _{a_rel} considering the objects’ relation R both on attribute (R _a) and decision (R _dec). Therefore, it is proposed to calculate the relative tolerance relation for each pair of objects as a t-norm of R _a and R _dec:

$$ R_{a\_rel} = { \mathcal{T}}\left( {R_{a} , R_{dec} } \right) $$

(9)

The most interesting from the point of view of applications described in introduction, seem to be such t-norms as:

1.
product(R _a , R _dec)
2.
Hamacher product(R _a , R _dec): $ {\rm T}_{{H_{0} }} \left( {a, b} \right) = \left\{ {\begin{array}{*{20}c} 0 & {if \;a = b = 0} \\ {\frac{ab}{a + b - ab}} & {otherwise} \\ \end{array} } \right. $
3.
min(R _a , R _dec)

The further proceedings are identical as in the classic FRST method, though all possible pairs of objects are compiled and Rb_rel is used instead of Rb:

$$ q(B) = {\text{mean}}\left({\hbox{max} \left({R_{{a\_rel}\,\,\,\,{(b \in B)}}} \right)} \right) $$

(10)

Such an approach allows to follow the significance of each attribute (in relation to the decision) for each pair of objects. This can be valuable from the point of view of expert using decision system as it allows to intuitionally understand the importance of attributes.

5 Experiments on Test Data

5.1 Some General Assumptions

The calculation of fuzzy discernibility indicator R _a for the pair of objects depends on the scale in which the attribute a was depicted. Therefore:

For the attribute a₂ (expressed in binary scale) the classical discernibility approach, based on the equivalence relation was employed,
For the attribute a₁ (expressed in the classifying scale) similarity matrix was employed (Table 2),
For the attributes a₃ to a₈ (in quantitative scale) tolerance relation R basing on formula (1) was employed.

Due to the specificity of the described problem establishing of all possible reducts of the set was not necessary. In practice, for the purpose of the further application only 1, sometimes few, reducts will be used. The accessibility to the attributes necessary for the calculation is usually high as they are available in the databases either as the descriptive attributes designed in database structure, or are easy to calculate on the basis of objects geometry. Therefore, the reducts were calculated with Johannson’s heuristic. It operates as follows: every time such attribute is added to the reduct, which results in the biggest increase of the quality $ q $ (understood as in formula 3). This steps may be repeated until:

Obtaining the quality $ q $ fulfils the condition (4) assumed by the user or
The point when adding another attributes results in the increase of quality $ q $ lower then estabilished $ \Delta q $.

In this work the second method (with $ \Delta q = 0.02 $) was employed, due to the necessity of maintaining a low system complexity (and consequently not overly numerous reducts), if its higher complexity did not increase the overall quality significantly.

Similarly, the other reducts can be calculated (starting from the subsequent attribute), however this work limited itself to only 1 reduct in each example.

5.2 Fuzzy Reducts for Binary Decision

Determination of the reduct started with combining the objects belonging to different decision classes (dec ₁) into pairs and calculating the discernibility matrix of relations R (according to the rules described in Sect. 5.1). Then the consecutive elements of the reduct were established with use of greedy heuristic basing on the $ q $ value – Table 3.

Table 3. Subsequent steps of fuzzy reduct creation for the binary decision, including the corresponding reducts qualities (the fields with the highest accuracy are highlighted, while the elements added to the reduct are in bold)

Full size table

Consequently reduct {a₁, a₂, a₈} have been established. Adding another attribute would not increase the quality of the reduct so no other attribute was added. What is worth mentioning is the high quality of the decision reduct, as the ability to distinguish the object with different decisions, is high.

5.3 Fuzzy Reducts for Quantitative Decisions

First the method using formula 6 was used. Results of following steps are presented in Table 4. One of the possible decision reduct is exactly the same that the one described in Sect. 5.2, although it allows to distinguish a number of object types. However accuracies seem to be over-optimistic since the goal was to distinguish between more exact decisions and the qualities here are higher than for the binary decision.

Table 4. Subsequent steps of fuzzy reduct creation for the quantitative decision, including the corresponding qualities (highlights as in Table 3); the method used: according to the formula 6

Full size table

The next step was to test the method proposed in Sect. 4.2. In this case all possible pairs of objects were juxtaposed. For each of them R _a, R _dec1 and R _{a_rel} were calculated (in two variants: with use of T-norm product and Hamacher product). Then the quality based on R _{a_rel} was calculated for particular attributes – the subsequent steps of the heuristics are illustrated in the tables below (Tables 5 and 6).

Table 5. Subsequent steps of fuzzy reduct creation for the quantitative decision, including the corresponding qualities (highlights as in Table 3); the t-norm used: $ R_{rel} = {\text{ product}}\left( {R_{a} , \, R_{dec} } \right) $

Full size table

Table 6. Subsequent steps of fuzzy reduct creation for the quantitative decision, including the corresponding qualities (highlights as in Table 3); the t-norm used: $ R_{rel} = {\text{ Hamacher product}}\left( {R_{a} , \, R_{dec} } \right) $

Full size table

Table 7. Subsequent steps of fuzzy reduct creation for the quantitative decision (decision artificially brought to the binary scale), including the corresponding accuracies (highlights as in Table 3); the t-norm used: product and Hamacher product (results for both are identical)

Full size table

The methods results in the same reducts as the previous ones, irrespectively which of the t-norm is used. What is more, even though qualities values differ depending on the used t-norm (and differ even more form the corresponding ones in Table 4), there seem to be noticeable tendencies in its distribution over the attributes.

In the next stage, in order to test the universality of the method, the same steps were taken, though in this case on binary decision (dec ₁) – Table 7. In this case the results, irrespective of the chosen t-norm, were alike ($ R_{dec} \in \left\{ {0, 1} \right\} $, so all t-norms have the same value in the R _b function). It should be noted, that the result (calculated reduct) and the relation between the qualities of particular reducts are identical to those calculated with the classic method described in Sect. 5.2. The ratio of corresponding accuracies in the Tables 3 and 7 equals c.a. 0.53, which reflects the proportion of the number of pairs in different decision classes and the total number of pairs (35/66), or in other words the average discernibility of all pair of objects towards the decision.

6 Conclusions

The work above addressed the problem of reduct calculation in the case of decisions in quantitative scale. In the process a tolerance relation (R _rel), understood as t-norm of tolerance relations of attribute (R _a) and decision (R _dec), was employed.

The conducted tests using the two types of t-norm – product and Hamacher product – gave similar results, meaning they both allowed to achieve the same reducts irrespective of differences in the quality value of the reduct. Generally the reducts’ qualities values calculated with this method were lower than for binary decision, what seems justifiable taking into consideration the necessity to distinguish the objects according to their continuous decision value. They are also lower than the qualities calculated by one of the existing methods (formula 6), however those qualities seem to be over-optimistic as they grow unreasonably thanks to the objects having the same or similar decision values.

The methodology employed in the case of artificially binary decision allowed to achieve the same reduct as it was the case in the original FRST method. However, even though the relation of quality values between particular attributes were maintained the absolute values of quality were different. It is a result of calculating the discernibility relation for all pair of objects.

The tests indicate that proposed method can be applied in the process of generalisation of geographical data mentioned above. However there is also a potential for other application areas. Depending on the application type other t-norm for calculating R _rel can be used.

The main advantage over the other method is that it is intuitionally understandable and prevents black-box solution, allowing user to follow the importance of each attribute on every single step of the reduct computation. Therefore the method can be employed among others in the creation of the system of generalisation control, on the first stage of its development – attribute selection.

References

Burghardt, D., Duchene, C., Mackaness, W. (eds.): Abstracting Geographic Information in a Data Rich World. Lecture Notes in Geoinformation and Cartography Series. Springer, Berlin (2014)
Google Scholar
Cornelis, Ch., Jensen, R., Martín, G.H., Slezak, D.: Attribute selection with fuzzy decision reducts. Inf. Sci. 180(2), 209–224 (2010)
Article MATH Google Scholar
Greco, S., Matarazzo, B., Słowiński, R.: Multicriteria classification by dominance-based rough set approach. In: Kloesgen, W., Zytkow, J. (eds.) Handbook of Data Mining and Knowledge Discovery. Oxford University Press, New York (2002)
Google Scholar
Mackaness W.: Understanding geographic space. In: Mackaness, w., Ruas, A., Sarjakoski, T. (eds.) Generalisation of Geographic Information: Cartographic Modelling and Application. Elsevier, Oxford (2007)
Google Scholar
Miller, H.J., Han, J.: Geographic Data Mining and Knowledge Discovery. Taylor & Francis, London (2001)
Book Google Scholar
Pawlak, Z.: Rough sets. Int. J. Parallel Prog. 11(5), 341–356 (1982)
MATH MathSciNet Google Scholar
Pawlak, Z.: Rough Sets: Theoretical Aspects of Reasoning about Data. Kluwer Academic Publishing, Dordrecht (1991)
Book MATH Google Scholar
Pawlak, Z., Skowron, A.: Rough sets: some extensions. Inf. Sci. 177, 28–40 (2007)
Article MATH MathSciNet Google Scholar
Olszewski R., Kartograficzne modelowanie rzeźby terenu metodami inteligencji obliczeniowej, Prace Naukowe - Geodezja, z. 46, Oficyna Wydawnicza Politechniki Warszawskiej, Warszawa (2009)
Google Scholar
Olszewski, R., Fiedukowicz, A.: Supporting the process of monument classification based on reducts, decision rules and neural networks. In: Kryszkiewicz, M., Cornelis, C., Ciucci, D., Medina-Moreno, J., Motoda, H., Raś, Z.W. (eds.) RSEISP 2014. LNCS, vol. 8537, pp. 327–334. Springer, Heidelberg (2014)
Google Scholar
Riza, L.S., Janusz, A., Bergmeir, C., Cornelis, C., Herrera, F., Ślęzak, D., Benítez, J.M.: Implementing algorithms of rough set theory and fuzzy rough set theory in the R package “RoughSets”. Inf. Sci. 287, 68–89 (2014)
Article Google Scholar
Ślęzak, D., Betliński, P.: A role of (not) crisp discernibility in rough set approach to numeric feature selection. In: Hassanien, A.E., Kim, T.-H., Ramadan, R., Salem, A.-B.M. (eds.) AMLTA 2012. CCIS, vol. 322, pp. 13–23. Springer, Heidelberg (2012)
Chapter Google Scholar

Download references

Acknowledgments

Author would like to thank Prof. Dominik Ślęzak for consultations and benevolent intelectual support.

Author information

Authors and Affiliations

Faculty of Geodesy and Cartography, Department of Cartography, Warsaw University of Technology, Warsaw, Poland
Anna Fiedukowicz

Authors

Anna Fiedukowicz
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Anna Fiedukowicz .

Editor information

Editors and Affiliations

Institute of Computer Science, Warsaw University of Technology, Warsaw, Poland
Marzena Kryszkiewicz
Machine Intelligence Unit, Indian Statistical Institute, Kolkata, West Bengal, India
Sanghamitra Bandyopadhyay
Institute of Computer Science, Warsaw University of Technology, Warsaw, Poland
Henryk Rybinski
Indian Statistical Institute, Kolkata, West Bengal, India
Sankar K. Pal

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Fiedukowicz, A. (2015). Fuzzy Rough Sets Theory Reducts for Quantitative Decisions – Approach for Spatial Data Generalization. In: Kryszkiewicz, M., Bandyopadhyay, S., Rybinski, H., Pal, S. (eds) Pattern Recognition and Machine Intelligence. PReMI 2015. Lecture Notes in Computer Science(), vol 9124. Springer, Cham. https://doi.org/10.1007/978-3-319-19941-2_30

Download citation

DOI: https://doi.org/10.1007/978-3-319-19941-2_30
Published: 23 June 2015
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-19940-5
Online ISBN: 978-3-319-19941-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

The International Association for Pattern Recognition (opens in a new tab)

Fuzzy Rough Sets Theory Reducts for Quantitative Decisions – Approach for Spatial Data Generalization

Abstract

Similar content being viewed by others

An Efficient Approach for Fuzzy Decision Reduct Computation

An Algorithm for Computing Goldman Fuzzy Reducts

A topological method for reduction in digital information uncertainty

Keywords

1 Introduction

2 Data Specifics

3 Rough Set Based Feature Selection

3.1 Rough Sets

3.2 Dominance-Based Rough Set Approach

3.3 Fuzzy Rough Sets

4 Fuzzy Rough Reducts for Quantitative Decision

4.1 Adaptations of Existing FRST Methods

4.2 Proposed Solutions

5 Experiments on Test Data

5.1 Some General Assumptions

5.2 Fuzzy Reducts for Binary Decision

5.3 Fuzzy Reducts for Quantitative Decisions

6 Conclusions

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Societies and partnerships

Navigation

Fuzzy Rough Sets Theory Reducts for Quantitative Decisions – Approach for Spatial Data Generalization

Abstract

Similar content being viewed by others

An Efficient Approach for Fuzzy Decision Reduct Computation

An Algorithm for Computing Goldman Fuzzy Reducts

A topological method for reduction in digital information uncertainty

Keywords

1 Introduction

2 Data Specifics

3 Rough Set Based Feature Selection

3.1 Rough Sets

3.2 Dominance-Based Rough Set Approach

3.3 Fuzzy Rough Sets

4 Fuzzy Rough Reducts for Quantitative Decision

4.1 Adaptations of Existing FRST Methods

4.2 Proposed Solutions

5 Experiments on Test Data

5.1 Some General Assumptions

5.2 Fuzzy Reducts for Binary Decision

5.3 Fuzzy Reducts for Quantitative Decisions

6 Conclusions

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Societies and partnerships

Search

Navigation