Information System Design Using Fuzzy and Rough Set Theory
 6 Downloads
Glossary
 Rough Sets

Rough set theory is a technique for dealing with uncertainty and for identifying causeeffect relationships in databases. It is based on a partitioning of some domain into equivalence classes and the defining of lower and upper approximation regions based on this partitioning to denote certain and possible inclusion in the rough set.
 Fuzzy Sets

Fuzzy set theory is another technique for dealing with uncertainty. It is based on the concept of measuring the degree of inclusion in a set through the use of a membership value. Where elements can either belong or not belong to a regular set, with fuzzy sets elements can belong to the set to a certain degree with zero indicating not an element, one indicating complete membership, and values between zero and one indicating partial or uncertain membership in the set.
 Information Theory

Information theory involves the study of measuring the information content of a signal. In databases information theoretic measures can be used to measure the information content of data. Entropy is one such measure.
 Database

A collection of data and the application programs that make use of this data for some enterprise is a database.
 Information System

An information system is a database enhanced with additional tools that can be used by management for planning and decisionmaking.
 Data Mining

Data mining involves the discovery of patterns or rules in a set of data. These patterns generate some knowledge and information from the raw data that can be used for making decisions. There are many approaches to data mining, and uncertainty management techniques play a vital role in knowledge discovery.
Definition of the Subject and Its Importance
Databases and information systems are ubiquitous in this age of information and technology. Computers have revolutionized the way data can be manipulated and stored, allowing for very large databases with sophisticated capabilities. With so much money and manpower invested in the design and daily use of these systems, it is imperative that they be as correct, secure, and adaptable to the changing needs of the enterprise as possible. Therefore it is important to understand the design and implementation of such systems and to be able to utilize all their capabilities.
Scientists and business executives alike know the value of information. The challenge has been to produce relevant information for an everchanging uncertain world from data and facts stored on computers and archival devices. These data are considered to be exact, certain, factual values. The real world, however, is uncertain, inexact, and fraught with errors. It is a challenge, then, to extract useful and relevant information from ordinary databases. Uncertainty management techniques such as rough and fuzzy sets can help. These are emerging topics of importance in both the areas of data science/big data (Cady 2017; Dhar 2013) and the Internet of things (Höller et al. 2014; Stankovic 2014).
Introduction
Databases are recognized for their ability to store and update data in an efficient manner, providing reliability and the elimination of data redundancy. The relational database model, in particular, has wellestablished mechanisms built into the model for properly designing the database and maintaining data integrity and consistency. Data alone, however, are only facts. What is needed is information. Knowledge discovery attempts to derive information from the pure facts, discovering highlevel regularities in the data. It is defined as the nontrivial extraction of implicit, previously unknown, and potentially useful information from data (Frawley et al. 1991; Han et al. 1992).
An innovative technique in the field of uncertainty and knowledge discovery is based on rough sets. Rough set theory, introduced and further developed mathematically by Pawlak (1984), provides a framework for the representation of uncertainty. It has been used in various applications such as the rough querying of crisp data (Beaubouef and Petry 1994b), uncertainty management in databases (Beaubouef and Petry 2007a), the mining of spatial data (Beaubouef and Petry 2002), and improved information retrieval (Srinivasan 1991). These techniques may readily be extended for use with objectoriented, spatial, and other complex databases and may be integrated with additional data mining techniques for a comprehensive knowledge discovery approach.
Fuzzy Sets and Rough Sets
Fuzzy Set Theory
Fuzzy set theory (Zadeh 1965) is another approach for managing uncertainty. It has been around for a few years longer than rough sets and also has welldeveloped theory, properties, and applications. Applications involving fuzzy logic are diverse and plentiful, ranging from fuzzy control systems in industry to fuzzy logic in databases (Buckles and Petry 1982a; Petry 1996).
Definition
That is, for a fuzzy set the characteristic function takes on all values between 0 and 1 and not just the discrete values of 0 or 1 representing the binary choice for membership in a conventional crisp set such as C. For a fuzzy set, the characteristic function is often called the membership function and denoted μ_{F} (x).
Definition
Support and α Cuts
A related concept to the support is αcuts. The αcut of a set is a nonfuzzy set of the universe whose elements have a membership function greater than or equal to some value α. A _{α} = {x  μ_{A} (x) ≥ α} for 0 ≤ α ≤ 1. Notice that the αcuts of a set are subsets of the support. The values of α can be chosen arbitrarily but are usually picked to select desired subsets of the universe.
For ordinary crisp sets A ∩ A = Ø; however, this is not generally true for a fuzzy set and its complement. This may seem to violate the law of the excluded middle, but this is just the essential nature of fuzzy sets. Since fuzzy sets have imprecise boundaries, we cannot place an element exclusively in a set or its complement.
The dilation operation dilates fuzzy elements by increasing the membership grade more for the elements with smaller membership grades.
Intensification (INT (A)). The intensification operation is like contrast intensification of a picture. It raises the membership grade of those elements within the crossover points and reduces the membership grade of those outside the crossover points.
Another example is that the dilation operation may be used to represent the linguistic modifier Moreorless.
The exponents such as μ^{2} for CON or μ^{1/2} for DIL can be viewed as specific values of parameterized exponents. So the hedge Extremely might be modeled by μ^{3} or other values that might be obtained by a consensus of opinions.
Rough Set Theory

U is the universe, which cannot be empty.

R is the indiscernibility relation or equivalence relation.

A = (U, R), an ordered pair, is called an approximation space.

[x]_{R} denotes the equivalence class of R containing x, for any element x of U.

Elementary sets in A – the equivalence classes of R.

Definable set in A – any finite union of elementary sets in A.

Lower approximation of X in A is the set RX = {x ∈ U / [x]_{R} ⊆ X}.

Upper approximation of X in A is the set \( \overline{R}X \) = {x ∈ U / [x]_{R} ∩ X ≠ ∅}.
POS_{R}(X) = RX denotes the Rpositive region of X or those elements which certainly belong to the rough set. The Rnegative region of X, NEG_{R}(X) = U\( \overline{\mathrm{R}}\mathrm{X} \), contains elements which do not belong to the rough set, and the boundary or Rborderline region of X, BN_{R}(X) = \( \overline{\mathrm{R}}\mathrm{X} \) RX, contains those elements which may or may not belong to the set. X is Rdefinable if and only if RX = \( \overline{\mathrm{R}}\mathrm{X} \). Otherwise, RX ≠\( \overline{\mathrm{R}}\mathrm{X} \) and X is rough with respect to R. A rough set in A is the group of subsets of U with the same upper and lower approximations.
Because there are advantages to both fuzzy set and rough set theories, several researchers have studied various ways of combining the two theories (Chanas and Kuchta 1992; Dubois and Prade 1987, 1992; Nanda and Majumdar 1992). Others have investigated the interrelations between the two theories (Chanas and Kuchta 1992; Pawlak 1985; Wygralak 1989). Fuzzy sets and rough sets are not equivalent, but complementary.
It has been shown in Wygralak (1989) that rough sets can be expressed by a fuzzy membership function μ → {0, 0.5, 1} to represent the negative, boundary, and positive regions. In this model, all elements of the lower approximation, or positive region, have a membership value of one. Those elements of the boundary region are assigned a membership value of 0.5. Elements not belonging to the rough set have a membership value of zero. Rough set definitions of union and intersection can be modified so that the fuzzy model satisfies all the properties of rough sets (Beaubouef and Petry 2000).
We integrate fuzziness into the rough set model in order to quantify levels of roughness in boundary region areas through the use of fuzzy membership values. Therefore, we do not require membership values of elements of the boundary region to equal 0.5, but allow them to range from zero to one, noninclusive. Additionally, the union and intersection operators for fuzzy rough sets are comparable to those for ordinary fuzzy sets, where MIN and MAX are used to obtain membership values of redundant elements.
Let U be a universe, X a rough set in U.
Definition
Definition
Definition
Rough Relational Database
The rough relational database model (Beaubouef et al. 1995) is an extension of the standard relational database model of Codd (Petry 1996). It captures all the essential features of rough sets theory including indiscernibility of elements denoted by equivalence classes and lower and upper approximation regions for defining sets which are indefinable in terms of the indiscernibility.
Every attribute domain is partitioned by some equivalence relation designated by the database designer or user. Within each domain, those values that are considered indiscernible belong to an equivalence class. This information is used by the query mechanism to retrieve information based on equivalence with the class to which the value belongs rather than equality, resulting in less critical wording of queries.
Recall is also improved in the rough relational database because rough relations provide possible matches to the query in addition to the certain matches which are obtained in the standard relational database. This is accomplished by using set containment in addition to equality of attributes in the calculation of lower and upper approximation regions of the query result.
The rough relational database has several features in common with the ordinary relational database. Both models represent data as a collection of relations containing tuples. These relations are sets. The tuples of a relation are its elements and, like elements of sets in general, are unordered and nonduplicated. A tuple t_{i} takes the form (d_{i1}, d_{i2}, ..., d_{im}), where d_{ij} is a domain value of a particular domain set D_{j}. In the ordinary relational database, d_{ij} ∈ D_{j}. In the rough database, however, as in other nonfirst normal form extensions to the relational model (Makinouchi 1977; Roth et al. 1987), d_{ij} ⊆ D_{j}, and although it is not required that d_{ij} be a singleton, d_{ij} ≠ ∅. Let P(D_{i}) denote the powerset(D_{i})  ∅.
Definition
A rough relation R is a subset of the set cross product P(D_{1}) × P(D_{2}) × ⋅ ⋅ ⋅ × P(D_{m}).
A rough tuple t is any member of R, which implies that it is also a member of P(D_{1}) × P(D_{2}) × ⋅ ⋅ ⋅ × P(D_{m}). If t_{i} is some arbitrary tuple, then t_{i} = (d_{i1}, d_{i2}, ..., d_{im}) where d_{ij} ⊆ D_{j}. A tuple in this model differs from that of ordinary databases in that the tuple components may be sets of domain values rather than single values. The set braces are omitted from singletons for notational simplicity.
Let [d_{xy}] denote the equivalence class to which d_{xy} belongs. When d_{xy} is a set of values, the equivalence class is formed by taking the union of equivalence classes of members of the set; if d_{xy} = {c_{1}, c_{2}, …, c_{n}}, then [d_{xy}] = [c_{1}] ∪ [c_{2}] ∪ … ∪ [c_{n}].
Definition
Tuples t_{i} = (d_{i1}, d_{i2}, …, d_{im}) and t_{k} = (d_{k1}, d_{k2}, …, d_{km}) are redundant if [d_{ij}] = [d_{kj}] for all j = 1,…, m.
In the rough relational database, redundant tuples are removed in the merging process since duplicates are not allowed in sets, the structure upon which the relational model is based.
There are two basic types of relational operators. The first type arises from the fact that relations are considered sets of tuples. Therefore, operations which can be applied to sets also apply to relations. The most useful of these for database purposes are set difference, union, and intersection. Operators which do not come from set theory, but which are useful for retrieval of relational data, are select, project, and join.
In the rough relational database, relations are rough sets as opposed to ordinary sets. Therefore, new rough operators (—, ∪, ∩, σ, π, ⋈), which are comparable to the standard relational operators, must be developed for the rough relational database. Moreover, a mechanism must exist within the database to mark tuples of a rough relation as belonging to the lower or upper approximation of that rough relation. Properties of the rough relational operators can be found in Beaubouef et al. (1995).
Information Theory
In communication theory, Shannon (1948) introduced the concept of entropy which was used to characterize the information content of signals. Since then, variations of these information theoretic measures have been successfully applied to applications in many diverse fields. In particular, the representation of uncertain information by entropy measures has been applied to all areas of databases, including fuzzy database querying (Buckles and Petry 1983), data allocation (Fung and Lam 1980), classification in rulebased systems (Quinlan 1986), and measuring uncertainty in rough and fuzzy rough relational databases (Beaubouef et al. 1998).
In fuzzy set theory, the representation of uncertain information measures has been extensively studied (Bhandari and Pal 1993; de Luca and Termini 1972; Klir and Folger 1988). So this paper relates the concepts of information theory to rough sets and compares these information theoretic measures to established rough set metrics of uncertainty. The measures are then applied to the rough relational database model (Beaubouef et al. 1995). Information content of both stored relational schemas and rough relations are expressed as types of rough entropy.
Rough set theory (Pawlak 1982) inherently models two types of uncertainty. The first type of uncertainty arises from the indiscernibility relation that is imposed on the universe, partitioning all values into a finite set of equivalence classes. If every equivalence class contains only one value, then there is no loss of information caused by the partitioning. In any coarser partitioning, however, there are fewer classes, and each class will contain a larger number of members. Our knowledge, or information, about a particular value decreases as the granularity of the partitioning becomes coarser.
Uncertainty is also modeled through the approximation regions of rough sets where elements of the lower approximation region have total participation in the rough set and those of the upper approximation region have uncertain participation in the rough set. Equivalently, the lower approximation is the certain region, and the boundary area of the upper approximation region is the possible region.
The second measure, roughness, represents the degree of incompleteness of knowledge about the rough set. It is calculated by subtracting the accuracy from 1: ρ_{R}(X) = 1  α_{R}(X).
These measures require knowledge of the number of elements in each of the approximation regions and are good metrics for uncertainty as it arises from the boundary region, implicitly taking into account equivalence classes as they belong wholly or partially to the set. However, accuracy and roughness measures do not necessarily provide us with information on the uncertainty related to the granularity of the indiscernibility relation for those values that are totally included in the lower approximation region. For example,
Let the rough set X be defined as follows: X = {A11, A12, A21, A22, B11, C1}
All three of the above partitionings result in the same upper and lower approximation regions for the given set X and hence the same accuracy measure (4/9 = 0.444) since only those classes belonging to the lower approximation region were repartitioned. It is obvious, however, that there is more uncertainty in A_{1} than in A_{2} and more uncertainty in A_{2} than in A_{3}. Therefore, a more comprehensive measure of uncertainty is needed.
We derive such a measure from techniques used for measuring entropy in classical information theory. Countless variations of the classical entropy have been developed, each tailored for a particular application domain or for measuring a particular type of uncertainty. Our rough entropy is defined such that we may apply it to rough databases. We define the entropy of a rough set X as follows:
Definition
The term ρ_{R}(X) denotes the roughness of the set X. The second term is the summation of the probabilities for each equivalence class belonging either wholly or in part to the rough set X. There is no ordering associated with individual class members. Therefore the probability of any one value of the class being named is the reciprocal of the number of elements in the class. If c_{i} is the cardinality of, or number of elements in, equivalence class i and all members of a given equivalence class are equal, P_{i} = 1/c_{i} represents the probability of one of the values in class i. Q_{i} denotes the probability of equivalence class i within the universe. Q_{i} is computed by taking the number of elements in class i and dividing by the total number of elements in all equivalence classes combined. The entropy of the sample rough set X, E_{r}(X), is given below for each of the possible indiscernibility relations A_{1}, A_{2}, and A_{3}.
Using A_{1}: –(5/9)[(4/9)log(1/4) + (3/9)log(1/3) + (2/9)log(1/2)] = 0.274
Using A_{2}: –(5/9)[(2/9)log(1/2) + (2/9)log(1/2) + (3/9)log(1/3) + (2/9)log(1/2)] = 0.20
Using A_{3}: –(5/9)[(1/9)log(1) + (1/9)log(1) + (1/9)log(1) + (1/9)log(1) + (3/9)log(1/3) + (2/9)log(1/2)] = 0.048
From the above calculations, it is clear that although each of the partitionings results in identical roughness measures, the entropy decreases as the classes become smaller through finer partitionings.
Entropy and the Rough Relational Database
The basic concepts of rough sets and their informationtheoretic measures carry over to the rough relational database model (Beaubouef et al. 1995). Recall that in the rough relational database, all domains are partitioned into equivalence classes and relations are not restricted to first normal form. We therefore have a type of rough set for each attribute of a relation. This results in a rough relation, since any tuple having a value for an attribute that belongs to the boundary region of its domain is a tuple belonging to the boundary region of the rough relation.
There are two things to consider when measuring uncertainty in databases: uncertainty or entropy of a rough relation that exists in a database at some given time and the entropy of a relation schema for an existing relation or query result. We must consider both since the approximation regions only come about by set values for attributes in given tuples. Without the extension of a database containing actual values, we only know about indiscernibility of attributes. We cannot consider the approximation regions.
We define the entropy for a rough relation schema as follows:
Definition
This is similar to the definition of entropy for rough sets without factoring in roughness since there are no elements in the boundary region (lower approximation = upper approximation). However, because a relation is a cross product among the domains, we must take the sum of all these entropies to obtain the entropy of the schema. The schema entropy provides a measure of the uncertainty inherent in the definition of the rough relation schema taking into account the partitioning of the domains on which the attributes of the schema are defined.
We extend the schema entropy E_{s}(S) to define the entropy of an actual rough relation instance E_{R}(R) of some database D by multiplying each term in the product by the roughness of the rough set of values for the domain of that given attribute.
Definition
We obtain the Dρ_{j}(R) values by letting the nonsingleton domain values represent elements of the boundary region, computing the original rough set accuracy and subtracting it from one to obtain the roughness. DQ_{i} is the probability of a tuple in the database relation having a value from class i, and DP_{i} is the probability of a value for class i occurring in the database relation out of all the values which are given.
Information theoretic measures again prove to be a useful metric for quantifying information content. In rough sets and the rough relational database, this is especially useful since in ordinary rough sets, Pawlak’s measure of roughness does not seem to capture the information content as precisely as our rough entropy measure.
In rough relational databases, knowledge about entropy can either guide the database user toward less uncertain data or act as a measure of the uncertainty of a data set or relation. As rough relations become larger in terms of the number of tuples or attributes, the automatic calculation of some measure of entropy becomes a necessity. Our rough relation entropy measure fulfills this need.
Rough Fuzzy Relational Database
The fuzzy rough relational database, as in the ordinary relational database, represents data as a collection of relations containing tuples. Because a relation is considered a set having the tuples as its members, the tuples are unordered. In addition, there can be no duplicate tuples in a relation. A tuple t_{i} takes the form (d_{i1}, d_{i2}, …, d_{im}, d_{iμ}), where d_{ij} is a domain value of a particular domain set D_{j} and d_{iμ} ∈ D_{μ}, where D_{μ} is the interval [0,1], the domain for fuzzy membership values. In the ordinary relational database, d_{ij} ∈ D_{j}. In the fuzzy rough relational database, except for the fuzzy membership value, however, d_{ij} ⊆ D_{j}, and although d_{ij} is not restricted to be a singleton, d_{ij} ≠ ∅. Let P(D_{i}) denote any nonnull member of the powerset of D_{i}.
Definition
A fuzzy rough relation R is a subset of the set cross product P(D_{1}) × P(D_{2}) × ⋅ ⋅ ⋅ × P(D_{m}) × D_{μ}.
For a specific relation, R, membership is determined semantically. Given that D_{1} is the set of names of nuclear/chemical plants, D_{2} is the set of locations, and assuming that RIVERB is the only nuclear power plant that is located in VENTRESS,
are all elements of P(D_{1}) × P(D_{2}) × D_{μ}. However, only the element (RIVERB, VENTRESS, 1) of those listed above is a member of the relation R(PLANT, LOCATION, μ), which associates each plant with the town or community in which it is located. A fuzzy rough tuple t is any member of R. If t_{i} is some arbitrary tuple, then t_{i} = (d_{i1}, d_{i2}, …, d_{im}, d_{iμ}) where d_{ij} ⊆ D_{j} and d_{iμ} ∈ D_{μ}.Definition
An interpretation α = (a_{1}, a_{2}, …, a_{m}, a_{μ}) of a fuzzy rough tuple t_{i} = (d_{i1}, d_{i2}, …, d_{im}, d_{iμ}) is any value assignment such that a_{j} ∈ d_{ij} for all j.
The interpretation space is the cross product D_{1} × D_{2} × ⋅ ⋅ ⋅ × D_{m} × D_{μ}, but is limited for a given relation R to the set of those tuples which are valid according to the underlying semantics of R. In an ordinary relational database, because domain values are atomic, there is only one possible interpretation for each tuple t_{i}. Moreover, the interpretation of t_{i} is equivalent to the tuple t_{i}. In the fuzzy rough relational database, this is not always the case.
Let [d_{xy}] denote the equivalence class to which d_{xy} belongs. When d_{xy} is a set of values, the equivalence class is formed by taking the union of equivalence classes of members of the set; if d_{xy} = {c_{1}, c_{2}, …, c_{n}}, then [d_{xy}] = [c_{1}] ∪ [c_{2}] ∪ … ∪ [c_{n}].
Definition
Tuples t_{i} = (d_{i1}, d_{i2}, …, d_{in}, d_{iμ}) and t_{k} = (d_{k1}, d_{k2}, …, d_{kn}, d_{kμ}) are redundant if [d_{ij}] = [d_{kj}] for all j = 1, …, n.
If a relation contains only those tuples of a lower approximation, i.e., those tuples having a μ value equal to one, the interpretation α of a tuple is unique. This follows immediately from the definition of redundancy. In fuzzy rough relations, there are no redundant tuples. The merging process used in relational database operations removes duplicate tuples since duplicates are not allowed in sets, the structure upon which the relational model is based.
Tuples may be redundant in all values except μ. As in the union of fuzzy rough sets where the maximum membership value of an element is retained, it is the convention of the fuzzy rough relational database to retain the tuple having the higher μ value when removing redundant tuples during merging. If we are supplied with identical data from two sources, one certain and the other uncertain, we would want to retain the data that is certain, avoiding loss of information.
Recall that the rough relational database is in nonfirst normal form; there are some attribute values that are sets. Another definition, which will be used for upper approximation tuples, is necessary for some of the alternate definitions of operators to be presented. This definition captures redundancy between elements of attribute values that are sets:
Definition
Two subtuples X = (d_{x1}, d_{x2}, …, d_{xm}) and Y = (d_{y1}, d_{y2}, …, d_{ym}) are roughly redundant, ≈_{R}, if for some [p] ⊆ [d_{xj}] and [q] ⊆ [d_{yj}], [p] = [q] for all j = 1, …, m.
In order for any database to be useful, a mechanism for operating on the basic elements and retrieving specified data must be provided. The concepts of redundancy and merging play a key role in the operations defined.
We must first design our database using some type of semantic model. We use a variation of the entityrelationship diagram that we call a fuzzy rough ER diagram. This diagram is similar to the standard ER diagram in that entity types are depicted in rectangles, relationships with diamonds, and attributes with ovals. However, in the fuzzy rough model, it is understood that membership values exist for all instances of entity types and relationships. Attributes which allow values where we want to be able to define equivalences are denoted with an asterisk (∗) above the oval. These values are defined in the indiscernibility relation, which is not actually part of the database design, but inherent in the fuzzy rough model.
Our fuzzy rough ER model (Beaubouef and Petry 2000) is similar to the second and third levels of fuzziness defined by Zvieli and Chen (1986). However, in our model, all entity and relationship occurrences (second level) are of the fuzzy type so we do not mark an “f” beside each one. Zvieli and Chen’s third level considers attributes that may be fuzzy. They use triangles instead of ovals to represent these attributes. We do not introduce fuzziness at the attribute level of our model in this paper, only roughness or indiscernibility, and denote those attribute with the “∗.” From the fuzzy rough ER diagram, we design the structure of the fuzzy rough relational database. If we have a priori information about the types of queries that will be involved, we can make intelligent choices that will maximize computer resources.
We next formally define the fuzzy rough relational database operators and discuss issues relating to the realworld problems of data representation and modeling. We may view indiscernibility as being modeled through the use of the indiscernibility relation, imprecision through the use of nonfirst normal form constructs, and degree of uncertainty and fuzziness through the use of tuple membership values, which are given as the value for the μ attribute in every fuzzy rough relation.
Fuzzy Rough Relational Operators
In Beaubouef et al. (1995), we defined several operators for the rough relational algebra. We now define similar operators for the fuzzy rough relational database as in Beaubouef and Petry (1994a). Recall that for all of these operators, the indiscernibility relation is used for equivalence of attribute values rather than equality of values.
Difference
The fuzzy rough relational difference operator is very much like the ordinary difference operator in relational databases and in sets in general. It is a binary operator that returns those elements of the first argument that are not contained in the second argument.
In the fuzzy rough relational database, the difference operator is applied to two fuzzy rough relations and, as in the rough relational database, indiscernibility, rather than equality of attribute values, is used in the elimination of redundant tuples. Hence, the difference operator is somewhat more complex. Let X and Y be two union compatible fuzzy rough relations.
Definition
The resulting fuzzy rough relation contains all those tuples which are in the lower approximation of X, but not redundant with a tuple in the lower approximation of Y. It also contains those tuples belonging to upper approximation regions of both X and Y, but which have a higher μ value in X than in Y. For example, let X contain the tuple (MODERN, 1) and Y contain the tuple (MODERN, .02). It would not be desirable to subtract out certain information with possible information, so X  Y yields (MODERN, 1).
Union
Because relations in databases are considered as sets, the union operator can be applied to any two unioncompatible relations to result in a third relation which has as its tuples all the tuples contained in either or both of the two original relations. The union operator can be extended to apply to fuzzy rough relations. Let X and Y be two union compatible fuzzy rough relations.
Definition
The resulting relation T contains all tuples in either X or Y or both, merged together and having redundant tuples removed. If X contains a tuple that is redundant with a tuple in Y except for the μ value, the merging process will retain only that tuple with the higher μ value.
Intersection
The fuzzy rough intersection, another binary operator on fuzzy rough relations, can be defined similarly.
Definition
In intersection, the MIN operator is used in the merging of equivalent tuples having different μ values, and the result contains all tuples that are members of both of the original fuzzy rough relations.
Definition
Select
The select operator for the fuzzy rough relational database model, σ, is a unary operator which takes a fuzzy rough relation X as its argument and returns a fuzzy rough relation containing a subset of the tuples of X, selected on the basis of values for a specified attribute. The operation σ_{A = a}(X), for example, returns those tuples in X where attribute A is equivalent to the class [a]. In general, select returns a subset of the tuples that match some selection criteria.
Let R be a relation schema, X a fuzzy rough relation on that schema, and A an attribute in R, a = {a_{i}} and b = {b_{j}}, where a_{i},b_{j} ∈ dom(A) and ∪_{x} is interpreted as “the union over all x.”
Definition
Assume we want to retrieve those elements where CITY = “ADDIS” from the following
fuzzy rough tuples:
The result of the selection is the following:
where the μ for the second tuple is the product of the original membership value 0.7 and 1/3.Project
Project is a unary fuzzy rough relational operator. It returns a relation that contains a subset of the columns of the original relation. Let X be a fuzzy rough relation with schema A, and let B be a subset of A. The fuzzy rough projection of X onto B is a fuzzy rough relation Y obtained by omitting the columns of X which correspond to attributes in A – B, and removing redundant tuples. Recall the definition of redundancy accounts for indiscernibility, which is central to the rough sets theory and that higher μ values have priority over lower ones.
Definition
Join
Join is a binary operator that takes related tuples from two relations and combines them into single tuples of the resulting relation. It uses common attributes to combine the two relations into one, usually larger, relation. Let X(A_{1}, A_{2}, …, A_{m}) and Y(B_{1}, B_{2}, …, B_{n}) be fuzzy rough relations with m and n attributes, respectively, and AB = C, the schema of the resulting fuzzy rough relation T.
Definition
<JOIN CONDITION> is a conjunction of one or more conditions of the form A = B.
Only those tuples which resulted from the “joining” of tuples that were both in lower approximations in the original relations belong to the lower approximation of the resulting fuzzy rough relation. All other “joined” tuples belong to the upper approximation only (the boundary region) and have membership values less than one. The fuzzy membership value of the resultant tuple is simply calculated as in Buckles and Petry (1985) by taking the minimum of the membership values of the original tuples. Taking the minimum value also follows the logic of Ola and Ozsoyoglu (1993), where, in joins of tuples with different levels of information uncertainty, the resultant tuple can have no greater certainty than that of its least certain component.
Fuzzy and rough set techniques integrated into the underlying data model result in databases that can more accurately represent realworld enterprises since they incorporate uncertainty management directly into the data model itself. This is useful as is for obtaining greater information through the querying of rough and fuzzy databases. Additional benefits may be realized when they are used in the process of data mining.
Rough Set Modeling of Spatial Data
Many of the problems associated with data are prevalent in all types of database systems. Spatial databases and GIS contain descriptive as well as positional data (Jing and Wenwen 2016). The various forms of uncertainty occur in both types of data, so many of the issues apply to ordinary databases as well, such as integration of data from multiple sources, timevariant data, uncertain data, imprecision in measurement, inconsistent wording of descriptive data, and “binning” or grouping of data into fixed categories, which also are employed in spatial contexts (Petry et al. 2005; Tavana et al. 2016).
Often spatial data is associated with a particular grid. The positions are set up in a regular matrixlike structure, and data is affiliated with point locations on the grid. This is the case for raster data and for other types of nonvector type data such as topography or sea surface temperature data. There is a tradeoff between the resolution or the scale of the grid and the amount of system resources necessary to store and process the data. Higher resolutions provide more information, but at a cost of memory space and execution time.
If we approach these data issues from a rough set point of view, it can be seen that there is indiscernibility inherent in the process of gridding or rasterizing data. A data item at a particular grid point in essence may represent data near the point as well. This is due to the fact that often point data must be mapped to the grid using techniques such as nearestneighbor, averaging, or statistics. The rough set indiscernibility relation may be set up so that the entire spatial area is partitioned into equivalence classes where each point on the grid belongs to an equivalence class. If the resolution of the grid changes, then, in fact, this is changing the granularity of the partitioning, resulting in fewer, but larger classes.
The approximation regions of rough sets are beneficial whenever information concerning spatial data regions is accessed. Consider a region such as a forest. One can reasonably conclude that any grid point identified as FOREST that is surrounded on all sides by grid points also identified as FOREST is, in fact, a point represented by the feature FOREST. However, consider points identified as FOREST that are adjacent to points identified as MEADOW. Is it not possible that these points represent meadow area as well as forest area but were identified as FOREST in the classification process? Likewise, points identified as MEADOW but adjacent to FOREST points may represent areas that contain part of the forest. This uncertainty maps naturally to the use of the approximation regions of the rough set theory, where the lower approximation region represents certain data and the boundary region of the upper approximation represents uncertain data. It applies to spatial database querying and spatial database mining operations.
If we force a finer granulation of the partitioning, a smaller boundary region results. This occurs when the resolution is increased. As the partitioning becomes finer and finer, finally a point is reached where the boundary region is nonexistent. Then the upper and lower approximation regions are the same, and there is no uncertainty in the spatial data as can be determined by the representation of the model.
In Worboys (1998a) Worboys models imprecision in spatial data based on the resolution at which the data is represented and for issues related to the integration of such data. This approach relies on the issue of indiscernibility – a core concept for rough sets – but does not carry over the entire framework and is just described as “reminiscent of the theory of rough sets” (Worboys 1998b). Ahlqvist and colleagues (Ahlqvist et al. 2000) used a rough set approach to define a rough classification of spatial data and to represent spatial locations. They also proposed a measure for quality of a rough classification compared to a crisp classification and evaluated their technique on actual data from vegetation map layers. They considered the combination of fuzzy and rough set approaches for reclassification as required by the integration of geographic data. Another research group in a mapping and GIS context (Wang et al. 2002) have developed an approach using a rough raster space for the field representation of a spatial entity and evaluated it on a classification case study for remote sensing images. In (2003) Bittner and Stell consider Klabeled partitions, which can represent maps, and then develop their relationship to rough sets to approximate map objects with vague boundaries. Additionally they investigate stratified partitions, which can be used to capture levels of details or granularity such as in consideration of scale transformations in maps, and extend this approach using the concepts of stratified rough sets. An additional approach for dealing with uncertain spatial data can be done using a DempsterShafer representation (Shafer 1976). By considering uncertainty in a spatial location as having a range that is most probable and an outer range that is possible, then by using nested intervals around a point, the uncertainty can be modeled (Elmore et al. 2017b).
Data Mining in Rough Databases
 1.
Contain both X_{j} and X_{k} (i.e., X_{j} ∪ X_{k}) – called the support s.
 2.
If T_{i} contains X_{j} then T_{i} also contains X_{k} – called the confidence c.
 1.
s – Prob (X_{j} ∪ X_{k}) and
 2.
c – Prob (Xk  Xj)
We assume the system user has provided minimum values for these in order to generate only sufficiently interesting rules. A rule whose support and confidence exceeds these minimums is called a strong rule.
The value, a, can be a subjective value obtained from the user depending on relative assessment of the roughness of the query result T. For the data mining example of the next section, we chose a neutral default value of a = ½. Note that W (X_{j}) is included in the summation only if all of the values of the itemset X_{j} are included in the transaction, i.e., it is a subset of the transaction.
In the spatial data mining area, there have only been a few efforts using rough sets. In the research described in (Beaubouef and Petry 2002; Bhattacharya and Bhatnagar 2012), approaches for attribute induction knowledge discovery (Raschia and Mouaddib 2002) in rough spatial data are investigated. In Bittner (2000) Bittner considers rough sets for spatiotemporal data and how to discover characteristic configurations of spatial objects focusing on the use of topological relationships for characterizations. In a survey of uncertaintybased spatial data mining, Shi et al. (2003) provide a brief general comparison of fuzzy and rough set approaches for spatial data mining.
Aggregation of Uncertain Information
Uncertainty arising from multiple sources and of many forms appears in the everyday activities and decisions of humans. We want to examine approaches that can be used to combine these uncertainties into forms that can become useful for decisionmaking. Effective decisionmaking should be able to make use of all the available, relevant information about such combined uncertainty (Ferson and Kreinovich 2001). In this section we describe approaches for combining separately possibilistic uncertainty, probabilistic uncertainty, and situations where both forms of uncertainty appear.
To formalize the discussion, let V be a discrete variable taking values in a space X that has both aleatory and epistemic sources of uncertainty (Parsons 2001). Let P be a probability distribution P: X → [0, 1] such that p_{k} ∈ [0,1], \( \sum \limits_{\mathrm{k}=1}^{\mathrm{n}} \)p_{k} = 1, that models the aleatory uncertainty. Then the epistemic uncertainty can be modeled by a possibility distribution (Makinouchi 1977) such that Π: X → [0, 1], where π(x_{k}) gives the possibility that x_{k} is the value of V, k = 1…n. A usual requirement here is the normality condition, Max_{x} [π (x)] = 1, that is at least one element in X must be fully possible. Abbreviating our notation so that p _{k} = p(x_{k}), … and π_{k} = π(x_{k}), …, we have P = {p_{1}, p_{2}, …..p_{n}} and Π = {π_{1}, π_{2},…., π_{n}}.
Definition
Shannon Entropy
Definition
Gini Index
Some practitioners use G(P) versus S(P) since it does not involve a logarithm, making analytic solutions simpler. Gini index is used in consideration of inequalities in various areas such as economics, ecology, and engineering (Aristondo et al. 2012). A very important application of the Gini index is as a splitting criterion for decision tree induction in machine learning and data mining (Breiman et al. 1984).
It is accepted in practice for diagnostic test selection that the Shannon and Gini measures are interchangeable (Sent and van de Gaag 2007). The specific relationship of Shannon entropy and the Gini index has been discussed in the literature (Eliazar and Sokolov 2010). Theoretical support for this practice is provided in Yager’s independent consideration of alternative measures of entropy (Yager 1995), where he derives the same form for an entropy measure as the Gini measure.
Definition
Specificity
Definition
Consistency
This measure does not represent an inherent relationship but rather represents the intuition that a lowering of an event’s possibility tends to lower its probability, but not the converse.
There is some research to our specific concern of aggregating probabilistic and possibilistic uncertainty. First a related approach is the possibilistic conditioning of probability distributions using the approach of Yager (2012). This form of aggregation makes it very amenable to apply the information measures (Elmore et al. 2014).
The distribution P_{1} from the transformation can then be aggregated with some other probability distribution P_{2} using a spectrum of operators such as min, max, mean, etc. (Petry et al. 2015). The aggregated probability distribution can then be evaluated using the Gini information measure to assess if there is enhanced information content over that of the initial distribuion P_{2}. Additional recent approaches have considered an intelligent qualitybased approach to fusing multisource probabilistic information (Yager and Petry 2016) and use of fuzzy Choquet integration of homogenous possibility and probability distributions (Anderson et al. 2016).
Future Directions
There are several other approaches to uncertainty representation that may be more suitable for certain applications. Type 2 fuzzy sets have been of considerable recent interest (Mendel and John 2002). In these as opposed to ordinary fuzzy sets in which the underlying membership functions are crisp, here the membership functions are themselves fuzzy. Intuitionistic sets introduced by Atanassov (1986, 2000) are another generalization of a fuzzy set. Two characteristic functions are used for capturing both the ordinary idea of degree of membership in the intuitionistic set and the degree of nonmembership of elements in the set and can be used in database design (Beaubouef and Petry 2007b). Related to the concepts introduced by rough sets is the idea of granularity for managing complex data by abstraction using information granules as discussed by Lin (1997, 1999). A granular set approach has also been introduced (Ligeza 2002) which is a set and a number of disjoint subsets that constitute a semipartition. Use of a DempsterShafer approach analogous to rough sets can be used to model uncertainty in spatial, temporal, and spatiotemporal application domains (Elmore et al. 2017a, b). Some prior database research on ordered relations (Ginsburg and Hull 1983), although not presented in the context of uncertainty of data, may provide some approaches to extend our work in this area. A main emphasis for future work is the incorporation of some of these research topics into mainstream database, GIS commercial products, and semistructured data on the semantic web.
Notes
Acknowledgments
The authors would like to thank the Naval Research Laboratory’s Base Program, Program Element No. 0602435 N, for sponsoring this research.
Bibliography
Primary Literature
 Agrawal R, Imielinski T, Swami A (1993) Mining Association Rules between sets of items in large databases. Proceedings of the 1993 ACMSIGMOD international conference on Management of Data. ACM Press, New York, pp 207–216Google Scholar
 Ahlqvist O, Keukelaar J, Oukbir K (2000) Rough classification and accuracy assessment. Int J Geogr Inf Sci 14:475–496CrossRefGoogle Scholar
 Anderson, D, Elmore P, Petry, F Havens T (2016) Fuzzy Choquet integration of homogenous possibility and probability distributions. Inf Sci 363:24–39zbMATHCrossRefGoogle Scholar
 Aristondo O, GarciaLparesta J, de la Vega C, Pereira R (2012) The Gini index, the dual decomposition of aggregation functions and the consistent measurement of inequality. Int J Intell Syst 27:132–152CrossRefGoogle Scholar
 Atanassov K (1986) Intuitionistic fuzzy sets. Fuzzy Sets Syst 20:87–96zbMATHCrossRefGoogle Scholar
 Attanassov K (2000) Intuitionistic fuzzy sets;theory and applications. Physica Verlag, HeidlelbergGoogle Scholar
 Beaubouef T Petry F (1994a) Fuzzy set quantification of roughness in a rough relational database model. In: Proceedings of the third IEEE international conference on fuzzy systems, Orlando, pp 172–177Google Scholar
 Beaubouef T, Petry F (1994b) Rough querying of crisp data in relational databases. In: Proceedings of the third international workshop on rough sets and soft computing (RSSC’94), San Jose, Hershey, pp 368–375Google Scholar
 Beaubouef T, Petry F (2000) Fuzzy rough set techniques for uncertainty processing in a relational database. Int J Intell Syst 15:389–424zbMATHCrossRefGoogle Scholar
 Beaubouef T, Petry F (2002) A rough set foundation for spatial data mining involving vague regions. In: Proceedings of FUZZIEEE’02, Honolulu, pp 767–772Google Scholar
 Beaubouef T, Petry F (2007a) Rough sets: a versatile theory for approaches to uncertainty Management in Databases. Rough Computing: Theories, Technologies and Applications, Idea Group, IncGoogle Scholar
 Beaubouef T, Petry F (2007b) Intuitionistic rough sets for database applications. In: Peters JF et al (eds) Transactions on rough sets VI. LNCS 4374. Springer, Berlin/New York, pp 26–30CrossRefGoogle Scholar
 Beaubouef T, Petry F, Buckles B (1995) Extension of the relational database and its algebra with rough set techniques. Comput Intell 11:233–245CrossRefGoogle Scholar
 Beaubouef T, Petry F, Arora G (1998) Informationtheoretic measures of uncertainty for rough sets and rough relational databases. Inf Sci 109:185–195CrossRefGoogle Scholar
 Bhandari D, Pal NR (1993) Some new information measures for fuzzy sets. Inf Sci 67:209–228MathSciNetzbMATHCrossRefGoogle Scholar
 Bhattacharya S, Bhatnagar V (2012) Fuzzy data mining: a literature survey and classification framework. Int J Netw Virt Org 11:382–408Google Scholar
 Bittner T (2000) Rough sets in spatiotemporal data mining. Proceedings of international workshop on temporal, spatial and spatiotemporal data mining. Springer, Berlin/Heidelberg, pp 89–104zbMATHCrossRefGoogle Scholar
 Bittner T, Stell J (2003) Stratified rough sets and vagueness. In: Kuhn W, Worboys M, Timpf S (eds) Spatial information theory: cognitive and computational foundations of geographic information science international conference (COSIT’03) pp 286–303CrossRefGoogle Scholar
 Breiman L, Friedman J, Olshen R, Stone C (1984) Classification and regression trees. Wadsworth & Brooks/Cole, MontereyzbMATHGoogle Scholar
 Buckles B, Petry F (1982) A fuzzy representation for relational data bases. Int J Fuzzy Sets Syst 7(3):213–226zbMATHCrossRefGoogle Scholar
 Buckles B, Petry F (1983) Informationtheoretical characterization of fuzzy relational databases. IEEE Trans Syst Man Cybern 13:74–77CrossRefGoogle Scholar
 Buckles BP, Petry F (1985) Uncertainty models in information and database systems. J Inf Sci 11:77–87CrossRefGoogle Scholar
 Cady F (2017) The data science handbook. Wiley, New YorkCrossRefGoogle Scholar
 Chanas S, Kuchta D (1992) Further remarks on the relation between rough and fuzzy sets. Fuzzy Sets Syst 47:391–394MathSciNetzbMATHCrossRefGoogle Scholar
 de Luca A, Termini S (1972) A definition of a nonprobabilistic entropy in the setting of fuzzy set theory. Inf Control 20:301–312MathSciNetzbMATHCrossRefGoogle Scholar
 Dhar V (2013) Data science and prediction. Commun ACM 56(12):64–73CrossRefGoogle Scholar
 Dubois D, Prade H (1983) Unfair coins and necessity measures: towards a possibilistic interpretations of histograms. Fuzzy Sets Syst 10:15–27MathSciNetzbMATHCrossRefGoogle Scholar
 Dubois D, Prade H (1987) Twofold fuzzy sets and rough sets–some issues in knowledge representation. Fuzzy Sets Syst 23:3–18MathSciNetzbMATHCrossRefGoogle Scholar
 Dubois D, Prade H (1992) Putting rough sets and fuzzy sets together. In: Slowinski R (ed) Intelligent decision support: handbook of applications and advances of the rough sets theory. Kluwer Academic Publishers, BostonGoogle Scholar
 Eliazar I, Sokolov I (2010) Maximization of statistical heterogeneity: from Shannon’s entropy to Gini's index. Phys A 389:3023–3038MathSciNetCrossRefGoogle Scholar
 Elmore P, Petry F, Yager R (2014) Comparative measures of aggregated uncertainty representations. J Ambient Intell Humaniz Comput 5(6):809–819CrossRefGoogle Scholar
 Elmore P, Petry F, Yager R (2017a) DempsterShafer approach to temporal uncertainty. IEEE Trans Emerg Topics Comput Intell 1(5):316–325CrossRefGoogle Scholar
 Elmore P, Petry F, Yager R (2017b) Geospatial modeling using dempstershafer theory. IEEE Trans Cybern 47(6):1551–1561CrossRefGoogle Scholar
 Ferson S, Kreinovich V (2001) Representation, elicitation, and aggregation of uncertainty in risk analysis – from traditional probabilistic techniques to more general, more realistic approaches: a survey. University of Texas at El Paso computer science tech report #1112001Google Scholar
 Frawley W, PiatetskyShapiro G, Matheus C (1991) Knowledge discovery in databases: an overview. In: PiatetskyShapiro G, Frawley W (eds.), Knowledge discovery in databases, AAAI/MIT Press, Menlo Park pp 1–27zbMATHGoogle Scholar
 Fung KT, Lam CM (1980) The database entropy concept and its application to the data allocation problem. Infor 18(4):354–363zbMATHGoogle Scholar
 Gini C (1912) Variabilita e mutabilita (Variability and Mutability). Tipografia di Paolo Cuppini, Bologna, p 156Google Scholar
 Ginsburg S, Hull R (1983) Order dependency in the relational model. Theor Comput Sci 26:146–195MathSciNetzbMATHGoogle Scholar
 Han J, Kamber M (2006) Data mining: concepts and techniques, 2nd edn. Morgan Kaufman, San DiegozbMATHGoogle Scholar
 Han J, Cai, Y, Cercone, N (1992) Knowledge discovery in databases: an attributeoriented approach, Proceedings of 18th VLDB Conference, Vancouver, Brit. Columbia, pp 547–559Google Scholar
 Höller J, Tsiatsis V, Mulligan C, Karnouskos S, Avesand S, Boyle D (2014) From machinetomachine to the internet of things: introduction to a new age of intelligence. Academic Press, WalthamGoogle Scholar
 Jing L, Wenwen Z (2016) Overview on the using rough set theory on GIS spatial relationships constraint. Int J Adv Res Artif Intell:11–15Google Scholar
 Klir GJ, Folger TA (1988) Fuzzy sets, uncertainty, and information. Prentice Hall, Englewood CliffszbMATHGoogle Scholar
 Ligeza A (2002) Granular sets and granular relation. Intelligent information systems. Physica Verlag. Heidelberg, pp 331–340Google Scholar
 Lin TY (1997) Granular computing: from rough sets and neighborhood systems to information granulation and computing in words. Eur Congr Intell Tech Soft Comput 812:1602–1606Google Scholar
 Lin TY (1999) Granular computing: fuzzy logic and rough sets. In: Zadeh L, Kacprzyk J (eds) Computing with words in information/intelligent systems. PhysicaVerlag, Heidelberg, pp 183–200CrossRefGoogle Scholar
 Makinouchi A (1977) A consideration on normal form of notnecessarily normalized relation in the relational data model. In: Proceedings of the 3rd international conference VLDB, pp 447–453Google Scholar
 Mendel J (2017) Uncertain rulebased fuzzy systems, 2nd edn. Springer, BerlinzbMATHCrossRefGoogle Scholar
 Mendel J, John R (2002) Type2 fuzzy sets made simple. IEEE Trans Fuzzy Sets 10:117–127CrossRefGoogle Scholar
 Nanda S, Majumdar S (1992) Fuzzy rough sets. Fuzzy Sets Syst 45:157–160MathSciNetzbMATHCrossRefGoogle Scholar
 Ola A, Ozsoyoglu G (1993) Incomplete relational database models based on intervals. IEEE Trans Knowl Data Eng 5:293–308CrossRefGoogle Scholar
 Parsons S (2001) Qualitative methods for reasoning under uncertainty. MIT Press, CambridgezbMATHCrossRefGoogle Scholar
 Pawlak Z (1982) Rough sets. Int J Comput Inform Sci 11:341–356zbMATHCrossRefGoogle Scholar
 Pawlak Z (1984) Rough sets. Int J ManMach Stud 21:127–134zbMATHCrossRefGoogle Scholar
 Pawlak Z (1985) Rough sets and fuzzy sets. Fuzzy Sets Syst 17:99–102MathSciNetzbMATHCrossRefGoogle Scholar
 Pawlak Z (1991) Rough sets: theoretical aspects of reasoning about data. Kluwer Academic Publishers, NorwellzbMATHCrossRefGoogle Scholar
 Pedrycz W, Gomide F (1996) An introduction to fuzzy sets: analysis and design. MIT Press, BostonzbMATHGoogle Scholar
 Petry F (1996) Fuzzy databases: principles and applications. Kluwer Press, BostonzbMATHCrossRefGoogle Scholar
 Petry F, Robinson V, Cobb M (2005) Fuzzy modeling with spatial information for geographic problems. Springer, Berlin/HeidelbergzbMATHCrossRefGoogle Scholar
 Petry F, Elmore P, Yager R (2015) Combining uncertain information of differing modalities. Inf Sci 322:237–256MathSciNetzbMATHCrossRefGoogle Scholar
 Quinlan JR (1986) Induction of decision trees. Mach Learn 1:81–106Google Scholar
 Raschia G, Mouaddib N (2002) SAINTETIQ: a fuzzy setbased approach to database summarization. Fuzzy Sets Syst 129:137–162MathSciNetzbMATHCrossRefGoogle Scholar
 Roth M, Korth H, Batory D (1987) SQL/NF: a query language for non1NF databases. Inf Syst 12:99–114CrossRefGoogle Scholar
 Sent D, van de Gaag L (2007) In: Carbonell J, Siebnarm J (eds) On the behavior of information measures for test selection. Lecture notes in AI 4594. Springer, BerlinGoogle Scholar
 Shafer G (1976) A mathematical theory of evidence. Princeton University Press, PrincetonzbMATHGoogle Scholar
 Shannon CL (1948) The mathematical theory of communication. Bell Syst Tech J 27:379–422Google Scholar
 Shi W, Wang S, Li D, Wang X (2003) Uncertaintybased spatial data mining. Proceedings of Asia GIS Association, Wuhan, pp 124–135Google Scholar
 Srinivasan P (1991) The importance of rough approximations for information retrieval. Int J ManMach Stud 34:657–671CrossRefGoogle Scholar
 Stankovic J (2014) Research directions for the internet of things. IEEE Internet Things J 1(1):3–9MathSciNetCrossRefGoogle Scholar
 Tavana M, Liu W, Elmore P, Petry F, Bourgeois BS (2016) A practical taxonomy of methods and literature for managing uncertain spatial data in geographic information systems. Measurement 82:123–162CrossRefGoogle Scholar
 Wang S, Li D, Shi W, Wang X (2002) Rough spatial description, International Archives of Photogrammetry and Remote Sensing, XXXII, Commission II, pp 503–510Google Scholar
 Worboys M (1998a) Computation with imprecise geospatial data. Comput Environ Urban Syst 22:85–106CrossRefGoogle Scholar
 Worboys M (1998b) Imprecision in finite resolution spatial data. GeoInformatica 2:257–280CrossRefGoogle Scholar
 Wygralak M (1989) Rough sets and fuzzy sets–some remarks on interrelations. Fuzzy Sets Syst 29:241–243MathSciNetzbMATHCrossRefGoogle Scholar
 Yager R (1982) Measuring tranquility and anxiety in decision making. Int J Gen Syst 8:139–146zbMATHCrossRefGoogle Scholar
 Yager R (1992) On the specificity of a possibility distribution. Fuzzy Sets Syst 50:279–292MathSciNetzbMATHCrossRefGoogle Scholar
 Yager R (1995) Measures of entropy and fuzziness related to aggregation operators. Inf Sci 82:147–166MathSciNetzbMATHCrossRefGoogle Scholar
 Yager R (2012) Conditional approach to possibilityprobability fusion. IEEE Trans Fuzzy Syst 20:46–56CrossRefGoogle Scholar
 Yager R, Petry F (2016) An intelligent quality based approach to fusing multisource probabilistic information. Info Fusion 31:127–136CrossRefGoogle Scholar
 Zadeh L (1965) Fuzzy Sets. Inf Control 8:338–353zbMATHCrossRefGoogle Scholar
 Zadeh L (1978) Fuzzy sets as a basis for a theory of possibility. Fuzzy Sets Syst 1:3–28MathSciNetzbMATHCrossRefGoogle Scholar
 Zvieli A, Chen P (1986) Entityrelationship modeling and fuzzy databases. In: Proceedings of international conference on data engineering, pp 320–327Google Scholar
Books and Reviews
 Aczel J, Daroczy Z (1975) On measures of information and their characterization. Academic Press, New YorkzbMATHGoogle Scholar
 Angryk R, Petry F (2007) Attributeoriented fuzzy generalization in proximity and similaritybased relation database systems. Int J Intell Syst 22:763–781zbMATHCrossRefGoogle Scholar
 Arora G, Petry F, Beaubouef T (1997) Information measure of type β under similarity relations, sixth IEEE international conference on fuzzy systems Barcelona, pp 857–862Google Scholar
 Arora G, Petry F, Beaubouef T (2001) A note on new parametric measures of information for fuzzy sets. J Combinatorics, Info Syst Sci 26:167–174MathSciNetzbMATHGoogle Scholar
 Beaubouef T Petry F (2001a) Vague regions and spatial relationships: a rough set approach. In: Fourth international conference on computational intelligence and multimedia applications, Yokosuka City, pp 313–318Google Scholar
 Beaubouef T Petry F (2001b) Vagueness in spatial data: rough set and eggyolk approaches. In: 14th international conference on industrial & engineering applications of artificial intelligence, pp 367–373Google Scholar
 Beaubouef T, Petry F (2003) In: BouchonMeunier B, Foulloy L, Yager R (eds) Rough set uncertainty in an object oriented data model, intelligent Systems for Information Processing: from representation to applications. Elsevier, Amsterdam, pp 37–46Google Scholar
 Beaubouef T, Petry F (2005a) Normalization in a rough relational database, international conference on rough sets, fuzzy sets, data mining and granular computing, pp 257–265zbMATHGoogle Scholar
 Beaubouef T, Petry F (2005b) Representation of spatial data in an OODB using rough and fuzzy set modeling. Soft Comput J 9:364–373CrossRefGoogle Scholar
 Beaubouef T, Petry F (2007) An attributeoriented approach for knowledge discovery in rough relational databases, proc FLAIRS’07, pp 507–508Google Scholar
 Beaubouef T, Petry F, Arora G (1998) Information measures for rough and fuzzy sets and application to uncertainty in relational databases. In: Pal S, Skowron A (eds) Roughfuzzy hybridization: a new trend in decisionmaking. Springer, Singapore, pp 200–214Google Scholar
 Beaubouef T, Ladner R, Petry F (2004) Rough set spatial data modeling for data mining. Int J Intell Syst 19:567–584zbMATHCrossRefGoogle Scholar
 Beaubouef T, Petry F, Ladner R (2007) Spatial data methods and vague regions: a rough set approach. Appl Soft Comput J 7:425–440CrossRefGoogle Scholar
 Buckles B Petry F (1982) Security and fuzzy databases. In: Proceedings 1982 IEEE international conference on cybernetics and society, pp 622–625Google Scholar
 Codd E (1970) A relational model of data for large shared data banks. Commun ACM 13:377–387zbMATHCrossRefGoogle Scholar
 Ebanks B (1983) On measures of fuzziness and their representations. J Math Anal Appl 94:24–37MathSciNetzbMATHCrossRefGoogle Scholar
 GrzymalaBusse J (1991) Managing uncertainty in expert systems. Kluwer Academic Publishers, BostonzbMATHCrossRefGoogle Scholar
 Han J, Nishio S, Kawano H, Wang W (1998) Generalizationbased data Mining in ObjectOriented Databases Using an objectcube model. Data Knowl Eng 25:55–97zbMATHCrossRefGoogle Scholar
 Havrda J, Charvat F (1967) Quantification methods of classification processes: concepts of structural α entropy. Kybernetica 3:149–172MathSciNetzbMATHGoogle Scholar
 Kapur J, Kesavan H (1992) Entropy optimization principles with applications. Academic Press, New YorkCrossRefGoogle Scholar
 Slowinski R (1992) A generalization of the indiscernibility relation for rough sets analysis of quantitative information. In: 1st international workshop on rough sets: state of the art and perspectives, Poland. In: pp 41–48Google Scholar