FACTOID # 161: If you are looking for work, just go to the Falkland Islands! They have full employment and a labor shortage.
 
 Home   Encyclopedia   Statistics   Countries A-Z   Flags   Maps   Education   Forum   FAQ   About 
 
WHAT'S NEW
RECENT ARTICLES
More Recent Articles »
 

FACTS & STATISTICS    Simple view

  1. Select countries to view: (hold down Control key and click to select several)

     

     

    Compare:

     

     

  1. Select fact or statistic: (* = graphable)

     

     

     

  2. (OPTIONAL) Compare to statistic: (both need to be graphable)

     

     

     

  3. View result as:

     

       
(OR) SEARCH ALL encyclopedia, stats & forums:   

Encyclopedia > Rough set
Cover of "Rough Sets: Theoretical Aspects of Reasoning about Data" by Zdzisław Pawlak (Kluwer 1991).
Cover of "Rough Sets: Theoretical Aspects of Reasoning about Data" by Zdzisław Pawlak (Kluwer 1991).

A rough set is a formal approximation of a crisp set (i.e., conventional set) in terms of a pair of sets which give the lower and the upper approximation of the original set. The lower and upper approximation sets themselves are crisp sets in the standard version of rough set theory (Pawlak 1991), but in other variations, the approximating sets may be fuzzy sets as well. Image File history File links Download high-resolution version (939x1448, 1379 KB) Cover of Rough Sets: Theoretical Aspects of Reasoning about Data by Zdzislaw Pawlak (Kluwer 1991). ... Image File history File links Download high-resolution version (939x1448, 1379 KB) Cover of Rough Sets: Theoretical Aspects of Reasoning about Data by Zdzislaw Pawlak (Kluwer 1991). ... The notion of a set is one of the most important and fundamental concepts in modern mathematics. ...

Contents

Definitions

This section contains an explanation of the basic framework of rough set theory, along with some of the key definitions. A review of this basic material can be found in sources such as Pawlak (1991), Ziarko (1998), Ziarko & Shan (1995), and many others.


Information System Framework

Let I = (mathbb{U},mathbb{A}) be an information system (attribute-value system), where mathbb{U} is a non-empty set of finite objects (the universe) and mathbb{A}is a non-empty finite set of attributes such that a:mathbb{U} rightarrow V_a for every a in mathbb{A}. Va is the set of values that attribute a may take. In words, the information table simply assigns a value Va to each attribute a of each object in universe mathbb{U}. To meet Wikipedias quality standards, this article or section may require cleanup. ...


With any P subseteq mathbb{A} there is an associated equivalence relation IND(P): In mathematics, an equivalence relation, denoted by an infix ~, is a binary relation on a set X that is reflexive, symmetric, and transitive. ...

IND(P) = {(x,y) in mathbb{U}^2 mid forall a in P, a(x)=a(y)}


The partition of mathbb{U} generated by IND(P) is denoted mathbb{U}/IND(P) (or mathbb{U}/P) and can be calculated as follows:

mathbb{U}/IND(P) = otimes {ain P mid mathbb{U}/IND({a})},


where

A otimes B = {Xcap Y mid forall X in A, forall Y in B, X cap Y neq emptyset}


If (x,y)in IND(P), then x and y are indiscernible by attributes from P. In words, for any selected subset of attributes P, there will be sets of objects that are indiscernible based on those attributes. These indistinguishable sets of objects therefore define an equivalence or indiscernibility relation, referred to as the P-indiscernibility relation.


Example: Equivalence Class Structure

For example, consider the following information table:

Sample Information System
Object P1 P2 P3 P4 P5
O1 1 2 0 1 1
O2 1 2 0 1 1
O3 2 0 0 1 0
O4 0 0 1 2 1
O5 2 1 0 2 1
O6 0 0 1 2 2
O7 2 0 0 1 0
O8 0 1 2 2 1
O9 2 1 0 2 2
O10 2 0 0 1 0


When the full set of attributes P = {P1,P2,P3,P4,P5} is considered, we see that we have the following seven equivalence classes:

begin{cases} {O_{1},O_{2}}  {O_{3},O_{7},O_{10}}  {O_{4}}  {O_{5}}  {O_{6}}  {O_{8}}  {O_{9}} end{cases}


Thus, the two objects within the first equivalence class, {O1,O2}, cannot be distinguished from one another based on the available attributes, and the three objects within the second equivalence class, {O3,O7,O10}, cannot be distinguished from one another based on the available attributes. The remaining five objects are each discernible from all other objects. The equivalence classes of the P-indiscernibility relation are denoted [x]P.


It is apparent that different attribute subset selections will in general lead to different indiscernibility classes. For example, if attribute P = P1 alone is selected, we obtain the following much coarser equivalence class structure:

begin{cases} {O_{1},O_{2}}  {O_{3},O_{5},O_{7},O_{9},O_{10}}  {O_{4},O_{6},O_{8}} end{cases}

Definition of Rough Set

Let X subseteq mathbb{U} be a target set that we wish to represent using attribute subset P. That is, we are told that an arbitrary set of objects X comprising a single class, and we wish to express this class (i.e., this subset) using using the equivalence classes induced by attribute subset P. In general, X cannot be expressed exactly, because the set may include and exclude objects which are indistinguishable based on attributes P.


For example, consider the target set X = {O1,O2,O3,O4}, and let attribute subset P = {P1,P2,P3,P4,P5}, the full available set of features. It will be noted that the set X cannot be expressed exactly because in [x]P, objects {O3,O7,O10} are indiscernible. Thus, there is no way to represent any set X which includes O3 but excludes objects O7 and O10.


However, the target set X can be approximated using only the information contained within P by constructing the P-lower and P-upper approximations of X:

{underline P}X= {x mid [x]_P subseteq X}
{overline P}X = {x mid [x]_P cap X neq emptyset }

Lower Approximation and Positive Region

The P-lower approximation or positive region is the union of all equivalence classes in [x]P which are contained by (i.e., are subsets of) of the target set. In the example, {underline P}X = {O_{1},O_{2}} cup {O_{4}}, the union of the two equivalence classes in [x]P which are contained in the target set. The lower approximation is the complete set of objects that in mathbb{U}/P can be positively (i.e., unambiguously) classified as belonging to target set X.


Upper Approximation and Negative Region

The P-upper approximation is the union of all equivalence classes in [x]P which have non-empty intersection with the target set. In the example, {overline P}X = {O_{1},O_{2}} cup {O_{4}} cup {O_{3},O_{7},O_{10}}, the union of the three equivalence classes in [x]P that have non-empty intersection with the target set. The upper approximation is the complete set of objects that in mathbb{U}/P that cannot be positively (i.e., unambiguously) classified as belonging to the complement of the target set overline X. In other words, the upper approximation is the complete set of objects that are possibly members of the target set X.


The set mathbb{U}-{overline P}X therefore represents the negative region, containing the set of objects that can be definitely ruled out as members of the target set.


Boundary Region

The boundary region, given by set difference {overline P}X - {underline P}X, consists of those objects that can neither be ruled in nor ruled out as members of the target set X.


In summary, the lower approximation of a target set is a conservative approximation consisting of only those objects which can positively be identified as members of the set. (These objects have no indiscernible "clones" which are excluded by the target set.) The upper approximation is a liberal approximation which includes all objects that might be members of target set. (Some objects in the upper approximation may not be members of the target set.) From the perspective of mathbb{U}/P, the lower approximation contains objects that are members of the target set with certainty (probability = 1), while the upper approximation contains objects that are members of the target set with nonzero probability (probability > 0).


The Rough Set

The tuple langle{underline P}X,{overline P}Xrangle composed of the lower and upper approximation is called a rough set. Thus, a rough set is composed of two crisp sets, one representing a lower boundary of the target set X, and one representing an upper boundary of the target set X.


The accuracy of the rough set representation of the set X can be given (Pawlak 1991) by the following:

alpha_{P}(X) = frac{left | {underline P}X right |} {left | {overline P}X right |}

That is, the accuracy of the rough set representation of X, αP(X), 0 leq alpha_{P}(X) leq 1, is the ratio of the number of objects which can be positively be placed in X to the number of objects that can be possibly be placed in X. This provides a measure of how closely the rough set is approximating the target set. Clearly, when the upper and lower approximations are equal (i.e., boundary region empty), then αP(X) = 1, and the approximation is perfect. Whenever the lower approximation is empty, the accuracy is zero (regardless of the size of the upper approximation).


Formal Properties of Rough Sets

The important formal properties of rough sets and boundaries are given in Pawlak (1991), and the other sources cited by that book.


Definability

In general, the upper and lower approximations are not equal. In such cases we say that target set X is undefinable or roughly definable on attribute set P. When the upper and lower approximations are equal (i.e., the boundary is empty), {overline P}X = {underline P}X, then the target set X is definable on attribute set P. We can distinguish the following special cases of undefinability (Pawlak, Wong, & Ziarko 1988):

  • Set X is internally definable if {underline P}X neq emptyset and {overline P}X = mathbb{U}. This means that on attribute set P, there are objects which we can be certain belong to target set X, but there are no objects which we can definitively exclude from set X.
  • Set X is externally definable if {underline P}X = emptyset and {overline P}X neq mathbb{U}. This means that on attribute set P, there are no objects which we can be certain belong to target set X, but there are objects which we can definitively exclude from set X.
  • Set X is totally non-definable if {underline P}X = emptyset and {overline P}X = mathbb{U}. This means that on attribute set P, there are no objects which we can be certain belong to target set X, and there are no objects which we can definitively exclude from set X. Thus, on attribute set P, we cannot decide whether any object is, or is not, a member of X.

Reduct and Core

An interesting question is whether there are attributes in the information system (attribute-value table) which are more important to the knowledge represented in the equivalence class structure than other attributes. Often we wonder whether there is a subset of attributes which by itself can fully characterize the knowledge in the database. Such an attribute set is called a reduct.


Formally (Ziarko & Shan 1995), a reduct is a subset of attributes RED subseteq P such that

  • [x]RED = [x]P, that is, the equivalence classes induced by reduced attribute set RED is the same as the equivalence class structure induced by full attribute set P.
  • Attribute set RED is minimal in the sense that [x]_{(RED-A)} neq [x]_P for any attribute A in RED. In other words, no attribute can be removed from set RED without changing the equivalence classes [x]P.

A reduct can be though of a sufficient set of features; sufficient, that is, to represent the category structure. In the example table above, attribute set {P3,P4,P5} is a reduct. The information system projected on just these attributes possesses the same equivalence class structure as that expressed by the full attribute set:

begin{cases} {O_{1},O_{2}}  {O_{3},O_{7},O_{10}}  {O_{4}}  {O_{5}}  {O_{6}}  {O_{8}}  {O_{9}} end{cases}


Attribute set {P3,P4,P5} is a legitimate reduct because eliminating elimination any of these attributes causes a collapse of the equivalence class structure, with the result that [x]_{RED} neq [x]_P.


The reduct of an information system is not unique. There may be many subsets of attributes which preserve the equivalence class structure (i.e., the knowledge) expressed in the information system. In the example information system above, another reduct is {P1,P2,P5}, producing the same equivalence class structure as [x]P.


The set of attributes which is common to all reducts is called the core. The core is the set of attributes which is possessed by every legitimate reduct, and therefore consists of attributes which cannot be removed from the information system without causing collapse of the equivalence class structure. The core may be though of as the set of necessary attributes; necessary, that is, for the category structure to be represented. In the example, the only such attribute is {P5}. Any one of the other attributes can be removed in isolation without damaging the equivalence class structure, and hence these are all dispensable. However, removing {P5} in isolation does change the equivalence class structure, and thus {P5} is the indispensable of this information system, and hence the core.


It is possible for the core to be empty, which means that there is no indispensable attribute. Any single attribute in the information system can be deleted without altering the equivalence class structure. In such cases, there is no essential or necessary attribute which is required for the the class structure to be represented.


Attribute Dependency

One of the most important aspects of database analysis or data acquisition is the discovery of attribute dependencies. That is, we wish to discover which variables are strongly related to which other variables. Generally, it is these strong relationships that will warrant further investigation, and that will ultimately be of use in predictive modeling.


In rough set theory, the notion of dependency is defined very simply. Let us take two (disjoint) sets of attributes, set P and set Q, and inquire what degree of dependency obtains between them. Each attribute set induces an (indiscernibility) equivalence class structure, the equivalence classes induced by P given by [x]P, and the equivalence classes induced by Q given by [x]Q.


Let [x]_Q = {Q_1, Q_2, Q_3, dots, Q_N }, where Qi is a given equivalence class from the equivalence class structure induced by attribute set Q. Then, the dependency of attribute set Q on attribute set P, γP(Q), is given by

gamma_{P}(Q) = frac{left | sum_{i=1}^N {underline P}Q_i right |} {left | mathbb{U} right |} leq 1


That is, for each equivalence class Qi in [x]Q, we add up the size of its lower approximation by the attributes in P, i.e., {underline P}Q_i. This approximation (as above, for arbitrary set X) is the number of objects which on attribute set P can be positively identified as belonging to target set Qi. Added across all equivalence classes in [x]Q, the numerator above represents the total number of objects which – based on attribute set P – can be positively categorized according to the classification induced by attributes Q. The dependency ratio therefore expresses the proportion (within the entire universe) of such classifiable objects . The dependency γP(Q) "can be interpreted as a proportion of such objects in the information system for which it suffices to know the values of attributes in P to determine the values of attributes in Q" (Ziarko & Shan 1995).


Thus, this measure of dependency expresses the degree of functional (i.e., deterministic) dependency of attribute set Q on attribute set P. It is not symmetric. The relationship of this notion of attribute dependency to more traditional information-theoretic (i.e., entropic) notions of attribute dependence has been discussed in a number of sources (e.g., Pawlak, Wong, & Ziarko 1988; Yao & Yao 2002; Wong, Ziarko, & Ye 1986).


Rule Extraction


The category representations discussed above are all extensional in nature. That is, a category or complex class is simply the sum of all its members. To represent a category is then just to be able to list or identify all the objects belonging to that category. However, extensional category representation have very limited practical use, because they provide no insight for deciding whether novel (never-before-seen) objects are members of the category. What is generally desired is an intentional description of the category, a representation of the category based on a set of rules that describe the scope of the category. The choice of such rules is not unique, and therein lies the issue of inductive bias. See Version space and Model selection for more about this issue. Image File history File links Crystal_128_clock. ... Informally speaking, the inductive bias of a machine learning algorithm refers to additional assumptions, that the learner will use to predict correct outputs for situations that have not been encountered so far. ... A version space in concept learning or induction is the subset of all hypotheses that are consistent with the observed training examples (Mitchell 1997). ... Model selection is the task of selecting a mathematical model from a set of potential models, given evidence. ...


Here we present a simple rule extraction procedure based on Ziarko & Shan (1995). More advanced rough-set rule-learning systems can be found in the literature on rough association rule learning.[citation needed] In data mining and treatment learning, association rule learners are used to discover elements that co-occur frequently within a data set [1] consisting of multiple independent selections of elements (such as purchasing transactions), and to discover rules, such as implication or correlation, which relate co-occurring elements. ...


Other rule-learning methods can be found in Pawlak (1991), Stefanowski (1998), Bazan et al. (2004), etc.


Applications

Rough sets can be used as a theoretical basis for some problems in machine learning. They have found to be particularly useful for rule induction and feature selection (semantics-preserving dimensionality reduction). They have also inspired some logical research.[citation needed] As a broad subfield of artificial intelligence, Machine learning is concerned with the development of algorithms and techniques that allow computers to learn. At a general level, there are two types of learning: inductive, and deductive. ... Rule induction is an area of machine learning in which formal rules are extracted from a set of observations. ... Feature selection, also known as subset selection or variable selection, is a process commonly used in machine learning, wherein a subset of the features available from the data are selected for application of a learning algorithm. ...


History

The idea of rough set was proposed by Pawlak (1991) as a new mathematical tool to deal with vague concepts. Comer, Grzymala-Busse, Iwinski, Nieminen, Novotny, Pawlak, Obtulowicz, and Pomykala have studied algebraic properties of rough sets. Different algebraic semantics have been developed by P. Pagliani, I. Duntsch, M. K. Chakraborty, M. Banerjee and A. Mani. These have been extended to more generalized rough sets by D. Cattaneo and A. Mani in particular. Rough sets can be used to represent ambiguity, vagueness and general uncertainty. Fuzzy-rough sets further extend the rough set concept through the use of fuzzy equivalence classes. Comer may refer to: Anjanette Comer, Armerican actres B. B. Comer, American politician Comer Vann Woodward, American historian Clement Comer Clay, governor of Alabama Comer, Georgia, a city in Madison County Jane Comer Gilfillan, American politician Samuel M. Comer, movie set decorator Category: ... The Novotny (also often spelled as Nowotny, even in non-German sources) is a device found in chess problems. ... Look up ambiguity in Wiktionary, the free dictionary. ... Ambiguity is one way in which the meanings of words and phrases can be unclear, but there is another way, which is different from ambiguity: vagueness. ... // Relation between uncertainty, probability and risk In his seminal work Risk, Uncertainty, and Profit, Frank Knight (1921) established the important distinction between risk and uncertainty: … Uncertainty must be taken in a sense radically distinct from the familiar notion of Risk, from which it has never been properly separated. ...


See also

An algebraic semantics of a programming language is a form of axiomatic semantics based on algebraic laws for describing and reasoning about program semantics in a formal manner. ... An alternative set theory is an alternative mathematical approach to the concept of set. ... Fuzzy logic is derived from fuzzy set theory dealing with reasoning that is approximate rather than precisely deduced from classical predicate logic. ... Fuzzy sets are an extension of classical set theory and are used in fuzzy logic. ... Granular Computing is an emerging computing paradigm of information processing. ... This article or section is in need of attention from an expert on the subject. ... A version space in concept learning or induction is the subset of all hypotheses that are consistent with the observed training examples (Mitchell 1997). ...

References

  • Bazan, Jan, Nguyen, Hung Son and Szczuka, Marcin (2004). "A view on rough set concept approximations". Fundamenta Informaticae 59: 107–118.
  • Chanas, Stefan, Kuchta, Dorota (1992). "Further remarks on the relation between rough and fuzzy sets". Fuzzy Sets and Systems 47 (3): 391–394.
  • Comer, Stephen D. (1991). "An algebraic approach to the approximation of information". Fundamenta Informaticae 14 (4): 495–502.
  • Dubois, D. (1990). "Rough fuzzy sets and fuzzy rough sets". International Journal of General Systems 17: 191–209.
  • Jensen, Richard, Shen, Qiang (2004). "Semantics-Preserving Dimensionality Reduction: Rough and Fuzzy-Rough Based Approaches". IEEE Transactions on Knowledge and Data Engineering 16 (12): 1457–1471.
  • Pawlak, Zdzisław, Wong, S. K. M. and Ziarko, Wojciech (1988). "Rough sets: Probabilistic versus deterministic approach". International Journal of Man-Machine Studies 29: 81–95.
  • Pawlak, Zdzisław (1991). Rough Sets: Theoretical Aspects of Reasoning About Data. Dordrecht: Kluwer Academic Publishing. ISBN: 0792314727.
  • Stefanowski, Jerzy (1998). Polkowski, Lech and Skowron, Andrzej "On rough set based approaches to induction of decision rules". Rough Sets in Knowledge Discovery 1: Methodology and Applications, 500–529, Heidelberg: Physica-Verlag.
  • Wong, S. K. M., Ziarko, Wojciech and Ye, R. Li (1986). "Comparison of rough-set and statistical methods in inductive learning". International Journal of Man-Machine Studies 24: 53–72.
  • Yao, J. T.; Yao, Y. Y. (2002). "Induction of classification rules by granular computing". Proceedings of the Third International Conference on Rough Sets and Current Trends in Computing (TSCTC'02), 331–338, London, UK: Springer-Verlag.
  • Ziarko, Wojciech (1998). "Rough sets as a methodology for data mining". Rough Sets in Knowledge Discovery 1: Methodology and Applications, 554–576, Heidelberg: Physica-Verlag.
  • Ziarko, Wojciech; Shan, Ning (1995). "Discovering attribute relationships, dependencies and rules by using rough sets". Proceedings of the 28th Annual Hawaii International Conference on System Sciences (HICSS'95), 293–299.

External links

  • The International Rough Set Society

  Results from FactBites:
 
Rough Set Data Analysis: A Round to Noninvasive Knowledge Discovery (188 words)
Rough set data analysis: A road to non-invasive knowledge discovery
This is not the first book on rough set analysis and certainly not the first book on knowledge discovery algorithms, but it is the first attempt to do this in a non-invasive way.
This book reports the ideas of the authors, but many citations of papers on Rough Set Data Analysis in knowledge discovery by other research groups are included as well.
Rough sets (1241 words)
The rough sets (proposed by Pawlak,1991) is one of the techniques for the identification and recognition of common patterns in data, especially in the case of uncertain and incomplete data.
As the roughness of the knowledge grows, the boundary region vanishes since it is being incorporated into the positive region of the classification.
The rough sets approach to data classification is often referred to as a machine-learning procedure for incomplete or imprecise data and was extensively tested in various scientific domains (Slowinski, 1992).
  More results at FactBites »


 

COMMENTARY     


Share your thoughts, questions and commentary here
Your name
Your comments
Please enter the 5-letter protection code

Want to know more?
Search encyclopedia, statistics and forums:

 


Lesson Plans | Student Area | Student FAQ | Reviews | Press Releases |  Feeds | Contact
The Wikipedia article included on this page is licensed under the GFDL.
Images may be subject to relevant owners' copyright.
All other elements are (c) copyright NationMaster.com 2003-5. All Rights Reserved.
Usage implies agreement with terms.