Article Outline
Glossary
Definition of the Subject
Introduction
Discretization
LEM2 Algorithm
Inconsistent Data
Missing Attribute Values
MLEM2
Classification System
Validation
Future Directions
Bibliography
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Abbreviations
- Discretization:
-
Discretization is a process of converting numerical attributes into symbolic ones by splitting the numerical attribute domain into intervals. Usually discretization is conducted before the main process of rule induction, but in some rule induction algorithms, e. g., in MLEM2 (Modified LEM2), rules are induced concurrently with discretization.
- LEM2 algorithm :
-
LEM2 (Learning from Examples Module, version 2) is the basic rule induction algorithm of the machine learning/data mining system LERS. LEM2, implemented for the first time in 1990, uses an idea of a local covering to induce a minimal set of minimal rules describing all data concepts.
- LERS machine learning/data mining system :
-
LERS (Learning from Examples based on Rough Sets) is a rule induction system created at the University of Kansas. Its first implementation was done in Franz Lisp in 1988. This first version of LERS had only one algorithm called LEM1 (Learning form Examples Module, version 1) to induce all rules from input data.
- Missing attribute values :
-
Missing attribute values frequently affect real-life data. Some attribute values are lost (e. g., erased), some are of the type “do not care” conditions (such attribute values were irrelevant for classification of the case). In most existing machine learning/data mining systems some method of handling missing attribute values is applied before the main process of rule induction.However, in MLEM2 rule induction and handling missing attribute values are conducted at the same time.
- Rule induction :
-
Rule induction is understood here as an instance of supervised learning. Rule induction is one of the basic processes of acquiring knowledge (knowledge extraction) in the form of rule sets from raw data. This process is widely used in machine learning (data mining). A data set contains cases (examples) characterized by attribute values and classified as members of concepts by an expert. Rules are expressions of the following format:
if condition1 and condition2 and … and condition\( { _n } \) then decision.
Bibliography
Chan CC, Grzymala‐Busse JW (1991) On the attribute redundancy and thelearning programs ID3, PRISM, and LEM2. Department of Computer Science, University of Kansas, TR-91-14 20
Chmielewski MR, Grzymala‐Busse JW (1996) Global discretization ofcontinuous attributes as preprocessing for machine learning. Int J Approx Reason 15:319–331
Grzymala‐Busse JW (1988) Knowledge acquisition underuncertainty – A rough set approach. J Intell Robot Syst 1:3–16
Grzymala‐Busse JW (1992) LERS – A system for learning fromexamples based on rough sets. In: Slowinski R (ed) Intelligent decision support. Handbook of applications and advances of the rough set theory. Kluwer,Dordrecht, pp 3–18
Grzymala‐Busse JW (1997) A new version of the rule induction systemLERS. Fundam Inform 31:27–39
Grzymala‐Busse JW (2002) Discretization of numerical attributes. In:Klösgen W, Zytkow J (eds) Handbook of data mining and knowledge discovery. Oxford University Press, New York,pp 218–225
Grzymala‐Busse JW (2002) MLEM2: A new algorithm for rule inductionfrom imperfect data. In: Proceedings of the 9th international conference on information processing and management of uncertainty in knowledge‐basedsystems, IPMU 2002, Annecy, France, pp 243–250
Grzymala‐Busse JW (2003) A comparison of three strategies to ruleinduction from data with numerical attributes. In: Proceedings of the international workshop on rough sets in knowledge discovery (RSKD 2003), inconjunction with the European joint conferences on theory and practice of software, Warsaw, pp 132–140
Grzymala‐Busse JW (2003) Rough set strategies to data with missing attribute values. In: Workshop notes, foundations and new directions of data mining, in conjunction with the 3rd IEEE international conference on datamining, Melbourne, FL, pp 56–63
Grzymala‐Busse JW (2007) Mining numerical data – A roughset approach. In: Proceedings of the RSEISP'2007, the international conference of rough sets and emerging intelligent systems paradigms, Warsaw,Poland. Lecture Notes in artificial intelligence, vol 4585. Springer, Berlin, pp 12–21
Kryszkiewicz M (1995) Rough set approach to incomplete informationsystems. In: Proceedings of the second annual joint conference on information sciences, pp 194–197
Kryszkiewicz M (1999) Rules in incomplete information systems. Inf Sci113:271–292
Lin TY (1989) Chinese Wall security policy – An aggressivemodel. In: Proceedings of the fifth aerospace computer security application conference, Tucson, AZ, pp 286–293
Lin TY (1989) Neighborhood systems and approximation in database and knowledgebase systems. In: Proceedings of the ISMIS-89, the fourth international symposium on methodologies of intelligent systems, Charlotte, NC,pp 75–86
Lin TY (1992) Topological and fuzzy rough sets. In: Slowinski R (ed)Intelligent decision support. Handbook of applications and advances of the rough set theory. Kluwer, Dordrecht,pp 287–304
Pawlak Z (1982) Rough Sets. Int J Comput Inf Sci11:341–356
Pawlak Z (1991) Rough Sets: Theoretical aspects of reasoning aboutdata. Kluwer, Dordrecht
Stefanowski J (2001) Algorithms of decision rule induction in datamining. Poznan University of Technology Press, Poznan
Stefanowski J, Tsoukias A (1999) On the extension of rough sets underincomplete information. In: Proceedings of the RSFDGrC'1999, 7th international workshop on new directions in rough sets, data mining, andgranular‐soft computing, pp 73–81
Stefanowski J, Tsoukias A (2001) Incomplete information tables and roughclassification. Comput Intell 17:545–566
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag
About this entry
Cite this entry
Grzymala-Busse, J.W. (2012). Rule Induction, Missing Attribute Values and Discretization. In: Meyers, R. (eds) Computational Complexity. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-1800-9_170
Download citation
DOI: https://doi.org/10.1007/978-1-4614-1800-9_170
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4614-1799-6
Online ISBN: 978-1-4614-1800-9
eBook Packages: Computer ScienceReference Module Computer Science and Engineering