Skip to main content

Rule Induction, Missing Attribute Values and Discretization

  • Reference work entry
Book cover Computational Complexity
  • 246 Accesses

Article Outline

Glossary

Definition of the Subject

Introduction

Discretization

LEM2 Algorithm

Inconsistent Data

Missing Attribute Values

MLEM2

Classification System

Validation

Future Directions

Bibliography

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 1,500.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 1,399.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Abbreviations

Discretization:

Discretization is a process of converting numerical attributes into symbolic ones by splitting the numerical attribute domain into intervals. Usually discretization is conducted before the main process of rule induction, but in some rule induction algorithms, e. g., in MLEM2 (Modified LEM2), rules are induced concurrently with discretization.

LEM2 algorithm :

LEM2 (Learning from Examples Module, version 2) is the basic rule induction algorithm of the machine learning/data mining system LERS. LEM2, implemented for the first time in 1990, uses an idea of a local covering to induce a minimal set of minimal rules describing all data concepts.

LERS machine learning/data mining system :

LERS (Learning from Examples based on Rough Sets) is a rule induction system created at the University of Kansas. Its first implementation was done in Franz Lisp in 1988. This first version of LERS had only one algorithm called LEM1 (Learning form Examples Module, version 1) to induce all rules from input data.

Missing attribute values :

Missing attribute values frequently affect real-life data. Some attribute values are lost (e. g., erased), some are of the type “do not care” conditions (such attribute values were irrelevant for classification of the case). In most existing machine learning/data mining systems some method of handling missing attribute values is applied before the main process of rule induction.However, in MLEM2 rule induction and handling missing attribute values are conducted at the same time.

Rule induction :

Rule induction is understood here as an instance of supervised learning. Rule induction is one of the basic processes of acquiring knowledge (knowledge extraction) in the form of rule sets from raw data. This process is widely used in machine learning (data mining). A data set contains cases (examples) characterized by attribute values and classified as members of concepts by an expert. Rules are expressions of the following format:

if condition1 and condition2 andand condition\( { _n } \) then decision.

Bibliography

  1. Chan CC, Grzymala‐Busse JW (1991) On the attribute redundancy and thelearning programs ID3, PRISM, and LEM2. Department of Computer Science, University of Kansas, TR-91-14 20

    Google Scholar 

  2. Chmielewski MR, Grzymala‐Busse JW (1996) Global discretization ofcontinuous attributes as preprocessing for machine learning. Int J Approx Reason 15:319–331

    Google Scholar 

  3. Grzymala‐Busse JW (1988) Knowledge acquisition underuncertainty – A rough set approach. J Intell Robot Syst 1:3–16

    Google Scholar 

  4. Grzymala‐Busse JW (1992) LERS – A system for learning fromexamples based on rough sets. In: Slowinski R (ed) Intelligent decision support. Handbook of applications and advances of the rough set theory. Kluwer,Dordrecht, pp 3–18

    Google Scholar 

  5. Grzymala‐Busse JW (1997) A new version of the rule induction systemLERS. Fundam Inform 31:27–39

    Google Scholar 

  6. Grzymala‐Busse JW (2002) Discretization of numerical attributes. In:Klösgen W, Zytkow J (eds) Handbook of data mining and knowledge discovery. Oxford University Press, New York,pp 218–225

    Google Scholar 

  7. Grzymala‐Busse JW (2002) MLEM2: A new algorithm for rule inductionfrom imperfect data. In: Proceedings of the 9th international conference on information processing and management of uncertainty in knowledge‐basedsystems, IPMU 2002, Annecy, France, pp 243–250

    Google Scholar 

  8. Grzymala‐Busse JW (2003) A comparison of three strategies to ruleinduction from data with numerical attributes. In: Proceedings of the international workshop on rough sets in knowledge discovery (RSKD 2003), inconjunction with the European joint conferences on theory and practice of software, Warsaw, pp 132–140

    Google Scholar 

  9. Grzymala‐Busse JW (2003) Rough set strategies to data with missing attribute values. In: Workshop notes, foundations and new directions of data mining, in conjunction with the 3rd IEEE international conference on datamining, Melbourne, FL, pp 56–63

    Google Scholar 

  10. Grzymala‐Busse JW (2007) Mining numerical data – A roughset approach. In: Proceedings of the RSEISP'2007, the international conference of rough sets and emerging intelligent systems paradigms, Warsaw,Poland. Lecture Notes in artificial intelligence, vol 4585. Springer, Berlin, pp 12–21

    Google Scholar 

  11. Kryszkiewicz M (1995) Rough set approach to incomplete informationsystems. In: Proceedings of the second annual joint conference on information sciences, pp 194–197

    Google Scholar 

  12. Kryszkiewicz M (1999) Rules in incomplete information systems. Inf Sci113:271–292

    Article  MathSciNet  MATH  Google Scholar 

  13. Lin TY (1989) Chinese Wall security policy – An aggressivemodel. In: Proceedings of the fifth aerospace computer security application conference, Tucson, AZ, pp 286–293

    Google Scholar 

  14. Lin TY (1989) Neighborhood systems and approximation in database and knowledgebase systems. In: Proceedings of the ISMIS-89, the fourth international symposium on methodologies of intelligent systems, Charlotte, NC,pp 75–86

    Google Scholar 

  15. Lin TY (1992) Topological and fuzzy rough sets. In: Slowinski R (ed)Intelligent decision support. Handbook of applications and advances of the rough set theory. Kluwer, Dordrecht,pp 287–304

    Chapter  Google Scholar 

  16. Pawlak Z (1982) Rough Sets. Int J Comput Inf Sci11:341–356

    Article  MathSciNet  MATH  Google Scholar 

  17. Pawlak Z (1991) Rough Sets: Theoretical aspects of reasoning aboutdata. Kluwer, Dordrecht

    MATH  Google Scholar 

  18. Stefanowski J (2001) Algorithms of decision rule induction in datamining. Poznan University of Technology Press, Poznan

    Google Scholar 

  19. Stefanowski J, Tsoukias A (1999) On the extension of rough sets underincomplete information. In: Proceedings of the RSFDGrC'1999, 7th international workshop on new directions in rough sets, data mining, andgranular‐soft computing, pp 73–81

    Google Scholar 

  20. Stefanowski J, Tsoukias A (2001) Incomplete information tables and roughclassification. Comput Intell 17:545–566

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer-Verlag

About this entry

Cite this entry

Grzymala-Busse, J.W. (2012). Rule Induction, Missing Attribute Values and Discretization. In: Meyers, R. (eds) Computational Complexity. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-1800-9_170

Download citation

Publish with us

Policies and ethics