Induction of Modular Classification Rules: Using Jmax-pruning
The Prism family of algorithms induces modular classification rules which, in contrast to decision tree induction algorithms, do not necessarily fit together into a decision tree structure. Classifiers induced by Prism algorithms achieve a comparable accuracy compared with decision trees and in some cases even outperform decision trees. Both kinds of algorithms tend to overfit on large and noisy datasets and this has led to the development of pruning methods. Pruning methods use various metrics to truncate decision trees or to eliminate whole rules or single rule terms from a Prism rule set. For decision trees many pre-pruning and postpruning methods exist, however for Prism algorithms only one pre-pruning method has been developed, J-pruning. Recent work with Prism algorithms examined J-pruning in the context of very large datasets and found that the current method does not use its full potential. This paper revisits the J-pruning method for the Prism family of algorithms and develops a new pruning method Jmax-pruning, discusses it in theoretical terms and evaluates it empirically.
KeywordsDecision Tree Training Instance Target Class Rule Induction Pruning Method
Unable to display preview. Download preview PDF.
- 1.C L Blake and C J Merz. UCI repository of machine learning databases. Technical report, University of California, Irvine, Department of Information and Computer Sciences, 1998.Google Scholar
- 2.M A Bramer. Automatic induction of classification rules from examples using N-Prism. In Research and Development in Intelligent Systems XVI, pages 99–121, Cambridge, 2000. Springer-Verlag.Google Scholar
- 3.M A Bramer. An information-theoretic approach to the pre-pruning of classification rules. In B Neumann M Musen and R Studer, editors, Intelligent Information Processing, pages 201–212. Kluwer, 2002.Google Scholar
- 7.E B Hunt, P J Stone, and J Marin. Experiments in induction. Academic Press, New York, 1966.Google Scholar
- 8.R Kerber. Chimerge: Discretization of numeric attributes. In AAAI, pages 123–128, 1992. Frederic Stahl and Max BramerGoogle Scholar
- 9.R S Michalski. On the Quasi-Minimal solution of the general covering problem. In Proceedings of the Fifth International Symposium on Information Processing, pages 125–128, Bled, Yugoslavia, 1969.Google Scholar
- 10.R J Quinlan. C4.5: programs for machine learning. Morgan Kaufmann, 1993.Google Scholar
- 12.F T Stahl. Parallel Rule Induction. PhD thesis, University of Portsmouth, 2009.Google Scholar
- 13.F T Stahl,MA Bramer, and MAdda. PMCRI: A parallel modular classification rule induction framework. In MLDM, pages 148–162. Springer, 2009.Google Scholar
- 14.I H Witten and F Eibe. Data Mining: Practical Machine Learning Tools and Techniques with Java Implementations. Morgan Kaufmann, 1999.Google Scholar