Rough Set Based Decision Tree Model for Classification
Decision tree, a commonly used classification model, is constructed recursively following a top down approach (from the general concepts to particular examples) by repeatedly splitting the training data set. ID3 is a greedy algorithm that considers one attribute at a time for splitting at a node. In C4.5, all attributes, barring the nominal attributes used at the parent nodes, are retained for further computation. This leads to extra overheads of memory and computational efforts. Rough Set theory (RS) simplifies the search for dominant attributes in the information systems. In this paper, Rough set based Decision Tree (RDT) model combining the RS tools with classical DT capabilities, is proposed to address the issue of computational overheads. The experiments compare the performance of RDT with RS approach and ID3 algorithm. The performance of RDT over RS approach is observed better in accuracy and rule complexity while RDT and ID3 are comparable.
KeywordsRough set supervised learning decision tree feature selection classification data mining
Unable to display preview. Download preview PDF.
- 1.Bjorvand, A.T., Komorowski, J.: Practical Applications of Genetic Algorithms for Efficient Reduct Computation. vol. 4, pp. 601–606, Wissenschaft & Technik Verlag (1997)Google Scholar
- 3.Hall, M.A., Holmes, G.: Benchmarking Attribute Selection Techniques for Discrete Class Data Mining. IEEE TKDE 20, 1–16 (2002)Google Scholar
- 4.Han, J., Kamber, M.: Data Mining: Concepts and Techniques, pp. 279–325. Morgan Kaufmann, San Francisco (2001)Google Scholar
- 7.Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kauffman, San Francisco (1993)Google Scholar
- 8.Rosetta, Rough set toolkit for analysis of data, available at http://www.idi.ntnu.no/~aleks/rosetta/
- 9.Winston, P.H.: Artificial Intelligence, 3rd edn. Addison-Wesley, Reading (1992)Google Scholar
- 10.Wroblewski, J.: Finding Minimal Reduct Using Genetic Algorithms. Warsaw University of Technology- Institute of Computer Science- Reports – 16/95 (1995)Google Scholar