Detecting a Compact Decision Tree Based on an Appropriate Abstraction

Kudoh, Yoshimitsu; Haraguchi, 1Makoto

doi:10.1007/3-540-44491-2_10

Yoshimitsu Kudoh⁶ &
1Makoto Haraguchi

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 1983))

Included in the following conference series:

International Conference on Intelligent Data Engineering and Automated Learning

1306 Accesses
4 Citations

Abstract

It is generally convinced that pre-processing for data mining is needed to exclude irrelevant and meaningless aspects of data before applying data mining algorithms. From this viewpoint, we have already proposed a notion of Information Theoretical Abstraction, and implemented a system ITA. Given a relational database and a family of possible abstractions for its attribute values, called an abstraction hierachy ITA selects the best abstraction among the possible ones so that class distributions needed to perform our clasification task are preserved as possibly as we can. According to our previous experiment, just one application of abstraction for the whole database has shown its effectiveness in reducing the size of detected rules, without making the classification error worse. However, as C4.5 performs serial attribute-selection repeatedly, ITA does not generally guarantee the preservingness of class distributions, given a sequence of attribute-selections. For this reason, in this paper, we propose a new version of ITA, called iterative ITA, so that it tries to keep the class distributions in each attribute selection step as possibly as we can.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Adriaans, P. and Zantinge, D.: Data Mining, Addison Wesley Longman Ltd., 1996.
Google Scholar
Arimoto, S: Probability, Information, Entropy. Morikita Shuppan, 1980 (in Japanese).
Google Scholar
Fayyad, U.N., Piatctsky-Sliapiro, G., Smyth, P. and Uthurusamy, R.(eds.): Advances in Knowledge Discovery and Data Mining. AAAIIMIT Press, 1996.
Google Scholar
Fayyad, U.N., Piatctsky-Shaprio, G., Srriytli, P.: From Data Mining to Knowledge Discovery: an Overview. In [3], pp.1–33
Google Scholar
Han, J. and Fu, Y.: Attribute-Oriented Induction in Data Mining. In [3], pp.399–421
Google Scholar
Holsheimer, M. and Kersten, M: Architectural Support for Data Mining, In: CWI Technical Report CS-R9429, Amsterdam, The Netherlands, 1994.
Google Scholar
Kudoh, Y. and Haraguchi, M.: An Appropriate Abstration for an Attribute-Oriented Induction. Proceeding of The Second International Conferencc on Dis-covcry Science, LNAI 1721, pp.43–65, 1999.
Google Scholar
Kudoh, Y. and Haraguchi, M.: Data Mining by Generalizing Database Based on ail Appropriatc Abstraction, In: Journal of Japanese Society for Artificial Intelligence, vol.15, No.4, July, pp.638–648, 2000 (in Japanese).
Google Scholar
Kudoh, Y. and Haraguchi, M.: An Appropriate Abstraction for Constructing a Compact Decision Tree, Proceegin of The Third International Conference on Discovery Science, LNAI, (to appear), 2000.
Google Scholar
Matsumoto, K., Morita, C. and Tsukimoto, H. Generalized Rule Discovery in Databases by Finding Similarities In: SIG-J-9401-15, pp.111–118, Japanese Society for Artificial Intelligence, 1994.
Google Scholar
Miclialski, R.S., Bratko, I. and Kubat, M. (eds.): Machine Learning and Data Mining: Methods arid Applications, London, John Wiley & Sons, 1997.
Google Scholar
Miclialski, R.S. and Kaufman, K.A.: Data Mining and Knowledge Discovery: A Reivew of Issues and a Multistrategy ApproachIn: [11] pp.71–112, 1997.
Google Scholar
Miller, G. A., Beckwith, R., Fellbaum, C., Gross, D. and Miller, K.: Introduction to WordNet: A n On-line Lexical DataBase In: Intcernational Journal of Lexicography 3 (4), pp.235–244, 1990.
Article Google Scholar
Miller, G.A.: Nouns in WordNet: a lexical inheritance system, In: International Journal of Lexicography 3 (4), pp. 245–264, 1990. ftp://ftp.cogsci.princeton.edu/pub/wordnet/5papers.ps
Article Google Scholar
Murphy, P.M. and Aha, D.W.: UCI Repository of machine learning databases, http://www.ics.uci.edu/ mlearn/MLRepository.html.
Quinaln, J.R.: C4.5-Programs for Machine Learning, Morgan Kaufmann, 1993.
Google Scholar

Download references

Author information

Authors and Affiliations

Division of Electronics and Information Engineering, Hokkaido University, N 13 W 8, 060-8628, Sapporo, JAPAN
Yoshimitsu Kudoh

Authors

Yoshimitsu Kudoh
View author publications
You can also search for this author in PubMed Google Scholar
1Makoto Haraguchi
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer Science and Engineering, The Chinese University of Hong Kong, Shatin, Hong Kong
Kwong Sak Leung & Lai-Wan Chan &
Department of Systems Engineering and Engineering Management, The Chinese University of Hong Kong, Shatin, Hong Kong
Helen Meng

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Kudoh, Y., Haraguchi, 1. (2000). Detecting a Compact Decision Tree Based on an Appropriate Abstraction. In: Leung, K.S., Chan, LW., Meng, H. (eds) Intelligent Data Engineering and Automated Learning — IDEAL 2000. Data Mining, Financial Engineering, and Intelligent Agents. IDEAL 2000. Lecture Notes in Computer Science, vol 1983. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-44491-2_10

Download citation

DOI: https://doi.org/10.1007/3-540-44491-2_10
Published: 27 May 2002
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-41450-6
Online ISBN: 978-3-540-44491-6
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics