Hierarchical Clustering of Multiple Decision Trees

Kavšek, Branko; Lavrač, Nada; Ferligoj, Anuška

doi:10.1007/978-3-642-56181-8_38

Branko Kavšek⁷,
Nada Lavrač⁷ &
Anuška Ferligoj⁸

Part of the book series: Studies in Classification, Data Analysis, and Knowledge Organization ((STUDIES CLASS))

1765 Accesses

Abstract

Decision tree learning is relatively non-robust: a small change in the training set may significantly change the structure of the induced decision tree. This paper presents a decision tree construction method in which the domain model is constructed by consensus clustering of N decision trees induced in N-fold cross-validation. Experimental results show that consensus decision trees are simpler than C4.5 decision trees, indicating that they may be a more stable approximation of the intended domain model than decision trees, constructed from the entire set of training instances.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

ADAMS, E.N. (1972). Consensus techniques and the comparison of taxonomic trees. Systematic Zoology, 21, 390–397.
Article Google Scholar
BREIMAN, L., FRIEDMAN, J., OLSHEN, R., and STONE, C. (1984). Classifica-tion and Regression Trees. Wadsworth International Group, Belmont, CA.
Google Scholar
BREIMAN, L. (1996). Bagging predictors. Machine Learning, 24: 123–140.
MathSciNet MATH Google Scholar
DAY, W.H.E. (1983). The role of complexity in comparing classifications. Mathe-matical Biosciences, 66, 97–114.
Article MATH Google Scholar
FAITH, D.P. (1988). Consensus applications in the biological sciences. In: Bock, H.H. (Ed.) Classification and Related Methods of Data Analysis, Amsterdam: North-Holland, 325–332.
Google Scholar
FISHER, D.H. (1989). Noise-tolerant conceptual clustering. Proceedings of the Eleventh International Joint Conference on Artificial Intelligence 825–830. San Francisco: Morgan Kaufmann.
Google Scholar
GORDON, A.D. (1981). Classification. London: Chapman and Hall.
MATH Google Scholar
HARTIGAN, J.A. (1975). Cluster Algorithms. New York: Wiley.
Google Scholar
HUNT, E., MARTIN, J., and STONE, P. (1966). Experiments in Induction. New York: Academic Press.
Google Scholar
KONONENKO, I., and BRATKO, I. (1991). Information based evaluation criterion for classifier’s performance. Machine Learning, 6, (1), 67–80.
Google Scholar
LANGLEY, P. (1996). Elements of Machine Learning. Morgan Kaufmann.
Google Scholar
LECLERC, B. (1988). Consensus applications in the social sciences. In: Bock, H.H. (Ed.) Classification and Related Methods of Data Analysis, Amsterdam: North-Holland, 333–340.
Google Scholar
McMORRIS, F.R. and NEUMAN, D. (1983). Consensus functions defined on trees. Mathematical Social Sciences, 4, 131–136.
Article MathSciNet MATH Google Scholar
QUINLAN, J.R. (1986). Induction of decision trees. Machine Learning, 1 (1): 81–106.
Google Scholar
QUINLAN, J.R. (1993). C4.5: Programs for Machine Learning. California: Morgan Kaufmann.
Google Scholar
REGNIER, S. (1965). Sur quelques aspects mathematiques des problems de classification automatique. I.I. C. Bulletin, 4, 175–191.
Google Scholar
SOKAL, R.R. and SNEATH, P.H.A. (1963). Principles of Numerical Taxonomy. San Francisco: Freeman.
Google Scholar
WITTEN, I.H. and FRANK, E. (1999). Data Mining: Practical Machine Learning Tools and Techniques with Java Implementations. Morgan Kaufmann, San Francisco.
Google Scholar

Download references

Author information

Authors and Affiliations

Institute Jožef Stefan, Jamova 39, 1000, Ljubljana, Slovenia
Branko Kavšek & Nada Lavrač
University of Ljubljana, 1000, Ljubljana, Slovenia
Anuška Ferligoj

Authors

Branko Kavšek
View author publications
You can also search for this author in PubMed Google Scholar
Nada Lavrač
View author publications
You can also search for this author in PubMed Google Scholar
Anuška Ferligoj
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Wroclaw University of Economics, ul. Komandorska 118/120, 53-345, Wroclaw, Poland
Krzysztof Jajuga
Department of Statistics, Cracow University of Economics, ul. Rakowicka 27, 31-510, Cracow, Poland
Andrzej Sokołowski
Institute of Statistics, Technical University of Aachen, Wuellnerstrasse 3, 52056, Aachen, Germany
Hans-Hermann Bock

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Kavšek, B., Lavrač, N., Ferligoj, A. (2002). Hierarchical Clustering of Multiple Decision Trees. In: Jajuga, K., Sokołowski, A., Bock, HH. (eds) Classification, Clustering, and Data Analysis. Studies in Classification, Data Analysis, and Knowledge Organization. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-56181-8_38

Download citation

DOI: https://doi.org/10.1007/978-3-642-56181-8_38
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-43691-1
Online ISBN: 978-3-642-56181-8
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics