Skip to main content

Clustering Via Decision Tree Construction

  • Chapter
  • First Online:
Book cover Foundations and Advances in Data Mining

Part of the book series: Studies in Fuzziness and Soft Computing ((STUDFUZZ,volume 180))

Abstract

Clustering is an exploratory data analysis task. It aims to find the intrinsic structure of data by organizing data objects into similarity groups or clusters. It is often called unsupervised learning because no class labels denoting an a priori partition of the objects are given. This is in contrast with supervised learning (e.g., classification) for which the data objects are already labeled with known classes. Past research in clustering has produced many algorithms. However, these algorithms have some shortcomings. In this paper, we propose a novel clustering technique, which is based on a supervised learning technique called decision tree construction. The new technique is able to overcome many of these shortcomings. The key idea is to use a decision tree to partition the data space into cluster (or dense) regions and empty (or sparse) regions (which produce outliers and anomalies). We achieve this by introducing virtual data points into the space and then applying a modified decision tree algorithm for the purpose. The technique is able to find “natural” clusters in large high dimensional spaces efficiently. It is suitable for clustering in the full dimensional space as well as in subspaces. It also provides easily comprehensible descriptions of the resulting clusters. Experiments on both synthetic data and real-life data show that the technique is effective and also scales well for large high dimensional datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Author information

Authors and Affiliations

Authors

Editor information

Wesley Chu Tsau Young Lin

Rights and permissions

Reprints and permissions

About this chapter

Cite this chapter

Liu, B., Xia, Y., Yu, P. Clustering Via Decision Tree Construction. In: Chu, W., Young Lin, T. (eds) Foundations and Advances in Data Mining. Studies in Fuzziness and Soft Computing, vol 180. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11362197_5

Download citation

  • DOI: https://doi.org/10.1007/11362197_5

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-25057-9

  • Online ISBN: 978-3-540-32393-8

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics