Building a Concept Hierarchy from a Distance Matrix

Kuo, Huang-Cheng; Huang, Jen-Peng

doi:10.1007/3-540-32392-9_10

Huang-Cheng Kuo³ &
Jen-Peng Huang⁴

Part of the book series: Advances in Soft Computing ((AINSC,volume 31))

888 Accesses
4 Citations

Abstract

Concept hierarchies are important in many generalized data mining applications, such as multiple level association rule mining. In literature, concept hierarchy is usually given by domain experts. In this paper, we propose algorithms to automatically build a concept hierarchy from a provided distance matrix. Our approach is modifying the traditional hierarchical clustering algorithms. For the purpose of algorithm evaluation, a distance matrix is derived from the concept hierarchy built by our algorithm. Root mean squared error between the provided distant matrix and the derived distance matrix is used as evaluation criterion. We compare the traditional hierarchical clustering and our modified algorithm under three strategies of computing cluster distance, namely single link, average link, and complete link. Empirical results show that the traditional algorithm under complete link strategy performs better than the other strategies. Our modified algorithms perform almost the same under the three strategies; and our algorithms perform better than the traditional algorithms under various situations.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 259.00; Price excludes VAT (USA)

Softcover Book: USD 329.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

U. M. Fayyad, K. B. Irani, “Multi-interval Discretization of Continuous-Valued Attributes for Classification Learning,” In Proceedings of the thirteenth International Joint Conference on Artificial Intelligence, 1993, pp. 1002–1027.
Google Scholar
V. Ganti, J. Gehrke, and R. Ramakrishnan, “CACTUS-Clustering Categorical Data Using Summaries,” ACM KDD, 1999, pp. 73–83.
Google Scholar
J. Han and M. Kamber, Data Mining: Concepts and Techniques, Morgan Kaufmann Pub., 2001.
Google Scholar
J. Han and Y. Fu, “Dynamic Generation and Refinement of Concept Hierarchies for Knowledge Discovery in Databases,” Workshop on Knowledge Discovery in Databases, 1994, pp. 157–168.
Google Scholar
J. Han and Y. Fu, “Discovery of Multiple-Level Association Rules from Large Databases,” VLDB Conference, 1995, pp. 420–431.
Google Scholar
A. K. Jain, R. C. Dubes, Algorithms for Clustering Data, Prentice-Hall, Inc., 1988.
Google Scholar
Huang-Cheng Kuo, Yi-Sen Lin, Jen-Peng Huang, “Distance Preserving Mapping from Categories to Numbers for Indexing,” International Conference on Knowledge-Based Intelligent Information Engineering Systems, Lecture Notes in Artificial Intelligence, Vol. 3214, 2004, pp. 1245–1251.
Google Scholar
R. Srikant and R. Agrawal, “Mining Generalized Association Rules,” VLDB Conference, 1995, pp. 407–419.
Google Scholar
R. Sibson, “SLINK: An Optimally Efficient Algorithm for the Single-Link Cluster Method,” Computer Journal, Vol. 16, No. 1, 1972, pp. 30–34.
Article MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science and Information Engineering, National Chiayi University, Taiwan, 600
Huang-Cheng Kuo
Department of Information Management, Southern Taiwan University of Technology, Taiwan, 710
Jen-Peng Huang

Authors

Huang-Cheng Kuo
View author publications
You can also search for this author in PubMed Google Scholar
Jen-Peng Huang
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Institute of Computer Sciences, Polish Academy of Sciences, ul. Ordona 21, 01-237, Warszawa, Poland
Mieczysław A. Kłopotek , Sławomir T. Wierzchoń & Krzysztof Trojanowski , &

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Kuo, HC., Huang, JP. (2005). Building a Concept Hierarchy from a Distance Matrix. In: Kłopotek, M.A., Wierzchoń, S.T., Trojanowski, K. (eds) Intelligent Information Processing and Web Mining. Advances in Soft Computing, vol 31. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-32392-9_10

Download citation

DOI: https://doi.org/10.1007/3-540-32392-9_10
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-25056-2
Online ISBN: 978-3-540-32392-1
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics