Hierarchical Clustering

Mirkin, Boris

doi:10.1007/978-0-85729-287-2_7

Boris Mirkin^2,3

Part of the book series: Undergraduate Topics in Computer Science ((UTICS))

3922 Accesses
1 Citations

Abstract

Hierarchical clustering builds a binary hierarchy on the entity set. The Chapter’s material explains an algorithm for agglomerative clustering and two different algorithms for divisive clustering, all three based on the same square error criterion as K-Means partitioning method. Agglomerative clustering starts from a trivial set of singletons and merges two clusters at a time. Divisive clustering splits clusters in parts and should be a more interesting approach computationally because it can utilize fast splitting algorithms and, also, stop splitting whenever it seems right. One divisive algorithm proceeds with the conventional K-Means at K = 2 utilized for splitting a cluster. The other maximizes summary association coefficient to make splits conceptually, that is, using one feature at a time. The last section is devoted to the Single Link clustering, a popular method for extraction of elongated structures from the data. Relations between single link clustering and two popular graph-theoretic structures, the Minimum Spanning Tree (MST) and connected components, are explained.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 29.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Boruvka, O.: Příspěvek k řešení otázky ekonomické stavby elektrovodních sítí (Contribution to the solution of a problem of economical construction of electrical networks)" (in Czech), Elektronický Obzor. 15, 153–154 (1926).
Google Scholar
Breiman, L., Friedman, J.H., Olshen, R.A., Stone, C.J.: Classification and Regression Trees. Wadswarth, Belmont, CA (1984).
Google Scholar
Fisher, D.H.: Knowledge acquisition via incremental conceptual clustering. Mach. Learn. 2, 139–172 (1987).
Google Scholar
Hartigan, J.A.: Clustering Algorithms. Wiley, New York (1975).
Google Scholar
Jain, A.K., Dubes, R.C.: Algorithms for Clustering Data. Prentice Hall, Upper Saddle River, NJ (1988).
Google Scholar
Johnsonbaugh, R., Schaefer, M.: Algorithms. Pearson Prentice Hall, Upper Saddle River, NJ (2004).
Google Scholar
Kruskal, J.B.: On the shortest spanning subtree of a graph and the traveling salesman problem. Proc. Am. Math. Soc. 7(1), 48–50 (1956).
Google Scholar
Lance, G.N., Williams, W.T.: A general theory of classificatory sorting strategies: 1. Hierarchical Systems. Comput. J. 9, 373–380 (1967).
Google Scholar
Mirkin, B.: Mathematical Classification and Clustering. Kluwer Academic Press, Boston-Dordrecht (1996).
Google Scholar
Mirkin, B.: Clustering for Data Mining: A Data Recovery Approach. Chapman & Hall/CRC, Boca Raton, FL (2005). ISBN 1-58488-534-3.
Google Scholar
Murtagh, F.: Multidimensional Clustering Algorithms. Physica-Verlag, Vienna (1985).
MATH Google Scholar
Murtagh, F., Downs, G., Contreras, P.: Hierarchical clustering of massive, high dimensional data sets by exploiting ultrametric embedding. SIAM J. Scientif. Comput. 30, 707–730 (2008).
Article MathSciNet MATH Google Scholar
Prim, R.C.: Shortest connection networks and some generalizations. Bell Syst. Technic. J. 36, 1389–1401 (1957).
Google Scholar
Tasoulis, S.K., Tasoulis, D.K., Plagianakos, V.P.: Enhancing principal direction divisive clustering. Pattern Recognit. 43, 3391–3411 (2010).
Article MATH Google Scholar
Ward, J.H. Jr.: Hierarchical grouping to optimize an objective function. J. Am. Stat. Assoc. 58, 236–244 (1963).
Article Google Scholar

Download references

Author information

Authors and Affiliations

Research University – Higher School of Economics, School of Applied Mathematics and Informatics, 11 Pokrovsky Boulevard, Moscow, RF, Russia
Boris Mirkin
Department of Computer Science, Birkbeck University of London, Malet Street, London, UK
Boris Mirkin

Authors

Boris Mirkin
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Mirkin, B. (2011). Hierarchical Clustering. In: Core Concepts in Data Analysis: Summarization, Correlation and Visualization. Undergraduate Topics in Computer Science. Springer, London. https://doi.org/10.1007/978-0-85729-287-2_7

Download citation

DOI: https://doi.org/10.1007/978-0-85729-287-2_7
Published: 09 February 2011
Publisher Name: Springer, London
Print ISBN: 978-0-85729-286-5
Online ISBN: 978-0-85729-287-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics