Hierarchical clustering builds a binary hierarchy on the entity set. The Chapter’s material explains an algorithm for agglomerative clustering and two different algorithms for divisive clustering, all three based on the same square error criterion as K-Means partitioning method. Agglomerative clustering starts from a trivial set of singletons and merges two clusters at a time. Divisive clustering splits clusters in parts and should be a more interesting approach computationally because it can utilize fast splitting algorithms and, also, stop splitting whenever it seems right. One divisive algorithm proceeds with the conventional K-Means at K = 2 utilized for splitting a cluster. The other maximizes summary association coefficient to make splits conceptually, that is, using one feature at a time. The last section is devoted to the Single Link clustering, a popular method for extraction of elongated structures from the data. Relations between single link clustering and two popular graph-theoretic structures, the Minimum Spanning Tree (MST) and connected components, are explained.


Span Tree Maximum Span Tree Agglomerative Cluster Conceptual Cluster Maximal Cluster 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. Boruvka, O.: Příspěvek k řešení otázky ekonomické stavby elektrovodních sítí (Contribution to the solution of a problem of economical construction of electrical networks)" (in Czech), Elektronický Obzor. 15, 153–154 (1926).Google Scholar
  2. Breiman, L., Friedman, J.H., Olshen, R.A., Stone, C.J.: Classification and Regression Trees. Wadswarth, Belmont, CA (1984).Google Scholar
  3. Fisher, D.H.: Knowledge acquisition via incremental conceptual clustering. Mach. Learn. 2, 139–172 (1987).Google Scholar
  4. Hartigan, J.A.: Clustering Algorithms. Wiley, New York (1975).Google Scholar
  5. Jain, A.K., Dubes, R.C.: Algorithms for Clustering Data. Prentice Hall, Upper Saddle River, NJ (1988).Google Scholar
  6. Johnsonbaugh, R., Schaefer, M.: Algorithms. Pearson Prentice Hall, Upper Saddle River, NJ (2004).Google Scholar
  7. Kruskal, J.B.: On the shortest spanning subtree of a graph and the traveling salesman problem. Proc. Am. Math. Soc. 7(1), 48–50 (1956).Google Scholar
  8. Lance, G.N., Williams, W.T.: A general theory of classificatory sorting strategies: 1. Hierarchical Systems. Comput. J. 9, 373–380 (1967).Google Scholar
  9. Mirkin, B.: Mathematical Classification and Clustering. Kluwer Academic Press, Boston-Dordrecht (1996).Google Scholar
  10. Mirkin, B.: Clustering for Data Mining: A Data Recovery Approach. Chapman & Hall/CRC, Boca Raton, FL (2005). ISBN 1-58488-534-3.Google Scholar
  11. Murtagh, F.: Multidimensional Clustering Algorithms. Physica-Verlag, Vienna (1985).MATHGoogle Scholar
  12. Murtagh, F., Downs, G., Contreras, P.: Hierarchical clustering of massive, high dimensional data sets by exploiting ultrametric embedding. SIAM J. Scientif. Comput. 30, 707–730 (2008).MathSciNetMATHCrossRefGoogle Scholar
  13. Prim, R.C.: Shortest connection networks and some generalizations. Bell Syst. Technic. J. 36, 1389–1401 (1957).Google Scholar
  14. Tasoulis, S.K., Tasoulis, D.K., Plagianakos, V.P.: Enhancing principal direction divisive clustering. Pattern Recognit. 43, 3391–3411 (2010).MATHCrossRefGoogle Scholar
  15. Ward, J.H. Jr.: Hierarchical grouping to optimize an objective function. J. Am. Stat. Assoc. 58, 236–244 (1963).CrossRefGoogle Scholar

Copyright information

© Springer-Verlag London Limited 2011

Authors and Affiliations

  • Boris Mirkin
    • 1
    • 2
  1. 1.Research University – Higher School of Economics, School of Applied Mathematics and InformaticsMoscowRussia
  2. 2.Department of Computer ScienceBirkbeck University of LondonLondonUK

Personalised recommendations