Divisive and Separate Cluster Structures

Mirkin, Boris

doi:10.1007/978-3-030-00271-8_5

Divisive and Separate Cluster Structures

Boris Mirkin^11,12

Chapter
First Online: 14 April 2019

2306 Accesses
1 Citations

Part of the book series: Undergraduate Topics in Computer Science ((UTICS))

Abstract

This Chapter is about dividing a dataset or its subset in two parts. If both parts are to be clusters, this is referred to as divisive clustering. If just one part is to be a cluster, this will be referred to as separative clustering. Iterative application of divisive clustering builds a binary hierarchy of which we will be interested at a partition of the dataset. Iterative application of separative clustering builds a set of clusters, possibly overlapping. The first three sections introduce three different approaches in divisive clustering: Ward clustering, Spectral clustering and Single link clustering. Ward clustering is an extension of K-means clustering dominated by the so-called Ward distance between clusters; also, this is a natural niche for conceptual clustering in which every division is made over a single feature to attain immediate interpretability of the hierarchy branches and clusters. Spectral clustering gained popularity with the so-called Normalized Cut approach to divisive clustering. A relaxation of this combinatorial problem appears to be equivalent to optimizing the Rayleigh quotient for a Laplacian transformation of the similarity matrix under consideration. In fact, other approaches under consideration, such as uniform clustering and semi-average clustering, also may be treated within the spectral approach. Single link clustering formalizes the nearest neighbor approach and is much related to graph-theoretic concepts: components and maximum spanning trees. One may think of divisive clustering as a process for building a binary hierarchy, which goes “top-down” in contrast to agglomerative clustering (in Sect. 4.6), which builds a binary hierarchy “bottom-up”. Two remaining sections describe two separative clustering approaches as extensions of popular approaches to the case. One tries to find a cluster with maximum inner summary similarity at a similarity matrix preprocessed according to the uniform and modularity approaches considered in Sect. 4.6.3 The other applies the encoder-decoder least-squares approach to modeling data by a one-cluster structure. It appears, criteria emerging within the latter approach are much akin to those described earlier, the summary and semi-average similarities, although parameters now can be adjusted according to the least-squares approach. This applies to a distinct direction, the so-called additive clustering approach, which can be usefully applied to the analysis of similarity data.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 54.99; Price excludes VAT (USA)

Softcover Book: USD 69.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

L. Breiman, J.H. Friedman, R.A. Olshen, C.J. Stone, Classification and Regression Trees (Wadswarth, Belmont, Ca, 1984)
Google Scholar
B. Mirkin, Mathematical Classification and Clustering (Kluwer Academic Press, 1996)
Google Scholar
B. Mirkin, Clustering: A Data Recovery Approach (Chapman & Hall/CRC, 2012)
Google Scholar
F. Murtagh, Multidimensional Clustering Algorithms (Physica-Verlag, Vienna, 1985)
MATH Google Scholar

Articles

O. Boruvka, Příspěvek k řešení otázky ekonomické stavby elektrovodních sítí (Contribution to the solution of a problem of economical construction of electrical networks)” (in Czech). Elektronický Obzor 15, 153–154 (1926)
Google Scholar
D.H. Fisher, Knowledge acquisition via incremental conceptual clustering. Mach. Learn. 2, 139–172 (1987)
Google Scholar
S. Guattery, G. Miller, On the quality of spectral separators. SIAM J. Matrix Anal. Appl. 19(3), 701–719 (1998)
Article MathSciNet Google Scholar
C. Klein, M. Randic, Resistance distance. J. Math. Chem. 12, 81–95 (1993)
Article MathSciNet Google Scholar
J.B. Kruskal, On the shortest spanning subtree of a graph and the traveling salesman problem. Proc. Am. Math. Soc. 7(1), 48–50 (1956)
Article MathSciNet Google Scholar
G.N. Lance, W.T. Williams, A general theory of classificatory sorting strategies: 1. Hierarchical Systems. Comput. J. 9, 373–380 (1967)
Article Google Scholar
U. Luxburg, A tutorial on spectral clustering. Stat. Comput. 17, 395–416 (2007)
Article MathSciNet Google Scholar
B. Mirkin, Additive clustering and qualitative factor analysis methods for similarity matrices. J. Classif. 4, 7–31 (1987); Erratum 6, 271–272 (1989)
Google Scholar
B. Mirkin, R. Camargo, T. Fenner, G. Loizou, P. Kellam, Similarity clustering of proteins using substantive knowledge and reconstruction of evolutionary gene histories in herpesvirus. Theor. Chem. Acc.: Theory, Comput., Model. 125(3–6), 569–582 (2010)
Article Google Scholar
F. Murtagh, G. Downs, P. Contreras, Hierarchical clustering of massive, high dimensional data sets by exploiting ultrametric embedding. SIAM J. Sci. Comput. 30, 707–730 (2008)
Article MathSciNet Google Scholar
M.E.J. Newman, Modularity and community structure in networks. PNAS 103(23), 8577–8582 (2006)
Article Google Scholar
R.C. Prim, Shortest connection networks and some generalizations. Bell Syst. Tech. J. 36, 1389–1401 (1957)
Article Google Scholar
J. Shi, J. Malik, Normalized cuts and image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 22(8), 888–905 (2000)
Article Google Scholar
R.N. Shepard, P. Arabie, Additive clustering: Representation of similarities as combinations of discrete overlapping properties. Psychol. Rev. 86, 87–123 (1979)
Article Google Scholar
S.K. Tasoulis, D.K. Tasoulis, V.P. Plagianakos, Enhancing principal direction divisive clustering. Pattern Recogn. 43, 3391–3411 (2010)
Article Google Scholar
J.H. Ward Jr., Hierarchical grouping to optimize an objective function. J. Am. Stat. Assoc. 58, 236–244 (1963)
Article MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

Department of Data Analysis and Artificial Intelligence, Faculty of Computer Science, National Research University Higher School of Economics, Moscow, Russia
Boris Mirkin (Professor)
Professor Emeritus, Department of Computer Science and Information Systems, Birkbeck University of London, London, UK
Boris Mirkin (Professor)

Authors

Boris Mirkin
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Boris Mirkin .

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Mirkin, B. (2019). Divisive and Separate Cluster Structures. In: Core Data Analysis: Summarization, Correlation, and Visualization. Undergraduate Topics in Computer Science. Springer, Cham. https://doi.org/10.1007/978-3-030-00271-8_5

Download citation

DOI: https://doi.org/10.1007/978-3-030-00271-8_5
Published: 14 April 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-00270-1
Online ISBN: 978-3-030-00271-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics