Abstract
Pairwise proximities are often the starting point for finding clusters by applying cluster analysis techniques. We refer to this approach as pairwise data clustering (Mucha HJ (2009) ClusCorr98 for Excel 2007: clustering, multivariate visualization, and validation. In: Mucha HJ, Ritter G (eds) Classification and clustering: models, software and applications. Report 26, WIAS, Berlin, pp 14–40). A well known example is Gaussian model-based cluster analysis of observations in its simplest settings: the sum of squares and logarithmic sum of squares method. These simple methods can become more general by weighting the observations. By doing so, for instance, clustering the rows and columns of a contingency table will be performed based on pairwise chi-square distances. Finding the appropriate number of clusters is the ultimate aim of the proposed built-in validation techniques. They verify the results of the two most important families of methods, hierarchical and partitional clustering. Pairwise clustering should be accompanied by multivariate graphics such as heatmaps and plot-dendrograms.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Banfield JD, Raftery AE (1993) Model-based Gaussian and non-Gaussian clustering. Biometrics 49(3):803–821
Bartel HG, Mucha HJ, Dolata J (2003) Über eine Modifikation eines graphentheoretisch basierten partitionierenden Verfahrens der Clusteranalyse. Match 48:209–223
CIA World Factbook (2003) World’s largest merchant fleets by country of owner. http://www.geographic.org
Flury B, Riedwyl H (1988) Multivariate statistics: a practical approach. Chapman and Hall, London
Greenacre MJ (1988) Clustering the rows and columns of a contingency table. J Classif 5:39–51
Hennig C (2007) Cluster-wise assessment of cluster stability. Comput Stat Data Anal 52(1): 258–271
Hubert L, Arabie P (1985) Comparing partitions. J Classif 2:193–218
Mucha HJ (2004) Automatic validation of hierarchical clustering. In: Antoch J (ed) Proceedings in computational statistics, COMPSTAT 2004, Prague. Physica-Verlag, Heidelberg, pp 1535–1542
Mucha HJ (2009) ClusCorr98 for Excel 2007: clustering, multivariate visualization, and validation. In: Mucha HJ, Ritter G (eds) Classification and clustering: models, software and applications. Report 26, WIAS, Berlin, pp 14–40
Mucha HJ, Bartel HG, Dolata J (2005) Techniques of rearrangements in binary trees (dendrograms) and applications. Match 54:561–582
Späth H (1985) Cluster dissection and analysis. Horwood, Chichester
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Mucha, HJ. (2014). Pairwise Data Clustering Accompanied by Validation and Visualisation. In: Gaul, W., Geyer-Schulz, A., Baba, Y., Okada, A. (eds) German-Japanese Interchange of Data Analysis Results. Studies in Classification, Data Analysis, and Knowledge Organization. Springer, Cham. https://doi.org/10.1007/978-3-319-01264-3_4
Download citation
DOI: https://doi.org/10.1007/978-3-319-01264-3_4
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-01263-6
Online ISBN: 978-3-319-01264-3
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)