Skip to main content

Pairwise Data Clustering Accompanied by Validation and Visualisation

  • Conference paper
  • First Online:
German-Japanese Interchange of Data Analysis Results

Abstract

Pairwise proximities are often the starting point for finding clusters by applying cluster analysis techniques. We refer to this approach as pairwise data clustering (Mucha HJ (2009) ClusCorr98 for Excel 2007: clustering, multivariate visualization, and validation. In: Mucha HJ, Ritter G (eds) Classification and clustering: models, software and applications. Report 26, WIAS, Berlin, pp 14–40). A well known example is Gaussian model-based cluster analysis of observations in its simplest settings: the sum of squares and logarithmic sum of squares method. These simple methods can become more general by weighting the observations. By doing so, for instance, clustering the rows and columns of a contingency table will be performed based on pairwise chi-square distances. Finding the appropriate number of clusters is the ultimate aim of the proposed built-in validation techniques. They verify the results of the two most important families of methods, hierarchical and partitional clustering. Pairwise clustering should be accompanied by multivariate graphics such as heatmaps and plot-dendrograms.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  • Banfield JD, Raftery AE (1993) Model-based Gaussian and non-Gaussian clustering. Biometrics 49(3):803–821

    Article  MathSciNet  MATH  Google Scholar 

  • Bartel HG, Mucha HJ, Dolata J (2003) Über eine Modifikation eines graphentheoretisch basierten partitionierenden Verfahrens der Clusteranalyse. Match 48:209–223

    MATH  Google Scholar 

  • CIA World Factbook (2003) World’s largest merchant fleets by country of owner. http://www.geographic.org

  • Flury B, Riedwyl H (1988) Multivariate statistics: a practical approach. Chapman and Hall, London

    Book  Google Scholar 

  • Greenacre MJ (1988) Clustering the rows and columns of a contingency table. J Classif 5:39–51

    Article  MathSciNet  MATH  Google Scholar 

  • Hennig C (2007) Cluster-wise assessment of cluster stability. Comput Stat Data Anal 52(1): 258–271

    Article  MathSciNet  MATH  Google Scholar 

  • Hubert L, Arabie P (1985) Comparing partitions. J Classif 2:193–218

    Article  Google Scholar 

  • Mucha HJ (2004) Automatic validation of hierarchical clustering. In: Antoch J (ed) Proceedings in computational statistics, COMPSTAT 2004, Prague. Physica-Verlag, Heidelberg, pp 1535–1542

    Google Scholar 

  • Mucha HJ (2009) ClusCorr98 for Excel 2007: clustering, multivariate visualization, and validation. In: Mucha HJ, Ritter G (eds) Classification and clustering: models, software and applications. Report 26, WIAS, Berlin, pp 14–40

    Google Scholar 

  • Mucha HJ, Bartel HG, Dolata J (2005) Techniques of rearrangements in binary trees (dendrograms) and applications. Match 54:561–582

    MathSciNet  MATH  Google Scholar 

  • Späth H (1985) Cluster dissection and analysis. Horwood, Chichester

    MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hans-Joachim Mucha .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this paper

Cite this paper

Mucha, HJ. (2014). Pairwise Data Clustering Accompanied by Validation and Visualisation. In: Gaul, W., Geyer-Schulz, A., Baba, Y., Okada, A. (eds) German-Japanese Interchange of Data Analysis Results. Studies in Classification, Data Analysis, and Knowledge Organization. Springer, Cham. https://doi.org/10.1007/978-3-319-01264-3_4

Download citation

Publish with us

Policies and ethics