Encyclopedia of Database Systems

2018 Edition
| Editors: Ling Liu, M. Tamer Özsu

Visualizing Clustering Results

  • Alexander Hinneburg
Reference work entry
DOI: https://doi.org/10.1007/978-1-4614-8265-9_617

Synonyms

Dendrogram; Heat map

Definition

Visualizing clusters is a way to facilitate human experts in evaluating, exploring, or interpreting the results of a cluster analysis. Clustering is an unsupervised learning technique, which groups a set of n data objects D = {x1, …, xn} into clusters so that objects in the same cluster are similar and objects from different clusters are dissimilar to each other. The data can be available (i) as (n × n) matrix of similarities (or dissimilarities), and (ii) as (n × d) data matrix, which describes each data object by a d-dimensional vector. The second form has to be accompanied by a suitable similarity or dissimilarity measure, which computes for a pair of d-dimensional vectors a (dis)similarity score. A typical example of such measure is the Euclidian metric. Clustering results may come in different forms: (i) as partition of D, (ii) as model, which summarizes properties of D, and (iii) as set of hierarchically nested partitions of D....

This is a preview of subscription content, log in to check access.

Recommended Reading

  1. 1.
    Ankerst M, Breunig MM, Kriegel H-P, Sander J. Optics: ordering points to identify the clustering structure. In: Proceedings of the ACM SIGMOD International Conference on Management of Data; 1999. p. 49–60.Google Scholar
  2. 2.
    Bar-Joseph Z, Gifford DK, Jaakkola TS. Fast optimal leaf ordering for hierarchical clustering. Bioinformatics. 2001;17(90001):22–9.CrossRefGoogle Scholar
  3. 3.
    Bishop C. Pattern classification and machine learning. New York: Springer; 2006.zbMATHGoogle Scholar
  4. 4.
    Campello RJGB, Moulavi D, Zimek A, Sander J. A framework for semi-supervised and unsupervised optimal extraction of clusters from hierarchies. Data Min Knowl Disc. 2013;27(3):344–71.MathSciNetzbMATHCrossRefGoogle Scholar
  5. 5.
    Domingos P. Occam’s two razors: the sharp and the blunt. In: Proceedings of the 4th International Conference on Knowledge Discovery and Data Mining; 1998. p. 37–43.Google Scholar
  6. 6.
    Faloutsos C, Lin K-I. Fastmap: a fast algorithm for indexing, data-mining and visualization of traditional and multimedia datasets. In: Proceedings of the ACM SIGMOD International Conference on Management of Data; 1995. p. 163–74.Google Scholar
  7. 7.
    Fua Y-H, Rundensteiner EA, Ward MO. Hierarchical parallel coordinates for visualizing large multivariate data sets. In: Proceedings of the IEEE Conference on Visualization; 1999.Google Scholar
  8. 8.
    Goldberger J, Roweis ST, Hinton GE, Salakhutdinov R. Neighbourhood components analysis. In: Advances in Neural Information Proceedings of the Systems 18, Proceedings of the Neural Information Proceedings of the Systems; 2005. p. 513–20.Google Scholar
  9. 9.
    Grimmer J, King G. General purpose computer-assisted clustering and conceptualization. Proc Natl Acad Sci. 2011;108(7):2643–50.CrossRefGoogle Scholar
  10. 10.
    Hahsler M, Hornik K, Buchta C. Getting things in order: an introduction to the R package seriation. http:// cran.at.r-project.org/web/packages/seriation/vignettes /seriation.pdf
  11. 11.
    Iwata T, Saito K, Ueda N, Stromsten S, Griffiths TL, Tenenbaum JB. Parametric embedding for class visualization. Neural Comput. 2007;19(9):2536–56.zbMATHCrossRefGoogle Scholar
  12. 12.
    Kaban A, Sun J, Raychaudhury S, Nolan L. On class visualisation for high dimensional data: exploring scientific data sets. In: Proceedings of the 9th International Conference on Discovery Science; 2006.Google Scholar
  13. 13.
    Koren Y, Harel D. A two-way visualization method for clustered data. In: Proceedings of the 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining; 2003. p. 589–94.Google Scholar
  14. 14.
    Langfelder P, Zhang B, Horvath S. Defining clusters from a hierarchical cluster tree: the dynamic tree cut package for R. Bioinformatics. 2008;24(5):719–20.CrossRefGoogle Scholar
  15. 15.
    Meilă M. Comparing clusterings – an information based distance. J Multivar Anal. 2007;98(5):873–95.MathSciNetzbMATHCrossRefGoogle Scholar
  16. 16.
    Sammon JW. A nonlinear mapping for data structure analysis. IEEE Trans Comput. 1969;18(5):401–9.CrossRefGoogle Scholar
  17. 17.
    Strehl A, Ghosh J. Relationship-based clustering and visualization for high-dimensional data mining. INFORMS J Comput. 2003;15(2):208–30.zbMATHCrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2018

Authors and Affiliations

  1. 1.Institute of Computer ScienceMartin-Luther-University Halle-WittenbergHalle/SaaleGermany