Encyclopedia of Database Systems

2018 Edition
| Editors: Ling Liu, M. Tamer Özsu

Clustering Validity

  • Michalis VazirgiannisEmail author
Reference work entry
DOI: https://doi.org/10.1007/978-1-4614-8265-9_616


Cluster stability; Cluster validation; Quality assessment; Stability-based validation of clustering


A problem one faces in clustering is to decide the optimal partitioning of the data into clusters. In this context visualization of the data set is a crucial verification of the clustering results. In the case of large multidimensional data sets (e.g., more than three dimensions) effective visualization of the data set is cumbersome. Moreover the perception of clusters using available visualization tools is a difficult task for humans that are not accustomed to higher dimensional spaces. The procedure of evaluating the results of a clustering algorithm is known under the term cluster validity. Cluster validity consists of a set of techniques for finding a set of clusters that best fits natural partitions (of given datasets) without any a priori class information. The outcome of the clustering process is validated by a cluster validity index.

Historical Background


This is a preview of subscription content, log in to check access.

Recommended Reading

  1. 1.
    Bezdek JC, Pal NR. Some new indexes of cluster validity. IEEE Trans Syst Man Cybern Part B. 1998;28(3):301–15.CrossRefGoogle Scholar
  2. 2.
    Datta S, Datta S. Comparisons and validation of statistical clustering techniques for microarray gene expression data. Bioinformatics. 2003;19(4):459–66.CrossRefGoogle Scholar
  3. 3.
    El-Melegy MT, Zanaty EA, Abd-Elhafiez WM, Farag AA. On cluster validity indexes in fuzzy and hard clustering algorithms for image segmentation. In: Proceedings of the International Conference on Image Processing; 2007. p. 5–8.Google Scholar
  4. 4.
    Halkidi M, Batistakis Y, Vazirgiannis M. On clustering validation techniques. J Intell Inf Syst. 2001;17(2–3):107–45.zbMATHCrossRefGoogle Scholar
  5. 5.
    Halkidi M, Gunopulos D, Vazirgiannis M, Kumar N, Domeniconi C. A clustering framework based on subjective and objective validity criteria. ACM Trans Knowl Discov Data. 2008;1(4):1–25.CrossRefGoogle Scholar
  6. 6.
    Jiang D, Tang C, Zhang A. Cluster analysis for gene expression data: a survey. IEEE Trans Knowl Data Eng. 2004;16(11):1370–86.CrossRefGoogle Scholar
  7. 7.
    Kim M, Ramakrishna RS. New indices for cluster validity assessment. Pattern Recogn Lett. 2005;26(15):2353–63.CrossRefGoogle Scholar
  8. 8.
    Maulik U, Bandyopadhyay S. Performance evaluation of some clustering algorithms and validity indices. IEEE Trans Pattern Anal Mach Intell. 2002;24(12):1650–4.CrossRefGoogle Scholar
  9. 9.
    NIPS 2005 workshop on theoretical foundations of clustering, Saturday, December 10th, 2005. Available at: http://www.kyb.tuebingen.mpg.de/bs/people/ule/clustering_workshop_nips05/clustering_workshop_nips05.htm_
  10. 10.
    Pal NR, Bezdek JC. On cluster validity for the fuzzy c-means model. IEEE Trans Fuzzy Syst. 1995;3(3):370–9.CrossRefGoogle Scholar
  11. 11.
    Rand WM. Objective criteria for the evaluation of clustering methods. J Am Stat Assoc. 1971;66(336):846–50.CrossRefGoogle Scholar
  12. 12.
    Wang J-S, Chiang J-C. A cluster validity measure with a hybrid parameter search method for the support vector clustering algorithm. Pattern Recog. 2008;41(2):506–20.zbMATHCrossRefGoogle Scholar
  13. 13.
    Zhang J, Modestino JW. A model-fitting approach to cluster validation with application to stochastic model-based image segmentation. IEEE Trans Pattern Anal Mach Intell. 1990;12(10):1009–17.CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2018

Authors and Affiliations

  1. 1.Athens University of Economics and BusinessAthensGreece

Section editors and affiliations

  • Dimitrios Gunopulos
    • 1
  1. 1.Department of Computer Science and EngineeringThe University of California at Riverside, Bourns College of EngineeringRiversideUSA