Abstract
Given a pairwise dissimilarity matrix D of a set of n objects, visual methods (such as VAT) for cluster tendency assessment generally represent D as an n×n image \(\mathrm{I}(\tilde{\bf D})\) where the objects are reordered to reveal hidden cluster structure as dark blocks along the diagonal of the image. A major limitation of such methods is the inability to highlight cluster structure in \(\mathrm{I}(\tilde{\bf D})\) when D contains highly complex clusters. To address this problem, this paper proposes an improved VAT (iVAT) method by combining a path-based distance transform with VAT. In addition, an automated VAT (aVAT) method is also proposed to automatically determine the number of clusters from \(\mathrm{I}(\tilde{\bf D})\). Experimental results on several synthetic and real-world data sets have demonstrated the effectiveness of our methods.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Xu, R., II, D.W.: Survey of clustering algorithms. IEEE Transactions on Neural Networks 16(3), 645–678 (2005)
Maulik, U., Bandyopadhyay, S.: Performance evaluation of some clustering algorithms and validity indices. IEEE Transactions on Pattern Analysis and Machine Intelligence 24(12), 1650–1654 (2002)
Hu, X., Xu, L.: A comparative study of several cluster number selection criteria. In: Liu, J., Cheung, Y.-m., Yin, H. (eds.) IDEAL 2003. LNCS, vol. 2690, pp. 195–202. Springer, Heidelberg (2003)
Bezdek, J.C., Pal, N.R.: Some new indices of cluster validity. IEEE Transactions on System, Man and Cybernetics 28(3), 301–315 (1998)
R. Tibshirani, G.W., Hastie, T.: Estimating the number of clusters in a dataset via the gap statistics. Journal of the Royal Statistical Society. Series B, Statistical Methodology 63(2) (2001) 411–423
Calinski, R.B., Harabasz, J.: A dendrite method for cluster analysis. Communications in Statistics 3(1), 1–27 (1974)
Dunn, J.C.: Indices of partition fuzziness and the detection of clusters in large sets. Fuzzy Automata and Decision Processes (1976)
Bezdek, J.C., Hathaway, R.J.: VAT: A tool for visual assement of (cluster) tendency. In: International Joint Conference on Neural Networks, vol. 3, pp. 2225–2230 (2002)
Tran-Luu, T.: Mathematical Concepts and Novel Heuristic Methods for Data Clustering and Visualization. PhD Thesis, University of Maryland, College Park, MD (1996)
Bezdek, J.C., Hathaway, R., Huband, J.: Visual assessment of clustering tendency for rectangular dissimilarity matrices. IEEE Transactions on Fuzzy Systems 15(5), 890–903 (2007)
Hathaway, R., Bezdek, J.C., Huband, J.: Scalable visual assessment of cluster tendency. Pattern Recognition 39(7), 1315–1324 (2006)
Wang, L., Geng, X., Bezdek, J., Leckie, C., Kotagiri, R.: SpecVAT: Enhanced visual cluster analysis. In: International Conference on Data Mining, pp. 638–647 (2008)
Huband, J., Bezdek, J.C., Hathaway, R.: bigVAT: Visual assessment of cluster tendency for large data sets. Pattern Recognition 38(11), 1875–1886 (2005)
Ling, R.: A computer generated aid for cluster analysis. Communications of the ACM 16(6), 355–361 (1973)
Rousseeuw, P.J.: A graphical aid to the interpretations and validation of cluster analysis. Journal of Computational and Applied Mathematics 20(1), 53–65 (1987)
Sledge, I., Huband, J., Bezdek, J.C. (Automatic) cluster count extraction from unlabeled datasets. In: Joint International Conference on Natural Computation and International Conference on Fuzzy Systems and Knowledge Discovery, vol. 1, pp. 3–13 (2008)
Wang, L., Leckie, C., Kotagiri, R., Bezdek, J.: Automatically determining the number of clusters in unlabeled data sets. IEEE Transactions on Knowledge and Data Engineering 21(3), 335–350 (2009)
Havens, T.C., Bezdek, J.C., Keller, J.M., Popescu, M.: Clustering in ordered dissimilarity data. International Journal of Intelligent Systems 24(5), 504–528 (2009)
Belkin, M., Niyogi, P.: Laplacian eigenmaps and spectral techniques for embedding and clustering. Advances in Neural Information Processing Systems 14, 585–591 (2002)
Chung, F.: Spectral graph theory. In: CBMS Regional Conference Series in Mathematics, American Mathematical Society, vol. 92 (1997)
Zelnik-Manor, L., Perona, P.: Self-tuning spectral clustering. Advances in Neural Information Processing Systems 17, 1601–1608 (2004)
Fisher, B., Zoller, T., Buhmann, J.: Path based pairwise data clustering with application to texture segmentation. In: Figueiredo, M., Zerubia, J., Jain, A.K. (eds.) EMMCVPR 2001. LNCS, vol. 2134, pp. 235–250. Springer, Heidelberg (2001)
Thayananthan, A., Stenger, B., Torr, P., Cipolla, R.: Shape context and chamfer matching in cluttered scenes. In: International Conference on Computer Vision and Pattern Recognition, vol. 1, pp. 127–133 (2003)
Otsu, N.: A threshold selection method from gray-level histograms. IEEE Transactions on Systems, Man, and Cybernetics 9(1), 62–66 (1979)
Barrow, H.G.: tenenbaum, J.M., Bolles, R.C., Wolf, H.C.: Parametric correspondence and chamfer matching: Two new techniques for image matching. In: Intternational Joint Conference on Artificial Intelligence, vol. 2, pp. 659–663 (1977)
Breitenbach, M., Grudic, G.: Clustering through ranking on manifolds. In: International Conference on Machine Learning, vol. 119, pp. 73–80 (2005)
Pal, N., Keller, J., Popescu, M., Bezdek, J.C., Mitchell, J., Huband, J.: Gene ontology-based knowledge discovery through fuzzy cluster analysis. Journal of Neural, Parallel and Scientific Computing 13(3-4), 337–361 (2005)
Wang, L., Leckie, C., Wang, X., Kotagiri, R., Bezdek, J.: Tensor space learning for analyzing activity patterns from video sequences. In: ICDM Workshop on Knowledge Discovery and Data Mining from Multimedia Data and Multimedia Applications, pp. 63–68 (2007)
Sezgin, M., Sankur, B.: Survey over image thresholding techniques and quantitative performance evaluation. Journal of Electronic Imaging 13(1), 146–165 (2004)
Chang, H., Yeung, D.Y.: Robust path-based spectral clustering with application to image segmentation. In: International Conference on Computer Vision, vol. 1, pp. 278–285 (2005)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Wang, L., Nguyen, U.T.V., Bezdek, J.C., Leckie, C.A., Ramamohanarao, K. (2010). iVAT and aVAT: Enhanced Visual Analysis for Cluster Tendency Assessment. In: Zaki, M.J., Yu, J.X., Ravindran, B., Pudi, V. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2010. Lecture Notes in Computer Science(), vol 6118. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-13657-3_5
Download citation
DOI: https://doi.org/10.1007/978-3-642-13657-3_5
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-13656-6
Online ISBN: 978-3-642-13657-3
eBook Packages: Computer ScienceComputer Science (R0)