Abstract
The clustering assumption is to maximize the within-cluster similarity and simultaneously to minimize the between-cluster similarity for a given unlabeled dataset. This paper deals with a new spectral clustering algorithm based on a similarity and dissimilarity criterion by incorporating a dissimilarity criterion into the normalized cut criterion. The within-cluster similarity and the between-cluster dissimilarity can be enhanced to result in good clustering performance. Experimental results on toy and real-world datasets show that the new spectral clustering algorithm has a promising performance.
Similar content being viewed by others
References
Barreto A, Araujo AA, Kremer S (2003) A taxonomy for spatiotemporal connectionist networks revisited: the unsupervised case. Neural Comput 15:1255–1320
Bezdek JC, Pal RN (1998) Some new indexes of cluster validity. IEEE Trans Pattern Recognit Mach Intell 28(3):301–315
Chen W, Feng G (2012) Spectral clustering: a semisupervised approach. Neurocomputing 77(1):229–242
Chen WY, Song Y, Bai H, Lin C-J, Chang E (2011) Parallel spectral clustering in distributed systems. IEEE Trans Pattern Recognit Mach Intell 33(3):568–586
Davies DL, Bouldin DW (1979) A cluster separation measure. IEEE Trans Pattern Recognit Mach Intell 1(4):224–227
Demšar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7:1–30
CHQ Ding, X He, H Zha, M Gu, HD Simon (2001) A min-max cut algorithm for graph partitioning and data clustering. In: Proceedings of the first IEEE International Conference on Data Mining (ICDM), Washington. DC, USA, pp 107–114
Duda R, Hart P, Stork D (2000) Pattern classification. Wiley-Interscience, London
Friedman M (1937) The use of ranks to avoid the assumption of normality implicit in the analysis of variance. J Am Stat Assoc 32:675–701
Graham DB, Allinson NM (1998) Characterizing virtual Eigensignatures for general purpose face recognition. Face Recognit: From Theory Appl, NATO ASI Ser F, Comput Syst Sci 163:446–456
Hagen L, Kahng AB (1992) New spectral methods for ratio cut partitioning and clustering. IEEE Trans Comput Aided Des Integr Circuits Syst 11(9):1074–1085
Lago-Fernández LF, Corbacho F (2010) Normality-based validation for crisp clustering. Pattern Recogn 43(3):782–795
Z Lu, M Carreira-Perpinan (2008) Constrained spectral clustering through affinity propagation. In: Proceedings of CVPR, Anchorage, Alaska, USA, pp 1–8
Lu H, Fu Z, Shu X (2014) Non-negative and sparse spectral clustering. Pattern Recogn 47(1):418–426
Luxburg U (2007) A tutorial on spectral clustering. Statistics and computing 17(4):395–416
Rousseeuw PJ (1987) Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. J Comput Appl Math 20:53–65
Serkar S, Soundararajan P (2000) Supervised learning of large perceptual organization: graph spectral partitioning and learning automata. IEEE Trans Pattern Anal Mach Intell 22(5):504–525
Shi J, Malik J (2000) Normalized cuts and image segmentation. IEEE Trans Pattern Anal Mach Intell 22(8):888–905
Wacquet G, Caillault EP, Hamad D, H´ebert P-A (2013) Constrained spectral embedding for k-way data clustering. Pattern Recogn Lett 34(9):1009–1017
X Wang, I Davidson (2010) Flexible constrained spectral clustering. In: The 16th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, Washington DC, USA, pp 563–572
Wang S, Siskind J (2003) Image segmentation with ratio cut. IEEE Trans Pattern Anal Mach Intell 25(6):675–690
Wu Z, Leahy R (1993) An optimal graph theoretic approach to data clustering: theory and its application to image segmentation. IEEE Trans Pattern Anal Mach Intell 15(11):1101–1113
AY Ng, MI Jordan (2002) On spectral clustering: analysis and an algorithm. In: Advances in neural information processing systems. Vancouver, British Columbia, Canada, pp 849–856
L Zelnik-Manor, P Perona (2004) Self-tuning spectral clustering. In: Saul LK, Weiss Y, Bottou L (eds) The 18th annual conference on neural information processing systems, Vancouver, British Columbia, Canada, pp 1601–1608
HS Zou, WD Zhou, L Zhang, CL Wu, RC Liu, LC Jiao (2009) A new constrained spectral clustering for sar image segmentation. In: Proceedings 2009 2nd Asian-Pacific Conference on Synthetic Aperture Radar, Xian, China, pp 680–683
Author information
Authors and Affiliations
Corresponding author
Appendix A
Appendix A
Proof:
To prove Theorem 1, we need to prove that the weight of the between-cluster similarity in (1 − m)y T Dy − m y T Qy is less than that in y T Dy. Thus, the between-cluster similarity would have less effect on the maximization of the within-cluster similarity.
Without loss of generality, assume that a graph G can be partitioned into two disjoint sub-graphs X 1 and X 2. Consider the continuous and loose form of the indicator vector y, and let y i ∊ {1, −b} with \(b = \frac{{\sum\nolimits_{{z_{i} > 0}} {D_{i} } }}{{\sum\nolimits_{{z_{i} < 0}} {D_{ii} } }}\) and indicator vector z = [z 1,…, z n ]. First, we expand the denominator of the objective in (4) and simplify it, and have
where Sim w is the sum of the first two terms in (15) which can be viewed as the within-cluster similarity, and \(Sim_{b} = \sum\nolimits_{{x_{i} \in X_{2} }} {\sum\nolimits_{{x_{j} \in X_{1} }} {W_{ij} } }\) denotes the between-cluster similarity. Next, we expand the denominator of the objective in (12), and have
Since Q ij = 1 − W ij , we substitute it into (16) and get
Comparing (15) with (17), we can see that the difference between them is the weight of the between-cluster similarity. In addition, it is obvious that the following inequality
holds true. If and only if m = 0, the inequality becomes equality.Thus, the between-cluster similarity in the denominator of (12) has a smaller weight than that of (4).
This completes the proof of Theorem 1.
Rights and permissions
About this article
Cite this article
Wang, B., Zhang, L., Wu, C. et al. Spectral clustering based on similarity and dissimilarity criterion. Pattern Anal Applic 20, 495–506 (2017). https://doi.org/10.1007/s10044-015-0515-x
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10044-015-0515-x