Skip to main content
Log in

Spectral clustering based on similarity and dissimilarity criterion

  • Theoretical Advances
  • Published:
Pattern Analysis and Applications Aims and scope Submit manuscript

Abstract

The clustering assumption is to maximize the within-cluster similarity and simultaneously to minimize the between-cluster similarity for a given unlabeled dataset. This paper deals with a new spectral clustering algorithm based on a similarity and dissimilarity criterion by incorporating a dissimilarity criterion into the normalized cut criterion. The within-cluster similarity and the between-cluster dissimilarity can be enhanced to result in good clustering performance. Experimental results on toy and real-world datasets show that the new spectral clustering algorithm has a promising performance.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

References

  1. Barreto A, Araujo AA, Kremer S (2003) A taxonomy for spatiotemporal connectionist networks revisited: the unsupervised case. Neural Comput 15:1255–1320

    Article  MATH  Google Scholar 

  2. Bezdek JC, Pal RN (1998) Some new indexes of cluster validity. IEEE Trans Pattern Recognit Mach Intell 28(3):301–315

    Google Scholar 

  3. Chen W, Feng G (2012) Spectral clustering: a semisupervised approach. Neurocomputing 77(1):229–242

    Article  Google Scholar 

  4. Chen WY, Song Y, Bai H, Lin C-J, Chang E (2011) Parallel spectral clustering in distributed systems. IEEE Trans Pattern Recognit Mach Intell 33(3):568–586

    Article  Google Scholar 

  5. Davies DL, Bouldin DW (1979) A cluster separation measure. IEEE Trans Pattern Recognit Mach Intell 1(4):224–227

    Article  Google Scholar 

  6. Demšar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7:1–30

    MathSciNet  MATH  Google Scholar 

  7. CHQ Ding, X He, H Zha, M Gu, HD Simon (2001) A min-max cut algorithm for graph partitioning and data clustering. In: Proceedings of the first IEEE International Conference on Data Mining (ICDM), Washington. DC, USA, pp 107–114

  8. Duda R, Hart P, Stork D (2000) Pattern classification. Wiley-Interscience, London

    MATH  Google Scholar 

  9. Friedman M (1937) The use of ranks to avoid the assumption of normality implicit in the analysis of variance. J Am Stat Assoc 32:675–701

    Article  MATH  Google Scholar 

  10. Graham DB, Allinson NM (1998) Characterizing virtual Eigensignatures for general purpose face recognition. Face Recognit: From Theory Appl, NATO ASI Ser F, Comput Syst Sci 163:446–456

    Article  Google Scholar 

  11. Hagen L, Kahng AB (1992) New spectral methods for ratio cut partitioning and clustering. IEEE Trans Comput Aided Des Integr Circuits Syst 11(9):1074–1085

    Article  Google Scholar 

  12. Lago-Fernández LF, Corbacho F (2010) Normality-based validation for crisp clustering. Pattern Recogn 43(3):782–795

    Article  MATH  Google Scholar 

  13. Z Lu, M Carreira-Perpinan (2008) Constrained spectral clustering through affinity propagation. In: Proceedings of CVPR, Anchorage, Alaska, USA, pp 1–8

  14. Lu H, Fu Z, Shu X (2014) Non-negative and sparse spectral clustering. Pattern Recogn 47(1):418–426

    Article  MATH  Google Scholar 

  15. Luxburg U (2007) A tutorial on spectral clustering. Statistics and computing 17(4):395–416

    Article  MathSciNet  Google Scholar 

  16. Rousseeuw PJ (1987) Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. J Comput Appl Math 20:53–65

    Article  MATH  Google Scholar 

  17. Serkar S, Soundararajan P (2000) Supervised learning of large perceptual organization: graph spectral partitioning and learning automata. IEEE Trans Pattern Anal Mach Intell 22(5):504–525

    Article  Google Scholar 

  18. Shi J, Malik J (2000) Normalized cuts and image segmentation. IEEE Trans Pattern Anal Mach Intell 22(8):888–905

    Article  Google Scholar 

  19. Wacquet G, Caillault EP, Hamad D, H´ebert P-A (2013) Constrained spectral embedding for k-way data clustering. Pattern Recogn Lett 34(9):1009–1017

    Article  Google Scholar 

  20. X Wang, I Davidson (2010) Flexible constrained spectral clustering. In: The 16th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, Washington DC, USA, pp 563–572

  21. Wang S, Siskind J (2003) Image segmentation with ratio cut. IEEE Trans Pattern Anal Mach Intell 25(6):675–690

    Article  Google Scholar 

  22. Wu Z, Leahy R (1993) An optimal graph theoretic approach to data clustering: theory and its application to image segmentation. IEEE Trans Pattern Anal Mach Intell 15(11):1101–1113

    Article  Google Scholar 

  23. AY Ng, MI Jordan (2002) On spectral clustering: analysis and an algorithm. In: Advances in neural information processing systems. Vancouver, British Columbia, Canada, pp 849–856

  24. L Zelnik-Manor, P Perona (2004) Self-tuning spectral clustering. In: Saul LK, Weiss Y, Bottou L (eds) The 18th annual conference on neural information processing systems, Vancouver, British Columbia, Canada, pp 1601–1608

  25. HS Zou, WD Zhou, L Zhang, CL Wu, RC Liu, LC Jiao (2009) A new constrained spectral clustering for sar image segmentation. In: Proceedings 2009 2nd Asian-Pacific Conference on Synthetic Aperture Radar, Xian, China, pp 680–683

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Bangjun Wang.

Appendix A

Appendix A

Proof:

To prove Theorem 1, we need to prove that the weight of the between-cluster similarity in (1 − m)y T Dy − m y T Qy is less than that in y T Dy. Thus, the between-cluster similarity would have less effect on the maximization of the within-cluster similarity.

Without loss of generality, assume that a graph G can be partitioned into two disjoint sub-graphs X 1 and X 2. Consider the continuous and loose form of the indicator vector y, and let y i  ∊ {1, −b} with \(b = \frac{{\sum\nolimits_{{z_{i} > 0}} {D_{i} } }}{{\sum\nolimits_{{z_{i} < 0}} {D_{ii} } }}\) and indicator vector z = [z 1,…, z n ]. First, we expand the denominator of the objective in (4) and simplify it, and have

$$\begin{aligned} {\mathbf{y}}^{T} {\mathbf{Dy}} = \left( {\sum\limits_{{\user2{\rm x}_{i} \in X_{1} }} {\sum\limits_{{\user2{\rm x}_{j} \in X_{1} }} {W_{ij} } } + b^{2} \sum\limits_{{\user2{\rm x}_{i} \in X_{2} }} {\sum\limits_{{\user2{\rm x}_{j} \in X_{2} }} {W_{ij} } } } \right) \hfill \\ {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} + \left( {\sum\limits_{{\user2{\rm x}_{i} \in X_{1} }} {\sum\limits_{{\user2{\rm x}_{j} \in X_{2} }} {W_{ij} } } + b^{2} \sum\limits_{{\user2{\rm x}_{i} \in X_{2} }} {\sum\limits_{{\user2{\rm x}_{j} \in X_{1} }} {W_{ij} } } } \right) \hfill \\ {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} = Sim_{w} + (1 + b^{2} )Sim_{b} , \hfill \\ \end{aligned}$$
(15)

where Sim w is the sum of the first two terms in (15) which can be viewed as the within-cluster similarity, and \(Sim_{b} = \sum\nolimits_{{x_{i} \in X_{2} }} {\sum\nolimits_{{x_{j} \in X_{1} }} {W_{ij} } }\) denotes the between-cluster similarity. Next, we expand the denominator of the objective in (12), and have

$$\begin{aligned} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} (1 - m){\mathbf{y}}^{T} {\mathbf{Dy}} - m{\mathbf{y}}^{T} {\mathbf{Qy}} \hfill \\ = (1 - m)\left( {\sum\limits_{{\user2{\rm x}_{i} \in X_{1} }} {\sum\limits_{{\user2{\rm x}_{j} \in X_{1} }} {W_{ij} } } } \right) \hfill \\ {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} + (1 - m)\left( {b^{2} \sum\limits_{{\user2{\rm x}_{i} \in X_{2} }} {\sum\limits_{{\user2{\rm x}_{j} \in X_{2} }} {W_{ij} } } + (1 + b^{2} )\sum\limits_{{\user2{\rm x}_{i} \in X_{2} }} {\sum\limits_{{\user2{\rm x}_{j} \in X_{1} }} {W_{ij} } } } \right) \hfill \\ {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} - m\left( {\sum\limits_{{\user2{\rm x}_{i} \in X_{1} }} {\sum\limits_{{\user2{\rm x}_{j} \in X_{1} }} {Q_{ij} } } + b^{2} \sum\limits_{{\user2{\rm x}_{i} \in X_{2} }} {\sum\limits_{{\user2{\rm x}_{j} \in X_{2} }} {Q_{ij} } } - 2b\sum\limits_{{\user2{\rm x}_{i} \in X_{1} }} {\sum\limits_{{\user2{\rm x}_{j} \in X_{2} }} {Q_{ij} } } } \right) \hfill \\ \end{aligned}$$
(16)

Since Q ij  = 1 − W ij , we substitute it into (16) and get

$$\begin{aligned} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} (1 - m){\mathbf{y}}^{T} {\mathbf{Dy}} - m{\mathbf{y}}^{T} {\mathbf{Dy}} \hfill \\ = \left( {\sum\limits_{{\user2{\rm x}_{i} \in X_{1} }} {\sum\limits_{{\user2{\rm x}_{j} \in X_{1} }} {W_{ij} } } + b^{2} \sum\limits_{{\user2{\rm x}_{i} \in X_{2} }} {\sum\limits_{{\user2{\rm x}_{j} \in X_{2} }} {W_{ij} } } } \right) \hfill \\ {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} + \left( {\left( {\left( {1 + b^{2} } \right) - m(1 + b^{2} )} \right)\sum\limits_{{\user2{\rm x}_{i} \in X_{1} }} {\sum\limits_{{\user2{\rm x}_{j} \in X_{2} }} {W_{ij} } } } \right) \hfill \\ = Sim_{w} + \left( {\left( {1 + b^{2} } \right) - m\left( {1 + b^{2} } \right)} \right)Sim_{b} \hfill \\ \end{aligned}$$
(17)

Comparing (15) with (17), we can see that the difference between them is the weight of the between-cluster similarity. In addition, it is obvious that the following inequality

$$\left( {\left( {1 + b^{2} } \right) - m(1 + b^{2} )} \right) \le (1 + b^{2} )$$
(18)

holds true. If and only if m = 0, the inequality becomes equality.Thus, the between-cluster similarity in the denominator of (12) has a smaller weight than that of (4).

This completes the proof of Theorem 1.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wang, B., Zhang, L., Wu, C. et al. Spectral clustering based on similarity and dissimilarity criterion. Pattern Anal Applic 20, 495–506 (2017). https://doi.org/10.1007/s10044-015-0515-x

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10044-015-0515-x

Keywords

Navigation