Abstract
Fuzzy \(c\)-means (FCM) is a well-known partitional clustering method, which allows an object to belong to two or more clusters with a membership grade between zero and one. Recently, due to the rich information conveyed by the membership grade matrix, FCM has been widely used in many real-world application domains where well-separated clusters are typically not available. In addition, people also recognize that the simple centroid-based iterative procedure of FCM is very appealing when dealing with large volume data.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Banerjee, A., Merugu, S., Dhillon, I., Ghosh, J.: Clusteringwith bregman divergences. J. Mach. Learn. Res. 6, 1705–1749 (2005)
Bezdek, J.: A convergence theorem for the fuzzy isodata clustering algorithms. IEEE Trans. Pattern Anal. Mach. Intell. PAMI-2(1), 1–8 (1980).
Bezdek, J.: Pattern Recognition with Fuzzy Objective Function Algoritms. Plenum Press, New York (1981)
Bezdek, J., Hathaway, R., Huggins, V.: Parametric estimation for normal mixtures. Pattern Recognit. Lett. 3, 79–84 (1985)
Bobrowski, L., Bezdek, J.: C-means clustering with the \(l_1\) and \(l_\infty \) norms. IEEE Trans. Syst. Man Cybern. 21(3), 545–554 (1991)
Boyd, S., Vandenberghe, L.: Convex Optimization. Cambridge University Press, Cambridge (2004)
Bregman, L.: The relaxation method of finding the common points of convex sets and its application to the solution of problems in convex programming. USSR Comput. Math. Math. Phys. 7, 200–217 (1967)
Chen, L., Ng, R.: On the marriage of lp-norms and edit distance. In: Proceedings of the 13th International Conference on Very Large Data Bases, pp. 792–803. Toronto, Canada (2004).
Dav\(\acute{e}\), R.: Characterization and detection of noise in clustering. Pattern Recognit. Lett. 12(11), 657–664 (1991).
Dunn, J.: A fuzzy relative of the isodata process and its use in detecting compact well-separated clusters. J. Cybern. 3, 32–57 (1973)
Flanders, H.: Differential Forms with Applications to the Physical Sciences. Dover Publications, New York (1989)
Golub, G., van Loan, C.: Matrix Computations. Johns Hopkins, Baltimore (1996)
Groll, L., Jakel, J.: A new convergence proof of fuzzy \(c\)-means. IEEE Trans. Fuzzy Syst. 13(5), 717–720 (2005)
Hathaway, R., Bezdek, J.: Local convergence of the fuzzy \(c\)-means algorithms. Pattern Recognit. 19, 477–480 (1986)
Hathaway, R., Bezdek, J.: Recent convergence results for the fuzzy c-means clustering algorithms. J. Classif. 5, 237–247 (1988)
Hathaway, R., Bezdek, J., Tucker, W.: An improved convergence theory for the fuzzy c-means clustering algorithms. In: Bezdek, J. (ed.) Analysis of Fuzzy Information, vol. 3, pp. 123–131. CRC Press, Boca Raton (1987)
Hathaway, R.J., Bezdek, J.C., Hu, Y.: Generalized fuzzy c-means clustering strategies using \(l_p\) norm distances. IEEE Trans. Fuzzy Syst. 8(5), 576–582 (2000)
Honda, K., Notsu, A., Ichihashi, H.: Fuzzy pca-guided robust k-means clustering. IEEE Trans. Fuzzy Syst. 18(1), 67–79 (2010)
Hoppner, F., Klawonn, F.: A contribution to convergence theory of fuzzy c-means and derivatives. IEEE Trans. Fuzzy Syst. 11(5), 682–694 (2003)
Hoppner, F., Klawonn, F., Kruse, R., Runkler, T.: Fuzzy Cluster Analysis. Wiley, New York (1999)
Ismail, M., Selim, S.: Fuzzy \(c\)-means: optimality of solutions and effective termination of the algorithm. Pattern Recognit. 19, 481–485 (1984)
Jajuga, K.: \(l_1\) norm-based fuzzy clustering. Fuzzy Sets Syst. 39, 43–50 (1991)
Karoubi, M., Leruste, C.: Algebraic Topology via Differential Geometry. Cambridge University Press, Cambridge (1987)
Kersten, P.: Fuzzy order statistics and their application to fuzzy clustering. IEEE Trans. Fuzzy Syst. 7, 708–712 (1999)
Kim, T., Bezdek, J., Hathaway, R.: Optimality tests for fixed points of the fcm algorithm. Pattern Recognit. 21(6), 651–663 (1988)
Klawonn, F., Keller, A.: Fuzzy clustering based on modified distance measures. In: Proceedings of the 3rd International Symposium on Advances in, Intelligent Data Analysis, pp. 291–302 (1999).
Leski, J.M.: Generalized weighted conditional fuzzy clustering. IEEE Trans. Fuzzy Syst. 11(6), 709–715 (2003)
Li, R., Mukaidono, M.: A maximum entropy to fuzzy clustering. In: Proceedings of 4th IEEE Internation Conference on Fuzzy Systems, pp. 2227–2232. Yokohama, Japan (1995).
Luenberger, D., Ye, Y.: Linear and Nonlinear Programming, 3rd edn. Springer, New York (2008)
MacQueen, J.: Some methods for classification and analysis of multivariate observations. In: Proceedings of the 5th Berkeley Symposium on Mathematical Statistics and Probability, pp. 281–297 (1967).
Meila, M.: Comparing clusterings-an axiomatic view. In: Proceedings of the 22nd International Conference on, Machine Learning, pp. 577–584 (2005).
Menard, M., Courboulay, V., Dardignac, P.: Possibistic and probabilistic fuzzy clustering: Unification within the framework of the nonextensive thermostatistics. Pattern Recognit. 36(6), 1325–1342 (2003)
Miyamoto, S., Agusta, Y.: An efficient algorithm for \(l_1\) fuzzy \(c\)-means and its termination. Control Cybern. 24(4), 421–436 (1995)
Miyamoto, S., Ichihashi, H., Honda, K.: Algorithms for Fuzzy Clustering: Methods in c-Means Clustering with Applications. Springer, Berlin (2008)
Miyamoto, S., Umayahara, K.: Fuzzy clustering by quadratic regularization. In: Proceedings of the 7th IEEE Internation Conference on Fuzzy Systems, pp. 394–1399 (1998).
Ohashi, Y.: Fuzzy clustering and robust estimation. In: Proceedings of the 9th SAS Users Group International Meeting. Hollywood Beach, FL, USA (1984).
Pedrycz, W.: Conditional fuzzy \(c\)-means. Pattern Recognit. Lett. 17, 625–632 (1996)
Pedrycz, W., Loia, V., Senatore, S.: Fuzzy clustering with viewpoints. IEEE Trans. Fuzzy Syst. 18(2), 274–284 (2010)
Rose, K., Gurewitz, E., Fox, G.: A deterministic annealing approach to clustering. Pattern Recognit. Lett. 11, 589–594 (1990)
Selim, S., Ismail, M.: On the local optimality of the fuzzy isodata clustering algorithm. IEEE Trans. Pattern Anal. Mach. Intell. 8, 284–288 (1986)
Sledge, I., Bezdek, J., Havens, T., Keller, J.: Relational generalizations of cluster validity indices. IEEE Trans. Fuzzy Syst. 18(4), 771–786 (2010)
Tan, P.N., Steinbach, M., Kumar, V.: Introduction to Data Mining. Addison-Wesley, Boston (2005)
Teboulle, M.: A unified continuous optimization framework for center-based clustering methods. J. Mach. Learn. Res. 8, 65–102 (2007)
Tucker, W.: Counterexamples to the convergence theorem for fuzzy isodata clustering algorithm. In: Bezdek, J. (ed.) Analysis of Fuzzy Information, vol. 3, pp. 109–122. CRC Press, Boca Raton (1987)
Wu, J., Xiong, H., Chen, J.: Adapting the right measures for k-means clustering. In: Proceedings of The 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 877–886 (2009).
Wu, J., Xiong, H., Liu, C., Chen, J.: A generalization of distance functions for fuzzy \(c\)-means clustering with centroids of arithmetic means. IEEE Trans. Fuzzy Syst. (Forthcoming, 2012).
Wu, K., Yang, M.: Alternative \(c\)-means clustering algorithms. Pattern Recognit. 35, 2267–2278 (2002)
Yang, M.: On a class of fuzzy classification maximum likelihood procedures. Fuzzy Sets Syst. 57, 365–375 (1993)
Yi, B.K., Faloutsos, C.: Fast time sequence indexing for arbitrary lp norms. In: Proceedings of the 26th International Conference on Very Large Data Bases, pp. 792–803. Cairo, Egypt (2000).
Yu, J., Yang, M.S.: Optimality test for generalized fcm and its application to parameter selection. IEEE Trans. Fuzzy Syst. 13(1), 164–176 (2005)
Yu, J., Yang, M.S.: A generalized fuzzy clustering regularization model with optimality tests and model complexity analysis. IEEE Trans. Fuzzy Syst. 15(5), 904–915 (2007)
Zangwill, W.: Nonlinear Programming: A Unified Approach. Prentice-Hall, Englewood Cliffs (1969)
Zhao, Y., Karypis, G.: Criterion functions for document clustering: experiments and analysis. Mach. Learn. 55(3), 311–331 (2004)
Author information
Authors and Affiliations
Appendix
Appendix
As previously mentioned in Sect. 3.4, the degeneration of \(\varvec{U}\) may happen for GD-FCM using \(f^\phi _{II}\), although the probability of occurrence is extremely low for real-world data. Here, we provide a solution for this degeneration case and show how the global convergence of GD-FCM can still be guaranteed.
During a GD-FCM iteration, assume that we get \((\varvec{U}^{de},\varvec{ v}^{de})=T_{mf}(\bar{\varvec{U}},\bar{\varvec{ V}})\), where \(\varvec{U}^{de}\) is degenerate but \(\bar{\varvec{U}}\) is nondegenerate. Without loss of generality, suppose there is only one \(r\) such that \(u^{de}_{rk}=0~\forall ~k\). Then, we let \(\hat{\varvec{ v}}=(\hat{\varvec{ v}}_1,\hat{\varvec{ v}}_2,\ldots ,\hat{\varvec{ v}}_c)^T\) be
Next, we try to resume the iteration by having \(\hat{\varvec{U}}=F(\hat{\varvec{V}})\). If \(\hat{\varvec{U}}\) is nondegenerate, then we define \((\hat{\varvec{U}},\hat{\varvec{V}})\doteq T_{mf}(\bar{\varvec{U}},\bar{\varvec{V}})\), and resume the iteration based on \((\hat{\varvec{U}},\hat{\varvec{V}})\). Otherwise, we repeat choosing a new \(\varvec{x}\in {\mathrm Conv }(\fancyscript{X})\) for \(\hat{\varvec{v}}_r\) until we have a nondegenerate \(\hat{\varvec{U}}=F(\hat{\varvec{V}})\). Typically, we choose \(\varvec{x}\) from \(\fancyscript{X}\) to ensure that \(\exists ~k\) such that \(u^{de}_{rk}\ne 0\).
Now, we establish the descent theorem when there is degeneration. Assume that \((\bar{\varvec{U}},\bar{\varvec{V}})\) is not in \(\Omega ^{\prime }\) of Eq. (3.37). Since \(u^{de}_{rk}=0~\forall ~k\), we have \(J_{mf}(\varvec{U}^{de},\varvec{V}^{de})= J_{mf}(\varvec{U}^{de},\hat{\varvec{V}})\ge J_{mf}(\hat{\varvec{U}},\hat{\varvec{V}})\). Furthermore, by Theorem 3.6, we have \(J_{mf}(\bar{\varvec{U}},\bar{\varvec{V}})>J_{mf}(\varvec{U}^{de},\varvec{V}^{de})\), which implies that \(J_{mf}(\bar{\varvec{U}},\bar{\varvec{V}})>J_{mf}(\hat{\varvec{U}},\hat{\varvec{V}})\). The descent theorem therefore holds. In addition, since we can “skip” the degenerate solution by jumping from \((\bar{\varvec{U}},\bar{\varvec{V}})\) to \((\hat{\varvec{U}},\hat{\varvec{V}})\), we still have \(T_{mf}:M_{fc}\times \fancyscript{S}^c\mapsto M_{fc}\times \fancyscript{S}^c\), and the closeness of \(T_{mf}\) and the compactness of \(M_{fc}\times {\mathrm Conv }(\fancyscript{X})^c\) still hold. By assembling the above results, we again get the global convergence by Zangwill’s convergence theorem.
Finally, in case that we cannot find any \(\varvec{x}\in {\mathrm Conv }(\fancyscript{X})\) for \(\hat{\varvec{ v}}_r\) such that \(\hat{\varvec{U}}=F(\hat{\varvec{ V}})\) is nondegenerate, we simply return \((\bar{\varvec{U}},\bar{\varvec{ V}})\) as the solution.
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Wu, J. (2012). Generalizing Distance Functions for Fuzzy c-Means Clustering. In: Advances in K-means Clustering. Springer Theses. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-29807-3_3
Download citation
DOI: https://doi.org/10.1007/978-3-642-29807-3_3
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-29806-6
Online ISBN: 978-3-642-29807-3
eBook Packages: Computer ScienceComputer Science (R0)