Abstract
Fuzzy c-means clustering and its derivatives are very successful on many clustering problems. However, fuzzy c-means clustering and similar algorithms have problems with high dimensional data sets and a large number of prototypes. In particular, we discuss hard c-means, noise clustering, fuzzy c-means with a polynomial fuzzifier function and its noise variant. A special test data set that is optimal for clustering is used to show weaknesses of said clustering algorithms in high dimensions. We also show that a high number of prototypes influences the clustering procedure in a similar way as a high number of dimensions. Finally, we show that the negative effects of high dimensional data sets can be reduced by adjusting the parameter of the algorithms, i.e. the fuzzifier, depending on the number of dimensions.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Beyer K, Goldstein J, Ramakrishnan R, Shaft U (1999) When is nearest neighbor meaningful? In: Database theory - ICDT’99, Lecture Notes in Computer Science, vol 1540, Springer, Berlin/Heidelberg, pp 217–235
Bezdek JC (1981) Pattern recognition with fuzzy objective function algorithms. Plenum, New York
Dave RN (1991) Characterization and detection of noise in clustering. Pattern Recogn Lett 12(11):657–664
Dunn JC (1973) A fuzzy relative of the isodata process and its use in detecting compact well-separated clusters. Cybern Syst Int J 3(3):32–57
Durrant RJ, Kabán A (2008) When is ’nearest neighbour’ meaningful: A converse theorem and implications. J Complex 25(4):385–397
Frigui H, Krishnapuram R (1996) A robust clustering algorithm based on competitive agglomeration and soft rejection of outliers. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, IEEE, pp 550–555
Gustafson DE, Kessel WC (1978) Fuzzy clustering with a fuzzy covariance matrix. IEEE 17:761–766
Höppner F, Klawonn F, Kruse R, Runkler T (1999) Fuzzy cluster analysis. Wiley, Chichester, England
Klawonn F, Höppner F (2003) What is fuzzy about fuzzy clustering? Understanding and improving the concept of the fuzzifier. In: Cryptographic Hardware and Embedded Systems - CHES 2003, Lecture Notes in Computer Science, vol 2779, Springer, Berlin/Heidelberg, pp 254–264
Kruse R, Döring C, Lesot MJ (2007) Advances in fuzzy clustering and its applications. In: Fundamentals of fuzzy clustering. Wiley, pp 3–30
Steinhaus H (1957) Sur la division des corps materiels en parties. Bull Acad Pol Sci, Cl III 4:801–804
Winkler R, Klawonn F, Kruse R (2011) Fuzzy C-Means in High Dimensional Spaces. International Journal of Fuzzy System Applications (IJFSA), 1(1), 1–16. doi:10.4018/IJFSA.2011010101
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Winkler, R., Klawonn, F., Kruse, R. (2012). Problems of Fuzzy c-Means Clustering and Similar Algorithms with High Dimensional Data Sets. In: Gaul, W., Geyer-Schulz, A., Schmidt-Thieme, L., Kunze, J. (eds) Challenges at the Interface of Data Analysis, Computer Science, and Optimization. Studies in Classification, Data Analysis, and Knowledge Organization. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-24466-7_9
Download citation
DOI: https://doi.org/10.1007/978-3-642-24466-7_9
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-24465-0
Online ISBN: 978-3-642-24466-7
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)