Abstract
Semi-supervised clustering algorithms aim at discovering the hidden structure of data sets with the help of expert knowledge, generally expressed as constraints on the data such as class labels or pairwise relations. Most of the time, the expert is considered as an oracle that only provides correct constraints. This paper focuses on the case where some label constraints are erroneous and proposes to investigate into more detail three semi-supervised fuzzy c-means clustering approaches as they have been tailored to naturally handle uncertainty in the expert labeling. In order to run a fair comparison between existing algorithms, formal improvements have been proposed to guarantee and fasten their convergence. Experiments conducted on real and synthetical datasets under uncertain labels and noise in the constraints show the effectiveness of using fuzzy clustering algorithm for noisy semi-supervised clustering.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
Available at http://archive.ics.uci.edu/ml.
References
Basu, S., Davidson, I., Wagstaff, K.: Constrained Clustering: Advances in Algorithms, Theory, and Applications. Chapman & Hall/CRC, Boca Raton (2008)
Bouchachia, A., Pedrycz, W.: Enhancement of fuzzy clustering by mechanisms of partial supervision. Fuzzy Sets Syst. 157(13), 1733–1759 (2006)
Antoine, V., Quost, B., Masson, M.H., Denœux, T.: Evidential clustering with instance-level constraints for proximity data. Soft. Comput. 18(7), 1321–1335 (2014)
Basu, S., Banerjee, A., Mooney, R.: Active semi-supervision for pairwise constrained clustering. In: Proceedings of 2004 SIAM Interernational Conference on Data Mining, pp. 333–344 (2004)
Bilenko, M., Basu, S., Mooney, R.J.: Integrating constraints and metric learning in semi-supervised clustering. In: Proceedings of 21st ICML (2004)
Wagstaff, K.L.: When is constrained clustering beneficial, and why. In: AAAI (2006)
Vu, V., Labroche, N., Bouchon-Meunier, B.: Boosting clustering by active constraint selection. In: Proceedings of 2010 19th ECAI, pp. 297–302 (2010)
Vu, V., Labroche, N., Bouchon-Meunier, B.: An efficient active constraint selection algorithm for clustering. In: 20th ICPR, pp. 2969–2972 (2010)
Zhang, D., Tan, K., Chen, S.: Semi-supervised kernel-based fuzzy c-means. In: Pal, N.R., Kasabov, N., Mudi, R.K., Pal, S., Parui, S.K. (eds.) ICONIP 2004. LNCS, vol. 3316, pp. 1229–1234. Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-540-30499-9_191
Pedrycz, W., Waletzky, J.: Fuzzy clustering with partial supervision. IEEE Trans. Syst. Man Cybern. Part B Cybern. 27(5), 787–795 (1997)
Lai, D., Garibaldi, J.: A comparison of distance-based semi-supervised fuzzy c-means clustering algorithms. In: IEEE International Conference on Fuzzy Systems (FUZZ-IEEE), pp. 1580–1586 (2011)
Bezdek, J.: Pattern Recognition with Fuzzy Objective Function Algorithms. Advanced Applications in Pattern Recognition. Springer, New York (1981). https://doi.org/10.1007/978-1-4757-0450-1
Gustafson, D., Kessel, W.: Fuzzy clustering with a fuzzy covariance matrix. In: IEEE Conference on Decision and Control Including the 17th Symposium on Adaptive Processes, pp. 761–766 (1979)
Endo, Y., Hamasuna, Y., Yamashiro, M., Miyamoto, S.: On semi-supervised fuzzy c-means clustering. In: IEEE International Conference on Fuzzy Systems (FUZZ-IEEE), pp. 1119–1124 (2009)
Basu, S., Banerjee, A., Mooney, R.: Semi-supervised clustering by seeding. In: Proceedings of 19th International Conference on Machine Learning (ICML), pp. 27–34 (2002)
Basu, S., Bilenko, M., Banerjee, A., Mooney, R.: Probabilistic Semi-supervised Clustering with Constraints, pp. 71–98. MIT Press, Cambridge (2006)
Rand, W.: Objective criteria for the evaluation of clustering methods. J. Am. Stat. Assoc. 66(336), 846–850 (1971)
Dave, R.: Validating fuzzy partitions obtained through c-shells clustering. Pattern Recogn. Lett. 17(6), 613–623 (1996)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG, part of Springer Nature
About this paper
Cite this paper
Antoine, V., Labroche, N. (2018). Semi-supervised Fuzzy c-Means Variants: A Study on Noisy Label Supervision. In: Medina, J., et al. Information Processing and Management of Uncertainty in Knowledge-Based Systems. Theory and Foundations. IPMU 2018. Communications in Computer and Information Science, vol 854. Springer, Cham. https://doi.org/10.1007/978-3-319-91476-3_5
Download citation
DOI: https://doi.org/10.1007/978-3-319-91476-3_5
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-91475-6
Online ISBN: 978-3-319-91476-3
eBook Packages: Computer ScienceComputer Science (R0)