On the Robustness of Kernel-Based Clustering
This paper evaluates the robustness of two types of unsupervised learning methods, which work in feature spaces induced by a kernel function, kernel k-means and kernel symmetric non-negative matrix factorization. The main hypothesis is that the use of non-linear kernels makes these clustering algorithms more robust to noise and outliers. The hypothesis is corroborated by applying kernel and non-kernel versions of the algorithms to data with different degrees of contamination with noisy data. The results show that the kernel versions of the clustering algorithms are indeed more robust, i.e. producing estimates with lower bias in the presence of noise.
KeywordsCluster Algorithm Gaussian Kernel Kernel Method Robust Statistic Probabilistic Latent Semantic Analysis
- 9.Kwok, J.T.Y., Tsang, I.W.H.: The pre-image problem in kernel methods, vol. 15, pp. 1517–1525. IEEE (2004)Google Scholar
- 10.Maronna, R.A., Martin, R.D., Yohai, V.J.: Robust statistics. Wiley (2006)Google Scholar
- 11.Nasraoui, O., Krishnapuram, R.: A robust estimator based on density and scale optimization and its application to clustering. In: Proceedings of the Fifth IEEE International Conference on Fuzzy Systems, vol. 2, pp. 1031–1035. IEEE (1996)Google Scholar