Semi-Supervised Kernel Clustering with Sample-to-Cluster Weights
Collecting unlabelled data is often effortless while labelling them can be difficult. Either the amount of data is too large or samples cannot be assigned a specific class label with certainty. In semi-supervised clustering the aim is to set the cluster centres close to their label-matching samples and unlabelled samples. Kernel based clustering methods are known to improve the cluster results by clustering in feature space. In this paper we propose a semi-supervised kernel based clustering algorithm that minimizes convergently an error function with sample-to-cluster weights. These sample-to-cluster weights are set dependent on the class label, i.e. matching, not-matching or unlabelled. The algorithm is able to use many kernel based clustering methods although we suggest Kernel Fuzzy C-Means, Relational Neural Gas and Kernel K-Means. We evaluate empirically the performance of this algorithm on two real-life dataset, namely Steel Plates Faults and MiniBooNE.
KeywordsClass Label Sample Label Normalize Mutual Information Cluster Assignment Cluster Label
Unable to display preview. Download preview PDF.
- 3.Basu, S., Bilenko, M., Mooney, R.J.: A Probabilistic Framework for Semi-Supervised Clustering. In: Proceedings of the 10th ACM SIGKDD International Conference on Knowledge Discovery and Data-Mining, pp. 59–68 (2004)Google Scholar
- 4.Kulis, B., Basu, S., Dhillon, I., Mooney, R.: Semi-supervised Graph Clustering: A Kernel Approach. In: Proceedings of the 25th International Conference on Machine Learning, vol. 74(1), pp. 1–22 (2008)Google Scholar
- 5.Yan, B., Domeniconi, C.: Exploration of Different Constraints and Query Methods with Kernel-based Semi-Supervised Clustering. In: IEEE International Conference on Systems, Man and Cybernetics, SMC 2006, pp. 829–834 (2006)Google Scholar
- 6.Hu, E., Chen, S., Zhang, D., Yin, X.: Semisupervised Kernel Matrix Learning by Kernel Propagation. IEEE Transactions on Neural Networks 21(11) (2010)Google Scholar
- 7.Weston, J.: Large-Scale Semi-Supervised Learning. In: Proceedings of NATO Advanced Study Institute on Mining Massive Data Sets for Security, vol. 19, pp. 62–75 (2008)Google Scholar
- 8.Zhang, D.Q., Chen, S.C.: Fuzzy clustering using kernel methods. In: International Conference of Control and Automatation, ICCA 2002, pp. 123–128 (2002)Google Scholar
- 10.Frank, A., Asuncion, A.: UCI Machine Learning Repository, University of California, School of Information and Computer Sciences, Irvine (2010), http://archive.ics.uci.edu/ml