Mode Seeking Clustering by KNN and Mean Shift Evaluated
Abstract
Clustering by mode seeking is most popular using the mean shift algorithm. A less well known alternative with different properties on the computational complexity is kNN mode seeking, based on the nearest neighbor rule instead of the Parzen kernel density estimator. It is faster and allows for much higher dimensionalities. We compare the performances of both procedures using a number of labeled datasets. The retrieved clusters are compared with the given class labels. In addition, the properties of the procedures are investigated for prototype selection.
It is shown that kNN mode seeking is well performing and is feasible for large scale problems with hundreds of dimensions and up to a hundred thousand data points. The mean shift algorithm may perform better than kNN mode seeking for smaller dataset sizes.
Keywords
Neighborhood Size Cluster Procedure Neighbor Rule Shift Algorithm Prototype SelectionReferences
- 1.Bradley, A.P.: The use of the area under the ROC curve in the evaluation of machine learning algorithms. Pattern Recognition 30(7), 1145–1159 (1997)CrossRefGoogle Scholar
- 2.Cheng, Y.: Mean shift, mode seeking, and clustering. IEEE Trans. Pattern Anal. Mach. Intell. 17(8), 790–799 (1995)CrossRefGoogle Scholar
- 3.Comaniciu, D., Meer, P.: Mean shift: A robust approach toward feature space analysis. IEEE Trans. Pattern Anal. Mach. Intell. 24(5), 603–619 (2002)CrossRefGoogle Scholar
- 4.Duin, R., Juszczak, P., de Ridder, D., Paclík, P., Pękalska, E., Tax, D., Verzakov, S.: PRTools 4.1, a Matlab toolbox for pattern recognition, http://prtools.org
- 5.Finkston, B.: Mean shift clusteringGoogle Scholar
- 6.Frank, A., Asuncion, A.: UCI machine learning repository (2010), http://archive.ics.uci.edu/ml
- 7.Fukunaga, K., Hostetler, L.D.: The estimation of the gradient of a density function, with applications in pattern recognition. IEEE Trans. Information Theory 21(1), 32–40 (1975)MathSciNetMATHCrossRefGoogle Scholar
- 8.Kittler, J.V.: A locally sensitive method for cluster analysis. Pattern Recognition 8(1), 23–33 (1976)MathSciNetMATHCrossRefGoogle Scholar
- 9.Koontz, W.L.G., Narendra, P.M., Fukunaga, K.: A graph-theoretic approach to nonparametric cluster analysis. IEEE Trans. Computer 25, 936–944 (1976)MathSciNetMATHCrossRefGoogle Scholar
- 10.Shaffer, E., Dubes, R.C., Jain, A.K.: Single-link characteristics of a mode-seeking clustering algorithm. Pattern Recognition 11(1), 65–70 (1979)MATHCrossRefGoogle Scholar