Mode Seeking Clustering by KNN and Mean Shift Evaluated
Clustering by mode seeking is most popular using the mean shift algorithm. A less well known alternative with different properties on the computational complexity is kNN mode seeking, based on the nearest neighbor rule instead of the Parzen kernel density estimator. It is faster and allows for much higher dimensionalities. We compare the performances of both procedures using a number of labeled datasets. The retrieved clusters are compared with the given class labels. In addition, the properties of the procedures are investigated for prototype selection.
It is shown that kNN mode seeking is well performing and is feasible for large scale problems with hundreds of dimensions and up to a hundred thousand data points. The mean shift algorithm may perform better than kNN mode seeking for smaller dataset sizes.
KeywordsNeighborhood Size Cluster Procedure Neighbor Rule Shift Algorithm Prototype Selection
- 4.Duin, R., Juszczak, P., de Ridder, D., Paclík, P., Pękalska, E., Tax, D., Verzakov, S.: PRTools 4.1, a Matlab toolbox for pattern recognition, http://prtools.org
- 5.Finkston, B.: Mean shift clusteringGoogle Scholar
- 6.Frank, A., Asuncion, A.: UCI machine learning repository (2010), http://archive.ics.uci.edu/ml