Mode Seeking Clustering by KNN and Mean Shift Evaluated

  • Robert P. W. Duin
  • Ana L. N. Fred
  • Marco Loog
  • Elżbieta Pękalska
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7626)

Abstract

Clustering by mode seeking is most popular using the mean shift algorithm. A less well known alternative with different properties on the computational complexity is kNN mode seeking, based on the nearest neighbor rule instead of the Parzen kernel density estimator. It is faster and allows for much higher dimensionalities. We compare the performances of both procedures using a number of labeled datasets. The retrieved clusters are compared with the given class labels. In addition, the properties of the procedures are investigated for prototype selection.

It is shown that kNN mode seeking is well performing and is feasible for large scale problems with hundreds of dimensions and up to a hundred thousand data points. The mean shift algorithm may perform better than kNN mode seeking for smaller dataset sizes.

Keywords

Neighborhood Size Cluster Procedure Neighbor Rule Shift Algorithm Prototype Selection 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

  1. 1.
    Bradley, A.P.: The use of the area under the ROC curve in the evaluation of machine learning algorithms. Pattern Recognition 30(7), 1145–1159 (1997)CrossRefGoogle Scholar
  2. 2.
    Cheng, Y.: Mean shift, mode seeking, and clustering. IEEE Trans. Pattern Anal. Mach. Intell. 17(8), 790–799 (1995)CrossRefGoogle Scholar
  3. 3.
    Comaniciu, D., Meer, P.: Mean shift: A robust approach toward feature space analysis. IEEE Trans. Pattern Anal. Mach. Intell. 24(5), 603–619 (2002)CrossRefGoogle Scholar
  4. 4.
    Duin, R., Juszczak, P., de Ridder, D., Paclík, P., Pękalska, E., Tax, D., Verzakov, S.: PRTools 4.1, a Matlab toolbox for pattern recognition, http://prtools.org
  5. 5.
    Finkston, B.: Mean shift clusteringGoogle Scholar
  6. 6.
    Frank, A., Asuncion, A.: UCI machine learning repository (2010), http://archive.ics.uci.edu/ml
  7. 7.
    Fukunaga, K., Hostetler, L.D.: The estimation of the gradient of a density function, with applications in pattern recognition. IEEE Trans. Information Theory 21(1), 32–40 (1975)MathSciNetMATHCrossRefGoogle Scholar
  8. 8.
    Kittler, J.V.: A locally sensitive method for cluster analysis. Pattern Recognition 8(1), 23–33 (1976)MathSciNetMATHCrossRefGoogle Scholar
  9. 9.
    Koontz, W.L.G., Narendra, P.M., Fukunaga, K.: A graph-theoretic approach to nonparametric cluster analysis. IEEE Trans. Computer 25, 936–944 (1976)MathSciNetMATHCrossRefGoogle Scholar
  10. 10.
    Shaffer, E., Dubes, R.C., Jain, A.K.: Single-link characteristics of a mode-seeking clustering algorithm. Pattern Recognition 11(1), 65–70 (1979)MATHCrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Robert P. W. Duin
    • 1
  • Ana L. N. Fred
    • 2
  • Marco Loog
    • 1
  • Elżbieta Pękalska
    • 3
  1. 1.Pattern Recognition LaboratoryDelft University of TechnologyThe Netherlands
  2. 2.Department of Electrical and Computer EngineeringInstituto Superior Técnico (IST - Technical University of Lisbon)Portugal
  3. 3.School of Computer ScienceUniversity of ManchesterUnited Kingdom

Personalised recommendations