Cluster Analysis of Data with Reduced Dimensionality: An Empirical Study

  • Pavel KrömerEmail author
  • Jan Platoš
Conference paper
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 423)


Cluster analysis is an important high-level data mining procedure that can be used to identify meaningful groups of objects within large data sets. Various dimension reduction methods are used to reduce the complexity of data before further processing. The lower-dimensional projections of original data sets can be seen as simplified models of the original data. In this paper, several clustering algorithms are used to process low-dimensional projections of complex data sets and compared with each other. The properties and quality of clustering obtained by each method is evaluated and their suitability to process reduced data sets is assessed.


Clustering Metric multidimensional scaling Sammon’s projection Affinity propagation Mean shift DBSCAN 



This work was supported by the IT4Innovations Centre of Excellence project (CZ.1.05/1.1.00/02.0070), funded by the European Regional Development Fund and the national budget of the Czech Republic via the Research and Development for Innovations Operational Programme and by Project SP2015/146 of the Student Grant System, VŠB—Technical University of Ostrava.


  1. 1.
    Abdi, H.: Metric multidimensional scaling. In: Salkind, N. (ed.) Encyclopedia of Measurement and Statistics, pp. 598–605. Sage, Thousand Oaks (2007)Google Scholar
  2. 2.
    Bandyopadhyay, S., Saha, S.: Unsupervised Classification: Similarity Measures, Classical and Metaheuristic Approaches, and Applications. SpringerLink, Bücher. Springer, Berlin (2012),
  3. 3.
    Borg, I., Groenen, P., Mair, P.: Mds algorithms. In: Applied Multidimensional Scaling, pp. 81–86. SpringerBriefs in Statistics, Springer, Berlin (2013),
  4. 4.
    Burges, C.J.C.: Dimension reduction: a guided tour. Found. Trends Mach. Learn. 2(4) (2010),
  5. 5.
    Cheng, Y.: Mean shift, mode seeking, and clustering. Pattern Anal. Mach. Intell. IEEE Trans. 17(8), 790–799 (1995)CrossRefGoogle Scholar
  6. 6.
    Comaniciu, D., Meer, P.: Mean shift: a robust approach toward feature space analysis. Pattern Anal. Mach. Intell., IEEE Trans. 24(5), 603–619 (2002)CrossRefGoogle Scholar
  7. 7.
    Dunn, J.C.: Well separated clusters and optimal fuzzy-partitions. J. Cybern. 4, 95–104 (1974)CrossRefMathSciNetGoogle Scholar
  8. 8.
    Everitt, B., Landau, S., Leese, M., Stahl, D.: Cluster Analysis. Wiley Series in Probability and Statistics, Wiley, Hoboken (2011),
  9. 9.
    Frey, B.J., Dueck, D.: Clustering by passing messages between data points. Science 315(5814), 972–976 (2007),
  10. 10.
    Fukunaga, K., Hostetler, L.: The estimation of the gradient of a density function, with applications in pattern recognition. Inf. Theor. IEEE Trans. 21(1), 32–40 (1975)CrossRefMathSciNetzbMATHGoogle Scholar
  11. 11.
    Hastie, T., Tibshirani, R., Friedman, J.: The Elements of Statistical Learning: Data Mining, Inference, and Prediction. Springer series in statistics, Springer, Berlin (2001),
  12. 12.
    Kriegel, H.P., Krüger, P., Sander, J., Zimek, A.: Density-based clustering. Wiley Interdisc. Rev. Data Min. Knowl. Disc. 1(3), 231–240 (2011). doi: 10.1002/widm.30 CrossRefGoogle Scholar
  13. 13.
    Sammon, J.W.: A nonlinear mapping for data structure analysis. IEEE Trans. Comput. 18, 401–409 (1969)CrossRefGoogle Scholar
  14. 14.
    Torgerson, W.S.: Multidimensional scaling: I. theory and method. Psychometrika 17, 401–419 (1952)CrossRefMathSciNetzbMATHGoogle Scholar
  15. 15.
    Wang, J.: Geometric Structure of High-Dimensional Data and Dimensionality Reduction. Springer, Berlin (2012),

Copyright information

© Springer International Publishing Switzerland 2016

Authors and Affiliations

  1. 1.IT4Innovations & Department of Computer ScienceVŠB Technical University of OstravaOstravaCzech Republic

Personalised recommendations