Skip to main content

Finding Groups in Ordinal Data: An Examination of Some Clustering Procedures

  • Conference paper
  • First Online:
Classification as a Tool for Research

Abstract

The article evaluates, based on ordinal data simulated with cluster.Gen function of clusterSim package working in R environment, some cluster analysis procedures containing GDM distance for ordinal data (see Jajuga et al. 2003; Walesiak 1993, 2006), nine clustering methods and eight internal cluster quality indices for determining the number of clusters. Seventy two clustering procedures are evaluated based on simulated data originating from a variety of models. Models contain the known structure of clusters and differ in the number of true dimensions, the number of categories for each variable, the density and shape of clusters, the number of true clusters, the number of noisy variables. Each clustering result was compared with the known cluster structure from models applying (Hubert and Arabie 1985) corrected Rand index.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  • Anderberg, M. R. (1973). Cluster analysis for applications. New York, San Francisco, London: Academic Press.

    MATH  Google Scholar 

  • Gordon, A. D. (1999). Classification. London: Chapman & Hall/CRC.

    MATH  Google Scholar 

  • Hubert, L. J., & Arabie, P. (1985). Comparing partitions. Journal of Classification, 2(1), 193–218.

    Article  Google Scholar 

  • Jajuga, K., & Walesiak, M. (2000). Standardisation of data set under different measurement scales. In R. Decker & W. Gaul (Eds.), Classification and information processing at the turn of the millennium (pp. 105–112). Berlin, Heidelberg: Springer-Verlag.

    Google Scholar 

  • Jajuga, K., Walesiak, M., & Ba̧k, A. (2003). On the general distance measure. In M. Schwaiger & O. Opitz (Eds.), Exploratory data analysis in empirical research (pp. 104–109). Berlin, Heidelberg: Springer-Verlag.

    Google Scholar 

  • Kaufman, L., & Rousseeuw, P. J. (1990). Finding groups in data: An introduction to cluster analysis. New York: Wiley (second editon: 2005).

    Google Scholar 

  • Kendall, M. G. (1966). Discrimination and classification. In P. R. Krishnaiah (Ed.), Multivariate analysis I (pp. 165–185). New York: Academic Press.

    Google Scholar 

  • Macnaughton-Smith, P., Williams, W. T., Dale, M. B., & Mockett, L. G. (1964). Dissimilarity analysis: A new technique of hierarchical sub-division. Nature, 202, 1034–1035.

    Article  Google Scholar 

  • Milligan, G. W. (1985). An algorithm for generating artificial test clusters. Psychometrika, 50(1), 123–127.

    Article  Google Scholar 

  • Milligan, G. W. (1996). Clustering validation: results and implications for applied analyses. In P. Arabie, L. J. Hubert & G. de Soete (Eds.), Clustering and classification (pp. 341–375). Singapore: World Scientific.

    Google Scholar 

  • Milligan, G. W., & Cooper, M. C. (1988). A study of standardization of variables in cluster analysis. Journal of Classification, 5(2), 181–204.

    Article  MathSciNet  Google Scholar 

  • Podani, J. (1999). Extending Gower’s general coefficient of similarity to ordinal characters. Taxon, 48, 331–340.

    Article  Google Scholar 

  • Qiu, W., & Joe, H. (2006). Generation of random clusters with specified degree of separation. Journal of Classification, 23(2), 315–334.

    Article  MathSciNet  Google Scholar 

  • Soffritti, G. (2003). Identifying multiple cluster structures in a data matrix. Communications in Statistics. Simulation and Computation, 32(4), 1151–1177.

    Article  MATH  MathSciNet  Google Scholar 

  • Stevens, S. S. (1959). Measurement, psychophysics and utility. In C. W. Churchman and P. Ratooch (Eds.), Measurement. Definitions and theories (pp. 18–63), New York: Wiley.

    Google Scholar 

  • Tibshirani, R., & Walther, G. (2005). Cluster validation by predicting strength. Journal of Computational and Graphical Statistics, 14(3), 511–528.

    Article  MathSciNet  Google Scholar 

  • Tibshirani, R., Walther, G., & Hastie, T. (2001). Estimating the number of clusters in a data set via the gap statistic. Journal of the Royal Statistical Society, ser. B, 63(2), 411–423.

    Google Scholar 

  • Walesiak, M. (1993). Statystyczna analiza wielowymiarowa w badaniach marketingowych [Multivariate Statistical analysis in marketing research]. Wrocław University of Economics, Research Papers no. 654.

    Google Scholar 

  • Walesiak, M. (2006). Uogólniona miara odległości w statystycznej analizie wielowymiarowej [The generalised distance measure in multivariate statistical analysis]. Wrocław: Wydawnictwo AE.

    Google Scholar 

  • Walesiak, M., & Dudek, A. (2009). clusterSim package, http://www.R-project.org/.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Marek Walesiak .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Walesiak, M., Dudek, A. (2010). Finding Groups in Ordinal Data: An Examination of Some Clustering Procedures. In: Locarek-Junge, H., Weihs, C. (eds) Classification as a Tool for Research. Studies in Classification, Data Analysis, and Knowledge Organization. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-10745-0_19

Download citation

Publish with us

Policies and ethics