Skip to main content

Cluster Analysis: An Application to a Real Mixed-Type Data Set

  • Chapter
  • First Online:

Part of the book series: Studies in Systems, Decision and Control ((SSDC,volume 179))

Abstract

When you dispose of multivariate data it is crucial to summarize them, so as to extract appropriate and useful information, and consequently, to make proper decisions accordingly. Cluster analysis fully meets this requirement; it groups data into meaningful groups such that both the similarity within a cluster and the dissimilarity between groups are maximized. Thanks to its great usefulness, clustering is used in a broad variety of contexts; this explains its huge appeal in many disciplines. Most of the existing clustering approaches are limited to numerical or categorical data only. However, since data sets composed of mixed types of attributes are very common in real life applications, it is absolutely worth to perform clustering on them. In this paper therefore we stress the importance of this approach, by implementing an application on a real world mixed-type data set.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD   169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  • Ahmad, A., Dey, L.: A k-mean clustering algorithm for mixed numeric and categorical data. Data Knowl. Eng. 63, 503–527 (2007)

    Article  Google Scholar 

  • Brignell, C.J., Dryden, I.L., Gattone, S.A., Park, B., Browne, W.J.: Surface shape analysis with an application to brain surface asymmetry in schizophrenia. Biostatistics 11(4), 1–22 (2010)

    Article  Google Scholar 

  • Caruso, G., Gattone, S.A., Fortuna, F., Di Battista, T.: Cluster analysis as a decision-making tool: a methodological review. In: Bucciarelli, E., Chen, S., Corchado, J.M., (eds.) Decision Economics: In the Tradition of Herbert A. Simon’s Heritage. Advances in Intelligent Systems and Computing, vol. 618, pp. 48–55. Springer International Publishing (2018)

    Google Scholar 

  • Cheung, Y., Jia, H.: Categorical-and-numerical-attribute data clustering based on a unified similarity metric without knowing cluster number. Pattern Recognit. 46, 2228–2238 (2013)

    Article  Google Scholar 

  • Di Battista, T.: Diversity index estimation by adaptive sampling. Environmetrics 13(2), 209–214 (2002)

    Article  Google Scholar 

  • Di Battista, T., Fortuna, F.: Clustering dichotomously scored items through functional data analysis. Electron. J. Appl. Stat. Anal. 9(2), 433–450 (2016)

    MathSciNet  Google Scholar 

  • Di Battista, T., Gattone, S.A.: Multivariate bootstrap confidence regions for abundance vector using data depth. Environ. Ecol. Stat. 11(4), 355–365 (2004)

    Article  MathSciNet  Google Scholar 

  • Di Battista, T., Gattone, S.A.: Nonparametric tests and confidence regions for intrinsic diversity profiles of ecological populations. Environmetrics 14(8), 733–741 (2003)

    Article  Google Scholar 

  • Everitt, B.: Cluster Analysis. Heinemann Educational Books Ltd. (1974)

    Google Scholar 

  • Fortuna, F., Maturo, F.: K-means clustering of item characteristic curves and item information curves via functional principal component analysis. Qual. Quant. (2018). https://doi.org/10.1007/s11135-018-0724-7

    Article  Google Scholar 

  • Gattone, S.A., De Sanctis, A., Russo, T., Pulcini, D.: A shape distance based on the Fisher-Rao metric and its application for shapes clustering. Phisica A 487, 93–102 (2017)

    Article  MathSciNet  Google Scholar 

  • Huang, Z.: Clustering large data sets with mixed numeric and categorical values. In: Proceedings in the First Pacific-Asia Conference on Knowledge Discovery and Data Mining, pp. 21–34 (1997)

    Google Scholar 

  • MacQueen, J.B.: Some methods for classification and analysis of multivariate observations. In: Proceedings of 5th Berkeley Symposium on Mathematical Statistics and Probability, vol. 1, pp. 281–297. University of California Press, Berkeley (1967)

    Google Scholar 

  • Maturo, F.: Unsupervised classification of ecological communities ranked according to their biodiversity patterns via a functional principal component decomposition of Hills numbers integral functions. Ecol. Indic. 90, 305–315 (2018)

    Article  Google Scholar 

  • Nie, G., Chen, Y., Zhang, L., Guo, Y.: Credit card customer analysis based on panel data clustering. Procedia Comput. Sci. 1(1), 2489–2497 (2010)

    Article  Google Scholar 

  • Peng, Y., Kou, G., Shi. Y., Chen, Z.: Improving clustering analysis for credit card accounts classification. In: Proceedings of the 5th International Conference on Computational Science—ICCS 2005, Part III, pp. 548–553. Springer Berlin Heidelberg (2005)

    Google Scholar 

  • Valentini, P., Di Battista, T., Gattone, S.: Heterogeneneity measures in customer satisfaction analysis. J. Classif. 28, 38–52 (2011)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to G. Caruso .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Caruso, G., Gattone, S.A., Balzanella, A., Di Battista, T. (2019). Cluster Analysis: An Application to a Real Mixed-Type Data Set. In: Flaut, C., Hošková-Mayerová, Š., Flaut, D. (eds) Models and Theories in Social Systems. Studies in Systems, Decision and Control, vol 179. Springer, Cham. https://doi.org/10.1007/978-3-030-00084-4_27

Download citation

Publish with us

Policies and ethics