Comparing Approaches for Clustering Mixed Mode Data: An Application in Marketing Research

  • Isabella MorliniEmail author
  • Sergio Zani
Conference paper
Part of the Studies in Classification, Data Analysis, and Knowledge Organization book series (STUDIES CLASS)


Practical applications in marketing research often involve mixtures of categorical and continuous variables. For the purpose of clustering, a variety of algorithms has been proposed to deal with mixed mode data. In this paper we apply some of these techniques on two data sets regarding marketing problems. We also propose an approach based on the consensus between partitions obtained by considering separately each variable or subsets of variables having the same scale. This approach may be applied to data with many categorical variables and does not impose restrictive assumptions on the variable distribution. We finally suggest a summarizing fuzzy partition with membership degrees obtained as a function of the classes determined by the different methods.


Membership Degree Optional Accessory Marketing Research Rand Index Fuzzy Partition 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. Ahhmad, A., & Dey, L. (2007). A k-means clustering algorithm for mixed numeric and categorical data. Data and Knowledge Engineering, 63(2), 503–527.CrossRefGoogle Scholar
  2. Chiu, T., Fang, D., Chen, J., Wang, Y., & Jeris, C. (2001). A robust and scalable clustering algorithm for mixed type attributes in large database environment. In Proceedings of the seventh ACM SIGKDD international conference on knowledge discovery and data mining 263–268. San Francisco, CA.CrossRefGoogle Scholar
  3. Coleman, D. A., & Woodruff, D. L. (2000). Cluster analysis for large datasets: An effective algorithm for maximizing the mixture likelihood. Journal of Computational and Graphical Statistics, 9(4), 672–688.CrossRefMathSciNetGoogle Scholar
  4. Everitt, B. S. (1988). A finite mixture model for the clustering of mixed mode data. Statistics and Probability Letters, 6, 305–309.CrossRefMathSciNetGoogle Scholar
  5. Everitt, B. S., & Merette, C. (1990). The clustering of mixed-mode data: A comparison of possible approaches. Journal of Applied Statistics, 17(3), 284–297.CrossRefGoogle Scholar
  6. Gordon, A. D., & Vichi, M. (1998). Partitions of partitions. Journal of Classification, 15, 265–285.zbMATHCrossRefGoogle Scholar
  7. Greenacre, M. (2007). Correspondence analysis in practice. New York: Chapman and Hall.zbMATHCrossRefGoogle Scholar
  8. Kaufman, L., & Rousseuw, P. J. (1990). Finding groups in data: An introduction to cluster analysis. New York: Wiley.Google Scholar
  9. Kohonen, T. (1984). Self-organization and associative memory. London: Springer.zbMATHGoogle Scholar
  10. Milligan, G. W., & Cooper, M. C. (1988). A study of standardization of variables in cluster analysis. Journal of Classification, 5, 181–204.CrossRefMathSciNetGoogle Scholar
  11. Zhang, T., Ramakrishnan, R., & Livny, M. (1996). BIRCH: An efficient data clustering method for very large databases. In: Proceedings of the ACM SIGMOD Conference on Management of Data 103–114. Montreal, Canada.Google Scholar
  12. Zhang, P., Wang, X., & Song, P. X. (2006). Clustering categorical data based on distance vectors. JASA, 101(473), 355–367.zbMATHMathSciNetGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2010

Authors and Affiliations

  1. 1.DSSCQ, Università di Modena e Reggio EmiliaModenaItaly

Personalised recommendations