Abstract
Learning the ‘true’ number of clusters in a given data set is a fundamental and largely unsolved problem in data analysis, which seriously affects the identification of customer segments in marketing research.
In this paper, we discuss the properties of relevant criteria commonly used to estimate the number of clusters. Moreover, we outline two adaptive clustering algorithms, a growing k-means algorithm and a growing self-organizing neural network. In the empirical part of the paper, we find that the first algorithm stops growing with exactly the number of clusters that we get when determining the optimal number of clusters by means of the JUMP-criterion. This cluster solution proves to be rather similar to the one we obtain by applying the neural network approach. To evaluate the clusters, we use association rules. By testing these rules, we show the differences of patterns underlying particular market segments.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
BAIER, D., GAUL, W. and SCHADER, M. (1997): Two-Mode Overlapping Clustering With Applications to Simultaneous Beneflt Segmentation and Market Structuring, in: Klar, R. and Opitz, O. (Eds.), Classification and Knowledge Organization. Springer, Heidelberg, 557–566.
BLACKWELL, R.D., MINIARD, P.W., and ENGEL, J.F. (2001): Consumer Behavior, Harcourt, Fort Worth.
BOCK, H.H. (1985): On Some Significance Tests in Cluster Analysis. Journal of Classification, 2,1, 77–108.
BOCK, H.H. (1996): Probability Models in Partitional Cluster Analysis. Computional Statistics and Data Analysis, 23,5, 5–28.
BOONE, D.S. and ROEHM, M. (2002): Evaluating the Appropriateness of Market Segmentation Solutions Using Artificial Neural Networks and the Membership Clustering Criterion. Marketing Letters, 13,4, 317–333.
BRASSINGTON, F. and PETTITT, S. (2005): Essentials of Marketing. Prentice Hall, Harlow.
BRA̅ZMA, A., JONASSEN, I., EIDHAMMER, I., and GILBERT, D. (1998): Approaches to the Automatic Discovery of Patterns in Biosequences. Journal of Computional Biology, 5,2, 277–304.
BRIN, S., MOTWANI, R., ULLMAN, J.D., and TSUR, S. (1997): Dynamic Itemset Counting and Implication Rules for Market Basket Data. In: J. Peckham (Ed.): Proceedings ACM SIGMOD International Conference on Management of Data. ACM Press, New York, 255–264.
CALINSKI, R. and HARABASZ, J. (1974): A Dendrite Method for Cluster Analysis. Communications in Statistics (Series A), 3,1, 1–27.
DECKER, R. (2005): Market Basket Analysis by Means of a Growing Neural Network, The International Review of Retail, Distribution and Consumer Research, forthcoming.
DIBB, S. and SIMKIN, L. (1994): Implementation Problems in Industrial Market Segmentation. Industrial Marketing Management, 23,1, 55–63.
DIBB, S. and STERN, P. (1995): Questioning the Reliability of Market Segmentation Techniques. Omega — International Journal of Management, 3,6, 625–636.
DUDOIT, S. and FRIDLYAND, J. (2002): A Prediction-Based Resampling Method of Estimating the Number of Clusters in a Dataset, Genome Biology, 3,7, 1–21.
FENNELL, G., ALLENBY, G.M., YANG, S., and EDWARDS, Y. (2003): The Effectiveness of Demographic and Psychographic Variables for Explaining Brand and Product Category Use. Quantitative Marketing and Economics, 1,2, 223–244.
GAUL, W. and L. SCHMIDT-THIEME (2002): Recommender Systems Based on User Navigational Behavior in the Internet, Behaviormetrika, 29,1, 1–22.
GREEN, RE. and KRIEGER, A.M. (1995): Alternative Approaches to Cluster-Based Market Segmentation. Journal of the Market Research Society, 3, 221–239.
GRANZIN, K.L, OLSEN, J.E., and PAINTER, J.J. (1998): Marketing to Consumer Segments Using Health-Promoting Lifestyles. Journal of Retailing and Consumer Services, 5,3, 131–141.
HAIR, J.F., ANDERSON, R.E., TATHAM, R.L., and BLACK, W.C. (1998): Multivariate Data Analysis. 5th ed., Prentice Hall, Upper Saddle River.
HAMERLY, G. and ELKAN, C. (2003): Learning the k in k-means. Advances in Neural Information Processing Systems, 17, http://www.citeseer.ist.psu.edu/hamerly031earning.html.
HARTIGAN, J.A. (1985): Statistical Theory in Clustering. Journal of Classification, 2,1, 63–76.
HILDERMAN, R.J. and HAMILTON H.J. (2001): Evaluation of Interestingness Measures for Ranking Discovered Knowledge. In: D. Cheung, G.J. Williams, and Q. Li (Eds.): Advances in Knowledge Discovery and Data Mining. Springer, Berlin, 247–259.
KRZANOWSKI, W. and LAI, Y. (1988): A Criterion for Determining the Number of Clusters in a Dataset Using Sum of Squares Clustering. Biometrics, 44,1, 23–34.
LIU, B., MA, Y., and LEE, R. (2001): Analyzing the Interestingness of Association Rules From the Temporal Dimension. IEEE International Conference on Data Mining (ICDM-2001), http://www.cs.uic.edu/liub/publications/ICDM-2001.ps.
LU, C.S. (2003): Market Segment Evaluation and International Distribution Centers. Transportation Research Part E: Logistics and Transportation Review, 391, 49–60.
MARDIA, K.V. (1974): Applications of Some Measures of Multivariate Skewness and Kurtosis for Testing Normality and Robustness Studies. Sankhya, 36, 115–128.
MARDIA, K.V., KENT, J.T., and BIBBY, J.M. (1979): Multivariate Analysis. Academic Press, London.
MECKLIN, C.J. and MUNDFROM, D.J. (2004): An Appraisal and Bibliography of Tests for Multivariate Normality. International Statistical Review, 72,1, 123–138.
MILLIGAN, G.W. and COOPER, M.C. (1985): An Examination of Procedures for Determining the Number of Clusters in a Data Set. Psychometrika, 50, 159–179.
PALMER, R.A. and Millier, P. (2004): Segmentation: Identification, Intuition, and Implementation. Industrial Marketing Management, 33,8, 779–785.
PAPASTEFANOU, G., SCHMIDT, P., BRSCH-SUPAN, A. and LDTKE, H. and OLTERSDORF, U. (1999): Social and Economic Research with Consumer Panel Data. GESIS, Mannheim.
ROTH, V., LANGE, T., BRAUN, M., and BUHMANN, J. (2002): A Resampling Approach to Cluster Validation. in: W. Härdle and B. Rönz (Eds.): Proceedings in Computational Statistics. Physica, Heidelberg, 123–128.
SUGAR, C.A. and JAMES, G.M. (2003): Finding the Number of Clusters in a Dataset: An Information-Theoretic Approach. Journal of the American Statistical Society, 98,463, 750–762.
TIBSHIRANI, R., WALTER, G., and HASTIE, T. (2001): Estimating the Number of Clusters in a Dataset via the Gap Statistic. Journal of the Royal Statistical Society (Series B), 63,3, 411–423.
WAGNER, R. (2005): Mining Promising Qualification Patterns. In: D. Baier and K.-D. Wernecke (Eds.): Innovations in Classification, Data Science, and Information Systems. Berlin, Springer, 249–256.
WEDEL, M. and KAMAKURA, W.A. (2000): Market Segmentation: Conceptional and Methodological Foundations. 2nd ed., Kluwer Academic Publishers, Dordrecht.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin · Heidelberg
About this chapter
Cite this chapter
Wagner, R., Scholz, S.W., Decker, R. (2005). The Number of Clusters in Market Segmentation. In: Baier, D., Decker, R., Schmidt-Thieme, L. (eds) Data Analysis and Decision Support. Studies in Classification, Data Analysis, and Knowledge Organization. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-28397-8_19
Download citation
DOI: https://doi.org/10.1007/3-540-28397-8_19
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-26007-3
Online ISBN: 978-3-540-28397-3
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)