Advances in Data Analysis and Classification

, Volume 6, Issue 4, pp 253–276

Image data analysis and classification in marketing

  • Daniel Baier
  • Ines Daniel
  • Sarah Frost
  • Robert Naundorf
Regular Article


Nowadays, the diffusion of smartphones, tablet computers, and other multipurpose equipment with high-speed Internet access makes new data types available for data analysis and classification in marketing. So, e.g., it is now possible to collect images/snaps, music, or videos instead of ratings. With appropriate algorithms and software at hand, a marketing researcher could simply group or classify respondents according to the content of uploaded images/snaps, music, or videos. However, appropriate algorithms and software are sparsely known in marketing research up to now. The paper tries to close this gap. Algorithms and software from computer science are presented, adapted and applied to data analysis and classification in marketing. The new SPSS-like software package IMADAC is introduced.


Image data analysis Image classification Market segmentation 

Mathematics Subject Classification

62H30 Classification and discrimination; cluster analysis 62H35 Image analysis 90B60 Marketing, advertising 91D30 Social networks 68T10 Pattern recognition speech recognition 


  1. Acock A (2005) SAS, Stata, SPSS: a comparison. J Marriage Fam 67(4):1093–1095CrossRefGoogle Scholar
  2. ADM (2010) Jahresbericht 2010. Arbeitskreis deutscher Markt- und Sozialforschungsinstitute (ADM) e.V., Frankfurt/MainGoogle Scholar
  3. Ahmad N, Omar A, Ramayah T (2010) Consumer lifestyles and online shopping continuance intention. Bus Strateg Ser 11(4):227–243CrossRefGoogle Scholar
  4. Albatineh A, Niewiadomska-Bugaj M (2011) Correcting jaccard and other similarity indices for chance agreement in cluster analysis. Adv Data Anal Classif 5(3):179–200MathSciNetCrossRefMATHGoogle Scholar
  5. Anderson W, Golden L (1984) Lifestyle and psychographics: a critical review and recommendation. Adv Consum Res 11:405–411Google Scholar
  6. Arabie P, Hubert L (1995) Advances in cluster analysis relevant to marketing research. Stud Classif Data Anal Knowl Org 6:3–19Google Scholar
  7. ARD/ZDF (2011) ARD/ZDF-Onlinestudie 2011. ARD/ZDF.
  8. Baier D (2003) Classification and marketing research. Taksonomia 10:21–39Google Scholar
  9. Baier D, Brusch M (2008) Marktsegmentierung. In: Herrmann A, Homburg C, Klarmann M (eds) Handbuch Marktforschung. Methoden Anwendungen Praxisbeispiele, Gabler, pp 769–790Google Scholar
  10. Baier D, Daniel I (2012) Image clustering for marketing purposes. Stud Classif Data Anal Knowl Org 43:487–494CrossRefGoogle Scholar
  11. Baier D, Gaul W (1999) Optimal product positioning based on paired comparison data. J Econ 89(1):365–392MATHGoogle Scholar
  12. Baier D, Stüber E (2010) Acceptance of recommendations to buy in online retailing. J Retail Consum Serv 17(3):173–180CrossRefGoogle Scholar
  13. Bay H, Ess A, Tuytelaars T, Van Gool L (2008) Speeded-up robust features (SURF). Comput Vis Image Understand 110(3):346–359CrossRefGoogle Scholar
  14. Bearden W, Netemeyer R, Teel J (1989) Measurement of consumer susceptibility influence. J Consum Res 15:473–481CrossRefGoogle Scholar
  15. Buckinx W, Van den Poel D (2005) Customer base analysis: partial defection of behaviourally loyal clients in a non-contractual FMCG retail setting. Eur J Oper Res 164(1):252–268CrossRefMATHGoogle Scholar
  16. Burget R, Karasek J, Smekal Z, Uher V, Dostal O (2010) RapidMiner image processing extension: a platform for collaborative research. In: The 33rd international conference on telecommunication and signal processing, TSP, pp 114–118Google Scholar
  17. BVM (2011) Marktforschung 2011/2012—BVM Handbuch der Institute und Dienstleister. Berufsverband Deutscher Markt- und Sozialforscher (BVM) e.V., BerlinGoogle Scholar
  18. Choras R (2007) Image feature extraction techniques and their applications for CBIR and biometrics systems. Int J Biol Biomed Eng 1(1):6–16Google Scholar
  19. Daniel I, Baier D (2012) Lifestyle segmentation based on contents of preferred images versus ratings of items. Stud Classif Data Anal Knowl Org (to appear)Google Scholar
  20. Datta R, Joshi D, Li J, Wang J (2008) Image retrieval: ideas, influences, and trends of the new age. ACM Comput Surv 40(2):5:1–60Google Scholar
  21. Duda RO, Hart PE (1973) Pattern classification and scene analysis. Wiley, New YorkMATHGoogle Scholar
  22. Fleuret F, Geman D (2008) Stationary features and cat detection. J Mach Learn Res 9:2549–2578MathSciNetMATHGoogle Scholar
  23. Foley JD, van Dam A, Feiner SK, Hughes JF (1999) Computer graphics: principles and practice. Addison-Wesley Publishing Company, ReadingGoogle Scholar
  24. Frost S, Baier D (2012) Using earth mover’s distance and its approximations for clustering images—a comparison. Stud Classif Data Anal Knowl Org (to appear)Google Scholar
  25. Fu KS, Rosenfeld A (1976) Pattern recognition and image processing. IEEE Trans Comput 25(12):1336–1346CrossRefMATHGoogle Scholar
  26. Gaul W, Baier D (1994) Marktforschung und Marketing Management: Computerbasierte Entscheidungsunterstützung. Oldenbourg, MünchenGoogle Scholar
  27. Gaul W, Schmidt-Thieme L (2002) Recommender systems based on user navigational behavior in the internet. Behaviormetrika 29(1):1–22MathSciNetCrossRefMATHGoogle Scholar
  28. Gonzalez A, Bello L (2002) The construct lifestyle in market segmentation—the behavior of tourist consumers. Eur J Mark 36:51–85CrossRefGoogle Scholar
  29. Gonzalez R, Woods R (2002) Digital image processing. Prentice-Hall, Englewood CliffsGoogle Scholar
  30. Haralick R, Shanmugam K, Dinstein I (1973) Textural features for image classification. IEEE Trans Syst Man Cybern 3(6):610–621CrossRefGoogle Scholar
  31. Hu M (1962) Visual pattern recognition by moment invariants. IRE Trans Inf Theory 8(2):179–187CrossRefMATHGoogle Scholar
  32. Hubert L, Arabie P (1985) Comparing partitions. J Classif 2:193–218CrossRefGoogle Scholar
  33. IBM (2009) IBM to acquire SPSS Inc. to provide clients predictive analytics capabilities. International Business Machines (IBM) Corp, News Release July 28, 2009Google Scholar
  34. IBM (2011) IBM SPSS Statistics 20 core system user’s guide.
  35. Kapferer JN (1995) Brand confusion: empirical study of a legal concept. Psychol Mark 12(6):551–569CrossRefGoogle Scholar
  36. Kato T (1992) Database architecture for content-based image retrieval. In: Jambardino AA, Niblack WR (eds) Image storage and retrieval systems, Proc. SPIE 1662, San Jose, pp 112–123Google Scholar
  37. KDNuggets (2011) KDNuggets polls.
  38. Kim Y, Street W (2004) An intelligent system for customer targeting: a data mining approach. Decis Support Syst 37(2):215–228CrossRefGoogle Scholar
  39. Law M, Figueiredo M, Jain A (2004) Simultaneous feature selection and clustering using mixture model. IEEE Trans Pattern Anal Mach Intell 26(9):1154–1166CrossRefGoogle Scholar
  40. Lazer W (1963) Life style concepts and marketing. In: Greyser A (ed) Toward Scientific Marketing. American Marketing Association, Chicago, pp 130–139Google Scholar
  41. Lee HJ, Lim H, Jolly L, Lee J (2009) Consumer lifestyles and adoption of high-technology products: a case of south korea. J Int Consum Mark 21(2):153–167CrossRefGoogle Scholar
  42. Ling H, Okada K (2007) An efficient earth mover’s distance algorithm for robust histogram comparison. IEEE Trans Pattern Anal Mach Intell 29(5):840–853CrossRefGoogle Scholar
  43. Liu Y, Zhang D, Lu G, Ma WY (2007) Asurvey of content-based image retrieval with high-level semantics. Pattern Recognit 40:262–282CrossRefMATHGoogle Scholar
  44. Lowe D (1999) Object recognition from local scale-invariant features. In: The proceedings of the seventh IEEE international conference on computer vision, vol 2, pp 1150–1157Google Scholar
  45. Lowe D (2004) Distinctive image features from scale-invariant keypoints. Int J Comput Vis 60(2):91–110CrossRefGoogle Scholar
  46. MacCallum R (1983) A comparison of factor analysis programs in SPSS, BMDP, and SAS. Psychometrika 48(2):223–231CrossRefMATHGoogle Scholar
  47. Maheshwari M, Silakari S, Motwani M (2009) Image clustering using color and texture. In: First international conference on computational intelligence, communication systems and networks, Indore, pp 403–408Google Scholar
  48. Niemann H (1983) Klassifikation von Mustern. Springer, BerlinCrossRefMATHGoogle Scholar
  49. Nunnally J (1978) Psychometric theory. McGraw-Hill, New YorkGoogle Scholar
  50. Pele O, Werman M (2009) Fast and robust earth mover’s distances. In: IEEE 12th international conference on computervision, vol 12, no. 1, pp 460–467Google Scholar
  51. Peleg S, Werman M, Rom H (1989) A unified approach to the change of resolution: space and gray-level. IEEE Trans Pattern Anal Mach Intell 11(7):739–742CrossRefGoogle Scholar
  52. Plummer J (1974) The concept and application of life style segmentation. J Mark 38(1):33–37CrossRefGoogle Scholar
  53. Punj G, Stewart DW (1983) Cluster analysis in marketing research: review and suggestions for application. J Mark Res 20:134–148CrossRefGoogle Scholar
  54. Puzicha J, Buhmann J, Rubner Y, Tomasi C (1999) Empirical evaluation of dissimilarity measures for color and texture. In: Proceedings ot the seventh IEEE international conference on computer vision, vol 2, pp 1165–1172Google Scholar
  55. Rapid-I (2010) Rapidminer 5.0 user manual.
  56. Resankova H, Husek D (2002) Comparison of SAS, SPSS and STATISTICA systems in the area of clustering variables. Comput Stat Data Anal 41(2):331–339Google Scholar
  57. Rexer K (2010) 4th annual rexer analytics data miner survey. In: Predictive analytics world, Oct. 2010. Washington, D.CGoogle Scholar
  58. Rosenfeld A (1969) Picture processing by computer. Computer science and applied mathematics. Academic Press, New YorkGoogle Scholar
  59. Rubner Y, Tomasi C (2001) Perceptual metrics for image database navigation. Kluwer Academic Publishers, BostonMATHGoogle Scholar
  60. Rubner Y, Guibas L, Tomasi C (1997) The earth mover’s distance, multi-dimensional scaling, and color-based image retrieval. In: Proceedings of the ARPA image understanding workshop, pp 661–668Google Scholar
  61. Rubner Y, Tomasi C, Guibas L (2000) The earth mover’s distance as a metric for image retrieval. Int J Comput Vis 40(2):99–121CrossRefMATHGoogle Scholar
  62. Rui Y, Huang T, Chang SF (1999) Image retrieval: current techniques, promising directions, and open issues. J Vis Commun Image Represent 10(1):39–62CrossRefGoogle Scholar
  63. Schmitt I (2005) Ähnlichkeitssuche in Multimedia-Datenbanken—retrieval. Suchalgorithmen und Anfragebehandlung, OldenbourgGoogle Scholar
  64. Schwaiger M (2006) Wirkungskontrolle kommunikationspolitischer Maßnahmen. In: Tomczak T (ed) Reinecke S. Handbuch Marketingcontrolling, Gabler, pp 521–548Google Scholar
  65. Schwarz M, Cowan W, Beatty J (1987) An experimental comparison of RGB, YIQ, LAB, HSV, and opponent color models. ACM Trans Graph 6(2):123–158CrossRefGoogle Scholar
  66. Sebe N, Lew M (2001) Texture features for content-based retrieval. In: Lew M (ed) Principles of visual information retrieval. Springer, London, pp 51–85Google Scholar
  67. Serratosa F, Sanroma G (2008) A fast approximation of the earth-movers distance between multidimensional histograms. Int J Pattern Recognit Artif Intell 22(8):1539–1558CrossRefGoogle Scholar
  68. Sinus (2009) Informationen zu den Sinus-Milieus 2009. Sinus Sociovision GmbH, HeidelbergGoogle Scholar
  69. Smith W (1956) Product differentiation and market segmentation as alternative marketing strategies. J Mark 21:3–8CrossRefGoogle Scholar
  70. Swain M, Ballard D (1991) Color indexing. Int J Comput Vis 7(1):11–32CrossRefGoogle Scholar
  71. Tamura H, Mori S, Yamawaki T (1978) Texture features corresponding to visual perception. IEEE Trans Syst Man Cybern 8(6):460–473CrossRefGoogle Scholar
  72. Tuma M, Decker R, Scholz S (2011) A survey of the challenges and pitfalls of cluster analysis application in market segmentation. Int J Mark Res 53(3):391–415CrossRefGoogle Scholar
  73. van Horen F, Pieters R (2012) When high-similarity copycats lose and moderate-similarity copycats gain: the impact of comparative evaluation. J Mark Res 49(2):83–91CrossRefGoogle Scholar
  74. van House (2007) Flickr and public image-sharing: distant closeness and photo exhibition. In: Conference on human factors in computing systems, San Jose, pp 2717–2722Google Scholar
  75. Venables W, Smith D, Team RDC (2011) An introduction to R: notes on R. Department of Statistics and Mathematics, Vienna University of Economics and Business, Vienna, A Programming Environment for Data Analysis and GraphicsGoogle Scholar
  76. Wahbeh A, Al-Radaideh Q, Al-Kabi M, Al-Shawakfa E (2011) A comparison study between data mining tools over some classification methods. Int J Adv Comput Sci Appl Spec Issue 3:18–26Google Scholar
  77. Wedel M, Kamakura W (2000) Market segmentation: conceptual and methodological foundations. Kluwer, DordrechtGoogle Scholar
  78. Weihs C, Ligges D, Mörchen F, Müllensiefen D (2007) Classification in music research. Adv Data Anal Classif 1(3):255–291MathSciNetCrossRefMATHGoogle Scholar
  79. Wells W, Tigert D (1971) Activities, interests and opinions. J Advert Res 11:27–35Google Scholar
  80. Wyszecki G, Stiles W (1982) Color science. Concepts and methods, quantitative data and formulae, 2nd edn. Wiley, New YorkGoogle Scholar
  81. Yankelovich D (1964) New criteria for market segmentation. Harvard Bus Rev 42(March–April):83–90Google Scholar
  82. Yankelovich D, Meer D (2006) Rediscovering market segmentation. Harvard Bus Rev 84(2):122–131Google Scholar
  83. Zellhöfer D, Schmitt I (2009) A preference-based approach for interactive weight learning—learning weights within a logic-based query language. Distrib Parallel Databases 27(1):31–51CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Daniel Baier
    • 1
  • Ines Daniel
    • 1
  • Sarah Frost
    • 1
  • Robert Naundorf
    • 1
  1. 1.Chair of Marketing and Innovation ManagementBrandenburg University of Technology Cottbus CottbusGermany

Personalised recommendations