Abstract
Since customer relationship management (CRM) plays an increasingly important role in a company’s marketing strategy, the database of the company can be considered as a valuable asset to compete with others. Consequently, companies constantly try to augment their database through data collection themselves, as well as through the acquisition of commercially available external data. Until now, little research has been done on the usefulness of these commercially available external databases for CRM. This study will present a methodology for such external data vendors based on random forests predictive modeling techniques to create commercial variables that solve the shortcomings of a classic transactional database. Eventually, we predicted spending pleasure variables, a composite measure of purchasing behavior and attitude, in 26 product categories for more than 3 million respondents. Enhancing a company’s transactional database with these variables can significantly improve the predictive performance of existing CRM models. This has been demonstrated in a case study with a magazine publisher for which prospects needed to be identified for new customer acquisition.
Similar content being viewed by others
References
Ansari, A., Essegaier, S., & Kohli, R. (2000). Internet recommendation systems. Journal of Marketing Research, 37(3), 363–375.
Bandyopadhyay, S., & Martell, M. (2007). Does attitudinal loyalty influence behavioral loyalty? A theoretical and empirical study. Journal of Retailing and Consumer Services, 14(1), 35–44.
Breiman, L. (2001). Random forests. Machine Learning, 45(1), 5–32.
Bruner, G. C., Hensel, P. J., & James, K. E. (2005). Marketing scales handbook 4: A compilation of multi-item measures for consumer behavior & advertising. Ohio: Thomson/South Western.
Buckinx, W., Verstraeten, G., & Van den Poel, D. (2007). Predicting customer loyalty using the internal transactional database. Expert Systems with Applications, 32(1), 125–134.
Bult, J. R., & Wansbeek, T. (1995). Optimal selection for direct mail. Marketing Science, 14(4), 378–394.
Coussement, K., & Van den Poel, D. (2008). Churn prediction in subscription services: An application of support vector machines while comparing two parameter-selection techniques. Expert Systems with Applications, 34(1), 313–327.
Dahl, D. W., Manchanda, R. V., & Argo, J. J. (2001). Embarrassment in customer purchase: The roles of social presence and purchase familiarity. Journal of Consumer Research, 28(3), 473–481.
Duda, R. O., Hart, P. E., & Stork, D. G. (2001). Pattern classification. New York: Wiley.
Gupta, S., Lehmann, D. R., & Stuart, J. A. (2004). Valuing customers. Journal of Marketing, 41(1), 7–19.
Hanley, J. A., & McNeil, B. J. (1982). The meaning and use of area under a receiver operating characteristic (ROC) curve. Radiology, 143(1), 29–36.
Hastie, T., Tibshirani, R., & Friedman, J. (2001). The elements of statistical learning: Data mining, inference and prediction. New York: Springer.
Hung, C., & Tsai, C. (2008). Market segmentation based on hierarchical self-organizing map for markets of multimedia on demand. Expert Systems with Applications, 34(1), 780–787.
Kamakura, W., Mela, C. F., Ansari, A., Bodapati, A., Fader, P., Iyengar, R., et al. (2005). Choice models and customer relationship management. Marketing Letters, 16(3), 279–291.
Kamakura, W. A., & Wedel, M. (2003). List augmentation with model based multiple imputation: A case study using a mixed-outcome factor model. Statistica Neerlandica, 57(1), 46–57.
Kannan, P. K., & Rao, H. R. (2001). Introduction to the special issue: Decision support issues in customer relationship management. Decision Support Systems, 32(2), 83–84.
Kim, D., Lee, H., & Cho, S. (2008). Response modeling with support vector regression. Expert Systems with Applications, 34(2), 1102–1108.
Lix, T. S., Berger, P. D., & Magliozzi, T. L. (1995). New customer acquisition: Prospecting models and the use of commercially available external data. Journal of Direct Marketing, 9(4), 8–19.
Martin, I. M., & Stewart, D. W. (2001). The differential impact of goal congruency on attitudes, intensions, and the transfer of brand equity. Journal of Marketing Research, 38(4), 471–484.
McCarty, J. A., & Hastak, M. (2007). Segmentation approaches in data-mining: A comparison of RFM, CHAID, and logistic regression. Journal of Business Research, 60(6), 656–662.
Nunnally, J. C., & Bernstein, I. H. (1994). Psychometric theory. New York: McGraw-Hill.
Petrison, L. A., Blatteberg, R. C., & Wang, P. (1993). Database marketing past, present and future. Journal of Direct Marketing, 7(3), 27–43
Prinzie, A., & Van den Poel, D. (2005). Constrained optimization of data-mining problems to improve model performance: A direct-marketing application. Expert Systems with Applications, 29(3), 630–640.
Prinzie, A., & Van den Poel, D. (2006). Exploiting randomness for feature selection in multinomial logit: A CRM cross-sell application. Lecture Notes in Artificial Intelligence, 4065, 310–323.
Prinzie, A., & Van den Poel, D. (2008). Random forests for multiclass classification: Random multinomial logit. Expert Systems with Applications, 34(3), 1721–1732.
Reichheld, F. F., & Sasser, Jr., W. E. (1990). Zero defections: Quality comes to services. Harvard Business Review, 68(5), 105–112.
Rossiter, J. R. (1995). “Spending power” and the subjective discretionary income (SDI) scale. Advances in Consumer Research, 22(1), 236–241.
Shin, H., & Cho, S. (2006). Response modeling with support vector machines. Expert Systems with Applications, 30(4), 746–760.
Suh, E. H., Noh, K. C., & Suh, C. K. (1999). Customer list segmentation using the combined response model. Expert Systems with Applications, 17(2), 89–97.
Van den Poel, D., & Buckinx, W. (2005). Predicting online purchasing behaviour. European Journal of Operational Research, 166(2), 557–575.
Van den Poel, D., & Larivière, B. (2004). Customer attrition analysis for financial services using proportional hazard models. European Journal of Operational Research, 157(1), 196–217.
Verhoef, P. C., Spring, P. N., Hoekstra, J. C., & Leeflang, P. S. H. (2003). The commercial use of segmentation and predictive modelling techniques for database marketing in the Netherlands. Decision Support Systems, 34(4), 471–481.
Voss, K. E., Spangenberg, E. R., & Grohmann, B. (2003). Measuring the hedonic and utilitarian dimensions of consumer attitude. Journal of Marketing Research, 40(3), 310–320.
Zahavi, J., & Levin, N. (1997). Applying neural computing to target marketing. Journal of Direct Marketing, 11(1), 5–23.
Zahay, D., Peltier, J., Schulz, D. E., & Griffen, A. (2004). The role of transactional versus relational data in IMC programs: Bringing customer data together. Journal of Advertising Research, 44(1), 3–18.
Acknowledgements
We would like to thank the anonymous external data provider for supporting this research and form making the data available. We are also grateful to Leo Breiman for his inspiring research and the public availability of the Random Forests software.
Author information
Authors and Affiliations
Corresponding author
Appendix
Appendix
Rights and permissions
About this article
Cite this article
Baecke, P., Van den Poel, D. Data augmentation by predicting spending pleasure using commercially available external data. J Intell Inf Syst 36, 367–383 (2011). https://doi.org/10.1007/s10844-009-0111-x
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10844-009-0111-x