Skip to main content
Log in

Data augmentation by predicting spending pleasure using commercially available external data

  • Published:
Journal of Intelligent Information Systems Aims and scope Submit manuscript

Abstract

Since customer relationship management (CRM) plays an increasingly important role in a company’s marketing strategy, the database of the company can be considered as a valuable asset to compete with others. Consequently, companies constantly try to augment their database through data collection themselves, as well as through the acquisition of commercially available external data. Until now, little research has been done on the usefulness of these commercially available external databases for CRM. This study will present a methodology for such external data vendors based on random forests predictive modeling techniques to create commercial variables that solve the shortcomings of a classic transactional database. Eventually, we predicted spending pleasure variables, a composite measure of purchasing behavior and attitude, in 26 product categories for more than 3 million respondents. Enhancing a company’s transactional database with these variables can significantly improve the predictive performance of existing CRM models. This has been demonstrated in a case study with a magazine publisher for which prospects needed to be identified for new customer acquisition.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1

Similar content being viewed by others

References

  • Ansari, A., Essegaier, S., & Kohli, R. (2000). Internet recommendation systems. Journal of Marketing Research, 37(3), 363–375.

    Article  Google Scholar 

  • Bandyopadhyay, S., & Martell, M. (2007). Does attitudinal loyalty influence behavioral loyalty? A theoretical and empirical study. Journal of Retailing and Consumer Services, 14(1), 35–44.

    Article  Google Scholar 

  • Breiman, L. (2001). Random forests. Machine Learning, 45(1), 5–32.

    Article  MATH  Google Scholar 

  • Bruner, G. C., Hensel, P. J., & James, K. E. (2005). Marketing scales handbook 4: A compilation of multi-item measures for consumer behavior & advertising. Ohio: Thomson/South Western.

    Google Scholar 

  • Buckinx, W., Verstraeten, G., & Van den Poel, D. (2007). Predicting customer loyalty using the internal transactional database. Expert Systems with Applications, 32(1), 125–134.

    Article  Google Scholar 

  • Bult, J. R., & Wansbeek, T. (1995). Optimal selection for direct mail. Marketing Science, 14(4), 378–394.

    Article  Google Scholar 

  • Coussement, K., & Van den Poel, D. (2008). Churn prediction in subscription services: An application of support vector machines while comparing two parameter-selection techniques. Expert Systems with Applications, 34(1), 313–327.

    Article  Google Scholar 

  • Dahl, D. W., Manchanda, R. V., & Argo, J. J. (2001). Embarrassment in customer purchase: The roles of social presence and purchase familiarity. Journal of Consumer Research, 28(3), 473–481.

    Article  Google Scholar 

  • Duda, R. O., Hart, P. E., & Stork, D. G. (2001). Pattern classification. New York: Wiley.

    MATH  Google Scholar 

  • Gupta, S., Lehmann, D. R., & Stuart, J. A. (2004). Valuing customers. Journal of Marketing, 41(1), 7–19.

    Article  Google Scholar 

  • Hanley, J. A., & McNeil, B. J. (1982). The meaning and use of area under a receiver operating characteristic (ROC) curve. Radiology, 143(1), 29–36.

    Google Scholar 

  • Hastie, T., Tibshirani, R., & Friedman, J. (2001). The elements of statistical learning: Data mining, inference and prediction. New York: Springer.

    MATH  Google Scholar 

  • Hung, C., & Tsai, C. (2008). Market segmentation based on hierarchical self-organizing map for markets of multimedia on demand. Expert Systems with Applications, 34(1), 780–787.

    Article  Google Scholar 

  • Kamakura, W., Mela, C. F., Ansari, A., Bodapati, A., Fader, P., Iyengar, R., et al. (2005). Choice models and customer relationship management. Marketing Letters, 16(3), 279–291.

    Article  Google Scholar 

  • Kamakura, W. A., & Wedel, M. (2003). List augmentation with model based multiple imputation: A case study using a mixed-outcome factor model. Statistica Neerlandica, 57(1), 46–57.

    Article  MathSciNet  MATH  Google Scholar 

  • Kannan, P. K., & Rao, H. R. (2001). Introduction to the special issue: Decision support issues in customer relationship management. Decision Support Systems, 32(2), 83–84.

    Article  Google Scholar 

  • Kim, D., Lee, H., & Cho, S. (2008). Response modeling with support vector regression. Expert Systems with Applications, 34(2), 1102–1108.

    Article  MathSciNet  Google Scholar 

  • Lix, T. S., Berger, P. D., & Magliozzi, T. L. (1995). New customer acquisition: Prospecting models and the use of commercially available external data. Journal of Direct Marketing, 9(4), 8–19.

    Article  Google Scholar 

  • Martin, I. M., & Stewart, D. W. (2001). The differential impact of goal congruency on attitudes, intensions, and the transfer of brand equity. Journal of Marketing Research, 38(4), 471–484.

    Article  Google Scholar 

  • McCarty, J. A., & Hastak, M. (2007). Segmentation approaches in data-mining: A comparison of RFM, CHAID, and logistic regression. Journal of Business Research, 60(6), 656–662.

    Article  Google Scholar 

  • Nunnally, J. C., & Bernstein, I. H. (1994). Psychometric theory. New York: McGraw-Hill.

    Google Scholar 

  • Petrison, L. A., Blatteberg, R. C., & Wang, P. (1993). Database marketing past, present and future. Journal of Direct Marketing, 7(3), 27–43

    Article  Google Scholar 

  • Prinzie, A., & Van den Poel, D. (2005). Constrained optimization of data-mining problems to improve model performance: A direct-marketing application. Expert Systems with Applications, 29(3), 630–640.

    Article  Google Scholar 

  • Prinzie, A., & Van den Poel, D. (2006). Exploiting randomness for feature selection in multinomial logit: A CRM cross-sell application. Lecture Notes in Artificial Intelligence, 4065, 310–323.

    Google Scholar 

  • Prinzie, A., & Van den Poel, D. (2008). Random forests for multiclass classification: Random multinomial logit. Expert Systems with Applications, 34(3), 1721–1732.

    Article  Google Scholar 

  • Reichheld, F. F., & Sasser, Jr., W. E. (1990). Zero defections: Quality comes to services. Harvard Business Review, 68(5), 105–112.

    Google Scholar 

  • Rossiter, J. R. (1995). “Spending power” and the subjective discretionary income (SDI) scale. Advances in Consumer Research, 22(1), 236–241.

    Google Scholar 

  • Shin, H., & Cho, S. (2006). Response modeling with support vector machines. Expert Systems with Applications, 30(4), 746–760.

    Article  Google Scholar 

  • Suh, E. H., Noh, K. C., & Suh, C. K. (1999). Customer list segmentation using the combined response model. Expert Systems with Applications, 17(2), 89–97.

    Article  Google Scholar 

  • Van den Poel, D., & Buckinx, W. (2005). Predicting online purchasing behaviour. European Journal of Operational Research, 166(2), 557–575.

    Article  MathSciNet  MATH  Google Scholar 

  • Van den Poel, D., & Larivière, B. (2004). Customer attrition analysis for financial services using proportional hazard models. European Journal of Operational Research, 157(1), 196–217.

    Article  MATH  Google Scholar 

  • Verhoef, P. C., Spring, P. N., Hoekstra, J. C., & Leeflang, P. S. H. (2003). The commercial use of segmentation and predictive modelling techniques for database marketing in the Netherlands. Decision Support Systems, 34(4), 471–481.

    Article  Google Scholar 

  • Voss, K. E., Spangenberg, E. R., & Grohmann, B. (2003). Measuring the hedonic and utilitarian dimensions of consumer attitude. Journal of Marketing Research, 40(3), 310–320.

    Article  Google Scholar 

  • Zahavi, J., & Levin, N. (1997). Applying neural computing to target marketing. Journal of Direct Marketing, 11(1), 5–23.

    Article  Google Scholar 

  • Zahay, D., Peltier, J., Schulz, D. E., & Griffen, A. (2004). The role of transactional versus relational data in IMC programs: Bringing customer data together. Journal of Advertising Research, 44(1), 3–18.

    Article  Google Scholar 

Download references

Acknowledgements

We would like to thank the anonymous external data provider for supporting this research and form making the data available. We are also grateful to Leo Breiman for his inspiring research and the public availability of the Random Forests software.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Dirk Van den Poel.

Appendix

Appendix

Table 7 Significant socio-demographic and spending pleasure variables in the acquisition model

Rights and permissions

Reprints and permissions

About this article

Cite this article

Baecke, P., Van den Poel, D. Data augmentation by predicting spending pleasure using commercially available external data. J Intell Inf Syst 36, 367–383 (2011). https://doi.org/10.1007/s10844-009-0111-x

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10844-009-0111-x

Keywords

Navigation