Skip to main content

Ensembles of Probability Estimation Trees for Customer Churn Prediction

  • Conference paper
Trends in Applied Intelligent Systems (IEA/AIE 2010)

Abstract

Customer churn prediction is one of the most important elements of a company’s Customer Relationship Management (CRM) strategy. In this study, two strategies are investigated to increase the lift performance of ensemble classification models, i.e. (i) using probability estimation trees (PETs) instead of standard decision trees as base classifiers, and (ii) implementing alternative fusion rules based on lift weights for the combination of ensemble member’s outputs. Experiments are conducted for four popular ensemble strategies on five real-life churn data sets. In general, the results demonstrate how lift performance can be substantially improved by using alternative base classifiers and fusion rules. However, the effect varies for the different ensemble strategies. In particular, the results indicate an increase of lift performance of (i) Bagging by implementing C4.4 base classifiers, (ii) the Random Subspace Method (RSM) by using lift-weighted fusion rules, and (iii) AdaBoost by implementing both.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Reinartz, W., Kumar, V.: The mismanagement of customer loyalty. Harvard Bus. Rev. 80, 86–94 (2002)

    Google Scholar 

  2. Shaw, M.J., Subramaniam, C., Tan, G.W., Welge, M.E.: Knowledge management and data mining for marketing. Decis. Support Syst. 31, 127–137 (2001)

    Article  Google Scholar 

  3. Kim, Y.S.: Toward a successful CRM: variable selection, sampling, and ensemble. Decis. Support Syst. 41, 542–553 (2006)

    Article  Google Scholar 

  4. Larivière, B., Van den Poel, D.: Predicting customer retention and profitability by using random forests and regression forests techniques. Expert Syst. Appl. 29, 472–484 (2005)

    Article  Google Scholar 

  5. Jinbo, S., Xiu, L., Wenhuang, L.: The application of AdaBoost in customer churn prediction. In: Proceedings of 2007 International Conference on Service Systems and Service Management (ICSSSM 2007), pp. 513–518 (2007)

    Google Scholar 

  6. Glady, N., Baesens, B., Croux, C.: Modeling churn using customer lifetime value. Eur. J. Oper. Res. 197, 402–411 (2009)

    Article  MATH  Google Scholar 

  7. Lemmens, A., Croux, C.: Bagging and boosting classification trees to predict churn. J. Marketing Res. 43, 276–286 (2006)

    Article  Google Scholar 

  8. Burez, J., Van den Poel, D.: Handling class imbalance in customer churn prediction. Expert Syst. Appl. 36, 4626–4636 (2009)

    Article  Google Scholar 

  9. Provost, F., Domingos, P.: Tree induction for probability-based ranking. Mach. Learn. 52, 199–215 (2003)

    Article  MATH  Google Scholar 

  10. Quinlan, R.: C4.5: Programs for Machine Learning. Morgan Kauffman Publishers, San Mateo (1993)

    Google Scholar 

  11. Breiman, L.: Bagging predictors. Mach. Learn. 24, 123–140 (1996)

    MATH  Google Scholar 

  12. Ho, T.K.: The random subspace method for constructing decision forests. IEEE T. Pattern Anal. 20, 832–844 (1998)

    Article  Google Scholar 

  13. Panov, P., Dzeroski, S.: Combining bagging and random subspaces to create better ensembles. In: Berthold, M.R., ShaweTaylor, J., Lavrac, N. (eds.) IDA 2007. LNCS, vol. 4723, pp. 86–94. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  14. Freund, Y., Schapire, R.E.: A decision-theoretic generalization of on-line learning and an application to boosting. J. Comput. Syst. Sci. 55, 119–139 (1997)

    Article  MathSciNet  MATH  Google Scholar 

  15. Clemençon, S., Vayatis, N.: Tree-Based Ranking Methods. IEEE T. Inform. Theory 55, 4316–4336 (2009)

    Article  MathSciNet  MATH  Google Scholar 

  16. Kuncheva, L.I.: Combining pattern classifiers: methods and algorithms. John Wiley & Sons, Hoboken (2004)

    Book  MATH  Google Scholar 

  17. Provost, F., Fawcett, T., Kohavi, R.: The Case against Accuracy Estimation for Comparing Induction Algorithms. In: Shavlik, J. (ed.) 15th International Conference on Machine Learning (ICML-1998), pp. 445–453. Morgan Kaufman, San Francisco (2000)

    Google Scholar 

  18. Bauer, E., Kohavi, R.: An empirical comparison of voting classification algorithms: Bagging, boosting, and variants. Mach. Learn. 36, 105–139 (1999)

    Article  Google Scholar 

  19. Rodríguez, J.J., Kuncheva, L.I., Alonso, C.J.: Rotation forest: A new classifier ensemble method. IEEE T. Pattern Anal. 28, 1619–1630 (2006)

    Article  Google Scholar 

  20. Bryll, R., Gutierrez-Osuna, R., Quek, F.: Attribute bagging: improving accuracy of classifier ensembles by using random feature subsets. Pattern Recogn. 36, 1291–1302 (2003)

    Article  MATH  Google Scholar 

  21. Prinzie, A., Van den Poel, D.: Constrained optimization of data-mining problems to improve model performance: A direct-marketing application. Expert Syst. Appl. 29, 630–640 (2005)

    Article  Google Scholar 

  22. Cieslak, D., Chawla, N.: Analyzing PETs on imbalanced datasets when training and testing class distributions differ. In: Washio, T., Suzuki, E., Ting, K.M., Inokuchi, A. (eds.) PAKDD 2008. LNCS (LNAI), vol. 5012, pp. 519–526. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  23. Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The WEKA Data Mining Software: An Update. SIGKDD Explorations 1 (2009)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

De Bock, K.W., Van den Poel, D. (2010). Ensembles of Probability Estimation Trees for Customer Churn Prediction. In: García-Pedrajas, N., Herrera, F., Fyfe, C., Benítez, J.M., Ali, M. (eds) Trends in Applied Intelligent Systems. IEA/AIE 2010. Lecture Notes in Computer Science(), vol 6097. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-13025-0_7

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-13025-0_7

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-13024-3

  • Online ISBN: 978-3-642-13025-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics