Cost-Sensitive Churn Prediction in Fund Management Services

  • James Brownlow
  • Charles Chu
  • Bin Fu
  • Guandong XuEmail author
  • Ben Culbert
  • Qinxue Meng
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10828)


Churn prediction is vital to companies as to identify potential churners and prevent losses in advance. Although it has been addressed as a classification task and a variety of models have been employed in practice, fund management services have presented several special challenges. One is that financial data is extremely imbalanced since only a tiny proportion of customers leave every year. Another is a unique cost-sensitive learning problem, i.e., costs of wrong predictions for churners should be related to their account balances, while costs of wrong predictions for non-churners should be the same. To address these issues, this paper proposes a new churn prediction model based on ensemble learning. In our model, multiple classifiers are built using sampled datasets to tackle the imbalanced data issue while exploiting data fully. Moreover, a novel sampling strategy is proposed to deal with the unique cost-sensitive issue. This model has been deployed in one of the leading fund management institutions in Australia, and its effectiveness has been fully validated in real applications.


Customer retention Churn prediction Cost-sensitive classification Imbalanced data 


  1. 1.
    Lu, N., Lin, H., Lu, J., Zhang, G.: A customer churn prediction model in telecom industry using boosting. IEEE Trans. Ind. Inform. 10(2), 1659–1665 (2014)CrossRefGoogle Scholar
  2. 2.
    Chen, C., Liaw, A., Breiman, L.: Using random forest to learn imbalanced data. Technical report, University of California, Berkeley (2004)Google Scholar
  3. 3.
    Ismail, M.R., Awang, M.K., Rahman, M.N.A., Makhtar, M.: A multi-layer perceptron approach for customer churn prediction. Int. J. Multimed. Ubiquitous Eng. 10(7), 213–222 (2015)CrossRefGoogle Scholar
  4. 4.
    Huang, Y., Zhu, F., Yuan, M., Deng, K., Li, Y., Ni, B., Dai, W., Yang, Q., Zeng, J.: Telco churn prediction with big data. In: Proceedings of the 2015 ACM International Conference on Management of Data, pp. 607–618 (2015)Google Scholar
  5. 5.
    Rowe, M.: Mining user lifecycles from online community platforms and their application to churn prediction. In: Proceedings of the 13th IEEE International Conference on Data Mining, pp. 637–646 (2013)Google Scholar
  6. 6.
    Runge, J., Gao, P., Garcin, F., Faltings, B.: Churn prediction for high-value players in casual social games. In: Proceedings of the 2014 IEEE Conference on Computational Intelligence and Games, pp. 1–8 (2014)Google Scholar
  7. 7.
    He, H., Garcia, E.A.: Learning from imbalanced data. IEEE Trans. Knowl. Data Eng. 21(9), 1263–1284 (2009)CrossRefGoogle Scholar
  8. 8.
    Zhang, Y., Zhou, Z.H.: Cost-sensitive face recognition. IEEE Trans. Pattern Anal. Mach. Intell. 32(10), 1758–1769 (2010)CrossRefGoogle Scholar
  9. 9.
    Liu, X.Y., Wu, J., Zhou, Z.H.: Exploratory undersampling for class-imbalance learning. IEEE Trans. Syst. Man. Cybern. Part B (Cybern.) 39(2), 539–550 (2009)Google Scholar
  10. 10.
    Chen, T., Guestrin, C.: Xgboost: a scalable tree boosting system. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 785–794. ACM (2016)Google Scholar
  11. 11.
    Rothenbuehler, P., Runge, J., Garcin, F., Faltings, B.: Hidden Markov models for churn prediction. In: Proceedings of the SAI Intelligent Systems Conference, pp. 723–730 (2015)Google Scholar
  12. 12.
    Dror, G., Pelleg, D., Rokhlenko, O., Szpektor, I.: Churn prediction in new users of Yahoo! answers. In: Proceedings of the 21st International Conference Companion on World Wide Web, pp. 829–834 (2012)Google Scholar
  13. 13.
    Mahajan, V., Misra, R., Mahajan, R.: Review of data mining techniques for churn prediction in telecom. J. Inf. Organ. Sci. 39(2), 183–197 (2015)Google Scholar
  14. 14.
    Galar, M., Fernandez, A., Barrenechea, E., Bustince, H., Herrera, F.: A review on ensembles for the class imbalance problem: bagging-, boosting-, and hybrid-based approaches. IEEE Trans. Syst. Man. Cybern. Part C 42(4), 463–484 (2012)CrossRefGoogle Scholar
  15. 15.
    Cieslak, D.A., Chawla, N.V.: Learning decision trees for unbalanced data. In: Daelemans, W., Goethals, B., Morik, K. (eds.) ECML PKDD 2008. LNCS (LNAI), vol. 5211, pp. 241–256. Springer, Heidelberg (2008). Scholar
  16. 16.
    Galar, M., Fernández, A., Barrenechea, E., Herrera, F.: Eusboost: enhancing ensembles for highly imbalanced data-sets by evolutionary undersampling. Pattern Recognit. 46(12), 3460–3471 (2013)CrossRefGoogle Scholar
  17. 17.
    Sun, Y., Kamel, M.S., Wong, A.K., Wang, Y.: Cost-sensitive boosting for classification of imbalanced data. Pattern Recognit. 40(12), 3358–3378 (2007)CrossRefGoogle Scholar
  18. 18.
    Domingos, P.: Metacost: a general method for making classifiers cost-sensitive. In: Proceedings of the Fifth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 155–164. ACM (1999)Google Scholar
  19. 19.
    Zadrozny, B., Langford, J., Abe, N.: Cost-sensitive learning by cost-proportionate example weighting. In: Proceedings of the Third IEEE International Conference on Data Mining, pp. 435–442. IEEE (2003)Google Scholar

Copyright information

© Springer International Publishing AG, part of Springer Nature 2018

Authors and Affiliations

  • James Brownlow
    • 1
    • 2
  • Charles Chu
    • 1
    • 2
  • Bin Fu
    • 1
  • Guandong Xu
    • 2
    Email author
  • Ben Culbert
    • 1
    • 2
  • Qinxue Meng
    • 1
  1. 1.Colonial First StateSydneyAustralia
  2. 2.Advanced Analytics Institute, UTSSydneyAustralia

Personalised recommendations