Skip to main content
Log in

Large-scale Ensemble Model for Customer Churn Prediction in Search Ads

  • Published:
Cognitive Computation Aims and scope Submit manuscript

Abstract

Customer churn prediction is one of the most important issues in search ads business management, which is a multi-billion market. The aim of churn prediction is to detect customers with a high propensity to leave the ads platform, then to do analysis and increase efforts for retaining them ahead of time. Ensemble model combines multiple weak models to obtain better predictive performance, which is inspired by human cognitive system and is widely used in various applications of machine learning. In this paper, we investigate how the ensemble model of gradient boosting decision tree (GBDT) to predict whether a customer will be a churner in the foreseeable future based on its activities in the search ads. We extract two types of features for the GBDT: dynamic features and static features. For dynamic features, we consider a sequence of customers’ activities (e.g., impressions, clicks) during a long period. For static features, we consider the information of customers setting (e.g., creation time, customer type). We evaluated the prediction performance in a large-scale customer data set from Bing Ads platform, and the results show that the static and dynamic features are complementary, and get the AUC (area under the curve of ROC) value 0.8410 on the test set by combining all features. The proposed model is useful to predict those customers who will be churner in the near future on the ads platform, and it has been successfully daily run on the Bing Ads platform.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

Notes

  1. https://adwords.google.com/home/

  2. https://bingads.microsoft.com/

  3. https://www.invespcro.com/blog/customer-acquisition-retention/

  4. https://www.quora.com/Why-do-many-research-studies-claim-that-deep-learning-is-a-black-box

  5. Because CPC strategy is widely used in the search ads, the metric of click is usually used to show the performance of the advertisers.

  6. Each advertiser creates the accounts in search ads with tree structures, including accounts, campaigns, ad group, and order items [1].

References

  1. Wang Q, Huang K, Li S, Yu W. Adaptive modeling for large-scale advertisers optimization. BMC Big Data Analytics 2017;2:8.

    Article  Google Scholar 

  2. Kim HS, Yoon CH. Determinants of subscriber churn and customer loyalty in the Korean mobile telephony market. Telecommun Policy 2004;28(9-10):751–65.

    Article  Google Scholar 

  3. Hadden J, Tiwari A, Roy R, Ruta D. Computer assisted customer churn management: state-of-the-art and future trends. Comput Oper Res 2007;v34(10):2902–17.

    Article  Google Scholar 

  4. Yoon S, Koehler J, Ghobarah A. 2010. Prediction of advertiser churn for google adwords jsm proceedings.

  5. Vafeiadis T, Diamantaras KI, Sarigiannidis G, et al. A comparison of machine learning techniques for customer churn prediction. Simul Model Pract Theory 2015;55:1–9.

    Article  Google Scholar 

  6. Kraljević G, Gotovac S. Modeling data mining applications for prediction of prepaid churn in telecommunication services. Automatika 2010;51(3):275–83.

    Article  Google Scholar 

  7. Jadhav RJ, Pawar UT. Churn prediction in telecommunication using data mining technology. Int J Adv Comput Sci Appl 2011;2(2):17–9. https://doi.org/10.14569/IJACSA.2011.020204.

    Google Scholar 

  8. Kim K, Jun CH, Lee J. Improved churn prediction in telecommunication industry by analyzing a large network. Expert Syst Appl 2014;41(15):6575–84.

    Article  Google Scholar 

  9. Qureshi SA, Rehman AS, Qamar AM, et al. 2014. Telecommunication subscribersćhurn prediction model using machine learning, 8th International Conference on Digital Information Management. IEEE. pp. 131–136.

  10. Amin A, Anwar S, Adnan A, Nawaz M, Alawfi K, Hussain A, Huang K. Customer churn prediction in the telecommunication sector using a rough set approach. Neurocomputing 2017;237:242–54.

    Article  Google Scholar 

  11. Xie Y, Xiu L. 2008. Churn prediction with linear discriminant boosting algorithm. IEEE International Conference on Machine Learning and Cybernetics, pp. 228–233.

  12. Glady N, Baesens B, Croux C. Modeling churn using customer lifetime value. Eur J Oper Res 2009; 197(1):402–11.

    Article  Google Scholar 

  13. Nie G, Wei R, Zhang L, et al. Credit card churn forecasting by logistic regression and decision tree. Expert Syst Appl An International Journal 2011;38(12):15273–85.

    Article  Google Scholar 

  14. Ali ÖG, Aritürk U. Dynamic churn prediction framework with more effective use of rare event data: the case of private banking. Expert Syst Appl 2014;41(17):7889–903.

    Article  Google Scholar 

  15. Risselada H, Verhoef PC, Bijmolt THA. Staying power of churn prediction models. J Interact Mark 2010; 24(3):198–208.

    Article  Google Scholar 

  16. Günther C-C, Tvete IF, Aas K, et al. Modelling and predicting customer churn from an insurance company. Scand Actuar J 2014;1:58–71.

    Article  Google Scholar 

  17. Ngonmang B, Viennet E, Tchuente M. Churn prediction in a real online social network using local community analysis. Proceedings of the 2012 International Conference on Advances in Social Networks Analysis and Mining; 2012. p. 282–288.

  18. Borbora ZH, Srivastava J. User behavior modelling approach for churn prediction in online games. 2012 international conference on privacy, security, risk and trust, PASSAT 2012, and 2012 international conference on social computing, SocialCom 2012, Amsterdam, Netherlands; 2012. p. 51–60.

  19. Runge J, Gao P, Garcin F, et al. Churn prediction for high-value players in casual social games. 2014 IEEE conference on Computational Intelligence and Games; 2014. p. 1–8.

  20. Castro EG, Tsuzuki MSG. Churn prediction in online games using playersĺogin records: a frequency analysis approach. IEEE Transactions on Computational Intelligence and Ai in Games 2015;7(3):255–65.

    Article  Google Scholar 

  21. Milošević M, živić N, Andjelković I. Early churn prediction with personalized targeting in mobile social games. Expert Syst Appl 2017;83:326–32.

    Article  Google Scholar 

  22. Gudivada VN, Irfan MT, Fathi E, Rao DL. Cognitive analytics : going beyond big data analytics and machine learning. Handbook of Statistics 2016;35:169–205.

    Article  Google Scholar 

  23. Wang Q-F, Cambria E, Liu C-L, Hussain A. Common sense knowledge for handwritten chinese text recognition. Cogn Comput 2013;5(2):234–42.

    Article  Google Scholar 

  24. Yin X-C, Huang K, Hao H-W. DE2: dynamic ensemble of ensembles for learning nonstationary data. Neurocomputing 2015;165:14–22.

    Article  Google Scholar 

  25. Saliha M, Swindle AH. From spin to identifying falsification in financial text. Cogn Comput 2016;8(4): 729–45.

    Article  Google Scholar 

  26. Ortín S, Pesquera L. Reservoir computing with an ensemble of time-delay reservoirs. Cogn Comput 2017; 9(3):327–36.

    Article  Google Scholar 

  27. Wen GH, Hou Z, Li HH, Li DY, Jiang LJ, Xun EY. Ensemble of deep neural networks with probability-based fusion for facial expression recognition. Cogn Comput 2017;9:597–610.

    Article  Google Scholar 

  28. Ayerdi B, Savio A, Graña M. Meta-ensembles of classifiers for Alzheimerś disease detection using independent ROI features. Natural and Artificial Computation in Engineering and Medical Applications. Springer; 2013. pp. 122–130.

  29. Gu Q, Ding YS, Zhang TL. An ensemble classifier based prediction of G-protein-coupled receptor classes in low homology. Neurocomputing 2015;154:110–18.

    Article  Google Scholar 

  30. Mogultay H, Vural F T Y. Cognitive learner: an ensemble learning architecture for cognitive state classification. IEEE 25th Signal Processing and Communications Applications Conference; 2017. p. 1–4.

  31. Friedman JH. Greedy function approximation: a gradient boosting machine. Ann Stat 2001;29(5):1189–232.

    Article  Google Scholar 

  32. Goodfellow Ian, Bengio Yoshua, Courville A. Deep Learning. Cambridge: MIT Press; 2016.

    Google Scholar 

  33. Duda RO, Hart PE, Stork DG. Pattern classification, 2nd ed. New York: Wiley; 2001.

    Google Scholar 

  34. Coussement K, Van den Poel D. Integrating the voice of customers through call center emails into a decision support system for churn prediction. Information & Management 2008;45(3):164–74.

    Article  Google Scholar 

  35. Lima E, Mues C, Baesens B. Domain knowledge integration in data mining using decision tables: case studies in churn prediction. J Oper Res Soc 2009;8(8):1096–106.

    Article  Google Scholar 

  36. Meher AK, Wilson J, Prashanth R. 2017. Towards a large scale practical churn model for prepaid mobile markets. Advances in Data Mining Applications and Theoretical Aspects, pp. 93–106.

  37. Li R, Wang P, Chen Z. A feature extraction method based on stacked auto-encoder for telecom churn prediction. In: Zhang L, Song X, and Wu Y, editors. Theory, Methodology, Tools and Applications for Modeling and Simulation of Complex Systems. AsiaSim 2016, SCS AutumnSim. Communications in Computer and Information Science. Singapore: Springer; 2016.

  38. Chamberlain BP, Cardoso A, Liu CHB, et al. Customer lifetime value prediction using embeddings. Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining; 2017. p. 1753–1762.

  39. Coussement K, Van den Poel D. Churn prediction in subscription services: an application of support vector machines while comparing two parameter-selection techniques. Expert Syst Appl 2008;34(1):313–27.

    Article  Google Scholar 

  40. Gordini N, Veglio V. Customers churn prediction and marketing retention strategies. An application of support vector machines based on the AUC parameter-selection technique in B2B e-commerce industry. Ind Mark Manag 2016;62:100–7.

    Article  Google Scholar 

  41. Huang Y, Kechadi T. An effective hybrid learning system for telecommunication churn prediction. Expert Syst Appl 2013;40(14):5635–47.

    Article  Google Scholar 

  42. Hadiji F, Sifa R, Drachen A, et al. Predicting player churn in the wild. IEEE conference on Computational intelligence and games (CIG). IEEE; 2014. p. 1–8.

  43. Keramati A, Jafari-Marandi R, Aliannejadi M, et al. Improved churn prediction in telecommunication industry using data mining techniques. Appl Soft Comput 2014;24:994–1012.

    Article  Google Scholar 

  44. Lemmens A, Croux C. Bagging and boosting classification trees to predict churn. J Mark Res (JMR) 2006; 43(2):276–86.

    Article  Google Scholar 

  45. Farquad MAH, Ravi V, Raju SN. Churn prediction using comprehensible support vector machine: an analytical CRM application. Appl Soft Comput 2014;19:31–40.

    Article  Google Scholar 

  46. Huang K, Yang H, King I, Lyu MR. Imbalanced learning with biased minimax probability machine. IEEE Trans Syst Man Cybern B 2006;36(4):913–23.

    Article  Google Scholar 

  47. Sun Y, Wong AK, Kamel MS. Classification of imbalanced data: a review. Int J Pattern Recognit Artif Intell 2009;23(4):687–719.

    Article  Google Scholar 

  48. He H, Garcia EA. Learning from imbalanced data. IEEE Trans Knowl Data Eng 2009;21(9):1263–84.

    Article  Google Scholar 

  49. Huang K, Zhang R, Yin X-C. Imbalance learning locally and globally. Neural Process Lett 2015;41(3): 311–23.

    Article  Google Scholar 

  50. Xie Y, Xiu L, Ngai E, Ying W. Customer churn prediction using improved balanced random forests. Expert Syst Appl 2009;36(3):5445–9.

    Article  Google Scholar 

  51. Zhu B, Baesens B, Backiel A, et al. Benchmarking sampling techniques for imbalance learning in churn prediction. J Oper Res Soc 2018;69(1):49–65. https://doi.org/10.1057/s41274-016-0176-1.

    Article  Google Scholar 

  52. Wangperawong A, Brun C, Laudy O, et al. 2016. Churn analysis using deep convolutional neural networks and autoencoders. arXiv:1604.05377.

  53. Kasiran Z, Ibrahim Z, Mohd Ribuan MS. Customer churn prediction using recurrent neural network with reinforcement learning algorithm in mobile phone users. Int J Int Inf Process 2014;5(1):1–11.

    Google Scholar 

  54. Spanoudes P, Nguyen T. 2017. Deep learning in customer churn prediction: unsupervised feature learning on abstract company independent feature vectors. arXiv:1703.03869.

  55. Chen T. 2014. Introduction to boosted trees, University Of Washington. http://homes.cs.washington.edu/~tqchen/pdf/BoostedTree.pdf.

  56. https://en.wikipedia.org/wiki/Gradient_boosting.

Download references

Acknowledgements

We also would like to thank all of the members in Bing Ads Adinsight team and PM team at Microsoft for their discussion and help on this work.

Funding

This study was funded by Natural Science Foundation of the Jiangsu Higher Education Institutions of China under no. 17KJB520041 and 17KJD520010; Natural Science Foundation of Jiangsu Province BK20181189 and BK20181190; Open Project Fund of the National Laboratory of Pattern Recognition 201800020, Key Program Special Fund in XJTLU under no. KSF-A-10, KSF-A-01 and KSF-P-02; and XJTLU Research Development Fund RDF-16-02-49. In addition, A. Hussain was supported by the UK Engineering and Physical Sciences Research Council (EPSRC) grant (AV-COGHEAR, grant reference number: EP/M026981/1).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Qiu-Feng Wang.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Ethical Approval

This article does not contain any studies with human participants performed by any of the authors.

Informed Consent

Informed consent was obtained from all individual participants included in the study.

Additional information

Qiu-Feng Wang is currently with XJTLU, but carried out the most of work described here while being affiliated with Microsoft.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wang, QF., Xu, M. & Hussain, A. Large-scale Ensemble Model for Customer Churn Prediction in Search Ads. Cogn Comput 11, 262–270 (2019). https://doi.org/10.1007/s12559-018-9608-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12559-018-9608-3

Keywords

Navigation