Skip to main content

Intelligent Big Data Analysis to Design Smart Predictor for Customer Churn in Telecommunication Industry

  • Conference paper
  • First Online:
Big Data and Smart Digital Environment (ICBDSDE 2018)

Part of the book series: Studies in Big Data ((SBD,volume 53))

Included in the following conference series:

Abstract

Due to the extensive development in the field of telecommunications, so today, it requires that companies must be familiar with and understand the nature of customers and their aspirations. This has led to strong competition between these companies so that they need to use the programmers to build accurate analysis systems to maintain their customers and improve the level of revenue. Here we have introduced an integration system that helps the telecom company achieve that goal. The proposed system consists of three basic pashas: First Phase: An understanding of the company’s data, which consisted of two main parts, included data on the same company in terms of number of employees, number of customers, revenues and expenses, and customer-related data. This phase focuses on initial processing of data that is fragmented and unbalanced. Where the data of the company and the customer was merged first using the joiner and then we addressed the problem of imbalance by building DSMOTE algorithm, which adopted the principle (samples and quadratic) and succeeded in the production of real samples instead of default in the treatment of imbalance. Second Phase: Data were separated after processing into training and testing data. The training data were used to construct a GBM-based predictor after it was developed and replace its decision-making part, which is (DT) with a (GA) algorithm to identify customers to three groups are (the group of customers most influencing the company’s revenues, the group of medium-sized customers and the least important group of customers). Third Stage: The accuracy of the predictor results was verified by using the matrix of the conflict matrix which are: Accuracy, Precision, Recall, F_measure, Fb. A comparison was made between the traditional method of initial treatment, which is SMOTE, DSMOTE in terms of error rate and accuracy. The best results for the developed method were when the data was divided by (40:60) and the error rate was (0.038) and the correct rate (0.962) while the traditional method was the best results for error rate (0.198) and resolution rate (0.802). In addition, the results of the GBM and GBM-GA were compared in terms of the four contrast matrix scales. The traditional method of GBM had the value of Accuracy (0.88), while the developed GBM-GA method was Accuracy (0.97). This confirmed the accuracy of the proposed method.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 219.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Wang, H.F. (ed.): Intelligent Data Analysis: Developing New Methodologies Through Pattern Discovery and Recovery. IGI Global, Hershey (2009)

    Google Scholar 

  2. Al_Janabi, S.: Smart system to create an optimal higher education environment using IDA and IOTs. Int. J. Comput. Appl. (2018). https://doi.org/10.1080/1206212X.2018.1512460

  3. Vijaya, J., Sivasankar, E.: An efficient system for customer churn prediction through particle swarm optimization based feature selection model with simulated annealing. Cluster Comput. 1–12 (2017). https://doi.org/10.1007/s10586-017-1172-1

  4. Al_Janabi, S.: A novel agent-DKGBM predictor for business intelligence and analytics toward enterprise data discovery abstract. J. Babylon Univ./Pure Appl. Sci. 23(2), 482–507 (2015)

    Google Scholar 

  5. Jiawei, H., Pei, J., Micheline, K.: Data Mining: Concepts and Techniques, 3rd edn. Elsevier (2011). ISBN 978-0-12-381479-1

    Google Scholar 

  6. Zhu, B., Baesens, B., van den Broucke, S.K.L.M.: An empirical comparison of techniques for the class imbalance problem in churn prediction. Inf. Sci. 408, 84–99 (2017)

    Article  Google Scholar 

  7. Machado, N.L.R., Ruiz, D.D.A.: Customer: a novel customer churn prediction method based on mobile application usage. In: 13th IEEE International Wireless Communications & Mobile Computing Conference, IWCMC 2017, pp. 2146–2151 (2017)

    Google Scholar 

  8. Subramanya, K.B., Somani, A.: Enhanced feature mining and classifier models to predict customer churn for an E-retailer. In: Proceedings of the 7th International Conference Confluence. 2017 Cloud Computing, Data Science & Engineering, pp. 531–536 (2017)

    Google Scholar 

  9. Abd-allah, M.N.: DyadChurn: customer churn prediction using strong social ties, pp. 1–11 (2017)

    Google Scholar 

  10. Chamberlain, B.P., Liu, C.H.B., Pagliari, R., Deisenroth, M.P.: Customer lifetime value prediction using embeddings, pp. 1753–1762. Elsevier (2017)

    Google Scholar 

  11. Milosevic, M., Zivi, N., Andjelkovi, I.: Early churn prediction with personalized targeting in mobile social games. Expert Syst. Appl. 83, 326–332 (2017)

    Article  Google Scholar 

  12. Óskarsdóttir, M., Bravo, C., Verbeke, W., Sarraute, C., Baesens, B., Vanthienen, J.: Social network analytics for churn prediction in telco: model building, evaluation and network architecture. Expert Syst. Appl. 85, 204–220 (2017)

    Article  Google Scholar 

  13. Zhao, L., Gao, Q., Dong, X., Dong, A., Dong, X.: K- local maximum margin feature extraction algorithm for churn prediction in telecom. Cluster Comput. 20(2), 1401–1409 (2017)

    Article  Google Scholar 

  14. Idris, A., Iftikhar, A., Rehman, Z.U.: Intelligent churn prediction for telecom using GP-AdaBoost learning and PSO undersampling. Cluster Comput. 1–15 (2017). https://doi.org/10.1007/s10586-017-1154-3

  15. Vijaya Saradhi, V., Palshikar, G.: Employee churn prediction. Expert Syst. Appl. 38(3), 1999–2006 (2011)

    Article  Google Scholar 

  16. Ali, S.H.: Miner for OACCR: case of medical data analysis in knowledge discovery. In: 2012 6th International Conference on Sciences of Electronics, Technologies of Information and Telecommunications (SETIT), IEEE, Sousse, pp. 962–975 (2012). https://doi.org/10.1109/SETIT.2012.6482043

  17. Al-Janabi, S.: Pragmatic miner to risk analysis for intrusion detection (PMRA-ID). In: Mohamed, A., Berry, M., Yap, B. (eds.) Soft Computing in Data Science. SCDS 2017. CCIS, vol. 788, pp. 263–277. Springer, Singapore (2017). https://doi.org/10.1007/978-981-10-7242-0_23

  18. AlOmari, D., Hassan, M.: Predicting telecommunication customer churn using data mining techniques. In: International Conference on Internet and Distributed Computing Systems, pp. 167–178. Springer, Cham (2016)

    Google Scholar 

  19. Coussement, K., Van den Poel, D.: Improving customer attrition prediction by integrating emotions from client/company interaction emails and evaluating multiple classifiers. Expert Syst. Appl. 36, 6127–6134 (2013)

    Article  Google Scholar 

  20. Owczarczuk, M.: Churn models for prepaid customers in the cellular telecommunication industry using large data marts. Expert Syst. Appl. 37(6), 4710–4712 (2010)

    Article  Google Scholar 

  21. Mansiaux, Y., Carrat, F.: Detection of independent associations in a large epidemiologic dataset: a comparison of random forests, boosted regression trees, conventional and penalized logistic regression for identifying independent factors associated with H1N1pdm influenza infections. BMC Med. Res. Methodol. 14(1), 99 (2014)

    Article  Google Scholar 

  22. Trevor, H., Robert, T., Jerome, F.: The Elements of Statistical Learning, 2nd edn., pp. 337–384. Springer, New York (2009). ISBN 0-387-84857-6

    Google Scholar 

  23. Elith, J., Leathwick, J.R., Hastie, T.: A working guide to boosted regression trees. J. Anim. Ecol. 77, 802–813 (2008)

    Article  Google Scholar 

  24. Robert, N., Gary, M., John, E.: Handbook of Statistical Analysis and Data Mining Applications. Academic Press (2009). ISBN-13: 978-0123747655

    Google Scholar 

  25. Al-Janabi, S., Salman, M.A., Fanfakh, A.: Recommendation system to improve time management for people in education environments. J. Eng. Appl. Sci. 13, 10182–10193 (2018). https://doi.org/10.3923/jeasci.2018.10182.10193

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Samaher Al_Janabi .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Al_Janabi, S., Razaq, F. (2019). Intelligent Big Data Analysis to Design Smart Predictor for Customer Churn in Telecommunication Industry. In: Farhaoui, Y., Moussaid, L. (eds) Big Data and Smart Digital Environment. ICBDSDE 2018. Studies in Big Data, vol 53. Springer, Cham. https://doi.org/10.1007/978-3-030-12048-1_26

Download citation

Publish with us

Policies and ethics