Skip to main content

Customer Analyst for the Telecom Industry

  • Chapter
  • First Online:

Abstract

The telecommunications industry is particularly rich in customer data, and telecom companies want to use this data to prevent customer churn, and improve the revenue per user through personalization and customer acquisition. Massive-scale analytics tools provide an opportunity to achieve this in is a flexible and scalable way. In this context, we have developed IBM Customer Analyst, a components library to analyze customer behavioral data and enable new insights and business scenarios based on the analysis of the relationship between users and the content they create and consume. Due to the massive amount of data and large number of users, this technology is built on IBM Infosphere BigInsights and Apache Hadoop. In this work, we first describe an efficient user profiling framework, with high user profiling quality guarantees, based on mobile web browsing log analysis. We describe the use of the Open Directory Project categories to generate user profiles. We then describe an end-to-end analysis flow and discuss its challenges. Last, we validate our methods through extensive experiments based on real data sets.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD   54.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    Some countries in fact require some kind of registration to take place but it is not a technological or business requirement.

  2. 2.

    http://www.dmoz.org/.

  3. 3.

    http://lucene.apache.org/.

  4. 4.

    http://www-01.ibm.com/software/data/infosphere/biginsights/.

  5. 5.

    http://code.google.com/p/jaql/.

  6. 6.

    http://rvs.github.com/oozie/index.html.

  7. 7.

    http://www.json.org/.

  8. 8.

    http://www.wikipedia.org/.

  9. 9.

    http://nutch.apache.org/.

  10. 10.

    http://www-01.ibm.com/software/analytics/spss/.

  11. 11.

    http://www.w3.org/RDF/.

  12. 12.

    http://www-01.ibm.com/software/ebusiness/jstart/bigsheets/.

References

  1. Cetintemel, U., Franklin, M.J., Giles, C.L.: Self-adaptive user profiles for large-scale data delivery. In: ICDE, San Diego, pp. 622–633 (2000)

    Google Scholar 

  2. Chen, L., Sycara, K.: Webmate: a personal agent for browsing and searching. In: AGENTS ’98, St. Paul. ACM, New York (1998)

    Google Scholar 

  3. Chen, Y., Pavlov, D., Canny, J.F.: Large-scale behavioral targeting. In: KDD ’09, Paris. ACM, New York (2009)

    Google Scholar 

  4. Chirita, P.A., Nejdl, W., Paiu, R., Kohlschütter, C.: Using ODP metadata to personalize search. In: Proceedings of the 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR ’05, Salvador, pp. 178–185. ACM, New York (2005)

    Google Scholar 

  5. Cohn, D., Hofmann, T.: The missing link – a probabilistic model of document content and hypertext connectivity. In: Advances in Neural Information Processing Systems, Vancouver (2001)

    Google Scholar 

  6. Davidov, D., Gabrilovich, E., Markovitch, S.: Parameterized generation of labeled datasets for text categorization based on a hierarchical directory. In: SIGIR ’04, Sheffield (2004)

    Google Scholar 

  7. Dean, J., Ghemawat, S.: Mapreduce: simplified data processing on large clusters. Commun. ACM 51(1), 107–113 (2008)

    Article  Google Scholar 

  8. de Reuver, M., Haaker, T.: Designing viable business models for context-aware mobile services. Telemat. Inform. 26(3), 240–248 (2009)

    Article  Google Scholar 

  9. Ghemawat, S., Gobioff, H., Leung, S.-T.: The Google file system. In: Proceedings of the Nineteenth ACM Symposium on Operating Systems Principles, SOSP ’03, Bolton Landing, pp. 29–43. ACM, New York (2003)

    Google Scholar 

  10. Gyöngyi, Z., Garcia-Molina, H., Pedersen, J.: Combating web spam with trustrank. In: Proceedings of the Thirtieth International Conference on Very Large Data Bases – Volume 30, VLDB ’04, Toronto, pp. 576–587 (2004)

    Google Scholar 

  11. http://en.wikipedia.org/wiki/Privacy_internet

  12. https://www.eff.org/deeplinks/2012/03/best-practices-respect-mobile-user-billrights

  13. Hung, S.-Y., Yen, D.C., Wang, H.-Y.: Applying data mining to telecom churn management. Expert Syst. Appl. 31, 515–524 (2006)

    Article  Google Scholar 

  14. Ingrid, D.: Weighted voting systems. Voting and Social Choice (2002)

    Google Scholar 

  15. Kaasinen, E.: User needs for location-aware mobile services. Pers. Ubiquitous Comput. 7, 70–79 (2003)

    Article  Google Scholar 

  16. Kleinberg, J.M.: Authoritative sources in a hyperlinked environment. J. ACM 46(5), 604–632 (1999)

    Article  MATH  MathSciNet  Google Scholar 

  17. Larry, B.: Weighted Voting Systems (2001)

    Google Scholar 

  18. Manning, C.D., Raghavan, P., Schutze, H.: Introduction to Information Retrieval. Cambridge University Press, New York (2008)

    Book  MATH  Google Scholar 

  19. Nunes, M., Cabral, L., Lima, R., Freitas, F., Reinaldo, G., Prudencio, R.: Docs-Clustering: A System for Hierarchical Clustering and Document Labeling (2008)

    Google Scholar 

  20. Oishi, T., Kambara, Y., Mine, T., Hasegawa, R., Fujita, H., Koshimura, M.: Personalized search using ODP-based user profiles created from user bookmark. In: PRICAI 2008: Trends in Artificial Intelligence, Hanoi. Volume 5351 of Lecture Notes in Computer Science, pp. 839–848 (2008)

    Google Scholar 

  21. Pearson, K.: The problem of the random walk. Nature 72, 294 (1905)

    Article  Google Scholar 

  22. Qi, X., Davison, B.D.: Web page classification: features and algorithms. ACM Comput. Surv. 41(2), 1–31 (2009)

    Article  Google Scholar 

  23. Richter, Y., Yom-Tov, E., Slonim, N.: Predicting customer churn in mobile networks through analysis of social groups. In: SDM, Columbus (2010)

    Google Scholar 

  24. Shmueli-Scheuer, M., Roitman, H., Carmel, D., Mass, Y., Konopnicki, D.: Extracting user profiles from large scale data. In: MDAC, Raleigh (2010)

    Google Scholar 

  25. Sugiyama, K., Hatano, K., Yoshikawa, M.: Adaptive web search based on user profile constructed without any effort from users. In: WWW, Manhattan, pp. 675–684 (2004)

    Google Scholar 

  26. Tanudjaja, F., Mui, L.: Persona: a contextualized and personalized web search. In: Proceedings of the 35th Annual Hawaii International Conference on System Sciences, Big Island, p. 67 (2001)

    Google Scholar 

  27. van Setten, M., Pokraev, S., Koolwaaij, J.: Context-aware recommendations in the mobile tourist application compass. In: Adaptive Hypermedia and Adaptive Web-Based Systems, Eindhoven, vol. 3137, pp. 515–548 (2004)

    Google Scholar 

  28. Williamson, M.: Using DMOZ open directory project lists with novell bordermanager (2003)

    Google Scholar 

  29. Zhou, Y., Wilkinson, D., Schreiber, R., Pan, R.: Large-scale parallel collaborative filtering for the netflix prize. In: AAIM ’08, Shanghai, pp. 337–348. Springer, Berlin/Heidelberg (2008)

    Google Scholar 

Download references

Acknowledgements

The authors would like to thank Shai Erera and Gilad Barkai for the useful discussions about implementation issues. We also thank Haggai Roitman for sharing thoughts and ideas. Finally, we thank Matin Jouzdani for his support to make it a successful project.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to David Konopnicki .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer Science+Business Media New York

About this chapter

Cite this chapter

Konopnicki, D., Shmueli-Scheuer, M. (2014). Customer Analyst for the Telecom Industry. In: Gkoulalas-Divanis, A., Labbi, A. (eds) Large-Scale Data Analytics. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-9242-9_4

Download citation

  • DOI: https://doi.org/10.1007/978-1-4614-9242-9_4

  • Published:

  • Publisher Name: Springer, New York, NY

  • Print ISBN: 978-1-4614-9241-2

  • Online ISBN: 978-1-4614-9242-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics