A Non-sequential Representation of Sequential Data for Churn Prediction

Eastwood, Mark; Gabrys, Bogdan

doi:10.1007/978-3-642-04595-0_26

Mark Eastwood²² &
Bogdan Gabrys²²

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 5711))

Included in the following conference series:

International Conference on Knowledge-Based and Intelligent Information and Engineering Systems

899 Accesses
4 Citations

Abstract

We investigate the length of event sequence giving best predictions when using a continuous HMM approach to churn prediction from sequential data. Motivated by observations that predictions based on only the few most recent events seem to be the most accurate, a non-sequential dataset is constructed from customer event histories by averaging features of the last few events. A simple K-nearest neighbor algorithm on this dataset is found to give significantly improved performance. It is quite intuitive to think that most people will react only to events in the fairly recent past. Events related to telecommunications occurring months or years ago are unlikely to have a large impact on a customer’s future behaviour, and these results bear this out. Methods that deal with sequential data also tend to be much more complex than those dealing with simple non-temporal data, giving an added benefit to expressing the recent information in a non-sequential manner.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Bilmes, J.: A gentle tutorial on the em algorithm and its application to parameter estimation for gaussian mixture and hidden markov models (1997)
Google Scholar
Breiman, L.: Bagging predictors. Machine Learning 24(2), 123–140 (1996)
MATH Google Scholar
Chen, Y.-S., Hung, Y.-P., Yen, T.-F., Fuh, C.-S.: Fast and versatile algorithm for nearest neighbor search based on a lower bound tree. Pattern Recogn. 40(2), 360–375 (2007)
Article MATH Google Scholar
Dietterich, T.G.: Machine learning for sequential data: A review. In: Caelli, T.M., Amin, A., Duin, R.P.W., Kamel, M.S., de Ridder, D. (eds.) SPR 2002 and SSPR 2002. LNCS, vol. 2396, pp. 15–30. Springer, Heidelberg (2002)
Chapter Google Scholar
Duda, R., Hart, P., Stork, D.: Pattern Classification. John Wiley and Sons, Chichester (2001)
MATH Google Scholar
Freund, Y., Schapire, R.E.: Experiments with a new boosting algorithm. In: Proceedings of the 13th International Conference on Machine Learning, pp. 148–156. Morgan Kaufmann, San Francisco (1996)
Google Scholar
Haddon, J., Tiwari, A., Roy, R., Ruta, D.: Churn prediction: Does technology matter (2006)
Google Scholar
Lemmens, A., Croux, C.: Bagging and boosting classification trees to predict churn. Journal of Marketing Research XLIII, 276–286 (2006)
Article Google Scholar
Murphy, K.: A hmm toolbox for matlab, http://www.cs.ubc.ca/~murphyk/software/hmm/hmm.html
Quinlan, J.R.: C4.5: programs for machine learning. Morgan Kaufmann Publishers Inc., San Francisco (1993)
Google Scholar
Ruta, D., Nauck, D., Azvine, B.: K nearest sequence method and its application to churn prediction. In: Corchado, E., Yin, H., Botti, V., Fyfe, C. (eds.) IDEAL 2006. LNCS, vol. 4224, pp. 207–215. Springer, Heidelberg (2006)
Chapter Google Scholar
Wei, C.-P., Chiu, I.-T.: Turning telecommunications call details to churn prediction: a data mining approach. Expert Systems with Applications 23, 103–112 (2002)
Article Google Scholar
Yan, L., Miller, D.J., Mozer, M.C., Wolniewicz, R.: Improving prediction of customer behaviour in non-stationary environments. In: Proc. of Int. Joint Conf. on Neural Networks, pp. 2258–2263 (2001)
Google Scholar

Download references

Author information

Authors and Affiliations

Computational Intelligence Research Group, School of Design, Engineering and Computing, Bournemouth University, UK
Mark Eastwood & Bogdan Gabrys

Authors

Mark Eastwood
View author publications
You can also search for this author in PubMed Google Scholar
Bogdan Gabrys
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

University of Chile, Republica 701, 8370439, Santiago, Chile
Juan D. Velásquez & Sebastián A. Ríos &
University of Brighton, BN2 4GJ, Brighton, UK
Robert J. Howlett
University of South Australia, 5095, Mawson Lakes, SA, Australia
Lakhmi C. Jain

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Eastwood, M., Gabrys, B. (2009). A Non-sequential Representation of Sequential Data for Churn Prediction. In: Velásquez, J.D., Ríos, S.A., Howlett, R.J., Jain, L.C. (eds) Knowledge-Based and Intelligent Information and Engineering Systems. KES 2009. Lecture Notes in Computer Science(), vol 5711. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-04595-0_26

Download citation

DOI: https://doi.org/10.1007/978-3-642-04595-0_26
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-04594-3
Online ISBN: 978-3-642-04595-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics