Requirements Engineering

, Volume 23, Issue 3, pp 333–355 | Cite as

Customer support ticket escalation prediction using feature engineering

  • Lloyd MontgomeryEmail author
  • Daniela Damian
  • Tyson Bulmer
  • Shaikh Quader
RE 2017


Understanding and keeping the customer happy is a central tenet of requirements engineering. Strategies to gather, analyze, and negotiate requirements are complemented by efforts to manage customer input after products have been deployed. For the latter, support tickets are key in allowing customers to submit their issues, bug reports, and feature requests. If insufficient attention is given to support issues, however, their escalation to management becomes time-consuming and expensive, especially for large organizations managing hundreds of customers and thousands of support tickets. Our work provides a step toward simplifying the job of support analysts and managers, particularly in predicting the risk of escalating support tickets. In a field study at our large industrial partner, IBM, we used a design science research methodology to characterize the support process and data available to IBM analysts in managing escalations. In a design science methodology, we used feature engineering to translate our understanding of support analysts’ expert knowledge of their customers into features of a support ticket model. We then implemented these features into a machine learning model to predict support ticket escalations. We trained and evaluated our machine learning model on over 2.5 million support tickets and 10,000 escalations, obtaining a recall of 87.36% and an 88.23% reduction in the workload for support analysts looking to identify support tickets at risk of escalation. Further on-site evaluations, through a prototype tool we developed to implement our machine learning techniques in practice, showed more efficient weekly support ticket management meetings. Finally, in addition to these research evaluation activities, we compared the performance of our support ticket model with that of a model developed with no feature engineering; the support ticket model features outperformed the non-engineered model. The artifacts created in this research are designed to serve as a starting place for organizations interested in predicting support ticket escalations, and for future researchers to build on to advance research in escalation prediction.


Customer relationship management Machine learning Escalation prediction Customer support ticket Design science research 



We thank IBM for their data, advice, and time spent as a collaborator; special thanks to Keith Mackenzie at IBM Victoria for his contribution to this research. We thank Emma Reading for her contribution to the prototype tool. We thank the anonymous referees of both RE17 and the REJ special issue. This research was funded by the Natural Sciences and Engineering Research Council of Canada (NSERC) and IBM Center for Advanced Studies (IBM CAS).


  1. 1.
    Bernstein A, Ekanayake J, Pinzger M (2007) Improving defect prediction using temporal features and non linear models. In: Ninth international work. Princ. Softw. Evol. conjunction with 6th ESEC/FSE Jt. Meet.—IWPSE ’07, p 11. ACM Press, New York, New York, USA.
  2. 2.
    Berry D (2017) Requirements for tools for hairy requirements or software engineering tasks. Technical report, University of Waterloo, Department of Computer ScienceGoogle Scholar
  3. 3.
    Berry MJ, Linoff G (1997) Data mining techniques: for marketing, sales, and customer support. Wiley, LondonGoogle Scholar
  4. 4.
    Berry D, Gacitua R, Sawyer P, Tjong SF (2012) The case for dumb requirements engineering tools. In: The case for dumb requirements engineering tools. Springer, pp 211–217Google Scholar
  5. 5.
    Boehm B (1984) Software engineering economics. IEEE Trans Softw Eng 10(1):4–21CrossRefGoogle Scholar
  6. 6.
    Bruckhaus T, Ling CX, Madhavji NH, Sheng S (2004) Software escalation prediction with data mining. In: Software escalation prediction with data miningGoogle Scholar
  7. 7.
    Chen T, Guestrin C (2016) XGBoost: a scalable tree boosting system. KDD.
  8. 8.
    Creswell JW, Miller DL (2000) Determining validity in qualitative inquiry. Theory into Pract 39:124–130CrossRefGoogle Scholar
  9. 9.
    Cruzes DS, Dyba T, Cruzes Daniela S, Dyba T (2011) Recommended steps for thematic synthesis in software engineering. In: 2011 5th international symposium on empirical software engineering and measurement (ESEM 2011). IEEE, pp 275–284Google Scholar
  10. 10.
    D’Ambros M, Lanza M, Robbes R (2010) An extensive comparison of bug prediction approaches. In: 2010 7th IEEE working conference on mining software repositories (MSR 2010). IEEE, pp 31–41.
  11. 11.
    Davis J, Goadrich M (2006) The relationship between Precision-Recall and ROC curves. In: Proceedings of the 23rd international conference on machine learning—ICML ’06, pp 233–240.
  12. 12.
    Domingos P (2012) A few useful things to know about machine learning. Commun ACM 55(10):78. CrossRefGoogle Scholar
  13. 13.
    Fawcett T (2004) ROC graphs: notes and practical considerations for researchers. Mach Learn 31(1):1–38MathSciNetGoogle Scholar
  14. 14.
    Hassan AE (2009) Predicting faults using the complexity of code changes. In: 2009 IEEE 31st international conference on software engineering. IEEE, pp 78–88.
  15. 15.
    Hassan A, Holt R (2005) The top ten list: dynamic fault prediction. In: 21st IEEE international conference on software maintenance. IEEE, pp 263–272.
  16. 16.
    Hevner A, Park J, Sudha R, March ST (2004) Design science in information systems research. MIS Q 28(1):75–105CrossRefGoogle Scholar
  17. 17.
    Hosmer DW Jr, Lemeshow S, Sturdivant RX (2013) Applied logistic regression, vol 398. Wiley, LondonCrossRefzbMATHGoogle Scholar
  18. 18.
    Kabbedijk J, Brinkkemper S, Jansen S, van der Veldt B (2009) Customer involvement in requirements management: lessons from mass market software development. In: 2009 17th IEEE international requirements engineering conference (RE). IEEE, pp 281–286.
  19. 19.
    Kim S, Zimmermann T, Whitehead Jr EJ, Zeller A (2007) Predicting faults from cached history. In: 29th international conference on software engineering, pp 489–498.
  20. 20.
    Lim SL, Damian D, Finkelstein A (2011) Stakesource 2. 0: using social networks of stakeholders to identify and prioritise requirements. In: 33rd international conference on software engineering (ICSE), pp 1022–1024. IEEEGoogle Scholar
  21. 21.
    Lincoln YS, Guba EG (1985) Naturalistic inquiry. Newbury Park, LondonGoogle Scholar
  22. 22.
    Ling CX, Sheng L, Bruckhaus T, Madhavji NH (2005) Predicting software escalations with maximum ROI. In: Proceedings of IEEE international conference on data mining, ICDM, pp 717–720.
  23. 23.
    Liu XY, Zhou ZH (2006) The influence of class imbalance on cost-sensitive learning: an empirical study. In: Proceedings—IEEE international conference on data mining, ICDM. IEEE, pp 970–974.
  24. 24.
    March ST, Smith GF (1995) Design and natural science research on information technology. Decis Support Syst 15(4):251–266. CrossRefGoogle Scholar
  25. 25.
    Marcu P, Grabarnik G, Luan L, Rosu D, Shwartz L, Ward C (2009) Towards an optimized model of incident ticket correlation. In: Towards an optimized model of incident ticket correlation, pp 569–576.
  26. 26.
    McCarty JA, Hastak M (2007) Segmentation approaches in data-mining: a comparison of RFM, CHAID, and logistic regression. J Bus Res 60(6):656–662. CrossRefGoogle Scholar
  27. 27.
    Merten T, Falis M, Hubner P, Quirchmayr T, Bursner S, Paech B (2016) Software feature request detection in issue tracking systems. In: 2016 IEEE 24th international requirements engineering conference (RE). IEEE, pp 166–175.
  28. 28.
    Montgomery L, Damian D (2017) What do support analysts know about their customers? On the study and prediction of support ticket escalations in large software organizations. In: 2017 IEEE 25th international requirements engineering conference (RE), pp 362–371.
  29. 29.
    Montgomery L, Reading E, Damian D (2017) Ecrits—visualizing support ticket escalation risk. In: 2017 IEEE 25th international requirements engineering conference (RE), pp 452–455.
  30. 30.
    Moser R, Pedrycz W, Succi G (2008) A comparative analysis of the efficiency of change metrics and static code attributes for defect prediction. In: Proceedings of 13th international conference on software engineering—ICSE ’08. ACM Press, New York, NY, USA, p 181.
  31. 31.
    Nagappan N, Ball T (2005) Use of relative code churn measures to predict system defect density. In: Proceedings of 27th international conference on software engineering 2005. ICSE 2005. IEEE, pp 284–292.
  32. 32.
    Ostrand T, Weyuker E, Bell R (2005) Predicting the location and number of faults in large software systems. IEEE Trans Softw Eng 31(4):340–355. CrossRefGoogle Scholar
  33. 33.
    Pang-Ning T, Steinbach M, Kumar V (2006) Introduction to data mining. Library of congress, p 796.
  34. 34.
    Reinartz W, Krafft M, Hoyer WD (2013) The customer relationship management process: its measurement and impact on performance. J Mark Res 41(3):293–305CrossRefGoogle Scholar
  35. 35.
    Schröter A, Aranda J, Damian D, Kwan I (2012) To talk or not to talk: factors that influence communication around changesets. In: Proceedings of the ACM 2012 conference on computer supported cooperative work—CSCW ’12. ACM, pp 1317–1326.
  36. 36.
    Sedlmair M, Meyer M, Munzner T (2012) Design study methodology: reflections from the trenches and the stacks. IEEE Trans Visual Comput Graph 18(12):2431–2440CrossRefGoogle Scholar
  37. 37.
    Sheng VS, Gu B, Fang W, Wu J (2014) Cost-sensitive learning for defect escalation. Knowl-Based Syst 66:146–155. CrossRefGoogle Scholar
  38. 38.
    Shull F, Singer J, Sjøberg DI (2008) Guide to advanced empirical software engineering. Springer, London. CrossRefGoogle Scholar
  39. 39.
    Simon HA (1981) The sciences of the artificial. MIT Press, CambridgeGoogle Scholar
  40. 40.
    Tan PN, Steinbach M, Kumar V (2006) Classification: alternative techniques. In: Tan PN (ed) Introduction to data mining. Pearson, London, pp 256–312Google Scholar
  41. 41.
    van de Weerd I, Brinkkemper S, Nieuwenhuis R, Versendaal J, Bijlsma L (2006) Towards a reference framework for software product management. In: 14th IEEE international requirements engineering conference. IEEE, pp 319–322.
  42. 42.
    Wieringa R (2009) Design science as nested problem solving. In: Proceedings of 4th international conference on design science research in information systems and technology—DESRIST ’09. ACM Press, New York, NY, USA, p 1.
  43. 43.
    Wieringa R (2014) What is design science? In: Design science methodology for information systems and software engineering. Springer, Berlin, Heidelberg, pp 3–11.
  44. 44.
    Wolf T, Schroter A, Damian D, Nguyen T (2009) Predicting build failures using social network analysis on developer communication. In: Proceedings of the 31st international conference on software engineering. IEEE Computer Society, pp 1–11Google Scholar

Copyright information

© Springer-Verlag London Ltd., part of Springer Nature 2018

Authors and Affiliations

  1. 1.University of VictoriaVictoriaCanada
  2. 2.Private Cloud Platform Digital SupportIBMTorontoCanada

Personalised recommendations