Predicting Interaction Quality in Customer Service Dialogs

Stoyanchev, Svetlana; Maiti, Soumi; Bangalore, Srinivas

doi:10.1007/978-3-319-92108-2_16

Svetlana Stoyanchev³⁵,
Soumi Maiti³⁶ &
Srinivas Bangalore³⁵

Part of the book series: Lecture Notes in Electrical Engineering ((LNEE,volume 510))

951 Accesses
1 Citations

Abstract

In this paper, we apply a dialog evaluation Interaction Quality (IQ) framework to human-computer customer service dialogs. IQ framework can be used to predict user satisfaction at an utterance level in a dialog. Such a rating framework is useful for online adaptation of dialog system behavior and increasing user engagement through personalization. We annotated a dataset of 120 human-computer dialogs from two customer service application domains with IQ scores. Our inter-annotator agreement (\(\rho =0.72/0.66\)) is similar to the agreement observed on the IQ annotations of publicly available bus information corpus. The IQ prediction performance of an in-domain SVM model trained on a small set of call center domain dialogs achieves a correlation of \(\rho =0.53{/}0.56\) measured against the annotated IQ scores. A generic model built exclusively on public LEGO data achieves 94%/65% of the in-domain model’s performance. An adapted model built by extending a public dataset with a small set of dialogs in a target domain achieves 102%/81% of the in-domain model’s performance.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
barge-in feature is GENERIC but not recorded in the INTER dataset.
2.
We use linearSVC from the sklearn package with the default parameters.
3.
We report the results on INTER corpus using LEGO1 for training as it achieved higher scores than the models trained on LEGO2.
4.
This result is a comparable to the result in [19] on LEGO corpus.
5.
The heat map is drawn on a logarithmic scale.

References

Beringer N, Kartal U, Louka K, Schiel F, Türk U, et al (2002) Promise–a procedure for multimodal interactive system evaluation. In: Multimodal resources and multimodal systems evaluation workshop program, Saturday, June 1, 2002, p 14
Google Scholar
Cohen J (1968) Weighted kappa: Nominal scale agreement with provision for scaled disagreement or partial credit. Psychological Bulletin 70:213–220
Article Google Scholar
Evanini K, Hunter P, Liscombe J, Suendermann D, Dayanidhi K, Pieraccini R (2008) Caller experience: a method for evaluating dialog systems and its automatic prediction. In: Spoken language technology workshop, 2008. SLT 2008. IEEE, pp 129–132
Google Scholar
Hartikainen M, Salonen EP, Turunen M (2004) Subjective evaluation of spoken dialogue systems using SERVQUAL method. In: INTERSPEECH
Google Scholar
Hone KS, Graham R (2000) Towards a tool for the subjective assessment of speech system interfaces (SASSI). Nat Lang Eng 6(3–4):287–303
Article Google Scholar
Pérez F, Granger BE (2007) IPython: a system for interactive scientific computing. Comput Sci Eng 9(3):21–29
Article Google Scholar
Pragst L, Ultes S, Minker W (2017) Recurrent neural network interaction quality estimation. Springer Singapore, Singapore, pp 381–393
Google Scholar
Raux A, Langner B, Black A, Eskenazi M (2005) Let’s Go public! Taking a spoken dialog system to the real world. In: Proceedings of eurospeech
Google Scholar
Reichheld FF (2004) The one number you need to grow. Harvard business review 81(12):46–54
Google Scholar
Roy S, Mariappan R, Dandapat S, Srivastava S, Galhotra S, Peddamuthu B (2016) Qart: a system for real-time holistic quality assurance for contact center dialogues. In: Thirtieth AAAI conference on artificial intelligence
Google Scholar
Schmitt A., Hank C., Liscombe J (2008) Detecting problematic dialogs with automated agents. In: Proceedings of the 4th IEEE tutorial and research workshop on perception and interactive technologies for speech-based systems: perception in multimodal dialogue systems. Springer, Berlin, Heidelberg, pp 72–80
Google Scholar
Schmitt A, Schatz B, Minker W (2011) Modeling and predicting quality in spoken human-computer interaction. In: Proceedings of the SIGDIAL 2011 conference. Association for Computational Linguistics, pp 173–184
Google Scholar
Schmitt A, Ultes S (2015) Interaction quality: assessing the quality of ongoing spoken dialog interaction by experts-and how it relates to user satisfaction. Speech Commun 74:12–36
Article Google Scholar
Schmitt A, Ultes S, Minker W (2012) A parameterized and annotated spoken dialog corpus of the CMU Let’s Go bus information system. In: Proceedings of the eight international conference on language resources and evaluation (LREC’2). European Language Resources Association (ELRA), Istanbul, Turkey
Google Scholar
Spearman C (1904) The proof and measurement of association between two things. Am J Psychol 15:88–103
Google Scholar
Suendermann D, Liscombe J, Pieraccini R (2010) Minimally invasive surgery for spoken dialog systems. In: INTERSPEECH, pp 98–101
Google Scholar
Ultes S, Kraus M, Schmitt A, Minker W (2015) Quality-adaptive spoken dialogue initiative selection and implications on reward modelling. In: Proceedings of the 16th annual meeting of the special interest group on discourse and dialogue. Association for Computational Linguistics, Prague, Czech Republic, pp 374–383
Google Scholar
Ultes S, Minker W (2014) Interaction quality estimation in spoken dialogue systems using hybrid-HMMs. In: Proceedings of the SIGDIAL 2014 conference, The 15th annual meeting of the special interest group on discourse and dialogue, 18–20 June 2014, Philadelphia, PA, USA, pp 208–217
Google Scholar
Ultes S, Sánchez MJP, Schmitt A, Minker W (2015) Analysis of an extended interaction quality corpus. In: Natural language dialog systems and intelligent assistants. Springer, pp 41–52
Google Scholar
Walker M, Kamm C, Litman D (2000) Towards developing general models of usability with paradise. Nat Lang Eng 6(3&4):363–377
Article Google Scholar
Walker MA, Langkilde-Geary I, Hastie HW, Wright JH, Gorin A (2002) Automatically training a problematic dialogue predictor for a spoken dialogue system. J Artif Intell Res 16(1):293–319
Article Google Scholar

Download references

Author information

Authors and Affiliations

Interactions LLC, New Jersey, NJ, 07932, USA
Svetlana Stoyanchev & Srinivas Bangalore
CUNY, The Graduate Center, New York, NY, USA
Soumi Maiti

Authors

Svetlana Stoyanchev
View author publications
You can also search for this author in PubMed Google Scholar
Soumi Maiti
View author publications
You can also search for this author in PubMed Google Scholar
Srinivas Bangalore
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Svetlana Stoyanchev .

Editor information

Editors and Affiliations

Language Technologies Institute, Carnegie Mellon University, Pittsburgh, Pennsylvania, USA
Maxine Eskenazi
LIMSI-CNRS, Sorbonne University, Paris, France
Laurence Devillers
LIMSI-CNRS, Paris-Saclay University, Orsay, France
Joseph Mariani

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Stoyanchev, S., Maiti, S., Bangalore, S. (2019). Predicting Interaction Quality in Customer Service Dialogs. In: Eskenazi, M., Devillers, L., Mariani, J. (eds) Advanced Social Interaction with Agents . Lecture Notes in Electrical Engineering, vol 510. Springer, Cham. https://doi.org/10.1007/978-3-319-92108-2_16

Download citation

DOI: https://doi.org/10.1007/978-3-319-92108-2_16
Published: 02 August 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-92107-5
Online ISBN: 978-3-319-92108-2
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics