User Simulation in the Development of Statistical Spoken Dialogue Systems

  • Simon Keizer
  • Stéphane Rossignol
  • Senthilkumar Chandramohan
  • Olivier Pietquin


Statistical approaches to dialogue management have steadily increased inpopularity over the last decade. Recent evaluations of such dialogue managershave shown their feasibility for sizeable domains and their advantage in terms ofincreased robustness. Moreover, simulated users have shown to be highly beneficialin the development and testing of dialogue managers and in particular, fortraining statistical dialogue managers. Learning the optimal policy of aPOMDP dialogue manager is typically done using the reinforcement learning(RL), but with the RL algorithms that are commonly used today, thisprocess still relies on the use of a simulated user. Data-driven approaches touser simulation have been developed to train dialogue managers on morerealistic user behaviour. This chapter provides an overview of user simulationtechniques and evaluation methodologies. In particular, recent developments inagenda-based user simulation, dynamic Bayesian network-based simulations andinverse reinforcement learning-based user simulations are discussed indetail. Finally, we will discuss ongoing work and future challenges for usersimulation.


User Behaviour Markov Decision Process Reward Function Real User Dialogue System 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.



The research leading to these results has received partial support from the European Cummunity’s Seventh Framework Programme (FP7) under grant agreement no. 216594 (CLASSiC project), under grant agreement no. 270019 (SpaceBook project), and under grant agreement no. 270435 (james project).


  1. 1.
    Abbeel, P., Ng, A.Y.: Apprenticeship learning via inverse reinforcement learning. In: Proceedings of the International Conference on Machine Learning (ICML), Banff, Alberta, Canada (2004)Google Scholar
  2. 2.
    Ai, H., Litman, D.: Assessing dialog system user simulation evaluation measures using human judges. In: Proceedings of the 46th meeting of the Association for Computational Linguistics, pp. 622–629. Columbus, OH (2008)Google Scholar
  3. 3.
    Anderson, T.: On the distribution of the two-sample Cramér-von Mises criterion. Annals of Mathematical Statistics 33(3), 1148–1159 (1962)MathSciNetMATHCrossRefGoogle Scholar
  4. 4.
    Bellman, R.: A markovian decision process. Journal of Mathematics and Mechanics 6, 679–684 (1957)MATHGoogle Scholar
  5. 5.
    Bos, J., Klein, E., Lemon, O., Oka, T.: DIPPER: Description and Formalisation of an Information-State Update Dialogue System Architecture. In: 4th SIGdial Workshop on Discourse and Dialogue, pp. 115–124. Sapporo (2003)Google Scholar
  6. 6.
    Chandramohan, S., Geist, M., Lefèvre, F., Pietquin, O.: User Simulation in Dialogue Systems using Inverse Reinforcement Learning. In: Proceedings Interspeech 2011, Florence, Italy, August 2011Google Scholar
  7. 7.
    Cheyer, A., Martin, D.L.: The open agent architecture. Autonomous Agents and Multi-Agent Systems 40(1/2), 143–148 (2001)CrossRefGoogle Scholar
  8. 8.
    Cramer, H.: On the composition of elementary errors. second paper: Statistical applications. Skandinavisk Aktuarietidskrift 11, 171–180 (1928)Google Scholar
  9. 9.
    Cuayáhuitl, H., Renals, S., Lemon, O., Shimodaira, H.: Human-computer dialogue simulation using hidden markov models. In: Proceedings of the Automatic Speech Recognition Workshop (ASRU), Cancun, Mexico (2005)Google Scholar
  10. 10.
    Doddington, G.: Automatic evaluation of machine translation quality using n-gram co-occurrence statistics. In: Proceedings of the Human Language Technology Conference (HLT), San Diego, CA (2002)Google Scholar
  11. 11.
    Eckert, W., Levin, E., Pieraccini, R.: User modeling for spoken dialogue system evaluation. In: Proceedings of the Automatic Speech Recognition Workshop (ASRU), Santa Barbara, CA, December 1997Google Scholar
  12. 12.
    Gasic, M., Jurcicek, F., Thomson, B., Yu, K., Young, S.: On-line policy optimisation of spoken dialogue systems via live interaction with human subjects. In: Proceedings of the Automatic Speech Recognition Workshop (ASRU), Waikoloa, HI (2011)Google Scholar
  13. 13.
    Georgila, K., Henderson, J., Lemon, O.: User simulation for spoken dialogue systems: Learning and evaluation. In: Proceedings International Conference on Spoken Language Processing (Interspeech/ICSLP), Pittsburgh, PA (2006)Google Scholar
  14. 14.
    Götze, J., Scheffler, T., Roller, R., Reithinger, N.: User simulation for the evaluation of bus information systems. In: Proceedings IEEE Spoken Language Technology Workshop (SLT), Berkeley, CA, December 2010Google Scholar
  15. 15.
    Foster, M.E., Keizer, S., Wang, Z., Lemon, O.: Machine learning of social states and skills for multi-party human-robot interaction. In: Proceedings ECAI Workshop on Machine Learning for Interactive Systems: Bridging the Gap Between Language, Motor Control and Vision, Montpellier (MLIS), France, (2012)Google Scholar
  16. 16.
    Jung, S., Lee, C., Kim, K., Jeong, M., Geunbae Lee, G.: Data-driven user simulation for automated evaluation of spoken dialogue systems. Computer Speech and Language 23, 479–509 (2009)CrossRefGoogle Scholar
  17. 17.
    Keizer, S., Gašić, M., Mairesse, F., Thomson, B., Yu, K., Young, S.: Modelling user behaviour in the HIS-POMDP dialogue manager. In: Proceedings of SLT, 2008Google Scholar
  18. 18.
    Keizer, S., Gašić, M., Jurčíček, F., Mairesse, F., Thomson, B., Yu, K., Young, S.: Parameter estimation for agenda-based user simulation. In: Proceedings of the Annual SIGdial Meeting on Discourse and Dialogue, Tokyo, Japan, September 2010Google Scholar
  19. 19.
    Kullback, S., Leibler, R.: On information and sufficiency. Annals of Mathematical Statistics 22, 79–86 (1951)MathSciNetMATHCrossRefGoogle Scholar
  20. 20.
    Lagoudakis, M.G., Parr, R.: Least-squares policy iteration. Journal of Machine Learning Research 4, 1107–1149 (2003)MathSciNetGoogle Scholar
  21. 21.
    Lee, A., Przybocki, M.: NIST 2005 machine translation evaluation official results. official release of automatic evaluation scores for all submissions, August 2005Google Scholar
  22. 22.
    Lemon, O., Liu, X., Shapiro, D., Tollander, C.: Hierarchical Reinforcement Learning of Dialogue Policies in a Development Environment for Dialogue Systems: REALL-DUDE. In: Proceedings of the 10th SemDial Workshop on the Semantics and Pragmatics of Dialogue (BRANDIAL), Potsdam, Germany (2006)Google Scholar
  23. 23.
    Levin, E., Pieraccini, R.: Using Markov Decision Process for learning dialogue strategies. In: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Seattle, WA (1998)Google Scholar
  24. 24.
    Levin, E., Pieraccini, R., Eckert, W.: A stochastic model of human-machine interaction for learning dialogue strategies. IEEE Transactions on Speech and Audio Processing 8(1), 2000Google Scholar
  25. 25.
    Ng, A.Y., Russell, S.: Algorithms for inverse reinforcement learning. In: Proceedings of 17th International Conference on Machine Learning (ICML), Stanford, CA (2000)Google Scholar
  26. 26.
    Papineni, K., Roukos, S., Ward, T., Zhu, W.J.: BLEU: A method for automatic evaluation of machine translation. In: Proceedings of the 40th Annual Meeting on Association for Computational Linguistics (ACL), Philadelphia, PA (2002)Google Scholar
  27. 27.
    Pietquin, O., Geist, M., Chandramohan, S., Frezza-Buet, H.: Sample-Efficient Batch Reinforcement Learning for Dialogue Management Optimization. ACM Transactions on Speech and Language Processing 7(3), 1–21 (2011)CrossRefGoogle Scholar
  28. 28.
    Pietquin, O.: A probabilistic description of man-machine spoken communication. In: Proceedings of the IEEE International Conference on Multimedia & Expo (ICME), Amsterdam, The Netherlands (2005)Google Scholar
  29. 29.
    Pietquin, O.: Consistent goal-directed user model for realistic man-machine task-oriented spoken dialogue simulation. In: Proceedings of the IEEE International Conference on Multimedia and Expo (ICME), Toronto, Canada, July 2006Google Scholar
  30. 30.
    Pietquin, O., Dutoit, T.: A probabilistic framework for dialogue simulation and optimal strategy learning. IEEE Transactions on Audio, Speech and Language Processing 14(2), 589–599 (2006)CrossRefGoogle Scholar
  31. 31.
    Pietquin, O., Hastie, H.: A survey on metrics for the evaluation of user simulations. The Knowledge Engineering Review, 2013Google Scholar
  32. 32.
    Rieser, V.: Bootstrapping Reinforcement Learning-based Dialogue Strategies from Wizard-of-Oz data. PhD thesis, Saarland University, Department of Computational Linguistics, July 2008Google Scholar
  33. 33.
    Rieser, V., Lemon, O.: Simulations for learning dialogue strategies. In: Proceedings of Interspeech 2006, Pittsburg, PA (2006)Google Scholar
  34. 34.
    van Rijsbergen, C.J.: Information Retrieval, 2nd edn. Butterworths, London, UK (1979)Google Scholar
  35. 35.
    Rossignol, S., Ianotto, M., Pietquin, O.: Training a BN-based user model for dialogue simulation with missing data. In: Proceedings of the International Joint Conference on Natural Language Processing (IJCNLP), Chiang Mai, Thailand (2011)Google Scholar
  36. 36.
    Rossignol, S., Pietquin, O., Ianotto, M.: Grounding Simulation in Spoken Dialog Systems with Bayesian Networks. In: Proceedings of the International Workshop on Spoken Dialogue Systems (IWSDS 2010), Gotemba, Japan, October 2010Google Scholar
  37. 37.
    Schatzmann, J., Stuttle, M.N., Weilhammer, K., Young, S.: Effects of the user model on simulation-based learning of dialogue strategies. In: Proceedings of ASRU’05, 2005Google Scholar
  38. 38.
    Schatzmann, J., Weilhammer, K., Stuttle, M., Young, S.: A survey of statistical user simulation techniques for reinforcement learning of dialogue management strategies. The Knowledge Engineering Review 21(2), 97–126 (2006)CrossRefGoogle Scholar
  39. 39.
    Schatzmann, J., Thomson, B., Young, S.: Statistical user simulation with a hidden agenda. In: Proceedings of the Annual SIGDIAL Meeting on Discourse and Dialogue, pp. 273–282. Antwerp, Belgium (2007)Google Scholar
  40. 40.
    Schatzmann, J., Thomson, B., Weilhammer, K., Ye, H., Young, S.: Agenda-based user simulation for bootstrapping a POMDP dialogue system. In: Proceedings of the Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL-HLT), Rochester, NY (2007)Google Scholar
  41. 41.
    Scheffler, K., Young, S.: Corpus-based dialogue simulation for automatic strategy learning and evaluation. In: Proceedings of the NAACL Workshop on Adaptation in Dialogue, Pittsburgh, PA (2001)Google Scholar
  42. 42.
    Singh, S., Kearns, M., Litman, D., Walker, M.: Reinforcement learning for spoken dialogue systems. In: Solla, S., Leen, T., Müller, K. (eds.) Advances in Neural Information Processing Systems (NIPS), MIT Press (2000)Google Scholar
  43. 43.
    Syed, U., Williams, J.D.: Using automatically transcribed dialogs to learn user models in a spoken dialog system. In: Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL) with the Human Language Technology Conference (HLT), Columbus, OH (2008)Google Scholar
  44. 44.
    Williams, J.: Evaluating user simulations with the Cramér-von Mises divergence. Speech Communication 50, 829–846 (2008)CrossRefGoogle Scholar
  45. 45.
    Young, S., Gašić, M., Keizer, S., Mairesse, F., Thomson, B., Yu, K.: The Hidden Information State model: a practical framework for POMDP based spoken dialogue management. Computer Speech and Language 24(2), 150–174 (2010)CrossRefGoogle Scholar
  46. 46.
    Zukerman, I., Albrecht, D.: Predictive statistical models for user modeling. User Modeling and User-Adapted Interaction 11, 5–18 (2001)MATHCrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media New York 2012

Authors and Affiliations

  • Simon Keizer
    • 1
  • Stéphane Rossignol
    • 2
  • Senthilkumar Chandramohan
    • 2
  • Olivier Pietquin
    • 2
  1. 1.Heriot-Watt UniversityEdinburghUK
  2. 2.SUPELECSupélec Campus de MetzMetzFrance

Personalised recommendations