Advertisement

International Journal of Speech Technology

, Volume 17, Issue 4, pp 309–323 | Cite as

Dialogue POMDP components (part I): learning states and observations

  • Hamid R. Chinaei
  • Brahim Chaib-draa
Article
  • 144 Downloads

Abstract

The partially observable Markov decision process (POMDP) framework has been applied in dialogue systems as a formal framework to represent uncertainty explicitly while being robust to noise. In this context, estimating the dialogue POMDP model components is a significant challenge as they have a direct impact on the optimized dialogue POMDP policy. To achieve such an estimation, we propose methods for learning dialogue POMDP model components using noisy and unannotated dialogues. Specifically, we introduce techniques to learn the set of possible user intentions from dialogues, use them as the dialogue POMDP states, and learn a maximum likelihood POMDP transition model from data. Since it is crucial to reduce the observation state size, we then propose two observation models: the keyword model and the intention model. Using these two models, the number of observations is reduced significantly while the POMDP performance remains high particularly in the intention POMDP. Learning states and observations sustaining a POMDP are both covered in this first part (part I) and experimented from dialogues collected by SmartWheeler (an intelligent wheelchair which aims to help persons with disabilities). Part II covers the reward model learning required by the POMDP.

Keywords

Partially observable Markov decision processes (POMDP) Unsupervised learning Learning observations and states  Healthcare dialogue management 

References

  1. Atrash, A., & Pineau, J. (2010). A Bayesian method for learning POMDP observation parameters for robot interaction management systems. In The POMDP practitioners workshop.Google Scholar
  2. Blei, D. (2012). Introduction to probabilistic topic models. Communications of the ACM, 55(4), 77–84.Google Scholar
  3. Blei, D. M., Ng, A. Y., & Jordan, M. I. (2003). Latent Dirichlet allocation. Journal of Machine Learning Research, 3, 993–1022.MATHGoogle Scholar
  4. Chinaei, H. R., Chaib-draa, B., & Lamontagne, L. (2009). Learning user intentions in spoken dialogue systems. In Proceedings of the 1st International Conference on Agents and Artificial Intelligence (ICAART’09), Porto, Portugal.Google Scholar
  5. Choi, J., & Kim, K.-E. (2011). Inverse reinforcement learning in partially observable environments. Journal of Machine Learning Research, 12, 691–730.MATHGoogle Scholar
  6. Daud, A., Li, J., Zhou, L., & Muhammad, F. (2010). Knowledge discovery through directed probabilistic topic models: A survey. Frontiers of Computer Science in China, 4(2), 280–301.CrossRefGoogle Scholar
  7. Doshi, F., & Roy, N. (2007). Efficient model learning for dialog management. In Proceedings of the 2nd ACM SIGCHI/SIGART conference on Human-Robot Interaction (HRI’07), Arlington, Virginia, USA.Google Scholar
  8. Doshi, F., & Roy, N. (2008). Spoken language interaction with model uncertainty: An adaptive human-robot interaction system. Connection Science, 20(4), 299–318.CrossRefGoogle Scholar
  9. Gašić, M. (2011). Statistical dialogue modelling. PhD thesis, Department of Engineering, University of Cambridge.Google Scholar
  10. Gruber, A., & Popat, A. (2007). Notes regarding computations in open htmm. http://openhtmm.googlecode.com/files/htmm_computations.pdf
  11. Gruber, A., Rosen-Zvi, M., & Weiss, Y. (2007). Hidden topic Markov models. In Artificial intelligence and statistics (AISTATS’07), San Juan, Puerto Rico, USA.Google Scholar
  12. Kaelbling, L., Littman, M., & Cassandra, A. (1998). Planning and acting in partially observable stochastic domains. Artificial Intelligence, 101(1–2), 99–134.MathSciNetCrossRefMATHGoogle Scholar
  13. Ko, Y., & Seo, J. (2004). Learning with unlabeled data for text categorization using bootstrapping and feature projection techniques. In Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics (ACL’04), Barcelona, Spain.Google Scholar
  14. Matsubara, S., Kimura, S., Kawaguchi, N., Yamaguchi, Y., & Inagaki, Y. (2002). Example-based speech intention understanding and its application to in-car spoken dialogue system. In Proceedings of the 19th International Conference on Computational linguistics (Vol. 1), Taipei, Taiwan. Google Scholar
  15. Ng, A. Y., & Russell, S. J. (2000). Algorithms for inverse reinforcement learning. In Proceedings of the 17th International Conference on Machine Learning (ICML’00), Stanford, CA, USA.Google Scholar
  16. Paek, T., & Pieraccini, R. (2008). Automating spoken dialogue management design using machine learning: An industry perspective. Speech Communication, 50(8), 716–729.CrossRefGoogle Scholar
  17. Pineau, J., Gordon, G., & Thrun, S. (2003). Point-based value iteration: An anytime algorithm for POMDPs. In International Joint Conference on Artificial Intelligence (IJCAI’03), Acapulco, Mexico.Google Scholar
  18. Pineau, J., West, R., Atrash, A., Villemure, J., & Routhier, F. (2011). On the feasibility of using a standardized test for evaluating a speech-controlled smart wheelchair. International Journal of Intelligent Control and Systems, 16(2), 124–131.Google Scholar
  19. Png, S. & Pineau, J. (2011). Bayesian reinforcement learning for POMDP-based dialogue systems. In Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP’11), Prague, Czech Republic.Google Scholar
  20. Png, S., Pineau, J., & Chaib-Draa, B. (2012). Building adaptive dialogue systems via bayes-adaptive POMDPs. IEEE Journal of Selected Topics in Signal Processing, 6(8), 917–927.Google Scholar
  21. Rabiner, L. R. (1990). Readings in speech recognition. In Chapter A tutorial on hidden Markov models and selected applications in speech recognition (pp. 267–296). San Francisco, CA, USA: Morgan Kaufmann Publishers Inc.Google Scholar
  22. Roy, N., Pineau, J., & Thrun, S. (2000). Spoken dialogue management using probabilistic reasoning. In Proceedings of the 38th Annual Meeting on Association for Computational Linguistics (ACL’00), Hong Kong.Google Scholar
  23. Thomson, B. (2009). Statistical methods for spoken dialogue management. PhD thesis, Department of Engineering, University of Cambridge.Google Scholar
  24. Weilhammer, K., Williams, J. D., & Young, S. (2004). The SACTI-2 corpus: Guide for research users. Cambridge University. Technical report.Google Scholar
  25. Williams, J. D. (2006). Partially observable Markov decision processes for spoken dialogue management. PhD thesis, Department of Engineering, University of Cambridge.Google Scholar
  26. Williams, J. D., & Young, S. (2005). The SACTI-1 corpus: Guide for research users. Department of Engineering, University of Cambridge. Technical report.Google Scholar
  27. Williams, J. D., & Young, S. (2007). Partially observable Markov decision processes for spoken dialog systems. Computer Speech and Language, 21, 393–422.CrossRefGoogle Scholar
  28. Zhang, B., Cai, Q., Mao, J., Chang, E., & Guo, B. (2001a). Spoken dialogue management as planning and acting under uncertainty. In Proceedings of the 9th European Conference on Speech Communication and Technology (Eurospeech’01), Aalborg, Denmark.Google Scholar
  29. Zhang, B., Cai, Q., Mao, J., & Guo, B. (2001b). Planning and acting under uncertainty: A new model for spoken dialogue system. In Proceedings of the 17th Conference in Uncertainty in Artificial Intelligence (UAI’01), Seattle, Washington, USA.Google Scholar

Copyright information

© Springer Science+Business Media New York 2014

Authors and Affiliations

  1. 1.Computer Science DepartmentLaval UniversityQuebecCanada

Personalised recommendations