Skip to main content
Log in

Dialogue POMDP components (part I): learning states and observations

  • Published:
International Journal of Speech Technology Aims and scope Submit manuscript

Abstract

The partially observable Markov decision process (POMDP) framework has been applied in dialogue systems as a formal framework to represent uncertainty explicitly while being robust to noise. In this context, estimating the dialogue POMDP model components is a significant challenge as they have a direct impact on the optimized dialogue POMDP policy. To achieve such an estimation, we propose methods for learning dialogue POMDP model components using noisy and unannotated dialogues. Specifically, we introduce techniques to learn the set of possible user intentions from dialogues, use them as the dialogue POMDP states, and learn a maximum likelihood POMDP transition model from data. Since it is crucial to reduce the observation state size, we then propose two observation models: the keyword model and the intention model. Using these two models, the number of observations is reduced significantly while the POMDP performance remains high particularly in the intention POMDP. Learning states and observations sustaining a POMDP are both covered in this first part (part I) and experimented from dialogues collected by SmartWheeler (an intelligent wheelchair which aims to help persons with disabilities). Part II covers the reward model learning required by the POMDP.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

Notes

  1. see http://mi.eng.cam.ac.uk/projects/sacti/corpora/.

  2. at: http://www.cs.cmu.edu/~trey/zmdp/.

  3. Notice that these results are only based on the dialogue POMDP simulation; where there exists neither user utterance nor machine’s utterance but only the simulated action and observations.

References

  • Atrash, A., & Pineau, J. (2010). A Bayesian method for learning POMDP observation parameters for robot interaction management systems. In The POMDP practitioners workshop.

  • Blei, D. (2012). Introduction to probabilistic topic models. Communications of the ACM, 55(4), 77–84.

  • Blei, D. M., Ng, A. Y., & Jordan, M. I. (2003). Latent Dirichlet allocation. Journal of Machine Learning Research, 3, 993–1022.

    MATH  Google Scholar 

  • Chinaei, H. R., Chaib-draa, B., & Lamontagne, L. (2009). Learning user intentions in spoken dialogue systems. In Proceedings of the 1st International Conference on Agents and Artificial Intelligence (ICAART’09), Porto, Portugal.

  • Choi, J., & Kim, K.-E. (2011). Inverse reinforcement learning in partially observable environments. Journal of Machine Learning Research, 12, 691–730.

    MATH  Google Scholar 

  • Daud, A., Li, J., Zhou, L., & Muhammad, F. (2010). Knowledge discovery through directed probabilistic topic models: A survey. Frontiers of Computer Science in China, 4(2), 280–301.

    Article  Google Scholar 

  • Doshi, F., & Roy, N. (2007). Efficient model learning for dialog management. In Proceedings of the 2nd ACM SIGCHI/SIGART conference on Human-Robot Interaction (HRI’07), Arlington, Virginia, USA.

  • Doshi, F., & Roy, N. (2008). Spoken language interaction with model uncertainty: An adaptive human-robot interaction system. Connection Science, 20(4), 299–318.

    Article  Google Scholar 

  • Gašić, M. (2011). Statistical dialogue modelling. PhD thesis, Department of Engineering, University of Cambridge.

  • Gruber, A., & Popat, A. (2007). Notes regarding computations in open htmm. http://openhtmm.googlecode.com/files/htmm_computations.pdf

  • Gruber, A., Rosen-Zvi, M., & Weiss, Y. (2007). Hidden topic Markov models. In Artificial intelligence and statistics (AISTATS’07), San Juan, Puerto Rico, USA.

  • Kaelbling, L., Littman, M., & Cassandra, A. (1998). Planning and acting in partially observable stochastic domains. Artificial Intelligence, 101(1–2), 99–134.

    Article  MathSciNet  MATH  Google Scholar 

  • Ko, Y., & Seo, J. (2004). Learning with unlabeled data for text categorization using bootstrapping and feature projection techniques. In Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics (ACL’04), Barcelona, Spain.

  • Matsubara, S., Kimura, S., Kawaguchi, N., Yamaguchi, Y., & Inagaki, Y. (2002). Example-based speech intention understanding and its application to in-car spoken dialogue system. In Proceedings of the 19th International Conference on Computational linguistics (Vol. 1), Taipei, Taiwan.

  • Ng, A. Y., & Russell, S. J. (2000). Algorithms for inverse reinforcement learning. In Proceedings of the 17th International Conference on Machine Learning (ICML’00), Stanford, CA, USA.

  • Paek, T., & Pieraccini, R. (2008). Automating spoken dialogue management design using machine learning: An industry perspective. Speech Communication, 50(8), 716–729.

    Article  Google Scholar 

  • Pineau, J., Gordon, G., & Thrun, S. (2003). Point-based value iteration: An anytime algorithm for POMDPs. In International Joint Conference on Artificial Intelligence (IJCAI’03), Acapulco, Mexico.

  • Pineau, J., West, R., Atrash, A., Villemure, J., & Routhier, F. (2011). On the feasibility of using a standardized test for evaluating a speech-controlled smart wheelchair. International Journal of Intelligent Control and Systems, 16(2), 124–131.

    Google Scholar 

  • Png, S. & Pineau, J. (2011). Bayesian reinforcement learning for POMDP-based dialogue systems. In Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP’11), Prague, Czech Republic.

  • Png, S., Pineau, J., & Chaib-Draa, B. (2012). Building adaptive dialogue systems via bayes-adaptive POMDPs. IEEE Journal of Selected Topics in Signal Processing, 6(8), 917–927.

  • Rabiner, L. R. (1990). Readings in speech recognition. In Chapter A tutorial on hidden Markov models and selected applications in speech recognition (pp. 267–296). San Francisco, CA, USA: Morgan Kaufmann Publishers Inc.

  • Roy, N., Pineau, J., & Thrun, S. (2000). Spoken dialogue management using probabilistic reasoning. In Proceedings of the 38th Annual Meeting on Association for Computational Linguistics (ACL’00), Hong Kong.

  • Thomson, B. (2009). Statistical methods for spoken dialogue management. PhD thesis, Department of Engineering, University of Cambridge.

  • Weilhammer, K., Williams, J. D., & Young, S. (2004). The SACTI-2 corpus: Guide for research users. Cambridge University. Technical report.

  • Williams, J. D. (2006). Partially observable Markov decision processes for spoken dialogue management. PhD thesis, Department of Engineering, University of Cambridge.

  • Williams, J. D., & Young, S. (2005). The SACTI-1 corpus: Guide for research users. Department of Engineering, University of Cambridge. Technical report.

  • Williams, J. D., & Young, S. (2007). Partially observable Markov decision processes for spoken dialog systems. Computer Speech and Language, 21, 393–422.

    Article  Google Scholar 

  • Zhang, B., Cai, Q., Mao, J., Chang, E., & Guo, B. (2001a). Spoken dialogue management as planning and acting under uncertainty. In Proceedings of the 9th European Conference on Speech Communication and Technology (Eurospeech’01), Aalborg, Denmark.

  • Zhang, B., Cai, Q., Mao, J., & Guo, B. (2001b). Planning and acting under uncertainty: A new model for spoken dialogue system. In Proceedings of the 17th Conference in Uncertainty in Artificial Intelligence (UAI’01), Seattle, Washington, USA.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Brahim Chaib-draa.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Chinaei, H.R., Chaib-draa, B. Dialogue POMDP components (part I): learning states and observations. Int J Speech Technol 17, 309–323 (2014). https://doi.org/10.1007/s10772-014-9244-6

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10772-014-9244-6

Keywords

Navigation