Dialogue POMDP components (part I): learning states and observations

Chinaei, Hamid R.; Chaib-draa, Brahim

doi:10.1007/s10772-014-9244-6

Dialogue POMDP components (part I): learning states and observations

Published: 15 October 2014

Volume 17, pages 309–323, (2014)
Cite this article

International Journal of Speech Technology Aims and scope Submit manuscript

Hamid R. Chinaei¹ &
Brahim Chaib-draa¹

205 Accesses
1 Citation
Explore all metrics

Abstract

The partially observable Markov decision process (POMDP) framework has been applied in dialogue systems as a formal framework to represent uncertainty explicitly while being robust to noise. In this context, estimating the dialogue POMDP model components is a significant challenge as they have a direct impact on the optimized dialogue POMDP policy. To achieve such an estimation, we propose methods for learning dialogue POMDP model components using noisy and unannotated dialogues. Specifically, we introduce techniques to learn the set of possible user intentions from dialogues, use them as the dialogue POMDP states, and learn a maximum likelihood POMDP transition model from data. Since it is crucial to reduce the observation state size, we then propose two observation models: the keyword model and the intention model. Using these two models, the number of observations is reduced significantly while the POMDP performance remains high particularly in the intention POMDP. Learning states and observations sustaining a POMDP are both covered in this first part (part I) and experimented from dialogues collected by SmartWheeler (an intelligent wheelchair which aims to help persons with disabilities). Part II covers the reward model learning required by the POMDP.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Notes

see http://mi.eng.cam.ac.uk/projects/sacti/corpora/.
at: http://www.cs.cmu.edu/~trey/zmdp/.
Notice that these results are only based on the dialogue POMDP simulation; where there exists neither user utterance nor machine’s utterance but only the simulated action and observations.

References

Atrash, A., & Pineau, J. (2010). A Bayesian method for learning POMDP observation parameters for robot interaction management systems. In The POMDP practitioners workshop.
Blei, D. (2012). Introduction to probabilistic topic models. Communications of the ACM, 55(4), 77–84.
Blei, D. M., Ng, A. Y., & Jordan, M. I. (2003). Latent Dirichlet allocation. Journal of Machine Learning Research, 3, 993–1022.
MATH Google Scholar
Chinaei, H. R., Chaib-draa, B., & Lamontagne, L. (2009). Learning user intentions in spoken dialogue systems. In Proceedings of the 1st International Conference on Agents and Artificial Intelligence (ICAART’09), Porto, Portugal.
Choi, J., & Kim, K.-E. (2011). Inverse reinforcement learning in partially observable environments. Journal of Machine Learning Research, 12, 691–730.
MATH Google Scholar
Daud, A., Li, J., Zhou, L., & Muhammad, F. (2010). Knowledge discovery through directed probabilistic topic models: A survey. Frontiers of Computer Science in China, 4(2), 280–301.
Article Google Scholar
Doshi, F., & Roy, N. (2007). Efficient model learning for dialog management. In Proceedings of the 2nd ACM SIGCHI/SIGART conference on Human-Robot Interaction (HRI’07), Arlington, Virginia, USA.
Doshi, F., & Roy, N. (2008). Spoken language interaction with model uncertainty: An adaptive human-robot interaction system. Connection Science, 20(4), 299–318.
Article Google Scholar
Gašić, M. (2011). Statistical dialogue modelling. PhD thesis, Department of Engineering, University of Cambridge.
Gruber, A., & Popat, A. (2007). Notes regarding computations in open htmm. http://openhtmm.googlecode.com/files/htmm_computations.pdf
Gruber, A., Rosen-Zvi, M., & Weiss, Y. (2007). Hidden topic Markov models. In Artificial intelligence and statistics (AISTATS’07), San Juan, Puerto Rico, USA.
Kaelbling, L., Littman, M., & Cassandra, A. (1998). Planning and acting in partially observable stochastic domains. Artificial Intelligence, 101(1–2), 99–134.
Article MathSciNet MATH Google Scholar
Ko, Y., & Seo, J. (2004). Learning with unlabeled data for text categorization using bootstrapping and feature projection techniques. In Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics (ACL’04), Barcelona, Spain.
Matsubara, S., Kimura, S., Kawaguchi, N., Yamaguchi, Y., & Inagaki, Y. (2002). Example-based speech intention understanding and its application to in-car spoken dialogue system. In Proceedings of the 19th International Conference on Computational linguistics (Vol. 1), Taipei, Taiwan.
Ng, A. Y., & Russell, S. J. (2000). Algorithms for inverse reinforcement learning. In Proceedings of the 17th International Conference on Machine Learning (ICML’00), Stanford, CA, USA.
Paek, T., & Pieraccini, R. (2008). Automating spoken dialogue management design using machine learning: An industry perspective. Speech Communication, 50(8), 716–729.
Article Google Scholar
Pineau, J., Gordon, G., & Thrun, S. (2003). Point-based value iteration: An anytime algorithm for POMDPs. In International Joint Conference on Artificial Intelligence (IJCAI’03), Acapulco, Mexico.
Pineau, J., West, R., Atrash, A., Villemure, J., & Routhier, F. (2011). On the feasibility of using a standardized test for evaluating a speech-controlled smart wheelchair. International Journal of Intelligent Control and Systems, 16(2), 124–131.
Google Scholar
Png, S. & Pineau, J. (2011). Bayesian reinforcement learning for POMDP-based dialogue systems. In Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP’11), Prague, Czech Republic.
Png, S., Pineau, J., & Chaib-Draa, B. (2012). Building adaptive dialogue systems via bayes-adaptive POMDPs. IEEE Journal of Selected Topics in Signal Processing, 6(8), 917–927.
Rabiner, L. R. (1990). Readings in speech recognition. In Chapter A tutorial on hidden Markov models and selected applications in speech recognition (pp. 267–296). San Francisco, CA, USA: Morgan Kaufmann Publishers Inc.
Roy, N., Pineau, J., & Thrun, S. (2000). Spoken dialogue management using probabilistic reasoning. In Proceedings of the 38th Annual Meeting on Association for Computational Linguistics (ACL’00), Hong Kong.
Thomson, B. (2009). Statistical methods for spoken dialogue management. PhD thesis, Department of Engineering, University of Cambridge.
Weilhammer, K., Williams, J. D., & Young, S. (2004). The SACTI-2 corpus: Guide for research users. Cambridge University. Technical report.
Williams, J. D. (2006). Partially observable Markov decision processes for spoken dialogue management. PhD thesis, Department of Engineering, University of Cambridge.
Williams, J. D., & Young, S. (2005). The SACTI-1 corpus: Guide for research users. Department of Engineering, University of Cambridge. Technical report.
Williams, J. D., & Young, S. (2007). Partially observable Markov decision processes for spoken dialog systems. Computer Speech and Language, 21, 393–422.
Article Google Scholar
Zhang, B., Cai, Q., Mao, J., Chang, E., & Guo, B. (2001a). Spoken dialogue management as planning and acting under uncertainty. In Proceedings of the 9th European Conference on Speech Communication and Technology (Eurospeech’01), Aalborg, Denmark.
Zhang, B., Cai, Q., Mao, J., & Guo, B. (2001b). Planning and acting under uncertainty: A new model for spoken dialogue system. In Proceedings of the 17th Conference in Uncertainty in Artificial Intelligence (UAI’01), Seattle, Washington, USA.

Download references

Author information

Authors and Affiliations

Computer Science Department, Laval University, Quebec, QC, Canada
Hamid R. Chinaei & Brahim Chaib-draa

Authors

Hamid R. Chinaei
View author publications
You can also search for this author in PubMed Google Scholar
Brahim Chaib-draa
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Brahim Chaib-draa.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Chinaei, H.R., Chaib-draa, B. Dialogue POMDP components (part I): learning states and observations. Int J Speech Technol 17, 309–323 (2014). https://doi.org/10.1007/s10772-014-9244-6

Download citation

Received: 09 July 2013
Accepted: 15 July 2014
Published: 15 October 2014
Issue Date: December 2014
DOI: https://doi.org/10.1007/s10772-014-9244-6

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Dialogue POMDP components (part I): learning states and observations

Abstract

Access this article

Similar content being viewed by others

Dialogue POMDP components (Part II): learning the reward function

Finite-to-Infinite N-Best POMDP for Spoken Dialogue Management

Towards Online Planning for Dialogue Management with Rich Domain Knowledge

Notes

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Dialogue POMDP components (part I): learning states and observations

Abstract

Access this article

Similar content being viewed by others

Dialogue POMDP components (Part II): learning the reward function

Finite-to-Infinite N-Best POMDP for Spoken Dialogue Management

Towards Online Planning for Dialogue Management with Rich Domain Knowledge

Notes

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation