Using Graphical Models for an Intelligent Mixed-Initiative Dialog Management System
The main goal of dialog management is to provide all information needed to perform e. g. a SQL-query, a navigation task, etc. Two principal approaches for dialog management systems exist: system directed ones and mixed-initiative ones. In this paper, we combine both approaches mentioned above in a novel way, and address the problem of natural intuitive dialog management. The objective of our approach is to provide a natural dialog flow. The whole dialog is therefore represented in a finite state machine: the information gathered during the dialog is represented in the states of the finite state machine; the transitions within the state machine denote the dialog steps into which the dialog is separated. The information is obtained from each natural spoken sentence by hierarchical decoding into tags, e. g. the name-tag and the address-tag. These information tags are gathered during the dialog; either by human initiative or by distinct questioning by the dialog manager. The models use information from the semantic information tags, the dialog history, and the training corpus. From all these integrated parts we achieve the best path to the end of the dialog by Viterbi decoding through the transition network after each information step. From the Air Travel Information System (ATIS) database, we extract all 21650 naturally spoken questions and the SQL-queries as answers for the trainings phase. The experiments have been realized on 200 automatically generated dialog sentences. The system obtains the semantic information in all test-sentences and leads the dialogs successfully to the end. In 66.5% of the sample dialogs we achieve the minimum of the required dialog steps. Hence, 33.5% of the dialogs have over-length.
Keywordsdialog management learning knowledge management intelligent systems
Unable to display preview. Download preview PDF.
- 2.Bilmes, J., Zweig, G.: The Graphical Model Toolkit: An Open Source Software System for Speech and Time-Series Processing. In: Proceedings of the International Conference on Acoustics, Speech and Signal Processing (ICASSP) (2002)Google Scholar
- 3.Hemphill, C.T., Godfrey, J.J., Doddington, G.R.: The ATIS Spoken Language Systems Pilot Corpus (1990), http://acl.ldc.upenn.edu/H/H90/H90-1021.pdf
- 4.Larrson, S., Bernman, A., Hallenborg, J., Hjelm, D.: Trindikit Manual (2004)Google Scholar
- 5.Levin, E., Pieraccini, R., Eckert, W.: Using Markov Decision Processes For Learning Dialogue Strategies. In: Proc. Int. Conf. Acoustics, Speech and Signal Processing, Seattle, USA (1998)Google Scholar
- 7.Rieser, V., Lemon, O.: Using Machine Learning to Explore Human Multimodal Clarification Strategies. In: IEEE/ACL Workshop, Palm Beach, Aruba (2006)Google Scholar
- 8.Schwärzler, S., Geiger, J., Schenk, J., Al-Hames, M., Hörnler, B., Ruske, G., Rigoll, G.: Combining Statistical and Syntactical Systems for Spoken Language Understanding With Graphical Models. In: Proc. of the 9th International Speech Communication Association (Interspeech 2008), Brisbane, Australia (2008)Google Scholar
- 9.Viterbi, A.: Error Bounds for Convolutional Codes and an Asymptotically OptimumError Bounds for Convolutional Codes and an Asymptotically Optimum Decoding Algorithm. IEEE Transactions on Information Theory, series 13, 260–267 (1967)Google Scholar
- 10.W3C Recommendation, Voice Extensible Markup Language (VXML), Version 2.0 (2004), http://www.w3.org/TR/2004/REC-voicexml20-20040316
- 11.Young, S.: Using POMDPs for dialog management. In: IEEE/ACL Workshop, Palm Beach, Aruba (2006)Google Scholar