Abstract
Previous research suggests that multimodal dialogue systems providing both speech and pen input, and outputting a combination of spoken language and graphics, are more robust than unimodal systems based on speech or graphics alone (André, 2002; Oviatt, 1999). Such systems are complex to build and significant research and evaluation effort must typically be expended to generate well-tuned modules for each system component. This chapter describes experiments utilising two complementary evaluation methods that can expedite the design process: (1) a Wizard-of-Oz data collection and evaluation using a novel Wizard tool we developed; and (2) an Overhearer evaluation experiment utilising logged interactions with the real system. We discuss the advantages and disadvantages of both methods and summarise how these two experiments have informed our research on dialogue management and response generation for the multimodal dialogue system MATCH.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
André, E. (2002). Natural language in multimedia/multimodal systems. In Mitkov, R., editor, Handbook of Computational Linguistics, pages 715–734. Oxford University Press.
Bangalore, S. and Johnston, M. (2000). Tight coupling of multimodal language processing with speech recognition. In Proceedings of International Conference on Spoken Language Processing (ICSLP), pages 126–129, Beijing, China.
Beutnagel, M., Conkie, A., Schroeter, J., Stylianou, Y., and Syrdal, A. (1999). The AT&T next-generation text-to-speech system. In Proceedings of Meeting of ASA/EAA/DAGA, pages 20–24, Berlin, Germany.
Carenini, G. and Moore, J. D. (2000). An empirical study of the influence of argument conciseness on argument effectiveness. In Proceedings of Annual Meeting of the Association for Computational Linguistics (ACL), pages 150–157, Hong Kong, China.
Carenini, G. and Moore, J. D. (2001). An empirical study of the influence of user tailoring on evaluative argument effectiveness. In Proceedings of International Joint Conference on Artificial Intelligence (IJCAI), pages 1307–1314, Seattle, Washington, USA.
Edwards, W. and Barron, F. H. (1994). SMART and SMARTER: Improved simple methods for multiattribute utility measurement. Organizational Behavior and Human Decision Processes, 60:306–325.
Johnston, M. and Bangalore, S. (2000). Finite-state multimodal parsing and understanding. In Proceedings of International Conference on Computational Linguistics, pages 1200–1208, SaarbrĂ¼cken, Germany.
Johnston, M. and Bangalore, S. (2001). Finite-state methods for multimodal parsing and integration. In Proceedings of ESSLLI Workshop on Finite-state Methods, European Summer School in Logic, Language and Information, pages 74–80, Helsinki, Finland.
Johnston, M., Bangalore, S., Vasireddy, G., Stent, A., Ehlen, P., Walker, M., Whittaker, S., and Maloor, P. (2002). MATCH: An architecture for multi-modal dialogue systems. In Proceedings of Annual Meeting of the Association for Computational Linguistics (ACL), pages 376–383, Philadelphia, Pennsylvania, USA.
Keeney, R. and Raiffa, H. (1976). Decisions with Multiple Objectives: Preferences and Value Tradeoffs. John Wiley and Sons, Chichester, United Kingdom.
Levin, E., Narayanan, S., Pieraccini, R., Biatov, K., Bocchieri, E., Fabbrizio, G. D., Eckert, W., Lee, S., Pokrovsky, A., Rahim, M., Ruscitti, P., and Walker, M. (2000). The AT&T DARPA Communicator mixed-initiative spoken dialog system. In Proceedings of International Conference on Spoken Language Processing (ICSLP), pages 122–125, Beijing, China.
Oviatt, S. (1999). Ten myths of multimodal interaction. Communications of the ACM, 42(11):74–81.
Sharp, R., Bocchieri, E., Castillo, C., Parthasarathy, S., Rath, C., Riley, M., and Rowland, J. (1997). The Watson speech recognition engine. In Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pages 4065–4068, Munich, Germany.
Stent, A., Walker, M., Whittaker, S., and Maloor, P. (2002). User-tailored generation for spoken dialogue: An experiment. In Proceedings of International Conference on Spoken Language Processing (ICSLP), pages 1281–1284, Denver, Colorado, USA.
Walker, M., Whittaker, S., Stent, A., Maloor, P., Moore, J., Johnston, M., and Vasireddy, G. (2002). Speech-plans: Generating evaluative responses in spoken dialogue. In Proceedings of International Conference on Natural Language Generation (INLG), pages 73–80, New York, New York, USA.
Walker, M., Whittaker, S., Stent, A., Maloor, P., Moore, J., Johnston, M., and Vasireddy, G. (2004). Generation and evaluation of user tailored responses in dialogue. Cognitive Science, In press.
Whittaker, S., Walker, M., and Moore, J. (2002). Fish or fowl: A Wizard of Oz evaluation of dialogue strategies in the restaurant domain. In Proceedings of International Conference on Language Resources and Evaluation (LREC), pages 1074–1078, Las Palmas, Gran Canaria, Spain.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer
About this chapter
Cite this chapter
Whittaker, S., Walker, M. (2005). Evaluating Dialogue Strategies in Multimodal Dialogue Systems. In: Minker, W., BĂ¼hler, D., Dybkjær, L. (eds) Spoken Multimodal Human-Computer Dialogue in Mobile Environments. Text, Speech and Language Technology, vol 28. Springer, Dordrecht. https://doi.org/10.1007/1-4020-3075-4_14
Download citation
DOI: https://doi.org/10.1007/1-4020-3075-4_14
Published:
Publisher Name: Springer, Dordrecht
Print ISBN: 978-1-4020-3073-4
Online ISBN: 978-1-4020-3075-8
eBook Packages: Computer ScienceComputer Science (R0)