Skip to main content

Evaluating Dialogue Strategies in Multimodal Dialogue Systems

  • Chapter
  • First Online:
Spoken Multimodal Human-Computer Dialogue in Mobile Environments

Part of the book series: Text, Speech and Language Technology ((TLTB,volume 28))

Abstract

Previous research suggests that multimodal dialogue systems providing both speech and pen input, and outputting a combination of spoken language and graphics, are more robust than unimodal systems based on speech or graphics alone (André, 2002; Oviatt, 1999). Such systems are complex to build and significant research and evaluation effort must typically be expended to generate well-tuned modules for each system component. This chapter describes experiments utilising two complementary evaluation methods that can expedite the design process: (1) a Wizard-of-Oz data collection and evaluation using a novel Wizard tool we developed; and (2) an Overhearer evaluation experiment utilising logged interactions with the real system. We discuss the advantages and disadvantages of both methods and summarise how these two experiments have informed our research on dialogue management and response generation for the multimodal dialogue system MATCH.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  • AndrĂ©, E. (2002). Natural language in multimedia/multimodal systems. In Mitkov, R., editor, Handbook of Computational Linguistics, pages 715–734. Oxford University Press.

    Google Scholar 

  • Bangalore, S. and Johnston, M. (2000). Tight coupling of multimodal language processing with speech recognition. In Proceedings of International Conference on Spoken Language Processing (ICSLP), pages 126–129, Beijing, China.

    Google Scholar 

  • Beutnagel, M., Conkie, A., Schroeter, J., Stylianou, Y., and Syrdal, A. (1999). The AT&T next-generation text-to-speech system. In Proceedings of Meeting of ASA/EAA/DAGA, pages 20–24, Berlin, Germany.

    Google Scholar 

  • Carenini, G. and Moore, J. D. (2000). An empirical study of the influence of argument conciseness on argument effectiveness. In Proceedings of Annual Meeting of the Association for Computational Linguistics (ACL), pages 150–157, Hong Kong, China.

    Google Scholar 

  • Carenini, G. and Moore, J. D. (2001). An empirical study of the influence of user tailoring on evaluative argument effectiveness. In Proceedings of International Joint Conference on Artificial Intelligence (IJCAI), pages 1307–1314, Seattle, Washington, USA.

    Google Scholar 

  • Edwards, W. and Barron, F. H. (1994). SMART and SMARTER: Improved simple methods for multiattribute utility measurement. Organizational Behavior and Human Decision Processes, 60:306–325.

    Article  Google Scholar 

  • Johnston, M. and Bangalore, S. (2000). Finite-state multimodal parsing and understanding. In Proceedings of International Conference on Computational Linguistics, pages 1200–1208, SaarbrĂ¼cken, Germany.

    Google Scholar 

  • Johnston, M. and Bangalore, S. (2001). Finite-state methods for multimodal parsing and integration. In Proceedings of ESSLLI Workshop on Finite-state Methods, European Summer School in Logic, Language and Information, pages 74–80, Helsinki, Finland.

    Google Scholar 

  • Johnston, M., Bangalore, S., Vasireddy, G., Stent, A., Ehlen, P., Walker, M., Whittaker, S., and Maloor, P. (2002). MATCH: An architecture for multi-modal dialogue systems. In Proceedings of Annual Meeting of the Association for Computational Linguistics (ACL), pages 376–383, Philadelphia, Pennsylvania, USA.

    Google Scholar 

  • Keeney, R. and Raiffa, H. (1976). Decisions with Multiple Objectives: Preferences and Value Tradeoffs. John Wiley and Sons, Chichester, United Kingdom.

    MATH  Google Scholar 

  • Levin, E., Narayanan, S., Pieraccini, R., Biatov, K., Bocchieri, E., Fabbrizio, G. D., Eckert, W., Lee, S., Pokrovsky, A., Rahim, M., Ruscitti, P., and Walker, M. (2000). The AT&T DARPA Communicator mixed-initiative spoken dialog system. In Proceedings of International Conference on Spoken Language Processing (ICSLP), pages 122–125, Beijing, China.

    Google Scholar 

  • Oviatt, S. (1999). Ten myths of multimodal interaction. Communications of the ACM, 42(11):74–81.

    Article  Google Scholar 

  • Sharp, R., Bocchieri, E., Castillo, C., Parthasarathy, S., Rath, C., Riley, M., and Rowland, J. (1997). The Watson speech recognition engine. In Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pages 4065–4068, Munich, Germany.

    Google Scholar 

  • Stent, A., Walker, M., Whittaker, S., and Maloor, P. (2002). User-tailored generation for spoken dialogue: An experiment. In Proceedings of International Conference on Spoken Language Processing (ICSLP), pages 1281–1284, Denver, Colorado, USA.

    Google Scholar 

  • Walker, M., Whittaker, S., Stent, A., Maloor, P., Moore, J., Johnston, M., and Vasireddy, G. (2002). Speech-plans: Generating evaluative responses in spoken dialogue. In Proceedings of International Conference on Natural Language Generation (INLG), pages 73–80, New York, New York, USA.

    Google Scholar 

  • Walker, M., Whittaker, S., Stent, A., Maloor, P., Moore, J., Johnston, M., and Vasireddy, G. (2004). Generation and evaluation of user tailored responses in dialogue. Cognitive Science, In press.

    Google Scholar 

  • Whittaker, S., Walker, M., and Moore, J. (2002). Fish or fowl: A Wizard of Oz evaluation of dialogue strategies in the restaurant domain. In Proceedings of International Conference on Language Resources and Evaluation (LREC), pages 1074–1078, Las Palmas, Gran Canaria, Spain.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2005 Springer

About this chapter

Cite this chapter

Whittaker, S., Walker, M. (2005). Evaluating Dialogue Strategies in Multimodal Dialogue Systems. In: Minker, W., BĂ¼hler, D., Dybkjær, L. (eds) Spoken Multimodal Human-Computer Dialogue in Mobile Environments. Text, Speech and Language Technology, vol 28. Springer, Dordrecht. https://doi.org/10.1007/1-4020-3075-4_14

Download citation

  • DOI: https://doi.org/10.1007/1-4020-3075-4_14

  • Published:

  • Publisher Name: Springer, Dordrecht

  • Print ISBN: 978-1-4020-3073-4

  • Online ISBN: 978-1-4020-3075-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics