Skip to main content

Empirically Evaluating an Adaptable Spoken Dialogue System

  • Conference paper
UM99 User Modeling

Part of the book series: CISM International Centre for Mechanical Sciences ((CISM,volume 407))

Abstract

Recent technological advances have made it possible to build real-time, interactive spoken dialogue systems for a wide variety of applications. However, when users do not respect the limitations of such systems, performance typically degrades. Although users differ with respect to their knowledge of system limitations, and although different dialogue strategies make system limitations more apparent to users, most current systems do not try to improve performance by adapting dialogue behavior to individual users. This paper presents an empirical evaluation of TOOT, an adaptable spoken dialogue system for retrieving train schedules on the web. We conduct an experiment in which 20 users carry out 4 tasks with both adaptable and non-adaptable versions of TOOT, resulting in a corpus of 80 dialogues. The values for a wide range of evaluation measures are then extracted from this corpus. Our results show that adaptable TOOT generally outperforms non-adaptable TOOT, and that the utility of adaptation depends on TOOT’s initial dialogue strategies.

We thank J. Chu-Carroll, C. Kamm, D. Lewis, M. Walker, and S. Whittaker for helpful comments.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  • Cohen, P. (1995). Empirical Methods for Artificial Intelligence. MIT Press, Boston.

    MATH  Google Scholar 

  • Danieli, M., and Gerbino, E. (1995). Metrics for evaluating dialogue strategies in a spoken language system. In Proc. AAAI Spring Symposium on Empirical Methods in Discourse Interpretation and Generation, 34–39.

    Google Scholar 

  • Kamm, C., Narayanan, S., Dutton, D., and Ritenour, R. (1997). Evaluating spoken dialog systems for telecommunication services. In Proc. 5th European Conf. on Speech Communication and Technology.

    Google Scholar 

  • Kamm, C., Litman, D., and Walker, M. (1998). From novice to expert: The effect of tutorials on user expertise with spoken dialogue systems. In Proc. 5th International Conf. on Spoken Language Processing, 1211–1214.

    Google Scholar 

  • Levin, E., and Pieraccini, R. (1997). A stochastic model of computer-human interaction for learning dialogue strategies. In Proc. 5th European Conf. on Speech Communication and Technologyk.

    Google Scholar 

  • Litman, D., Pan, S., and Walker, M. (1998). Evaluating Response Strategies in a Web-Based Spoken Dialogue Agent. In Proc. 36th Annual Meeting of the Association for Computational Linguistics and 17th International Conf. on Computational Linguistics, 780–786.

    Chapter  Google Scholar 

  • Litman, D., Walker, M., and Kearns, M. (1999). Automatic detection of poor speech recognition at the dialogue level. Manuscript submitted for publication.

    Google Scholar 

  • Monge, P., and Cappella, J., eds. (1980). Multivariate Techniques in Human Communication Research. Academic Press, New York.

    MATH  Google Scholar 

  • Smith, R. W. (1998). An evaluation of strategies for selectively verifying utterance meanings in spoken natural language dialog. International Journal of Human-Computer Studies 48:627–647.

    Article  Google Scholar 

  • Strachan, L., Anderson, J., Sneesby, M., and Evans, M. (1997). Pragmatic user modelling in a commercial software system. In Proc. 6th International Conf. on User Modeling, 189–200.

    Google Scholar 

  • van Zanten, G. V. (1998). Adaptive mixed-initiative dialogue management. Technical Report 52, IPO, Center for Research on User-System Interaction.

    Google Scholar 

  • Walker, M., Hindle, D., Fromer, J., Fabbrizio, G. D., and Mestel, C. (1997a). Evaluating competing agent strategies for a voice email agent. In Proc. 5th European Conf on Speech Communication and Technology.

    Google Scholar 

  • Walker, M., Litman, D., Kamm, C., and Abella, A. (1997b). PARADISE: A general framework for evaluating spoken dialogue agents. In Proc. 35th Annual Meeting of the Association for Computational Linguistics and 8th Conf. of the European Chapter of the Association for Computational Linguistics, 271–280.

    Google Scholar 

  • Walker, M., Fromer, J., and Narayanan, S. (1998). Learning optimal dialogue strategies: A case study of a spoken dialogue agent for email. In Proc. 36th Annual Meeting of the Association for Computational Linguistics and 17th International Conf. on Computational Linguistics, 1345–1352.

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 1999 Springer Science+Business Media New York

About this paper

Cite this paper

Litman, D.J., Pan, S. (1999). Empirically Evaluating an Adaptable Spoken Dialogue System. In: Kay, J. (eds) UM99 User Modeling. CISM International Centre for Mechanical Sciences, vol 407. Springer, Vienna. https://doi.org/10.1007/978-3-7091-2490-1_6

Download citation

  • DOI: https://doi.org/10.1007/978-3-7091-2490-1_6

  • Publisher Name: Springer, Vienna

  • Print ISBN: 978-3-211-83151-9

  • Online ISBN: 978-3-7091-2490-1

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics