Skip to main content

An Empirical Study of Speech Recognition Errors in Human-Computer Dialogue

  • Chapter
Current and New Directions in Discourse and Dialogue

Part of the book series: Text, Speech and Language Technology ((TLTB,volume 22))

  • 305 Accesses

Abstract

The development of spoken dialogue systems is soon faced with the limitations of current speech recognition technology that make recognition errors a recurring problem for any dialogue system. Several studies have suggested that there is no clear correlation between speech recognition scores and user satisfaction, or the ability to complete the tasks underlying spoken dialogue [Yankelovich et al., 1995] [Dybkjaer et al., 1997], suggesting that a certain level of errors should not prevent spoken dialogue systems from being successful. However, most of the studies on speech recognition errors have concentrated either on parsing incomplete utterances or on global dialogue robustness, i.e. at task completion level [Allen et al., 1996] [Stromback and Jonsson, 1998] [Brandt-Pook et al., 1996]. There have been very few studies exploring the impact of speech recognition errors across the various component of a dialogue system, or specifically evaluating the impact of speech recognition errors on the overall dialogue behaviour.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  • James F. Allen, Brad Miller, Eric Ringger, and Teresa Sikorski (1996). Robust Understanding in a Dialogue System. Proceedings of the 34th Annual Meeting of the Association for Computational Linguistics, San Francisco, pp. 62–70.

    Google Scholar 

  • Jonas Beskow, and Scott McGlashan (1997). Olga: A Conversational Agent with Gestures. In: Proceedings of the IJCAI’97 Workshop on Animated Interface Agents — Making them Intelligent, Nagoya, Japan, August 1997.

    Google Scholar 

  • Manuela Boros, W. Eckert, F. Gallwitz, G. Gorz, G. Hanrieder, and H. Niemann (1996). Towards Understanding Spontaneous Speech: Word Accuracy Vs. Concept Accuracy. Proceedings of the Fourth International Conference on Spoken Language Processing (ICSLP’96), Philadelphia, pp. 1009–1012.

    Google Scholar 

  • Hans Brandt-Pook, Gemot A. Fink, Bernd Hildebrandt, Franz Kummert, and Gerhard Sagerer. (1996). A Robust Dialogue System for Making an Appointment. Proceedings of the Fourth International Conference on Spoken Language Processing (ICSLP’96), Philadelphia, pp. 693–696.

    Google Scholar 

  • Marc Cavazza, (1998). An Integrated TFG Parser with Explicit Tree Typing. In: Proceedings of the Fourth TAG+ Workshop, IRCS, University of Pennsylvania, pp. 34–37.

    Google Scholar 

  • Marc Cavazza (2000). From Speech Acts to Search Acts: a Semantic Approach to Speech Act Recognition. Proceedings of GOTALOG 2000, Gothenburg, Sweden, pp. 187–190.

    Google Scholar 

  • Kerstin Fischer and Anton Batliner (2000). What Makes Speakers Angry in Human-Computer Conversation. Proceedings of the Third Human-Computer Conversation Workshop (HCCW), Bellagio, Italy, pp. 62–67.

    Google Scholar 

  • James Glass, Joseph Polifroni, Stephanie Seneff and Victor Zue (2000), Data Collection and Performance Evaluation of Spoken Dialogue Systems: The MIT Experience. Proceedings of the Sixth International Conference on Spoken Language Processing (ICSLP’00), Beijing, China.

    Google Scholar 

  • Eli Hagen (2000). A Flexible Spoken Dialogue Manager. Proceedings of the Third Human-Computer Conversation Workshop (HCCW), Bellagio, Italy, pp.68–73.

    Google Scholar 

  • Bernd Hildebrandt, Heike Rautenstrauch, Gerhard Sagerer (1996). Evaluation of spoken language understanding and dialogue systems. Proceedings of the Fourth International Conference on Spoken Language Processing (ICSLP’96), Philadelphia, pp. 685–688.

    Google Scholar 

  • Ian Lewin, Ralph Becket, Johan Boye, David Carter, Manny Rayner, and Mats Wiren (1999). Language processing for spoken dialogue systems: is shallow parsing enough? In: Accessing Information in Spoken Audio: Proceedings of ESCA ETRW Workshop, Cambridge, pp. 37–42.

    Google Scholar 

  • Bernard Ludwig, Martin Klarner, Heinrich Niemann and Gunther Goerz (2000). Context and Content in Dialogue Systems. Proceedings of the Third Human-Computer Conversation Workshop (HCCW), Bellagio, Italy, pp. 105–111.

    Google Scholar 

  • Susan Luperfoy and David Duff (1996). Disco: A Four-Step Dialogue Recovery Program. Proceedings of the AAAI Workshop on Detection, Repair and Prevention of Human-Machine Mis communication, Portland, USA.

    Google Scholar 

  • Elisabeth Maier (1996). Context Construction as Subtask of Dialogue Processing: the VERBMOBEL Case. Proceedings of the Eleventh Twente Workshop on Language Technologies (TWLT-11), Dialogue Management in Natural Language Systems, University of Twente, The Netherlands, pp. 113–122.

    Google Scholar 

  • Katashi Nagao and Akikazu Takeuchi,(1994). Speech Dialogue with Facial Displays: Multimodal Human-Computer Conversation. In: Proceedings of the 32nd Annual Meeting of the Association for Computational Linguistics (ACL’94), pp. 102–109.

    Google Scholar 

  • Joseph Polifroni and Stephanie Seneff (2000). Galaxy-II as an Architecture for Spoken Dialogue Evaluation. Proceedings of the. Second International Conference on Language Resources and Evaluation (LREC), Athens, Greece, pp. 725–730.

    Google Scholar 

  • Tony Robinson, Mike Hochberg and Steve Renais (1996). The use of recurrent neural networks in continuous speech recognition. In: C. H. Lee, K. K. Paliwal and F. K. Soong (Eds.), Automatic Speech and Speaker Recognition Advanced Topics, Kluwer.

    Google Scholar 

  • David Sadek (1999). Design considerations on dialogue systems: from theory to technology-the case of Artimis. Proceedings of the ESCA Workshop on Interactive Dialogue in Multimodal Systems, Kloster Irsee, Germany, pp. 173–187.

    Google Scholar 

  • Kathleen Stibler and James Denny (2001). A Three-tiered Evaluation Approach for Interactive Spoken Dialogue Systems. Proceedings of the Human Language Technology Conference 2001, San Diego.

    Google Scholar 

  • Lena Stromback and Arne Jonsson. Robust interpretation for spoken dialogue systems. (1998). Proceedings of the Fifth International Conference on Spoken Language Processing (ICSLP’98), Sydney, Australia.

    Google Scholar 

  • David Traum and Elisabeth A. Hinkelman (1992). Conversation Acts in Task-Oriented Spoken Dialogue. Computational Intelligence, vol. 8, n. 3.

    Google Scholar 

  • Turunen, M. and Hakulinen, J. (2001). Agent-Based Error Handling in Spoken Dialogue Systems. Proceedings of Eurospeech 2001, Aalborg, Denmark, pp. 2189–2192.

    Google Scholar 

  • Marilyn A. Walker (1996). Inferring Acceptance and Rejection in Dialogue by Default Rules of Inference. Language and Speech, 39–2.

    Google Scholar 

  • Marilyn A. Walker, Diane J. Litman, Candace A. Kamm and Alicia Abella (1997). PARADISE: A Framework for Evaluating Spoken Dialogue Agents. Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics, pp. 271–280.

    Google Scholar 

  • Marilyn A. Walker and Rebecca Passonneau (2001). DATE: A Dialogue Act Tagging Scheme for Evaluation of Spoken Dialogue Systems. Proceedings of the Human Language Technology Conference 2001, San Diego.

    Google Scholar 

  • Marilyn A. Walker, Lynette Hirschman and John Aberdeen (2000). Evaluation For Darpa Communicator Spoken Dialogue Systems. Proceedings of the Second International Conference on Language Resources and Evaluation (LREC), Athens, Greece.

    Google Scholar 

  • Nicole Yankelovich, Gina-Anne Levow and Matt Marx (1995). Designing Speech Acts: Issues in Speech User Interfaces. Proceedings of the ACM Conference on Human Factors in Computing Systems (CHF’95), Denver, USA.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2003 Springer Science+Business Media Dordrecht

About this chapter

Cite this chapter

Cavazza, M. (2003). An Empirical Study of Speech Recognition Errors in Human-Computer Dialogue. In: van Kuppevelt, J., Smith, R.W. (eds) Current and New Directions in Discourse and Dialogue. Text, Speech and Language Technology, vol 22. Springer, Dordrecht. https://doi.org/10.1007/978-94-010-0019-2_6

Download citation

  • DOI: https://doi.org/10.1007/978-94-010-0019-2_6

  • Publisher Name: Springer, Dordrecht

  • Print ISBN: 978-1-4020-1615-8

  • Online ISBN: 978-94-010-0019-2

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics