Abstract
The development of spoken dialogue systems is soon faced with the limitations of current speech recognition technology that make recognition errors a recurring problem for any dialogue system. Several studies have suggested that there is no clear correlation between speech recognition scores and user satisfaction, or the ability to complete the tasks underlying spoken dialogue [Yankelovich et al., 1995] [Dybkjaer et al., 1997], suggesting that a certain level of errors should not prevent spoken dialogue systems from being successful. However, most of the studies on speech recognition errors have concentrated either on parsing incomplete utterances or on global dialogue robustness, i.e. at task completion level [Allen et al., 1996] [Stromback and Jonsson, 1998] [Brandt-Pook et al., 1996]. There have been very few studies exploring the impact of speech recognition errors across the various component of a dialogue system, or specifically evaluating the impact of speech recognition errors on the overall dialogue behaviour.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
James F. Allen, Brad Miller, Eric Ringger, and Teresa Sikorski (1996). Robust Understanding in a Dialogue System. Proceedings of the 34th Annual Meeting of the Association for Computational Linguistics, San Francisco, pp. 62–70.
Jonas Beskow, and Scott McGlashan (1997). Olga: A Conversational Agent with Gestures. In: Proceedings of the IJCAI’97 Workshop on Animated Interface Agents — Making them Intelligent, Nagoya, Japan, August 1997.
Manuela Boros, W. Eckert, F. Gallwitz, G. Gorz, G. Hanrieder, and H. Niemann (1996). Towards Understanding Spontaneous Speech: Word Accuracy Vs. Concept Accuracy. Proceedings of the Fourth International Conference on Spoken Language Processing (ICSLP’96), Philadelphia, pp. 1009–1012.
Hans Brandt-Pook, Gemot A. Fink, Bernd Hildebrandt, Franz Kummert, and Gerhard Sagerer. (1996). A Robust Dialogue System for Making an Appointment. Proceedings of the Fourth International Conference on Spoken Language Processing (ICSLP’96), Philadelphia, pp. 693–696.
Marc Cavazza, (1998). An Integrated TFG Parser with Explicit Tree Typing. In: Proceedings of the Fourth TAG+ Workshop, IRCS, University of Pennsylvania, pp. 34–37.
Marc Cavazza (2000). From Speech Acts to Search Acts: a Semantic Approach to Speech Act Recognition. Proceedings of GOTALOG 2000, Gothenburg, Sweden, pp. 187–190.
Kerstin Fischer and Anton Batliner (2000). What Makes Speakers Angry in Human-Computer Conversation. Proceedings of the Third Human-Computer Conversation Workshop (HCCW), Bellagio, Italy, pp. 62–67.
James Glass, Joseph Polifroni, Stephanie Seneff and Victor Zue (2000), Data Collection and Performance Evaluation of Spoken Dialogue Systems: The MIT Experience. Proceedings of the Sixth International Conference on Spoken Language Processing (ICSLP’00), Beijing, China.
Eli Hagen (2000). A Flexible Spoken Dialogue Manager. Proceedings of the Third Human-Computer Conversation Workshop (HCCW), Bellagio, Italy, pp.68–73.
Bernd Hildebrandt, Heike Rautenstrauch, Gerhard Sagerer (1996). Evaluation of spoken language understanding and dialogue systems. Proceedings of the Fourth International Conference on Spoken Language Processing (ICSLP’96), Philadelphia, pp. 685–688.
Ian Lewin, Ralph Becket, Johan Boye, David Carter, Manny Rayner, and Mats Wiren (1999). Language processing for spoken dialogue systems: is shallow parsing enough? In: Accessing Information in Spoken Audio: Proceedings of ESCA ETRW Workshop, Cambridge, pp. 37–42.
Bernard Ludwig, Martin Klarner, Heinrich Niemann and Gunther Goerz (2000). Context and Content in Dialogue Systems. Proceedings of the Third Human-Computer Conversation Workshop (HCCW), Bellagio, Italy, pp. 105–111.
Susan Luperfoy and David Duff (1996). Disco: A Four-Step Dialogue Recovery Program. Proceedings of the AAAI Workshop on Detection, Repair and Prevention of Human-Machine Mis communication, Portland, USA.
Elisabeth Maier (1996). Context Construction as Subtask of Dialogue Processing: the VERBMOBEL Case. Proceedings of the Eleventh Twente Workshop on Language Technologies (TWLT-11), Dialogue Management in Natural Language Systems, University of Twente, The Netherlands, pp. 113–122.
Katashi Nagao and Akikazu Takeuchi,(1994). Speech Dialogue with Facial Displays: Multimodal Human-Computer Conversation. In: Proceedings of the 32nd Annual Meeting of the Association for Computational Linguistics (ACL’94), pp. 102–109.
Joseph Polifroni and Stephanie Seneff (2000). Galaxy-II as an Architecture for Spoken Dialogue Evaluation. Proceedings of the. Second International Conference on Language Resources and Evaluation (LREC), Athens, Greece, pp. 725–730.
Tony Robinson, Mike Hochberg and Steve Renais (1996). The use of recurrent neural networks in continuous speech recognition. In: C. H. Lee, K. K. Paliwal and F. K. Soong (Eds.), Automatic Speech and Speaker Recognition Advanced Topics, Kluwer.
David Sadek (1999). Design considerations on dialogue systems: from theory to technology-the case of Artimis. Proceedings of the ESCA Workshop on Interactive Dialogue in Multimodal Systems, Kloster Irsee, Germany, pp. 173–187.
Kathleen Stibler and James Denny (2001). A Three-tiered Evaluation Approach for Interactive Spoken Dialogue Systems. Proceedings of the Human Language Technology Conference 2001, San Diego.
Lena Stromback and Arne Jonsson. Robust interpretation for spoken dialogue systems. (1998). Proceedings of the Fifth International Conference on Spoken Language Processing (ICSLP’98), Sydney, Australia.
David Traum and Elisabeth A. Hinkelman (1992). Conversation Acts in Task-Oriented Spoken Dialogue. Computational Intelligence, vol. 8, n. 3.
Turunen, M. and Hakulinen, J. (2001). Agent-Based Error Handling in Spoken Dialogue Systems. Proceedings of Eurospeech 2001, Aalborg, Denmark, pp. 2189–2192.
Marilyn A. Walker (1996). Inferring Acceptance and Rejection in Dialogue by Default Rules of Inference. Language and Speech, 39–2.
Marilyn A. Walker, Diane J. Litman, Candace A. Kamm and Alicia Abella (1997). PARADISE: A Framework for Evaluating Spoken Dialogue Agents. Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics, pp. 271–280.
Marilyn A. Walker and Rebecca Passonneau (2001). DATE: A Dialogue Act Tagging Scheme for Evaluation of Spoken Dialogue Systems. Proceedings of the Human Language Technology Conference 2001, San Diego.
Marilyn A. Walker, Lynette Hirschman and John Aberdeen (2000). Evaluation For Darpa Communicator Spoken Dialogue Systems. Proceedings of the Second International Conference on Language Resources and Evaluation (LREC), Athens, Greece.
Nicole Yankelovich, Gina-Anne Levow and Matt Marx (1995). Designing Speech Acts: Issues in Speech User Interfaces. Proceedings of the ACM Conference on Human Factors in Computing Systems (CHF’95), Denver, USA.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2003 Springer Science+Business Media Dordrecht
About this chapter
Cite this chapter
Cavazza, M. (2003). An Empirical Study of Speech Recognition Errors in Human-Computer Dialogue. In: van Kuppevelt, J., Smith, R.W. (eds) Current and New Directions in Discourse and Dialogue. Text, Speech and Language Technology, vol 22. Springer, Dordrecht. https://doi.org/10.1007/978-94-010-0019-2_6
Download citation
DOI: https://doi.org/10.1007/978-94-010-0019-2_6
Publisher Name: Springer, Dordrecht
Print ISBN: 978-1-4020-1615-8
Online ISBN: 978-94-010-0019-2
eBook Packages: Springer Book Archive