An Empirical Study of Speech Recognition Errors in Human-Computer Dialogue

Cavazza, Marc

doi:10.1007/978-94-010-0019-2_6

Marc Cavazza⁵

Part of the book series: Text, Speech and Language Technology ((TLTB,volume 22))

305 Accesses

Abstract

The development of spoken dialogue systems is soon faced with the limitations of current speech recognition technology that make recognition errors a recurring problem for any dialogue system. Several studies have suggested that there is no clear correlation between speech recognition scores and user satisfaction, or the ability to complete the tasks underlying spoken dialogue [Yankelovich et al., 1995] [Dybkjaer et al., 1997], suggesting that a certain level of errors should not prevent spoken dialogue systems from being successful. However, most of the studies on speech recognition errors have concentrated either on parsing incomplete utterances or on global dialogue robustness, i.e. at task completion level [Allen et al., 1996] [Stromback and Jonsson, 1998] [Brandt-Pook et al., 1996]. There have been very few studies exploring the impact of speech recognition errors across the various component of a dialogue system, or specifically evaluating the impact of speech recognition errors on the overall dialogue behaviour.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

James F. Allen, Brad Miller, Eric Ringger, and Teresa Sikorski (1996). Robust Understanding in a Dialogue System. Proceedings of the 34th Annual Meeting of the Association for Computational Linguistics, San Francisco, pp. 62–70.
Google Scholar
Jonas Beskow, and Scott McGlashan (1997). Olga: A Conversational Agent with Gestures. In: Proceedings of the IJCAI’97 Workshop on Animated Interface Agents — Making them Intelligent, Nagoya, Japan, August 1997.
Google Scholar
Manuela Boros, W. Eckert, F. Gallwitz, G. Gorz, G. Hanrieder, and H. Niemann (1996). Towards Understanding Spontaneous Speech: Word Accuracy Vs. Concept Accuracy. Proceedings of the Fourth International Conference on Spoken Language Processing (ICSLP’96), Philadelphia, pp. 1009–1012.
Google Scholar
Hans Brandt-Pook, Gemot A. Fink, Bernd Hildebrandt, Franz Kummert, and Gerhard Sagerer. (1996). A Robust Dialogue System for Making an Appointment. Proceedings of the Fourth International Conference on Spoken Language Processing (ICSLP’96), Philadelphia, pp. 693–696.
Google Scholar
Marc Cavazza, (1998). An Integrated TFG Parser with Explicit Tree Typing. In: Proceedings of the Fourth TAG+ Workshop, IRCS, University of Pennsylvania, pp. 34–37.
Google Scholar
Marc Cavazza (2000). From Speech Acts to Search Acts: a Semantic Approach to Speech Act Recognition. Proceedings of GOTALOG 2000, Gothenburg, Sweden, pp. 187–190.
Google Scholar
Kerstin Fischer and Anton Batliner (2000). What Makes Speakers Angry in Human-Computer Conversation. Proceedings of the Third Human-Computer Conversation Workshop (HCCW), Bellagio, Italy, pp. 62–67.
Google Scholar
James Glass, Joseph Polifroni, Stephanie Seneff and Victor Zue (2000), Data Collection and Performance Evaluation of Spoken Dialogue Systems: The MIT Experience. Proceedings of the Sixth International Conference on Spoken Language Processing (ICSLP’00), Beijing, China.
Google Scholar
Eli Hagen (2000). A Flexible Spoken Dialogue Manager. Proceedings of the Third Human-Computer Conversation Workshop (HCCW), Bellagio, Italy, pp.68–73.
Google Scholar
Bernd Hildebrandt, Heike Rautenstrauch, Gerhard Sagerer (1996). Evaluation of spoken language understanding and dialogue systems. Proceedings of the Fourth International Conference on Spoken Language Processing (ICSLP’96), Philadelphia, pp. 685–688.
Google Scholar
Ian Lewin, Ralph Becket, Johan Boye, David Carter, Manny Rayner, and Mats Wiren (1999). Language processing for spoken dialogue systems: is shallow parsing enough? In: Accessing Information in Spoken Audio: Proceedings of ESCA ETRW Workshop, Cambridge, pp. 37–42.
Google Scholar
Bernard Ludwig, Martin Klarner, Heinrich Niemann and Gunther Goerz (2000). Context and Content in Dialogue Systems. Proceedings of the Third Human-Computer Conversation Workshop (HCCW), Bellagio, Italy, pp. 105–111.
Google Scholar
Susan Luperfoy and David Duff (1996). Disco: A Four-Step Dialogue Recovery Program. Proceedings of the AAAI Workshop on Detection, Repair and Prevention of Human-Machine Mis communication, Portland, USA.
Google Scholar
Elisabeth Maier (1996). Context Construction as Subtask of Dialogue Processing: the VERBMOBEL Case. Proceedings of the Eleventh Twente Workshop on Language Technologies (TWLT-11), Dialogue Management in Natural Language Systems, University of Twente, The Netherlands, pp. 113–122.
Google Scholar
Katashi Nagao and Akikazu Takeuchi,(1994). Speech Dialogue with Facial Displays: Multimodal Human-Computer Conversation. In: Proceedings of the 32nd Annual Meeting of the Association for Computational Linguistics (ACL’94), pp. 102–109.
Google Scholar
Joseph Polifroni and Stephanie Seneff (2000). Galaxy-II as an Architecture for Spoken Dialogue Evaluation. Proceedings of the. Second International Conference on Language Resources and Evaluation (LREC), Athens, Greece, pp. 725–730.
Google Scholar
Tony Robinson, Mike Hochberg and Steve Renais (1996). The use of recurrent neural networks in continuous speech recognition. In: C. H. Lee, K. K. Paliwal and F. K. Soong (Eds.), Automatic Speech and Speaker Recognition Advanced Topics, Kluwer.
Google Scholar
David Sadek (1999). Design considerations on dialogue systems: from theory to technology-the case of Artimis. Proceedings of the ESCA Workshop on Interactive Dialogue in Multimodal Systems, Kloster Irsee, Germany, pp. 173–187.
Google Scholar
Kathleen Stibler and James Denny (2001). A Three-tiered Evaluation Approach for Interactive Spoken Dialogue Systems. Proceedings of the Human Language Technology Conference 2001, San Diego.
Google Scholar
Lena Stromback and Arne Jonsson. Robust interpretation for spoken dialogue systems. (1998). Proceedings of the Fifth International Conference on Spoken Language Processing (ICSLP’98), Sydney, Australia.
Google Scholar
David Traum and Elisabeth A. Hinkelman (1992). Conversation Acts in Task-Oriented Spoken Dialogue. Computational Intelligence, vol. 8, n. 3.
Google Scholar
Turunen, M. and Hakulinen, J. (2001). Agent-Based Error Handling in Spoken Dialogue Systems. Proceedings of Eurospeech 2001, Aalborg, Denmark, pp. 2189–2192.
Google Scholar
Marilyn A. Walker (1996). Inferring Acceptance and Rejection in Dialogue by Default Rules of Inference. Language and Speech, 39–2.
Google Scholar
Marilyn A. Walker, Diane J. Litman, Candace A. Kamm and Alicia Abella (1997). PARADISE: A Framework for Evaluating Spoken Dialogue Agents. Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics, pp. 271–280.
Google Scholar
Marilyn A. Walker and Rebecca Passonneau (2001). DATE: A Dialogue Act Tagging Scheme for Evaluation of Spoken Dialogue Systems. Proceedings of the Human Language Technology Conference 2001, San Diego.
Google Scholar
Marilyn A. Walker, Lynette Hirschman and John Aberdeen (2000). Evaluation For Darpa Communicator Spoken Dialogue Systems. Proceedings of the Second International Conference on Language Resources and Evaluation (LREC), Athens, Greece.
Google Scholar
Nicole Yankelovich, Gina-Anne Levow and Matt Marx (1995). Designing Speech Acts: Issues in Speech User Interfaces. Proceedings of the ACM Conference on Human Factors in Computing Systems (CHF’95), Denver, USA.
Google Scholar

Download references

Author information

Authors and Affiliations

School of Computing and Mathematics, University of Teesside, TS1 3BA, Middlesbrough, UK
Marc Cavazza

Authors

Marc Cavazza
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Stuttgart University, Germany
Jan van Kuppevelt
East Carolina University, USA
Ronnie W. Smith

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Cavazza, M. (2003). An Empirical Study of Speech Recognition Errors in Human-Computer Dialogue. In: van Kuppevelt, J., Smith, R.W. (eds) Current and New Directions in Discourse and Dialogue. Text, Speech and Language Technology, vol 22. Springer, Dordrecht. https://doi.org/10.1007/978-94-010-0019-2_6

Download citation

DOI: https://doi.org/10.1007/978-94-010-0019-2_6
Publisher Name: Springer, Dordrecht
Print ISBN: 978-1-4020-1615-8
Online ISBN: 978-94-010-0019-2
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics