Abstract
List-selection is a common user activity in speech recognition applications of all kinds. Whether selecting from a short list of commands in a telephone-based IVR menu, choosing from a list of proper names in a voice-dialing application, or navigating through the n-best list returned by a large-vocabulary speech recognition event, designers often choose the menu as a well-known list-selection device. Audio menus, however, differ from visual menus in several important ways. The challenges of handling time, short-term memory, and cognitive load are issues that the speech designer must confront. This article studies the spontaneous speech acts of users confronted with verbal lists—offering specific solutions to design problems. When should the list be presented? How should user-interruption be handled? How often should the list be repeated? What are the elements of a graphical menu and how do they differ from sound? How should the menu respond to low-confidence or out-task user speech? A re-engineered menu device that addresses these audio-only challenges is presented and then discussed.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Baber, C. (1997). Beyond the desktop. San Diego, CA: Academic Press, San Diego, CA.
Baber, C., Johnson, G., and Cleaver, D. (1997). Factors affecting users’ choice of words in speech-based interaction with public technology. International Journal of Speech Technology, 2(1), May, 45–59.
Balentine, B., Ayer, C., Miller, C., and Scott, B. (1997). Debouncing the speech button: A sliding capture window device for synchronizing turn-taking. International Journal of Speech Technology, 2 (1), 7–19.
Balentine, B., and Scott, B. (1992, February). Goal-orientation and adaptivity in a spoken human interface. Journal of the American Voice Input/Output Society.
Grudin, J. (1989). The case against user interface consistency. Communications of the ACM, 32 (10), 1164–1173.
Heins, R., Franzke, M., Durian, M., and Bayya, A. Turn-taking as a design principle for barge-in in spoken language systems. International Journal of Speech Technology, 2(2), 155–164.
Reeves, B., and Nass, C. (1996). The media equation. Cambridge, UK: CSLI Publications.
Rubinstein, R., and Hersh, H. (1984). The human factor. Bedford, MA: Digital Press.
Maes, P., and Shneiderman, B. ( 1997, November/December). Direct manipulation vs. interface agents: Excerpts from debates at IUI 97 and CHI 97. Interactions, 4 (6), 2–61.
Shneiderman, B. (1997). Designing the user interface (3rd ed.). Reading, MA: Addison-Wesley.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 1999 Springer Science+Business Media New York
About this chapter
Cite this chapter
Balentine, B. (1999). Re-Engineering the Speech Menu. In: Gardner-Bonneau, D. (eds) Human Factors and Voice Interactive Systems. The Springer International Series in Engineering and Computer Science, vol 498. Springer, Boston, MA. https://doi.org/10.1007/978-1-4757-2980-1_10
Download citation
DOI: https://doi.org/10.1007/978-1-4757-2980-1_10
Publisher Name: Springer, Boston, MA
Print ISBN: 978-1-4757-2982-5
Online ISBN: 978-1-4757-2980-1
eBook Packages: Springer Book Archive