Abstract
While speech offers unique advantages and opportunities as an interface modality, the known limitations of speech recognition technology and cognitive limitations of spoken interaction amplify the importance of usability in the development of speech applications. The competitive business environment, on the other hand, requires sound business justification for any investment in speech technology and proof of its usability and effectiveness. This chapter presents design principles and usability engineering methods that empower practitioners to optimize both usability and ROI of telephone speech applications, frequently also referred to as telephone Voice User Interface (VUI) or Interactive Voice Response (IVR) systems. The first section discusses limitations of speech user interfaces and their repercussions on design. From a survey of research and industry know-how a short list of guidelines for IVR design is derived. Examples illustrate how to apply these guidelines during the design phase of a telephone speech application. The second section presents a data-driven methodology for optimizing usability and effectiveness of IVRs. The methodology is grounded in the analysis of live, end-to-end calls - the ultimate field data for telephone speech applications. We will describe how to capture end-to-end call data from deployed systems and how to mine this data to measure usability and identify problems. Leveraging end-to-end call data empowers practitioners to build solid business cases, optimize ROI, and justify the cost of IVR usability engineering. Case studies from the consulting practice at BBN Technologies illustrate how these methods were applied in some of the largest US deployments of automated telephone applications.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Balentine, B., & Morgan, D. P. (1999). How to build a speech recognition application. San Ramon, CA: Enterprise Integration Group.
Balentine, B. (2006). It’s better to be a good machine. San Ramon, CA: Enterprise Integration Group.
Bennacef, S., Devillers, L., Rosset, S., & Lamel, L. (1996). Dialog in the RAILTEL telephone-based system. In International Conference on Spoken Language Systems (ICSLP)(pp. 550-553). Philadelphia, PA: IEEE.
Cohen, M. H., Giangola, J. P., and Balogh, J. (2004). Voice user interface design. Reading, MA: Addison-Wesley.
Delogu, C., Di Carlo, A., Rotundi, P., & Satori, D. (1998). A comparison between DTMF and ASR IVR services through objective and subjective evaluation. In Interactive Voice Technology for Telecommunications Applications (IVTTA)(pp.145-150). Italy: IEEE.
Edwards, K., Quinn, K., Dalziel, P. B., & Jack, M. A. (1997). Evaluating commercial speech recognition and DTMF technology for automated telephone banking services. In IEEE Colloquium on Advances in Interactive Voice Technologies for Telecommunication Services(pp. 1-6).
Edwards, K., Quinn, K., et al. (1997). Evaluating commercial speech recognition and DTMF technology for automated telephone banking services. IEEE Colloquium on Advances in Interactive Voice Technologies for Telecommunication Services.
Gorin, A., Parker, B., Sachs, R., & Wilpon, J. (1996). How may I help you? In Interactive Voice Technology for Telecommunications Applications (IVTTA)(pp. 57-60). IEEE.
Halstead-Nussloch, R. (1989). The design of phone-based interfaces for consumers. In International Conference for Human Factors in Computing Systems (CHI) (pp. 347-352). New York: ACM Press.
Holtzblatt, K., & Beyer, H. (1998). Contextual design. Morgan Kaufmann.
Karat, C.-M., Halverson, C., Horn, D., & Karat, John. (1999). Patterns of entry and correction in large vocabulary continuous speech recognition systems. In International Conference for Computer-Human Interaction (CHI)(pp. 568-576). New York: ACM Press.
Karat, J., D. Horn, D., Halverson, C., & Karat, C.-M. (2000). Overcoming unusability: Developing efficient strategies in speech recognition systems. In International Conference for Human Factors in Computing Systems (CHI) (Vol. 2). New York: ACM Press.
Newman, D. (2000). Talk to your computer: Speech recognition made easy. Berkeley, CA: Waveside Publishing.
Nielsen, J. (1993). Usability engineering. Morristown , NJ: AP Professional.
Novick, D. G., Hansen, B., Sutton, S., & Marshall, C.R. (1999). Limiting factors of automated telephone dialogues. In D. Gardner-Bonneau (Ed.), Human factors and voice interactive systems (pp. 163-186). Boston/Dordrecht/London: Kluwer Academic Publishers.
Oviatt, S., DeAngeli, A., & Kuhn, K. (1997). Integration and synchronization of input modes during multimodal human-computer interaction. International Conference on Human Factors in Computing Systems (CHI) (pp. 415-422). New York: ACM Press.
Parnas, D. L. (1969). On the use of transition diagrams in the design of a user interface of interactive computer systems. In Proceedings of ACM Conference (pp. 379-385).
Reeves, B., & Nass, C. (1996). The media equation. Cambridge (UK): Cambridge University Press.
Resnick, P., & Virzi, R. A. (1995). Relief from the audio interface blues: Expanding the spectrum of menu, list, and form styles. Transactions on Computer-Human Interaction (TOCHI), 2(2), 145-176.
Roberts, T. L., & Engelbeck, G. (1989). The effects of device technology on the usability of advanced telephone functions. In International Conference on Human Factors in Computing Systems (CHI) (pp. 331-338). New York: ACM Press.
Sacks, H., & Schegloff, E. A. (1974). A simplest systematics for the organization of turn-taking in conversation. Language, 50, 698-735.
Shneiderman, B. (2000). The limits of speech recognition. Communications of the ACM, 43(9).
Soltau, H., & Waibel, A. (2000). Acoustic models for hyperarticulated speech. Paper presented at the International Conference on Speech and Language Processing (ICASSP), Beijing, China.
Suhm, B. (2003). Towards best practices for speech user interface design. In European Conference on Speech Communication and Technology (Eurospeech) (pp. 2217-2220).
Suhm, B., Meyers, B., & Waibel, A. (1999). Empirical and model-based evaluation of multimodal error correction. In International Conference on Computer-Human Interaction (CHI). New York: ACM Press.
Suhm, B., & Peterson, P. (2001). Evaluating commercial touch-tone and speech-enabled telephone voice user interfaces using a single measure. In International Conference on Human Factors in Computing Systems (CHI)(pp. 2.129-2.130). New York: ACM Press.
Author information
Authors and Affiliations
Rights and permissions
Copyright information
© 2008 Springer Science + Business Media, LLC
About this chapter
Cite this chapter
Suhm, B. (2008). Ivr Usability Engineering Using Guidelines And Analyses Of End-to-End Calls. In: Human Factors and Voice Interactive Systems. Signals and Communication Technology. Springer, Boston, MA. https://doi.org/10.1007/978-0-387-68439-0_1
Download citation
DOI: https://doi.org/10.1007/978-0-387-68439-0_1
Publisher Name: Springer, Boston, MA
Print ISBN: 978-0-387-25482-1
Online ISBN: 978-0-387-68439-0
eBook Packages: EngineeringEngineering (R0)