Skip to main content

Who, Why and How Often? Key Elements for the Design of a Successful Speech Application Taking Account of the Target Groups

  • Chapter
Usability of Speech Dialog Systems

Part of the book series: Signals and Commmunication Technologies ((SCT))

  • 756 Accesses

Abstract

Three questions have to be answered before designing a speech application: who will use it, why will they use it and how often will they use it? A designer needs answers to all of these questions to best be able to address the needs of the target group. This chapter will outline a methodical procedural model which describes the workflow required to build a speech application that is properly designed for its target groups. The workflow covers the analysis of requirements, specification, implementation, production, delivery and operation. This chapter also provides an overview of the most important information we need to describe a voice user interface, and where this information can be found. It also provides an overview of current and future technical developments in the field of speech processing and their relevance for the design of dialogues in future. We will then recommend 11 design features which, according to our experience, help the designer of a voice user interface to exploit knowledge about the user and to focus the design of the dialogue on the user’s abilities, their competence, expectations and needs.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 139.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  • Anastasi, A. (1976). Differentielle Psychologie. Vol. II, Beltz, Weinheim, 1976.

    Google Scholar 

  • Asendorpf, J.B. (2003). Person/situation (environment) assessment. In R. Fernández-Ballesteros (Ed.), Encyclopedia of Psychological Assessment. Vol. 2, London, U.K., Sage, pp. 695–698.

    Google Scholar 

  • Baltes, P. B. (1990). Entwicklungspsychologie der Lebensspanne: Theoretische Leitsätze. Psychologische Rundschau, 41, 1990, pp. 1–24.

    Google Scholar 

  • Bickmore, T.; Cassell, J. (2005). Social Dialogue with Embodied Conversational Agents. In J. van Kuppevelt, L. Dybkjaer, & N. Bernsen (Eds.), Advances in Natural, Multimodal Dialogue Systems, Springer Netherlands.

    Google Scholar 

  • Braun, F. (2004): Reden Frauen anders? Entwicklungen und Positionen in der linguistischen Geschlechterforschung. In K. Eichhoff-Cyrus (Ed.), Adam, Eva und die Sprache, Beiträge zur Geschlechterforschung. Mannheim, Dudenverlag, pp. 9–26.

    Google Scholar 

  • Buisine, S.; Abrilian, S.; Martin, JC. (2004). Evaluation of multimodal behaviour of embodied agents. In Z. Ruttkay and C. Pelachaud (Ed.), From Brows till Trust: Evaluating Embodied Conversational Agents. Kluwer.

    Google Scholar 

  • Burkhardt, F.; Ajmera, J.; Englert, R.; Burleson, W.; Stegmann, J. (2006). Detecting anger in automated voice portal dialogues. Proc. Interspeech 2006, ISCA, Pittsburgh, PA, USA.

    Google Scholar 

  • Burkhardt, F.; van Ballegooy, M.; Englert, R.; Huber, R. (2005). An emotion-aware voice portal. Proc. 16. Conference for Electronic Speech Signal Processing (ESSP) 2005, Prague, Czech Republic.

    Google Scholar 

  • Canada, K.; Brusca, F. (1991). The technological gender gap: Evidence and recommendations for educators and computer-based instruction designers. Educational Technology Research & Development, vol. 39, no. 2, pp. 43–51.

    Google Scholar 

  • Catrambone, R.; Stasko, J.; Xiao, J. (2004). ECA as user interface paradigm. In Z. Ruttkay and C. Pelachaud (Ed.), From Brows till Trust: Evaluating Embodied Conversational Agents, Kluwer.

    Google Scholar 

  • Cerrato, L.; Falcone, M.; Paoloni, A. (2000). Subjective age estimation of telephonic voices. Speech Communication, vol. 31, no. 2–3, pp. 107–102.

    Google Scholar 

  • Duda, R. O.; Hart, P. E.; Stork, D. G. (2000). Pattern Classification. 2nd ed., Wiley Interscience.

    Google Scholar 

  • Fraser, J.; Gibret, G. (1991). Simulating speech systems. Computer, Speech, and Language 5, pp.81–99.

    Article  Google Scholar 

  • Gilly, M. C.; Zeithaml, V. A. (1985). The elderly consumer and adaptation of technologies. Journal of Consumer Research, vol. 12, pp. 353–357.

    Article  Google Scholar 

  • Gomez, L. M.; Egan, D. E.; Bowers, C. (1986). Learning to use a text editor: some learner characteristics that predict success. Human- Computer Interaction, vol. 2, pp. 1–23.

    Article  Google Scholar 

  • Günthner, Susanne (1997). Zur kommunikativen Konstruktion von Geschlechterdifferenzen im Gespräch. In Braun, F. /Pasero, U. (Eds.), Kommunikation von Geschlecht – Communication of Gender. Pfaffenweiler, Centaurus, pp. 122–146.

    Google Scholar 

  • Hempel, T. (2006a). Usability of Telephone-Based Speech Dialogue Systems as Experienced by User Groups of Different Age and Background. In: 2nd ISCA/DEGA Tutorial and Research Workshop on Perceptual Quality of Systems, Sept. 04th–06th, Berlin, Germany, International Speech Communication Association: Bonn, Germany, pp. 76–78.

    Google Scholar 

  • Hempel, T. (2006b). Umgang von mittelalten und älteren Nutzern mit telefonbasierten Sprachdialoguesystemen. In: Usability Professionals 06/Mensch & Computer 2006 – Mensch und Computer im Strukturwandel, Sept. 3rd–6th 2006, Gelsenkirchen, Germany, University of Applied Sciences.

    Google Scholar 

  • Kienast, M.; Paeschke, A.; Sendlmeier, W. F. (1999). Articulatory reduction in emotional speech. Proceedings Eurospeech 99, Budapest, pp. 117–120.

    Google Scholar 

  • Krämer, N. C.; Rüggenberg, S.; Meyer zu Kniendorf, C.; Bente, G. (2002). Schnittstelle für alle? Möglichkeiten zur Anpassung anthropomorpher Interface Agenten an verschiedene Nutzergruppen. In M. Herzceg, W. Prinz & H. Oberquelle (Ed.), Mensch und Computer 2002, Teubner, Stuttgart, pp. 125–134.

    Google Scholar 

  • Krämer, N.C. (2001). Bewegende Bewegung. Sozio-emotionale Wirkungen nonverbalen Verhaltens und deren experimentelle Untersuchung mittels Computeranimation. Lengerich, Pabst.

    Google Scholar 

  • Lee, C.M.; Narayanan, S. (2005). Towards detecting emotions in spoken dialogs. IEEE Transactions on Speech and Audio Processing, 13(2), pp. 293–302.

    Article  Google Scholar 

  • Levin, T.; Gordon, C. (1989). Effects of gender and computer experience on attitudes toward computers. Journal of Computing Research, 5(1), pp. 69–88.

    MathSciNet  Google Scholar 

  • McBreen H. (2002). Embodied conversational agents in ecommerce. In Socially Intelligent Agents: Creating Relationships with Computers and Robots. Kluwer Academic Publishers.

    Google Scholar 

  • Metze, F.; Ajmera, J.; Englert, R.; Bub, U.; Burkhardt, F.; Stegmann, J.; Müller, C.; Huber, R.; Andrassy, B.; Bauer, J. G.; Littel, B. (2007). Comparison of four approaches to age and gender recognition for telephone applications. Proc. ICASSP 2007, IEEE, Honolulu, Hawaii.

    Google Scholar 

  • Mulac, A. (1999). Perceptions of women and men based on their linguistic behavior: The Gender-Linked Language Effect. In Pasero, U. /Braun, F. (Eds.), Perceiving and performing gender. Opladen, pp. 88–104.

    Google Scholar 

  • Paterno, F.; Mancini, C.; Meniconi, S. (1997). ConcurTaskTrees: A diagrammatic notation for specifying task models. Proceedings Interact’97, Chapman&Hall, July, Sydney, pp. 362–369.

    Google Scholar 

  • Rabiner, L. R. (1989). A tutorial on hidden markov models and selected applications in speech recognition. Proceedings of the IEEE, vol. 77, no. 2, February, pp. 257–286.

    Google Scholar 

  • Reynolds, D. A.; Campbell, J. P.; Campbell, W. M.; Dunn, R. B.; Gleason, T. P.; Jones, D. A.; Quatieri, T. F.; Quillen, C. B.; Sturim, D. E.; Torres-Carrasquillo, P. A. (2003). Beyond Cepstra: Exploiting High-Level Information in Speaker Recognition. Proc. Workshop on Multimodal User Authentication in Santa Barbara, California, pp. 223–229.

    Google Scholar 

  • Rudinger, G. (1994). Ältere Menschen und Technik. In Kastner M. (Ed.), Personalpflege: Der gesunde Mitarbeiter in einer gesunden Organisation. Quintessenz, München, pp. 187–194.

    Google Scholar 

  • Schölkopf, B.; Smola, A. (2002). Learning with Kernels: Support Vector Machines, Regularization, Optimization and Beyond. MIT Press, Cambridge, MA, USA.

    Google Scholar 

  • Sproull, L.; Subramani, M.; Kiesler, S.; Walker, J. H.; Waters, K. (1996). When the interface is a face. Human-Computer Interaction, vol. 11, pp. 97–124.

    Article  Google Scholar 

  • Sproull, L. S.; Kiesler, S.; Zubrow, D. (1984). Encountering an Alien Culture, Journal of Social Issues, 40(3), pp. 31–48.

    Google Scholar 

  • Strong, E. K. Jr. (1943). Vocational interests of men and women. Stanford University Press, Stanford.

    Google Scholar 

  • SWR (2004). Media-Analyse 2004/II. Media Perspektiven, SWR.

    Google Scholar 

  • Walker, M.; Langkilde-Geary, I.; Wright, H.; Wright, J.; Gorin, A. (2002). Automatically training a problematic dialogue predictor for a spoken dialogue system. Journal of Artificial Intelligence Research 16, pp. 293–319.

    MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Oberle, F. (2008). Who, Why and How Often? Key Elements for the Design of a Successful Speech Application Taking Account of the Target Groups. In: Usability of Speech Dialog Systems. Signals and Commmunication Technologies. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-78343-5_1

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-78343-5_1

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-78342-8

  • Online ISBN: 978-3-540-78343-5

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics