Skip to main content

Evaluating Users’ Reactions to Human-Like Interfaces

Prosodic and Paralinguistic Features as New Measures of User Satisfaction

  • Chapter
From Brows to Trust

Part of the book series: Human-Computer Interaction Series ((HCIS,volume 7))

Abstract

An increasing number of dialogue systems are deployed to provide public services in our everyday lives. They are becoming more service-minded and several of them provide different channels for interaction. The rationale is to make automatic services available in new environments and more attractive to use. From a developer perspective, this affects the complexity of the requirements elicitation activity, as new combinations and variations in end-user interaction need to be considered. The aim of our investigation is to propose new parameters and metrics to evaluate multimodal dialogue systems endowed with embodied conversational agents (ECAs). These new metrics focus on the users, rather than on the system. Our assumption is that the intentional use of prosodic variation and the production of communicative non-verbal behaviour by users can give an indication of their attitude towards the system and might also help to evaluate the users’ overall experience of the interaction. To test our hypothesis we carried out analyses on di erent Swedish corpora of interactions between users and multimodal dialogue systems. We analysed the prosodic variation in the way the users ended their interactions with the system and we observed the production of non-verbal communicative expressions by users. Our study supports the idea that the observation of users’ prosodic variation and production of communicative non-verbal behaviour during the interaction with dialogue systems could be used as an indication of whether or not the users are satisfied with the system performance.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  • Bell, L. and Gustafson, J. (1999a). Utterance types in the August System. In Proceedings of the Third Swedish Symposium on Multimodal Communication, pp.81–84. Stockholm.

    Google Scholar 

  • Bell, L. and Gustafson, J. (1999b). Interacting with an animated agent: an analysis of a Swedish database of spontaneous computer directed speech. In Proceedings of Eurospeech’99, Budapest, pp. 1143–1146.

    Google Scholar 

  • Beskow, J. (2003). Talking Heads Models and Applications for Multimodal Speech Synthesis. Doctoral dissertation, KTH, Stockholm.

    Google Scholar 

  • Beskow, J., Edlund, J., and Nordstrand, M. (in press). A model for generalised multi-modal dialogue system output applied to an animated talking head. In Minker, W., Büuhler, D., and Dybkjaer, L., editors, Spoken Multimodal Human-Computer Dialogue in Mobile Environments, Dordrecht, Kluwer Academic Press.

    Google Scholar 

  • Carlson, R. and Granström, B. (1996). The Waxholm spoken dialogue system, Palkova, Z., editor, Phonetica Pragensia IX. Charisteria viro doctissimo Premysl Janota oblata. Acta Universitatis Carolinae Philologica, 1: 39–52.

    Google Scholar 

  • Cave, C., Guaitella, I., Bertrand, R., Santi, S., Harlay, F., and Espesser R. (1996). About the relationship between eyebrow movements and f0 variations. In Proceedings of the ICSLP’96, pp. 2175–2179. Philadelphia.

    Google Scholar 

  • Cerrato, L. and Skhiri, M. (2003). Analysis and measurement of head movements in human dialogues. In Proceedings of AVSP, ITRW on Audio Visual Speech Processing’03, pp. 251–256, St Jorioz, France.

    Google Scholar 

  • Damasio, A. (1994). AR: Descartes’ Error: Emotion, Reason, and the Human Brain, New York, Grosset-Putnam.

    Google Scholar 

  • Danieli, M. and Gerbino, E. (1995). Metrics for evaluating dialog strategies in a spoken language system. In Working Notes AAAI Spring Symposium Series, pp. 34–39, Stanford University.

    Google Scholar 

  • Edlund, J. and Nordstrand, M. (2002). Turn-taking Gestures and Hour-Glasses in a Multi-modal Dialogue System. In Proceedings of ISCA Workshop Multi-Modal Dialogue in Mobile Environments, Kloster Irsee, Germany. pp. 181–184.

    Google Scholar 

  • Ekman, P. (1993). Facial expression and emotion. American Psychologist, 48(4): 384–392.

    Article  Google Scholar 

  • Fabri, M., Moore, D.J., and Hobbs, D.J. (2000). Expressive Agents: Non-verbal Communication in Collaborative Virtual Environments. In Proceedings of Workshop on Embodied Conversational Agents — Let’s Specify and Evaluate them!. AAMAS02, Bologna, Italy.

    Google Scholar 

  • Gibbon, D., Mertins, I., Moore, R.K., editors, (2000). Handbook of multimodal and spoken dialogue systems. Kluwer Academic Press.

    Google Scholar 

  • Glass, J.R. (1999). Challenges for spoken dialog systems. In Proceedings of the 1999 IEEE ASRU Workshop, pp. 430–435, Keystone.

    Google Scholar 

  • Graf, H.P. et al. (2002). Visual Prosody; Facial Movements Accompanying speech. In Proceedings of the 5th IEEE International Conference on Automatic Face and Gesture Recognition, Washington. pp. 396–401.

    Google Scholar 

  • Gustafson, J., Lindberg, N., and Lundeberg, M. (1999). The August spoken dialogue system. In Proceedings of Eurospeech’99, pp. 1151–1154, Budapest.

    Google Scholar 

  • Gustafson, J., Bell, L., Beskow, J., Boye, J., Carlson, R., Edlund, J., Granström, B., House, D., and Wirén M (2000). AdApt — a Multimodal conversational dialogue system in an apartment domain. In Proceedings of ICSLP’00, 2, pp. 134–137, Bejiing.

    Google Scholar 

  • Hjalmarsson, A. (2002). Evaluating Adapt, a multimodal conversational dialogue system, using PARADISE. MaS thesis. Department of Speech Music and Hearing KTH, Stockholm.

    Google Scholar 

  • Höök, K. (2002). Evaluation of A ective Interaction. In Proceedings of Workshop on Embodied Conversational Agents — Let’s Specify and Evaluate them!. AAMAS02, Bologna, Italy.

    Google Scholar 

  • Kipp, M. (2001). Anvil — A Generic Annotation Tool for Multimodal Dialogue. In Proceedings of Eurospeech’01, pp. 1367–1370, Aalborg.

    Google Scholar 

  • Laban, R. (1976). The Language of Movement a Guidebook to Choreutic. Boston, Plays Inc.

    Google Scholar 

  • Lippmann, R. (1997). Speech recognition by machines and humans. Speech Communication, 22: 1–15, Elsevier Science.?

    Article  Google Scholar 

  • Massaro, D.W., Cohen, M., Beskow, J., and Cole, R. (1998). Developing and Evaluating Conversational Agents. In Workshop on Embodied Conversation Characters’98, pp. 287–318. Lake Tohoe.

    Google Scholar 

  • Massaro, D.W. (2000). Multimodal emotion perception analogous to speech processes. In Proceedings of the ISCA Workshop on Speech and Emotion, pp. 114–121, Newcastle Northern Ireland.

    Google Scholar 

  • McTear, M.F. (2002). Spoken Dialog Technology: Enabling the Conversational User Interface. ACM Computing Surveys, 34(1): 90–169.

    Article  Google Scholar 

  • Mitchell, J., Menezes, C., Williams, J., Pardo, B., Erickson, D., and Fujimura, O. (2000). Changes in syllables and boundary strengths due to irritations. In Proceedings of the ISCA Workshop on Speech and Emotion’00, pp. 98–103, Newcastle Northern Ireland.

    Google Scholar 

  • Montero, J. M., Gutierrez Ariola, J., de Cordoba Herralde, R., Enriquez Carrasco, E. V., and Pardo Muoz, J.M. (2002). The role of pitch and tempo in Spanish Emotional Speech. In Keller, E., Bailly, G., Monaghan, A., Terken, J., and Huckvale, M. editors, Improvements in speech synthesis Cost 258, pp. 246–251, John Wiley & Sons, Chichester.

    Google Scholar 

  • Nass, C. and Moon, Y. (1996). Social responses to communication technologies: A summary of four studies. Unpublished manuscript.

    Google Scholar 

  • Nordstrand, M., Svanfeldt, G., Granstrm, B., and House, D. (2003). Measurements of Articulatory Variation and Communicative signals in Expressive Speech. In Proceedings of AVSP, ITRW on Audio Visual Speech Processing’03, pp. 233–238, St Jorioz, France.

    Google Scholar 

  • Pelachaud, C., Badler, N., and Steedman, M. (1996). Generating Facial Expressions for Speech. Cognitive Science, 20(1): 1–46.

    Article  Google Scholar 

  • Picard, R. (1997). A ective Computing, Cambridge Massachusetts, MIT Press.

    Google Scholar 

  • Poggi, I. and Pelachaud, C. (1999). Emotional Meaning and Expression in Animated Faces. In Proceedings IWAI’99, pp. 182–195, Siena, Italy.

    Google Scholar 

  • Sanders, G.A. and Scholtz, J. (2000). Measurements and Evaluation of Embodied conversational agents. In Cassell, J., Sullivan, J., Prevost, S., and Churchill, E., editors, Embodied conversational agents, pp. 346–373, Cambridge Massachusetts, MIT press.

    Google Scholar 

  • Scheglo, E.A. and Sacks, H. (1977). Opening Up Closings. Semiotica 8: 298–327.

    Google Scholar 

  • Sjölander, K. and Beskow, J. (2000). WaveSurfer — an Open Source Speech Tool. In Proceedings of ICSLP’00, Bejing, pp. 464–467.

    Google Scholar 

  • Thórisson, K.R. (1997). Gandalf an embodied humanoid capable of real time multimodal dialogue with people. In Proceedings of the First ACM International Conference of Autonomous Agents, pp. 536–537, California.

    Google Scholar 

  • Walker M.A., Kamm C.A. and Littman, D.J. (2000) Towards Developing General Models of Usability with PARADISE. Natural Language Engineering: Special Issue on Best Practice in Spoken Dialogue Systems, 2000, 6.

    Google Scholar 

  • Walker, M.A and Passonneau, R. (2001). DATE: A Dialogue Act Tagging Scheme for Evaluation of Spoken Dialogue Systems. In Proceedings of Human Language Technology Conference’01, pp. 66–73, San Diego. Available from: http://hlt2001.org/papers/hlt2001-15.pdf

Download references

Authors

Editor information

Zsófia Ruttkay Catherine Pelachaud

Rights and permissions

Reprints and permissions

Copyright information

© 2004 Kluwer Academic Publishers

About this chapter

Cite this chapter

Cerrato, L., Ekeklint, S. (2004). Evaluating Users’ Reactions to Human-Like Interfaces. In: Ruttkay, Z., Pelachaud, C. (eds) From Brows to Trust. Human-Computer Interaction Series, vol 7. Springer, Dordrecht. https://doi.org/10.1007/1-4020-2730-3_4

Download citation

  • DOI: https://doi.org/10.1007/1-4020-2730-3_4

  • Publisher Name: Springer, Dordrecht

  • Print ISBN: 978-1-4020-2729-1

  • Online ISBN: 978-1-4020-2730-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics