Teaching Computers to Conduct Spoken Interviews: Breaking the Realtime Barrier with Learning

Jonsdottir, Gudny Ragna; Thórisson, Kristinn R.

doi:10.1007/978-3-642-04380-2_49

Gudny Ragna Jonsdottir²³ &
Kristinn R. Thórisson²³

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 5773))

Included in the following conference series:

International Workshop on Intelligent Virtual Agents

2284 Accesses
7 Citations

Abstract

Several challenges remain in the effort to build software capable of conducting realtime dialogue with people. Part of the problem has been a lack of realtime flexibility, especially with regards to turntaking. We have built a system that can adapt its turntaking behavior in natural dialogue, learning to minimize unwanted interruptions and “awkward silences”. The system learns this dynamically during the interaction in less than 30 turns, without special training sessions. Here we describe the system and its performance when interacting with people in the role of an interviewer. A prior evaluation of the system included 10 interactions with a single artificial agent (a non-learning version of itself); the new data consists of 10 interaction sessions with 10 different humans. Results show performance to be close to a human’s in natural, polite dialogue, with 20% of the turn transitions taking place in under 300 msecs and 60% under 500 msecs. The system works in real-world settings, achieving robust learning in spite of noisy data. The modularity of the architecture gives it significant potential for extensions beyond the interview scenario described here.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Wilson, M., Wilson, T.P.: An oscillator model of the timing of turn-taking. Psychonomic Bulletin Review 38(12), 957–968 (2005)
Article Google Scholar
Ford, C., Thompson, S.A.: Interactional units in conversation: Syntactic, intonational, and pragmatic resources for the management of turns. In: Ochs, E., Schegloff, E., Thompson, S.A. (eds.) Interaction and Grammar, pp. 134–184. Cambridge University Press, Cambridge (1996)
Chapter Google Scholar
Goodwin, C.: Conversational Organization: Interaction Between Speakers and Hearers. Academic Press, London (1981)
Google Scholar
ten Bosch, L., Oostdijk, N., Boves, L.: On temporal aspects of turn taking in conversational dialogues. Speech Communication 47(1-2), 80–86 (2005)
Article Google Scholar
Jefferson, G.: Preliminary notes on a possible metric which provides for a standard maximum silence of approximately one second in conversation. Conversation: an Interdisciplinary Perspective, Multilingual Matters, 166–196 (1989)
Google Scholar
Thórisson, K.R.: Natural turn-taking needs no manual: Computational theory and model, from perception to action, pp. 173–207 (2002)
Google Scholar
Thórisson, K.R., Benko, H., Arnold, A., Abramov, D., Maskey, S., Vaseekaran, A.: Constructionist design methodology for interactive intelligences. A.I. Magazine 25, 77–90 (2004)
Google Scholar
Jonsdottir, G.R., Thorisson, K.R., Nivel, E.: Learning smooth, human-like turntaking in realtime dialogue. In: Prendinger, H., Lester, J.C., Ishizuka, M. (eds.) IVA 2008. LNCS (LNAI), vol. 5208, pp. 162–175. Springer, Heidelberg (2008)
Chapter Google Scholar
Sacks, H., Schegloff, E.A., Jefferson, G.A.: A simplest systematics for the organization of turn-taking in conversation. Language 50, 696–735 (1974)
Article Google Scholar
Walker, M.B.: Smooth transitions in conversational turntaking: Implications for theory, vol. 110, pp. 31–37 (1982)
Google Scholar
Thórisson, K.R.: Dialogue control in social interface agents. In: INTERCHI Adjunct Proceedings, pp. 139–140 (1993)
Google Scholar
Thórisson, K.R.: Communicative humanoids: A computational model of psycho-social dialogue skills, Ph.D. thesis, Massachusetts Institute of Technology (1996)
Google Scholar
Gratch, J., Okhmatovskaia, A., Lamothe, F., Marsella, S., Morales, M., van der Werf, R.J., Morency, L.P.: Virtual rapport. In: IVA, Marina Del Rey, California, pp. 14–27 (2006)
Google Scholar
Sato, R., Higashinaka, R., Tamoto, M., Nakano, M., Aikawa, K.: Learning decision trees to determine turn-taking by spoken dialogue systems. In: ICSLP 2002, pp. 861–864 (2002)
Google Scholar
Schlangen, D.: From reaction to prediction: Experiments with computational models of turn-taking. In: Proceedings of Interspeech 2006, Panel on Prosody of Dialogue Acts and Turn-Taking, Pittsburgh, USA (September (2006)
Google Scholar
Morency, L.-P., de Kok, I., Gratch, J.: Predicting listener backchannels: A probabilistic multimodal approach. In: Prendinger, H., Lester, J.C., Ishizuka, M. (eds.) IVA 2008. LNCS (LNAI), vol. 5208, pp. 176–190. Springer, Heidelberg (2008)
Chapter Google Scholar
Bonaiuto, J., Thórisson, K.R.: Towards a neurocognitive model of realtime turntaking in face-to-face dialogue. In: Embodied Communication in Humans And Machines, pp. 451–483. Oxford University Press, Oxford (2008)
Chapter Google Scholar
Thórisson, K.R., Jonsdottir, G.R.: A granular architecture for dynamic realtime dialogue. In: Intelligent Virtual Agents, IVA 2008, pp. 1–3 (2008)
Google Scholar
Pierrehumbert, J., Hirschberg, J.: The meaning of intonational contours in the interpretation of discourse. In: Cohen, P.R., Morgan, J., Pollack, M. (eds.) Intentions in Communication, pp. 271–311. MIT Press, Cambridge (1990)
Google Scholar
Thórisson, K.R.: Machine perception of multimodal natural dialogue. In: McKevitt, P., Nulláin, S.Ó., Mulvihill, C. (eds.) Language, Vision & Music, 2002, pp. 97–115. John Benjamins, Amsterdam (2002)
Chapter Google Scholar
Nivel, E., Thórisson, K.R.: Prosodica: A realtime prosody tracker for dynamic dialogue. Technical report, Reykjavik University Department of Computer Science, Technical Report RUTR-CS08001 (2008)
Google Scholar
Card, S.K., Moran, T.P., Newell, A.: The Model Human Processor: An Engineering Model of Human Performance, vol. II. John Wiley and Sons, New York (1986)
Google Scholar
Andreas, E.S.: Observations on overlap: Findings and implications for automatic processing of multi-party conversation. In: Proceedings of Eurospeech 2001, pp. 1359–1362 (2001)
Google Scholar
Markauskaite, L.: Towards an integrated analytical framework of information and communications technology literacy: from intended to implemented and achieved dimensions. Information Research 11 (2006), paper 252
Google Scholar

Download references

Author information

Authors and Affiliations

Center for Analysis & Design of Intelligent Agents and School of Computer Science, Reykjavik University, Kringlunni 1, IS, 103, Reykjavik, Iceland
Gudny Ragna Jonsdottir & Kristinn R. Thórisson

Authors

Gudny Ragna Jonsdottir
View author publications
You can also search for this author in PubMed Google Scholar
Kristinn R. Thórisson
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Human Media Interaction (HMI), University of Twente, EWI (Zilverling), P.O.Box 217, 7500, Enschede, AE, The Netherlands
Zsófia Ruttkay
Deutsches Forschungszentrum für künstliche Intelligenz (DFKI), Campus D3.2, 66123, Saarbrücken, Germany
Michael Kipp
Human Media Interaction Group,Dept. of Computer Science, University of Twente, P.O. Box 217, 7500, Enschede, AE, The Netherlands
Anton Nijholt
Center for Analysis and Design of Intelligent Agents, CADIA, School of Computer Science, Reykjavik University, Kringlan 1, 103, Reykjavik, Iceland
Hannes Högni Vilhjálmsson

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Jonsdottir, G.R., Thórisson, K.R. (2009). Teaching Computers to Conduct Spoken Interviews: Breaking the Realtime Barrier with Learning. In: Ruttkay, Z., Kipp, M., Nijholt, A., Vilhjálmsson, H.H. (eds) Intelligent Virtual Agents. IVA 2009. Lecture Notes in Computer Science(), vol 5773. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-04380-2_49

Download citation

DOI: https://doi.org/10.1007/978-3-642-04380-2_49
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-04379-6
Online ISBN: 978-3-642-04380-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics