Abstract
Several challenges remain in the effort to build software capable of conducting realtime dialogue with people. Part of the problem has been a lack of realtime flexibility, especially with regards to turntaking. We have built a system that can adapt its turntaking behavior in natural dialogue, learning to minimize unwanted interruptions and “awkward silences”. The system learns this dynamically during the interaction in less than 30 turns, without special training sessions. Here we describe the system and its performance when interacting with people in the role of an interviewer. A prior evaluation of the system included 10 interactions with a single artificial agent (a non-learning version of itself); the new data consists of 10 interaction sessions with 10 different humans. Results show performance to be close to a human’s in natural, polite dialogue, with 20% of the turn transitions taking place in under 300 msecs and 60% under 500 msecs. The system works in real-world settings, achieving robust learning in spite of noisy data. The modularity of the architecture gives it significant potential for extensions beyond the interview scenario described here.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Wilson, M., Wilson, T.P.: An oscillator model of the timing of turn-taking. Psychonomic Bulletin Review 38(12), 957–968 (2005)
Ford, C., Thompson, S.A.: Interactional units in conversation: Syntactic, intonational, and pragmatic resources for the management of turns. In: Ochs, E., Schegloff, E., Thompson, S.A. (eds.) Interaction and Grammar, pp. 134–184. Cambridge University Press, Cambridge (1996)
Goodwin, C.: Conversational Organization: Interaction Between Speakers and Hearers. Academic Press, London (1981)
ten Bosch, L., Oostdijk, N., Boves, L.: On temporal aspects of turn taking in conversational dialogues. Speech Communication 47(1-2), 80–86 (2005)
Jefferson, G.: Preliminary notes on a possible metric which provides for a standard maximum silence of approximately one second in conversation. Conversation: an Interdisciplinary Perspective, Multilingual Matters, 166–196 (1989)
Thórisson, K.R.: Natural turn-taking needs no manual: Computational theory and model, from perception to action, pp. 173–207 (2002)
Thórisson, K.R., Benko, H., Arnold, A., Abramov, D., Maskey, S., Vaseekaran, A.: Constructionist design methodology for interactive intelligences. A.I. Magazine 25, 77–90 (2004)
Jonsdottir, G.R., Thorisson, K.R., Nivel, E.: Learning smooth, human-like turntaking in realtime dialogue. In: Prendinger, H., Lester, J.C., Ishizuka, M. (eds.) IVA 2008. LNCS (LNAI), vol. 5208, pp. 162–175. Springer, Heidelberg (2008)
Sacks, H., Schegloff, E.A., Jefferson, G.A.: A simplest systematics for the organization of turn-taking in conversation. Language 50, 696–735 (1974)
Walker, M.B.: Smooth transitions in conversational turntaking: Implications for theory, vol. 110, pp. 31–37 (1982)
Thórisson, K.R.: Dialogue control in social interface agents. In: INTERCHI Adjunct Proceedings, pp. 139–140 (1993)
Thórisson, K.R.: Communicative humanoids: A computational model of psycho-social dialogue skills, Ph.D. thesis, Massachusetts Institute of Technology (1996)
Gratch, J., Okhmatovskaia, A., Lamothe, F., Marsella, S., Morales, M., van der Werf, R.J., Morency, L.P.: Virtual rapport. In: IVA, Marina Del Rey, California, pp. 14–27 (2006)
Sato, R., Higashinaka, R., Tamoto, M., Nakano, M., Aikawa, K.: Learning decision trees to determine turn-taking by spoken dialogue systems. In: ICSLP 2002, pp. 861–864 (2002)
Schlangen, D.: From reaction to prediction: Experiments with computational models of turn-taking. In: Proceedings of Interspeech 2006, Panel on Prosody of Dialogue Acts and Turn-Taking, Pittsburgh, USA (September (2006)
Morency, L.-P., de Kok, I., Gratch, J.: Predicting listener backchannels: A probabilistic multimodal approach. In: Prendinger, H., Lester, J.C., Ishizuka, M. (eds.) IVA 2008. LNCS (LNAI), vol. 5208, pp. 176–190. Springer, Heidelberg (2008)
Bonaiuto, J., Thórisson, K.R.: Towards a neurocognitive model of realtime turntaking in face-to-face dialogue. In: Embodied Communication in Humans And Machines, pp. 451–483. Oxford University Press, Oxford (2008)
Thórisson, K.R., Jonsdottir, G.R.: A granular architecture for dynamic realtime dialogue. In: Intelligent Virtual Agents, IVA 2008, pp. 1–3 (2008)
Pierrehumbert, J., Hirschberg, J.: The meaning of intonational contours in the interpretation of discourse. In: Cohen, P.R., Morgan, J., Pollack, M. (eds.) Intentions in Communication, pp. 271–311. MIT Press, Cambridge (1990)
Thórisson, K.R.: Machine perception of multimodal natural dialogue. In: McKevitt, P., Nulláin, S.Ó., Mulvihill, C. (eds.) Language, Vision & Music, 2002, pp. 97–115. John Benjamins, Amsterdam (2002)
Nivel, E., Thórisson, K.R.: Prosodica: A realtime prosody tracker for dynamic dialogue. Technical report, Reykjavik University Department of Computer Science, Technical Report RUTR-CS08001 (2008)
Card, S.K., Moran, T.P., Newell, A.: The Model Human Processor: An Engineering Model of Human Performance, vol. II. John Wiley and Sons, New York (1986)
Andreas, E.S.: Observations on overlap: Findings and implications for automatic processing of multi-party conversation. In: Proceedings of Eurospeech 2001, pp. 1359–1362 (2001)
Markauskaite, L.: Towards an integrated analytical framework of information and communications technology literacy: from intended to implemented and achieved dimensions. Information Research 11 (2006), paper 252
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Jonsdottir, G.R., Thórisson, K.R. (2009). Teaching Computers to Conduct Spoken Interviews: Breaking the Realtime Barrier with Learning. In: Ruttkay, Z., Kipp, M., Nijholt, A., Vilhjálmsson, H.H. (eds) Intelligent Virtual Agents. IVA 2009. Lecture Notes in Computer Science(), vol 5773. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-04380-2_49
Download citation
DOI: https://doi.org/10.1007/978-3-642-04380-2_49
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-04379-6
Online ISBN: 978-3-642-04380-2
eBook Packages: Computer ScienceComputer Science (R0)