Skip to main content

Teaching Computers to Conduct Spoken Interviews: Breaking the Realtime Barrier with Learning

  • Conference paper
Intelligent Virtual Agents (IVA 2009)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 5773))

Included in the following conference series:

Abstract

Several challenges remain in the effort to build software capable of conducting realtime dialogue with people. Part of the problem has been a lack of realtime flexibility, especially with regards to turntaking. We have built a system that can adapt its turntaking behavior in natural dialogue, learning to minimize unwanted interruptions and “awkward silences”. The system learns this dynamically during the interaction in less than 30 turns, without special training sessions. Here we describe the system and its performance when interacting with people in the role of an interviewer. A prior evaluation of the system included 10 interactions with a single artificial agent (a non-learning version of itself); the new data consists of 10 interaction sessions with 10 different humans. Results show performance to be close to a human’s in natural, polite dialogue, with 20% of the turn transitions taking place in under 300 msecs and 60% under 500 msecs. The system works in real-world settings, achieving robust learning in spite of noisy data. The modularity of the architecture gives it significant potential for extensions beyond the interview scenario described here.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Wilson, M., Wilson, T.P.: An oscillator model of the timing of turn-taking. Psychonomic Bulletin Review 38(12), 957–968 (2005)

    Article  Google Scholar 

  2. Ford, C., Thompson, S.A.: Interactional units in conversation: Syntactic, intonational, and pragmatic resources for the management of turns. In: Ochs, E., Schegloff, E., Thompson, S.A. (eds.) Interaction and Grammar, pp. 134–184. Cambridge University Press, Cambridge (1996)

    Chapter  Google Scholar 

  3. Goodwin, C.: Conversational Organization: Interaction Between Speakers and Hearers. Academic Press, London (1981)

    Google Scholar 

  4. ten Bosch, L., Oostdijk, N., Boves, L.: On temporal aspects of turn taking in conversational dialogues. Speech Communication 47(1-2), 80–86 (2005)

    Article  Google Scholar 

  5. Jefferson, G.: Preliminary notes on a possible metric which provides for a standard maximum silence of approximately one second in conversation. Conversation: an Interdisciplinary Perspective, Multilingual Matters, 166–196 (1989)

    Google Scholar 

  6. Thórisson, K.R.: Natural turn-taking needs no manual: Computational theory and model, from perception to action, pp. 173–207 (2002)

    Google Scholar 

  7. Thórisson, K.R., Benko, H., Arnold, A., Abramov, D., Maskey, S., Vaseekaran, A.: Constructionist design methodology for interactive intelligences. A.I. Magazine 25, 77–90 (2004)

    Google Scholar 

  8. Jonsdottir, G.R., Thorisson, K.R., Nivel, E.: Learning smooth, human-like turntaking in realtime dialogue. In: Prendinger, H., Lester, J.C., Ishizuka, M. (eds.) IVA 2008. LNCS (LNAI), vol. 5208, pp. 162–175. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  9. Sacks, H., Schegloff, E.A., Jefferson, G.A.: A simplest systematics for the organization of turn-taking in conversation. Language 50, 696–735 (1974)

    Article  Google Scholar 

  10. Walker, M.B.: Smooth transitions in conversational turntaking: Implications for theory, vol. 110, pp. 31–37 (1982)

    Google Scholar 

  11. Thórisson, K.R.: Dialogue control in social interface agents. In: INTERCHI Adjunct Proceedings, pp. 139–140 (1993)

    Google Scholar 

  12. Thórisson, K.R.: Communicative humanoids: A computational model of psycho-social dialogue skills, Ph.D. thesis, Massachusetts Institute of Technology (1996)

    Google Scholar 

  13. Gratch, J., Okhmatovskaia, A., Lamothe, F., Marsella, S., Morales, M., van der Werf, R.J., Morency, L.P.: Virtual rapport. In: IVA, Marina Del Rey, California, pp. 14–27 (2006)

    Google Scholar 

  14. Sato, R., Higashinaka, R., Tamoto, M., Nakano, M., Aikawa, K.: Learning decision trees to determine turn-taking by spoken dialogue systems. In: ICSLP 2002, pp. 861–864 (2002)

    Google Scholar 

  15. Schlangen, D.: From reaction to prediction: Experiments with computational models of turn-taking. In: Proceedings of Interspeech 2006, Panel on Prosody of Dialogue Acts and Turn-Taking, Pittsburgh, USA (September (2006)

    Google Scholar 

  16. Morency, L.-P., de Kok, I., Gratch, J.: Predicting listener backchannels: A probabilistic multimodal approach. In: Prendinger, H., Lester, J.C., Ishizuka, M. (eds.) IVA 2008. LNCS (LNAI), vol. 5208, pp. 176–190. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  17. Bonaiuto, J., Thórisson, K.R.: Towards a neurocognitive model of realtime turntaking in face-to-face dialogue. In: Embodied Communication in Humans And Machines, pp. 451–483. Oxford University Press, Oxford (2008)

    Chapter  Google Scholar 

  18. Thórisson, K.R., Jonsdottir, G.R.: A granular architecture for dynamic realtime dialogue. In: Intelligent Virtual Agents, IVA 2008, pp. 1–3 (2008)

    Google Scholar 

  19. Pierrehumbert, J., Hirschberg, J.: The meaning of intonational contours in the interpretation of discourse. In: Cohen, P.R., Morgan, J., Pollack, M. (eds.) Intentions in Communication, pp. 271–311. MIT Press, Cambridge (1990)

    Google Scholar 

  20. Thórisson, K.R.: Machine perception of multimodal natural dialogue. In: McKevitt, P., Nulláin, S.Ó., Mulvihill, C. (eds.) Language, Vision & Music, 2002, pp. 97–115. John Benjamins, Amsterdam (2002)

    Chapter  Google Scholar 

  21. Nivel, E., Thórisson, K.R.: Prosodica: A realtime prosody tracker for dynamic dialogue. Technical report, Reykjavik University Department of Computer Science, Technical Report RUTR-CS08001 (2008)

    Google Scholar 

  22. Card, S.K., Moran, T.P., Newell, A.: The Model Human Processor: An Engineering Model of Human Performance, vol. II. John Wiley and Sons, New York (1986)

    Google Scholar 

  23. Andreas, E.S.: Observations on overlap: Findings and implications for automatic processing of multi-party conversation. In: Proceedings of Eurospeech 2001, pp. 1359–1362 (2001)

    Google Scholar 

  24. Markauskaite, L.: Towards an integrated analytical framework of information and communications technology literacy: from intended to implemented and achieved dimensions. Information Research 11 (2006), paper 252

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Jonsdottir, G.R., Thórisson, K.R. (2009). Teaching Computers to Conduct Spoken Interviews: Breaking the Realtime Barrier with Learning. In: Ruttkay, Z., Kipp, M., Nijholt, A., Vilhjálmsson, H.H. (eds) Intelligent Virtual Agents. IVA 2009. Lecture Notes in Computer Science(), vol 5773. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-04380-2_49

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-04380-2_49

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-04379-6

  • Online ISBN: 978-3-642-04380-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics