Skip to main content

Probabilistic Inference of Gaze Patterns and Structure of Multiparty Conversations from Head Directions and Utterances

  • Conference paper
  • First Online:
New Frontiers in Artificial Intelligence (JSAI 2005)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 4012))

Included in the following conference series:

Abstract

A novel probabilistic framework is proposed for inferring gaze patterns and the structure of conversation in face-to-face multiparty communication, based on head directions and the presence/absence of utterances of participants. First, we define three classes of conversational regimes, which are characterized by the topology of the gaze pattern; we assume that they indicate the structure of the conversation, i.e. who is talking to whom. Next, the problem is formulated as joint estimation of both regime state from the gaze pattern and utterance, and the gaze pattern from head directions. We then devise a dynamic Bayesian network, called the Markov-switching model. The regime changes over time are based on Markov transitions, and controls the dynamics of the gaze patterns and utterances. Furthermore, Bayesian estimation of regime, gaze pattern, and model parameters are implemented using a Markov chain Monte Carlo method. Experiments on four-person conversations confirm accurate gaze estimation and the effectiveness of the framework toward identification of the conversation structures.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Cutler, R., Rui, Y., Gupta, A., Cadiz, J., Tashev, T., He, L., Colburn, A., Zhang, Z., Liu, Z., Silverberg, S.: Distributed meetings: A meeting capture and broadcasting system. In: Proc. ACM Multimedia 2002, pp. 503–512 (2002)

    Google Scholar 

  2. Bett, M., Gross, R., Yu, H., Zhu, X., Pan, Y., Yang, J., Waibel, A.: Multimodal meeting tracker. In: Proc. RIAO 2000: Content-Based Multimodal Inform. Access (2000)

    Google Scholar 

  3. Heylen, D., Es, I.V., Nijholt, A., Dijk, B.V.: Experimenting with the gaze of a conversational agent. In: Proc. Int. CLASS Workshop on Natural Intelligent and Effective Interaction in Multimodal Dialogue Systems, pp. 93–100 (2002)

    Google Scholar 

  4. McCowan, I., Perez, D., Bengio, S., Lathoud, G., Barnard, M., Zhang, D.: Automatic analysis of multimodal group actions in meetings. IEEE Trans. PAMI 27 (2005)

    Google Scholar 

  5. Zhang, D., Perez, D.G., Bengio, S., McCowan, I., Lathoud, G.: Modeling individual and group actions in meetings: A two-layer HMM framework. In: Proc. 2nd. IEEE Workshop on Event Mining (2004)

    Google Scholar 

  6. Clark, H.H., Carlson, T.B.: Hearers and speech acts. Language 58, 332–373 (1982)

    Article  Google Scholar 

  7. Kendon, A.: Some functions of gaze-direction in social interaction. Acta Psychologica 26, 22–63 (1967)

    Article  Google Scholar 

  8. Argyle, M., Cook, M.: Gaze and Mutual Gaze. Cambridge University Press, Cambridge (1976)

    Google Scholar 

  9. Jovanovic, N., Akker, R.: Towards automatic addressee identification in multi-party dialogues. In: Proc. SIGdial 2004, pp. 89–92 (2004)

    Google Scholar 

  10. Takemae, Y., Otsuka, K., Mukawa, N.: An analysis of speakers’ gaze behavior for automatic addressee identification in multiparty conversation and its application to video editing. In: Proc. of IEEE Int. Workshop on Robot and Human Interactive Communication (IEEE/RO-MAN), pp. 581–586 (2004)

    Google Scholar 

  11. Ohno, T., Mukawa, N.: A free-head, simple calibration, gaze tracking system that enables gaze-based interaction. In: Proc. Eye Tracking Research & Application Symposium (ETRA) 2004, pp. 115–122 (2004)

    Google Scholar 

  12. Matsumoto, Y., Zelinsky, A.: An algorithm for real-time stereo vision implementation of head pose and gaze direction measurement. In: Proc. Int. Conf. Automatic Face and Gesture Recognition 2004, pp. 499–504 (2000)

    Google Scholar 

  13. Stiefelhagen, R., Yang, J., Waibel, A.: Modeling focus of attention for meeting index based on multiple cues. IEEE Trans. Neural Networks 13 (2002)

    Google Scholar 

  14. Reidsma, D., Akker, R., Rienks, R., Poppe, R., Nijholt, A., Heylen, D., Zwiers, J.: Virtual meeting rooms: From observation to simulation. Proc. Social Intelli. Design (2005)

    Google Scholar 

  15. Morency, L.-P., Rahimi, A., Darrell, T.: Adaptive view-based appearance model. In: Proc. CVPR 2003, pp. 803–810 (2003)

    Google Scholar 

  16. Kim, C.-J., Nelson, C.R.: State-Space Models with Regime Switching. MIT Press, Cambridge (1999)

    Google Scholar 

  17. Gilks, W.R., Richardson, S., Spiegelhalter, D.J.: Markov Chain Monte Carlo in Practice. Chapman & Hall/CRC (1996)

    Google Scholar 

  18. Oliver, N.M., Rosario, B., Pentland, A.P.: A Bayesian computer vision system for modeling human interactions. IEEE Trans. PAMI 22 (2000)

    Google Scholar 

  19. Takemae, Y., Otsuka, K., Mukawa, N.: Impact of video editing based on participants’ gaze in multiparty conversation. In: Proc. ACM CHI 2004, pp. 1333–1336 (2004)

    Google Scholar 

  20. Novic, D.G., Hansen, B., Ward, K.: Coordinating turn-taking with gaze. In: Proc. Int. Conf. Spoken Language 1996, pp. 1888–1891 (1996)

    Google Scholar 

  21. Chen, R., Li, T.-H.: Blind restoration of linearly degraded discrete signals by Gibbs sampling. IEEE Trans. Signal Processing 43, 2410–2413 (1995)

    Article  Google Scholar 

  22. Bernardo, J.M., Smith, A.F.M.: Bayesian Theory. John Wiley & Sons, Chichester (1994)

    Book  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Otsuka, K., Takemae, Y., Yamato, J., Murase, H. (2006). Probabilistic Inference of Gaze Patterns and Structure of Multiparty Conversations from Head Directions and Utterances. In: Washio, T., Sakurai, A., Nakajima, K., Takeda, H., Tojo, S., Yokoo, M. (eds) New Frontiers in Artificial Intelligence. JSAI 2005. Lecture Notes in Computer Science(), vol 4012. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11780496_38

Download citation

  • DOI: https://doi.org/10.1007/11780496_38

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-35470-3

  • Online ISBN: 978-3-540-35471-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics