Action Alignment from Gaze Cues in Human-Human and Human-Robot Interaction

  • Nuno Ferreira DuarteEmail author
  • Mirko Raković
  • Jorge Marques
  • José Santos-Victor
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11131)


Cognitive neuroscience experiments show how people intensify the exchange of non-verbal cues when they work on a joint task towards a common goal. When individuals share their intentions, it creates a social interaction that drives the mutual alignment of their actions and behavior. To understand the intentions of others, we strongly rely on the gaze cues. According to the role each person plays in the interaction, the resulting alignment of the body and gaze movements will be different. This mechanism is key to understand and model dyadic social interactions.

We focus on the alignment of the leader’s behavior during dyadic interactions. The recorded gaze movements of dyads are used to build a model of the leader’s gaze behavior. We use of the follower’s gaze behavior data for two purposes: (i) to determine whether the follower is involved in the interaction, and (ii) if the follower’s gaze behavior correlates to the type of the action under execution. This information is then used to plan the leader’s actions in order to sustain the leader/follower alignment in the social interaction.

The model of the leader’s gaze behavior and the alignment of the intentions is evaluated in a human-robot interaction scenario, with the robot acting as a leader and the human as a follower. During the interaction, the robot (i) emits non-verbal cues consistent with the action performed; (ii) predicts the human actions, and (iii) aligns its motion according to the human behavior.


Action anticipation Gaze behavior Action alignment Human-robot interaction 



We thank all of our colleagues, students and volunteers that supported us in preparing and conducting the experiments.


  1. 1.
    Admoni, H., Dragan, A., Srinivasa, S.S., Scassellati, B.: Deliberate delays during robot-to-human handovers improve compliance with gaze communication. In: Proceedings of the 2014 ACM/IEEE International Conference on Human-robot Interaction, HRI 2014, pp. 49–56. ACM, New York (2014).
  2. 2.
    Andrist, S., Gleicher, M., Mutlu, B.: Looking coordinated: Bidirectional gaze mechanisms for collaborative interaction with virtual characters. In: Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems, CHI 2017, pp. 2571–2582. ACM, New York (2017).
  3. 3.
    Bassetti, C.: Chapter 2 - social interaction in temporary gatherings: A sociological taxonomy of groups and crowds for computer vision practitioners. In: Murino, V., Cristani, M., Shah, S., Savarese, S. (eds.) Group and Crowd Behavior for Computer Vision, pp. 15–28. Academic Press (2017)., Scholar
  4. 4.
    Biagini, F., Campanino, M.: Discrete time Markov chains. In: Elements of Probability and Statistics, pp. 81–87. Springer, Cham (2016). Scholar
  5. 5.
    Domhof, J., Chandarr, A., Rudinac, M., Jonker, P.: Multimodal joint visual attention model for natural human-robot interaction in domestic environments. In: 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 2406–2412, September 2015.
  6. 6.
    Duarte, N.F., Rakovic, M., Tasevski, J., Coco, M.I., Billard, A., Santos-Victor, J.: Action anticipation: reading the intentions of humans and robots. IEEE Robot. Autom. Lett. 3(4), 4132–4139 (2018). Scholar
  7. 7.
    Duchowski, A.T.: Gaze-based interaction: A 30 year retrospective, vol. 73, pp. 59–69 (2018)., Scholar
  8. 8.
    Farha, Y.A., Richard, A., Gall, J.: When will you do what? - anticipating temporal occurrences of activities. arXiv preprint arXiv:1804.00892 (2018)
  9. 9.
    Fathi, A., Ren, X., Rehg, J.M.: Learning to recognize objects in egocentric activities. In: Proceedings of the 2011 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2011, pp. 3281–3288. IEEE Computer Society, Washington, DC (2011).,
  10. 10.
    Gallotti, M., Fairhurst, M., Frith, C.: Alignment in social interactions. Conscious. Cogn. 48, 253–261 (2017)CrossRefGoogle Scholar
  11. 11.
    Gottwald, J.M., Elsner, B., Pollatos, O.: Good is upspatial metaphors in action observation. Front. Psychol. 6, 1605 (2015). Scholar
  12. 12.
    Ivaldi, S., Anzalone, S., Rousseau, W., Sigaud, O., Chetouani, M.: Robot initiative in a team learning task increases the rhythm of interaction but not the perceived engagement. Front. Neurorobotics 8, 5 (2014)CrossRefGoogle Scholar
  13. 13.
    Kassner, M., Patera, W., Bulling, A.: Pupil: an open source platform for pervasive eye tracking and mobile gaze-based interaction. In: Proceedings of the 2014 ACM International Joint Conference on Pervasive and Ubiquitous Computing: Adjunct Publication, pp. 1151–1160. ACM (2014)Google Scholar
  14. 14.
    Kelley, R., Tavakkoli, A., King, C., Nicolescu, M., Nicolescu, M.: Understanding activities and intentions for human-robot interaction (2010). Scholar
  15. 15.
    Kitani, K.M., Ziebart, B.D., Bagnell, J.A., Hebert, M.: Activity forecasting. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012. LNCS, vol. 7575, pp. 201–214. Springer, Heidelberg (2012). Scholar
  16. 16.
    Koppula, H.S., Saxena, A.: Anticipating human activities using object affordances for reactive robotic response. IEEE Trans. Pattern Anal. Mach. Intell. 38(1), 14–29 (2016). Scholar
  17. 17.
    Kothe, C.: Lab streaming layer (LSL) (2018). Accessed 26 Feb 2015
  18. 18.
    Lukic, L., Santos-Victor, J., Billard, A.: Learning robotic eye-arm-hand coordination from human demonstration: a coupled dynamical systems approach. Biol. Cybern. 108(2), 223–248 (2014)MathSciNetCrossRefGoogle Scholar
  19. 19.
    Metta, G., et al.: The iCub humanoid robot: an open-systems platform for research in cognitive development. Neural Networks 23(8–9), 1125–1134 (2010)CrossRefGoogle Scholar
  20. 20.
    Palinko, O., Rea, F., Sandini, G., Sciutti, A.: Eye gaze tracking for a humanoid robot. In: 2015 IEEE-RAS 15th International Conference on Humanoid Robots (Humanoids), pp. 318–324, November 2015.
  21. 21.
    Pattacini, U., Nori, F., Natale, L., Metta, G., Sandini, G.: An experimental evaluation of a novel minimum-jerk cartesian controller for humanoid robots. In: 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 1668–1674. IEEE (2010)Google Scholar
  22. 22.
    Pfeiffer, M., Schwesinger, U., Sommer, H., Galceran, E., Siegwart, R.: Predicting actions to act predictably: Cooperative partial motion planning with maximum entropy models. In: 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 2096–2101, October 2016.
  23. 23.
    Rakovic, M., Duarte, N.F., Marques, J., Santos-Victor, J.: Modelling the gaze dialogue: non-verbal communication in human-human and human-robot interaction. Paper Under Revis. 1(1), 1–12 (2018)Google Scholar
  24. 24.
    Raković, M., Duarte, N., Tasevski, J., Santos-Victor, J., Borovac, B.: A dataset of head and eye gaze during dyadic interaction task for modeling robot gaze behavior. In: MATEC Web of Conferences, vol. 161, p. 03002. EDP Sciences (2018)Google Scholar
  25. 25.
    Roncone, A., Pattacini, U., Metta, G., Natale, L.: A cartesian 6-dof gaze controller for humanoid robots. In: Robotics: Science and Systems (2016)Google Scholar
  26. 26.
    Schydlo, P., Rakovic, M., Jamone, L., Santos-Victor, J.: Anticipation in human-robot cooperation: a recurrent neural network approach for multiple action sequences prediction. In: IEEE International Conference on Robotics and Automation, ICRA 2018 (2018)Google Scholar
  27. 27.
    Sciutti, A., Mara, M., Tagliasco, V., Sandini, G.: Humanizing human-robot interaction: on the importance of mutual understanding. IEEE Technol. Soc. Mag. 37(1), 22–29 (2018). Scholar
  28. 28.
    Ycel, Z., Salah, A.A., Meriçli, Ç., Meriçli, T., Valenti, R., Gevers, T.: Joint attention by gaze interpolation and saliency. IEEE Trans. Cybern. 43(3), 829–842 (2013). Scholar
  29. 29.
    Zhang, J., Li, W., Ogunbona, P.O., Wang, P., Tang, C.: Rgb-d-based action recognition datasets: a survey. Pattern Recognit. 60, 86–105 (2016). Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  • Nuno Ferreira Duarte
    • 1
    Email author
  • Mirko Raković
    • 1
    • 2
  • Jorge Marques
    • 1
  • José Santos-Victor
    • 1
  1. 1.Vislab, Institute for Systems and Robotics, Instituto Superior TécnicoUniversidade de LisboaLisbonPortugal
  2. 2.Faculty of Technical SciencesUniversity of Novi SadNovi SadSerbia

Personalised recommendations