Intention Estimation and Recommendation System Based on Attention Sharing

  • Sangwook Kim
  • Jehan Jung
  • Swathi Kavuri
  • Minho Lee
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8226)


In human-agent interactions, attention sharing plays a key role in understanding other’s intention without explicit verbal explanation. Deep learning algorithms are recently used to model these interactions in a complex real world environment. In this paper we propose a deep learning based intention estimation and recommendation system by understanding humans attention based on their gestures. Action-object affordances are modeled using stacked auto-encoder, which represents the relationships between actions and objects. Intention estimation and object recommendation system according to human intention is implemented based on an affordance model. Experimental result demonstrates meaningful intention estimation and recommendation performance in the real-world scenarios.


intention estimation recommendation system attention sharing deep learning action-object affordance 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Duncan, J.: Selective attention and the organization of visual information. Journal of Experimental Psychology. General 113, 501–517 (1984)CrossRefGoogle Scholar
  2. 2.
    Treisman, A.M.: Strategies and models of selective attention. Psychological Review 76 (1969)Google Scholar
  3. 3.
    Ban, S.-W., Jang, Y.-M., Lee, M.: Affective saliency map considering psychological distance. Neurocomputing 74, 1916–1925 (2011)CrossRefGoogle Scholar
  4. 4.
    Kozima, H.: Attention-sharing and behavior-sharing in human-robot communication. In: IEEE International Workshop on Robot and Human Communication (1998)Google Scholar
  5. 5.
    Moore, C.E., Dunham. P.J.: Joint attention: Its origins and role in development. Lawrence Erlbaum Associates, Inc. (1995) Google Scholar
  6. 6.
    Gibson, J.J.: The ecological approach to visual perception. Psychology Press (1986)Google Scholar
  7. 7.
    Kjellström, H., Romero, J., Kragić, D.: Visual object-action recognition: Inferring object affordances from human demonstration. Computer Vision and Image Understanding 115, 81–90 (2011)CrossRefGoogle Scholar
  8. 8.
    Montesano, L., et al.: Learning Object Affordances: From Sensory-Motor Coordination to Imitation. IEEE Transactions on Robotics 24, 15–26 (2008)CrossRefGoogle Scholar
  9. 9.
    Zhu, C., Cheng, Q., Sheng, W.: Human intention recognition in smart assisted living systems using a hierarchical hidden Markov model. In: IEEE International Conference on Automation Science and Engineering, pp. 253–258 (2008)Google Scholar
  10. 10.
    Schrempf, O.C., Hanebeck, U.D.: A generic model for estimating user-intentions in human-robot cooperation. In: Proceedings of the 2nd International Conference on Informatics in Control, Automation and Robotics, vol. 5 (2005)Google Scholar
  11. 11.
    Kelley, R., Wigand, L., Hamilton, B., Browne, K., Nicolescu, M., Nicolescu, M.: Deep networks for predicting human intent with respect to objects. In: Proceedings of the Seventh Annual ACM/IEEE International Conference on Human-Robot Interaction, pp. 171–172 (2012)Google Scholar
  12. 12.
    Hwang, B., Jang, Y.-M., Mallipeddi, R., Lee, M.: Probabilistic human intention modeling for cognitive augmentation. In: IEEE International Conference on Systems, Man, and Cybernetics (SMC), pp. 2580–2584 (2012)Google Scholar
  13. 13.
    Rish, I.: An empirical study of the naive Bayes classifier. In: IJCAI 2001 Workshop on Empirical Methods in Artificial Intelligence, pp. 41–46 (2001)Google Scholar
  14. 14.
    Shotton, J., Sharp, T., Kipman, A., Fitzgibbon, A., Finocchio, M., Blake, A., Cook, M., Moore, R.: Real-time human pose recognition in parts from single depth images. Communications of the ACM 56, 116–124 (2013)CrossRefGoogle Scholar
  15. 15.
    Vincent, P., Larochelle, H., Lajoie, I., Bengio, Y., Manzagol, P.A.: Stacked denoising autoencoders: Learning useful representations in a deep network with a local denoising criterion. The Journal of Machine Learning Research 9999, 3371–3408 (2010)MathSciNetGoogle Scholar
  16. 16.
    Hinton, G.E., Osindero, S., Teh, Y.-W.: A fast learning algorithm for deep belief nets. Neural Computation 18, 1527–1554 (2006)MathSciNetCrossRefzbMATHGoogle Scholar
  17. 17.
    Hinton, G.E., Salakhutdinov, R.R.: Reducing the dimensionality of data with neural networks. Science 313(5786), 504–507 (2006)MathSciNetCrossRefzbMATHGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  • Sangwook Kim
    • 1
  • Jehan Jung
    • 2
  • Swathi Kavuri
    • 1
  • Minho Lee
    • 1
  1. 1.School of Electronics EngineeringKyungpook National UniversityTaeguSouth Korea
  2. 2.Department of Sensor and Display EngineeringKyungpook National UniversityTaeguSouth Korea

Personalised recommendations