, Volume 2, Issue 2, pp 64–70 | Cite as

Towards attentive robots

  • Simone FrintropEmail author
Review Article


This paper introduces Atentive Robots: robots that attend to the parts of their sensory input that are currently of most potential interest. The concept of selecting the most promising parts is adopted from human perception where selective attention allocates the brain resources to the most interesting parts of the sensory input. We give an overview of current approaches to integrate computational attention into robotic systems, with a focus on biologically-inspired visual attention methods. Example applications range from localization with salient landmarks over object manipulation to the design of social robots. A brief outlook gives an impression of how future ways to obtain attentive robots might look like.


visual attention saliency cognitive robots 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. [1]
    1. Definition of attention., June 2011.
  2. [2]
    Y. Aloimonos, I. Weiss, and A. Bandopadhay. Active vision. International Journal of Computer Vision (IJCV), 1(4):333–356, 1988.CrossRefGoogle Scholar
  3. [3]
    R. Bajcsy. Active perception vs. passive perception. In Proc. IEEE Workshop on Computer Vision: Representation and Control, Bellaire MI, 1985.Google Scholar
  4. [4]
    A. Belardinelli. Salience features selection: Deriving a model from human evidence. PhD thesis, Sapienza Universita di Roma, Rome, Italy, 2008.Google Scholar
  5. [5]
    M. Björkman and D. Kragic. Active 3D scene segmentation and detection of unknown objects. In IEEE International Conference on Robotics and Automation (ICRA), Anchorage, USA, 2010.Google Scholar
  6. [6]
    M. Bollmann, R. Hoischen, M. Jesikiewicz, C. Justkowski, and B. Mertsching. Playing domino: A case study for an active vision system. In H.I. Christensen, editor, Computer Vision Systems, pages 392–411. Springer, 1999.Google Scholar
  7. [7]
    C. Breazeal. A context-dependent attention system for a social robot. In Proc. of the Int’l Joint Conference on Artifical Intelligence (IJCAI 99), pages 1146–1151, Stockholm, Sweden, 1999.Google Scholar
  8. [8]
    C. Breazeal. Sociable Machines: Expresive Social Exchange Between Humans and Robots. PhD thesis, Department of Electrical Engineering and Computer Science. MIT, 2000.Google Scholar
  9. [9]
    N. D. B. Bruce and J. K. Tsotsos. Saliency, attention, and visual search: An information theoretic approach. Journal of Vision, 9(3):1–24, 2009.CrossRefGoogle Scholar
  10. [10]
    J. J. Clark and N. J. Ferrier. Modal control of an attentive vision system. In Proc. Of the 2nd International Conferenceon Computer Vision, Tampa, Florida, US, Dec 1988.Google Scholar
  11. [11]
    A. Ess, K. Schindler, B. Leibe, and L. Van Gool. Object detection and tracking for autonomous navigation in dynamic environments. International Journal of Robotics Research, 29(14): 1707–1725, 2010.CrossRefGoogle Scholar
  12. [12]
    M. Everingham, L. Van Gool, C. K. I. Williams, J. Winn, and A. Zisserman. The PASCAL Visual Object Classes Challenge 2010 (VOC2010) Results., 2010.
  13. [13]
    T. Fong, I. Nourbakhsh, and K. Dautenhahn. A survey of socially interactive robots. Robotics and Autonomous Systems, 42(3–4):143–166, 2003.zbMATHCrossRefGoogle Scholar
  14. [14]
    P.-E. Forssén, D. Meger, K. Lai, S. Helmer, J. J. Little, and D. G. Lowe. Informed visual search: Combining attention and object recognition. In International Conferenceon Roboticsand Automation, 2008.Google Scholar
  15. [15]
    S. Frintrop. VOCUS: A Visual Atention System for Object Detection and Goal-directed Search, volume 3899 of Lecture Notes in Artificial Intelligence (LNAI). Springer, Berlin/Heidelberg, 2006.CrossRefGoogle Scholar
  16. [16]
    S. Frintrop. The high repeatability of salient regions. In Proc. of ECCV workshop “Vision in Action: Efficient Strategies for Cognitive Agentsin Complex Environments”, 2008.Google Scholar
  17. [17]
    S. Frintrop. Computational visual attention. In A. A. Salah and T. Gevers, editors, Computer Analysis of Human Behavior (to appear), Advances in Pattern Recognition. Springer, 2011.Google Scholar
  18. [18]
    S. Frintrop and P. Jensfelt. Attentional landmarks and active gaze control for visual SLAM. IEEE Trans. on Robotics, Special Issue on Visual SLAM, 24(5), Oct 2008.Google Scholar
  19. [19]
    S. Frintrop, E. Rome, A. Nüchter, and H. Surmann. A bimodal laser-based attention system. J. of Computer Vision and Image Understanding(CVIU), Special Issue on Attention and Performance in Computer Vision, 100(1–2):124–151, Oct–Nov 2005.Google Scholar
  20. [20]
    S. Frintrop, M. Klodt, and E. Rome. A real-time visual attention system using integral images. In Proc. Of the 5th Int’l Conf. on Computer Vision Systems (ICVS), Bielefeld, Germany, March 2007.Google Scholar
  21. [21]
    S. Frintrop, E. Rome, and H. I. Christensen. Computational visual attention systems and their cognitive foundations: A survey. ACM Trans. On Applied Perception, 7(1), 2010.Google Scholar
  22. [22]
    D. Gao, S. Han, and N. Vasconcelos. Discriminant saliency, the detection of suspicious coincidences, and applications to visual recognition. IEEE Trans. On PAMI, 31(6), 2009.Google Scholar
  23. [23]
    S. Gould, J. Arfvidsson, A. Kaehler, B. Sapp, M. Messner, G. Bradski, P. Baumstarck, S. Chung, and A. Y. Ng. Peripheral-foveal vision for real-time object recognition and tracking in video. In Proc. Of the 20th Int. Joint Conferenceon Artifical intelligence (IJCAI), 2007.Google Scholar
  24. [24]
    G. Heidemann, R. Rae, H. Bekel, I. Bax, and H. Ritter. Integrating context-free and context-dependent attentional mechanisms for gestural object reference. Machine Vision and Applications, 16(1):64–73, 2004.CrossRefGoogle Scholar
  25. [25]
    X. Hou and L. Zhang. Saliency detection: a spectral residual approach. In Proc. of CVPR, 2007.Google Scholar
  26. [26]
    L. Itti, C. Koch, and E. Niebur. A model of saliency-based visual attention for rapid scene analysis. IEEE Trans. On Patern Analysis and Machine Intelligence, 20(11):1254–1259, 1998.CrossRefGoogle Scholar
  27. [27]
    W. James. The Principles of Psychology. Dover Publications, New York, 1890.CrossRefGoogle Scholar
  28. [28]
    R. Johansson, G. Westling, A. Backstrom, and J. Flanagan. Eyehand coordination in object manipulation. The Journal of Neuroscience, 21(17):6917–6932, 2001.Google Scholar
  29. [29]
    M. Johnson-Roberson, J. Bohg, M. Björkman, and D. Kragic. Attention based active 3D point cloud segmentation. In Proc. of the 2010 IEEE/RSJ Int. Conf. on Inteligent Robots and Systems, October 2010.Google Scholar
  30. [30]
    B. Leibe, A. Leonardis, and B. Schiele. Robust object detection with interleaved categorization and segmentation. Int. J. of Computer Vision, Special Issue on Learning for Recognition and Recognition for Learning, 77(1–3):259–289, 2008.Google Scholar
  31. [31]
    T. Liu, Z. Yuan, J. Sun, J. Wang, N. Zheng, X. Tang, and H.-Y. Shum. Learning to detect a salient object. IEEE Transactions on Patern Analysis and Machine Intelligence, 2009.Google Scholar
  32. [32]
    D. Loach, A. Frischen, N. Bruce, and J. K. Tsotsos. An attentional mechanism for selecting appropriate actions afforded by graspable objects. Psychological Science, 19(12), 2008.Google Scholar
  33. [33]
    D. G. Lowe. Distinctive image features from scale-invariant key-points. Int’l J. of Computer Vision (IJCV), 60(2):91–110, 2004.CrossRefGoogle Scholar
  34. [34]
    D. Meger, P.-E. Forssén, K. Lai, S. Helmer, S. McCann, T. Southey, M. Baumann, J. J. Little, D. G. Lowe, and B. Dow. Curious george: An attentive semantic robot. Journal Robotics and Autonomous Systems, 56(6), 2008.Google Scholar
  35. [35]
    A. D. Mehta, I. Ulbert, and C. E. Schroeder. Intermodal selective attention in monkeys. I: Distribution and timing of effects across visual areas. Cerebral Cortex, 10(4), 2000.Google Scholar
  36. [36]
    F. Miau, C. Papageorgiou, and L. Itti. Neuromorphic algorithms for computer vision and attention. In Proc. SPIE 46 Annual Int’l Symposium on Optical Science and Technology, volume 4479, pages 12–23, Nov 2001.Google Scholar
  37. [37]
    S. Mitri, S. Frintrop, K. Pervölz, H. Surmann, and A. Nüchter. Robust object detection at regions of interest with an application in ball recognition. In IEEE Proc. of the Int’l Conf. on Robotics and Automation (ICRA’ 05), 2005.Google Scholar
  38. [38]
    C. Muhl, Y. Nagai, and G. Sagerer. On constructing a communicative space in HRI. In Proc. of the 30th German Conference on Artificial Intelligence (KI 2007). Springer, 2007.Google Scholar
  39. [39]
    Y. Nagai. From bottom-up visual attention to robot action learning. In IEEE 8th Int’l Conf. on Development and Learning, 2009.Google Scholar
  40. [40]
    C. Nass and Y. Moon. Machines and mindlessness: Social responses to computers. Journal of Social Issues, 56(1):81–103, 2000.CrossRefGoogle Scholar
  41. [41]
    V. Navalpakkam and L. Itti. An integrated model of top-down and bottom-up attention for optimizing detection speed. In Proc. of the Conf. on Computer Vision and Patern Recognition (CVPR), 2006.Google Scholar
  42. [42]
    S. B. Nickerson, P. Jasiobedzki, D. Wilkes, M. Jenkin, E. Milios, J. K. Tsotsos, A. Jepson, and O. N. Bains. The ARK project: Autonomous mobile robots for known industrial environments. Robotics and Autonomous Systems, 25(1–2):83–104, 1998.CrossRefGoogle Scholar
  43. [43]
    S. E. Palmer. Vision Science: Photons to Phenomenology. The MIT Press, Cambridge, MA, 1999.Google Scholar
  44. [44]
    H. Pashler. The Psychology of Atention. MIT Press, Cambridge, MA, 1997.Google Scholar
  45. [45]
    B. Rasolzadeh, M. Björkman, K. Huebner, and D. Kragic. An active vision system for detecting, fixating and manipulating objects in real world. International Journal of Robotics Research, 29(2–3), 2010.Google Scholar
  46. [46]
    A. Rotenstein, A. Andreopoulos, E. Fazl, D. Jacob, M. Robinson, K. Shubina, Y. Zhu, and J.K. Tsotsos. Towards the dream of intelligent, visually-guided wheelchairs. In Proc. 2nd Int’l Conf. on Technology and Aging, 2007.Google Scholar
  47. [47]
    J. Ruesch, M. Lopes, A. Bernardino, J. Hörnstein, J. Santos-Victor, and R. Pfeifer. Multimodal saliency-based bottom-up attention: A framework for the humanoid robot icub. In Proc. of Int’l Conf. on Robotics and Automation (ICRA), 2008.Google Scholar
  48. [48]
    B. Schauerte, J. Richarz, and G. A. Fink. Saliency-based identification and recognition of pointed-at objects. In Proc. of Int. Conf. on Intelligent Robots and Systems (IROS), 2010.Google Scholar
  49. [49]
    C. Siagian and L. Itti. Biologically inspired mobile robot vision localization. IEEE Transaction on Robotics, 25(4):861–873, July 2009.CrossRefGoogle Scholar
  50. [50]
    A. Torralba, A. Oliva, M. Castelhano, and J. Henderson. Contextual guidance of eye movements and attention in real-world scenes: The role of global features on object search. Psychological Review, 113(4), 2006.Google Scholar
  51. [51]
    A. M. Treisman and G. Gelade. A feature integration theory of attention. Cognitive Psychology, 12:97–136, 1980.CrossRefGoogle Scholar
  52. [52]
    J. K. Tsotsos. A Computational Perspective on Visual Atention. The MIT Press, 2011.Google Scholar
  53. [53]
    J. K. Tsotsos, G. Verghese, S. Stevenson, M. Black, D. Metaxas, S. Culhane, S. Dickinson, M. Jenkin, A. Jepson, E. Milios, F. Nuflo, Y. Ye, and R. Mann. PLAYBOT: A visually-guided robot to assist physically disabled children in play. Image and Vision Computing 16, Special Issue on Vision for the Disabled, pages 275–292, April 1998.Google Scholar
  54. [54]
    S. Vijayakumar, J. Conradt, T. Shibata, and S. Schaal. Overt visual attention for a humanoid robot. In Proc. International Conference on Intelligence in Robotics and Autonomous Systems (IROS2001), pages 2332–2337, Hawaii, 2001.Google Scholar
  55. [55]
    J. Vogel and N. de Freitas. Target-directed attention: Sequential decision-making for gaze planning. In Proc. of ICRA, 2008.Google Scholar
  56. [56]
    D. Walther. Interactions of visual attention and object recognition: computational modeling, algorithms, and psychophysics. PhD thesis, California Institute of Technology, Pasadena, CA, 2006.Google Scholar
  57. [57]
    D. Walther and C. Koch. Attention in hierarchical models of object recognition. Computational Neuroscience: Theoretical in-sights into brain funciton, Progres in Brain research, 165: 57–78, 2007.CrossRefGoogle Scholar
  58. [58]
    J. M. Wolfe. Guided search 2.0: A revised model of visual search. Psychonomic Buletin and Review, 1(2):202–238, 1994.CrossRefGoogle Scholar
  59. [59]
    T. Xu, T. Pototschnig, K. Kühnlenz, and M. Buss. A high-speed multi-GPU implementation of bottom-up attention using CUDA. In Proc. of the International Conference on Robotics and Automation, (ICRA), 2009.Google Scholar
  60. [60]
    T. Xu, T. Zhang, K. Kühnlenz, and M. Buss. Attentional object detection of an active multi-vocal vision system. Int. J. of Humanoid Robotics, 7(2), 2010.Google Scholar

Copyright information

© © Versita Warsaw and Springer-Verlag Wien 2011

Authors and Affiliations

  1. 1.Institute of Computer Science IIIRheinische Friedrich-Wilhelms-UniversitätBonnGermany

Personalised recommendations