Gaze Location Prediction with Depth Features as Auxiliary Information

  • Redwan Abdo A. Mohammed
  • Lars Schwabe
  • Oliver Staadt
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8511)


We present the results of a first experimental study to improve the computation of saliency maps, by using luminance and depth images features. More specifically, we have recorded the center of gaze of users when they were viewing natural scenes. We used machine learning techniques to train a bottom-up, top-down model of saliency based on 2D and depth features/cues. We found that models trained on Itti & Koch and depth features combined outperform models trained on other individual features (i.e. only Gabor filter responses or only depth features), or trained on combination of these features. As a consequence, depth features combined with Itti & Koch features improve the prediction of gaze locations. This first characterization of using joint luminance and depth features is an important step towards developing models of eye movements, which operate well under natural conditions such as those encountered in HCI settings.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Hoover, A., Jean-Baptiste, G., Jiang, X.: An experimental comparison of range image segmentation algorithms. IEEE Transactions on Pattern Analysis and Machine Intelligence 18, 673–689 (1996)CrossRefGoogle Scholar
  2. 2.
    Borji, A., Itti, L.: State-of-the-art in visual attention modeling. IEEE Trans. Pattern Anal. Mach. Intell. 35(1), 185–207 (2013)CrossRefMathSciNetGoogle Scholar
  3. 3.
    Gao, D., Vasconcelos, N.: Discriminant saliency for visual recognition from cluttered scenes. In: NIPS (2004)Google Scholar
  4. 4.
    Harel, J., Koch, C., Perona, P.: Graph-based visual saliency. In: Advances in Neural Information Processing Systems 19, pp. 545–552. MIT Press (2007)Google Scholar
  5. 5.
    Horvitz, E., Kadie, C., Paek, T., Hovel, D.: Models of attention in computing and communication: From principles to applications (2003)Google Scholar
  6. 6.
    Itti, L., Koch, C., Niebur, E.: A model of saliency-based visual attention for rapid scene analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence 20(11), 1254–1259 (1998)CrossRefGoogle Scholar
  7. 7.
    Lewis, J.P.: Fast normalized cross-correlation (1995)Google Scholar
  8. 8.
    Mahadevan, V., Vasconcelos, N.: Spatiotemporal saliency in dynamic scenes. IEEE Trans. Pattern Anal. Mach. Intell. 32(1), 171–177 (2010)CrossRefGoogle Scholar
  9. 9.
    Mohammed, R.A.A., Schwabe, L.: Scene-dependence of saliency maps of natural luminance and depth images. In: Fifth Baltic Conference “Human - Computer Interaction” (2011) (to appear)Google Scholar
  10. 10.
    Mohammed, R.A.A., Schwabe, L.: A brain informatics approach to explain the oblique effect via depth statistics. In: Zanzotto, F.M., Tsumoto, S., Taatgen, N., Yao, Y. (eds.) BI 2012. LNCS, vol. 7670, pp. 97–106. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  11. 11.
    Mohammed, R.A.A., Mohammed, S.A., Schwabe, L.: Batgaze: A new tool to measure depth features at the center of gaze during free viewing. In: Zanzotto, F.M., Tsumoto, S., Taatgen, N., Yao, Y. (eds.) BI 2012. LNCS, vol. 7670, pp. 85–96. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  12. 12.
    Potetz, B., Lee, T.S.: Statistical correlations between 2d images and 3d structures in natural scenes. Journal of Optical Society of America, A 7(20), 1292–1303 (2003)CrossRefGoogle Scholar
  13. 13.
    Reinagel, P., Zador, A.M.: Natural scene statistics at the centre of gaze. Network 10(4), 341–350 (1999)CrossRefzbMATHGoogle Scholar
  14. 14.
    Roda, C.: Human Attention in Digital Environments. Cambridge University Press, Cambridge (2011)Google Scholar
  15. 15.
    Saxena, A., Chung, S.H., Ng, A.Y.: Learning depth from single monocular images. In: NIPS 18. MIT Press (2005)Google Scholar
  16. 16.
    Saxena, A., Sun, M., Ng, A.Y.: Make3d: Learning 3d scene structure from a single still image. IEEE Trans. Pattern Anal. Mach. Intell. 31(5), 824–840 (2009)CrossRefGoogle Scholar
  17. 17.
    Durand, F., Judd, T., Ehinger, K., Torralba, A.: Learning to predict where humans look. In: ICCV (2009)Google Scholar
  18. 18.
    Yang, Z., Purves, D.: Image source statistics of surfaces in natural scenes. Network: Computation in Neural Systems 14(3), 371–390 (2003)CrossRefGoogle Scholar
  19. 19.
    Yokoya, N., Levine, M.D.: Range image segmentation based on differential geometry: A hybrid approach. IEEE Transactions on Pattern Analysis and Machine Intelligence 11(6), 643–649 (1989)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  • Redwan Abdo A. Mohammed
    • 1
  • Lars Schwabe
    • 1
  • Oliver Staadt
    • 1
  1. 1.Institute of Computer ScienceUniversity of RostockRostockGermany

Personalised recommendations