Saliency Benchmarking Made Easy: Separating Models, Maps and Metrics

  • Matthias KümmererEmail author
  • Thomas S. A. Wallis
  • Matthias Bethge
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11220)


Dozens of new models on fixation prediction are published every year and compared on open benchmarks such as MIT300 and LSUN. However, progress in the field can be difficult to judge because models are compared using a variety of inconsistent metrics. Here we show that no single saliency map can perform well under all metrics. Instead, we propose a principled approach to solve the benchmarking problem by separating the notions of saliency models, maps and metrics. Inspired by Bayesian decision theory, we define a saliency model to be a probabilistic model of fixation density prediction and a saliency map to be a metric-specific prediction derived from the model density which maximizes the expected performance on that metric given the model density. We derive these optimal saliency maps for the most commonly used saliency metrics (AUC, sAUC, NSS, CC, SIM, KL-Div) and show that they can be computed analytically or approximated with high precision. We show that this leads to consistent rankings in all metrics and avoids the penalties of using one saliency map for all metrics. Our method allows researchers to have their model compete on many different metrics with state-of-the-art in those metrics: “good” models will perform well in all metrics.


Saliency Benchmarking Metrics Fixations Bayesian decision theory Model comparison 



This study is part of Matthias Kümmerer’s thesis work at the International Max Planck Research School for Intelligent Systems (IMPRS-IS). The research has been funded by the German Science Foundation (DFG; Collaborative Research Centre 1233) and the German Excellency Initiative (EXC307).

Supplementary material

474218_1_En_47_MOESM1_ESM.pdf (494 kb)
Supplementary material 1 (pdf 493 KB)


  1. 1.
    Adeli, H., Vitu, F., Zelinsky, G.J.: A model of the superior colliculus predicts fixation locations during scene viewing and visual search. J. Neurosci. 37(6), 1453–1467 (2016). Scholar
  2. 2.
    Barthelme, S., Trukenbrod, H., Engbert, R., Wichmann, F.: Modeling fixation locations using spatial point processes. J. Vis. 13(12), 1–1 (2013). Scholar
  3. 3.
    Borji, A., Sihite, D.N., Itti, L.: Objects do not predict fixations better than early saliency: a re-analysis of einhauser et al’.s data. J. Vis. 13(10), 18–18 (2013). Scholar
  4. 4.
    Borji, A., Itti, L.: State-of-the-art in visual attention modeling. IEEE Trans. Pattern Anal. Mach. Intell. 35(1), 185–207 (2013). Scholar
  5. 5.
    Borji, A., Sihite, D.N., Itti, L.: Quantitative analysis of human-model agreement in visual saliency modeling: a comparative study. IEEE Trans. Image Process. 22(1), 55–69 (2013). Scholar
  6. 6.
    Bruce, N.D.B., Tsotsos, J.K.: Saliency, attention, and visual search: an information theoretic approach. J. Vis. 9(3), 5–5 (2009). Scholar
  7. 7.
    Bruce, N.D.B., Catton, C., Janjic, S.: A deeper look at saliency: Feature contrast, semantics, and beyond. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE (2016).
  8. 8.
    Bruce, N.D., Wloka, C., Frosst, N., Rahman, S., Tsotsos, J.K.: On computational modeling of visual saliency: examining what’s right, and what’s left. Vis. Res. 116, 95–112 (2015). Scholar
  9. 9.
    Bylinskii, Z., Judd, T., Durand, F., Oliva, A., Torralba, A.: MIT saliency benchmark.
  10. 10.
    Bylinskii, Z., Judd, T., Oliva, A., Torralba, A., Durand, F.: What do different evaluation metrics tell us about saliency models? [cs] (2016), arXiv:1604.03605
  11. 11.
    Bylinskii, Z., Recasens, A., Borji, A., Oliva, A., Torralba, A., Durand, F.: Where should saliency models look next? In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9909, pp. 809–824. Springer, Cham (2016). Scholar
  12. 12.
    Cerf, M., Harel, J., Huth, A., Einhäuser, W., Koch, C.: Decoding what people see from where they look: predicting visual stimuli from scanpaths. In: Paletta, L., Tsotsos, J.K. (eds.) WAPCV 2008. LNCS (LNAI), vol. 5395, pp. 15–26. Springer, Heidelberg (2009). Scholar
  13. 13.
    Cornia, M., Baraldi, L., Serra, G., Cucchiara, R.: Predicting human eye fixations via an LSTM-based saliency attentive model. [cs] (2016), arXiv:1611.09571
  14. 14.
    Einhauser, W., Spain, M., Perona, P.: Objects predict fixations better than early saliency. J. Vis. 8(14), 18–18 (2008). Scholar
  15. 15.
    Harel, J., Koch, C., Perona, P.: Graph-based visual saliency. In: Advances in neural information processing systems, pp. 545–552 (2006)Google Scholar
  16. 16.
    Huang, X., Shen, C., Boix, X., Zhao, Q.: SALICON: Reducing the semantic gap in saliency prediction by adapting deep neural networks. In: 2015 IEEE International Conference on Computer Vision (ICCV). IEEE (2015).
  17. 17.
    Itti, L., Koch, C., Niebur, E.: A model of saliency-based visual attention for rapid scene analysis. IEEE Trans. Pattern Anal. Mach. Intell. 20(11), 1254–1259 (1998). Scholar
  18. 18.
    Itti, L.: Quantifying the contribution of low-level saliency to human eye movements in dynamic scenes. Vis. Cogn. 12(6), 1093–1123 (2005). Scholar
  19. 19.
    Itti, L., Borji, A.: Computational models: Bottom-up and top-down aspects. The Oxford Handbook of Attention. Oxford University Press, Oxford (2014)Google Scholar
  20. 20.
    Jetley, S., Murray, N., Vig, E.: End-to-end saliency mapping via probability distribution prediction. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE (2016).
  21. 21.
    Jiang, M., Huang, S., Duan, J., Zhao, Q.: SALICON: saliency in context. In: 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE (2015).
  22. 22.
    Jost, T., Ouerhani, N., Wartburg, R.V., Müri, R., Hügli, H.: Assessing the contribution of color in visual attention. Comput. Vis. Image Underst. 100(1–2), 107–123 (2005). Scholar
  23. 23.
    Judd, T., Durand, F.d., Torralba, A.: A benchmark of computational models of saliency to predict human fixations. CSAIL Technical reports (2012). 1721.1/68590Google Scholar
  24. 24.
    Judd, T., Ehinger, K., Durand, F., Torralba, A.: Learning to predict where humans look. In: 2009 IEEE 12th International Conference on Computer Vision. IEEE (2009).
  25. 25.
    Kienzle, W., Franz, M.O., Scholkopf, B., Wichmann, F.A.: Center-surround patterns emerge as optimal predictors for human saccade targets. J. Vis. 9(5), 7–7 (2009). Scholar
  26. 26.
    Koch, C., Ullman, S.: Shifts in selective visual attention: towards the underlying neural circuitry. Hum. Neurobiol. 4, 219–227 (1985).
  27. 27.
    Koehler, K., Guo, F., Zhang, S., Eckstein, M.P.: What do saliency models predict? J. Vis. 14(3), 14–14 (2014). Scholar
  28. 28.
    Kruthiventi, S.S.S., Ayush, K., Babu, R.V.: DeepFix: a fully convolutional neural network for predicting human eye fixations. IEEE Trans. Image Process. 26(9), 4446–4456 (2017). Scholar
  29. 29.
    Kümmerer, M.: pysaliency.
  30. 30.
    Kümmerer, M., Theis, L., Bethge, M.: Deep gaze i: boosting saliency prediction with feature maps trained on ImageNet. In: 2015 International Conference on Learning Representations - Workshop Track (ICLR) (2015), arXiv:1411.1045
  31. 31.
    Kümmerer, M., Wallis, T.S.A., Gatys, L.A., Bethge, M.: Understanding low- and high-level contributions to fixation prediction. In: The IEEE International Conference on Computer Vision (ICCV). IEEE (2017)Google Scholar
  32. 32.
    Kümmerer, M., Wallis, T.S.A., Bethge, M.: Information-theoretic model comparison unifies saliency metrics. Proc. Natl. Acad. Sci. USA 112(52), 16054–16059 (2015). Scholar
  33. 33.
    Le Meur, O., Baccino, T.: Methods for comparing scanpaths and saliency maps: strengths and weaknesses. Behav. Res. 45(1), 251–266 (2012). Scholar
  34. 34.
    Li, Z.: A saliency map in primary visual cortex. Trends Cogn. Sci. 6(1), 9–16 (2002). Scholar
  35. 35.
    Nuthmann, A., Einhäuser, W., Schütz, I.: How well can saliency models predict fixation selection in scenes beyond central bias? a new approach to model evaluation using generalized linear mixed models. Front. Hum. Neurosci. 11, 491 (2017). Scholar
  36. 36.
    Pan, J., et al.: SalGAN: visual saliency prediction with generative adversarial networks. [cs] (2017), arXiv:1701.01081
  37. 37.
    Peters, R.J., Iyer, A., Itti, L., Koch, C.: Components of bottom-up gaze allocation in natural images. Vis. Res. 45(18), 2397–2416 (2005). Scholar
  38. 38.
    Riche, N.: Metrics for saliency model validation. From Human Attention to Computational Attention, pp. 209–225. Springer, New York (2016). Scholar
  39. 39.
    Riche, N.: Saliency model evaluation. From Human Attention to Computational Attention, pp. 245–267. Springer, New York (2016). Scholar
  40. 40.
    Riche, N., Duvinage, M., Mancas, M., Gosselin, B., Dutoit, T.: Saliency and human fixations: state-of-the-art and study of comparison metrics. In: 2013 IEEE International Conference on Computer Vision. IEEE (2013).
  41. 41.
    Rothkopf, C.A., Ballard, D.H., Hayhoe, M.M.: Task and context determine where you look. J. Vis. 7(14), 16 (2016). Scholar
  42. 42.
    Schütt, H.H., Rothkegel, L.O.M., Trukenbrod, H.A., Reich, S., Wichmann, F.A., Engbert, R.: Likelihood-based parameter estimation and comparison of dynamical cognitive models. Psychol. Rev. 124(4), 505–524 (2017). Scholar
  43. 43.
    Tatler, B.W., Hayhoe, M.M., Land, M.F., Ballard, D.H.: Eye guidance in natural vision: reinterpreting salience. J. Vis. 11(5), 5–5 (2011). Scholar
  44. 44.
    Tatler, B.W.: The central fixation bias in scene viewing: selecting an optimal viewing position independently of motor biases and image feature distributions. J. Vis. 7(14), 4 (2007). Scholar
  45. 45.
    Tatler, B.W., Baddeley, R.J., Gilchrist, I.D.: Visual correlates of fixation selection: effects of scale and time. Vis. Res. 45(5), 643–659 (2005). Scholar
  46. 46.
    Tatler, B.W., Vincent, B.T.: Systematic tendencies in scene viewing. J. Eye Mov. Res. 2(2), 1–18 (2008).
  47. 47.
    Thomas, C.: OpenSalicon: an open source implementation of the salicon saliency model. CoRR abs/1606.00110 (2016), arXiv:1606.00110
  48. 48.
    Treisman, A.M., Gelade, G.: A feature-integration theory of attention. Cogn. Psychol. 12(1), 97–136 (1980). Scholar
  49. 49.
    Vig, E., Dorr, M., Cox, D.: Large-scale optimization of hierarchical features for saliency prediction in natural images. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition. IEEE (2014).
  50. 50.
    Vincent, B.T., Baddeley, R., Correani, A., Troscianko, T., Leonards, U.: Do we look at lights? using mixture modelling to distinguish between low- and high-level factors in natural image viewing. Vis. Cogn. 17(6–7), 856–879 (2009). Scholar
  51. 51.
    Wilming, N., Betz, T., Kietzmann, T.C., König, P.: Measures and limits of models of fixation selection. PLoS ONE 6(9), e24038 (2011). Scholar
  52. 52.
    Xiao, J., Xu, P., Zhang, Y., Ehinger, K., Finkelstein, A., Kulkarni, S.: What can we learn from eye tracking data on 20,000 images? J. Vis. 15(12), 790 (2015). Scholar
  53. 53.
    Yu, F., et al.: Large-scale scene understanding challenge.
  54. 54.
    Yu, F., et al.: SALICON saliency prediction challenge.
  55. 55.
    Zhang, J., Sclaroff, S.: Saliency detection: a Boolean map approach. In: 2013 IEEE International Conference on Computer Vision. IEEE (2013).
  56. 56.
    Zhang, L., Tong, M.H., Marks, T.K., Shan, H., Cottrell, G.W.: SUN: a Bayesian framework for saliency using natural statistics. J. Vis. 8(7), 32 (2008). Scholar

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  1. 1.Werner Reichardt Centre for Integrative Neuroscience, University of TübingenT übingenGermany
  2. 2.Wilhelm-Schickard Institute for Computer Science (Informatik), University of TübingenTübingenGermany

Personalised recommendations