Abstract
Saliency prediction can be treated as the activity of human brain. Most saliency prediction methods employ features to determine the contrast of an image area relative to its surroundings. However, only few studies have investigated how human brain activities affect saliency prediction. In this paper, we propose an enhanced saliency prediction model via free energy principle. A new AR-RTV model, which combines the relative total variation (RTV) structure extractor with autoregressive (AR) operator, is firstly utilized to decompose an original image into the predictable component and the surprise component. Then, we adopt the local entropy of ‘surprise’ map and the gradient magnitude (GM) map to estimate the component saliency maps-sub-saliency respectively. Finally, inspired by visual error sensitivity, a saliency augment operator is designed to enhance the final saliency combined two sub-saliency maps. Experimental results on two benchmark databases demonstrate the superior performance of the proposed method compared to eleven state-of-the-art algorithms.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Shang, X., Wang, Y., Luo, L., et al.: Perceptual multiview video coding based on foveated just noticeable distortion profile in DCT domain. In: IEEE International Conference on Image Processing, pp. 1914–1917. IEEE (2014)
Han, J., Zhang, D., Cheng, G., et al.: Object detection in optical remote sensing images based on weakly supervised learning and high-level feature learning. IEEE Trans. Geosci. Remote Sens. 53(6), 3325–3337 (2015)
Han, J., Chen, C., Shao, L., et al.: Learning computational models of video memorability from fMRI brain imaging. IEEE Trans. Cybern. 45(8), 1692 (2015)
Gu, K., Li, L., Lu, H., et al.: A fast reliable image quality predictor by fusing micro- and macro-structures. IEEE Trans. Ind. Electron. 64(5), 3903–3912 (2017)
Gu, K., Lin, W., Zhai, G., et al.: No-reference quality metric of contrast-distorted images based on information maximization. IEEE Trans. Cybern. 47(12), 4559–4565 (2017)
Gu, K., Wang, S., Yang, H., et al.: Saliency-guided quality assessment of screen content images. IEEE Trans. Multimedia 18(6), 1098–1110 (2016)
Itti, L., Koch, C., Niebur, E.: A model of saliency-based visual attention for rapid scene analysis. IEEE Computer Society (1998)
López-García, F., Fdez-Vidal, X.R., Pardo, X.M., et al.: Scene recognition through visual attention and image features: a comparison between SIFT and SURF approaches. Intech (2011)
Harel, J.: Graph-based visual saliency. Nips 19, 545–552 (2007)
Erdem, E., Erdem, A.: Visual saliency estimation by nonlinearly integrating features using region covariances. J. Vis. 13(4), 11 (2013)
Hou, X., Zhang, L.: Saliency detection: a spectral residual approach. In: IEEE Conference on Computer Vision and Pattern Recognition 2007, CVPR 2007, pp. 1–8. IEEE (2007)
Achanta, R., Hemami, S., Estrada, F., et al.: Frequency-tuned salient region detection. In: IEEE Conference on Computer Vision and Pattern Recognition 2009, CVPR 2009, pp. 1597–1604. IEEE (2009)
Hou, X., Harel, J., Koch, C.: Image signature: highlighting sparse salient regions. IEEE Trans. Pattern Anal. Mach. Intell. 34(1), 194 (2012)
Li, J., Levine, M.D., An, X., et al.: Visual saliency based on scale-space analysis in the frequency domain. IEEE Trans. Pattern Anal. Mach. Intell. 35(4), 996–1010 (2013)
Bruce, N., Tsotsos, J.: Attention based on information maximization. J. Vis. 7(9), 950 (2007)
Zhang, L., Tong, M.H., Marks, T.K., et al.: SUN: a Bayesian framework for saliency using natural statistics. J. Vis. 8(7), 32.1 (2008)
Zhao, Q., Koch, C.: Learning visual saliency by combining feature maps in a nonlinear manner using AdaBoost. J. Vis. 12(6), 22 (2012)
Tavakoli, H.R., Laaksonen, J.: Bottom-up fixation prediction using unsupervised hierarchical models. In: Chen, C.-S., Lu, J., Ma, K.-K. (eds.) ACCV 2016. LNCS, vol. 10116, pp. 287–302. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-54407-6_19
Pan, J., Ferrer, C.C., Mcguinness, K., et al.: SalGAN: visual saliency prediction with generative adversarial networks (2017)
Fang, Y., Lin, W., Lau, C.T., et al.: A visual attention model combining top-down and bottom-up mechanisms for salient object detection. In: IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 1293–1296. IEEE (2011)
Borji, A., Itti, L.: State-of-the-art in visual attention modeling. IEEE Trans. Pattern Anal. Mach. Intell. 35(1), 185–207 (2012)
Leventhal, A.G.: The Neural Basis of Visual Function: Vision and Visual Dysfunction, vol. 4. CRC Press, Boca Raton (1991)
Friston, K.: The free-energy principle: a unified brain theory? Nat. Rev. Neurosci. 11(2), 127 (2010)
Zhai, G., Wu, X., Yang, X., et al.: A psychovisual quality metric in free-energy principle. IEEE Trans. Image Process. 21(1), 41–52 (2012)
Gu, K., Zhai, G., Lin, W., et al.: Visual Saliency detection with free energy theory. IEEE Signal Process. Lett. 22(10), 1552–1555 (2015)
Judd, T., Ehinger, K., Durand, F., et al.: Learning to predict where humans look. In: IEEE, International Conference on Computer Vision, pp. 2106–2113. IEEE (2010)
Wu, J., Shi, G., Lin, W., et al.: Just noticeable difference estimation for images with free-energy principle. IEEE Trans. Multimedia 15(7), 1705–1710 (2013)
Gu, K., Zhai, G., Yang, X., et al.: Hybrid no-reference quality metric for singly and multiply distorted images. IEEE Trans. Broadcast. 60(3), 555–567 (2014)
Gu, K., Zhai, G., Yang, X., et al.: Using free energy principle for blind image quality assessment. IEEE Trans. Multimedia 17(1), 50–63 (2014)
Xu, L., Yan, Q., Xia, Y., et al.: Structure extraction from texture via relative total variation. ACM Trans. Graph. 31(6), 1–10 (2012)
Attias, H.: A variational Bayesian framework for graphical models. In: International Conference on Neural Information Processing Systems, pp. 209–215. MIT Press (1999)
Qu, Y.D., Cui, C.S., Chen, S.B., et al.: A fast subpixel edge detection method using Sobel – Zernike moments, operator. Image Vis. Comput. 23(1), 11–17 (2005)
Wandell, B.A.: Foundations of Vision. Sinauer Associates, Sunderland (1995)
Geisler, W.S.: Real-time foveated multiresolution system for low-bandwidth video communication. Proc. SPIE – Int. Soc. Opt. Eng. 3299, 294–305 (1998)
Wang, Z., Bovik, A.C.: Embedded foveation image coding. IEEE Trans. Image Process. 10(10), 1397–1410 (2001)
Bylinskii, Z., Judd, T., Oliva, A., et al.: What do different evaluation metrics tell us about saliency models? IEEE Trans. Pattern Anal. Mach. Intell. 41(3), 740–757 (2019)
Margolin, R., Tal, A.: Saliency for image manipulation. Vis. Comput. 29(5), 381–392 (2013)
Acknowledgment
This work was supported by Natural Science Foundation of China under Grant No. 61671283, 61301113.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Ye, P., Wang, Y., Xia, Y., An, P., Zhang, J. (2019). Enhanced Saliency Prediction via Free Energy Principle. In: Zhai, G., Zhou, J., An, P., Yang, X. (eds) Digital TV and Multimedia Communication. IFTC 2018. Communications in Computer and Information Science, vol 1009. Springer, Singapore. https://doi.org/10.1007/978-981-13-8138-6_3
Download citation
DOI: https://doi.org/10.1007/978-981-13-8138-6_3
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-13-8137-9
Online ISBN: 978-981-13-8138-6
eBook Packages: Computer ScienceComputer Science (R0)