Skip to main content
Log in

A new deep spatial transformer convolutional neural network for image saliency detection

  • Published:
Design Automation for Embedded Systems Aims and scope Submit manuscript

Abstract

In this paper we propose a novel deep spatial transformer convolutional neural network (Spatial Net) framework for the detection of salient and abnormal areas in images. The proposed method is general and has three main parts: (1) context information in the image is captured by using convolutional neural networks (CNN) to automatically learn high-level features; (2) to better adapt the CNN model to the saliency task, we redesign the feature sub-network structure to output a 6-dimensional transformation matrix for affine transformation based on the spatial transformer network. Several local features are extracted, which can effectively capture edge pixels in the salient area, meanwhile embedded into the above model to reduce the impact of highlighting background regions; (3) finally, areas of interest are detected by means of the linear combination of global and local feature information. Experimental results demonstrate that Spatial Nets obtain superior detection performance over state-of-the-art algorithms on two popular datasets, requiring less memory and computation to achieve high performance.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

References

  1. Zhang X, Li Z, Zhou T et al (2012) Neural activities in V1 create a bottom-up saliency map. Neuron 73(1):183–192

    Article  Google Scholar 

  2. Itti L, Koch C, Niebur E (1998) A model of saliency-based visual attention for rapid scene analysis. IEEE Trans Pattern Anal Mach Intell 20(11):1254–1259

    Article  Google Scholar 

  3. Harel J, Koch C, Perona P (2006) Graph-based visual saliency. In: Proceedings of the 19th International Conference on Neural Information Processing Systems. MIT Press, Cambridge, MA, pp 545–552. https://dl.acm.org/citation.cfm?id=2976525

  4. Cheng M, Mitra NJ, Huang X et al (2015) Global contrast based salient region detection. IEEE Trans Pattern Anal Mach Intell 37(3):569–582

    Article  Google Scholar 

  5. Rahtu E, Kannala J, Salo M et al (2010) Segmenting salient objects from images and videos. In: ECCV2010, pp 366–379

  6. Vig E, Dorr M, Cox D (2014) Large-scale optimization of hierarchical features for saliency prediction in natural images. Comput Vis Pattern Recognit 2014:2798–2805

    Google Scholar 

  7. Zhou A, Yao A, Guo Y et al (2017) Incremental network quantization: towards lossless CNNs with low-precision weights. arXiv preprint arXiv:1702.03044

  8. Li G, Yu Y (2016) Visual saliency detection based on multiscale deep CNN features. IEEE Trans Image Process 25(11):5012–5024

    Article  MathSciNet  Google Scholar 

  9. Imamoglu N, Zhang C, Shimoda W et al (2017) Saliency detection by forward and backward cues in deep-CNNs. arXiv preprint arXiv:1703.00152

  10. He S, Lau RW, Liu W et al (2015) Supercnn: a superpixelwise convolutional neural network for salient object detection. Int J Comput Vis 115(3):330–344

    Article  MathSciNet  Google Scholar 

  11. Zhang J, Li B, Dai Y et al (2017) Integrated deep and shallow networks for salient object detection. arXiv preprint arXiv:1706.00530

  12. Sarvadevabhatla RK, Surya S, Kruthiventi SS, Babu RV et al (2016) SwiDeN: convolutional neural networks for depiction invariant object recognition. In: ACM multimedia, vol 1, pp 187–191

  13. Szegedy C, Toshev A, Erhan D (2013) Deep neural networks for object detection. Neural Inf Process Syst 2013(1):2553–2561

    Google Scholar 

  14. Li H, Chen J, Lu H et al (2017) CNN for saliency detection with low-level feature integration. Neurocomputing 226(22):212–220

    Article  Google Scholar 

  15. Krizhevsky A, Sutskever I, Hinton GE (2017) ImageNet classification with deep convolutional neural networks. Commun ACM 60(6):84–90

    Article  Google Scholar 

  16. Li G, Xie Y, Lin L et al (2017) Instance-level salient object segmentation. arXiv preprint arXiv:1704.03604

  17. Xie G-S, Zhang X-Y, Yang W et al (2017) LG-CNN: from local parts to global discrimination for fine-grained recognition. Pattern Recognit 71(1):118–131

    Article  Google Scholar 

  18. Scherer D, Muller A, Behnke S et al (2010) Evaluation of pooling operations in convolutional architectures for object recognition. In: International conference on artificial neural networks 2010 (ICANN2010), pp 92–101

  19. Jaderberg M, Simonyan K, Zisserman A et al (2015) Spatial transformer networks. Neural Inf Process Syst 2015(1):2017–2025

    Google Scholar 

  20. Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556

  21. Kingma DP, Ba J (2014) Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980

  22. Jiang M, Huang S, Duan J et al (2015) Salicon: saliency in context. In: CVPR2015, pp. 1072–1080

  23. Xiao J, Hays J, Ehinger KA et al (2010) Sun database: large-scale scene recognition from abbey to zoo. In: CVPR2010, pp 3485–3492

  24. Bylinskii Z, Judd T, Borji A et al (2017) Mit saliency benchmark. http://saliency.mit.edu/. Accessed 1 Dec 2017

  25. Zhang L, Tong MH, Marks TK et al (2008) SUN: a Bayesian framework for saliency using natural statistics. J Vis 8(7):32.1–20

    Article  Google Scholar 

  26. Zhang Y, Yu F, Song S et al (2017) Large-scale scene understanding challenge: leaderboard. http://lsun.cs.princeton.edu/leaderboard/. Accessed 6 June 2017

  27. Riche N, Mancas M, Duvinage M et al (2013) RARE2012: a multi-scale rarity-based saliency detection with its comparative statistical analysis. Sig Process Image Commun 28(6):642–658

    Article  Google Scholar 

  28. Xia C, Qi F, Shi G (2016) Bottom-up visual saliency estimation with deep autoencoder-based sparse reconstruction. IEEE Trans Neural Netw Learn Syst 27(6):1227–1240

    Article  MathSciNet  Google Scholar 

  29. Zhang J, Sclaroff S (2013) Saliency detection: a boolean map approach. In: ICCV2013, pp 153–160

Download references

Acknowledgements

The authors acknowledge the National Natural Science Foundation of Shaanxi Province (Grant No. 2016JM6023).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xinsheng Zhang.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhang, X., Gao, T. & Gao, D. A new deep spatial transformer convolutional neural network for image saliency detection. Des Autom Embed Syst 22, 243–256 (2018). https://doi.org/10.1007/s10617-018-9209-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10617-018-9209-0

Keywords

Navigation