Reverse Attention for Salient Object Detection

  • Shuhan ChenEmail author
  • Xiuli Tan
  • Ben Wang
  • Xuelong Hu
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11213)


Benefit from the quick development of deep learning techniques, salient object detection has achieved remarkable progresses recently. However, there still exists following two major challenges that hinder its application in embedded devices, low resolution output and heavy model weight. To this end, this paper presents an accurate yet compact deep network for efficient salient object detection. More specifically, given a coarse saliency prediction in the deepest layer, we first employ residual learning to learn side-output residual features for saliency refinement, which can be achieved with very limited convolutional parameters while keep accuracy. Secondly, we further propose reverse attention to guide such side-output residual learning in a top-down manner. By erasing the current predicted salient regions from side-output features, the network can eventually explore the missing object parts and details which results in high resolution and accuracy. Experiments on six benchmark datasets demonstrate that the proposed approach compares favorably against state-of-the-art methods, and with advantages in terms of simplicity, efficiency (45 FPS) and model size (81 MB).


Salient object detection Reverse attention Side-output residual learning 



This work was supported by the Natural Science Foundation of China (No. 61502412), Natural Science Foundation for Youths of Jiangsu Province (No. BK20150459), Foundation of Yangzhou University (No. 2017CXJ026).


  1. 1.
    Wei, Y., et al.: STC: a simple to complex framework for weakly-supervised semantic segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 39(11), 2314–2320 (2017)CrossRefGoogle Scholar
  2. 2.
    Xu, K., et al.: Show, attend and tell: neural image caption generation with visual attention. In: ICML, pp. 2048–2057 (2015)Google Scholar
  3. 3.
    Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: CVPR, pp. 3431–3440 (2015)Google Scholar
  4. 4.
    Dai, J., He, K., Li, Y., Ren, S., Sun, J.: Instance-sensitive fully convolutional networks. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016 Part VI. LNCS, vol. 9910, pp. 534–549. Springer, Cham (2016). Scholar
  5. 5.
    Xie, S., Tu, Z.: Holistically-nested edge detection. In: ICCV, pp. 1395–1403 (2015)Google Scholar
  6. 6.
    Li, X., et al.: DeepSaliency: multi-task deep neural network model for salient object detection. IEEE Trans. Image Proc. 25(8), 3919–3930 (2016)MathSciNetCrossRefGoogle Scholar
  7. 7.
    Lee, G., Tai, Y.W., Kim, J.: Deep saliency with encoded low level distance map and high level features. In: CVPR, pp. 660–668 (2016)Google Scholar
  8. 8.
    Li, G., Yu, Y.: Deep contrast learning for salient object detection. In: CVPR, pp. 478–487 (2016)Google Scholar
  9. 9.
    Wang, L., Wang, L., Lu, H., Zhang, P., Ruan, X.: Saliency detection with recurrent fully convolutional networks. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016 Part IV. LNCS, vol. 9908, pp. 825–841. Springer, Cham (2016). Scholar
  10. 10.
    Luo, Z., Mishra, A., Achkar, A., Eichel, J., Li, S., Jodoin, P.M.: Non-local deep features for salient object detection. In: CVPR, pp. 6593–6601 (2017)Google Scholar
  11. 11.
    Hou, Q., Cheng, M.M., Hu, X., Borji, A., Tu, Z., Torr, P.: Deeply supervised salient object detection with short connections. In: CVPR, pp. 5300–5309 (2017)Google Scholar
  12. 12.
    Li, G., Xie, Y., Lin, L., Yu, Y.: Instance-level salient object segmentation. In: CVPR, pp. 247–256 (2017)Google Scholar
  13. 13.
    Zhang, P., Wang, D., Lu, H., Wang, H., Ruan, X.: Amulet: aggregating multi-level convolutional features for salient object detection. In: ICCV, pp. 202–211 (2017)Google Scholar
  14. 14.
    Zhang, P., Wang, D., Lu, H., Wang, H., Yin, B.: Learning uncertain convolutional features for accurate saliency detection. In: ICCV, pp. 212–221 (2017)Google Scholar
  15. 15.
    Chen, T., Lin, L., Liu, L., Luo, X., Li, X.: DISC: deep image saliency computing via progressive representation learning. IEEE Trans. Neural Netw. Learn. Syst. 27(6), 1135–1149 (2016)MathSciNetCrossRefGoogle Scholar
  16. 16.
    Tang, Y., Wu, X.: Saliency detection via combining region-level and pixel-level predictions with CNNs. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016 Part VIII. LNCS, vol. 9912, pp. 809–825. Springer, Cham (2016). Scholar
  17. 17.
    Xiao, H., Feng, J., Wei, Y., Zhang, M.: Deep salient object detection with dense connections and distraction diagnosis. IEEE Trans. Multimedia (2018)Google Scholar
  18. 18.
    Olaf, R., Philipp, F., Thomas, B.: U-net: convolutional networks for biomedical image segmentation. In: MICCAI, pp. 234–241 (2015)Google Scholar
  19. 19.
    Pinheiro, P.O., Lin, T.-Y., Collobert, R., Dollár, P.: Learning to refine object segments. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016 Part I. LNCS, vol. 9905, pp. 75–91. Springer, Cham (2016). Scholar
  20. 20.
    Liu, Y., Yao, J., Li, L., Lu, X., Han, J.: Learning to refine object contours with a top-down fully convolutional encoder-decoder network. In: ArXiv e-prints (2017)Google Scholar
  21. 21.
    Ke, W., Chen, J., Jiao, J., Zhao, G., Ye, Q.: SRN: side-output residual network for object symmetry detection in the wild. In: CVPR, pp. 302–310 (2017)Google Scholar
  22. 22.
    Shen, W., Zhao, K., Jiang, Y., Wang, Y., Bai, X., Yuille, A.: DeepSkeleton: learning multi-task scale-associated deep side outputs for object skeleton extraction in natural images. IEEE Trans. Image Proc. 26(11), 5298–5311 (2017)CrossRefGoogle Scholar
  23. 23.
    Hu, P., Shuai, B., Liu, J., Wang, G.: Deep level sets for salient object detection. In: CVPR, pp. 2300–2309 (2017)Google Scholar
  24. 24.
    Krähenbühl, P., Koltun, V.: Efficient inference in fully connected CRFs with Gaussian edge potentials. In: NIPS, pp. 109–117 (2011)Google Scholar
  25. 25.
    He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016)Google Scholar
  26. 26.
    Lai, W.S., Huang, J.B., Ahuja, N., Yang, M.H.: Deep Laplacian pyramid networks for fast and accurate super-resolution. In: CVPR, pp. 624–632 (2017)Google Scholar
  27. 27.
    Kim, J., Kwon Lee, J., Mu Lee, K.: Accurate image super-resolution using very deep convolutional networks. In: CVPR, pp. 1646–1654 (2016)Google Scholar
  28. 28.
    Kuen, J., Wang, Z., Wang, G.: Recurrent attentional networks for saliency detection. In: CVPR, pp. 3668–3677 (2016)Google Scholar
  29. 29.
    Ghiasi, G., Fowlkes, C.C.: Laplacian pyramid reconstruction and refinement for semantic segmentation. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016 Part III. LNCS, vol. 9907, pp. 519–534. Springer, Cham (2016). Scholar
  30. 30.
    Chen, L.C., Yang, Y., Wang, J., Xu, W., Yuille, A.L.: Attention to scale: Scale-aware semantic image segmentation. In: CVPR, pp. 3640–3649 (2016)Google Scholar
  31. 31.
    Wang, F., et al.: Residual attention network for image classification. In: CVPR, pp. 6450–6458 (2017)Google Scholar
  32. 32.
    Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks. In: ArXiv e-prints (2017)Google Scholar
  33. 33.
    Huang, Q., et al.: Semantic segmentation with reverse attention. In: BMVC (2017)Google Scholar
  34. 34.
    Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In: ArXiv e-prints (2014)Google Scholar
  35. 35.
    Jia, Y., Shelhamer, E., et al.: Caffe: convolutional architecture for fast feature embedding. In: ACM Multimedia, pp. 675–678 (2014)Google Scholar
  36. 36.
    Liu, T., et al.: Learning to detect a salient object. IEEE Trans. Pattern Anal. Mach. Intell. 33(2), 353–367 (2011)CrossRefGoogle Scholar
  37. 37.
    Li, G., Yu, Y.: Visual saliency detection based on multiscale deep cnn features. IEEE Trans. Image Proc. 25(11), 5012–5024 (2016)MathSciNetCrossRefGoogle Scholar
  38. 38.
    Shi, J., Yan, Q., Xu, L., Jia, J.: Hierarchical image saliency detection on extended CSSD. IEEE Trans. Pattern Anal. Mach. Intell. 38(4), 717–729 (2016)CrossRefGoogle Scholar
  39. 39.
    Li, Y., Hou, X., Koch, C., Rehg, J.M., Yuille, A.L.: The secrets of salient object segmentation. In: CVPR, pp. 280–287 (2014)Google Scholar
  40. 40.
    Martin, D., Fowlkes, C., Tal, D., Malik, J.: A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics. In: ICCV, pp. 416–423 (2001)Google Scholar
  41. 41.
    Yang, C., Zhang, L., Lu, H., Ruan, X., Yang, M.H.: Saliency detection via graph-based manifold ranking. In: CVPR, pp. 3166–3173 (2013)Google Scholar
  42. 42.
    Jiang, H., Wang, J., Yuan, Z., Wu, Y., Zheng, N., Li, S.: Salient object detection: A discriminative regional feature integration approach. In: CVPR. 2083–2090 (2013)Google Scholar
  43. 43.
    Borji, A., Cheng, M.M., Jiang, H., Li, J.: Salient object detection: a benchmark. IEEE Trans. Image Proc. 24(12), 5706–5722 (2015)MathSciNetCrossRefGoogle Scholar
  44. 44.
    Liu, N., Han, J.: DHSNet: deep hierarchical saliency network for salient object detection. In: CVPR, pp. 678-686 (2016)Google Scholar
  45. 45.
    Kim, J., Pavlovic, V.: A shape-based approach for salient object detection using deep learning. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016 Part IV. LNCS, vol. 9908, pp. 455–470. Springer, Cham (2016). Scholar
  46. 46.
    Cheng, M.M., Mitra, N.J., Huang, X., Torr, P.H., Hu, S.M.: Global contrast based salient region detection. IEEE Trans. Pattern Anal. Mach. Intell. 37(3), 569–582 (2015)CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  1. 1.School of Information EngineeringYangzhou UniversityYangzhouChina

Personalised recommendations