Multimedia Tools and Applications

, Volume 77, Issue 3, pp 3387–3403 | Cite as

Semantic feature based multi-spectral saliency detection

  • Lan Wang
  • Chenqiang Gao
  • Jie Jian
  • Lin Tang
  • Jiang Liu


Saliency detection aims to locate the distinctive regions in images and can be extensively applied to many applications. Up to now, most of effort has put into visible images and the related methods usually encounter difficulty for images with complex background. In this paper, we propose a semantic feature based multi-spectral saliency detection method using the complementarity of infrared and visible images. We use the thermal infrared image to relieve the difficulty of visible images with complex background, while still utilizing the rich texture and color information in visible images. Specifically, we firstly uses the Convolutional Neural Network to extract high-level feature from superpixels obtained by segmenting visible and infrared images, and then the initial saliency maps of both spectrums are computed, respectively. After that, two initial saliency maps are fused via a Total Variation (TV) minimization model and finally the fused result is linearly combined with the enhanced foreground salient object map to obtain the final saliency detection result. Experiment results reveal that the proposed method outperforms the baseline methods.


Saliency detection Multi-spectrum Infrared images 



This work is supported by the National Natural Science Foundation of China (No.61571071), Wenfeng innovationand start-up project of Chongqing University of Posts and Telecommunications (No. WF201404), the National Social Science Foundation of China (No.15BGL2729), the Research Innovation Program for Postgraduate of Chongqing (No. CYS17222). The authors also thank NVIDIA corporation for the donation of GTX 980 GPU.


  1. 1.
    Achanta R, Estrada F, Wils P, Süsstrunk S (2008) Salient region detection and segmentation. In: 6th international conference on computer vision systems (ICVS 2008). Springer, pp 66–75Google Scholar
  2. 2.
    Bao L, Lu J, Li Y, Shi Y (2015) A saliency detection model using shearlet transform. Multimedia Tools Appl 74(11):4045–4058CrossRefGoogle Scholar
  3. 3.
    Borji A, Cheng MM, Jiang H, Li J (2014) Salient object detection: a survey. arXiv:1411.5878
  4. 4.
    Borji A, Cheng MM, Jiang H, Li J (2015) Salient object detection: a benchmark. IEEE Trans Image Process 24(12):5706–5722MathSciNetCrossRefGoogle Scholar
  5. 5.
    Chan TF, Esedoglu S (2005) Aspects of total variation regularized l 1 function approximation. SIAM J Appl Math 65(5):1817–1837MathSciNetCrossRefMATHGoogle Scholar
  6. 6.
    Chang X, Ma Z, Lin M, Yang Y, Hauptmann A (2017) Feature interaction augmented sparse learning for fast kinect motion detection. IEEE Trans Image Process 26(8):3911–3920MathSciNetCrossRefGoogle Scholar
  7. 7.
    Chang X, Ma Z, Yang Y, Zeng Z, Hauptmann AG (2017) Bi-level semantic representation analysis for multimedia event detection. IEEE Trans Cybern 47(5):1180–1197CrossRefGoogle Scholar
  8. 8.
    Chang X, Yang Y (2016) Semisupervised feature analysis by mining correlations among multiple tasks. IEEE Trans Neural Netw Learn Syst PP(99):1–12. Google Scholar
  9. 9.
    Chang X, Yang Y, Hauptmann AG, Xing EP, Yu Y (2015) Semantic concept discovery for large-scale zero-shot event detection. In: Twenty-fourth international joint conference on artificial intelligence, vol 2. AAAI Press, p 6Google Scholar
  10. 10.
    Chang X, Yu YL, Yang Y, Xing EP (2017) Semantic pooling for complex event analysis in untrimmed videos. IEEE Trans Pattern Anal Mach Intell 39 (8):1617–1632. CrossRefGoogle Scholar
  11. 11.
    Cheng MM, Mitra NJ, Huang X, Torr PH, Hu SM (2015) Global contrast based salient region detection. IEEE Trans Pattern Anal Mach Intell 37(3):569–582CrossRefGoogle Scholar
  12. 12.
    Felzenszwalb PF, Huttenlocher DP (2004) Efficient graph-based image segmentation. Int J Comput Vis 59(2):167–181CrossRefGoogle Scholar
  13. 13.
    Gao C, Du Y, Liu J, Lv J, Yang L, Meng D, Hauptmann AG (2016) Infar dataset: infrared action recognition at different times. Neurocomputing 212:36–47CrossRefGoogle Scholar
  14. 14.
    Goferman S, Zelnik-Manor L, Tal A (2012) Context-aware saliency detection. IEEE Trans Pattern Anal Mach Intell 34(10):1915–1926CrossRefGoogle Scholar
  15. 15.
    Han A, Han F, Hao J, Yuan Y (2017) An improved saliency detection method based on non-uniform quantification and channel-weighted color distance. Multimedia Tools Appl 76(8):11,037–11,050CrossRefGoogle Scholar
  16. 16.
    Harel J, Koch C, Perona P et al (2006) Graph-based visual saliency. In: Advances in neural information processing systems 19 (NIPS 2006), vol 1. Curran Associates, Inc, p 5Google Scholar
  17. 17.
    Hiremath P, Pujari J (2008) Content based image retrieval using color boosted salient points and shape features of an image. Intern J Image Process 2(1):10–17Google Scholar
  18. 18.
    Itti L (2004) Automatic foveation for video compression using a neurobiological model of visual attention. IEEE Trans Image Process 13(10):1304–1318CrossRefGoogle Scholar
  19. 19.
    Itti L, Koch C, Niebur E (1998) A model of saliency-based visual attention for rapid scene analysis. IEEE Trans Pattern Anal Mach Intell 20(11):1254–1259CrossRefGoogle Scholar
  20. 20.
    Jiang B, Zhang L, Lu H, Yang C, Yang MH (2013) Saliency detection via absorbing markov chain. In: 2013 IEEE conference on computer vision and pattern recognition (CVPR). IEEE, pp 1665–1672Google Scholar
  21. 21.
    Jiang R, Crookes D (2014) Deep salience: visual salience modeling via deep belief propagation. In: AAAI. AAAI Press, pp 2773–2779Google Scholar
  22. 22.
    Jiang H, Wang J, Yuan Z, Wu Y, Zheng N, Li S (2013) Salient object detection: a discriminative regional feature integration approach. In: 2013 IEEE conference on computer vision and pattern recognition (CVPR). IEEE, pp 2083–2090Google Scholar
  23. 23.
    Judd T, Ehinger K, Durand F, Torralba A (2009) Learning to predict where humans look. In: 2009 IEEE 12th international conference on computer vision (ICCV). IEEE, pp 2106–2113Google Scholar
  24. 24.
    Kanan C, Cottrell G (2010) Robust classification of objects, faces, and flowers using natural image statistics. In: 2010 IEEE conference on computer vision and pattern recognition (CVPR). IEEE, pp 2472–2479Google Scholar
  25. 25.
    Li X, Lu H, Zhang L, Ruan X, Yang MH (2013) Saliency detection via dense and sparse reconstruction. In: 2013 IEEE conference on computer vision and pattern recognition (CVPR). IEEE, pp 2976– 2983Google Scholar
  26. 26.
    Lin Y, Kong S, Wang D, Zhuang Y (2014) Saliency detection within a deep convolutional architecture. In: Workshops at the twenty-eighth AAAI conference on artificial intelligenceGoogle Scholar
  27. 27.
    Liu T, Yuan Z, Sun J, Wang J, Zheng N, Tang X, Shum HY (2011) Learning to detect a salient object. IEEE Trans Pattern Anal Mach Intell 33 (2):353–367CrossRefGoogle Scholar
  28. 28.
    Ma J, Chen C, Li C, Huang J (2016) Infrared and visible image fusion via gradient transfer and total variation minimization. Information Fusion 31:100–109CrossRefGoogle Scholar
  29. 29.
    Ma YF, Hua XS, Lu L, Zhang HJ (2005) A generic framework of user attention model and its application in video summarization. IEEE Trans Multimedia 7(5):907–919CrossRefGoogle Scholar
  30. 30.
    Oliva A, Torralba A, Castelhano MS, Henderson JM (2003) Top-down control of visual attention in object detection. In: Proceedings of international conference on image processing, 2003. ICIP 2003, vol 1. IEEE, pp I–253Google Scholar
  31. 31.
    Perazzi F, Krähenbühl P, Pritch Y, Hornung A (2012) Saliency filters: contrast based filtering for salient region detection. In: 2012 IEEE conference on computer vision and pattern recognition (CVPR). IEEE, pp 733–740Google Scholar
  32. 32.
    Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556
  33. 33.
    Torralba A, Oliva A, Castelhano MS, Henderson JM (2006) Contextual guidance of eye movements and attention in real-world scenes: the role of global features in object search. Psychol Rev 113(4): 766CrossRefGoogle Scholar
  34. 34.
    Wang L, Lu H, Ruan X, Yang MH (2015) Deep networks for saliency detection via local estimation and global search. In: 2015 IEEE conference on computer vision and pattern recognition (CVPR). IEEE, pp 3183–3192Google Scholar
  35. 35.
    Wang Q, Yan P, Yuan Y, Li X (2013) Multi-spectral saliency detection. Pattern Recogn Lett 34(1):34– 41CrossRefGoogle Scholar
  36. 36.
    Wang Q, Zhu G, Yuan Y (2013) Multi-spectral dataset and its application in saliency detection. Comput Vis Image Underst 117(12):1748–1754CrossRefGoogle Scholar
  37. 37.
    Yan Q, Xu L, Shi J, Jia J (2013) Hierarchical saliency detection. In: 2013 IEEE conference on computer vision and pattern recognition (CVPR). IEEE, pp 1155–1162Google Scholar
  38. 38.
    Yan Y, Nie F, Li W, Gao C, Yang Y, Xu D (2016) Image classification by cross-media active learning with privileged information. IEEE Trans Multimedia 18(12):2494–2502CrossRefGoogle Scholar
  39. 39.
    Yan Y, Xu Z, Liu G, Ma Z, Sebe N (2013) Glocal structural feature selection with sparsity for multimedia data understanding. In: Proceedings of the 21st ACM international conference on multimedia. ACM, pp 537–540Google Scholar
  40. 40.
    Yan Y, Yang Y, Shen H, Meng D, Liu G, Hauptmann AG, Sebe N (2015) Complex event detection via event oriented dictionary learning. In: Twenty-ninth AAAI conference on artificial intelligence. AAAI Press, pp 3841–3847Google Scholar
  41. 41.
    Yang J, Yang MH (2017) Top-down visual saliency via joint crf and dictionary learning. IEEE Trans Pattern Anal Mach Intell PP(99):1–12. Google Scholar
  42. 42.
    Yang Y, Xu D, Nie F, Yan S, Zhuang Y (2010) Image clustering using local discriminant models and global integration. IEEE Trans Image Process 19(10):2761–2773MathSciNetCrossRefMATHGoogle Scholar
  43. 43.
    Zeiler MD, Fergus R (2014) Visualizing and understanding convolutional networks. In: European conference on computer vision. Springer, pp 818–833Google Scholar
  44. 44.
    Zhai Y, Shah M (2006) Visual attention detection in video sequences using spatiotemporal cues. In: Proceedings of the 14th ACM international conference on multimedia. ACM, pp 815–824Google Scholar
  45. 45.
    Zhang D, Han J, Jiang L, Ye S, Chang X (2017) Revealing event saliency in unconstrained video collection. IEEE Trans Image Process 26(4):1746–1758MathSciNetCrossRefGoogle Scholar
  46. 46.
    Zhao R, Ouyang W, Li H, Wang X (2015) Saliency detection by multi-context deep learning. In: 2015 IEEE conference on computer vision and pattern recognition (CVPR). IEEE, pp 1265– 1274Google Scholar
  47. 47.
    Zhu W, Liang S, Wei Y, Sun J (2014) Saliency optimization from robust background detection. In: 2014 IEEE conference on computer vision and pattern recognition (CVPR). IEEE, pp 2814–2821Google Scholar

Copyright information

© Springer Science+Business Media, LLC 2017

Authors and Affiliations

  • Lan Wang
    • 1
  • Chenqiang Gao
    • 1
  • Jie Jian
    • 1
  • Lin Tang
    • 1
  • Jiang Liu
    • 1
  1. 1.Chongqing Key Laboratory of Signal and Information ProcessingChongqing University of Posts and TelecommunicationsChongqingChina

Personalised recommendations