Top-Down Saliency by Multi-scale Contextual Pooling

Qiu, Yuanyuan; Zhu, Jun; Zhang, Rui; Huang, Jun

doi:10.1007/978-3-642-34778-8_27

Yuanyuan Qiu^20,21,
Jun Zhu^20,21,
Rui Zhang^20,21 &
…
Jun Huang²²

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 7674))

Included in the following conference series:

Pacific-Rim Conference on Multimedia

3619 Accesses
2 Citations

Abstract

Goal-driven top-down mechanism plays an important role in the case of object detection and recognition. In this paper, we propose a top-down computational model for goal-driven saliency detection based on a coding-based classification framework. It consists of four successive steps: feature extraction, descriptor coding, local pooling and saliency prediction. In the step of local pooling, we investigate the effect of multi-scale contextual information for saliency detection and find that there exists an optimal contextual scale to achieve the patch-level feature presentation. On basis of this observation, we propose an approach for automatic scale selection in saliency prediction step. The experimental results demonstrate that our method can effectively improve the performance of goal-driven saliency detection as well as related object detection.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Itti, L., Koch, C., Niebur, E.: A model of saliency-based visual attention for rapid scene analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence 20, 1254–1259 (1998)
Article Google Scholar
Cheng, M.M., Zhang, G.X., Mitra, N.J., Huang, X.L., Hu, S.M.: Global Contrast based Salient Region Detection. In: 2011 IEEE Conference on Computer Vision and Pattern Recognition, pp. 409–416 (2011)
Google Scholar
Xiaodi, H., Liqing, Z.: Saliency Detection: A Spectral Residual Approach. In: 2007 IEEE Conference on Computer Vision and Pattern Recognition, pp. 2280–2287 (2007)
Google Scholar
Achanta, R., Hemami, S., Estrada, F., Susstrunk, S.: Frequency-tuned Salient Region Detection. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, vol. 1-4, pp. 1597–1604. IEEE, New York (2009)
Chapter Google Scholar
Navalpakkam, V., Itti, L.: An Integrated Model of Top-Down and Bottom-Up Attention for Optimizing Detection Speed. In: 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 2049–2056 (2006)
Google Scholar
Frintrop, S., Backer, G., Rome, E.: Goal-Directed Search with a Top-Down Modulated Computational Attention System. In: Kropatsch, W.G., Sablatnig, R., Hanbury, A. (eds.) DAGM 2005. LNCS, vol. 3663, pp. 117–124. Springer, Heidelberg (2005)
Chapter Google Scholar
Tie, L., Zejian, Y., Jian, S., Jingdong, W., Nanning, Z., Xiaoou, T., Heung-Yeung, S.: Learning to detect a salient object. IEEE Transactions on Pattern Analysis and Machine Intelligence 33(2), 353–367 (2011)
Article Google Scholar
Oliva, A., Torralba, A., Castelhano, M.S., Henderson, J.M.: Top-down control of visual attention in object detection. In: 2003 International Conference on Image Processing, pp. 253–256 (2003)
Google Scholar
Gao, D., Han, S., Vasconcelos, N.: Discriminant Saliency, the Detection of Suspicious Coincidences, and Applications to Visual Recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence 31, 989–1005 (2009)
Article Google Scholar
Lowe, D.G.: Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision 60, 91–110 (2004)
Article Google Scholar
Jinjun, W., Jianchao, Y., Kai, Y., Fengjun, L., Huang, T., Yihong, G.: Locality-constrained linear coding for image classification. In: 2010 IEEE Conference on Computer Vision and Pattern Recognition, pp. 3360–3367 (2010)
Google Scholar
Yu, K., Zhang, T., Gong, Y.: Nonlinear learning using local coordinate coding. In: Advances in Neural Information Processing Systems, pp. 2223–2231 (2009)
Google Scholar
Torralba, A., Oliva, A., Castelhano, M.S., Henderson, J.M.: Contextual guidance of eye movements and attention in real-world scenes: The role of global features in object search. Psychol. Rev. 113, 766–786 (2006)
Article Google Scholar
Chang, C.C., Lin, C.J.: LIBSVM: a library for support vector machines. ACM Transactions on Intelligent Systems and Technology (TIST) 2, 27 (2011)
Google Scholar
Lee, Y.B., Lee, S.: Robust Face Detection Based on Knowledge-Directed Specification of Bottom-Up Saliency. Etri Journal 33, 600–610 (2011)
Article Google Scholar
Zhai, Y., Shah, M.: Visual attention detection in video sequences using spatiotemporal cues, pp. 815–824. ACM (2006)
Google Scholar

Download references

Author information

Authors and Affiliations

Institute of Image Communication and Network Engineering, Shanghai Jiao Tong University, Shanghai, P.R. China
Yuanyuan Qiu, Jun Zhu & Rui Zhang
Shanghai Key Laboratory of Digital Media Processing and Transmission, Shanghai Jiao Tong University, Shanghai, P.R. China
Yuanyuan Qiu, Jun Zhu & Rui Zhang
Shanghai Advanced Research Institute, Chinese Academy of Sciences, China
Jun Huang

Authors

Yuanyuan Qiu
View author publications
You can also search for this author in PubMed Google Scholar
Jun Zhu
View author publications
You can also search for this author in PubMed Google Scholar
Rui Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Jun Huang
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

School of Computer Engineering, Nanyang Technologies University, 50 Nanyang Avenue, 639798, Singapore
Weisi Lin , Dong Xu , Jianxin Wu , Ying He & Jianfei Cai , , , &
Department of Computing, University of Surrey, GU2 7XH, Guildford, UK
Anthony Ho
Department of Computer Science, School of Computing, National University of Singapore, Building AS6, Room #05-06, 117417, Singapore
Mohan Kankanhalli
Department of Electrical Engineering, University of Washington, M418 EE/CSE, Box 352500, 98195, Seattle, WA, USA
Ming-Ting Sun

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Qiu, Y., Zhu, J., Zhang, R., Huang, J. (2012). Top-Down Saliency by Multi-scale Contextual Pooling. In: Lin, W., et al. Advances in Multimedia Information Processing – PCM 2012. PCM 2012. Lecture Notes in Computer Science, vol 7674. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-34778-8_27

Download citation

DOI: https://doi.org/10.1007/978-3-642-34778-8_27
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-34777-1
Online ISBN: 978-3-642-34778-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics