Skip to main content

Visual Attention Modelling in a 3D Context

  • Chapter
  • First Online:
  • 930 Accesses

Abstract

This chapter provides a general framework for visual attention modelling. A combination of different state-of-the-art approaches in the field of saliency detection is described. This is done by extending various spatial domain approaches to the temporal domain. Proposed saliency detection methods (with and without using depth information) are applied on the video to detect salient regions. Finally, experimental results are shown in order to validate the saliency map quality with the eye tracking system results. This chapter also deals with the integration of visual attention models in video compression algorithms. Jointly with the eye tracking data, this use case provides a validation framework to assess the relevance of saliency extraction methods. By using our proposed saliency models for video coding purposes, we demonstrate substantial performances gains.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD   54.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    Matlab implementation of the face detection algorithm is available on http://people.kyb.tuebingen.mpg.de/kienzle/fdlib/fdlib.htm.

  2. 2.

    http://www.tobii.com/en/eye-tracking-research/global/products/hardware/tobii-tx300-eye-tracker.

  3. 3.

    http://www.lg.com/de/service-produkt/lg-47LM660S.

  4. 4.

    http://www.dvs.de/de/produkte/video-systems/clipster.htm.

  5. 5.

    Tobii Technology TX300 eye tracker system was used for obtaining the heat maps.

  6. 6.

    http://www.videolan.org/developers/x264.html.

References

  1. Abdollahian G, Edward JD (2007) Finding regions of interest in home videos based on camera motion. In: IEEE International Conference on Image Processing (ICIP), vol 4, 2007

    Google Scholar 

  2. Achanta R, Hemami SS, Estrada FJ, Süsstrunk S (2009) Frequency-tuned salient region detection. In: CVPR, pp 1597–1604, 2009

    Google Scholar 

  3. Ali B, Laurent I (2013) State-of-the-art in visual attention modeling. IEEE Trans Pattern Anal Mach Intell 35(1):185–207

    Article  Google Scholar 

  4. Ariizumi R, Kaneda S, Haga H (2008) Energy saving of TV by face detection. In: Proceedings of the 1st international conference on pervasive technologies related to assistive environments, pp 95:1–95:8, 2008

    Google Scholar 

  5. Boris S, Rainer S (2012) In: Predicting human gaze using quaternion DCT image signature saliency and face detection. In: Proceedings of the IEEE workshop on the applications of computer vision (WACV), 2012

    Google Scholar 

  6. Borji A, Tavakoli H, Sihite D, Itti L (2013) Analysis of scores, datasets, and models in visual saliency prediction. In Proceedings of the IEEE international conference on computer vision, pp 921–928, 2013

    Google Scholar 

  7. Cerf M, Frady EP, Koch C (2008) Predicting human gaze using low-level saliency combined with face detection. In: Advances in neural information processing systems, vol 20, 2008

    Google Scholar 

  8. Cerf M, Frady EP, Koch C (2009) Faces and text attract gaze independent of the task: Experimental data and computer model. J Vis 9(12):1–15

    Article  Google Scholar 

  9. Chenlei G, Qi M, Liming Z (2008) Spatio-temporal Saliency detection using phase spectrum of quaternion fourier transform. In: CVPR’08, 2008

    Google Scholar 

  10. Corvee E, Bremond F (2009) BiolD: a multimodal biometric identification system. In: 3rd international conference on crime detection and prevention (ICDP 2009), 255(5):1–6, 2009

    Google Scholar 

  11. Cui X, Liu Q, Metaxas D (2009) Temporal spectral residual: fast motion saliency detection. In: Proceedings of the 17th ACM international conference on multimedia (MM’09), pp 617–620, 2009

    Google Scholar 

  12. Dittrich T, Kopf S, Schaber P, Guthier B, Effelsberg W (2013) Saliency detection for stereoscopic video. In: Proceedings of the 4th ACM multimedia systems conference, ser. MMSys 13, pp 12–23. ACM, New York

    Chapter  Google Scholar 

  13. Frischholz RW, Dieckmann U (2000) Features and objects in visual processing. Computer 33(2):64–68

    Article  Google Scholar 

  14. Froba B, Ernst A (2004) Face detection with the modified census transform. In: Sixth IEEE International Conference on Automatic Face and Gesture Recognition, pp 91–96 (2004)

    Google Scholar 

  15. Hadizadeh H, Bajić IV (2014) Saliency-aware video compression. IEEE Trans Image Process 23(1):19–33

    Article  Google Scholar 

  16. Han J, Ngan KN, Li M, Zhang HJ (2006) Unsupervised extraction of visual attention objects in color images. IEEE Trans Circuits Syst Video Technol 16(1):141–145

    Article  Google Scholar 

  17. Heng M-M, Zhang G-X, Mitra NJ, Huang X, Hu S-M (2011) Global contrast based salient region detection. In CVPR, pp 409–416, 2011

    Google Scholar 

  18. Hou X, Zhang L (2007) Saliency detection: a spectral residual approach. In: Conference on computer vision and patten recognition (CVPR), pp 1–8, IEEE, 2007

    Google Scholar 

  19. Hou X, Harel J, Koch C (2012) Image signature: highlighting sparse salient regions. IEEE Trans Pattern Anal Mach Intell 34(1):194–201

    Article  Google Scholar 

  20. Itti L, Koch C, Niebur E (1998) A model of saliency-based visual attention for rapid scene analysis. IEEE Trans Pattern Anal Mach Intell 20(11):1254–1259

    Article  Google Scholar 

  21. Jacobson N, Lee Y-L, Mahadevan V, Vasconcelos N, Nguyen TQ (2010) A novel approach to FRUC using discriminant saliency and frame segmentation. Sci. Am 19(11):2924–2934.

    MathSciNet  Google Scholar 

  22. Jacobson N, Nguyen TQ (2011) Video processing with scale-aware saliency: application to Frame Rate Up-Conversion. In: ICASSP, pp 1313–1316, 2011

    Google Scholar 

  23. Kienzle W, Bakir GH, Franz M, Schölkopf B (2004) Face detection-efficient and rank deficient. In: NIPS, 2004

    Google Scholar 

  24. Ko BC, Nam J-Y (2006) Object-of-interest image segmentation based on human attention and semantic region clustering. J. Opt. Soc. Am. A 23(10):2462–2470

    Article  Google Scholar 

  25. Koch C, Ullman S (1985) Shifts in selective visual attention: towards the underlying neural circuitry. Human neurobiology. In California Institute of Technology. 2000. Ph.D. Thesis, 4(4), 219–227, 1985

    Google Scholar 

  26. Laurent I (2000) Models of bottom-up and top-down visual attention. PhD thesis, California Institute of Technology

    Google Scholar 

  27. Ma Y-F, Hua X-S, Lu L, Zhang H-J (2005) A generic framework of user attention model and its application in video summarization. IEEE Trans Multimed 7(5):907–919

    Article  Google Scholar 

  28. Maki A, Nordlund P, Eklundh J-O (1996) A computational model of depth-based attention. In: Proceedings of the 13th international conference on pattern recognition, vol 4, pp 734–739, 1996

    Google Scholar 

  29. Qureshi H (2013) DCT based Temporal image signature approach. In: 8th international conference on computer vision theory and applications (VISAPP’13), pp 208–212, Barcelona, 2013

    Google Scholar 

  30. Qureshi H, Ludwig M (2013) Improving temporal image signature approach by adding face conspicuity map. In: Proceedings of the 2nd ROMEO workshop, Istanbul, 9 July 2013

    Google Scholar 

  31. Radhakrishna A, Sabine S (2009) Saliency detection for content-aware image resizing. In: IEEE international conference on image processing, 2009

    Google Scholar 

  32. Riche N, Mancas M, Gosselin B, Dutoit T (2011) 3d saliency for abnormal motion selection: the role of the depth map. In: Proceedings of the 8th international conference on computer vision systems, ser. ICVS11, pp 143–152. Springer, Berlin

    Google Scholar 

  33. Tizon N, Moreno C, Preda M (2011) Roi based video streaming for 3d remote rendering. In: MMSP, pp 1–6, IEEE, 2011

    Google Scholar 

  34. Tizon N, Dosso G, Ekmekcioglu E (2014) Multi-view acquisition and advanced depth map processing techniques. In: Kondoz, A., Dagiuklas, T. (eds) 3D Future Internet Media, pp. 55–78. Springer, New York

    Chapter  Google Scholar 

  35. Treisman AM, Gelade G (1980) A feature-integration theory of attention. Cogn Psychol 12:97–136

    Article  Google Scholar 

  36. Treisman A (1986) Features and objects in visual processing. Sci Am 255(5):114–125

    Article  Google Scholar 

  37. Viola P, Jones M (2001) Rapid object detection using a boosted cascade of simple features. In Computer Vision and Pattern Recognition, (CVPR), vol 1, pp 1–511, 2001

    Google Scholar 

  38. Wiegand T, Sullivan G, Bjontegaard G, Luthra A (2003) Overview of the h.264/avc video coding standard. IEEE Trans Circ Syst Video Technol 13(7):560–576

    Article  Google Scholar 

  39. Zhang Y, Jiang G, Yu M, Chen K (2010) Stereoscopic visual attention model for 3d video. In: Proceedings of the 16th international conference on advances in multimedia modeling, ser. MMM10, pp 314–324. Springer, Berlin

    Google Scholar 

Download references

Acknowledgements

This work was supported by the ROMEO project (grant number: 287896), which was funded by the EC FP7 ICT collaborative research program.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Haroon Qureshi .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer Science+Business Media New York

About this chapter

Cite this chapter

Qureshi, H., Tizon, N. (2015). Visual Attention Modelling in a 3D Context. In: Kondoz, A., Dagiuklas, T. (eds) Novel 3D Media Technologies. Springer, New York, NY. https://doi.org/10.1007/978-1-4939-2026-6_6

Download citation

  • DOI: https://doi.org/10.1007/978-1-4939-2026-6_6

  • Published:

  • Publisher Name: Springer, New York, NY

  • Print ISBN: 978-1-4939-2025-9

  • Online ISBN: 978-1-4939-2026-6

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics