Skip to main content
Log in

Modelling salient visual dynamics in videos

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Automatic video annotation is a critical step for content-based video retrieval and browsing. Detecting the focus of interest in video frames automatically can benefit the tedious manual labeling process. However, producing an appropriate extent of visually salient regions in video sequences is a challenging task. Therefore, in this work, we propose a novel approach for modeling dynamic visual attention based on spatiotemporal analysis. Our model first detects salient points in three-dimensional video volumes, and then uses the points as seeds to search the extent of salient regions in a novel motion attention map. To determine the extent of attended regions, we use the maximum entropy in the spatial domain to analyze the dynamics derived by spatiotemporal analysis. Our experiment results show that the proposed dynamic visual attention model achieves high precision value of 70% and reveals its robustness in successive video volumes.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

References

  1. Bollmann M, Hoischen R, Mertsching B (1997) Integration of static and dynamic scene features guiding visual attention. Proc. DAGM-Symposium, pp. 483–490

  2. Courty N, Marchand E, Arnaldi B (2003) A new application for saliency maps: Synthetic vision of autonomous actors. Proc. International Conference on Image Processing, Barcelona, Spain

  3. Harris C, Stephens M (1988) A combined corner and edge detector. In Alvey Vision Conference, pp. 147–151

  4. Ho CC, Cheng WH, Pan TJ, Wu JL (2003) A user-attention based focus detection framework and its application. Proceedings of the International Conference on Information, Communications and Signal Processing and Fourth Pacific-Rim Conference on Multimedia 3:1315–1319

    Article  Google Scholar 

  5. Itti L, Koch C (2001) Computational modeling of visual attention. Neuroscience 2:1–11

    Google Scholar 

  6. Itti L, Koch C, Niebur E (1998) A model of saliency-based visual attention for rapid scene analysis. IEEE Trans Pattern Anal Mach Intell 20(11):1254–1259

    Article  Google Scholar 

  7. James W (1980/1981) The principles of psychology. Harvard University Press, Cambridge

    Google Scholar 

  8. Laptev L, Lindeberg T (2003) Space-time interest points. Proc. IEEE International Conference on Computer Vision, pp. 432–439

  9. Li S, Lee MC (2007) An efficient spatiotemporal attention model and its application to shot matching. IEEE Trans Circ Syst Video Tech 17(10):1383–1387

    Article  MathSciNet  Google Scholar 

  10. Lucas B, Kanade T (1981) An iterative image registration technique with an application to stereo vision. Proc. International Joint Conference on Artificial Intelligence, pp. 674–679

  11. Ma, YF, Lu L, Zhang HJ, Li M (2002) A user attention model for video summarization. Proc ACM Multimed, pp. 533–541

  12. Navalpakkam V, Itti L (2006) An integrated model of top-down and bottom-up attention for optimizing detection speed. Proc IEEE CVPR 2:2049–2056

    Google Scholar 

  13. Pal NR, Pal SK (1993) Entropy: a new definition and its applications. IEEE Trans Syst Man Cybern 21(5):1260–1270

    Article  Google Scholar 

  14. Shic F, Scassellati B (2007) A behavioral analysis of computational models of visual attention. Int J Comput Vis 73(2):159–177

    Article  Google Scholar 

  15. Shih CC, Tyan HR, Mark Liao HY (2001) Shot change detection based on the reynolds transport theorem. Proc. Second IEEE Pacific Rim Conference on Multimedia, Oct. 24–26, Beijing, China, LNCS 2195, pp. 819–824

  16. Su CW, Mark Liao HY, Tyan HR, Fan KC, Chen L-H (2005) A motion-tolerant dissolve detection algorithm. IEEE Trans on Multimedia 7(6), December 2005

  17. Zhai Y, Shah M (2006) Visual attention detection in video sequences using spatiotemporal cues. Proc ACM Multimedia, pp. 815–824

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Duan-Yu Chen.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Chen, DY. Modelling salient visual dynamics in videos. Multimed Tools Appl 53, 271–284 (2011). https://doi.org/10.1007/s11042-010-0511-5

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-010-0511-5

Keywords

Navigation