Abstract
In this paper we are interested in the saliency of visual content from wearable cameras. The subjective saliency in wearable video is studied first due to the psycho-visual experience on this content. Then the method for objective saliency map computation with a specific contribution based on geometrical saliency is proposed. Fusion of spatial, temporal and geometric cues in an objective saliency map is realized by the multiplicative operator. Resulting objective saliency maps are evaluated against the subjective maps with promising results, highlighting interesting performance of proposed geometric saliency model.
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Ionescu, B., Vertan, C., Lambert, P., Benoit, A.: A color-action perceptual approach to the classification of animated movies. In: Proceedings of the 1st ACM International Conference on Multimedia Retrieval, ICMR 2011, pp. 10:1–10:8. ACM, New York (2011)
Itti, L., Koch, C.: Computational modelling of visual attention. Nature Review Neuroscience 2, 194–203 (2001)
Itti, L.: Quantifying the contribution of low-level saliency to human eye movements in dynamic scenes. Visual Cognition 12, 1093–1123 (2005)
Le Meur, O., Le Callet, P., Barba, D.: Predicting visual fixations on video based on low-level video features. Vision Research 47, 1057–1092 (2007)
Tatler, B.W.: The central fixation bias in scene viewing: Selecting an optimal viewing position independently of motor biases and image feature distributions. Journal of Vision 7, 1–17 (2007)
Dorr, M., Martinetz, T., Gegenfurtner, K.R., Barth, E.: Variability of eye movements when viewing dynamic natural scenes. Journal of Vision 10 (2010)
Brouard, O., Ricordel, V., Barba, D.: Cartes de Saillance Spatio-Temporelle basées Contrastes de Couleur et Mouvement Relatif. In: Compression et Representation des Signaux Audiovisuels, CORESA 2009, Toulouse, France, 6 pages (2009)
Lee, Y.J., Ghosh, J., Grauman, K.: Discovering important people and objects for egocentric video summarization. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2012, pp. 1–8 (2012)
Starner, T., Schiele, B., Pentland, A.: Visual contextual awareness in wearable computing. In: ISWC, pp. 50–57 (1998)
Ren, X., Philipose, M.: Egocentric recognition of handled objects: Benchmark and analysis. In: Computer Vision and Pattern Recognition Workshop, pp. 1–8 (2009)
Karaman, S., Benois-Pineau, J., Mégret, R., Dovgalecs, V., Dartigues, J.F., Gaëstel, Y.: Human Daily Activities Indexing in Videos from Wearable Cameras for Monitoring of Patients with Dementia Diseases. In: ICPR 2010, Istanbul, Turquie, pp. 4113–4116 (2010) ANR-09-BLAN-0165-02
Szolgay, D., Benois-Pineau, J., Mégret, R., Gaëstel, Y., Dartigues, J.F.: Detection of moving foreground objects in videos with strong camera motion. Pattern Analysis and Applications 14, 311–328 (2011)
International Telecommunication Union: Methodology for the subjective assessment of the quality of television pictures. Recommendation BT.500-11, International Telecommunication Union (2002)
Land, M., Mennie, N., Rusted, J.: The roles of vision and eye movements in the control of activities of daily living. Perception 28, 1311–1328 (1999)
Wooding, D.: Eye movements of large populations: Ii. Deriving regions of interest, coverage, and similarity using fixation maps. Behavior Research Methods 34, 518–528 (2002), doi:10.3758/BF03195481
Hood, D.C., Finkelstein, M.A.: Sensitivity to light. In: Boff, K.R., Kaufman, L., Thomas, J.P. (eds.) Handbook of Perception and Human Performance. Sensory processes and perception, vol. 1, pp. 5-1–5-66. John Wiley & Sons, New York (1986)
Aziz, M., Mertsching, B.: Fast and robust generation of feature maps for region-based visual attention. IEEE Transactions on Image Processing 17, 633–644 (2008)
Marat, S., Ho Phuoc, T., Granjon, L., Guyader, N., Pellerin, D., Guérin-Dugué, A.: Modelling spatio-temporal saliency to predict gaze direction for short videos. International Journal of Computer Vision 82, 231–243 (2009), Département Images et Signal
Bouguet, J.Y.: Pyramidal implementation of the lucas kanade feature tracker. Intel Corporation, Microprocessor Research Labs (2000)
Boujut, H., Benois-Pineau, J., Ahmed, T., Hadar, O., Bonnet, P.: A metric for no-reference video quality assessment for hd tv delivery based on saliency maps. In: 2011 IEEE International Conference on Multimedia and Expo (ICME), pp. 1–5 (2011)
Kraemer, P., Benois-Pineau, J., Domenger, J.P.: Scene Similarity Measure for Video Content Segmentation in the Framework of Rough Indexing Paradigm, Espagne, pp. 141–155 (2004)
Daly, S.J.: Engineering observations from spatiovelocity and spatiotemporal visual models. In: IS&T/SPIE Conference on Human Vision and Electronic Imaging III, vol. 3299, pp. 180–191 (1998)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Boujut, H., Benois-Pineau, J., Megret, R. (2012). Fusion of Multiple Visual Cues for Visual Saliency Extraction from Wearable Camera Settings with Strong Motion. In: Fusiello, A., Murino, V., Cucchiara, R. (eds) Computer Vision – ECCV 2012. Workshops and Demonstrations. ECCV 2012. Lecture Notes in Computer Science, vol 7585. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-33885-4_44
Download citation
DOI: https://doi.org/10.1007/978-3-642-33885-4_44
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-33884-7
Online ISBN: 978-3-642-33885-4
eBook Packages: Computer ScienceComputer Science (R0)