Visual Content Indexing and Retrieval with Psycho-Visual Models

Le Callet, Patrick; Benois-Pineau, Jenny

doi:10.1007/978-3-319-57687-9_1

Visual Content Indexing and Retrieval with Psycho-Visual Models

Patrick Le Callet⁴ &
Jenny Benois-Pineau⁵

Chapter
First Online: 16 October 2017

463 Accesses

Part of the book series: Multimedia Systems and Applications ((MMSA))

Abstract

The present chapter is an introduction to the book. The subject we propose has seen an exploded interest since last decade from research community in computer vision and multimedia indexing. From the field of video quality assessment where models of Human Visual System (HVS) were generally used to predict where humans will foveate and how will they perceive the degradation, these methods moved to classical Image and Video Indexing and retrieval tasks, recognition of objects, events, actions in images and video. In this book we try to give the most complete overview of the methods for visual information indexing and retrieval using prediction of visual attention or saliency. But also consider new approaches specifically designed for these tasks.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

Agrawal, P., Girshick, B., Malik, J.: Analyzing the performance of multilayer neural networks for object recognition. In: Computer Vision - ECCV 2014–13th European Conference, Zurich, September 6–12 (2014), Proceedings, Part VII, pp. 329–344 (2014)
Google Scholar
Alexe, B., Deselaers, T., Ferrari, V.: Measuring the objectness of image windows. IEEE Trans. Pattern Anal. Mach. Intell. 34(11), 2189–2202 (2012)
Article Google Scholar
Buswell, G.T.: How People Look at Pictures. University of Chicago Press, Chicago, IL (1935)
Google Scholar
de Carvalho Soares, R., da Silva, I.R., Guliato, D.: Spatial locality weighting of features using saliency map with a BoVW approach. In: International Conference on Tools with Artificial Intelligence, 2012, pp. 1070–1075 (2012)
Google Scholar
de San Roman, P.P., Benois-Pineau, J., Domenger, J.-P., Paclet, F., Cataert, D., de Rugy, A.: Saliency driven object recognition in egocentric videos with deep CNN. CoRR, abs/1606.07256 (2016)
Google Scholar
Engelke, U., Le Callet, P.: Perceived interest and overt visual attention in natural images. Signal Process. Image Commun. 39(Part B), 386–404 (2015). Recent Advances in Vision Modeling for Image and Video Processing
Google Scholar
Frieden, B.R.: Science from Fisher Information: A Unification, Cambridge edn. Cambridge University Press, Cambridge (2004)
Book MATH Google Scholar
Girshick, R.B., Donahue, J., Darrell, T., Malik, J.: Region-based convolutional networks for accurate object detection and segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 38(1), 142–158 (2016)
Article Google Scholar
González-Díaz, I., Buso, V., Benois-Pineau, J.: Perceptual modeling in the problem of active object recognition in visual scenes. Pattern Recogn. 56, 129–141 (2016)
Article Google Scholar
Harel, J., Koch, C., Perona, P.: Graph-based visual saliency. In: Advances in Neural Information Processing Systems, vol. 19, pp. 545–552. MIT, Cambridge (2007)
Google Scholar
Harris, C., Stephens, M.: A combined corner and edge detector. In: Proceedings of the 4th Alvey Vision Conference, pp. 147–151 (1988)
Google Scholar
Itti, L., Koch, C.: Computational modelling of visual attention. Nat. Rev. Neurosci. 2(3), 194–203 (2001)
Article Google Scholar
James, W.: The Principles of Psychology. Read Books, Vancouver, BC (2013)
Google Scholar
Jiang, Y.-G., Dai, Q., Mei, T., Rui, Y., Chang, S.-F.: Super fast event recognition in internet videos. IEEE Trans. Multimedia 177(8), 1–13 (2015)
Article Google Scholar
Larson, M., Soleymani, M., Gravier, G., Jones, G.J.F.: The benchmarking initiative for multimedia evaluation: MediaEval 2016. IEEE Multimedia 1(8), 93–97 (2017)
Article Google Scholar
Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 60, 91–110 (2004)
Article Google Scholar
Le Meur, O., Le Callet, P.: What we see is most likely to be what matters: visual attention and applications. In: 2009 16th IEEE International Conference on Image Processing (ICIP), pp. 3085–3088 (2009)
Google Scholar
Mikolajczyk, K., Schmid, C.: Indexing based on scale invariant interest points. In: Proceedings of the 8th IEEE International Conference on Computer Vision, vol. 1, pp. 525–531 (2001)
Google Scholar
Narwaria, M., Mantiuk, K.R., Da Silva, M.P., Le Callet, P.: HDR-VDP-2.2: a calibrated method for objective quality prediction of high-dynamic range and standard images. J. Electron. Imaging 24(1), 010501 (2015)
Google Scholar
Otsu, N.: A threshold selection method from gray-level histograms. IEEE Trans. Syst. Man Cybern. 9(1), 62–66 (1979)
Article Google Scholar
Papushoy, A., Bors, G.A.: Visual attention for content based image retrieval. In: 2015 IEEE International Conference on Image Processing, ICIP 2015, Quebec City, QC, 27–30 September 2015, pp. 971–975
Google Scholar
Philbin, J., Chum, O., Isard, M., Sivic, J., Zisserman, A.: Lost in quantization: Improving particular object retrieval in large scale image databases. In: 2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2008), Anchorage, Alaska, 24–26 June 2008
Google Scholar
Rai, Y., Cheung, G., Le Callet, P.: Quantifying the relation between perceived interest and visual salience during free viewing using trellis based optimization. In: 2016 International Conference on Image, Video, and Multidimensional Signal Processing, vol. 9394, July 2016
Google Scholar
Rayatdoost, S., Soleymani, M.: Ranking images and videos on visual interestingness by visual sentiment features. In: Working Notes Proceedings of the MediaEval 2016 Workshop, Hilversum, 20–21 October 2016, CEUR-WS.org
Google Scholar
Ren, X., Gu, C.: Figure-ground segmentation improves handled object recognition in egocentric video. In: IEEE Conference on Computer Vision and Pattern Recognition (2010)
Book Google Scholar
Rosten, E., Drummond, T.: Fusing points and lines for high performance tracking. In: Proceedings of the IEEE International Conference on Computer Vision, vol. 2, pp. 1508–1511 (2005)
Google Scholar
Schill, K., Umkehrer, E., Beinlich, S., Krieger, G., Zetzsche, C.: Scene analysis with saccadic eye movements: top-down and bottom-up modeling. J. Electron. Imaging 10(1), 152–160 (2001)
Article Google Scholar
Sermanet, P., Eigen, D., Zhang, X., Mathieu, M., Fergus, R., LeCun, Y.: Overfeat: integrated recognition, localization and detection using convolutional networks. CoRR, abs/1312.6229 (2013)
Google Scholar
Sermanet, P., Kavukcuoglu, K., Chintala, S., LeCun, Y.: Pedestrian detection with unsupervised multi-stage feature learning. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition, Portland, OR, June 23–28, pp. 3626–3633 (2013)
Google Scholar
Soleymani, M.: The quest for visual interest. In: ACM International Conference on Multimedia, New York, pp. 919–922 (2015)
Google Scholar
Uijlings, J.R.R., Van de Sande, K.E.A., Gevers, T., Smeulders, A.W.M.: Selective search for object recognition. Int. J. Comput. Vis. 104(2), 154–171 (2013)
Article Google Scholar
Vig, E., Dorr, M., Cox, D.: Space-Variant Descriptor Sampling for Action Recognition Based on Saliency and Eye Movements, pp. 84–97. Springer, Firenze (2012)
Google Scholar
Wang, H., Schmid, C.: Action recognition with improved trajectories. In: Proceedings of the IEEE International Conference on Computer Vision (2013)
Book Google Scholar
Wang, H., Kläser, A., Schmid, C., Liu, C.-L.: Action recognition by dense trajectories. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3169–3176. IEEE, New York (2011)
Google Scholar
Wang, H., Oneata, D., Verbeek, J., Schmid, C.: A robust and efficient video representation for action recognition. Int. J. Comput. Vis. 219–38 (2016)
Google Scholar

Download references

Author information

Authors and Affiliations

LS2N UMR CNRS 6004, Université de Nantes, Nantes Cedex 3, France
Patrick Le Callet
LaBRI UMR 5800, Univ. Bordeaux, CNRS, Bordeaux INP, Univ. Bordeaux, 351, crs de la Liberation, F33405, Talence Cedex, France
Jenny Benois-Pineau

Authors

Patrick Le Callet
View author publications
You can also search for this author in PubMed Google Scholar
Jenny Benois-Pineau
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Patrick Le Callet .

Editor information

Editors and Affiliations

LaBRI UMR 5800, Univ. Bordeaux, CNRS, Bordeaux INP, Univ. Bordeaux, Talence, France
Jenny Benois-Pineau
LS2N, UMR CNRS 6004, Université de Nantes, Nantes Cedex 3, France
Patrick Le Callet

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Le Callet, P., Benois-Pineau, J. (2017). Visual Content Indexing and Retrieval with Psycho-Visual Models. In: Benois-Pineau, J., Le Callet, P. (eds) Visual Content Indexing and Retrieval with Psycho-Visual Models. Multimedia Systems and Applications. Springer, Cham. https://doi.org/10.1007/978-3-319-57687-9_1

Download citation

DOI: https://doi.org/10.1007/978-3-319-57687-9_1
Published: 16 October 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-57686-2
Online ISBN: 978-3-319-57687-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics