Abstract
The present chapter is an introduction to the book. The subject we propose has seen an exploded interest since last decade from research community in computer vision and multimedia indexing. From the field of video quality assessment where models of Human Visual System (HVS) were generally used to predict where humans will foveate and how will they perceive the degradation, these methods moved to classical Image and Video Indexing and retrieval tasks, recognition of objects, events, actions in images and video. In this book we try to give the most complete overview of the methods for visual information indexing and retrieval using prediction of visual attention or saliency. But also consider new approaches specifically designed for these tasks.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Agrawal, P., Girshick, B., Malik, J.: Analyzing the performance of multilayer neural networks for object recognition. In: Computer Vision - ECCV 2014–13th European Conference, Zurich, September 6–12 (2014), Proceedings, Part VII, pp. 329–344 (2014)
Alexe, B., Deselaers, T., Ferrari, V.: Measuring the objectness of image windows. IEEE Trans. Pattern Anal. Mach. Intell. 34(11), 2189–2202 (2012)
Buswell, G.T.: How People Look at Pictures. University of Chicago Press, Chicago, IL (1935)
de Carvalho Soares, R., da Silva, I.R., Guliato, D.: Spatial locality weighting of features using saliency map with a BoVW approach. In: International Conference on Tools with Artificial Intelligence, 2012, pp. 1070–1075 (2012)
de San Roman, P.P., Benois-Pineau, J., Domenger, J.-P., Paclet, F., Cataert, D., de Rugy, A.: Saliency driven object recognition in egocentric videos with deep CNN. CoRR, abs/1606.07256 (2016)
Engelke, U., Le Callet, P.: Perceived interest and overt visual attention in natural images. Signal Process. Image Commun. 39(Part B), 386–404 (2015). Recent Advances in Vision Modeling for Image and Video Processing
Frieden, B.R.: Science from Fisher Information: A Unification, Cambridge edn. Cambridge University Press, Cambridge (2004)
Girshick, R.B., Donahue, J., Darrell, T., Malik, J.: Region-based convolutional networks for accurate object detection and segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 38(1), 142–158 (2016)
González-Díaz, I., Buso, V., Benois-Pineau, J.: Perceptual modeling in the problem of active object recognition in visual scenes. Pattern Recogn. 56, 129–141 (2016)
Harel, J., Koch, C., Perona, P.: Graph-based visual saliency. In: Advances in Neural Information Processing Systems, vol. 19, pp. 545–552. MIT, Cambridge (2007)
Harris, C., Stephens, M.: A combined corner and edge detector. In: Proceedings of the 4th Alvey Vision Conference, pp. 147–151 (1988)
Itti, L., Koch, C.: Computational modelling of visual attention. Nat. Rev. Neurosci. 2(3), 194–203 (2001)
James, W.: The Principles of Psychology. Read Books, Vancouver, BC (2013)
Jiang, Y.-G., Dai, Q., Mei, T., Rui, Y., Chang, S.-F.: Super fast event recognition in internet videos. IEEE Trans. Multimedia 177(8), 1–13 (2015)
Larson, M., Soleymani, M., Gravier, G., Jones, G.J.F.: The benchmarking initiative for multimedia evaluation: MediaEval 2016. IEEE Multimedia 1(8), 93–97 (2017)
Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 60, 91–110 (2004)
Le Meur, O., Le Callet, P.: What we see is most likely to be what matters: visual attention and applications. In: 2009 16th IEEE International Conference on Image Processing (ICIP), pp. 3085–3088 (2009)
Mikolajczyk, K., Schmid, C.: Indexing based on scale invariant interest points. In: Proceedings of the 8th IEEE International Conference on Computer Vision, vol. 1, pp. 525–531 (2001)
Narwaria, M., Mantiuk, K.R., Da Silva, M.P., Le Callet, P.: HDR-VDP-2.2: a calibrated method for objective quality prediction of high-dynamic range and standard images. J. Electron. Imaging 24(1), 010501 (2015)
Otsu, N.: A threshold selection method from gray-level histograms. IEEE Trans. Syst. Man Cybern. 9(1), 62–66 (1979)
Papushoy, A., Bors, G.A.: Visual attention for content based image retrieval. In: 2015 IEEE International Conference on Image Processing, ICIP 2015, Quebec City, QC, 27–30 September 2015, pp. 971–975
Philbin, J., Chum, O., Isard, M., Sivic, J., Zisserman, A.: Lost in quantization: Improving particular object retrieval in large scale image databases. In: 2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2008), Anchorage, Alaska, 24–26 June 2008
Rai, Y., Cheung, G., Le Callet, P.: Quantifying the relation between perceived interest and visual salience during free viewing using trellis based optimization. In: 2016 International Conference on Image, Video, and Multidimensional Signal Processing, vol. 9394, July 2016
Rayatdoost, S., Soleymani, M.: Ranking images and videos on visual interestingness by visual sentiment features. In: Working Notes Proceedings of the MediaEval 2016 Workshop, Hilversum, 20–21 October 2016, CEUR-WS.org
Ren, X., Gu, C.: Figure-ground segmentation improves handled object recognition in egocentric video. In: IEEE Conference on Computer Vision and Pattern Recognition (2010)
Rosten, E., Drummond, T.: Fusing points and lines for high performance tracking. In: Proceedings of the IEEE International Conference on Computer Vision, vol. 2, pp. 1508–1511 (2005)
Schill, K., Umkehrer, E., Beinlich, S., Krieger, G., Zetzsche, C.: Scene analysis with saccadic eye movements: top-down and bottom-up modeling. J. Electron. Imaging 10(1), 152–160 (2001)
Sermanet, P., Eigen, D., Zhang, X., Mathieu, M., Fergus, R., LeCun, Y.: Overfeat: integrated recognition, localization and detection using convolutional networks. CoRR, abs/1312.6229 (2013)
Sermanet, P., Kavukcuoglu, K., Chintala, S., LeCun, Y.: Pedestrian detection with unsupervised multi-stage feature learning. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition, Portland, OR, June 23–28, pp. 3626–3633 (2013)
Soleymani, M.: The quest for visual interest. In: ACM International Conference on Multimedia, New York, pp. 919–922 (2015)
Uijlings, J.R.R., Van de Sande, K.E.A., Gevers, T., Smeulders, A.W.M.: Selective search for object recognition. Int. J. Comput. Vis. 104(2), 154–171 (2013)
Vig, E., Dorr, M., Cox, D.: Space-Variant Descriptor Sampling for Action Recognition Based on Saliency and Eye Movements, pp. 84–97. Springer, Firenze (2012)
Wang, H., Schmid, C.: Action recognition with improved trajectories. In: Proceedings of the IEEE International Conference on Computer Vision (2013)
Wang, H., Kläser, A., Schmid, C., Liu, C.-L.: Action recognition by dense trajectories. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3169–3176. IEEE, New York (2011)
Wang, H., Oneata, D., Verbeek, J., Schmid, C.: A robust and efficient video representation for action recognition. Int. J. Comput. Vis. 219–38 (2016)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this chapter
Cite this chapter
Le Callet, P., Benois-Pineau, J. (2017). Visual Content Indexing and Retrieval with Psycho-Visual Models. In: Benois-Pineau, J., Le Callet, P. (eds) Visual Content Indexing and Retrieval with Psycho-Visual Models. Multimedia Systems and Applications. Springer, Cham. https://doi.org/10.1007/978-3-319-57687-9_1
Download citation
DOI: https://doi.org/10.1007/978-3-319-57687-9_1
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-57686-2
Online ISBN: 978-3-319-57687-9
eBook Packages: Computer ScienceComputer Science (R0)