Skip to main content

Visual Content Indexing and Retrieval with Psycho-Visual Models

  • Chapter
  • First Online:
  • 463 Accesses

Part of the book series: Multimedia Systems and Applications ((MMSA))

Abstract

The present chapter is an introduction to the book. The subject we propose has seen an exploded interest since last decade from research community in computer vision and multimedia indexing. From the field of video quality assessment where models of Human Visual System (HVS) were generally used to predict where humans will foveate and how will they perceive the degradation, these methods moved to classical Image and Video Indexing and retrieval tasks, recognition of objects, events, actions in images and video. In this book we try to give the most complete overview of the methods for visual information indexing and retrieval using prediction of visual attention or saliency. But also consider new approaches specifically designed for these tasks.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD   109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Agrawal, P., Girshick, B., Malik, J.: Analyzing the performance of multilayer neural networks for object recognition. In: Computer Vision - ECCV 2014–13th European Conference, Zurich, September 6–12 (2014), Proceedings, Part VII, pp. 329–344 (2014)

    Google Scholar 

  2. Alexe, B., Deselaers, T., Ferrari, V.: Measuring the objectness of image windows. IEEE Trans. Pattern Anal. Mach. Intell. 34(11), 2189–2202 (2012)

    Article  Google Scholar 

  3. Buswell, G.T.: How People Look at Pictures. University of Chicago Press, Chicago, IL (1935)

    Google Scholar 

  4.  de Carvalho Soares, R., da Silva, I.R., Guliato, D.: Spatial locality weighting of features using saliency map with a BoVW approach. In: International Conference on Tools with Artificial Intelligence, 2012, pp. 1070–1075 (2012)

    Google Scholar 

  5. de San Roman, P.P., Benois-Pineau, J., Domenger, J.-P., Paclet, F., Cataert, D., de Rugy, A.: Saliency driven object recognition in egocentric videos with deep CNN. CoRR, abs/1606.07256 (2016)

    Google Scholar 

  6. Engelke, U., Le Callet, P.: Perceived interest and overt visual attention in natural images. Signal Process. Image Commun. 39(Part B), 386–404 (2015). Recent Advances in Vision Modeling for Image and Video Processing

    Google Scholar 

  7. Frieden, B.R.: Science from Fisher Information: A Unification, Cambridge edn. Cambridge University Press, Cambridge (2004)

    Book  MATH  Google Scholar 

  8. Girshick, R.B., Donahue, J., Darrell, T., Malik, J.: Region-based convolutional networks for accurate object detection and segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 38(1), 142–158 (2016)

    Article  Google Scholar 

  9. González-Díaz, I., Buso, V., Benois-Pineau, J.: Perceptual modeling in the problem of active object recognition in visual scenes. Pattern Recogn. 56, 129–141 (2016)

    Article  Google Scholar 

  10. Harel, J., Koch, C., Perona, P.: Graph-based visual saliency. In: Advances in Neural Information Processing Systems, vol. 19, pp. 545–552. MIT, Cambridge (2007)

    Google Scholar 

  11. Harris, C., Stephens, M.: A combined corner and edge detector. In: Proceedings of the 4th Alvey Vision Conference, pp. 147–151 (1988)

    Google Scholar 

  12. Itti, L., Koch, C.: Computational modelling of visual attention. Nat. Rev. Neurosci. 2(3), 194–203 (2001)

    Article  Google Scholar 

  13. James, W.: The Principles of Psychology. Read Books, Vancouver, BC (2013)

    Google Scholar 

  14. Jiang, Y.-G., Dai, Q., Mei, T., Rui, Y., Chang, S.-F.: Super fast event recognition in internet videos. IEEE Trans. Multimedia 177(8), 1–13 (2015)

    Article  Google Scholar 

  15. Larson, M., Soleymani, M., Gravier, G., Jones, G.J.F.: The benchmarking initiative for multimedia evaluation: MediaEval 2016. IEEE Multimedia 1(8), 93–97 (2017)

    Article  Google Scholar 

  16. Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 60, 91–110 (2004)

    Article  Google Scholar 

  17. Le Meur, O., Le Callet, P.: What we see is most likely to be what matters: visual attention and applications. In: 2009 16th IEEE International Conference on Image Processing (ICIP), pp. 3085–3088 (2009)

    Google Scholar 

  18. Mikolajczyk, K., Schmid, C.: Indexing based on scale invariant interest points. In: Proceedings of the 8th IEEE International Conference on Computer Vision, vol. 1, pp. 525–531 (2001)

    Google Scholar 

  19. Narwaria, M., Mantiuk, K.R., Da Silva, M.P., Le Callet, P.: HDR-VDP-2.2: a calibrated method for objective quality prediction of high-dynamic range and standard images. J. Electron. Imaging 24(1), 010501 (2015)

    Google Scholar 

  20. Otsu, N.: A threshold selection method from gray-level histograms. IEEE Trans. Syst. Man Cybern. 9(1), 62–66 (1979)

    Article  Google Scholar 

  21. Papushoy, A., Bors, G.A.: Visual attention for content based image retrieval. In: 2015 IEEE International Conference on Image Processing, ICIP 2015, Quebec City, QC, 27–30 September 2015, pp. 971–975

    Google Scholar 

  22. Philbin, J., Chum, O., Isard, M., Sivic, J., Zisserman, A.: Lost in quantization: Improving particular object retrieval in large scale image databases. In: 2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2008), Anchorage, Alaska, 24–26 June 2008

    Google Scholar 

  23. Rai, Y., Cheung, G., Le Callet, P.: Quantifying the relation between perceived interest and visual salience during free viewing using trellis based optimization. In: 2016 International Conference on Image, Video, and Multidimensional Signal Processing, vol. 9394, July 2016

    Google Scholar 

  24. Rayatdoost, S., Soleymani, M.: Ranking images and videos on visual interestingness by visual sentiment features. In: Working Notes Proceedings of the MediaEval 2016 Workshop, Hilversum, 20–21 October 2016, CEUR-WS.org

    Google Scholar 

  25. Ren, X., Gu, C.: Figure-ground segmentation improves handled object recognition in egocentric video. In: IEEE Conference on Computer Vision and Pattern Recognition (2010)

    Book  Google Scholar 

  26. Rosten, E., Drummond, T.: Fusing points and lines for high performance tracking. In: Proceedings of the IEEE International Conference on Computer Vision, vol. 2, pp. 1508–1511 (2005)

    Google Scholar 

  27. Schill, K., Umkehrer, E., Beinlich, S., Krieger, G., Zetzsche, C.: Scene analysis with saccadic eye movements: top-down and bottom-up modeling. J. Electron. Imaging 10(1), 152–160 (2001)

    Article  Google Scholar 

  28. Sermanet, P., Eigen, D., Zhang, X., Mathieu, M., Fergus, R., LeCun, Y.: Overfeat: integrated recognition, localization and detection using convolutional networks. CoRR, abs/1312.6229 (2013)

    Google Scholar 

  29. Sermanet, P., Kavukcuoglu, K., Chintala, S., LeCun, Y.: Pedestrian detection with unsupervised multi-stage feature learning. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition, Portland, OR, June 23–28, pp. 3626–3633 (2013)

    Google Scholar 

  30. Soleymani, M.: The quest for visual interest. In: ACM International Conference on Multimedia, New York, pp. 919–922 (2015)

    Google Scholar 

  31. Uijlings, J.R.R., Van de Sande, K.E.A., Gevers, T., Smeulders, A.W.M.: Selective search for object recognition. Int. J. Comput. Vis. 104(2), 154–171 (2013)

    Article  Google Scholar 

  32. Vig, E., Dorr, M., Cox, D.: Space-Variant Descriptor Sampling for Action Recognition Based on Saliency and Eye Movements, pp. 84–97. Springer, Firenze (2012)

    Google Scholar 

  33. Wang, H., Schmid, C.: Action recognition with improved trajectories. In: Proceedings of the IEEE International Conference on Computer Vision (2013)

    Book  Google Scholar 

  34. Wang, H., Kläser, A., Schmid, C., Liu, C.-L.: Action recognition by dense trajectories. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3169–3176. IEEE, New York (2011)

    Google Scholar 

  35. Wang, H., Oneata, D., Verbeek, J., Schmid, C.: A robust and efficient video representation for action recognition. Int. J. Comput. Vis. 219–38 (2016)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Patrick Le Callet .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this chapter

Cite this chapter

Le Callet, P., Benois-Pineau, J. (2017). Visual Content Indexing and Retrieval with Psycho-Visual Models. In: Benois-Pineau, J., Le Callet, P. (eds) Visual Content Indexing and Retrieval with Psycho-Visual Models. Multimedia Systems and Applications. Springer, Cham. https://doi.org/10.1007/978-3-319-57687-9_1

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-57687-9_1

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-57686-2

  • Online ISBN: 978-3-319-57687-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics