Skip to main content
Log in

Content-based representation and retrieval of visual media: A state-of-the-art review

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

This paper reviews a number of recently available techniques in content analysis of visual media and their application to the indexing, retrieval, abstracting, relevance assessment, interactive perception, annotation and re-use of visual documents.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. T.G. Aguierre-Smith and G. Davenport, “The stratification system: A design environment for random access video,” Proc. 3rd Int. Workshop on Network and Operating System Support for Digital Audio and Video, La Jolla, CA, USA, Nov. 1992, pp. 250–261.

  2. P. Aigrain, “Organizing image banks for visual access: Model and techniques,” OPTICA'87 Conf. Proc., Amsterdam, Learned Information, April 1987, pp. 257–270.

  3. P. Aigrain, “Image and sound digital libraries need more than storage and networked access,” Proc. International Symposium on Digital Libraries, ULIS, Tsukuba, Japan, Aug. 1995, pp. 112–118.

  4. P.Aigrain, “Software research for video libraries and archives,” IFLA Journal, special issue on the UNESCO Memory of the World Project, Vol. 21, No. 3, pp. 198–202, 1995.

    Google Scholar 

  5. P.Aigrain and P.Joly, “The automatic real-time analysis of film editing and transition effects and its applications,” Computers & Graphics, Vol. 18, No. 1, pp. 93–103, Jan.–Feb. 1994.

    Google Scholar 

  6. P. Aigrain and P. Joly, “Discrete visual manipulation user interfaces for video,” Proc. RIAO'94 Conference, New-York, Oct. 1994, Vol. 2, pp. 12–17.

  7. P. Aigrain and V. Longueville, “A connection graph for user navigation in a large image bank,” Proc. RIAO'91, Barcelona, Spain, April 1991, Vol. 1, pp. 67–84.

  8. P.Aigrain and V.Longueville, “Evaluation of navigational links between images”, Information Processing and Management, Vol. 28, No. 4, pp. 517–528, 1992.

    Google Scholar 

  9. P. Aigrain, P. Joly, and V. Longueville, “Medium-knowledge-based macro-segmentation of video into sequences,” Working Notes of IJCAI Workshop on Intelligent Multimedia Information Retrieval, Montreal, Aug. 1995, pp. 5–14.

  10. P. Aigrain, P. Joly, H.-K. Kim, and P. Lepain, Software Tools for Moving Image Archives: Access, Indexing and User Interfaces, G. Boston (Ed.), Proc. Joint Technical Sympoisum on Technology and Our Audiovisual Heritage, FIAF/FIAT/IASA/IFLA/ICA, London, Jan. 1995.

  11. P. Aigrain, P. Joly, P. Lepain, and V. Longueville, “Representation-based user interfaces for the audiovisual library of year 2000,” Proc. IS&T/SPIE'95 Multimedia Computing and Networking, San Jose, Feb. 1995, pp. 35–45.

  12. A. Akutsu, Y. Tonomura, H. Hashimoto, and Y. Ohba, “Video indexing using motion vectors,” Proc. Visual Communication and Image Processing, SPIE, Amsterdam, 1992, Vol. 1818, pp. 1522–1530.

  13. A. Akutsu and Y. Tonomura, “Video tomography: An efficient method for camerawork extraction and motion analysis,” Proc. A.C.M. Multimedia Conference, San Francisco, Oct. 1994.

  14. F. Arman, A. Hsu, and M.Y. Chiu, “Feature management for large video databases,” Proc. Storage and Retrieval for Image and Video Databases I, SPIE, Feb. 1993, Vol. 1908, pp. 2–12.

  15. T. Blum, D. Keislar, J. Wheaton, and E. Wold, “Audio databases with content-based retrieval,” Working Notes of IJCAI Workshop on Intelligent Multimedia Information Retrieval, Montreal, Aug. 1995, pp. 71–92.

  16. J.S. Boreczky and L.A. Rowe, “Comparison of video shot boundary detection techniques,” Proc. SPIE Conf. Storage and Retrieval for Video Databases IV, San Jose, CA, USA, Feb. 1995.

  17. V.M.Bove Jr., “Entropy-based depth from focus,” Journal of the Optical Society of America A, Vol. 10, pp. 561–566, April 1993.

    Google Scholar 

  18. S. Butler and A.P. Parkes, “Filmic spacetime diagrams for video structure representation,” to appear in Image Communication special issue on Image and Video Semantics: Processing, Analysis and Application, 1996.

  19. N.-S.Chang and K.-S.Fu, “Query by pictorial example,” IEEE Transactions on Software Engineering, Vol. 6, No. 6, pp. 519–524, Nov. 1980.

    Google Scholar 

  20. M. Cherfaoui and C. Bertin, “Two-stage strategy for indexing and presenting video,” Proc. SPIE Conf. Storage and Retrieval for Video Databases III, San Jose, CA, USA, Feb. 1994, Vol. 2185.

  21. A. Dailianas, R. Allen, and P. England, “Comparison of automatic video segmentation algorithms,” Proceedings of SPIE Photonics West, Philadelphia, Oct. 1995.

  22. J.Ens and P.Lawrence, “An investigation of methods determining depth from focus,” IEEE Transactions on Pattern Matching and Machine Intelligence, Vol. 15, pp. 97–108, Feb. 1993.

    Google Scholar 

  23. M. Flickner et al., “Query by image and video content,” IEEE Computer, pp. 23–32, Sept. 1995.

  24. Y. Gong, L.T. Sin, H.C. Chuan, H.J. Zhang, and M. Sakauchi, “Automatic parsing of TV soccer programs,” Proc. Second IEEE International Conference on Multimedia Computing and Systems, Washington DC, May 15–18, 1995, pp. 167–174.

  25. A.S. Gordon and E.A. Domeshek, “Conceptual indexing for video retrieval,” Working Notes of IJCAI Workshop on Intelligent Multimedia Information Retrieval, Montreal, Aug. 1995, pp. 23–38.

  26. V.N. Gudivada, “On spatial similarity measures for multimedia applications,” Proc. SPIE Conf. Storage and Retrieval for Image and Video Databases III, San Jose, CA, USA, Feb. 1994, Vol. 2420, pp. 363–380.

  27. V.N.Gudivada and V.V.Raghavan, “Design and evaluation of algorithms for image retrieval by spatial similarity,” ACM Transactions on Information Systems, Vol. 13, No. 2, pp. 115–144, April 1995.

    Google Scholar 

  28. V.Guigueno, “L'identité de l'image: Expression et systémes documentaires,” rapport d'option, Ecole Polytechnique, Palaiseau, France, Juillet, 1991.

    Google Scholar 

  29. K. Haase, “Framer: A persistent portable representation library,” Proc. of ECAI'94, 1994.

  30. A.Hampapur, R.Jain, and T.E.Weymouth, “Production model based digital video segmentation,” Multimedia Tools and Applications, Vol. 1, No. 1, pp. 9–46, 1995.

    Google Scholar 

  31. A.G. Hauptmann and M. Smith, “Text, speech and vision for video segmentation: The Informedia project,” Working Notes of IJCAI Workshop on Intelligent Multimedia Information Retrieval, Montreal, Aug. 1995, pp. 17–22.

  32. M. Hawley, Structure Out of Sound, Ph.D. Dissertation, MIT Media Laboratory, Cambridge, Mass., USA, 1993.

    Google Scholar 

  33. K. Hirata and T. Kato, “Query by Visual Example: Content-Based Image Retrieval,” Proc. E.D.B.T.'92 Conf. on Advances in Database Technology, in Pirotte, Delobel, and Gottlob (Eds.), Springer-Verlag, Lecture Notes in Computer Science, Vol. 580, pp. 56–71, 1994.

  34. M.E. Hodges, R.M. Sassnett, and M.S. Ackerman, “A construction set for multimedia applications,” IEEE Software, pp. 37–43, Jan. 1989.

  35. M. Irani, P. Anandan, J. Bergen, R. Kumar, and S. Hsu, “Mosaic based representations of video sequences and their applications,” Image Communication special issue on Image and Video Semantics: Processing, Analysis and Application, 1996.

  36. R. Jain, A. Pentland, and D. Petkovic (Eds.), Workshop Report: NSF-ARPA Workshop on Visual Information Management Systems, Cambridge, Mass., USA, June 1995.

  37. P. Joly and H.-K. Kim, “Efficient automatic analysis of camera work and micro-segmentation of video using spatio-temporal images,” Image Communication special issue on Image and Video Semantics: Processing, Analysis and Application, 1996.

  38. T. Kato, “Database architecture for content-based image retrieval,” Proc. of SPIE Conf. on Image Storage and Retrieval Systems, San Jose, Feb. 1992, Vol. 1662, pp. 112–123.

  39. P.Lepain and R.André-Obrecht, “Micro-segmentation d'enregistrements musicaux,” Actes des Journées d'Informatique Musicale, Vol. 95-13, pp. 81–90, 1995.

    Google Scholar 

  40. W.E. Mackay and G. Davenport, “Virtual video editing in interactive multimedia applications,” Communications of the A.C.M., Vol. 32, No. 9, July 1989.

  41. J. Meng, Y. Juan, and S.-F. Chang, “Scene change detection in an MPEG compressed video sequence,” IS&T/SPIE'95 Digital Video Compression: Algorithm and Technologies, San Jose, Feb. 1995, Vol. 2419, pp. 14–25.

  42. M. Mills, J. Cohen, and Y.Y. Wong, “A magnifier tool for video data,” Proc. INTERCHI'92, ACM, May 1992, pp. 93–98.

  43. A. Nagasaka and Y. Tanaka, “Automatic scene-change detection method for video works,” E. Knuth and I.M. Wegener (Eds.), Proc. 40th National Con. Information Processing Society of Japan, 1990.

  44. A.Nagasaka and Y.Tanaka, Automatic Video Indexing and Full-Search for Video Appearances, E.Knuth and I.M.Wegener (Eds.), Visual Database Systems, Elsevier Science Publishers: Amsterdam, Vol. II, pp. 113–127, 1992.

    Google Scholar 

  45. B.C.O'Connor, “Selecting key frames of moving image documents: A digital environment for analysis and navigation,” Microcomputers for Information Management, Vol. 8, No. 2, pp. 119–133, 1991.

    Google Scholar 

  46. B. Peeters, J. Faton, and P. de Pierpont, Storyboard-Le Cinéma Dessiné, Editions Yellow Now, 1992.

  47. A. Pentland, R.W. Picard, and S. Sclaroff, “Photobook: Content-based manipulation of image databases,” Proc. Storage and Retrieval for Image and Video Databases II, San Jose, CA, USA, Feb. 1994, Vol. 2185.

  48. R. Picard and Fang Liu, “A new World ordering for image similarity,” Proc. Int. Conf. on Acoustic Signals and Signal Processing, Adelaide, Australia, March 1994, Vol. 5, p. 129.

  49. R.W. Picard and T.O. Minka T., “Vision texture for annotation,” Multimedia Systems, ACM-Springer, Vol. 3, No. 3, pp. 3–14, Feb. 1995.

  50. F.Salazar, “Analyse automatique des mouvements de caméra dans un document vid'eo,” IRIT, rapport de recherche, 95-33-R, Universit'e Paul Sabatier, Toulouse, France, Sept. 1995.

    Google Scholar 

  51. F.Salazar and F.Val'ero, “Analyse automatique de documents vidéo,” IRIT, rapport de recherche, 95-28-R, Université Paul Sabatier, Toulouse, France, Juin 1995.

    Google Scholar 

  52. S.Sclaroff and A.Pentland, “Modal matching for correspondence and recognition,” IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 17, No. 6, pp. 545–561, June 1995.

    Google Scholar 

  53. I.K. Sethi and N. Patel, “A statistic approach to scene change detection,” Proc. SPIE Storage and Retrival for Image and Video Databases III, San Jose, CA, USA, Feb. 1995, Vol. 2420, pp. 329–338.

  54. B. Shahraray, “Scene change detection and content-based sampling of video sequences,” IS&T/SPIE'95 Digital Video Compression: Algorithm and Technologies, San Jose, Feb. 1995, Vol. 2419, pp. 2–13, SPIE Proceedings.

  55. B. Shahraray and D.C. Gibbon, “Automatic generation of pictorial transcripts of video programs,” IS&T/SPIE'95 Digital Video Compression: Algorithm and Technologies, San Jose, Feb. 1995, Vol. 2417, pp. 512–519, SPIE Proceedings.

  56. M. Stricker and M. Orengo, “Similarity of color images,” Proc. Storage and Retrieval for Image and Video Databases III, San Jose, CA, USA, Feb. 1995, Vol. 2420, pp. 381–392, SPIE Conference Proceedings.

  57. A. Takeshita, T. Inoue, and K. Tanaka, “Extracting text skim structures for multimedia browsing,” in M. Maybury (Ed.), Working Notes of IJCAI Workshop on Intelligent Multimedia Information Retrieval, Montreal, Aug. 1995, pp. 46–58.

  58. H.Tamura, S.Mori, and T.Yamawaki, “Texture features corresponding to visual perception,” IEEE Trans. on Syst., Man, and Cybern., Vol. 6, No. 4, pp. 460–473, 1979.

    Google Scholar 

  59. L. Teodosio and W. Bender, “Salient video stills: Content and context preserved,” Proc. ACM Multimedia'93, Anaheim, CA, USA, Aug. 1993.

  60. Y. Tonomura, A. Akutsu, K. Otsuji, and T. Sadakata, “VideoMAP and VideoSpacelcon: Tools for anatomizing video content,” Proc. InterChi'93, ACM, 1993, pp. 131–136.

  61. Y.T. Tse and R.L. Baker, “Global zoom/pan estimation and compensation for video compression,” Proc. ICASSP'91, May 1991, Vol. 4.

  62. H. Ueda, T. Miyatake, and S. Yoshisawa, “IMPACT: An interactive natural-motion-picture dedicated multimedia authoring system,” Proc. CHI'91, ACM, 1991, pp. 343–350.

  63. H.D. Wactlar, D. Christel, A. Hauptmann, T. Kanade, M. Mauldin, R. Reddy, M. Smith, and S. Stevens, “Technical challenges for the informedia digital video library,” Proc. Intenational Symposium on Digital Libraries, Tsukuba, Japan, Aug. 1995, pp. 10–16.

  64. L. Wyse and S.W. Smoliar, “Towards content-based audio indexing and retrieval,” Proc. IJCAL Workshop on Computational Auditory Scene Analysis, D. Rosenthal and H.G. Okuno (Eds.), Montréal, Aug. 1995, pp. 149–152.

  65. B.-L. Yeo and B. Liu, “On the extraction of DC sequence from MPEG compressed video,” International Conference on Image Processing (ICIP'95), Washington, DC, USA, Oct. 1995, IEEE.

  66. M.M. Yeung, B.-L. Yeo, W. Wolf, and B. Liu, “Video browsing using clustering and scene transitions on compressed sequences,” IS&T/SPIE'95 Multimedia Computing and Networking, San Jose, Feb. 1995, Vol. 2417, pp. 399–413.

  67. M.M. Yeung and B. Liu, “Efficient matching and clustering of video shots,” International Conference on Image Processing (ICIP'95), Washington, DC, USA, Oct. 1995, IEEE.

  68. R. Zabih, K. Mai, and J. Miller, “A robust method for detecting cuts and dissolves in video sequences,” Proc. ACM Multimedia'95, San Francisco, Nov. 1995.

  69. H.J.Zhang, A.Kankanhalli, and S.W.Smoliar, “Automatic partitioning of full-motion video,” Multimedia Systems, ACM-Springer, Vol. 1, No. 1, pp. 10–28, 1993.

    Google Scholar 

  70. H.J. Zhang and S.W. Smoliar, “Developing power tools for video indexing and retrieval,” Proc. SPIE'94 Storage and Retrieval for Video Databases, San Jose, CA, USA, Feb. 1994.

  71. H.J. Zhang, S.W. Smoliar, and J.H. Wu, “Content-based video browsing tools,” Proceedings of IS&T/SPIE'95 Multimedia Computing and Networking, San Jose, Feb. 1994, Vol. 2417.

  72. H.J. Zhang, C.Y. Low, Y. Gong, and S.W. Smoliar, “Video parsing using compressed data,” Proc. SPIE'94 Image and Video Processing II, San Jose, CA, USA, Feb. 1994, pp. 142–149.

  73. H.J.Zhang, S.Y.Tan, S.W.Smoliar, and Y.Gong, “Automatic parsing and indexing of news video,” Multimedia Systems, Vol. 2, No. 6, pp. 256–265, 1995.

    Google Scholar 

  74. H.J. Zhang, C.Y. Low, S.W. Smoliar, and J.H. Wu, “Video parsing, retrieval and browsing: An integrated and content-based solution,” Proc. ACM Multimedia'95, San Francisco, Nov. 5–9, 1995, pp. 15–24.

  75. D. Zhong, H.J. Zhang, and S.-F. Chang, “Clustering methods for video browsing and annotation,” Proc. Storage and Retrieval for Image and Video Databases IV, San Jose, CA, USA, Feb. 1995.

Download references

Author information

Authors and Affiliations

Authors

Additional information

This work was performed while this author was with Institute of Systems Science, Singapore.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Aigrain, P., Zhang, H. & Petkovic, D. Content-based representation and retrieval of visual media: A state-of-the-art review. Multimed Tools Appl 3, 179–202 (1996). https://doi.org/10.1007/BF00393937

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1007/BF00393937

Keywords

Navigation