Conceptual Indexing of Television Images Based on Face and Caption Sizes and Locations

  • Remi Ronfard
  • Christophe Garcia
  • Jean Carrive
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 1929)


Indexing videos by their image content is an important issue for digital audiovisual archives. While much work has been devoted to classification and indexing methods based on perceptual qualities of images, such as color, shape and texture, there is also a need for classi fication and indexing of some structural properties of images. In this paper, we present some methods for image classification in video, based on the presence, size and location of faces and captions. We argue that such classifications are highly domain-dependent, and are best handled using flexible knowledge management systems (in our case, a description logics).


Description Logic Face Detection Knowledge Management System Broadcast News Text Detection 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Aigrain, Ph., Joly, Ph. and Longueville, V. Medium knowledge-based macrosegmentation of video into sequences Intelligent multimedia information retrieval, AAAI Press-MIT Press, 1997.Google Scholar
  2. 2.
    Borgida, A., Brachman, R.J., McGuiness, D.L., Resnick, L.A. 1989. CLASSIC: A Structural Data Model for Objects. ACM SIGMOD Int. Conf. on Management of Data, 1989.Google Scholar
  3. 3.
    Bouthemy, P., Garcia C., Ronfard R., Tziritas G., Veneau E. Scene segmentation and image feature extraction for video indexing and retrieval. VISUAL’99, Amsterdam, 1999.Google Scholar
  4. 4.
    Carrive, J., Pachet F., Ronfard R. Using Description Logics for Indexing Audiovisual Documents. Proceedings of the International Workshop on Description Logics, Trento, Italy, 1998.Google Scholar
  5. 5.
    Carrive, J., Pachet F., Ronfard R. Clavis: a temporal reasoning system for classification of audiovisual sequences. RIAO, Paris, April 2000.Google Scholar
  6. 6.
    Carrive, J., Pachet F., Ronfard R. A Language for Audiovisual Template Specification and Recognition Int. Conference on Constraint Programming, Singapore,September 2000.Google Scholar
  7. 7.
    Chan, Y. and Lin, S.H. and Tan, Y.P. and Kung, S.Y. Video shot classification using human faces. IEEE Intern. Conference on Image Processing, September 1996.Google Scholar
  8. 8.
    Chopra K., Srihari R.K.. Control Structures for Incorporating Picture-Specific Context in Image Interpretation. in: Proceedings of Int’l Joint Conf. on Artificial Intelligence, 1995.Google Scholar
  9. 9.
    Garcia C., Zikos G., Tziritas G.. Wavelet Packet Analysis for Face Recognition. To appear in Image and Vision Computing, 18(4).Google Scholar
  10. 10.
    Garcia C. and Tziritas G.. Face Detection Using Quantized Skin Color Regions Merging andWavelet Packet Analysis. IEEE Transactions on Multimedia, 1(3):264–277, Sept. 1999.CrossRefGoogle Scholar
  11. 11.
    Garcia C., Apostolidis X.. Text Detection and Segmentation in Complex Color Images. IEEE International Conference on Acoustics, Speech, and Signal, June 5–9 2000, Istanbul, Turkey.Google Scholar
  12. 12.
    Ide, I., Yamamoto, K. and Tanaka, H. Automatic indexing to video based on shot classification. Advanced Multimedia Content Processing, LNCS1554, November 1998.Google Scholar
  13. 13.
    Jaimes, A. and Chang, S.F. Model-based classification of visual information for content-based retrieval Storage and Retrieval for Image and Video Databases, SPI99, San Jose, January 1999.Google Scholar
  14. 14.
    Patel-Schneider, P. and Swartout, B., KRSS Description Logic Specification from the KRSS Effort,, January 1992.
  15. 15.
    Ronfard, R. Shot-level indexing and matching of video content. Storage and Retrieval for Image and Video Databases, SPIE, October 1997.Google Scholar
  16. 16.
    Satoh S., Kanade T.. Name-it: Association of Face and Name in Video. in: Proc. of Computer Vision and Pattern Recognition. IEEE Compu ter Society Press, pp. 368–373, 1997.Google Scholar
  17. 17.
    Gunsel, B. and Ferman, A.M. and Tekalp, A.M. Video Indexing Through Integration of Syntactic and Semantic Features. WACV, 1996.Google Scholar
  18. 18.
    Ferman, A.M., Tekalp, A.M. and Mehrotra, R. Effective Content Representation for Video IEEE Intern. Conference on Image Processing, October 1998.Google Scholar
  19. 19.
    Thomson, R. Grammar of the shot. Media Manual, Focal Press, Oxford, UK, 1998.Google Scholar
  20. 20.
    Yeung, M. and Yeo, B.-L. Time-constrained Clustering for Segmentation of Video into Story Units International Conference on Pattern Recognition, 1996.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2000

Authors and Affiliations

  • Remi Ronfard
    • 1
  • Christophe Garcia
    • 2
  • Jean Carrive
    • 1
  1. 1.INABry-sur-MarneFrance
  2. 2.ICS-FORTHCreteGreece

Personalised recommendations