Spatio-Temporal Formalization of Video Events

  • Milan Petković
  • Willem Jonker
Part of the The Springer International Series in Engineering and Computer Science book series (MMSA, volume 25)


As concluded in the previous chapter, the main gap in video retrieval lies between the low-level media features and the high-level concepts. However, as video is a temporal sequence of pixel regions at the physical level, it is very difficult to explore its semantic content. To solve this problem, several domain-dependent research efforts have been undertaken. These approaches take advantage of using domain knowledge to facilitate extraction of high-level concepts directly from features. In particular, they mainly use information on object positions, their transitions over time, and relate them to particular events (high-level concepts). For example, methods have been proposed to detect events in football games [1], soccer games [2], tennis [3, 4], hunting [5], events in a static room [6], etc. Motion (for a review see [7]) and audio are, in isolation, very often used for event recognition. In [8] for example, the extraction of highlights from baseball games is based on audio only.


Topological Relation Dominant Color Event Description Minimum Bounding Rectangle Video Event 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. [1]
    S. Intille, A. Bobick, “Visual Tracking Using Closed-Worlds”, Tech. Report No. 294, M.I.T. Media Laboratory, 1994.Google Scholar
  2. [2]
    Y. Gong, L. T. Sin, C. H. Chuan, H-J. Zhang, M. Sakauchi, “Automatic Parsing of TV Soccer Programs”, In Proceedings of the IEEE International Conference on Multimedia Computing and Systems, Washington D.C., 1995, pp. 167–174.Google Scholar
  3. [3]
    H. Miyamori, S-I. Iisaku, “Video Annotation for Content-based Retrieval using Human Behavior Analysis and Domain Knowledge”, In Proc. of the IEEE International Conference on Automatic Face and Gesture Recognition, Grenoble, France, 2000, pp. 320–325.CrossRefGoogle Scholar
  4. [4]
    G. Sudhir, J. Lee, A. Jain, “Automatic Classification of Tennis Video for High-level Content-based Retrieval”, IEEE Workshop on Content-based Access and Image and Video Databases, Bombay, India, 1998, pp. 81–90.Google Scholar
  5. [5]
    N. Haering, R.J. Qian, M.I. Sezan, “A semantic event-detection approach and its application to detecting hunts in wildlife video”, Circuits and Systems for Video Technology, IEEE Transactions on, 10 (6), Sept. 2000, pp. 857–868.Google Scholar
  6. [6]
    D. Ayers, M. Shah, “Recognizing Human Actions in a Static Room”, IEEE Workshop on Applications of Computer Vision (WACV), Princeton, NJ, 1998, pp. 42–47.Google Scholar
  7. [7]
    M. Shah, R. Jain (eds), Motion-Based Recognition, Kluwer Academic Publishers, 1997.Google Scholar
  8. [8]
    Y. Rui, A. Gupta, A. Acero, “Automatically Extracting Highlights for TV Baseball Programs”, In Proceedings of the ACM Multimedia International Conference, Los Angeles, CA, 2000, pp. 105–115.Google Scholar
  9. [9]
    D. A. Forsyth, J. Malik, M. M. Fleck, H. Greenspan, T. Leung, S. Belongie, C. Carson, C. Bregler, “Finding Pictures of Objects in Large Collections of Images”, In the Proceedings of European Conference on Computer Vision (ECCV) ‘86 Workshop on Object Representation, Cambridge, April 1996.Google Scholar
  10. [10]
    International Organization for Standardization, Overview of the MPEG-4 Standard, R. Koenen (ed), N2995, Melbourne, October 1999.Google Scholar
  11. [11]
    M. Egenhofer, “Reasoning about Binary Topological Relations”, In the Proceedings of the Second Symposium on the Design and Implementation of Large Spatial Databases,Springer Verlag LNCS.Google Scholar
  12. [12]
    M. Egenhofer, R. Franzosa, “Point-set topological spatial relations”, International Journal of Geographic Information Systems, Vol. 5, No. 2, pp. 161–174.Google Scholar
  13. [13]
    J. F. Allen, “Maintaining knowledge about temporal intervals”, Communications of ACM, 26 (11), 1983, pp. 832–843.CrossRefGoogle Scholar
  14. [14]
    D. Papadias, Y. Theodoridis, T. Sellis, and M. Egenhofer, “Topological Relations in the World of Minimum Bounding Rectangles: A Study with R-Trees”, SIGMOD RECORD 24 (2), 1995, pp. 92–103.CrossRefGoogle Scholar
  15. [15]
    D. Peuquet, Z. Ci-Xiang, “An Algorithm to Determine the Directional Relationship between Arbitrarily-Shaped Polygons in the Plane”, Pattern Recognition, Vol. 20, No. 1, pp 65–74.Google Scholar
  16. [16]
    H-J. Chang, S-K. Chang, “Temporal Modeling and Intermedia Synchronization for Presentation of Multimedia Streams”, In Multimedia Information Storage and Management, S. M. Chung (ed.), Kluwer Academic Publishers, 1996, pp. 373–398.Google Scholar
  17. [17]
    T. C.T. Kuo, A. L.P. Chen, “A Content-Based Query Language for Video Databases”, Proc. of IEEE Multimedia Computing Systems, 1996.Google Scholar
  18. [18]
    MPEG Requirements Group, MPEG-7 visual part of eXperimentation Model 6.0, ISO/IEC JTC1/SC29/WG11 MPEG2000/N3398, Geneva, CH, June 2000.Google Scholar
  19. [19]
    MPEG Requirements Group, Working Draft 2.0 of MPEG-7 Visual, ISO/IEC JTC1/SC29/WG11 MPEG2000 N3322, Noordwijkerhout, NL, March 2000.Google Scholar
  20. [20]
    M. Petkovic, W. Jonker, “A Framework for Video Modeling”, In Proceedings of Eighteenth LASTED International Conference on Applied Informatics, Innsbruck, February 2000, pp. 317–322.Google Scholar
  21. [21]
    Z. Zivkovic, F. van der Heijden, M. Petkovic, W. Jonker, “Image processing and feature extraction for recognizing strokes in tennis game videos”, In the Proceedings of the Seventh Annual Conference of the Advanced School for Computing and Imaging, the Netherlands, June 2001.Google Scholar
  22. [22]
    M. Petkovic, R. Zwol, H. E. Blok, W. Jonker, P. M. G. Apers, M. Windhouwer, M. Kersten, “Content-based Video Indexing for the Support of Digital Library Search”, In Proceedings of 18` h IEEE International Conference on Data Engineering (ICDE), San Jose, USA, February 2002.Google Scholar

Copyright information

© Springer Science+Business Media New York 2004

Authors and Affiliations

  • Milan Petković
    • 1
  • Willem Jonker
    • 2
  1. 1.University of TwenteThe Netherlands
  2. 2.University of Twente and Philips ResearchThe Netherlands

Personalised recommendations