Skip to main content

Video Scene and Event Detection

  • Reference work entry
  • First Online:
  • 187 Accesses

Synonyms

Video scene and event extraction

Definition

A video scene, also called a logical story unit [7] or simply a story unit, can be defined as a semantically related consecutive series of image frames that depict and convey a high-level concept such as event, topic, object, location, and action, which constitutes a story in a video. Especially, an event can be defined as an incident or situation, which occurs in a particular place during a particular interval of time, for example – homerun in a baseball game, actor’s entrance on stage, car explosion on a highway, etc. Under these definitions, video scene and event detection is used to find all video intervals corresponding to a specific event from a given video.

Historical Background

Video scene and event detection has been an active research area in the community of multimedia signal processing and computer vision and has attracted much interest in many applications such as multimedia information retrieval, video archive indexing...

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   4,499.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD   6,499.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Recommended Reading

  1. Adams B, Amir A, Iyengar G, Lin C-Y, Naphade M, Neti C, Smith JR. Semantic indexing of multimedia content using visual, audio and text cues. EURASIP J Appl Signal Proc. 2003;2003(2):1–16.

    Google Scholar 

  2. Babaguchi N, Kawai Y, Kitahashi T. Event based indexing of broadcasted sports video by intermodal collaboration. IEEE Trans Multimed. 2002;4(1):68–75.

    Article  Google Scholar 

  3. Babaguchi N, Nitta N. Intermodal collaboration: a strategy for semantic content analysis for broadcasted sports video. In: Proceedings of the International Conference Image Processing; 2003. p. 13–6.

    Google Scholar 

  4. Chua T-S, Xu H. Fusion of AV features and external information sources for event detection in team sports video. ACM Trans Multimed Comput Commun Appl. 2006;2(1):44–67.

    Article  Google Scholar 

  5. Goh K-S, Miyahara K, Radhakrishan R, Xiong Z, Divakaran A. Audio-visual event detection based on mining of semantic audio-visual labels. MERL, TR-2004-008. 2004.

    Google Scholar 

  6. Gong Y, Xu W. Machine learning for multimedia content analysis. Berlin: Springer; 2007.

    Google Scholar 

  7. Hanjalic A, Lagendijk RL, Biemond J. Automated high-level movie segmentation for advanced video-retrieval systems. IEEE Trans Circ Syst Video Techn. 1999;9(4):580–8.

    Article  Google Scholar 

  8. Hauptmann AG, Smith MA. Text, speech, and vision for video segmentation: the informedia project. In: Proceedings of the AAAI Symposium on Computational Models for Integrating Language and Vision; 1995. p. 90–5.

    Google Scholar 

  9. Li Y, Kuo C-CJ. Video content analysis using multimodal information: for movie content extraction, indexing and representation. Norwell: Kluwer; 2003.

    Book  Google Scholar 

  10. Lienhart R, Pfeiffer S, Effelsberg W. Video abstracting. Commun ACM. 1997;40(12):55–62.

    Article  Google Scholar 

  11. Merlino A, Morey D, Maybury M. Broadcast news navigation using story segmentation. In: Proceedings of the 5th ACM International Conference on Multimedia; 1997. p. 381–91.

    Google Scholar 

  12. Rui Y, Huang TS, Mehrotra S. Constructing table-of-content for videos. ACM Multimed Syst J. 1999;7(5):359–68.

    Article  Google Scholar 

  13. Sundaram H, Chang S-F. Computable scenes and structures in films. IEEE Trans Multimed. 2002;4(4):482–91.

    Article  Google Scholar 

  14. The National Institute of Standards and Technology (NIST). TREC video retrieval evaluation. 2001–2014. http://www-nlpir.nist.gov/projects/trecvid/

  15. Xie L, Xu P, Chang S-F, Divakaran A, Sun H. Structure analysis of soccer video with domain knowledge and hidden Markov models. Pattern Recogn Lett. 2004;25(7):767–75.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Noboru Babaguchi .

Editor information

Editors and Affiliations

Section Editor information

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Science+Business Media, LLC, part of Springer Nature

About this entry

Check for updates. Verify currency and authenticity via CrossMark

Cite this entry

Babaguchi, N., Nitta, N. (2018). Video Scene and Event Detection. In: Liu, L., Özsu, M.T. (eds) Encyclopedia of Database Systems. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-8265-9_1022

Download citation

Publish with us

Policies and ethics