Video scene segmentation and semantic representation using a novel scheme

Zhu, Songhao; Liu, Yuncai

doi:10.1007/s11042-008-0233-0

Video scene segmentation and semantic representation using a novel scheme

Published: 08 October 2008

Volume 42, pages 183–205, (2009)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Songhao Zhu¹ &
Yuncai Liu¹

421 Accesses
26 Citations
Explore all metrics

Abstract

Grouping video content into semantic segments and classifying semantic scenes into different types are the crucial processes to content-based video organization, management and retrieval. In this paper, a novel approach to automatically segment scenes and semantically represent scenes is proposed. Firstly, video shots are detected using a rough-to-fine algorithm. Secondly, key-frames within each shot are selected adaptively with hybrid features, and redundant key-frames are removed by template matching. Thirdly, spatio-temporal coherent shots are clustered into the same scene based on the temporal constraint of video content and visual similarity between shot activities. Finally, under the full analysis of typical characters on continuously recorded videos, scene content is semantically represented to satisfy human demand on video retrieval. The proposed algorithm has been performed on various genres of films and TV program. Promising experimental results show that the proposed method makes sense to efficient retrieval of interesting video content.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Text-Based Video Scene Segmentation: A Novel Method to Determine Shot Boundaries

Shot and Scene Detection via Hierarchical Clustering for Re-using Broadcast Video

Keyframes and Shot Boundaries: The Attributes of Scene Segmentation and Classification

References

Adams B, Dorai C, Venkatesh S (2000) Towards automatic extraction of expressive elements from motion pictures: tempo. IEEE proceeding on International Conference on Image Processing, 641–644
Aner A, Kender JR (2002) Video summaries through mosaic-based shot and scene clustering. Proceeding on European Conference on Computer Vision, 388–402
Ariki Y, Kumano M, Tsukada K (2003) Highlight scene extraction in real time from baseball live video. Proceeding on ACM International Workshop on Multimedia Information Retrieval, 209–214
Avrithis YS, Doulamis AD et al (1999) A stochastic framework for optimal key frame extraction from MPEG Video Databases. J Comput Vis Image Underst 75(1/2):3–24 doi:10.1006/cviu.1999.0761
Article Google Scholar
Bordwell D, Thompson K (1997) Film art: an introduction, 5th edn. McGraw-Hill, New York
Google Scholar
Bouthemy P, Garcia C et al. (1999) Scene segmentation and image feature extraction for video indexing and retrieval. Proceeding on International Conference on Visual Information and Information Systems, 245–252
Cernekova Z, Kotropoulos C, Pitas I (2003) Video shot segmentation using singular value decomposition. IEEE proceeding on International Conference on Multimedia and Expo, 301–302
Chaisorn L, Chua TS, Lee C-H (2002) The segmentation of news video into story units. IEEE proceeding on International Conference on Multimedia and Expo, 73–76
Hanjalic A, Lagendijk RL, Biemond J (1999) Automated high-level movie segmentation for advanced video-retrieval systems. IEEE Trans Circuits Syst Video Technol 9(4):580–588
Article Google Scholar
Hoashi K, Sugano M et al. (2004) Shot boundary determination on MPEG compressed domain and story segmentation experiments for TRECVID 2004. TREC Video Retrieval Evaluation Forum
Hsu W, Chang SF (2004) Generative, discriminative, and ensemble learning on multi-model perceptual fusion toward news video story segmentation. IEEE proceeding on International Conference on Multimedia and Expo, 656–659
Huang J, Liu Z, Wang Y (1998) Integration of audio and visual information for content-based video segmentation. IEEE proceeding on International Conference on Image Processing, 526–530
Kender JR, Yeo BL (1998) Video scene segmentation via continuous video coherence. IEEE proceeding on Computer Vision and Pattern Recognition, 367–373
Li SZ, Zhu L et al. (2002) Statistic learning of multi-view face detection. Proceeding on European Conference on Computer Vision, 67–81
Li Y, Narayanan S, Jay Kuo C-C (2003) Movie content analysis indexing, and skimming. Kluwer, Video Mining, Chapter 5
Lienhart R, Pfeiffer S, Effelsberg W (1999) Scene determination based on video and audio features. IEEE proceeding on International Conference on Multimedia Computing and Systems, 685–690
Lin T, Zhang HJ, Shi QY (2001) Video content representation for shot retrieval and scene extraction. Int J Image Graph 1(3):507–526 doi:10.1142/S0219467801000293
Article Google Scholar
Ngo CW, Zhang HJ et al (2002) Motion-based video representation for scene change detection. Int J Comput Vis 50(2):127–142 doi:10.1023/A:1020341931699
Article MATH Google Scholar
Qi Y, Huuptmunn AG, Liu T (2003) Supervised classification for video shot segmentation. IEEE proceeding on International Conference on Multimedia and Expo, 689–672
Rasheed Z, Shah M (2003) Scene detection in Hollywood movies and TV shows. IEEE proceeding on Computer vision and pattern recognition, 343–348
Rasheed Z, Shah M (2005) Detection and representation of scenes in videos. IEEE Trans Multimed 7(6):1097–1105 doi:10.1109/TMM.2005.858392
Article Google Scholar
Rui Y, Huang TS, Mehrotra S (1999) Constructing table-of-content for videos. Journal of ACM Multimedia Systems. Spec Issue Multimedia Syst Video Libr 7(5):359–368
Google Scholar
Shahraray B (1995) Scene change detection and content-based sampling of video sequence. Proceeding on SPIE Storage and Retrieval for Image and Video Databases, 2–13
Sundaram H, Chang SF (2000) Video scene segmentation using video and audio features. IEEE proceeding on International Conference on Multimedia and Expo, 1145–1148
Tavanapong W, Zhou J (2004) Shot clustering techniques for story browsing. IEEE Trans Multimed 6(4):517–526 doi:10.1109/TMM.2004.830810
Article Google Scholar
Truong BT, Venkatesh S, Dorai C (2003) Scene extraction in motion picture. IEEE Trans Circuits Syst Video Technol 13(1):5–15
Article Google Scholar
Wolf W (1996) Key frame selection by motion analysis. IEEE proceeding on International Conference on Acoustics, Speech, and Signal Processing, 1228–1231
Xie L, Xu P et al (2004) Structure analysis of soccer video with domain knowledge and hidden Markov models. J Pattern Recognit Lett 25(7):767–775 doi:10.1016/j.patrec.2004.01.005
Article Google Scholar
Yeung M, Yeo B-L (1998) Segmentation of video by clustering and graph analysis. J Comput Vis Image Underst 71(1):94–109 doi:10.1006/cviu.1997.0628
Article Google Scholar
Yoshitaka A, Ishii T et al. (1997) Content-based retrieval of video data by the grammar of film. Proceeding on IEEE Symposium on Visual Languages, 310–317
Yuan J, Zhang B, Lin F (2005) Graph partition model for robust temporal data segmentation. Proceedings of the 9th Pacific-Asia Conference on Knowledge Discovery and Data Mining, 758–763
Zabih R, Miller J, Mai K (1999) A feature-based algorithm for detecting and classification production effects. J ACM Multimedia Syst 7(1):119–128 doi:10.1007/s005300050115
Article Google Scholar
Zhai Y, Shah M (2006) Video scene segmentation using Markov Chain Monte Carlo. IEEE Trans Multimed 8(4):686–697 doi:10.1109/TMM.2006.876299
Article Google Scholar
Zhang H, Low CY et al. (1995) Video parsing, retrieval and browsing: An integrated and content-based solution. Proceedings of ACM Conference on Multimedia, 15–24
Zhao L, Qi W et al. (2000) Key-frame extraction and shot retrieval using nearest feature line (NFL). Proceedings of International Workshop on Multimedia Information Retrieval, 217–220
Zhao YJ, Wang T et al. (2007) Scene segmentation and categorization using NCuts. IEEE proceeding on Computer vision and pattern recognition, 343–348
Zhuang Y, Rui Y et al. (1998) Adaptive key frame extraction using unsupervised clustering. IEEE proceeding on International Conference on Multimedia and Expo, 866–870

Download references

Acknowledgements

This work is supported by the National High-Tech Research and development Plan of China (973) under Grant No. 2006CB303103, and also supported by the National Natural Science Foundation of China under Grant No. 60833009.

Author information

Authors and Affiliations

Institute of Image Processing and Pattern Recognition, School of Electronics and Electric Engineering, Shanghai Jiao tong University, 800, Don chuan Road, Shanghai, 200240, China
Songhao Zhu & Yuncai Liu

Authors

Songhao Zhu
View author publications
You can also search for this author in PubMed Google Scholar
Yuncai Liu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Songhao Zhu.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhu, S., Liu, Y. Video scene segmentation and semantic representation using a novel scheme. Multimed Tools Appl 42, 183–205 (2009). https://doi.org/10.1007/s11042-008-0233-0

Download citation

Published: 08 October 2008
Issue Date: April 2009
DOI: https://doi.org/10.1007/s11042-008-0233-0

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Video scene segmentation and semantic representation using a novel scheme

Abstract

Access this article

Similar content being viewed by others

Text-Based Video Scene Segmentation: A Novel Method to Determine Shot Boundaries

Shot and Scene Detection via Hierarchical Clustering for Re-using Broadcast Video

Keyframes and Shot Boundaries: The Attributes of Scene Segmentation and Classification

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Video scene segmentation and semantic representation using a novel scheme

Abstract

Access this article

Similar content being viewed by others

Text-Based Video Scene Segmentation: A Novel Method to Determine Shot Boundaries

Shot and Scene Detection via Hierarchical Clustering for Re-using Broadcast Video

Keyframes and Shot Boundaries: The Attributes of Scene Segmentation and Classification

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation