Video Segmentation and Keyframe Extraction

doi:10.1007/978-1-84800-007-0_14

Part of the book series: Advanced Information and Knowledge Processing ((AI&KP))

2655 Accesses
1 Citations

The goal of this chapter is to show how clustering techniques are applied to perform video segmentation, i.e. to split videos into segments meaningful from a semantic point of view. The segmentation is the first step of any process aimed at extracting from videos high level information, i.e. information which is not explicitly stated in the data, but it rather requires an abstraction process [10] [17] [22]. The video segmentation can be thought of as the partitioning of a text into chapters, sections and other parts that help the reader to better access the content. In more general terms, the segmentation of a long document (text, video, audio, etc.) into smaller parts addresses the limits of the human mind in dealing with large amounts of information. In fact, humans are known to be more effective when managing five to nine information chunks rather than a single information block corresponding to the sum of the chunks [30].

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

M. Abdel-Mottaleb, N. Dimitrova, R. Desai, and J. Martino. CONIVAS: content based image and video access system. In Proceedings of ACM International Conference on Multimedia, pages 427-428, 1996.
Google Scholar
A. Aner-Wolf and J. Kender. Video-summaries and cross-referencing through mosaic based representation. Computer Vision and Image Understanding, 95(2):201-237, 2004.
Article Google Scholar
H. Aoki, S. Shmotsuji, and O. Hori. A shot classification method of selecting effective key-frames for video browsing. In Proceedings of ACM International Conference on Multimedia, pages 1-10, 1996.
Google Scholar
R. Castagno, T. Ebrahimi, and M. Kunt. Video segmentation based on multiple features for interactive multimedia applications. IEEE Transactions on Circuits and Systems for Video Technology, 8(5):562-571, 1998.
Article Google Scholar
Z. Cernekova, I. Pitas, and Nikou. Information theory-based shot cut/fade detection and video summarization. IEEE Transactions on Circuits and Systems for Video Technology, 16(1):82-91, 2006.
Article Google Scholar
H.S. Chang, S. Sull, and S.U. Lee. Efficient video indexing scheme for content-based retrieval. IEEE Transactions on Circuits and Systems for Video Technol-ogy, 9(8):1269-1279, 1999.
Article Google Scholar
R. Collobert, S. Bengio, and J. Mariéthoz. Torch: a modular machine learning software library. Technical Report 02-46, IDIAP, 2002.
Google Scholar
P.L. Correia and F. Pereira. Classification of video segmentation application scenarios. IEEE Transactions on Circuits and Systems for Video Technology, 14(5):735-741, 2004.
Article Google Scholar
J.M. Corridoni and A. Del Bimbo. Structured representation and automatic indexing of movie information content. Pattern Recognition, 31(12):2027-2045, 1998.
Article Google Scholar
N. Dimitrova, H.J. Zhang, B. Shahraray, I. Sezan, T. Huang, and A. Zakhor. Application of video-content analysis and retrieval. IEEE Multimedia, 9(3):42-55,2002.
Article Google Scholar
A.D. Doulamis and N.D. Doulamis. Optimal content-based video decomposition for interactive video navigation. IEEE Transactions on Circuits and Systems for Video Technology, 14(6):757-775, 2004.
Article Google Scholar
X. Du and G. Fan. Joint key-frame extraction and object segmentation for content-based video analysis. IEEE Transactions on Circuits and Systems for Video Technology, 16(7):904-914, 2006.
Article Google Scholar
X. Gao and X. Tang. Unsupervised video-shot segmentation and model-free anchorperson detection for news video story parsing. IEEE Transactions on Circuits and Systems for Video Technology, 12(9):765-776, 2002.
Article Google Scholar
D. Gatica-Perez, A. Loui, and M.T. Sun. Finding structure in home videos by probabilistic hierarchical clustering. IEEE Transactions on Circuits and Systems for Video Technology, 13(6):539-548, 2003.
Article Google Scholar
J.M. Gauch and A. Shivadas. Finding and identifying unknown commercials using repeated video sequence detection. Computer Vision and Image Under-standing, 103(1):80-88, 2006.
Article Google Scholar
A. Hamampur, T. Weymouth, and R. Jain. Digital video segmentation. In Proceedings of ACM International Conference on Multimedia, pages 357-364, 1994.
Google Scholar
A. Hanjalic. Shot boundary detection: unraveled and resolved? IEEE Transac-tions on Circuits and Systems for Video Technology, 12(2):90-105, 2002.
Article Google Scholar
A. Hanjalic. Content Based Analysis of Digital Video. Springer-Verlag, 2004.
Google Scholar
A. Hanjalic, R.L. Lagendijk, and J. Biemond. Automated high-level movie seg-mentation for advanced video-retrieval systems. IEEE Transactions on Circuits and Systems for Video Technology, 9(4):580-588, 1999.
Article Google Scholar
A. Hanjalic and H.J. Zhang. An integrated scheme for automated video ab-straction based on unsupervised cluster-validity analysis. IEEE Transactions on Circuits and Systems for Video Technology, 9(8):1280-1289, 1999.
Article Google Scholar
V. Kobla, D. Doermann, and C. Faloutsos. VideoTrails: representing and visual- izing structure. In Proceedings of ACM International Conference on Multimedia, pages 335-346, 1997.
Google Scholar
I. Koprinska and S. Carrato. Temporal video segmentation: a survey. Signal Processing: Image Communication, 16:477-500, 2001.
Article Google Scholar
J. Lee and B.W. Dickinson. Hierarchical video indexing and retrieval for subband-coded video. IEEE Transactions on Circuits and Systems for Video Technology, 10(5):824-829, 2000.
Article Google Scholar
M.S. Lee, Y.M. Yang, and S.W. Lee. Automatic video parsing using shot bound- ary detection and camera operation analysis. Pattern Recognition, 34(3):711-719,2001.
Article MATH Google Scholar
R. Leonardi, P. Migliorati, and M. Prandini. Semantic indexing of soccer audio-visual sequences: a multimodal approach based on controlled Markov chains. IEEE Transactions on Circuits and Systems for Video Technology, 14(5):634-643,2004.
Article Google Scholar
Y. Li and J. Kuo. Video Content Analysis Using Multimodal Information. Springer-Verlag, 2003.
Google Scholar
L. Lije and G. Fan. Combined key-frame extraction and object-based video seg- mentation. IEEE Transactions on Circuits and Systems for Video Technology, 15(7):869-884, 2005.
Article Google Scholar
S.D. MacArthur, C.E. Brodley, A.C. Kak, and L.S. Broderick. Interactive content-based image retrieval using relevance feedback. Computer Vision and Image Understanding, 88(2):55-75, 2002.
Article MATH Google Scholar
T. Meier and K.N. Ngan. Video segmentation for content-based coding. IEEE Transactions on Circuits and Systems for Video Technology, 9(8):1190-1203, 1999.
Article Google Scholar
G.A. Miller. The magic number seven plus or minus two: some limits on capacity for processing information. Psychology Review, 63:81-97, 1956.
Article Google Scholar
C.W. Ngo, T.C. Pong, and R.T. Chin. Video partitioning by temporal slice coherency. IEEE Transactions on Circuits and Systems for Video Technology, 11(8):941-953, 2001.
Article Google Scholar
N.V. Patel and I.K. Sethi. Video shot detection and characterization for video databases. Pattern Recognition, 30(4):583-592, 1997.
Article Google Scholar
M.J. Pickering and S. Rüger. Evaluation of key-frame based retrieval techniques for video. Computer Vision and Image Understanding, 92(2-3):217-235, 2003.
Article Google Scholar
S. Porter, M. Mirmehdi, and B. Thoams. Temporal video segmentation and classification of edit effects. Image and Vision Computing, 21(13-14):1097-1106, 2003.
Article Google Scholar
K.M. Pua, Gauch J.M., S.E. Gauch, and J.Z. Miadowicz. Real-time repeated video sequence identification. Computer Vision and Image Understanding, 93(3):310-327, 2004.
Article Google Scholar
E. Sahouria and A. Zakhor. Content analysis of video using principal component analysis. IEEE Transactions on Circuits and Systems for Video Technology, 9(8):1290-1298, 1999.
Article Google Scholar
F. Schaffalitzky and A. Zisserman. Automated location matching in movies. Computer Vision and Image Understanding, 92(2-3):217-235, 2003.
Article Google Scholar
M.A. Smith and M.G. Christel. Automating the creation of a digital video library. In Proceedings of ACM International Conference on Multimedia, pages 357-358, 1995.
Google Scholar
M.A. Smith and T. Kanade. Multimodal Video Characterization and Summa- rization. Springer-Verlag, 2004.
Google Scholar
C.G.M. Snoek and M. Worring. Multimodal video indexing: a review of the state-of-the-art. Multimedia Tools and Applications, 25(1):5-35, 2005.
Article Google Scholar
K.W. Sze, K.M. Lam, and G. Qiu. A new key frame representation for video segment retrieval. IEEE Transactions on Circuits and Systems for Video Tech-nology, 15(9):1148-1155, 2005.
Article Google Scholar
Y. Taniguchi, A. Akutsu, Y. Tonomura, and H. Hamada. An intuitive and effi- cient access interface to real-time incoming video based on automatic indexing. In Proceedings of ACM International Conference on Multimedia, pages 25-33, 1995.
Google Scholar
B.T. Truong, S. Venkatesh, and C. Dorai. Scene extraction in motion pictures. IEEE Transactions on Circuits and Systems for Video Technology, 13(1):5-15, 2003.
Article Google Scholar
S. Tsekeridou and I. Pitas. Content-based video parsing and indexing based on audi-visual interaction. IEEE Transactions on Circuits and Systems for Video Technology, 11(4):522-535, 2001.
Article Google Scholar
D. Wang. Unsupervised video segmentation based on watersheds and temporal tracking. IEEE Transactions on Circuits and Systems for Video Technology, 8(5):539-546, 1998.
Article Google Scholar
B.L. Yeo and B. Liu. Rapid scene analysis on compressed video. IEEE Trans-actions on Circuits and Systems for Video Technology, 5(6):533-544, 1995.
Article Google Scholar
M. Yeung, B.L. Yeo, and B. Liu. Segmentation of video by clustering and graph analysis. Computer Vision and Image Understanding, 71(1):94-109, 1998.
Article Google Scholar
M.M. Yeung and B.L. Yeo. Video visulization for compact presentation and fast browsing of pictorial content. IEEE Transactions on Circuits and Systems for Video Technology, 7(5):771-785, 1997.
Article Google Scholar
H. Yi, D. Rajan, and L.T. Chia. A motion-based scene tree for compressed video content management. Image and Vision Computing, 24(2):131-142, 2006.
Article Google Scholar
H.H. Yu and W. Wolf. A hierarchical multiresolution video shot transition detection scheme. Computer Vision and Image Understanding, 75(1-2):196-213, 1999.
Article Google Scholar
H.J. Zhang, C.Y. Low, S.W. Smoliar, and J.H. Wu. Video parsing, retrieval and browsing: an integrated and content-based solution. In Proceedings of ACM International Conference on Multimedia, pages 15-24, 1995.
Google Scholar
H.J. Zhang, J. Wu, D. Zhong, and S.W. Smoliar. An integrated system for content-based video retrieval and browsing. Pattern Recognition, 30(4):643-658, 1997.
Article Google Scholar
H.J. Zhang, J.H. Wu, C.Y. Low, and S.W. Smoliar. A video parsing, index-ing and retrieval system. In Proceedings of ACM International Conference on Multimedia, pages 359-360, 1995.
Google Scholar
Y.J. Zhang and H.B. Lu. A hierarchical organization scheme for video data. Pattern Recognition, 35(11):2381-2387, 2002.
Article MATH Google Scholar

Download references

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

(2008). Video Segmentation and Keyframe Extraction. In: Machine Learning for Audio, Image and Video Analysis. Advanced Information and Knowledge Processing. Springer, London. https://doi.org/10.1007/978-1-84800-007-0_14

Download citation

DOI: https://doi.org/10.1007/978-1-84800-007-0_14
Publisher Name: Springer, London
Print ISBN: 978-1-84800-006-3
Online ISBN: 978-1-84800-007-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics