Skip to main content

Part of the book series: Advanced Information and Knowledge Processing ((AI&KP))

The goal of this chapter is to show how clustering techniques are applied to perform video segmentation, i.e. to split videos into segments meaningful from a semantic point of view. The segmentation is the first step of any process aimed at extracting from videos high level information, i.e. information which is not explicitly stated in the data, but it rather requires an abstraction process [10] [17] [22]. The video segmentation can be thought of as the partitioning of a text into chapters, sections and other parts that help the reader to better access the content. In more general terms, the segmentation of a long document (text, video, audio, etc.) into smaller parts addresses the limits of the human mind in dealing with large amounts of information. In fact, humans are known to be more effective when managing five to nine information chunks rather than a single information block corresponding to the sum of the chunks [30].

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. M. Abdel-Mottaleb, N. Dimitrova, R. Desai, and J. Martino. CONIVAS: content based image and video access system. In Proceedings of ACM International Conference on Multimedia, pages 427-428, 1996.

    Google Scholar 

  2. A. Aner-Wolf and J. Kender. Video-summaries and cross-referencing through mosaic based representation. Computer Vision and Image Understanding, 95(2):201-237, 2004.

    Article  Google Scholar 

  3. H. Aoki, S. Shmotsuji, and O. Hori. A shot classification method of selecting effective key-frames for video browsing. In Proceedings of ACM International Conference on Multimedia, pages 1-10, 1996.

    Google Scholar 

  4. R. Castagno, T. Ebrahimi, and M. Kunt. Video segmentation based on multiple features for interactive multimedia applications. IEEE Transactions on Circuits and Systems for Video Technology, 8(5):562-571, 1998.

    Article  Google Scholar 

  5. Z. Cernekova, I. Pitas, and Nikou. Information theory-based shot cut/fade detection and video summarization. IEEE Transactions on Circuits and Systems for Video Technology, 16(1):82-91, 2006.

    Article  Google Scholar 

  6. H.S. Chang, S. Sull, and S.U. Lee. Efficient video indexing scheme for content-based retrieval. IEEE Transactions on Circuits and Systems for Video Technol-ogy, 9(8):1269-1279, 1999.

    Article  Google Scholar 

  7. R. Collobert, S. Bengio, and J. Mariéthoz. Torch: a modular machine learning software library. Technical Report 02-46, IDIAP, 2002.

    Google Scholar 

  8. P.L. Correia and F. Pereira. Classification of video segmentation application scenarios. IEEE Transactions on Circuits and Systems for Video Technology, 14(5):735-741, 2004.

    Article  Google Scholar 

  9. J.M. Corridoni and A. Del Bimbo. Structured representation and automatic indexing of movie information content. Pattern Recognition, 31(12):2027-2045, 1998.

    Article  Google Scholar 

  10. N. Dimitrova, H.J. Zhang, B. Shahraray, I. Sezan, T. Huang, and A. Zakhor. Application of video-content analysis and retrieval. IEEE Multimedia, 9(3):42-55,2002.

    Article  Google Scholar 

  11. A.D. Doulamis and N.D. Doulamis. Optimal content-based video decomposition for interactive video navigation. IEEE Transactions on Circuits and Systems for Video Technology, 14(6):757-775, 2004.

    Article  Google Scholar 

  12. X. Du and G. Fan. Joint key-frame extraction and object segmentation for content-based video analysis. IEEE Transactions on Circuits and Systems for Video Technology, 16(7):904-914, 2006.

    Article  Google Scholar 

  13. X. Gao and X. Tang. Unsupervised video-shot segmentation and model-free anchorperson detection for news video story parsing. IEEE Transactions on Circuits and Systems for Video Technology, 12(9):765-776, 2002.

    Article  Google Scholar 

  14. D. Gatica-Perez, A. Loui, and M.T. Sun. Finding structure in home videos by probabilistic hierarchical clustering. IEEE Transactions on Circuits and Systems for Video Technology, 13(6):539-548, 2003.

    Article  Google Scholar 

  15. J.M. Gauch and A. Shivadas. Finding and identifying unknown commercials using repeated video sequence detection. Computer Vision and Image Under-standing, 103(1):80-88, 2006.

    Article  Google Scholar 

  16. A. Hamampur, T. Weymouth, and R. Jain. Digital video segmentation. In Proceedings of ACM International Conference on Multimedia, pages 357-364, 1994.

    Google Scholar 

  17. A. Hanjalic. Shot boundary detection: unraveled and resolved? IEEE Transac-tions on Circuits and Systems for Video Technology, 12(2):90-105, 2002.

    Article  Google Scholar 

  18. A. Hanjalic. Content Based Analysis of Digital Video. Springer-Verlag, 2004.

    Google Scholar 

  19. A. Hanjalic, R.L. Lagendijk, and J. Biemond. Automated high-level movie seg-mentation for advanced video-retrieval systems. IEEE Transactions on Circuits and Systems for Video Technology, 9(4):580-588, 1999.

    Article  Google Scholar 

  20. A. Hanjalic and H.J. Zhang. An integrated scheme for automated video ab-straction based on unsupervised cluster-validity analysis. IEEE Transactions on Circuits and Systems for Video Technology, 9(8):1280-1289, 1999.

    Article  Google Scholar 

  21. V. Kobla, D. Doermann, and C. Faloutsos. VideoTrails: representing and visual- izing structure. In Proceedings of ACM International Conference on Multimedia, pages 335-346, 1997.

    Google Scholar 

  22. I. Koprinska and S. Carrato. Temporal video segmentation: a survey. Signal Processing: Image Communication, 16:477-500, 2001.

    Article  Google Scholar 

  23. J. Lee and B.W. Dickinson. Hierarchical video indexing and retrieval for subband-coded video. IEEE Transactions on Circuits and Systems for Video Technology, 10(5):824-829, 2000.

    Article  Google Scholar 

  24. M.S. Lee, Y.M. Yang, and S.W. Lee. Automatic video parsing using shot bound- ary detection and camera operation analysis. Pattern Recognition, 34(3):711-719,2001.

    Article  MATH  Google Scholar 

  25. R. Leonardi, P. Migliorati, and M. Prandini. Semantic indexing of soccer audio-visual sequences: a multimodal approach based on controlled Markov chains. IEEE Transactions on Circuits and Systems for Video Technology, 14(5):634-643,2004.

    Article  Google Scholar 

  26. Y. Li and J. Kuo. Video Content Analysis Using Multimodal Information. Springer-Verlag, 2003.

    Google Scholar 

  27. L. Lije and G. Fan. Combined key-frame extraction and object-based video seg- mentation. IEEE Transactions on Circuits and Systems for Video Technology, 15(7):869-884, 2005.

    Article  Google Scholar 

  28. S.D. MacArthur, C.E. Brodley, A.C. Kak, and L.S. Broderick. Interactive content-based image retrieval using relevance feedback. Computer Vision and Image Understanding, 88(2):55-75, 2002.

    Article  MATH  Google Scholar 

  29. T. Meier and K.N. Ngan. Video segmentation for content-based coding. IEEE Transactions on Circuits and Systems for Video Technology, 9(8):1190-1203, 1999.

    Article  Google Scholar 

  30. G.A. Miller. The magic number seven plus or minus two: some limits on capacity for processing information. Psychology Review, 63:81-97, 1956.

    Article  Google Scholar 

  31. C.W. Ngo, T.C. Pong, and R.T. Chin. Video partitioning by temporal slice coherency. IEEE Transactions on Circuits and Systems for Video Technology, 11(8):941-953, 2001.

    Article  Google Scholar 

  32. N.V. Patel and I.K. Sethi. Video shot detection and characterization for video databases. Pattern Recognition, 30(4):583-592, 1997.

    Article  Google Scholar 

  33. M.J. Pickering and S. Rüger. Evaluation of key-frame based retrieval techniques for video. Computer Vision and Image Understanding, 92(2-3):217-235, 2003.

    Article  Google Scholar 

  34. S. Porter, M. Mirmehdi, and B. Thoams. Temporal video segmentation and classification of edit effects. Image and Vision Computing, 21(13-14):1097-1106, 2003.

    Article  Google Scholar 

  35. K.M. Pua, Gauch J.M., S.E. Gauch, and J.Z. Miadowicz. Real-time repeated video sequence identification. Computer Vision and Image Understanding, 93(3):310-327, 2004.

    Article  Google Scholar 

  36. E. Sahouria and A. Zakhor. Content analysis of video using principal component analysis. IEEE Transactions on Circuits and Systems for Video Technology, 9(8):1290-1298, 1999.

    Article  Google Scholar 

  37. F. Schaffalitzky and A. Zisserman. Automated location matching in movies. Computer Vision and Image Understanding, 92(2-3):217-235, 2003.

    Article  Google Scholar 

  38. M.A. Smith and M.G. Christel. Automating the creation of a digital video library. In Proceedings of ACM International Conference on Multimedia, pages 357-358, 1995.

    Google Scholar 

  39. M.A. Smith and T. Kanade. Multimodal Video Characterization and Summa- rization. Springer-Verlag, 2004.

    Google Scholar 

  40. C.G.M. Snoek and M. Worring. Multimodal video indexing: a review of the state-of-the-art. Multimedia Tools and Applications, 25(1):5-35, 2005.

    Article  Google Scholar 

  41. K.W. Sze, K.M. Lam, and G. Qiu. A new key frame representation for video segment retrieval. IEEE Transactions on Circuits and Systems for Video Tech-nology, 15(9):1148-1155, 2005.

    Article  Google Scholar 

  42. Y. Taniguchi, A. Akutsu, Y. Tonomura, and H. Hamada. An intuitive and effi- cient access interface to real-time incoming video based on automatic indexing. In Proceedings of ACM International Conference on Multimedia, pages 25-33, 1995.

    Google Scholar 

  43. B.T. Truong, S. Venkatesh, and C. Dorai. Scene extraction in motion pictures. IEEE Transactions on Circuits and Systems for Video Technology, 13(1):5-15, 2003.

    Article  Google Scholar 

  44. S. Tsekeridou and I. Pitas. Content-based video parsing and indexing based on audi-visual interaction. IEEE Transactions on Circuits and Systems for Video Technology, 11(4):522-535, 2001.

    Article  Google Scholar 

  45. D. Wang. Unsupervised video segmentation based on watersheds and temporal tracking. IEEE Transactions on Circuits and Systems for Video Technology, 8(5):539-546, 1998.

    Article  Google Scholar 

  46. B.L. Yeo and B. Liu. Rapid scene analysis on compressed video. IEEE Trans-actions on Circuits and Systems for Video Technology, 5(6):533-544, 1995.

    Article  Google Scholar 

  47. M. Yeung, B.L. Yeo, and B. Liu. Segmentation of video by clustering and graph analysis. Computer Vision and Image Understanding, 71(1):94-109, 1998.

    Article  Google Scholar 

  48. M.M. Yeung and B.L. Yeo. Video visulization for compact presentation and fast browsing of pictorial content. IEEE Transactions on Circuits and Systems for Video Technology, 7(5):771-785, 1997.

    Article  Google Scholar 

  49. H. Yi, D. Rajan, and L.T. Chia. A motion-based scene tree for compressed video content management. Image and Vision Computing, 24(2):131-142, 2006.

    Article  Google Scholar 

  50. H.H. Yu and W. Wolf. A hierarchical multiresolution video shot transition detection scheme. Computer Vision and Image Understanding, 75(1-2):196-213, 1999.

    Article  Google Scholar 

  51. H.J. Zhang, C.Y. Low, S.W. Smoliar, and J.H. Wu. Video parsing, retrieval and browsing: an integrated and content-based solution. In Proceedings of ACM International Conference on Multimedia, pages 15-24, 1995.

    Google Scholar 

  52. H.J. Zhang, J. Wu, D. Zhong, and S.W. Smoliar. An integrated system for content-based video retrieval and browsing. Pattern Recognition, 30(4):643-658, 1997.

    Article  Google Scholar 

  53. H.J. Zhang, J.H. Wu, C.Y. Low, and S.W. Smoliar. A video parsing, index-ing and retrieval system. In Proceedings of ACM International Conference on Multimedia, pages 359-360, 1995.

    Google Scholar 

  54. Y.J. Zhang and H.B. Lu. A hierarchical organization scheme for video data. Pattern Recognition, 35(11):2381-2387, 2002.

    Article  MATH  Google Scholar 

Download references

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer

About this chapter

Cite this chapter

(2008). Video Segmentation and Keyframe Extraction. In: Machine Learning for Audio, Image and Video Analysis. Advanced Information and Knowledge Processing. Springer, London. https://doi.org/10.1007/978-1-84800-007-0_14

Download citation

  • DOI: https://doi.org/10.1007/978-1-84800-007-0_14

  • Publisher Name: Springer, London

  • Print ISBN: 978-1-84800-006-3

  • Online ISBN: 978-1-84800-007-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics