Skip to main content

Video Segmentation and Keyframe Extraction

  • Chapter
  • First Online:
Machine Learning for Audio, Image and Video Analysis

Abstract

What the reader should know to understand this chapter \(\bullet \) Basic notions of image processing (Chap. 3). \(\bullet \) Clustering techniques (Chap. 6).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 49.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 64.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 99.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Figure 14.4 includes the whole video and this is the reason why the shot boundaries are four rather than one as shown in Fig. 14.1 which shows only the first 20 s.

  2. 2.

    At the time this book is being written, the package can be downloaded from http://torch3vision.idiap.ch.

  3. 3.

    At the time this book is being written, the package is publicly available at the following website: http://www.torch.ch.

References

  1. M. Abdel-Mottaleb, N. Dimitrova, R. Desai, and J. Martino. CONIVAS: content based image and video access system. In Proceedings of ACM International Conference on Multimedia, pages 427–428, 1996.

    Google Scholar 

  2. A. Aner-Wolf and J. Kender. Video-summaries and cross-referencing through mosaic based representation. Computer Vision and Image Understanding, 95(2):201–237, 2004.

    Google Scholar 

  3. H. Aoki, S. Shmotsuji, and O. Hori. A shot classification method of selecting effective key-frames for video browsing. In Proceedings of ACM International Conference on Multimedia, pages 1–10, 1996.

    Google Scholar 

  4. R. Castagno, T. Ebrahimi, and M. Kunt. Video segmentation based on multiple features for interactive multimedia applications. IEEE Transactions on Circuits and Systems for Video Technology, 8(5):562–571, 1998.

    Google Scholar 

  5. Z. Cernekova, I. Pitas, and Nikou. Information theory-based shot cut/fade detection and video summarization. IEEE Transactions on Circuits and Systems for Video Technology, 16(1):82–91, 2006.

    Google Scholar 

  6. H.S. Chang, S. Sull, and S.U. Lee. Efficient video indexing scheme for content-based retrieval. IEEE Transactions on Circuits and Systems for Video Technology, 9(8):1269–1279, 1999.

    Google Scholar 

  7. R. Collobert, S. Bengio, and J. Mariéthoz. Torch: a modular machine learning software library. Technical Report 02-46, IDIAP, 2002.

    Google Scholar 

  8. P.L. Correia and F. Pereira. Classification of video segmentation application scenarios. IEEE Transactions on Circuits and Systems for Video Technology, 14(5):735–741, 2004.

    Google Scholar 

  9. J.M. Corridoni and A. Del Bimbo. Structured representation and automatic indexing of movie information content. Pattern Recognition, 31(12):2027–2045, 1998.

    Google Scholar 

  10. N. Dimitrova, H.J. Zhang, B. Shahraray, I. Sezan, T. Huang, and A. Zakhor. Application of video-content analysis and retrieval. IEEE Multimedia, 9(3):42–55, 2002.

    Google Scholar 

  11. A.D. Doulamis and N.D. Doulamis. Optimal content-based video decomposition for interactive video navigation. IEEE Transactions on Circuits and Systems for Video Technology, 14(6):757–775, 2004.

    Google Scholar 

  12. X. Du and G. Fan. Joint key-frame extraction and object segmentation for content-based video analysis. IEEE Transactions on Circuits and Systems for Video Technology, 16(7):904–914, 2006.

    Google Scholar 

  13. X. Gao and X. Tang. Unsupervised video-shot segmentation and model-free anchorperson detection for news video story parsing. IEEE Transactions on Circuits and Systems for Video Technology, 12(9):765–776, 2002.

    Google Scholar 

  14. D. Gatica-Perez, A. Loui, and M.T. Sun. Finding structure in home videos by probabilistic hierarchical clustering. IEEE Transactions on Circuits and Systems for Video Technology, 13(6):539–548, 2003.

    Google Scholar 

  15. J.M. Gauch and A. Shivadas. Finding and identifying unknown commercials using repeated video sequence detection. Computer Vision and Image Understanding, 103(1):80–88, 2006.

    Google Scholar 

  16. A. Hamampur, T. Weymouth, and R. Jain. Digital video segmentation. In Proceedings of ACM International Conference on Multimedia, pages 357–364, 1994.

    Google Scholar 

  17. A. Hanjalic. Shot boundary detection: unraveled and resolved? IEEE Transactions on Circuits and Systems for Video Technology, 12(2):90–105, 2002.

    Google Scholar 

  18. A. Hanjalic. Content Based Analysis of Digital Video. Springer-Verlag, 2004.

    Google Scholar 

  19. A. Hanjalic and H.J. Zhang. An integrated scheme for automated video abstraction based on unsupervised cluster-validity analysis. IEEE Transactions on Circuits and Systems for Video Technology, 9(8):1280–1289, 1999.

    Google Scholar 

  20. A. Hanjalic, R.L. Lagendijk, and J. Biemond. Automated high-level movie segmentation for advanced video-retrieval systems. IEEE Transactions on Circuits and Systems for Video Technology, 9(4):580–588, 1999.

    Google Scholar 

  21. V. Kobla, D. Doermann, and C. Faloutsos. VideoTrails: representing and visualizing structure. In Proceedings of ACM International Conference on Multimedia, pages 335–346, 1997.

    Google Scholar 

  22. I. Koprinska and S. Carrato. Temporal video segmentation: a survey. Signal Processing: Image Communication, 16:477–500, 2001.

    Google Scholar 

  23. J. Lee and B.W. Dickinson. Hierarchical video indexing and retrieval for subband-coded video. IEEE Transactions on Circuits and Systems for Video Technology, 10(5):824–829, 2000.

    Google Scholar 

  24. M.S. Lee, Y.M. Yang, and S.W. Lee. Automatic video parsing using shot boundary detection and camera operation analysis. Pattern Recognition, 34(3):711–719, 2001.

    Google Scholar 

  25. R. Leonardi, P. Migliorati, and M. Prandini. Semantic indexing of soccer audio-visual sequences: a multimodal approach based on controlled Markov chains. IEEE Transactions on Circuits and Systems for Video Technology, 14(5):634–643, 2004.

    Google Scholar 

  26. Y. Li and J. Kuo. Video Content Analysis Using Multimodal Information. Springer-Verlag, 2003.

    Google Scholar 

  27. L. Lije and G. Fan. Combined key-frame extraction and object-based video segmentation. IEEE Transactions on Circuits and Systems for Video Technology, 15(7):869–884, 2005.

    Google Scholar 

  28. S.D. MacArthur, C.E. Brodley, A.C. Kak, and L.S. Broderick. Interactive content-based image retrieval using relevance feedback. Computer Vision and Image Understanding, 88(2):55–75, 2002.

    Google Scholar 

  29. T. Meier and K.N. Ngan. Video segmentation for content-based coding. IEEE Transactions on Circuits and Systems for Video Technology, 9(8):1190–1203, 1999.

    Google Scholar 

  30. G.A. Miller. The magic number seven plus or minus two: some limits on capacity for processing information. Psychology Review, 63:81–97, 1956.

    Google Scholar 

  31. C.W. Ngo, T.C. Pong, and R.T. Chin. Video partitioning by temporal slice coherency. IEEE Transactions on Circuits and Systems for Video Technology, 11(8):941–953, 2001.

    Google Scholar 

  32. N.V. Patel and I.K. Sethi. Video shot detection and characterization for video databases. Pattern Recognition, 30(4):583–592, 1997.

    Google Scholar 

  33. M.J. Pickering and S. Rüger. Evaluation of key-frame based retrieval techniques for video. Computer Vision and Image Understanding, 92(2–3):217–235, 2003.

    Google Scholar 

  34. S. Porter, M. Mirmehdi, and B. Thoams. Temporal video segmentation and classification of edit effects. Image and Vision Computing, 21(13–14):1097–1106, 2003.

    Google Scholar 

  35. K.M. Pua, Gauch J.M., S.E. Gauch, and J.Z. Miadowicz. Real-time repeated video sequence identification. Computer Vision and Image Understanding, 93(3):310–327, 2004.

    Google Scholar 

  36. E. Sahouria and A. Zakhor. Content analysis of video using principal component analysis. IEEE Transactions on Circuits and Systems for Video Technology, 9(8):1290–1298, 1999.

    Google Scholar 

  37. F. Schaffalitzky and A. Zisserman. Automated location matching in movies. Computer Vision and Image Understanding, 92(2–3):217–235, 2003.

    Google Scholar 

  38. M.A. Smith and M.G. Christel. Automating the creation of a digital video library. In Proceedings of ACM International Conference on Multimedia, pages 357–358, 1995.

    Google Scholar 

  39. M.A. Smith and T. Kanade. Multimodal Video Characterization and Summarization. Springer-Verlag, 2004.

    Google Scholar 

  40. C.G.M. Snoek and M. Worring. Multimodal video indexing: a review of the state-of-the-art. Multimedia Tools and Applications, 25(1):5–35, 2005.

    Google Scholar 

  41. K.W. Sze, K.M. Lam, and G. Qiu. A new key frame representation for video segment retrieval. IEEE Transactions on Circuits and Systems for Video Technology, 15(9):1148–1155, 2005.

    Google Scholar 

  42. Y. Taniguchi, A. Akutsu, Y. Tonomura, and H. Hamada. An intuitive and efficient access interface to real-time incoming video based on automatic indexing. In Proceedings of ACM International Conference on Multimedia, pages 25–33, 1995.

    Google Scholar 

  43. B.T. Truong, S. Venkatesh, and C. Dorai. Scene extraction in motion pictures. IEEE Transactions on Circuits and Systems for Video Technology, 13(1):5–15, 2003.

    Google Scholar 

  44. S. Tsekeridou and I. Pitas. Content-based video parsing and indexing based on audio-visual interaction. IEEE Transactions on Circuits and Systems for Video Technology, 11(4):522–535, 2001.

    Google Scholar 

  45. D. Wang. Unsupervised video segmentation based on watersheds and temporal tracking. IEEE Transactions on Circuits and Systems for Video Technology, 8(5):539–546, 1998.

    Google Scholar 

  46. B.L. Yeo and B. Liu. Rapid scene analysis on compressed video. IEEE Transactions on Circuits and Systems for Video Technology, 5(6):533–544, 1995.

    Google Scholar 

  47. M.M. Yeung and B.L. Yeo. Video visualization for compact presentation and fast browsing of pictorial content. IEEE Transactions on Circuits and Systems for Video Technology, 7(5):771–785, 1997.

    Google Scholar 

  48. M. Yeung, B.L. Yeo, and B. Liu. Segmentation of video by clustering and graph analysis. computer Vision and Image Understanding, 71(1):94–109, 1998.

    Google Scholar 

  49. H. Yi, D. Rajan, and L.T. Chia. A motion-based scene tree for compressed video content management. Image and Vision Computing, 24(2):131–142, 2006.

    Google Scholar 

  50. H.H. Yu and W. Wolf. A hierarchical multiresolution video shot transition detection scheme. Computer Vision and Image Understanding, 75(1–2):196–213, 1999.

    Google Scholar 

  51. H.J. Zhang, C.Y. Low, S.W. Smoliar, and J.H. Wu. Video parsing, retrieval and browsing: an integrated and content-based solution. In Proceedings of ACM International Conference on Multimedia, pages 15–24, 1995.

    Google Scholar 

  52. H.J. Zhang, J.H. Wu, C.Y. Low, and S.W. Smoliar. A video parsing, indexing and retrieval system. In Proceedings of ACM International Conference on Multimedia, pages 359–360, 1995.

    Google Scholar 

  53. H.J. Zhang, J. Wu, D. Zhong, and S.W. Smoliar. An integrated system for content-based video retrieval and browsing. Pattern Recognition, 30(4):643–658, 1997.

    Google Scholar 

  54. Y.J. Zhang and H.B. Lu. A hierarchical organization scheme for video data. Pattern Recognition, 35(11):2381–2387, 2002.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Francesco Camastra .

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer-Verlag London

About this chapter

Cite this chapter

Camastra, F., Vinciarelli, A. (2015). Video Segmentation and Keyframe Extraction. In: Machine Learning for Audio, Image and Video Analysis. Advanced Information and Knowledge Processing. Springer, London. https://doi.org/10.1007/978-1-4471-6735-8_14

Download citation

  • DOI: https://doi.org/10.1007/978-1-4471-6735-8_14

  • Published:

  • Publisher Name: Springer, London

  • Print ISBN: 978-1-4471-6734-1

  • Online ISBN: 978-1-4471-6735-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics