Abstract
The segmentation and classification of news video into single-story semantic units is a challenging problem. This research proposes a two-level, multi-modal framework to tackle this problem. The video is analyzed at the shot and story unit (or scene) levels using a variety of features and techniques. At the shot level, we employ a Decision Tree to classify the shot into one of 13 predefined categories. At the scene level, we perform the HMM (Hidden Markov Models) analysis to eliminate shot classification errors and to locate story boundaries. We test the performance of our system using two days of news video obtained from the MediaCorp of Singapore. Our initial results indicate that we could achieve a high accuracy of over 95 % for shot classification. The use of HMM analysis helps to improve the accuracy of the shot classification and achieve over 89% accuracy on story segmentation.
Keywords
Download to read the full chapter text
Chapter PDF
Similar content being viewed by others
References
A. Aydin Alatan, Alin N. Akansu, and Wayne Wolf (2001). “Multi-modal Dialog Scene Detection using Hidden Markov Models for Content-based Multi-media Indexing”. Multimedia Tools and applications, 14, pp 137–151.
Shiu-Fu Chang and Hari Sundaram (2000). “ Structural and semantic analysis of video”, IEEE International Conference on Multimedia and Expo (II): pp. 687-
Y. Chen and E. K. Wong (2001). “ A knowledge-based Approach to Video Content Classification”, Proceeding of SPIE Vol. 4315, pp. 292–300.
Tat-Seng Chua and Chunxin Chu (1998). Color-based Pseudo-object for image retrieval with relevance feedback. International Conference on Advanced Multimedia Content Processing ‘88. Osaka, Japan, Nov. 148–162.
Tat-Seng Chua, Yunlong Zhao and Mohan S. Kankanhalli (2000). “An Automated Compressed-Domain Face Detection Method for Video Stratification”, Proceedings of Multimedia Modeling (MMM’2000), USA, Nov, World Scientific, pp 333–347.
Robert Dale, Hermann Moisl, and Harold Somers (2000). “Handbook of natural language processing”, Imprint New York: Marcel Dekker.
T. G. Dietterich, and G. Bakin (1995). “Solving Multi-class Learning Problems via Error-Correcting Output Codes”, Journal of Artificial Intelligence Research, pp 263–286
Stefan Eickeler, Andreas Kosmala, Gerhard Rigoll (1997). “A New Approach To Content-based Video Indexing Using Hidden Markov Models”, IEEE workshop on Image Analysis for Multimedia Interactive Service (WIAMIS), pp 149–154.
J. Huang, Z. Liu, Y. Wang (1999). “Integration of Multimodal Features for Video Scene Classification Based on HMM”, IEEE signal processing Society workshop on Multimedia Signal processing, Denmark, pp 53–58.
Ichiro Ide, Koji Yamamoto, and Hidehiko Tanaka (1998). “Automatic Video Indexing Based on Shot Classification”, Conference on Advanced Multimedia Content Processing (AMCP’98), Osaka, Japan. S. Nishio, F. Kishino (eds), Lecture Notes in Computer Science Vol. 1554, pp 87102.
Michael I. Jordan (1998) (Eds). “Learning in Graphical Models”, MIT Press.
Chun-Keat Koh and Tat-Seng Chua (2000). “Detection and Segmentation of Commercials in News Video”, Technical report, The School of computing, National University of Singapore.
Yi Lin, Mohan S Kanhanhalli, and Tat-Seng Chua (2000), Mohan S Kanhanhalli, and Tat-Seng Chua (2000). “Temporal Multi-resolution Analysis for Video Segmentationtion”, Proceedings of SPIE (Storage and Retrieval for Media Databases)., San Jose, USA. Jan 2000, Vol 3972, pp 494–505.
Zhu Liu, Jingcheng Huang, and Yao Wang (1998). “Classification of TV Programs Based on Audio Information using Hidden Markov Models”, IEEE Signal Processing Society, Workshop on Multimedia Signal Processing, Los Angeles, California, USA, pp 27–31.
Lie Lu, Stan Z. Li and Hong-Jiang Zhang (2001). “Content-based Audio Segmentation using Support Vector Machine”, IEEE International Conference on Multimedia and Expo (ICME 2001), Japan, pp 956–959.
J. R. Quinlan (1986). “Induction of Decision Trees. Machine Learning” vol. 1, pp. 81–106.
L. Rabiner and B. Juang (1993). “Fundamentals of Speech Recognition”, Prentice-Hall.
Jihua Wang, Tat-Seng Chua, and Liping Chen (2001). “Cinematic-based Model for Scene boundary detection”, to appear in Proc. of Multimedia Modeling conference (MMM’01), Amsterdam, Netherlands.
Hong-Jiang Zhang, A. Kankanhalli and S.W. Smoliar (1993). “Automatic Partitioning of Fullmotion Video”, Multimedia Systems, 1 (1), pp 10–28.
Yi Zhang and Tat-Seng Chua (2000). “Detection of Text Captions in Compressed domain Video”. Proceedings of ACM Multimedia’2000 Workshops (Multimedia Information Retrieval), California, USA. Nov, pp 201–204.
WenSheng Zhou, Asha Vellaikal, and C—C Jay Kuo (2000). “Rule-based Classification System for basketball video indexing”, Proceedings of ACM Multimedia’2000 Workshops (Multimedia Information Retrieval), California, USA. Nov, pp 213–216.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2002 Springer Science+Business Media New York
About this chapter
Cite this chapter
Chaisorn, L., Chua, TS. (2002). The Segmentation and Classification of Story Boundaries in News Video. In: Zhou, X., Pu, P. (eds) Visual and Multimedia Information Management. VDB 2002. IFIP — The International Federation for Information Processing, vol 88. Springer, Boston, MA. https://doi.org/10.1007/978-0-387-35592-4_8
Download citation
DOI: https://doi.org/10.1007/978-0-387-35592-4_8
Publisher Name: Springer, Boston, MA
Print ISBN: 978-1-4757-6935-7
Online ISBN: 978-0-387-35592-4
eBook Packages: Springer Book Archive