Video Summarization with Visual and Semantic Features

Dong, Pei; Wang, Zhiyong; Zhuo, Li; Feng, Dagan

doi:10.1007/978-3-642-15702-8_19

Pei Dong^22,23,
Zhiyong Wang²²,
Li Zhuo²³ &
…
Dagan Feng^22,24

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 6297))

Included in the following conference series:

Pacific-Rim Conference on Multimedia

1526 Accesses
2 Citations

Abstract

Video summarization aims to provide a condensed yet informative version for original footages so as to facilitate content comprehension, browsing and delivery, where multi-modal features play an important role in differentiating individual segments of a video. In this paper, we present a method combining both visual and semantic features. Rather than utilize domain specific or heuristic textual features as semantic features, we assign semantic concepts to video segments through automatic video annotation. Therefore, semantic coherence between accompanying text and high-level concepts of video segments is exploited to characterize the importance of video segments. Visual features (e.g. motion and face) which have been widely used in user attention model-based summarization have been integrated with the proposed semantic coherence to obtain the final summarization. Experiments on a half-hour sample video from TRECVID 2006 dataset have been conducted to demonstrate that semantic coherence is very helpful for video summarization when being fused with different visual features.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Money, A., Agius, H.: Video summarisation: A conceptual framework and survey of the state of the art. Journal of Visual Communication and Image Representation 19(2), 121–143 (2008)
Article Google Scholar
Li, Y., Zhang, T., Tretter, D.: An overview of video abstraction techniques. Tech. Rep. HP-2001-191, HP Laboratory (2001)
Google Scholar
Ma, Y., Zhang, H.: Video snapshot: A bird view of video sequence. In: Proceedings of the 11th International Conference on Multi Media Modeling (MMM), pp. 94–101 (2005)
Google Scholar
Xu, M., Li, S.Z., Li, B., Yuan, X.T., Xiang, S.M.: A set theoretical method for video synopsis. In: ACM International Conference on Multimedia Information Retrieval (MIR), pp. 366–370 (2008)
Google Scholar
Ekin, A., Tekalp, A., Mehrotra, R.: Automatic soccer video analysis and summarization. IEEE Transactions on Image Processing 12(7), 796–807 (2003)
Article Google Scholar
Luo, B., Tang, X., Liu, J., Zhang, H.: Video caption detection and extraction using temporal information. In: Proceedings of the International Conference on Image Processing (ICIP), vol. 1, pp. 297–300 (2003)
Google Scholar
Taskiran, C., Pizlo, Z., Amir, A., Ponceleon, D., Delp, E.: Automated video program summarization using speech transcripts. IEEE Transactions on Multimedia 8(4), 775–791 (2006)
Article Google Scholar
Tsoneva, T., Barbieri, M., Weda, H.: Automated summarization of narrative video on a semantic level. In: Proceedings of the 1st IEEE International Conference on Semantic Computing (ICSC), pp. 169–176 (2007)
Google Scholar
Otsuka, I., Nakane, K., Divakaran, A., Hatanaka, K., Ogawa, M.: A highlight scene detection and video summarization system using audio feature for a personal video recorder. IEEE Transactions on Consumer Electronics 51, 112–116 (2005)
Article Google Scholar
Refaey, M., Abd-Almageed, W., Davis, L.: A logic framework for sports video summarization using text-based semantic annotation. In: Proceedings of the 3rd International Workshop on Semantic Media Adaptation and Personalization (SMAP), pp. 69–75 (2008)
Google Scholar
Pickering, M., Wong, L., Rüger, S.: ANSES: Summarisation of news video. In: Proceedings of International Conference on Image and Video Retrieval (CIVR), pp. 425–434 (2003)
Google Scholar
Evangelopoulos, G., Zlatintsi, A., Skoumas, G., Rapantzikos, K., Potamianos, A., Maragos, P., Avrithis, Y.: Video event detection and summarization using audio, visual and text saliency. In: Proceedings of the 2009 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 3553–3556 (2009)
Google Scholar
Chen, B., Wang, J., Wang, J.: A novel video summarization based on mining the story-structure and semantic relations among concept entities. IEEE Transactions on Multimedia 11(2), 295–312 (2009)
Article Google Scholar
Liang, C., Kuo, J., Chu, W., Wu, J.: Semantic units detection and summarization of baseball videos. In: Proceedings of the 47th Midwest Symposium on Circuits and Systems (MWSCAS), vol. 1, pp. 297–300 (2004)
Google Scholar
Tjondronegoro, D., Chen, Y.P., Pham, B.: Classification of self-consumable highlights for soccer video summaries. In: Proceedings of the IEEE International Conference on Multimedia and Expo. (ICME), vol. 1, pp. 579–582 (2004)
Google Scholar
Jiang, Y.G., Ngo, C.W., Yang, J.: Towards optimal bag-of-features for object categorization and semantic video retrieval. In: Proceedings of the 6th ACM International Conference on Image and Video Retrieval (CIVR), pp. 494–501 (2007)
Google Scholar
Ma, Y., Hua, X., Lu, L., Zhang, H.: A generic framework of user attention model and its application in video summarization. IEEE Transactions on Multimedia 7(5), 907–919 (2005)
Article Google Scholar
Viola, P., Jones, M.: Rapid object detection using a boosted cascade of simple features. In: CVPR (1), pp. 511–518 (2001)
Google Scholar
Pedersen, T., Patwardhan, S., Michelizzi, J.: WordNet::Similarity - measuring the relatedness of concepts. In: Proceedings of Fifth Annual Meeting of the North American Chapter of the Association for Computational Linguistics (NAACL), pp. 38–41 (2004)
Google Scholar
Kleban, J., Sarkar, A., Moxley, E., Mangiat, S., Joshi, S., Kuo, T., Manjunath, B.: Feature fusion and redundancy pruning for rush video summarization. In: Proceedings of the International Workshop on TRECVID Video Summarization, pp. 84–88 (2007)
Google Scholar
Liu, Z., Zavesky, E., Gibbon, D., Shahraray, B., Haffner, P.: AT&T research at TRECVID 2007. In: TRECVID 2007 Workshop (2007)
Google Scholar

Download references

Author information

Authors and Affiliations

School of Information Technologies, University of Sydney, Australia
Pei Dong, Zhiyong Wang & Dagan Feng
Signal and Information Processing Laboratory, Beijing University of Technology, Beijing, China
Pei Dong & Li Zhuo
Dept. of Electronic and Information Engineering, Hong Kong Polytechnic University, Hong Kong
Dagan Feng

Authors

Pei Dong
View author publications
You can also search for this author in PubMed Google Scholar
Zhiyong Wang
View author publications
You can also search for this author in PubMed Google Scholar
Li Zhuo
View author publications
You can also search for this author in PubMed Google Scholar
Dagan Feng
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

School of Computer Science, University of Nottingham, Jubilee Campus, NG8 1BB, Nottingham, UK
Guoping Qiu
The Centre for Multimedia Signal Processing, The Hong Kong Polytechnic University, Hong Kong, China
Kin Man Lam
Faculty of System Design, Tokyo Metropolitan University, 6-6, Asahigaoka, 191-0065, Hino-city, Tokyo
Hitoshi Kiya
Shanghai Key Laboratory of Intelligent Information Processing, Department of Computer Science & Engineering, Fudan University, Shanghai, China
Xiang-Yang Xue
Department of Electrical Engineering, University of Southern California, 90089-2564, Los Angeles, CA
C.-C. Jay Kuo
LIACS Media Lab, Leiden University,
Michael S. Lew

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Dong, P., Wang, Z., Zhuo, L., Feng, D. (2010). Video Summarization with Visual and Semantic Features. In: Qiu, G., Lam, K.M., Kiya, H., Xue, XY., Kuo, CC.J., Lew, M.S. (eds) Advances in Multimedia Information Processing - PCM 2010. PCM 2010. Lecture Notes in Computer Science, vol 6297. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-15702-8_19

Download citation

DOI: https://doi.org/10.1007/978-3-642-15702-8_19
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-15701-1
Online ISBN: 978-3-642-15702-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics