Feature Pooling Using Spatio-Temporal Constrain for Video Summarization and Retrieval

Ren, Jie; Ren, Jinchang

doi:10.1007/978-981-10-1536-6_50

Jie Ren⁵ &
Jinchang Ren⁶

Part of the book series: Lecture Notes in Electrical Engineering ((LNEE,volume 393))

1064 Accesses

Abstract

A content-based video retrieval via visual feature pooling is proposed in this paper. Since these visual words represent local features extracted from frame images, spatio-temporal constrains are applied to solve the ambiguity of the model towards effective retrieval of semantic video clips. Both shot level and segment level processing are employed, and the latter is found more robust in dealing with complex scenes where accurate video segmentation may fail. Our experimental results have shown that the constrained scheme help to improve 5 % average matching accuracy. In addition, it suggests that summarized videos at 25–30 % of original size can still maintain a viewing quality of 70–80 % towards fast content delivery.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Cotsaces C, Nikolaidis N, Pitas I (2006) Video shot detection and condensed representation: a review. IEEE Signal Proc Mag 23(2):28–37
Article Google Scholar
Ren J, Jiang J, Chen J (2009) Shot boundary detection in MPEG videos using local and global indicators. IEEE Trans Circ Syst Video Tech 19(8):1234–1238
Google Scholar
Ren J, Jiang J (2009) Hierarchical modelling and adaptive clustering for real-time summarization of rush videos. IEEE Trans Multimedia 11(5):906–917
Article Google Scholar
Yuan Y, Wang H, Xiao et al (2007) A formal study of shot boundary detection. IEEE Trans Circ Syst Video Tech 17(2):168–186
Google Scholar
Ngo CW, Ma YF, Zhang H-J (2005) Video summarization and scene detection by graph modeling. IEEE Trans Circ Syst Video Tech 15(2):296–305
Article Google Scholar
Chang S-F, Vetro A (2005) Video adaptation: concepts, technologies, and open issues. Proc IEEE 93(1):148–158
Article Google Scholar
Hanjalic A, Xu L-Q (2005) Affective video content representation and modeling. IEEE Trans Multimedia 7(1):143–154
Article Google Scholar
Qin J, Yung HC (2010) Scene categorization via contextual visual words. Pattern recognition
Google Scholar
Everingham M, Gool LV, Williams CKI, Winn J, Zisserman A (2010) The pascal visual object classes (VOC) challenge. In: Int J Comput Vis (IJCV)
Google Scholar
Tuytelaars T, Lampert CH, Blaschko MB, Buntine W (2010) Unsupervised object discovery: a comparison. IJCV
Google Scholar
van Gemert JC, Veenman CJ, Smeulders AWM, Geusebroek JM (2010) Visual word ambiguity. IEEE Trans, PAMI
Google Scholar
Rapantzikos, K., Tsapatsoulis, N., Avrithis, Y., and Kollias, S. 2010, Spatiotemporal Saliency for Video Classification, Signal Processing: Image Communication
Google Scholar
Zhang J et al (2007) Local feature and kernels for classification of textures and object categories: a comprehensive study. IJCV 73(2):213–238
Article Google Scholar
Spyrou E, Tolias G, Mylonas Ph, Avrithis Y (2009) Concept detection and keyframe extraction using a visual thesaurus. Multimed Tools Appl 41(3):337–373
Google Scholar
Tuytelaars T, Mikolajczyk K (2008) Local invariant feature detectors: a survey. Found Tr Comput Gr Vis 3(3):177–280
Article Google Scholar
Jiang J, Qiu K, Xiao G (2008) A block-edge-pattern-based content descriptor in DCT domain. IEEE Trans Circ Syst Video Tech 18(7):994–998
Google Scholar

Download references

Author information

Authors and Affiliations

College of Electronics and Information, Xi’an Polytechnic University, Xi’an, China
Jie Ren
Department of Electronic and Electrical Engineering, University of Strathclyde, Glasgow, UK
Jinchang Ren

Authors

Jie Ren
View author publications
You can also search for this author in PubMed Google Scholar
Jinchang Ren
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jie Ren .

Editor information

Editors and Affiliations

Computer Science and Engg., Seoul National Uni. of Sci. & Techn. Computer Science and Engg., Seoul, Korea (Republic of)
James J. (Jong Hyuk) Park
School of Comp. Science and Technology, Huazhong Univ. of Science and Technology School of Comp. Science and Technology, Wuhan, China
Hai Jin
Dept. of Multimedia Eng., Dongguk University Dept. of Multimedia Eng., Seoul, Korea (Republic of)
Young-Sik Jeong
College of Computer & Information Sci, King Saud University College of Computer & Information Sci, Riyadh, Saudi Arabia
Muhammad Khurram Khan

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Ren, J., Ren, J. (2016). Feature Pooling Using Spatio-Temporal Constrain for Video Summarization and Retrieval. In: Park, J., Jin, H., Jeong, YS., Khan, M. (eds) Advanced Multimedia and Ubiquitous Engineering. Lecture Notes in Electrical Engineering, vol 393. Springer, Singapore. https://doi.org/10.1007/978-981-10-1536-6_50

Download citation

DOI: https://doi.org/10.1007/978-981-10-1536-6_50
Published: 30 August 2016
Publisher Name: Springer, Singapore
Print ISBN: 978-981-10-1535-9
Online ISBN: 978-981-10-1536-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics