Scalable video summarization via sparse dictionary learning and selection simultaneously

Etezadifar, Pouriya; Farsi, Hassan

doi:10.1007/s11042-016-3433-z

Scalable video summarization via sparse dictionary learning and selection simultaneously

Published: 22 March 2016

Volume 76, pages 7947–7971, (2017)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Pouriya Etezadifar¹ &
Hassan Farsi¹

412 Accesses
11 Citations
Explore all metrics

Abstract

Every day, a huge amount of video data is generated worldwide and processing this kind of data requires powerful resources in terms of time, manpower, and hardware. Therefore, to help quickly understand the content of video data, video summarization methods have been proposed. Recently, sparse formulation-based methods have been found to be able to summarize a large amount of video compared to other methods. In this paper, we propose a new method in which video summarization is performed as training and selection sparse dictionary problem simultaneously. It is shown that the proposed method is able to improve the summarization of a large amount of video data compared to other methods. Finally, the performance of the proposed method is compared to state-of-the-art methods using standard data sets, in which the key frames are manually tagged. The obtained results demonstrate that the proposed method improves video summarization compared to other methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Robust video summarization using collaborative representation of adjacent frames

Article 01 June 2018

Adaptive Video Summarization via Robust Representation and Structured Sparsity

Group sparse based locality – sensitive dictionary learning for video semantic analysis

Article 29 July 2018

References

Arnold BC, Groeneveld RA (1995) Measuring skewness with respect to the mode. Am Stat 49:34–38
MathSciNet Google Scholar
Avila SEF, Lopes APB, daLuz A, Araújo A (2011) Vsumm: a mechanism designed to produce static video summaries and a novel evaluation method. Pattern Rec Let 32(1):56–68
Article Google Scholar
Cernekova Z, Pitas I, Nikou C (2006) Information theory-based shot cut/fade detection and video summarization. IEEE Trans Circuits Syst Video Technol 16(1):82–91
Article Google Scholar
Chen F, Cooper M, Adcock J (2007) Video summarization preserving dynamic content. In proceeding of international workshop on TRECVID video summarization
Chen F, Vleeschouwer D (2011) Formulating team-sport video summarization as a resource allocation problem. IEEE Trans Circuits Syst Video Technol 21(2):193–205
Article Google Scholar
Chen F, Vleeschouwer CD, Cavallaro A (2014) Resource allocation for personalized video summarization. IEEE Trans Multimed 16(2):455–469
Article Google Scholar
Ciocca G, Schettini R (2006) Innovative algorithm for key frame extraction in video summarization. J Real-Time Image Proc 1(1):69–88
Article Google Scholar
Cong Y, Yuan J, Liu J (2011) Sparse reconstruction cost for abnormal event detection. In: Proc. IEEE Conf. Com. Vision & Pattern Recognition(CVPR), pp. 3449–3456
Cong Y, Yuan J, Luo J (2012) Towards scalable summarization of consumer videos via sparse dictionary selection. IEEE Trans Multimed 14(1):66–75
Article Google Scholar
Doulamis ND, Doulamis AD, Avrithis YS, Ntalianis KS, Kollias SD (2000) Efficient summarization of stereoscopic video sequences. IEEE Trans Circuits Syst Video Technol 10(4):501–517
Article Google Scholar
Ejaz N, Manzoor U, Nefti S, Baik SW (2012) A collaborative multi-agent framework for abnormal activity detection in crowded areas. Int J Innov Comp Inf Control 8(6):4219–4234
Google Scholar
Ejaz N, Tariq TB, Baik SW (2012) Adaptive key frame extraction for video summarization using an aggregation mechanism. Elsevier. Visual comm image rep. 23: 1031–1040
Elad M (2010) Sparse redundant representations, from theory to applications in signal and image processing. Springer, the Teknion Institute of technology Haifa, pp. 200–246
Ferman AM, Gunsel B, Tekalp AM (1997) Object-based indexing of MPEG-4 compressed video. Proc. VCIP’97, vol. SPIE-3024, pp. 953–963
Fu Y, Guo Y, Zhu Y, Liuv SC, Zhou Z (2010) Multi view video summarization. IEEE Trans Multimed 12(7):717–729
Article Google Scholar
Furini M, Geraci F, Montangero M, Pellegrini M (2010) Stimo: still and moving video storyboard for the web scenario. Multimed Tools Appl 46(1):47–69
Article Google Scholar
Gallager RG (1968) Information theory and reliable communication. Wiley, New York
MATH Google Scholar
Golub GH, Van Loan CF (1996) Matrix computations, 3rd. Edition. Johns Hopkins University Press, Baltimore and London
Groeneveld RA, Meeden G (1984) Measuring skewness and kurtosis. J R Stat Soc Stat 33:391–399
Google Scholar
Guan G, Wang Z, Lu S, Dadeng J, Feng D (2013) Keypoint based keyframe selection. IEEE Trans Circuits Syst Video Technol 23(4):729–734
Article Google Scholar
Hanjalic A, Langendijk RL, Biemond J (1996) A new key frame allocation method for representing stored video streams. 1st Int. Workshop on image databases & multi, search, Amsterdam, The Netherlands, pp. 67–74
Hu W, Xie N, Zeng X, Maybank S (2011) A survey on visual content-based video indexing and retrieval. IEEE Trans Syst Man Cybern 41(6):797–819
Article Google Scholar
Kim HH, Kim YH (2010) Toward a conceptual framework of key-frame extraction and storyboard display for video summarization. J Am Soc Inf Sci Techol 61(5):927–939
Article Google Scholar
Lakshmi GG, Domnic S (2014) Shot based key frame extraction for ecological video indexing and retrieval. Elsevier Ecol Inf 23:107–117
Article Google Scholar
Li Y, Lee SH, Yeh CH, Kuo CC (2006) Techniques for movie content analysis and skimming: tutorial and overview on video abstraction techniques. IEEE Signal Process Mag 23(2):79–89
Article Google Scholar
Li Y, Zhang T, Tretter D (2001) An overview of video abstraction techniques. Technical report HPL-2001–191
Lindeberg T (1994) Scale-space theory: a basic tool for analyzing structures at different scales. J Appl Stat, pp. 224–270
Loui A, Luo J, Chang S, Ellis D, Jiang W, Kennedy L, Lee K, Yanagawa A (2007) Kodak’s consumer video benchmark data set: concept definition and annotation. in Proc. Int. Workshop multimedia Inf. Ret, pp. 245–254
Lowe DG (2004) Distinctive image features from scale-invariant key-points. Int J Comput Vision 60(2):91–110
Article Google Scholar
Lu S, Wang Z, Mei T, Guan G, Feng DD (2014) A Bag-of-importance model with locality-constrained coding based feature learning for video summarization. IEEE Trans Multimed 16(6):1497–1509
Article Google Scholar
Luo J, Papin C, Costello K (2009) Towards extracting semantically meaningful key frames from personal video clips: from humans to computers. IEEE Trans Circuits Syst Video Technol 19(2):289–301
Article Google Scholar
Massimiliano A (2006) Extracting and Summarizing Information from large data Repositories,” Ph.D. Dissertation, University of Naples Federico II, Italia
Mikolajczyk K, Zisserman A, Schmid C (2003) Shape recognition with edge-based features. In Proc. British Machine Vision Conf, Norwich, U.K
Money AG, Agius H (2008) Video summarization: a conceptual framework and survey of the state of the Art. J Visual Commun Image Represent 19(2):121–143
Article Google Scholar
Mundur P, Rao Y, Yesha Y (2006) Keyframe based video summarization using Delaunay clustering. Int J Digit Libr 6(2):219–232
Article Google Scholar
Nesterov Y (2007) Gradient methods for minimizing composite objective function. CORE, Louvain-la-Neuve
Google Scholar
Ngo CW, Ma YF, Zhang HJ (2005) Video summarization and scene detection by graph modeling. IEEE Trans Circuits Syst Video Technol 15(2):296–305
Article Google Scholar
Oliva A, Torralba A (2001) Modeling the shape of the scene: a holistic representation of the spatial envelope. Int J Comput Vis 42(3):145–175
Article MATH Google Scholar
Rasheed Z, Shah M (2005) Detection and representation of scenes in videos. IEEE Trans Multimed 7(6):1097–1105
Article Google Scholar
Rayner JCW, Best DJ, Matthews KL (1995) Interpreting the skewness coefficient. Commun Stat Theory Methods 24:593–600
Article MATH Google Scholar
Sokolova M, Japkowicz N, Szpakowicz S (2006) Beyond accuracy, F-score and ROC: a family of discriminant measures for performance evaluation. Adv Artif Intell 43(4):1015–1021
Google Scholar
Stricker M, Orengo M (1995) Similarity of color images. Proc SPIE Storage Retr Image Video Databases 2420:381–392
Article Google Scholar
Tapas K, Res A et al (2002) An efficient k-means clustering algorithm: analysis and implementation. IEEE Trans Pattern Anal Mach Intell 24(7):881–892
Article Google Scholar
Taskiran CM, Pizlo Z, Amir A, Ponceleon D, Delp EJ (2006) Automated video program summarization using speech transcripts. IEEE Trans Multimedia 8(4):775–791
Article Google Scholar
The open video project: http://www.open-video.org/
The VSUMM database site: https://sites.google.com/site/vsummsite/results
Tibshirani R (1996) Regression shrinkage and selection via the Lasso. J R Statist Soc Ser B (Methodol) 58(1):267–288
MathSciNet MATH Google Scholar
Truong BT, Venkatesh S (2007) Video abstraction: a systematic re-view and classification. ACM Trans Multimedia Com Commun Appl 3(1):1–37
Article Google Scholar
Tsai DM, Lai SC (2009) Independent component analysis-based background subtraction for indoor surveillance. IEEE Trans Image Process 18(1):158–167
Article MathSciNet Google Scholar
Wu J, Christensen H, Rehg J (2009) Visual place categorization: problem, dataset, and algorithm. In: Proc. IRO
Wu J, Christensen H, Rehg J (2009) Visual place categorization: problem, dataset, and algorithm. In: Proc. Intelligent robots and systems, pp. 4763–47760
Wu J, Rehg J (2010) Centrist: a visual descriptor for scene categorization. IEEE Trans Pattern Anal Mach Intell 33(8):1489–1501
Google Scholar
Xiang T, Gong S (2008) Video behavior profiling for anomaly detection. IEEE Trans Pattern Anal Mach Intell 30(5):893–908
Article Google Scholar
Xu M, Orwell J, Jones G (2004) Tracking football players with multiple cameras. In: Proc. Int. Conf. Image Process, ICIP 2004, pp. 2909–2912
Yan C, Zhang Y et al (2014) A Highly parallel framework for HEVC coding unit partitioning tree decision on many-core processors. IEEE Signal Process Lett 21(5):573–576
Article Google Scholar
Yan C, Zhang Y et al (2014) Efficient Parallel framework for HEVC motion estimation on many-core processors. IEEE Trans Multimed Circuits Syst Video Technol 24(12):2077–2089
Article Google Scholar
Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. J R Stat Soc 68(1):49–67
Article MathSciNet MATH Google Scholar
Zhuang Y, Rui Y, Huang T, Mehrotra S (1998) Adaptive key frame extraction using unsupervised clustering. Proc Int Conf Image Process 1:866–870
Google Scholar

Download references

Author information

Authors and Affiliations

Faculty of Electrical and Computer Engineering, University of Birjand, Birjand, Iran
Pouriya Etezadifar & Hassan Farsi

Authors

Pouriya Etezadifar
View author publications
You can also search for this author in PubMed Google Scholar
Hassan Farsi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hassan Farsi.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Etezadifar, P., Farsi, H. Scalable video summarization via sparse dictionary learning and selection simultaneously. Multimed Tools Appl 76, 7947–7971 (2017). https://doi.org/10.1007/s11042-016-3433-z

Download citation

Received: 07 October 2015
Revised: 20 February 2016
Accepted: 03 March 2016
Published: 22 March 2016
Issue Date: March 2017
DOI: https://doi.org/10.1007/s11042-016-3433-z

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Scalable video summarization via sparse dictionary learning and selection simultaneously

Abstract

Access this article

Similar content being viewed by others

Robust video summarization using collaborative representation of adjacent frames

Adaptive Video Summarization via Robust Representation and Structured Sparsity

Group sparse based locality – sensitive dictionary learning for video semantic analysis

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Scalable video summarization via sparse dictionary learning and selection simultaneously

Abstract

Access this article

Similar content being viewed by others

Robust video summarization using collaborative representation of adjacent frames

Adaptive Video Summarization via Robust Representation and Structured Sparsity

Group sparse based locality – sensitive dictionary learning for video semantic analysis

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation