Multimedia Tools and Applications

, Volume 77, Issue 20, pp 26635–26655 | Cite as

Deep Event Learning boosT-up Approach: DELTA

  • Krishan KumarEmail author
  • Deepti D. Shrimankar


Nowadays, the video surveillance systems may be omnipresent, but essential for supervision everywhere, e.g., ATM, airport, railway station and other crowded situations. In the multi-view video systems, various cameras are producing a huge amount of video content around the clock which makes it difficult for fast browsing, retrieval, and analysis. Accessing and managing such huge data in real time becomes a real challenging task because of inter-view dependencies, illumination changes and the bearing of many inactive frames. The work highlights an accurate and efficient technique to detect and summarize the event in multi-view surveillance videos using boosting, a machine learning algorithm, as a solution to the above issues. Interview dependencies across multiple views of the video are captured via weak learning classifiers in boosting algorithm. The light changes and still frames are tackled with moving an object in the frame by Deep learning framework. It helps to reach the correct decision for the active frame and inactive frame, without any prior information about the number of issues in a video. Target, as well as subjective ratings, clearly indicate the potency of our proposed DELTA model, where it successfully reduces the video data, while keeping the important information as events.


AdaBoosting Deep learning Event summarization Key-frame Multi-view video 


  1. 1.
    Alfaro et al (2016) Action recognition in video using sparse coding and relative features. CVPR, 2688–2697Google Scholar
  2. 2.
    Almeida J et al (2013) Online video summarization on compressed domain. JVCIR 24(6):729–738Google Scholar
  3. 3.
    Anurag K et al (2017) A novel superpixel based color spatial feature for salient object detection. IEEE CICT’17Google Scholar
  4. 4.
    Brunelli R et al (1999) A survey on the automatic indexing of video data. JVCIR 10(2):78–112Google Scholar
  5. 5.
    Chang P et al (2002) Extract highlights from baseball game video with hidden markov models. IEEE ICIP 1:I-609Google Scholar
  6. 6.
    Chang et al (2016) They are not equally reliable: semantic event search using differentiated concept classifiers. CVPR, 1884–1893Google Scholar
  7. 7.
    Fu Y et al (2010) Multi view video summmarization. IEEE TMM 12(7):717–729Google Scholar
  8. 8.
    Fu Y et al (2014) Multi-view metric learning for multi-video video summarization, CoRR, vol. abs/1405.6434, [Online]. Available: 1405.6434
  9. 9.
    Gagandeep S et al (2017) PICS: a novel technique for video summarization. Springer MISP’17Google Scholar
  10. 10.
    Gan et al (2015) Devnet: adeep event network for multimedia event detection and evidence recounting. CVPR, 2568–2577Google Scholar
  11. 11.
    Gygli et al (2015) Video summarization by learning submodular mixtures of objectives. CVPR, 3090–3098Google Scholar
  12. 12.
    Hao W et al (2006) Generalized multiclass adaboost and its applications to multimedia classification. CVPR’06 Workshop, 113–113Google Scholar
  13. 13.
    Jasim H et al (2016) Surveillance video summarization based on histogram differencing and sum conditional variance. WASET Inter J Comp Elect Automat Control Inform Engg 10(9):1652–1657MathSciNetGoogle Scholar
  14. 14.
    Jiang G et al (2015) Super fast event recognition in internet videos. IEEE TMM 17(8):1174–1186Google Scholar
  15. 15.
    Jones et al (2006) Method and system for object detection in digital images. U.S. Patent No. 7,099,510.29Google Scholar
  16. 16.
    Krishan K et al (2017) F-DES: fast and deep event summarization. IEEE TMMGoogle Scholar
  17. 17.
    Krishan K et al (2017) SOMES: an efficient SOM technique for event summarization in multi-view surveillance videos. Springer ICACNI’17Google Scholar
  18. 18.
    Krishan K et al (2017) V-LESS: a video from linear event summarieS. Springer CVIP’17Google Scholar
  19. 19.
    Krishan K et al (2017) D-CAD: deep and crowded anomaly detection. ACM ICCCT’17Google Scholar
  20. 20.
    Krizhevsky A et al (2012) Imagenet classification with deep convolutional neural networks. ANIPS, 1097–1105Google Scholar
  21. 21.
    Krogh A et al (1995) Neural network ensembles, cross validation, and active learning. ANIPS 7:231–238Google Scholar
  22. 22.
    Kuanar S et al (2015) Multi-view video summarization using bipartite matching constrained optimum-path forest clustering. IEEE TMM 17(8):1166–1173Google Scholar
  23. 23.
    Kumar K et al (2016) Equal partition based clustering approach for event summarization in videos. SITIS, 119–126Google Scholar
  24. 24.
    Kumar K et al (2017) Key-lectures: keyframes extraction in video lectures. Springer MISP’17Google Scholar
  25. 25.
    Kumar K et al (2017) Eratosthenes sieve based key-frame extraction technique for event summarization in videos. MTAP, 1–22Google Scholar
  26. 26.
    Kumar K et al (2017) Event BAGGING: a novel event summarization approach in multi-view surveillance videos IEEE IESC’17Google Scholar
  27. 27.
    Lu S et al (2014) A bag-of-importance model with locality-constrained coding based feature learning for video summarization. IEEE TMM 16(6):1497–1509Google Scholar
  28. 28.
    Ma F et al. (2005) A generic framework of user attention model and its application in video summarization. IEEE TMM 7(5):907–919Google Scholar
  29. 29.
    Mazloom M et al (2014) Conceptlets: selective semantics for classifying video events. IEEE TMM 16(8):2214–2228Google Scholar
  30. 30.
    Mazloom M et al (2016) TagBook: a semantic video representation without supervision for event detection. IEEE TMM 18(7):1378–1388Google Scholar
  31. 31.
    Merler M et al (2012) Semantic model vectors for complex video event recognition. IEEE TMM 14(1):88–101Google Scholar
  32. 32.
    Mundur P et al (2006) Keyframe-based video summarization using Delaunay clustering. IJDL 6(2):219–232Google Scholar
  33. 33.
    Musfequs S et al (2016) Video summarization using geometric primitives. IEEE DICTA’16Google Scholar
  34. 34.
    Nagasaka A (1991) Automatic video indexing and full-video search for object appearances. In: Conf. on visual database system, pp 119–133Google Scholar
  35. 35.
    Ou H et al (2015) On-line multi-view video summarization for wireless video sensor network. IEEE J S T Sig Process 9(1):165–179CrossRefGoogle Scholar
  36. 36.
    Panagiotakis C et al (2009) Equivalent key frames selection based on iso-content principles. TCSVT 19(3):447–6451Google Scholar
  37. 37.
    Potapov D et al (2014) Category-specific video summarization. ECCV, 540–555Google Scholar
  38. 38.
    Qian S et al (2016) Multi-modal event topic model for social event analysis. IEEE TMM 18(2):233–246Google Scholar
  39. 39.
    Singh N et al (2016) A convex hull approach in conjunction with Gaussian mixture model for salient object detection. DSP, 22–31Google Scholar
  40. 40.
    Singh N et al (2016) A novel position prior using fusion of rule of thirds and image center for salient object detection MTAP. CrossRefGoogle Scholar
  41. 41.
    Sun X et al (2000) Video summarization using R-sequences. Real-Time Imag 6 (6):449–459CrossRefGoogle Scholar
  42. 42.
    Valdes V et al (2008) Binary tree based on-line video summarization. ACM TRECVid video summarization workshop, 134–138Google Scholar
  43. 43.
    Vezhnevets A et al (2007) Avoiding boosting overfitting by removing confusing samples. Machine learning: ECML, 430–441Google Scholar
  44. 44.
    Wang M et al (2012) Event driven web video summarization by tag localization and key-shot identification. IEEE TMM 14(4):975–985Google Scholar
  45. 45.
    Wang S et al (2014) Semi-supervised multiple feature analysis for action recognition. IEEE TMM 16(2):289–298Google Scholar
  46. 46.
    Wang F et al (2014) Video event detection using motion relativity and feature selection. IEEE TMM 16(5):1303–1315Google Scholar
  47. 47.
    Wang et al (2015) Video event recognition with deep hierarchical context model. CVPR, 4418–4427Google Scholar
  48. 48.
    Weber B (2008) Generic object detection using AdaBoost. Department of Computer Science University of California, Santa CruzGoogle Scholar
  49. 49.
    Wu B et al (2004) Fast rotation invariant multi-view face detection based on real adaboost. IEEE FGR’04, 79–84Google Scholar
  50. 50.
    Xiong Z et al (2004) Effective and efficient sports highlight extraction using the minimum description length criterion in selecting GMM structures. IEEE ICME’04 3:1947–1950Google Scholar
  51. 51.
    Xu et al (2015) A discriminative CNN video representation for event detection. CVPR, 1798–1807Google Scholar
  52. 52.
    Xu B et al (2016) Fast summarization of user-generated videos: exploiting semantic, emotional, & quality clues. IEEE TMM 23,3:23–33Google Scholar
  53. 53.
    Yang X et al (2015) Automatic visual concept learning for social event understanding. IEEE TMM 17(3):346–358Google Scholar
  54. 54.
    Zhang T et al (2012) A generic framework for video annotation via semi-supervised learning. IEEE TMM 14(4):1206–1219MathSciNetGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2018

Authors and Affiliations

  1. 1.Department of Computer Science and EngineeringVisvesvaraya National Institute of Technology NagpurNagpurIndia
  2. 2.National Institute of Technology UttarakhandSrinagarIndia

Personalised recommendations