Fusion of gradient and feature similarity for Keyframe extraction

Abstract

Several computer vision applications such as e-learning, video editing, video compression, video-on-demand and surveillance etc. are popular in recent days. Most of the applications need videos to be retrieved and processed regularly. First and foremost step towards video retrieval and management is keyframe extraction. The perfect identification of shot transition boundaries is trivial in extracting keyframes. In present article, a framework for shot transition detection and keyframe extraction have been proposed. The proposed method is efficient, simple and does not require supervision which makes it attractive. The proposed method establishes the shot transition boundaries by estimating feature similarity (FSIM) between gradient magnitudes of consecutive frames. Then the frame with the highest mean and standard deviation is chosen as keyframe to that shot. In any situation if one feature fails to establish shot transition boundary another feature may succeed in establishment of shot transition boundary at proper frame locations of video. The proposed algorithm is tested on four different datasets, among them one is developed by us, two are well known standard datasets to evaluate keyframe extraction algorithm and the other one is standard surveillance video dataset. All the datasets are publicly available. Performance evaluation of the method is done in terms of Figure of merit, Detection percentage, Accuracy and Missing factor. The experimental results prove that the proposed method outperforms other state-of-art methods.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17

References

  1. 1.

    Ayadi T, Ellouze M, Hamdani TM, Alimi AM (2013 Jun 1) Movie scenes detection with MIGSOM based on shots semi-supervised clustering. Neural Comput Appl 22(7–8):1387–1396

    Article  Google Scholar 

  2. 2.

    Birinci M, Kiranyaz S (2014 Mar 1) A perceptual scheme for fully automatic video shot boundary detection. Signal Process: Image Commun 29(3):410–423

    Google Scholar 

  3. 3.

    Bommisetty RM, Prakash O, Khare A (2019) Keyframe extraction using Pearson correlation coefficient and color moments. Multimedia Systems:1-33. https://doi.org/10.1007/s00530-019-00642-8

  4. 4.

    Chen J, Ren J, Jiang J (2011) Modelling of content-aware indicators for effective determination of shot boundaries in compressed MPEG videos. Multimed Tools Appl 54(2):219–239

  5. 5.

    Dutta D, Saha SK, Chanda B (2016) A shot detection technique using linear regression of shot transition pattern. Multimed Tools Appl 75(1):93–113

  6. 6.

    Fei M, Jiang W, Mao W, Song Z (2016) New fusional framework combining sparse selection and clustering for key frame extraction. IET Comput Vis 10(4):280–288

  7. 7.

    Ferreira L, da Silva Cruz LA, Assuncao P (2016) Towards key-frame extraction methods for 3D video: a review. EURASIP J Image Video Process 2016(1):28

  8. 8.

    Gao G, Ma H (2014) To accelerate shot boundary detection by reducing detection region and scope. Multimed Tools Appl 71(3):1749–1770

  9. 9.

    Hannane R, Elboushaki A, Afdel K, Naghabhushan P, Javed M (2016 Jun 1) An efficient method for video shot boundary detection and keyframe extraction using SIFT-point distribution histogram. Int J Multimed Inform Retriev 5(2):89–104

    Article  Google Scholar 

  10. 10.

    https://computervisiononline.com/dataset/ (2020) Accessed on 18th August 2020.

  11. 11.

    https://www.sites.google.com/site/vsummsite/download. Accessed 15 Jan 2021

  12. 12.

    Hu W, Jin Y, Wen Y, Wang Z, Sun L (2017) Towards wi-fi ap-assisted content prefetching for on-demand tv series: A learning-based approach. IEEE Trans Circuits Syst Video Technol 28(7):1665–1676

  13. 13.

    Huang CR, Lee HP, Chen CS (2014) Shot change detection via local keypoint matching. IEEE Trans Multimed 10(6):1097–1108

  14. 14.

    Ioannidis A, Chasanis V, Likas A (2016 Mar 1) Weighted multi-view key-frame extraction. Pattern Recognition Lett 72:52–61

    Article  Google Scholar 

  15. 15.

    Jadhav MP, Jadhav DS (2015 Jan 1) Video Summarization Using Higher Order Color Moments (VSUHCM). Procedia Comput Sci 45:275–281

    Article  Google Scholar 

  16. 16.

    Kovesi P (1999) Image features from phase congruency. Videre: J Comp Vis Res, 1(3):1–26

  17. 17.

    Kumar K, Shrimankar DD, Singh N (2018 Mar 1) Eratosthenes sieve based key-frame extraction technique for event summarization in videos. Multimed Tools Appl 77(6):7383–7404

    Article  Google Scholar 

  18. 18.

    Lee (2018) VirtualDub home page. http://www.virtualdub.org/index.html. Accessed 27 Sept 2018

  19. 19.

    Lee H, Yu J, Im Y, Gil JM, Park D (2011) A unified scheme of shot boundary detection and anchor shot detection in news video story parsing. Multimed Tools Appl 51(3):1127–1145

  20. 20.

    Li Z, Liu G (2008) A novel scene change detection algorithm based on the 3D wavelet transform. In: Image Processing, 2008. ICIP 2008. 15th IEEE International Conference on 2008 Oct 12. IEEE, pp 1536–1539

  21. 21.

    Liu H, Hao H (2014) Key frame extraction based on improved hierarchical clustering algorithm. In: Fuzzy Systems and Knowledge Discovery (FSKD), 2014 11th International Conference on 2014 Aug 19. IEEE, pp 793–797

  22. 22.

    Liu H, Meng W, Liu Z (2012) Key frame extraction of online video based on optimized frame difference. In: Fuzzy Systems and Knowledge Discovery (FSKD), 2012 9th International Conference on 2012 May 29. IEEE, pp 1238–1242

  23. 23.

    Liu H, Pan L, Meng W (2012) Key frame extraction from online video based on improved frame difference optimization. In: Communication Technology (ICCT), 2012 IEEE 14th International Conference on 2012 Nov 9. IEEE, pp 940–944

  24. 24.

    Liu XM, Hao AM, Zhao D (2013 Jan 1) Optimization-based key frame extraction for motion capture animation. Visual Comput 29(1):85–95

    Article  Google Scholar 

  25. 25.

    Lu ZM, Shi Y (2013 Dec 1) Fast video shot boundary detection based on SVD and pattern matching. IEEE Trans Image Process 22(12):5136–5145

    MathSciNet  Article  Google Scholar 

  26. 26.

    Lu G, Zhou Y, Li X, Yan P (2017 Mar 1) Unsupervised, efficient and scalable key-frame selection for automatic summarization of surveillance videos. Multimed Tools Appl 76(5):6309–6331

    Article  Google Scholar 

  27. 27.

    Mohanta PP, Saha SK, Chanda B (2012 Feb) A model-based shot boundary detection technique using frame transition parameters. IEEE Trans Multimed 14(1):223–233

    Article  Google Scholar 

  28. 28.

    Mounika (2020) https://sites.google.com/site/mounikabrv3/research-profile, Accessed on 19th August2020

  29. 29.

    Mundur P, Rao Y, Yesha Y (2006 Apr 1) Keyframe-based video summarization using Delaunay clustering. Int J Digital Lib 6(2):219–232

    Article  Google Scholar 

  30. 30.

    Poornima K, Kanchana R (2012) A method to align images using image segmentation. Int J Soft Comput Eng 2(1):294–298

    Google Scholar 

  31. 31.

    Shaker IF, Abd-Elrahman A, Abdel-Gawad AK, Sherief MA (2011 Apr 12) Building extraction from high resolution space images in high density residential areas in the Great Cairo region. Remote Sens 3(4):781–791

    Article  Google Scholar 

  32. 32.

    Sheena CV, Narayanan NK (2015 Jan 1) Key-frame extraction by analysis of histograms of video frames using statistical methods. Procedia Comput Sci 70:36–40

    Article  Google Scholar 

  33. 33.

    Shi Y, Yang H, Gong M, Liu X, Xia Y (2017) A fast and robust key frame extraction method for video copyright protection. J Electric Comput Eng 2017(3):1–7

  34. 34.

    Thakre KS, Rajurkar AM, Manthalkar RR (2016 Jan 1) Video Partitioning and Secured Keyframe Extraction of MPEG Video. Procedia Comput Sci 78:790–798

    Article  Google Scholar 

  35. 35.

    Warhade KK, Merchant SN, Desai UB (2011 Nov 1) Shot boundary detection in the presence of fire flicker and explosion using stationary wavelet transform. Signal Image Video Process 5(4):507–515

    Article  Google Scholar 

  36. 36.

    Yu L, Cao J, Chen M, Cui X (2018 Sep 1) Key frame extraction scheme based on sliding window and features. Peer-to-Peer Netw Appl 11(5):1141–1152

    Article  Google Scholar 

  37. 37.

    Zhang L, Zhang L, Mou X, Zhang D (2011 Jan 31) FSIM: A feature similarity index for image quality assessment. IEEE Trans Image Process 20(8):2378–2386

    MathSciNet  Article  Google Scholar 

Download references

Author information

Affiliations

Authors

Corresponding author

Correspondence to Ashish Khare.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

Appendix

Table 6 Total number of frames and number of keyframes in ground truth for Dataset-1
Table 7 Total number of frames and number of keyframes in ground truth of Dataset-2
Table 8 Total number of frames and number of keyframes in ground truth of Dataset-3
Table 9 Figure of merit (F-measure) obtained by the proposed method and the other methods [9, 15, 26, 32, 34, 36] for different videos of dataset-2
Table 10 Figure of merit (F-measure) obtained by the proposed method and the other methods [9, 15, 26, 32, 34, 36] for different videos of dataset-3
Table 11 Detection percentage obtained by the proposed method and the other methods [9, 15, 26, 32, 34, 36] for different videos of dataset-2
Table 12 Detection percentage obtained by the proposed method and the other methods [9, 15, 26, 32, 34, 36] for different videos of dataset-3
Table 13 Accuracy obtained by the proposed method and the other methods [9, 15, 26, 32, 34, 36] for different videos of dataset-2
Table 14 Accuracy obtained by the proposed method and the other methods [9, 15, 26, 32, 34, 36] for different videos of dataset-3
Table 15 Missing factor obtained by the proposed method and the other methods [9, 15, 26, 32, 34, 36] for different videos of dataset −2
Table 16 Missing factor obtained by the proposed method and the other methods [9, 15, 26, 32, 34, 36] for different videos of dataset −3
Table 17 Figure of merit (F-measure) obtained by the proposed method and the other methods [9, 15, 26, 32, 34, 36] for different videos of dataset-4
Table 18 Detection percentage obtained by the proposed method and the other methods [9, 15, 26, 32, 34, 36] for different videos of dataset-4
Table 19 Accuracy obtained by the proposed method and the other methods [9, 15, 26, 32, 34, 36] for different videos of dataset-4
Table 20 Missing factor obtained by the proposed method and the other methods [9, 15, 26, 32, 34, 36] for different videos of dataset −4

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Mounika Bommisetty, R., Khare, A., Siddiqui, T.J. et al. Fusion of gradient and feature similarity for Keyframe extraction. Multimed Tools Appl (2021). https://doi.org/10.1007/s11042-020-10390-x

Download citation

Keywords

  • Shot boundary detection
  • Keyframe extraction
  • Gradient magnitude and feature similarity