Geometrically robust video hashing based on ST-PCT for video copy detection

  • Wu Tang
  • Yan WoEmail author
  • Guoqiang Han


Copy videos flooding in the network infringe the video copyright and heavy the storage pressure on the video services server, which raises a huge demand for video copy detection techniques that can accurately and quickly detect the copies of video from huge video database. This paper aims to generate a video hash which has not only high discrimination but also robustness against geometrical and spatial-temporal transformations. First, we propose Spatial-Temporal Polar Cosine Transform (ST-PCT), which considers a video as a three-dimensional matrix and performs two-dimensional Polar Cosine Transforms (PCT) after performing a one-dimensional Discrete Cosine Transform (DCT) on video. This transformation can extract features of the spatial-temporal domain and has geometric invariance. Then, based on ST-PCT, we propose a geometrically robust video hashing method for video copy detection. The video features generated by ST-PCT are compressed and quantified to a compact binary hash code. Experimental results show that compared with the state-of-the-art methods, the proposed method has better robustness, higher accuracy, and faster calculation speed.


Video hashing Video copy detection Spatial-temporal polar cosine transform Geometric invariant 



The authors would like to thank Qiangjiang Wang, College of Computer Science and Engineering, South China University of Technology, for collecting data, participating in writing the manuscript and programming. We would also like to thank anonymous reviewers for their insightful suggestions. This work is supported by National Natural Science Foundation of Guangdong [Grant No.2016A030313472, 2017A030312008, 2018A030313994]; National Natural Science Foundation of China [Grant No. 61472145]; Science and Technology Planning Project of Guangdong Province, China [Grant No. 2016B090918042, 2016B010127003]; Young creative talents project of Guangdong Provincial Education Department [Grant No. 2016KQNCX092].


  1. 1.
    Boukhari A, Serir A (2016) Weber binarized statistical image features (WBSIF) based video copy detection. J Vis Commun Image Represent 34:50–64CrossRefGoogle Scholar
  2. 2.
    Chen H, Wo Y, Han G (2018) Multi-granularity geometrically robust video hashing for tampering detection. Multimed Tools Appl 77(5):5303–5321CrossRefGoogle Scholar
  3. 3.
    Coskun B, Sankur B, Memon N (2006) Spatio–temporal transform based video hashing. ieee Transactions on Multimedia 8(6):1190–1208CrossRefGoogle Scholar
  4. 4.
    Divecha N, and Jani NN. (2013) Implementation and performance analysis of DCT-DWT-SVD based watermarking algorithms for color images. Intelligent Systems and Signal Processing (ISSP), 2013 International Conference on. IEEEGoogle Scholar
  5. 5.
    Dyana A, Das S (2009) Trajectory representation using gabor features for motion-based video retrieval. Pattern Recogn Lett 30(10):877–892CrossRefGoogle Scholar
  6. 6.
    Esmaeili MM, Fatourechi M, Ward RK (2011) A robust and fast video copy detection system using content-based fingerprinting. IEEE Transactions on information forensics and security 6(1):213–226CrossRefGoogle Scholar
  7. 7.
    Gionis A, Indyk P, and Motwani R. (1999) Similarity search in high dimensions via hashing. Vldb. Vol. 99. No. 6Google Scholar
  8. 8.
    Gong Y et al (2013) Iterative quantization: a procrustean approach to learning binary codes for large-scale image retrieval. IEEE Trans Pattern Anal Mach Intell 35(12):2916–2929CrossRefGoogle Scholar
  9. 9.
    Guzman-Zavaleta ZJ et al (2017) A robust and low-cost video fingerprint extraction method for copy detection. Multimed Tools Appl 76(22):24143–24163CrossRefGoogle Scholar
  10. 10.
    Heo J-P, et al. (2012) Spherical hashing. Computer Vision and Pattern Recognition (CVPR), 2012 IEEE Conference on. IEEEGoogle Scholar
  11. 11.
    Himeur Y, and Sadi K A. (2015) Joint color and texture descriptor using ring decomposition for robust video copy detection in large databases. Signal Processing and Information Technology (ISSPIT), 2015 IEEE International Symposium on. IEEEGoogle Scholar
  12. 12.
    Himeur Y, Sadi KA (2018) Robust video copy detection based on ring decomposition based binarized statistical image features and invariant color descriptor (RBSIF-ICD). Multimed Tools Appl 77(13):17309–17331CrossRefGoogle Scholar
  13. 13.
    Himeur Y, Ait-Sadi K, and Oumamne A. (2014) A fast and robust key-frames based video copy detection using BSIF-RMI. Signal Processing and Multimedia Applications (SIGMAP), 2014 International Conference on. IEEEGoogle Scholar
  14. 14.
    Hu Y, Lu X (2018) Learning spatial-temporal features for video copy detection by the combination of CNN and RNN. J Vis Commun Image Represent 55:21–29CrossRefGoogle Scholar
  15. 15.
    Jin Z et al (2014) Density sensitive hashing. IEEE Trans Cybernetics 44(8):1362–1371CrossRefGoogle Scholar
  16. 16.
    Jolliffe I (2011) Principal component analysis. International encyclopedia of statistical science. Springer, Berlin, Heidelberg, 1094–1096CrossRefGoogle Scholar
  17. 17.
    Kim S et al (2014) Adaptive weighted fusion with new spatial and temporal fingerprints for improved video copy detection. Signal Process Image Commun 29(7):788–806CrossRefGoogle Scholar
  18. 18.
    Law-To J et al. (2007) Video copy detection: a comparative study. Proceedings of the 6th ACM international conference on Image and video retrieval. ACMGoogle Scholar
  19. 19.
    Lee S, Yoo CD (2008) Robust video fingerprinting for content-based video identification. IEEE Transactions on Circuits and Systems for Video Technology 18.7:983–988CrossRefGoogle Scholar
  20. 20.
    Li J, et al. ((2018)) Two-class 3D-CNN classifiers combination for video copy detection. Multimedia Tools and Applications: 1–13Google Scholar
  21. 21.
    Liu H, Lu H, Xue X (2013) A segmentation and graph-based video sequence matching method for video copy detection. IEEE Trans Knowl Data Eng 25(8):1706–1718CrossRefGoogle Scholar
  22. 22.
    Liu Y et al (2016) From action to activity: sensor-based activity recognition. Neurocomputing 181:108–115CrossRefGoogle Scholar
  23. 23.
    Nie X, et al. (2017) Two-layer video fingerprinting strategy for near-duplicate video detection. Multimedia & Expo Workshops (ICMEW), 2017 IEEE International Conference on. IEEEGoogle Scholar
  24. 24.
    Özbulak G, Kahraman F, and Baykut S. (2016) Robust video copy detection in large-scale TV streams using local features and CFAR based threshold. Digital Signal Processing (DSP), 2016 IEEE International Conference on. IEEEGoogle Scholar
  25. 25.
    Saikia N (2015) Perceptual hashing in the 3D-DWT domain. Green Computing and Internet of Things (ICGCIoT), 2015 International Conference on. IEEEGoogle Scholar
  26. 26.
    Shinde SR, and Chiddarwar GG (2015) Recent advances in content based video copy detection. 2015 International Conference on Pervasive Computing (ICPC). IEEEGoogle Scholar
  27. 27.
    Shoaib,S, and Mahajan RC (2015) Authenticating using secret key in digital video watermarking using 3-level DWT. Communication, Information & Computing Technology (ICCICT), 2015 International Conference on. IEEEGoogle Scholar
  28. 28.
    Soomro K, Zamir AR, and Shah M. (2012) UCF101: A dataset of 101 human actions classes from videos in the wild. arXiv preprint arXiv:1212.0402 Google Scholar
  29. 29.
    Su P-C, Wu C-S (2017) Efficient copy detection for compressed digital videos by spatial and temporal feature extraction. Multimed Tools Appl 76(1):1331–1353CrossRefGoogle Scholar
  30. 30.
    Tasdemir K, and Enis Cetin A (2010) Motion vector based features for content based video copy detection. 2010 International Conference on Pattern Recognition. IEEEGoogle Scholar
  31. 31.
    Wang, Rong Bo, et al. (2016) Video Copy Detection Based On Temporal Contextual Hashing. Multimedia Big Data (BigMM), 2016 IEEE Second International Conference on. IEEEGoogle Scholar
  32. 32.
    Wang L et al. (2017) Compact CNN based video representation for efficient video copy detection. International conference on multimedia modeling. Springer, ChamGoogle Scholar
  33. 33.
    Weiss Y, Torralba A, and Fergus R (2009) Spectral hashing. Advances in neural information processing systems Google Scholar
  34. 34.
    Yan W, Jiao X (2012) Accurate and fast harmonic transform of polar coordinates [J]. J South China Univ Technol: Nat Sci Ed 40(4):23–29MathSciNetGoogle Scholar
  35. 35.
    Yap P-T, Jiang X, Kot AC (2010) Two-dimensional polar harmonic transforms for invariant image representation. IEEE Trans Pattern Anal Mach Intell 32(7):1259–1270CrossRefGoogle Scholar
  36. 36.
    Zhang Z et al. (2010) Video copy detection based on speeded up robust features and locality sensitive hashing. Automation and Logistics (ICAL), 2010 IEEE International Conference on. IEEEGoogle Scholar
  37. 37.
    Zhang JR et al. (2012) Fast near-duplicate video retrieval via motion time series matching. 2012 IEEE International Conference on Multimedia and Expo. IEEEGoogle Scholar
  38. 38.
    Zhao Y et al (2013) Robust hashing for image authentication using Zernike moments and local features. IEEE transactions on information forensics and security 8(1):55–63MathSciNetCrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2019

Authors and Affiliations

  1. 1.College of Computer Science and EngineeringSouth China University of TechnologyGuangzhouChina

Personalised recommendations