3D Research

, 9:14 | Cite as

Stereoscopic Video Clip Matching Algorithm Based on Incidence Matrix of Similar Key Frames

  • Feng-feng Duan
  • Si-yao Duan
3DR Express


Clip matching is the key of content-based video retrieval. In order to better realize the content-based video retrieval and improve the accuracy and efficiency, a stereoscopic video clip matching algorithm based on incidence matrix of similar key frames is proposed by integrating depth information in feature extraction. In the algorithm, the stereoscopic videos are divided into sub-sequences through shot and sub-shot segmentation, and the key frames and features are extracted. With the features of key frames, this paper proposed the visual similarity based on the incidence matrix of similar key frames, order similarity based on the maximum common sub-sequence of linearly fitting, semantic similarity based on the cross similarity of similar key frames. And then the linear expression of similar matching is built so as to realize the similarity calculation and effective matching of stereoscopic video clips. In the experiment of clip matching, the mean average recall (MAR) is increased by 3.69 and 0.29%, the mean average precision (MAP) is increased by 10.17 and 8.10%, and the mean average matching time (MAMT) is decreased by 44.35 and 38.60%, respectively, compared with the existing typical stereoscopic video retrieval algorithm SVR and the video clip matching algorithm VTD-LFF. The experimental results show that the proposed algorithm can realize stereoscopic video clip matching more accurately and efficiently.


Clip matching Stereoscopic video Similar key frames Incidence matrix Content-based video retrieval 



This study was funded by the Base Project of Hunan Province Social Science (No. 14JD38).


  1. 1.
    Megrhi, S., Souidene, W., & Beghdadi, A. (2013). Spatio-temporal salient feature extraction for perceptual content based video retrieval. In 2013 Colour and Visual Computing Symposium (CVCS) (pp. 1–7). Gjovik: IEEE.Google Scholar
  2. 2.
    Vajda, P., Ivanov, I., Goldmann, L., et al. (2010). 3D object duplicate detection for video retrieval. In 2010 11th International Workshop on Image Analysis for Multimedia Interactive Services (WIAMIS) (pp. 1–4). Desenzano del Garda: IEEE.Google Scholar
  3. 3.
    Feng, Y., Ren, J., & Jiang, J. (2011). Generic framework for content-based stereo image/video retrieval. Electronics Letters, 47(2), 97–987.CrossRefGoogle Scholar
  4. 4.
    Rudinac, S., Larson, M., & Hanjalic, A. (2009). Exploiting visual reranking to improve pseudo-relevance feedback for spoken-content-based video retrieval. In 10th Workshop on Image Analysis for Multimedia Interactive Services, 2009 (WIAMIS ‘09) (pp. 17–20). London: IEEE.Google Scholar
  5. 5.
    Wu, L. D., Deng, L. Q., & Deng, W. (2014). Content-based near duplicate video clip retrieval technology. Journal of Chinese Computer Systems, 35(3), 615–619.MathSciNetGoogle Scholar
  6. 6.
    Kang, M. M., Huang, X. L., & Yang, L. F. (2010). Video clip retrieval based on incidence matrix and dynamic-step sliding-window. In 2010 International Conference on Computer Application and System Modeling (ICCASM) (pp. 256–259). Taiyuan: IEEE.Google Scholar
  7. 7.
    Huang, Z., Wang, L. P., Shen, H. T., et al. (2009). Online near-duplicate video clip detection and retrieval: An accurate and fast system. In IEEE 25th International Conference on Data Engineering, 2009 (ICDE ‘09) (pp. 1511–1514). Shanghai: IEEE.Google Scholar
  8. 8.
    Min, H. S., Choi, J. Y., De Neve, W., et al. (2012). Near-duplicate video clip detection using model-free semantic concept detection and adaptive semantic distance measurement. IEEE Transactions on Circuits and Systems for Video Technology, 22(8), 1174–1187.CrossRefGoogle Scholar
  9. 9.
    Deng, L. Q., Chen, D. W., Yuan, Z. M., et al. (2012). Near duplicate cartoon video clip detection. Journal of Computer-Aided Design & Computer Graphics, 24(2), 199–206.Google Scholar
  10. 10.
    Guo, Y. M., Xie, Y. X., Lao, S. Y., et al. (2014). Detection and location of near-duplicate video clips. Computer Science, 41(10), 53–56.Google Scholar
  11. 11.
    Zhao, Y. Q., Zhou, X. Z., & He, X. (2007). An efficient method for video clip retrieval using equivalence relation theory. Journal of Image and Graphics, 12(1), 127–134.Google Scholar
  12. 12.
    Wang, F. Y., Zhang, S. W., & Li, H. P. (2013). Video clip identification algorithm based on spatio-temporal ordinal measures. Journal of Software, 24(12), 2921–2936.CrossRefGoogle Scholar
  13. 13.
    Duan, F. F. (2016). Shot segmentation for binocular stereoscopic video based on spatial-temporal feature clustering. 3D Research, 7(4), 1–12.CrossRefGoogle Scholar
  14. 14.
    Petersohn, C. (2007). Sub-shots-basic units of video. In 14th International Workshop on Systems, Signals and Image Processing, 2007 and 6th EURASIP Conference Focused on Speech and Image Processing, Multimedia Communications and Services (pp. 323–326). Maribor: IEEE.Google Scholar
  15. 15.
    Cooray, S. H., & O’Connor, N. E. (2010). Identifying an efficient and robust sub-shot segmentation method for home movie summarization. In 2010 10th International Conference on Intelligent Systems Design and Applications (ISDA) (pp. 1287–1292). Cairo: IEEE.Google Scholar
  16. 16.
    Pan, L., Wu, X. J., & Shu, X. (2009). Key frame extraction based on sub-shot segmentation and entropy computing. In 2009 Chinese Conference on Pattern Recognition (CCPR) (pp. 1–5). Nanjing: IEEE.Google Scholar
  17. 17.
    Duan, F. F., Wang, Y. B., Yang, L. F., et al. (2016). Feature extraction for stereoscopic vision depth map based on principal component analysis and histogram of oriented depth gradient. Journal of Computer Applications, 36(1), 222–226.Google Scholar
  18. 18.
    Zhang, X. N., Jiang, J., Liang, Z. H., et al. (2010). Skin color enhancement based on favorite skin color in HSV color space. IEEE Transactions on Consumer Electronics, 56(3), 1789–1793.CrossRefGoogle Scholar
  19. 19.
    Zhang, H., Wang, Y. X., & Jiang, X. H. (2013). An improved shot segmentation algorithm based on color histograms for decompressed videos. In 2013 6th International Congress on Image and Signal Processing (CISP) (pp. 86–90). Hangzhou: IEEE.Google Scholar
  20. 20.
    Niu, J., Wang, Z. Y., & Feng, D. G. (2010). Two-step similarity matching for content-based video retrieval in p2p networks. In 2010 IEEE International Conference on Multimedia and Expo (ICME) (pp. 1690–1694). Suntec City: IEEE.Google Scholar
  21. 21.
    Tahayna, B., Alhashmi, S., Wang, Y. D., et al. (2010). Combining content and context information fusion for video classification and retrieval. In 2010 2nd International Conference on Signal Processing Systems (ICSPS) (pp. 600–604). Dalian: IEEE.Google Scholar

Copyright information

© 3D Research Center, Kwangwoon University and Springer-Verlag GmbH Germany, part of Springer Nature 2018

Authors and Affiliations

  1. 1.School of Journalism and CommunicationHunan Normal UniversityChangshaChina
  2. 2.Hunan Social Public Opinion Monitoring and Network Public Opinion Research CenterChangshaChina

Personalised recommendations