Skip to main content

Efficient Storage Support for Real-Time Near-Duplicate Video Retrieval

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 8631))

Abstract

Near-duplicate video retrieval in a real-time manner is important to offer efficient storage services, and becomes more challenging due to dealing with the rapid growth of multimedia videos. Existing work fails to efficiently address this important problem due to overlooking the storage property of massive videos. In order to bridge the gap between storage system organization and application-aware videos, we propose a cost-effective real-time video retrieval scheme, called FastVR, which supports fast near-duplicate video retrieval. FastVR has the salient features of space- and time-efficiency in large-scale storage systems. The idea behind FastVR is to leverage space-efficient indexing structure and compact feature representation to facilitate keyframe based matching. Moreover, in the compact feature representation, FastVR transforms the frames into feature vectors in the Hamming space. The indexing structure in FastVR uses Locality Sensitive Hashing(LSH) to support fast similar neighboring search by grouping similar videos together. The conventional LSH unfortunately causes space inefficiency that is well addressed by a cuckoo hashing scheme. FastVR uses a semi-random choice to improve the performance in the random selection of the cuckoo hashing scheme. We implemented FastVR and examined the performance using a real-world dataset. The experimental results demonstrate the efficiency and significant performance improvements.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. YouTube Statistics (2014), http://www.youtube.com/yt/press/

  2. Böhm, C., Berchtold, S., Keim, D.A.: Searching in high-dimensional spaces: Index structures for improving the performance of multimedia databases. ACM Computing Surveys (CSUR) 33(3), 322–373 (2001)

    Article  Google Scholar 

  3. Datar, M., Immorlica, N., Indyk, P., Mirrokni, V.S.: Locality-sensitive hashing scheme based on p-stable distributions. In: Proc. Annual Symposium on Computational Geometry. ACM (2004)

    Google Scholar 

  4. Douze, M., Gaidon, A., Jegou, H., Marszałek, M., Schmid, C., et al.: Inria-lears video copy detection system. In: TREC Video Retrieval Evaluation, TRECVID Workshop (2008)

    Google Scholar 

  5. Gantz, J., Reinsel, D.: The Digital Universe in 2020: Big Data, Bigger Digital Shadows, and Biggest Growth in the Far East. In: International Data Corporation (IDC) iView (December 2012)

    Google Scholar 

  6. Hua, Y., Jiang, H., Zhu, Y., Feng, D., Tian, L.: Smartstore: A new metadata organization paradigm with semantic-awareness for next-generation file systems. In: Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis. IEEE (2009)

    Google Scholar 

  7. Hua, Y., Xiao, B., Liu, X.: Nest: Locality-aware approximate query service for cloud computing. In: Proceedings of IEEE International Conference on Computer Communications, INFOCOM (2013)

    Google Scholar 

  8. Huang, Z., Shen, H.T., Shao, J., Cui, B., Zhou, X.: Practical online near-duplicate subsequence detection for continuous video streams. IEEE Transactions on Multimedia 12(5), 386–398 (2010)

    Article  Google Scholar 

  9. Huang, Z., Shen, H.T., Shao, J., Zhou, X., Cui, B.: Bounded coordinate system indexing for real-time video clip search. ACM Transactions on Information Systems (TOIS) 27(3), 17 (2009)

    Article  Google Scholar 

  10. Indyk, P., Motwani, R.: Approximate nearest neighbors: towards removing the curse of dimensionality. In: Proc. ACM Symposium on Theory Of computing. ACM (1998)

    Google Scholar 

  11. Katiyar, A., Weissman, J.: ViDeDup: An application-aware framework for video de-duplication. In: Proceedings of the 3rd USENIX Conference on Hot Topics in Storage and File Systems, HotStorage (2011)

    Google Scholar 

  12. Ke, Y., Sukthankar, R.: Pca-sift: A more distinctive representation for local image descriptors. In: Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR 2004, vol. 2, pp. II–506. IEEE (2004)

    Google Scholar 

  13. Law-To, J., Chen, L., Joly, A., Laptev, I., Buisson, O., Gouet-Brunet, V., Boujemaa, N., Stentiford, F.: Video copy detection: a comparative study. In: Proceedings of the 6th ACM International Conference on Image and Video Retrieval, pp. 371–378. ACM Press (2007)

    Google Scholar 

  14. Leung, A.W., Shao, M., Bisson, T., Pasupathy, S., Miller, E.L.: Spyglass: Fast, scalable metadata search for large-scale storage systems. In: Proceedings of the Conference on File and Storage Technologies (FAST), pp. 153–166 (2009)

    Google Scholar 

  15. Lowe, D.G.: Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision 60(2), 91–110 (2004)

    Article  Google Scholar 

  16. Lu, G., Nam, Y.J., Du, D.H.: BloomStore: Bloom-filter based memory-efficient key-value store for indexing of data deduplication on flash. In: Proc. IEEE 28th Symposium on Mass Storage Systems and Technologies (MSST). IEEE (2012)

    Google Scholar 

  17. Lv, Q., Josephson, W., Wang, Z., Charikar, M., Li, K.: Multi-probe lsh: Efficient indexing for high-dimensional similarity search. In: Proceedings of the 33rd International Conference on Very Large Data Bases, pp. 950–961. VLDB Endowment (2007)

    Google Scholar 

  18. Pagh, R., Rodler, F.F.: Cuckoo hashing. Journal of Algorithms 51(2), 122–144 (2004)

    Article  MATH  MathSciNet  Google Scholar 

  19. Park, D., Du, D.H.: Hot data identification for flash-based storage systems using multiple Bloom filters. In: Proc. IEEE 27th Symposium on Mass Storage Systems and Technologies (MSST). IEEE (2011)

    Google Scholar 

  20. Poullot, S., Crucianu, M., Buisson, O.: Scalable mining of large video databases using copy detection. In: Proceedings of the 16th ACM International Conference on Multimedia, pp. 61–70. ACM (2008)

    Google Scholar 

  21. Shang, L., Yang, L., Wang, F., Chan, K.P., Hua, X.S.: Real-time large scale near-duplicate web video retrieval. In: Proceedings of the International Conference on Multimedia, pp. 531–540. ACM (2010)

    Google Scholar 

  22. Sivic, J., Zisserman, A.: Video google: A text retrieval approach to object matching in videos. In: Proceedings of the Ninth IEEE International Conference on Computer Vision (ICCV), pp. 1470–1477. IEEE (2003)

    Google Scholar 

  23. Weiss, Y., Torralba, A., Fergus, R.: Spectral hashing. In: Advances in Neural Information Processing Systems, pp. 1753–1760 (2008)

    Google Scholar 

  24. Wu, X., Hauptmann, A.G., Ngo, C.W.: Practical elimination of near-duplicates from web video search. In: Proceedings of the 15th International Conference on Multimedia, pp. 218–227. ACM (2007)

    Google Scholar 

  25. Zhan, D., Jiang, H., Seth, S.C.: Exploiting set-level non-uniformity of capacity demand to enhance cmp cooperative caching. In: Proceedings of the 2010 IEEE International Symposium on Parallel & Distributed Processing (IPDPS), pp. 1–10. IEEE (2010)

    Google Scholar 

  26. Zhan, D., Jiang, H., Seth, S.C.: Stem: Spatiotemporal management of capacity for intra-core last level caches. In: Proceedings of the 43rd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO), pp. 163–174. IEEE (2010)

    Google Scholar 

  27. Zhan, D., Jiang, H., Seth, S.C.: Locality & utility co-optimization for practical capacity management of shared last level caches. In: Proceedings of the 26th ACM International Conference on Supercomputing, pp. 279–290. ACM (2012)

    Google Scholar 

  28. Zhao, W.L., Tan, S., Ngo, C.W.: Large-scale near-duplicate web video search: Challenge and opportunity. In: Proceedings of the IEEE International Conference on Multimedia and Expo (ICME), pp. 1624–1627. IEEE (2009)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this paper

Cite this paper

Nie, Z., Hua, Y., Feng, D., Li, Q., Sun, Y. (2014). Efficient Storage Support for Real-Time Near-Duplicate Video Retrieval. In: Sun, Xh., et al. Algorithms and Architectures for Parallel Processing. ICA3PP 2014. Lecture Notes in Computer Science, vol 8631. Springer, Cham. https://doi.org/10.1007/978-3-319-11194-0_24

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-11194-0_24

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-11193-3

  • Online ISBN: 978-3-319-11194-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics