Abstract
Near-duplicate video retrieval in a real-time manner is important to offer efficient storage services, and becomes more challenging due to dealing with the rapid growth of multimedia videos. Existing work fails to efficiently address this important problem due to overlooking the storage property of massive videos. In order to bridge the gap between storage system organization and application-aware videos, we propose a cost-effective real-time video retrieval scheme, called FastVR, which supports fast near-duplicate video retrieval. FastVR has the salient features of space- and time-efficiency in large-scale storage systems. The idea behind FastVR is to leverage space-efficient indexing structure and compact feature representation to facilitate keyframe based matching. Moreover, in the compact feature representation, FastVR transforms the frames into feature vectors in the Hamming space. The indexing structure in FastVR uses Locality Sensitive Hashing(LSH) to support fast similar neighboring search by grouping similar videos together. The conventional LSH unfortunately causes space inefficiency that is well addressed by a cuckoo hashing scheme. FastVR uses a semi-random choice to improve the performance in the random selection of the cuckoo hashing scheme. We implemented FastVR and examined the performance using a real-world dataset. The experimental results demonstrate the efficiency and significant performance improvements.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
YouTube Statistics (2014), http://www.youtube.com/yt/press/
Böhm, C., Berchtold, S., Keim, D.A.: Searching in high-dimensional spaces: Index structures for improving the performance of multimedia databases. ACM Computing Surveys (CSUR) 33(3), 322–373 (2001)
Datar, M., Immorlica, N., Indyk, P., Mirrokni, V.S.: Locality-sensitive hashing scheme based on p-stable distributions. In: Proc. Annual Symposium on Computational Geometry. ACM (2004)
Douze, M., Gaidon, A., Jegou, H., Marszałek, M., Schmid, C., et al.: Inria-lears video copy detection system. In: TREC Video Retrieval Evaluation, TRECVID Workshop (2008)
Gantz, J., Reinsel, D.: The Digital Universe in 2020: Big Data, Bigger Digital Shadows, and Biggest Growth in the Far East. In: International Data Corporation (IDC) iView (December 2012)
Hua, Y., Jiang, H., Zhu, Y., Feng, D., Tian, L.: Smartstore: A new metadata organization paradigm with semantic-awareness for next-generation file systems. In: Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis. IEEE (2009)
Hua, Y., Xiao, B., Liu, X.: Nest: Locality-aware approximate query service for cloud computing. In: Proceedings of IEEE International Conference on Computer Communications, INFOCOM (2013)
Huang, Z., Shen, H.T., Shao, J., Cui, B., Zhou, X.: Practical online near-duplicate subsequence detection for continuous video streams. IEEE Transactions on Multimedia 12(5), 386–398 (2010)
Huang, Z., Shen, H.T., Shao, J., Zhou, X., Cui, B.: Bounded coordinate system indexing for real-time video clip search. ACM Transactions on Information Systems (TOIS) 27(3), 17 (2009)
Indyk, P., Motwani, R.: Approximate nearest neighbors: towards removing the curse of dimensionality. In: Proc. ACM Symposium on Theory Of computing. ACM (1998)
Katiyar, A., Weissman, J.: ViDeDup: An application-aware framework for video de-duplication. In: Proceedings of the 3rd USENIX Conference on Hot Topics in Storage and File Systems, HotStorage (2011)
Ke, Y., Sukthankar, R.: Pca-sift: A more distinctive representation for local image descriptors. In: Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR 2004, vol. 2, pp. II–506. IEEE (2004)
Law-To, J., Chen, L., Joly, A., Laptev, I., Buisson, O., Gouet-Brunet, V., Boujemaa, N., Stentiford, F.: Video copy detection: a comparative study. In: Proceedings of the 6th ACM International Conference on Image and Video Retrieval, pp. 371–378. ACM Press (2007)
Leung, A.W., Shao, M., Bisson, T., Pasupathy, S., Miller, E.L.: Spyglass: Fast, scalable metadata search for large-scale storage systems. In: Proceedings of the Conference on File and Storage Technologies (FAST), pp. 153–166 (2009)
Lowe, D.G.: Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision 60(2), 91–110 (2004)
Lu, G., Nam, Y.J., Du, D.H.: BloomStore: Bloom-filter based memory-efficient key-value store for indexing of data deduplication on flash. In: Proc. IEEE 28th Symposium on Mass Storage Systems and Technologies (MSST). IEEE (2012)
Lv, Q., Josephson, W., Wang, Z., Charikar, M., Li, K.: Multi-probe lsh: Efficient indexing for high-dimensional similarity search. In: Proceedings of the 33rd International Conference on Very Large Data Bases, pp. 950–961. VLDB Endowment (2007)
Pagh, R., Rodler, F.F.: Cuckoo hashing. Journal of Algorithms 51(2), 122–144 (2004)
Park, D., Du, D.H.: Hot data identification for flash-based storage systems using multiple Bloom filters. In: Proc. IEEE 27th Symposium on Mass Storage Systems and Technologies (MSST). IEEE (2011)
Poullot, S., Crucianu, M., Buisson, O.: Scalable mining of large video databases using copy detection. In: Proceedings of the 16th ACM International Conference on Multimedia, pp. 61–70. ACM (2008)
Shang, L., Yang, L., Wang, F., Chan, K.P., Hua, X.S.: Real-time large scale near-duplicate web video retrieval. In: Proceedings of the International Conference on Multimedia, pp. 531–540. ACM (2010)
Sivic, J., Zisserman, A.: Video google: A text retrieval approach to object matching in videos. In: Proceedings of the Ninth IEEE International Conference on Computer Vision (ICCV), pp. 1470–1477. IEEE (2003)
Weiss, Y., Torralba, A., Fergus, R.: Spectral hashing. In: Advances in Neural Information Processing Systems, pp. 1753–1760 (2008)
Wu, X., Hauptmann, A.G., Ngo, C.W.: Practical elimination of near-duplicates from web video search. In: Proceedings of the 15th International Conference on Multimedia, pp. 218–227. ACM (2007)
Zhan, D., Jiang, H., Seth, S.C.: Exploiting set-level non-uniformity of capacity demand to enhance cmp cooperative caching. In: Proceedings of the 2010 IEEE International Symposium on Parallel & Distributed Processing (IPDPS), pp. 1–10. IEEE (2010)
Zhan, D., Jiang, H., Seth, S.C.: Stem: Spatiotemporal management of capacity for intra-core last level caches. In: Proceedings of the 43rd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO), pp. 163–174. IEEE (2010)
Zhan, D., Jiang, H., Seth, S.C.: Locality & utility co-optimization for practical capacity management of shared last level caches. In: Proceedings of the 26th ACM International Conference on Supercomputing, pp. 279–290. ACM (2012)
Zhao, W.L., Tan, S., Ngo, C.W.: Large-scale near-duplicate web video search: Challenge and opportunity. In: Proceedings of the IEEE International Conference on Multimedia and Expo (ICME), pp. 1624–1627. IEEE (2009)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Nie, Z., Hua, Y., Feng, D., Li, Q., Sun, Y. (2014). Efficient Storage Support for Real-Time Near-Duplicate Video Retrieval. In: Sun, Xh., et al. Algorithms and Architectures for Parallel Processing. ICA3PP 2014. Lecture Notes in Computer Science, vol 8631. Springer, Cham. https://doi.org/10.1007/978-3-319-11194-0_24
Download citation
DOI: https://doi.org/10.1007/978-3-319-11194-0_24
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-11193-3
Online ISBN: 978-3-319-11194-0
eBook Packages: Computer ScienceComputer Science (R0)