Compressing Visual Descriptors of Image Sequences
In recent years, there has been significant progress in developing more compact visual descriptors, typically by aggregating local descriptors. However, all these methods are descriptors for still images, and are typically applied independently to (key) frames when used in tasks such as instance search in video. Thus, they do not make use of the temporal redundancy of the video, which has negative impacts on the descriptor size and the matching complexity. We propose a compressed descriptor for image sequences, which encodes a segment of video using a single descriptor. The proposed approach is a framework that can be used with different local descriptors, including compact descriptors. We describe the extraction and matching process for the descriptor and provide evaluation results on a large video data set.
The research leading to these results has received funding from the European Union’s Seventh Framework Programme (FP7/2007–2013) under grant agreement no 610370, ICoSOLE, and from the Austrian Research Promotion Agency under the KIRAS grant E.V.A.
- 1.Call for proposals for compact descriptors for video analysis (CDVA) - search and retrieval. Technical report ISO/IEC JTC1/SC29/WG11/N15339 (2015)Google Scholar
- 2.Evaluation framework for compact descriptors for video analysis - search and retrieval - version 2.0. Technical report ISO/IEC JTC1/SC29/WG11/N15729 (2015)Google Scholar
- 3.ISO/IEC 15938-13: Information technology - multimedia content description interface - part 13: compact descriptors for visual search (2015)Google Scholar
- 4.Arandjelovic, R., Zisserman, A.: All about VLAD. In: 2013 IEEE Conference Computer Vision and Pattern Recognition (CVPR), pp. 1578–1585, June 2013Google Scholar
- 5.Balestri, M., Francini, G., Lepsøy, S.: Keypoint identification. Patent application WO 2015/011185 A1 (2013)Google Scholar
- 7.Duan, L.-Y., Gao, F., Chen, J., Lin, J., Huang, T.: Compact descriptors for mobile visual search and MPEG CDVS standardization. In: IEEE International Symposium on Circuits and Systems, pp. 885–888 (2013)Google Scholar
- 8.Jegou, H., Douze, M., Schmid, C., Perez, P.: Aggregating local descriptors into a compact image representation. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 3304–3311, June 2010Google Scholar
- 12.Perronnin, F., Dance, C.: Fisher kernels on visual vocabularies for image categorization. In: IEEE Conference Computer Vision and Pattern Recognition, June 2007Google Scholar
- 13.Picard, D., Gosselin, P.-H.: Improving image similarity with vectors of locally aggregated tensors. In: IEEE International Conference on Image Processing, Brussels, BE, September 2011Google Scholar
- 14.Rublee, E., Rabaud, V., Konolige, K. Bradski, G.: ORB: an efficient alternative to SIFT or SURF. In: 2011 IEEE International Conference on Computer Vision (ICCV), pp. 2564–2571, November 2011Google Scholar