Abstract
Recently, spatial partitioning approaches such as spatial pyramid matching (SPM) are commonly used in image classification to collect the global and local features of the images. They divide the input image into small sub-regions (typically in a hierarchical manner) and generate a feature vector for each of them. Although the codes for the descriptors are assigned softly in modern image feature representation techniques, the codes must fall into only a single sub-region when forming the feature vector. In other words, the soft code assignment is used in the descriptor space but the codes are still “hard” voted from the view point of the image space. This paper proposes a spatial soft voting method, in which the existence of the codes are expressed by a Gaussian function and the maps of the existence are sampled to form a feature vector. The generated feature vectors are “soft” both in the descriptor space and the image space. In addition, extra computational cost as compared to SPM is negligibly small. The concept of the spatial soft voting is general and can be applied to most hard spatial partitioning approaches.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Sivic, J., Zisserman, A.: Video google: A text retrieval approach to object matching in videos. In: ICCV, pp. 1470–1477 (2003), BoF
Csurka, G., Dance, C., Fan, L., Willamowski, J., Bray, C.: Visual categorization with bags of keypoints. In: Workshop on Statistical Learning in Computer Vision, ECCV (2004)
Philbin, J., Chum, O., Isard, M., Sivic, J., Zisserman, A.: Lost in quantization: improving particular object retrieval in large scale image databases. In: CVPR (2008)
van Gemert, J.C., Geusebroek, J.-M., Veenman, C.J., Smeulders, A.W.M.: Kernel Codebooks for Scene Categorization. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part III. LNCS, vol. 5304, pp. 696–709. Springer, Heidelberg (2008)
Yang, J., Yu, K., Gong, Y., Huang, T.: Linear spatial pyramid matching using sparse coding for image classification. In: CVPR (2009)
Wang, J., Yang, J., Yu, K., Lv, F., Huang, T., Gong, Y.: Locality constrained linear coding for image classification. In: CVPR (2010)
Lazebnik, S., Schmid, C., Ponce, J.: Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories. In: CVPR (2006)
Farquhar, J., Szedmak, S., Meng, H., Shawe-Taylor, J.: Improving “bag-of-keypoints” image categorisation. Technical report, University of Southampton (2005)
Yu, K., Lin, Y., Lafferty, J.: Learning image representations from the pixel level via hierarchical sparse coding. In: CVPR (2011)
Jiang, Z., Lin, Z., Davis, L.S.: Learning a discriminative dictionary for sparse coding via label consistent k-svd. In: CVPR (2011)
Zhang, C., Liu, J., Tian, Q., Xu, C., Lu, H., Ma, S.: Image classification by non-negative sparse coding, low-rank and sparse decomposition. In: CVPR (2011)
Kulkarni, N., Li, B.: Discriminative affine sparse codes for image classification. In: CVPR (2011)
Marszalek, M., Schmid, C., Harzallah, H., van de Weijer, J.: Learning object representations for visual object class recognition. In: Visual Recog. Challange Workshop (2007)
Perronnin, F., Sánchez, J., Mensink, T.: Improving the Fisher Kernel for Large-Scale Image Classification. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part IV. LNCS, vol. 6314, pp. 143–156. Springer, Heidelberg (2010)
Yang, J., Yu, K., Huang, T.: Efficient Highly Over-Complete Sparse Coding Using a Mixture Model. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part V. LNCS, vol. 6315, pp. 113–126. Springer, Heidelberg (2010)
Zhou, X., Yu, K., Zhang, T., Huang, T.S.: Image Classification Using Super-Vector Coding of Local Image Descriptors. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part V. LNCS, vol. 6315, pp. 141–154. Springer, Heidelberg (2010)
Grauman, K., Darrell, T.: The pyramid match kernel: discriminative classification with sets of image features. In: ICCV, vol. 2, pp. 1458–1465 (2005), Pyramid match kernel, base of SPM
Harada, T., Ushiku, Y., Yamashita, Y., Kuniyoshi, Y.: Discriminative spatial pyramid. In: CVPR (2011)
Feng, J., Ni, B., Tian, Q., Yan, S.: Geometric l p -norm feature pooling for image classification. In: CVPR (2011)
Marszaek, M., Schmid, C.: Spatial weighting for bag-of-features. In: CVPR (2006), Spatial relation but segmentation is required
Cao, Y., Wang, C., Li, Z., Zhang, L., Zhang, L.: Spatial bag-of-features. In: CVPR (2010)
Wang, X., Bai, X., Liu, W., Latecki, L.J.: Feature context for image classification and object detection. In: CVPR (2011)
Belongie, S., Malik, J., Puzicha, J.: Shape matching and object recognition using shape contexts. IEEE TPAMI 24, 509–522 (2002)
Huang, Y., Huang, K., Wang, C., Tan, T.: Exploring relations of visual codes for image classification. In: CVPR (2011)
Krapacy, J., Verbeek, J., Jurie, F.: Modeling spatial layout with fisher vectors for image categorization. In: ICCV (2011)
Boureau, Y., Roux, N.L., Bach, F., Ponce, J., LeCun, Y.: Ask the locals: Multi-way local pooling for image recognition. In: ICCV (2011)
Jia, Y., Huang, C.: Beyond spatial pyramids: Receptive field learning for pooled image features. In: CVPR (2012)
Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: CVPR (2005)
Li, F.-F., Fergus, R., Perona, P.: Learning generative visual models from few training examples: an incremental bayesian approach tested on 101 object categories. In: CVPR Workshop on Generative-Model Based Vision (2004)
Zhang, H., Berg, A., Maire, M., Malik, J.: Svm-knn: Discriminative nearest neighbor classification for visual category recognition. In: CVPR (2006)
Griffin, G., Holub, A., Perona, P.: Caltech-256 object category dataset. Technical Report 7694, California Institute of Technology (2007)
Karayev, S., Fritz, M., Fidler, S., Darrell, T.: A probabilistic model for recursive factorized image features. In: CVPR (2011)
Everingham, M., Van Gool, L., Williams, C.K.I., Winn, J., Zisserman, A.: The PASCAL Visual Object Classes Challenge 2011 (VOC 2011) Results
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Yamasaki, T., Chen, T. (2013). Beyond Spatial Pyramid Matching: Spatial Soft Voting for Image Classification. In: Park, JI., Kim, J. (eds) Computer Vision - ACCV 2012 Workshops. ACCV 2012. Lecture Notes in Computer Science, vol 7729. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-37484-5_41
Download citation
DOI: https://doi.org/10.1007/978-3-642-37484-5_41
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-37483-8
Online ISBN: 978-3-642-37484-5
eBook Packages: Computer ScienceComputer Science (R0)