Beyond Spatial Pyramid Matching: Spatial Soft Voting for Image Classification

Yamasaki, Toshihiko; Chen, Tsuhan

doi:10.1007/978-3-642-37484-5_41

Beyond Spatial Pyramid Matching: Spatial Soft Voting for Image Classification

Toshihiko Yamasaki^18,19 &
Tsuhan Chen^18,19

Conference paper

2866 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 7729))

Abstract

Recently, spatial partitioning approaches such as spatial pyramid matching (SPM) are commonly used in image classification to collect the global and local features of the images. They divide the input image into small sub-regions (typically in a hierarchical manner) and generate a feature vector for each of them. Although the codes for the descriptors are assigned softly in modern image feature representation techniques, the codes must fall into only a single sub-region when forming the feature vector. In other words, the soft code assignment is used in the descriptor space but the codes are still “hard” voted from the view point of the image space. This paper proposes a spatial soft voting method, in which the existence of the codes are expressed by a Gaussian function and the maps of the existence are sampled to form a feature vector. The generated feature vectors are “soft” both in the descriptor space and the image space. In addition, extra computational cost as compared to SPM is negligibly small. The concept of the spatial soft voting is general and can be applied to most hard spatial partitioning approaches.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Sivic, J., Zisserman, A.: Video google: A text retrieval approach to object matching in videos. In: ICCV, pp. 1470–1477 (2003), BoF
Google Scholar
Csurka, G., Dance, C., Fan, L., Willamowski, J., Bray, C.: Visual categorization with bags of keypoints. In: Workshop on Statistical Learning in Computer Vision, ECCV (2004)
Google Scholar
Philbin, J., Chum, O., Isard, M., Sivic, J., Zisserman, A.: Lost in quantization: improving particular object retrieval in large scale image databases. In: CVPR (2008)
Google Scholar
van Gemert, J.C., Geusebroek, J.-M., Veenman, C.J., Smeulders, A.W.M.: Kernel Codebooks for Scene Categorization. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part III. LNCS, vol. 5304, pp. 696–709. Springer, Heidelberg (2008)
Chapter Google Scholar
Yang, J., Yu, K., Gong, Y., Huang, T.: Linear spatial pyramid matching using sparse coding for image classification. In: CVPR (2009)
Google Scholar
Wang, J., Yang, J., Yu, K., Lv, F., Huang, T., Gong, Y.: Locality constrained linear coding for image classification. In: CVPR (2010)
Google Scholar
Lazebnik, S., Schmid, C., Ponce, J.: Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories. In: CVPR (2006)
Google Scholar
Farquhar, J., Szedmak, S., Meng, H., Shawe-Taylor, J.: Improving “bag-of-keypoints” image categorisation. Technical report, University of Southampton (2005)
Google Scholar
Yu, K., Lin, Y., Lafferty, J.: Learning image representations from the pixel level via hierarchical sparse coding. In: CVPR (2011)
Google Scholar
Jiang, Z., Lin, Z., Davis, L.S.: Learning a discriminative dictionary for sparse coding via label consistent k-svd. In: CVPR (2011)
Google Scholar
Zhang, C., Liu, J., Tian, Q., Xu, C., Lu, H., Ma, S.: Image classification by non-negative sparse coding, low-rank and sparse decomposition. In: CVPR (2011)
Google Scholar
Kulkarni, N., Li, B.: Discriminative affine sparse codes for image classification. In: CVPR (2011)
Google Scholar
Marszalek, M., Schmid, C., Harzallah, H., van de Weijer, J.: Learning object representations for visual object class recognition. In: Visual Recog. Challange Workshop (2007)
Google Scholar
Perronnin, F., Sánchez, J., Mensink, T.: Improving the Fisher Kernel for Large-Scale Image Classification. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part IV. LNCS, vol. 6314, pp. 143–156. Springer, Heidelberg (2010)
Chapter Google Scholar
Yang, J., Yu, K., Huang, T.: Efficient Highly Over-Complete Sparse Coding Using a Mixture Model. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part V. LNCS, vol. 6315, pp. 113–126. Springer, Heidelberg (2010)
Chapter Google Scholar
Zhou, X., Yu, K., Zhang, T., Huang, T.S.: Image Classification Using Super-Vector Coding of Local Image Descriptors. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part V. LNCS, vol. 6315, pp. 141–154. Springer, Heidelberg (2010)
Chapter Google Scholar
Grauman, K., Darrell, T.: The pyramid match kernel: discriminative classification with sets of image features. In: ICCV, vol. 2, pp. 1458–1465 (2005), Pyramid match kernel, base of SPM
Google Scholar
Harada, T., Ushiku, Y., Yamashita, Y., Kuniyoshi, Y.: Discriminative spatial pyramid. In: CVPR (2011)
Google Scholar
Feng, J., Ni, B., Tian, Q., Yan, S.: Geometric l _p-norm feature pooling for image classification. In: CVPR (2011)
Google Scholar
Marszaek, M., Schmid, C.: Spatial weighting for bag-of-features. In: CVPR (2006), Spatial relation but segmentation is required
Google Scholar
Cao, Y., Wang, C., Li, Z., Zhang, L., Zhang, L.: Spatial bag-of-features. In: CVPR (2010)
Google Scholar
Wang, X., Bai, X., Liu, W., Latecki, L.J.: Feature context for image classification and object detection. In: CVPR (2011)
Google Scholar
Belongie, S., Malik, J., Puzicha, J.: Shape matching and object recognition using shape contexts. IEEE TPAMI 24, 509–522 (2002)
Article Google Scholar
Huang, Y., Huang, K., Wang, C., Tan, T.: Exploring relations of visual codes for image classification. In: CVPR (2011)
Google Scholar
Krapacy, J., Verbeek, J., Jurie, F.: Modeling spatial layout with fisher vectors for image categorization. In: ICCV (2011)
Google Scholar
Boureau, Y., Roux, N.L., Bach, F., Ponce, J., LeCun, Y.: Ask the locals: Multi-way local pooling for image recognition. In: ICCV (2011)
Google Scholar
Jia, Y., Huang, C.: Beyond spatial pyramids: Receptive field learning for pooled image features. In: CVPR (2012)
Google Scholar
Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: CVPR (2005)
Google Scholar
Li, F.-F., Fergus, R., Perona, P.: Learning generative visual models from few training examples: an incremental bayesian approach tested on 101 object categories. In: CVPR Workshop on Generative-Model Based Vision (2004)
Google Scholar
Zhang, H., Berg, A., Maire, M., Malik, J.: Svm-knn: Discriminative nearest neighbor classification for visual category recognition. In: CVPR (2006)
Google Scholar
Griffin, G., Holub, A., Perona, P.: Caltech-256 object category dataset. Technical Report 7694, California Institute of Technology (2007)
Google Scholar
Karayev, S., Fritz, M., Fidler, S., Darrell, T.: A probabilistic model for recursive factorized image features. In: CVPR (2011)
Google Scholar
Everingham, M., Van Gool, L., Williams, C.K.I., Winn, J., Zisserman, A.: The PASCAL Visual Object Classes Challenge 2011 (VOC 2011) Results
Google Scholar

Download references

Author information

Authors and Affiliations

The University of Tokyo, Japan
Toshihiko Yamasaki & Tsuhan Chen
Cornell University, USA
Toshihiko Yamasaki & Tsuhan Chen

Authors

Toshihiko Yamasaki
View author publications
You can also search for this author in PubMed Google Scholar
Tsuhan Chen
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Computer Science and Engineering, Hanyang University, 222 Wangshimni-ro, Seongdong-gu, 133-791, Seoul, South Korea
Jong-Il Park
Department of Electrical Engineering, KAIST, 291 Daehak-ro, Yuseong-gu, 305-701, Daejeon, South Korea
Junmo Kim

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Yamasaki, T., Chen, T. (2013). Beyond Spatial Pyramid Matching: Spatial Soft Voting for Image Classification. In: Park, JI., Kim, J. (eds) Computer Vision - ACCV 2012 Workshops. ACCV 2012. Lecture Notes in Computer Science, vol 7729. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-37484-5_41

Download citation

DOI: https://doi.org/10.1007/978-3-642-37484-5_41
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-37483-8
Online ISBN: 978-3-642-37484-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics