Contextual-Guided Bag-of-Visual-Words Model for Multi-class Object Categorization

Mirza-Mohammadi, Mehdi; Escalera, Sergio; Radeva, Petia

doi:10.1007/978-3-642-03767-2_91

Mehdi Mirza-Mohammadi¹⁸,
Sergio Escalera^18,19 &
Petia Radeva^18,19

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 5702))

Included in the following conference series:

International Conference on Computer Analysis of Images and Patterns

1738 Accesses
7 Citations

Abstract

Bag-of-words model (BOW) is inspired by the text classification problem, where a document is represented by an unsorted set of contained words. Analogously, in the object categorization problem, an image is represented by an unsorted set of discrete visual words (BOVW). In these models, relations among visual words are performed after dictionary construction. However, close object regions can have far descriptions in the feature space, being grouped as different visual words. In this paper, we present a method for considering geometrical information of visual words in the dictionary construction step. Object interest regions are obtained by means of the Harris-Affine detector and then described using the SIFT descriptor. Afterward, a contextual-space and a feature-space are defined, and a merging process is used to fuse feature words based on their proximity in the contextual-space. Moreover, we use the Error Correcting Output Codes framework to learn the new dictionary in order to perform multi-class classification. Results show significant classification improvements when spatial information is taken into account in the dictionary construction step.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Mikolajczyk, K., Tuytelaars, T., Schmid, C., Zisserman, A., Matas, J., Kadir, T., Van Gool, L.: A comparison of affine region detectors. IJCV 65(1-2), 43–72 (2005)
Article Google Scholar
Lowe, D.: Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision 60(2), 91–110 (2005)
Article Google Scholar
Csurka, G., Dance, C., Fan, L., Willamowski, J., Bray, C.: Visual categorization with bags of keypoints. In: ECCV, pp. 1–22 (2004)
Google Scholar
Philbin, J., Chum, O., Isard, M., Sivic, J., Zisserman, A.: Object retrieval with large vocabularies and fast spatial matching. In: CVPR, pp. 1–8 (2007)
Google Scholar
Chum, O., Philbin, J., Sivic, J., Zisserman, A.: Automatic query expansion with a generative feature model for object retrieval. In: ICCV, pp. 1–8 (2007)
Google Scholar
Carneiro, G., Jepson, A.: Flexible spatial models for grouping local image features. In: CVPR, vol. 2, pp. 747–754 (2004)
Google Scholar
Dietterich, T., Bakiri, G.: Solving multiclass learning problems via error-correcting output codes 2, 263–282 (1995)
Google Scholar
Escalera, S., Pujol, O., Radeva, P.: On the decoding process in ternary error-correcting output codes. Transactions in PAMI 99 (2009)
Google Scholar
Caltech 101, http://www.vision.caltech.edu/image_datasets/caltech101/
Caltech 256, http://www.vision.caltech.edu/image_datasets/caltech256/
Clustering package, http://bonsai.ims.u-tokyo.ac.jp/~mdehoon/software/cluster/

Download references

Author information

Authors and Affiliations

Dept. Matemàtica Aplicada i Anàlisi, Gran Via 585, 08007, Barcelona, Spain
Mehdi Mirza-Mohammadi, Sergio Escalera & Petia Radeva
Computer Vision Center, Campus UAB, Edifici O, 08193, Bellaterra, Barcelona
Sergio Escalera & Petia Radeva

Authors

Mehdi Mirza-Mohammadi
View author publications
You can also search for this author in PubMed Google Scholar
Sergio Escalera
View author publications
You can also search for this author in PubMed Google Scholar
Petia Radeva
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Mathematics and Computer Science, University of Münster, Einsteinstrasse 62, 48149, Münster, Germany
Xiaoyi Jiang
Institute of Mathematics and Computing Science, University of Groningen, Nijenborgh 9, 9747, Groningen, AG, The Netherlands
Nicolai Petkov

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Mirza-Mohammadi, M., Escalera, S., Radeva, P. (2009). Contextual-Guided Bag-of-Visual-Words Model for Multi-class Object Categorization. In: Jiang, X., Petkov, N. (eds) Computer Analysis of Images and Patterns. CAIP 2009. Lecture Notes in Computer Science, vol 5702. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-03767-2_91

Download citation

DOI: https://doi.org/10.1007/978-3-642-03767-2_91
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-03766-5
Online ISBN: 978-3-642-03767-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics