Open Issues on Codebook Generation in Image Classification Tasks

Piras, Luca; Giacinto, Giorgio

doi:10.1007/978-3-319-08979-9_25

Luca Piras²⁰ &
Giorgio Giacinto²⁰

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 8556))

Included in the following conference series:

International Workshop on Machine Learning and Data Mining in Pattern Recognition

2365 Accesses

Abstract

In the last years the use of the so-called bag-of-features approach, often referred to also as the codebook approach, has extensively gained large popularity among researchers in the image classification field, as it exhibited high levels of performance. A large variety of image classification, scene recognition, and more in general computer vision problems have been addressed according to this paradigm in the recent literature. Despite the fact that some papers questioned the real effectiveness of the paradigm, most of the works in the literature follows the same approach for codebook creation, making it a standard “de facto”, without any critical investigation on the suitability of the employed procedure to the problem at hand. The most widespread structure for codebook creation is made up of four steps: dense sampling image patch detection; use of SIFT as patch descriptors; use of the k-means algorithms for clustering patch descriptors in order to select a small number of representative descriptors; use of the SVM classifier, where images are described by a codebook whose vocabulary is made up of the selected representative descriptors. In this paper, we will focus on a critical review of the third step of this process, to see if the clustering step is really useful to produce effective codebooks for image classification tasks. Reported results clearly show that a codebook created according to a purely random extraction of the patch descriptors from the set of descriptors extracted from the images in a dataset, is able to improve classification performances with respect to the performances attained with codebooks created by the clustering process.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Ballan, L., Bertini, M., Del Bimbo, A., Serain, A.M., Serra, G., Zaccone, B.F.: Combining generative and discriminative models for classifying social images from 101 object categories. In: Proc. of International Conference on Pattern Recognition (ICPR), Tsukuba, Japan (November 2012) (Poster)
Google Scholar
Bay, H., Ess, A., Tuytelaars, T., Gool, L.J.V.: Speeded-up robust features (surf). Computer Vision and Image Understanding 110(3), 346–359 (2008)
Article Google Scholar
Becker, J.H., Tuytelaars, T., Gool, L.J.V.: Codebook-free exemplar models for object detection. In: WIAMIS, pp. 1–4. IEEE (2012)
Google Scholar
Bishop, C.M.: Pattern Recognition and Machine Learning (Information Science and Statistics). Springer (October 2006), http://www.worldcat.org/isbn/0387310738
Boiman, O., Shechtman, E., Irani, M.: In defense of nearest-neighbor based image classification. In: CVPR. IEEE Computer Society (2008)
Google Scholar
Chang, C.C., Lin, C.J.: Libsvm: A library for support vector machines. ACM TIST 2(3), 27 (2011)
Google Scholar
Chang, S.F., Sikora, T., Puri, A.: Overview of the mpeg-7 standard. IEEE Trans. Circuits Syst. Video Techn., 688–695 (2001)
Google Scholar
Chatzichristofis, S.A., Boutalis, Y.S.: Fcth: Fuzzy color and texture histogram - a low level feature for accurate image retrieval. In: Proceedings of the 2008 Ninth International Workshop on Image Analysis for Multimedia Interactive Services, pp. 191–196. IEEE Computer Society (2008)
Google Scholar
Chavez, A., Gustafson, D.: Building an effective visual codebook: Is k-means clustering useful? In: Bebis, G., et al. (eds.) ISVC 2012, Part II. LNCS, vol. 7432, pp. 517–525. Springer, Heidelberg (2012)
Chapter Google Scholar
Comaniciu, D., Meer, P.: Mean shift: A robust approach toward feature space analysis. IEEE Trans. Pattern Anal. Mach. Intell. 24(5), 603–619 (2002)
Article Google Scholar
Cristianini, N., Shawe-Taylor, J.: An Introduction to Support Vector Machines and Other Kernel-based Learning Methods. Cambridge University Press (2000)
Google Scholar
Crowley, J.L., Sanderson, A.C.: Multiple resolution representation and probabilistic matching of 2-d gray-scale shape. IEEE Trans. Pattern Anal. Mach. Intell. 9(1), 113–121 (1987)
Article Google Scholar
Csurka, G., Dance, C.R., Fan, L., Willamowski, J., Bray, C.: Visual categorization with bags of keypoints. In: Workshop on Statistical Learning in Computer Vision, ECCV, pp. 1–22 (2004)
Google Scholar
Deselaers, T., Keysers, D., Ney, H.: Features for image retrieval: an experimental comparison. Inf. Retr. 11(2), 77–107 (2008)
Article Google Scholar
Estabrooks, A., Jo, T., Japkowicz, N.: A multiple resampling method for learning from imbalanced data sets. Computational Intelligence 20(1), 18–36 (2004)
Article MathSciNet Google Scholar
Grana, C., Serra, G., Manfredi, M., Cucchiara, R.: Image classification with multivariate gaussian descriptors. In: Petrosino (ed.) [36], pp. 111–120
Google Scholar
Joachims, T.: Text categorization with suport vector machines: Learning with many relevant features. In: Nédellec, C., Rouveirol, C. (eds.) ECML 1998. LNCS, vol. 1398, pp. 137–142. Springer, Heidelberg (1998)
Google Scholar
Jurie, F., Triggs, B.: Creating efficient codebooks for visual recognition. In: ICCV, pp. 604–610. IEEE Computer Society (2005)
Google Scholar
Ke, Y., Sukthankar, R.: Pca-sift: A more distinctive representation for local image descriptors. In: CVPR (2), pp. 506–513 (2004)
Google Scholar
Kohonen, T.: The self-organizing map. Neurocomputing 21(1-3), 1–6 (1998)
Article MATH Google Scholar
Koikkalainen, P., Oja, E.: Self-organizing hierarchical feature maps. In: 1990 IJCNN International Joint Conference on Neural Networks, vol. 2, pp. 279–284 (1990)
Google Scholar
Lazebnik, S., Schmid, C., Ponce, J.: Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories. In: CVPR (2), pp. 2169–2178. IEEE Computer Society (2006)
Google Scholar
Li, F.F., Perona, P.: A bayesian hierarchical model for learning natural scene categories. In: CVPR (2), pp. 524–531. IEEE Computer Society (2005)
Google Scholar
Linde, Y., Buzo, A., Gray, R.: An algorithm for vector quantizer design. IEEE Transactions on Communications 28(1), 84–95 (1980)
Article Google Scholar
Liu, B.D., Wang, Y.X., Zhang, Y.J., Shen, B.: Learning dictionary on manifolds for image classification. Pattern Recognition 46(7), 1879–1890 (2013)
Article Google Scholar
Lowe, D.G.: Object recognition from local scale-invariant features. In: ICCV, pp. 1150–1157 (1999)
Google Scholar
Lowe, D.G.: Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision 60(2), 91–110 (2004)
Article Google Scholar
Martínez-Muñoz, G., Delgado, N.L., Mortensen, E.N., Zhang, W., Yamamuro, A., Paasch, R., Payet, N., Lytle, D.A., Shapiro, L.G., Todorovic, S., Moldenke, A., Dietterich, T.G.: Dictionary-free categorization of very similar objects via stacked evidence trees. In: CVPR, pp. 549–556. IEEE (2009)
Google Scholar
Meyerson, A.: Online facility location. In: FOCS, pp. 426–431. IEEE Computer Society (2001)
Google Scholar
Mikolajczyk, K., Schmid, C.: Indexing based on scale invariant interest points. In: ICCV, pp. 525–531 (2001)
Google Scholar
Mikolajczyk, K., Schmid, C.: Scale & affine invariant interest point detectors. International Journal of Computer Vision 60(1), 63–86 (2004)
Article Google Scholar
Mikolajczyk, K., Schmid, C.: A performance evaluation of local descriptors. IEEE Trans. Pattern Anal. Mach. Intell. 27(10), 1615–1630 (2005)
Article Google Scholar
Moosmann, F., Triggs, B., Jurie, F.: Fast discriminative visual codebooks using randomized clustering forests. In: Schölkopf, B., Platt, J.C., Hoffman, T. (eds.) NIPS, pp. 985–992. MIT Press (2006), http://eprints.pascal-network.org/archive/00002438/01/nips.pdf
Nowak, E., Jurie, F., Triggs, B.: Sampling strategies for bag-of-features image classification. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006, Part IV. LNCS, vol. 3954, pp. 490–503. Springer, Heidelberg (2006)
Chapter Google Scholar
Penatti, O.A.B., Silva, F.B., Valle, E., Gouet-Brunet, V., Torres, R.d.S.: Visual word spatial arrangement for image retrieval and classification. Pattern Recognition 47(2), 705–720 (2014)
Article Google Scholar
Petrosino, A. (ed.): ICIAP 2013, Part II. LNCS, vol. 8157, pp. 2013–2017. Springer, Heidelberg (2013)
Google Scholar
Pillai, I., Fumera, G., Roli, F.: Threshold optimisation for multi-label classifiers. Pattern Recognition 46(7), 2055–2065 (2013), http://www.sciencedirect.com/science/article/pii/S0031320313000320
Article Google Scholar
Piras, L., Tronci, R., Giacinto, G.: Diversity in ensembles of codebooks for visual concept detection. In: Petrosino (ed.) [36], pp. 399–408
Google Scholar
Ramanan, A., Niranjan, M.: A review of codebook models in patch-based visual object recognition. Journal of Signal Processing Systems 68(3), 333–352 (2012)
Article Google Scholar
van Rijsbergen, C.J.: Information Retrieval. Butterworth (1979)
Google Scholar
van de Sande, K.E.A., Gevers, T., Snoek, C.G.M.: Evaluating color descriptors for object and scene recognition. IEEE Trans. Pattern Anal. Mach. Intell. 32(9), 1582–1596 (2010)
Article Google Scholar
Sebastiani, F.: Machine learning in automated text categorization. ACM Comput. Surv. 34(1), 1–47 (2002)
Article Google Scholar
Sivic, J., Zisserman, A.: A text retrieval approach to object matching in videos. In: ICCV, pp. 1470–1477. IEEE Computer Society (2003)
Google Scholar
Thomee, B., Popescu, A.: Overview of the imageclef 2012 flickr photo annotation and retrieval task. Tech. rep., CLEF 2012 working notes, Rome, Italy (2012)
Google Scholar
Tsoumakas, G., Katakis, I., Vlahavas, I.P.: Mining multi-label data. In: Maimon, O., Rokach, L. (eds.) Data Mining and Knowledge Discovery Handbook, pp. 667–685. Springer (2010)
Google Scholar
Tuytelaars, T., Mikolajczyk, K.: Local invariant feature detectors: A survey. Foundations and Trends in Computer Graphics and Vision 3(3), 177–280 (2007)
Article Google Scholar
Viitaniemi, V., Laaksonen, J.: Experiments on selection of codebooks for local image feature histograms. In: Sebillo, M., Vitiello, G., Schaefer, G. (eds.) VISUAL 2008. LNCS, vol. 5188, pp. 126–137. Springer, Heidelberg (2008)
Chapter Google Scholar
Wang, J., Yang, J., Yu, K., Lv, F., Huang, T., Gong, Y.: Locality-constrained linear coding for image classification. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3360–3367 (2010)
Google Scholar
Wu, J., Rehg, J.M.: Beyond the euclidean distance: Creating effective visual codebooks using the histogram intersection kernel. In: ICCV, pp. 630–637. IEEE (2009)
Google Scholar
Yang, Y.: A study on thresholding strategies for text categorization. In: ACM (ed.) Proceedings of the International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 137–145 (2001)
Google Scholar
Zhang, C., Wang, S., Liang, C., Liu, J., Huang, Q., Li, H., Tian, Q.: Beyond bag of words: image representation in sub-semantic space. In: Jaimes, A., Sebe, N., Boujemaa, N., Gatica-Perez, D., Shamma, D.A., Worring, M., Zimmermann, R. (eds.) ACM Multimedia, pp. 497–500. ACM (2013)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Electrical and Electronic Engineering, University of Cagliari, Piazza D’armi, 09123, Cagliari, Italy
Luca Piras & Giorgio Giacinto

Authors

Luca Piras
View author publications
You can also search for this author in PubMed Google Scholar
Giorgio Giacinto
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Institute of Computer Vision and Applied Computer Sciences, IBaI,, Leipzig, Germany
Petra Perner

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Piras, L., Giacinto, G. (2014). Open Issues on Codebook Generation in Image Classification Tasks. In: Perner, P. (eds) Machine Learning and Data Mining in Pattern Recognition. MLDM 2014. Lecture Notes in Computer Science(), vol 8556. Springer, Cham. https://doi.org/10.1007/978-3-319-08979-9_25

Download citation

DOI: https://doi.org/10.1007/978-3-319-08979-9_25
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-08978-2
Online ISBN: 978-3-319-08979-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics