Skip to main content

Open Issues on Codebook Generation in Image Classification Tasks

  • Conference paper
Machine Learning and Data Mining in Pattern Recognition (MLDM 2014)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 8556))

  • 2365 Accesses

Abstract

In the last years the use of the so-called bag-of-features approach, often referred to also as the codebook approach, has extensively gained large popularity among researchers in the image classification field, as it exhibited high levels of performance. A large variety of image classification, scene recognition, and more in general computer vision problems have been addressed according to this paradigm in the recent literature. Despite the fact that some papers questioned the real effectiveness of the paradigm, most of the works in the literature follows the same approach for codebook creation, making it a standard “de facto”, without any critical investigation on the suitability of the employed procedure to the problem at hand. The most widespread structure for codebook creation is made up of four steps: dense sampling image patch detection; use of SIFT as patch descriptors; use of the k-means algorithms for clustering patch descriptors in order to select a small number of representative descriptors; use of the SVM classifier, where images are described by a codebook whose vocabulary is made up of the selected representative descriptors. In this paper, we will focus on a critical review of the third step of this process, to see if the clustering step is really useful to produce effective codebooks for image classification tasks. Reported results clearly show that a codebook created according to a purely random extraction of the patch descriptors from the set of descriptors extracted from the images in a dataset, is able to improve classification performances with respect to the performances attained with codebooks created by the clustering process.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Ballan, L., Bertini, M., Del Bimbo, A., Serain, A.M., Serra, G., Zaccone, B.F.: Combining generative and discriminative models for classifying social images from 101 object categories. In: Proc. of International Conference on Pattern Recognition (ICPR), Tsukuba, Japan (November 2012) (Poster)

    Google Scholar 

  2. Bay, H., Ess, A., Tuytelaars, T., Gool, L.J.V.: Speeded-up robust features (surf). Computer Vision and Image Understanding 110(3), 346–359 (2008)

    Article  Google Scholar 

  3. Becker, J.H., Tuytelaars, T., Gool, L.J.V.: Codebook-free exemplar models for object detection. In: WIAMIS, pp. 1–4. IEEE (2012)

    Google Scholar 

  4. Bishop, C.M.: Pattern Recognition and Machine Learning (Information Science and Statistics). Springer (October 2006), http://www.worldcat.org/isbn/0387310738

  5. Boiman, O., Shechtman, E., Irani, M.: In defense of nearest-neighbor based image classification. In: CVPR. IEEE Computer Society (2008)

    Google Scholar 

  6. Chang, C.C., Lin, C.J.: Libsvm: A library for support vector machines. ACM TIST 2(3), 27 (2011)

    Google Scholar 

  7. Chang, S.F., Sikora, T., Puri, A.: Overview of the mpeg-7 standard. IEEE Trans. Circuits Syst. Video Techn., 688–695 (2001)

    Google Scholar 

  8. Chatzichristofis, S.A., Boutalis, Y.S.: Fcth: Fuzzy color and texture histogram - a low level feature for accurate image retrieval. In: Proceedings of the 2008 Ninth International Workshop on Image Analysis for Multimedia Interactive Services, pp. 191–196. IEEE Computer Society (2008)

    Google Scholar 

  9. Chavez, A., Gustafson, D.: Building an effective visual codebook: Is k-means clustering useful? In: Bebis, G., et al. (eds.) ISVC 2012, Part II. LNCS, vol. 7432, pp. 517–525. Springer, Heidelberg (2012)

    Chapter  Google Scholar 

  10. Comaniciu, D., Meer, P.: Mean shift: A robust approach toward feature space analysis. IEEE Trans. Pattern Anal. Mach. Intell. 24(5), 603–619 (2002)

    Article  Google Scholar 

  11. Cristianini, N., Shawe-Taylor, J.: An Introduction to Support Vector Machines and Other Kernel-based Learning Methods. Cambridge University Press (2000)

    Google Scholar 

  12. Crowley, J.L., Sanderson, A.C.: Multiple resolution representation and probabilistic matching of 2-d gray-scale shape. IEEE Trans. Pattern Anal. Mach. Intell. 9(1), 113–121 (1987)

    Article  Google Scholar 

  13. Csurka, G., Dance, C.R., Fan, L., Willamowski, J., Bray, C.: Visual categorization with bags of keypoints. In: Workshop on Statistical Learning in Computer Vision, ECCV, pp. 1–22 (2004)

    Google Scholar 

  14. Deselaers, T., Keysers, D., Ney, H.: Features for image retrieval: an experimental comparison. Inf. Retr. 11(2), 77–107 (2008)

    Article  Google Scholar 

  15. Estabrooks, A., Jo, T., Japkowicz, N.: A multiple resampling method for learning from imbalanced data sets. Computational Intelligence 20(1), 18–36 (2004)

    Article  MathSciNet  Google Scholar 

  16. Grana, C., Serra, G., Manfredi, M., Cucchiara, R.: Image classification with multivariate gaussian descriptors. In: Petrosino (ed.) [36], pp. 111–120

    Google Scholar 

  17. Joachims, T.: Text categorization with suport vector machines: Learning with many relevant features. In: Nédellec, C., Rouveirol, C. (eds.) ECML 1998. LNCS, vol. 1398, pp. 137–142. Springer, Heidelberg (1998)

    Google Scholar 

  18. Jurie, F., Triggs, B.: Creating efficient codebooks for visual recognition. In: ICCV, pp. 604–610. IEEE Computer Society (2005)

    Google Scholar 

  19. Ke, Y., Sukthankar, R.: Pca-sift: A more distinctive representation for local image descriptors. In: CVPR (2), pp. 506–513 (2004)

    Google Scholar 

  20. Kohonen, T.: The self-organizing map. Neurocomputing 21(1-3), 1–6 (1998)

    Article  MATH  Google Scholar 

  21. Koikkalainen, P., Oja, E.: Self-organizing hierarchical feature maps. In: 1990 IJCNN International Joint Conference on Neural Networks, vol. 2, pp. 279–284 (1990)

    Google Scholar 

  22. Lazebnik, S., Schmid, C., Ponce, J.: Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories. In: CVPR (2), pp. 2169–2178. IEEE Computer Society (2006)

    Google Scholar 

  23. Li, F.F., Perona, P.: A bayesian hierarchical model for learning natural scene categories. In: CVPR (2), pp. 524–531. IEEE Computer Society (2005)

    Google Scholar 

  24. Linde, Y., Buzo, A., Gray, R.: An algorithm for vector quantizer design. IEEE Transactions on Communications 28(1), 84–95 (1980)

    Article  Google Scholar 

  25. Liu, B.D., Wang, Y.X., Zhang, Y.J., Shen, B.: Learning dictionary on manifolds for image classification. Pattern Recognition 46(7), 1879–1890 (2013)

    Article  Google Scholar 

  26. Lowe, D.G.: Object recognition from local scale-invariant features. In: ICCV, pp. 1150–1157 (1999)

    Google Scholar 

  27. Lowe, D.G.: Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision 60(2), 91–110 (2004)

    Article  Google Scholar 

  28. Martínez-Muñoz, G., Delgado, N.L., Mortensen, E.N., Zhang, W., Yamamuro, A., Paasch, R., Payet, N., Lytle, D.A., Shapiro, L.G., Todorovic, S., Moldenke, A., Dietterich, T.G.: Dictionary-free categorization of very similar objects via stacked evidence trees. In: CVPR, pp. 549–556. IEEE (2009)

    Google Scholar 

  29. Meyerson, A.: Online facility location. In: FOCS, pp. 426–431. IEEE Computer Society (2001)

    Google Scholar 

  30. Mikolajczyk, K., Schmid, C.: Indexing based on scale invariant interest points. In: ICCV, pp. 525–531 (2001)

    Google Scholar 

  31. Mikolajczyk, K., Schmid, C.: Scale & affine invariant interest point detectors. International Journal of Computer Vision 60(1), 63–86 (2004)

    Article  Google Scholar 

  32. Mikolajczyk, K., Schmid, C.: A performance evaluation of local descriptors. IEEE Trans. Pattern Anal. Mach. Intell. 27(10), 1615–1630 (2005)

    Article  Google Scholar 

  33. Moosmann, F., Triggs, B., Jurie, F.: Fast discriminative visual codebooks using randomized clustering forests. In: Schölkopf, B., Platt, J.C., Hoffman, T. (eds.) NIPS, pp. 985–992. MIT Press (2006), http://eprints.pascal-network.org/archive/00002438/01/nips.pdf

  34. Nowak, E., Jurie, F., Triggs, B.: Sampling strategies for bag-of-features image classification. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006, Part IV. LNCS, vol. 3954, pp. 490–503. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  35. Penatti, O.A.B., Silva, F.B., Valle, E., Gouet-Brunet, V., Torres, R.d.S.: Visual word spatial arrangement for image retrieval and classification. Pattern Recognition 47(2), 705–720 (2014)

    Article  Google Scholar 

  36. Petrosino, A. (ed.): ICIAP 2013, Part II. LNCS, vol. 8157, pp. 2013–2017. Springer, Heidelberg (2013)

    Google Scholar 

  37. Pillai, I., Fumera, G., Roli, F.: Threshold optimisation for multi-label classifiers. Pattern Recognition 46(7), 2055–2065 (2013), http://www.sciencedirect.com/science/article/pii/S0031320313000320

    Article  Google Scholar 

  38. Piras, L., Tronci, R., Giacinto, G.: Diversity in ensembles of codebooks for visual concept detection. In: Petrosino (ed.) [36], pp. 399–408

    Google Scholar 

  39. Ramanan, A., Niranjan, M.: A review of codebook models in patch-based visual object recognition. Journal of Signal Processing Systems 68(3), 333–352 (2012)

    Article  Google Scholar 

  40. van Rijsbergen, C.J.: Information Retrieval. Butterworth (1979)

    Google Scholar 

  41. van de Sande, K.E.A., Gevers, T., Snoek, C.G.M.: Evaluating color descriptors for object and scene recognition. IEEE Trans. Pattern Anal. Mach. Intell. 32(9), 1582–1596 (2010)

    Article  Google Scholar 

  42. Sebastiani, F.: Machine learning in automated text categorization. ACM Comput. Surv. 34(1), 1–47 (2002)

    Article  Google Scholar 

  43. Sivic, J., Zisserman, A.: A text retrieval approach to object matching in videos. In: ICCV, pp. 1470–1477. IEEE Computer Society (2003)

    Google Scholar 

  44. Thomee, B., Popescu, A.: Overview of the imageclef 2012 flickr photo annotation and retrieval task. Tech. rep., CLEF 2012 working notes, Rome, Italy (2012)

    Google Scholar 

  45. Tsoumakas, G., Katakis, I., Vlahavas, I.P.: Mining multi-label data. In: Maimon, O., Rokach, L. (eds.) Data Mining and Knowledge Discovery Handbook, pp. 667–685. Springer (2010)

    Google Scholar 

  46. Tuytelaars, T., Mikolajczyk, K.: Local invariant feature detectors: A survey. Foundations and Trends in Computer Graphics and Vision 3(3), 177–280 (2007)

    Article  Google Scholar 

  47. Viitaniemi, V., Laaksonen, J.: Experiments on selection of codebooks for local image feature histograms. In: Sebillo, M., Vitiello, G., Schaefer, G. (eds.) VISUAL 2008. LNCS, vol. 5188, pp. 126–137. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  48. Wang, J., Yang, J., Yu, K., Lv, F., Huang, T., Gong, Y.: Locality-constrained linear coding for image classification. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3360–3367 (2010)

    Google Scholar 

  49. Wu, J., Rehg, J.M.: Beyond the euclidean distance: Creating effective visual codebooks using the histogram intersection kernel. In: ICCV, pp. 630–637. IEEE (2009)

    Google Scholar 

  50. Yang, Y.: A study on thresholding strategies for text categorization. In: ACM (ed.) Proceedings of the International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 137–145 (2001)

    Google Scholar 

  51. Zhang, C., Wang, S., Liang, C., Liu, J., Huang, Q., Li, H., Tian, Q.: Beyond bag of words: image representation in sub-semantic space. In: Jaimes, A., Sebe, N., Boujemaa, N., Gatica-Perez, D., Shamma, D.A., Worring, M., Zimmermann, R. (eds.) ACM Multimedia, pp. 497–500. ACM (2013)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this paper

Cite this paper

Piras, L., Giacinto, G. (2014). Open Issues on Codebook Generation in Image Classification Tasks. In: Perner, P. (eds) Machine Learning and Data Mining in Pattern Recognition. MLDM 2014. Lecture Notes in Computer Science(), vol 8556. Springer, Cham. https://doi.org/10.1007/978-3-319-08979-9_25

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-08979-9_25

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-08978-2

  • Online ISBN: 978-3-319-08979-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics