Kernels for Visual Words Histograms

  • Radu Tudor Ionescu
  • Marius Popescu
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8156)


Computer vision researchers have developed several learning methods based on the bag-of-words model for image related tasks, such as image retrieval or image categorization. For such an approach, images are represented as histograms of visual words from a codebook that is usually obtained with a simple clustering method. Next, kernel methods are used to compare such histograms. Popular choices, besides the linear SVM, are the intersection, Hellinger’s, χ 2 and Jensen-Shannon kernels.

This paper aims at introducing a kernel for histograms of visual words, namely the PQ kernel. This kernel is inspired from a class of similarity measures for ordinal variables, more precisely Goodman and Kruskals gamma and Kendalls tau. A proof that PQ is actually a kernel is also given in this work. The proof is based on building its feature map.

Object recognition experiments are conducted to compare the PQ kernel with other state of the art kernels on two benchmark datasets. The PQ kernel has the best mean average precision (AP) on both datasets. In one of the experiments, PQ and Jensen-Shannon kernels are combined to improve the mean AP score even further. In conclusion, the PQ kernel can be used with success, alone or in combination with other kernels, for image retrieval, image classification or other related tasks.


kernel method rank correlation measure ordinal measure ordinal data visual words histograms bag-of-words BoW model 


  1. 1.
    Bosch, A., Zisserman, A., Munoz, X.: Image Classification using Random Forests and Ferns. In: ICCV, pp. 1–8. IEEE Computer Society Press (2007)Google Scholar
  2. 2.
    Csurka, G., Dance, C.R., Fan, L., Willamowski, J., Bray, C.: Visual categorization with bags of keypoints. In: Workshop on Statistical Learning in Computer Vision, ECCV, pp. 1–22 (2004)Google Scholar
  3. 3.
    Dalal, N., Triggs, B.: Histograms of Oriented Gradients for Human Detection. In: CVPR, vol. 1, pp. 886–893. IEEE Computer Society, Washington, DC (2005)Google Scholar
  4. 4.
    Everingham, M., van Gool, L., Williams, C.K., Winn, J., Zisserman, A.: The Pascal Visual Object Classes (VOC) Challenge. IJCV 88(2), 303–338 (2010)CrossRefGoogle Scholar
  5. 5.
    Fei-Fei, L., Fergus, R., Perona, P.: Learning generative visual models from few training examples: An incremental Bayesian approach tested on 101 object categories. CVIU 106(1), 59–70 (2007)Google Scholar
  6. 6.
    Fei-Fei, L., Perona, P.: A Bayesian Hierarchical Model for Learning Natural Scene Categories. In: CVPR, vol. 2, pp. 524–531. IEEE Computer Society (2005)Google Scholar
  7. 7.
    Lazebnik, S., Schmid, C., Ponce, J.: A Maximum Entropy Framework for Part-Based Texture and Object Recognition. In: ICCV 2005, vol. 1, pp. 832–838. IEEE Computer Society, Washington, DC (2005)Google Scholar
  8. 8.
    Lazebnik, S., Schmid, C., Ponce, J.: Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories. In: CVPR 2006, vol. 2, pp. 2169–2178. IEEE Computer Society, Washington, DC (2006)Google Scholar
  9. 9.
    Leung, T., Malik, J.: Representing and Recognizing the Visual Appearance of Materials using Three-dimensional Textons. IJCV 43(1), 29–44 (2001)CrossRefzbMATHGoogle Scholar
  10. 10.
    Lowe, D.G.: Object Recognition from Local Scale-Invariant Features. In: ICCV, vol. 2, pp. 1150–1157. IEEE Computer Society, Washington, DC (1999)Google Scholar
  11. 11.
    Philbin, J., Chum, O., Isard, M., Sivic, J., Zisserman, A.: Object retrieval with large vocabularies and fast spatial matching. In: CVPR 2007, pp. 1–8 (2007)Google Scholar
  12. 12.
    Shalev-Shwartz, S., Singer, Y., Srebro, N.: Pegasos: Primal Estimated sub-GrAdient SOlver for SVM. In: ICML, pp. 807–814. ACM (2007)Google Scholar
  13. 13.
    Sivic, J., Russell, B.C., Efros, A.A., Zisserman, A., Freeman, W.T.: Discovering Objects and their Localization in Images. In: Proceedings of ICCV, pp. 370–377 (2005)Google Scholar
  14. 14.
    Upton, G., Cook, I.: A Dictionary of Statistics. Oxford University Press (2004)Google Scholar
  15. 15.
    Vedaldi, A., Fulkerson, B.: VLFeat: An Open and Portable Library of Computer Vision Algorithms (2008),
  16. 16.
    Vedaldi, A., Zisserman, A.: Efficient additive kernels via explicit feature maps. In: CVPR, pp. 3539–3546. IEEE Computer Society, San Francisco (2010)Google Scholar
  17. 17.
    Winn, J., Criminisi, A., Minka, T.: Object Categorization by Learned Universal Visual Dictionary. In: ICCV, vol. 2, pp. 1800–1807. IEEE Computer Society (2005)Google Scholar
  18. 18.
    Yagnik, J., Strelow, D., Ross, D.A., Lin, R.S.: The power of comparative reasoning. In: ICCV, pp. 2431–2438. IEEE (2011)Google Scholar
  19. 19.
    Zhang, J., Marszalek, M., Lazebnik, S., Schmid, C.: Local Features and Kernels for Classification of Texture and Object Categories: A Comprehensive Study. IJCV 73(2), 213–238 (2007)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  • Radu Tudor Ionescu
    • 1
  • Marius Popescu
    • 1
  1. 1.Faculty of Mathematics and Computer ScienceUniversity of BucharestBucharestRomania

Personalised recommendations