Image Classification with Multivariate Gaussian Descriptors

  • Costantino Grana
  • Giuseppe Serra
  • Marco Manfredi
  • Rita Cucchiara
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8157)


Techniques based on Bag Of Words approach represent images by quantizing local descriptors and summarizing their distribution in a histogram. Differently, in this paper we describe an image as multivariate Gaussian distribution, estimated over the extracted local descriptors. The estimated distribution is mapped to a high-dimensional descriptor, by concatenating the mean vector and the projection of the covariance matrix on the Euclidean space tangent to the Riemannian manifold. To deal with large scale datasets and high dimensional feature spaces the Stochastic Gradient Descent solver is adopted. The experimental results on Caltech-101 and ImageCLEF2011 show that the method obtains competitive performance with state-of-the art approaches.


image retrieval image classification multi-class multi-label stochastic gradient descent 


  1. 1.
    Abou–Moustafa, K.T., De La Torre, F., Ferrie, F.P.: Designing a metric for the difference between two gaussian densities. Adv. Intel. Soft Comput. 83, 57–70 (2010)CrossRefGoogle Scholar
  2. 2.
    Ali, S.M., Silvey, S.D.: A general class of coefficients of divergence of one distribution from another. J. of the Royal Stat. Soc (B) 28(1), 131–142 (1966)zbMATHMathSciNetGoogle Scholar
  3. 3.
    Binder, A., Samek, W., Kloft, M., Müller, C., Müller, K.R., Kawanabe, M.: The Joint Submission of the TU Berlin and Fraunhofer FIRST (TUBFI) to the Image CLEF2011 Photo Annotation Task. In: CLEF Workshop (2011)Google Scholar
  4. 4.
    Chatfield, K., Lempitsky, V., Vedaldi, A., Zisserman, A.: The devil is in the details: an evaluation of recent feature encoding methods. In: BMVC (2011)Google Scholar
  5. 5.
    Csurka, G., Dance, C.R., Fan, L., Willamowski, J., Bray, C.: Visual categorization with bags of keypoints. In: ECCV Workshop Stat. Learn. Comput. Vision (2004)Google Scholar
  6. 6.
    Duchenne, O., Joulin, A., Ponce, J.: A graph-matching kernel for object categorization. In: Proc. of ICCV (2011)Google Scholar
  7. 7.
    van Gemert, J.C., Geusebroek, J.-M., Veenman, C.J., Smeulders, A.W.M.: Kernel codebooks for scene categorization. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part III. LNCS, vol. 5304, pp. 696–709. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  8. 8.
    Grauman, K., Darrell, T.: The pyramid match kernel: Efficient learning with sets of features. J. Mach. Learn. Res. 8, 725–760 (2007)zbMATHGoogle Scholar
  9. 9.
    Huang, Y., Huang, K., Wang, C., Tan, T.: Exploring relations of visual codes for image classification. In: Proc. of CVPR (2011)Google Scholar
  10. 10.
    Jia, Y., Huang, C., Darrell, T.: Beyond spatial pyramids: Receptive field learning for pooled image features. In: CVPR (2012)Google Scholar
  11. 11.
    Jiang, Z., Zhang, G., Davis, L.S.: Submodular dictionary learning for sparse coding. In: CVPR (2012)Google Scholar
  12. 12.
    Kailath, T.: The divergence and Bhattacharyya distance measures in signal selection. IEEE T. Commun. Techn. 15(1), 52–60 (1967)CrossRefGoogle Scholar
  13. 13.
    Liu, L., Wang, L., Liu, X.: In defense of soft-assignment coding. In: ICCV (2011)Google Scholar
  14. 14.
    Mikolajczyk, K., Schmid, C.: A performance evaluation of local descriptors. IEEE T. Pattern Anal. 27(10), 1615–1630 (2005)CrossRefGoogle Scholar
  15. 15.
    Nister, D., Stewenius, H.: Scalable recognition with a vocabulary tree. In: IEEE International Conference on Computer Vision and Pattern Recognition (2006)Google Scholar
  16. 16.
    Perronnin, F., Sánchez, J., Mensink, T.: Improving the fisher kernel for large-scale image classification. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part IV. LNCS, vol. 6314, pp. 143–156. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  17. 17.
    Spyromitros-Xioufis, E., Sechidis, K., Tsoumakas, G., Vlahavas, I.P.: MLKD’s Participation at the CLEF 2011 Photo Annotation and Concept-Based Retrieval Tasks. In: CLEF Workshop (2011)Google Scholar
  18. 18.
    Tuytelaars, T., Fritz, M., Saenko, K., Darrell, T.: The nbnn kernel. In: ICCV (2011)Google Scholar
  19. 19.
    Tuzel, O., Porikli, F., Meer, P.: Pedestrian Detection via Classification on Riemannian Manifolds. IEEE T. Pattern Anal. 30(10), 1713–1727 (2008)CrossRefGoogle Scholar
  20. 20.
    Vedaldi, A., Fulkerson, B.: VLFeat: An open and portable library of computer vision algorithms (2008),
  21. 21.
    Vedaldi, A., Zisserman, A.: Efficient additive kernels via explicit feature maps. IEEE T. Pattern Anal. 34(3), 480–492 (2012)CrossRefGoogle Scholar
  22. 22.
    T., Wang, Y.G.J., Yang, J., Yu, K., Lv, F., Huang: Locality-constrained linear coding for image classification. In: CVPR (2010)Google Scholar
  23. 23.
    Yang, T.J., Yu, K., Gong, Y., Huang: Linear spatial pyramid matching using sparse coding for image classification. In: CVPR (2009)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  • Costantino Grana
    • 1
  • Giuseppe Serra
    • 1
  • Marco Manfredi
    • 1
  • Rita Cucchiara
    • 1
  1. 1.Università degli Studi di Modena e Reggio EmiliaModenaItaly

Personalised recommendations