Efficient Object Category Recognition Using Classemes

  • Lorenzo Torresani
  • Martin Szummer
  • Andrew Fitzgibbon
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6311)


We introduce a new descriptor for images which allows the construction of efficient and compact classifiers with good accuracy on object category recognition. The descriptor is the output of a large number of weakly trained object category classifiers on the image. The trained categories are selected from an ontology of visual concepts, but the intention is not to encode an explicit decomposition of the scene. Rather, we accept that existing object category classifiers often encode not the category per se but ancillary image characteristics; and that these ancillary characteristics can combine to represent visual classes unrelated to the constituent categories’ semantic meanings.

The advantage of this descriptor is that it allows object-category queries to be made against image databases using efficient classifiers (efficient at test time) such as linear support vector machines, and allows these queries to be for novel categories. Even when the representation is reduced to 200 bytes per image, classification accuracy on object category recognition is comparable with the state of the art (36% versus 42%), but at orders of magnitude lower computational cost.


Training Image Category Label Image Search Linear Support Vector Machine Multiple Kernel Learning 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Bo, L., Sminchisescu, C.: Efficient Match Kernel between Sets of Features for Visual Recognition. In: Adv. in Neural Inform. Proc. Systems (December 2009)Google Scholar
  2. 2.
    Boiman, O., Shechtman, E., Irani, M.: In defense of nearest-neighbor based image classification. In: Proc. Comp. Vision Pattern Recogn., CVPR (2008)Google Scholar
  3. 3.
    Bosch, A.: Image classification using rois and multiple kernel learning (2010),
  4. 4.
    Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: CVPR, vol. 1, pp. 886–893 (2005)Google Scholar
  5. 5.
    Farhadi, A., Endres, I., Hoiem, D., Forsyth, D.: Describing objects by their attributes. In: Proc. Comp. Vision Pattern Recogn. (CVPR), pp. 1778–1785 (2009)Google Scholar
  6. 6.
    Gehler, P.V., Nowozin, S.: On feature combination for multiclass object classification. In: Intl. Conf. Computer Vision (2009)Google Scholar
  7. 7.
    Griffin, G., Perona, P.: Learning and using taxonomies for fast visual categorization. In: Proc. Comp. Vision Pattern Recogn., CVPR (2008)Google Scholar
  8. 8.
    Heitz, G., Gould, S., Saxena, A., Koller, D.: Cascaded classification models: Combining models for holistic scene understanding. In: Adv. in Neural Inform. Proc. Systems (NIPS), pp. 641–648 (2008)Google Scholar
  9. 9.
    Jegou, H., Douze, M., Schmid, C.: Hamming embedding and weak geometric consistency for large scale image search. In: European Conf. Comp. Vision (October 2008)Google Scholar
  10. 10.
    Joachims, T.: An implementation of support vector machines (svms) in c (2002)Google Scholar
  11. 11.
    Lampert, C.H., Nickisch, H., Harmeling, S.: Learning to detect unseen object classes by between-class attribute transfer. In: Proc. Comp. Vision Pattern Recogn., CVPR (2009)Google Scholar
  12. 12.
    Lowe, D.: Distinctive image features from scale-invariant keypoints. Intl. Jrnl. of Computer Vision 60(2), 91–110 (2004)CrossRefGoogle Scholar
  13. 13.
  14. 14.
    Malisiewicz, T., Efros, A.A.: Recognition by association via learning per-exemplar distances. In: Proc. Comp. Vision Pattern Recogn., CVPR (2008)Google Scholar
  15. 15.
    Manning, C.D., Raghavan, P., Schütze, H.: Introduction to Information Retrieval. Cambridge University Press, Cambridge (2008)zbMATHGoogle Scholar
  16. 16.
    Mikolajczyk, K., Schmid, C.: Scale and affine invariant interest point detectors. Intl. Jrnl. of Computer Vision 60(1), 63–86 (2004)CrossRefGoogle Scholar
  17. 17.
    Naphade, M., Smith, J.R., Tesic, J., Chang, S.F., Hsu, W., Kennedy, L., Hauptmann, A., Curtis, J.: Large-scale concept ontology for multimedia. IEEE MultiMedia 13(3), 86–91 (2006)CrossRefGoogle Scholar
  18. 18.
    Nistér, D., Stewénius, H.: Scalable recognition with a vocabulary tree. In: Proc. Comp. Vision Pattern Recogn. (CVPR), pp. 2161–2168 (2006)Google Scholar
  19. 19.
    Oliva, A., Torralba, A.: Building the gist of a scene: The role of global image features in recognition. Visual Perception, Progress in Brain Research 155 (2006)Google Scholar
  20. 20.
    Salakhutdinov, R., Hinton, G.: Semantic hashing. In: SIGIR Workshop on Information Retrieval and Applications of Graphical Models (2007)Google Scholar
  21. 21.
    Shechtman, E., Irani, M.: Matching local self-similarities across images and videos. In: Proc. Comp. Vision Pattern Recogn. CVPR (June 2007)Google Scholar
  22. 22.
    Shotton, J., Johnson, M., Cipolla, R.: Semantic texton forests for image categorization and segmentation. In: Proc. Comp. Vision Pattern Recogn., CVPR (2008)Google Scholar
  23. 23.
    Torralba, A., Murphy, K.P., Freeman, W.T.: Sharing visual features for multiclass and multiview object detection. IEEE Transactions on Pattern Analysis and Machine Intelligence 29(5), 854–869 (2007)CrossRefGoogle Scholar
  24. 24.
    Torresani, L., Szummer, M., Fitzgibbon, A.: Learning query-dependent prefilters for scalable image retrieval. In: Proc. Comp. Vision Pattern Recogn. (CVPR), pp. 2615–2622 (2009)Google Scholar
  25. 25.
    Torresani, L., Szummer, M., Fitzgibbon, A.: Efficient object category recognition using classemes (2010),
  26. 26.
    Weiss, Y., Torralba, A.B., Fergus, R.: Spectral hashing. In: Adv. in Neural Inform. Proc. Systems (NIPS), pp. 1753–1760 (2008)Google Scholar
  27. 27.
    Zehnder, P., Koller-Meier, E., Gool, L.V.: An efficient shared multi-class detection cascade. In: British Machine Vision Conf. (2008)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2010

Authors and Affiliations

  • Lorenzo Torresani
    • 1
  • Martin Szummer
    • 2
  • Andrew Fitzgibbon
    • 2
  1. 1.Darthmouth CollegeHanoverUSA
  2. 2.Microsoft ResearchCambridgeUnited Kingdom

Personalised recommendations