Learning Compositional Categorization Models

Ommer, Björn; Buhmann, Joachim M.

doi:10.1007/11744078_25

Björn Ommer¹⁹ &
Joachim M. Buhmann¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 3953))

Included in the following conference series:

European Conference on Computer Vision

3123 Accesses
11 Citations

Abstract

This contribution proposes a compositional approach to visual object categorization of scenes. Compositions are learned from the Caltech 101 database and form intermediate abstractions of images that are semantically situated between low-level representations and the high-level categorization. Salient regions, which are described by localized feature histograms, are detected as image parts. Subsequently compositions are formed as bags of parts with a locality constraint. After performing a spatial binding of compositions by means of a shape model, coupled probabilistic kernel classifiers are applied thereupon to establish the final image categorization. In contrast to the discriminative training of the categorizer, intermediate compositions are learned in a generative manner yielding relevant part agglomerations, i.e. groupings which are frequently appearing in the dataset while simultaneously supporting the discrimination between sets of categories. Consequently, compositionality simplifies the learning of a complex categorization model for complete scenes by splitting it up into simpler, sharable compositions. The architecture is evaluated on the highly challenging Caltech 101 database which exhibits large intra-category variations. Our compositional approach shows competitive retrieval rates in the range of 53.6 ± 0.88% or, with a multi-scale feature set, rates of 57.8 ± 0.79%.

Download to read the full chapter text

Chapter PDF

Adding Discriminative Power to Hierarchical Compositional Models for Object Class Detection

SceneNet: A Perceptual Ontology for Scene Understanding

Improve scene categorization via sub-scene recognition

Article 03 June 2014

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

Fischler, M.A., Elschlager, R.A.: The representation and matching of pictorial structures. IEEE Trans. Comput. 22 (1973)
Google Scholar
Lades, M., Vorbrüggen, J.C., Buhmann, J.M., Lange, J., von der Malsburg, C., Würtz, R.P., Konen, W.: Distortion invariant object recognition in the dynamic link architecture. IEEE Trans. Comput. 42 (1993)
Google Scholar
Leibe, B., Schiele, B.: Scale-invariant object categorization using a scale-adaptive mean-shift search. In: Rasmussen, C.E., Bülthoff, H.H., Schölkopf, B., Giese, M.A. (eds.) DAGM 2004. LNCS, vol. 3175, pp. 145–153. Springer, Heidelberg (2004)
Chapter Google Scholar
Fergus, R., Perona, P., Zisserman, A.: A visual category filter for google images. In: Pajdla, T., Matas, J(G.) (eds.) ECCV 2004. LNCS, vol. 3021, pp. 242–256. Springer, Heidelberg (2004)
Chapter Google Scholar
Fei-Fei, L., Fergus, R., Perona, P.: Learning generative visual models from few training examples: An incremental bayesian approach tested on 101 object categories. In: CVPR Workshop GMBV (2004)
Google Scholar
Agarwal, S., Awan, A., Roth, D.: Learning to detect objects in images via a sparse, part-based representation. IEEE Trans. Pattern Anal. Machine Intell. 26 (2004)
Google Scholar
Ommer, B., Buhmann, J.M.: Object categorization by compositional graphical models. In: Rangarajan, A., Vemuri, B.C., Yuille, A.L. (eds.) EMMCVPR 2005. LNCS, vol. 3757, pp. 235–250. Springer, Heidelberg (2005)
Chapter Google Scholar
Berg, A.C., Berg, T.L., Malik, J.: Shape matching and object recognition using low distortion correspondence. In: CVPR (2005)
Google Scholar
Geman, S., Potter, D.F., Chi, Z.: Composition Systems. Technical report, Division of Applied Mathematics. Brown University, Providence, RI (1998)
Google Scholar
Biederman, I.: Recognition-by-components: A theory of human image understanding. Psychological Review 94 (1987)
Google Scholar
Ommer, B., Buhmann, J.M.: A compositionality architecture for perceptual feature grouping. In: Rangarajan, A., Figueiredo, M.A.T., Zerubia, J. (eds.) EMMCVPR 2003. LNCS, vol. 2683. Springer, Heidelberg (2003)
Chapter Google Scholar
Lowe, D.G.: Perceptual Organization and Visual Recognition. Kluwer Academic Publishers, Norwell (1985)
Google Scholar
Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Computer Vision 60 (2004)
Google Scholar
Veltkamp, R.C., Tanase, M.: Content-based image and video retrieval. In: A Survey of Content-Based Image Retrieval Systems. Kluwer, Dordrecht (2002)
Google Scholar
Dance, C., Willamowski, J., Fan, L., Bray, C., Csurka, G.: Visual categorization with bags of keypoints. In: ECCV Workshop on Stat. Learn. in Comp. Vis. (2004)
Google Scholar
Weber, M., Welling, M., Perona, P.: Unsupervised learning of models for recognition. In: Vernon, D. (ed.) ECCV 2000. LNCS, vol. 1842, pp. 18–32. Springer, Heidelberg (2000)
Chapter Google Scholar
Holub, A.D., Welling, M., Perona, P.: Combining generative models and fisher kernels for object class recognition. In: ICCV (2005)
Google Scholar
Felzenszwalb, P.F., Huttenlocher, D.P.: Pictorial structures for object recognition. Int. J. Computer Vision 61 (2005)
Google Scholar
Heisele, B., Serre, T., Pontil, M., Vetter, T., Poggio, T.: Categorization by learning and combining object parts. In: NIPS (2001)
Google Scholar
Borenstein, E., Sharon, E., Ullman, S.: Combining top-down and bottom-up segmentation. In: CVPR Workshop on Perceptual Organization in Comp. Vis. (2004)
Google Scholar
Viola, P., Jones, M.: Rapid object detection using a boosted cascade of simple features. In: CVPR (2001)
Google Scholar
Mikolajczyk, K., Schmid, C.: Scale & affine invariant interest point detectors. Int. J. Computer Vision 60 (2004)
Google Scholar
Winkler, G.: Image Analysis, Random Fields and Markov Chain Monte Carlo Methods—A Mathematical Introduction, 2nd edn. Springer, Heidelberg (2003)
Google Scholar
Torralba, A., Murphy, K.P., Freeman, W.T.: Sharing visual features for multiclass and multiview object detection. In: CVPR (2004)
Google Scholar
Roth, V., Tsuda, K.: Pairwise coupling for machine recognition of hand-printed japanese characters. In: CVPR (2001)
Google Scholar
Hastie, T., Tibshirani, R.: Classification by pairwise coupling. In: NIPS (1998)
Google Scholar

Download references

Author information

Authors and Affiliations

Institute of Computational Science, ETH Zurich, 8092, Zurich, Switzerland
Björn Ommer & Joachim M. Buhmann

Authors

Björn Ommer
View author publications
You can also search for this author in PubMed Google Scholar
Joachim M. Buhmann
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

University of Ljubljana, Slovenia
Aleš Leonardis
Institute for Computer Graphics and Vision, TU Graz, Inffeldgasse 16, 8010, Graz, Austria
Horst Bischof
Vision-based Measurement Group, Inst. of El. Measurement and Meas. Sign. Proc. Graz, University of Technology, Austria
Axel Pinz

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Ommer, B., Buhmann, J.M. (2006). Learning Compositional Categorization Models. In: Leonardis, A., Bischof, H., Pinz, A. (eds) Computer Vision – ECCV 2006. ECCV 2006. Lecture Notes in Computer Science, vol 3953. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11744078_25

Download citation

DOI: https://doi.org/10.1007/11744078_25
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-33836-9
Online ISBN: 978-3-540-33837-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Learning Compositional Categorization Models

Abstract

Chapter PDF

Similar content being viewed by others

Adding Discriminative Power to Hierarchical Compositional Models for Object Class Detection

SceneNet: A Perceptual Ontology for Scene Understanding

Improve scene categorization via sub-scene recognition

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Learning Compositional Categorization Models

Abstract

Chapter PDF

Similar content being viewed by others

Adding Discriminative Power to Hierarchical Compositional Models for Object Class Detection

SceneNet: A Perceptual Ontology for Scene Understanding

Improve scene categorization via sub-scene recognition

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation