Abstract
This contribution proposes a compositional approach to visual object categorization of scenes. Compositions are learned from the Caltech 101 database and form intermediate abstractions of images that are semantically situated between low-level representations and the high-level categorization. Salient regions, which are described by localized feature histograms, are detected as image parts. Subsequently compositions are formed as bags of parts with a locality constraint. After performing a spatial binding of compositions by means of a shape model, coupled probabilistic kernel classifiers are applied thereupon to establish the final image categorization. In contrast to the discriminative training of the categorizer, intermediate compositions are learned in a generative manner yielding relevant part agglomerations, i.e. groupings which are frequently appearing in the dataset while simultaneously supporting the discrimination between sets of categories. Consequently, compositionality simplifies the learning of a complex categorization model for complete scenes by splitting it up into simpler, sharable compositions. The architecture is evaluated on the highly challenging Caltech 101 database which exhibits large intra-category variations. Our compositional approach shows competitive retrieval rates in the range of 53.6 ± 0.88% or, with a multi-scale feature set, rates of 57.8 ± 0.79%.
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Fischler, M.A., Elschlager, R.A.: The representation and matching of pictorial structures. IEEE Trans. Comput. 22 (1973)
Lades, M., Vorbrüggen, J.C., Buhmann, J.M., Lange, J., von der Malsburg, C., Würtz, R.P., Konen, W.: Distortion invariant object recognition in the dynamic link architecture. IEEE Trans. Comput. 42 (1993)
Leibe, B., Schiele, B.: Scale-invariant object categorization using a scale-adaptive mean-shift search. In: Rasmussen, C.E., Bülthoff, H.H., Schölkopf, B., Giese, M.A. (eds.) DAGM 2004. LNCS, vol. 3175, pp. 145–153. Springer, Heidelberg (2004)
Fergus, R., Perona, P., Zisserman, A.: A visual category filter for google images. In: Pajdla, T., Matas, J(G.) (eds.) ECCV 2004. LNCS, vol. 3021, pp. 242–256. Springer, Heidelberg (2004)
Fei-Fei, L., Fergus, R., Perona, P.: Learning generative visual models from few training examples: An incremental bayesian approach tested on 101 object categories. In: CVPR Workshop GMBV (2004)
Agarwal, S., Awan, A., Roth, D.: Learning to detect objects in images via a sparse, part-based representation. IEEE Trans. Pattern Anal. Machine Intell. 26 (2004)
Ommer, B., Buhmann, J.M.: Object categorization by compositional graphical models. In: Rangarajan, A., Vemuri, B.C., Yuille, A.L. (eds.) EMMCVPR 2005. LNCS, vol. 3757, pp. 235–250. Springer, Heidelberg (2005)
Berg, A.C., Berg, T.L., Malik, J.: Shape matching and object recognition using low distortion correspondence. In: CVPR (2005)
Geman, S., Potter, D.F., Chi, Z.: Composition Systems. Technical report, Division of Applied Mathematics. Brown University, Providence, RI (1998)
Biederman, I.: Recognition-by-components: A theory of human image understanding. Psychological Review 94 (1987)
Ommer, B., Buhmann, J.M.: A compositionality architecture for perceptual feature grouping. In: Rangarajan, A., Figueiredo, M.A.T., Zerubia, J. (eds.) EMMCVPR 2003. LNCS, vol. 2683. Springer, Heidelberg (2003)
Lowe, D.G.: Perceptual Organization and Visual Recognition. Kluwer Academic Publishers, Norwell (1985)
Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Computer Vision 60 (2004)
Veltkamp, R.C., Tanase, M.: Content-based image and video retrieval. In: A Survey of Content-Based Image Retrieval Systems. Kluwer, Dordrecht (2002)
Dance, C., Willamowski, J., Fan, L., Bray, C., Csurka, G.: Visual categorization with bags of keypoints. In: ECCV Workshop on Stat. Learn. in Comp. Vis. (2004)
Weber, M., Welling, M., Perona, P.: Unsupervised learning of models for recognition. In: Vernon, D. (ed.) ECCV 2000. LNCS, vol. 1842, pp. 18–32. Springer, Heidelberg (2000)
Holub, A.D., Welling, M., Perona, P.: Combining generative models and fisher kernels for object class recognition. In: ICCV (2005)
Felzenszwalb, P.F., Huttenlocher, D.P.: Pictorial structures for object recognition. Int. J. Computer Vision 61 (2005)
Heisele, B., Serre, T., Pontil, M., Vetter, T., Poggio, T.: Categorization by learning and combining object parts. In: NIPS (2001)
Borenstein, E., Sharon, E., Ullman, S.: Combining top-down and bottom-up segmentation. In: CVPR Workshop on Perceptual Organization in Comp. Vis. (2004)
Viola, P., Jones, M.: Rapid object detection using a boosted cascade of simple features. In: CVPR (2001)
Mikolajczyk, K., Schmid, C.: Scale & affine invariant interest point detectors. Int. J. Computer Vision 60 (2004)
Winkler, G.: Image Analysis, Random Fields and Markov Chain Monte Carlo Methods—A Mathematical Introduction, 2nd edn. Springer, Heidelberg (2003)
Torralba, A., Murphy, K.P., Freeman, W.T.: Sharing visual features for multiclass and multiview object detection. In: CVPR (2004)
Roth, V., Tsuda, K.: Pairwise coupling for machine recognition of hand-printed japanese characters. In: CVPR (2001)
Hastie, T., Tibshirani, R.: Classification by pairwise coupling. In: NIPS (1998)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Ommer, B., Buhmann, J.M. (2006). Learning Compositional Categorization Models. In: Leonardis, A., Bischof, H., Pinz, A. (eds) Computer Vision – ECCV 2006. ECCV 2006. Lecture Notes in Computer Science, vol 3953. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11744078_25
Download citation
DOI: https://doi.org/10.1007/11744078_25
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-33836-9
Online ISBN: 978-3-540-33837-6
eBook Packages: Computer ScienceComputer Science (R0)