Advertisement

Hierarchical Learning of Dominant Constellations for Object Class Recognition

  • Nathan Mekuz
  • John K. Tsotsos
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4843)

Abstract

The importance of spatial configuration information for object class recognition is widely recognized. Single isolated local appearance codes are often ambiguous. On the other hand, object classes are often characterized by groups of local features appearing in a specific spatial structure. Learning these structures can provide additional discriminant cues and boost recognition performance. However, the problem of learning such features automatically from raw images remains largely uninvestigated. In contrast to previous approaches which require accurate localization and segmentation of objects to learn spatial information, we propose learning by hierarchical voting to identify frequently occurring spatial relationships among local features directly from raw images. The method is resistant to common geometric perturbations in both the training and test data. We describe a novel representation developed to this end and present experimental results that validate its efficacy by demonstrating the improvement in class recognition results realized by including the additional learned information.

Keywords

Local Feature Visual Word Constellation Representation Class Recognition Vote Space 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Fergus, R., Perona, P., Zisserman, A.: Object class recognition by unsupervised scale-invariant learning. In: IEEE Conference on Computer Vision and Pattern Recognition, vol. 2, pp. 264–271 (2003)Google Scholar
  2. 2.
    Leibe, B., Mikolajczyk, K., Schiele, B.: Efficient clustering and matching for object class recognition. In: British Machine Vision Conference, Edinburgh, England (2006)Google Scholar
  3. 3.
    Berg, A.C., Berg, T.L., Malik, J.: Shape matching and object recognition using low distortion correspondences. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 26–33. IEEE Computer Society Press, Los Alamitos (2005)Google Scholar
  4. 4.
    Dorko, G., Schmid, C.: Object class recognition using discriminative local features (2005)Google Scholar
  5. 5.
    Ortega, M., Rui, Y., Chakrabarti, K., Mehrotra, S., Huang, T.S.: Supporting similarity queries in mars. In: ACM International Conference on Multimedia, pp. 403–413. ACM Press, New York (1997)Google Scholar
  6. 6.
    Carson, C., Thomas, M., Belongie, S., Hellerstein, J., Malik, J.: Blobworld: a system for region-based image indexing and retrieval. Technical report, Berkeley, CA, USA (1999)Google Scholar
  7. 7.
    Mukherjea, S., Hirata, K., Hara, Y.: Amore: a world-wide web image retrieval engine. In: CHI 1999. Extended abstracts on human factors in computing systems, pp. 17–18. ACM Press, New York (1999)CrossRefGoogle Scholar
  8. 8.
    Malik, J., Belongie, S., Shi, J., Leung, T.K.: Textons, contours and regions: Cue integration in image segmentation. In: IEEE International Conference on Computer Vision, pp. 918–925. IEEE Computer Society Press, Los Alamitos (1999)Google Scholar
  9. 9.
    Lowe, D.G.: Object recognition from local scale-invariant features. In: IEEE International Conference on Computer Vision, vol. 1150, IEEE Computer Society Press, Los Alamitos (1999)Google Scholar
  10. 10.
    Lazebnik, S., Schmid, C., Ponce, J.: Affine-invariant local descriptors and neighborhood statistics for texture recognition. In: IEEE International Conference on Computer Vision, vol. 649, IEEE Computer Society, Los Alamitos (2003)Google Scholar
  11. 11.
    Sivic, J., Zisserman, A.: Video Google: A text retrieval approach to object matching in videos. In: IEEE International Conference on Computer Vision, vol. 2, pp. 1470–1477 (2003)Google Scholar
  12. 12.
    Lipson, P., Grimson, E., Sinha, P.: Configuration based scene classification and image indexing. In: IEEE Conference on Computer Vision and Pattern Recognition, vol. 1007, IEEE Computer Society, Los Alamitos (1997)Google Scholar
  13. 13.
    Zhang, W., Yu, B., Zelinsky, G.J., Samaras, D.: Object class recognition using multiple layer boosting with heterogeneous features. In: IEEE Conference on Computer Vision and Pattern RecognitionGoogle Scholar
  14. 14.
    Amit, Y., Geman, D.: A computational model for visual selection. Neural Comput. 11, 1691–1715 (1999)CrossRefGoogle Scholar
  15. 15.
    Agarwal, A., Triggs, W.: Hyperfeatures - multilevel local coding for visual recognition. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006. LNCS, Springer, Heidelberg (2006)Google Scholar
  16. 16.
    Sinha, P.: Image invariants for object recognition. Invest. Opth. & Vis. Sci. 34(6) (1994)Google Scholar
  17. 17.
    Shokoufandeh, A., Dickinson, S.J., Jönsson, C., Bretzner, L., Lindeberg, T.: On the representation and matching of qualitative shape at multiple scales. In: European Conference on Computer Vision, pp. 759–775. Springer, Heidelberg (2002)Google Scholar
  18. 18.
    Fidler, S., Berginc, G., Leonardis, A.: Hierarchical statistical learning of generic parts of object structure. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 182–189. IEEE Computer Society Press, Los Alamitos (2006)Google Scholar
  19. 19.
    Witten, I.H., Moffat, A., Bell, T.C.: Managing gigabytes: compressing and indexing documents and images, 2nd edn. Morgan Kaufmann Publishers Inc, San Francisco (1999)Google Scholar
  20. 20.
    Everingham, M., Zisserman, A., Williams, C.K.I., Van Gool, L.: The PASCAL Visual Object Classes Challenge. In: VOC2006 (2006)Google Scholar
  21. 21.
    Baeza-Yates, R., Ribeiro-Neto, B.: Modern Information Retrieval. Addison-Wesley, Reading (1999)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2007

Authors and Affiliations

  • Nathan Mekuz
    • 1
  • John K. Tsotsos
    • 1
  1. 1.Center for Vision Research (CVR) and, Department of Computer Science and Engineering, York University, Toronto, M3J 1P3Canada

Personalised recommendations