A Sparse Object Category Model for Efficient Learning and Complete Recognition

Fergus, Rob; Perona, Pietro; Zisserman, Andrew

doi:10.1007/11957959_23

Rob Fergus²⁰,
Pietro Perona²¹ &
Andrew Zisserman²⁰

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 4170))

2759 Accesses
30 Citations

Abstract

We present a “parts and structure” model for object category recognition that can be learnt efficiently and in a weakly-supervised manner: the model is learnt from example images containing category instances, without requiring segmentation from background clutter.

The model is a sparse representation of the object, and consists of a star topology configuration of parts modeling the output of a variety of feature detectors. The optimal choice of feature types (whose repertoire includes interest points, curves and regions) is made automatically.

In recognition, the model may be applied efficiently in a complete manner, bypassing the need for feature detectors, to give the globally optimal match within a query image. The approach is demonstrated on a wide variety of categories, and delivers both successful classification and localization of the object within the image.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Agarwal, S., Roth, D.: Learning a sparse representation for object detection. In: Heyden, A., et al. (eds.) ECCV 2002. LNCS, vol. 2353, pp. 113–127. Springer, Heidelberg (2002)
Chapter Google Scholar
Borenstein, E., Ullman, S.: Class-specific, top-down segmentation. In: Heyden, A., et al. (eds.) ECCV 2002. LNCS, vol. 2351, pp. 109–122. Springer, Heidelberg (2002)
Chapter Google Scholar
Burl, M., Leung, T., Perona, P.: Face localization via shape statistics. In: Int. Workshop on Automatic Face and Gesture Recognition (1995)
Google Scholar
Crandall, D., Felzenszwalb, P., Huttenlocher, D.: Spatial priors for part-based recognition using statistical models. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, San Diego, vol.1, pp. 10–17 (2005)
Google Scholar
Csurka, G., Bray, C., Dance, C., Fan, L.: Visual categorization with bags of keypoints. In: Workshop on Statistical Learning in Computer Vision, ECCV, pp. 1–22 (2004)
Google Scholar
Feltzenswalb, P., Huttenlocher, D.: Pictorial structures for object recognition. International Journal of Computer Vision 61, 55–79 (2005)
Article Google Scholar
Fergus, R., Perona, P.: Caltech Object Category datasets (2003), http://www.vision.caltech.edu/html-files/archive.html
Fergus, R., Perona, P., Zisserman, A.: Object class recognition by unsupervised scale-invariant learning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (June 2003)
Google Scholar
Fergus, R., Perona, P., Zisserman, A.: A visual category filter for Google images. In: Pajdla, T., Matas, J(G.) (eds.) ECCV 2004. LNCS, vol. 3021, pp. 242–256. Springer, Heidelberg (2004)
Chapter Google Scholar
Fischler, M., Elschlager, R.: The representation and matching of pictorial structures. IEEE Transactions on Computer 22(1), 67–92 (1973)
Article Google Scholar
Harris, C.J., Stephens, M.: A combined corner and edge detector. In: Proceedings of the 4th Alvey Vision Conference, Manchester, pp. 147–151 (1988)
Google Scholar
Jurie, F., Schmid, C.: Scale-invariant shape features for recognition of object categories. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Washington, DC, pp. 90–96 (2004)
Google Scholar
Kadir, T., Brady, M.: Scale, saliency and image description. International Journal of Computer Vision 45(2), 83–105 (2001)
Article MATH Google Scholar
Ke, Y., Sukthankar, R.: PCA–SIFT: A more distinctive representation for local image descriptors. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Washington, DC (June 2004)
Google Scholar
Leibe, B., Leonardis, A., Schiele, B.: Combined object categorization and segmentation with an implicit shape model. In: Workshop on Statistical Learning in Computer Vision, ECCV (2004)
Google Scholar
Lowe, D.: Local feature view clustering for 3D object recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Kauai, Hawaii, pp. 682–688. Springer, Heidelberg (2001)
Google Scholar
Moreels, P., Maire, M., Perona, P.: Recognition by probabilistic hypothesis construction. In: Pajdla, T., Matas, J(G.) (eds.) ECCV 2004. LNCS, vol. 3021, pp. 55–68. Springer, Heidelberg (2004)
Chapter Google Scholar
Opelt, A., Fussenegger, A., Auer, P.: Weak hypotheses and boosting for generic object detection and recognition. In: Pajdla, T., Matas, J(G.) (eds.) ECCV 2004. LNCS, vol. 3024. Springer, Heidelberg (2004)
Google Scholar
Thureson, J., Carlsson, S.: Appearance based qualitative image description for object class recognition. In: Pajdla, T., Matas, J(G.) (eds.) ECCV 2004. LNCS, vol. 3022, pp. 518–529. Springer, Heidelberg (2004)
Chapter Google Scholar
Torralba, A., Murphy, K.P., Freeman, W.T.: Sharing features: efficient boosting procedures for multiclass object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Washington, DC, pp. 762–769 (2004)
Google Scholar
Weber, M., Welling, M., Perona, P.: Unsupervised learning of models for recognition. In: Vernon, D. (ed.) ECCV 2000. LNCS, vol. 1842, pp. 18–32. Springer, Heidelberg (2000)
Chapter Google Scholar

Download references

Author information

Authors and Affiliations

Dept. of Engineering Science, University of Oxford, Parks Road, Oxford, OX1 3PJ, U.K.
Rob Fergus & Andrew Zisserman
Dept. of Electrical Engineering, California Institute of Technology, MC 136–93, Pasadena, CA, 91125, U.S.A.
Pietro Perona

Authors

Rob Fergus
View author publications
You can also search for this author in PubMed Google Scholar
Pietro Perona
View author publications
You can also search for this author in PubMed Google Scholar
Andrew Zisserman
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Département d’Informatique, Ecole Normale Supérieure, P.O. Box, Paris, France
Jean Ponce
Carnegie Mellon University, Pittsburgh, USA
Martial Hebert
GRAVIR-INRIA, 655 avenue de l’Europe, P.O. Box, 38330, Montbonnot, France
Cordelia Schmid
Department of Engineering Science, University of Oxford, Parks Road, OX1 3PJ, Oxford, UK
Andrew Zisserman

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Fergus, R., Perona, P., Zisserman, A. (2006). A Sparse Object Category Model for Efficient Learning and Complete Recognition. In: Ponce, J., Hebert, M., Schmid, C., Zisserman, A. (eds) Toward Category-Level Object Recognition. Lecture Notes in Computer Science, vol 4170. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11957959_23

Download citation

DOI: https://doi.org/10.1007/11957959_23
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-68794-8
Online ISBN: 978-3-540-68795-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics