View Synthesis for Recognizing Unseen Poses of Object Classes

Savarese, Silvio; Fei-Fei, Li

doi:10.1007/978-3-540-88690-7_45

Silvio Savarese⁴ &
Li Fei-Fei⁵

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 5304))

Included in the following conference series:

European Conference on Computer Vision

8145 Accesses
18 Citations
3 Altmetric

Abstract

An important task in object recognition is to enable algorithms to categorize objects under arbitrary poses in a cluttered 3D world. A recent paper by Savarese & Fei-Fei [1] has proposed a novel representation to model 3D object classes. In this representation stable parts of objects from one class are linked together to capture both the appearance and shape properties of the object class. We propose to extend this framework and improve the ability of the model to recognize poses that have not been seen in training. Inspired by works in single object view synthesis (e.g., Seitz & Dyer [2]), our new representation allows the model to synthesize novel views of an object class at recognition time. This mechanism is incorporated in a novel two-step algorithm that is able to classify objects under arbitrary and/or unseen poses. We compare our results on pose categorization with the model and dataset presented in [1]. In a second experiment, we collect a new, more challenging dataset of 8 object classes from crawling the web. In both experiments, our model shows competitive performances compared to [1] for classifying objects in unseen poses.

Download to read the full chapter text

Chapter PDF

Pose Augmentation: Class-Agnostic Object Pose Transformation for Object Recognition

PartCom: Part Composition Learning for 3D Open-Set Recognition

Article 19 November 2023

Category-Level 6D Object Pose Recovery in Depth Images

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

Savarese, S., Fei-Fei, L.: 3D generic object categorization, localization and pose estimation. In: IEEE Int. Conf. on Computer Vision, Rio de Janeiro, Brazil (October 2007)
Google Scholar
Seitz, S., Dyer, C.: View morphing. In: SIGGRAPH, pp. 21–30 (1996)
Google Scholar
Viola, P., Jones, M.: Rapid object detection using a boosted cascade of simple features. In: Proc. Computer Vision and Pattern Recognition (2001)
Google Scholar
Weber, M., Welling, M., Perona, P.: Unsupervised learning of models for recognition. In: Vernon, D. (ed.) ECCV 2000. LNCS, vol. 1842, pp. 18–32. Springer, Heidelberg (2000)
Chapter Google Scholar
Dance, C., Willamowski, J., Fan, L., Bray, C., Csurka, G.: Visual categorization with bags of keypoints. In: ECCV International Workshop on Statistical Learning in Computer Vision, Prague (2004)
Google Scholar
Fergus, R., Perona, P., Zisserman, A.: Object class recognition by unsupervised scale-invariant learning. In: Proc. Comp. Vis. and Pattern Recogn. (2003)
Google Scholar
Grauman, K., Darrell, T.: The pyramid match kernel: Discriminative classification with sets of image features. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV), Beijing, China (2005)
Google Scholar
Leibe, B., Schiele, B.: Combined object categorization and segmentation with an implicit shape model. In: Proc. Workshop on satistical learning in computer vision, Prague, Czech Republic (2004)
Google Scholar
Berg, A., Berg, T., Malik, J.: Shape matching and object recognition using low distortion correspondences. In: Proc. Computer Vis. and Pattern Recog. (2005)
Google Scholar
Todorovic, S., Ahuja, N.: Extracting subimages of an unknown category from a set of images. In: CVPR (2006)
Google Scholar
Schneiderman, H., Kanade, T.: A statistical approach to 3D object detection applied to faces and cars. In: Proc. CVPR, pp. 746–751 (2000)
Google Scholar
Weber, M., Einhaeuser, W., Welling, M., Perona, P.: Viewpoint-invariant learning and detection of human heads. In: Int. Conf. Autom. Face and Gesture Rec. (2000)
Google Scholar
Torralba, A., Murphy, K., Freeman, W.: Sharing features: efficient boosting procedures for multiclass object detection. In: Proc. Conference on Computer Vision and Pattern Recognition (CVPR) (2004)
Google Scholar
Beier, T., Neely, S.: Feature-based image metamorphosis. In: SIGGRAPH (1992)
Google Scholar
Chen, S., Williams, L.: View interpolation for image synthesis. Computer Graphics 27, 279–288 (1993)
Google Scholar
Szeliski, R.: Video mosaics for virtual environments. Computer Graphics and Applications 16, 22–30 (1996)
Article Google Scholar
Avidan, S., Shashua, A.: Novel view synthesis in tensor space. In: Proc. Computer Vision and Pattern Recognition, vol. 1, pp. 1034–1040 (1997)
Google Scholar
Laveau, S., Faugeras, O.: 3-d scene representation as a collection of images. In: Proc. International Conference on Pattern Recognition (1994)
Google Scholar
Hartley, R.I., Zisserman, A.: Multiple View Geometry in Computer Vision, 2nd edn. Cambridge University Press, Cambridge (2004)
Book MATH Google Scholar
Xiao, J., Shah, M.: Tri-view morphing. CVIU 96 (2004)
Google Scholar
Brown, M., Lowe, D.: Unsupervised 3D object recognition and reconstruction in unordered datasets. In: 5th International Conference on 3D Imaging and Modelling (3DIM 2005), Ottawa, Canada (2005)
Google Scholar
Lowe, D.: Object recognition from local scale-invariant features. In: Proc. International Conference on Computer Vision, pp. 1150–1157 (1999)
Google Scholar
Ullman, S., Basri, R.: Recognition by linear combination of models. Technical report, Cambridge, MA, USA (1989)
Google Scholar
Rothganger, F., Lazebnik, S., Schmid, C., Ponce, J.: 3d object modeling and recognition using local affine-invariant image descriptors and multi-view spatial constraints. IJCV 66(3), 231–259 (2006)
Article Google Scholar
Ferrari, V., Tuytelaars, T., Van Gool, L.: Simultaneous object recognition and segmentation from single or multiple model views. IJCV (2006)
Google Scholar
Lazebnik, S., Schmid, C., Ponce, J.: Semi-local affine parts for object recognition. In: Proceedings of BMVC, Kingston, UK, vol. 2, pp. 959–968 (2004)
Google Scholar
Bart, E., Byvatov, E., Ullman, S.: View-invariant recognition using corresponding object fragments. In: Pajdla, T., Matas, J(G.) (eds.) ECCV 2004. LNCS, vol. 3022, pp. 152–165. Springer, Heidelberg (2004)
Chapter Google Scholar
Thomas, A., Ferrari, V., Leibe, B., Tuytelaars, T., Schiele, B., Van Gool, L.: Towards multi-view object class detection. In: Proceedings IEEE Conference on Computer Vision and Pattern Recognition, vol. 2, pp. 1589–1596 (2006)
Google Scholar
Kushal, A., Schmid, C., Ponce, J.: Flexible object models for category-level 3d object recognition. In: Proc. Conf. on Comp. Vis. and Patt. Recogn. (2007)
Google Scholar
Hoeim, D., Rother, C., Winn, J.: 3D layoutcrf for multi-view object class recognition and segmentation. In: Proc. In IEEE Conference on Computer Vision and Pattern Recognition (2007)
Google Scholar
Yan, P., Khan, D., Shah, M.: 3d model based object class detection in an arbitrary view. In: ICCV (2007)
Google Scholar
Chiu, H., Kaelbling, L., Lozano-Perez, T.: Virtual training for multi-view object class recognition. In: CVPR (2007)
Google Scholar
http://vangogh.ai.uiuc.edu/silvio/3ddataset.html
Russell, B., Torralba, A., Murphy, K., Freeman, W.: Labelme: a database and web-based tool for image annotation. Int. Journal of Computer Vision (2007)
Google Scholar
http://vangogh.ai.uiuc.edu/silvio/3ddataset2.html

Download references

Author information

Authors and Affiliations

Department of Electrical Engineering, University of Michigan at Ann Arbor, USA
Silvio Savarese
Department of Computer Science, Princeton University, USA
Li Fei-Fei

Authors

Silvio Savarese
View author publications
You can also search for this author in PubMed Google Scholar
Li Fei-Fei
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Computer Science Department, University of Illinois at Urbana Champaign, 3310 Siebel Hall, IL 61801, Urbana, USA
David Forsyth
Department of Computing, Oxford Brookes University, OX33 1HX, Wheatley, Oxford, UK
Philip Torr
Department of Engineering Science, University of Oxford, Parks Road, OX1 3PJ, Oxford, UK
Andrew Zisserman

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Savarese, S., Fei-Fei, L. (2008). View Synthesis for Recognizing Unseen Poses of Object Classes. In: Forsyth, D., Torr, P., Zisserman, A. (eds) Computer Vision – ECCV 2008. ECCV 2008. Lecture Notes in Computer Science, vol 5304. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-88690-7_45

Download citation

DOI: https://doi.org/10.1007/978-3-540-88690-7_45
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-88689-1
Online ISBN: 978-3-540-88690-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

View Synthesis for Recognizing Unseen Poses of Object Classes

Abstract

Chapter PDF

Similar content being viewed by others

Pose Augmentation: Class-Agnostic Object Pose Transformation for Object Recognition

PartCom: Part Composition Learning for 3D Open-Set Recognition

Category-Level 6D Object Pose Recovery in Depth Images

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

View Synthesis for Recognizing Unseen Poses of Object Classes

Abstract

Chapter PDF

Similar content being viewed by others

Pose Augmentation: Class-Agnostic Object Pose Transformation for Object Recognition

PartCom: Part Composition Learning for 3D Open-Set Recognition

Category-Level 6D Object Pose Recovery in Depth Images

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation