Abstract
An important task in object recognition is to enable algorithms to categorize objects under arbitrary poses in a cluttered 3D world. A recent paper by Savarese & Fei-Fei [1] has proposed a novel representation to model 3D object classes. In this representation stable parts of objects from one class are linked together to capture both the appearance and shape properties of the object class. We propose to extend this framework and improve the ability of the model to recognize poses that have not been seen in training. Inspired by works in single object view synthesis (e.g., Seitz & Dyer [2]), our new representation allows the model to synthesize novel views of an object class at recognition time. This mechanism is incorporated in a novel two-step algorithm that is able to classify objects under arbitrary and/or unseen poses. We compare our results on pose categorization with the model and dataset presented in [1]. In a second experiment, we collect a new, more challenging dataset of 8 object classes from crawling the web. In both experiments, our model shows competitive performances compared to [1] for classifying objects in unseen poses.
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Savarese, S., Fei-Fei, L.: 3D generic object categorization, localization and pose estimation. In: IEEE Int. Conf. on Computer Vision, Rio de Janeiro, Brazil (October 2007)
Seitz, S., Dyer, C.: View morphing. In: SIGGRAPH, pp. 21–30 (1996)
Viola, P., Jones, M.: Rapid object detection using a boosted cascade of simple features. In: Proc. Computer Vision and Pattern Recognition (2001)
Weber, M., Welling, M., Perona, P.: Unsupervised learning of models for recognition. In: Vernon, D. (ed.) ECCV 2000. LNCS, vol. 1842, pp. 18–32. Springer, Heidelberg (2000)
Dance, C., Willamowski, J., Fan, L., Bray, C., Csurka, G.: Visual categorization with bags of keypoints. In: ECCV International Workshop on Statistical Learning in Computer Vision, Prague (2004)
Fergus, R., Perona, P., Zisserman, A.: Object class recognition by unsupervised scale-invariant learning. In: Proc. Comp. Vis. and Pattern Recogn. (2003)
Grauman, K., Darrell, T.: The pyramid match kernel: Discriminative classification with sets of image features. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV), Beijing, China (2005)
Leibe, B., Schiele, B.: Combined object categorization and segmentation with an implicit shape model. In: Proc. Workshop on satistical learning in computer vision, Prague, Czech Republic (2004)
Berg, A., Berg, T., Malik, J.: Shape matching and object recognition using low distortion correspondences. In: Proc. Computer Vis. and Pattern Recog. (2005)
Todorovic, S., Ahuja, N.: Extracting subimages of an unknown category from a set of images. In: CVPR (2006)
Schneiderman, H., Kanade, T.: A statistical approach to 3D object detection applied to faces and cars. In: Proc. CVPR, pp. 746–751 (2000)
Weber, M., Einhaeuser, W., Welling, M., Perona, P.: Viewpoint-invariant learning and detection of human heads. In: Int. Conf. Autom. Face and Gesture Rec. (2000)
Torralba, A., Murphy, K., Freeman, W.: Sharing features: efficient boosting procedures for multiclass object detection. In: Proc. Conference on Computer Vision and Pattern Recognition (CVPR) (2004)
Beier, T., Neely, S.: Feature-based image metamorphosis. In: SIGGRAPH (1992)
Chen, S., Williams, L.: View interpolation for image synthesis. Computer Graphics 27, 279–288 (1993)
Szeliski, R.: Video mosaics for virtual environments. Computer Graphics and Applications 16, 22–30 (1996)
Avidan, S., Shashua, A.: Novel view synthesis in tensor space. In: Proc. Computer Vision and Pattern Recognition, vol. 1, pp. 1034–1040 (1997)
Laveau, S., Faugeras, O.: 3-d scene representation as a collection of images. In: Proc. International Conference on Pattern Recognition (1994)
Hartley, R.I., Zisserman, A.: Multiple View Geometry in Computer Vision, 2nd edn. Cambridge University Press, Cambridge (2004)
Xiao, J., Shah, M.: Tri-view morphing. CVIU 96 (2004)
Brown, M., Lowe, D.: Unsupervised 3D object recognition and reconstruction in unordered datasets. In: 5th International Conference on 3D Imaging and Modelling (3DIM 2005), Ottawa, Canada (2005)
Lowe, D.: Object recognition from local scale-invariant features. In: Proc. International Conference on Computer Vision, pp. 1150–1157 (1999)
Ullman, S., Basri, R.: Recognition by linear combination of models. Technical report, Cambridge, MA, USA (1989)
Rothganger, F., Lazebnik, S., Schmid, C., Ponce, J.: 3d object modeling and recognition using local affine-invariant image descriptors and multi-view spatial constraints. IJCV 66(3), 231–259 (2006)
Ferrari, V., Tuytelaars, T., Van Gool, L.: Simultaneous object recognition and segmentation from single or multiple model views. IJCV (2006)
Lazebnik, S., Schmid, C., Ponce, J.: Semi-local affine parts for object recognition. In: Proceedings of BMVC, Kingston, UK, vol. 2, pp. 959–968 (2004)
Bart, E., Byvatov, E., Ullman, S.: View-invariant recognition using corresponding object fragments. In: Pajdla, T., Matas, J(G.) (eds.) ECCV 2004. LNCS, vol. 3022, pp. 152–165. Springer, Heidelberg (2004)
Thomas, A., Ferrari, V., Leibe, B., Tuytelaars, T., Schiele, B., Van Gool, L.: Towards multi-view object class detection. In: Proceedings IEEE Conference on Computer Vision and Pattern Recognition, vol. 2, pp. 1589–1596 (2006)
Kushal, A., Schmid, C., Ponce, J.: Flexible object models for category-level 3d object recognition. In: Proc. Conf. on Comp. Vis. and Patt. Recogn. (2007)
Hoeim, D., Rother, C., Winn, J.: 3D layoutcrf for multi-view object class recognition and segmentation. In: Proc. In IEEE Conference on Computer Vision and Pattern Recognition (2007)
Yan, P., Khan, D., Shah, M.: 3d model based object class detection in an arbitrary view. In: ICCV (2007)
Chiu, H., Kaelbling, L., Lozano-Perez, T.: Virtual training for multi-view object class recognition. In: CVPR (2007)
Russell, B., Torralba, A., Murphy, K., Freeman, W.: Labelme: a database and web-based tool for image annotation. Int. Journal of Computer Vision (2007)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Savarese, S., Fei-Fei, L. (2008). View Synthesis for Recognizing Unseen Poses of Object Classes. In: Forsyth, D., Torr, P., Zisserman, A. (eds) Computer Vision – ECCV 2008. ECCV 2008. Lecture Notes in Computer Science, vol 5304. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-88690-7_45
Download citation
DOI: https://doi.org/10.1007/978-3-540-88690-7_45
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-88689-1
Online ISBN: 978-3-540-88690-7
eBook Packages: Computer ScienceComputer Science (R0)