3D2PM – 3D Deformable Part Models

Pepik, Bojan; Gehler, Peter; Stark, Michael; Schiele, Bernt

doi:10.1007/978-3-642-33783-3_26

Bojan Pepik²¹,
Peter Gehler²³,
Michael Stark^21,22 &
…
Bernt Schiele²¹

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 7577))

Included in the following conference series:

European Conference on Computer Vision

9682 Accesses
37 Citations

Abstract

As objects are inherently 3-dimensional, they have been modeled in 3D in the early days of computer vision. Due to the ambiguities arising from mapping 2D features to 3D models, 2D feature-based models are the predominant paradigm in object recognition today. While such models have shown competitive bounding box (BB) detection performance, they are clearly limited in their capability of fine-grained reasoning in 3D or continuous viewpoint estimation as required for advanced tasks such as 3D scene understanding. This work extends the deformable part model [1] to a 3D object model. It consists of multiple parts modeled in 3D and a continuous appearance model. As a result, the model generalizes beyond BB oriented object detection and can be jointly optimized in a discriminative fashion for object detection and viewpoint estimation. Our 3D Deformable Part Model (3D²PM) leverages on CAD data of the object class, as a 3D geometry proxy.

Download to read the full chapter text

Chapter PDF

An Elastic Deformation Field Model for Object Detection and Tracking

Article 24 June 2014

Class-Specific Object Pose Estimation and Reconstruction Using 3D Part Geometry

Monocular Surface Reconstruction Using 3D Deformable Part Models

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

Felzenszwalb, P.F., Girshick, R., McAllester, D., Ramanan, D.: Object detection with discriminatively trained part based models. PAMI (2010)
Google Scholar
Marr, D., Nishihara, H.: Representation and recognition of the spatial organization of three-dimensional shapes. Proc. Roy. Soc. London B 200, 269–294 (1978)
Article Google Scholar
Brooks, R.: Symbolic reasoning among 3-d models and 2-d images. Artificial Intelligence 17, 285–348 (1981)
Article Google Scholar
Pentland, A.: Perceptual organization and the representation of natural form. Artificial Intelligence 28 (1986)
Google Scholar
Lowe, D.: Three-dimensional object recognition from single two-dimensional images. Artificial Intelligence (1987)
Google Scholar
Stark, L., Hoover, A., Goldgof, D., Bowyer, K.: Function-based recognition from incomplete knowledge of shape. In: WQV 1993 (1993)
Google Scholar
Green, K., Eggert, D., Stark, L., Bowyer, K.: Generic recognition of articulated objects through reasoning about potential function. CVIU (1995)
Google Scholar
Fergus, R., Perona, P., Zisserman, A.: Object class recognition by unsupervised scale-invariant learning. In: CVPR (2003)
Google Scholar
Leibe, B., Leonardis, A., Schiele, B.: Robust object detection with interleaved categorization and segmentation. IJCV 77, 259–289 (2008)
Article Google Scholar
Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: CVPR (2005)
Google Scholar
Hoiem, D., Efros, A., Hebert, M.: Putting objects in perspective. IJCV (2008)
Google Scholar
Ess, A., Leibe, B., Schindler, K., Van Gool, L.: Robust multi-person tracking from a mobile platform. PAMI (2009)
Google Scholar
Wojek, C., Roth, S., Schindler, K., Schiele, B.: Monocular 3D Scene Modeling and Inference: Understanding Multi-Object Traffic Scenes. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part IV. LNCS, vol. 6314, pp. 467–481. Springer, Heidelberg (2010)
Chapter Google Scholar
Thomas, A., Ferrari, V., Leibe, B., Tuytelaars, T., Schiele, B., Van Gool, L.: Towards multi-view object class detection. In: CVPR (2006)
Google Scholar
Savarese, S., Fei-Fei, L.: 3D generic object categorization, localization and pose estimation. In: ICCV (2007)
Google Scholar
Yan, P., Khan, S., Shah, M.: 3D model based object class detection in an arbitrary view. In: ICCV (2007)
Google Scholar
Su, H., Sun, M., Fei-Fei, L., Savarese, S.: Learning a dense multi-view representation for detection, viewpoint classification and synthesis of object categories. In: ICCV (2009)
Google Scholar
Liebelt, J., Schmid, C.: Multi-view object class detection with a 3D geometric model. In: CVPR (2010)
Google Scholar
Stark, M., Goesele, M., Schiele, B.: Back to the future: Learning shape models from 3D cad data. In: BMVC (2010)
Google Scholar
Zia, Z., Stark, M., Schindler, K., Schiele, B.: Revisiting 3D geometric models for accurate object shape and pose. In: 3dRR 2011 (2011)
Google Scholar
Payet, N., Todorovic, S.: From contours to 3D object detection and pose estimation. In: ICCV (2011)
Google Scholar
Glasner, D., Galun, M., Alpert, S., Basri, R., Shakhnarovich, G.: Viewpoint-aware object detection and pose estimation. In: ICCV (2011)
Google Scholar
Lopez-Sastre, R.J., Tuytelaars, T., Savarese, S.: Dpm revisited: A performance evaluation for object category pose estimation. In: ICCV-WS CORP (2011)
Google Scholar
Bao, S.Y., Savarese, S.: Semantic structure from motion. In: CVPR (2011)
Google Scholar
Ozuysal, M., Lepetit, V., Fua, P.: Pose estimation for category specific multiview object localization. In: CVPR (2009)
Google Scholar
Gu, C., Ren, X.: Discriminative Mixture-of-Templates for Viewpoint Classification. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part V. LNCS, vol. 6315, pp. 408–421. Springer, Heidelberg (2010)
Chapter Google Scholar
Pepik, B., Stark, M., Gehler, P., Schiele, B.: Teaching 3D geometry to deformable part models. In: CVPR (2012)
Google Scholar
Arie-Nachimson, M., Basri, R.: Constructing implicit 3D shape models for pose estimation. In: ICCV (2009)
Google Scholar
Sun, M., Bradski, G., Xu, B.-X., Savarese, S.: Depth-Encoded Hough Voting for Joint Object Detection and Shape Recovery. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part V. LNCS, vol. 6315, pp. 658–671. Springer, Heidelberg (2010)
Chapter Google Scholar
Yu, C.N.J., Joachims, T.: Learning structural svms with latent variables. In: International Conference on Machine Learning, ICML (2009)
Google Scholar
Blaschko, M.B., Lampert, C.H.: Learning to Localize Objects with Structured Output Regression. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part I. LNCS, vol. 5302, pp. 2–15. Springer, Heidelberg (2008)
Chapter Google Scholar
Everingham, M., Van Gool, L., Williams, C.K.I., Winn, J., Zisserman, A.: The PASCAL VOC 2007 Results (2007)
Google Scholar
Lehmann, A., Gehler, P., Van Gool, L.: Branch&rank: Non-linear object detection. In: BMVC (2011)
Google Scholar

Download references

Author information

Authors and Affiliations

Max Planck Institute for Informatics, Germany
Bojan Pepik, Michael Stark & Bernt Schiele
Stanford University, USA
Michael Stark
Max Planck Institute for Intelligent Systems, Germany
Peter Gehler

Authors

Bojan Pepik
View author publications
You can also search for this author in PubMed Google Scholar
Peter Gehler
View author publications
You can also search for this author in PubMed Google Scholar
Michael Stark
View author publications
You can also search for this author in PubMed Google Scholar
Bernt Schiele
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Microsoft Research Ltd., CB3 0FB, Cambridge, UK
Andrew Fitzgibbon
Dept. of Computer Science, University of North Carolina, 27599, Chapel Hill, NC, USA
Svetlana Lazebnik
California Institute of Technology, 91125, Pasadena, CA, USA
Pietro Perona
Institute of Industrial Science, The University of Tokyo, 153-8505, Tokyo, Japan
Yoichi Sato
INRIA, 38330, Montbonnot, France
Cordelia Schmid

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Pepik, B., Gehler, P., Stark, M., Schiele, B. (2012). 3D²PM – 3D Deformable Part Models. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds) Computer Vision – ECCV 2012. ECCV 2012. Lecture Notes in Computer Science, vol 7577. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-33783-3_26

Download citation

DOI: https://doi.org/10.1007/978-3-642-33783-3_26
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-33782-6
Online ISBN: 978-3-642-33783-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

3D²PM – 3D Deformable Part Models

Abstract

Chapter PDF

Similar content being viewed by others

An Elastic Deformation Field Model for Object Detection and Tracking

Class-Specific Object Pose Estimation and Reconstruction Using 3D Part Geometry

Monocular Surface Reconstruction Using 3D Deformable Part Models

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

3D2PM – 3D Deformable Part Models

Abstract

Chapter PDF

Similar content being viewed by others

An Elastic Deformation Field Model for Object Detection and Tracking

Class-Specific Object Pose Estimation and Reconstruction Using 3D Part Geometry

Monocular Surface Reconstruction Using 3D Deformable Part Models

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation

3D²PM – 3D Deformable Part Models