Object Proposals Estimation in Depth Image Using Compact 3D Shape Manifolds

Zheng, Shuai; Prisacariu, Victor Adrian; Averkiou, Melinos; Cheng, Ming-Ming; Mitra, Niloy J.; Shotton, Jamie; Torr, Philip H. S.; Rother, Carsten

doi:10.1007/978-3-319-24947-6_16

Shuai Zheng¹⁷,
Victor Adrian Prisacariu¹⁷,
Melinos Averkiou¹⁸,
Ming-Ming Cheng^17,21,
Niloy J. Mitra¹⁸,
Jamie Shotton¹⁹,
Philip H. S. Torr¹⁷ &
…
Carsten Rother²⁰

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 9358))

Included in the following conference series:

German Conference on Pattern Recognition

2182 Accesses
14 Citations

Abstract

Man-made objects, such as chairs, often have very large shape variations, making it challenging to detect them. In this work we investigate the task of finding particular object shapes from a single depth image. We tackle this task by exploiting the inherently low dimensionality in the object shape variations, which we discover and encode as a compact shape space. Starting from any collection of 3D models, we first train a low dimensional Gaussian Process Latent Variable Shape Space. We then sample this space, effectively producing infinite amounts of shape variations, which are used for training. Additionally, to support fast and accurate inference, we improve the standard 3D object category proposal generation pipeline by applying a shallow convolutional neural network-based filtering stage. This combination leads to considerable improvements for proposal generation, in both speed and accuracy. We compare our full system to previous state-of-the-art approaches, on four different shape classes, and show a clear improvement.

S. Zheng, V.A. Prisacariu, M.-M. Cheng and P.H.S. Torr—This work has been supported by UK EPSRC EP/I001107/2 and EP/J014990 (VAP).

M. Averkiou and N.J. Mitra—This work has been supported by Starting Grant SmartGeometry (StG-2013-335373) and Melinos Averkiou is grateful for a scholarship from the Rabin Ezra Scholarship Trust.

M.-M. Cheng—This work has been partially supported by Youth Leader Program of Nankai University.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Notes

1.
https://github.com/bittnt/Objectness.
2.
Experiments are carried out on a machine with a Intel Xeon E5-2687w(32 Cores).

References

Alexe, B., Deselaers, T., Ferrari, V.: Measuring objectness of image windows. IEEE Trans. Pattern Anal. Mach. Intell. 34(11), 2189–2202 (2012)
Article Google Scholar
Arbeláez, P., Pont-Tuset, J., Barron, J., Marques, F., Malik, J.: Multiscale combinatorial grouping. In: CVPR, pp. 328–335 (2014)
Google Scholar
Aubry, M., Maturana, D., Efros, A.A., Russel, B., Sivic, J.: Seeing 3D chairs: exemplar part-based 2D-3D alignment using a large dataset of CAD models. In: CVPR, pp. 3762–3769 (2014)
Google Scholar
Averkiou, M., Kim, V., Zheng, Y., Mitra, N.J.: Shapesynth: parameterizing model collections for coupled shape exploration and synthesis. Comput. Graph. Forum 33(2), 125–134 (2014)
Article Google Scholar
Brown, M., Hua, G., Winder, S.: Discriminative learning of local image descriptors. IEEE Trans. Pattern Anal. Mach. Intell. 33(1), 43–57 (2011)
Article Google Scholar
Carreira, J., Sminchisescu, C.: CPMC: automatic object segmentation using constrained parametric min-cuts. IEEE Trans. Pattern Anal. Mach. Intell. 34(7), 1312–1328 (2012)
Article Google Scholar
Cheng, M.M., Zhang, Z., Lin, W.Y., Torr, P.: BING: Binarized normed gradients for objectness estimation at 300 fps. In: CVPR, pp. 3286–3293 (2014)
Google Scholar
Chiu, H.P., Kaelbling, L.P., Lozano-Perez, T.: Virtual training for multi-view object class recognition. In: CVPR, pp. 1–8 (2007)
Google Scholar
Dame, A., Prisacariu, V.A., Ren, C.Y., Reid, I.: Dense reconstruction using 3d object shape priors. In: CVPR, pp. 1288–1295 (2013)
Google Scholar
Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: ImageNet: a large-scale hierarchical image database. In: CVPR, pp. 248–255 (2009)
Google Scholar
Endres, I., Hoiem, D.: Category independent object proposals. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part V. LNCS, vol. 6315, pp. 575–588. Springer, Heidelberg (2010)
Chapter Google Scholar
Endres, I., Hoiem, D.: Category-independent object proposals with diverse ranking. IEEE Trans. PAMI 36(2), 222–234 (2014)
Article Google Scholar
Everingham, M., Van Gool, L., Williams, C.K.I., Winn, J., Zisserman, A.: The pascal visual object classes (VOC) challenge. Int. J. Comput. Vision 88(2), 303–338 (2010)
Article Google Scholar
Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: CVPR, pp. 580–587 (2014)
Google Scholar
Gupta, S., Arbeláez, P.A., Girshick, R.B., Malik, J.: Aligning 3D models to RGB-D images of cluttered scenes. In: CVPR, pp. 4731–4740 (2015)
Google Scholar
Gupta, S., Girshick, R., Arbeláez, P., Malik, J.: Learning rich features from RGB-D images for object detection and segmentation. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014, Part VII. LNCS, vol. 8695, pp. 345–360. Springer, Heidelberg (2014)
Chapter Google Scholar
Jia, Y.: Caffe: An open source convolutional architecture for fast feature embedding (2013). http://caffe.berkeleyvision.org/
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition (2014). arXiv:1409.1556v2
Karpathy, A., Miller, S., Li, F.F.: Object discovery in 3d scenes via shape analysis. In: ICRA, pp. 2088–2095 (2013)
Google Scholar
Ke, Y., Sukthankar, R.: PCA-SIFT: a more distinctive representation for local image descriptors. In: CVPR, pp. 506–513 (2004)
Google Scholar
Kim, Y.M., Mitra, N.J., Huang, Q., Guibas, L.: Guided real-time scanning of indoor objects. Comput. Graph. Forum (Proc. Pacific Graph.) 32, 177–186 (2013)
Article Google Scholar
Krähenbühl, P., Koltun, V.: Geodesic object proposals. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014, Part V. LNCS, vol. 8693, pp. 725–739. Springer, Heidelberg (2014)
Chapter Google Scholar
Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: NIPS, pp. 1106–1114 (2012)
Google Scholar
de La Gorce, M., Paragios, N., Fleet, D.: Model-based hand tracking with texture, shading and self-occlusions. In: CVPR, pp. 1–8 (2008)
Google Scholar
Lawrence, N.: Probabilistic non-linear principal component analysis with Gaussian process latent variable models. JMLR 6, 1783–1816 (2005)
MathSciNet Google Scholar
Lawrence, N.D.: Gaussian process latent variable models for visualisation of high dimensional data. In: NIPS, pp. 329–336 (2003)
Google Scholar
LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. In: Proceedings of the IEEE, pp. 2278–2324 (1998)
Google Scholar
Lin, M., Chen, Q., Yan, S.: Network in network. In: ICLR (2013)
Google Scholar
Pepik, B., Stark, M., Gehler, P., Schiele, B.: Multi-view priors for learning detectors from sparse viewpoint data (2014). arXiv:1312.6095
Prisacariu, V.A., Segal, A.V., Reid, I.: Simultaneous monocular 2D segmentation, 3D Pose recovery and 3D reconstruction. In: Lee, K.M., Matsushita, Y., Rehg, J.M., Hu, Z. (eds.) ACCV 2012, Part I. LNCS, vol. 7724, pp. 593–606. Springer, Heidelberg (2013)
Chapter Google Scholar
Shotton, J., Fitzgibbon, A.W., Cook, M., Sharp, T., Finocchio, M., Moore, R., Kipman, A., Blake, A.: Real-time human pose recognition in parts from single depth images. In: CVPR, pp. 1297–1304 (2011)
Google Scholar
Silberman, N., Hoiem, D., Kohli, P., Fergus, R.: Indoor segmentation and support inference from RGBD images. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part V. LNCS, vol. 7576, pp. 746–760. Springer, Heidelberg (2012)
Chapter Google Scholar
Simonyan, K., Vedaldi, A., Zisserman, A.: Learning local feature descriptors using convex optimisation. IEEE Trans. Pattern Anal. Mach. Intell. 36(8), 1573–1585 (2014)
Article Google Scholar
Song, S., Xiao, J.: Sliding shapes for 3D object detection in depth images. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014, Part VI. LNCS, vol. 8694, pp. 634–651. Springer, Heidelberg (2014)
Chapter Google Scholar
Stenger, B., Thayananthan, A., Torr, P.H.S., Cipolla, R.: Model-based hand tracking using a hierarchical bayesian filter. IEEE Trans. Pattern Anal. Mach. Intell. 28(9), 1372–1384 (2006)
Article Google Scholar
Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., Rabinovich, A.: Going deeper with convolutions (2014). arXiv:1409.4842
Tang, D., Yu, T.H., Kim, T.K.: Real-time articulated hand Pose estimation using semi-supervised transductive regression forests. In: ICCV, pp. 3224–3231 (2013)
Google Scholar
Uijlings, J.R.R., van de Sande, K.E.A., Gevers, T., Smeulders, A.W.M.: Selective search for object recognition. IJCV 104(2), 154–171 (2013)
Article Google Scholar
Zhang, Z., Warrell, J., Torr, P.H.: Proposal generation for object detection using cascaded ranking SVMS. In: CVPR, pp. 1497–1504 (2011)
Google Scholar
Zitnick, C.L., Dollár, P.: Edge boxes: locating object proposals from edges. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014, Part V. LNCS, vol. 8693, pp. 391–405. Springer, Heidelberg (2014)
Chapter Google Scholar

Download references

Author information

Authors and Affiliations

University of Oxford, Oxford, UK
Shuai Zheng, Victor Adrian Prisacariu, Ming-Ming Cheng & Philip H. S. Torr
University College London, London, UK
Melinos Averkiou & Niloy J. Mitra
Microsoft Research, Cambridge, UK
Jamie Shotton
TU Dresden, Dresden, Germany
Carsten Rother
Nankai University, Tianjin, China
Ming-Ming Cheng

Authors

Shuai Zheng
View author publications
You can also search for this author in PubMed Google Scholar
Victor Adrian Prisacariu
View author publications
You can also search for this author in PubMed Google Scholar
Melinos Averkiou
View author publications
You can also search for this author in PubMed Google Scholar
Ming-Ming Cheng
View author publications
You can also search for this author in PubMed Google Scholar
Niloy J. Mitra
View author publications
You can also search for this author in PubMed Google Scholar
Jamie Shotton
View author publications
You can also search for this author in PubMed Google Scholar
Philip H. S. Torr
View author publications
You can also search for this author in PubMed Google Scholar
Carsten Rother
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Shuai Zheng .

Editor information

Editors and Affiliations

Institute of Computer Science III, University of Bonn, Bonn, Germany
Juergen Gall
MPI for Intelligent Systems, University of Tübingen, Tübingen, Germany
Peter Gehler
Computer Vision Group, Visual Computing Institute, RWTH Aachen, Aachen, Germany
Bastian Leibe

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zheng, S. et al. (2015). Object Proposals Estimation in Depth Image Using Compact 3D Shape Manifolds. In: Gall, J., Gehler, P., Leibe, B. (eds) Pattern Recognition. DAGM 2015. Lecture Notes in Computer Science(), vol 9358. Springer, Cham. https://doi.org/10.1007/978-3-319-24947-6_16

Download citation

DOI: https://doi.org/10.1007/978-3-319-24947-6_16
Published: 03 November 2015
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-24946-9
Online ISBN: 978-3-319-24947-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics