Advertisement

Single View 3D Reconstruction with Category Information Learning

  • Weihong Cao
  • Fei Hu
  • Long YeEmail author
  • Qin Zhang
Conference paper
  • 48 Downloads
Part of the Communications in Computer and Information Science book series (CCIS, volume 1181)

Abstract

3D reconstruction from single image is a classical problem in computer vision. Due to the fact that the information contained in one single image is not sufficient for 3D shape reconstruction, the existing model cannot reconstruct 3D models very well. To tackle this problem, we propose a novel model which effectively utilizes the category information of objects to improve the performance of network on single view 3D reconstruction. Our model consists of two parts: rough shape generation network (RSGN) and category comparison network (CCN). RSGN can learn the characteristics of objects in the same category through the comparison part CCN. In the experiments, we verify the feasibility of our model on the ShapeNet dataset, and the results confirm our framework.

Keywords

Single view 3D reconstruction Adversarial learning 

Notes

Acknowledgment

The work is supported by the National Natural Science Foundation of China under Grant Nos.61971383 and 61631016 and the Fundamental Research Funds for the Central Universities under Grant Nos. 2018XNG1824, YLSZ180226 and 2018XNG1825.

References

  1. 1.
    Chang, A.X., et al.: ShapeNet: an information-rich 3D model repository. Comput. Sci. (2015)Google Scholar
  2. 2.
    Xiang, Y., Roozbeh, M., Savarese, S.: Beyond PASCAL: a benchmark for 3D object detection in the wild. In: Workshop on Applications of Computer Vision, pp. 75–82 (2014)Google Scholar
  3. 3.
    Xiang, Y., et al.: ObjectNet3D: a large scale database for 3D object recognition. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9912, pp. 160–176. Springer, Cham (2016).  https://doi.org/10.1007/978-3-319-46484-8_10CrossRefGoogle Scholar
  4. 4.
    Lim, J.J., Pirsiavash, H., Torralba, A.: Parsing IKEA objects: fine pose estimation. In: 2013 IEEE International Conference on Computer Vision (ICCV). IEEE Computer Society (2013)Google Scholar
  5. 5.
    Sun, X., et al.: Pix3D: dataset and methods for single-image 3D shape modeling. In: Computer Vision and Pattern Recognition, pp. 2974–2983 (2018)Google Scholar
  6. 6.
    Wu, N.Z., et al.: 3D ShapeNets: a deep representation for volumetric shape modeling. In: 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE Computer Society (2015)Google Scholar
  7. 7.
    Lee, H., et al.: Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations. In: International Conference on Machine Learning, pp. 609–616 (2009)Google Scholar
  8. 8.
    Girdhar, R., Fouhey, D.F., Rodriguez, M., Gupta, A.: Learning a predictable and generative vector representation for objects. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9910, pp. 484–499. Springer, Cham (2016).  https://doi.org/10.1007/978-3-319-46466-4_29CrossRefGoogle Scholar
  9. 9.
    Choy, C.B., Xu, D., Gwak, J., Chen, K., Savarese, S.: 3D-R2N2: a unified approach for single and multi-view 3D object reconstruction. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9912, pp. 628–644. Springer, Cham (2016).  https://doi.org/10.1007/978-3-319-46484-8_38CrossRefGoogle Scholar
  10. 10.
    Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)CrossRefGoogle Scholar
  11. 11.
    Yan, X., et al.: Perspective transformer nets: learning single-view 3D object reconstruction without 3D supervision. In: Neural Information Processing Systems, pp. 1696–1704 (2016)Google Scholar
  12. 12.
    Yang, G., Cui, Y., Belongie, S., Hariharan, B.: Learning single-view 3D reconstruction with limited pose supervision. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11219, pp. 90–105. Springer, Cham (2018).  https://doi.org/10.1007/978-3-030-01267-0_6CrossRefGoogle Scholar
  13. 13.
    Wu, J., et al.: Single image 3D interpreter network. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9910, pp. 365–382. Springer, Cham (2016).  https://doi.org/10.1007/978-3-319-46466-4_22CrossRefGoogle Scholar
  14. 14.
    Novotny, D., Larlus, D., Vedaldi, A.: Learning 3D object categories by looking around them. In: International Conference on Computer Vision, pp. 5228–5237 (2017)Google Scholar
  15. 15.
    Zhu, R., et al.: Rethinking reprojection: closing the loop for pose-aware shape reconstruction from a single image. In: International Conference on Computer Vision, pp. 57–65 (2017)Google Scholar
  16. 16.
    Rezende, D.J., et al.: Unsupervised learning of 3D structure from images. arXiv: Computer Vision and Pattern Recognition (2016)Google Scholar
  17. 17.
    Hane, C., Tulsiani, S., Malik, J.: Hierarchical surface prediction for 3D object reconstruction. In: International Conference on 3D Vision, pp. 412–420 (2017)Google Scholar
  18. 18.
    Tatarchenko, M., Dosovitskiy, A., Brox, T.: Multi-view 3D models from single images with a convolutional network. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9911, pp. 322–337. Springer, Cham (2016).  https://doi.org/10.1007/978-3-319-46478-7_20CrossRefGoogle Scholar
  19. 19.
    Tulsiani, S., et al.: Multi-view supervision for single-view reconstruction via differentiable ray consistency. arXiv: Computer Vision and Pattern Recognition (2017)Google Scholar
  20. 20.
    Wu, J., et al.: MarrNet: 3D shape reconstruction via 2.5D sketches. In: Neural Information Processing Systems, pp. 540–550 (2017)Google Scholar
  21. 21.
    Fan, H., Su, H., Guibas, L.J.: A point set generation network for 3D object reconstruction from a single image. In: Computer Vision and Pattern Recognition, pp. 2463–2471 (2017)Google Scholar
  22. 22.
    Goodfellow, I.J., et al.: Generative adversarial nets. In: Neural Information Processing Systems, pp. 2672–2680 (2014)Google Scholar
  23. 23.
    Wu, J., et al.: Learning a probabilistic latent space of object shapes via 3D generative-adversarial modeling. In: Computer Vision and Pattern Recognition (2016)Google Scholar
  24. 24.
    Gadelha, M., Maji, S., Wang, R.: 3D shape induction from 2D views of multiple objects. In: International Conference on 3D Vision, pp. 402–411 (2017)Google Scholar
  25. 25.
    Krizhevsky, A., Sutskever, I., Hinton, G.: ImageNet classification with deep convolutional neural networks. NIPS Curran Associates Inc. (2012)Google Scholar
  26. 26.
    Tatarchenko, M., et al.: What do single-view 3D reconstruction networks learn? In: Computer Vision and Pattern Recognition (2019)Google Scholar
  27. 27.
    Wang, N., Zhang, Y., Li, Z., Fu, Y., Liu, W., Jiang, Y.-G.: Pixel2Mesh: generating 3D mesh models from single RGB images. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11215, pp. 55–71. Springer, Cham (2018).  https://doi.org/10.1007/978-3-030-01252-6_4CrossRefGoogle Scholar
  28. 28.
    Groueix, T., et al.: AtlasNet: a Papier-Mâché approach to learning 3D surface generation. In: Computer Vision and Pattern Recognition (2018)Google Scholar
  29. 29.
    Zhao, Y., et al.: 3D point-capsule networks. In: Computer Vision and Pattern Recognition, pp. 1009–1018 (2018)Google Scholar
  30. 30.
    Hu, F., et al.: 3D VAE-attention network: a parallel system for single-view 3D reconstruction. In: Pacific Graphics (2018)Google Scholar

Copyright information

© Springer Nature Singapore Pte Ltd. 2020

Authors and Affiliations

  1. 1.Key Laboratory of Media Audio and VideoCommunication University of China, Ministry of EducationBeijingChina

Personalised recommendations