Advertisement

A Deeper Look at 3D Shape Classifiers

  • Jong-Chyi SuEmail author
  • Matheus Gadelha
  • Rui Wang
  • Subhransu Maji
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11131)

Abstract

We investigate the role of representations and architectures for classifying 3D shapes in terms of their computational efficiency, generalization, and robustness to adversarial transformations. By varying the number of training examples and employing cross-modal transfer learning we study the role of initialization of existing deep architectures for 3D shape classification. Our analysis shows that multiview methods continue to offer the best generalization even without pretraining on large labeled image datasets, and even when trained on simplified inputs such as binary silhouettes. Furthermore, the performance of voxel-based 3D convolutional networks and point-based architectures can be improved via cross-modal transfer from image representations. Finally, we analyze the robustness of 3D shape classifiers to adversarial transformations and present a novel approach for generating adversarial perturbations of a 3D shape for multiview classifiers using a differentiable renderer. We find that point-based networks are more robust to point position perturbations while voxel-based and multiview networks are easily fooled with the addition of imperceptible noise to the input.

Notes

Acknowledgment

We acknowledge support from NSF (#1617917, #1749833) and the MassTech Collaborative grant for funding the UMass GPU cluster.

Supplementary material

478822_1_En_49_MOESM1_ESM.pdf (5 mb)
Supplementary material 1 (pdf 5139 KB)

References

  1. 1.
    Aytar, Y., Vondrick, C., Torralba, A.: SoundNet: learning sound representations from unlabeled video. In: Neural Information Processing Systems (NIPS) (2016)Google Scholar
  2. 2.
    Blender Online Community: Blender - a 3D modelling and rendering package. Blender Foundation, Blender Institute, Amsterdam. http://www.blender.org
  3. 3.
    Brock, A., Lim, T., Ritchie, J.M., Weston, N.: Generative and discriminative voxel modeling with convolutional neural networks. arXiv preprint arXiv:1608.04236 (2016)
  4. 4.
    Buciluǎ, C., Caruana, R., Niculescu-Mizil, A.: Model compression. In: Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD) (2006)Google Scholar
  5. 5.
    Gadhela, M., Maji, S., Wang, R.: Unsupervised 3D shape induction from 2D views of multiple objects. In: International Conference on 3D Vision 2018 (2017)Google Scholar
  6. 6.
    Goodfellow, I., Shlens, J., Szegedy, C.: Explaining and harnessing adversarial examples. In: International Conference on Learning Representations (ICLR) (2015)Google Scholar
  7. 7.
    Gupta, S., Hoffman, J., Malik, J.: Cross modal distillation for supervision transfer. In: Computer Vision and Pattern Recognition (CVPR) (2016)Google Scholar
  8. 8.
    He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Computer Vision and Pattern Recognition (CVPR) (2016)Google Scholar
  9. 9.
    Hegde, V., Zadeh, R.: FusionNet: 3d object classification using multiple data representations. arXiv preprint arXiv:1607.05695 (2016)
  10. 10.
    Hinton, G., Vinyals, O., Dean, J.: Distilling knowledge in a neural network. In: Neural Information Processing Systems (NIPS) (2014)Google Scholar
  11. 11.
    Ioffe, S., Szegedy, C.: Batch normalization: accelerating deep network training by reducing internal covariate shift. In: International Conference on Machine Learning (ICML) (2015)Google Scholar
  12. 12.
    Kalogerakis, E., Averkiou, M., Maji, S., Chaudhuri, S.: 3D shape segmentation with projective convolutional networks (2017)Google Scholar
  13. 13.
    Kanezaki, A., Matsushita, Y., Nishida, Y.: RotationNet: joint object categorization and pose estimation using multiviews from unsupervised viewpoints. arXiv preprint arXiv:1603.06208 (2016)
  14. 14.
    Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
  15. 15.
    Klokov, R., Lempitsky, V.: Escape from cells: deep kd-networks for the recognition of 3D point cloud models. In: International Conference on Computer Vision (ICCV) (2017)Google Scholar
  16. 16.
    Kurakin, A., Goodfellow, I.J., Bengio, S.: Adversarial examples in the physical world. arXiv preprint arXiv:1607.02533 (2016)
  17. 17.
    Maturana, D., Scherer, S.: VoxNet: a 3d convolutional neural network for real-time object recognition. In: IEEE International Conference on Intelligent Robots and Systems (IROS) (2015)Google Scholar
  18. 18.
    Meagher, D.J.: Octree encoding: a new technique for the representation, manipulation and display of arbitrary 3-d objects by computer. Electrical and Systems Engineering Department Rensseiaer Polytechnic Institute Image Processing Laboratory (1980)Google Scholar
  19. 19.
    Nguyen, A., Yosinski, J., Clune, J.: Deep neural networks are easily fooled: high confidence predictions for unrecognizable images. In: Computer Vision and Pattern Recognition (CVPR) (2015)Google Scholar
  20. 20.
    Paszke, A., et al.: Automatic differentiation in PyTorch (2017)Google Scholar
  21. 21.
    Phong, B.T.: Illumination for computer generated pictures. Commun. ACM 18(6), 311–317 (1975)CrossRefGoogle Scholar
  22. 22.
    Qi, C.R., Yi, L., Su, H., Guibas, L.J.: Pointnet++: deep hierarchical feature learning on point sets in a metric space. In: Neural Information Processing Systems (NIPS) (2017)Google Scholar
  23. 23.
    Qi, C.R., Su, H., Nießner, M., Dai, A., Yan, M., Guibas, L.: Volumetric and multi-view CNNs for object classification on 3d data. In: Computer Vision and Pattern Recognition (CVPR) (2016)Google Scholar
  24. 24.
    Qi, C.R., Su, H., Nießner, M., Dai, A., Yan, M., Guibas, L.: Volumetric andmulti-view CNNs for object classification on 3d data. In: Computer Vision and Pattern Recognition (CVPR) (2016)Google Scholar
  25. 25.
    Riegler, G., Ulusoy, A.O., Geiger, A.: OctNet: learning deep 3d representations at high resolutions. In: Computer Vision and Pattern Recognition (CVPR) (2017)Google Scholar
  26. 26.
    Russakovsky, O.: Imagenet large scale visual recognition challenge. Int. J. Comput. Vis. (IJCV) 115(3), 211–252 (2015)MathSciNetCrossRefGoogle Scholar
  27. 27.
    Sedaghat, N., Zolfaghari, M., Amiri, E., Brox, T.: Orientation-boosted voxel nets for 3D object recognition. In: British Machine Vision Conference (BMVC) (2017)Google Scholar
  28. 28.
    Sfikas, K., Theoharis, T., Pratikakis, I.: Exploiting the panorama representation for convolutional neural network classification and retrieval. In: Eurographics Workshop on 3D Object Retrieval (2017)Google Scholar
  29. 29.
    Shen, Y., Feng, C., Yang, Y., Tian, D.: Neighbors do help: deeply exploiting local structures of point clouds. arXiv preprint arXiv:1712.06760 (2017)
  30. 30.
    Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)
  31. 31.
    Su, H., Maji, S., Kalogerakis, E., Learned-Miller, E.G.: Multi-view convolutional neural networks for 3D shape recognition. In: International Conference on Computer Vision (ICCV) (2015)Google Scholar
  32. 32.
    Su, H., Qi, C., Mo, K., Guibas, L.: PointNet: deep learning on point sets for 3D classification and segmentation. In: Computer Vision and Pattern Recognition (CVPR) (2017)Google Scholar
  33. 33.
    Szegedy, C., et al.: Intriguing properties of neural networks. arXiv preprint arXiv:1312.6199 (2013)
  34. 34.
    Wang, C., Pelillo, M., Siddiqi, K.: Dominant set clustering and pooling for multi-view 3D object recognition. In: British Machine Vision Conference (BMVC) (2017)Google Scholar
  35. 35.
    Wang, P.S., Liu, Y., Guo, Y.X., Sun, C.Y., Tong, X.: O-CNN: octree-based convolutional neural networks for 3D shape analysis. ACM Trans. Graph. (SIGGRAPH) 36(4), 72 (2017)Google Scholar
  36. 36.
    Wang, Y., Sun, Y., Liu, Z., Sarma, S.E., Bronstein, M.M., Solomon, J.M.: Dynamic graph CNN for learning on point clouds. arXiv preprint arXiv:1801.07829 (2018)
  37. 37.
    Wu, Z., et al.: 3D ShapeNets: a deep representation for volumetric shapes. In: Computer Vision and Pattern Recognition (CVPR) (2015)Google Scholar
  38. 38.
    Zaheer, M., Kottur, S., Ravanbakhsh, S., Poczos, B., Salakhutdinov, R.R., Smola, A.J.: Deep sets. In: Neural Information Processing Systems (NIPS) (2017)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  • Jong-Chyi Su
    • 1
    Email author
  • Matheus Gadelha
    • 1
  • Rui Wang
    • 1
  • Subhransu Maji
    • 1
  1. 1.University of MassachusettsAmherstUSA

Personalised recommendations