Occlusion Resistant Object Rotation Regression from Point Cloud Segments

  • Ge GaoEmail author
  • Mikko Lauri
  • Jianwei Zhang
  • Simone Frintrop
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11129)


Rotation estimation of known rigid objects is important for robotic applications such as dexterous manipulation. Most existing methods for rotation estimation use intermediate representations such as templates, global or local feature descriptors, or object coordinates, which require multiple steps in order to infer the object pose. We propose to directly regress a pose vector from point cloud segments using a convolutional neural network. Experimental results show that our method achieves competitive performance compared to a state-of-the-art method, while also showing more robustness against occlusion. Our method does not require any post processing such as refinement with the iterative closest point algorithm.


6D pose estimation Convolutional neural network Point cloud Lie algebra 



This work was partially funded by the German Science Foundation (DFG) in project Crossmodal Learning, TRR 169.


  1. 1.
    Brachmann, E., Krull, A., Michel, F., Gumhold, S., Shotton, J., Rother, C.: Learning 6D object pose estimation using 3D object coordinates. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8690, pp. 536–551. Springer, Cham (2014). Scholar
  2. 2.
    Calli, B., Walsman, A., Singh, A., Srinivasa, S., Abbeel, P.: Benchmarking in manipulation research using the Yale-CMU-Berkeley object and model set. Robot. Autom. Mag. IEEE 22(3), 36–52 (2015)CrossRefGoogle Scholar
  3. 3.
    Chen, Y., Medioni, G.: Object modelling by registration of multiple range images. Image Vis. Comput. 10(3), 145–155 (1992)CrossRefGoogle Scholar
  4. 4.
    Do, T., Cai, M., Pham, T., Reid, I.: Deep-6DPose: recovering 6D object pose from a single RGB image. arXiv preprint arXiv:1802.10367 (2018)
  5. 5.
    Drost, B., Ulrich, M., Navab, N., Ilic, S.: Model globally, match locally: efficient and robust 3D object recognition. In: CVPR (2010)Google Scholar
  6. 6.
    Hall, B.C.: Lie Groups, Lie Algebras, and Representations. GTM, vol. 222. Springer, Cham (2015). Scholar
  7. 7.
    Hartley, R., Trumpf, J., Dai, Y., Li, H.: Rotation averaging. Int. J. Comput. Vis. 103(3), 267–305 (2013)MathSciNetCrossRefGoogle Scholar
  8. 8.
    Hinterstoisser, S., et al.: Multimodal templates for real-time detection of texture-less objects in heavily cluttered scenes. In: ICCV (2011)Google Scholar
  9. 9.
    Hinterstoisser, S., et al.: Model based training, detection and pose estimation of texture-less 3D objects in heavily cluttered scenes. In: Lee, K.M., Matsushita, Y., Rehg, J.M., Hu, Z. (eds.) ACCV 2012. LNCS, vol. 7724, pp. 548–562. Springer, Heidelberg (2013). Scholar
  10. 10.
    Hinterstoisser, S., Lepetit, V., Rajkumar, N., Konolige, K.: Going further with point pair features. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9907, pp. 834–848. Springer, Cham (2016). Scholar
  11. 11.
    Hoag, D.: Apollo guidance and navigation: considerations of Apollo IMU gimbal lock, pp. 1–64. MIT Instrumentation Laboratory, Cambridge (1963)Google Scholar
  12. 12.
    Hodaň, T., Zabulis, X., Lourakis, M., Obdržálek, S., Matas, J.: Detection and fine 3D pose estimation of texture-less objects in RGB-D images. In: IROS (2015)Google Scholar
  13. 13.
    Huynh, D.Q.: Metrics for 3D rotations: comparison and analysis. J. Math. Imag. Vis. 35(2), 155–164 (2009)MathSciNetCrossRefGoogle Scholar
  14. 14.
    Jaderberg, M., Simonyan, K., Zisserman, A., Kavukcuoglu, K.: Spatial transformer networks. In: NIPS (2015)Google Scholar
  15. 15.
    Jafari, H., Mustikovela, S.K., Pertsch, K., Brachmann, E., Rother, C.: iPose: instance-aware 6D pose estimation of partly occluded objects. arXiv preprint arXiv:1712.01924 (2018)
  16. 16.
    Jafari, O.H., Mustikovela, S.K., Pertsch, K., Brachmann, E., Rother, C.: The best of both worlds: learning geometry-based 6D object pose estimation. arXiv preprint arXiv:1712.01924 (2017)
  17. 17.
    Kehl, W., Manhardt, F., Tombari, F., Ilic, S., Navab, N.: SSD-6D: making RGB-based 3D detection and 6D pose estimation great again. In: ICCV (2017)Google Scholar
  18. 18.
    Kehl, W., Milletari, F., Tombari, F., Ilic, S., Navab, N.: Deep learning of local RGB-D patches for 3D object detection and 6D pose estimation. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9907, pp. 205–220. Springer, Cham (2016). Scholar
  19. 19.
    Kehl, W., Tombari, F., Navab, N., Ilic, S., Lepetit, V.: Hashmod: a hashing method for scalable 3D object detection. In: BMVC (2015)Google Scholar
  20. 20.
    Krull, A., Brachmann, E., Michel, F., Yang, M.Y., Gumhold, S., Rother, C.: Learning analysis-by-synthesis for 6D pose estimation in RGB-D images. In: ICCV (2015)Google Scholar
  21. 21.
    Li, Y., Wang, G., Ji, X., Xiang, Y., Fox, D.: DeepIM: deep iterative matching for 6D pose estimation. arXiv preprint arXiv:1804.00175 (2018)
  22. 22.
    Mahendran, S., Ali, H., Vidal, R.: 3D pose regression using convolutional neural networks. In: ICCV (2017)Google Scholar
  23. 23.
    Michel, F., et al.: Global hypothesis generation for 6D object pose estimation. In: CVPR (2017)Google Scholar
  24. 24.
    Qi, C.R., Su, H., Mo, K., Guibas, L.J.: PointNet: deep learning on point sets for 3D classification and segmentation. In: CVPR (2017)Google Scholar
  25. 25.
    Rad, M., Lepetit, V.: BB8: a scalable, accurate, robust to partial occlusion method for predicting the 3D poses of challenging objects without using depth. In: ICCV (2017)Google Scholar
  26. 26.
    Riegler, G., Ulusoy, A.O., Geiger, A.: OctNet: learning deep 3D representations at high resolutions. In: CVPR (2017)Google Scholar
  27. 27.
    Rusu, R., Bradski, G., Thibaux, R., Hsu, J.: Fast 3D recognition and pose using the viewpoint feature histogram. In: IROS (2010)Google Scholar
  28. 28.
    Sedaghat, N., Zolfaghari, M., Amiri, E., Brox, T.: Orientation-boosted voxel nets for 3D object recognition. In: BMVC (2017)Google Scholar
  29. 29.
    Song, S., Xiao, J.: Deep sliding shapes for Amodal 3D object detection in RGB-D images. In: CVPR (2016)Google Scholar
  30. 30.
    Tejani, A., Kouskouridas, R., Doumanoglou, A., Tang, D., Kim, T.: Latent-class Hough forests for 6 DoF object pose estimation. PAMI 40(1), 119–132 (2018)CrossRefGoogle Scholar
  31. 31.
    Tejani, A., Tang, D., Kouskouridas, R., Kim, T.-K.: Latent-class Hough forests for 3D object detection and pose estimation. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8694, pp. 462–477. Springer, Cham (2014). Scholar
  32. 32.
    Tekin, B., Sinha, S.N., Fua, P.: Real-time seamless single shot 6D object pose prediction. In: CVPR (2018)Google Scholar
  33. 33.
    Wang, Y., Sun, Y., Liu, Z., Sarma, S.E., Bronstein, M.M., Solomon, J.M.: Dynamic graph CNN for learning on point clouds. arXiv preprint arXiv:1801.07829 (2018)
  34. 34.
    Xiang, Y., Schmidt, T., Narayanan, V., Fox, D.: PoseCNN: a convolutional neural network for 6D object pose estimation in cluttered scenes. In: RSS (2018)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.Department of InformaticsUniversity of HamburgHamburgGermany

Personalised recommendations