Next Best View Planning via Reinforcement Learning for Scanning of Arbitrary 3D Shapes

Abstract

Reconstructing 3D objects from scanned measurements is a fundamental task in computer vision. A central factor for the effectiveness of 3D reconstruction is the selection of sensor views for scanning. The latter remains an open problem in the 3D geometry processing area, known as the next-best-view planning problem, and is commonly approached by combinatorial or greedy methods. In this work, we propose a reinforcement learning-based approach to sequential next-best-view planning. The method is implemented based on the gym environment including 3D reconstruction, next-best-scan planning, and image acquisition features. We demonstrate this method to outperform the baselines in terms of the number of required scans and the obtained 3D mesh reconstruction accuracy.

This is a preview of subscription content, access via your institution.

Fig. 1.
Fig. 2.
Fig. 3.

REFERENCES

  1. 1

    M. Levoy, K. Pulli, B. Curless, S. Rusinkiewicz, D. Koller, L. Pereira, M. Ginzton, S. Anderson, J. Davis, J. Ginsberg, et al., “The digital michelangelo project: 3d scanning of large statues,” in Proc. 27th Ann. Conf. on Computer Graphics and Interactive Techniques, New York, July, 2000 (ACM, New York, 2000), pp. 131–144.

  2. 2

    F. Remondino, “Heritage recording and 3d modeling with photogrammetry and 3d scanning,” Remote Sensing 3, 1104–1138 (2011).

  3. 3

    M. Kazhdan, M. Bolitho, and H. Hoppe, “Poisson surface reconstruction,” in Proc. Fourth Eurographics Symp. on Geometry Processing (SGP'06), Cagliari, Sardinia, Italy, June 26–28, 2006 (Eurographics Ass., 2006), pp. 61–70.

  4. 4

    M. D. Kaba, M. G. Uzunbas, and S. N. Lim, “A reinforcement learning approach to the view planning problem,” in Proc. 2017 IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), 2017 (IEEE, New York, 2017), 5094–5102.

  5. 5

    G. H. Tarbox and S. N Gottschlich, “Planning for complete sensor coverage in inspection,” Computer Vision and Image Understanding 61, 84–111 (1995).

  6. 6

    S. Y. Chen and Y. F. Li, “Vision sensor planning for 3‑d model acquisition,” IEEE Trans. Systems, Man, and Cybernetics, Part B (Cybernetics) 35, 894–904 (2005).

  7. 7

    J. Maver and R. Bajcsy, “Occlusions as a guide for planning the next view,” IEEE Trans. Pattern Analysis & Machine Intell. 15, 417 ̶ 433 (1993).

  8. 8

    W. R. Scott, G. Roth, and J.-F. Rivest, “View planning for automated threedimensional object reconstruction and inspection,” ACM Comput. Surv. 35, 64–96 (2003).

  9. 9

    B. Yamauchi, “A frontier-based approach for autonomous exploration,” in Proc. 1997 IEEE Int. Symp. on Comput. Intell. in Robotics and Automation (CIRA’97) ‘Towards New Computational Principles for Robotics and Automation’, 1997 (CIRA, 1997), pp. 146–151.

  10. 10

    A. Bircher, M. Kamel, K. Alexis, H. Oleynikova, and R. Siegwart, “Receding horizon “next-best-view,” planner for 3d exploration," in Proc. 2016 IEEE Int. Conf. on Robotics and Automation (ICRA), 2016 (IEEE, New York, 2016), pp. 1462–1468.

  11. 11

    S. Isler, R. Sabzevari, J. Delmerico, and D. Scaramuzza, “An information gain formulation for active volumetric 3d reconstruction,” in Proc. 2016 IEEE Int. Conf. on Robotics and Automation (ICRA), 2016 (IEEE, New York, 2016), pp. 3477–3484.

  12. 12

    M. Mendoza, J. I. Vasquez-Gomez, H. Taud, L. E. Sucar, and C. Reta, “Supervised learning of the next-best-view for 3d object reconstruction,” Pattern Recogn. Lett. 133, 224–231 (2020).

  13. 13

    Zhirong Wu, S. Song, A. Khosla, Fisher Yu, Linguang Zhang, Xiaoou Tang, and J. Xiao, in 3d Shapenets: A Deep Representation for Volumetric Shapes (2015 IEEE Conf. on Computer Vision & Pattern Recognition (CVPR) (IEEE, New York, 2015), pp. 1912–1920.

  14. 14

    D. Silver, Th. Hubert, Ju. Schrittwieser, I. Antonoglou, M. Lai, A. Guez, M. Lanctot, L. Sifre, D. Kumaran, T. Graepel, T. Lillicrap, K. Simonyan, and D. Hassabis, “Mastering chess and shogi by self-play with a general reinforcement learning algorithm,” ArXiv Preprint 1712.01815. (2017).

  15. 15

    A. P. Badia, B. Piot, S. Kapturowski, P. Sprechmann, A. Vitvitskyi, D. Guo, and Ch. Blundell, “Agent57: Outperforming the atari human benchmark,” ArXiv Preprint 2003.13350 (2020).

  16. 16

    W. Kool, H. van Hoof, and M. Welling, “Attention, learn to solve routing problems!" in Proc. Int. Conf. on Learning Representations (ICLR), New Orleans, LA, USA, 2019 (ICLR, 2019).

  17. 17

    Xiangyu Zhao, Long Xia, Liang Zhang, Zhuoye Ding, Dawei Yin, and Jiliang Tang, “Deep reinforcement learning for page-wise recommendations,” in Proc. 12th ACM Conf. on Recommender Systems, Vancouver, Canada, Oct. 7, 2018 (ACM, 2018).

  18. 18

    V. Mnih, K. Kavukcuoglu, D. Silver, A. Graves, I. Antonoglou, D. Wierstra, and M. Riedmiller, “Playing atari with deep reinforcement learning,” NIPS Deep Learning Workshop (2013).

  19. 19

    H. van Hasselt, A. Guez, and D. Silver, “Deep reinforcement learning with double q-learning,” in Proc. Thirtieth AAAI Conf. on Artificial Intelligence, Phoenix, Arizona USA, Feb. 12—17, 2016 (AAAI, 2016), pp. 2094–2100.

  20. 20

    G. Brockman, V. Cheung, L. Pettersson, J. Schneider, J. Schulman, J. Tang, and W. Zaremba, “Openai gym,” ArXiv Preprint 1606.01540 (2016).

  21. 21

    I. Wald, S. Woop, C. Benthin, G. S. Johnson, and M. Ernst, “Embree: a kernel framework for efficient cpu ray tracing,” ACM Trans. on Graphics (TOG) 33 (4), 1–8 (2014).

  22. 22

    M. Roberts, “Evenly distributing points on a sphere,” Extreme Learning Blog (2018).

    Google Scholar 

  23. 23

    S. Koch, A. Matveev, Zh. Jiang, F. Williams, A. Artemov, E. Burnaev, M. Alexa, D. Zorin, and D. Panozzo, “ABC: A big CAD model dataset for geometric deep learning,” in Proc. IEEE Conf. on Computer Vision and Pattern Recognition, CVPR-2019, Long Beach, CA, USA, June 16–20, 2019 (IEEE, New York, 2018), pp. 9601–9611.

Download references

Funding

The work was supported by The Ministry of Education and Science of Russian Federation, grant no. 14.615.21.0004, grant code: RFMEFI61518X0004.

Author information

Affiliations

Authors

Corresponding author

Correspondence to E. V. Burnaev.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Potapova, S.G., Artemov, A.V., Sviridov, S.V. et al. Next Best View Planning via Reinforcement Learning for Scanning of Arbitrary 3D Shapes. J. Commun. Technol. Electron. 65, 1484–1490 (2020). https://doi.org/10.1134/S1064226920120141

Download citation

Keywords:

  • 3D model
  • next best view
  • depth map
  • CAD model
  • reinforcement learning
  • mesh