Planar Pose Estimation Using Object Detection and Reinforcement Learning

Nørby Rasmussen, Frederik; Terp Andersen, Sebastian; Grossmann, Bjarne; Boukas, Evangelos; Nalpantidis, Lazaros

doi:10.1007/978-3-030-34995-0_32

Planar Pose Estimation Using Object Detection and Reinforcement Learning

Frederik Nørby Rasmussen¹²,
Sebastian Terp Andersen¹²,
Bjarne Grossmann¹²,
Evangelos Boukas¹³ &
…
Lazaros Nalpantidis¹³

Conference paper
First Online: 23 November 2019

2606 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 11754))

Abstract

Pose estimation concerns systems or models dealing with the determination of a static object’s pose using, in this case, vision. This paper approaching the problem with an active vision-based solution, that integrates both perception and action in the same model. The problem is solved using a combination of neural networks for object detection and a reinforcement learning architecture for moving a camera and estimating the pose. A robotic implementation of the proposed active vision system is used for testing with promising results. Experiments show that our approach does not only solve the simple task of planar visual pose estimation, but also exhibits robustness to changes in the environment.

F. N. Rasmussen and S. T. Andersen—The two authors have contributed equally to the work.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

Boukas, E., Gasteratos, A.: Modeling regions of interest on orbital and rover imagery for planetary exploration missions. Cybern. Syst. 47(3), 180–205 (2016). https://doi.org/10.1080/01969722.2016.1154771
Article Google Scholar
Brockman, G., et al.: OpenAI Gym (2016)
Google Scholar
Brooks, R.A.: Intelligence without representation. Artif. Intell. 47, 139–159 (1991)
Article Google Scholar
Huang, J., et al.: Speed/accuracy trade-offs for modern convolutional object detectors. In: Computer Vision and Pattern Recognition (2017)
Google Scholar
Jia, Z., Chang, Y.J., Chen, T.: Active view selection for object and pose recognition. In: 2009 IEEE 12th International Conference on Computer Vision Workshops, September 2009, pp. 641–648 (2009)
Google Scholar
Kostavelis, I., Nalpantidis, L., Gasteratos, A.: Object recognition using saliency maps and HTM learning. In: IEEE International Conference on Imaging Systems and Techniques, pp. 528–532. IEEE, Manchester (2012)
Google Scholar
Krull, A., Brachmann, E., Michel, F., Yang, M.Y., Gumhold, S.: Learning analysis-by-synthesis for 6D pose estimation in RGB-D images. In: 2015 IEEE International Conference on Computer Vision, December 2015, pp. 954–962 (2015)
Google Scholar
Krull, A., Brachmann, E., Nowozin, S., Michel, F., Shotton, J., Rother, C.: PoseAgent: budget-constrained 6D object pose estimation via reinforcement learning. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition, July 2017, pp. 2566–2574 (2017)
Google Scholar
Mnih, V., et al.: Human-level control through deep reinforcement learning. Nature 518(7540), 529–533 (2015). https://doi.org/10.1038/nature14236
Article Google Scholar
Piater, J., Jodogne, S., Detry, R., Kraft, D., Krueger, N., Kroemer, O.: Learning visual representations for perception-action systems. Int. J. Robot. Res. 30, 294–307 (2015)
Article Google Scholar
Plappert, M.: Keras-RL (2016). https://github.com/keras-rl/keras-rl
Polydoros, A.S., Boukas, E., Nalpantidis, L.: Online multi-target learning of inverse dynamics models for computed-torque control of compliant manipulators. In: IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Vancouver (2017)
Google Scholar
Toshev, A., Szegedy, C.: DeepPose: human pose estimation via deep neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, June 2014, pp. 1653–1660 (2014)
Google Scholar
Tremblay, J., To, T., Sundaralingam, B., Xiang, Y., Fox, D., Birchfield, S.: Deep object pose estimation for semantic robotic grasping of household objects. Cornell University Library (2018)
Google Scholar
Wooldridge, M., Jennings, N.R.: Agent theories, architectures, and languages: a survey. In: Wooldridge, M.J., Jennings, N.R. (eds.) ATAL 1994. LNCS, vol. 890, pp. 1–39. Springer, Heidelberg (1995). https://doi.org/10.1007/3-540-58855-8_1
Chapter Google Scholar

Download references

Author information

Authors and Affiliations

Department of Materials and Production, Aalborg University, Copenhagen, Denmark
Frederik Nørby Rasmussen, Sebastian Terp Andersen & Bjarne Grossmann
Department of Electrical Engineering, Technical University of Denmark, Kongens Lyngby, Denmark
Evangelos Boukas & Lazaros Nalpantidis

Authors

Frederik Nørby Rasmussen
View author publications
You can also search for this author in PubMed Google Scholar
Sebastian Terp Andersen
View author publications
You can also search for this author in PubMed Google Scholar
Bjarne Grossmann
View author publications
You can also search for this author in PubMed Google Scholar
Evangelos Boukas
View author publications
You can also search for this author in PubMed Google Scholar
Lazaros Nalpantidis
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Lazaros Nalpantidis .

Editor information

Editors and Affiliations

Centre for Research and Technology Hellas (CERTH-ITI), Thessaloniki, Greece
Dimitrios Tzovaras
Centre for Research and Technology Hellas (CERTH-ITI), Thessaloniki, Greece
Dimitrios Giakoumis
Vienna University of Technology, Vienna, Austria
Markus Vincze
Foundation for Research and Technology Hellas (FORTH), Heraklion, Greece
Antonis Argyros

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Nørby Rasmussen, F., Terp Andersen, S., Grossmann, B., Boukas, E., Nalpantidis, L. (2019). Planar Pose Estimation Using Object Detection and Reinforcement Learning. In: Tzovaras, D., Giakoumis, D., Vincze, M., Argyros, A. (eds) Computer Vision Systems. ICVS 2019. Lecture Notes in Computer Science(), vol 11754. Springer, Cham. https://doi.org/10.1007/978-3-030-34995-0_32

Download citation

DOI: https://doi.org/10.1007/978-3-030-34995-0_32
Published: 23 November 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-34994-3
Online ISBN: 978-3-030-34995-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics