Advertisement

A Hybrid Deep Reinforcement Learning Algorithm for Intelligent Manipulation

  • Chao Ma
  • Jianfei Li
  • Jie Bai
  • Yaobing WangEmail author
  • Bin Liu
  • Jing Sun
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11743)

Abstract

Conventional collaborative robots can solve complex problems through programming approaches. But the current tasks are different and nonrepetitive, many problems cannot be solved by conventional programming methods. Deep reinforcement learning provides a framework for solving robotic control tasks using machine learning techniques. However, the existing model-free deep reinforcement learning algorithms lack unified framework for comparing sample efficiency with final performance. In this paper, a hybrid deep reinforcement learning framework and its application in robot control are proposed based on the existing model-free deep reinforcement learning algorithms. In the acting process, the distributed actors acting with the environment are used to acquire the data, while prior actors are used to solve the cold boot problem of the algorithm. In the learning process, prioritized experience replay and multi-step learning are designed for the improvement on the final performance. Simulations are represented to show the practicality and potential of the proposed algorithm. Results show that the hybrid deep reinforcement learning algorithm in this paper has a significant improvement on the final performance and sample efficiency while it can ensure the stability and convergence.

Keywords

Deep reinforcement learning Robot control Data flow Hybrid deep reinforcement learning 

Notes

Acknowledgment

This research was supported, in part, by the National Natural Science Foundation of China (No. 51875393) and by the China Advance Research for Manned Space Project (No. 030601).

References

  1. 1.
    Vecerik, M., Hester, T., Scholz, J., et al.: Leveraging demonstrations for deep reinforcement learning on robotics problems with sparse rewards. arXiv preprint arXiv:1707.08817 (2017)
  2. 2.
    Tai, L., Zhang, J., Liu, M., et al.: A survey of deep network solutions for learning control in robotics: from reinforcement to imitation. arXiv preprint arXiv:1612.07139 (2016)
  3. 3.
    Barto, G., Sutton, S., Anderson, W.: Neuronlike adaptive elements that can solve difficult learning control problems. IEEE Trans. Syst. Man Cybern. 5, 834–846 (1984)Google Scholar
  4. 4.
    Mnih, V., Kavukcuoglu, K., Silver, D., et al.: Playing Atari with deep reinforcement learning. In: Neural Information Processing Systems (2013)Google Scholar
  5. 5.
    Lillicrap, P., Hunt, J., Pritzel, A., et al.: Continuous control with deep reinforcement learning. arXiv preprint arXiv:1509.02971 (2015)
  6. 6.
    Mnih, V., Badia, P., Mirza, M., et al.: Asynchronous methods for deep reinforcement learning. arXiv preprint arXiv:1602.01783 (2016)
  7. 7.
    Wu, Y., Mansimov, E., Grosse, B., et al.: Scalable trust-region method for deep reinforcement learning using kronecker-factored approximation. In: Neural Information Processing Systems, pp. 5279–5288 (2017)Google Scholar
  8. 8.
    Schulman, J., Levine, S., Abbeel, P., et al.: Trust region policy optimization. In: 32nd International Conference on Machine Learning (ICML 2015), pp. 1889–1897 (2015)Google Scholar
  9. 9.
    Schulman, J., Wolski, F., Dhariwal, P., et al.: Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347 (2017)
  10. 10.
    Goodfellow, I., Pouget, J., Mirza, M., et al.: Generative adversarial nets. In: Neural Information Processing Systems, pp. 2672–2680 (2014)Google Scholar
  11. 11.
    Dean, J., Corrado, G., Monga, R., et al.: Large scale distributed deep networks. In: 25th International Conference on Neural Information Processing Systems (2012)Google Scholar
  12. 12.
    Nair, A., Srinivasan, P., Blackwell, S., et al.: Massively parallel methods for deep reinforcement learning. arXiv preprint arXiv:1507.04296 (2015)
  13. 13.
    Horgan, D., Quan, J., Budden, D., et al.: Distributed prioritized experience replay. arXiv preprint arXiv:1803.00933 (2018)
  14. 14.
    Hessel, M., Modayil, J., Hasselt, H., et al.: Rainbow: combining improvements in deep reinforcement learning. arXiv preprint arXiv:1710.02298 (2017)
  15. 15.
    Babaeizadeh, M., Frosio, I., Tyree, S., et al.: Reinforcement learning through asynchronous advantage actor-critic on a gpu. arXiv preprint arXiv:1611.06256 (2016)
  16. 16.
  17. 17.
    Li, J., Liu, L., Wang, Y., et al.: Adaptive hybrid impedance control of robot manipulators with robustness against environment’s uncertainties. In: 2015 IEEE International Conference on Mechatronics and Automation, pp. 1846–1851. IEEE (2015)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  • Chao Ma
    • 1
  • Jianfei Li
    • 1
  • Jie Bai
    • 1
  • Yaobing Wang
    • 1
    Email author
  • Bin Liu
    • 1
  • Jing Sun
    • 1
  1. 1.Beijing Key Laboratory of Intelligent Space Robotic Systems Technology and ApplicationsBeijing Institute of Spacecraft System EngineeringBeijingChina

Personalised recommendations