Accelerating Deep Q Network by Weighting Experiences

Murakami, Kazuhiro; Moriyama, Koichi; Mutoh, Atsuko; Matsui, Tohgoroh; Inuzuka, Nobuhiro

doi:10.1007/978-3-030-04167-0_19

Kazuhiro Murakami¹⁶,
Koichi Moriyama¹⁶,
Atsuko Mutoh¹⁶,
Tohgoroh Matsui¹⁷ &
…
Nobuhiro Inuzuka¹⁶

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 11301))

Included in the following conference series:

International Conference on Neural Information Processing

3689 Accesses

Abstract

Deep Q Network (DQN) is a reinforcement learning methodlogy that uses deep neural networks to approximate the Q-function. Literature reveals that DQN can select better responses than humans. However, DQN requires a lengthy period of time to learn the appropriate actions by using tuples of state, action, reward and next state, called “experience”, sampled from its memory. DQN samples them uniformly and randomly, but the experiences are skewed resulting in slow learning because frequent experiences are redundantly sampled but infrequent ones are not. This work mitigates the problem by weighting experiences based on their frequency and manipulating their sampling probability. In a video game environment, the proposed method learned the appropriate responses faster than DQN.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Abbeel, P., Coates, A., Ng, A.Y.: Autonomous helicopter aerobatics through apprenticeship learning. Int. J. Robot. Res. 29(13), 1–31 (2010)
Article Google Scholar
Mnih, V., et al.: Playing Atari with deep reinforcement learning. arXiv:1312.5602v1 (2013)
Mnih, V., et al.: Human-level control through deep reinforcement learning. Nature 518, 529–533 (2015)
Article Google Scholar
Brockman, G., et al.: OpenAI Gym. arXiv:1606.01540 (2016)
Rumelhart, D.E., Hinton, G.E., Williams, R.J.: Learning representations by back-propagating errors. Nature 323, 533–536 (1986)
Article Google Scholar
Moore, A.W., Atkeson, C.G.: Prioritized sweeping: reinforcement learning with less data and less time. Mach. Learn. 13, 103–130 (1993)
Google Scholar
Schaul, T., Quan, J., Antonoglou, I., Silver, D.: Prioritized experience replay. In: International Conference on Learning Representations (2016)
Google Scholar
Miyazaki, K.: Exploitation-oriented learning with deep learning introducing profit sharing to a deep Q-network. J. Adv. Comput. Intell. Intell. Inform. 21(5), 849–855 (2017)
Article Google Scholar
Miyazaki, K., Yamamura, M., Kobayashi, S.: A theory of profit sharing in reinforcement learning. J. Jpn. Soc. Artif. Intell. 9(4), 580–587 (1994). (in Japanese)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, Graduate School of Engineering, Nagoya Institute of Technology, Nagoya, Japan
Kazuhiro Murakami, Koichi Moriyama, Atsuko Mutoh & Nobuhiro Inuzuka
Department of Clinical Engineering, College of Life and Health Sciences, Chubu University, Kasugai, Japan
Tohgoroh Matsui

Authors

Kazuhiro Murakami
View author publications
You can also search for this author in PubMed Google Scholar
Koichi Moriyama
View author publications
You can also search for this author in PubMed Google Scholar
Atsuko Mutoh
View author publications
You can also search for this author in PubMed Google Scholar
Tohgoroh Matsui
View author publications
You can also search for this author in PubMed Google Scholar
Nobuhiro Inuzuka
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Kazuhiro Murakami .

Editor information

Editors and Affiliations

The Chinese Academy of Sciences, Beijing, China
Long Cheng
City University of Hong Kong, Kowloon, Hong Kong
Andrew Chi Sing Leung
Kobe University, Kobe, Japan
Seiichi Ozawa

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Murakami, K., Moriyama, K., Mutoh, A., Matsui, T., Inuzuka, N. (2018). Accelerating Deep Q Network by Weighting Experiences. In: Cheng, L., Leung, A., Ozawa, S. (eds) Neural Information Processing. ICONIP 2018. Lecture Notes in Computer Science(), vol 11301. Springer, Cham. https://doi.org/10.1007/978-3-030-04167-0_19

Download citation

DOI: https://doi.org/10.1007/978-3-030-04167-0_19
Published: 17 November 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-04166-3
Online ISBN: 978-3-030-04167-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics