Abstract
In real worlds, rewards are easily sparse because the state space is huge. Reinforcement learning agents have to achieve exploration skills to get rewards in such an environment. In that case, curiosity defined as internally generated rewards for state prediction error can encourage agents to explore environments. However, when a robot learns its policy by reinforcement learning, changing outputs of the policy cause jerking because of inertia. Jerking prevents state prediction from convergence, which would make the policy learning unstable. In this paper, we propose Arbitrable Intrinsically Motivated Exploration (AIME), which enables robots to stably learn curiosity-based exploration. AIME uses Accumulator Based Arbitration Model (ABAM) that we previously proposed as an ensemble learning method inspired by prefrontal cortex. ABAM adjusts motor controls to improve stability of reward generation and reinforcement learning. In experiments, we show that a robot can explore a non-reward simulated environment with AIME.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsNotes
References
Tai, L., Liu, M.: Towards cognitive exploration through deep reinforcement learning for mobile robots. arXiv preprint arXiv:1610.01733 (2016)
Xie, L., Wang, S., Markham, A., Trigoni, N.: Towards monocular vision based obstacle avoidance through deep reinforcement learning. arXiv preprint arXiv:1706.09829 (2017)
Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A.A., Veness, J., Bellemare, M.G., Graves, A., et al.: Human-level control through deep reinforcement learning. Nature 518(7540), 529 (2015)
Pathak, D., Agrawal, P., Efros, A.A., Darrell, T.: Curiosity-driven exploration by self-supervised prediction. In: International Conference on Machine Learning (ICML), vol. 2017 (2017)
Ostrovski, G., Bellemare, M.G., Oord, A., Munos, R.: Count-based exploration with neural density models. arXiv preprint arXiv:1703.01310 (2017)
Kempka, M., Wydmuch, M., Runc, G., Toczek, J., Jakowski, W.: Vizdoom: a doom-based AI research platform for visual reinforcement learning. In: 2016 IEEE Conference on Computational Intelligence and Games (CIG), pp. 1–8. IEEE (2016)
Bellemare, M.G., Naddaf, Y., Veness, J., Bowling, M.: The arcade learning environment: an evaluation platform for general agents. J. Artif. Intell. Res. 47, 253–279 (2013)
Osawa, M., Ashihara, Y., Seno, T., Imai, M., Kurihara, S.: Accumulator based arbitration model for both supervised and reinforcement learning inspired by prefrontal cortex. In: International Conference on Neural Information Processing, pp. 608–617 (2017)
Heess, N., Hunt, J.J., Lillicrap, T.P., Silver, D.: Memory-based control with recurrent neural networks. arXiv preprint arXiv:1512.04455 (2015)
Heess, N., Wayne, G., Silver, D., Lillicrap, T., Erez, T., Tassa, Y.: Learning continuous control policies by stochastic value gradients. In: Advances in Neural Information Processing Systems, pp. 2944–2952 (2015)
Singh, S.P., Jaakkola, T., Jordan, M.I.: Learning without state-estimation in partially observable Markovian decision processes. In: Machine Learning Proceedings 1994, pp. 284–292 (1994)
Ueno, S., Osawa, M., Imai, M., Kato, T., Yamakawa, H.: Reinforcement learning framework for robots in the real world that extends cognitive architecture: prototype simulation environment ‘Re:ROS’. Biologically Inspired Cognitive Architectures (BICA) for Young Scientists, vol. 636, pp. 198–206 (2017)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Seno, T., Osawa, M., Imai, M. (2019). An Intrinsically Motivated Robot Explores Non-reward Environments with Output Arbitration. In: Samsonovich, A. (eds) Biologically Inspired Cognitive Architectures 2018. BICA 2018. Advances in Intelligent Systems and Computing, vol 848. Springer, Cham. https://doi.org/10.1007/978-3-319-99316-4_37
Download citation
DOI: https://doi.org/10.1007/978-3-319-99316-4_37
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-99315-7
Online ISBN: 978-3-319-99316-4
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)