Flexibility of Emulation Learning from Pioneers in Nonstationary Environments

Shinriki, Moto; Wakabayashi, Hiroaki; Kono, Yu; Takahashi, Tatsuji

doi:10.1007/978-3-030-39878-1_9

Moto Shinriki²²,
Hiroaki Wakabayashi²²,
Yu Kono²² &
…
Tatsuji Takahashi²²

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 1128))

Included in the following conference series:

Annual Conference of the Japanese Society for Artificial Intelligence

497 Accesses
3 Citations

Abstract

This is an extension from a selected paper from JSAI2019. Social learning is crucial in acquisition of the intelligent behaviors of humans and many kinds of animals, as it makes behavior learning far more efficient than pure trial-and-error. In imitation learning, a representative form of social learning, the agent observes specific action-state pair sequences produced by another agent (expert) and reflect them into its own action. One of its implementations in reinforcement learning is the inverse reinforcement learning. We propose another form of social learning, emulation learning, which requires much less information from another agent (pioneer). In emulation learning, the agent is given only a certain level of achievement by another agent, or a record. In this study, we implement emulation learning in the reinforcement learning setting by applying a model of satisficing action policy. We show that the emulation learning algorithm works well both in stationary and non-stationary reinforcement learning tasks, breaking the often observed trade-off like relationship between efficiency and flexibility.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Whiten, A., McGuigan, N., Marshall-Pescini, S., Hopper, L.M.: Emulation, imitation, over-imitation and the scope of culture for child and chimpanzee. Philos. Trans. R. Soc. B: Biol. Sci. 364(1528), 2417–2428 (2009). https://doi.org/10.1098/rstb.2009.0069
Article Google Scholar
Takahashi, T., Kohno, Y., Uragami, D.: Cognitive satisficing: bounded rationality in reinforcement learning. Trans. Jpn. Soc. Artif. Intell. 31(6), AI30-M\_1–11 (2016). (in Japanese)
Article Google Scholar
Tamatsukuri, A., Takahashi, T.: Guaranteed satisficing and finite regret: analysis of a cognitive satisficing value function. BioSystems 180, 46–53 (2019)
Article Google Scholar
Andrew Maas, J., Bagnell, A., Dey, A.K., Ziebart, B.D.: Maximum entropy inverse reinforcement learning. In: AAAI 2008 (2008)
Google Scholar
Levy, K.Y., Shimkin, N.: Unified inter and intra options learning using policy gradient methods. In: EWRL, pp. 153–164 (2011)
Google Scholar
Simon, H.A.: Rational choice and the structure of the environment. Psychol. Rev. 63(2), 129–138 (1956)
Article Google Scholar
Ushida, U., Kono, Y., Takahashi, T.: Satisficing reinforcement learning for survival. In: Proceedings of JSAI 2017, 4C2-2in2 (2017). (in Japanese)
Google Scholar
Kono, Y., Takahashi, T.: Autonomous optimal exploration through satisficing. In: Proceedings of JSAI 2018, 1Z3-04 (2018). (in Japanese)
Google Scholar

Download references

Acknowledgments

This work was supported by JSPS KAKENHI Grant Number 17H04696.

Author information

Authors and Affiliations

Tokyo Denki University, Ishizaka, Hatoyama, Hiki, Saitama, 350-0394, Japan
Moto Shinriki, Hiroaki Wakabayashi, Yu Kono & Tatsuji Takahashi

Authors

Moto Shinriki
View author publications
You can also search for this author in PubMed Google Scholar
Hiroaki Wakabayashi
View author publications
You can also search for this author in PubMed Google Scholar
Yu Kono
View author publications
You can also search for this author in PubMed Google Scholar
Tatsuji Takahashi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Tatsuji Takahashi .

Editor information

Editors and Affiliations

Department of Systems Innovation, University of Tokyo, Tokyo, Japan
Yukio Ohsawa
Faculty of Business and Commerce, Kansai University, Osaka, Japan
Katsutoshi Yada
Nagoya Institute of Technology, Nagoya, Japan
Takayuki Ito
Graduate School of System Design, Tokyo Metropolitan University, Tokyo, Japan
Yasufumi Takama
Department of Information and Communication, Tokyo Metropolitan University, Tokyo, Japan
Eri Sato-Shimokawara
Faculty of Letters, Chiba University, Chiba, Japan
Akinori Abe
School of Engineering, The University of Tokyo, Tokyo, Japan
Junichiro Mori
Graduate School of Economics, Osaka University, Toyonaka, Osaka, Japan
Naohiro Matsumura

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Shinriki, M., Wakabayashi, H., Kono, Y., Takahashi, T. (2020). Flexibility of Emulation Learning from Pioneers in Nonstationary Environments. In: Ohsawa, Y., et al. Advances in Artificial Intelligence. JSAI 2019. Advances in Intelligent Systems and Computing, vol 1128. Springer, Cham. https://doi.org/10.1007/978-3-030-39878-1_9

Download citation

DOI: https://doi.org/10.1007/978-3-030-39878-1_9
Published: 04 February 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-39877-4
Online ISBN: 978-3-030-39878-1
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics