Abstract
For developing a robot that learns long and complicated action sequences in the real-world, autonomous learning of multi-step discrete state transition is significant. To realize the multi-step discrete state transition in a neural network is generally thought to be difficult because of basically the needs to hold the state while performing the transition between the states when needed. In this paper, only through the reinforcement learning using rewards and punishments in a simple learning system consisting of a recurrent neural network (RNN), it is shown that a multi-step discrete state transition emerged through learning in a continuous state-action space. It is shown that in a two-switch task, two states transition represented by two types of hidden nodes emerged through the learning. In addition, it is shown that the contribution of the dynamics by the interaction between the RNN and the environment based on the discrete state transitions leads to repetition of the interesting behavior when no reward is given at the goal.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Bakker, B., Zhumatiy, V., Gruener, G., Schmidhuber, J.: A Robot that Reinforcement-Learns to Identify and Memorize Important Previous Observations. In: Proc. of IROS 2003, pp. 430–435 (2003)
Utsunomiya, H., Shibata, K.: Contextual Behaviors and Internal Representations Acquired by Reinforcement Learning with a Recurrent Neural Network in a Continuous State and Action Space Task. In: Köppen, M., Kasabov, N., Coghill, G. (eds.) ICONIP 2008, Part II. LNCS, vol. 5507, pp. 970–978. Springer, Heidelberg (2009)
Shibata, K., Utsunomiya, H.: Discovery of Pattern Meaning from Delayed Rewards by Reinforcement Learning with a Recurrent Neural Network. In: Proc. of Int’l Joint Conf. on Neural Networks 2011, pp. 1445–1452, N-0311.pdf (2011)
Shibata, K.: Emergence of Intelligence through Reinforcement Learning with a Neural Network. In: Mellouk, A. (ed.) Advances in Reinforcement Learning, pp. 99–120. InTech (2011)
Taguchi, Y., Shibata, K.: The Effect of the Initial Weight Values of the Learning Problem that Needs the Internal State Transition by a Recurrent Neural Network. In: Proc. of Kyushu Branch Annual Conf. of SICE, pp. 87–90 (2011) (in Japanese)
Barto, A.G., Sutton, R.S., Anderson, W.: Neuronlike Adaptive Elements Can Solve Difficult Learning Control Problems. IEEE Trans. on Systems, Man, and Cybernetics 13(5), 834–846 (1983)
Elman, J.L.: Finding Structure in Time. Cognitive Science 14, 179–211 (1990)
Rumelhart, D.E., Hinton, G.E., Williams, R.J.: Learning Internal Representations by Errorpropagating. In: Parallel Distributed Processing, vol. 1, pp. 318–362. MIT Press (1986)
Tani, J., Ito, M., Sugita, Y.: Self-organization of Distributedly Represented Multiple Behavior Schemata in a Mirror System: Reviews of Robot Experiments using RNNPB. Neural Networks 17, 1273–1289 (2004)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Samsudin, M.F., Sawatsubashi, Y., Shibata, K. (2012). Emergence of Multi-step Discrete State Transition through Reinforcement Learning with a Recurrent Neural Network. In: Huang, T., Zeng, Z., Li, C., Leung, C.S. (eds) Neural Information Processing. ICONIP 2012. Lecture Notes in Computer Science, vol 7664. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-34481-7_71
Download citation
DOI: https://doi.org/10.1007/978-3-642-34481-7_71
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-34480-0
Online ISBN: 978-3-642-34481-7
eBook Packages: Computer ScienceComputer Science (R0)