Learning a DFT-based sequence with reinforcement learning: a NAO implementation
The implementation of sequence learning in robotic platforms o ers several challenges. Deciding when to stop one action and continue to the next requires a balance between stability of sensory information and, of course, the knowledge about what action is required next. The work presented here proposes a starting point for the successful execution and learning of dynamic sequences. Making use of the NAO humanoid platform we propose a mathematical model based on dynamic field theory and reinforcement learning methods for obtaining and performing a sequence of elementary motor behaviors. Results from the comparison of two reinforcement learning methods applied to sequence generation, for both simulation and implementation, are provided.
Keywordssequences neural dynamics reinforcement learning humanoid
Unable to display preview. Download preview PDF.
- G. Schöner, Cambridge Handbok of Computational CognitiveModeling. R. Sun, UK: Cambridge University Press, 2008, ch. Dynamical systems approaches to cognition, pp. 101–126.Google Scholar
- W. Erlhagen, A. Mukovskiy, F. Chersi, and E. Bicho, “On the development of intention understanding for joint action tasks,” 2007.Google Scholar
- Y. Sandamirskaya, M. Richter, and G. Schöner, “Neural dynamics of sequence generation and behavioral organization,” in Front. Comput. Neurosci.: Computational Neuroscience & Neurotechnology Bernstein Conference & Neurex Annual Meting,BC11, no. 0, 2011.Google Scholar
- R. S. Sutton and A. G. Barto, Reinforcement Learning: An Introduction (Adaptive Computation and Machine Learning). The MIT Press, Mar. 1998. [Online]. Available: http://www.amazon.com/exec/obidos/redirect?tag=citeulike07-20&path=ASIN/0262193981 Google Scholar
- J. Modayil, A. White, and R. S. Sutton, “Multi-timescale nexting in a reinforcement learning robot,” CoRR, vol. abs/1112.1133, 2011.Google Scholar
- Y. Sandamirskaya and G. Schöner, “Serial order in an acting system: a multidimensional dynamic neural fields implementation,” in Development and Learning, 2010. ICDL 2010. 9th IEEE InternationalConferenceon, 2010.Google Scholar
- Y. Niv, “Reinforcement learning in the brain,” Journal of Mathematical Psychology, vol. 53, no. 3, pp. 139–154, 2009. [Online]. Available: http://linkinghub.elsevier.com/retrieve/pii/S0022249608001181 MathSciNetzbMATHCrossRefGoogle Scholar
- J. K. O’Regan and A. Noë, “A sensorimotor account of vision and visual consciousness.” The Behavioral and brain sciences, vol. 24, no. 5, Oct. 2001. [Online]. Available: http://view.ncbi.nlm.nih.gov/pubmed/12239892
- S. Kazerounian, M. D. Luciw, M. Richter, and Y. Sandamirskaya, “Autonomous reinforcement of behavioral sequences in neural dynamics,” CoRR, vol. abs/1210.3569, 2012.Google Scholar