Paladyn

, Volume 3, Issue 4, pp 181–187 | Cite as

Learning a DFT-based sequence with reinforcement learning: a NAO implementation

Research Article
  • 61 Downloads

Abstract

The implementation of sequence learning in robotic platforms o ers several challenges. Deciding when to stop one action and continue to the next requires a balance between stability of sensory information and, of course, the knowledge about what action is required next. The work presented here proposes a starting point for the successful execution and learning of dynamic sequences. Making use of the NAO humanoid platform we propose a mathematical model based on dynamic field theory and reinforcement learning methods for obtaining and performing a sequence of elementary motor behaviors. Results from the comparison of two reinforcement learning methods applied to sequence generation, for both simulation and implementation, are provided.

Keywords

sequences neural dynamics reinforcement learning humanoid 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. [1]
    S. Amari, “Dynamics of pattern formation in lateral-inhibition type neural fields,” BiologicalCybernetics, vol. 27, pp. 77–87, 1977.MathSciNetMATHGoogle Scholar
  2. [2]
    G. Schöner, Cambridge Handbok of Computational CognitiveModeling. R. Sun, UK: Cambridge University Press, 2008, ch. Dynamical systems approaches to cognition, pp. 101–126.Google Scholar
  3. [3]
    E. Bicho, P. Mallet, and G. Schoner, “Target representation on an autonomous vehicle with low-level sensors.” The International Journal of Robotics Research, vol. 19, no. 5, pp. 424–447, May 2000. [Online]. Available: http://dx.doi.org/10.1177/02783640022066950 CrossRefGoogle Scholar
  4. [4]
    W. Erlhagen, A. Mukovskiy, F. Chersi, and E. Bicho, “On the development of intention understanding for joint action tasks,” 2007.Google Scholar
  5. [5]
    Y. Sandamirskaya and G. Schöner, “An embodied account of serial order: How instabilities drive sequence generation,” Neural Networks, vol. 23, no. 10, pp. 1164–1179, December 2010.CrossRefGoogle Scholar
  6. [6]
    Y. Sandamirskaya, M. Richter, and G. Schöner, “Neural dynamics of sequence generation and behavioral organization,” in Front. Comput. Neurosci.: Computational Neuroscience & Neurotechnology Bernstein Conference & Neurex Annual Meting,BC11, no. 0, 2011.Google Scholar
  7. [7]
    R. S. Sutton and A. G. Barto, Reinforcement Learning: An Introduction (Adaptive Computation and Machine Learning). The MIT Press, Mar. 1998. [Online]. Available: http://www.amazon.com/exec/obidos/redirect?tag=citeulike07-20&path=ASIN/0262193981 Google Scholar
  8. [8]
    R. E. Suri and W. Schultz, “Learning of sequential movements by neural network model with dopamine-like reinforcement signal,” Experimental Brain Research, vol. 121, pp. 350–354, 1998, 10.1007/s002210050467. [Online]. Available: http://dx.doi.org/10.1007/s002210050467 CrossRefGoogle Scholar
  9. [9]
    J. Modayil, A. White, and R. S. Sutton, “Multi-timescale nexting in a reinforcement learning robot,” CoRR, vol. abs/1112.1133, 2011.Google Scholar
  10. [10]
    Y. Sandamirskaya and G. Schöner, “Serial order in an acting system: a multidimensional dynamic neural fields implementation,” in Development and Learning, 2010. ICDL 2010. 9th IEEE InternationalConferenceon, 2010.Google Scholar
  11. [11]
    Y. Niv, “Reinforcement learning in the brain,” Journal of Mathematical Psychology, vol. 53, no. 3, pp. 139–154, 2009. [Online]. Available: http://linkinghub.elsevier.com/retrieve/pii/S0022249608001181 MathSciNetMATHCrossRefGoogle Scholar
  12. [12]
    M. Wiering and M. van Otterlo, Reinforcement Learning: State-Of-the-Art, ser. Adaptation, Learning, and Optimization. Springer, 2012. [Online]. Available: http://books.google.com/books?id=YPjNuvrJR0MC Google Scholar
  13. [13]
    E. Thelen and L. Smith, Dynamic Systems Approach to the Develop, ser. The MIT Press/Bradford Books series in cognitive psychology. Mit Press, 1996. [Online]. Available: http://books.google.com/books?id=kBslxoe0TekC Google Scholar
  14. [14]
    J. K. O’Regan and A. Noë, “A sensorimotor account of vision and visual consciousness.” The Behavioral and brain sciences, vol. 24, no. 5, Oct. 2001. [Online]. Available: http://view.ncbi.nlm.nih.gov/pubmed/12239892
  15. [15]
    S. Kazerounian, M. D. Luciw, M. Richter, and Y. Sandamirskaya, “Autonomous reinforcement of behavioral sequences in neural dynamics,” CoRR, vol. abs/1210.3569, 2012.Google Scholar

Copyright information

© Versita Warsaw and Springer-Verlag Wien 2013

Authors and Affiliations

  1. 1.Interaction LabUniversity of SkövdeSkövdeSweden

Personalised recommendations