, Volume 3, Issue 4, pp 181–187 | Cite as

Learning a DFT-based sequence with reinforcement learning: a NAO implementation

  • Boris DuránEmail author
  • Gauss Lee
  • Robert Lowe
Research Article


The implementation of sequence learning in robotic platforms o ers several challenges. Deciding when to stop one action and continue to the next requires a balance between stability of sensory information and, of course, the knowledge about what action is required next. The work presented here proposes a starting point for the successful execution and learning of dynamic sequences. Making use of the NAO humanoid platform we propose a mathematical model based on dynamic field theory and reinforcement learning methods for obtaining and performing a sequence of elementary motor behaviors. Results from the comparison of two reinforcement learning methods applied to sequence generation, for both simulation and implementation, are provided.


sequences neural dynamics reinforcement learning humanoid 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. [1]
    S. Amari, “Dynamics of pattern formation in lateral-inhibition type neural fields,” BiologicalCybernetics, vol. 27, pp. 77–87, 1977.MathSciNetzbMATHGoogle Scholar
  2. [2]
    G. Schöner, Cambridge Handbok of Computational CognitiveModeling. R. Sun, UK: Cambridge University Press, 2008, ch. Dynamical systems approaches to cognition, pp. 101–126.Google Scholar
  3. [3]
    E. Bicho, P. Mallet, and G. Schoner, “Target representation on an autonomous vehicle with low-level sensors.” The International Journal of Robotics Research, vol. 19, no. 5, pp. 424–447, May 2000. [Online]. Available: CrossRefGoogle Scholar
  4. [4]
    W. Erlhagen, A. Mukovskiy, F. Chersi, and E. Bicho, “On the development of intention understanding for joint action tasks,” 2007.Google Scholar
  5. [5]
    Y. Sandamirskaya and G. Schöner, “An embodied account of serial order: How instabilities drive sequence generation,” Neural Networks, vol. 23, no. 10, pp. 1164–1179, December 2010.CrossRefGoogle Scholar
  6. [6]
    Y. Sandamirskaya, M. Richter, and G. Schöner, “Neural dynamics of sequence generation and behavioral organization,” in Front. Comput. Neurosci.: Computational Neuroscience & Neurotechnology Bernstein Conference & Neurex Annual Meting,BC11, no. 0, 2011.Google Scholar
  7. [7]
    R. S. Sutton and A. G. Barto, Reinforcement Learning: An Introduction (Adaptive Computation and Machine Learning). The MIT Press, Mar. 1998. [Online]. Available: Google Scholar
  8. [8]
    R. E. Suri and W. Schultz, “Learning of sequential movements by neural network model with dopamine-like reinforcement signal,” Experimental Brain Research, vol. 121, pp. 350–354, 1998, 10.1007/s002210050467. [Online]. Available: CrossRefGoogle Scholar
  9. [9]
    J. Modayil, A. White, and R. S. Sutton, “Multi-timescale nexting in a reinforcement learning robot,” CoRR, vol. abs/1112.1133, 2011.Google Scholar
  10. [10]
    Y. Sandamirskaya and G. Schöner, “Serial order in an acting system: a multidimensional dynamic neural fields implementation,” in Development and Learning, 2010. ICDL 2010. 9th IEEE InternationalConferenceon, 2010.Google Scholar
  11. [11]
    Y. Niv, “Reinforcement learning in the brain,” Journal of Mathematical Psychology, vol. 53, no. 3, pp. 139–154, 2009. [Online]. Available: MathSciNetzbMATHCrossRefGoogle Scholar
  12. [12]
    M. Wiering and M. van Otterlo, Reinforcement Learning: State-Of-the-Art, ser. Adaptation, Learning, and Optimization. Springer, 2012. [Online]. Available: Google Scholar
  13. [13]
    E. Thelen and L. Smith, Dynamic Systems Approach to the Develop, ser. The MIT Press/Bradford Books series in cognitive psychology. Mit Press, 1996. [Online]. Available: Google Scholar
  14. [14]
    J. K. O’Regan and A. Noë, “A sensorimotor account of vision and visual consciousness.” The Behavioral and brain sciences, vol. 24, no. 5, Oct. 2001. [Online]. Available:
  15. [15]
    S. Kazerounian, M. D. Luciw, M. Richter, and Y. Sandamirskaya, “Autonomous reinforcement of behavioral sequences in neural dynamics,” CoRR, vol. abs/1210.3569, 2012.Google Scholar

Copyright information

© Versita Warsaw and Springer-Verlag Wien 2013

Authors and Affiliations

  1. 1.Interaction LabUniversity of SkövdeSkövdeSweden

Personalised recommendations