Learning a DFT-based sequence with reinforcement learning: a NAO implementation

Abstract

The implementation of sequence learning in robotic platforms o ers several challenges. Deciding when to stop one action and continue to the next requires a balance between stability of sensory information and, of course, the knowledge about what action is required next. The work presented here proposes a starting point for the successful execution and learning of dynamic sequences. Making use of the NAO humanoid platform we propose a mathematical model based on dynamic field theory and reinforcement learning methods for obtaining and performing a sequence of elementary motor behaviors. Results from the comparison of two reinforcement learning methods applied to sequence generation, for both simulation and implementation, are provided.

This is a preview of subscription content, log in to check access.

References

  1. [1]

    S. Amari, “Dynamics of pattern formation in lateral-inhibition type neural fields,” BiologicalCybernetics, vol. 27, pp. 77–87, 1977.

    MathSciNet  MATH  Google Scholar 

  2. [2]

    G. Schöner, Cambridge Handbok of Computational CognitiveModeling. R. Sun, UK: Cambridge University Press, 2008, ch. Dynamical systems approaches to cognition, pp. 101–126.

    Google Scholar 

  3. [3]

    E. Bicho, P. Mallet, and G. Schoner, “Target representation on an autonomous vehicle with low-level sensors.” The International Journal of Robotics Research, vol. 19, no. 5, pp. 424–447, May 2000. [Online]. Available: http://dx.doi.org/10.1177/02783640022066950

    Article  Google Scholar 

  4. [4]

    W. Erlhagen, A. Mukovskiy, F. Chersi, and E. Bicho, “On the development of intention understanding for joint action tasks,” 2007.

    Google Scholar 

  5. [5]

    Y. Sandamirskaya and G. Schöner, “An embodied account of serial order: How instabilities drive sequence generation,” Neural Networks, vol. 23, no. 10, pp. 1164–1179, December 2010.

    Article  Google Scholar 

  6. [6]

    Y. Sandamirskaya, M. Richter, and G. Schöner, “Neural dynamics of sequence generation and behavioral organization,” in Front. Comput. Neurosci.: Computational Neuroscience & Neurotechnology Bernstein Conference & Neurex Annual Meting,BC11, no. 0, 2011.

    Google Scholar 

  7. [7]

    R. S. Sutton and A. G. Barto, Reinforcement Learning: An Introduction (Adaptive Computation and Machine Learning). The MIT Press, Mar. 1998. [Online]. Available: http://www.amazon.com/exec/obidos/redirect?tag=citeulike07-20&path=ASIN/0262193981

    Google Scholar 

  8. [8]

    R. E. Suri and W. Schultz, “Learning of sequential movements by neural network model with dopamine-like reinforcement signal,” Experimental Brain Research, vol. 121, pp. 350–354, 1998, 10.1007/s002210050467. [Online]. Available: http://dx.doi.org/10.1007/s002210050467

    Article  Google Scholar 

  9. [9]

    J. Modayil, A. White, and R. S. Sutton, “Multi-timescale nexting in a reinforcement learning robot,” CoRR, vol. abs/1112.1133, 2011.

    Google Scholar 

  10. [10]

    Y. Sandamirskaya and G. Schöner, “Serial order in an acting system: a multidimensional dynamic neural fields implementation,” in Development and Learning, 2010. ICDL 2010. 9th IEEE InternationalConferenceon, 2010.

    Google Scholar 

  11. [11]

    Y. Niv, “Reinforcement learning in the brain,” Journal of Mathematical Psychology, vol. 53, no. 3, pp. 139–154, 2009. [Online]. Available: http://linkinghub.elsevier.com/retrieve/pii/S0022249608001181

    MathSciNet  MATH  Article  Google Scholar 

  12. [12]

    M. Wiering and M. van Otterlo, Reinforcement Learning: State-Of-the-Art, ser. Adaptation, Learning, and Optimization. Springer, 2012. [Online]. Available: http://books.google.com/books?id=YPjNuvrJR0MC

    Google Scholar 

  13. [13]

    E. Thelen and L. Smith, Dynamic Systems Approach to the Develop, ser. The MIT Press/Bradford Books series in cognitive psychology. Mit Press, 1996. [Online]. Available: http://books.google.com/books?id=kBslxoe0TekC

    Google Scholar 

  14. [14]

    J. K. O’Regan and A. Noë, “A sensorimotor account of vision and visual consciousness.” The Behavioral and brain sciences, vol. 24, no. 5, Oct. 2001. [Online]. Available: http://view.ncbi.nlm.nih.gov/pubmed/12239892

    Google Scholar 

  15. [15]

    S. Kazerounian, M. D. Luciw, M. Richter, and Y. Sandamirskaya, “Autonomous reinforcement of behavioral sequences in neural dynamics,” CoRR, vol. abs/1210.3569, 2012.

    Google Scholar 

Download references

Author information

Affiliations

Authors

Corresponding author

Correspondence to Boris Durán.

About this article

Cite this article

Durán, B., Lee, G. & Lowe, R. Learning a DFT-based sequence with reinforcement learning: a NAO implementation. Paladyn 3, 181–187 (2012). https://doi.org/10.2478/s13230-013-0109-5

Download citation

Keywords

  • sequences
  • neural dynamics
  • reinforcement learning
  • humanoid