Learning a DFT-based sequence with reinforcement learning: a NAO implementation

Durán, Boris; Lee, Gauss; Lowe, Robert

doi:10.2478/s13230-013-0109-5

Learning a DFT-based sequence with reinforcement learning: a NAO implementation

Research Article
Published: 25 April 2013

Volume 3, pages 181–187, (2012)
Cite this article

Paladyn

Boris Durán¹,
Gauss Lee¹ &
Robert Lowe¹

68 Accesses
1 Citation
Explore all metrics

Abstract

The implementation of sequence learning in robotic platforms o ers several challenges. Deciding when to stop one action and continue to the next requires a balance between stability of sensory information and, of course, the knowledge about what action is required next. The work presented here proposes a starting point for the successful execution and learning of dynamic sequences. Making use of the NAO humanoid platform we propose a mathematical model based on dynamic field theory and reinforcement learning methods for obtaining and performing a sequence of elementary motor behaviors. Results from the comparison of two reinforcement learning methods applied to sequence generation, for both simulation and implementation, are provided.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

S. Amari, “Dynamics of pattern formation in lateral-inhibition type neural fields,” BiologicalCybernetics, vol. 27, pp. 77–87, 1977.
MathSciNet MATH Google Scholar
G. Schöner, Cambridge Handbok of Computational CognitiveModeling. R. Sun, UK: Cambridge University Press, 2008, ch. Dynamical systems approaches to cognition, pp. 101–126.
Google Scholar
E. Bicho, P. Mallet, and G. Schoner, “Target representation on an autonomous vehicle with low-level sensors.” The International Journal of Robotics Research, vol. 19, no. 5, pp. 424–447, May 2000. [Online]. Available: http://dx.doi.org/10.1177/02783640022066950
Article Google Scholar
W. Erlhagen, A. Mukovskiy, F. Chersi, and E. Bicho, “On the development of intention understanding for joint action tasks,” 2007.
Google Scholar
Y. Sandamirskaya and G. Schöner, “An embodied account of serial order: How instabilities drive sequence generation,” Neural Networks, vol. 23, no. 10, pp. 1164–1179, December 2010.
Article Google Scholar
Y. Sandamirskaya, M. Richter, and G. Schöner, “Neural dynamics of sequence generation and behavioral organization,” in Front. Comput. Neurosci.: Computational Neuroscience & Neurotechnology Bernstein Conference & Neurex Annual Meting,BC11, no. 0, 2011.
Google Scholar
R. S. Sutton and A. G. Barto, Reinforcement Learning: An Introduction (Adaptive Computation and Machine Learning). The MIT Press, Mar. 1998. [Online]. Available: http://www.amazon.com/exec/obidos/redirect?tag=citeulike07-20&path=ASIN/0262193981
Google Scholar
R. E. Suri and W. Schultz, “Learning of sequential movements by neural network model with dopamine-like reinforcement signal,” Experimental Brain Research, vol. 121, pp. 350–354, 1998, 10.1007/s002210050467. [Online]. Available: http://dx.doi.org/10.1007/s002210050467
Article Google Scholar
J. Modayil, A. White, and R. S. Sutton, “Multi-timescale nexting in a reinforcement learning robot,” CoRR, vol. abs/1112.1133, 2011.
Google Scholar
Y. Sandamirskaya and G. Schöner, “Serial order in an acting system: a multidimensional dynamic neural fields implementation,” in Development and Learning, 2010. ICDL 2010. 9th IEEE InternationalConferenceon, 2010.
Google Scholar
Y. Niv, “Reinforcement learning in the brain,” Journal of Mathematical Psychology, vol. 53, no. 3, pp. 139–154, 2009. [Online]. Available: http://linkinghub.elsevier.com/retrieve/pii/S0022249608001181
Article MathSciNet MATH Google Scholar
M. Wiering and M. van Otterlo, Reinforcement Learning: State-Of-the-Art, ser. Adaptation, Learning, and Optimization. Springer, 2012. [Online]. Available: http://books.google.com/books?id=YPjNuvrJR0MC
Google Scholar
E. Thelen and L. Smith, Dynamic Systems Approach to the Develop, ser. The MIT Press/Bradford Books series in cognitive psychology. Mit Press, 1996. [Online]. Available: http://books.google.com/books?id=kBslxoe0TekC
Google Scholar
J. K. O’Regan and A. Noë, “A sensorimotor account of vision and visual consciousness.” The Behavioral and brain sciences, vol. 24, no. 5, Oct. 2001. [Online]. Available: http://view.ncbi.nlm.nih.gov/pubmed/12239892
Google Scholar
S. Kazerounian, M. D. Luciw, M. Richter, and Y. Sandamirskaya, “Autonomous reinforcement of behavioral sequences in neural dynamics,” CoRR, vol. abs/1210.3569, 2012.
Google Scholar

Download references

Author information

Authors and Affiliations

Interaction Lab, University of Skövde, Skövde, Sweden
Boris Durán, Gauss Lee & Robert Lowe

Authors

Boris Durán
View author publications
You can also search for this author in PubMed Google Scholar
Gauss Lee
View author publications
You can also search for this author in PubMed Google Scholar
Robert Lowe
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Boris Durán.

About this article

Cite this article

Durán, B., Lee, G. & Lowe, R. Learning a DFT-based sequence with reinforcement learning: a NAO implementation. Paladyn 3, 181–187 (2012). https://doi.org/10.2478/s13230-013-0109-5

Download citation

Received: 18 December 2012
Accepted: 27 March 2013
Published: 25 April 2013
Issue Date: December 2012
DOI: https://doi.org/10.2478/s13230-013-0109-5

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Learning a DFT-based sequence with reinforcement learning: a NAO implementation

Abstract

Access this article

Similar content being viewed by others

Learning motions from demonstrations and rewards with time-invariant dynamical systems based policies

The Challenges of Reinforcement Learning in Robotics and Optimal Control

Sequencing of multi-robot behaviors using reinforcement learning

References

Author information

Authors and Affiliations

Corresponding author

About this article

Cite this article

Keywords

Navigation

Learning a DFT-based sequence with reinforcement learning: a NAO implementation

Abstract

Access this article

Similar content being viewed by others

Learning motions from demonstrations and rewards with time-invariant dynamical systems based policies

The Challenges of Reinforcement Learning in Robotics and Optimal Control

Sequencing of multi-robot behaviors using reinforcement learning

References

Author information

Authors and Affiliations

Corresponding author

About this article

Cite this article

Share this article

Keywords

Search

Navigation