TeXDYNA: Hierarchical Reinforcement Learning in Factored MDPs

Kozlova, Olga; Sigaud, Olivier; Meyer, Christophe

doi:10.1007/978-3-642-15193-4_46

Olga Kozlova²¹,
Olivier Sigaud²¹ &
Christophe Meyer²²

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 6226))

Included in the following conference series:

International Conference on Simulation of Adaptive Behavior

1411 Accesses
1 Citations

Abstract

Reinforcement learning is one of the main adaptive mechanisms that is both well documented in animal behaviour and giving rise to computational studies in animats and robots. In this paper, we present TeXDYNA, an algorithm designed to solve large reinforcement learning problems with unknown structure by integrating hierarchical abstraction techniques of Hierarchical Reinforcement Learning and factorization techniques of Factored Reinforcement Learning. We validate our approach on the LIGHT BOX problem.

This work was founded by CIFRE convention - 1032/2006.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MIT Press, Cambridge (1998)
Google Scholar
Boutilier, C., Dearden, R., Goldszmidt, M.: Exploiting structure in policy construction. In: Proceedings of the 14th International Joint Conference on Artificial Intelligence, pp. 1104–1111 (1995)
Google Scholar
Barto, A., Mahadevan, S.: Recent advances in hierarchical reinforcement learning. Discrete Event Systems Journal 13, 41–77 (2003)
Article MATH MathSciNet Google Scholar
Hengst, B.: Discovering hierarchy in reinforcement learning with HEXQ. In: Proceedings of the 19th International Conference on Machine Learning, pp. 243–250 (2002)
Google Scholar
Jonsson, A., Barto, A.: Causal graph based decomposition of factored MDPs. Journal of Machine Learning Research 7, 2259–2301 (2006)
MathSciNet Google Scholar
Degris, T., Sigaud, O., Wuillemin, P.H.: Learning the structure of factored markov decision processes in reinforcement learning problems. In: Proceedings of the 23rd International Conference on Machine Learning, Pittsburgh, Pennsylvania, pp. 257–264. ACM, New York (2006)
Google Scholar
Boutilier, C., Dearden, R., Goldszmidt, M.: Stochastic dynamic programming with factored representations. Artificial Intelligence 121(1-2), 49–107 (2000)
Article MATH MathSciNet Google Scholar
Sutton, R.S.: DYNA, an integrated architecture for learning, planning and reacting. In: Working Notes of the AAAI Spring Symposium on Integrated Intelligent Architectures (1991)
Google Scholar
Sutton, R., Precup, D., Singh, S.: Between MDPs and semi-MDPs: A framework for temporal abstraction in reinforcement learning. Artificial Intelligence 112, 181–211 (1999)
Article MATH MathSciNet Google Scholar
Jonsson, A.: A causal approach to hierarchical decomposition in reinforcement learning. PhD thesis, University of Massachusetts Amherst (2006)
Google Scholar
Vigorito, C.M., Barto, A.G.: Autonomous Hierarchical Skill Acquisition in Factored MDPs. In: Yale Workshop on Adaptive and Learning Systems, New Haven, Connecticut (2008)
Google Scholar
Vigorito, C., Barto, A.: Hierarchical Representations of Behavior for Efficient Creative Search. In: AAAI Spring Symposium on Creative Intelligent Systems, Palo Alto, CA (2008)
Google Scholar
Singh, S., Barto, A., Chentanez, N.: Intrinsically motivated reinforcement learning. Advances in Neural Information Processing Systems 18, 1281–1288 (2004)
Google Scholar
Szita, I., Lörincz, A.: The many faces of optimism: a unifying approach. In: Proceedings of the 25th International Conference on Machine Learning, pp. 1048–1055. ACM, New York (2008)
Chapter Google Scholar
Oudeyer, P.Y., Kaplan, F., Hafner, V.: Intrinsic motivation systems for autonomous mental development. IEEE Transactions on Evolutionary Computation 11 (2007)
Google Scholar

Download references

Author information

Authors and Affiliations

Institut des Systémes Intelligents et de Robotique, Université Pierre et Marie Curie - Paris 6, CNRS UMR 7222, 4 place Jussieu, F75252, Paris Cedex 05
Olga Kozlova & Olivier Sigaud
Thales Security Solutions & Services, ThereSIS Research and Innovation Office, Route départementale 128, F91767, Palaiseau Cedex
Christophe Meyer

Authors

Olga Kozlova
View author publications
You can also search for this author in PubMed Google Scholar
Olivier Sigaud
View author publications
You can also search for this author in PubMed Google Scholar
Christophe Meyer
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

ISIR, Université Pierre et Marie Curie-Paris 6, 4 Place Jussieu, 75252, Paris cedex 05, France
Stéphane Doncieux , Benoît Girard , Agnès Guillot , Jean-Arcady Meyer & Jean-Baptiste Mouret , , , &
The Mærsk Mc-Kinney Møller Institute, University of Southern Denmark, Campusvej 55, 5230, Odense M, Denmark
John Hallam

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Kozlova, O., Sigaud, O., Meyer, C. (2010). TeXDYNA: Hierarchical Reinforcement Learning in Factored MDPs . In: Doncieux, S., Girard, B., Guillot, A., Hallam, J., Meyer, JA., Mouret, JB. (eds) From Animals to Animats 11. SAB 2010. Lecture Notes in Computer Science(), vol 6226. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-15193-4_46

Download citation

DOI: https://doi.org/10.1007/978-3-642-15193-4_46
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-15192-7
Online ISBN: 978-3-642-15193-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics