TTree: Tree-Based State Generalization with Temporally Abstract Actions

Uther, William T. B.; Veloso, Manuela M.

doi:10.1007/3-540-45622-8_24

William T. B. Uther³ &
Manuela M. Veloso³

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 2371))

Included in the following conference series:

International Symposium on Abstraction, Reformulation, and Approximation

818 Accesses
1 Citations

Abstract

In this paper we describe the Trajectory Tree, or TTree, algorithm. TTree uses a small set of supplied policies to help solve a Semi-Markov Decision Problem (SMDP). The algorithm uses a learned tree based discretization of the state space as an abstract state description and both user supplied and auto-generated policies as temporally abstract actions. It uses a generative model of the world to sample the transition function for the abstract SMDP defined by those state and temporal abstractions, and then finds a policy for that abstract SMDP. This policy for the abstract SMDP can then be mapped back to a policy for the base SMDP, solving the supplied problem. In this paper we present the TTree algorithm and give empirical comparisons to other SMDP algorithms showing its effectiveness.

This research was sponsored by the United States Air Force under Agreement Nos. F30602-00-2-0549 and F30602-98-2-0135. The content of this publication does not necessarily reflect the position of the funding agencies and no official endorsement should be inferred.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Puterman, M.L.: Markov Decision Processes: Discrete stochastic dynamic programming. John Wiley & Sons (1994)
Google Scholar
Chapman, D., Kaelbling, L.P.: Input generalization in delayed reinforcement learning: An algorithm and performance comparisons. In: Proceedings of IJCAI-91. (1991)
Google Scholar
Uther, W.T.B., Veloso, M.M.: Tree based discretization for continuous state space reinforcement learning. In: Proceedings of AAAI-98. (1998)
Google Scholar
Munos, R., Moore, A.W.: Variable resolution discretization for high-accuracy solutions of optimal control problems. In: Proceedings of IJCAI-99. (1999)
Google Scholar
Sutton, R.S., Precup, D., Singh, S.: Intra-option learning about temporally abstract actions. In: Proceedings of ICML98, Morgan Kaufmann (1998) 556–564
Google Scholar
Dietterich, T.G.: The MAXQ method for hierarchical reinforcement learning. In: Proceedings of ICML98, Morgan Kaufmann (1998)
Google Scholar
Parr, R.S., Russell, S.: Reinforcement learning with hierarchies of machines. In: Neural and Information Processing Systems (NIPS-98). Volume 10., MIT Press (1998)
Google Scholar
Baird, L.C.: Residual algorithms: Reinforcement learning with function approximation. In Prieditis, A., Russell, S., eds.: Proceedings of ICML95, Morgan Kaufmann (1995)
Google Scholar
Ng, A.Y., Jordan, M.: PEGASUS: A policy search method for large MDPs and POMDPs. In: Proceedings of UAI00. (2000)
Google Scholar
Hengst, B.: Generating hierarchical structure in reinforcement learning from state variables. In: Proceedings of PRICAI 2000. Volume 1886 of Lecture Notes in Computer Science., Springer (2000)
Google Scholar
McGovern, A., Barto, A.G.: Automatic discovery of subgoals in reinforcement learning using diverse density. In: Proceedings of ICML01. (2001)
Google Scholar
Uther, W.T.B., Veloso, M.M.: The lumberjack algorithm for learning linked decision forests. In: Proceedings of PRICAI 2000. Volume 1886 of Lecture Notes in Computer Science., Springer (2000)
Google Scholar
Moore, A., Atkeson, C.G.: Prioritized sweeping: Reinforcement learning with less data and less real time. Machine Learning 13 (1993)
Google Scholar

Download references

Author information

Authors and Affiliations

Computer Science Department, Carnegie Mellon University, Pittsburgh, PA, 15213, USA
William T. B. Uther & Manuela M. Veloso

Authors

William T. B. Uther
View author publications
You can also search for this author in PubMed Google Scholar
Manuela M. Veloso
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

College of Computing, Georgia Institute of Technology, 801 Atlantic Dr NW, Atlanta, GA, 30332-0280, USA
Sven Koenig
Department of Computing Science, Universtity of Alberta, 2-21 Athabasca Hall, Edmonton, Alberta, T6G 2E8, Canada
Robert C. Holte

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Uther, W.T.B., Veloso, M.M. (2002). TTree: Tree-Based State Generalization with Temporally Abstract Actions. In: Koenig, S., Holte, R.C. (eds) Abstraction, Reformulation, and Approximation. SARA 2002. Lecture Notes in Computer Science(), vol 2371. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45622-8_24

Download citation

DOI: https://doi.org/10.1007/3-540-45622-8_24
Published: 09 July 2002
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-43941-7
Online ISBN: 978-3-540-45622-3
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics