Symbolic Bounded Real-Time Dynamic Programming

Delgado, Karina Valdivia; Fang, Cheng; Sanner, Scott; de Barros, Leliane Nunes

doi:10.1007/978-3-642-16138-4_20

Karina Valdivia Delgado²²,
Cheng Fang²³,
Scott Sanner²⁴ &
…
Leliane Nunes de Barros²²

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 6404))

Included in the following conference series:

Brazilian Symposium on Artificial Intelligence

1509 Accesses
1 Citations

Abstract

Real-time dynamic programming (RTDP) solves Markov decision processes (MDPs) when the initial state is restricted. By visiting (and updating) only a fraction of the state space, this approach can be used to solve problems with intractably large state space. In order to improve the performance of RTDP, a variant based on symbolic representation was proposed, named sRTDP. Traditional RTDP approaches work best on problems with sparse transition matrices where they can often efficiently achieve ε-convergence without visiting all states; however, on problems with dense transition matrices where most states are reachable in one step, the sRTDP approach shows an advantage over traditional RTDP by up to three orders of magnitude, as we demonstrate in this paper. We also specify a new variant of sRTDP based on BRTDP, named sBRTDP, which converges quickly when compared to RTDP variants, since it does less updating by making a better choice of the next state to be visited.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Puterman, M.L.: Markov Decision Processes: Discrete Stochastic Dynamic Programming. Wiley, Chichester (1994)
Book MATH Google Scholar
Bertsekas, D.P.: Distributed dynamic programming. IEEE Transactions on Automatic Control 27, 610–617 (1982)
Article MATH Google Scholar
Barto, A.G., Bradtke, S.J., Singh, S.P.: Learning to act using real-time dynamic programming. Technical Report UM-CS-1993-002, U. Mass. Amherst (1993)
Google Scholar
Bonet, B., Geffner, H.: Labeled RTDP: Improving the convergence of real-time dynamic programming. In: ICAPS-2003, Trento, Italy, pp. 12–21 (2003)
Google Scholar
McMahan, H.B., Likhachev, M., Gordon, G.J.: Bounded real-time dynamic programming: RTDP with monotone upper bounds. In: ICML 2005, Bonn, Germany, pp. 569–576 (2005)
Google Scholar
Sanner, S., Goetschalckx, R., Driessens, K., Shani, G.: Bayesian real-time dynamic programming. In: 21st IJCAI, San Francisco, CA, USA, pp. 1784–1789 (2009)
Google Scholar
Feng, Z., Hansen, E.A., Zilberstein, S.: Symbolic generalization for on-line planning. In: 19th UAI, pp. 209–216 (2003)
Google Scholar
Hoey, J., St-Aubin, R., Hu, A., Boutilier, C.: SPUDD: Stochastic planning using decision diagrams. In: UAI 1999, Stockholm, pp. 279–288 (1999)
Google Scholar
Bahar, R.I., Frohm, E., Gaona, C., Hachtel, G., Macii, E., Pardo, A., Somenzi, F.: Algebraic Decision Diagrams and their applications. In: IEEE /ACM International Conference on CAD, pp. 428–432 (1993)
Google Scholar
Boutilier, C., Friedman, N., Goldszmidt, M., Koller, D.: Context-specific independence in Bayesian networks. In: UAI 1996, Portland, OR, pp. 115–123 (1996)
Google Scholar
Guestrin, C., Koller, D., Parr, R., Venktaraman, S.: Efficient solution methods for factored MDPs. JAIR 19, 399–468 (2002)
MATH Google Scholar
Delgado, K.V., Sanner, S., de Barros, L.N., Cozman, F.G.: Efficient Solutions to Factored MDPs with Imprecise Transition Probabilities. In: 19th ICAPS, Thessaloniki, Greece (2009)
Google Scholar

Download references

Author information

Authors and Affiliations

University of São Paulo, São Paulo, Brazil
Karina Valdivia Delgado & Leliane Nunes de Barros
University of Sydney, Sydney, Australia
Cheng Fang
National ICT Australia, Canberra, Australia
Scott Sanner

Authors

Karina Valdivia Delgado
View author publications
You can also search for this author in PubMed Google Scholar
Cheng Fang
View author publications
You can also search for this author in PubMed Google Scholar
Scott Sanner
View author publications
You can also search for this author in PubMed Google Scholar
Leliane Nunes de Barros
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

FURG, Centro de Ciências Computacionais, Universidade Federal do Rio Grande, Av. Itália, km 8 – Campus Carreiros, 96.201-900, Rio Grande, RS, Brazil
Antônio Carlos da Rocha Costa
UFRGS, Instituto de Informática, Universidade Federal do Rio Grande do Sul, Av. Bento Conçalves 9.500, 91501-970, Porto Alegre, RS, Brazil
Rosa Maria Vicari
Departamento de Ciência da Computação, Centro Universitário da FEI, Av. Humberto A. C. Branco 3972, 09850-901, São Bernardo do Campo, SP, Brazil
Flavio Tonidandel

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Delgado, K.V., Fang, C., Sanner, S., de Barros, L.N. (2010). Symbolic Bounded Real-Time Dynamic Programming. In: da Rocha Costa, A.C., Vicari, R.M., Tonidandel, F. (eds) Advances in Artificial Intelligence – SBIA 2010. SBIA 2010. Lecture Notes in Computer Science(), vol 6404. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-16138-4_20

Download citation

DOI: https://doi.org/10.1007/978-3-642-16138-4_20
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-16137-7
Online ISBN: 978-3-642-16138-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics