Adaptive Real-Time Dynamic Programming

Barto, Andrew G.

doi:10.1007/978-1-4899-7502-7_5-1

Andrew G. Barto⁴

10 Accesses

Synonyms

ARTDP

Definition

Adaptive Real-Time Dynamic Programming (ARTDP) is an algorithm that allows an agent to improve its behavior while interacting over time with an incompletely known dynamic environment. It can also be viewed as a heuristic search algorithm for finding shortest paths in incompletely known stochastic domains. ARTDP is based on Dynamic Programming (DP), but unlike conventional DP, which consists of off-line algorithms, ARTDP is an on-line algorithm because it uses agent behavior to guide its computation. ARTDP is adaptive because it does not need a complete and accurate model of the environment but learns a model from data collected during agent–environment interaction. When a good model is available, Real-Time Dynamic Programming (RTDP) is applicable, which is ARTDP without the model-learning component.

Motivation and Background

RTDP combines strengths of heuristic search and DP. Like heuristic search – and unlike conventional DP – it does not have to evaluate the...

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

References

Barto A, Bradtke S, Singh S (1995) Learning to act using real-time dynamic programming. Artif Intell 72(1–2):81–138
Article Google Scholar
Bertsekas D, Tsitsiklis J (1989) Parallel and distributed computation: numerical methods. Prentice-Hall, Englewood Cliffs
Google Scholar
Bonet B, Geffner H (2003) Labeled RTDP: improving the convergence of real-time dynamic programming. In: Proceedings of the 13th international conference on automated planning and scheduling (ICAPS-2003), Trento
Google Scholar
Feng Z, Hansen E, Zilberstein S (2003) Symbolic generalization for on-line planning. In: Proceedings of the 19th conference on uncertainty in artificial intelligence, Acapulco
Google Scholar
Hansen E, Zilberstein S (2001) LAO*: a heuristic search algorithm that finds solutions with loops. Artif Intell 129:35–62
Article MathSciNet Google Scholar
Jalali A, Ferguson M (1989) Computationally efficient control algorithms for Markov chains. In: Proceedings of the 28th conference on decision and control, Tampa, pp 1283–1288
Google Scholar
Korf R (1990) Real-time heuristic search. Artif Intell 42(2–3):189–211
Article Google Scholar
Smith T, Simmons R (2006) Focused real-time dynamic programming for MDPs: squeezing more out of a heuristic. In: Proceedings of the national conference on artificial intelligence (AAAI). AAAI Press, Boston
Google Scholar
Sutton R (1990) Integrated architectures for learning, planning, and reacting based on approximating dynamic programming. In: Proceedings of the 7th international conference on machine learning. Morgan Kaufmann, San Mateo, pp 216–224
Google Scholar

Download references

Author information

Authors and Affiliations

College of Information and Computer Sciences, University of Massachusetts Amherst, Amherst, MA, USA
Andrew G. Barto

Authors

Andrew G. Barto
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Andrew G. Barto .

Editor information

Editors and Affiliations

Clayton, VIC, Australia
Dinh Phung
Software Engineering, Monash University School of Computer Science &, Melbourne, VIC, Australia
Geoffrey I. Webb
Engineering (CSE), University of New South Wales School of Computer Science &, Sydney, NSW, Australia
Claude Sammut

Rights and permissions

Reprints and permissions

Copyright information

About this entry

Cite this entry

Barto, A.G. (2024). Adaptive Real-Time Dynamic Programming. In: Phung, D., Webb, G.I., Sammut, C. (eds) Encyclopedia of Machine Learning and Data Science. Springer, New York, NY. https://doi.org/10.1007/978-1-4899-7502-7_5-1

Download citation

DOI: https://doi.org/10.1007/978-1-4899-7502-7_5-1
Published: 28 September 2023
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4899-7502-7
Online ISBN: 978-1-4899-7502-7
eBook Packages: Springer Reference Computer SciencesReference Module Computer Science and Engineering

Publish with us

Policies and ethics