On the complexity of finite memory policies for Markov decision processes

Beauquier, Danièle; Burago, Dima; Slissenko, Anatol

doi:10.1007/3-540-60246-1_125

Danièle Beauquier¹^nAff1,
Dima Burago^4,5^nAff2 &
Anatol Slissenko^6,4^nAff1

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 969))

Included in the following conference series:

International Symposium on Mathematical Foundations of Computer Science

963 Accesses
2 Citations

Abstract

We consider some complexity questions concerning a model of uncertainty known as Markov decision processes. Our results concern the problem of constructing optimal policies under a criterion of optimality defined in terms of constraints on the behavior of the process. The constraints are described by regular languages, and the motivation goes from robot motion planning. It is known that, in the case of perfect information, optimal policies under the traditional cost criteria can be found among Markov policies and in polytime. We show, firstly, that for the behavior criterion optimal policies are not Markovian for finite as well as infinite horizon. On the other hand, optimal policies in this case lie in the class of finite memory policies defined in the paper, and can be found in polytime. We remark that in the case of partial information, finite memory policies cannot be optimal in the general situation. Nevertheless, the class of finite memory policies seems to be of interest for probabilistic policies: though probabilistic policies are not better than deterministic ones in the general class of history remembering policies, the former ones can be better in the class of finite memory policies.

The research of this author was supported by DRET and Armines contract 920171.00.1013.

The research of this author was partially supported by DRET contract 91/1061.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

D. P. Bertsekas. Dynamic Programming and Stochastic Control. Academic Press, New York, 1976.
Google Scholar
D. Burago, M. de Rougemont, and A. Slissenko. On the complexity of partially observed Markov decision processes. 19p., accepted to Theor. Comput. Sci., 1995.
Google Scholar
C. J. Eilenberg. Automata, Languages and Machines. Academic Press, New York, 1974. Vol. A.
Google Scholar
L.C.M. Kallenberg. Linear programming and finite Markovian control problems. Technical Report 148, Mathematics Centrum Tract, Amsterdam, 1983.
Google Scholar
C. H. Papadimitriou and J. N. Tsitsiklis. The complexity of Markov decision procedures. Mathematics of Operations Research, 12(3):441–450, 1987.
Google Scholar
M.L. Puterman. Markov decision processes. In D.P. Heyman and M.J. Sobel, editors, Handbooks in Operations Research and Management Science. Stochastic Models, pages 331–434. North Holland, 1990. Vol. 2.
Google Scholar

Download references

Author information

Danièle Beauquier & Anatol Slissenko
Present address: Equipe d'Informatique Fondamentale, Université Paris-12, 61, Ave. du Général de Gaulle, 94010, Créteil, France
Dima Burago
Present address: Dept. of Mathematics, University of Pennsylvania, 19104, Philadelphia, PA, USA

Authors and Affiliations

Institut Blaise Pascal, Université Paris-12 and L.I.T.P., Paris, France
Danièle Beauquier
Laboratory for Theory of Algorithms, St-Petersburg Inst. for Informatics and Automation of the Acad. Sci. of Russia, St-Petersburg, Russia
Dima Burago & Anatol Slissenko
LRI, Université Paris-Sud, France
Dima Burago
Institut Blaise Pascal, Université Paris-12 and L.I.T.P., Paris, France
Anatol Slissenko

Authors

Danièle Beauquier
View author publications
You can also search for this author in PubMed Google Scholar
Dima Burago
View author publications
You can also search for this author in PubMed Google Scholar
Anatol Slissenko
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Jiří Wiedermann Petr Hájek

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Beauquier, D., Burago, D., Slissenko, A. (1995). On the complexity of finite memory policies for Markov decision processes. In: Wiedermann, J., Hájek, P. (eds) Mathematical Foundations of Computer Science 1995. MFCS 1995. Lecture Notes in Computer Science, vol 969. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-60246-1_125

Download citation

DOI: https://doi.org/10.1007/3-540-60246-1_125
Published: 02 June 2005
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-60246-0
Online ISBN: 978-3-540-44768-9
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics