Advertisement

Concurrent MDPs with Finite Markovian Policies

  • Peter BuchholzEmail author
  • Dimitri Scheftelowitsch
Conference paper
  • 63 Downloads
Part of the Lecture Notes in Computer Science book series (LNCS, volume 12040)

Abstract

The recently defined class of Concurrent Markov Decision Processes (CMDPs) allows one to describe scenario based uncertainty in sequential decision problems like scheduling or admission problems. The resulting optimization problem of computing an optimal policy is NP-hard. This paper introduces a new class of policies for CMDPs on infinite horizons. A mixed integer linear program and an efficient approximation algorithm based on policy iteration are defined for the computation of optimal polices. The proposed approximation algorithm also improves the available approximate value iteration algorithm for the finite horizon case.

Keywords

Concurrent Markov Decision Processes Optimal policies Robust optimization Integer linear programming Local search heuristics 

References

  1. 1.
    Bertsimas, D., Mišić, V.V.: Robust product line design. Oper. Res. 65(1), 19–37 (2017)MathSciNetCrossRefGoogle Scholar
  2. 2.
    Bertsimas, D., Silberholz, J., Trikalinos, T.: Optimal healthcare decision making under multiple mathematical models: application in prostate cancer screening. Health Care Manag. Sci. 21(1), 105–118 (2016)CrossRefGoogle Scholar
  3. 3.
    Bertsimas, D., Sim, M.: The price of robustness. Oper. Res. 52(1), 35–53 (2004)MathSciNetCrossRefGoogle Scholar
  4. 4.
    Buchholz, P.: Markov decision processes with uncertain parameters. http://ls4-www.cs.tu-dortmund.de/download/buchholz/CMDP/CMDP_Description
  5. 5.
    Buchholz, P., Scheftelowitsch, D.: Computation of weighted sums of rewards for concurrent MDPs. Math. Methods Oper. Res. 89(1), 1–42 (2019)MathSciNetCrossRefGoogle Scholar
  6. 6.
    Buchholz, P., Scheftelowitsch, D.: Light robustness in the optimization of Markov decision processes with uncertain parameters. Comput. Oper. Res. 108, 69–81 (2019)MathSciNetCrossRefGoogle Scholar
  7. 7.
    Goyal, V., Grand-Clement, J.: Robust Markov decision process: Beyond rectangularity. CoRR, abs/1811.00215 (2019)Google Scholar
  8. 8.
    Hager, W.W.: Updating the inverse of a matrix. SIAM Rev. 31(2), 221–239 (1989)MathSciNetCrossRefGoogle Scholar
  9. 9.
    Iyengar, G.N.: Robust dynamic programming. Math. Oper. Res. 30(2), 257–280 (2005)MathSciNetCrossRefGoogle Scholar
  10. 10.
    Jünger, M., et al. (eds.): 50 Years of Integer Programming 1958–2008 - From the Early Years to the State-of-the-Art. Springer, Heidelberg (2010).  https://doi.org/10.1007/978-3-540-68279-0CrossRefGoogle Scholar
  11. 11.
    Mannor, S., Mebel, O., Xu, H.: Robust MDPs with k-rectangular uncertainty. Math. Oper. Res. 41(4), 1484–1509 (2016)MathSciNetCrossRefGoogle Scholar
  12. 12.
    Nilim, A., Ghaoui, L.E.: Robust control of Markov decision processes with uncertain transition matrices. Oper. Res. 53(5), 780–798 (2005)MathSciNetCrossRefGoogle Scholar
  13. 13.
    Puterman, M.L.: Markov Decision Processes. Wiley, New York (2005)zbMATHGoogle Scholar
  14. 14.
    Rockafellar, R.T., Wets, R.J.: Scenarios and policy aggregation in optimization under uncertainty. Math. Oper. Res. 16(1), 119–147 (1991)MathSciNetCrossRefGoogle Scholar
  15. 15.
    Satia, J.K., Lave, R.E.: Markovian decision processes with uncertain transition probabilities. Oper. Res. 21(3), 728–740 (1973)MathSciNetCrossRefGoogle Scholar
  16. 16.
    Scheftelowitsch, D.: Markov decision processes with uncertain parameters. Ph.D. thesis, Technical University of Dortmund, Germany (2018)Google Scholar
  17. 17.
    Serfozo, R.F.: An equivalence between continuous and discrete time Markov decision processes. Oper. Res. 27(3), 616–620 (1979)MathSciNetCrossRefGoogle Scholar
  18. 18.
    Steimle, L.N.: Stochastic Dynamic Optimization Under Ambiguity. Ph.D. thesis, Industrial and Operations Engineering in the University of Michigan (2019)Google Scholar
  19. 19.
    Steimle, L.N., Ahluwalia, V., Kamdar, C., Denton, B.T.: Decomposition methods for multi-model Markov decision processes. Technical report, Optimization-online (2018)Google Scholar
  20. 20.
    Steimle, L.N., Kaufman, D.L., Denton, B.T.: Multi-model Markov decision processes. Technical report, Optimization-online (2018)Google Scholar
  21. 21.
    White, C.C., Eldeib, H.K.: Markov decision processes with imprecise transition probabilities. Oper. Res. 42(4), 739–749 (1994)MathSciNetCrossRefGoogle Scholar
  22. 22.
    Wiesemann, W., Kuhn, D., Rustem, B.: Robust Markov decision processes. Math. Oper. Res. 38(1), 153–183 (2013)MathSciNetCrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  1. 1.Informatik IV, TU DortmundDortmundGermany

Personalised recommendations