Skip to main content

Parameter-Independent Strategies for pMDPs via POMDPs

  • Conference paper
  • First Online:
Book cover Quantitative Evaluation of Systems (QEST 2018)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 11024))

Included in the following conference series:

Abstract

Markov Decision Processes (MDPs) are a popular class of models suitable for solving control decision problems in probabilistic reactive systems. We consider parametric MDPs (pMDPs) that include parameters in some of the transition probabilities to account for stochastic uncertainties of the environment such as noise or input disturbances.

We study pMDPs with reachability objectives where the parameter values are unknown and impossible to measure directly during execution, but there is a probability distribution known over the parameter values. We study for the first time computing parameter-independent strategies that are expectation optimal, i.e., optimize the expected reachability probability under the probability distribution over the parameters. We present an encoding of our problem to partially observable MDPs (POMDPs), i.e., a reduction of our problem to computing optimal strategies in POMDPs.

We evaluate our method experimentally on several benchmarks: a motivating (repeated) learner model; a series of benchmarks of varying configurations of a robot moving on a grid; and a consensus protocol.

This work was supported by the Austrian FWF (National Research Network RiSE/SHiNE S11405-N23, S11407-N23 and S11411-N23) and by the RTG 2236 UnRAVeL funded by the German Research Foundation.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    We can deal with objectives beyond reachability as long as they are induced by a reward structure (for applicability of the available tools), see Sects. 3 and 4.

  2. 2.

    The discounted accumulated reward objective is defined in a similar way, by adding a factor \(\gamma ^i\) to the i-th summand in (1) with \(\gamma \in [0,1)\) being the discount factor. For solving reachability objectives, undiscounted rewards are sufficient.

  3. 3.

    We use the abbreviation pMDP rather than PMDP as it is common in the recent literature, see (e.g. [17, 35] and as it reminds of the parameter p.).

References

  1. Arming, S., Bartocci, E., Chatterjee, K., Katoen, J., Sokolova, A.: Parameter-independent strategies for pMDPs via POMDPs. arXiv 1806.05126 (2018). http://arxiv.org/abs/1806.05126

  2. Arming, S., Bartocci, E., Sokolova, A.: SEA-PARAM: exploring schedulers in parametric MDPs. In: Proceedings of the QAPL 2017. EPTCS, vol. 250, pp. 25–38 (2017)

    Google Scholar 

  3. Aspnes, J., Herlihy, M.: Fast randomized consensus using shared memory. J. Algorithms 11(3), 441–461 (1990)

    Article  MathSciNet  Google Scholar 

  4. Baier, C., Größer, M., Bertrand, N.: Probabilistic \(\omega \)-automata. J. ACM 59(1), 1:1–1:52 (2012)

    Article  MathSciNet  Google Scholar 

  5. Baier, C., Katoen, J.: Principles of Model Checking. MIT Press, Cambridge (2008)

    MATH  Google Scholar 

  6. Baldi, M., et al.: A probabilistic small model theorem to assess confidentiality of dispersed cloud storage. In: Bertrand, N., Bortolussi, L. (eds.) QEST 2017. LNCS, vol. 10503, pp. 123–139. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-66335-7_8

    Chapter  Google Scholar 

  7. Bargiacchi, E.: AI-Toolbox. https://github.com/Svalorzen/AI-Toolbox/

  8. Bartocci, E., Grosu, R., Katsaros, P., Ramakrishnan, C.R., Smolka, S.A.: Model repair for probabilistic systems. In: Abdulla, P.A., Leino, K.R.M. (eds.) TACAS 2011. LNCS, vol. 6605, pp. 326–340. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-19835-9_30

    Chapter  MATH  Google Scholar 

  9. Beyer, D., Löwe, S., Wendler, P.: Benchmarking and resource measurement. In: Fischer, B., Geldenhuys, J. (eds.) SPIN 2015. LNCS, vol. 9232, pp. 160–178. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-23404-5_12

    Chapter  Google Scholar 

  10. Cassandra, A.R., Littman, M.L., Zhang, N.L.: Incremental pruning - a simple, fast, exact method for partially observable Markov decision processes. In: Proceedings of the UAI 1997, pp. 54–61 (1997)

    Google Scholar 

  11. Chatterjee, K., Doyen, L., Henzinger, T.A.: Qualitative analysis of partially-observable Markov decision processes. In: Hliněný, P., Kučera, A. (eds.) MFCS 2010. LNCS, vol. 6281, pp. 258–269. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-15155-2_24

    Chapter  MATH  Google Scholar 

  12. Chatterjee, K., Chmelik, M.: POMDPs under probabilistic semantics. Artif. Intell. 221, 46–72 (2015)

    Article  MathSciNet  Google Scholar 

  13. Chatterjee, K., Chmelik, M., Davies, J.: A symbolic SAT-based algorithm for almost-sure reachability with small strategies in POMDPs. In: Proceedings of the AAAI 2016, pp. 3225–3232 (2016)

    Google Scholar 

  14. Chatterjee, K., Chmelik, M., Gupta, R., Kanodia, A.: Optimal cost almost-sure reachability in POMDPs. Artif. Intell. 234, 26–48 (2016)

    Article  MathSciNet  Google Scholar 

  15. Chatterjee, K., Doyen, L., Gimbert, H., Henzinger, T.A.: Randomness for free. In: Hliněný, P., Kučera, A. (eds.) MFCS 2010. LNCS, vol. 6281, pp. 246–257. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-15155-2_23

    Chapter  Google Scholar 

  16. Chen, T., Hahn, E.M., Han, T., Kwiatkowska, M.Z., Qu, H., Zhang, L.: Model repair for Markov decision processes. In: Proceedings of the TASE 2013, pp. 85–92 (2013)

    Google Scholar 

  17. Cubuktepe, M.: Sequential convex programming for the efficient verification of parametric MDPs. In: Legay, A., Margaria, T. (eds.) TACAS 2017. LNCS, vol. 10206, pp. 133–150. Springer, Heidelberg (2017). https://doi.org/10.1007/978-3-662-54580-5_8

    Chapter  Google Scholar 

  18. Daws, C.: Symbolic and parametric model checking of discrete-time Markov chains. In: Liu, Z., Araki, K. (eds.) ICTAC 2004. LNCS, vol. 3407, pp. 280–294. Springer, Heidelberg (2005). https://doi.org/10.1007/978-3-540-31862-0_21

    Chapter  MATH  Google Scholar 

  19. Dehnert, C., et al.: PROPhESY: a probabilistic parameter synthesis tool. In: Kroening, D., Păsăreanu, C.S. (eds.) CAV 2015. LNCS, vol. 9206, pp. 214–231. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-21690-4_13

    Chapter  Google Scholar 

  20. Dehnert, C., Junges, S., Katoen, J.-P., Volk, M.: A Storm is coming: a modern probabilistic model checker. In: Majumdar, R., Kunčak, V. (eds.) CAV 2017. LNCS, vol. 10427, pp. 592–600. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-63390-9_31

    Chapter  Google Scholar 

  21. Hahn, E.M., Han, T., Zhang, L.: Probabilistic reachability for parametric Markov models. STTT 13(1), 3–19 (2011)

    Article  Google Scholar 

  22. Hahn, E.M., Han, T., Zhang, L.: Synthesis for PCTL in parametric Markov decision processes. In: Bobaru, M., Havelund, K., Holzmann, G.J., Joshi, R. (eds.) NFM 2011. LNCS, vol. 6617, pp. 146–161. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-20398-5_12

    Chapter  Google Scholar 

  23. Hahn, E.M., Hermanns, H., Zhang, L., Wachter, B.: PARAM case studies (2015). https://depend.cs.uni-saarland.de/tools/param/casestudies

  24. Halmos, P.R.: Measure Theory. Springer, New York (1974). https://doi.org/10.1007/978-1-4684-9440-2

    Book  MATH  Google Scholar 

  25. Jansen, N., et al.: Accelerating parametric probabilistic verification. In: Norman, G., Sanders, W. (eds.) QEST 2014. LNCS, vol. 8657, pp. 404–420. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10696-0_31

    Chapter  Google Scholar 

  26. Junges, S., Jansen, N., Wimmer, R., Quatmann, T., Winterer, L., Katoen, J., Becker, B.: Finite-state controllers of POMDPs via parameter synthesis. In: Proceedings of the UAI 2018 (2018)

    Google Scholar 

  27. Kreinovich, V., Lakeyev, A., Rohn, J., Kahl, P.: Computational Complexity and Feasibility of Data Processing and Interval Computations, Applied Optimization, vol. 10. Springer, Boston (1998). https://doi.org/10.1007/978-1-4757-2793-7

    Book  MATH  Google Scholar 

  28. Kwiatkowska, M., Norman, G., Parker, D.: PRISM 4.0: verification of probabilistic real-time systems. In: Gopalakrishnan, G., Qadeer, S. (eds.) CAV 2011. LNCS, vol. 6806, pp. 585–591. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-22110-1_47

    Chapter  Google Scholar 

  29. Lanotte, R., Maggiolo-Schettini, A., Troina, A.: Parametric probabilistic transition systems for system design and analysis. Form. Asp. Comput. 19(1), 93–109 (2007)

    Article  Google Scholar 

  30. Lukina, A., et al.: ARES: adaptive receding-horizon synthesis of optimal plans. In: Legay, A., Margaria, T. (eds.) TACAS 2017. LNCS, vol. 10206, pp. 286–302. Springer, Heidelberg (2017). https://doi.org/10.1007/978-3-662-54580-5_17

    Chapter  Google Scholar 

  31. Madani, O., Hanks, S., Condon, A.: On the undecidability of probabilistic planning and related stochastic optimization problems. Artif. Intell. 147(1–2), 5–34 (2003)

    Article  MathSciNet  Google Scholar 

  32. Medina Ayala, A.I., Andersson, S.B., Belta, C.: Probabilistic control from time-bounded temporal logic specifications in dynamic environments. In: Proceedings of the ICRA 2012, pp. 4705–4710. IEEE (2012)

    Google Scholar 

  33. Pathak, S., Ábrahám, E., Jansen, N., Tacchella, A., Katoen, J.-P.: A greedy approach for the efficient repair of stochastic models. In: Havelund, K., Holzmann, G., Joshi, R. (eds.) NFM 2015. LNCS, vol. 9058, pp. 295–309. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-17524-9_21

    Chapter  Google Scholar 

  34. Pineau, J., Gordon, G.J., Thrun, S.: Point-based value iteration - an anytime algorithm for POMDPs. In: Proceedings of the IJCAI 2003, pp. 1025–1032 (2003)

    Google Scholar 

  35. Polgreen, E., Wijesuriya, V.B., Haesaert, S., Abate, A.: Automated experiment design for data-efficient verification of parametric Markov decision processes. In: Bertrand, N., Bortolussi, L. (eds.) QEST 2017. LNCS, vol. 10503, pp. 259–274. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-66335-7_16

    Chapter  MATH  Google Scholar 

  36. Quatmann, T., Dehnert, C., Jansen, N., Junges, S., Katoen, J.-P.: Parameter synthesis for Markov models: faster than ever. In: Artho, C., Legay, A., Peled, D. (eds.) ATVA 2016. LNCS, vol. 9938, pp. 50–67. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46520-3_4

    Chapter  Google Scholar 

  37. Russell, S., Norvig, P.: Artificial Intelligence: A Modern Approach. Prentice Hall, Upper Saddle River (2009)

    MATH  Google Scholar 

  38. Sennott, L.I.: Stochastic Dynamic Programming and the Control of Queueing Systems. Wiley, New York (1998)

    Book  Google Scholar 

  39. Spaan, M.T.J., Vlassis, N.: Perseus: randomized point-based value iteration for POMDPs. J. Artif. Intell. Res. 24, 195–220 (2011)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sebastian Arming .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Arming, S., Bartocci, E., Chatterjee, K., Katoen, JP., Sokolova, A. (2018). Parameter-Independent Strategies for pMDPs via POMDPs. In: McIver, A., Horvath, A. (eds) Quantitative Evaluation of Systems. QEST 2018. Lecture Notes in Computer Science(), vol 11024. Springer, Cham. https://doi.org/10.1007/978-3-319-99154-2_4

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-99154-2_4

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-99153-5

  • Online ISBN: 978-3-319-99154-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics