Skip to main content

Verification of General Markov Decision Processes by Approximate Similarity Relations and Policy Refinement

  • Conference paper
  • First Online:
Quantitative Evaluation of Systems (QEST 2016)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 9826))

Included in the following conference series:

Abstract

In this work we introduce new approximate similarity relations that are shown to be key for policy (or control) synthesis over general Markov decision processes. The models of interest are discrete-time Markov decision processes, endowed with uncountably-infinite state spaces and metric output (or observation) spaces. The new relations, underpinned by the use of metrics, allow in particular for a useful trade-off between deviations over probability distributions on states, and distances between model outputs. We show that the new probabilistic similarity relations can be effectively employed over general Markov decision processes for verification purposes, and specifically for control refinement from abstract models.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    The index \(\mathbb {X}\) in \(\mathcal F_\mathbb {X}\) distinguishes the given \(\sigma \)-algebra on \(\mathbb {X}\) from that on \(\mathbb {Y}\), which is denoted as \(\mathcal F_\mathbb {Y}\). Whenever possible this index will be dropped.

  2. 2.

    \(\mathcal B(\mathbb {X}_1) \otimes \mathcal {B}( \mathbb {X}_2)\) denotes the product \(\sigma \)-algebra of \(\mathcal B(\mathbb {X}_1)\) and \(\mathcal {B}( \mathbb {X}_2)\).

  3. 3.

    Alternatively, if \(\mathbf {M}_2\) with non-deterministic input \(\tilde{e}\in D\) is an \(\epsilon _a\)- alternating bisimulation [23] of \(\mathbf {M}_3\) then \(\mathbf {M}_2\approx _{\epsilon _a}^0\mathbf {M}_3\).

References

  1. Abate, A., Kwiatkowska, M., Norman, G., Parker, D.: Probabilistic model checking of labelled markov processes via finite approximate bisimulations. In: van Breugel, F., Kashefi, E., Palamidessi, C., Rutten, J. (eds.) Horizons of the Mind. LNCS, vol. 8464, pp. 40–58. Springer, Heidelberg (2014)

    Google Scholar 

  2. Abate, A., Redig, F., Tkachev, I.: On the effect of perturbation of conditional probabilities in total variation. Stat. Probab. Lett. 88, 1–8 (2014)

    Article  MathSciNet  MATH  Google Scholar 

  3. Abate, A.: Approximation metrics based on probabilistic bisimulations for general state-space Markov processes: a survey. Electron. Notes Theor. Comput. Sci. 297, 3–25 (2013)

    Article  MATH  Google Scholar 

  4. Abate, A., Prandini, M., Lygeros, J., Sastry, S.: Probabilistic reachability and safety for controlled discrete time stochastic hybrid systems. Automatica 44(11), 2724–2734 (2008)

    Article  MathSciNet  MATH  Google Scholar 

  5. Bertsekas, D.P., Shreve, S.E.: Stochastic Optimal Control: The Discrete Time Case. Athena Scientific, Belmont (1996)

    MATH  Google Scholar 

  6. Bogachev, V.I.: Measure Theory. Springer Science & Business Media, Heidelberg (2007)

    Book  MATH  Google Scholar 

  7. Borkar, V.S.: Probability Theory: An Advanced Course. Springer Science & Business Media, Heidelberg (2012)

    MATH  Google Scholar 

  8. Desharnais, J., Gupta, V., Jagadeesan, R., Panangaden, P.: Metrics for labelled Markov processes. Theor. Comput. Sci. 318(3), 323–354 (2004)

    Article  MathSciNet  MATH  Google Scholar 

  9. Desharnais, J., Laviolette, F., Tracol, M.: Approximate analysis of probabilistic processes: logic, simulation and games. In: QEST, pp. 264–273, September 2008

    Google Scholar 

  10. D’Innocenzo, A., Abate, A., Katoen, J.P.: Robust PCTL model checking. In: Proceedings of the 15th ACM International Conference on Hybrid Systems: Computation and Control, pp. 275–285 (2012)

    Google Scholar 

  11. Esmaeil Zadeh Soudjani, S., Gevaerts, C., Abate, A.: FAUST\(^{2}\): Formal Abstractions of Uncountable-STate STochastic processes. In: Baier, C., Tinelli, C. (eds.) TACAS 2015. LNCS, vol. 9035, pp. 272–286. Springer, Heidelberg (2015)

    Google Scholar 

  12. Esmaeil Zadeh Soudjani, S., Abate, A.: Adaptive and sequential gridding procedures for the abstraction and verification of stochastic processes. SIAM J. Appl. Dyn. Syst. 12(2), 921–956 (2013)

    Article  MathSciNet  MATH  Google Scholar 

  13. Girard, A., Pappas, G.J.: Hierarchical control system design using approximate simulation. Automatica 45(2), 566–571 (2009)

    Article  MathSciNet  MATH  Google Scholar 

  14. Haesaert, S., Esmaeil Zadeh Soudjani, S., Abate, A.: Verification of general Markov decision processes by approximate similarity relations and policy refinement, May 2016. arXiv preprint arXiv:1605.09557

  15. Haesaert, S., Abate, A., Van den Hof, P.M.J.: Correct-by-design output feedback of LTI systems. In: Conference on Decision and Control, pp. 6159–6164 (2015)

    Google Scholar 

  16. Julius, A.A., Pappas, G.J.: Approximations of stochastic hybrid systems. IEEE Trans. Autom. Control 54(6), 1193–1203 (2009)

    Article  MathSciNet  Google Scholar 

  17. Larsen, K.G., Skou, A.: Bisimulation through probabilistic testing. Inf. Comput. 94(1), 1–28 (1991)

    Article  MathSciNet  MATH  Google Scholar 

  18. Lindvall, T.: Lectures on the Coupling Method. Courier Corporation, North Chelmsford (2002)

    MATH  Google Scholar 

  19. Mazo Jr., M., Davitian, A., Tabuada, P.: PESSOA: a tool for embedded controller synthesis. In: Touili, T., Cook, B., Jackson, P. (eds.) CAV 2010. LNCS, vol. 6174, pp. 566–569. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

  20. Meyn, S.P., Tweedie, R.L.: Markov Chains and Stochastic Stability. Communications and Control Engineering Series. Springer, London (1993)

    Book  MATH  Google Scholar 

  21. Segala, R.: Modeling and verification of randomized distributed real-time systems. Ph.D. thesis, Massachusetts Institute of Technology (1995)

    Google Scholar 

  22. Skala, H.J.: The existence of probability measures with given marginals. Ann. Probab. 21, 136–142 (1993)

    Article  MathSciNet  MATH  Google Scholar 

  23. Tabuada, P.: Verification and Control of Hybrid Systems. Springer, Heidelberg (2009)

    Book  MATH  Google Scholar 

  24. Zamani, M., Esfahani, P.M., Majumdar, R., Abate, A., Lygeros, J.: Symbolic control of stochastic systems via approximately bisimilar finite abstractions. IEEE Trans. Autom. Control 59(12), 3135–3150 (2014)

    Article  MathSciNet  Google Scholar 

  25. Zhang, C., Pang, J.: On probabilistic alternating simulations. In: Calude, C.S., Sassone, V. (eds.) TCS 2010. IFIP AICT, vol. 323, pp. 71–85. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

Download references

Acknowledgement

The research of Sofie Haesaert is supported by DISC through a personal grant from the NWO graduate program.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sofie Haesaert .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this paper

Cite this paper

Haesaert, S., Abate, A., Van den Hof, P.M.J. (2016). Verification of General Markov Decision Processes by Approximate Similarity Relations and Policy Refinement. In: Agha, G., Van Houdt, B. (eds) Quantitative Evaluation of Systems. QEST 2016. Lecture Notes in Computer Science(), vol 9826. Springer, Cham. https://doi.org/10.1007/978-3-319-43425-4_16

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-43425-4_16

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-43424-7

  • Online ISBN: 978-3-319-43425-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics