Continuous-Time Markov Decisions Based on Partial Exploration

Ashok, Pranav; Butkova, Yuliya; Hermanns, Holger; Křetínský, Jan

doi:10.1007/978-3-030-01090-4_19

Part of the book series: Lecture Notes in Computer Science ((LNPSE,volume 11138))

Included in the following conference series:

International Symposium on Automated Technology for Verification and Analysis

1364 Accesses
10 Citations

Abstract

We provide a framework for speeding up algorithms for time-bounded reachability analysis of continuous-time Markov decision processes. The principle is to find a small, but almost equivalent subsystem of the original system and only analyse the subsystem. Candidates for the subsystem are identified through simulations and iteratively enlarged until runs are represented in the subsystem with high enough probability. The framework is thus dual to that of abstraction refinement. We instantiate the framework in several ways with several traditional algorithms and experimentally confirm orders-of-magnitude speed ups in many cases.

This research was supported in part by Deutsche Forschungsgemeinschaft (DFG) through the TUM International Graduate School of Science and Engineering (IGSSE) project 10.06 PARSEC, the ERC through Advanced Grant 695614 (POWVER), the Czech Science Foundation grant No. 18-11193S, and the DFG project 383882557 Statistical Unbounded Verification.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
Measurable with respect to the standard \(\sigma \)-algebra on the set of paths [NZ10].
2.
This is due to the fact that for the reachability value, only what happens before the first arrival to the goal matters, and everything that happens afterwards is irrelevant.
3.
Translated to explicit state format by the tool Scoop [Tim11].

References

Ashok, P., Butkova, Y., Hermanns, H., Křetínský, J.: Continuous-Time Markov Decisions Based on Partial Exploration. ArXiv e-prints (2018). https://arxiv.org/abs/1807.09641
Ashok, P., Chatterjee, K., Daca, P., Kretínský, J., Meggendorfer, T.: Value iteration for long-run average reward in Markov decision processes. In: CAV (2017)
Google Scholar
Aziz, A., Sanwal, K., Singhal, V., Brayton, R.K.: Verifying continuous time Markov chains. In: CAV (1996)
Google Scholar
Bartocci, E., Bortolussi, L., Brázdil, T., Milios, D., Sanguinetti, G.: Policy learning in continuous-time Markov decision processes using gaussian processes. Perform. Eval. 116, 84–100 (2017)
Article Google Scholar
Brázdil, T., et al.: Verification of Markov decision processes using learning algorithms. In: ATVA (2014)
Google Scholar
Bruno, J.L., Downey, P.J., Frederickson, G.N.: Sequencing tasks with exponential service times to minimize the expected flow time or makespan. J. ACM 28(1), 100–113 (1981)
Article MathSciNet Google Scholar
Bertsekas, D.P.: Dynamic Programming and Optimal Control, vol. II. Athena Scientific (1995)
Google Scholar
Brázdil, T., Forejt, V., Krčál, J., Křetínský, J., Kučera, A.: Continuous-time stochastic games with time-bounded reachability. In: FSTTCS (2009)
Google Scholar
Baier, C., Haverkort, B.R., Hermanns, H., Katoen, J.: Efficient computation of time-bounded reachability probabilities in uniform continuous-time Markov decision processes. In: TACAS (2004)
Google Scholar
Butkova, Y., Hatefi, H., Hermanns, H., Krcál, J.: Optimal continuous time Markov decisions. In: ATVA (2015)
Google Scholar
Buchholz, P., Schulz, I.: Numerical analysis of continuous time Markov decision processes over finite horizons. Comput. OR 38(3), 651–659 (2011)
Article MathSciNet Google Scholar
Eisentraut, C., Hermanns, H., Katoen, J., Zhang, L.: A semantics for every GSPN. In: Petri Nets (2013)
Chapter Google Scholar
Feinberg, E.A.: Continuous time discounted jump Markov decision processes: a discrete-event approach. Math. Oper. Res. 29(3), 492–524 (2004)
Article MathSciNet Google Scholar
Fearnley, J., Rabe, M., Schewe, S., Zhang, L.: Efficient approximation of optimal control for continuous-time Markov games. In: FSTTCS (2011)
Google Scholar
Ghemawat, S., Gobioff, H., Leung, S.: The google file system. In: SOSP (2003)
Google Scholar
Guck, D., Hatefi, H., Hermanns, H., Katoen, J., Timmer, M.: Modelling, reduction and analysis of Markov automata. In: QEST (2013)
Google Scholar
Guck, D., Han, T., Katoen, J., Neuhäußer, M.R.: Quantitative timed analysis of interactive Markov chains. In: NFM (2012)
Google Scholar
Haverkort, B.R., Cloth, L., Hermanns, H., Katoen, J., Baier, C.: Model checking performability properties. In: DSN (2002)
Google Scholar
Hatefi, H., Hermanns, H.: Improving time bounded reachability computations in interactive Markov chains. In: FSEN (2013)
Chapter Google Scholar
Haverkort, B.R., Hermanns, H., Katoen, J.: On the use of model checking techniques for dependability evaluation. In: SRDS’00 (2000)
Google Scholar
Kwiatkowska, M., Norman, G., Parker, D.: PRISM 4.0: verification of probabilistic real-time systems. In: Gopalakrishnan, G., Qadeer, S. (eds.) CAV 2011. LNCS, vol. 6806, pp. 585–591. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-22110-1_47
Chapter Google Scholar
Lefèvre, C.: Optimal control of a birth and death epidemic process. Oper. Res. 29(5), 971–982 (1981)
Article MathSciNet Google Scholar
McMahan, H.B., Likhachev, M., Gordon, G.J.: Bounded real-time dynamic programming: RTDP with monotone upper bounds and performance guarantees. In: ICML (2005)
Google Scholar
Neuhäußer, M.R.: Model checking nondeterministic and randomly timed systems. Ph.D. thesis, RWTH Aachen University (2010)
Google Scholar
Neuhäußer, M.R., Zhang, L.: Time-bounded reachability probabilities in continuous-time Markov decision processes. In: QEST (2010)
Google Scholar
Pavese, E., Braberman, V.A., Uchitel, S.: Automated reliability estimation over partial systematic explorations. In: ICSE, pp. 602–611 (2013)
Google Scholar
Qiu, Q., Qu, Q., Pedram, M.: Stochastic modeling of a power-managed system-construction and optimization. IEEE Trans. CAD Integr. Circuits Syst. 20(10), 1200–1217 (2001)
Article Google Scholar
Sennott, L.I.: Stochastic Dynamic Programming and the Control of Queueing Systems. Wiley-Interscience, New York (1999)
MATH Google Scholar
Timmer, M.: Scoop: a tool for symbolic optimisations of probabilistic processes. In: QEST (2011)
Google Scholar
Timmer, M., Katoen, J.-P., van de Pol, J., Stoelinga, M.I.A.: Efficient modelling and generation of Markov automata. In: Koutny, M., Ulidowski, I. (eds.) CONCUR 2012. LNCS, vol. 7454, pp. 364–379. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-32940-1_26
Chapter Google Scholar
Timmer, M., van de Pol, J., Stoelinga, M.I.A.: Confluence reduction for Markov automata. In: Braberman, V., Fribourg, L. (eds.) FORMATS 2013. LNCS, vol. 8053, pp. 243–257. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-40229-6_17
Chapter Google Scholar
Zhang, L., Neuhäußer, M.R.: Model checking interactive Markov chains. In: Esparza, J., Majumdar, R. (eds.) TACAS 2010. LNCS, vol. 6015, pp. 53–68. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-12002-2_5
Chapter Google Scholar

Download references

Author information

Authors and Affiliations

Technical University of Munich, Munich, Germany
Pranav Ashok & Jan Křetínský
Saarland University, Saarbrücken, Germany
Yuliya Butkova & Holger Hermanns

Authors

Pranav Ashok
View author publications
You can also search for this author in PubMed Google Scholar
Yuliya Butkova
View author publications
You can also search for this author in PubMed Google Scholar
Holger Hermanns
View author publications
You can also search for this author in PubMed Google Scholar
Jan Křetínský
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jan Křetínský .

Editor information

Editors and Affiliations

Microsoft Research, Redmond, WA, USA
Shuvendu K. Lahiri
University of Southern California, Los Angeles, CA, USA
Chao Wang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Ashok, P., Butkova, Y., Hermanns, H., Křetínský, J. (2018). Continuous-Time Markov Decisions Based on Partial Exploration. In: Lahiri, S., Wang, C. (eds) Automated Technology for Verification and Analysis. ATVA 2018. Lecture Notes in Computer Science(), vol 11138. Springer, Cham. https://doi.org/10.1007/978-3-030-01090-4_19

Download citation

DOI: https://doi.org/10.1007/978-3-030-01090-4_19
Published: 30 September 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-01089-8
Online ISBN: 978-3-030-01090-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics