Synthesis for Multi-objective Stochastic Games: An Application to Autonomous Urban Driving

  • Taolue Chen
  • Marta Kwiatkowska
  • Aistis Simaitis
  • Clemens Wiltsche
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8054)


We study strategy synthesis for stochastic two-player games with multiple objectives expressed as a conjunction of LTL and expected total reward goals. For stopping games, the strategies are constructed from the Pareto frontiers that we compute via value iteration. Since, in general, infinite memory is required for deterministic winning strategies in such games, our construction takes advantage of randomised memory updates in order to provide compact strategies. We implement our methods in PRISM-games, a model checker for stochastic multi-player games, and present a case study motivated by the DARPA Urban Challenge, illustrating how our methods can be used to synthesise strategies for high-level control of autonomous vehicles.


Road Segment Markov Decision Process Reward Function Pareto Frontier Linear Temporal Logic 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Bagnara, R., Hill, P.M., Zaffanella, E.: The Parma Polyhedra Library: Toward a complete set of numerical abstractions for the analysis and verification of hardware and software systems. Sci. Comput. Program. 72(1-2), 3–21 (2008)MathSciNetCrossRefGoogle Scholar
  2. 2.
    Brázdil, T., Brozek, V., Chatterjee, K., Forejt, V., Kučera, A.: Two views on multiple mean-payoff objectives in Markov decision processes. In: LICS, pp. 33–42 (2011)Google Scholar
  3. 3.
    Campbell, M., Egerstedt, M., How, J.P., Murray, R.M.: Autonomous driving in urban environments: approaches, lessons and challenges. Phil. Trans. R. Soc. A 368(1928), 4649–4672 (2010)CrossRefGoogle Scholar
  4. 4.
    Chatterjee, K., Doyen, L., Henzinger, T.A., Raskin, J.F.: Generalized mean-payoff and energy games. In: FSTTCS. LIPIcs, vol. 8, pp. 505–516 (2010)Google Scholar
  5. 5.
    Chatterjee, K., Majumdar, R., Henzinger, T.A.: Markov decision processes with multiple objectives. In: Durand, B., Thomas, W. (eds.) STACS 2006. LNCS, vol. 3884, pp. 325–336. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  6. 6.
    Chatterjee, K., Randour, M., Raskin, J.-F.: Strategy synthesis for multi-dimensional quantitative objectives. In: Koutny, M., Ulidowski, I. (eds.) CONCUR 2012. LNCS, vol. 7454, pp. 115–131. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  7. 7.
    Chen, T., Forejt, V., Kwiatkowska, M., Parker, D., Simaitis, A.: PRISM-games: A model checker for stochastic multi-player games. In: Piterman, N., Smolka, S.A. (eds.) TACAS 2013. LNCS, vol. 7795, pp. 185–191. Springer, Heidelberg (2013)CrossRefGoogle Scholar
  8. 8.
    Chen, T., Forejt, V., Kwiatkowska, M., Simaitis, A., Trivedi, A., Ummels, M.: Playing stochastic games precisely. In: Koutny, M., Ulidowski, I. (eds.) CONCUR 2012. LNCS, vol. 7454, pp. 348–363. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  9. 9.
    Chen, T., Forejt, V., Kwiatkowska, M., Simaitis, A., Wiltsche, C.: On stochastic games with multiple objectives. In: MFCS (accepted, 2013)Google Scholar
  10. 10.
    Courcoubetis, C., Yannakakis, M.: Markov decision processes and regular events. IEEE Trans. Autom. Control 43(10), 1399–1418 (1998)MathSciNetzbMATHCrossRefGoogle Scholar
  11. 11.
    DARPA. Urban Challenge (2007) (online accessed March 8, 2013)Google Scholar
  12. 12.
    Dziembowski, S., Jurdzinski, M., Walukiewicz, I.: How much memory is needed to win infinite games? In: LICS, pp. 99–110 (1997)Google Scholar
  13. 13.
    Etessami, K., Kwiatkowska, M.Z., Vardi, M.Y., Yannakakis, M.: Multi-objective model checking of Markov decision processes. LMCS 4(4) (2008)Google Scholar
  14. 14.
    Fahrenberg, U., Juhl, L., Larsen, K.G., Srba, J.: Energy games in multiweighted automata. In: Cerone, A., Pihlajasaari, P. (eds.) ICTAC 2011. LNCS, vol. 6916, pp. 95–115. Springer, Heidelberg (2011)CrossRefGoogle Scholar
  15. 15.
    Forejt, V., Kwiatkowska, M., Norman, G., Parker, D., Qu, H.: Quantitative multi-objective verification for probabilistic systems. In: Abdulla, P.A., Leino, K.R.M. (eds.) TACAS 2011. LNCS, vol. 6605, pp. 112–127. Springer, Heidelberg (2011)CrossRefGoogle Scholar
  16. 16.
    OpenStreetMap (2013) (online; accessed March 8, 2013)Google Scholar
  17. 17.
    Urmson, C., Anhalt, J., Bagnell, D., Baker, C., Bittner, R., Clark, M.N., Dolan, J., Duggins, D., Galatali, T., Geyer, C., et al.: Autonomous driving in urban environments: Boss and the urban challenge. J. Field Robot. 25(8), 425–466 (2008)CrossRefGoogle Scholar
  18. 18.
    Velner, Y., Chatterjee, K., Doyen, L., Henzinger, T.A., Rabinovich, A., Raskin, J.F.: The complexity of multi-mean-payoff and multi-energy games. CoRR, abs/1209.3234 (2012)Google Scholar
  19. 19.
    Wongpiromsarn, T., Frazzoli, E.: Control of probabilistic systems under dynamic, partially known environments with temporal logic specifications. In: CDC, pp. 7644–7651 (2012)Google Scholar
  20. 20.
    Wongpiromsarn, T., Topcu, U., Murray, R.M.: Receding horizon temporal logic planning. IEEE Trans. Automat. Contr. 57(11), 2817–2830 (2012)MathSciNetCrossRefGoogle Scholar
  21. 21.
    Wongpiromsarn, T., Ulusoy, A., Belta, C., Frazzoli, E., Rus, D.: Incremental synthesis of control policies for heterogeneous multi-agent systems with linear temporal logic specification. In: ICRA (accepted, 2013)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  • Taolue Chen
    • 1
  • Marta Kwiatkowska
    • 1
  • Aistis Simaitis
    • 1
  • Clemens Wiltsche
    • 1
  1. 1.Department of Computer ScienceUniversity of OxfordUnited Kingdom

Personalised recommendations