Sandboxing Controllers for Stochastic Cyber-Physical Systems

Zhong, Bingzhuo; Zamani, Majid; Caccamo, Marco

doi:10.1007/978-3-030-29662-9_15

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 11750))

Included in the following conference series:

International Conference on Formal Modeling and Analysis of Timed Systems

639 Accesses
1 Citations

Abstract

Current cyber-physical systems (CPS) are expected to accomplish complex tasks. To achieve this goal, high performance, but unverified controllers (e.g. deep neural network, black-box controllers from third parties) are applied, which makes it very challenging to keep the overall CPS safe. By sandboxing these controllers, we are not only able to use them but also to enforce safety properties over the controlled physical systems at the same time. However, current available solutions for sandboxing controllers are just applicable to deterministic (a.k.a. non-stochastic) systems, possibly affected by bounded disturbances. In this paper, for the first time we propose a novel solution for sandboxing unverified complex controllers for CPS operating in noisy environments (a.k.a. stochastic CPS). Moreover, we also provide probabilistic guarantees on their safety. Here, the unverified control input is observed at each time instant and checked whether it violates the maximal tolerable probability of reaching the unsafe set. If this probability exceeds a given threshold, the unverified control input will be rejected, and the advisory input provided by the optimal safety controller will be used to maintain the probabilistic safety guarantee. The proposed approach is illustrated empirically and the results indicate that the expected safety probability is guaranteed.

This work was supported in part by the H2020 ERC Starting Grant AutoCPS (grant agreement No 804639) and German Research Foundation (DFG) through the grants ZA 873/1-1 and ZA 873/4-1. Marco Caccamo was supported by an Alexander von Humboldt Professorship endowed by the German Federal Ministry of Education and Research. Any opinions, findings, and conclusions or recommendations expressed in this publication are those of the authors and do not necessarily reflect the views of the Alexander von Humboldt Foundation.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
No input needed to be provided at $t=H$ since it is the end of the execution.

References

Abate, A., Prandini, M., Lygeros, J., Sastry, S.: Probabilistic reachability and safety for controlled discrete time stochastic hybrid systems. Automatica 44(11), 2724–2734 (2008). https://doi.org/10.1016/j.automatica.2008.03.027
Article MathSciNet MATH Google Scholar
Abdi, F., Chen, C.Y., Hasan, M., Liu, S., Mohan, S., Caccamo, M.: Preserving physical safety under cyber attacks. IEEE Internet Things J. (2018). https://doi.org/10.1109/JIOT.2018.2889866
Article Google Scholar
Abdi, F., Tabish, R., Rungger, M., Zamani, M., Caccamo, M.: Application and system-level software fault tolerance through full system restarts. In: 2017 ACM/IEEE 8th International Conference on Cyber-Physical Systems (ICCPS), pp. 197–206. IEEE (2017). https://doi.org/10.1145/3055004.3055012
Alshiekh, M., Bloem, R., Ehlers, R., Könighofer, B., Niekum, S., Topcu, U.: Safe reinforcement learning via shielding. In: Thirty-Second AAAI Conference on Artificial Intelligence (2018)
Google Scholar
Bak, S., Johnson, T.T., Caccamo, M., Sha, L.: Real-time reachability for verified simplex design. In: 2014 IEEE Real-Time Systems Symposium, pp. 138–148. IEEE (2014). https://doi.org/10.1109/RTSS.2014.21
Bak, S., Manamcheri, K., Mitra, S., Caccamo, M.: Sandboxing controllers for cyber-physical systems. In: 2011 IEEE/ACM Second International Conference on Cyber-Physical Systems, pp. 3–12. IEEE (2011). https://doi.org/10.1109/ICCPS.2011.25
Bloem, R., Könighofer, B., Könighofer, R., Wang, C.: Shield synthesis: runtime enforcement for reactive systems. In: International Conference on Tools and Algorithms for the Construction and Analysis of Systems, pp. 533–548. Springer (2015). https://doi.org/10.1007/978-3-662-46681-0_51
Chapter Google Scholar
Crenshaw, T.L., Gunter, E., Robinson, C.L., Sha, L., Kumar, P.: The simplex reference model: limiting fault-propagation due to unreliable components in cyber-physical system architectures. In: 28th IEEE International Real-Time Systems Symposium, RTSS 2007, pp. 400–412. IEEE (2007). https://doi.org/10.1109/RTSS.2007.34
Esmaeil Zadeh Soudjani, S.: Formal abstractions for automated verification and synthesis of stochastic systems. Ph.D. thesis, Technical University of Delft (2014). https://doi.org/10.4233/uuid:201d5145-0717-4dea-b0d0-c018e510fdaa
Hernández-Lerma, O., Lasserre, J.B.: Discrete-Time Markov Control Processes: Basic Optimality Criteria. Springer, New York (1996). https://doi.org/10.1007/978-1-4612-0729-0
Book MATH Google Scholar
Hovakimyan, N., Cao, C., Kharisov, E., Xargay, E., Gregory, I.M.: L 1 adaptive control for safety-critical systems. IEEE Control Syst. Mag. 31(5), 54–104 (2011). https://doi.org/10.1109/MCS.2011.941961
Article MathSciNet MATH Google Scholar
Humphrey, L., Könighofer, B., Könighofer, R., Topcu, U.: Synthesis of admissible shields. In: Bloem, R., Arbel, E. (eds.) HVC 2016. LNCS, vol. 10028, pp. 134–151. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-49052-6_9
Chapter Google Scholar
Kupferman, O., Vardi, M.Y.: Model checking of safety properties. Form. Methods Syst. Des. 19(3), 291–314 (2001). https://doi.org/10.1023/A:1011254632723
Article MATH Google Scholar
Lavaei, A., Soudjani, S., Zamani, M.: From dissipativity theory to compositional construction of finite Markov decision processes. In: Proceedings of the 21st International Conference on Hybrid Systems: Computation and Control (part of CPS Week), pp. 21–30. ACM (2018). https://doi.org/10.1145/3178126.3178135
Lavaei, A., Soudjani, S., Zamani, M.: Compositional synthesis of large-scale stochastic systems: a relaxed dissipativity approach. arXiv preprint arXiv:1902.01223 (2019)
Monahan, G.E.: State of the art–a survey of partially observable markov decision processes: theory, models, and algorithms. Manag. Sci. 28(1), 1–16 (1982). https://doi.org/10.1287/mnsc.28.1.1
Article MATH Google Scholar
Reis, C., Barth, A., Pizano, C.: Browser security: lessons from google chrome. Commun. ACM 52(8), 45–49 (2009). https://doi.org/10.1145/1536616.1536634
Article Google Scholar
Sha, L.: Using simplicity to control complexity. IEEE Softw. 20–28 (2001). https://doi.org/10.1109/MS.2001.936213
Article Google Scholar
Tkachev, I., Mereacre, A., Katoen, J.P., Abate, A.: Quantitative automata-based controller synthesis for non-autonomous stochastic hybrid systems. In: Proceedings of the 16th International Conference on Hybrid Systems: Computation and Control, pp. 293–302. ACM (2013). https://doi.org/10.1145/2461328.2461373
Wang, X., Hovakimyan, N., Sha, L.: L1simplex: fault-tolerant control of cyber-physical systems. In: 2013 ACM/IEEE International Conference on Cyber-Physical Systems (ICCPS), pp. 41–50. IEEE (2013). https://doi.org/10.1145/2502524.2502531
Wang, X., Hovakimyan, N., Sha, L.: RSimplex: a robust control architecture for cyber and physical failures. ACM Trans. Cyber Phys. Syst. 2(4), 27 (2018). https://doi.org/10.1145/3121428
Article Google Scholar
Yao, J., Liu, X., Zhu, G., Sha, L.: Netsimplex: controller fault tolerance architecture in networked control systems. IEEE Trans. Ind. Inform. 9(1), 346–356 (2013). https://doi.org/10.1109/TII.2012.2219060
Article Google Scholar

Download references

Acknowledgements

The authors would like to thank Abolfazl Lavaei for the discussions on synthesizing optimal safety controllers for stochastic systems.

Author information

Authors and Affiliations

Mechanical Engineering Department, Technical University of Munich, Munich, Germany
Bingzhuo Zhong & Marco Caccamo
Computer Science Department, University of Colorado Boulder, Boulder, USA
Majid Zamani
Computer Science Department, Ludwig Maximilian University of Munich, Munich, Germany
Majid Zamani

Authors

Bingzhuo Zhong
View author publications
You can also search for this author in PubMed Google Scholar
Majid Zamani
View author publications
You can also search for this author in PubMed Google Scholar
Marco Caccamo
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Bingzhuo Zhong .

Editor information

Editors and Affiliations

Université Paris 13, Villetaneuse, France
Étienne André
University of Twente, Enschede, The Netherlands
Mariëlle Stoelinga

Appendix: Proof of Theorem 2

The Proof of Theorem 2 is done with the help of the following lemma.

Lemma 1

Given a finite MDP $\mathfrak {M}\,=\,\{\tilde{X},\,\tilde{U},\,\tilde{T}\}$ and a Markov policy $\mu \,=\,(\mu _0, \mu _1,\ldots ,\mu _{H-1})$ in a finite time horizon $\overline{0,H}$, we have

$$\begin{aligned} 1-\tilde{V}_{n+1}(\tilde{x}) = \sum _{\tilde{y}\in \tilde{X}\backslash \{\phi \}}(1-\tilde{V}_{n}(\tilde{y}))\tilde{T}(\tilde{y}|\tilde{x},\mu _{H-n-1}(\tilde{x})) \end{aligned}$$

where $\tilde{V}_{n}(\tilde{x})$ is the value function for the reach-avoid problem and $\tilde{x}\in \tilde{X}$.

The proof can be readily derived based on Theorem 1 and the definition of $\tilde{T}$. Let $\mu '$ be the Markov policy used to control the system when the unverified controller is accepted at some states at some time instants. Here, we use $\tilde{X}_s$ to represent $\tilde{X}\backslash \{\phi \}$. Let’s define:

$$\begin{aligned} f(\tilde{x}(k),\mu '_{k}(\tilde{x}(k))) = 1-\sum _{\tilde{x}(k+1)\in \tilde{X}_s}\tilde{V}_{*,H-k-1}(\tilde{x}(k+1))\tilde{T}(\tilde{x}(k+1)|\tilde{x}(k),\mu '_{k}(\tilde{x}(k))), \end{aligned}$$

and

$$\begin{aligned} g(\tilde{x}(k-1),\mu '_{k-1}(\tilde{x}(k-1)))=\tilde{T}(\tilde{x}(k)|\tilde{x}(k-1),\mu '_{k-1}(\tilde{x}(k-1))). \end{aligned}$$

Given initial state $s_0\in \tilde{X}_s$, at each time instant $t=k$ where $k\in \overline{0,H-1}$, we have

$$\begin{aligned} \begin{aligned}&1-\tilde{V}^{\mu '}_{H}(s_0)\\ =&\sum _{\tilde{x}(1)\in \tilde{X}_s}\left( \sum _{\tilde{x}(2)\in \tilde{X}_s}\left( \ldots \left( \sum _{\tilde{x}(k)\in \tilde{X}_s}f(\tilde{x}(k),\mu '_{k}(\tilde{x}(k)))g(\tilde{x}(k-1),\mu '_{k-1}(\tilde{x}(k-1)))\right) \right. \right. \\&\left. \left. \ldots \right) g(\tilde{x}(1),\mu '_{1}(\tilde{x}(1)))\right) g(s_0,\mu '_{0}(s_0))\\ \end{aligned} \end{aligned}$$

$$\begin{aligned} \begin{aligned} \ge&\sum _{\tilde{x}(1)\in \tilde{X}_s}\left( \sum _{\tilde{x}(2)\in \tilde{X}_s}\left( \ldots \left( f(\underline{\tilde{x}(k)},\underline{\mu '_{k}(\tilde{x}(k))})\sum _{\tilde{x}(k)\in \tilde{X}_s}g(\tilde{x}(k-1),\mu '_{k-1}(\tilde{x}(k-1)))\right) \right. \right. \\&\left. \left. \ldots \right) g(\tilde{x}(1),\mu '_{1}(\tilde{x}(1)))\right) g(s_0,\mu '_{0}(s_0))\\ \ge&\sum _{\tilde{x}(1)\in \tilde{X}_s}\left( \sum _{\tilde{x}(2)\in \tilde{X}_s}\left( \ldots \left( \left( \sum _{\tilde{x}(k)\in \tilde{X}_s}g(\underline{\tilde{x}(k-1)},\underline{\mu '_{k-1}(\tilde{x}(k-1))}))\right) f(\underline{\tilde{x}(k)},\underline{\mu '_{k}(\tilde{x}(k))})\right. \right. \right. \\&\left. \left. \left. \sum _{\tilde{x}(k-1)\in \tilde{X}_s}g(\tilde{x}(k-2),\mu '_{k-2}(\tilde{x}(k-2)))\right) \ldots \right) g(\tilde{x}(1),\mu '_{1}(\tilde{x}(1)))\right) g(s_0,\mu '_{0}(s_0))\\ \ge&\sum _{\tilde{x}(1)\in \tilde{X}_s}\left( \sum _{\tilde{x}(2)\in \tilde{X}_s}\left( \ldots \left( \left( \sum _{\tilde{x}(k-1)\in \tilde{X}_s}g(\underline{\tilde{x}(k-2)},\underline{\mu '_{k-2}(\tilde{x}(k-2))})\right) \right. \right. \right. \\&\left( \sum _{\tilde{x}(k)\in \tilde{X}_s}g(\underline{\tilde{x}(k-1)},\underline{\mu '_{k-1}(\tilde{x}(k-1))})\right) f(\underline{\tilde{x}(k)},\underline{\mu '_{k}(\tilde{x}(k))})\\&\left. \left. \left. \sum _{\tilde{x}(k-2)\in \tilde{X}_s}g(\tilde{x}(k-3),\mu '_{k-3}(\tilde{x}(k-3)))\right) \ldots \right) g(\tilde{x}(1),\mu '_{1}(\tilde{x}(1)))\right) g(s_0,\mu '_{0}(s_0))\\&\ldots \\ \ge&\prod _{t=1}^{k}\sum _{\tilde{x}(t)\in \tilde{X}_s}g(\underline{\tilde{x}(t-1)}, \underline{\mu _{t-1}(\tilde{x}(t-1))}) (f(\underline{\tilde{x}(k)},\underline{\mu '_{k}(\tilde{x}(k))}) \end{aligned} \end{aligned}$$

where

$$\begin{aligned} (\underline{\tilde{x}(t-1)},\,\underline{\mu _{t-1}(\tilde{x}(t-1))}) = \mathop {\arg \min }_{\begin{array}{c} \tilde{x}(t-1)\in \tilde{X}_s\\ \mu _{t-1}(\tilde{x}(t-1)) \end{array}}\sum _{\tilde{x}(t)\in \tilde{X}_s}g(\tilde{x}(t-1), \mu _{t-1}(\tilde{x}(t-1))) \end{aligned}$$

for all $t\in \overline{0,k}$, and

$$\begin{aligned} (\underline{\tilde{x}(k)},\,\underline{\mu _{k}(\tilde{x}(k))}) = \mathop {\arg \min }_{\begin{array}{c} \tilde{x}(k)\in \tilde{X}_s\\ \mu _{k}(\tilde{x}(k)) \end{array}}f(\tilde{x}(k),\mu '_{k}(\tilde{x}(k))). \end{aligned}$$

Noted that $\omega = (\underline{\tilde{x}(0)}, \underline{\mu _0(\tilde{x}(0))},\underline{\tilde{x}(1)}, \underline{\mu _1(\tilde{x}(1))}\ldots \,\underline{\tilde{x}(k)})$ is one of the paths up to time instant k which can be generated by the system controlled by the Markov policy $\mu '$, and the History-based Supervisor ensures that for all paths $\omega $ up to arbitrary time instant $k\in \overline{0,H}$,

$$\begin{aligned} \prod _{t=1}^{k}\sum _{\tilde{x}\in \tilde{X}_s}g(\omega _x(t-1), \omega _u(t-1))\left( f(\omega _x(k),u_{uc}(\omega _x(k),k))\right) \ge 1-\rho . \end{aligned}$$

Note that we have $1-\tilde{V}^{\mu '}_{H}(s_0)\ge 1-\rho $, i.e. $p^{\mu '}_{s_0}(\diamond ^{\le H}\mathcal {A}^c) =\tilde{V}^{\mu '}_{H}(s_0) \le \rho $.

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zhong, B., Zamani, M., Caccamo, M. (2019). Sandboxing Controllers for Stochastic Cyber-Physical Systems. In: André, É., Stoelinga, M. (eds) Formal Modeling and Analysis of Timed Systems. FORMATS 2019. Lecture Notes in Computer Science(), vol 11750. Springer, Cham. https://doi.org/10.1007/978-3-030-29662-9_15

Download citation

DOI: https://doi.org/10.1007/978-3-030-29662-9_15
Published: 13 August 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-29661-2
Online ISBN: 978-3-030-29662-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Sandboxing Controllers for Stochastic Cyber-Physical Systems

Abstract

Access this chapter

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Appendix: Proof of Theorem 2

Appendix: Proof of Theorem 2

Lemma 1

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation