Tail Probabilities for Randomized Program Runtimes via Martingales for Higher Moments
Abstract
Programs with randomization constructs is an active research topic, especially after the recent introduction of martingalebased analysis methods for their termination and runtimes. Unlike most of the existing works that focus on proving almostsure termination or estimating the expected runtime, in this work we study the tail probabilities of runtimes—such as “the execution takes more than 100 steps with probability at most 1%.” To this goal, we devise a theory of supermartingales that overapproximate higher moments of runtime. These higher moments, combined with a suitable concentration inequality, yield useful upper bounds of tail probabilities. Moreover, our vectorvalued formulation enables automated templatebased synthesis of those supermartingales. Our experiments suggest the method’s practical use.
1 Introduction
The important roles of randomization in algorithms and software systems are nowadays wellrecognized. In algorithms, randomization can bring remarkable speed gain at the expense of small probabilities of imprecision. In cryptography, many encryption algorithms are randomized in order to conceal the identity of plaintexts. In software systems, randomization is widely utilized for the purpose of fairness, security and privacy.
Embracing randomization in programming languages has therefore been an active research topic for a long time. Doing so does not only offer a solid infrastructure that programmers and system designers can rely on, but also opens up the possibility of languagebased, static analysis of properties of randomized algorithms and systems.
The current paper’s goal is to analyze imperative programs with randomization constructs—the latter come in two forms, namely probabilistic branching and assignment from a designated, possibly continuous, distribution. We shall refer to such programs as randomized programs.^{1}
Runtime and Termination Analysis of Randomized Programs. The runtime of a randomized program is often a problem of our interest; so is almostsure termination, that is, whether the program terminates with probability 1. In the programming language community, these problems have been taken up by many researchers as a challenge of both practical importance and theoretical interest.
Most of the existing works on runtime and termination analysis follow either of the following two approaches.

Martingalebased methods, initiated with a notion of ranking supermartingale in [4] and extended [1, 6, 7, 11, 13], have their origin in the theory of stochastic processes. They can also be seen as a probabilistic extension of ranking functions, a standard proof method for termination of (nonrandomized) programs. Martingalebased methods have seen remarkable success in automated synthesis using templates and constraint solving (like LP or SDP).

The predicatetransformer approach, pursued in [2, 17, 19], uses a more syntaxguided formalism of program logic and emphasizes reasoning by invariants.
The essential difference between the two approaches is not big: an invariant notion in the latter is easily seen to be an adaptation of a suitable notion of supermartingale. The work [33] presents a comprehensive account on the ordertheoretic foundation behind these techniques.
These existing works are mostly focused on the following problems: deciding almostsure termination, computing termination probabilities, and computing expected runtime. (Here “computing” includes giving upper/lower bounds.) See [33] for a comparison of some of the existing martingalebased methods.
 Input:

a randomized program \(\varGamma \), and a deadline \(d\in \mathbb {N}\)
 Output:

an upper bound of the tail probability \(\mathrm {Pr}({T_{\mathrm {run}} \ge d})\), where \(T_{\mathrm {run}}\) is the runtime of \(\varGamma \)
An example of our target program is in Fig. 1. It is an imperative program with randomization: in Line 3, the value of z is sampled from the uniform distribution over the interval \([2, 1]\). The symbol \(*\) in the line 4 stands for a nondeterministic Boolean value; in our analysis, it is resolved so that the runtime becomes the longest.
Our Method: Concentration Inequalities, Higher Moments, and VectorValued Supermartingales. Towards the goal of computing tail probabilities, our approach is to use concentration inequalities, a technique from probability theory that is commonly used for overapproximating various tail probabilities. There are various concentration inequalities in the literature, and each of them is applicable in a different setting, such as a nonnegative random variable (Markov’s inequality), known mean and variance (Chebyshev’s inequality), a differencebounded martingale (Azuma’s inequality), and so on. Some of them were used for analyzing randomized programs [5] (see Sect. 7 for comparison).
In this paper, we use a specific concentration inequality that uses higher moments \(\mathbb {E}[{T_{\mathrm {run}}}], \dots , \mathbb {E}[{(T_{\mathrm {run}})^K}]\) of runtimes \(T_{\mathrm {run}}\), up to a choice of the maximum degree K. The concentration inequality is taken from [3]; it generalizes Markov’s and Chebyshev’s. We observe that a higher moment yields a tighter bound of the tail probability, as the deadline d grows bigger. Therefore it makes sense to strive for computing higher moments.
For computing higher moments of runtimes, we systematically extend the existing theory of ranking supermartingales, from the expected runtime (i.e. the first moment) to higher moments. The theory features a vectorvalued supermartingale, which not only generalizes easily to degrees up to arbitrary \(K\in \mathbb {N}\), but also allows automated synthesis much like usual supermartingales.
We also claim that the soundness of these vectorvalued supermartingales is proved in a mathematically clean manner. Following our previous work [33], our arguments are based on the ordertheoretic foundation of fixed points (namely the KnasterTarski, Cousot–Cousot and Kleene theorems), and we give upper bounds of higher moments by suitable least fixed points.
Overall, our workflow is as shown in Fig. 2. We note that the step 2 in Fig. 2 is computationally much cheaper than the step 1: in fact, the step 2 yields a symbolic expression for an upper bound in which d is a free variable. This makes it possible to draw graphs like the ones in Fig. 3. It is also easy to find a deadline d for which \(\mathrm {Pr}({T_{\mathrm {run}} \ge d})\) is below a given threshold \(p\in [0,1]\).
We implemented a prototype that synthesizes vectorvalued supermartingales using linear and polynomial templates. The resulting constraints are solved by LP and SDP solvers, respectively. Experiments show that our method can produce nontrivial upper bounds in reasonable computation time. We also experimentally confirm that higher moments are useful in producing tighter bounds.
Our Contributions. Summarizing, the contribution of this paper is as follows.

We extend the existing theory of ranking supermartingales from expected runtimes (i.e. the first moment) to higher moments. The extension has a solid foundation of ordertheoretic fixed points. Moreover, its clean presentation by vectorvalued supermartingales makes automated synthesis as easy as before. Our target randomized programs are rich, embracing nondeterminism and continuous distributions.

We study how these vectorvalued supermartingales (and the resulting upper bounds of higher moments) can be used to yield upper bounds of tail probabilities of runtimes. We identify a concentration lemma that suits this purpose. We show that higher moments indeed yield tighter bounds.

Overall, we present a comprehensive languagebased framework for overapproximating tail probabilities of runtimes of randomized programs (Fig. 2). It has been implemented, and our experiments suggest its practical use.
2 Preliminaries
We present some preliminary materials, including the definition of pCFGs (we use them as a model of randomized programs) and the definition of runtime.
Given topological spaces X and Y, let \(\mathcal {B}(X)\) be the set of Borel sets on X and \(\mathcal {B}(X, Y)\) be the set of Borel measurable functions \(X \rightarrow Y\). We assume that the set \(\mathbb {R}\) of reals, a finite set L and the set \([0, \infty ]\) are equipped with the usual topology, the discrete topology, and the order topology, respectively. We use the induced Borel structures for these spaces. Given a measurable space X, let \(\mathcal {D}(X)\) be the set of probability measures on X. For any \(\mu \in \mathcal {D}(X)\), let \(\mathrm {supp}(\mu )\) be the support of \(\mu \). We write \(\mathbb {E}[{X}]\) for the expectation of a random variable X.
Our use of pCFGs follows recent works including [1].
Definition 2.1
(pCFG). A probabilistic control flow graph (pCFG) is a tuple \(\varGamma = (L, V, l_{\mathrm {init}}, \varvec{x}_{\mathrm {init}}, {\mapsto }, \mathrm {Up}, \mathrm {Pr}, G)\) that consists of the following.

A finite set L of locations. It is a disjoint union of sets \(L_D\), \(L_P\), \(L_n\) and \(L_A\) of deterministic, probabilistic, nondeterministic and assignment locations.

A finite set V of program variables.

An initial location \(l_{\mathrm {init}} \in L\). – An initial valuation \(\varvec{x}_{\mathrm {init}} \in \mathbb {R}^{V}\)

A transition relation \({\mapsto } \subseteq L \times L\) which is total (i.e. \(\forall l.\, \exists l'.\,l \mapsto l'\)).

An update function \(\mathrm {Up} : L_A \rightarrow V \times \bigl (\,\mathcal {B}(\mathbb {R}^V, \mathbb {R}) \cup \mathcal {D}(\mathbb {R}) \cup \mathcal {B}(\mathbb {R})\,\bigr )\) for assignment.

A family \(\mathrm {Pr} = (\mathrm {Pr}_l)_{l \in L_P}\) of probability distributions, where \(\mathrm {Pr}_l \in \mathcal {D}(L)\), for probabilistic locations. We require that \(l' \in \mathrm {supp}(\mathrm {Pr}_l)\) implies \(l \mapsto l'\).

A guard function \(G : L_D \times L \rightarrow \mathcal {B}(\mathbb {R}^V)\) such that for each \(l \in L_D\) and \(\varvec{x} \in \mathbb {R}^V\), there exists a unique location \(l' \in L\) satisfying \(l \mapsto l'\) and \(\varvec{x} \in G(l, l')\).
The update function can be decomposed into three functions \(\mathrm {Up}_D : L_{AD} \rightarrow V \times \mathcal {B}(\mathbb {R}^V, \mathbb {R})\), \(\mathrm {Up}_P : L_{AP} \rightarrow V \times \mathcal {D}(\mathbb {R})\) and \(\mathrm {Up}_N : L_{AN} \rightarrow V \times \mathcal {B}(\mathbb {R})\), under a suitable decomposition \(L_A = L_{AD} \cup L_{AP} \cup L_{AN}\) of assignment locations. The elements of \(L_{AD}\), \(L_{AP} \) and \(L_{AN}\) represent deterministic, probabilistic and nondeterministic assignments, respectively. See e.g. [33].
An example of a pCFG is shown on the right. It models the program in Fig. 1. The node \(l_4\) is a nondeterministic location. \(\mathrm {Unif}(2, 1)\) is the uniform distribution on the interval \([2, 1]\).
A configuration of a pCFG \(\varGamma \) is a pair \((l, \varvec{x}) \in L \times \mathbb {R}^V\) of a location and a valuation. We regard the set \(S= L \times \mathbb {R}^V\) of configurations is equipped with the product topology where L is equipped with the discrete topology. We say a configuration \((l', \varvec{x}')\) is a successor of \((l, \varvec{x})\), if \(l\mapsto l'\) and the following hold.

If \(l \in L_D\), then \(\varvec{x}' = \varvec{x}\) and \(\varvec{x} \in G(l, l')\). – If \(l \in L_N \cup L_P\), then \(\varvec{x}' = \varvec{x}\).

If \(l \in L_A\), then \(\varvec{x}' = {\varvec{x}}({x_j} \leftarrow {a})\), where \({\varvec{x}}({x_j} \leftarrow {a})\) denotes the vector obtained by replacing the \(x_j\)component of \(\varvec{x}\) by a. Here \(x_{j}\) is such that \(\mathrm {Up}(l) = (x_j, u)\), and a is chosen as follows: (1) \(a = u(\varvec{x})\) if \(u \in \mathcal {B}(\mathbb {R}^V, \mathbb {R})\); (2) \(a \in \mathrm {supp}(u)\) if \(u \in \mathcal {D}(\mathbb {R})\); and (3) \(a \in u\) if \(u \in \mathcal {B}(\mathbb {R})\).
An invariant of a pCFG \(\varGamma \) is a measurable set \(I \in \mathcal {B}(S)\) such that \((l_{\mathrm {init}}, \varvec{x}_{\mathrm {init}}) \in I\) and I is closed under taking successors (i.e. if \(c \in I\) and \(c'\) is a successor of c then \(c' \in I\)). Use of invariants is a common technique in automated synthesis of supermartingales [1]: it restricts configuration spaces and thus makes the constraints on supermartingales weaker. It is also common to take an invariant as a measurable set [1]. A run of \(\varGamma \) is an infinite sequence of configurations \(c_0 c_1 \dots \) such that \(c_0\) is the initial configuration \((l_{\mathrm {init}}, \varvec{x}_{\mathrm {init}})\) and \(c_{i+1}\) is a successor of \(c_i\) for each i. Let \(\mathrm {Run}({\varGamma })\) be the set of runs of \(\varGamma \).
A scheduler resolves nondeterminism: at a location in \(L_{N}\cup L_{AN}\), it chooses a distribution of next configurations depending on the history of configurations visited so far. Given a pCFG \(\varGamma \) and a scheduler \(\sigma \) of \(\varGamma \), a probability measure \(\nu _{\sigma }^{\varGamma }\) on \(\mathrm {Run}({\varGamma })\) is defined in the usual manner. See [22, Appendix B] for details.
Definition 2.2
(reaching time \(T_{C}^{\varGamma }, T_{C,\sigma }^{\varGamma }\)). Let \(\varGamma \) be a pCFG and \(C\subseteq S\) be a set of configurations called a destination. The reaching time to C is a function \(T_C^{\varGamma } : \mathrm {Run}({\varGamma }) \rightarrow [0, \infty ]\) defined by \( (T_C^{\varGamma })(c_{0}c_{1}\cdots ) = {\mathrm {argmin}}_{i\in \mathbb {N}} (c_{i}\in C) \). Fixing a scheduler \(\sigma \) makes \(T_{C}^{\varGamma }\) a random variable, since \(\sigma \) determines a probability measure \(\nu _{\sigma }^{\varGamma }\) on \(\mathrm {Run}({\varGamma })\). It is denoted by \(T_{C,\sigma }^{\varGamma }\).
Runtimes of pCFGs are a special case of reaching times, namely to the set of terminating configurations.
The following higher moments are central to our framework. Recall that we are interested in demonic schedulers, i.e. those which make runtimes longer.
Definition 2.3
(\( \mathbb {M}_{C, \sigma }^{\varGamma , k}\) and \(\overline{\mathbb {M}}_{C}^{\varGamma , k}\)). Assume the setting of Definition 2.2, and let \(k\in \mathbb {N}\) and \(c \in S\). We write \(\mathbb {M}_{C, \sigma }^{\varGamma , k}(c)\) for the kth moment of the reaching time of \(\varGamma \) from c to C under the scheduler \(\sigma \), i.e. that is, \( \mathbb {M}_{C, \sigma }^{\varGamma , k}(c) = \mathbb {E}[{(T^{\varGamma _c}_{C,\sigma })^k}] = \int (T^{\varGamma _c}_C)^k\, \mathrm {d} \nu _{\sigma }^{\varGamma _c} \) where \(\varGamma _c\) is a pCFG obtained from \(\varGamma \) by changing the initial configuration to c. Their supremum under varying \(\sigma \) is denoted by \( \overline{\mathbb {M}}_{C}^{\varGamma , k} := \sup _{\sigma } \mathbb {M}_{C, \sigma }^{\varGamma , k} \).
3 Ranking Supermartingale for Higher Moments
We introduce one of the main contributions in the paper, a notion of ranking supermartingale that overapproximates higher moments. It is motivated by the following observation: martingalebased reasoning about the second moment must concur with one about the first moment. We conduct a systematic theoretical extension that features an ordertheoretic foundation and vectorvalued supermartingales. The theory accommodates nondeterminism and continuous distributions, too. We omit some details and proofs; they are in [22, Appendix C].
The fully general theory for higher moments will be presented in Sect. 3.2; we present its restriction to the second moments in Sect. 3.1 for readability.
Prior to these, we review the existing theory of ranking supermartingales, through the lens of ordertheoretic fixed points. In doing so we follow [33].
Definition 3.1
(“nexttime” operation \(\overline{\mathbb {X}}\) (preexpectation)). Given \(\eta : S\rightarrow [0, \infty ]\), let \(\overline{\mathbb {X}} \eta : S\rightarrow [0, \infty ]\) be the function defined as follows.

If \(l \in L_D\) and \(\varvec{x} \vDash G(l, l')\), then \((\overline{\mathbb {X}} \eta ) (l, \varvec{x}) = \eta (l', \varvec{x})\).

If \(l \in L_P\), then \((\overline{\mathbb {X}} \eta ) (l, \varvec{x}) = \sum _{l \mapsto l'} \mathrm {Pr}_l(l') \eta (l', \varvec{x})\).

If \(l \in L_N\), then \((\overline{\mathbb {X}} \eta ) (l, \varvec{x}) = \sup _{l \mapsto l'} \eta (l', \varvec{x})\).

If \(l \in L_A\), \(\mathrm {Up}(l) = (x_j, u)\) and \(l \mapsto l'\), if \(u \in \mathcal {B}(\mathbb {R}^V, \mathbb {R})\), then \((\overline{\mathbb {X}} \eta ) (l, \varvec{x}) = \eta (l', {\varvec{x}}({x_j} \leftarrow {u(\varvec{x})}))\); if \(u \in \mathcal {D}(\mathbb {R})\), then \((\overline{\mathbb {X}} \eta ) (l, \varvec{x}) = \int _{\mathbb {R}} \eta (l', {\varvec{x}}({x_j} \leftarrow {y}))\, \mathrm {d} u(y)\); and if \(u \in \mathcal {B}(\mathbb {R})\), then \((\overline{\mathbb {X}} \eta ) (l, \varvec{x}) = \sup _{y \in u} \eta (l', {\varvec{x}}({x_j} \leftarrow {y}))\).
Intuitively, \(\overline{\mathbb {X}} \eta \) is the expectation of \(\eta \) after one transition. Nondeterminism is resolved by the maximal choice.
The following theorem is fundamental in theoretical computer science.
Theorem 3.2
(Knaster–Tarski, [34]). Let \((L, \le )\) be a complete lattice and \(f : L \rightarrow L\) be a monotone function. The least fixed point \(\mu f\) is the least prefixed point, i.e. \(\mu f = \min \{ l \in L \mid f(l) \le l \}\). \(\square \)
The significance of the KnasterTarski theorem in verification lies in the induced proof rule: \(f(l)\le l\Rightarrow \mu f\le l\). Instantiating to the expected reaching time \( \overline{\mathbb {M}}^{\varGamma , 1}_{C}\!=\! \mu F_{1}\), it means \(F_1 (\eta ) \!\le \! \eta \Rightarrow \overline{\mathbb {M}}^{\varGamma , 1}_{C} \!\le \! \eta \), i.e. an arbitrary prefixed point of \(F_{1}\)—which coincides with the notion of ranking supermartingale [4]—overapproximates the expected reaching time. This proves soundness of ranking supermartingales.
3.1 Ranking Supermartingales for the Second Moments
We extend ranking supermartingales to the second moments. It paves the way to a fully general theory (up to the Kth moments) in Sect. 3.2.
The key in the martingalebased reasoning of expected reaching times (i.e. first moments) was that they are characterized as the least fixed point of a function \(F_1\). Here it is crucial that for an arbitrary random variable T, we have \(\mathbb {E}[{T+1}]=\mathbb {E}[{T}]+1\) and therefore we can calculate \(\mathbb {E}[{T+1}]\) from \(\mathbb {E}[{T}]\). However, this is not the case for second moments. As \(\mathbb {E}[{(T+1)^2}] = \mathbb {E}[{T^2}] + 2 \mathbb {E}[{T}] + 1\), calculating the second moment requires not only \(\mathbb {E}[{T^2}]\) but also \(\mathbb {E}[{T}]\). This encourages us to define a vectorvalued supermartingale.
Definition 3.3
(timeelapse function \(\mathrm {El}_{1}\)). A function \(\mathrm {El}_{1} \!:\! [0, \infty ]^2 \rightarrow [0, \infty ]^2\) is defined by \(\mathrm {El}_{1}(x_1, x_2) = (x_1 + 1, x_2 + 2 x_1 + 1)\).
Then, an extension of \(F_1\) for second moments can be defined as a combination of the timeelapse function \(\mathrm {El}_{1}\) and the preexpectation \(\overline{\mathbb {X}}\).
Definition 3.4
We can extend the complete lattice structure of \([0, \infty ]\) to the function space \(S \rightarrow [0, \infty ]^2\) in a pointwise manner. It is a routine to prove that \(F_2\) is monotone with respect to this complete lattice structure. Hence \(F_2\) has the least fixed point. In fact, while \(\overline{\mathbb {M}}^{\varGamma , 1}_{C}\) was characterized as the least fixed point of \(F_1\), a tuple \((\overline{\mathbb {M}}^{\varGamma , 1}_{C}, \overline{\mathbb {M}}^{\varGamma , 2}_{C})\) is not the least fixed point of \(F_2\) (cf. Example 3.8 and Theorem 3.9). However, the least fixed point of \(F_2\) overapproximates the tuple of moments.
Theorem 3.5
For any configuration \(c \in I\), \((\mu F_2) (c) \ge (\overline{\mathbb {M}}^{\varGamma , 1}_{C} (c), \overline{\mathbb {M}}^{\varGamma , 2}_{C} (c))\). \(\square \)
Like ranking supermartingale for first moments, ranking supermartingale for second moments is defined as a prefixed point of \(F_2\), i.e. a function \(\eta \) such that \(\eta \ge F_2 (\eta )\). However, we modify the definition for the sake of implementation.
Definition 3.6
(ranking supermartingale for second moments). A ranking supermartingale for second moments is a function \(\eta : S\rightarrow \mathbb {R}^2\) such that: (i) \(\eta (c) \ge (\overline{\mathbb {X}}(\mathrm {El}_{1} \mathrel {\circ }\eta ))(c)\) for each \(c \in I \setminus C\); and (ii) \(\eta (c) \ge 0\) for each \(c \in I\).
Here, the timeelapse function \(\mathrm {El}_{1}\) captures a positive decrease of the ranking supermartingale. Even though we only have inequality in Theorem 3.5, we can prove the following desired property of our supermartingale notion.
Theorem 3.7
If \(\eta : S\rightarrow \mathbb {R}^2\) is a supermartingale for second moments, then \(\bigl (\overline{\mathbb {M}}^{\varGamma , 1}_{C}(c), \overline{\mathbb {M}}^{\varGamma , 2}_{C}(c)\bigr ) \le \eta (c)\) for each \(c \in I\). \(\square \)
The following example and theorem show that we cannot replace \(\ge \) with \(=\) in Theorem 3.5 in general, but it is possible in the absence of nondeterminism.
Example 3.8
The figure on the right shows a pCFG such that \(l_2 \in L_P\) and all the other locations are in \(L_N\), the initial location is \(l_0\) and \(l_{12}\) is a terminating location. For the pCFG, the lefthand side of the inequality in
Theorem 3.5 is \(\mu F_2 (l_0) = \left( 6, 37.5 \right) \). In contrast, if a scheduler \(\sigma \) takes a transition from \(l_1\) to \(l_2\) with probability p, \((\mathbb {M}^{\varGamma , 1}_{C, \sigma }(l_0), \mathbb {M}^{\varGamma , 2}_{C, \sigma }(l_0)) = \left( 6  \frac{1}{2} p, 36  \frac{5}{2} p \right) \). Hence the righthand side is \((\overline{\mathbb {M}}^{\varGamma , 1}_{C}(l_0), \overline{\mathbb {M}}^{\varGamma , 2}_{C}(l_0)) = (6, 36)\).
Theorem 3.9
If \(L_N = L_{AN} = \emptyset \), \(\forall c\in I.\, (\mu F_2) (c) = (\overline{\mathbb {M}}^{\varGamma , 1}_{C}(c), \overline{\mathbb {M}}^{\varGamma , 2}_{C}(c))\). \(\square \)
3.2 Ranking Supermartingales for the Higher Moments
We extend the result in Sect. 3.1 to moments higher than second.
Firstly, the timeelapse function \(\mathrm {El}_{1}\) is generalized as follows.
Definition 3.10
(timeelapse function \(\mathrm {El}_{1}^{K, k}\)). For \(K\!\in \!\mathbb {N}\) and \(k\!\in \!\{1, \dots , K\}\), a function \(\mathrm {El}_{1}^{K, k} : [0, \infty ]^K \rightarrow [0, \infty ]\) is defined by \(\mathrm {El}_{1}^{K, k}(x_1, \dots , x_K) = 1 + \sum _{j=1}^k \left( {\begin{array}{c}k\\ j\end{array}}\right) x_j\). Here \(\left( {\begin{array}{c}k\\ j\end{array}}\right) \) is the binomial coefficient.
Again, a monotone function \(F_K\) is defined as a combination of the timeelapse function \(\mathrm {El}_{1}^{K, k}\) and the preexpectation \(\overline{\mathbb {X}}\).
Definition 3.11
As in Definition 3.6, we define a supermartingale as a prefixed point of \(F_K\).
Definition 3.12
(ranking supermartingale for Kth moments). We define \(\eta _1, \dots , \eta _K : S\rightarrow \mathbb {R}\) by \((\eta _1(c), \dots , \eta _K(c)) = \eta (c)\). A ranking supermartingale for Kth moments is a function \(\eta : S\rightarrow \mathbb {R}^K\) such that for each k, (i) \(\eta _k(c) \ge (\overline{\mathbb {X}} (\mathrm {El}_{1}^{K, k} \mathrel {\circ }\eta _k))(c)\) for each \(c \in I \setminus C\); and (ii) \(\eta _k(c) \ge 0\) for each \(c \in I\).
For higher moments, we can prove an analogous result to Theorem 3.7.
Theorem 3.13
If \(\eta \) is a supermartingale for Kth moments, then for each \(c \in I\), \((\overline{\mathbb {M}}^{\varGamma , 1}_{C}(c), \dots , \overline{\mathbb {M}}^{\varGamma , K}_{C}(c)) \le \eta (c)\). \(\square \)
4 From Moments to Tail Probabilities
We discuss how to obtain upper bounds of tail probabilities of runtimes from upper bounds of higher moments of runtimes. Combined with the result in Sect. 3, it induces a martingalebased method for overapproximating tail probabilities.
We use a concentration inequality. There are many choices of concentration inequalities (see e.g. [3]), and we use a variant of Markov’s inequality. We prove that the concentration inequality is not only sound but also complete in a sense.
Formally, our goal is to calculate is an upper bound of \(\mathrm {Pr}({T^{\varGamma }_{C, \sigma } \ge d})\) for a given deadline \(d > 0\), under the assumption that we know upper bounds \(u_1, \dots , u_K\) of moments \(\mathbb {E}[{T^{\varGamma }_{C, \sigma }}], \dots , \mathbb {E}[{(T^{\varGamma }_{C, \sigma })^K}]\). In other words, we want to overapproximate \(\sup _{\mu } \mu ([d, \infty ])\) where \(\mu \) ranges over the set of probability measures on \([0, \infty ]\) satisfying \(\left( \int x\, \mathrm {d}\mu (x), \dots , \int x^K\, \mathrm {d}\mu (x) \right) \le (u_1, \dots , u_K)\).
To answer this problem, we use a generalized form of Markov’s inequality.
Proposition 4.1
By letting \(\phi (x) = x^k\) in Proposition 4.1, we obtain the following inequality. It gives an upper bound of the tail probability that is “tight.”
Proposition 4.2
Proof
The former part is immediate from Proposition 4.1. For the latter part, consider \(\mu = p \delta _d + (1p) \delta _0\) where \(\delta _x\) is the Dirac measure at x and p is the value of the righthand side of (1). \(\square \)
By combining Theorem 3.13 with Proposition 4.2, we obtain the following corollary. We can use it for overapproximating tail probabilities.
Corollary 4.3
Note that if \(K = 1\), Corollary 4.3 is essentially the same as [5, Thm 4]. Note also that for each K there exists \(d>0\) such that \(\frac{\eta _K(l_{\mathrm {init}}, \varvec{x}_{\mathrm {init}})}{d^K}=\min _{0 \le k \le K} \frac{\eta _k(l_{\mathrm {init}}, \varvec{x}_{\mathrm {init}})}{d^k}\). Hence higher moments become useful in overapproximating tail probabilities as d gets large. Later in Sect. 6, we demonstrate this fact experimentally.
5 TemplateBased Synthesis Algorithm
We discuss an automated synthesis algorithm that calculates an upper bound for the kth moment of the runtime of a pCFG using a supermartingale in Definitions 3.6 or 3.12. It takes a pCFG \(\varGamma \), an invariant I, a set \(C \subseteq I\) of configurations, and a natural number K as input and outputs an upper bound of Kth moment.
Our algorithm is adapted from existing templatebased algorithms for synthesizing a ranking supermartingale (for first moments) [4, 6, 7]. It fixes a linear or polynomial template with unknown coefficients for a supermartingale and using numerical methods like linear programming (LP) or semidefinite programming (SDP), calculate a valuation of the unknown coefficients so that the axioms of ranking supermartingale for Kth moments are satisfied.
We hereby briefly explain the algorithms. See [22, Appendix D] for details.
Linear Template. Our linear templatebased algorithm is adapted from [4, 7]. We should assume that \(\varGamma \), I and C are all “linear” in the sense that expressions appearing in \(\varGamma \) are all linear and I and C are represented by linear inequalities. To deal with assignments from a distribution like \(x:=\mathrm {Norm}(0,1)\), we also assume that expected values of distributions appearing in \(\varGamma \) are known.
The algorithm first fixes a template for a supermartingale: for each location l, it fixes a Ktuple \( \bigl ( \sum _{j=1}^{V} a^l_{j, 1} x_j + b^l_{1}, \ldots , \sum _{j}^{V} a^l_{j, K} x_j + b^l_{K} \bigr ) \) of linear formulas. Here each \(a^l_{j,i}\) and \(b^l_i\) are unknown variables called parameters. The algorithm next collects conditions on the parameters so that the tuples constitute a ranking supermartingale for Kth moments. It results in a conjunction of formulas of a form \(\varphi _1\ge 0\wedge \cdots \wedge \varphi _m\ge 0\,\Rightarrow \, \psi \ge 0\). Here \(\varphi _1,\ldots ,\varphi _m\) are linear formulas without parameters and \(\psi \) is a linear formula where parameters linearly appear in the coefficients. By Farkas’ lemma (see e.g. [29, Cor 7.1h]) we can turn such formulas into linear inequalities over parameters by adding new variables. Its feasibility is efficiently solvable with an LP solver. We naturally wish to minimize an upper bound of the Kth moment, i.e. the last component of \(\eta (l_{\mathrm {init}}, \varvec{x}_{\mathrm {init}})\). We can minimize it by setting it to the objective function of the LP problem.
Polynomial Template. The polynomial templatebased algorithm is based on [6]. This time, \(\varGamma \), I and C can be “polynomial.” To deal with assignments of distributions, we assume that the nth moments of distributions in \(\varGamma \) are easily calculated for each \(n\in \mathbb {N}\). It is similar to the linear templatebased one.
It first fixes a polynomial template for a supermartingale, i.e. it assigns each location l a Ktuple of polynomial expressions with unknown coefficients. Likewise the linear templatebased algorithm, the algorithm reduces the axioms of supermartingale for higher moments to a conjunction of formulas of a form \(\varphi _1\ge 0\wedge \cdots \wedge \varphi _m\ge 0\,\Rightarrow \,\psi \ge 0\). This time, each \(\varphi _i\) is a polynomial formula without parameters and \(\psi \) is a polynomial formula whose coefficients are linear formula over the parameters. In the polynomial case, a conjunction of such formula is reduced to an SDP problem using a theorem called Positivstellensatz (we used a variant called Schmüdgen’s Positivstellensatz [28]). We solve the resulting problem using an SDP solver setting \(\eta (l_{\mathrm {init}}, \varvec{x}_{\mathrm {init}})\) as the objective function.
6 Experiments
We implemented two programs in OCaml to synthesize a supermartingale based on (a) a linear template and (b) a polynomial template. The programs translate a given randomized program to a pCFG and output an LP or SDP problem as described in Sect. 5. An invariant I and a terminal configuration C for the input program are specified manually. See e.g. [20] for automatic synthesis of an invariant. For linear templates, we have used GLPK (v4.65) [12] as an LP solver. For polynomial templates, we have used SOSTOOLS (v3.03) [31] (a sums of squares optimization tool that internally uses an SDP solver) on Matlab (R2018b). We used SDPT3 (v4.0) [30] as an SDP solver. The experiments were carried out on a Surface Pro 4 with an Intel Core i56300U (2.40 GHz) and 8 GB RAM. We tested our implementation for the following two programs and their variants, which were also used in the literature [7, 19]. Their code is in [22, Appendix E].
Coupon collector’s problem. A probabilistic model of collecting coupons enclosed in cereal boxes. There exist n types of coupons, and one repeatedly buy cereal boxes until all the types of coupons are collected. We consider two cases: (11) \(n=2\) and (12) \(n=4\). We tested the linear template program for them.
Random walk. We used three variants of 1dimensional random walks: (21) integervalued one, (22) realvalued one with assignments from continuous distributions, (23) with adversarial nondeterminism; and two variants of 2dimensional random walks (24) and (25) with assignments from continuous distributions and adversarial nondeterminism. We tested both the linear and the polynomial template programs for these examples.
Upper bounds of the moments of runtimes. “” indicates that the LP or SDP solver returned “infeasible”. The “degree” column shows the degree of the polynomial template used in the experiments.
moment  (a) linear template  (b) polynominal template  

upper bound  time (sec)  upper bound  time (sec)  degree  
(11)  1st  13  0.012  
2nd  201  0.019  
3rd  3829  0.023  
(12)  1st  68  0.024  
2nd  3124  0.054  
3rd  171932  0.089  
4th  12049876  0.126  
5th  1048131068  0.191  
(21)  1st  20  0.024  20.0  24.980  2 
2nd    0.013  2320.0  37.609  2  
3rd    0.017    30.932  3  
(22)  1st  75  0.009  75.0  33.372  2 
2nd    0.014  8375.0  73.514  2  
3rd    0.021    170.416  3  
(23)  1st  62  0.020  62.0  40.746  2 
2nd  28605.4  0.038  6710.0  97.156  2  
3rd  19567043.36  0.057    35.427  3  
(24)  1st  96  0.020  95.95  157.748  2 
2nd    0.029  10944.0  361.957  2  
(25)  1st  90  0.022    143.055  2 
2nd    0.042    327.202  2 
We can see that our method is applicable even with nondeterministic branching ((23), (24) and (25)) or assignments from continuous distributions ((22), (24) and (25)). We can use a linear template for bounding higher moments as long as there exists a supermartingale for higher moments representable by linear expressions ((11), (12) and (23)). In contrast, for (21), (22) and (24), only a polynomial template program found a supermartingale for second moments.
It is expectable that the polynomial template program gives a better bound than the linear one because a polynomial template is more expressive than a linear one. However, it did not hold for some test cases, probably because of numerical errors of the SDP solver. For example, (21) has a supermartingale for third moments that can be checked by a hand calculation, but the SDP solver returned “infeasible” in the polynomial template program. It appears that our program fails when large numbers are involved (e.g. the third moments of (21), (22) and (23)). We have also tested a variant of (21) where the initial position is multiplied by 10000. Then the SDP solver returned “infeasible” in the polynomial template program while the linear template program returns a nontrivial bound. Hence it seems that numerical errors are likely to occur to the polynomial template program when large numbers are involved.
Figure 3 shows that the bigger the deadline d is, the more useful higher moments become (cf. a remark just after Corollary 4.3). For example, in (12), an upper bound of \(\mathrm {Pr}({T^{\varGamma }_{C, \sigma } \ge 100})\) calculated from the upper bound of the first moment is 0.680 while that of the fifth moment is 0.105.
To show the merit of our method compared with samplingbased methods, we calculated a tail probability bound for a variant of (22) (shown in Fig. 4 on p. 12) with a deadline \(d = 10^{11}\). Because of its very long expected runtime, a samplingbased method would not work for it. In contrast, the linear templatebased program gave an upper bound \(\mathrm {Pr}({T^{\varGamma }_{C, \sigma } \ge 10^{11}}) \le 5000000025 / 10^{11} \approx 0.05\) in almost the same execution time as (22) (\(<0.02\) s).
7 Related Work
MartingaleBased Analysis of Randomized Programs. Martingalebased methods are widely studied for the termination analysis of randomized programs. One of the first is ranking supermartingales, introduced in [4] for proving almost sure termination. The theory of ranking supermartingales has since been extended actively: accommodating nondeterminism [1, 6, 7, 11], syntaxoriented composition of supermartingales [11], proving properties beyond termination/reachability [13], and so on. Automated templatebased synthesis of supermartingales by constraint solving has been pursued, too [1, 4, 6, 7].
Other martingalebased methods that are fundamentally different from ranking supermartingales have been devised, too. They include: different notions of repulsing supermartingales for refuting termination (in [8, 33]; also studied in control theory [32]); and multiplyscaled submartingales for underapproximating reachability probabilities [33, 36]. See [33] for an overview.
In the literature on martingalebased methods, the one closest to this work is [5]. Among its contribution is the analysis of tail probabilities. It is done by either of the following combinations: (1) differencebounded ranking supermartingales and the corresponding Azuma’s concentration inequality; and (2) (not necessarily differencebounded) ranking supermartingales and Markov’s concentration inequality. When we compare these two methods with ours, the first method requires repeated martingale synthesis for different parameter values, which can pose a performance challenge. The second method corresponds to the restriction of our method to the first moment; recall that we showed the advantage of using higher moments, theoretically (Sect. 4) and experimentally (Sect. 6). See [22, Appendix F.1] for detailed discussions. Implementation is lacking in [5], too.
We use Markov’s inequality to calculate an upper bound of \(\mathrm {Pr}({T_{\mathrm {run}} \ge d})\) from a ranking supermartingale. In [7], Hoeffding’s and Bernstein’s inequalities are used for the same purpose. As the upper bounds obtained by these inequalities are exponentially decreasing with respect to d, they are asymptotically tighter than our bound obtained by Markov’s inequality, assuming that we use the same ranking supermartingale. However, Hoeffding’s and Bernstein’s inequalities are applicable to limited classes of ranking supermartingales (socalled differencebounded and incremental ones, respectively). There exists a randomized program whose tail probability for runtimes is decreasing only polynomially (not exponentially, see [22, Appendix G]); this witnesses that there are cases where the methods in [7] do not apply but ours can.
The work [1] is also close to ours in that their supermartingales are vectorvalued. The difference is in the orders: in [1] they use the lexicographic order between vectors, and they aim to prove almost sure termination. In contrast, we use the pointwise order between vectors, for overapproximating higher moments.
The PredicateTransformer Approach to Runtime Analysis. In the runtime/termination analysis of randomized programs, another principal line of work uses predicate transformers [2, 17, 19], following the precedent works on probabilistic predicate transformers such as [21, 25]. In fact, from the mathematical point of view, the main construct for witnessing runtime/termination in those predicate transformer calculi (called invariants, see e.g. in [19]) is essentially the same thing as ranking supermartingales. Therefore the difference between the martingalebased and predicatetransformer approaches is mostly the matter of presentation—the predicatetransformer approach is more closely tied to program syntax and has a stronger deductive flavor. It also seems that there is less work on automated synthesis in the predicatetransformer approach.
In the predicatetransformer approach, the work [17] is the closest to ours, in that it studies variance of runtimes of randomized programs. The main differences are as follows: (1) computing tail probabilities is not pursued [17]; (2) their extension from expected runtimes to variance involves an additional variable \(\tau \), which poses a challenge in automated synthesis as well as in generalization to even higher moments; and (3) they do not pursue automated analysis. See Appendix F.2 of the extended version [22] for further details.
Higher Moments of Runtimes. Computing and using higher moments of runtimes of probabilistic systems—generalizing randomized programs—has been pursued before. In [9], computing moments of runtimes of finitestate Markov chains is reduced to a certain linear equation. In the study of randomized algorithms, the survey [10] collects a number of methods, among which are some tail probability bounds using higher moments. Unlike ours, none of these methods are languagebased static ones. They do not allow automated analysis.
Other Potential Approaches to Tail Probabilities. We discuss potential approaches to estimating tail probabilities, other than the martingalebased one.
Sampling is widely employed for approximating behaviors of probabilistic systems; especially so in the field of probabilistic programming languages, since exact symbolic reasoning is hard in presence of conditioning. See e.g. [35]. We also used sampling to estimate tail probabilities in (22), Fig. 3. The main advantages of our current approach over sampling are threefold: (1) our upper bounds come with a mathematical guarantee, while the sampling bounds can always be erroneous; (2) it requires ingenuity to sample programs with nondeterminism; and (3) programs whose execution can take millions of years can still be analyzed by our method in a reasonable time, without executing them. The latter advantage is shared by static, languagebased analysis methods in general; see e.g. [2].
Another potential method is probabilistic model checkers such as PRISM [23]. Their algorithms are usually only applicable to finitestate models, and thus not to randomized programs in general. Nevertheless, fixing a deadline d can make the reachable part \(S_{\le d}\) of the configuration space S finite, opening up the possibility of use of model checkers. It is an open question how to do so precisely, and the following challenges are foreseen: (1) if the program contains continuous distributions, the reachable part \(S_{\le d}\) becomes infinite; (2) even if \(S_{\le d}\) is finite, one has to repeat (supposedly expensive) runs of a model checker for each choice of d. In contrast, in our method, an upper bound for the tail probability \(\mathrm {Pr}({T_{\mathrm {run}} \ge d})\) is symbolically expressed as a function of d (Proposition 4.2). Therefore, estimating tail probabilities for varying d is computationally cheap.
8 Conclusions and Future Work
We provided a technique to obtain an upper bound of the tail probability of runtimes given a randomized algorithm and a deadline. We first extended the ordinary ranking supermartingale notion using the ordertheoretic characterization so that it can calculate upper bounds of higher moments of runtimes for randomized programs. Then by using a suitable concentration inequality, we introduced a method to calculate an upper bound of tail probabilities from upper bounds of higher moments. Our method is not only sound but also complete in a sense. Our method was obtained by combining our supermartingale and the concentration inequality. We also implemented an automated synthesis algorithm and demonstrated the applicability of our framework.
Future Work. Example 3.8 shows that our supermartingale is not complete: it sometimes fails to give a tight bound for higher moments. Studying and improving the incompleteness is one possible direction of future work. For example, the following questions would be interesting: Can bounds given by our supermartingale be arbitrarily bad? Can we remedy the completeness by restricting the type of nondeterminism? Can we define a complete supermartingale?
Making our current method compositional is another direction of future research. Use of continuations, as in [18], can be a technical solution.
We are also interested in improving the implementation. The polynomial template program failed to give an upper bound for higher moments because of numerical errors (see Sect. 6). We wish to remedy this situation. There exist several studies for using numerical solvers for verification without affected by numerical errors [14, 15, 16, 26, 27]. We might make use of these works for improvements.
Footnotes
 1.
With the rise of statistical machine learning, probabilistic programs attract a lot of attention. Randomized programs can be thought of as a fragment of probabilistic programs without conditioning (or observation) constructs. In other words, the Bayesian aspect of probabilistic programs is absent in randomized programs.
 2.
Notes
Acknowledgement
We thank the anonymous referees for useful comments. The authors are supported by JST ERATO HASUO Metamathematics for Systems Design Project (No. JPMJER1603), the JSPSINRIA Bilateral Joint Research Project “CRECOGI,” and JSPS KAKENHI Grant No. 15KT0012 & 15K11984. Natsuki Urabe is supported by JSPS KAKENHI Grant No. 16J08157.
References
 1.Agrawal, S., Chatterjee, K., Novotný, P.: Lexicographic ranking supermartingales: an efficient approach to termination of probabilistic programs. PACMPL 2(POPL), 34:1–34:32 (2018)Google Scholar
 2.Batz, K., Kaminski, B.L., Katoen, J.P., Matheja, C.: How long, O Bayesian network, will I sample thee?  A program analysis perspective on expected sampling times. In: Ahmed, A. (ed.) ESOP 2018. LNCS, vol. 10801, pp. 186–213. Springer, Cham (2018). https://doi.org/10.1007/9783319898841_7CrossRefGoogle Scholar
 3.Boucheron, S., Lugosi, G., Massart, P.: Concentration Inequalities: A Nonasymptotic Theory of Independence. Oxford University Press, Oxford (2013)CrossRefGoogle Scholar
 4.Chakarov, A., Sankaranarayanan, S.: Probabilistic program analysis with martingales. In: Sharygina, N., Veith, H. (eds.) CAV 2013. LNCS, vol. 8044, pp. 511–526. Springer, Heidelberg (2013). https://doi.org/10.1007/9783642397998_34CrossRefGoogle Scholar
 5.Chatterjee, K., Fu, H.: Termination of nondeterministic recursive probabilistic programs. CoRR, abs/1701.02944 (2017)Google Scholar
 6.Chatterjee, K., Fu, H., Goharshady, A.K.: Termination analysis of probabilistic programs through Positivstellensatz’s. In: Chaudhuri, S., Farzan, A. (eds.) CAV 2016. LNCS, vol. 9779, pp. 3–22. Springer, Cham (2016). https://doi.org/10.1007/9783319415284_1CrossRefGoogle Scholar
 7.Chatterjee, K., Fu, H., Novotný, P., Hasheminezhad, R.: Algorithmic analysis of qualitative and quantitative termination problems for affine probabilistic programs. ACM Trans. Program. Lang. Syst. 40(2), 7:1–7:45 (2018)CrossRefGoogle Scholar
 8.Chatterjee, K., Novotný, P., Zikelic, D.: Stochastic invariants for probabilistic termination. In: POPL, pp. 145–160. ACM (2017)Google Scholar
 9.Dayar, T., Akar, N.: Computing moments of first passage times to a subset of states in markov chains. SIAM J. Matrix Anal. Appl. 27(2), 396–412 (2005)MathSciNetCrossRefGoogle Scholar
 10.Doerr, B.: Probabilistic tools for the analysis of randomized optimization heuristics. CoRR, abs/1801.06733 (2018)Google Scholar
 11.Ferrer Fioriti, L.M., Hermanns, H.: Probabilistic termination: soundness, completeness, and compositionality. In: POPL, pp. 489–501. ACM (2015)Google Scholar
 12.The GNU linear programming kit. https://www.gnu.org/software/glpk/
 13.Jagtap, P., Soudjani, S., Zamani, M.: Temporal logic verification of stochastic systems using barrier certificates. In: Lahiri and Wang [24], pp. 177–193Google Scholar
 14.Jansson, C.: Termination and verification for illposed semidefinite programming problems. Optimization Online (2005)Google Scholar
 15.Jansson, C.: VSDP: a MATLAB software package for verified semidefinite programming. In: NOLTA, pp. 327–330 (2006)Google Scholar
 16.Jansson, C., Chaykin, D., Keil, C.: Rigorous error bounds for the optimal value in semidefinite programming. SIAM J. Numer. Anal. 46(1), 180–200 (2007)MathSciNetCrossRefGoogle Scholar
 17.Kaminski, B.L., Katoen, J.P., Matheja, C.: Inferring covariances for probabilistic programs. In: Agha, G., Van Houdt, B. (eds.) QEST 2016. LNCS, vol. 9826, pp. 191–206. Springer, Cham (2016). https://doi.org/10.1007/9783319434254_14CrossRefGoogle Scholar
 18.Kaminski, B.L., Katoen, J.P., Matheja, C., Olmedo, F.: Weakest precondition reasoning for expected run–times of probabilistic programs. In: Thiemann, P. (ed.) ESOP 2016. LNCS, vol. 9632, pp. 364–389. Springer, Heidelberg (2016). https://doi.org/10.1007/9783662494981_15CrossRefGoogle Scholar
 19.Kaminski, B.L., Katoen, J.P., Matheja, C., Olmedo, F.: Weakest precondition reasoning for expected runtimes of randomized algorithms. J. ACM 65(5), 30:1–30:68 (2018)MathSciNetCrossRefGoogle Scholar
 20.Katoen, J.P., McIver, A., Meinicke, L., Morgan, C.C.: Linearinvariant generation for probabilistic programs: automated support for proofbased methods. In: Cousot, R., Martel, M. (eds.) SAS 2010. LNCS, vol. 6337, pp. 390–406. Springer, Heidelberg (2010). https://doi.org/10.1007/9783642157691_24CrossRefGoogle Scholar
 21.Kozen, D.: Semantics of probabilistic programs. J. Comput. Syst. Sci. 22(3), 328–350 (1981)MathSciNetCrossRefGoogle Scholar
 22.Kura, S., Urabe, N., Hasuo, I.: Tail probabilities for randomized program runtimes via martingales for higher moments. CoRR, abs/1811.06779 (2018)Google Scholar
 23.Kwiatkowska, M.Z., Norman, G., Parker, D.: PRISM 4.0: verification of probabilistic realtime systems. In: Gopalakrishnan, G., Qadeer, S. (eds.) CAV 2011. LNCS, vol. 6806, pp. 585–591. Springer, Heidelberg (2011). https://doi.org/10.1007/9783642221101_47CrossRefGoogle Scholar
 24.Lahiri, S.K., Wang, C. (eds.): ATVA 2018. LNCS, vol. 11138. Springer, Cham (2018). https://doi.org/10.1007/9783030010904CrossRefGoogle Scholar
 25.Morgan, C., McIver, A., Seidel, K.: Probabilistic predicate transformers. ACM Trans. Program. Lang. Syst. 18(3), 325–353 (1996)CrossRefGoogle Scholar
 26.Roux, P., Iguernlala, M., Conchon, S.: A nonlinear arithmetic procedure for controlcommand software verification. In: Beyer, D., Huisman, M. (eds.) TACAS 2018. LNCS, vol. 10806, pp. 132–151. Springer, Cham (2018). https://doi.org/10.1007/9783319899633_8CrossRefGoogle Scholar
 27.Roux, P., Voronin, Y.L., Sankaranarayanan, S.: Validating numerical semidefinite programming solvers for polynomial invariants. Form. Methods Syst. Des. 53(2), 286–312 (2018)CrossRefGoogle Scholar
 28.Schmüdgen, K.: The kmoment problem for compact semialgebraic sets. Math. Ann. 289(1), 203–206 (1991)MathSciNetCrossRefGoogle Scholar
 29.Schrijver, A.: Theory of Linear and Integer Programming. Wiley, New York (1986)zbMATHGoogle Scholar
 30.
 31.SOSTOOLS. http://sysos.eng.ox.ac.uk/sostools/
 32.Steinhardt, J., Tedrake, R.: Finitetime regional verification of stochastic nonlinear systems. Int. J. Robot. Res. 31(7), 901–923 (2012)CrossRefGoogle Scholar
 33.Takisaka, T., Oyabu, Y., Urabe, N., Hasuo, I.: Ranking and repulsing supermartingales for reachability in probabilistic programs. In: Lahiri and Wang [24], pp. 476–493Google Scholar
 34.Tarski, A.: A latticetheoretical fixpoint theorem and its applications. Pac. J. Math. 5, 285–309 (1955)MathSciNetCrossRefGoogle Scholar
 35.Tolpin, D., van de Meent, J.W., Yang, H., Wood, F.D.: Design and implementation of probabilistic programming language anglican. In: IFL, pp. 6:1–6:12. ACM (2016)Google Scholar
 36.Urabe, N., Hara, M., Hasuo, I.: Categorical liveness checking by corecursive algebras. In: Proceedings of LICS 2017, pp. 1–12. IEEE Computer Society (2017)Google Scholar
Copyright information
Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.
The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.