1 Introduction

We study the applicability of convex optimization to the formal verification of systems that exhibit randomness or stochastic uncertainties. Such systems are formally represented by so-called parametric Markov models.

In fact, many real-world systems exhibit random behavior and stochastic uncertainties. One major example is in the field of robotics, where the presence of measurement noise or input disturbances requires special controller synthesis techniques [39] that achieve robustness of robot actions against uncertainties in the robot model and the environment. On the other hand, formal verification offers methods for rigorously proving or disproving properties about the system behavior, and synthesizing strategies that satisfy these properties. In particular, model checking [36] is a well-studied technique that provides guarantees on appropriate behavior for all possible events and scenarios.

Model checking can be applied to systems with stochastic uncertainties, including discrete-time Markov chains (MCs), Markov decision processes (MDPs), and their continuous-time counterparts [31]. Probabilistic model checkers are able to verify reachability properties like “the probability of reaching a set of unsafe states is \(\le 10\%\)” and expected costs properties like “the expected cost of reaching a goal state is \(\le 20\).” A rich set of properties, specified by linear- and branching-time logics, reduces to such properties [31]. Tools like PRISM  [15], STORM  [29], and iscasMc  [22] are probabilistic model checkers capable of handling a wide range of large-scale problems.

Key requirements for applying model checking are a reliable system model and formal specifications of desired or undesired behaviors. As a result, most approaches assume that models of the stochastic uncertainties are precisely given. For example, if a system description includes an environmental disturbance, the mean of that disturbance should be known before formal statements are made about expected system behavior. However, the desire to treat many applications where uncertainty measures (e.g., faultiness, reliability, reaction rates, packet loss ratio) are not exactly known at design time gives rise to parametric probabilistic models [1, 30]. Here, transition probabilities are expressed as functions over system parameters, i.e., descriptions of uncertainties. In this setting, parameter synthesis addresses the problem of computing parameter instantiations leading to satisfaction of system specifications. More precisely, parameters are mapped to concrete probabilities inducing the resulting instantiated model to satisfy specifications. A direct application is model repair [13], where a concrete model (without parameters) is changed (repaired) such that specifications are satisfied.

Dedicated tools like PARAM  [11], PRISM  [15], or PROPhESY  [25] compute rational functions over parameters that express reachability probabilities or expected costs in a parametric Markov chain (pMC). These optimized tools work with millions of states but are restricted to a few parameters, as the necessary computation of greatest common divisors does not scale well with the number of parameters. Moreover, the resulting functions are inherently nonlinear and often of high degree. Evaluation by an SMT solver over nonlinear arithmetic such as Z3  [17] suffers from the fact that the solving procedures are exponential in the degree of polynomials and the number of variables.

This paper takes an alternative perspective. We discuss a general nonlinear programming formulation for the verification of parametric Markov decision processes (pMDPs). The powerful modeling capabilities of nonlinear programs (NLPs) enable incorporating multi-objective properties and penalties on the parameters of the pMDP. However, because of their generality, solving NLPs to find a global optimum is difficult. Even feasible solutions (satisfying the constraints) cannot always be computed efficiently [5, 37]. In contrast, for the class of NLPs called convex optimization problems, efficient methods to compute feasible solutions and global optima even for large-scale problems are available [38].

We therefore propose a novel automated method of utilizing convex optimization for pMDPs. Many NLP problems for pMDPs belong to the class of signomial programs (SGPs), a certain class of nonconvex optimization problems. For instance, all benchmarks available at the PARAM–webpage [26] belong to this class. Restricting the general pMDP problem accordingly yields a direct and efficient synthesis method—formulated as an NLP—for a large class of pMDP problems. We list the two main technical results of this paper:

  1. 1.

    We relax nonconvex constraints in SGPs and apply a simple transformation to the parameter functions. The resulting programs are geometric programs (GPs) [7], a class of convex programs. We show that a solution to the relaxed GP induces feasibility (satisfaction of all specifications) in the original pMDP problem. Note that solving GPs is polynomial in the number of variables.

  2. 2.

    Given an initial feasible solution, we use a technique called sequential convex programming [7] to improve a signomial objective. This local optimization method for nonconvex problems leverages convex optimization by solving a sequence of convex approximations (GPs) of the original SGP.

Sequential convex programming is known to efficiently find a feasible solution with good, though not necessarily globally optimal, objective values [7, 8]. We initialize the sequence with a feasible solution (obtained from the GP) of the original problem and compute a trust region. Inside this region, the optimal value of the approximation of the SGP is at least as good as the objective value at the feasible solution of the GP. The optimal solution of the approximation is then the initial point of the next iteration with a new trust region. This procedure is iterated to approximate a local optimum of the original problem.

Utilizing our results, we discuss the concrete problems of parameter synthesis and model repair for multiple specifications for pMDPs. Experimental results with a prototype implementation show the applicability of our optimization methods to benchmarks of up to \(10^5\) states. As solving GPs is polynomial in the number of variables, our approaches are relatively insensitive to the number of parameters in pMDPs. This is an improvement over state-of-the-art approaches that leverage SMT, which—for our class of problems—scale exponentially in variables and the degree of polynomials. This is substantiated by our experiments.

Related Work. Several approaches exist for pMCs [11, 12, 23, 25] while the number of approaches for pMDPs [12, 33] is limited. Ceska et al. [21] synthesize rate parameters in stochastic biochemical networks. Multi-objective model checking of non-parametric MDPs [9] is a convex problem [14]. Bortolussi et al. [28] developed a Bayesian statistical algorithm for properties on stochastic population models. Convex uncertainties in MDPs without parameter dependencies are discussed in [20]. Parametric probabilistic models are used to rank patches in the repair of software [32] and to compute perturbation bounds [24, 34].

2 Preliminaries

A probability distribution over a finite or countably infinite set \(X\) is a function \(\mu :X\rightarrow [0,\, 1]\subseteq \mathbb {R}\) with \(\sum _{x\in X}\mu (x)=1\). The set of all distributions on \(X\) is denoted by \( Distr (X)\).

Definition 1

(Monomial, Posynomial, Signomial). Let \(V=\{x_1,\ldots ,x_n\}\) be a finite set of strictly positive real-valued variables. A monomial over V is an expression of the form

$$\begin{aligned} g=c\cdot x_{1}^{a_{1}}\cdots x_{n}^{a_{n}}\ , \end{aligned}$$

where \(c\in \mathbb {R}_{>0} \) is a positive coefficient, and \(a_i\in \mathbb {R}\) are exponents for \(1\le i\le n\). A posynomial over V is a sum of one or more monomials:

$$\begin{aligned} f=\sum _{k=1}^K c_k\cdot x_1^{a_{1k}}\cdots x_n^{a_{nk}} \ . \end{aligned}$$
(1)

If \(c_k\) is allowed to be a negative real number for any \(1\le k\le K\), then the expression (1) is a signomial. The sets of all monomials, posynomials, and signomials over V are denoted by \( Mon_{V} \), \( Pos_{V} \), and \( Sig_{V} \), respectively.

This definition of monomials differs from the standard algebraic definition where exponents are positive integers with no restriction on the coefficient sign. A sum of monomials is then called a polynomial. Our definitions are consistent with [7].

Definition 2

(Valuation). For a set of real-valued variables V, a valuation u over V is a function \(u:V \rightarrow \mathbb {R}\). The set of all valuations over V is \( Val ^V\).

Applying valuation u to monomial \(g\) over V yields a real number \(g[u]\in \mathbb {R}\) by replacing each occurrence of variables \(x\in V\) in \(g\) by u(x); the procedure is analogous for posynomials and signomials using standard arithmetic operations.

Definition 3

(pMDP and pMC). A parametric Markov decision process (pMDP) is a tuple \(\mathcal {M}{}=(S{},\,s_{ I }{}, Act ,{V},\mathcal {P}{})\) with a finite set S of states, an initial state \(s_{ I }\in S\), a finite set \( Act \) of actions, a finite set of real-valued variables \({V} \), and a transition function \(\mathcal {P}:S \times Act \times S \rightarrow Sig_{{V}} \) satisfying for all \(s\in S: Act (s) \ne \emptyset \), where \( Act (s) = \{\alpha \in Act \mid \exists s'\in S.\,\mathcal {P}(s,\,\alpha ,\,s') \ne 0\}\). If for all \(s\in S\) it holds that \(| Act (s)| = 1\), \(\mathcal {M}\) is called a parametric discrete-time Markov chain (pMC).

\( Act (s)\) is the set of enabled actions at state s; as \( Act (s) \ne \emptyset \), there are no deadlock states. Costs are defined using a state–action cost function \({c}:S \times Act \rightarrow \mathbb {R}_{\ge 0}\).

Remark 1

Largely due to algorithmic reasons, the transition probabilities in the literature [12, 25, 33] are polynomials or rational functions, i.e., fractions of polynomials. Our restriction to signomials is realistic; all benchmarks from the PARAM–webpage [26] contain only signomial transition probabilities.

A pMDP \(\mathcal {M}\) is a Markov decision process (MDP) if the transition function is a valid probability distribution, i.e., \(\mathcal {P}:S \times Act \times S \rightarrow [0, 1]\) and \(\sum _{s'\in S}\mathcal {P}(s,\alpha ,s') = 1\) for all \(s \in S \text{ s.t. } \alpha \in Act (s)\). Analogously, a Markov chain (MC) is a special class of a pMC; a model is parameter-free if all probabilities are constant. Applying a valuation u to a pMDP, denoted \(\mathcal {M}[u]\), replaces each signomial f in \(\mathcal {M}\) by f[u]; we call \(\mathcal {M}[u]\) the instantiation of \(\mathcal {M}\) at u. The application of u is to replace the transition function f by the probability f[u]. A valuation u is well-defined for \(\mathcal {M}\) if the replacement yields probability distributions at all states; the resulting model \(\mathcal {M}[u]\) is an MDP or an MC.

Example 1

(pMC). Consider a variant of the Knuth–Yao model of a die [2], where a six-sided die is simulated by successive coin flips. We alternate flipping two biased coins, which result in heads with probabilities defined by the monomials p and q, respectively. Consequently, the probability for tails is given by the signomials \(1-p\) and \(1-q\), respectively. The corresponding pMC is depicted in Fig. 1(a); and the instantiated MC for \(p = 0.4\) and \(q = 0.7\) is given in Fig. 1(b). Note that we omit actions, as the model is deterministic.

Fig. 1.
figure 1

A variant of the Knuth–Yao die for unfair coins.

In order to define a probability measure and expected cost on MDPs, nondeterministic choices are resolved by so-called schedulers. For practical reasons we restrict ourselves to memoryless schedulers; details can be found in [36].

Definition 4

(Scheduler). A (randomized) scheduler for an MDP \(\mathcal {M}\) is a function \(\sigma :S\rightarrow Distr ( Act )\) such that \(\sigma (s)(\alpha ) > 0\) implies \(\alpha \in Act (s)\). The set of all schedulers over \(\mathcal {M}\) is denoted by \( Sched ^\mathcal {M}\).

Applying a scheduler to an MDP yields a so-called induced Markov chain.

Definition 5

(Induced MC). Let MDP \(\mathcal {M}{}=(S{},s_{ I }{}, Act ,\mathcal {P}{})\) and scheduler \(\sigma \in Sched ^\mathcal {M}\). The MC induced by \(\mathcal {M}\) and \(\sigma \) is \(\mathcal {M}^\sigma =(S,s_{ I }, Act ,\mathcal {P}^\sigma )\) where for all \(s,s'\in S\),

$$\begin{aligned} \mathcal {P}^\sigma (s,s')=\sum _{\alpha \in Act (s)} \sigma (s)(\alpha )\cdot \mathcal {P}(s,\alpha ,s'). \end{aligned}$$

We consider reachability properties and expected cost properties. For MC \(\mathcal {D}\) with states S, let \(\mathrm {Pr}^{\mathcal {D}}_{s}(\lozenge T)\) denote the probability of reaching a set of target states \(T \subseteq S\) from state \(s\in S\); simply \(\mathrm {Pr}^{\mathcal {D}}(\lozenge T)\) denotes the probability for initial state \(s_{ I }\). We use the standard probability measure as in [36, Chap. 10]. For threshold \(\lambda \in [0, 1]\), the reachability property asserting that a target state is to be reached with probability at most \(\lambda \) is denoted \(\varphi = \mathbb {P}_{\le \lambda }(\lozenge T)\). The property is satisfied by \(\mathcal {D}\), written \(\mathcal {D}\models \varphi \), iff \(\mathrm {Pr}^{\mathcal {D}}(\lozenge T)\le \lambda \).

The cost of a path through MC \(\mathcal {D}\) until a set of goal states \(G\subseteq S\) is the sum of action costs visited along the path. The expected cost of a finite path is the product of its probability and its cost. For \(\mathrm {Pr}^{\mathcal {D}}(\lozenge G) = 1\), the expected cost of reaching G is the sum of expected costs of all paths leading to G. An expected cost property \(\mathrm {EC}_{\le \kappa }(\lozenge G)\) is satisfied if the expected cost of reaching T is bounded by a threshold \(\kappa \in \mathbb {R}\). Formal definitions are given in e.g., [36].

If multiple specifications \(\varphi _1,\ldots ,\varphi _q\) are given, which are either reachability properties or expected cost properties of the aforementioned forms, we write the satisfaction of all specifications \(\varphi _1,\ldots ,\varphi _q\) for an MC \(\mathcal {D}\) as \(\mathcal {D}\models \varphi _1\wedge \ldots \wedge \varphi _q\).

An MDP \(\mathcal {M}\) satisfies the specifications \(\varphi _1,\ldots ,\varphi _q\), iff for all schedulers \(\sigma \in Sched ^\mathcal {M}\) it holds that \(\mathcal {M}^\sigma \models \varphi _1\wedge \ldots \wedge \varphi _q\). The verification of multiple specifications is also referred to as multi-objective model checking [9, 16]. We are also interested in the so-called scheduler synthesis problem, where the aim is to find a scheduler \(\sigma \) such that the specifications are satisfied (although other schedulers may not satisfy the specifications).

3 Nonlinear Programming for pMDPs

In this section we formally state a general pMDP parameter synthesis problem and describe how it can be formulated using nonlinear programming.

3.1 Formal Problem Statement

figure a

Intuitively, we wish to compute a parameter valuation and a scheduler such that all specifications are satisfied, and the objective is globally minimized. We refer to a valuation–scheduler pair \((u,\sigma )\) that satisfies condition (a), i.e., only guarantees satisfaction of the specifications but does not necessarily minimize the objective f, as a feasible solution to the pMDP synthesis problem. If both (a) and (b) are satisfied, the pair is an optimal solution to the pMDP synthesis problem.

3.2 Nonlinear Encoding

We now provide an NLP encoding of Problem 1. A general NLP over a set of real-valued variables \(\mathcal {V} \) can be written as

$$\begin{aligned} \text {minimize}&\quad f \end{aligned}$$
(2)
$$\begin{aligned} \text {subject to}&\nonumber \\ \forall i.\, 1\le i\le m&\quad g_i \le 0, \end{aligned}$$
(3)
$$\begin{aligned} \forall j.\, 1\le i\le p&\quad h_j = 0, \end{aligned}$$
(4)

where f, \(g_i\), and \(h_j\) are arbitrary functions over \(\mathcal {V} \), and m and p are the number of inequality and equality constraints of the program respectively. Tools like IPOPT  [10] solve small instances of such problems.

Consider a pMDP \(\mathcal {M}{}=(S{},\,s_{ I }{}, Act ,{V},\mathcal {P}{})\) with specifications \(\varphi _1=\mathbb {P}_{\le \lambda }(\lozenge T)\) and \(\varphi _2=\mathrm {EC}_{\le \kappa }(\lozenge G)\). We will discuss how additional specifications of either type can be encoded. The set \(\mathcal {V} = {V} \cup W\) of variables of the NLP consists of the variables \({V} \) that occur in the pMDP as well as a set W of additional variables:

  • \(\{ \sigma ^{s,\alpha } \mid s \in S, \alpha \in Act (s) \}\), which define the randomized scheduler \(\sigma \) by \(\sigma (s)(\alpha )=\sigma ^{s,\alpha }\).

  • \(\{ p_s \mid s \in S \}\), where \(p_s\) is the probability of reaching the target set \(T\subseteq S\) from state s under scheduler \(\sigma \), and

  • \(\{ c_s \mid s \in S \}\), where \(c_s\) is the expected cost to reach \(G\subseteq S\) from s under \(\sigma \).

A valuation over \(\mathcal {V} \) consists of a valuation \(u\in Val ^V\) over the pMDP variables and a valuation \(w\in Val ^W\) over the additional variables.

$$\begin{aligned} \text{ minimize }&\quad f \end{aligned}$$
(5)
$$\begin{aligned} \text{ subject } \text{ to }\nonumber \\&\quad p_{s_{ I }}\le \lambda , \end{aligned}$$
(6)
$$\begin{aligned}&\quad c_{s_{ I }}\le \kappa , \end{aligned}$$
(7)
$$\begin{aligned} \forall s\in S.&\quad \sum _{\alpha \in Act (s)}\sigma ^{s,\alpha }=1, \end{aligned}$$
(8)
$$\begin{aligned} \forall s\in S\,~\forall \alpha \in Act (s).&\quad 0 \le \sigma ^{s,\alpha } \le 1, \end{aligned}$$
(9)
$$\begin{aligned} \forall s\in S\,~\forall \alpha \in Act (s).&\quad \sum _{s'\in S}\mathcal {P}(s,\alpha ,s')=1, \end{aligned}$$
(10)
$$\begin{aligned} \forall s, s'\in S\,~\forall \alpha \in Act (s).&\quad 0 \le \mathcal {P}(s,\alpha ,s') \le 1, \end{aligned}$$
(11)
$$\begin{aligned} \forall s\in T.&\quad p_s=1, \end{aligned}$$
(12)
$$\begin{aligned} \forall s\in S\setminus T.&\quad p_s=\sum _{\alpha \in Act (s)}\sigma ^{s,\alpha }\cdot \sum _{s'\in S} \mathcal {P}(s,\alpha ,s')\cdot p_{s'}, \end{aligned}$$
(13)
$$\begin{aligned} \forall s\in G.&\quad c_s=0, \end{aligned}$$
(14)
$$\begin{aligned} \forall s\in S\setminus G.&\quad c_s= \sum _{\alpha \in Act (s)} \sigma ^{s,\alpha } \cdot \Bigl (c(s,\alpha ) + \sum _{s'\in S} \mathcal {P}(s,\alpha ,s') \cdot c_{s'}\Bigr ). \end{aligned}$$
(15)

The NLP (5)–(15) encodes Problem 1 in the following way. The objective function f in (5) is any real-valued function over the variables \(\mathcal {V} \). The constraints (6) and (7) encode the specifications \(\varphi _1\) and \(\varphi _2\), respectively. The constraints (8)–(9) ensure that the scheduler obtained is well-defined by requiring that the scheduler variables at each state sum to unity. Similarly, the constraints (10)–(11) ensure that for all states, parameters from \({V} \) are instantiated such that probabilities sum up to one. (These constraints are included if not all probabilities at a state are constant.) The probability of reaching the target for all states in the target set is set to one using (12). The reachability probabilities in each state depend on the reachability of the successor states and the transition probabilities to those states through (13). Analogously to the reachability probabilities, the cost for each goal state \(G\subseteq S\) must be zero, thereby precluding the collection of infinite cost at absorbing states, as enforced by (14). Finally, the expected cost for all states except target states is given by the equation (15), where according to the strategy \(\sigma \) the cost of each action is added to the expected cost of the successors.

We can readily extend the NLP to include more specifications. If another reachability property \(\varphi '=\mathbb {P}_{\le \lambda '}(\lozenge T')\) is given, we add the set of probability variables \(\{ p'_s \mid s \in S\}\) to W, and duplicate the constraints (12)–(13) accordingly. To ensure satisfaction of \(\varphi '\), we also add the constraint \(p'_{s_{ I }}\le \lambda '\). The procedure is similar for additional expected cost properties. By construction, we have the following result relating the NLP encoding and Problem 1.

Theorem 1

The NLP (5)–(15) is sound and complete with respect to Problem 1.

We refer to soundness in the sense that each variable assignment that satisfies the constraints induces a scheduler and a valuation of parameters such that a feasible solution of the problem is induced. Moreover, any optimal solution to the NLP induces an optimal solution of the problem. Completeness means that all possible solutions of the problem can be encoded by this NLP; while unsatisfiability means that no such solution exists, making the problem infeasible.

Signomial Programs. By Definitions 1 and 3, all constraints in the NLP consist of signomial functions. A special class of NLPs known as signomial programs (SGPs) is of the form (2)–(4) where f, \(g_i\) and \(h_j\) are signomials over \(\mathcal {V} \), see Definition 1. Therefore, we observe that the NLP (5)–(15) is an SGP. We will refer to the NLP as an SGP in what follows.

SGPs with equality constraints consisting of functions that are not affine are not convex in general. In particular, the SGP (5)–(15) is not necessarily convex. Consider a simple pMC only having transition probabilities of the form p and \(1-p\), as in Example 1. The function in the equality constraint (13) of the corresponding SGP encoding is not affine in parameter p and the probability variable \(p_s\) for some state \(s\in S\). More generally, the equality constraints (10), (13), and (15) involving \(\mathcal {P}\) are not necessarily affine, and thus the SGP may not be a convex program [38]. Whereas for convex programs global optimal solutions can be found efficiently [38], such guarantees are not given for SGPs. However, we can efficiently obtain local optimal solutions for SGPs in our setting, as shown in the following sections.

4 Convexification

We investigate how to transform the SGP (5)–(15) into a convex program by relaxing equality constraints and a lifting of variables of the SGP. A certain subclass of SGPs called geometric programs (GPs) can be transformed into convex programs [7, Sect. 2.5] and solved efficiently. A GP is an SGP of the form (2)–(4) where \(f,g_i\in Pos_{\mathcal {V}} \) and \(h_j\in Mon_{\mathcal {V}} \). We will refer to a constraint with posynomial or monomial function as a posynomial or monomial constraint, respectively.

4.1 Transformation and Relaxation of Equality Constraints

As discussed before, the SGP (5)–(15) is not convex because of the presence of non-affine equality constraints. First observe the following transformation [7]:

$$\begin{aligned} f\le h\Longleftrightarrow \frac{f}{h}\le 1, \end{aligned}$$
(16)

for \(f\in Pos_{\mathcal {V}} \) and \(h\in Mon_{\mathcal {V}} \). Note that monomials are strictly positive (Definition 1). This (division-)transformation of \(f\le h\) yields a posynomial inequality constraint.

We relax all equality constraints of SGP (5)–(15) that are not monomials to inequalities, then we apply the division-transformation wherever possible. Constraints (6), (7), (8), (10), (13), and (15) are transformed to

$$\begin{aligned}&\quad \frac{p_{s_{ I }}}{\lambda }\le 1, \end{aligned}$$
(17)
$$\begin{aligned}&\quad \frac{c_{s_{ I }}}{\kappa }\le 1, \end{aligned}$$
(18)
$$\begin{aligned} \forall s\in S.&\quad \sum _{\alpha \in Act (s)}\sigma ^{s,\alpha }\le 1, \end{aligned}$$
(19)
$$\begin{aligned} \forall s\in S\, \forall \alpha \in Act (s).&\quad \sum _{s'\in S}\mathcal {P}(s,\alpha ,s')\le 1, \end{aligned}$$
(20)
$$\begin{aligned} \forall s\in S\setminus T.&\quad \frac{\sum \limits _{\alpha \in Act (s)}\sigma ^{s,\alpha }\cdot \sum \limits _{s'\in S} \mathcal {P}(s,\alpha ,s')\cdot p_{s'}}{p_s}\le 1, \end{aligned}$$
(21)
$$\begin{aligned} \forall s\in S\setminus G.&\quad \frac{\sum \limits _{\alpha \in Act (s)} \sigma ^{s,\alpha } \cdot \Bigl (c(s,\alpha ) + \sum \limits _{s'\in S} \mathcal {P}(s,\alpha ,s') \cdot c_{s'}\Bigr )}{c_s}\le 1 . \end{aligned}$$
(22)

These constraints are not necessarily posynomial inequality constraints because (as in Definition 3) we allow signomial expressions in the transition probability function \(\mathcal {P}\). Therefore, replacing (6), (7), (8), (10), (13), and (15) in the SGP with (17)–(22) does not by itself convert the SGP to a GP.

Fig. 2.
figure 2

Lifting of signomial transition probability function.

4.2 Convexification by Lifting

The relaxed equality constraints (20)–(22) involving \(\mathcal {P}\) are signomial, rather than posynomial, because the parameters enter Problem 1 in signomial form. Specifically, consider the relaxed equality constraint (21) at \(s_0\) in Example 1,

$$\begin{aligned} \frac{p\cdot p_{s_1} + (1-p)\cdot p_{s_2}}{p_{s_0}} \le 1. \end{aligned}$$
(23)

The term \((1-p)\cdot p_{s_2}\) is signomial in p and \(p_{s_2}\). We lift by introducing a new variable \(\bar{p} = 1-p\), and rewrite (23) as a posynomial inequality constraint and an equality constraint in the lifted variables:

$$\begin{aligned} \frac{p\cdot p_{s_1} + \bar{p} \cdot p_{s_2}}{p_{s_0}} \le 1, \quad \bar{p} = 1-p. \end{aligned}$$
(24)

We relax the (non-monomial) equality constraint to \(p + \bar{p} \le 1\). More generally, we restrict the way parameters occur in \(\mathcal {P}\) as follows. Refer to Fig. 2(a). For every state \(s\in S\) and every action \(\alpha \in Act (s)\) we require that there exists at most one state \(\bar{s}\in S\) such that \(\mathcal {P}(s,\alpha ,\bar{s})\in Sig_{{V}} \) and \(\mathcal {P}(s,\alpha ,s')\in Pos_{{V}} \) for all \(s'\in S\setminus \{\bar{s}\}\). In particular, we require that

$$\begin{aligned} \underbrace{\mathcal {P}(s,\alpha ,\bar{s})}_{\in Sig_{{V}} } = 1 - \sum _{s'\in S\setminus \{\bar{s}\}} \underbrace{\mathcal {P}({s,\alpha ,s'})}_{\in Pos_{{V}} }\ . \end{aligned}$$

This requirement is met by all benchmarks available at the PARAM–webpage [26]. In general, we lift by introducing a new variable \(\bar{p}_{s,\alpha ,\bar{s}}=\mathcal {P}(s,\alpha ,\bar{s})\) for each such state \(s\in S\); refer to Fig. 2(b). We denote this set of lifting variables by L. Lifting as explained above then creates a new transition probability function \(\bar{\mathcal {P}}\) where for every \(s,s'\in S\) and \(\alpha \in Act \) we have \(\bar{\mathcal {P}}(s,\alpha ,s')\in Pos_{{V} \cup L} \).

We call the set of constraints obtained through transformation, relaxation, and lifting of every constraint of the SGP (6)–(15) as shown above the convexified constraints. Any posynomial objective subject to the convexified constraints forms by construction a GP over the pMDP parameters \({V} \), the SGP additional variables W, and the lifting variables L.

4.3 Tightening the Constraints

A solution of the GP as obtained in the previous section does not have a direct relation to the original SGP (5)–(15). In particular, a solution to the GP may not have the relaxed constraints satisfied with equality. For (19) and (20), the induced parameter valuation and the scheduler are not well-defined, i.e., the probabilities may not sum to one. We need to relate the relaxed and lifted GP to Problem 1. By defining a regularization function F over all parameter and scheduler variables, we ensure that the constraints are satisfied with equality; enforcing well-defined probability distributions.

$$\begin{aligned} F = \sum _{p\in {V}} \frac{1}{p} + \sum _{\bar{p}\in L} \frac{1}{\bar{p}} + \sum _{s\in S,\alpha \in Act (s)} \frac{1}{\sigma _{s,\alpha }} \ . \end{aligned}$$
(25)

The function F is monotone in all its variables. We discard the original objective f in (5) and form a GP with the regularization objective F (25):

$$\begin{aligned} \text {minimize }&\quad F \end{aligned}$$
(26)
$$\begin{aligned} \text {subject to}\nonumber \\&\quad \frac{p_{s_{ I }}}{\lambda }\le 1, \end{aligned}$$
(27)
$$\begin{aligned}&\quad \frac{c_{s_{ I }}}{\kappa }\le 1, \end{aligned}$$
(28)
$$\begin{aligned} \forall s\in S.&\quad \sum _{\alpha \in Act (s)}\sigma ^{s,\alpha }\le 1, \end{aligned}$$
(29)
$$\begin{aligned} \forall s\in S\, \forall \alpha \in Act (s).&\quad \sigma ^{s,\alpha } \le 1 , \end{aligned}$$
(30)
$$\begin{aligned} \forall s\in S\, \forall \alpha \in Act (s).&\quad \sum _{s'\in S}\bar{\mathcal {P}}(s,\alpha ,s')\le 1, \end{aligned}$$
(31)
$$\begin{aligned} \forall s, s'\in S\, \forall \alpha \in Act (s).&\quad \bar{\mathcal {P}}(s,\alpha ,s') \le 1, \end{aligned}$$
(32)
$$\begin{aligned} \forall s\in T.&\quad p_s=1, \end{aligned}$$
(33)
$$\begin{aligned} \forall s\in S\setminus T.&\quad \frac{\sum \limits _{\alpha \in Act (s)}\sigma ^{s,\alpha }\cdot \sum \limits _{s'\in S} \bar{\mathcal {P}}(s,\alpha ,s')\cdot p_{s'}}{p_s}\le 1, \end{aligned}$$
(34)
$$\begin{aligned} \forall s\in S\setminus G.&\quad \frac{\sum \limits _{\alpha \in Act (s)} \sigma ^{s,\alpha } \cdot \Bigl (c(s,\alpha ) + \sum \limits _{s'\in S} \bar{\mathcal {P}} (s,\alpha ,s') \cdot c_{s'}\Bigr )}{c_s}\le 1 . \end{aligned}$$
(35)

Since the objective F (25) and the inequality constraints (29) and (31) are monotone in \({V} \), L, and the scheduler variables, each optimal solution for a feasible problem satisfies them with equality. We obtain a well-defined scheduler \(\sigma \) and a valuation u as in Problem 1. Note that variables from (14) are explicitly excluded from the GP by treating them as constants.

The reachability probability constraints (34) and cost constraints (35) need not be satisfied with equality. However, (34) is equivalent to

$$\begin{aligned} {p_s} \ge {\sum \limits _{\alpha \in Act (s)}\sigma ^{s,\alpha }\cdot \sum \limits _{s'\in S} \bar{\mathcal {P}}(s,\alpha ,s')\cdot p_{s'}} \end{aligned}$$

for all \(s\in S\setminus T\) and \(\alpha \in Act \). The probability variables \(p_s\) are assigned upper bounds on the actual probability to reach the target states T under scheduler \(\sigma \) and valuation u. Put differently, the \(p_s\) variables cannot be assigned values that are lower than the actual probability; ensuring that \(\sigma \) and u induce satisfaction of the specification given by (27) if the problem is feasible and \(\sigma \) and u are well-defined. An analogous reasoning applies to the expected cost computation (35). A solution consisting of a scheduler or valuation that are not well-defined occurs only if Problem 1 itself is infeasible. Identifying that such a solution has been obtained is easy. These facts allow us to state the main result of this section.

Theorem 2

A solution of the GP (26)–(35) inducing well-defined scheduler \(\sigma \) and valuation u is a feasible solution to Problem 1.

Note that the actual probabilities induced by \(\sigma \) and u for the given pMDP \(\mathcal {M}\) are given by the MC \(\mathcal {M}^\sigma [u]\) induced by \(\sigma \) and instantiated by u. Since all variables are implicitly positive in a GP, no transition probability function will be instantiated to probability zero. The case of a scheduler variable being zero to induce the optimum can be excluded by a previous graph analysis.

5 Sequential Geometric Programming

We showed how to efficiently obtain a feasible solution for Problem 1 by solving GP (26)–(35). We propose a sequential convex programming trust-region method to compute a local optimum of the SGP (5)–(15), following [7, Sect. 9.1], solving a sequence of GPs. We obtain each GP by replacing signomial functions in equality constraints of the SGP (5)–(15) with monomial approximations of the functions.

Definition 6

(Monomial approximation). Given a posynomial \(f \in Sig_{\mathcal {V}} \), variables \(\mathcal {V} = \{x_1,\dots ,x_n\}\), and a valuation \(u \in Val ^\mathcal {V} \), a monomial approximation \(\hat{f}\in Mon_{\mathcal {V}} \) for f near u is

$$\begin{aligned} \forall i. 1 \le i \le n \quad \hat{f} = f[u] \prod _{i=1}^{n}\Bigg (\dfrac{x_{i}}{u(x_i)}\Bigg )^{a_{i}}, \quad \text {where } a_{i}=\dfrac{u(x_i)}{f[u]}\dfrac{\partial f}{\partial x_{i}}[u]. \end{aligned}$$

Intuitively, we compute a linearization \(\hat{f}\) of \(f\in Sig_{\mathcal {V}} \) around a fixed valuation u. We enforce the fidelity of monomial approximation \(\hat{f}\) of \(f \in Sig_{\mathcal {V}} \) by restricting valuations to remain within a set known as trust region. We define the following constraints on the variables \(\mathcal {V} \) with \(t > 1\) determining the size of the trust region:

(36)

For a given valuation u, we approximate the SGP (5)–(15) to obtain a local GP as follows. First, we apply a lifting procedure (Sect. 4.2) to the SGP ensuring that all constraints consist of posynomial functions. The thus obtained posynomial inequality constraints are included in the local GP. After replacing posynomials in every equality constraint by their monomial approximations near u, the resulting monomial equality constraints are also included. Finally, we add trust region constraints (36) for scheduler and parameter variables. The objective function is the same as for the SGP. The optimal solution of the local GP is not necessarily a feasible solution to the SGP. Therefore, we first normalize the scheduler and parameter values to obtain well-defined probability distributions. These normalized values are used to compute precise probabilities and expected cost using PRISM. The steps above provide a feasible solution of the SGP.

We use such approximations to obtain a sequence of feasible solutions to the SGP approaching a local optimum of the SGP. First, we compute a feasible solution \(u^{(0)}\) for Problem 1 (Sect. 4), forming the initial point of a sequence of solutions \(u^{(0)},\ldots , u^{(N)}, N \in \mathbb {N}\). The solution \(u^{(k)}\) for \(0\le k\le N\) is obtained from a local GP defined using \(u^{(k-1)}\) as explained above.

The parameter t for each iteration k is determined based on its value for the previous iteration, and the ratio of \( f\left[ u^{(k-1)}\right] \) to \(f\left[ u^{(k-2)}\right] \), where f is the objective function in (5). The iterations are stopped when \(\left| f\left[ u^{(k)}\right] - f\left[ u^{(k-1)}\right] \right| < \epsilon \). Intuitively, \(\epsilon \) defines the required improvement on the objective value for each iteration; once there is not enough improvement the process terminates.

6 Applications

We discuss two applications and their restrictions for the general SGP (5)–(15).

Model Repair. For MC \(\mathcal {D}\) and specification \(\varphi \) with \(\mathcal {D}\not \models \varphi \), the model repair problem [13] is to transform \(\mathcal {D}\) to \(\mathcal {D}'\) such that \(\mathcal {D}'\models \varphi \). The transformation involves a change of transition probabilities. Additionally, a cost function measures the change of probabilities. The natural underlying model is a pMC where parameters are added to probabilities. The cost function is minimized subject to constraints that induce satisfaction of \(\varphi \). In [13], the problem is given as NLP. Heuristic [27] and simulation-based methods [19] (for MDPs) were presented.

Leveraging our results, one can readily encode model repair problems for MDPs, multiple objectives, and restrictions on probability or cost changes directly as NLPs. The encoding as in [13] is handled by our method in Sect. 5 as it involves signomial constraints. We now propose a more efficient approach, which encodes the change of probabilities using monomial functions. Consider an MDP \(\mathcal {M}{}=(S{},s_{ I }{}, Act ,\mathcal {P}{})\) and specifications \(\varphi _1,\ldots ,\varphi _q\) with \(\mathcal {M}\not \models \varphi _1\wedge \ldots \wedge \varphi _q\). For each probability \(\mathcal {P}(s,\alpha ,s')=a\in \mathbb {R}\) that may be changed, introduce a parameter p, forming the parameter set \({V} \). We define a parametric transition probability function by \(\mathcal {P}'(s,\alpha ,s')=p\cdot a\in Mon_{V} \). The quadratic cost function is for instance \(f=\sum _{p\in V} p^2\in Pos_{V} \). By minimizing the sum of squares of the parameters (with some regularization), the change of probabilities is minimized.

By incorporating these modifications into SGP (5)–(15), our approach is directly applicable. Either we restrict the cost function f to an upper bound, and efficiently solve a feasibility problem (Sect. 4), or we compute a local minimum of the cost function (Sect. 5). In contrast to [13], our approach works for MDPs and has an efficient solution. While [19] uses fast simulation techniques, we can directly incorporate multiple objectives and restrictions on the results while offering an efficient numerical solution of the problem.

Parameter Space Partitioning. For pMDPs, tools like PRISM  [15] or PROPhESY  [25] aim at partitioning the parameter space into regions with respect to a specification. A parameter region is given by a convex polytope defined by linear inequalities over the parameters, restricting valuations to a region. Now, for pMDP \(\mathcal {M}\) a region is safe regarding a specification \(\varphi \), if no valuation u inside this region and no scheduler \(\sigma \) induce \(\mathcal {M}^\sigma [u]\not \models \varphi \). Vice versa, a region is unsafe, if there is no valuation and scheduler such that the specification is satisfied. In [25], this certification is performed using SMT solving. More efficiency is achieved by using an approximation method [33].

Certifying regions to be unsafe is directly possible using our approach. Assume pMDP \(\mathcal {M}\), specifications \(\varphi _1,\ldots ,\varphi _q\), and a region candidate defined by a set of linear inequalities. We incorporate the inequalities in the NLP (5)–(15). If the feasibility problem (Sect. 4) has no solution, the region is unsafe. This yields the first efficient numerical method for this problem of which we are aware. Proving that a region is safe is more involved. Given one specification \(\varphi =\mathbb {P}_{\le \lambda }(\lozenge T)\), we maximize the probability to reach T. If this probability is at most \(\lambda \), the region is safe. For using our method from Sect. 5, one needs domain specific knowledge to show that a local optimum is a global optimum.

7 Experiments

We implemented a prototype using the Python interfaces of the probabilistic model checker STORM  [29] and the optimization solver MOSEK  [35]. All experiments were run on a 2.6 GHz machine with 32 GB RAM. We used PRISM  [15] to correct approximation errors as explained before. We evaluated our approaches using mainly examples from the PARAM–webpage [26] and from PRISM  [18]. We considered several parametric instances of the Bounded Retransmission Protocol (BRP) [4], NAND Multiplexing [6], and the Consensus protocol (CONS) [3]. For BRP, we have a pMC and a pMDP version, NAND is a pMC, and CONS is a pMDP. For obtaining feasibility solutions, we compare to the SMT solver Z3  [17]. For additional optimality criteria, there is no comparison to another tool possible as IPOPT  [10] already fails for the smallest instances we consider.

Figure 3(a) states for each benchmark instance the number of states (#states) and the number of parameters (#par). We defined two specifications consisting of a expected cost property (\(\mathrm {EC}\)) and a reachability property (\(\mathbb {P}\)). For some benchmarks, we also maximized the probability to reach a set of “good states” (\(*\)). We list the times taken by MOSEK; for optimality problems we also list the times PRISM took to compute precise probabilities or costs (Sect. 5). For feasibility problems we list the times of Z3. The timeout (TO) is 90 min.

Fig. 3.
figure 3

Experiments.

We observe that both for feasibility with optimality criteria we can handle most benchmarks of up to \(10^5\) states within the timeout, while we ran into a timeout for CONS. The number of iterations N in the sequential convex programming is less than 12 for all benchmarks with \(\epsilon =10^{-3}\). As expected, simply solving feasibility problems is faster by at least one order of magnitude. Raising the number of parameters from 2 to 4 for BRP does not cause a major performance hit, contrary to existing tools. For all benchmarks except NAND, Z3 only delivered results for the smallest instances within the timeout.

To demonstrate the insensitivity of our approach to the number of parameters, we considered a pMC of rolling multiple Knuth–Yao dice with 156 states, 522 transitions and considered instances with up to 8 different parameters. The timeout is 100 s. In Fig. 3(b) we compare our encoding in MOSEK for this benchmark to the mere computation of a rational function using PROPhESY  [25] and again to Z3. PROPhESY already runs into a timeout for 4 parametersFootnote 1. Z3 needs around 15 s for most of the tests. Using GPs with MOSEK proves far more efficient as it needs less than one second for all instances.

In addition, we test model repair (Sect. 6) on a BRP instance with 17415 states for \(\varphi =\mathbb {P}_{\le 0.9}(\lozenge T)\). The initial parameter instantiation violates \(\varphi \). We performed model repair towards satisfaction of \(\varphi \). The probability of reaching T results in 0.79 and the associated cost is 0.013. The computation time is 21.93 s. We compare our result to an implementation of [19], where the probability of reaching T is 0.58 and the associated cost is 0.064. However, the time for the simulation-based method is only 2.4 s, highlighting the expected trade-off between optimality and computation times for the two methods.

Finally, we encode model repair for the small pMC from Example 1 in IPOPT, see [13]. For \(\psi =\mathbb {P}_{\le 0.125}(\lozenge T)\) where T represents the outcome of the die being 2, the initial instantiation induces probability 1 / 6. With our method, the probability of satisfying \(\psi \) is 0.1248 and the cost is 0.0128. With IPOPT, the probability is 0.125 with cost 0.1025, showing that our result is nearly optimal.

8 Conclusion and Future Work

We presented a way to use convex optimization in the field of parameter synthesis for parametric Markov models. Using our results, many NLP encodings of related problems now have a direct and efficient solution.

Future work will concern the integration of these methods into mature tools like PRISM or PROPhESY to enable large-scale benchmarking by state space reduction techniques and advanced data structures. Moreover, we will explore extensions to richer models like continuous-time Markov chains [31].