An Iterative DecisionMaking Scheme for Markov Decision Processes and Its Application to Selfadaptive Systems
 5 Citations
 759 Downloads
Abstract
Software is often governed by and thus adapts to phenomena that occur at runtime. Unlike traditional decision problems, where a decisionmaking model is determined for reasoning, the adaptation logic of such software is concerned with empirical data and is subject to practical constraints. We present an Iterative DecisionMaking Scheme (IDMS) that infers both point and interval estimates for the undetermined transition probabilities in a Markov Decision Process (MDP) based on sampled data, and iteratively computes a confidently optimal scheduler from a given finite subset of schedulers. The most important feature of IDMS is the flexibility for adjusting the criterion of confident optimality and the sample size within the iteration, leading to a tradeoff between accuracy, data usage and computational overhead. We apply IDMS to an existing selfadaptation framework Rainbow and conduct a case study using a Rainbow system to demonstrate the flexibility of IDMS.
Keywords
Goal State Markov Decision Process Computational Overhead Cumulative Cost Integer Matrice1 Introduction
Software is often governed by and thus adapts to phenomena that occur at runtime [22]. One typical example is the control software of autonomous systems, such as driverless vehicles. Because the occurrence of runtime phenomena is asynchronous with respect to the flow of the application logic, because not all information about the phenomena is available at the design time, and because the specification of the adaptive behavior may evolve over time, it is advantageous to gather the complex adaptation logic into a component separated from the application logic. In contrast to traditional decision problems where a decision model, such as a Markov Decision Process (MDP) [27], is determined for reasoning, the adaptation logic of autonomous systems is governed by empirical data and is subject to practical constraints. In many situations, one has to sacrifice the optimality of an adaptive solution to a certain extent in order to satisfy various QualityofService (QoS) constraints.
Consider a Web system that provides news content services. At some moment, the system may detect high latency of content delivery. Suppose that the system can lower the content fidelity (such as delivering multimedia contents in the text mode) and/or increase the server pool size, and that the benefits or costs of these operations are measured quantitatively. Further, to achieve more sophisticated effects, operations can be combined to form a strategy. For example, one simple strategy could be the following: Once “high latency” is detected, increase the number of Virtual Machine (VM) instances by one; if “high latency” persists, switch from the multimedia mode to the text mode. Because multiple strategies built into the adaptation logic may be triggered by the same condition, an additional mechanism is required to select one of them.
A key challenge of the strategy selection for the Web system is that some probability parameters, such as successful chances of operations, are not fixed. For example, if the VM number is increased by one, the probability that latency will drop below the threshold may increase, but it still has to be estimated based on runtime data. While the idealized goal is to select an optimal strategy, it is important to take into account the practical constraints. For example, obsolete data no longer reflects the current environmental situation; the time frame of data sampling may be constrained by the tolerance of adaptation delay; the sampling frequency may be restricted because of its performance overhead on the network; and last but not the least, the adaptation should not downgrade the functional performance of the system by consuming too much computational capacity (e.g., CPU and RAM). In short, besides decision accuracy, runtime decisionmaking has to address the limitation of data and computation resource.
 1.
We infer both point and interval estimates of transition probabilities for the MDP decision model based on the data structure for runtime data.
 2.
Next, we compute a scheduler that minimizes the cumulative cost for a given reachability problem.
 3.
We then determine whether this scheduler meets a criterion called confident optimality. If yes, or if the maximal number of iterative steps is reached, the iteration terminates; otherwise, the iteration returns to data sampling.
We formalize three metrics for IDMS: (i) the probability that a confidently optimal scheduler is truly optimal, namely accuracy; (ii) the average sample size of the iteration, which is a direct metric of data usage; and (iii) the average time of iteration, which measures computational overhead conveniently. The tradeoff among these three metrics is realized by adjusting the criterion of confident optimality and the sample size during the iteration. The core method of IDMS is a valueiteration algorithm developed from probabilistic model checking [19].
The second contribution of this paper is an application of IDMS to selfadaptive systems. Several highlevel frameworks and approaches based on probabilistic model checking have been proposed to aid the design of selfadaptive systems, but with emphasis on different aspects of the adaptation [3, 4, 18, 20, 23]. However none of these works address the problem of making the aforementioned tradeoff in the adaptation. We demonstrate that IDMS can be naturally embedded into the Rainbow framework [11] which employs a standard, pointvalued MDP as its decision model, and thus extends the adaptation function of the latter. We present a case study on a Rainbow system and the empirical evidence that demonstrates the flexibility of IDMS.
The remainder of the paper is organized as follows. Section 2 presents the formal models and core method. Section 3 presents the IDMS scheme. Section 4 describes the application to selfadaptive systems. Section 5 presents the case study. Section 6 reports the related work. Section 7 concludes the paper.
2 Formal Model and ValueIteration Method
In this section, we present our formal models and valueiteration method. The position of our method in the state of the art is discussed in Sect. 6.
Definition 1

S is a finite, nonempty state space,

Act is a finite nonempty set of actions,

\(\alpha \) is the initial distribution over S,

\(\mathcal {P}= \{\mathcal {P}_a\}_{a\in Act}\) is a family of transition probability matrices indexed by \(a\in Act\), and

\(C:S\rightarrow \mathbf {R}_{\ge 0}\) is a cost function.
We require that, for each \(a\in Act\) and \(s\in S\), \(\mathcal {P}_a[s,t]\ge 0\) for all \(t\in S\) and \(\sum _{t\in S}\mathcal {P}_a[s,t]\in \{0,1\}\). We say action a is enabled at s if \(\sum _{t\in S}\mathcal {P}_a[s,t]=1\).
Schedulers play a crucial role in the analysis of MDPs. For our purposes, it suffices to consider simple schedulers, in which for each state s, the scheduler fixes one of the enabled actions at s and selects the same action every time when the system resides in s. Formally, a simple scheduler is a function \(\sigma :S\rightarrow Act\) such that \(\sigma (s)\) is one of the actions enabled at state s. In our setting, instead of considering the whole set of schedulers, we work only with a (finite) subset of simple schedulers \(\varSigma \) specified by the user. A path in \(\mathcal {M}\) under \(\sigma \) is an infinite sequence of states \(\rho =s_0s_1\cdots \) such that, for all \(i\ge 0\), \(\mathcal {P}_a[s_i,s_{i+1}]>0\) for \(a= \sigma (s_i)\). Let \(Path_{\mathcal {M},\sigma }\) be the set of paths in \(\mathcal {M}\) under \(\sigma \). Let \(Path_{\mathcal {M},\sigma }(s)\) be the subset of paths that start from s. Let \(Pr_{\mathcal {M},\sigma }\) be the standard probability distribution over \(Path_{\mathcal {M},\sigma }\) as defined in the literature [1, Chap. 10].
The expected cumulative cost, or simply cumulative cost, of reaching a set \(G\subseteq S\) of goal states (called Gstates hereafter) in \(\mathcal {M}\) under \(\sigma \), denoted \(C_{\mathcal {M},\sigma }(G) \), is defined as follows: First, let \( C_{\mathcal {M},\sigma }(s,G)\) be the expected value of random variable \(X: Path_{\mathcal {M},\sigma }(s) \rightarrow \mathbf {R}_{\ge 0} \) such that (i) if \(s\in G\) then \(X(\rho )=0 \), (ii) if \(\rho [i]\notin G\) for all \(i\ge 0\) then \(X(\rho ) = \infty \), and (iii) otherwise \(X(\rho ) =\sum _{i=0}^{n1} C(s_i) \) where \(s_n\in G\) and \(s_j\notin G\) for all \(j<n\). Then, let \(C_{\mathcal {M},\sigma }(G) = \sum _{s\in S} \alpha (s)\cdot C_{\mathcal {M},\sigma }(s,G)\).
Intervalvalued MDPs (IMDP) are MDPs where some of the transition probabilities are specified as real intervals.
Definition 2

S, Act, \(\alpha \) and \(C\) are defined the same as in Definition 1,

\(\mathcal {P}^+ = \{\mathcal {P}^+_a\}_{a\in Act}\), \(\mathcal {P}^ = \{\mathcal {P}^_a\}_{a\in Act}\) are two families of nonnegative matrices indexed by \(a\in Act\), giving the upper and lower bounds of transition probabilities respectively. Further, for each\(a\in Act\), \(\mathcal {P}^+_a\)and\(\mathcal {P}^_a\)have the same corresponding 0 and 1entries.
With \(\mathcal {M}^{I}=(S, Act, \mathcal {P}^+, \mathcal {P}^,\alpha , C)\) we associate a set of MDPs \(\llbracket \mathcal {M}^{I}\rrbracket \) such that \(\mathcal {M}=(S, Act, \mathcal {P}, \alpha , C) \in \llbracket \mathcal {M}^{I}\rrbracket \)if and only if for each \(a\in Act\), \(\mathcal {P}^_a \le \mathcal {P}_a\le \mathcal {P}^+_a\). where \(\le \) is interpreted entrywise. We call an \(\mathcal {M}\in \llbracket \mathcal {M}^{I}\rrbracket \) an instance of \(\mathcal {M}^{I}\).
Proposition 1
\(C^{\min }_{\mathcal {M}^{I}, \sigma }( G) = \sum _{s\in S} \alpha (s) x_s\).
3 Iterative DecisionMaking Scheme
3.1 IDMS Preview and Example
IDMS is an iterative process that contains one prestage and five runtime stages (i.e., Stage 1 to 5), as depicted in Fig. 1. The prestage builds up a parametric MDP with transition probability parameters in the design time. At runtime Stage 1 collects data samples and Stage 2 infers point and interval estimates based on the samples. By instantiating the parameters with the point and interval estimates, Stage 3 builds up a (concrete) MDP and an IMDP. Stage 4 attempts to compute a confidently optimal scheduler. Then the process either moves to Stage 5 where a decision is made or goes back to Stage 1. The process terminates when either a confidently optimal scheduler is returned, or the maximal time of iteration (namely the maximal number of steps within the iteration) is reached. Note as the decision making may need to be repeated periodically at runtime, Stage 5 may be followed by Stage 1.
3.2 Data Structure and Parameter Estimation
The data structure is used to estimate parameters in the parametric MDP. IDMS adopts two forms of estimation, namely point estimation and interval estimation, which we illustrate using \(M_a\). Note that \(M_a\) is used to estimate parameters \(\theta _1\) and \(\theta _2\). For point estimation, \(\theta _1\) is estimated as the numerical value \(N_{1,2}/(N_{1,2}+N_{1,G})\) and \(\theta _2\) is estimated as \(N_{2,3}/(N_{2,3}+N_{2,G})\). For interval estimation, IDMS assumes that \(\theta _1\) (resp., \(\theta _2\)) is the mean of a Bernoulli distribution and \((N_{1,2}, N_{1,G})\) (resp., \((N_{2,3}, N_{2,G})\)) forms a random sample of the distribution. In other words, \((N_{1,2}, N_{1,G})\) denote a random sample containing \(N_{1,2}\) copies of 1 and \(N_{1, G}\) copies of 0, and \((N_{2,3}, N_{2,G})\) has a similar meaning. Therefore, one can employ the standard statistical inference method to derive a confidence interval for \(\theta _1\) and one for \(\theta _2\). By the laws of large numbers, if \(N_{1,2}+N_{1,G}\) (resp., \(N_{2,3}+N_{2,G}\)) increases then the width of the resulted confidence interval for \(\theta _1\) (resp., \(\theta _2\)) likely decreases (when the confidence level is fixed).
3.3 Confident Optimality
By instantiating the transition probability parameters in the parametric MDP with the corresponding point estimates and interval estimates, one obtains a concrete MDP \(\mathcal {M}\) and an IMDP \(\mathcal {M}^{I}\). Note that if \([p,q]\subset [0,1]\) instantiates a parameter \(\theta \) then, equivalently, \([1q,1p]\) instantiates \(1\theta \). Clearly, \(\mathcal {M}\) and \(\mathcal {M}^{I}\) share the same state space S, initial distribution \(\alpha \) and cost function C. Moreover, \(\mathcal {M}\) is an instance of \(\mathcal {M}^{I}\), namely, \(\mathcal {M}\in \llbracket \mathcal {M}^{I}\rrbracket \). From now on, for given \(\mathcal {M}\) and \(\mathcal {M}^{I}\), we always assume \(\mathcal {M}\in \llbracket \mathcal {M}^{I}\rrbracket \). A key decisionmaking criterion in IDMS is formalized as follows:
Definition 3

The cumulative cost (of reaching Gstates) in \(\mathcal {M}\) under \({\sigma ^*}\) is not larger than the cumulative cost in \(\mathcal {M}\) under \(\sigma \);

The \(({1}/{\gamma })\)portion of the difference between the maximum cumulative cost in \(\mathcal {M}^{I}\) under \({\sigma ^*}\) and the minimum cumulative cost in \(\mathcal {M}^{I}\) under \(\sigma \) is not larger than the maximumminimum difference of cumulative cost in \(\mathcal {M}^{I}\) under \({\sigma ^*}\).
A correct illustrative example is presented in the latter text. It is noteworthy that, different from an standard MDP problem, a subset of schedulers is explicitly given in our definition.
The parameter \(\gamma \), which is specified by the user, has the function of adjusting the criterion of confident optimality. A confidently optimal scheduler may not exist for the given MDP and IMDP; in some rare case, there may be more than one confidently optimal schedulers. Note that if a sufficiently large value for \(\gamma \) is selected, then the second condition in Eq. (3) is guaranteed to be true. If so, the definition is degenerated to the standard definition of optimal cumulative costs for MDPs with pointvalued transition probabilities.
 1.
Compute \(C_{\sigma }\) for all \(\sigma \in \varSigma \), and compute \(\varSigma _1\subseteq \varSigma \) such that \(C_{\sigma _1}= \min _{\sigma \in \varSigma } C_{\sigma }\) if and only if \(\sigma _1\in \varSigma _1\).
 2.
Compute \(C_{\sigma _1}^{\max }\) for all \(\sigma _1\in \varSigma _1\), and compute \(C_{\sigma }^{\min }\) for all \(\sigma \in \varSigma \).
 3.
If there is \({\sigma ^*}\in \varSigma _1\) such that \(C_{{\sigma ^*}}^{\max }\le C_{\sigma }^{\min } +\gamma \cdot C_{{\sigma ^*}}^{\mathrm {dif}}\) where \(\sigma \ne {\sigma ^*}\), then return \({\sigma ^*}\); otherwise, return “no confidently optimal scheduler”.
The procedure relies on the core method of valueiteration presented in Sect. 2. The computational complexity of is dependent on the core valueiteration method and the size of \(\varSigma \). Note that although the number of all schedulers in an MDP increases exponentially as the size of the MDP increases, in our case a specific subset of schedulers \(\varSigma \) is predefined by the model builder. If we suppose the valueiteration takes constant time (e.g., the model is fixed), then the time complexity of the procedure is linear in the size of \(\varSigma \).
We present an example to explain how IDMS is affected by \(\gamma \) and the sample size. Suppose after instantiating \(\mathbf {\theta }\) of \(\mathcal {M}_{eg}(\mathbf {\theta })\) with point estimates and interval estimates, the cumulative cost intervals for schedulers \(\sigma _a\) and \(\sigma _b\) are \([l_1, u_1]\) and \([l_2,u_2]\), respectively. The positions of \(l_1, u_1, l_2\) and \( u_2\) are illustrated on the left side of the following drawing (where \(0\le p <q\)).
If \(u_1\le l_2+\gamma (u_1l_1)\), the above procedure returns \(\sigma _a\). But if \(u_1>l_2+\gamma (u_1l_1)\), neither \(\sigma _a\) nor \(\sigma _b\) is confidently optimal and so the procedure returns “no confidently optimal scheduler”. If one lowers the value \(\gamma \) and/or increases the sample size, the computed cost intervals usually shrink, as depicted on the right side of the above drawing. Then there is a higher probability that a confidently optimal scheduler (namely \(\sigma _a\)) is returned from the procedure and the iteration of IDMS terminates.
3.4 Metrics and Tradeoff

Accuracy is the probability that a confidently optimal scheduler is optimal.

Data usage is the average size of sampled data used in the iteration.

Computational overhead is measured by the average iteration time (namely, the average number of iterative steps).
Ideally, one wants to maximize the first one while minimize the latter two. However, according to laws of statistics this is impossible. To obtain high accuracy in a statistical process (including IDMS), a largesized sample has to be used; although it is possible to set a high accuracy threshold and then try to infer the result using a sample whose size is as small as possible, this usually leads to a costly iterative process. Therefore, a practical solution is to achieve a suitable tradeoff between the three metrics. In IDMS, to realize this tradeoff, one can adjust the constant \(\gamma \) and the sample size within the iteration.
4 Application to Selfadaptive System
In this section, we describe an application of IDMS to selfadaptive systems. A variety of frameworks are proposed to aid the design of selfadaptive systems [12, 13, 25] and we focus on the Rainbow framework.
4.1 Rainbow Framework
In Rainbow, the adaptation is specified as strategies in its customized language Stitch [9]. A strategy is a tree structure consisting of tactics, which in turn contain operations. Figure 4 specifies two strategies a and b, guarded by a common condition cond where SNo and MaxSNo refer to the current server number and the maximal server number, respectively.^{1} If strategy a is selected, operation enlistSever[1] in tactic s1 is first executed. Next, if the variable hiLatency is true then enlistSever[1] in tactic s2 is executed; otherwise strategy a terminates. Last, if hiLatency persists to be true then switchToTextMode in tactic s3 is executed; otherwise strategy a terminates. Strategy b is specified in a similar style.
Costs of operations in strategies a and b
Utility Dimension  Operation  

op(s1)  op(s2)  op(s3)  op(s4)  op(s5)  op(s6)  op(s7)  
Content  0  0  1  0  1  1  0 
Budget  1  1  0  2  0  0  2 
Rainbow characterizes uncertainty in the detection of guarding conditions (such as hiLantency in tactic s2) as probabilities called likelihoods. The likelihoods in strategies a and b are specified in Table 2. Note that because there is one likelihood parameter in each tactic (except s1) of in Fig. 4, like for operations, we also use tactic names to label likelihoods—the correspondent operation to an tactic s is denoted lk(s). We explain how these likelihoods are elicited in Rainbow later; for now, they are viewed as undetermined parameters.
It is not hard to observe a correspondence between the adaptation specification of Z.com and an MPD model, where operations are represented by actions and strategies are represented by schedulers. Indeed, the Stitch specification under consideration can be translated into \(\mathcal {M}_{eg}(\mathbf {\theta })\). Therefore, the adaptation problem in Rainbow is an instance of the problem of selecting a strategy that minimizes the cumulative cost (of reaching the goal states in the MDP).
4.2 Embedding IDMS into Rainbow
Likelihood parameters in strategies a and b
Likelihood  Interpretation as a conditional probability 

lk(s2)  Pr (hiLatency=true  SNo=MaxSNo  1 & textMode=true) 
lk(s3)  Pr (hiLatency=true  SNo=MaxSNo & textMode=true) 
lk(s4)  Pr (hiLoad=true  hiLatency=true & SNo=MaxSNo  2 & textMode=true) 
lk(s4)  Pr (hiLoad=false  hiLatency=true & SNo=MaxSNo  2 & textMode=true) 
lk(s6)  Pr (hiLatency=true  hiLoad=true & SNo=MaxSNo & textMode=true) 
lk(s7)  Pr (hiLatency=true  hiLoad=true & SNo=MaxSNo  2 & textMode=false) 
Rainbow exploits point estimates for likelihoods, as its decision model is a standard MDP. Because the runtime data set cannot be arbitrarily large, point estimates may be errorprone. Poor strategy selection often causes some extra cost and reduced benefit. Even worse, the extra cost and reduced benefit may accumulate if the nonoptimal strategy is selected repeatedly. In view of this, the interval estimation method in IDMS can complement to the point estimation method in Rainbow, and leads to more stable decisionmaking outputs. By applying IDMS to Rainbow, another and more important benefit is the possibility of making a tradeoff between accuracy, data usage and computational overhead, thus improving the adaptation function of Rainbow.
5 SimulationBased Experiment
5.1 Methodology and Setting
The general experimental methodology we adopt is simulation. Recall that IDMS assumes that likelihood parameters in Z.com are means of Bernoulli distributions. We use Matlab to simulate the generation and collection of runtime data. To this end, we need to fix the expected values of the Bernoulli random variables, namely the true values of \(\mathbf {\theta }\) of \(\mathcal {M}_{eg}(\mathbf {\theta })\). We let \(\theta _1= \frac{2}{3}\), \(\theta _2=\frac{4}{7}\), \(\theta _3=\frac{1}{3}\), \(\theta _4=\frac{4}{9}\) and \(\theta _5=\frac{4}{9}\). As the true values of \(\mathbf {\theta }\) are given, we also know which scheduler is optimal. Indeed, by computation, the overall cost of strategy a is 2.0476 and that of strategy b is 2.0741. Thus, strategy a is optimal. It is noteworthy that the difference between the above two overall costs may seem small, but it is nonnegligible because they are proportional to the weights of utility dimensions, which may be large in some case, and also because the extra cost may accumulate if the adaptation is triggered repeatedly.
To evaluate the flexibility of IDMS for making the intended tradeoff, we implement the computing procedure presented in Sect. 3.3 in Matlab. Given a sample of specific size for estimating each parameter \(\theta _i\) of \(\mathbf {\theta }\), and given a specific value of \(\gamma \), IDMS terminates with a certain probability, called termination probability in the experiment. Based on the termination probability, we can immediately calculate the data usage and the computational overhead. Upon termination, with a certain probability, the selected scheduler is strategy a. This probability, called correctness rate in the experiment, is equal to the metric of accuracy. Since we can simulate IDMS (applied to \(\mathcal {M}_{eg}(\mathbf {\theta })\)), we can estimate the correctness rate and termination probability using the standard Monte Carlo estimation. In this experiment, we estimate the two for different sample sizes and values of \(\gamma \). Note that the confidence level of interval estimation is fixed in IDMS and we set it to be 95 % in the experiment. The Matlab source code and data are available on http://www.comp.nus.edu.sg/~sugx/fase16/.
5.2 Experimental Data and Concrete Tradeoffs
Priorities of metrics in three different cases
Metric  Priority  

A  B  C  
Accuracy  high  medium  low 
Data usage  low  medium  high 
Computational overhead  low  high  high 
An important implication of Fig. 5 is that, by adjusting the value of \(\gamma \) and the sample size in different ways, one is able to achieve different tradeoffs between accuracy, data usage and computational overhead. To illustrate this flexibility, Table 3 describes three cases where the three metrics have different priorities. Based on Fig. 5, by selecting different pairs of \(\gamma \) and sample size, we obtain three examples of iteration schemes depicted in Fig. 6. Each marker in Fig. 6 refers to an iterative step with a specific value of \(\gamma \) and a specific sample size. For example, setting \(\gamma =0.5\) and the sample size as 1,400, according to Fig. 5, we obtain the leftmost marker of Scheme A in Fig. 6. The other markers in Fig. 6 are identified in the same way. All three schemes terminate with probability 1 before or when the sample size reaches 5,000. It is easy to observe that the schemes reflect the metric priorities in the corresponding cases in Table 3. For example, Scheme A has a high correctness rate compared with the other two schemes, because the priority of accuracy is high in Case A; it has a low average termination probability and a high number of markers, because the priorities of both data usage and computational overhead are low in Case A.
6 Related Work
Probabilistic model checking is a relatively mature technique that has been successfully applied to a wide range of domains, and we refer the readers to Forejt et al. [19] for a survey. The IMDP model considered in this paper falls into the class of probabilistic models with uncertainty, which have received substantial attention. For instance, in AI research, IMDPs were considered with different objectives such as discounted sum and limiting average [21, 30]. The motivation of those works is to come up with an abstract framework, which is different from our motivation of runtime decisionmaking. In robust control theory, IMDPs or MDPs with more general forms of uncertainty are advocated to address the robustness of the controller under potential perturbation of the system [24, 32]. In this paper, we consider expected cumulative costs of reachability properties. On the computational aspect, some of the mentioned approaches [21, 24, 30] also employed a valueiteration method. However, they mostly rely on ordering between intervals which is not needed in our case. Furthermore, Puggelli et al. [26] proposed polynomial algorithms for Markov chains with uncertainty based on optimization techniques, but only for reachability and PCTL properties.
Several highlevel frameworks and approaches based on probabilistic model checking have been proposed for selfadaptive systems recently, but with emphasis on different aspects of the adaptation, such as QoS management and optimization [4], adaptation decisions [20], verification with information of confidence intervals [3], runtime verification efficiency and sensitivity analysis [18], and proactive verification and adaptation latency [23]. None of those works addressed the problem of making a practical tradeoff similar to the one supported by IDMS. Rainbow [9] supports the computation of cumulative costs and/or rewards when the likelihood parameters in the adaptation strategies are explicitly specified. Subsequent work [5, 6] employs a combination of a simulation method and probabilistic model checking to evaluate properties such as resilience and adaptation latency. As mentioned, our IDMS can be economically embedded into Rainbow and extend the adaptation function of the latter.
We mention some other existing approaches to the design of selfadaptive systems, which rely on mathematical methods related to probability theory and statistics. Esfahani et al. [15, 16] presented a general definition of adaptation optimality using fuzzy mathematics, which accounts for not only the current utility but also the optimal consequence of future operations. But IDMS estimates the probability parameters based on runtime data. Epifani et al. [14] presented the KAMI framework to deal with the inaccuracy of parameters related to the nonfunctional aspect of the system (such as reliability and performance), and Bencomo et al. [2] presented a Bayesian network for modeling selfadaptive systems. These two approaches rely on the Bayesian (point) estimation method while IDMS exploits both point and interval estimates from the frequentist statistics theory. Finally, Filieri et al. [17] constructed approximate dynamic models of a selfadaptive system and for synthesizing, from those models, a suitable controller that guarantees prescribed multiple nonfunctional system requirements. The method they used is from control theory, which is quite different from the theory of MDPs.
7 Conclusions
We have presented IDMS, an iterative framework that supports a tradeoff among three important metrics in practical runtime decisionmaking problems: accuracy, data usage and computational overhead. We have also instantiated IDMS on the Rainbow framework and presented a simulationbased evaluation.
For future work, we plan to enhance IDMS with a mechanism for automatically adjusting the confident optimality and the sample size based on the given priorities of the three metrics. Another interesting topic is a generalization of the valueiteration method in IMDP to synthesize a scheduler that minimizes the cumulative cost, without prescribing a subset of schedulers.
Footnotes
 1.
For simplicity, the specification does not strictly follow the syntax of Stitch.
References
 1.Baier, C., Katoen, J.P.: Principles of Model Checking. The MIT Press, Cambridge (2008)zbMATHGoogle Scholar
 2.Bencomo, N., Belaggoun, A., Issarny, V.: Dynamic decision networks for decisionmaking in selfadaptive systems: A case study. In: Proceedings of the 8th International Symposium on Software Engineering for Adaptive and SelfManaging Systems, SEAMS 2013, pp. 113–122. IEEE Press, Piscataway, NJ, USA (2013)Google Scholar
 3.Calinescu, R., Ghezzi, C., Johnson, K., Pezzé, M., Rafiq, Y., Tamburrelli, G.: Formal verification with confidence intervals: A new approach to establishing the qualityofservice properties of software systems. IEEE Trans. Reliab. 99, 1–19 (2015)Google Scholar
 4.Calinescu, R., Grunske, L., Kwiatkowska, M., Mirandola, R., Tamburrelli, G.: Dynamic QoS management and optimization in servicebased systems. IEEE Trans. Softw. Eng. 37(3), 387–409 (2011)CrossRefGoogle Scholar
 5.Camára, J., de Lemos, R.: Evaluation of resilience in selfadaptive systems using probabilistic modelchecking. In: 2012 ICSE Workshop on Software Engineering for Adaptive and SelfManaging Systems (SEAMS), pp. 53–62, June 2012Google Scholar
 6.Cámara, J., Moreno, G.A., Garlan, D.: Stochastic game analysis and latency awareness for proactive selfadaptation. In: Proceedings of the 9th International Symposium on Software Engineering for Adaptive and SelfManaging Systems, SEAMS, pp. 155–164. ACM, New York, NY, USA (2014)Google Scholar
 7.Celiku, O., Garlan, D., Schmerl, B.: Augmenting architectural modeling to cope with uncertainty. In: Proceedings of the International Workshop on Living with Uncertainty (IWLU 2007), Atlanta, Georgia, USA (2007)Google Scholar
 8.Chen, T., Forejt, V., Kwiatkowska, M.Z., Parker, D., Simaitis, A.: Automatic verification of competitive stochastic systems. Formal Method Syst. Des. 43(1), 61–92 (2013)CrossRefzbMATHGoogle Scholar
 9.Cheng, S.W.: Rainbow: CostEffective Software Architecturebased Self Adaptation. Ph.D. thesis, Carnegie Mellon University (2008)Google Scholar
 10.Cheng, S.W., Garlan, D.: Handling uncertainty in autonomic systems. In: Proceedings of the International Workshop on Living with Uncertainty (IWLU 2007), Atlanta, Georgia, USA (2007)Google Scholar
 11.Cheng, S.W., Garlan, D., Schmerl, B.: Architecturebased selfadaptation in the presence of multiple objectives. In: ICSE Workshop on Software Engineering for Adaptive and SelfManaging Systems (SEAMS 2006), Shanghai, China (2006)Google Scholar
 12.Cooray, D., Malek, S., Roshandel, R., Kilgore, D.: RESISTing reliability degradation through proactive reconfiguration. In: Proceedings of the IEEE/ACM International Conference on Automated Software Engineering, ASE 2010, pp. 83–92. ACM, New York, NY, USA (2010)Google Scholar
 13.Elkhodary, A., Esfahani, N., Malek, S.: FUSION: A framework for engineering selftuning selfadaptive software systems. In: Proceedings of the Eighteenth ACM SIGSOFT International Symposium on Foundations of Software Engineering, FSE 2010, pp. 7–16. ACM, New York, NY, USA (2010)Google Scholar
 14.Epifani, I., Ghezzi, C., Mirandola, R., Tamburrelli, G.: Model evolution by runtime parameter adaptation. In: Proceedings of the 31st International Conference on Software Engineering, ICSE 2009, pp. 111–121. IEEE Computer Society, Washington, DC, USA (2009)Google Scholar
 15.Esfahani, N., Kouroshfar, E., Malek, S.: Taming uncertainty in selfadaptive software. In: Proceedings of the 19th ACM SIGSOFT Symposium and the 13th European Conference on Foundations of Software Engineering, ESEC/FSE 2011, pp. 234–244. ACM, New York, NY, USA (2011)Google Scholar
 16.Esfahani, N., Malek, S.: Uncertainty in selfadaptive software systems. In: de Lemos, R., Giese, H., Müller, H.A., Shaw, M. (eds.) Software Engineering for SelfAdaptive Systems. LNCS, vol. 7475, pp. 214–238. Springer, Heidelberg (2013)CrossRefGoogle Scholar
 17.Filieri, A., Hoffmann, H., Maggio, M.: Automated multiobjective control for selfadaptive software design. In: Proceedings of the 10th Joint Meeting on Foundations of Software Engineering, ESEC/FSE 2015, pp. 13–24 (2015)Google Scholar
 18.Filieri, A., Tamburrelli, G., Ghezzi, C.: Supporting selfadaptation via quantitative verification and sensitivity analysis at run time. IEEE Trans. Softw. Eng. 42, 75–99 (2015)CrossRefGoogle Scholar
 19.Forejt, V., Kwiatkowska, M., Norman, G., Parker, D.: Automated verification techniques for probabilistic systems. In: Bernardo, M., Issarny, V. (eds.) SFM 2011. LNCS, vol. 6659, pp. 53–113. Springer, Heidelberg (2011)CrossRefGoogle Scholar
 20.Ghezzi, C., Pinto, L.S., Spoletini, P., Tamburrelli, G.: Managing nonfunctional uncertainty via modeldriven adaptivity. In: Proceedings of the International Conference on Software Engineering, ICSE 2013, pp. 33–42. IEEE Press (2013)Google Scholar
 21.Givan, R., Leach, S.M., Dean, T.L.: Boundedparameter Markov Decision Processes. J. Artif. Intell. 122(1–2), 71–109 (2000)MathSciNetCrossRefzbMATHGoogle Scholar
 22.Huebscher, M.C., McCann, J.A.: A survey of autonomic computingdegrees, models, and applications. ACM Comput. Surv. 40(3), 7: 1–7: 28 (2008)CrossRefGoogle Scholar
 23.Moreno, G.A., Cámara, J., Garlan, D., Schmerl, B.: Proactive selfadaptation under uncertainty: A probabilistic model checking approach. In: Proceedings of the 10th Joint Meeting on Foundations of Software Engineering, ESEC/FSE 2015, pp. 1–12. ACM, New York, NY, USA (2015)Google Scholar
 24.Nilim, A., Ghaoui, L.E.: Robust control of Markov Decision Processes with uncertain transition matrices. Oper. Res. 53(5), 780–798 (2005)MathSciNetCrossRefzbMATHGoogle Scholar
 25.Poladian, V., Garlan, D., Shaw, M., Satyanarayanan, M., Schmerl, B., Sousat, J.: Leveraging resource prediction for anticipatory dynamic configuration. In: First International Conference on SelfAdaptive and SelfOrganizing Systems, SASO 2007, pp. 214–223 (2007)Google Scholar
 26.Puggelli, A., Li, W., SangiovanniVincentelli, A.L., Seshia, S.A.: Polynomialtime verification of PCTL properties of MDPs with convex uncertainties. In: Sharygina, N., Veith, H. (eds.) CAV 2013. LNCS, vol. 8044, pp. 527–542. Springer, Heidelberg (2013)CrossRefGoogle Scholar
 27.Puterman, M.L.: Markov decision processes. Handb. Oper. Res. Manage. Sci. 2, 331–434 (1990)MathSciNetCrossRefzbMATHGoogle Scholar
 28.Su, G., Feng, Y., Chen, T., Rosenblum, D.S.: Asymptotic perturbation bounds for probabilistic model checking with empirically determined probability parameters. IEEE Trans. Softw. Eng. 99, 1–19 (2015)Google Scholar
 29.Su, G., Rosenblum, D.S., Tamburrelli, G.: Reliability of runtime qualityofservice evaluation using parametirc model checking. In: Proceedings of the 38th International Conference on Software Engineering, ICSE 2016. ACM, New York, NY, USA (2016)Google Scholar
 30.Tewari, A., Bartlett, P.L.: Bounded parameter markov decision processes with average reward criterion. In: Bshouty, N.H., Gentile, C. (eds.) COLT. LNCS (LNAI), vol. 4539, pp. 263–277. Springer, Heidelberg (2007)CrossRefGoogle Scholar
 31.Varga, R.S.: Matrix Iterative Analysis. Springer Series in Computational Mathematics. Springer, Heidelberg (2009)zbMATHGoogle Scholar
 32.Wiesemann, W., Kuhn, D., Rustem, B.: Robust markov decision processes. Math. Oper. Res. 38(1), 153–183 (2013)MathSciNetCrossRefzbMATHGoogle Scholar