Keywords

figure a

1 Introduction

Population protocols [1, 3, 4] are a model of distributed computing in which replicated, mobile agents with limited computational power interact stochastically to achieve a common task. They provide a simple and elegant formalism to model, e.g., networks of passively mobile sensors [1, 5], trust propagation [13], evolutionary dynamics [14], and chemical systems, under the name chemical reaction networks [12, 16, 19].

Population protocols are parameterized: the number of agents does not change during the execution of the protocol, but is a priori unbounded. A protocol is correct if it behaves correctly for all of its infinitely many initial configurations. For this reason, it is challenging to design correct and efficient protocols.

In this paper we introduce PeregrineFootnote 1, the first tool for the parameterized analysis of population protocols. Peregrine is intended for use by researchers in distributed computing and systems biology. It allows the user to specify protocols either through an editor or as simple scripts, and to analyze them via a graphical interface. The analysis features of Peregrine include manual step-by-step simulation; automatic sampling; statistics generation of average convergence speed; detection of incorrect executions through simulation; and formal verification of correctness. The first four features are supported for all protocols, while verification is supported for silent protocols, a large subclass of protocols [6]. Verification is performed automatically over all of the infinitely many initial configurations using the recent approach of [6] for solving the so-called well-specification problem.

Related Work. The problem of automatically verifying that a population protocol conforms to its specification for one fixed initial configuration has been considered in [10, 11, 17, 20]. In [10], ad hoc search algorithms are used. In [11, 17], the authors show how to model the problem in the probabilistic model checker Prism, and under certain conditions in Spin. In [20], the problem is modeled with the Pat toolkit for model checking under fairness assumptions. All these tools increase our confidence in the correctness of a protocol. However, compared to Peregrine, they are not visual tools, they do not offer simulation capabilities, and they can only verify the correctness of a protocol for a finite number of initial configurations, with typically a small number of agents. Peregrine proves correctness for all of the infinitely many initial configurations, with an arbitrarily large number of agents.

As mentioned in the introduction, population protocols are isomorphic to chemical reaction networks (CRNs), a popular model in natural computing. Cardelli et al. have recently developed model checking techniques and analysis algorithms for stochastic CRNs [7,8,9]. The problems studied therein are incomparable to the parameterized questions addressed by Peregrine.

The verification algorithm of Peregrine is based on [6], where a novel approach for the parameterized verification of silent population protocols has been presented. The command-line tool of [6] only offers support for proving correctness, with no functionality for visualization or simulation. Further, contrary to Peregrine, the tool cannot produce counterexamples when correctness fails.

2 Population Protocols

We introduce population protocols through a simple example and then briefly formalize the model. We refer the reader to [4] for a more thorough but still intuitive presentation. Suppose anonymous and mobile agents wish to take a majority vote. Intuitively, anonymous means that agents have no identity, and mobile that agents are “wandering around”, and can only interact whenever they bump into each other. In order to vote, all agents conduct the following protocol. Each agent is in one out of four states \(\{Y,N, y, n\}\). Initially all agents are in the states Y or N, corresponding to how they want to vote (states yn are auxiliary states). Agents repeatedly interact pairwise according to the following rules:

$$\begin{aligned} a :YN \mapsto yn \qquad b :Yn \mapsto Yy \qquad c :Ny \mapsto Nn \qquad d :yn \mapsto yy \end{aligned}$$

For example, if the population initially has two agents of opinion “yes” and one agent of opinion “no”, then a possible execution is:

(1)

where e.g. denotes the multiset with two agents in state Y and one agent in state N.

The goal of every population protocol is to ensure that the agents eventually reach a lasting consensus, i.e., a multiset in which (1) either all agents are in “yes”-states, or all agents are in “no”-states, and (2) further interactions do not destroy the consensus. On top of this universal specification, each protocol has an individual goal, determining which initial configurations should reach the “yes” and the “no” lasting consensus. In the majority protocol above, the agents should reach a “yes”-consensus iff 50% or more agents vote “yes”.

Execution (1) above leads to a lasting “yes”-consensus; further, the consensus is the right one, since 2 out of 3 agents voted “yes”. In fact, assuming agents interact uniformly and independently at random, the above protocol is correct: executions almost surely reach a correct lasting consensus.

More formally, a population protocol is a tuple (QTIO) where Q is a finite set of states, \(T \subseteq Q^2 \times Q^2\) is a set of transitions, \(I \subseteq Q\) are the initial states and \(O :Q \rightarrow \{0, 1\}\) is the output mapping. A configuration is a non-empty multiset over Q, an initial configuration is a non-empty multiset over I, and a configuration is terminal if it cannot be altered by any transition. A configuration is in a consensus if all of its states map to the same output under O.

An execution is a finite or infinite sequence \(C_0 \xrightarrow {t_1} C_1 \xrightarrow {t_2} \cdots \) such that \(C_i\) is obtained from applying transition \(t_i\) to \(C_{i-1}\). A fair execution is either a finite execution that reaches a terminal configuration, or an infinite execution such that if \(\{i \in \mathbb {N}: C_i \xrightarrow {*} D\}\) is infinite, then \(\{i \in \mathbb {N}: C_i = D\}\) is infinite for any configuration D. In other words, fairness ensures that a configuration cannot be avoided forever if it is reachable infinitely often. Fairness is an abstraction of the random interactions occurring within a population. A configuration C is in a lasting consensus if every execution from C only leads to configurations of the same consensus.

If for every initial configuration C, all fair executions from C lead to a lasting consensus \(\varphi (C) \in \{0,1\}\), then we say that the protocol computes the predicate \(\varphi \). For example, the above majority protocol with \(O(Y) = O(y) = 1\) and \(O(N) = O(n) = 0\) computes the predicate \(C[Y] \ge C[N]\), where C[x] denotes the number of occurrences of state x in C. A protocol does not necessarily compute a predicate. For example, if we alter the majority protocol by removing transition d, then is a fair execution, but is not in a consensus. In other words, transition d acts as a tie-breaker which allows to reach the consensus configuration . A protocol that computes a predicate is said to be well-specified. It is well-known that well-specified population protocols compute precisely the predicates definable in Presburger arithmetic [3]. On top of different majority protocols for the predicate \(C[x] \ge C[y]\), the literature contains, e.g., different families of so-called flock-of-birds protocols for the predicates \(C[x] \ge c\), where c is an integer constant, and families of threshold protocols for the predicates \(a_1 \cdot C[x_1] + \cdots + a_n \cdot C[x_n] \ge c\), where \(a_1, \ldots , a_n, c\) are integer constants and \(x_1, \ldots , x_n\) are initial states.

3 Analyzing Population Protocols

Peregrine is a web tool with a JavaScript frontend and a Haskell backend. The backend makes use of the SMT solver Z3 [15] to test satisfiability of Presburger arithmetic formulas. The user has access to four main features through the graphical frontend. We present these features in the remainder of the section.

Protocol Description. Peregrine offers a description language for both single protocols and families of protocols depending on some parameters. Single protocols are described either through a graphical editor or as simple Python scripts. Families of protocols (called parametric protocols) can only be specified as scripts, but Peregrine assists the user by generating a code skeleton.

Simulation. Population protocols can be simulated through a graphical player depicted in Fig. 1. The user can pick an initial configuration and simulate the protocol by either manual selection of interactions, or by letting a scheduler pick interactions uniformly at random. The simulator keeps a history of the execution which can be rewound at any time, making it easy to experiment with the different behaviours of a protocol. Configurations can be displayed in two ways: either as explicit populations, as illustrated in Fig. 1, or as bar charts of the states count, more convenient for large populations.

Fig. 1.
figure 1

Simulation of the majority protocol from the initial configuration .

Statistics. Peregrine can generate statistics from batch simulations. The user provides four parameters: \(s_\text {min}, s_\text {max}, m\) and n. Peregrine generates n random executions as follows. For each execution, a number s is picked uniformly at random from \([s_\text {min}, s_\text {max}]\), and an initial configuration of size s is then picked uniformly at random. Each step of an execution is picked uniformly at random among enabled interactions. If no terminal configuration is reached within m steps, then the simulation halts. In the end, n executions of length at most m are gathered. Peregrine classifies the generated executions according to their consensus, and computes statistics on the convergence speed (see the next two paragraphs). The results can be visualized in different ways, and the raw data can be exported as a JSON file.

Consensus. For each random execution, Peregrine checks whether the last configuration of an execution is in a consensus and, if so, whether the consensus corresponds to the expected output of the protocol. Peregrine reports which percentage of the executions reach a consensus, and whether the consensus is correct and/or lasting. In normal mode, Peregrine only classifies an execution as lasting consensus if it ends in a terminal configuration. In the increased accuracy mode, if the execution ends in a configuration C of consensus \(b \in \{0, 1\}\), then the model checker LoLA [18] is used to determine whether there exists a configuration \(C'\) such that \(C \xrightarrow {*} C'\) and \(C'\) is not of consensus b. If it is not the case, then Peregrine concludes that C is in a lasting consensus. Peregrine plots the percentage of executions in each category as a function of the population size, as illustrated on the left of Fig. 2.

Average Convergence Speed. Peregrine also provides statistics on the convergence speed of a protocol. Let \(C_0 \xrightarrow {t_1} C_1 \xrightarrow {t_2} \cdots \xrightarrow {t_\ell } C_\ell \) be an execution such that \(C_\ell \) is in a consensus \(b \in \{0, 1\}\). The number of steps to convergence of the execution is defined as 0 if all configurations are of consensus b, and otherwise as \(i + 1\), where i is the largest index such that \(C_i\) is not in consensus b. For each population size, Peregrine computes the average number of steps to convergence of all consensus executions of that population size, and plots the information as illustrated on the right of Fig. 2.

Fig. 2.
figure 2

Statistics for 5000 random executions of the approximate majority protocol of [2], of length at most 40, from initial configurations of size at most 25. The left plot shows the percentage of executions reaching a consensus (dark green: lasting correct, light green: correct, light red: incorrect, dark red: lasting incorrect) and no consensus (orange). In this example the occurrences of light red are negligible. The right plot shows the average number of steps to convergence. (Color figure online)

Fig. 3.
figure 3

Verification of the majority protocol of Sect. 2 without transition \(d :yn \mapsto yy\).

Verification. Peregrine can automatically verify that a population protocol computes a given predicate. Predicates can be specified by the user in quantifier-free Presburger arithmetic extended with the family of predicates \(\{x \equiv y\ (\text {mod } c)\}_{c \ge 2}\), which is equivalent to Presburger arithmetic. For example, for the majority protocol of Sect. 2, the user simply specifies C[Y]>= C[N].

Peregrine implements the approach of [6] to verify correctness of protocols which are silent. A protocol is said to be silent if from every initial configuration, every fair execution leads to a terminal configuration. The majority protocol of Sect. 2 and most existing protocols from the literature are silent [6]. We briefly describe the approach of [6] and how it is integrated into Peregrine.

Suppose we are given a population protocol \(\mathcal {P}\) and we wish to determine whether it computes a predicate \(\varphi \). The procedure first tries to prove that \(\mathcal {P}\) is silent. This is done by verifying a more restricted condition called layered termination. Verifying the latter property reduces to testing satisfiability of a Presburger arithmetic formula. If this formula holds, then the protocol is silent, otherwise no conclusion is derived. However, essentially all existing silent protocols satisfy layered termination [6].

Once \(\mathcal {P}\) is proven to be silent, the procedure attempts to prove that no “bad execution” exists. More precisely, it checks whether there exist configurations \(C_0\) and \(C_1\) such that \(C_0 \xrightarrow {*} C_1\), \(C_0\) is initial, \(C_1\) is terminal, and \(C_1\) is not in consensus \(\varphi (C_0) \in \{0,1\}\). Since reachability is not definable in Presburger arithmetic, a Presburger-definable over-approximation of reachability, borrowed from Petri net theory, is used instead. We obtain the following formula \(\varPhi _\text {bad-exec}\):

If \(\varPhi _\text {bad-exec}\) is unsatisfiable, then \(\mathcal {P}\) is correct. Otherwise, no conclusion is reached, and \(\varPhi _\text {bad-exec}\) is iteratively strengthened by enriching the over-approximation . Whenever \(\varPhi _\text {bad-exec}\) is satisfied by \((C_0, C_1)\), Peregrine calls the model-checker LoLA to test whether \(C_1\) is indeed reachable from \(C_0\). If so, then Peregrine reports \(\mathcal {P}\) to be incorrect, and generates a counter-example execution, which can be replayed or exported as a JSON file (see Fig. 3).

Currently Peregrine can verify protocols with up to a hundred states and a few thousands transitions. The bottleneck is the size of the constraint system. Due to lack of space, we refer the reader to [6] for detailed experimental results.