On the Complexity of Checking Consistency for Replicated Data Types
Abstract
Recent distributed systems have introduced variations of familiar abstract data types (ADTs) like counters, registers, flags, and sets, that provide high availability and partition tolerance. These conflictfree replicated data types (CRDTs) utilize mechanisms to resolve the effects of concurrent updates to replicated data. Naturally these objects weaken their consistency guarantees to achieve availability and partitiontolerance, and various notions of weak consistency capture those guarantees.
In this work we study the tractability of CRDTconsistency checking. To capture guarantees precisely, and facilitate symbolic reasoning, we propose novel logical characterizations. By developing novel reductions from propositional satisfiability problems, and novel consistencychecking algorithms, we discover both positive and negative results. In particular, we show intractability for replicated flags, sets, counters, and registers, yet tractability for replicated growable arrays. Furthermore, we demonstrate that tractability can be redeemed for registers when each value is written at most once, for counters when the number of replicas is fixed, and for sets and flags when the number of replicas and variables is fixed.
1 Introduction
Recent distributed systems have introduced variations of familiar abstract data types (ADTs) like counters, registers, flags, and sets, that provide high availability and partition tolerance. These conflictfree replicated data types (CRDTs) [33] efficiently resolve the effects of concurrent updates to replicated data. Naturally they weaken consistency guarantees to achieve availability and partitiontolerance, and various notions of weak consistency capture such guarantees [8, 11, 29, 35, 36].
In this work we study the tractability of CRDT consistency checking; Fig. 1 summarizes our results. In particular, we consider runtime verification: deciding whether a given execution of a CRDT is consistent with its ADT specification. This problem is particularly relevant as distributedsystem testing tools like Jepsen [25] are appearing; without efficient, general consistencychecking algorithms, such tools could be limited to specialized classes of errors like node crashes.
Our study proceeds in three parts. First, to precisely characterize the consistency of various CRDTs, and facilitate symbolic reasoning, we develop novel logical characterizations to capture their guarantees. Our logical models are built on a notion of abstract execution, which relates the operations of a given history with three separate relations: a readfrom relation, governing the observations from which a given operation constitutes its own return value; a happensbefore relation, capturing the causal relationships among operations; and a linearization relation, capturing any necessary arbitration among noncommutative effects which are executed concurrently, e.g., following a lastwriterwins policy. Accordingly, we capture data type specifications with logical axioms interpreted over the readfrom, happensbefore, and linearization relations of abstract executions, reducing the consistency problem to: does there exist an abstract execution over the given history which satisfies the axioms of the given data type?
Second, we demonstrate the intractability of several replicated data types by reduction from propositional satisfiability (SAT) problems. In particular, we consider the 1in3 SAT problem [19], which asks for a truth assignment to the variables of a given set of clauses such that exactly one literal per clause is assigned true. Our reductions essentially simulate the existential choice of a truth assignment with the existential choice of the readfrom and happensbefore relations of an abstract execution. For a given 1in3 SAT instance, we construct a history of replicas obeying carefullytailored synchronization protocols, which is consistent exactly when the corresponding SAT instance is positive.
Third, we develop tractable consistencychecking algorithms for individual data types and special cases: replicated growing arrays; multivalue and lastwriterwins registers, when each value is written only once; counters, when replicas are bounded; and sets and flags, when their sizes are also bounded. While the algorithms for each case are tailored to the algebraic properties of the data types they handle, they essentially all function by constructing abstract executions incrementally, processing replicas’ operations in prefix order.
 1.
We develop novel logical characterizations of consistency for the replicated register, flag, set, counter, and array data types (Sect. 2);
 2.
We develop novel reductions from propositional satisfiability problems to consistency checking to demonstrate intractability for replicated flags, sets, counters, and registers (Sect. 3); and
 3.
We develop tractable consistencychecking algorithms for replicated growable arrays, registers, when written values are unique, counters, when replicas are bounded, and sets and flags, when their sizes are also bounded (Sects. 4–6).
2 A Logical Characterization of Replicated Data Types
In this section we describe an axiomatic framework for defining the semantics of replicated data types. We consider a set of method names \(\mathbb {M}\), and that each method \(\mathsf {m}\in \mathbb {M}\) has a number of arguments and a return value sampled from a data domain \(\mathbb {D}\). We will use operation labels of the form \(\mathsf {m}(a) \overset{i}{\Rightarrow } b\) to represent the call of a method \(\mathsf {m}\in \mathbb {M}\), with argument \(a\in \mathbb {D}\), and resulting in the value \(b\in \mathbb {D}\). Since there might be multiple calls to the same method with the same arguments and result, labels are tagged with a unique identifier i. We will ignore identifiers when unambiguous.
The interaction between a data type implementation and a client is represented by a history \({h}={\left\langle {\mathsf {Op}, \mathsf {ro}}\right\rangle }\) which consists of a set of operation labels \(\mathsf {Op}\) and a partial replica order \(\mathsf {ro}\) ordering operations issued by the client on the same replica. Usually, \(\mathsf {ro}\) is a union of sequences, each sequence representing the operations issued on the same replica, and the width of \(\mathsf {ro}\), i.e., the maximum number of mutuallyunordered operations, gives the number of replicas in a given history.

a readfrom binary relation \(\textsf {rf} \) over operations in \(\mathsf {Op}\), which identifies the set of updates needed to “explain” a certain return value, e.g., a write operation explaining the return value of a read,

a strict partial happensbefore order \(\textsf {hb}\), which includes \(\mathsf {ro}\) and \(\textsf {rf} \), representing the causality constraints in an execution, and

a strict total linearization order \(\mathsf {lin}\), which includes \(\textsf {hb}\), used to model conflict resolution policies based on timestamps.
A replicated data type is defined by a set of firstorder axioms \(\varPhi \) characterizing the relations in an abstract execution. A history \({h}\) is admitted by a data type when there exists an abstract execution \({e}\) such that \({\left\langle {{h},{e}}\right\rangle }\,\models \,\varPhi \). The satisfaction relation \(\models \) is defined as usual in first order logic. The admissibility problem is the problem of checking whether a history \({h}\) is admitted by a given data type.
In the following, we define the replicated data types with respect to which we study the complexity of the admissibility problem. The axioms used to define them are listed in Figs. 2 and 3. These axioms use the function symbols method, argument, and return interpreted over operation labels, whose semantics is selfexplanatory.
2.1 Replicated Sets and Flags
The AddWins Set and RemoveWins Set [34] are two implementations of a replicated set with operations add(x), remove(x), and contains(x) for adding, removing, and checking membership of an element x. Although the meaning of these methods is selfevident from their names, the result of conflicting concurrent operations is not evident. When concurrent add(x) and remove(x) operations are delivered to a certain replica, the AddWins Set chooses to keep the element x in the set, so every subsequent invocation of contains(x) on this replica returns \( true \), while the RemoveWins Set makes the dual choice of removing x from the set.
The definition of the RemoveWins Set is similar, except for the parameters of RetvalSet, which become \( \textsc {RetvalSet}(\mathsf{contains}, false ,\mathsf{remove}), \) i.e., a contains(x) returns \( false \) iff it readsfrom at least one remove(x).
The EnableWins Flag and DisableWins Flag are implementations of a set of flags with operations: enable(x), disable(x), and read(x), where enable(x) turns the flag x to true, disable(x) turns x to false, while read(x) returns the state of the flag x. Their semantics is similar to the AddWins Set and RemoveWins Set, respectively, where enable(x), disable(x), and read(x) play the role of add(x), remove(x), and contains(x), respectively. Their axioms are defined as above.
2.2 Replicated Registers
We consider two variations of replicated registers called MultiValue Register (MVR) and LastWriterWins Register (LWW) [34] which maintain a set of registers and provide write(x,v) operations for writing a value v on a register x and read(x) operations for reading the content of a register x (the domain of values is kept unspecified since it is irrelevant). While a read(x) operation of MVR returns all the values written by concurrent writes which are maximal among its happensbefore predecessors, therefore, leaving the responsibility for solving conflicts between concurrent writes to the client, a read(x) operation of LWW returns a single value chosen using a conflictresolution policy based on timestamps. Each written value is associated to a timestamp, and a read operation returns the most recent value w.r.t. the timestamps. This order between timestamps is modeled using the linearization order of an abstract execution.
LWW is obtained from the definition of MVR by replacing ReadAllMaximals with the axiom LinLWW which ensures that every write(x,_) operation which happensbefore a read(x) operation is linearized before the write(x,_) operation from where the read(x) takes its value (when these two write operations are different). This definition of LWW is inspired by the “badpattern” characterization in [6], corresponding to their causal convergence criterion.
2.3 Replicated Counters
The replicated counter datatype [34] maintains a set of counters interpreted as integers (the counters can become negative). This datatype provides operations inc(x) and dec(x) for incrementing and decrementing a counter x, and read(x) operations to read the value of the counter x. The semantics of the replicated counter is quite standard: a read(x) operation returns the value computed as the difference between the number of inc(x) operations and dec(x) operations among its happensbefore predecessors. The axioms defined below will enforce the fact that a read(x) operation readsfrom all its happensbefore predecessors which are inc(x) or dec(x) operations.
2.4 Replicated Growable Array
The Replicated Growing Array (RGA) [32] is a replicated list used for textediting applications. RGA supports three operations: addAfter(a,b) which adds the character b immediately after the occurrence of the character a assumed to be present in the list, remove(a) which removes a assumed to be present in the list, and read() which returns the list contents. It is assumed that a character is added at most once^{2}. The conflicts between concurrent addAfter operations that add a character immediately after the same character is solved using timestamps (i.e., each added character is associated to a timestamp and the order between characters depends on the order between the corresponding timestamps), which in the axioms below are modeled by the linearization order.

every addAfter(a,b) operation readsfrom the addAfter(_,a) adding the character a, except when \(a=\circ \) which denotes the “root” element of the list^{3},

every remove(a) operation readsfrom the operation adding a, and

every read operation returning a list containing a readsfrom the operation addAfter(_,a) adding a.
Then, RetvalRGA ensures that a read operation \(o_1\) happeningafter an operation adding a character a readsfrom a remove(a) operation when a doesn’t occur in the list returned by \(o_1\) (the history must contain a remove(a) operation because otherwise, a should have occurred in the list returned by the read).
Finally, LinRGA models the conflict resolution policy by constraining the linearization order between addAfter(a,_) operations adding some character immediately after the same character a. As a particular case, LinRGA enforces that addAfter(a,b) is linearized before addAfter(a,c) when a read operation returns a list where c precedes b (addAfter(a,b) results in the list \(a\cdot b\) and applying addAfter(a,c) on \(a\cdot b\) results in the list \(a\cdot c\cdot b\)). However, this is not sufficient: assume that the history contains the two operations addAfter(a,b) and addAfter(a,c) along with two operations remove(b) and addAfter(b,d). Then, a read operation returning the list \(a\cdot c\cdot d\) must enforce that addAfter(a,b) is linearized before addAfter(a,c) because this is the only order between these two operations that can lead to the result \(a\cdot c\cdot d\), i.e., executing addAfter(a,b), addAfter(b,d), remove(b), addAfter(a,c) in this order. LinRGA deals with any scenario where arbitrarilymany characters can be removed from the list: \(\textsf {rf} ^*_\mathsf{addAfter}\) is the reflexive and transitive closure of the projection of \(\textsf {rf} \) on \(\mathsf{addAfter}\) operations and \(<_{o_5}\) denotes the order between characters in the list returned by the read operation \(o_5\).
3 Intractability for Registers, Sets, Flags, and Counters
In this section we demonstrate that checking the consistency is intractable for many widelyused data types. While this is not completely unexpected, since some related consistencychecking problems like sequential consistency are also intractable [20], this contrasts recent tractability results for checking strong consistency (i.e., linearizability) of common nonreplicated data types like sets, maps, and queues [15]. In fact, in many cases we show that intractability even holds if the number of replicas is fixed.
Our proofs of intractability follow the general structure of Gibbons and Korach’s proofs for the intractability of checking sequential consistency (SC) for atomic registers with read and write operations [20]. In particular, we reduce a specialized type of NPhard propositional satisfiability (SAT) problem to checking whether histories are admitted by a given data type. While our construction borrows from Gibbons and Korach’s, the adaptation from SC to CRDT consistency requires a significant extension to handle the consistency relaxation represented by abstract executions: rather than a direct sequencing of threads’ operations, CRDT consistency requires the construction of three separate relations: readfrom, happensbefore, and linearization.
Theorem 1
The admissibility problem is NPhard when the number of replicas is fixed for the following data types: AddWins Set, RemoveWins Set, EnableWins Flag, DisableWins Flag, MultiValue Register, and LastWriterWins Register.
Proof
We demonstrate a reduction from the 1in3 SAT problem. For a given problem \(p = \bigwedge _{i=1}^{m} (\alpha _i \vee \beta _i \vee \gamma _i)\) over variables \(x_1, \ldots , x_n\), we construct a 3replica history \(h_p\) of the flag data type — either enable or disablewins — as illustrated in Fig. 4. The encoding includes a flag variable \(x_j\) for each propositional variable \(x_j\), along with a perreplica flag variable \(y_j\) used to implement synchronization barriers. Intuitively, executions of \(h_p\) proceed in \(m+1\) rounds: the first round corresponds to the assignment of a truth valuation, while subsequent rounds check the validity of each clause given the assignment. The reductions to sets and registers are slight variations on this proof, in which the Read, Enable, and Disable operations are replaced with Contains, Add, and Remove, respectively, and Read and Writes of values 1 and 0, respectively.
 1.
every pair of operations in distinct rounds — due to barriers; and
 2.
every operation in a given round, over all replicas, without interleaving the operations of distinct replicas within the same round — since a replica’s reads in a given round are only consistent with the other replicas’ after the reenabling and disabling of flag variables.
In other words, replicas appear to execute atomically per round, in a roundrobin fashion. Furthermore, since all operations in a given round happen before the operations of subsequent rounds, the values of flag variables are consistent across rounds — i.e., as read by the first replica to execute in a given round — and determined in the initial round either by conflict resolution — i.e., enable or disablewins — or by happensbefore, in case conflict resolution would have been inconsistent with subsequent reads.
In the “if” direction, let \(\varvec{r} \in {\{{0,1,2}\}}^m\) be the positions of positivelyassigned variables in each clause, e.g., \(r_i = 0\) implies \(\alpha _i = true \) and \(\beta _i = \gamma _i = false \). We construct an abstract execution \(e_{\varvec{r}}\) in which the happensbefore relation sequences the operations of replica \(r_i\) before those of \(r_i + 1 \bmod 3\), and in turn before \(r_i + 2 \bmod 3\). In other words, the replicas in round i appear to execute in lefttoright order from starting with the replica \(r_i\), whose reads correspond to the satisfying assignment of \((\alpha _i \vee \beta _i \vee \gamma _i)\). The readfrom relation of \(e_{\varvec{r}}\) relates each \(\mathrm {Read}(x_j) = true \) operation to the most recent \(\mathrm {Enable}(x_j)\) operation in happensbefore order, which is unique since happensbefore sequences the operations of all rounds; the case for \(\mathrm {Read}(x_j) = false \) and \(\mathrm {Disable}(x_j)\) is symmetric. It is then straightforward to verify that \(e_{\varvec{r}}\) satisfies the axioms of the enable or disablewins flag, and thus \(h_p\) is admitted.
Theorem 1 establishes intractability of consistency for the aforementioned sets, flags, and registers, independently from the number of replicas. In contrast, our proof of Theorem 2 for counter data types depends on the number of replicas, since our encoding requires two replicas per propositional variable. Intuitively, since counter increments and decrements are commutative, the initial round in the previous encoding would have fixed all counter values to zero. Instead, the next encoding isolates initial increments and decrements to independent replicas. The weaker result is indeed tight since checking counter consistency with a fixed number of replicas is polynomial time, as Sect. 5 demonstrates.
Theorem 2
The admissibility problem for the Counter data type is NPhard.
Proof
We demonstrate a reduction from the 1in3 SAT problem. For a given problem \(p = \bigwedge _{i=1}^{m} (\alpha _i \vee \beta _i \vee \gamma _i)\) over variables \(x_1, \ldots , x_n\), we construct a history \(h_p\) of the counter data type over \(2n+3\) replicas, as illustrated in Fig. 5.
Besides the differences imposed due to the commutativity of counter increments and decrements, our reduction follows the same strategy as in the proof of Theorem 1: the happensbefore relation of \(h_p\)’s abstract executions order every pair of operations in distinct rounds (of Replicas 0–2), and every operation in a given (noninitial) round. As before, Replicas 0–2 appear to execute atomically per round, in a roundrobin fashion, and counter variables are consistent across rounds. The key difference is that here abstract executions’ happensbefore relations only relate the operations of either Replica \(2j\!+\!1\) or \(2j\!+\!2\), for each \(j = 1, \ldots , n\), to operations in subsequent rounds: the other’s operations are never observed by other replicas. Our encoding ensures that exactly one of each is observed by ensuring that the counter y is incremented exactly n times — and relying on the fact that every variable appears in some clause, so that a read that observed neither or both would yield the value zero, which is inconsistent with \(h_p\). Otherwise, our reasoning follows the proof of Theorem 1, in which the readfrom relation selects all increments and decrements of the same counter variable in happensbefore order. \(\square \)
4 PolynomialTime Algorithms for Registers and Arrays
ComputeRF computes the readfrom relation \(\textsf {rf} \) satisfying the ReadFrom_ and Retval_ axioms. In the case of LWW and MVR, it defines \(\textsf {rf} \) as the set of all pairs formed of write(x,v) and read(x) operations where v belongs to the return value of the read. By Retval_, each read(x) operation must be associated to at least one write(x,_) operation. Also, the fact that each value is written at most once implies that this \(\textsf {rf} \) relation is uniquely defined, e.g., for LWW, it is not possible to find two write operations that could be \(\textsf {rf} \) related to the same read operation. In general, if there exists no \(\textsf {rf} \) relation satisfying these axioms, then ComputeRF returns a distinguished value \(\bot \) to signal a consistency violation. Note that the computation of the readfrom for LWW and MVR is quadratic time^{4} since the constraints imposed by the axioms relate only to the operation labels, the methods they invoke or their arguments. The case of RGA is slightly more involved because the axiom RetvalRGA introduces more readfrom constraints based on the happensbefore order which includes \(\mathsf {ro}\) and the \(\textsf {rf} \) itself. In this case, the computation of \(\textsf {rf} \) relies on a fixpoint computation, which converges in at most quadratic time (the maximal size of \(\textsf {rf} \)), described in Algorithm 2. Essentially, we use the axiom ReadFromRGA to populate the readfrom relation and then, apply the axiom RetvalRGA iteratively, using the readfrom constraints added in previous steps, until the computation converges.
After computing the readfrom relation, our algorithm defines the happensbefore relation \(\textsf {hb}\) as the transitive closure of \(\mathsf {ro}\) union \(\textsf {rf} \). This is sound because none of the axioms of these datatypes enforce new happensbefore constraints, which are not already captured by \(\mathsf {ro}\) and \(\textsf {rf} \). Then, it checks whether the \(\textsf {hb}\) defined this way is acyclic and satisfies the datatype’s axioms that constrain \(\textsf {hb}\), i.e., ReadFromMaximal and ReadAllMaximals (when they are present).
Finally, in the case of LWW and RGA, the algorithm computes a (partial) linearization order that satisfies the corresponding Lin_ axioms. Starting from an initial linearization order which is exactly the happensbefore, it computes new constraints by instantiating the universally quantified axioms LinLWW and LinRGA. Since these axioms are not “recursive”, i.e., they don’t enforce linearization order constraints based on other linearization order constraints, a standard instantiation of these axioms is enough to compute a partial linearization order such that any extension to a total order satisfies the datatype’s axioms.
Theorem 3
Algorithm 1 returns \( true \) iff the input history is consistent.
The following holds because Algorithm 1 runs in polynomial time — the rank depends on the number of quantifiers in the datatype’s axioms. Indeed, Algorithm 1 represents a least fixpoint computation which converges in at most a quadratic number of iterations (the maximal size of \(\textsf {rf} \)).
Corollary 1
The admissibility problem is polynomial time for RGA, and for LWW and MVR on differentiated histories.
5 PolynomialTime Algorithms for Replicated Counters
Let \({h}= (\mathsf {Op}, \mathsf {ro})\) be a history. To simplify the notations, we assume that the replica order is a union of sequences, each sequence representing the operations issued on the same replica. Therefore, each operation \(o\in \mathsf {Op}\) is associated with a replica identifier \(\mathsf{rep}(o)\in [1..n_{h}]\), where \(n_{h}\) is the number of replicas in \({h}\).
A prefix of \({h}\) is a set of operation \(\mathsf {Op}'\subseteq \mathsf {Op}\) such that all the \(\mathsf {ro}\) predecessors of operations in \(\mathsf {Op}'\) are also in \(\mathsf {Op}'\), i.e., \(\forall {o}\in \mathsf {Op}.\ \mathsf {ro}^{1}({o})\in \mathsf {Op}\). Note that the union of two prefixes of \({h}\) is also a prefix of \({h}\). The last operation of replica i in a prefix \(\mathsf {Op}'\) is the \(\mathsf {ro}\)maximal operation o with \(\mathsf{rep}(o)=i\) included in \(\mathsf {Op}'\). A prefix \(\mathsf {Op}'\) is called valid if \((\mathsf {Op}',\mathsf {ro}')\), where \(\mathsf {ro}'\) is the projection of \(\mathsf {ro}\) on \(\mathsf {Op}'\), is admitted by the replicated counter.
A prefix map is a mapping m which associates a prefix of \({h}\) to each replica \(i\in [1..n_{h}]\). Intuitively, a prefix map defines for each replica i the set of operations which are “known” to i, i.e., happenbefore the last operation of i in its prefix. Formally, a prefix map m is included in a happensbefore relation \(\textsf {hb}\), denoted by \(m\subseteq \textsf {hb}\), if for each replica \(i\in [1..n_{h}]\), \(\textsf {hb}(o,o_i)\) for each operation in \(o\in m(i)\setminus \{o_i\}\), where \(o_i\) is the last operation of i in m(i). We call \(o_i\) the last operation of i in m, and denoted it by \(\mathsf{last}_i(m)\). A prefix map m is valid if it associates a valid prefix to each replica, and complete if it associates the whole history \({h}\) to each replica i.
Algorithm 3 lists our algorithm for checking consistency of replicated counter histories. It is defined as a recursive procedure \(\mathsf {checkCounter}\) that searches for a sequence of valid extensions of a given prefix map (initially, this prefix map is empty) until it becomes complete. The axiom RetvalCounter is enforced whenever extending the prefix map with a new read operation (when the last operation of a replica i is “advanced” to a read operation). The following theorem states of the correctness of the algorithm.
Theorem 4
\(\mathsf {checkCounter}({h},\emptyset ,\emptyset )\) returns \( true \) iff the input history is consistent.
When the number of replicas is fixed, the number of prefix maps becomes polynomial in the size of the history. This follows from the fact that prefixes are uniquely defined by their \(\mathsf {ro}\)maximal operations, whose number is fixed.
Corollary 2
The admissibility problem for replicated counters is polynomialtime when the number of replicas is fixed.
6 PolynomialTime Algorithms for Sets and Flags
While Theorem 1 shows that the admissibility problem is NPcomplete for replicated sets and flags even if the number of replicas is fixed, we show that this problem becomes polynomial time when additionally, the number of values added to the set, or the number of flags, is also fixed. Note that this doesn’t limit the number of operations in the input history which can still be arbitrarily large. In the following, we focus on the AddWins Set, the other cases being very similar.
We propose an algorithm for checking consistency which is actually an extension of the one presented in Sect. 5 for replicated counters. The additional complexity in checking consistency for the AddWins Set comes from the validity of contains(x) return values which requires identifying the maximal predecessors in the happensbefore relation that add or remove x (which are not necessarily the maximal \(\textsf {hb}\)predecessors alltogether). In the case of counters, it was enough just to count happensbefore predecessors. Therefore, we extend the algorithm for replicated counters such that along with the prefix map, we also keep track of the \(\textsf {hb}\)maximal add(x) and remove(x) operations for each element x and each replica i. When extending a prefix map with a contains operation, these \(\textsf {hb}\)maximal operations (which define a witness for the readfrom relation) are enough to verify the RetValSet axiom. Extending the prefix of a replica with an add or remove operation (issued on the same replica), or by merging the prefix of another replica, may require an update of these \(\textsf {hb}\)maximal predecessors.
When the number of replicas and elements are fixed, the number of readfrom maps is polynomial in the size of the history — recall that the number of operations associated by a readfrom map to a replica and set element is bounded by the number of replicas. Combined with the number of prefix maps being polynomial when the number of replicas is fixed, we obtain the following result.
Theorem 5
Checking whether a history is admitted by the AddWins Set, RemoveWins Set, EnableWins Flag, or the DisableWins Flag is polynomial time provided that the number of replicas and elements/flags is fixed.
7 Related Work
Many have considered consistency models applicable to CRDTs, including causal consistency [26], sequential consistency [27], linearizability [24], session consistency [35], eventual consistency [36], and happensbefore consistency [29]. Burckhardt et al. [8, 11] propose a unifying framework to formalize these models. Many have also studied the complexity of verifying datatype agnostic notions of consistency, including serializability, sequential consistency and linearizability [1, 2, 4, 18, 20, 22, 30], as well as causal consistency [6]. Our definition of the replicated LWW register corresponds to the notion of causal convergence in [6]. This work studies the complexity of the admissibility problem for the replicated LWW register. It shows that this problem is NPcomplete in general and polynomial time when each value is written only once. Our NPcompleteness result is stronger since it assumes a fixed number of replicas, and our algorithm for the case of unique values is more general and can be applied uniformly to MVR and RGA. While Bouajjani et al. [5, 14] consider the complexity for individual linearizable collection types, we are the first to establish (in)tractability of individual replicated data types. Others have developed effective consistency checking algorithms for sequential consistency [3, 9, 23, 31], serializability [12, 17, 18, 21], linearizability [10, 16, 28, 37], and even weaker notions like eventual consistency [7] and sequential happensbefore consistency [13, 15]. In contrast, we are the first to establish precise polynomialtime algorithms for runtime verification of replicated data types.
8 Conclusion
By developing novel logical characterizations of replicated data types, reductions from propositional satisfiability checking, and tractable algorithms, we have established a frontier of tractability for checking consistency of replicated data types. As far as we are aware, our results are the first to characterize the asymptotic complexity consistency checking for CRDTs.
Footnotes
 1.
For simplicity, we assume that every history contains a set of write operations writing the initial values of variables, which precede every other operation in replica order.
 2.
In a practical context, this can be enforced by tagging characters with replica identifiers and sequence numbers.
 3.
This element is not returned by read operations.
 4.
Assuming constant time lookup/insert operations (e.g., using hashmaps), this complexity is linear time.
References
 1.Alur, R., McMillan, K.L., Peled, D.A.: Modelchecking of correctness conditions for concurrent objects. Inf. Comput. 160(1–2), 167–188 (2000). https://doi.org/10.1006/inco.1999.2847MathSciNetCrossRefzbMATHGoogle Scholar
 2.Bingham, J.D., Condon, A., Hu, A.J.: Toward a decidable notion of sequential consistency. In: Rosenberg, A.L., auf der Heide, F.M. (eds.) SPAA 2003: Proceedings of the Fifteenth Annual ACM Symposium on Parallelism in Algorithms and Architectures, San Diego, California, USA, (part of FCRC 2003), 7–9 June 2003, pp. 304–313. ACM (2003). https://doi.org/10.1145/777412.777467
 3.Bingham, J., Condon, A., Hu, A.J., Qadeer, S., Zhang, Z.: Automatic verification of sequential consistency for unbounded addresses and data values. In: Alur, R., Peled, D.A. (eds.) CAV 2004. LNCS, vol. 3114, pp. 427–439. Springer, Heidelberg (2004). https://doi.org/10.1007/9783540278139_33CrossRefzbMATHGoogle Scholar
 4.Bouajjani, A., Emmi, M., Enea, C., Hamza, J.: Verifying concurrent programs against sequential specifications. In: Felleisen, M., Gardner, P. (eds.) ESOP 2013. LNCS, vol. 7792, pp. 290–309. Springer, Heidelberg (2013). https://doi.org/10.1007/9783642370366_17CrossRefGoogle Scholar
 5.Bouajjani, A., Emmi, M., Enea, C., Hamza, J.: On reducing linearizability to state reachability. Inf. Comput. 261(Part), 383–400 (2018). https://doi.org/10.1016/j.ic.2018.02.014MathSciNetCrossRefzbMATHGoogle Scholar
 6.Bouajjani, A., Enea, C., Guerraoui, R., Hamza, J.: On verifying causal consistency. In: Castagna, G., Gordon, A.D. (eds.) Proceedings of the 44th ACM SIGPLAN Symposium on Principles of Programming Languages, POPL 2017, Paris, France, 18–20 January 2017, pp. 626–638. ACM (2017). http://dl.acm.org/citation.cfm?id=3009888
 7.Bouajjani, A., Enea, C., Hamza, J.: Verifying eventual consistency of optimistic replication systems. In: Jagannathan, S., Sewell, P. (eds.) The 41st Annual ACM SIGPLANSIGACT Symposium on Principles of Programming Languages, POPL 2014, San Diego, CA, USA, 20–21 January 2014, pp. 285–296. ACM (2014). https://doi.org/10.1145/2535838.2535877
 8.Burckhardt, S.: Principles of eventual consistency. Found. Trends Program. Lang. 1(1–2), 1–150 (2014). https://doi.org/10.1561/2500000011CrossRefGoogle Scholar
 9.Burckhardt, S., Alur, R., Martin, M.M.K.: Checkfence: checking consistency of concurrent data types on relaxed memory models. In: Ferrante, J., McKinley, K.S. (eds.) Proceedings of the ACM SIGPLAN 2007 Conference on Programming Language Design and Implementation, San Diego, California, USA, 10–13 June 2007, pp. 12–21. ACM (2007). https://doi.org/10.1145/1250734.1250737
 10.Burckhardt, S., Dern, C., Musuvathi, M., Tan, R.: Lineup: a complete and automatic linearizability checker. In: Zorn, B.G., Aiken, A. (eds.) Proceedings of the 2010 ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI 2010, Toronto, Ontario, Canada, 5–10 June 2010, pp. 330–340. ACM (2010). https://doi.org/10.1145/1806596.1806634
 11.Burckhardt, S., Gotsman, A., Yang, H., Zawirski, M.: Replicated data types: specification, verification, optimality. In: Jagannathan, S., Sewell, P. (eds.) The 41st Annual ACM SIGPLANSIGACT Symposium on Principles of Programming Languages, POPL 2014, San Diego, CA, USA, 20–21 January 2014, pp. 271–284. ACM (2014). https://doi.org/10.1145/2535838.2535848
 12.Cohen, A., O’Leary, J.W., Pnueli, A., Tuttle, M.R., Zuck, L.D.: Verifying correctness of transactional memories. In: Proceedings of the 7th International Conference on Formal Methods in ComputerAided Design, FMCAD 2007, Austin, Texas, USA, 11–14 November 2007, pp. 37–44. IEEE Computer Society (2007). https://doi.org/10.1109/FAMCAD.2007.40
 13.Emmi, M., Enea, C.: Monitoring weak consistency. In: Chockler, H., Weissenbacher, G. (eds.) CAV 2018, Part I. LNCS, vol. 10981, pp. 487–506. Springer, Cham (2018). https://doi.org/10.1007/9783319961453_26CrossRefGoogle Scholar
 14.Emmi, M., Enea, C.: Sound, complete, and tractable linearizability monitoring for concurrent collections. PACMPL 2(POPL), 25:1–25:27 (2018). https://doi.org/10.1145/3158113CrossRefGoogle Scholar
 15.Emmi, M., Enea, C.: Weakconsistency specification via visibility relaxation. PACMPL 3(POPL), 60:1–60:28 (2019). https://dl.acm.org/citation.cfm?id=3290373Google Scholar
 16.Emmi, M., Enea, C., Hamza, J.: Monitoring refinement via symbolic reasoning. In: Grove, D., Blackburn, S. (eds.) Proceedings of the 36th ACM SIGPLAN Conference on Programming Language Design and Implementation, Portland, OR, USA, 15–17 June 2015, pp. 260–269. ACM (2015). https://doi.org/10.1145/2737924.2737983
 17.Emmi, M., Majumdar, R., Manevich, R.: Parameterized verification of transactional memories. In: Zorn, B.G., Aiken, A. (eds.) Proceedings of the 2010 ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI 2010, Toronto, Ontario, Canada, 5–10 June 2010, pp. 134–145. ACM (2010). https://doi.org/10.1145/1806596.1806613
 18.Farzan, A., Madhusudan, P.: Monitoring atomicity in concurrent programs. In: Gupta, A., Malik, S. (eds.) CAV 2008. LNCS, vol. 5123, pp. 52–65. Springer, Heidelberg (2008). https://doi.org/10.1007/9783540705451_8CrossRefGoogle Scholar
 19.Garey, M.R., Johnson, D.S.: Computers and Intractability: A Guide to the Theory of NPCompleteness. W. H. Freeman, New York (1979)zbMATHGoogle Scholar
 20.Gibbons, P.B., Korach, E.: Testing shared memories. SIAM J. Comput. 26(4), 1208–1244 (1997). https://doi.org/10.1137/S0097539794279614MathSciNetCrossRefzbMATHGoogle Scholar
 21.Guerraoui, R., Henzinger, T.A., Jobstmann, B., Singh, V.: Model checking transactional memories. In: Gupta, R., Amarasinghe, S.P. (eds.) Proceedings of the ACM SIGPLAN 2008 Conference on Programming Language Design and Implementation, Tucson, AZ, USA, 7–13 June 2008, pp. 372–382. ACM (2008). https://doi.org/10.1145/1375581.1375626
 22.Hamza, J.: On the complexity of linearizability. In: Bouajjani, A., Fauconnier, H. (eds.) NETYS 2015. LNCS, vol. 9466, pp. 308–321. Springer, Cham (2015). https://doi.org/10.1007/9783319268507_21CrossRefGoogle Scholar
 23.Henzinger, T.A., Qadeer, S., Rajamani, S.K.: Verifying sequential consistency on sharedmemory multiprocessor systems. In: Halbwachs, N., Peled, D. (eds.) CAV 1999. LNCS, vol. 1633, pp. 301–315. Springer, Heidelberg (1999). https://doi.org/10.1007/3540486836_27CrossRefGoogle Scholar
 24.Herlihy, M., Wing, J.M.: Linearizability: a correctness condition for concurrent objects. ACM Trans. Program. Lang. Syst. 12(3), 463–492 (1990). https://doi.org/10.1145/78969.78972CrossRefGoogle Scholar
 25.Kingsbury, K.: Jepsen: Distributed systems safety research (2016). https://jepsen.io
 26.Lamport, L.: Time, clocks, and the ordering of events in a distributed system. Commun. ACM 21(7), 558–565 (1978). https://doi.org/10.1145/359545.359563CrossRefzbMATHGoogle Scholar
 27.Lamport, L.: How to make a multiprocessor computer that correctly executes multiprocess programs. IEEE Trans. Comput. 28(9), 690–691 (1979). https://doi.org/10.1109/TC.1979.1675439CrossRefzbMATHGoogle Scholar
 28.Lowe, G.: Testing for linearizability. Concurr. Comput. Pract. Exp. 29(4) (2017). https://doi.org/10.1002/cpe.3928
 29.Manson, J., Pugh, W., Adve, S.V.: The java memory model. In: Palsberg, J., Abadi, M. (eds.) Proceedings of the 32nd ACM SIGPLANSIGACT Symposium on Principles of Programming Languages, POPL 2005, Long Beach, California, USA, 12–14 January 2005, pp. 378–391. ACM (2005). https://doi.org/10.1145/1040305.1040336
 30.Papadimitriou, C.H.: The serializability of concurrent database updates. J. ACM 26(4), 631–653 (1979). https://doi.org/10.1145/322154.322158MathSciNetCrossRefzbMATHGoogle Scholar
 31.Qadeer, S.: Verifying sequential consistency on sharedmemory multiprocessorsby model checking. IEEE Trans. Parallel Distrib. Syst. 14(8), 730–741 (2003). https://doi.org/10.1109/TPDS.2003.1225053CrossRefGoogle Scholar
 32.Roh, H., Jeon, M., Kim, J., Lee, J.: Replicated abstract data types: building blocks for collaborative applications. J. Parallel Distrib. Comput. 71(3), 354–368 (2011). https://doi.org/10.1016/j.jpdc.2010.12.006CrossRefzbMATHGoogle Scholar
 33.Shapiro, M., Preguiça, N., Baquero, C., Zawirski, M.: Conflictfree replicated data types. In: Défago, X., Petit, F., Villain, V. (eds.) SSS 2011. LNCS, vol. 6976, pp. 386–400. Springer, Heidelberg (2011). https://doi.org/10.1007/9783642245503_29CrossRefzbMATHGoogle Scholar
 34.Shapiro, M., Preguiça, N.M., Baquero, C., Zawirski, M.: Convergent and commutative replicated data types. Bull. EATCS 104, 67–88 (2011). http://eatcs.org/beatcs/index.php/beatcs/article/view/120MathSciNetzbMATHGoogle Scholar
 35.Terry, D.B., Demers, A.J., Petersen, K., Spreitzer, M., Theimer, M., Welch, B.B.: Session guarantees for weakly consistent replicated data. In: Proceedings of the Third International Conference on Parallel and Distributed Information Systems (PDIS 1994), Austin, Texas, USA, 28–30 September 1994, pp. 140–149. IEEE Computer Society (1994). https://doi.org/10.1109/PDIS.1994.331722
 36.Terry, D.B., Theimer, M., Petersen, K., Demers, A.J., Spreitzer, M., Hauser, C.: Managing update conflicts in bayou, a weakly connected replicated storage system. In: Jones, M.B. (ed.) Proceedings of the Fifteenth ACM Symposium on Operating System Principles, SOSP 1995, Copper Mountain Resort, Colorado, USA, 3–6 December 1995, pp. 172–183. ACM (1995). https://doi.org/10.1145/224056.224070
 37.Wing, J.M., Gong, C.: Testing and verifying concurrent objects. J. Parallel Distrib. Comput. 17(1–2), 164–182 (1993). https://doi.org/10.1006/jpdc.1993.1015MathSciNetCrossRefzbMATHGoogle Scholar
 38.Wolper, P.: Expressing interesting properties of programs in propositional temporal logic. In: Conference Record of the Thirteenth Annual ACM Symposium on Principles of Programming Languages, St. Petersburg Beach, Florida, USA, January 1986, pp. 184–193. ACM Press (1986). https://doi.org/10.1145/512644.512661
Copyright information
Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.
The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.