Checking Robustness Against Snapshot Isolation
Abstract
Transactional access to databases is an important abstraction allowing programmers to consider blocks of actions (transactions) as executing in isolation. The strongest consistency model is serializability, which ensures the atomicity abstraction of transactions executing over a sequentially consistent memory. Since ensuring serializability carries a significant penalty on availability, modern databases provide weaker consistency models, one of the most prominent being snapshot isolation. In general, the correctness of a program relying on serializable transactions may be broken when using weaker models. However, certain programs may also be insensitive to consistency relaxations, i.e., all their properties holding under serializability are preserved even when they are executed over a weak consistent database and without additional synchronization.
In this paper, we address the issue of verifying if a given program is robust against snapshot isolation, i.e., all its behaviors are serializable even if it is executed over a database ensuring snapshot isolation. We show that this verification problem is polynomial time reducible to a state reachability problem in transactional programs over a sequentially consistent shared memory. This reduction opens the door to the reuse of the classic verification technology for reasoning about weaklyconsistent programs. In particular, we show that it can be used to derive a proof technique based on Lipton’s reduction theory that allows to prove programs robust.
1 Introduction
Transactions simplify concurrent programming by enabling computations on shared data that are isolated from other concurrent computations and resilient to failures. Modern databases provide transactions in various forms corresponding to different tradeoffs between consistency and availability. The strongest consistency level is achieved with serializable transactions [21] whose outcome in concurrent executions is the same as if the transactions were executed atomically in some order. Since serializability carries a significant penalty on availability, modern databases often provide weaker consistency models, one of the most prominent being snapshot isolation (SI) [5]. Then, an important issue is to ensure that the level of consistency needed by a given program coincides with the one that is guaranteed by its infrastructure, i.e., the database it uses. One way to tackle this issue is to investigate the problem of checking robustness of programs against consistency relaxations: Given a program P and two consistency models S and W such that S is stronger than W, we say that P is robust for S against W if for every two implementations \(I_S\) and \(I_W\) of S and W respectively, the set of computations of P when running with \(I_S\) is the same as its set of computations when running with \(I_W\). This means that P is not sensitive to the consistency relaxation from S to W, and therefore it is possible to reason about the behaviors of P assuming that it is running over S, and no additional synchronization is required when P runs over the weak model W such that it maintains all its properties satisfied with S.
In this paper, we address the problem of verifying robustness of transactional programs for serializability, against snapshot isolation. Under snapshot isolation, any transaction t reads values from a snapshot of the database taken at its start and t can commit only if no other committed transaction has written to a location that t wrote to, since t started. Robustness is a form of program equivalence between two versions of the same program, obtained using two semantics, one more permissive than the other. It ensures that this permissiveness has no effect on the program under consideration. The difficulty in checking robustness is to apprehend the extra behaviors due to the relaxed model w.r.t. the strong model. This requires a priori reasoning about complex order constraints between operations in arbitrarily long computations, which may need maintaining unbounded ordered structures, and make robustness checking hard or even undecidable.
Our first contribution is to show that verifying robustness of transactional programs against snapshot isolation can be reduced in polynomial time to the reachability problem in concurrent programs under sequential consistency (SC). This allows (1) to avoid explicit handling of the snapshots from where transactions read along computations (since this may imply memorizing unbounded information), and (2) to leverage available tools for verifying invariants/reachability problems on concurrent programs. This also implies that the robustness problem is decidable for finitestate programs, PSPACEcomplete when the number of sites is fixed, and EXPSPACEcomplete otherwise. This is the first result on the decidability and complexity of the problem of verifying robustness in the context of transactional programs. The problem of verifying robustness has been considered in the literature for several models, including eventual and causal consistency [6, 10, 11, 12, 20]. These works provide (over or under)approximate analyses for checking robustness, but none of them provides precise (sound and complete) algorithmic verification methods for solving this problem.
Based on this reduction, our second contribution is a proof methodology for establishing robustness which builds on Lipton’s reduction theory [18]. We use the theory of movers to establish whether the relaxations allowed by SI are harmless, i.e., they don’t introduce new behaviors compared to serializability.
We applied the proposed verification techniques on 10 challenging applications extracted from previous work [2, 6, 11, 14, 16, 19, 24]. These techniques were enough for proving or disproving the robustness of these applications.
2 Overview
In this section, we give an overview of our approach for checking robustness against snapshot isolation. While serializability enforces that transactions are atomic and conflicting transactions, i.e., which read or write to a common location, cannot commit concurrently, SI [5] allows that conflicting transactions commit in parallel as long as they don’t contain a writewrite conflict, i.e., write on a common location. Moreover, under SI, each transaction reads from a snapshot of the database taken at its start. These relaxations permit the “anomaly” known as Write Skew (WS) shown in Fig. 1a, where an anomaly is a program execution which is allowed by SI, but not by serializability. The execution of Write Skew under SI allows the reads of \(\mathtt{x}\) and \(\mathtt{y}\) to return 0 although this cannot happen under serializability. These values are possible since each transaction is executed locally (starting from the initial snapshot) without observing the writes of the other transaction.
Execution Trace. Our notion of program robustness is based on an abstract representation of executions called trace. Informally, an execution trace is a set of events, i.e., accesses to shared variables and transaction begin/commit events, along with several standard dependency relations between events recording the dataflow. The transitive closure of the union of all these dependency relations is called happensbefore. An execution is an anomaly if the happensbefore of its trace is cyclic. Figure 1b shows the happensbefore of the Write Skew anomaly. Notice that the happensbefore order is cyclic in both cases.
Above, \(\textsf {begin}({p}_1,{t}_1)\) stands for starting a new transaction \({t}_1\) by process \({p}_1\), \(\textsf {ld}\) represents read (load) actions, while \(\textsf {isu}\) denotes write actions that are visible only to the current transaction (not yet committed). The writes performed during \({t}_1\) become visible to all processes once the commit event \(\textsf {com}({p}_1,{t}_1)\) takes place.
misses the conflict (happensbefore dependency) between the issue event of \({t}_1\) and \({t}_2\). Therefore, the events of \({t}_2\) can be reordered to the left of \({t}_1\) and obtain an equivalent execution where \(\textsf {st}({p}_1,{t}_1,x,r_{x})\) occurs immediately after \(\textsf {st}({p}_1,{t}_1,r_{x},1)\). In this case, \({t}_1\) is not anymore delayed and this execution is possible under serializability (without the instrumentation).
Proving Robustness Using Commutativity Dependency Graphs. Based on the reduction above, we also devise an approximated method for checking robustness based on the concept of mover in Lipton’s reduction theory [18]. An event is a left (resp., right) mover if it commutes to the left (resp., right) of another event (from a different process) while preserving the computation. We use the notion of mover to characterize happensbefore dependencies between transactions. Roughly, there exists a happensbefore dependency between two transactions in some execution if one doesn’t commute to the left/right of the other one. We define a commutativity dependency graph which summarizes the happensbefore dependencies in all executions of a given program between transactions \({t}\) as they appear in the program, transactions \({t}\setminus \{w\}\) where the writes of \({t}\) are deactivated (i.e., their effects are not visible outside the transaction), and
transactions \({t}\setminus \{r\}\) where the reads of \({t}\) obtain nondeterministic values. The transactions \({t}\setminus \{w\}\) are used to simulate issue events of delayed transactions (where writes are not yet visible) while the transactions \({t}\setminus \{r\}\) are used to simulate commit events of delayed transactions (which only write to the shared memory). Two transactions a and b are linked by an edge iff a cannot move to the right of b (or b cannot move to the left of a), or if they are related by the program order (i.e., issued in some order in the same process). Then a program is robust if for every transaction \({t}\), this graph doesn’t contain a path from \({t}\setminus \{w\}\) to \({t}\setminus \{r\}\) formed of transactions that don’t write to a variable that \({t}\) writes to (the latter condition is enforced by SI since two concurrent transactions cannot commit at the same time when they write to a common variable). For example, Fig. 2 shows the commutativity dependency graph of the modified WS program where the read of y is removed from \({t}_1\). The fact that it doesn’t contain any path like above implies that it is robust.
3 Programs
A program is parallel composition of processes distinguished using a set of identifiers \(\mathbb {P}\). Each process is a sequence of transactions and each transaction is a sequence of labeled instructions. Each transaction starts with a \(\mathtt {begin}\) instruction and finishes with a \(\mathtt {commit}\) instruction. Each other instruction is either an assignment to a processlocal register from a set \(\mathbb {R}\) or to a shared variable from a set \(\mathbb {V}\), or an \(\mathtt {assume}\) statement. The read/write assignments use values from a data domain \(\mathbb {D}\). An assignment to a register \(\langle reg\rangle := \langle var\rangle \) is called a read of the sharedvariable \(\langle var\rangle \) and an assignment to a shared variable \(\langle var\rangle := \langle reg\text {}expr\rangle \) is called a write to \(\langle var\rangle \) (\(\langle reg\text {}expr\rangle \) is an expression over registers whose syntax we leave unspecified since it is irrelevant for our development). The \(\mathtt {assume}\) \(\langle bexpr\rangle \) blocks the process if the Boolean expression \(\langle bexpr\rangle \) over registers is false. They are used to model conditionals as usual. We use \(\mathtt {goto}\) statements to model an arbitrary controlflow where the same label can be assigned to multiple instructions and multiple \(\mathtt {goto}\) statements can direct the control to the same label which allows to mimic imperative constructs like loops and conditionals. To simplify the technical exposition, our syntax includes simple read/write instructions. However, our results apply as well to instructions that include SQL (select/update) queries. The experiments reported in Sect. 7 consider programs with SQL based transactions.
The semantics of a program under SI is defined as follows. The shared variables are stored in a central memory and each process keeps a replicated copy of the central memory. A process starts a transaction by discarding its local copy and fetching the values of the shared variables from the central memory. When a process commits a transaction, it merges its local copy of the shared variables with the one stored in the central memory in order to make its updates visible to all processes. During the execution of a transaction, the process stores the writes to shared variables only in its local copy and reads only from its local copy. When a process merges its local copy with the centralized one, it is required that there were no concurrent updates that occurred after the last fetch from the central memory to a shared variable that was updated by the current transaction. Otherwise, the transaction is aborted and its effects discarded.
An execution of program \(\mathcal {P}\), under snapshot isolation, is a sequence of events \( ev _1\cdot ev _2\cdot \ldots \) corresponding to a run of \([\mathcal {P}]_{\mathtt{CM}{}}\). The set of executions of \(\mathcal {P}\) under SI is denoted by \(\mathbb {E}\textsf {x}_{\mathtt{SI}{}}(\mathcal {P})\).
4 Robustness Against SI
A trace abstracts the order in which sharedvariables are accessed inside a transaction and the order between transactions accessing different variables. Formally, the trace of an execution \(\rho \) is obtained by (1) replacing each subsequence of transitions in \(\rho \) corresponding to the same transaction, but excluding the \(\textsf {com}\) transition, with a single “macroevent” \(\textsf {isu}({p},{t})\), and (2) adding several standard relations between these macroevents \(\textsf {isu}({p},{t})\) and commit events \(\textsf {com}({p},{t})\) to record the dataflow in \(\rho \), e.g. which transaction wrote the value read by another transaction. The sequence of \(\textsf {isu}({p},{t})\) and \(\textsf {com}({p},{t})\) events obtained in the first step is called a summary of \(\rho \). We say that a transaction \({t}\) in \(\rho \) performs an external read of a variable \(x\) if \(\rho \) contains an event \(\textsf {ld}({p},{t},x,v)\) which is not preceded by a write on \(x\) of \({t}\), i.e., an event \(\textsf {isu}({p},{t},x,v)\). Also, we say that a transaction \({t}\) writes a variable \(x\) if \(\rho \) contains an event \(\textsf {isu}({p},{t},x,v)\), for some \(v\).
The trace \(\mathsf {tr}(\rho ) = (\tau , {\mathsf {PO}}, {\mathsf {WR}}, {\mathsf {WW}}, {\mathsf {RW}}, {\mathsf {STO}})\) of an execution \(\rho \) consists of the summary \(\tau \) of \(\rho \) along with the program order \({\mathsf {PO}}\), which relates any two issue events \(\textsf {isu}({p},{t})\) and \(\textsf {isu}({p},{t}')\) that occur in this order in \(\tau \), writeread relation \({\mathsf {WR}}\) (also called readfrom), which relates any two events \(\textsf {com}({p},{t})\) and \(\textsf {isu}({p}',{t}')\) that occur in this order in \(\tau \) such that \({t}'\) performs an external read of \(x\), and \(\textsf {com}({p},{t})\) is the last event in \(\tau \) before \(\textsf {isu}({p}',{t}')\) that writes to \(x\) (to mark the variable \(x\), we may use \({\mathsf {WR}}(x)\)), the writewrite order \({\mathsf {WW}}\) (also called storeorder), which relates any two store events \(\textsf {com}({p},{t})\) and \(\textsf {com}({p}',{t}')\) that occur in this order in \(\tau \) and write to the same variable \(x\) (to mark the variable \(x\), we may use \({\mathsf {WW}}(x)\)), the readwrite relation \({\mathsf {RW}}\) (also called conflict), which relates any two events \(\textsf {isu}({p},{t})\) and \(\textsf {com}({p}',{t}')\) that occur in this order in \(\tau \) such that \({t}\) reads a value that is overwritten by \({t}'\), and the sametransaction relation \({\mathsf {STO}}\), which relates the issue event with the commit event of the same transaction. The readwrite relation \({\mathsf {RW}}\) is formally defined as \({\mathsf {RW}}(x)={\mathsf {WR}}^{1}(x);{\mathsf {WW}}(x)\) (we use ; to denote the standard composition of relations) and \({\mathsf {RW}}=\bigcup \limits _{x\in \mathbb {V}} {\mathsf {RW}}(x)\). If a transaction \({t}\) reads the initial value of \(x\) then \({\mathsf {RW}}(x)\) relates \(\textsf {isu}({p},{t})\) to \(\textsf {com}({p}',{t}')\) of any other transaction \({t}'\) which writes to \(x\) (i.e., \((\textsf {isu}({p},{t}),\textsf {com}({p}',{t}'))\in {\mathsf {RW}}(x)\)) (note that in the above relations, \({p}\) and \({p}'\) might designate the same process).
Since we reason about only one trace at a time, to simplify the writing, we may say that a trace is simply a sequence \(\tau \) as above, keeping the relations \({\mathsf {PO}}\), \({\mathsf {WR}}\), \({\mathsf {WW}}\), \({\mathsf {RW}}\), and \({\mathsf {STO}}\) implicit. The set of traces of executions of a program \(\mathcal {P}\) under SI is denoted by \(\mathbb {T}\textsf {r}(\mathcal {P})_{\mathtt{SI}{}}\).
Serializability Semantics. The semantics of a program under serializability can be defined using a transition system where the configurations keep a single sharedvariable valuation (accessed by all processes) with the standard interpretation of read and write statements. Each transaction executes in isolation. Alternatively, the serializability semantics can be defined as a restriction of \([\mathcal {P}]_{\mathtt{SI}{}}\) to the set of executions where each transaction is immediately delivered when it starts, i.e., the start and commit time of transaction coincide \({t}.{st}={t}.{ct}\). Such executions are called serializable and the set of serializable executions of a program \(\mathcal {P}\) is denoted by \(\mathbb {E}\textsf {x}_{\mathtt{SER}{}}(\mathcal {P})\). The latter definition is easier to reason about when relating executions under snapshot isolation and serializability, respectively.
Serializable Trace. A trace \( tr \) is called serializable if it is the trace of a serializable execution. Let \(\mathbb {T}\textsf {r}_{\mathtt{SER}{}}(\mathcal {P})\) denote the set of serializable traces. Given a serializable trace \( tr =(\tau , {\mathsf {PO}}, {\mathsf {WR}}, {\mathsf {WW}},{\mathsf {RW}}, {\mathsf {STO}})\) we have that every event \(\textsf {isu}({p},{t})\) in \(\tau \) is immediately followed by the corresponding \(\textsf {com}({p},{t})\) event.
Happens Before Order. Since multiple executions may have the same trace, it is possible that an execution \(\rho \) produced by snapshot isolation has a serializable trace \(\mathsf {tr}(\rho )\) even though \(\textsf {isu}({p},{t})\) events may not be immediately followed by \(\textsf {com}({p},{t})\) actions. However, \(\rho \) would be equivalent, up to reordering of “independent” (or commutative) transitions, to a serializable execution. To check whether the trace of an execution is serializable, we introduce the happensbefore relation on the events of a given trace as the transitive closure of the union of all the relations in the trace, i.e., \({\mathsf {HB}}= ({\mathsf {PO}}\cup {\mathsf {WW}}\cup {\mathsf {WR}}\cup {\mathsf {RW}}\cup {\mathsf {STO}})^{+}\).
Finally, the happensbefore relation between events is extended to transactions as follows: a transaction \({t}_1\) happensbefore another transaction \({t}_2\ne {t}_1\) if the trace \( tr \) contains an event of transaction \({t}_1\) which happensbefore an event of \({t}_2\). The happensbefore relation between transactions is denoted by \({\mathsf {HB}}_t\) and called transactional happensbefore. The following characterizes serializable traces.
A program is called robust if it produces the same set of traces as the serializability semantics.
Definition 1
A program \(\mathcal {P}\) is called robust against SI iff Open image in new window .
Since \(\mathbb {T}\textsf {r}_{\mathtt{SER}{}}(\mathcal {P}) \subseteq \mathbb {T}\textsf {r}_{\textsf {X}}(\mathcal {P})\), the problem of checking robustness of a program \(\mathcal {P}\) is reduced to checking whether there exists a trace \( tr \in \mathbb {T}\textsf {r}_{\mathtt{SI}{}}(\mathcal {P}) \setminus \mathbb {T}\textsf {r}_{\mathtt{SER}{}}(\mathcal {P}).\)
5 Reducing Robustness Against SI to SC Reachability
A trace which is not serializable must contain at least an issue and a commit event of the same transaction that don’t occur one after the other even after reordering of “independent” events. Thus, there must exist an event that occur between the two which is related to both events via the happensbefore relation, forbidding the issue and commit to be adjacent. Otherwise, we can build another trace with the same happensbefore where events are reordered such that the issue is immediately followed by the corresponding commit. The latter is a serializable trace which contradicts the initial assumption. We define a program instrumentation which mimics the delay of transactions by doing the writes on auxiliary variables which are not visible to other transactions. After the delay of a transaction, we track happensbefore dependencies until we execute a transaction that does a “read” on one of the variables that the delayed transaction writes to (this would expose a readwrite dependency to the commit event of the delayed transaction). While tracking happensbefore dependencies we cannot execute a transaction that writes to a variable that the delayed transaction writes to since SI forbids writewrite conflicts between concurrent transactions.
Concretely, given a program \(\mathcal {P}\), we define an instrumentation of \(\mathcal {P}\) such that \(\mathcal {P}\) is not robust against SI iff the instrumentation reaches an error state under serializability. The instrumentation uses auxiliary variables in order to simulate a single delayed transaction which we prove that it is enough for deciding robustness. Let \(\textsf {isu}({p},{t})\) be the issue event of the only delayed transaction. The process \({p}\) that delayed \({t}\) is called the Attacker. When the attacker finishes executing the delayed transaction it stops. Other processes that execute transactions afterwards are called HappensBefore Helpers.
The instrumentation uses two copies of the set of shared variables in the original program to simulate the delayed transaction. We use primed variables \(x'\) to denote the second copy. Thus, when a process becomes the attacker, it will only write to the second copy that is not visible to other processes including the happensbefore helpers. The writes made by the other processes including the happensbefore helpers are made visible to all processes.
When the attacker delays the transaction \({t}\), it keeps track of the variables it accessed, in particular, it stores the name of one of the variables it writes to, x, it tracks every variable y that it reads from and every variable z that it writes to. When the attacker finishes executing \({t}\), and some other process wants to execute some other transaction, the underlying transaction must contain a write to a variable y that the attacker reads from. Also, the underlying transaction must not write to a variable that \({t}\) writes to. We say that this process has joined happensbefore helpers through the underlying transaction. While executing this transaction, we keep track of each variable that was accessed and the type of operation, whether it is a read or write. Afterward, in order for some other transaction to “join” the happensbefore path, it must not write to a variable that \({t}\) writes to so it does not violate the fact that SI forbids writewrite conflicts, and it has to satisfy one of the following conditions in order to ensure the continuity of the happensbefore dependencies: (1) the transaction is issued by a process that has already another transaction in the happensbefore dependency (program order dependency), (2) the transaction is reading from a shared variable that was updated by a previous transaction in the happensbefore dependency (writeread dependency), (3) the transaction writes to a shared variable that was read by a previous transaction in the happensbefore dependency (readwrite dependency), or (4) the transaction writes to a shared variable that was updated by a previous transaction in the happensbefore dependency (writewrite dependency). We introduce a flag for each shared variable to mark the fact that the variable was read or written by a previous transaction.
Processes continue executing transactions as part of the chain of happensbefore dependencies, until a transaction does a read on the variable x that \({t}\) wrote to. In this case, we reached an error state which signals that we found a cycle in the transactional happensbefore relation.
The instrumentation uses four varieties of flags: a) global flags (i.e., \(\mathsf {HB}_{}\), \(a_{\textsf {tr}_{\textsf {A}}}\), \(a_{\textsf {st}_{\textsf {A}}}\)), b) flags local to a process (i.e., p.a and p.hbh), and c) flags per shared variable (i.e., \(x.event\), \(x.event'\), and \(x.eventI\)). We will explain the meaning of these flags along with the instrumentation. At the start of the execution, all flags are initialized to null (\(\perp \)).
5.1 Instrumentation of the Attacker
Figure 3 lists the instrumentation of the write and read instructions of the attacker. Each process passes through an initial phase where it executes transactions that are visible immediately to all the other processes (i.e., they are not delayed), and then nondeterministically it can choose to delay a transaction at which point it sets the flag \(a_{\textsf {tr}_{\textsf {A}}}\) to \(\mathsf {true}\). During the delayed transaction it chooses nondeterministically a write instruction to a variable x and stores the name of this variable in the flag \(a_{\textsf {st}_{\textsf {A}}}\) (line (5)). The values written during the delayed transaction are stored in the primed variables and are visible only to the current transaction, in case the transaction reads its own writes. For example, given a variable z, all writes to z from the original program are transformed into writes to the primed version \(z'\) (line (3)). Each time, the attacker writes to z, it sets the flag \(z.event'=1\). This flag is used later by transactions from happensbefore helpers to avoid writing to variables that the delayed transaction writes to.
A read on a variable, y, in the delayed transaction takes her value from the primed version, \(y'\). In every read in the delayed transaction, we set the flag y.event to \(\textsf {ld}\) (line (1)) to be used latter in order for a process to join the happensbefore helpers. Afterward, the attacker starts the happensbefore path, and it sets the variable \(\mathsf {HB}_{}\) to \(\mathsf {true}\) (line (2)) to mark the start of the happens. When the flag \(\mathsf {HB}_{}\) is set to \(\mathsf {true}\) the attacker stops executing new transactions.
5.2 Instrumentation of the HappensBefore Helpers
The remaining processes, which are not the attacker, can become a happensbefore helper. Figure 4 lists the instrumentation of write and read instructions of a happensbefore helper. In a first phase, each process executes the original code until the flag \(a_{\textsf {tr}_{\textsf {A}}}\) is set to \(\mathsf {true}\) by the attacker. This flag signals the “creation” of the secondary copy of the sharedvariables, which can be observed only by the attacker. At this point, the flag \(\mathsf {HB}_{}\) is set to \(\mathsf {true}\), and the happensbefore helper process chooses nondeterministically a first transaction through which it wants to join the set of happensbefore helpers, i.e., continue the happensbefore dependency created by the existing happensbefore helpers. When a process chooses a transaction, it makes a pledge (while executing the \(\mathtt {begin}\) instruction) that during this transaction it will either read from a variable that was written to by another happensbefore helper, write to a variable that was accessed (read or written) by another happensbefore helper, or write to a variable that was read from in the delayed transaction. When the pledge is met, the process sets the flag p.hbh to \(\textsf {true}\) (lines (7) and (11)). The execution is blocked if a process does not keep its pledge (i.e., the flag p.hbh is null) at the end of the transaction. Note that the first process to join the happensbefore helper has to execute a transaction \({t}\) which writes to a variable that was read from in the delayed transaction since this is the only way to build a happensbefore between \({t}\), and the delayed transaction (\({\mathsf {PO}}\) is not possible since \({t}\) is not from the attacker, \({\mathsf {WR}}\) is not possible since \({t}\) does not see the writes of the delayed transaction, and \({\mathsf {WW}}\) is not possible since \({t}\) cannot write to a variable that the delayed transaction writes to). We use a flag x.event for each variable x to record the type (read \(\textsf {ld}\) or write \(\textsf {st}\)) of the last access made by a happensbefore helper (lines (8) and (10)). During the execution of a transaction that is part of the happensbefore dependency, we must ensure that the transaction does not write to variable y where \(y.even'\) is set to 1. Otherwise, the execution is blocked (line 9).
5.3 Correctness
The role of a process in an execution is chosen nondeterministically at runtime. Therefore, the final instrumentation of a given program \(\mathcal {P}\), denoted by \([\![\mathcal {P}]\!]\), is obtained by replacing each labeled instruction \(\langle linst\rangle \) with the concatenation of the instrumentations corresponding to the attacker and the happensbefore helpers, i.e., \(\ \ \ [\![\langle linst\rangle ]\!] \,{::=}\, [\![\langle linst\rangle ]\!]_{\textsf {A}} [\![\langle linst\rangle ]\!]_{\textsf {HbH}} \)
The following theorem states the correctness of the instrumentation.
Theorem 2
\(\mathcal {P}\) is not robust against SI iff \([\![\mathcal {P}]\!]\) reaches the error state.
If a program is not robust, this implies that the execution of the program under SI results in a trace where the happensbefore is cyclic. Which is possible only if the program contains at least one delayed transaction. In the proof of this theorem, we show that is sufficient to search for executions that contain a single delayed transaction.
Notice that in the instrumentation of the attacker, the delayed transaction must contain a read and write instructions on different variables. Also, the transactions of the happensbefore helpers must not contain a write to a variable that the delayed transaction writes to. The following corollary states the complexity of checking robustness for finitestate programs^{1} against snapshot isolation. It is a direct consequence of Theorem 2 and of previous results concerning the reachability problem in concurrent programs running over a sequentiallyconsistent memory, with a fixed [17] or parametric number of processes [22].
Corollary 1
Checking robustness of finitestate programs against snapshot isolation is PSPACEcomplete when the number of processes is fixed and EXPSPACEcomplete, otherwise.
The instrumentation can be extended to SQL (select/update) queries where a statement may include expressions over a finite/infinite set of variables, e.g., by manipulating a set of flags x.event for each statement instead of only one.
6 Proving Program Robustness
As a more pragmatic alternative to the reduction in the previous section, we define an approximated method for proving robustness which is inspired by Lipton’s reduction theory [18].
Movers. Given an execution \(\tau = ev _1\cdot \ldots \cdot ev _n\) of a program \(\mathcal {P}\) under serializability (where each event \( ev _i\) corresponds to executing an entire transaction), we say that the event \( ev _i\) moves right (resp., left) in \(\tau \) if \( ev _1\cdot \ldots \cdot ev _{i1}\cdot ev _{i+1}\cdot ev _i\cdot ev _{i+2}\cdot \ldots \cdot ev _n\) (resp., \( ev _1\cdot \ldots \cdot ev _{i2}\cdot ev _i\cdot ev _{i1}\cdot ev _{i+1}\cdot \ldots \cdot ev _n\)) is also a valid execution of \(\mathcal {P}\), the process of \( ev _i\) is different from the process of \( ev _{i+1}\) (resp., \( ev _{i1}\)), and both executions reach to the same end state \(\sigma _n\). For an execution \(\tau \), let \(\mathsf {instOf}_\tau ( ev _i)\) denote the transaction that generated the event \( ev _i\). A transaction \({t}\) of a program \(\mathcal {P}\) is a right (resp., left) mover if for all executions \(\tau \) of \(\mathcal {P}\) under serializability, the event \( ev _i\) with \(\mathsf {instOf}( ev _i) = {t}\) moves right (resp., left) in \(\tau \).
If a transaction \({t}\) is not a right mover, then there must exist an execution \(\tau \) of \(\mathcal {P}\) under serializability and an event \( ev _i\) of \(\tau \) with \(\mathsf {instOf}( ev _i) = {t}\) that does not move right. This implies that there must exist another \( ev _{i+1}\) of \(\tau \) which caused \( ev _i\) to not be a right mover. Since \( ev _i\) and \( ev _{i+1}\) do not commute, then this must be because of either a writeread, writewrite, or a readwrite dependency. If \({t}'=\mathsf {instOf}( ev _{i+1})\), we say that \({t}\) is not a right mover because of \({t}'\) and some dependency that is either writeread, writewrite, or readwrite. Notice that when \({t}\) is not a right mover because of \({t}'\) then \({t}'\) is not a left mover because of \({t}\).
We define \({\mathsf {M_{WR}}}\) as a binary relation between transactions such that \(({t},{t}') \in {\mathsf {M_{WR}}}\) when \({t}\) is not a right mover because of \({t}'\) and a writeread dependency. We define the relations \({\mathsf {M_{WW}}}\) and \({\mathsf {M_{RW}}}\) corresponding to writewrite and readwrite dependencies in a similar way.
Read/Writefree Transactions. Given a transaction \({t}\), we define \({t}\setminus \{r\}\) as a variation of \({t}\) where all the reads from shared variables are replaced with nondeterministic reads, i.e., \(\langle reg\rangle := \langle var\rangle \) statements are replaced with \(\langle reg\rangle := \star \) where \(\star \) denotes nondeterministic choice. We also define \({t}\setminus \{w\}\) as a variation of \({t}\) where all the writes to shared variables in \({t}\) are disabled. Intuitively, recalling the reduction to SC reachability in Sect. 5, \({t}\setminus \{w\}\) simulates the delay of a transaction by the Attacker, i.e., the writes are not made visible to other processes, and \({t}\setminus \{r\}\) approximates the commit of the delayed transaction which only applies a set of writes.
Commutativity Dependency Graph. Given a program \(\mathcal {P}\), we define the commutativity dependency graph as a graph where vertices represent transactions and their read/writefree variations. Two vertices which correspond to the original transactions in \(\mathcal {P}\) are related by a program order edge, if they belong to the same process. The other edges in this graph represent the “nonmover” relations \({\mathsf {M_{WR}}}\), \({\mathsf {M_{WW}}}\), and \({\mathsf {M_{RW}}}\).
 (a)
\(({t}''_0,{t}_1) \in {\mathsf {M_{RW}}}\) where \({t}''_0\) is the writefree variation of \({t}_0\) and \({t}_1\) does not write to a variable that \({t}_0\) writes to;
 (b)
for all \(i\in [1,n]\), \(({t}_i,{t}_{i+1})\in ({\mathsf {PO}}\cup {\mathsf {M_{WR}}}\cup {\mathsf {M_{WW}}}\cup {\mathsf {M_{RW}}})\), \({t}_i\) and \({t}_{i+1}\) do not write to a shared variable that \({t}_0\) writes to;
 (c)
\(({t}_n,{t}'_0) \in {\mathsf {M_{RW}}}\) where \({t}'_0\) is the readfree variation of \({t}_0\) and \({t}_n\) does not write to a variable that \({t}_0\) writes to.
A nonmover cycle approximates an execution of the instrumentation defined in Sect. 5 in between the moment that the Attacker delays a transaction \({t}_0\) (which here corresponds to the writefree variation \({t}''_0\)) and the moment where \({t}_0\) gets committed (the readfree variation \({t}'_0\)).
The following theorem shows that the acyclicity of the commutativity dependency graph of a program implies the robustness of the program. Actually, the notion of robustness in this theorem relies on a slightly different notion of trace where storeorder and writeorder dependencies take into account values, i.e., storeorder relates only writes writing different values and the writeorder relates a read to the oldest write (w.r.t. execution order) writing its value. This relaxation helps in avoiding some harmless robustness violations due to for instance, two transactions writing the same value to some variable.
Theorem 3
For a program \(\mathcal {P}\), if the commutativity dependency graph of \(\mathcal {P}\) does not contain nonmover cycles, then \(\mathcal {P}\) is robust.
7 Experiments
An overview of the analysis results. CDG stands for commutativity dependency graph. The columns PO and PT show the number of proof obligations and proof time in second, respectively. T stands for trivial when the application has only readonly transactions.
Application  #Transactions  Robustness  Reachability analysis  CDG Analysis  

PO  PT  PO  PT  
Auction  4  70  0.3  20  0.5  
Courseware  5  59  0.37  na  na  
FusionTicket  4  72  0.3  34  0.5  
SmallBank  5  48  0.28  na  na  
TPCC  5  54  0.7  82  3.7  
CassieqCore  8  173  0.55  104  2.9  
CurrencyExchange  6  88  0.35  26  3.5  
PlayList  14  99  4.63  236  7.3  
RoomStore  5  85  0.3  22  0.5  
ShoppingCart  4  58  0.25  T  T 
A first experiment concerns the reduction of robustness checking to SC reachability. For each application, we have constructed a client (i.e., a program composed of transactions defined within that application) with a fixed number of processes (at most 3) and a fixed number of transactions (between 3 and 7 transactions per process). We have encoded the instrumentation of this client, defined in Sect. 5, in the Boogie programming language [3] and used the Civl verifier [15] in order to check whether the assertions introduced by the instrumentation are violated (which would represent a robustness violation). Note that since clients are of fixed size, this requires no additional assertions/invariants (it is an instance of bounded model checking). The results are reported in Table 1. We have found two of the applications, Courseware and SmallBank, to not be robust against snapshot isolation. The violation in Courseware is caused by transactions RemoveCourse and EnrollStudent that execute concurrently, RemoveCourse removing a course that has no registered student and EnrollStudent registering a new student to the same course. We get an invalid state where a student is registered for a course that was removed. SmallBank’s violation contains transactions Balance, TransactSaving, and WriteCheck. One process executes WriteCheck where it withdraws an amount from the checking account after checking that the sum of the checking and savings accounts is bigger than this amount. Concurrently, a second process executes TransactSaving where it withdraws an amount from the saving account after checking that it is smaller than the amount in the savings account. Afterwards, the second process checks the contents of both the checking and saving accounts. We get an invalid state where the sum of the checking and savings accounts is negative.
Since in the first experiment we consider fixed clients, the lack of assertion violations doesn’t imply that the application is robust (this instantiation of our reduction can only be used to reveal robustness violations). Thus, a second experiment concerns the robustness proof method based on commutativity dependency graphs (Sect. 6). For the applications that were not identified as nonrobust by the previous method, we have used Civl to construct their commutativity dependency graphs, i.e., identify the “nonmover” relations \({\mathsf {M_{WR}}}\), \({\mathsf {M_{WW}}}\), and \({\mathsf {M_{RW}}}\) (Civl allows to check whether some code fragment is a left/right mover). In all cases, the graph didn’t contain nonmover cycles, which allows to conclude that the applications are robust.
The experiments show that our results can be used for finding violations and proving robustness, and that they apply to a large set of interesting examples. Note that the reduction to SC and the proof method based on commutativity dependency graphs are valid for programs with SQL (select/update) queries.
8 Related Work

SI and TSO admit different sets of relaxations and SI is a model of transactional databases.

We use a different notion of measure: the measure in [7] counts the number of events between a write issue and a write commit while our notion of measure counts the number of delayed transactions. This is a first reason for which the proof techniques in [7] don’t extend to our context.

Transactions induce more complex traces: two transactions might be related by several dependency relations since each transaction may contain multiple reads and writes to different locations. In TSO, each action is a read or a write to some location, and two events are related by a single dependency relation. Also, the number of dependencies between two transactions depends on the execution since the set of reads/writes in a transaction evolves dynamically.
As far as we know, our work provides the first results concerning the decidability and the complexity of robustness checking in the context of transactions. The existing work on the verification of robustness for transactional programs provide either over or underapproximate analyses. Our commutativity dependency graphs are similar to the static dependency graphs used in [6, 10, 11, 12], but they are more precise, i.e., reducing the number of false alarms. The static dependency graphs record happensbefore dependencies between transactions based on a syntactic approximation of the variables accessed by a transaction. For example, our techniques are able to prove that the program in Fig. 5 is robust, while this is not possible using static dependency graphs. The latter would contain a dependency from transaction \({t}_1\) to \({t}_2\) and one from \({t}_2\) to \({t}_1\) just because syntactically, each of the two transactions reads both variables and may write to one of them. Our dependency graphs take into account the semantics of these transactions and do not include this happensbefore cycle. Other over and underapproximate analyses have been proposed in [20]. They are based on encoding executions into first order logic, boundedmodel checking for the underapproximate analysis, and a sound check for proving a cutoff bound on the size of the happensbefore cycles possible in the executions of a program, for the overapproximate analysis. The latter is strictly less precise than our method based on commutativity dependency graphs. For instance, extending the TPCC application with additional transactions will make the method in [20] fail while our method will succeed in proving robustness (the three transactions are for adding a new product, adding a new warehouse based on the number of customers and warehouses, and adding a new customer, respectively).
Finally, the idea of using Lipton’s reduction theory for checking robustness has been also used in the context of the TSO memory model [8], but the techniques are completely different, e.g., the TSO technique considers each update in isolation and doesn’t consider nonmover cycles like in our commutativity dependency graphs.
Footnotes
 1.
Programs with a bounded number of variables taking values from a bounded domain.
References
 1.Adya, A.: Weak consistency: a generalized theory and optimistic implementations for distributed transactions. Ph.D. thesis (1999)Google Scholar
 2.Alomari, M., Cahill, M.J., Fekete, A., Röhm, U.: The cost of serializability on platforms that use snapshot isolation. In: Alonso, G., Blakeley, J.A., Chen, A.L.P. (eds.) Proceedings of the 24th International Conference on Data Engineering, ICDE 2008, 7–12 April 2008, Cancún, Mexico, pp. 576–585. IEEE Computer Society (2008)Google Scholar
 3.Barnett, M., Chang, B.Y.E., DeLine, R., Jacobs, B., Leino, K.R.M.: Boogie: a modular reusable verifier for objectoriented programs. In: de Boer, F.S., Bonsangue, M.M., Graf, S., de Roever, W.P. (eds.) FMCO 2005. LNCS, vol. 4111, pp. 364–387. Springer, Heidelberg (2006). https://doi.org/10.1007/11804192_17CrossRefGoogle Scholar
 4.Beillahi, S.M., Bouajjani, A., Enea, C.: Checking robustness against snapshot isolation. CoRR, abs/1905.08406 (2019)Google Scholar
 5.Berenson, H., Bernstein, P.A., Gray, J., Melton, J., O’Neil, E.J., O’Neil, P.E.: A critique of ANSI SQL isolation levels. In: Carey, M.J., Schneider, D.A. (eds.) Proceedings of the 1995 ACM SIGMOD International Conference on Management of Data, San Jose, California, USA, 22–25 May 1995, pp. 1–10. ACM Press (1995)Google Scholar
 6.Bernardi, G., Gotsman, A.: Robustness against consistency models with atomic visibility. In: Desharnais, J., Jagadeesan, R. (eds.) 27th International Conference on Concurrency Theory, CONCUR 2016, 23–26 August 2016, Québec City, Canada. LIPIcs, vol. 59, pp. 7:1–7:15. Schloss Dagstuhl  LeibnizZentrum fuer Informatik (2016)Google Scholar
 7.Bouajjani, A., Derevenetc, E., Meyer, R.: Checking and enforcing robustness against TSO. In: Felleisen, M., Gardner, P. (eds.) ESOP 2013. LNCS, vol. 7792, pp. 533–553. Springer, Heidelberg (2013). https://doi.org/10.1007/9783642370366_29CrossRefzbMATHGoogle Scholar
 8.Bouajjani, A., Enea, C., Mutluergil, S.O., Tasiran, S.: Reasoning about TSO programs using reduction and abstraction. In: Chockler, H., Weissenbacher, G. (eds.) CAV 2018. LNCS, vol. 10982, pp. 336–353. Springer, Cham (2018). https://doi.org/10.1007/9783319961422_21CrossRefGoogle Scholar
 9.Bouajjani, A., Meyer, R., Möhlmann, E.: Deciding robustness against total store ordering. In: Aceto, L., Henzinger, M., Sgall, J. (eds.) ICALP 2011. LNCS, vol. 6756, pp. 428–440. Springer, Heidelberg (2011). https://doi.org/10.1007/9783642220128_34CrossRefGoogle Scholar
 10.Brutschy, L., Dimitrov, D., Müller, P., Vechev, M.T.: Serializability for eventual consistency: criterion, analysis, and applications. In: Castagna, G., Gordon, A.D. (eds.) Proceedings of the 44th ACM SIGPLAN Symposium on Principles of Programming Languages, POPL 2017, Paris, France, 18–20 January 2017, pp. 458–472. ACM (2017)Google Scholar
 11.Brutschy, L., Dimitrov, D., Müller, P., Vechev, M.T.: Static serializability analysis for causal consistency. In: Foster, J.S., Grossman, D. (eds.) Proceedings of the 39th ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI 2018, Philadelphia, PA, USA, 18–22 June 2018, pp. 90–104. ACM (2018)Google Scholar
 12.Cerone, A., Gotsman, A.: Analysing snapshot isolation. J. ACM 65(2), 11:1–11:41 (2018)MathSciNetCrossRefGoogle Scholar
 13.Derevenetc, E., Meyer, R.: Robustness against power is PSpacecomplete. In: Esparza, J., Fraigniaud, P., Husfeldt, T., Koutsoupias, E. (eds.) ICALP 2014. LNCS, vol. 8573, pp. 158–170. Springer, Heidelberg (2014). https://doi.org/10.1007/9783662439517_14CrossRefGoogle Scholar
 14.Gotsman, A., Yang, H., Ferreira, C., Najafzadeh, M., Shapiro, M.: ‘cause i’m strong enough: reasoning about consistency choices in distributed systems. In: Bodík, R., Majumdar, R. (eds.) Proceedings of the 43rd Annual ACM SIGPLANSIGACT Symposium on Principles of Programming Languages, POPL 2016, St. Petersburg, FL, USA, 20–22 January 2016, pp. 371–384. ACM (2016)Google Scholar
 15.Hawblitzel, C., Petrank, E., Qadeer, S., Tasiran, S.: Automated and modular refinement reasoning for concurrent programs. In: Kroening, D., Păsăreanu, C.S. (eds.) CAV 2015. LNCS, vol. 9207, pp. 449–465. Springer, Cham (2015). https://doi.org/10.1007/9783319216683_26CrossRefGoogle Scholar
 16.Holt, B., Bornholt, J., Zhang, I., Ports, D.R.K., Oskin, M., Ceze, L.: Disciplined inconsistency with consistency types. In: Aguilera, M.K., Cooper, B., Diao, Y. (eds.) Proceedings of the Seventh ACM Symposium on Cloud Computing, Santa Clara, CA, USA, 5–7 October 2016, pp. 279–293. ACM (2016)Google Scholar
 17.Kozen, D.: Lower bounds for natural proof systems. In: 18th Annual Symposium on Foundations of Computer Science, Providence, Rhode Island, USA, October 31–1 November 1977, pp. 254–266. IEEE Computer Society (1977)Google Scholar
 18.Lipton, R.J.: Reduction: a method of proving properties of parallel programs. Commun. ACM 18(12), 717–721 (1975)MathSciNetCrossRefGoogle Scholar
 19.Nagar, K., Jagannathan, S.: Automated detection of serializability violations under weak consistency. In: Schewe, S., Zhang, L. (eds.) 29th International Conference on Concurrency Theory, CONCUR 2018, 4–7 September 2018, Beijing, China. LIPIcs, vol. 118, pp. 41:1–41:18. Schloss Dagstuhl  LeibnizZentrum fuer Informatik (2018)Google Scholar
 20.Nagar, K., Jagannathan, S.: Automatic detection of serializability violations under weak consistency. In: 29th International Conference on Concurrency Theory (CONCUR 2018), September 2018Google Scholar
 21.Papadimitriou, C.H.: The serializability of concurrent database updates. J. ACM 26(4), 631–653 (1979)MathSciNetCrossRefGoogle Scholar
 22.Rackoff, C.: The covering and boundedness problems for vector addition systems. Theoret. Comput. Sci. 6, 223–231 (1978)MathSciNetCrossRefGoogle Scholar
 23.Shasha, D.E., Snir, M.: Efficient and correct execution of parallel programs that share memory. ACM Trans. Program. Lang. Syst. 10(2), 282–312 (1988)CrossRefGoogle Scholar
 24.TPC: Technical report, Transaction Processing Performance Council, February 2010. http://www.tpc.org/tpc_documents_current_versions/pdf/tpcc_v5.11.0.pdf
Copyright information
Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.
The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.