Fragment Abstraction for Concurrent Shape Analysis
Abstract
A major challenge in automated verification is to develop techniques that are able to reason about finegrained concurrent algorithms that consist of an unbounded number of concurrent threads, which operate on an unbounded domain of data values, and use unbounded dynamically allocated memory. Existing automated techniques consider the case where shared data is organized into singlylinked lists. We present a novel shape analysis for automated verification of finegrained concurrent algorithms that can handle heap structures which are more complex than just singlylinked lists, in particular skip lists and arrays of singly linked lists, while at the same time handling an unbounded number of concurrent threads, an unbounded domain of data values (including timestamps), and an unbounded shared heap. Our technique is based on a novel shape abstraction, which represents a set of heaps by a set of fragments. A fragment is an abstraction of a pair of heap cells that are connected by a pointer field. We have implemented our approach and applied it to automatically verify correctness, in the sense of linearizability, of most linearizable concurrent implementations of sets, stacks, and queues, which employ singlylinked lists, skip lists, or arrays of singlylinked lists with timestamps, which are known to us in the literature.
1 Introduction
Concurrent algorithms with an unbounded number of threads that concurrently access a dynamically allocated shared state are of central importance in a large number of software systems. They provide efficient concurrent realizations of common interface abstractions, and are widely used in libraries, such as the Intel Threading Building Blocks or the java.util.concurrent package. They are notoriously difficult to get correct and verify, since they often employ finegrained synchronization and avoid locking when possible. A number of bugs in published algorithms have been reported [13, 30]. Consequently, significant research efforts have been directed towards developing techniques to verify correctness of such algorithms. One widelyused correctness criterion is that of linearizability, meaning that each method invocation can be considered to occur atomically at some point between its call and return. Many of the developed verification techniques require significant manual effort for constructing correctness proofs (e.g., [25, 41]), in some cases with the support of an interactive theorem prover (e.g., [11, 35, 40]). Development of automated verification techniques remains a difficult challenge.
A major challenge for the development of automated verification techniques is that such techniques must be able to reason about finegrained concurrent algorithms that are infinitestate in many dimensions: they consist of an unbounded number of concurrent threads, which operate on an unbounded domain of data values, and use unbounded dynamically allocated memory. Perhaps the hardest of these challenges is that of handling dynamically allocated memory. Consequently, existing techniques that can automatically prove correctness of such finegrained concurrent algorithms restrict attention to the case where heap structures represent shared data by singlylinked lists [1, 3, 18, 36, 42]. Furthermore, many of these techniques impose additional restrictions on the considered verification problem, such as bounding the number of accessing threads [4, 43, 45]. However, in many concurrent data structure implementations the heap represents more sophisticated structures, such as skiplists [16, 22, 38] and arrays of singlylinked lists [12]. There are no techniques that have been applied to automatically verify concurrent algorithms that operate on such data structures.
Contributions. In this paper, we present a technique for automatic verification of concurrent data structure implementations that operate on dynamically allocated heap structures which are more complex than just singlylinked lists. Our framework is the first that can automatically verify concurrent data structure implementations that employ singly linked lists, skiplists [16, 22, 38], as well as arrays of singly linked lists [12], at the same time as handling an unbounded number of concurrent threads, an unbounded domain of data values (including timestamps), and an unbounded shared heap.
Our technique is based on a novel shape abstraction, called fragment abstraction, which in a simple and uniform way is able to represent several different classes of unbounded heap structures. Its main idea is to represent a set of heap states by a set of fragments. A fragment represents two heap cells that are connected by a pointer field. For each of its cells, the fragment represents the contents of its nonpointer fields, together with information about how the cell can be reached from the program’s global pointer variables. The latter information consists of both: (i) local information, saying which pointer variables point directly to them, and (ii) global information, saying how the cell can reach to and be reached from (by following chains of pointers) heap cells that are globally significant, typically since some global variable points to them. A set of fragments represents the set of heap states in which any two pointerconnected nodes is represented by some fragment in the set. Thus, a set of fragments describes the set of heaps that can be formed by “piecing together” fragments in the set. The combination of local and global information in fragments supports reasoning about the sequence of cells that can be accessed by threads that traverse the heap by following pointer fields in cells and pointer variables: the local information captures properties of the cell fields that can be accessed as a thread dereferences a pointer variable or a pointer field; the global information also captures whether certain significant accesses will at all be possible by following a sequence of pointer fields. This support for reasoning about patterns of cell accesses enables automated verification of reachability and other functional properties.
Fragment abstraction can (and should) be combined, in a natural way, with data abstractions for handling unbounded data domains and with thread abstractions for handling an unbounded number of threads. For the latter we adapt the successful threadmodular approach [5], which represents the local state of a single, but arbitrary thread, together with the part of the global state and heap that is accessible to that thread. Our combination of fragment abstraction, thread abstraction, and data abstraction results in a finite abstract domain, thereby guaranteeing termination of our analysis.
We have implemented our approach and applied it to automatically verify correctness, in the sense of linearizability, of a large number of concurrent data structure algorithms, described in a Clike language. More specifically, we have automatically verified linearizability of most linearizable concurrent implementations of sets, stacks, and queues, and priority queues, which employ singlylinked lists, skiplists, or arrays of timestamped singlylinked lists, which are known to us in the literature on concurrent data structures. For this verification, we specify linearizability using the simple and powerful technique of observers [1, 7, 9], which reduces the criterion of linearizability to a simple reachability property. To verify implementations of stacks and queues, the application of observers can be done completely automatically without any manual steps, whereas for implementations of sets, the verification relies on lightweight user annotation of how linearization points are placed in each method [3].
The fact that our fragment abstraction has been able to automatically verify all supplied concurrent algorithms, also those that employ skiplists or arrays of SLLs, indicates that the fragment abstraction is a simple mechanism for capturing both the local and global information about heap cells that is necessary for verifying correctness, in particular for concurrent algorithms where an unbounded number of threads interact via a shared heap.
Outline. In the next section, we illustrate our fragment abstraction on the verification of a skiplistbased concurrent set implementation. In Sect. 3 we introduce our model for programs, and of observers for specifying linearizability. In Sect. 4 we describe in more detail our fragment abstraction for skiplists; note that singlylinked lists can be handled as a simple special case of skiplists. In Sect. 5 we describe how fragment abstraction applies to arrays of singlylinked lists with timestamp fields. Our implementation and experiments are reported in Sect. 6, followed by conclusions in Sect. 7.
Related Work. A large number of techniques have been developed for representing heap structures in automated analysis, including, e.g., separation logic and various related graph formalisms [10, 15, 47], other logics [33], automata [23], or graph grammars [19]. Most works apply these to sequential programs.
Approaches for automated verification of concurrent algorithms are limited to the case of singlylinked lists [1, 3, 18, 36, 42]. Furthermore, many of these techniques impose additional restrictions on the considered verification problem, such as bounding the number of accessing threads [4, 43, 45].
In [1], concurrent programs operating on SLLs are analyzed using an adaptation of a transitive closure logic [6], combined with tracking of simple sortedness properties between data elements; the approach does not allow to represent patterns observed by threads when following sequences of pointers inside the heap, and so has not been applied to concurrent set implementations. In our recent work [3], we extended this approach to handle SLL implementations of concurrent sets by adapting a wellknown abstraction of singlylinked lists [28] for concurrent programs. The resulting technique is specifically tailored for singlylinks. Our fragment abstraction is significantly simpler conceptually, and can therefore be adapted also for other classes of heap structures. The approach of [3] is the only one with a shape representation strong enough to verify concurrent set implementations based on sorted and nonsorted singlylinked lists having nonoptimistic contains (or lookup) operations we consider, such as the lockfree sets of HM [22], Harris [17], or Michael [29], or unordered set of [48]. As shown in Sect. 6, our fragment abstraction can handle them as well as also algorithms employing skiplists and arrays of singlylinked lists.
There is no previous work on automated verification of skiplistbased concurrent algorithms. Verification of sequential algorithms have been addressed under restrictions, such as limiting the number of levels to two or three [2, 23]. The work [34] generates verification conditions for statements in sequential skiplist implementations. All these works assume that skiplists have the wellformedness property that any higherlevel lists is a sublist of any lowerlevel list, which is true for sequential skiplist algorithms, but false for several concurrent ones, such as [22, 26].
Concurrent algorithms based on arrays of SLLs, and including timestamps, e.g., for verifying the algorithms in [12] have shown to be rather challenging. Only recently has the TS stack been verified by nonautomated techniques [8] using a nontrivial extension of forward simulation, and the TS queue been verified manually by a new technique based on partial orders [24, 37]. We have verified both these algorithms automatically using fragment abstraction.
2 Overview
In this section, we illustrate our technique on the verification of correctness, in the sense of linearizability, of a concurrent set data structure based on skiplists, namely the LockFree Concurrent Skiplist from [22, Sect. 14.4]. Skiplists provide expected logarithmic time search while avoiding some of the complications of tree structures. Informally, a skiplist consists of a collection of sorted linked lists, each of which is located at a level, ranging from 1 up to a maximum value. Each skiplist node has a key value and participates in the lists at levels 1 up to its height. The skiplist has sentinel head and tail nodes with maximum heights and key values \(\infty \) and \(+\infty \), respectively. The lowestlevel list (at level 1) constitutes an ordered list of all nodes in the skiplist. Higherlevel lists are increasingly sparse sublists of the lowestlevel list, and serve as shortcuts into lowerlevel lists. Figure 1 shows an example of a skiplist of height 3. It has head and tail nodes of height 3, two nodes of height 2, and one node of height 1.
In the algorithm, each heap node has a \(\mathtt{key}\) field, a \(\mathtt{height}\), an array of \(\mathtt{next}\) pointers indexed from 1 up to its \(\mathtt{height}\), and an array of \(\mathtt{marked}\) fields which are true if the node has been logically removed at the corresponding level. Removal of a node (at a certain level \(\mathtt{k}\)) occurs in two steps: first the node is logically removed by setting its \(\mathtt{marked}\) flag at level \(\mathtt{k}\) to \(\mathtt{true}\), thereafter the node is physically removed by unlinking it from the level\(\mathtt{k}\) list. The algorithm must be able to update the \(\mathtt{next[k]}\) pointer and \(\mathtt{marked[k]}\) field together as one atomic operation; this is standardly implemented by encoding them in a single word. The head and tail nodes of the skiplist are pointed to by global pointer variables \(\mathtt{H}\) and \(\mathtt{T}\), respectively. The \(\mathtt{find}\) method traverses the list at decreasing levels using two local variables \(\mathtt{pred}\) and \(\mathtt{curr}\), starting at the head and at the maximum level (lines 5–6). At each level \(\mathtt{k}\) it sets \(\mathtt{curr}\) to \(\mathtt{pred.next[k]}\) (line 7). During the traversal, the pointer variable \(\mathtt{succ}\) and boolean variable \(\mathtt{marked}\) are atomically assigned the values of \(\mathtt{curr.next[k]}\) and \(\mathtt{curr.marked[k]}\), respectively (line 9, 14). After that, the method repeatedly removes marked nodes at the current level (lines 10 to 14). This is done by using a \(\mathtt{CompareAndSwap}\) \(\mathtt{(CAS)}\) command (line 11), which tests whether \(\mathtt{pred.next[k]}\) and \(\mathtt{pred.marked[k]}\) are equal to \(\mathtt{curr}\) and \(\mathtt{false}\) respectively. If this test succeeds, it replaces them with \(\mathtt{succ}\) and \(\mathtt{false}\) and returns \(\mathtt{true}\); otherwise, the \(\mathtt{CAS}\) returns \(\mathtt{false}\). During the traversal at level \(\mathtt{k}\), \(\mathtt{pred}\) and \(\mathtt{curr}\) are advanced until \(\mathtt{pred}\) points to a node with the largest key at level \(\mathtt{k}\) which is smaller than \(\mathtt{x}\) (lines 15–18). Thereafter, the resulting values of \(\mathtt{pred}\) and \(\mathtt{curr}\) are recorded into \(\mathtt{preds[k]}\) and \(\mathtt{succs[k]}\) (lines 19, 20), whereafter traversal continues one level below until it reaches the bottom level. Finally, the method returns \(\mathtt{true}\) if the \(\mathtt{key}\) value of \(\mathtt{curr}\) is equal to \(\mathtt{x}\); otherwise, it returns \(\mathtt{false}\) meaning that a node with key \(\mathtt{x}\) is not found.
The \(\mathtt{add}\) method uses \(\mathtt{find}\) to check whether a node with key \(\mathtt{x}\) is already in the list. If so it returns \(\mathtt{false}\); otherwise, a new node is created with randomly chosen height \(\mathtt{h}\) (line 7), and with \(\mathtt{next}\) pointers at levels from 1 to \(\mathtt{h}\) initialised to corresponding elements of \(\mathtt{succ}\) (line 8 to 9). Thereafter, the new node is added into the list by linking it into the bottomlevel list between the \(\mathtt{preds[1]}\) and \(\mathtt{succs[1]}\) pointers returned by \(\mathtt{find}\). This is achieved by using a \(\mathtt{CAS}\) to make \(\mathtt{preds[1].next[1]}\) point to the new node (line 13). If the \(\mathtt{CAS}\) fails, the \(\mathtt{add}\) method will restart from the beginning (line 3) by calling \(\mathtt{find}\) again, etc. Otherwise, \(\mathtt{add}\) proceeds with linking the new node into the list at increasingly higher levels (lines 16 to 22). For each higher level \(\mathtt{k}\), it makes \(\mathtt{preds[k].next[k]}\) point to the new node if it is still valid (line 20); otherwise \(\mathtt{find}\) is called again to recompute \(\mathtt{preds[k]}\) and \(\mathtt{succs[k]}\) on the remaining unlinked levels (line 22). Once all levels are linked, the method returns \(\mathtt{true}\).
To prepare for verification, we add a specification which expresses that the skiplist algorithm of Fig. 2 is a linearizable implementation of a set data structure, using the technique of observers [1, 3, 7, 9]. For our skiplist algorithm, the user first instruments statements in each method that correspond to linearization points (LPs), so that their execution announces the corresponding atomic set operation. In Fig. 2, the LP of a successful \(\mathtt{add}\) operation is at line 15 of the \(\mathtt{add}\) method (denoted by a blue dot) when the \(\mathtt{CAS}\) succeeds, whereas the LP of an unsuccessful \(\mathtt{add}\) operation is at line 13 of the \(\mathtt{find}\) method (denoted by a red dot). We must now verify that in any concurrent execution of a collection of method calls, the sequence of announced operations satisfies the semantics of the set data structure. This check is performed by an observer, which monitors the sequence of announced operations. The observer for the set data structure utilizes a register, which is initialized with a single, arbitrary \(\mathtt{key}\) value. It checks that operations on this particular value follow set semantics, i.e., that successful \(\mathtt{add}\) and \(\mathtt{remove}\) operations on an element alternate and that \(\mathtt{contains}\) are consistent with them. We form the crossproduct of the program and the observer, synchronizing on operation announcements. This reduces the problem of checking linearizability to the problem of checking that in this crossproduct, regardless of the initial observer register value, the observer cannot reach a state where the semantics of the set data structure has been violated.
To verify that the observer cannot reach a state where a violation is reported, we compute a symbolic representation of an invariant that is satisfied by all reachable configurations of the crossproduct of a program and an observer. This symbolic representation combines thread abstraction, data abstraction and our novel fragment abstraction to represent the heap state. Our thread abstraction adapts the threadmodular approach by representing only the view of single, but arbitrary, thread \(\mathtt{th}\). Such a view consists of the local state of thread \(\mathtt{th}\), including the value of the program counter, the state of the observer, and the part of the heap that is accessible to thread \(\mathtt{th}\) via pointer variables (local to \(\mathtt{th}\) or global). Our data abstraction represents variables and cell fields that range over small finite domains by their concrete values, whereas variables and fields that range over the same domain as \(\mathtt{key}\) fields are abstracted to constraints over their relative ordering (wrp. to <).

\(\mathtt {dabs}\) represents the nonpointer fields of the cell under the applied data abstraction,

\(\mathtt {pvars}\) is the set of (local to \(\mathtt{th}\) or global) pointer variables that point to the cell,

\(\mathtt {reachfrom}\) is the set of (i) global pointer variables from which the cell represented by the tag is reachable via a (possibly empty) sequence of \(\mathtt{next[1]}\) pointers, and (ii) observer registers \(\mathtt {x}_i\) such that the cell is reachable from some cell whose data value equals that of \(\mathtt {x}_i\),

\(\mathtt {reachto}\) is the corresponding information, but now considering cells that are reachable from the cell represented by the tag.

\(\mathtt {private}\) is \(\mathtt{true}\) only if \({\mathbbm {c}}\) is private to \(\mathtt{th}\).
Let us illustrate how fragment abstraction applies to the skiplist algorithm. Figure 4 shows an example heap state of the skiplist algorithm with three levels. Each heap cell is shown with the values of its fields as described in Fig. 3. In addition, each cell is labeled by the pointer variables that point to it; we use \(\mathtt{preds(i)[k]}\) to denote the local variable \(\mathtt{preds[k]}\) of thread \(\mathtt{th}_\mathtt{i}\), and the same for other local variables. In the heap state of Fig. 4, thread \(\mathtt{th}_1\) is trying to add a new node of height 1 with key 9, and has reached line 8 of the \(\mathtt{add}\) method. Thread \(\mathtt{th}_2\) is trying to add a new node with key 20 and it has done its first iteration of the \(\mathtt{for}\) loop in the \(\mathtt{find}\) method. The variables \(\mathtt{preds(2)[3]}\) and \(\mathtt{currs(2)[3]}\) have been assigned so that the new node (which has not yet been created) will be inserted between node 5 and the tail node. The observer is not shown, but the value of the observer register is 9; thus it currently tracks the \(\mathtt{add}\) operation of \(\mathtt{th}_1\).
Figure 5 shows a set of fragments that is sufficient to represent the part of the heap that is accessible to \(\mathtt{th}_1\) in the configuration in Fig. 4. There are 11 fragments, named \(\mathtt {v}_1\), ..., \(\mathtt {v}_{11}\). Two of these (\(\mathtt {v}_6\), \(\mathtt {v}_7\) and \(\mathtt {v}_{11}\)) consist of a tag that points to \(\bot \). All other fragments consist of a pair of pointerconnected tags. The fragments \(\mathtt {v}_1\), ..., \(\mathtt {v}_{6}\) are level1fragments, whereas \(\mathtt {v}_7\), ..., \(\mathtt {v}_{11}\) are higher levelfragments. The \(\mathtt{private}\) field of the input tag of \(\mathtt {v}_7\) is \(\mathtt{true}\), whereas the \(\mathtt{private}\) field of tags of other fragments are \(\mathtt{false}\).
 1.
the bottomlevel list is strictly sorted in \(\mathtt{key}\) order,
 2.
a higherlevel pointer from a globally reachable node is a shortcut into the level1 list, i.e., it points to a node that is reachable by a sequence of \(\mathtt{next[1]}\) pointers,
 3.
all nodes which are unreachable from the head of the list are marked, and
 4.
the variable \(\mathtt{pred}\) points to a cell whose \(\mathtt{key}\) field is never larger than the input parameter of its \(\mathtt{add}\) method.
Let us illustrate how such invariants are captured by our fragment abstraction. (1) All level1 fragments are strictly sorted, implying that the bottomlevel list is strictly sorted. (2) For each higherlevel fragment \(\mathtt{v}\), if \(\mathtt{H} \in \mathtt{v.i.}\mathtt {reachfrom}\) then also \(\mathtt{H} \in \mathtt{v.o.}\mathtt {reachfrom}\), implying (together with \(\mathtt{v.}\phi = \left\{ <\right\} \)) that the cell represented by \(\mathtt{v.o}\) it is reachable from that represented by \(\mathtt{v.i}\) by a sequence of \(\mathtt{next[1]}\)pointers. (3) This is verified by inspecting each tag: \(\mathtt {v}_{3}\) contains the only unreachable tag, and it is also marked. (4) The fragments express this property in the case where the value of \(\mathtt{key}\) is the same as the value of the observer register \(\mathtt{x}\). Since the invariant holds for any value of \(\mathtt{x}\), this property is sufficiently represented for purposes of verification.
3 Concurrent Data Structure Implementations
In this section, we introduce our representation of concurrent data structure implementations, we define the correctness criterion of linearizability, we introduce observers and how to use them for specifying linearizability.
3.1 Concurrent Data Structure Implementations
We first introduce (sequential) data structures. A data structure \(\mathtt{DS}\) is a pair Open image in new window , where Open image in new window is a (possibly infinite) data domain and Open image in new window is an alphabet of method names. An operation \({ op}\) is of the form \(\mathtt{m}(d^{ in},d^{ out})\), where Open image in new window is a method name, and \(d^{ in},d^{ out}\) are the input resp. output values, each of which is either in Open image in new window or in some small finite domain \(\mathbb {F}\), which includes the booleans. For some method names, the input or output value is absent from the operation. A trace of \(\mathtt{DS}\) is a sequence of operations. The (sequential) semantics of a data structure \(\mathtt{DS}\) is given by a set \([\![{\mathtt{DS}}]\!]\) of allowed traces. For example, a \(\mathtt{Set}\) data structure has method names \(\mathtt{add}\), \(\mathtt{remove}\), and \(\mathtt{contains}\). An example of an allowed trace is \(\mathtt{add(3,true)\ contains(4,false)\ contains(3,true) \ remove(3,true)}\).
A concurrent data structure implementation operates on a shared state consisting of shared global variables and a shared heap. It assigns, to each method name, a method which performs operations on the shared state. It also comes with a method named \(\mathtt{init}\), which initializes its shared state.
A heap (state) \({\mathcal {H}}\) consists of a finite set \({\mathbb C}\) of cells, including the two special cells \(\mathtt{null}\) and \(\bot \) (dangling). Heap cells have a fixed set \(\mathcal {F}\) of fields, namely nonpointer fields that assume values in Open image in new window or \(\mathbb {F}\), and possibly lock fields. We use the term Open image in new window field for a nonpointer field that assumes values in Open image in new window , and the terms \(\mathbb {F}\)field and lock field with analogous meaning. Furthermore, each cell has one or several named pointer fields. For instance, in data structure implementations based on singlylinked lists, each heap cell has a pointer field named \(\mathtt{next}\); in implementations based on skiplists there is an array of pointer fields named \(\mathtt{next[k]}\) where \(\mathtt{k}\) ranges from 1 to a maximum level.
Each method declares local variables and a method body. The set of local variables includes the input parameter of the method and the program counter \(\mathtt{pc}\). A local state \(\mathtt{loc}\) of a thread \(\mathtt{th}\) defines the values of its local variables. The global variables can be accessed by all threads, whereas local variables can be accessed only by the thread which is invoking the corresponding method. Variables are either pointer variables (to heap cells), locks, or data variables assuming values in Open image in new window or \(\mathbb {F}\). We assume that all global variables are pointer variables. The body is built in the standard way from atomic commands, using standard control flow constructs (sequential composition, selection, and loop constructs). Atomic commands include assignments between variables, or fields of cells pointed to by a pointer variable. Method execution is terminated by executing a \(\mathtt{return}\) command, which may return a value. The command \(\mathtt{new\ Node()}\) allocates a new structure of type \(\mathtt{Node}\) on the heap, and returns a reference to it. The compareandswap command \(\mathtt{CAS(a,b,c)}\) atomically compares the values of \(\mathtt{a}\) and \(\mathtt{b}\). If equal, it assigns the value of \(\mathtt{c}\) to \(\mathtt{a}\) and returns \(\mathtt{true}\), otherwise, it leaves \(\mathtt{a}\) unchanged and returns \(\mathtt{false}\). We assume a memory management mechanism, which automatically collects garbage, and ensures that a new cell is fresh, i.e., has not been used before; this avoids the socalled ABA problem (e.g., [31]).
We define a program \({\mathcal {P}}\) (over a concurrent data structure) to consist of an arbitrary number of concurrently executing threads, each of which executes a method that performs an operation on the data structure. The shared state is initialized by the \(\mathtt{init}\) method prior to the start of program execution. A configuration of a program \({\mathcal {P}}\) is a tuple \(c_{{\mathcal {P}}}= \left\langle \mathtt{T},\mathtt{LOC},{\mathcal {H}}\right\rangle \) where \(\mathtt{T}\) is a set of threads, \({\mathcal {H}}\) is a heap, and \(\mathtt{LOC}\) maps each thread \(\mathtt{th}\in \mathtt{T}\) to its local state \(\mathtt{LOC}\left( \mathtt{th}\right) \). We assume concurrent execution according to sequentially consistent memory model. The behavior of a thread \(\mathtt{th}\) executing a method can be formalized as a transition relation \({\xrightarrow {}{}}_{\!\!\mathtt{th}}\) on pairs \(\left\langle \mathtt{loc},{\mathcal {H}}\right\rangle \) consisting of a local state \(\mathtt{loc}\) and a heap state \({\mathcal {H}}\). The behavior of a program \({\mathcal {P}}\) can be formalized by a transition relation \({\xrightarrow {}{}}_{\!\!{\mathcal {P}}}\) on program configurations; each step corresponds to a move of a single thread. I.e., there is a transition of form \(\left\langle \mathtt{T},\mathtt{LOC},{\mathcal {H}}\right\rangle {\xrightarrow {}{}}_{\!\!{\mathcal {P}}} \left\langle \mathtt{T},\mathtt{LOC}[\mathtt{th}\leftarrow \mathtt{loc}'],{\mathcal {H}}'\right\rangle \) whenever some thread \(\mathtt{th}\in \mathtt{T}\) has a transition \(\left\langle \mathtt{loc},{\mathcal {H}}\right\rangle {\xrightarrow {}{}}_{\!\!\mathtt{th}}\left\langle \mathtt{loc}',{\mathcal {H}}'\right\rangle \) with \(\mathtt{LOC}(\mathtt{th}) = \mathtt{loc}\).
3.2 Linearizability
In a concurrent data structure implementation, we represent the calling of a method by a call action \(\mathtt {call}_{\mathtt{o}}\; \mathtt{m}\left( d^{ in}\right) \), and the return of a method by a return action \(\mathtt {ret}_{\mathtt{o}}\; \mathtt{m}\left( d^{ out}\right) \), where \(\mathtt{o}\in \mathbb {N}\) is an action identifier, which links the call and return of each method invocation. A history \(h\) is a sequence of actions such that (i) different occurrences of return actions have different action identifiers, and (ii) for each return action \(a_2\) in \(h\) there is a unique matching call action \(a_1\) with the same action identifier and method name, which occurs before \(a_2\) in \(h\). A call action which does not match any return action in \(h\) is said to be pending. A history without pending call actions is said to be complete. A completed extension of \(h\) is a complete history \(h'\) obtained from \(h\) by appending (at the end) zero or more return actions that are matched by pending call actions in \(h\), and thereafter removing the call actions that are still pending. For action identifiers \(\mathtt{o}_1,\mathtt{o}_2\), we write \(\mathtt{o}_1\preceq _\mathtt{h}\mathtt{o}_2\) to denote that the return action with identifier \(\mathtt{o}_1\) occurs before the call action with identifier \(\mathtt{o}_2\) in \(h\). A complete history is sequential if it is of the form \(a_1a'_1a_2a'_2\cdots a_na'_n\) where \(a'_i\) is the matching action of \(a_i\) for all \(i:1\le i\le n\), i.e., each call action is immediately followed by its matching return action. We identify a sequential history of the above form with the corresponding trace \({ op}_1{ op}_2\cdots { op}_n\) where \({ op}_i=\mathtt{m}(d^{ in}_i,d^{ out}_i)\), \(a_i=\mathtt {call}_{\mathtt{o}_i}\; \mathtt{m}\left( d^{ in}_{i}\right) \), and \(a_i=\mathtt {ret}_{\mathtt{o}_i}\; \mathtt{m}\left( d^{ out}_{i}\right) \), i.e., we merge each call action together with the matching return action into one operation. A complete history \(h'\) is a linearization of \(h\) if (i) \(h'\) is a permutation of \(h\), (ii) \(h'\) is sequential, and (iii) \(\mathtt{o}_1\preceq _{\mathtt{h}'}\mathtt{o}_2\) if \(\mathtt{o}_1\preceq _{\mathtt{h}}\mathtt{o}_2\) for each pair of action identifiers \(\mathtt{o}_1\) and \(\mathtt{o}_2\). A sequential history \(h'\) is valid wrt. \(\mathtt{DS}\) if the corresponding trace is in \([\![{\mathtt{DS}}]\!]\). We say that \(h\) is linearizable wrt. \(\mathtt{DS}\) if there is a completed extension of \(h\), which has a linearization that is valid wrt. \(\mathtt{DS}\). We say that a program \({\mathcal {P}}\) is linearizable wrt. \(\mathtt{DS}\) if, in each possible execution, the sequence of call and return actions is linearizable wrt. \(\mathtt{DS}\).
We specify linearizability using the technique of observers [1, 3, 7, 9]. Depending on the data structure, we apply it in two different ways.

For implementations of sets and priority queues, the user instruments each method so that it announces a corresponding operation precisely when the method executes its LP, either directly or with lightweight instrumentation using the technique of linearization policies [3]. We represent such announcements by labels on the program transition relation \({\xrightarrow {}{}}_{\!\!{\mathcal {P}}}\), resulting in transitions of form \(c_{{\mathcal {P}}}{\xrightarrow {\mathtt{m}(d^{ in},d^{ out})}{}}_{\!\!{\mathcal {P}}}c_{{\mathcal {P}}}'\). Thereafter, an observer is constructed, which monitors the sequence of operations that is announced by the instrumentation; it reports (by moving to an accepting error location) whenever this sequence violates the (sequential) semantics of the data structure.

For stacks and queues, we use a recent result [7, 9] that the set of linearizable histories, i.e., sequences of call and return actions, can be exactly specified by an observer. Thus, linearizability can be specified without any usersupplied instrumentation, by using an observer which monitors the sequences of call and return actions and reports violations of linearizability.
Formally, an observer \({\mathcal {O}}\) is a tuple \(\left\langle S^{\mathcal {O}},s^{\mathcal {O}}_\mathtt{init},\mathtt{X}^{{\mathcal {O}}},\varDelta ^{\mathcal {O}},s^{\mathcal {O}}_\mathtt{acc}\right\rangle \) where \(S^{\mathcal {O}}\) is a finite set of observer locations including the initial location \(s^{\mathcal {O}}_\mathtt{init}\) and the accepting location \(s^{\mathcal {O}}_\mathtt{acc}\), a finite set \(\mathtt{X}^{{\mathcal {O}}}\) of registers, and \(\varDelta ^{\mathcal {O}}\) is a finite set of transitions. For observers that monitor sequences of operations, transitions are of the form \(\left\langle s_1,\mathtt{m}(x^{ in},x^{ out}),s_2\right\rangle \), where Open image in new window is a method name and \(x^{ in}\) and \(x^{ out}\) are either registers or constants, i.e., transitions are labeled by operations whose input or output data may be parameterized on registers. The observer processes a sequence of operations one operation at a time. If there is a transition, whose label (after replacing registers by their values) matches the operation, such a transition is performed. If there is no such transition, the observer remains in its current location. The observer accepts a sequence if it can be processed in such a way that an accepting location is reached. The observer is defined in such a way that it accepts precisely those sequences that are not in \([\![{\mathtt{DS}}]\!]\). Figure 6 depicts an observer for the set data structure.
To check that no execution of the program announces a sequence of labels that can drive the observer to an accepting location, we form the crossproduct \({\mathcal {S}}={\mathcal {P}}\otimes {\mathcal {O}}\) of the program \({\mathcal {P}}\) and the observer \({\mathcal {O}}\), synchronizing on common transition labels. Thus, configurations of \({\mathcal {S}}\) are of the form \(\left\langle c_{{\mathcal {P}}},\left\langle s,\rho \right\rangle \right\rangle \), consisting of a program configuration \(c_{{\mathcal {P}}}\), an observer location \(s\), and an assignment \(\rho \) of values in Open image in new window to the observer registers. Transitions of \({\mathcal {S}}\) are of the form \(\left\langle c_{{\mathcal {P}}},\left\langle s,\rho \right\rangle \right\rangle ,{\xrightarrow {}{}}_{\!\!{\mathcal {S}}},\left\langle {c_{{\mathcal {P}}}}',\left\langle s',\rho \right\rangle \right\rangle \), obtained from a transition \(c_{{\mathcal {P}}}{\xrightarrow {\lambda }{}}_{\!\!{\mathcal {P}}} {c_{{\mathcal {P}}}}'\) of the program with some (possibly empty) label \(\lambda \), where the observer makes a transition \(s{\xrightarrow {\lambda }{}}_{\!\!} {s}'\) if it can perform such a matching transition, otherwise \(s' = s\). Note that the observer registers are not changed. We also add straightforward instrumentation to check that each method invocation announces exactly one operation, whose input and output values agree with the method’s parameters and return value. This reduces the problem of checking linearizability to the problem of checking that in this crossproduct, the observer cannot reach an accepting error location.
4 Verification Using Fragment Abstraction for Skiplists
In the previous section, we reduced the problem of verifying linearizability to the problem of verifying that, in any execution of the crossproduct of a program and an observer, the observer cannot reach an accepting location. We perform this verification by computing a symbolic representation of an invariant that is satisfied by all reachable configurations of the crossproduct, using an abstract interpretationbased fixpoint procedure, starting from a symbolic representation of the set of initial configurations, thereafter repeatedly performing symbolic postcondition computations that extend the symbolic representation by the effect of any execution step of the program, until convergence.
In Sect. 4.1, we define in more detail our symbolic representation for skiplists, focusing in particular on the use of fragment abstraction, and thereafter (in Sect. 4.2) describe the symbolic postcondition computation. Since singlylinked lists is a trivial special case of skiplists, we can use the relevant part of this technique also for programs based on singlylinked lists.
4.1 Symbolic Representation
This subsection contains a more detailed description of our symbolic representation for programs that operate on skiplists, which was introduced in Sect. 2. We first describe the data abstraction, thereafter the fragment abstraction, and finally their combination into a symbolic representation.
Data Abstraction. Our data abstraction is defined by assigning a abstract domain to each concrete domain of data values, as follows.

For small concrete domains (including that of the program counter, and of the observer location), the abstract domain is the same as the concrete one.

For locks, the abstract domain is \(\left\{ me , other , free \right\} \), meaning that the lock is held by the concerned thread, held by some other thread, or is free, respectively.

For the concrete domain Open image in new window of data values, the abstract domain is the set of mappings from observer registers and local variables ranging over Open image in new window to subsets of \(\left\{ <,=,>\right\} \). An mapping in this abstract domain represents the set of data values \(\mathtt{d}\) such that it maps each local variable and observer register with a value Open image in new window to a set which includes a relation \(\sim \) such that \(\mathtt{d} \sim \mathtt{d}'\).
Since the number of levels is unbounded, we define an abstraction for levels. Let \(\mathtt{k}\) be a level. Define the abstraction of a pointer variable of form \(\mathtt{p[k]}\), denoted \(\widehat{\mathtt {p[k]}}\), to be \(\mathtt{p[1]}\) if \(\mathtt{k}= 1\), and to be \(\mathtt{p[higher]}\) if \(\mathtt{k}\ge 2\). That is, this abstraction does not distinguish different higher levels.
A tag is a tuple \(\mathtt {tag}= \left\langle \mathtt {dabs},\mathtt {pvars},\mathtt {reachfrom},\mathtt {reachto},\mathtt {private}\right\rangle \), where (i) \(\mathtt {dabs}\) is a mapping from nonpointer fields to their corresponding abstract domains; if a nonpointer field is an array indexed by levels, then the abstract domain is that for single elements: e.g., the abstract domain for the array \(\mathtt{marked}\) in Fig. 2 is simply the set of booleans, (ii) \(\mathtt {pvars}\) is a set of abstracted pointer variables, (iii) \(\mathtt {reachfrom}\) and \(\mathtt {reachto}\) are sets of global pointer variables and observer registers, and (iv) \(\mathtt {private}\) is a boolean value.

\(\mathtt {dabs}\) is an abstraction of the concrete values of the nonpointer fields of \({\mathbbm {c}}\); for array fields \(\mathtt{f}\) we use the concrete value \(\mathtt{f[k]}\),

\(\mathtt {pvars}\) is the set of abstractions of pointer variables (global or local to \(\mathtt{th}\)) that point to \({\mathbbm {c}}\),

\(\mathtt {reachfrom}\) is the set of (i) abstractions of global pointer variables from which \({\mathbbm {c}}\) is reachable via a (possibly empty) sequence of \(\mathtt{next[1]}\) pointers, and (ii) observer registers \(\mathtt {x}_i\) such that \({\mathbbm {c}}\) is reachable from some \(\mathtt {x}_i\)cell (via a sequence of \(\mathtt{next[1]}\) pointers),

\(\mathtt {reachto}\) is the set of (i) abstractions of global pointer variables pointing to a cell that is reachable (via a sequence of \(\mathtt{next[1]}\) pointers) from \({\mathbbm {c}}\), and (ii) observer registers \(\mathtt {x}_i\) such that some \(\mathtt {x}_i\)cell is reachable from \({\mathbbm {c}}\).

\(\mathtt {private}\) is \(\mathtt{true}\) only if \({\mathbbm {c}}\) is not accessible to any other thread than \(\mathtt{th}\).
Note that the global information represented by the fields \(\mathtt {reachfrom}\) and \(\mathtt {reachto}\) concerns only reachability via level1 pointers.
A skiplist fragment \(\mathtt {v}\) (or just fragment) is a triple of form \(\left\langle \mathtt {i},\mathtt {o},\phi \right\rangle \), of form \(\left\langle \mathtt {i},\mathtt{null}\right\rangle \), or of form \(\left\langle \mathtt {i},\bot \right\rangle \), where \(\mathtt {i}\) and \(\mathtt {o}\) are tags and \(\phi \) is a subset of \(\left\{ <, =, >\right\} \). Each skiplist fragment additionally has a type, which is either level1 or higherlevel (note that a level1 fragment can otherwise be identical to a higherlevel fragment). For a cell \({\mathbbm {c}}\) which is accessible to thread \(\mathtt{th}\), and a fragment \(\mathtt {v}\) of form \(\left\langle \mathtt {i},\mathtt {o},\phi \right\rangle \), let \({\mathbbm {c}}\lhd _{\mathtt{th},\mathtt{k}}^{c_{\mathcal {S}}}\mathtt {v}\) denote that the \(\mathtt{next[k]}\) field of \({\mathbbm {c}}\) points to a cell \({\mathbbm {c}}'\) such that \({\mathbbm {c}}\lhd _{\mathtt{th},\mathtt{k}}^{c_{\mathcal {S}}} \mathtt {i}\), and \({\mathbbm {c}}' \lhd _{\mathtt{th},\mathtt{k}}^{c_{\mathcal {S}}} \mathtt {o}\), and \({\mathbbm {c}}.\mathtt {data} \sim {\mathbbm {c}}'.\mathtt {data}\) for some \(\sim \in \phi \). The definition of \({\mathbbm {c}}\lhd _{\mathtt{th},\mathtt{k}}^{c_{\mathcal {S}}}\mathtt {v}\) is adapted to fragments of form \(\left\langle \mathtt {i},\mathtt{null}\right\rangle \) and \(\left\langle \mathtt {i},\bot \right\rangle \) in the obvious way. For a fragment \(\mathtt {v}= \left\langle \mathtt {i},\mathtt {o},\phi \right\rangle \), we often use \(\mathtt {v}.\mathtt {i}\) for \(\mathtt {i}\) and \(\mathtt {v}.\mathtt {o}\) for \(\mathtt {o}\), etc.

for any cell \({\mathbbm {c}}\) that is accessible to \(\mathtt{th}\) (different from \(\mathtt{null}\) and \(\bot \)), there is a level1 fragment \(\mathtt {v}\in V\) such that \({\mathbbm {c}}\lhd _{{\mathtt{th}},1}^{c_{\mathcal {S}}} \mathtt {v}\), and

for all levels \(\mathtt{k}\) from 2 up to the height of \({\mathbbm {c}}\), there is a higherlevel fragment \(\mathtt {v}\in V\) such that \({\mathbbm {c}}\lhd _{{\mathtt{th}},\mathtt{k}}^{c_{\mathcal {S}}} \mathtt {v}\).
Intuitively, a set of fragment represents the set of heap states, in which each pair of cells connected by a \(\mathtt{next[1]}\) pointer is represented by a level1 fragment, and each pair of cells connected by a \(\mathtt{next[k]}\) pointer for \(\mathtt{k}\ge 2\) is represented by a higherlevel fragment which represents array fields of cells at index \(\mathtt{k}\).
Symbolic Representation. We can now define our abstract symbolic representation.
Define a local symbolic configuration \(\sigma \) to be a mapping from local nonpointer variables (including the program counter) to their corresponding abstract domains. We let \(c_{\mathcal {S}}\models _{\mathtt{th}}^{ loc } \sigma \) denote that in the global configuration \(c_{\mathcal {S}}\), the local configuration of thread \(\mathtt{th}\) satisfies the local symbolic configuration \(\sigma \), defined in the natural way. For a local symbolic configuration \(\sigma \), an observer location \(s\), a pair \(V\) of fragments and a thread \(\mathtt{th}\), we write \(c_{\mathcal {S}}\models _{\mathtt{th}} \left\langle \sigma ,s,V\right\rangle \) to denote that (i) \(c_{\mathcal {S}}\models _{\mathtt{th}}^{ loc } \sigma \), (ii) the observer is in location \(s\), and (iii) \(c_{\mathcal {S}}\models _{\mathtt{th}}^{ heap } V\).
Definition 1
A symbolic representation \(\varPsi \) is a partial mapping from pairs of local symbolic configurations and observer locations to sets of fragments. A system configuration \(c_{\mathcal {S}}\) satisfies a symbolic representation \(\varPsi \), denoted \(c_{\mathcal {S}} \text{ sat } \varPsi \), if for each thread \(\mathtt{th}\), the domain of \(\varPsi \) contains a pair \(\left\langle \sigma ,s\right\rangle \) such that \(c_{\mathcal {S}}\models _{\mathtt{th}} \left\langle \sigma ,s,\varPsi (\left\langle \sigma ,s\right\rangle )\right\rangle \).
4.2 Symbolic Postcondition Computation
The symbolic postcondition computation must ensure that the symbolic representation of the reachable configurations of a program is closed under execution of a statement by some thread. That is, given a symbolic representation \(\varPsi \), the symbolic postcondition operation must produce an extension \(\varPsi '\) of \(\varPsi \), such that whenever \(c_{\mathcal {S}} \text{ sat } \varPsi \) and \(c_{\mathcal {S}}{\xrightarrow {}{}}_{\!\!{\mathcal {S}}}c_{\mathcal {S}}'\) then \({c_{\mathcal {S}}}' \text{ sat } \varPsi '\). Let \(\mathtt{th}\) be an arbitrary thread. Then \(c_{\mathcal {S}} \text{ sat } \varPsi \) means that \(Dom(\varPsi )\) contains some pair \(\left\langle \sigma ,s\right\rangle \) with \(c_{\mathcal {S}}\models _{\mathtt{th}} \left\langle \sigma ,s,\varPsi (\left\langle \sigma ,s\right\rangle )\right\rangle \). The symbolic postcondition computation must ensure that \(Dom(\varPsi ')\) contains a pair \(\left\langle \sigma ',s'\right\rangle \) such that \(c_{\mathcal {S}}' \models _{\mathtt{th}} \left\langle \sigma ',s',\varPsi '(\left\langle \sigma ',s'\right\rangle )\right\rangle \). In the threadmodular approach, there are two cases to consider, depending on which thread causes the step from \(c_{\mathcal {S}}\) to \({c_{\mathcal {S}}}'\).

Local Steps: The step is caused by \(\mathtt{th}\) itself executing a statement which may change its local state, the location of the observer, and the state of the heap. In this case, we first compute a local symbolic configuration \(\sigma '\), an observer location \(s'\), and a set \(V'\) of fragments such that \(c_{\mathcal {S}}' \models _{\mathtt{th}} \left\langle \sigma ',s',V'\right\rangle \), and then (if necessary) extend \(\varPsi \) so that \(\left\langle \sigma ',s'\right\rangle \in Dom(\varPsi )\) and \( V' \subseteq \varPsi (\left\langle \sigma ',s'\right\rangle )\).

Interference Steps: The step is caused by another thread \(\mathtt{th}_2\), executing a statement which may change the location of the observer (to \(s'\)) and the heap. By \(c_{\mathcal {S}} \text{ sat } \varPsi \) there is a local symbolic configuration \(\sigma _2\) with \(\left\langle \sigma _2,s\right\rangle \in Dom(\varPsi )\) such that \(c_{\mathcal {S}}\models _{\mathtt{th}_2} \left\langle \sigma _2,s,\varPsi (\left\langle \sigma _2,s\right\rangle )\right\rangle \). For any such \(\sigma _2\) and statement of \(\mathtt{th}_2\), we must compute a set \(V'\) of fragments such that the resulting configuration \({c_{\mathcal {S}}}'\) satisfies \(c_{\mathcal {S}}' \models _{\mathtt{th}}^{ heap } V'\) and ensure that \(\left\langle \sigma ,s'\right\rangle \in Dom(\varPsi )\) and \(V' \subseteq \varPsi (\left\langle \sigma ,s'\right\rangle )\). To do this, we first combine the local symbolic configurations \(\sigma \) and \(\sigma _2\) and the sets of fragments \(\varPsi (\left\langle \sigma ,s\right\rangle )\) and \(\varPsi (\left\langle \sigma _2,s\right\rangle )\), using an operation called intersection, into a joint local symbolic configuration of \(\mathtt{th}\) and \(\mathtt{th}_2\) and a set \(V_{1,2}\) of fragments that represents the cells accessible to either \(\mathtt{th}\) or \(\mathtt{th}_2\). We thereafter symbolically compute the postcondition of the statement executed by \(\mathtt{th}_2\), in the same was as for local steps, and finally project the set of resulting fragments back onto \(\mathtt{th}\) to obtain \(V'\).
In the following, we first describe the symbolic postcondition computation for local steps, and thereafter the intersection operation.
Symbolic Postcondition Computation for Local Steps. Let \(\mathtt{th}\) be an arbitrary thread, assume that \(\left\langle \sigma ,s\right\rangle \in Dom(\varPsi )\), and let \(V= \varPsi (\left\langle \sigma ,s\right\rangle )\) For each statement that \(\mathtt{th}\) can execute in a configuration \(c_{\mathcal {S}}\) with \(c_{\mathcal {S}}\models _{\mathtt{th}} \left\langle \sigma ,s,V\right\rangle \), we must compute a local symbolic configuration \(\sigma '\), a new observer location \(s'\) and a set \(V'\) of fragments such that the resulting configuration \({c_{\mathcal {S}}}'\) satisfies \(c_{\mathcal {S}}' \models _{\mathtt{th}} \left\langle \sigma ',s',V'\right\rangle \). This computation is done differently for each statement. For statements that do not affect the heap or pointer variables, this computation is standard, and affects only the local symbolic configuration, the observer location, and the \(\mathtt {dabs}\) component of tags. We therefore here describe how to compute the effect of statements that update pointer variables or pointer fields of heap cells, since these are the most interesting cases. In this computation, the set \(V'\) is constructed in two steps: (1) First, the level1 fragments of \(V'\) are computed, based on the level1 fragments in \(V\). (2) Thereafter, the higherlevel fragments of \(V'\) are computed, based on the higherlevel fragments in \(V\) and how fragments in \(V\) are transformed when entered in to \(V'\). We first describe the construction of level1 fragments, and thereafter the construction of higherlevel fragments.

let \(\mathtt {v}_1 \hookrightarrow _{V} \mathtt {v}_2\) denote that \(\mathtt {v}_1.\mathtt {o}\) and \(\mathtt {v}_2.\mathtt {i}\) are consistent, and

let \(\mathtt {v}_1 \leftrightarrow _{V} \mathtt {v}_2\) denote that \(\mathtt {v}_1.\mathtt {o} = \mathtt {v}_2.\mathtt {o}\) are consistent, and that either \(\mathtt {v}_1.\mathtt {i}.\mathtt {pvars} \cap \mathtt {v}_2.\mathtt {i}.\mathtt {pvars} = \emptyset \) or the global variables in \(\mathtt {v}_1.\mathtt {i}.\mathtt {reachfrom}\) are disjoint from those in \(\mathtt {v}_2.\mathtt {i}.\mathtt {reachfrom}\).

\(\overset{+}{\hookrightarrow }_{V}\) denotes the transitive closure, and \(\overset{*}{\hookrightarrow }_{V}\) the reflexive transitive closure, of \(\hookrightarrow _{V}\),

\(\mathtt {v}_1 \! \overset{**}{\leftrightarrow }_{V} \! \mathtt {v}_2\) denotes that \(\exists \mathtt {v}_1',\mathtt {v}_2' \!\in \! V\) with \(\mathtt {v}_1' \! \leftrightarrow _{V} \! \mathtt {v}_2'\) where \(\mathtt {v}_1 \! \overset{*}{\hookrightarrow }_{V} \mathtt {v}_1' \!\) and \(\mathtt {v}_2 \overset{*}{\hookrightarrow }_{V} \! \mathtt {v}_2'\),

\(\mathtt {v}_1 \! \overset{*+}{\leftrightarrow }_{V} \! \mathtt {v}_2\) denotes that \(\exists \mathtt {v}_1', \mathtt {v}_2' \!\in \! V\) with \(\mathtt {v}_1' \! \leftrightarrow _{V} \! \mathtt {v}_2'\) where \(\mathtt {v}_1 \! \overset{*}{\hookrightarrow }_{V} \mathtt {v}_1' \!\) and \(\mathtt {v}_2 \overset{+}{\hookrightarrow }_{V} \! \mathtt {v}_2'\),

\(\mathtt {v}_1 \! \overset{*\circ }{\leftrightarrow }_{V} \! \mathtt {v}_2\) denotes that \(\exists \mathtt {v}_1' \in V\) with \(\mathtt {v}_1' \! \leftrightarrow _{V} \! \mathtt {v}_2\) where \(\mathtt {v}_1 \! \overset{*}{\hookrightarrow }_{V} \mathtt {v}_1'\),

\(\mathtt {v}_1 \! \overset{++}{\leftrightarrow }_{V} \! \mathtt {v}_2\) denotes that \(\exists \mathtt {v}_1',\mathtt {v}_2' \!\in \! V\) with \(\mathtt {v}_1' \! \leftrightarrow _{V} \! \mathtt {v}_2'\) where \(\mathtt {v}_1 \! \overset{+}{\hookrightarrow }_{V} \mathtt {v}_1' \!\) and \(\mathtt {v}_2 \overset{+}{\hookrightarrow }_{V} \! \mathtt {v}_2'\),

\(\mathtt {v}_1 \! \overset{+\circ }{\leftrightarrow }_{V} \! \mathtt {v}_2\) denotes that \(\exists \mathtt {v}_1' \in V\) with \(\mathtt {v}_1' \! \leftrightarrow _{V} \! \mathtt {v}_2\) where \(\mathtt {v}_1 \! \overset{+}{\hookrightarrow }_{V} \mathtt {v}_1'\).
We sometimes use, e.g., \(\mathtt {v}_2 \! \overset{+*}{\leftrightarrow }_{V} \! \mathtt {v}_1\) for \(\mathtt {v}_1 \! \overset{*+}{\leftrightarrow }_{V} \! \mathtt {v}_2\). We say that \(\mathtt {v}_1\) and \(\mathtt {v}_2\) are compatible if \(\mathtt {v}_x \overset{*}{\hookrightarrow }\mathtt {v}_y\), or \(\mathtt {v}_y \overset{*}{\hookrightarrow }\mathtt {v}_x\), or \(\mathtt {v}_x \overset{**}{\leftrightarrow }\mathtt {v}_y\). Intuitively, if \(\mathtt {v}_1\) and \(\mathtt {v}_2\) are satisfied by two cells in the same heap state, then they must be compatible.
Figure 7 illustrates the above relations for a heap state with 13 heap cells. The figure depicts, in green, four pairs of heap cells connected by a \(\mathtt{next[1]}\) pointer, which satisfy the four fragments \(\mathtt {v}_1\), \(\mathtt {v}_2\), \(\mathtt {v}_3\), and \(\mathtt {v}_4\), respectively. At the bottom are depicted the transitiveclosure like relations that hold between these fragments.
We can now describe the symbolic postcondition computation for statements that affect pointer variables or fields. This is a case analysis, and for space reasons we only include some representative cases.
First, consider a statement of form \(\mathtt{x := y}\), where \(\mathtt {x}\) and \(\mathtt {y}\) are local (to thread \(\mathtt{th}\)) or global pointer variables. We must compute a set \(V'\) of fragments which are satisfied by the configuration after the statement. We first compute the level1fragments in \(V'\) as follows (higherlevel fragments will be computed later). We observe that for any cell \({\mathbbm {c}}\) which is accessible to \(\mathtt{th}\) after the statement, there must be some level1 fragment \(\mathtt {v}'\) in \(V'\) with \({\mathbbm {c}}\lhd _{\mathtt{th},1}^{c_{\mathcal {S}}} \mathtt {v}'\). By assumption, \({\mathbbm {c}}\) satisfies some fragment \(\mathtt {v}\) in \(V\) before the statement, and is in the same heap state as the cell pointed to by \(\mathtt {y}\). This implies that \(\mathtt {v}\) must be compatible with some fragment \(\mathtt {v}_y\in V\) such that \(\widehat{\mathtt {y}} \in \mathtt {v}_y.\mathtt {i}.\mathtt {pvars}\) (recall that \(\widehat{\mathtt {y}}\) is the abstraction of \(\mathtt {y}\), which in the case that \(\mathtt {y}\) is an array element maps higher level indices to that abstract index \(\mathtt{higher}\)). This means that we can make a case analysis on the possible relationships between \(\mathtt {v}\) and any such \(\mathtt {v}_y\). Thus, for each fragment \(\mathtt {v}_y\in V\) such that \(\widehat{\mathtt {y}} \in \mathtt {v}_y.\mathtt {i}.\mathtt {pvars}\) we let \(V'\) contain the fragments obtained by any of the following transformations on any fragment in \(V\).
 1.First, for the fragment \(\mathtt {v}_y\) itself, we let \(V'\) contain \(\mathtt {v}_y'\), which is the same as \(\mathtt {v}_y\), except thatand furthermore, if \(\mathtt {x}\) is a global variable, then

\(\mathtt {v}_y'.\mathtt {i}.\mathtt {pvars} = \mathtt {v}_y.\mathtt {i}.\mathtt {pvars} \cup \left\{ \widehat{\mathtt {x}}\right\} \) and \(\mathtt {v}_y'.\mathtt {o}.\mathtt {pvars} = \mathtt {v}.\mathtt {o}.\mathtt {pvars} \setminus \left\{ \widehat{\mathtt {x}}\right\} \)

\(\mathtt {v}_y'.\mathtt {i}.\mathtt {reachto} = \mathtt {v}_y.\mathtt {i}.\mathtt {reachto} \cup \left\{ \widehat{\mathtt {x}}\right\} \) and \(\mathtt {v}_y'.\mathtt {i}.\mathtt {reachfrom} = \mathtt {v}_y.\mathtt {i}.\mathtt {reachfrom} \cup \left\{ \widehat{\mathtt {x}}\right\} \),

\(\mathtt {v}_y'.\mathtt {o}.\mathtt {reachfrom} = \mathtt {v}_y.\mathtt {o}.\mathtt {reachfrom} \cup \left\{ \widehat{\mathtt {x}}\right\} \) and \(\mathtt {v}_y'.\mathtt {o}.\mathtt {reachto} = \mathtt {v}_y.\mathtt {o}.\mathtt {reachto} \setminus \left\{ \widehat{\mathtt {x}}\right\} \).

 2.for each \(\mathtt {v}\) with \(\mathtt {v}\hookrightarrow _{V} \mathtt {v}_y\), let \(V'\) contain \(\mathtt {v}'\) which is the same as \(\mathtt {v}\) except that

\(\mathtt {v}'.\mathtt {i}.\mathtt {pvars} = \mathtt {v}.\mathtt {i}.\mathtt {pvars} \setminus \left\{ \widehat{\mathtt {x}}\right\} \),

\(\mathtt {v}'.\mathtt {o}.\mathtt {pvars} = \mathtt {v}.\mathtt {o}.\mathtt {pvars} \cup \left\{ \widehat{\mathtt {x}}\right\} \),

\(\mathtt {v}'.\mathtt {i}.\mathtt {reachfrom} = \mathtt {v}.\mathtt {i}.\mathtt {reachfrom} \setminus \left\{ \widehat{\mathtt {x}}\right\} \) if \(\mathtt{x}\) is a global variable,

\(\mathtt {v}'.\mathtt {i}.\mathtt {reachto} = \mathtt {v}.\mathtt {i}.\mathtt {reachto} \cup \left\{ \widehat{\mathtt {x}}\right\} \) if \(\mathtt{x}\) is a global variable,

\(\mathtt {v}'.\mathtt {o}.\mathtt {reachfrom} = \mathtt {v}.\mathtt {o}.\mathtt {reachfrom} \cup \left\{ \widehat{\mathtt {x}}\right\} \) if \(\mathtt{x}\) is a global variable,

\(\mathtt {v}'.\mathtt {o}.\mathtt {reachto} = \mathtt {v}.\mathtt {o}.\mathtt {reachto} \cup \left\{ \widehat{\mathtt {x}}\right\} \) if \(\mathtt{x}\) is a global variable,

 3.
We perform analogous inclusions for fragments \(\mathtt {v}\) with \(\mathtt {v}\overset{+}{\hookrightarrow }_{V} \mathtt {v}_y\), \(\mathtt {v}_y \overset{*}{\hookrightarrow }_{V} \mathtt {v}\), \(\mathtt {v}_y \overset{*+}{\leftrightarrow }_{V} \mathtt {v}\), and \(\mathtt {v}_y \overset{*\circ }{\leftrightarrow }_{V} \mathtt {v}\). Here, we show only the case of \(\mathtt {v}_y \overset{*+}{\leftrightarrow }_{V} \mathtt {v}\), in which case we let \(V'\) contain \(\mathtt {v}'\) which is the same as \(\mathtt {v}\) except that \(\widehat{\mathtt {x}}\) is removed from the sets \(\mathtt {v}'.\mathtt {i}.\mathtt {pvars}\), \(\mathtt {v}'.\mathtt {o}.\mathtt {pvars}\), \(\mathtt {v}'.\mathtt {i}.\mathtt {reachfrom}\), \(\mathtt {v}'.\mathtt {i}.\mathtt {reachto}\), \(\mathtt {v}'.\mathtt {o}.\mathtt {reachfrom}\), and \(\mathtt {v}'.\mathtt {o}.\mathtt {reachto}\).
 1.
the fragment \(\mathtt {v}_{new}\), representing the new pair of neighbours formed by the statement, of form \(\mathtt {v}_{new} = \left\langle \mathtt {i},\mathtt {o},\phi \right\rangle \), such that \(\mathtt {v}_{new}.\mathtt {i}.\mathtt {tag} = \mathtt {v}_x.\mathtt {i}.\mathtt {tag}\) and \(\mathtt {v}_{new}.\mathtt {o}.\mathtt {tag} = \mathtt {v}_y.\mathtt {i}.\mathtt {tag}\) except that \(\mathtt {v}_{new}.\mathtt {o}.\mathtt {reachfrom} = \mathtt {v}_y.\mathtt {i}.\mathtt {reachfrom} \cup \mathtt {v}_x.\mathtt {i}.\mathtt {reachfrom}\) and \(\mathtt {v}_{new}.\mathtt {i}.\mathtt {reachto} = \mathtt {v}_y.\mathtt {i}.\mathtt {reachto} \cup \mathtt {v}_x.\mathtt {i}.\mathtt {pvars}\); the constraint represent by \(\mathtt {v}_{new}.\phi \) is obtained from the constraints represented by the data abstractions of \(\mathtt {v}_x.\mathtt {i}\) and \(\mathtt {v}_y.\mathtt {i}\), as well as the possible transitive closurerelations between \(\mathtt {v}_x\) and \(\mathtt {v}_y\), some of which imply that the data fields of \(\mathtt {v}_x\) and \(\mathtt {v}_y\) are ordered, and
 2.all possible fragments that can result from a transformation of some fragment \(\mathtt {v}\in V\). This is done by an exhaustive case analysis on the possible relationships between \(\mathtt {v}\), \(\mathtt {v}_x\) and \(\mathtt {v}_y\). Let us consider an interesting case, in which \(\mathtt {v}_x \overset{*}{\hookrightarrow }_{V} \mathtt {v}\) and either \(\mathtt {v}\overset{+}{\hookrightarrow }_{V} \mathtt {v}_y\) or \(\mathtt {v}_y \overset{*+}{\leftrightarrow }\mathtt {v}\). In this case,

for each subset \(\mathtt {regset}\) of the observer registers in \(\mathtt {v}.\mathtt {i}.\mathtt {reachfrom} \cap \mathtt {v}_x.\mathtt {i}.\mathtt {reachfrom}\), and for each subset \(\mathtt {regset}'\) of the set of observer registers in \(\mathtt {v}.\mathtt {o}.\mathtt {reachfrom} \cap \mathtt {v}_x.\mathtt {i}.\mathtt {reachfrom}\), we let \(V'\) contain a fragment \(\mathtt {v}'\) which is the same as \(\mathtt {v}\) except that \(\mathtt {v}'.\mathtt {i}.\mathtt {reachfrom} = (\mathtt{\mathtt {v}.\mathtt {i}.\mathtt {reachfrom}} \mathtt{\setminus \mathtt {v}_x.\mathtt {i}.\mathtt {reachfrom}) \cup \mathtt {regset}}\) and \(\mathtt {v}'.\mathtt {o}.\mathtt {reachfrom} = (\mathtt {v}.\mathtt {o}.\mathtt {reachfrom} \setminus \mathtt {v}_x.\mathtt {i}.\mathtt {reachfrom}) \cup \mathtt {regset}'\). An intuitive explanation for the rule for \(\mathtt {v}'.\mathtt {i}.\mathtt {reachfrom}\) is that the global variables that can reach \(\mathtt {v}_x.\mathtt {i}\) should clearly be removed from \(\mathtt {v}'.\mathtt {i}.\mathtt {reachfrom}\) since \(\mathtt {v}_x \overset{*}{\hookrightarrow }_{V} \mathtt {v}'\) is false after the statement. However, for an observer register \(\mathtt {x}_i\), an \(\mathtt {x}_i\)cell can still reach \(\mathtt {v}'.\mathtt {i}\), if there are two \(\mathtt {x}_i\)cells, one which reaches \(\mathtt {v}_x.\mathtt {i}\) and another which reaches \(\mathtt {v}'.\mathtt {i}\); we cannot precisely determine for which \(\mathtt {x}_i\) this may be the case, except that any such \(\mathtt {x}_i\) must be in \(\mathtt {v}.\mathtt {i}.\mathtt {reachfrom} \cap \mathtt {v}_x.\mathtt {i}.\mathtt {reachfrom}\). The intuition for the rule for \(\mathtt {v}'.\mathtt {o}.\mathtt {reachfrom}\) is analogous.

Symbolic Postcondition Computation for Interference Steps. Here, the key step is the intersection operation, which takes two sets of fragments \(V_1\) and \(V_2\), and produces a set of joint fragments \(V_{1,2}\), such that \(c_{\mathcal {S}}\models _{\mathtt{th}_1,\mathtt{th}_2}^{ heap } V_{1,2}\) for any configuration such that \(c_{\mathcal {S}}\models _{\mathtt{th}_i}^{ heap } V_i\) for \(i=1,2\) (here \(\models _{\mathtt{th}_1,\mathtt{th}_2}^{ heap }\) is defined in the natural way). This means that for each heap cell accessible to either \(\mathtt{th}_1\) or \(\mathtt{th}_2\), the set \(V_{1,2}\) contains a fragment \(\mathtt {v}\) with \({\mathbbm {c}}\lhd _{\{\mathtt{th}_1,\mathtt{th}_2\},\mathtt{k}}^{c_{\mathcal {S}}} \mathtt {v}\) for each \(\mathtt{k}\) which is at most the height of \({\mathbbm {c}}\) (generalizing the notation \(\lhd _{\mathtt{th},\mathtt{k}}^{c_{\mathcal {S}}}\) to several threads). Note that a joint fragment represents local pointer variables of both \(\mathtt{th}_1\) and \(\mathtt{th}_2\). In order to distinguish between local variables of \(\mathtt{th}_1\) and \(\mathtt{th}_2\), we use \(\mathtt {x[i]}\) to denote a local variable \(\mathtt {x}\) of thread \(\mathtt{th}_i\). Here, we describe the intersection operation for level1 fragments. The intersection operation is analogous for higherlevel fragments.
For a fragment \(\mathtt {v}\), define \(\mathtt {v}.\mathtt {i}.\mathtt {greachfrom}\) as the set of global variables in \(\mathtt {v}.\mathtt {i}.\mathtt {reachfrom}\). Define \(\mathtt {v}.\mathtt {i}.\mathtt {greachto}\), \(\mathtt {v}.\mathtt {o}.\mathtt {greachfrom}\), \(\mathtt {v}.\mathtt {o}.\mathtt {greachto}\), \(\mathtt {v}.\mathtt {i}.\mathtt {gpvars}\), and \(\mathtt {v}.\mathtt {o}.\mathtt {gpvars}\) analogously. Define \(\mathtt {v}.\mathtt {i}.\mathtt {gtag}\) as the tuple \(\left\langle \mathtt {v}.\mathtt {i}.\mathtt {dabs},\mathtt {v}.\mathtt {i}.\mathtt {gpvars},\mathtt {v}.\mathtt {i}.\mathtt {greachfrom},\mathtt {v}.\mathtt {i}.\mathtt {greachto}\right\rangle \), and define \(\mathtt {v}.\mathtt {o}.\mathtt {gtag}\) analogously. We must distinguish the following possibilities.
 If \({\mathbbm {c}}\) is accessible to both \(\mathtt{th}_1\) and \(\mathtt{th}_2\), then there are fragments \(\mathtt {v}_1 \in V_1\) and \(\mathtt {v}_2 \in V_2\) such that \({\mathbbm {c}}\lhd _{\mathtt{th}_1,1}^{c_{\mathcal {S}}} \mathtt {v}_1\) and \({\mathbbm {c}}\lhd _{\mathtt{th}_2,1}^{c_{\mathcal {S}}} \mathtt {v}_2\). This can happen only if \(\mathtt {v}_1.\mathtt {i}.\mathtt {gtag} = \mathtt {v}_2.\mathtt {i}.\mathtt {gtag}\), and \(\mathtt {v}_1.\mathtt {o}.\mathtt {gtag} = \mathtt {v}_2.\mathtt {o}.\mathtt {gtag}\), and \(\mathtt {v}_1.\mathtt {i}.\mathtt {private} = \mathtt {v}_2.\mathtt {i}.\mathtt {private} = \mathtt{false}\). Thus, for any such pair of fragments \(\mathtt {v}_1 \in V_1\) and \(\mathtt {v}_2 \in V_2\), we let \(V_{1,2}\) contain a fragment \(\mathtt {v}_{12}\) which is identical to \(\mathtt {v}_1\) except that

\(\mathtt {v}_{12}.\mathtt {i}.\mathtt {pvars} = \mathtt {v}_1.\mathtt {i}.\mathtt {pvars} \cup \mathtt {v}_2.\mathtt {i}.\mathtt {pvars}\),

\(\mathtt {v}_{12}.\mathtt {o}.\mathtt {pvars} = \mathtt {v}_1.\mathtt {o}.\mathtt {pvars} \cup \mathtt {v}_2.\mathtt {o}.\mathtt {pvars}\),

\(\mathtt {v}_{12}.\mathtt {i}.\mathtt {reachfrom} = \mathtt {v}_{1}.\mathtt {i}.\mathtt {reachfrom} \cup \mathtt {v}_{2}.\mathtt {i}.\mathtt {reachfrom}\), and

\(\mathtt {v}_{12}.\mathtt {o}.\mathtt {reachfrom} = \mathtt {v}_{1}.\mathtt {o}.\mathtt {reachfrom} \cup \mathtt {v}_{2}.\mathtt {o}.\mathtt {reachfrom}\).

 If \({\mathbbm {c}}\) is accessible to \(\mathtt{th}_1\), but not to \(\mathtt{th}_2\), and \({\mathbbm {c}}.\mathtt{next[1]}\) is accessible also to \(\mathtt{th}_2\), then there are fragments \(\mathtt {v}_1 \in V_1\) and \(\mathtt {v}_2 \in V_2\) such that \({\mathbbm {c}}\lhd _{\mathtt{th}_1,1}^{c_{\mathcal {S}}} \mathtt {v}_1\) and \({\mathbbm {c}}.\mathtt{next[1]} \lhd _{\mathtt{th}_2,1}^{c_{\mathcal {S}}} \mathtt {v}_2.\mathtt {o}\). This can happen only if \(\mathtt {v}_1.\mathtt {i}.\mathtt {greachfrom} = \emptyset \), and \(\mathtt {v}_1.\mathtt {o}.\mathtt {gtag} = \mathtt {v}_2.\mathtt {o}.\mathtt {gtag}\), and \(\mathtt {v}_1.\mathtt {o}.\mathtt {private} = \mathtt {v}_2.\mathtt {o}.\mathtt {private} = \mathtt{false}\). Thus, for any such pair of fragments \(\mathtt {v}_1 \in V_1\) and \(\mathtt {v}_2 \in V_2\), we let \(V_{1,2}\) contain a fragment \(\mathtt {v}_{1}'\) which is identical to \(\mathtt {v}_1\) except that

\(\mathtt {v}_1'.\mathtt {o}.\mathtt {pvars} = \mathtt {v}_1.\mathtt {o}.\mathtt {pvars} \cup \mathtt {v}_2.\mathtt {o}.\mathtt {pvars}\), and

\(\mathtt {v}_1'.\mathtt {o}.\mathtt {reachfrom} = \mathtt {v}_{1}.\mathtt {o}.\mathtt {reachfrom} \cup \mathtt {v}_{2}.\mathtt {o}.\mathtt {reachfrom}\).


If neither \({\mathbbm {c}}\) nor \({\mathbbm {c}}.\mathtt{next[1]}\) is accessible \(\mathtt{th}_2\), then there is a fragment \(\mathtt {v}_1 \in V_1\) such that \({\mathbbm {c}}\lhd _{\mathtt{th}_1,1}^{c_{\mathcal {S}}} \mathtt {v}_1\). This can happen only if \(\mathtt {v}_1.\mathtt {o}.\mathtt {greachfrom} = \emptyset \), in which case we let \(V_{1,2}\) contain the fragment \(\mathtt {v}_1\).

For each of the two last cases, there is also a symmetric case with the roles of \(\mathtt{th}_1\) and \(\mathtt{th}_2\) reversed.
5 Arrays of SinglyLinked Lists with Timestamps
Figure 8 shows a simplified version of the Timestamped Stack (TS stack) of [12], where we have omitted the check for emptiness in the \(\mathtt{pop}\) method, and the optimization using \(\mathtt{push}\)\(\mathtt{pop}\) elimination. These features are included in the full version of the algorithm, that we have verified automatically.
The algorithm uses an array of singlylinked lists (SLLs), one for each thread, accessed via the threadindexed array \(\mathtt{pools[maxThreads]}\) of pointers to the first cell of each list. The \(\mathtt{init}\) method initializes each of these pointers to \(\mathtt{null}\). Each list cell contains a data value, a timestamp value, a \(\mathtt{next}\) pointer, and a boolean flag \(\mathtt{mark}\) which indicates whether the node is logically removed from the stack. Each thread pushes elements only to “its own” list, but can pop elements from any list.
A \(\mathtt{push}\) method for inserting a data element \(\mathtt{d}\) works as follows: first, a new cell with element \(\mathtt{d}\) and minimal timestamp \(\mathtt{1}\) is inserted at the beginning of the list indexed by the calling thread (line 1–3). After that, a new timestamp is created and assigned (via the variable \(\mathtt{t}\)) to the \(\mathtt{ts}\) field of the inserted cell (line 4–5). Finally, the method unlinks (i.e., physically removes) all cells that are reachable (through a sequence of \(\mathtt{next}\) pointers) from the inserted cell and whose \(\mathtt{mark}\) field is \(\mathtt{true}\); these cells are already logically removed. This is done by redirecting the \(\mathtt{next}\) pointer of the inserted cell to the first cell with a \(\mathtt{false}\) \(\mathtt{mark}\) field, which is reachable from the inserted cell.
A \(\mathtt{pop}\) method first traverses all lists, finding in each list the first cell whose \(\mathtt{mark}\) field is \(\mathtt{false}\) (line 8), and letting the variable \(\mathtt{youngest}\) point to the most recent such cell (i.e., with the largest timestamp) (line 1–11). A compareandswap (CAS) is used to set the \(\mathtt{mark}\) field of this youngest cell to \(\mathtt{true}\), thereby logically removing it. This procedure will restart if the CAS fails. After the youngest cell has been removed, the method will unlink all cells, whose \(\mathtt{mark}\) field is \(\mathtt{true}\), that appear before (line 17–19) or after (line 20–23) the removed cell. Finally, the method returns the \(\mathtt{data}\) value of the removed cell.
Fragment Abstraction. In our verification, we establish that the TS stack algorithm of Fig. 8 is correct in the sense that it is a linearizable implementation of a stack data structure. For stacks and queues, we specify linearizability by observers that synchronize on call and return actions of methods, as shown by [7]; this is done without any usersupplied annotation, hence the verification is fully automated.
The verification is performed analogously as for skiplists, as described in Sect. 4. Here we show how fragment abstraction is used for arrays of singlylinked lists. Figure 9 shows an example heap state of TS stack. The heap consists of a set of singly linked lists (SLLs), each of which is accessed from a pointer in the array \(\mathtt{pools[maxThreads]}\) in a configuration when it is accessed concurrently by three threads \(\mathtt{th}_1\), \(\mathtt{th}_2\), and \(\mathtt{th}_3\). The heap consists of three SLLs accessed from the three pointers \(\mathtt{pools[1]}\), \(\mathtt{pools[2]}\), and \(\mathtt{pools[3]}\) respectively. Each heap cell is shown with the values of its fields, using the layout shown to the right in Fig. 9. In addition, each cell is labeled by the pointer variables that point to it. We use \(\mathtt{lvar(i)}\) to denote the local variable \(\mathtt{lvar}\) of thread \(\mathtt{th}_\mathtt{i}\).
We verify the algorithm using a symbolic representation that is analogous to the one used for skiplists. There are two main differences.

Since the array \(\mathtt{pools}\) is global, all threads can reach all lists in the heap (the only cells that cannot be reached by all threads are new cells that are not yet inserted).

We therefore represent the view of a thread by a threaddependent abstraction of thread indices, which index the array \(\mathtt{pools}\). In the view of a thread, the index of the list where it is currently active is abstracted to \(\mathtt{me}\), and all other indices are abstracted to \(\mathtt{ot}\). The currently active index is taken to be the thread index for a thread performing a \(\mathtt{push}\), the value of \(\mathtt{i}\) for a thread executing in the for loop of \(\mathtt{pop}\), and the value of \(\mathtt{k}\) after that loop.
Figure 10 shows a set of fragments that is satisfied wrp. to \(\mathtt{th}_2\) by the configuration in Fig. 9. There are 7 fragments, named \(\mathtt {v}_1, \ldots , \mathtt {v}_7\). Consider the tag which occurs in fragment \(\mathtt {v}_7\). This tag is an abstraction of the bottomrightmost heap cell in Fig. 9, The different nonpointer fields are represented as follows.

The \(\mathtt{data}\) field of the tag (to the left) abstracts the data value 2 to the set of observer registers with that value: in this case \(\mathtt {x}_2\).

The \(\mathtt{ts}\) field (at the top) abstracts the timer value 15 to the possible relations with \(\mathtt{ts}\)fields of heap cells with the same data value as each observer registers. Recall that observer registers \(\mathtt {x}_1\) and \(\mathtt {x}_2\) have values 4 and 2, respectively. There are three heap cells with \(\mathtt{data}\) field value 4, all with a \(\mathtt{ts}\) value less than 15. There is one heap cell with \(\mathtt{data}\) field value 2, having \(\mathtt{ts}\) value 15. Consequently, the abstraction of the \(\mathtt{ts}\) field maps \(\mathtt {x}_1\) to \(\left\{ >\right\} \) and \(\mathtt {x}_2\) to \(\left\{ =\right\} \): this is the mapping \(\lambda _4\) in Fig. 10.

The \(\mathtt{mark}\) field assumes values from a small finite domain and is represented precisely as in concrete heap cells.

Whenever a thread performing \(\mathtt{pop}\) moves from one iteration of the Open image in new window loop to the next, the abstraction must consider to swap between the abstractions \(\mathtt{me}\) and \(\mathtt{ot}\).

In interference steps, we must consider that the abstraction \(\mathtt{me}\) for the interfering thread may have to be changed into \(\mathtt{ot}\). Furthermore, the abstractions \(\mathtt{me}\) for two \(\mathtt{push}\) methods cannot coincide, since each thread pushes only to its own list.
6 Experimental Results
Based on our framework, we have implemented a tool in OCaml, and used it for verifying various kinds of concurrent data structures implementation of stacks, priority queues, queues and sets. All of them are based on heap structures. There are three types of heap structures we consider in our experiments.
Singlylinked list benchmarks: These benchmarks include stacks, queues and sets algorithms which are the wellknown in the literature. The challenge is that in some set implementation, the linearization points are not fixed, they depended on the future of each execution. The sets with non fixed linearization points are the lazy set [20], lockfree sets of HM [22], Harris [17], Michael [29], and unordered set of [48]. By using observers and controllers in our previous work [3]. Our approach is simple and strong enough to verify these singlylinked list benchmarks.
Skiplist benchmarks: We consider four skiplist algorithms including the lockbased skiplist set [31], the lockfree skiplist set which is described in Sect. 2 [22], and two skiplistbased priority queues [26, 27]. One challenge for verifying these algorithms is to deal with unbounded number of levels. In addition, in the lockfree skiplist [22] and priority queue [26], the skiplist shape is not well formed, meaning that each higher level list need not be a sublist of lower level lists. These algorithms have not been automatically verified in previous work. By applying our fragment abstraction, to the best of our knowledge, we provide first framework which can automatically verify these concurrent skiplists algorithms.
Arrays of singlylinked list benchmarks: We consider two challenging timestamp algorithms in [12]. There are two challenges when verifying these algorithm. The first challenge is how to deal with an unbounded number of SLLs, and the second challenge is that the linearization points of the algorithms are not fixed, but depend on the future of each execution. By combining our fragment abstraction with the observers for stacks and queues in [7], we are able to verify these two algorithms automatically. The observers are crucial for achieving automation, since they enforce the weakest possible ordering constraints that are necessary for proving linearizability, thereby making it possible to use a less precise abstraction.
Running Times. The experiments were performed on a desktop 2.8 GHz processor with 8 GB memory. The results are presented in Fig. 11, where running times are given in seconds. Column a shows the verification times of our tool, whereas column b shows the verification times for algorithms based on SLLs, using the technique in our previous work [3]. In our experiments, we run the tool together with an observer in [1, 7] and controllers in [3] to verify linearizability of the algorithms. All experiments start from the initial heap, and end either when the analysis reaches a fixed point or when a violation of safety properties or linearizability is detected. As can be seen from the table, the verification times vary in the different examples. This is due to the types of shapes that are produced during the analysis. For instance, skiplist algorithms have much longer verification times. This is due to the number of pointer variables and their complicated shapes. In contrast, other algorithms produce simple shape patterns and hence they have shorter verification times.
Error Detection. In addition to establishing correctness of the original versions of the benchmark algorithms, we tested our tool with intentionally inserted bugs. For example, we omitted setting time statement in line 5 of the \(\mathtt{push}\) method in the TS stack algorithm, or we omitted the \(\mathtt{CAS}\) statements in lockfree algorithms. The tool, as expected, successfully detected and reported the bugs.
7 Conclusions
We have presented a novel shape abstraction, called fragment abstraction, for automatic verification of concurrent data structure implementations that operate on different forms of dynamically allocated heap structures, including singlylinked lists, skiplists, and arrays of singlylinked lists. Our approach is the first framework that can automatically verify concurrent data structure implementations that employ skiplists and arrays of singly linked lists, at the same time as handling an unbounded number of concurrent threads, an unbounded domain of data values (including timestamps), and an unbounded shared heap. We showed fragment abstraction allows to combine local and global reachability information to allow verification of the functional behavior of a collection of threads.
As future work, we intend to investigate whether fragment abstraction can be applied also to other heap structures, such as concurrent binary search trees.
References
 1.Abdulla, P.A., Haziza, F., Holík, L., Jonsson, B., Rezine, A.: An integrated specification and verification technique for highly concurrent data structures. In: Piterman, N., Smolka, S.A. (eds.) TACAS 2013. LNCS, vol. 7795, pp. 324–338. Springer, Heidelberg (2013). https://doi.org/10.1007/9783642367427_23CrossRefzbMATHGoogle Scholar
 2.Abdulla, P.A., Holík, L., Jonsson, B., Trinh, C.Q., et al.: Verification of heap manipulating programs with ordered data by extended forest automata. Acta Inf. 53(4), 357–385 (2016)MathSciNetCrossRefGoogle Scholar
 3.Abdulla, P.A., Jonsson, B., Trinh, C.Q.: Automated verification of linearization policies. In: Rival, X. (ed.) SAS 2016. LNCS, vol. 9837, pp. 61–83. Springer, Heidelberg (2016). https://doi.org/10.1007/9783662534137_4CrossRefGoogle Scholar
 4.Amit, D., Rinetzky, N., Reps, T., Sagiv, M., Yahav, E.: Comparison under abstraction for verifying linearizability. In: Damm, W., Hermanns, H. (eds.) CAV 2007. LNCS, vol. 4590, pp. 477–490. Springer, Heidelberg (2007). https://doi.org/10.1007/9783540733683_49CrossRefGoogle Scholar
 5.Berdine, J., LevAmi, T., Manevich, R., Ramalingam, G., Sagiv, M.: Thread quantification for concurrent shape analysis. In: Gupta, A., Malik, S. (eds.) CAV 2008. LNCS, vol. 5123, pp. 399–413. Springer, Heidelberg (2008). https://doi.org/10.1007/9783540705451_37CrossRefGoogle Scholar
 6.Bingham, J., Rakamarić, Z.: A logic and decision procedure for predicate abstraction of heapmanipulating programs. In: Emerson, E.A., Namjoshi, K.S. (eds.) VMCAI 2006. LNCS, vol. 3855, pp. 207–221. Springer, Heidelberg (2005). https://doi.org/10.1007/11609773_14CrossRefzbMATHGoogle Scholar
 7.Bouajjani, A., Emmi, M., Enea, C., Hamza, J.: On reducing linearizability to state reachability. In: Halldórsson, M.M., Iwama, K., Kobayashi, N., Speckmann, B. (eds.) ICALP 2015, Part II. LNCS, vol. 9135, pp. 95–107. Springer, Heidelberg (2015). https://doi.org/10.1007/9783662476666_8CrossRefGoogle Scholar
 8.Bouajjani, A., Emmi, M., Enea, C., Mutluergil, S.O.: Proving linearizability using forward simulations. In: Majumdar, R., Kunčak, V. (eds.) CAV 2017, Part II. LNCS, vol. 10427, pp. 542–563. Springer, Cham (2017). https://doi.org/10.1007/9783319633909_28CrossRefGoogle Scholar
 9.Chakraborty, S., Henzinger, T.A., Sezgin, A., Vafeiadis, V.: Aspectoriented linearizability proofs. Log. Methods Comput. Sci. 11(1) (2015)Google Scholar
 10.Chang, B.Y.E., Rival, X., Necula, G.C.: Shape analysis with structural invariant checkers. In: Nielson, H.R., Filé, G. (eds.) SAS 2007. LNCS, vol. 4634, pp. 384–401. Springer, Heidelberg (2007). https://doi.org/10.1007/9783540740612_24CrossRefGoogle Scholar
 11.Colvin, R., Groves, L., Luchangco, V., Moir, M.: Formal verification of a lazy concurrent listbased set algorithm. In: Ball, T., Jones, R.B. (eds.) CAV 2006. LNCS, vol. 4144, pp. 475–488. Springer, Heidelberg (2006). https://doi.org/10.1007/11817963_44CrossRefGoogle Scholar
 12.Dodds, M., Haas, A., Kirsch, C.: A scalable, correct timestamped stack. In: POPL, pp. 233–246. ACM (2015)CrossRefGoogle Scholar
 13.Doherty, S., Detlefs, D., Groves, L., Flood, C., et al.: DCAS is not a silver bullet for nonblocking algorithm design. In: SPAA 2004, pp. 216–224. ACM (2004)Google Scholar
 14.Doherty, S., Groves, L., Luchangco, V., Moir, M.: Formal verification of a practical lockfree queue algorithm. In: de FrutosEscrig, D., Núñez, M. (eds.) FORTE 2004. LNCS, vol. 3235, pp. 97–114. Springer, Heidelberg (2004). https://doi.org/10.1007/9783540302322_7CrossRefzbMATHGoogle Scholar
 15.Dudka, K., Peringer, P., Vojnar, T.: Byteprecise verification of lowlevel list manipulation. In: Logozzo, F., Fähndrich, M. (eds.) SAS 2013. LNCS, vol. 7935, pp. 215–237. Springer, Heidelberg (2013). https://doi.org/10.1007/9783642388569_13CrossRefGoogle Scholar
 16.Fomitchev, M., Ruppert, E.: Lockfree linked lists and skip lists. In: PODC 2004, pp. 50–59. ACM (2004)Google Scholar
 17.Harris, T.L.: A pragmatic implementation of nonblocking linkedlists. In: Welch, J. (ed.) DISC 2001. LNCS, vol. 2180, pp. 300–314. Springer, Heidelberg (2001). https://doi.org/10.1007/3540454144_21CrossRefGoogle Scholar
 18.Haziza, F., Holík, L., Meyer, R., Wolff, S.: Pointer race freedom. In: Jobstmann, B., Leino, K.R.M. (eds.) VMCAI 2016. LNCS, vol. 9583, pp. 393–412. Springer, Heidelberg (2016). https://doi.org/10.1007/9783662491225_19CrossRefGoogle Scholar
 19.Heinen, J., Noll, T., Rieger, S.: Juggrnaut: graph grammar abstraction for unbounded heap structures. ENTCS 266, 93–107 (2010)Google Scholar
 20.Heller, S., Herlihy, M., Luchangco, V., Moir, M., Scherer, W.N., Shavit, N.: A lazy concurrent listbased set algorithm. In: Anderson, J.H., Prencipe, G., Wattenhofer, R. (eds.) OPODIS 2005. LNCS, vol. 3974, pp. 3–16. Springer, Heidelberg (2006). https://doi.org/10.1007/11795490_3CrossRefGoogle Scholar
 21.Herlihy, M., Lev, Y., Luchangco, V., Shavit, N.: A simple optimistic skiplist algorithm. In: Prencipe, G., Zaks, S. (eds.) SIROCCO 2007. LNCS, vol. 4474, pp. 124–138. Springer, Heidelberg (2007). https://doi.org/10.1007/9783540729518_11CrossRefGoogle Scholar
 22.Herlihy, M., Shavit, N.: The Art of Multiprocessor Programming. Morgan Kaufmann, San Francisco (2008)Google Scholar
 23.Holík, L., Lengál, O., Rogalewicz, A., Šimáček, J., Vojnar, T.: Fully automated shape analysis based on forest automata. In: Sharygina, N., Veith, H. (eds.) CAV 2013. LNCS, vol. 8044, pp. 740–755. Springer, Heidelberg (2013). https://doi.org/10.1007/9783642397998_52CrossRefGoogle Scholar
 24.Khyzha, A., Dodds, M., Gotsman, A., Parkinson, M.: Proving linearizability using partial orders. In: Yang, H. (ed.) ESOP 2017. LNCS, vol. 10201, pp. 639–667. Springer, Heidelberg (2017). https://doi.org/10.1007/9783662544341_24CrossRefGoogle Scholar
 25.Liang, H., Feng, X.: Modular verification of linearizability with nonfixed linearization points. In: PLDI, pp. 459–470. ACM (2013)Google Scholar
 26.Lindén, J., Jonsson, B.: A skiplistbased concurrent priority queue with minimal memory contention. In: Baldoni, R., Nisse, N., van Steen, M. (eds.) OPODIS 2013. LNCS, vol. 8304, pp. 206–220. Springer, Cham (2013). https://doi.org/10.1007/9783319038506_15CrossRefGoogle Scholar
 27.Lotan, I., Shavit, N.: Skiplistbased concurrent priority queues. In: IPDPS, pp. 263–268. IEEE (2000)Google Scholar
 28.Manevich, R., Yahav, E., Ramalingam, G., Sagiv, M.: Predicate abstraction and canonical abstraction for singlylinked lists. In: Cousot, R. (ed.) VMCAI 2005. LNCS, vol. 3385, pp. 181–198. Springer, Heidelberg (2005). https://doi.org/10.1007/9783540305798_13CrossRefzbMATHGoogle Scholar
 29.Michael, M.M.: High performance dynamic lockfree hash tables and listbased sets. In: SPAA, pp. 73–82 (2002)Google Scholar
 30.Michael, M., Scott, M.: Correction of a memory management method for lockfree data structures. Technical report TR599, University of Rochester, Rochester, NY, USA (1995)Google Scholar
 31.Michael, M., Scott, M.: Simple, fast, and practical nonblocking and blocking concurrent queue algorithms. In: PODC, pp. 267–275. ACM (1996)Google Scholar
 32.O’Hearn, P.W., Rinetzky, N., Vechev, M.T., Yahav, E., Yorsh, G.: Verifying linearizability with hindsight. In: PODC, pp. 85–94 (2010)Google Scholar
 33.Sagiv, S., Reps, T., Wilhelm, R.: Parametric shape analysis via 3valued logic. ACM Trans. Program. Lang. Syst. 24(3), 217–298 (2002)CrossRefGoogle Scholar
 34.Sánchez, A., Sánchez, C.: Formal verification of skiplists with arbitrary many levels. In: Cassez, F., Raskin, J.F. (eds.) ATVA 2014. LNCS, vol. 8837, pp. 314–329. Springer, Cham (2014). https://doi.org/10.1007/9783319119366_23CrossRefzbMATHGoogle Scholar
 35.Schellhorn, G., Derrick, J., Wehrheim, H.: A sound and complete proof technique for linearizability of concurrent data structures. ACM Trans. Comput. Log. 15(4), 31:1–37 (2014)MathSciNetCrossRefGoogle Scholar
 36.Segalov, M., LevAmi, T., Manevich, R., Ganesan, R., Sagiv, M.: Abstract transformers for thread correlation analysis. In: Hu, Z. (ed.) APLAS 2009. LNCS, vol. 5904, pp. 30–46. Springer, Heidelberg (2009). https://doi.org/10.1007/9783642106729_5CrossRefGoogle Scholar
 37.Singh, V., Neamtiu, I., Gupta, R.: Proving concurrent data structures linearizable. In: ISSRE, pp. 230–240. IEEE (2016)Google Scholar
 38.Sundell, H., Tsigas, P.: Fast and lockfree concurrent priority queues for multithread systems. J. Parallel Distrib. Comput. 65(5), 609–627 (2005)CrossRefGoogle Scholar
 39.Treiber, R.: Systems programming: Coping with parallelism. Technical report RJ5118, IBM Almaden Res. Ctr. (1986)Google Scholar
 40.Turon, A.J., Thamsborg, J., Ahmed, A., Birkedal, L., Dreyer, D.: Logical relations for finegrained concurrency. In: POPL 2013, pp. 343–356. ACM (2013)CrossRefGoogle Scholar
 41.Vafeiadis, V.: Modular finegrained concurrency verification. Ph.D. thesis, University of Cambridge (2008)Google Scholar
 42.Vafeiadis, V.: Automatically proving linearizability. In: Touili, T., Cook, B., Jackson, P. (eds.) CAV 2010. LNCS, vol. 6174, pp. 450–464. Springer, Heidelberg (2010). https://doi.org/10.1007/9783642142956_40CrossRefGoogle Scholar
 43.Černý, P., Radhakrishna, A., Zufferey, D., Chaudhuri, S., Alur, R.: Model checking of linearizability of concurrent list implementations. In: Touili, T., Cook, B., Jackson, P. (eds.) CAV 2010. LNCS, vol. 6174, pp. 465–479. Springer, Heidelberg (2010). https://doi.org/10.1007/9783642142956_41CrossRefGoogle Scholar
 44.Vechev, M.T., Yahav, E.: Deriving linearizable finegrained concurrent objects. In: PLDI, pp. 125–135. ACM (2008)Google Scholar
 45.Vechev, M., Yahav, E., Yorsh, G.: Experience with model checking linearizability. In: Păsăreanu, C.S. (ed.) SPIN 2009. LNCS, vol. 5578, pp. 261–278. Springer, Heidelberg (2009). https://doi.org/10.1007/9783642026522_21CrossRefGoogle Scholar
 46.Wachter, B., Westphal, B.: The spotlight principle. In: Cook, B., Podelski, A. (eds.) VMCAI 2007. LNCS, vol. 4349, pp. 182–198. Springer, Heidelberg (2007). https://doi.org/10.1007/9783540697381_13CrossRefGoogle Scholar
 47.Yang, H., Lee, O., Berdine, J., Calcagno, C., Cook, B., Distefano, D., O’Hearn, P.: Scalable shape analysis for systems code. In: Gupta, A., Malik, S. (eds.) CAV 2008. LNCS, vol. 5123, pp. 385–398. Springer, Heidelberg (2008). https://doi.org/10.1007/9783540705451_36CrossRefGoogle Scholar
 48.Zhang, K., Zhao, Y., Yang, Y., Liu, Y., Spear, M.: Practical nonblocking unordered lists. In: Afek, Y. (ed.) DISC 2013. LNCS, vol. 8205, pp. 239–253. Springer, Heidelberg (2013). https://doi.org/10.1007/9783642415272_17CrossRefGoogle Scholar
Copyright information
Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made. The images or other third party material in this book are included in the book's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the book's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.