Alternating Automata Modulo First Order Theories
Abstract
We introduce firstorder alternating automata, a generalization of boolean alternating automata, in which transition rules are described by multisorted firstorder formulae, with states and internal variables given by uninterpreted predicate terms. The model is closed under union, intersection and complement, and its emptiness problem is undecidable, even for the simplest data theory of equality. To cope with the undecidability problem, we develop an abstraction refinement semialgorithm based on lazy annotation of the symbolic execution paths with interpolants, obtained by applying (i) quantifier elimination with witness term generation and (ii) Lyndon interpolation in the quantifierfree theory of the data domain, with uninterpreted predicate symbols. This provides a method for checking inclusion of timed and finitememory register automata, and emptiness of quantified predicate automata, previously used in the verification of parameterized concurrent programs, composed of replicated threads, with shared memory.
1 Introduction
Many results in automata theory rely on the finite alphabet hypothesis, which guarantees, in some cases, the existence of determinization, complementation and inclusion checking methods. However, this hypothesis prevents the use of automata as models of realtime systems or even simple programs, whose input and output are data values ranging over very large domains, typically viewed as infinite mathematical abstractions.
Traditional attempts to generalize classical RabinScott automata to infinite alphabets, such as timed automata [1] and finitememory automata [16] face the complement closure problem: there exist automata for which the complement language cannot be recognized by an automaton in the same class. This makes it impossible to encode a language inclusion problem \({\mathcal L}({A}) \subseteq {\mathcal L}({B})\) as the emptiness of an automaton recognizing the language \({\mathcal L}({A}) \cap {\mathcal L}^c({B})\), where \({\mathcal L}^c({B})\) denotes the complement of \({\mathcal L}({B})\).
Even for finite alphabets, complementation of finitestate automata faces inherent exponential blowup, due to nondeterminism. However, if we allow universal nondeterminism, in addition to the classical existential nondeterminism, complementation is possible is linear time. Having both existential and universal nondeterminism defines the alternating automata model [4]. A finitealphabet alternating automaton is described by a set of transition rules \(q \xrightarrow {{\scriptscriptstyle a}}_{{\scriptstyle }} \phi \), where q is a state, a is an input symbol and \(\phi \) is a boolean formula, whose propositional variables denote successor states.
Our Contribution. We extend alternating automata to infinite data alphabets, by defining a model of computation in which all boolean operations, including complementation, can be done in linear time. The control states are given by kary predicate symbols \(q(y_1,\ldots ,y_k)\), the input consists of an event a from a finite alphabet and a tuple of data variables \(x_1,\ldots ,x_n\), ranging over an infinite domain, and transitions are of the form \(q(y_1,\ldots ,y_k) \xrightarrow {{\scriptscriptstyle a(x_1,\ldots ,x_n)}}_{{\scriptstyle }} \phi (x_1, \ldots , x_n, y_1, \ldots , y_k)\), where \(\phi \) is a formula in the firstorder theory of the data domain. In this model, the arguments of a predicate atom \(q(y_1,\ldots ,y_k)\) represent the values of the internal variables associated with the state. Together with the input values \(x_1,\ldots ,x_n\), these values define the next configurations, but remain invisible in the input sequence.
The tight coupling of internal values and control states, by means of uninterpreted predicate symbols, allows for lineartime complementation just as in the case of classical propositional alternating automata. Complementation is, moreover, possible when the transition formulae contain firstorder quantifiers, generating infinitelybranching execution trees. The price to be paid for this expressivity is that emptiness of firstorder alternating automata is undecidable, even for the simplest data theory of equality [6].
 1.
Quantified transition rules make it hard, if not impossible, in general, to decide if a path is infeasible. This is mainly because adding uninterpreted predicate symbols to decidable firstorder theories, such as Presburger arithmetic, results in undecidability [10]. To deal with this problem, we assume that the firstorder data theory, without uninterpreted predicate symbols, has a quantifier elimination procedure, that instantiates quantifiers with effectively computable witness terms.
 2.
The interpolants that prove the infeasibility of a path are not local, as they may refer to input values encountered in the past. However, the future executions are oblivious to when these values have been seen in the past and depend only on the relation between the past and current values. We use this fact to define a labeling of nodes, visited by the lazy annotation procedure, with conjunctions of existentially quantified interpolants combining predicate atoms with data constraints.
We use firstorder alternating automata to develop practical semialgorithms for a number of known undecidable problems, such as: inclusion of regular timed languages [1], inclusion of quasiregular languages recognized by finitememory automata [16] and emptiness of predicate automata, a subclass of firstorder alternating automata used to verify parameterized concurrent programs [6, 7].
Related Work. Recognizers for languages over infinite alphabets have found various applications, ranging from Unicode text recognition [5] to runtime program monitoring [2]. Extending finite automata to infinite alphabets has been considered in the context of symbolic alternating finite automata (sAFA), whose transitions are labeled with guards taken from a decidable theory of the data domain [5]. As in our model, sAFA are closed under union, intersection and complement and emptiness is decidable, due to the lack of registers. However, sAFA are strictly less expressive than our model, because comparing data at different positions in the input word is not possible.
Constrained Horn clauses (CHC) are a branching computation model widespread in program verification [9]. The main difference between alternating and bottomup branching computations is that, in an alternating model, all branches of the computation must synchronize on the same input word. With this in mind, it is possible to express emptiness of firstorder alternating automata as the existence of solutions of a CHC over a higherorder theory of data, extended with algebraic data types (lists). The effectiveness of such an encoding depends on the effectiveness of interpolation and witness term generation for theories of algebraic data types [11].
The alternating automata model presented in this paper extends the alternating automata with variables ranging over infinite data considered in [14]. There all variables were required to be observable in the input. We overcome this restriction by allowing internal (invisible) variables. Another closely related work [13] considers an inclusion between an asynchronous product of automata \(A_1 \times \ldots \times A_n\), extended with data variables, and a monitor automaton B. The semialgorithm defined there was based on the assumption that all variables of the observer B must be declared in the automata \(A_1, \ldots , A_n\) under check. This limitation can now be bypassed, since the inclusion problem can be encoded as emptiness of a firstorder alternating automaton and, moreover, the emptiness checking semialgorithm can handle invisible variables.
The work probably closest to ours concerns the model of predicate automata (PA) [6, 7, 17], used in the verification of parameterized concurrent programs with shared memory. In this model, the alphabet consists of pairs of program statements and thread identifiers and is considered infinite because the number of threads is unbounded. Since thread identifiers can only be compared for equality, the data theory in PA is the theory of equality. Even with this simplification, the emptiness problem is undecidable when either the predicates have arity greater than one [6] or use quantified transition rules [17]. Checking emptiness of quantifierfree PA is possible semialgorithmically, by explicitly enumerating reachable configurations and checking coverage by looking for permutations of argument values. However, no semialgorithm has been given for quantified PA. Dealing with quantified transition rules is one of our contributions.
1.1 Preliminaries
For two integers \(0 \le i \le j\), we define Open image in new window and Open image in new window . We consider two disjoint sorts \(\mathbb {D}\) and \(\mathbb {B}\), where \(\mathbb {D}\) is an infinite domain and \(\mathbb {B}= \{ \top ,\bot \}\) is the set of boolean values true (\(\top \)) and false (\(\bot \)), respectively. The \(\mathbb {D}\) sort is equipped with countably many function symbols \(f : \mathbb {D}^{\#(f)} \rightarrow \mathbb {D}\cup \mathbb {B}\), where \(\#(f)\ge 0\) denotes the number of arguments (arity) of f. A predicate is a function symbol \(p : \mathbb {D}^{\#(p)} \rightarrow \mathbb {B}\) that is, a \(\#(p)\)ary relation.
We consider the interpretation of all function symbols \(f : \mathbb {D}^{\#(f)} \rightarrow \mathbb {D}\) to be fixed by the interpretation of the \(\mathbb {D}\) sort, for instance if \(\mathbb {D}\) is the set of integers \(\mathbf{\mathbb {Z}}\), these are zero, the successor function and the arithmetic operations of addition and multiplication. We extend this convention to several predicates over \(\mathbb {D}\), such as the inequality relation over \(\mathbf{\mathbb {Z}}\), and write \(\mathsf {Pred}\) for the set of remaining uninterpreted predicates.
An interpretation \(\mathcal {I}\) maps each predicate symbol p into a set \(p^\mathcal {I}\subseteq \mathbb {D}^{\#(p)}\), if \(\#(p)>0\), or into an element of \(\mathbb {B}\) if \(\#(p)=0\). A valuation \(\nu \) maps each variable x into an element of \(\mathbb {D}\). Given a term t, we denote by \(t^\nu \) the value obtained by replacing each variable x by the value \(\nu (x)\) and evaluating each function application. For a formula \(\phi \), we define the forcing relation Open image in new window recursively on the structure of \(\phi \), as usual. For a formula \(\phi \) and a valuation \(\nu \), we define Open image in new window and drop the \(\nu \) subscript for sentences. A sentence \(\phi \) is satisfiable if \(\mathbf{[\![}\phi \mathbf{]\!]}\ne \emptyset \). An element of \(\mathbf{[\![}\phi \mathbf{]\!]}\) is called a model of \(\phi \). A formula \(\phi \) is valid if Open image in new window for every interpretation \(\mathcal {I}\) and every valuation \(\nu \). We say that \(\phi \) entails \(\psi \), written Open image in new window if and only if \(\mathbf{[\![}\phi \mathbf{]\!]} \subseteq \mathbf{[\![}\psi \mathbf{]\!]}\).
Interpretations are partially ordered by the pointwise subset order, defined as \(\mathcal {I}_1 \subseteq \mathcal {I}_2\) if and only if \(p^{\mathcal {I}_1} \subseteq p^{\mathcal {I}_2}\) for each predicate symbol \(p \in \mathsf {Pred}\). Given a formula \(\phi \) and a valuation \(\nu \), we define Open image in new window the set of minimal interpretations that, together with \(\nu \), form models of \(\phi \).
2 First Order Alternating Automata
Let \(\varSigma \) be a finite alphabet \(\varSigma \) of input events. Given a finite set of variables \(X \subseteq \mathsf {Var}\), we denote by \(X \mapsto \mathbb {D}\) the set of valuations of the variables X and \(\varSigma [X] = \varSigma \times (X \mapsto \mathbb {D})\) be the possibly infinite set of data symbols \((a,\nu )\), where a is an input symbol and \(\nu \) is a valuation. A data word (simply called word in the following) is a finite sequence \(w=(a_1,\nu _1)(a_2,\nu _2) \ldots (a_n,\nu _n)\) of data symbols. Given a word w, we denote by Open image in new window its sequence of input events and by \({w}_\mathbb {D}\) the valuation associating each timestamped variable \({x}^{\scriptscriptstyle {({i})}}\), where \(x \in \mathsf {Var}\), the value \(\nu _i(x)\), for all \(i\in [1,n]\). We denote by \(\varepsilon \) the empty sequence, by \(\varSigma ^*\) the set of finite input sequences and by \(\varSigma [X]^*\) the set of finite data words over the variables X.
A firstorder alternating automaton is a tuple \(\mathcal {A}= \langle \varSigma ,X,Q,\iota ,F,\varDelta \rangle \), where \(\varSigma \) is a finite set of input events, X is a finite set of input variables, Q is a finite set of predicates denoting control states, \(\iota \in \mathsf {Form}^+(Q,\emptyset )\) is a sentence defining initial configurations, \(F \subseteq Q\) is the set of predicates denoting final states and \(\varDelta \) is a set of transition rules. A transition rule is of the form \(q(y_1,\ldots ,y_{\#(q)}) \xrightarrow {{\scriptscriptstyle a(X)}}_{{\scriptstyle }} \psi \), where \(q \in Q\) is a predicate, \(a \in \varSigma \) is an input event and \(\psi \in \mathsf {Form}^+(Q,X \cup \{ y_1,\ldots ,y_{\#(q)} \})\) is a positive formula, where \(X\,\cap \,\{ y_1,\ldots ,y_{\#(q)} \} = \emptyset \). Without loss of generality, we consider, for each predicate \(q \in Q\) and each input event \(a \in \varSigma \), at most one such rule, as two or more rules can be joined using disjunction. The quantifiers occurring in the righthand side formula of a transition rule are called transition quantifiers. The size of \(\mathcal {A}\) is \({{\mathcal {A}}}\) Open image in new window .
The semantics of firstorder alternating automata is analogous to the semantics of propositional alternating automata, with rules of the form \(q \xrightarrow {{\scriptscriptstyle a}}_{{\scriptstyle }} \phi \), where q is a propositional variable and \(\phi \) a positive boolean combination of propositional variables. For instance, \(q_0 \xrightarrow {{\scriptscriptstyle a}}_{{\scriptstyle }} (q_1 \wedge q_2) \vee q_3\) means that the automaton can choose to transition in either both \(q_1\) and \(q_2\) or in \(q_3\) alone. This leads to defining transitions as the minimal models of the right hand side of a rule^{1}. The original definition of alternating automata [4] works around this problem and considers boolean valuations instead of formulae. In contrast, a finite description of a firstorder alternating automaton cannot be given in terms of interpretations, as a firstorder formula may have infinitely many models, corresponding to infinitely many initial or successor states occurring within an execution step.
Given an uninterpreted predicate symbol \(q \in Q\) and data values \(d_1,\ldots ,d_{\#(q)} \in \mathbb {D}\), the tuple \((q, d_1,\ldots ,d_{\#(q)})\) is called a configuration, sometimes written \(q(d_1,\ldots ,d_{\#(q)})\), when no confusion arises. A configuration is final if \(q \in F\). An interpretation \(\mathcal {I}\) corresponds to a set of configurations Open image in new window , called a cube. This notation is lifted to sets of configurations in the usual way.
Definition 1
 1.
\(c = \{ T(\epsilon ) \mid T \in \mathcal {T} \}\) is the set of configurations labeling the roots of \(T_1,T_2,\ldots \) and
 2.
if \((q,d_1,\ldots ,d_{\#(q)})\) labels a node on the level \(j \in [n1]\) in \(T_i\), then the labels of its children form a cube from \(\mathsf {c}({\mathbf{[\![}\psi \mathbf{]\!]}^\mu _{\eta }})\), where \(\eta = \nu _{j+1}[y_1 \leftarrow d_1,\ldots ,y_{\#(q)} \leftarrow d_{\#(q)}]\) and \(q(y_1,\ldots ,y_{\#(q)}) \xrightarrow {{\scriptscriptstyle a_{j+1}(X)}}_{{\scriptstyle }} \psi \in \varDelta \) is a transition rule of \(\mathcal {A}\).
An execution \(\mathcal {T}\) over w, starting with c, is accepting if and only if all paths in \(\mathcal {T}\) have the same length and the frontier of each tree \(T \in \mathcal {T}\) is labeled with final configurations. If \(\mathcal {A}\) has an accepting execution over w starting with a cube \(c \in \mathsf {c}({\mathbf{[\![}\iota \mathbf{]\!]}^\mu })\), then \(\mathcal {A}\) accepts w and let \({\mathcal L}({\mathcal {A}})\) be the set of words accepted by \(\mathcal {A}\). For example, consider the automaton \(\mathcal {A}= \langle \{ a \}, \{ x \}, \{ q_0,q_1,q_2,q_f \}, q_0(0), \{ q_f \}, \varDelta \rangle \), where \(\varDelta \) is the set: \(q_0(y) \xrightarrow {{\scriptscriptstyle a(x)}}_{{\scriptstyle }} q_1(y+x) \wedge q_2(yx)\), \(q_1(y) \xrightarrow {{\scriptscriptstyle a(x)}}_{{\scriptstyle }} q_1(y+x) \vee (y > 0 \wedge q_f)\) and \(q_2(y) \xrightarrow {{\scriptscriptstyle a(x)}}_{{\scriptstyle }} q_2(yx) \vee (y > 0 \wedge q_f)\). A possible execution tree of this automaton is the following:
The execution tree is not accepting, since its frontier is not labeled with final configurations everywhere. Incidentally, here we have \({\mathcal L}({\mathcal {A}}) = \emptyset \), which is proved by our tool in \(\sim \!0.5\) s on an average machine.
 1.
boolean closure: given automata \(\mathcal {A}_i = \langle \varSigma ,X,Q_i,\iota _i,F_i,\varDelta _i \rangle \), for \(i=1,2\), do there exist automata \(\mathcal {A}_\cap \), \(\mathcal {A}_\cup \) and \(\overline{\mathcal {A}}_1\) such that \(L(\mathcal {A}_\cap ) = L(\mathcal {A}_1) \cap L(\mathcal {A}_2)\), \(L(\mathcal {A}_\cup ) = L(\mathcal {A}_1) \cup L(\mathcal {A}_2)\) and \(L(\overline{\mathcal {A}}_1) = \varSigma [X]^* \setminus L(\mathcal {A}_1)\)?
 2.
emptiness: given an automaton \(\mathcal {A}\), is \(L(\mathcal {A}) = \emptyset \)?
For technical reasons, we address the following problem next: given an automaton \(\mathcal {A}\) and an input sequence \(\alpha \in \varSigma ^*\), does there exists a word \(w \in {\mathcal L}({\mathcal {A}})\) such that \({w}_\varSigma = \alpha \) ? By solving this problem first, we develop the machinery required to prove that firstorder alternating automata are closed under complement and, further, set up the ground for developping a practical semialgorithm for the emptiness problem.
2.1 Path Formulae
In the upcoming developments it is sometimes more convenient to work with logical formulae defining executions of automata, than with lowlevel execution forests. For this reason, we first introduce path formulae \(\varTheta ({\alpha })\), which are formulae defining the executions of an automaton, over words that share a given sequence \(\alpha \) of input events. Second, we restrict a path formula \(\varTheta ({\alpha })\) to an acceptance formula \(\varUpsilon ({\alpha })\), which defines only those executions that are accepting among \(\varTheta ({\alpha })\). Consequently, the automaton accepts a word w such that \({w}_\varSigma = \alpha \) if and only if \(\varUpsilon ({\alpha })\) is satisfiable.
Let \(\mathcal {A}= \langle \varSigma ,X,Q,\iota ,F,\varDelta \rangle \) be an automaton for the rest of this section. For any \(i \in \mathbf{\mathbb {N}}\), we denote by \({Q}^{\scriptscriptstyle {({i})}} = \{ {q}^{\scriptscriptstyle {({i})}} \mid q \in Q \}\) and \({X}^{\scriptscriptstyle {({i})}} = \{ {x}^{\scriptscriptstyle {({i})}} \mid x \in X \}\) the sets of timestamped predicate symbols and variables, respectively. We also define Open image in new window and Open image in new window . For a formula \(\psi \) and \(i \in \mathbf{\mathbb {N}}\), we define Open image in new window the formula in which all input variables and state predicates (and only those symbols) are replaced by their timestamped counterparts. Moreover, we write \(q(\mathbf y)\) for \(q(y_1,\ldots ,y_{\#(q)})\), when no confusion arises.
Lemma 1
Given an automaton \(\mathcal {A}= \langle \varSigma ,X,Q,\iota ,F,\varDelta \rangle \), for any word \(w=(a_1,\) \(\nu _1) \ldots (a_n,\nu _n)\), we have \(\mathbf{[\![}\varTheta ({{w}_\varSigma }) \mathbf{]\!]}^\mu _{{w}_\mathbb {D}} = \{ \mathcal {I}_{\mathcal {T}} \mid \mathcal {T}\text { is an execution of }\mathcal {A}\text { over } w \}\).
Lemma 2
Given an automaton \(\mathcal {A}= \langle \varSigma ,X,Q,\iota ,F,\varDelta \rangle \), for every word \(w \in \varSigma [X]^*\), the following are equivalent: (1) there exists an interpretation \(\mathcal {I}\) such that Open image in new window and (2) \(w \in {\mathcal L}({\mathcal {A}})\).
As an immediate consequence, one can decide whether \(\mathcal {A}\) accepts some word w with a given input sequence \({w}_\varSigma =\alpha \), by checking whether \(\varUpsilon ({\alpha })\) is satisfiable. However, unlike nonalternating infinitestate models of computation, such as counter automata (nondeterministic programs with integer variables), the satisfiability query for an acceptance (path) formula falls outside of known decidable theories, supported by standard SMT solvers. There are basically two reasons for this, namely (i) the presence of predicate symbols, and (ii) the nontrivial alternation of quantifiers. To understand this point, consider for example, the decidable theory of Presburger arithmetic [24]. Adding even only one monadic predicate symbol to it yields undecidability in the presence of nontrivial quantifier alternation [10]. On the other hand, the quantifierfree fragment of Presburger arithmetic extended with uninterpreted function symbols is decidable, by a NelsonOppen style congruence closure argument [22].
To tackle the problem of deciding satisfiability of \(\varUpsilon ({\alpha })\) formulae, we start from the observation that their form is rather particular, which allows the elimination of path quantifiers and uninterpreted predicate symbols, by a couple of satisfiabilitypreserving transformations. The result of applying these transformations is a formula with no predicate symbols, whose only quantifiers are those introduced by the transition rules of the automaton. Next, in Sect. 3 we shall assume moreover that the firstorder theory of the data sort \(\mathbb {D}\) (without uninterpreted predicate symbols) has quantifier elimination, providing thus an effective decision procedure.
For the time being, let us formally define the elimination of transition quantifiers and predicate symbols. Let \(\alpha = a_1 \ldots a_n\) be a given sequence of input events and let \(\alpha _i\) be the prefix \(a_1 \ldots a_i\) of \(\alpha \), for \(i \in [n]\), where \(\alpha _0=\epsilon \). We consider the sequence of formulae \(\widehat{\varTheta }({\alpha _0}), \ldots , \widehat{\varTheta }({\alpha _n})\) defined as Open image in new window and, for all \(i \in [1,n]\), let \(\widehat{\varTheta }({\alpha _i})\) be the conjunction of \(\widehat{\varTheta }({\alpha _{i1}})\) with all formulae \({q}^{\scriptscriptstyle {({i1})}}(t_1,\ldots ,t_{\#(q)}) \rightarrow {\psi }^{\scriptscriptstyle {({i})}}[t_1/y_1, \ldots , t_{\#(q)}/y_{\#(q)}]\), such that \({q}^{\scriptscriptstyle {({i1})}}(t_1,\ldots ,t_{\#(q)})\) occurs in \(\widehat{\varTheta }({\alpha _{i1}})\), for some terms \(t_1,\ldots ,t_{\#(q)}\). Next, we write \(\widehat{\varUpsilon }({\alpha })\) for the conjunction of \(\widehat{\varTheta }({\alpha _n})\) with all \({q}^{\scriptscriptstyle {({n})}}(t_1,\ldots ,t_{\#(q)}) \rightarrow \bot \), such that \({q}^{\scriptscriptstyle {({n})}}(t_1,\ldots ,t_{\#(q)})\) occurs in \(\widehat{\varTheta }({\alpha _n})\), for some \(q \in Q \setminus F\). Note that \(\widehat{\varUpsilon }({\alpha })\) contains no path quantifiers, as required. On the other hand, the scope of the transition quantifiers in \(\widehat{\varUpsilon }({\alpha })\) exceeds the righthand side formulae from the transition rules, as shown by the following example.
Example 1
Lemma 3
For any input event sequence \(\alpha =a_1\ldots a_n\) and each valuation \(\nu : {X}^{\scriptscriptstyle {({\le n})}} \rightarrow \mathbb {D}\), the following hold, for every interpretation \(\mathcal {I}\): (1) if Open image in new window then Open image in new window , and (2) if Open image in new window there exists an interpretation \(\mathcal {J}\subseteq \mathcal {I}\) such that Open image in new window .
Further, we eliminate the predicate atoms from \(\widehat{\varUpsilon }({\alpha })\), by considering the sequence of formulae Open image in new window and \(\overline{\varTheta }({\alpha _i})\) is obtained by substituting each predicate atom \({q}^{\scriptscriptstyle {({i1})}}(t_1,\ldots ,t_{\#(q)})\) in \(\overline{\varTheta }({\alpha _{i1}})\) by \({\psi }^{\scriptscriptstyle {({i})}}[t_1/y_1,\ldots ,t_{\#(q)}/y_{\#(q)}]\), where \(q(\mathbf y) \xrightarrow {{\scriptscriptstyle a_{i}(X)}}_{{\scriptstyle }} \psi \in \varDelta \), for all \(i \in [1,n]\). We write \(\overline{\varUpsilon }({\alpha })\) for the formula obtained by replacing, in \(\overline{\varTheta }({\alpha })\), each occurrence of a predicate \({q}^{\scriptscriptstyle {({n})}}\), such that \(q \in Q \setminus F\) (resp. \(q \in F\)), by \(\bot \) (resp. \(\top \)).
Example 2
At this point, we prove the formal relation between the satisfiability of the formulae \(\widehat{\varUpsilon }({\alpha })\) and \(\overline{\varUpsilon }({\alpha })\). Since there are no occurrences of predicates in \(\overline{\varUpsilon }({\alpha })\), for each valuation \(\nu : {X}^{\scriptscriptstyle {({\le n})}} \rightarrow \mathbb {D}\), there exists an interpretation \(\mathcal {I}\) such that Open image in new window if and only if Open image in new window , for every interpretation \(\mathcal {J}\). In this case we omit \(\mathcal {I}\) and simply write Open image in new window .
Lemma 4
For any input event sequence \(\alpha =a_1\ldots a_n\) and each valuation \(\nu : {X}^{\scriptscriptstyle {({\le n})}} \rightarrow \mathbb {D}\), there exists a valuation \(\mathcal {I}\) such that Open image in new window if and only if Open image in new window .
Finally, we define the acceptance of a word with a given input event sequence by means of a quantifierfree formula in which no predicate atom occurs.
Lemma 5
Given an automaton \(\mathcal {A}= \langle \varSigma ,X,Q,\iota ,F,\varDelta \rangle \), for every word \(w \in \varSigma [X]^*\), we have Open image in new window if and only if \(w \in {\mathcal L}({\mathcal {A}})\).
2.2 Boolean Closure of First Order Alternating Automata
Theorem 1
 1.
\({\mathcal L}({\mathcal {A}_\cap }) = {\mathcal L}({\mathcal {A}_1})\,\cap \,{\mathcal L}({\mathcal {A}_2})\), where \(\mathcal {A}_\cap = \langle \varSigma ,X,Q_1\,\cup \,Q_2, \iota _1 \wedge \iota _2, F_1\,\cup \,F_2,\) \(\varDelta _1\,\cup \,\varDelta _2 \rangle \),
 2.
\({\mathcal L}({\overline{\mathcal {A}_i}}) = \varSigma [X]^* \setminus {\mathcal L}({\mathcal {A}_i})\), where \(\overline{\mathcal {A}_i} = \langle \varSigma ,X,Q_i,{\iota }^{\sim },Q_i\setminus F_i,{\varDelta }^{\sim }_i \rangle \) and \({\varDelta }^{\sim }_i = \{ q(\mathbf y) \xrightarrow {{\scriptscriptstyle a(X)}}_{{\scriptstyle }} {\psi }^{\sim } \mid q(\mathbf y) \xrightarrow {{\scriptscriptstyle a(X)}}_{{\scriptstyle }} \psi \in \varDelta _i \}\), for \(i=1,2\).
Moreover, \({{\mathcal {A}_\cap }} = \mathcal {O}({{{\mathcal {A}_1}}+{{\mathcal {A}_2}}})\) and \({{\overline{\mathcal {A}_i}}} = \mathcal {O}({{{\mathcal {A}_i}}})\), for \(i=1,2\).
3 The Emptiness Problem
The emptiness problem is undecidable even for automata with predicates of arity two, whose transition rules use only equalities and disequalities, having no transition quantifiers [6]. Since even such simple classes of alternating automata have no general decision procedure for emptiness, we use an abstractionrefinement semialgorithm based on lazy annotation [20, 21]. In a nutshell, a lazy annotation procedure systematically explores the set of finite input event sequences searching for an accepting execution. For an input sequence, if the path formula is satisfiable, we compute a word in the language of the automaton, from the model of the path formula. Otherwise, i.e. the sequence is spurious, the search backtracks and each position in the sequence is annotated with an interpolant, thus marking the sequence as infeasible. The semialgorithm uses moreover a coverage relation between sequences, ensuring that the continuations of already covered sequences are never explored. Sometimes this coverage relation provides a sound termination argument, in case when the automaton is empty.
For two input event sequences \(\alpha , \beta \in \varSigma ^*\), we say that \(\alpha \) is a prefix of \(\beta \), written \(\alpha \preceq \beta \), if \(\alpha =\beta \gamma \) for some sequence \(\gamma \in \varSigma ^*\). A set S of sequences is prefixclosed if for each \(\alpha \in S\), if \(\beta \preceq \alpha \) then \(\beta \in S\), and complete if for each \(\alpha \in S\), there exists \(a \in \varSigma \) such that \(\alpha a \in S\) if and only if \(\alpha b \in S\) for all \(b \in \varSigma \). A prefixclosed set is the backbone of a tree whose edges are labeled with input events. If the set is, moreover, complete, then every node of the tree has either zero successors, in which case it is called a leaf, or it has a successor edge labeled with a for each input event \(a \in \varSigma \).
Definition 2
Lazy annotation semialgorithms [20, 21] build unfoldings of automata trying to discover counterexamples for emptiness. If the automaton \(\mathcal {A}\) in question is nonempty, a systematic enumeration of the input event sequences^{2} from \(\varSigma ^*\) will suffice to discover a word \(w \in {\mathcal L}({\mathcal {A}})\), provided that the firstorder theory of the data domain \(\mathbb {D}\) is decidable (Lemma 2). However, if \({\mathcal L}({\mathcal {A}}) = \emptyset \), the enumeration of input event sequences may, in principle, run forever. The typical way of fighting this divergence problem is to define a coverage relation between the nodes of the unfolding tree.
Definition 3
Given an unfolding U of an automaton \(\mathcal {A}= \langle \varSigma ,X,Q,\iota ,F,\varDelta \rangle \) a node \(\alpha \in \mathrm {dom}(U)\) is covered by another node \(\beta \in \mathrm {dom}(U)\), denoted \(\alpha \sqsubseteq \beta \), if and only if there exists a node \(\alpha ' \preceq \alpha \) such that Open image in new window . Moreover, U is closed if and only if every leaf from \(\mathrm {dom}(U)\) is covered by an uncovered node.
A lazy annotation semialgorithm will stop and report emptiness provided that it succeeds in building a closed and safe unfolding of the automaton. Notice that, by Definition 3, for any three nodes of an unfolding U, say \(\alpha ,\beta ,\gamma \in \mathrm {dom}(U)\), if \(\alpha \prec \beta \) and \(\alpha \sqsubseteq \gamma \), then \(\beta \sqsubseteq \gamma \) as well. As we show next (Theorem 2), there is no need to expand covered nodes, because, intuitively, there exists a word \(w \in {\mathcal L}({\mathcal {A}})\) such that \(\alpha \preceq {w}_\varSigma \) and \(\alpha \sqsubseteq \gamma \) only if there exists another word \(u \in {\mathcal L}({\mathcal {A}})\) such that \(\gamma \preceq {u}_\varSigma \). Hence, exploring only those input event sequences that are continuations of \(\gamma \) (and ignoring those of \(\alpha \)) suffices in order to find a counterexample for emptiness, if one exists.
An unfolding node \(\alpha \in \mathrm {dom}(U)\) is said to be spurious if and only if \(\varUpsilon ({\alpha })\) is unsatisfiable. In this case, we change (refine) the labels of (some of the) prefixes of \(\alpha \) (and that of \(\alpha \)), such that \(U(\alpha )\) becomes \(\bot \), thus indicating that there is no real execution of the automaton along that input event sequence. As a result of the change of labels, if a node \(\gamma \preceq \alpha \) used to cover another node from \(\mathrm {dom}(U)\), it might not cover it with the new label. Therefore, the coverage relation has to be recomputed after each refinement of the labeling. The semialgorithm stops when (and if) a safe complete unfolding has been found.
Theorem 2
If an automaton \(\mathcal {A}\) has a nonempty safe closed unfolding then \({\mathcal L}({A}) = \emptyset \).
We describe the semialgorithm used to check emptiness of firstorder alternating automata. The execution of Algorithm 1 consists of three phases, corresponding to the Close, Refine and Expand of the original IMPACT procedure [20]. Let n be a node removed from the worklist at line 2 and let \(\alpha (n)\) be the input sequence labeling the path from the root node to n. If \(\overline{\varUpsilon }({\alpha (n)})\) is satisfiable, the sequence \(\alpha (n)\) is feasible, in which case a model of \(\overline{\varUpsilon }({\alpha (n)})\) is obtained and a word \(w \in L(\mathcal {A})\) is returned. Otherwise, \(\alpha (n)\) is an infeasible input sequence and the procedure enters the refinement phase (lines 9–19). The GLI for \(\alpha (n)\) is used to strenghten the labels of all the ancestors of n, by conjoining the formulae of the interpolant, changed according to Lemma 7, to the existing labels.
In this process, the nodes on the path between \(\mathtt {r}\) and n, including n, might become eligible for coverage, therefore we attempt to close each ancestor of n that is impacted by the refinement (line 19). Observe that, in this case the call to Open image in new window must uncover each node which is covered by a successor of n (line 30 of the Open image in new window function). This is required because, due to the overapproximation of the sets of reachable configurations, the covering relation is not transitive, as explained in [20]. If Open image in new window adds a covering edge \((n_i,m)\) to \(\lhd \), it does not have to be called for the successors of \(n_i\) on this path, which is handled via the boolean flag b. Finally, if n is still uncovered (it has not been previously covered during the refinement phase) we expand n (lines 21–25) by creating a new node for each successor s via the input event \(a \in \varSigma \) and inserting it into the worklist.
4 Interpolant Generation
Typically, when checking the unreachability of a set of program configurations, the interpolants used to annotate the unfolded control structure are assertions about the values of the program variables in a given control state, at a certain step of an execution [20]. Because we consider alternating computation trees (forests), we must distinguish between (i) locality of interpolants w.r.t. a given control state (control locality) and (ii) locality w.r.t. a given time stamp (time locality). In logical terms, controllocal interpolants are formulae involving a single predicate symbol, whereas timelocal interpolants involve only predicates \({q}^{\scriptscriptstyle {({i})}}\) and variables \({x}^{\scriptscriptstyle {({i})}}\), for a single \(i \ge 0\). When considering alternating executions, controllocal interpolants are not always enough to prove emptiness, because of the synchronization of several branches of the computation on the same input word. For this reason, the interpolants considered in this paper will never be controllocal and we shall use the term local to denote timelocal interpolants, with no free variables.
First, let us give the formal definition of the class of interpolants we shall work with. Given a formula \(\phi \), the vocabulary of \(\phi \), denoted \(\mathrm {V}({\phi })\) is the set of predicate symbols \(q \in {Q}^{\scriptscriptstyle {({i})}}\) and variables \(x \in {X}^{\scriptscriptstyle {({i})}}\), occurring in \(\phi \), for some \(i\ge 0\). For a term t, its vocabulary \(\mathrm {V}({t})\) is the set of variables that occur in t. Observe that quantified variables and the interpreted function symbols of the data theory^{3} do not belong to the vocabulary of a formula. By \(\mathrm {P}^{+}(\phi )\) [\(\mathrm {P}^{}(\phi )\)] we denote the set of predicate symbols that occur in \(\phi \) under an even [odd] number of negations.
Definition 4
([19]). Given formulae \(\phi \) and \(\psi \) such that \(\phi \wedge \psi \) is unsatisfiable, a Lyndon interpolant is a formula I such that Open image in new window , the formula \(I \wedge \psi \) is unsatisfiable, \(\mathrm {V}({I}) \subseteq \mathrm {V}({\phi }) \cap \mathrm {V}({\psi })\), \(\mathrm {P}^{+}(I) \subseteq \mathrm {P}^{+}(\phi ) \cap \mathrm {P}^{+}(\psi )\) and \(\mathrm {P}^{}(I) \subseteq \mathrm {P}^{}(\phi ) \cap \mathrm {P}^{}(\psi )\).
In the rest of this section, fix an automaton \(\mathcal {A}= \langle \varSigma ,X,Q,\iota ,F,\varDelta \rangle \). The following definition generalizes interpolants from unsatisfiable conjunctions to input sequences:
Definition 5
Given a sequence of input events \(\alpha = a_1 \ldots a_n \in \varSigma ^*\), a generalized Lyndon interpolant (GLI) is a sequence \((I_0,\ldots ,I_n)\) of formulae such that, for all \(k \in [n1]\), the following hold: (1) \(\mathrm {P}^{}(I_k) = \emptyset \), (2) Open image in new window , Open image in new window and (3) \(I_n \wedge \bigwedge _{q \in Q \setminus F} \forall y_1 \ldots \forall y_{\#(q)}~.~q(\mathbf y) \rightarrow \bot \) is unsatisfiable. Moreover, the GLI is local if and only if \(\mathrm {V}({I_k}) \subseteq {Q}^{\scriptscriptstyle {({k})}}\), for all \(k \in [n]\).
The following proposition states the existence of local GLI for the theories in which Lyndon’s Interpolation Theorem holds.
Proposition 1
If there exists a Lyndon interpolant for any two formulae \(\phi \) and \(\psi \), in the firstorder theory of data with uninterpreted predicate symbols, such that \(\phi \wedge \psi \) is unsatisfiable, then any sequence of input events \(\alpha = a_1 \ldots a_n \in \varSigma ^*\), such that \(\varUpsilon ({\alpha })\) is unsatisfiable, has a local GLI \((I_0,\ldots ,I_n)\).
A problematic point of the above proposition is that the existence of Lyndon interpolants (Definition 4) is proved in principle, but the proof is nonconstructive. In other words, the proof of Proposition 1 does not yield an algorithm for computing GLIs, for the following reason. Building an interpolant for an unsatisfiable conjunction of formulae \(\phi \wedge \psi \) is typically the job of the decision procedure that proves the unsatisfiability and, in general, there is no such procedure, when \(\phi \) and \(\psi \) contain predicates and have nontrivial quantifier alternation. In this case, some provers use instantiation heuristics for the universal quantifiers that are sufficient for proving unsatisfiability, however these heuristics are not always suitable for interpolant generation. Consequently, from now on, we assume the existence of an effective Lyndon interpolation procedure only for decidable theories, such as the quantifierfree linear (integer) arithmetic with uninterpreted functions (UFLIA, UFLRA, etc.) [26].
This is where the predicatefree path formulae (defined in Sect. 2.1) come into play. Recall that, for a given event sequence \(\alpha \), the automaton \(\mathcal {A}\) accepts a word w such that \({w}_\varSigma = \alpha \) if and only if \(\overline{\varUpsilon }({\alpha })\) is satisfiable (Lemma 5). Assuming further that the equality and interpreted predicates (e.g. inequalities for integers) atoms from the transition rules of \(\mathcal {A}\) belong to a decidable firstorder theory, such as Presburger arithmetic, Lemma 5 gives us an effective way of checking emptiness of \(\mathcal {A}\), relative to a given event sequence. However, this method does not cope well with lazy annotation, because there is no way to extract, from the unsatisfiability proof of \(\overline{\varUpsilon }({\alpha })\), the interpolants needed to annotate \(\alpha \). This is because (I) the formula \(\overline{\varUpsilon }({\alpha })\), obtained by repeated substitutions loses track of the steps of the execution, and (II) quantifiers that occur nested in \(\overline{\varUpsilon }({\alpha })\) make it difficult to write \(\overline{\varUpsilon }({\alpha })\) as an unsatisfiable quantifierfree conjunction of formulae from which interpolants are extracted (Definition 4).
The solution we adopt for the first issue (I) consists in partially recovering the timestamped structure of the acceptance formula \(\varUpsilon ({\alpha })\) using the formula \(\widehat{\varUpsilon }({\alpha })\), in which only transition quantifiers occur. The second issue (II) is solved under the additional assuption that the theory of the data domain \(\mathbb {D}\) has witnessproducing quantifier elimination. More precisely, we assume that, for each formula \(\exists x~.~\phi (x)\), there exists an effectively computable term \(\tau \), in which x does not occur, such that \(\exists x~.~\phi \) and \(\phi [\tau /x]\) are equisatisfiable. These terms, called witness terms in the following, are actual definitions of the Skolem function symbols from the following folklore theorem:
Theorem 3
([3]). Given \(Q_1 x_1 \ldots Q_n x_n~.~\phi \) a firstorder sentence, where \(Q_1, \ldots , Q_n \in \{ \exists ,\forall \}\) and \(\phi \) is quantifierfree, let Open image in new window if \(Q_i = \forall \) and Open image in new window if \(Q_i = \exists \), where \(f_i\) is a fresh function symbol and \(\{ y_1, \ldots , y_{k_i} \} = \{ x_j \mid j < i,~Q_j = \exists \}\). Then the entailment Open image in new window holds.
Examples of witnessproducing quantifier elimination procedures can be found in the literature for e.g. linear integer (real) arithmetic (LIA,LRA), Presburger arithmetic and boolean algebra of sets and Presburger cardinality constraints (BAPA) [18].
Under the assumption that witness terms can be effectively built, we describe the generation of a nonlocal GLI for a given input event sequence \(\alpha = a_1 \ldots a_n\). First, we generate successively the acceptance formula \(\varUpsilon ({\alpha })\) and its equisatisfiable forms \(\widehat{\varUpsilon }({\alpha }) = Q_1x_1 \ldots Q_mx_m~.~\widehat{\varPhi }\) and \(\overline{\varUpsilon }({\alpha }) = Q_1x_1 \ldots Q_mx_m~.~\overline{\varPhi }\), both written in prenex form, with matrices \(\widehat{\varPhi }\) and \(\overline{\varPhi }\), respectively. Because we assumed that the first order theory of \(\mathbb {D}\) has quantifier elimination, the satisfiability problem for \(\overline{\varUpsilon }({\alpha })\) is decidable. If \(\overline{\varUpsilon }({\alpha })\) is satisfiable, we build a counterexample for emptiness w such that \({w}_\varSigma =\alpha \) and \({w}_\mathbb {D}\) is a satisfying assignment for \(\overline{\varUpsilon }({\alpha })\). Otherwise, \(\overline{\varUpsilon }({\alpha })\) is unsatisfiable and there exist witness terms \(\tau _{i_1} \ldots \tau _{i_\ell }\), where \(\{ i_1, \ldots , i_\ell \} = \{ j \in [1,m] \mid Q_j = \forall \}\), such that \(\overline{\varPhi }[\tau _{i_1}/x_{i_1}, \ldots , \tau _{i_\ell }/x_{i_\ell }]\) is unsatisfiable (Theorem 3). Then it turns out that the formula \(\widehat{\varPhi }[\tau _{i_1}/x_{i_1}, \ldots , \tau _{i_\ell }/x_{i_\ell }]\), obtained analogously from the matrix of \(\widehat{\varUpsilon }({\alpha })\), is unsatisfiable as well (Lemma 6 below). Because this latter formula is structured as a conjunction of formulae \({\iota }^{\scriptscriptstyle {({0})}} \wedge \phi _1 \ldots \wedge \phi _n \wedge \psi \), where \(\mathrm {V}({\phi _k}) \cap {Q}^{\scriptscriptstyle {({\le n})}} \subseteq {Q}^{\scriptscriptstyle {({k1})}} \cup {Q}^{\scriptscriptstyle {({k})}}\) and \(\mathrm {V}({\psi }) \cap {Q}^{\scriptscriptstyle {({\le n})}} \subseteq {Q}^{\scriptscriptstyle {({n})}}\), it is now possible to use an existing interpolation procedure for the quantifierfree theory of \(\mathbb {D}\), extended with uninterpreted function symbols, to compute a (not necessarily local) GLI \((I_0, \ldots , I_n)\) such that \(\mathrm {V}({I_k}) \cap {Q}^{\scriptscriptstyle {({\le n})}} \subseteq {Q}^{\scriptscriptstyle {({k})}}\), for all \(k \in [n]\).
Example 3
We formalize and prove the correctness for the above construction of nonlocal GLI. A function \(\xi : \mathbf{\mathbb {N}}\rightarrow \mathbf{\mathbb {N}}\) is monotonic iff for each \(n < m\) we have \(\xi (n) \le \xi (m)\) and finiterange iff for each \(n \in \mathbf{\mathbb {N}}\) the set \(\{ m \mid \xi (m)=n \}\) is finite. If \(\xi \) is finiterange, we denote by \(\xi _{\max }^{1}(n) \in \mathbf{\mathbb {N}}\) the maximal value m such that \(\xi (m)=n\).
Lemma 6
 1.
witness terms \(\tau _{i_1}, \ldots , \tau _{i_\ell }\), where \(\{ i_1, \ldots ,i_\ell \} = \{ j \in [1,m] \mid Q_j = \forall \}\) and \(\mathrm {V}({\tau _{i_j}}) \subseteq {X}^{\scriptscriptstyle {({\le \xi (i_j)})}} \cup \{ x_k \mid k < i_j, Q_k = \exists \},~\forall j \in [1,\ell ]\) such that \(\widehat{\varPhi }[\tau _{i_1}/x_{i_1}, \ldots , \tau _{i_\ell }/x_{i_\ell }]\) is unsatisfiable, and
 2.
a GLI \((I_0, \ldots , I_n)\) for \(\alpha \), such that \(\mathrm {V}({I_k}) \subseteq {Q}^{\scriptscriptstyle {({k})}} \cup {X}^{\scriptscriptstyle {({\le k})}} \cup \{ x_j \mid j < \xi _{\max }^{1}(k),~Q_j = \exists \}\), for all \(k \in [n]\).
Consequently, under two assumptions about the firstorder theory of the data domain, namely (i) witnessproducing quantifier elimination, and (ii) Lyndon interpolation for the quantifierfree fragment with uninterpreted functions, we developed a generic method that produces GLIs for unfeasible input event sequences. Moreover, each formula in the interpolant refers only to the current predicate symbols, the current and past input variables and the existentially quantified transition variables introduced at the previous steps. The remaining questions are how to use these GLIs to label the sequences in the unfolding of an automaton (Definition 2) and compute coverage (Definition 3) between nodes of the unfolding.
4.1 Unfolding with Nonlocal Interpolants
As required by Definition 2, the unfolding U of an automaton \(\mathcal {A}= \langle \varSigma ,X,Q,\iota ,F,\varDelta \rangle \) is labeled by formulae \(U(\alpha ) \in \mathsf {Form}^+(Q,\emptyset )\), with no free symbols, other than predicate symbols, such that the labeling is compatible with the transition relation of the automaton. Each newly expanded input sequence of \(\mathcal {A}\) is initially labeled with \(\top \) and the labels are refined using GLIs computed from proofs of spuriousness. The following lemma describes the refinement of the labeling of an input sequence by a nonlocal GLI:
Lemma 7

\(U'(\alpha _k) = U(\alpha _k) \wedge J_k\), for all \(k \in [n]\), where \(J_k\) is the formula obtained from \(I_k\) by removing the time stamp of each predicate symbol \({q}^{\scriptscriptstyle {({k})}}\) and existentially quantifying each free variable, and

\(U'(\beta ) = U(\beta )\) if \(\beta \in \mathrm {dom}(U)\) and \(\beta \not \preceq \alpha \),
Moreover, \(\alpha \) is safe in \(U'\).
Observe that, by Lemma 6(2), the set of free variables of a GLI formula \(I_k\) consists of (i) variables \({X}^{\scriptscriptstyle {({\le k})}}\) keeping track of data values seen in the input at some earlier moment in time, and (ii) variables that track past choices made within the transition rules. Basically, it is not important when exactly in the past a certain input has been read or when a choice has been made, because only the relation between the values of these and the current variables determines the future behavior of the automaton. Quantifying these variables existentially does the job of ignoring when exactly in the past these values have been seen. Moreover, the last point of Lemma 7 ensures that the refined path is safe in the new unfolding and will stay safe in all future refinements of this unfolding.
The last ingredient of the lazy annotation semialgorithm based on unfoldings consist in the implementation of the coverage check, when the unfolding of an automaton is labeled with conjunctions of existentially quantified formulae with predicate symbols, obtained from interpolation. By Definition 3, checking whether a given node \(\alpha \in \mathrm {dom}(U)\) is covered amounts to finding a prefix \(\alpha ' \preceq \alpha \) and a node \(\beta \in \mathrm {dom}(U)\) such that Open image in new window , or equivalently, the formula \(U(\alpha ') \wedge \lnot U(\beta )\) is unsatisfiable. However, the latter formula, in prenex form, has quantifier prefix in the language \(\exists ^*\forall ^*\) and, as previously mentioned, the satisfiability problem for such formulae becomes undecidable when the data theory subsumes Presburger arithmetic [10].
Experiments with First Order Alternating Automata
Example  \({{\mathcal {A}}}\) (bytes)  Predicates  Variables  Transitions  \(L(\mathcal {A})=\emptyset \)?  Nodes expanded  Nodes visited  Time (msec) 

incdec.pa  499  3  1  12  No  21  17  779 
localdec.pa  678  4  1  16  No  49  35  1814 
ticket.pa  4250  13  1  73  No  229  91  9543 
count_thread0.pa  9767  14  1  126  No  154  128  8553 
count_thread1.pa  10925  15  1  135  No  766  692  76771 
local0.pa  10595  13  1  117  No  73  27  1431 
local1.pa  11385  14  1  126  No  1135  858  101042 
array_rotation.ada  1834  8  7  7  Yes  9  8  1543 
array_simple.ada  3440  9  5  8  Yes  11  10  6787 
array_shift.ada  874  6  5  5  Yes  6  5  413 
abp.ada  6909  16  14  28  No  52  47  4788 
train.ada  1823  10  4  26  Yes  68  67  7319 
hw1.ada  322  3  2  5  Solver error  /  /  / 
hw2.ada  674  7  2  8  Yes  20  22  4974 
rrcrossing.foada  1780  10  1  16  Yes  67  67  7574 
trainsimple1.foada  5421  13  1  61  Yes  43  44  2893 
trainsimple2.foada  10177  16  1  118  Yes  111  113  8386 
trainsimple3.foada  15961  19  1  193  Yes  196  200  15041 
fischermutex2.foada  3000  11  2  23  Yes  23  23  808 
fischermutex3.foada  4452  16  2  34  Yes  33  33  1154 
5 Experimental Results
We have implemented a version of the IMPACT semialgorithm [20] in a prototype tool, avaliable online [8]. The tool is written in Java and uses the Z3 SMT solver [27], via the JavaSMT interface [15], for spuriousness and coverage queries and also for interpolant generation. Table 1 reports the size of the input automaton in bytes, the numbers of Predicates, Variables and Transitions, the result of emptiness check, the number of Expanded and Visited Nodes during the unfolding and the Time in miliseconds. The experiments were carried out on a MacOS x64  1.3 GHz Intel Core i5  8 GB 1867 MHz LPDDR3 machine.
The test cases shown in Table 1, come from several sources, namely predicate automata models (*.pa) [6, 7] available online [23], timed automata inclusion problems (abp.ada, train.ada, rrcrossing.foada), array logic entailments (array_rotation.ada, array_simple.ada, array_shift.ada) and hardware circuit verification (hw1.ada, hw2.ada), initially considered in [13], with the restriction that local variables are made visible in the input. The trainsimpleN. foada and fischermutexN. foada examples are parametric verification problems in which one checks inclusions of the form \(\bigcap _{i=1}^N{\mathcal L}({A_i}) \subseteq {\mathcal L}({B})\), where \(A_i\) is the ith copy of the template automaton.
The advantage of using FOADA over the INCLUDER [12] tool from [13] is the possibility of having automata over infinite alphabets with local variables, whose values are not visible in the input. In particular, this is essential for checking inclusion of timed automata that use internal clocks to control the computation.
6 Conclusions
We present firstorder alternating automata, a model of computation that generalizes classical boolean alternating automata to firstorder theories. Due to their expressivity, firstorder alternating automata are closed under union, intersection and complement. However the emptiness problem is undecidable even in the most simple case, of the quantifierfree theory of equality with uninterpreted predicate symbols. We deal with the emptiness problem by developping a practical semialgorithm that always terminates, when the automaton is not empty. In case of emptiness, termination of the semialgorithm occurs in most practical test cases, as shown by a number of experiments.
Footnotes
 1.
Both \(\{ q_1 \leftarrow \top , q_2 \leftarrow \top , q_3 \leftarrow \bot \}\) and \(\{ q_1 \leftarrow \bot , q_2 \leftarrow \bot , q_3 \leftarrow \top \}\) are minimal models, however \(\{ q_1 \leftarrow \top , q_2 \leftarrow \top , q_3 \leftarrow \top \}\) is a model but is not minimal.
 2.
For instance, using breadthfirst search.
 3.
E.g., the arithmetic operators of addition and multiplication, when \(\mathbb {D}\) is the set of integers.
References
 1.Alur, R., Dill, D.L.: A theory of timed automata. Theor. Comput. Sci. 126(2), 183–235 (1994)MathSciNetCrossRefGoogle Scholar
 2.Barringer, H., Rydeheard, D., Havelund, K.: Rule systems for runtime monitoring: from Eagle to RuleR. In: Sokolsky, O., Taşıran, S. (eds.) RV 2007. LNCS, vol. 4839, pp. 111–125. Springer, Heidelberg (2007). https://doi.org/10.1007/9783540773955_10CrossRefGoogle Scholar
 3.Börger, E., Grädel, E., Gurevich, Y.: The Classical Decision Problem: Perspectives in Mathematical Logic. Springer, Heidelberg (1997)CrossRefGoogle Scholar
 4.Chandra, A.K., Kozen, D.C., Stockmeyer, L.J.: Alternation. J. ACM 28(1), 114–133 (1981)MathSciNetCrossRefGoogle Scholar
 5.D’Antoni, L., Kincaid, Z., Wang, F.: A symbolic decision procedure for symbolic alternating finite automata. Electron. Notes Theor. Comput. Sci. 336, 79–99 (2018). The Thirtythird Conference on the Mathematical Foundations of Programming Semantics (MFPS XXXIII)MathSciNetCrossRefGoogle Scholar
 6.Farzan, A., Kincaid, Z., Podelski, A.: Proof spaces for unbounded parallelism. SIGPLAN Not. 50(1), 407–420 (2015)CrossRefGoogle Scholar
 7.Farzan, A., Kincaid, Z., Podelski, A.: Proving liveness of parameterized programs. In: Proceedings of the 31st Annual ACM/IEEE Symposium on Logic in Computer Science, LICS 2016, pp. 185–196. ACM (2016)Google Scholar
 8.First Order Alternating Data Automata (FOADA). https://github.com/cathiec/FOADA
 9.Grebenshchikov, S., Lopes, N.P., Popeea, C., Rybalchenko, A.: Synthesizing software verifiers from proof rules. SIGPLAN Not. 47(6), 405–416 (2012)CrossRefGoogle Scholar
 10.Halpern, J.Y.: Presburger arithmetic with unary predicates is \(\pi ^1_1\) complete. J. Symb. Log. 56(2), 637–642 (1991)MathSciNetCrossRefGoogle Scholar
 11.Hojjat, H., Rümmer, P.: Deciding and interpolating algebraic data types by reduction. Technical report. CoRR abs/1801.02367 (2018). http://arxiv.org/abs/1801.02367
 12.
 13.Iosif, R., Rogalewicz, A., Vojnar, T.: Abstraction refinement and antichains for trace inclusion of infinite state systems. In: Chechik, M., Raskin, J.F. (eds.) TACAS 2016. LNCS, vol. 9636, pp. 71–89. Springer, Heidelberg (2016). https://doi.org/10.1007/9783662496749_5CrossRefGoogle Scholar
 14.Iosif, R., Xu, X.: Abstraction refinement for emptiness checking of alternating data automata. In: Beyer, D., Huisman, M. (eds.) TACAS 2018. LNCS, vol. 10806, pp. 93–111. Springer, Cham (2018). https://doi.org/10.1007/9783319899633_6CrossRefGoogle Scholar
 15.JavaSMT. https://github.com/sosylab/javasmt
 16.Kaminski, M., Francez, N.: Finitememory automata. Theor. Comput. Sci. 134(2), 329–363 (1994)MathSciNetCrossRefGoogle Scholar
 17.Kincaid, Z.: Parallel proofs for parallel programs. Ph.D. thesis, University of Toronto (2016)Google Scholar
 18.Kuncak, V., Mayer, M., Piskac, R., Suter, P.: Software synthesis procedures. Commun. ACM 55(2), 103–111 (2012)CrossRefGoogle Scholar
 19.Lyndon, R.C.: An interpolation theorem in the predicate calculus. Pacific J. Math. 9(1), 129–142 (1959)MathSciNetCrossRefGoogle Scholar
 20.McMillan, K.L.: Lazy abstraction with interpolants. In: Ball, T., Jones, R.B. (eds.) CAV 2006. LNCS, vol. 4144, pp. 123–136. Springer, Heidelberg (2006). https://doi.org/10.1007/11817963_14CrossRefGoogle Scholar
 21.McMillan, K.L.: Lazy annotation revisited. In: Biere, A., Bloem, R. (eds.) CAV 2014. LNCS, vol. 8559, pp. 243–259. Springer, Cham (2014). https://doi.org/10.1007/9783319088679_16CrossRefGoogle Scholar
 22.Nelson, G., Oppen, D.C.: Fast decision procedures based on congruence closure. J. ACM 27(2), 356–364 (1980)MathSciNetCrossRefGoogle Scholar
 23.Predicate Automata. https://github.com/zkincaid/duet/tree/ark2/regression/predicateAutomata
 24.Presburger, M.: Über die Vollstandigkeit eines gewissen Systems der Arithmetik. Comptes rendus du I Congrés des Pays Slaves, Warsaw (1929)Google Scholar
 25.Reynolds, A., King, T., Kuncak, V.: Solving quantified linear arithmetic by counterexampleguided instantiation. Form. Methods Syst. Des. 51(3), 500–532 (2017)CrossRefGoogle Scholar
 26.Rybalchenko, A., SofronieStokkermans, V.: Constraint solving for interpolation. J. Symb. Comput. 45(11), 1212–1233 (2010)MathSciNetCrossRefGoogle Scholar
 27.Z3 SMT Solver. https://rise4fun.com/z3
Copyright information
Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.
The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.