1 Introduction

The goal of program synthesis is to find a program in some search space that meets a specification—e.g., satisfies a set of examples or a logical formula. Recently, a large family of synthesis problems has been unified into a framework called syntax-guided synthesis (SyGuS). A SyGuS problem is specified by a regular-tree grammar that describes the search space of programs, and a logical formula that constitutes the behavioral specification. Many synthesizers now support a specific format for SyGuS problems [1], and compete in annual synthesis competitions [2]. Thanks to these competitions, these solvers are now quite mature and are finding a wealth of applications [9].

Consider the SyGuS problem to synthesize a function f that computes the maximum of two variables x and y, denoted by \((\psi _{\texttt {max2}}(f,x,y), G_1)\). The goal is to create \(e_f\)—an expression-tree for f—where \(e_f\) is in the language of the following regular-tree grammar \(G_1\):

$$ \begin{array}{r@{~}r@{~~}ll} \text {Start} &{} \,{::=}\, &{} \text {Plus}(\text {Start}, \text {Start}) \mid \text {IfThenElse}(\text {BExpr}, \text {Start}, \text {Start}) \mid x \mid y \mid 0 \mid 1\\ \text {BExpr} &{} \,{::=}\, &{} \text {GreaterThan}(\text {Start}, \text {Start}) \mid \text {Not}(\text {BExpr}) \mid \text {And}(\text {BExpr}, \text {BExpr}) \end{array} $$

and \(\forall x,y. \psi _{\texttt {max2}}([\![e_f ]\!],x,y)\) is valid, where \([\![e_f ]\!] \) denotes the meaning of \(e_f\), and

SyGuS solvers can easily find a solution, such as

$$ e := \text {IfThenElse}(\text {GreaterThan}(x,y),x,y). $$

Although many solvers can now find solutions efficiently to many SyGuS problems, there has been effectively no work on the much harder task of proving that a given SyGuS problem is unrealizable—i.e., it does not admit a solution. For example, consider the SyGuS problem \((\psi _{\texttt {max2}}(f,x,y), G_2)\), where \(G_2\) is the more restricted grammar with if-then-else operators and conditions stripped out:

$$ \begin{array}{r@{~}r@{~~}ll} \text {Start}&\,{::=}\,&\text {Plus}(\text {Start}, \text {Start}) \quad \mid x \mid y \mid 0 \mid 1 \end{array} $$

This SyGuS problem does not have a solution, because no expression generated by \(G_2\) meets the specification.Footnote 1 However, to the best of our knowledge, current SyGuS solvers cannot prove that such a SyGuS problem is unrealizable.Footnote 2

A key property of the previous example is that the grammar is infinite. When such a SyGuS problem is realizable, any search technique that systematically explores the infinite search space of possible programs will eventually identify a solution to the synthesis problem. In contrast, proving that a problem is unrealizable requires showing that every program in the infinite search space fails to satisfy the specification. This problem is in general undecidable [6]. Although we cannot hope to have an algorithm for establishing unrealizability, the challenge is to find a technique that succeeds for the kinds of problems encountered in practice. Existing synthesizers can detect the absence of a solution in certain cases (e.g., because the grammar is finite, or is infinite but only generate a finite number of functionally distinct programs). However, in practice, as our experiments show, this ability is limited—no existing solver was able to show unrealizability for any of the examples considered in this paper.

In this paper, we present a technique for proving that a possibly infinite SyGuS problem is unrealizable. Our technique builds on two ideas.

  1. 1.

    We observe that unrealizability can often be proven using finitely many input examples. In Sect. 2, we show how the example discussed above can be proven to be unrealizable using four input examples—(0, 0), (0, 1), (1, 0), and (1, 1).

  2. 2.

    We devise a way to encode a SyGuS problem \((\psi (f,\bar{x}), G)\) over a finite set of examples E as a reachability problem in a recursive program \(P[{G},{E}]\). In particular, the program that we construct has an assertion that holds if and only if the given SyGuS problem is unrealizable. Consequently, unrealizability can be proven by establishing that the assertion always holds. This property can often be established by a conventional program-analysis tool.

The encoding mentioned in item 2 is non-trivial for three reasons. The following list explains each issue, and sketches how they are addressed

(1) Infinitely many terms. We need to model the infinitely many terms generated by the grammar of a given synthesis problem \((\psi (f,\bar{x}), G)\).

To address this issue, we use non-determinism and recursion, and give an encoding \(P[{G},{E}]\) such that (i) each non-deterministic path p in the program \(P[{G},{E}]\) corresponds to a possible expression \(e_p\) that G can generate, and (ii) for each expression e that G can generate, there is a path \(p_e\) in \(P[{G},{E}]\). (There is an isomorphism between paths and the expression-trees of G)

(2) Nondeterminism. We need the computation performed along each path p in \(P[{G},{E}]\) to mimic the execution of expression \(e_p\). Because the program uses non-determinism, we need to make sure that, for a given path p in the program \(P[{G},{E}]\), computational steps are carried out that mimic the evaluation of \(e_p\) for each of the finitely many example inputs in E.

We address this issue by threading the expression-evaluation computations associated with each example in E through the same non-deterministic choices.

(3) Complex Specifications. We need to handle specifications that allow for nested calls of the programs being synthesized.

For instance, consider the specification \(f(f(x)) = x\). To handle this specification, we introduce a new variable y and rewrite the specification as \(f(x) = y \wedge f(y) = x\). Because y is now also used as an input to f, we will thread both the computations of x and y through the non-deterministic recursive program.

Our work makes the following contributions:

  • We reduce the SyGuS unrealizability problem to a reachability problem to which standard program-analysis tools can be applied (Sects. 2 and 4).

  • We observe that, for many SyGuS problems, unrealizability can be proven using finitely many input examples, and use this idea to apply the Counter-Example-Guided Inductive Synthesis (CEGIS) algorithm to the problem of proving unrealizability (Sect. 3).

  • We give an encoding of a SyGuS problem \((\psi (f,\bar{x}), G)\) over a finite set of examples E as a reachability problem in a nondeterministic recursive program \(P[{G},{E}]\), which has the following property: if a certain assertion in \(P[{G},{E}]\) always holds, then the synthesis problem is unrealizable (Sect. 4).

  • We implement our technique in a tool nope using the ESolver synthesizer [2] as the SyGuS solver and the SeaHorn tool [8] for checking reachability. nope is able to establish unrealizability for 59 out of 132 variants of benchmarks taken from the SyGuS competition. In particular, nope solves all benchmarks with no more than 15 productions in the grammar and requiring no more than 9 input examples for proving unrealizability. Existing SyGuS solvers lack the ability to prove that these benchmarks are unrealizable, and time out on them.

Section 6 discusses related work. Some additional technical material, proofs, and full experimental results are given in [13].

2 Illustrative Example

In this section, we illustrate the main components of our framework for establishing the unrealizability of a SyGuS problem.

Consider the SyGuS problem to synthesize a function f that computes the maximum of two variables x and y, denoted by \((\psi _{\texttt {max2}}(f,x,y), G_1)\). The goal is to create \(e_f\)—an expression-tree for f—where \(e_f\) is in the language of the following regular-tree grammar \(G_1\):

$$ \begin{array}{r@{~}r@{~~}ll} \text {Start} &{} \,{::=}\, &{} \text {Plus}(\text {Start}, \text {Start}) \mid \text {IfThenElse}(\text {BExpr}, \text {Start}, \text {Start}) \mid x \mid y \mid 0 \mid 1\\ \text {BExpr} &{} \,{::=}\, &{} \text {GreaterThan}(\text {Start}, \text {Start}) \mid \text {Not}(\text {BExpr}) \mid \text {And}(\text {BExpr}, \text {BExpr}) \end{array} $$

and \(\forall x,y{.} \psi _{\texttt {max2}}([\![e_f ]\!],x,y)\) is valid, where \([\![e_f ]\!] \) denotes the meaning of \(e_f\), and

SyGuS solvers can easily find a solution, such as

$$ e := \text {IfThenElse}(\text {GreaterThan}(x,y),x,y). $$

Although many solvers can now find solutions efficiently to many SyGuS problems, there has been effectively no work on the much harder task of proving that a given SyGuS problem is unrealizable—i.e., it does not admit a solution. For example, consider the SyGuS problem \((\psi _{\texttt {max2}}(f,x,y), G_2)\), where \(G_2\) is the more restricted grammar with if-then-else operators and conditions stripped out:

$$ \begin{array}{r@{~}r@{~~}ll} \text {Start}&\,{::=}\,&\text {Plus}(\text {Start}, \text {Start}) \mid x \mid y \mid 0 \mid 1 \end{array} $$

This SyGuS problem does not have a solution, because no expression generated by \(G_2\) meets the specification.Footnote 3 However, to the best of our knowledge, current SyGuS solvers cannot prove that such a SyGuS problem is unrealizable. As an example, we use the problem \((\psi _{\texttt {max2}}(f,x,y), G_2)\) discussed in Sect. 1, and show how unrealizability can be proven using four input examples: (0, 0), (0, 1), (1, 0), and (1, 1).

Fig. 1.
figure 1

Program \(P[{G_2},{E_1}]\) created during the course of proving the unrealizability of \((\psi _{\texttt {max2}}(f,x,y), G_2)\) using the set of input examples \(E_1=\{(0,1)\}\).

Our method can be seen as a variant of Counter-Example-Guided Inductive Synthesis (CEGIS), in which the goal is to create a program P in which a certain assertion always holds. Until such a program is created, each round of the algorithm returns a counter-example, from which we extract an additional input example for the original SyGuS problem. On the \(i^{\text {th}}\) round, the current set of input examples \(E_i\) is used, together with the grammar—in this case \(G_2\)—and the specification of the desired behavior—\(\psi _{\texttt {max2}}(f,x,y)\), to create a candidate program \(P[{G_2},{E_i}]\). The program \(P[{G_2},{E_i}]\) contains an assertion, and a standard program analyzer is used to check whether the assertion always holds.

Suppose that for the SyGuS problem \((\psi _{\texttt {max2}}(f,x,y), G_2)\) we start with just the one example input (0, 1)—i.e., \(E_1=\{(0,1)\}\). Figure 1 shows the initial program \(P[{G_2},{E_1}]\) that our method creates. The function spec implements the predicate \(\psi _{\texttt {max2}}(f,x,y)\). (All of the programs \(\{ P[{G_2},{E_i}] \}\) use the same function spec). The initialization statements “” at line (21) in procedure main correspond to the input example (0, 1). The recursive procedure Start encodes the productions of grammar \(G_2\). Start is non-deterministic; it contains four calls to an external function nd(), which returns a non-deterministically chosen Boolean value. The calls to nd() can be understood as controlling whether or not a production is selected from \(G_2\) during a top-down, left-to-right generation of an expression-tree: lines (3)–(8) correspond to “Start  ::=  Plus(Start, Start),” and lines (10), (11), (12), and (13) correspond to “Start  ::=  x,” “Start  ::=  y,” “Start  ::=  1,” and “Start  ::=  0,” respectively. The code in the five cases in the body of Start encodes the semantics of the respective production of \(G_2\); in particular, the statements that are executed along any execution path of \(P[{G_2},{E_1}]\) implement the bottom-up evaluation of some expression-tree that can be generated by \(G_2\). For instance, consider the path that visits statements in the following order (for brevity, some statement numbers have been elided):

$$\begin{aligned} 21~~\, 22~~\,(_{\texttt {Start}}~~\, 3~~\, 4~~\,(_{\texttt {Start}}~~\, 10~~\, )_{\texttt {Start}}~~\, 6~~\,(_{\texttt {Start}}~~\, 12~~\, )_{\texttt {Start}}~~\, 8~~\, )_{\texttt {Start}}~~\, 23, \end{aligned}$$
(1)

where \((_{\texttt {Start}}\) and \()_{\texttt {Start}}\) indicate entry to, and return from, procedure Start, respectively. Path (1) corresponds to the top-down, left-to-right generation of the expression-tree Plus(x,1), interleaved with the tree’s bottom-up evaluation.

Fig. 2.
figure 2

Program \(P[{G_2},{E_4}]\) created during the course of proving the unrealizability of \((\psi _{\texttt {max2}}(f,x,y), G_2)\) using the set of input examples \(E_4=\{(0,0), (0,1), (1,0), (1,1)\}\).

Note that with path (1), when control returns to main, variable I_0 has the value 1, and thus the assertion at line (23) fails.

A sound program analyzer will discover that some such path exists in the program, and will return the sequence of non-deterministic choices required to follow one such path. Suppose that the analyzer chooses to report path (1); the sequence of choices would be tftffft, which can be decoded to create the expression-tree Plus(x,1). At this point, we have a candidate definition for f: \(f = x + 1\). This formula can be checked using an SMT solver to see whether it satisfies the behavioral specification \(\psi _{\texttt {max2}}(f,x,y)\). In this case, the SMT solver returns “false.” One counter-example that it could return is (0, 0).

At this point, program \(P[{G_2},{E_2}]\) would be constructed using both of the example inputs (0, 1) and (0, 0). Rather than describe \(P[{G_2},{E_2}]\), we will describe the final program constructed, \(P[{G_2},{E_4}]\) (see Fig. 2).

As can be seen from the comments in the two programs, program \(P[{G_2},{E_4}]\) has the same basic structure as \(P[{G_2},{E_1}]\).

  • main begins with initialization statements for the four example inputs.

  • Start has five cases that correspond to the five productions of \(G_2\).

The main difference is that because the encoding of \(G_2\) in Start uses non-determinism, we need to make sure that along each path p in \(P[{G_2},{E_4}]\), each of the example inputs is used to evaluate the same expression-tree. We address this issue by threading the expression-evaluation computations associated with each of the example inputs through the same non-deterministic choices. That is, each of the five “production cases” in Start has four encodings of the production’s semantics—one for each of the four expression evaluations. By this means, the statements that are executed along path p perform four simultaneous bottom-up evaluations of the expression-tree from \(G_2\) that corresponds to p.

Programs \(P[{G_2},{E_2}]\) and \(P[{G_2},{E_3}]\) are similar to \(P[{G_2},{E_4}]\), but their paths carry out two and three simultaneous bottom-up evaluations, respectively. The actions taken during rounds 2 and 3 to generate a new counter-example—and hence a new example input—are similar to what was described for round 1. On round 4, however, the program analyzer will determine that the assertion on lines (34)–(35) always holds, which means that there is no path through \(P[{G_2},{E_4}]\) for which the behavioral specification holds for all of the input examples. This property means that there is no expression-tree that satisfies the specification—i.e., the SyGuS problem \((\psi _{\texttt {max2}}(f,x,y), G_2)\) is unrealizable.

Our implementation uses the program-analysis tool SeaHorn [8] as the assertion checker. In the case of \(P[{G_2},{E_4}]\), SeaHorn takes only 0.5 s to establish that the assertion in \(P[{G_2},{E_4}]\) always holds.

3 SyGuS, Realizability, and CEGIS

3.1 Background

Trees and Tree Grammars. A ranked alphabet is a tuple \((\varSigma ,rk_\varSigma )\) where \(\varSigma \) is a finite set of symbols and \(rk_\varSigma :\varSigma \rightarrow \mathbb {N}\) associates a rank to each symbol. For every \(m\ge 0\), the set of all symbols in \(\varSigma \) with rank m is denoted by \(\varSigma ^{(m)}\). In our examples, a ranked alphabet is specified by showing the set \(\varSigma \) and attaching the respective rank to every symbol as a superscript—e.g., \(\varSigma =\{+^{(2)},c^{(0)}\}\). (For brevity, the superscript is sometimes omitted). We use \(T_\varSigma \) to denote the set of all (ranked) trees over \(\varSigma \)—i.e., \(T_\varSigma \) is the smallest set such that (i\(\varSigma ^{(0)} \subseteq T_\varSigma \), (ii) if \(\sigma ^{(k)} \in \varSigma ^{(k)}\) and \(t_1,\ldots ,t_k\in T_\varSigma \), then \(\sigma ^{(k)}(t_1,\cdots ,t_k)\in T_\varSigma \). In what follows, we assume a fixed ranked alphabet \((\varSigma ,rk_\varSigma )\).

In this paper, we focus on typed regular tree grammars, in which each nonterminal and each symbol is associated with a type. There is a finite set of types \(\{ \tau _1, \ldots , \tau _k \}\). Associated with each symbol \(\sigma ^{(i)} \in \varSigma ^{(i)}\), there is a type assignment \(a_{\sigma ^{(i)}} = (\tau _0, \tau _1, \ldots , \tau _i)\), where \(\tau _0\) is called the left-hand-side type and \(\tau _1, \ldots , \tau _i\) are called the right-hand-side types. Tree grammars are similar to word grammars, but generate trees over a ranked alphabet instead of words.

Definition 1

(Regular Tree Grammar). A typed regular tree grammar (RTG) is a tuple \(G=(N,\varSigma ,S,a,\delta )\), where N is a finite set of non-terminal symbols of arity 0; \(\varSigma \) is a ranked alphabet; \(S\in N\) is an initial non-terminal; a is a type assignment that gives types for members of \(\varSigma \cup N\); and \(\delta \) is a finite set of productions of the form \(A_0 \rightarrow \sigma ^{(i)}(A_1, \ldots , A_i)\), where for \(1 \le j \le i\), each \(A_j \in N\) is a non-terminal such that if \(a(\sigma ^{(i)}) = (\tau _0, \tau _1, \ldots , \tau _i)\) then \(a(A_j) = \tau _j\).

In a SyGuS problem, each variable, such as x and y in the example RTGs in Sect. 1, is treated as an arity-0 symbol—i.e., \(x^{(0)}\) and \(y^{(0)}\).

Given a tree \(t\in T_{\varSigma \cup N}\), applying a production \(r = A\rightarrow \beta \) to t produces the tree \(t'\) resulting from replacing the left-most occurrence of A in t with the right-hand side \(\beta \). A tree \(t\in T_\varSigma \) is generated by the grammar G—denoted by \(t\in L(G)\)—iff it can be obtained by applying a sequence of productions \(r_1\cdots r_n\) to the tree whose root is the initial non-terminal S.

Syntax-Guided Synthesis. A SyGuS problem is specified with respect to a background theory T—e.g., linear arithmetic—and the goal is to synthesize a function f that satisfies two constraints provided by the user. The first constraint, \(\psi (f,\bar{x})\), describes a semantic property that f should satisfy. The second constraint limits the search space S of f, and is given as a set of expressions specified by an RTG G that defines a subset of all terms in T.

Definition 2

A SyGuS problem over a background theory T is a pair \({sy}=(\psi (f,\bar{x}), G)\) where G is a regular tree grammar that only contains terms in T—i.e., \(L(G)\subseteq T\)—and \(\psi (f,\bar{x})\) is a Boolean formula constraining the semantic behavior of the synthesized program f.

A SyGuS problem is realizable if there exists a expression \(e\in L(G)\) such that \(\forall \bar{x}{.} \psi ([\![e ]\!],\bar{x})\) is true. Otherwise we say that the problem is unrealizable.

Theorem 1

(Undecidability [6]). Given a SyGuS problem \({sy}\), it is undecidable to check whether \({sy}\) is realizable.

Counterexample-Guided Inductive Synthesis. The Counterexample-Guided Inductive Synthesis (CEGIS) algorithm is a popular approach to solving synthesis problems. Instead of directly looking for an expression that satisfies the specification \(\varphi \) on all possible inputs, the CEGIS algorithm uses a synthesizer S that can find expressions that are correct on a finite set of examples E. If S finds a solution that is correct on all elements of E, CEGIS uses a verifier V to check whether the discovered solution is also correct for all possible inputs to the problem. If not, a counterexample obtained from V is added to the set of examples, and the process repeats. More formally, CEGIS starts with an empty set of examples \({E} \) and repeats the following steps:

  1. 1.

    Call the synthesizer S to find an expression e such that holds and go to step 2; return unrealizable if no expression exists.

  2. 2.

    Call the verifier V to find a model c for the formula \(\lnot \psi ([\![e ]\!],\bar{x})\), and add c to the counterexample set \({E} \); return e as a valid solution if no model is found.

Because SyGuS problems are only defined over first-order decidable theories, any SMT solver can be used as the verifier V to check whether the formula \(\lnot \psi ([\![e ]\!],\bar{x})\) is satisfiable. On the other hand, providing a synthesizer S to find solutions such that \(\forall \bar{x}\in {E} {.}\psi ([\![e ]\!],\bar{x})\) holds is a much harder problem because e is a second-order term drawn from an infinite search space. In fact, checking whether such an e exists is an undecidable problem [6].

The main contribution of our paper is a reduction of the unrealizability problem—i.e., the problem of proving that there is no expression \(e\in L(G)\) such that \(\forall \bar{x}\in {E}.\psi ([\![e ]\!],\bar{x})\) holds—to an unreachability problem (Sect. 4). This reduction allows us to use existing (un)reachability verifiers to check whether a SyGuS instance is unrealizable.

3.2 CEGIS and Unrealizability

The CEGIS algorithm is sound but incomplete for proving unrealizability. Given a SyGuS problem \({sy}=(\psi (f,\bar{x}),G)\) and a finite set of inputs \({E} \), we denote with \({sy}^{E}:=(\psi ^{E} (f,\bar{x}),G)\) the corresponding SyGuS problem that only requires the function f to be correct on the examples in E.

Lemma 1

(Soundness). If \({sy}^{E} \) is unrealizable then \({sy}\) is unrealizable.

Even when given a perfect synthesizer S—i.e., one that can solve a problem \(sy^{E} \) for every possible set \({E} \)—there are SyGuS problems for which the CEGIS algorithm is not powerful enough to prove unrealizability.

Lemma 2

(Incompleteness). There exists an unrealizable SyGuS problem \({sy}\) such that for every finite set of examples \({E} \) the problem \({sy}^{E} \) is realizable.

Despite this negative result, we will show that a CEGIS algorithm can prove unrealizability for many SyGuS instances (Sect. 5).

4 From Unrealizability to Unreachability

In this section, we show how a SyGuS problem for finitely many examples can be reduced to a reachability problem in a non-deterministic, recursive program in an imperative programming language.

4.1 Reachability Problems

A program P takes an initial state I as input and outputs a final state O, i.e., \([\![P ]\!] (I)=O\) where \([\![\cdot ]\!] \) denotes the semantic function of the programming language. As illustrated in Sect. 2, we allow a program to contain calls to an external function nd(), which returns a non-deterministically chosen Boolean value. When program P contains calls to nd(), we use \(\hat{P}\) to denote the program that is the same as P except that \(\hat{P}\) takes an additional integer input \(\texttt {n}\), and each call \(\texttt {nd()}\) is replaced by a call to a local function nextbit() defined as follows:

In other words, the integer parameter \(\texttt {n}\) of \(\hat{P}[\texttt {n}]\) formalizes all of the non-deterministic choices made by P in calls to \(\texttt {nd()}\).

For the programs \(P[{G},{E}]\) used in our unrealizability algorithm, the only calls to \(\texttt {nd()}\) are ones that control whether or not a production is selected from grammar G during a top-down, left-to-right generation of an expression-tree. Given \(\texttt {n}\), we can decode it to identify which expression-tree \(\texttt {n}\) represents.

Example 1

Consider again the SyGuS problem \((\psi _{\texttt {max2}}(f,x,y), G_2)\) discussed in Sect. 2. In the discussion of the initial program \(P[{G_2},{E_1}]\) (Fig. 1), we hypothesized that the program analyzer chose to report path (1) in P, for which the sequence of non-deterministic choices is tftffft. That sequence means that for \({\hat{P}}[\texttt {n}]\), the value of \(\texttt {n}\) is \(1000101~(\text {base}~2)\) (or \(69~(\text {base}~10)\)). The 1s, from low-order to high-order position, represent choices of production instances in a top-down, left-to-right generation of an expression-tree. (The 0s represent rejected possible choices). The rightmost 1 in \(\texttt {n}\) corresponds to the choice in line (3) of “”; the 1 in the third-from-rightmost position corresponds to the choice in line (10) of “” as the left child of the Plus node; and the 1 in the leftmost position corresponds to the choice in line (12) of “” as the right child. By this means, we learn that the behavioral specification \(\psi _{\texttt {max2}}(f,x,y)\) holds for the example set \(E_1 = \{ (0,1) \}\) for \(f \mapsto \texttt {Plus(x,1)}\).    \(\square \)

Definition 3

(Reachability Problem). Given a program \(\hat{P}[\texttt {n}]\), containing assertion statements and a non-deterministic integer input \(\texttt {n}\), we use \(re_P\) to denote the corresponding reachability problem. The reachability problem \(re_P\) is satisfiable if there exists a value n that, when bound to \(\texttt {n}\), falsifies any of the assertions in \(\hat{P}[\texttt {n}]\). The problem is unsatisfiable otherwise.

4.2 Reduction to Reachability

The main component of our framework is an encoding \(\textit{enc} \) that given a SyGuS problem \({sy}^{E} =(\psi ^{E} (f,{x}), G)\) over a set of examples \({E} = \{c_1,\ldots ,c_k\}\), outputs a program \(P[{G},{{E}}]\) such that \({sy}^{E} \) is realizable if and only if \(re_{\textit{enc} ({sy}, E)}\) is satisfiable. In this section, we define all the components of \(P[{G},{{E}}]\), and state the correctness properties of our reduction.

Remark: In this section, we assume that in the specification \(\psi (f,x)\) every occurrence of f has x as input parameter. We show how to overcome this restriction in App. A [13]. In the following, we assume that the input x has type \(\tau _I\), where \(\tau _I\) could be a complex type—e.g., a tuple type.

Program Construction. Recall that the grammar G is a tuple \((N,\varSigma ,S,a,\delta )\). First, for each non-terminal \(A\in N\), the program \(P[{G},{{E}}]\) contains k global variables \(\{\texttt {g\_1\_A},\dots ,\texttt {g\_k\_A}\}\) of type a(A) that are used to express the values resulting from evaluating expressions generated from non-terminal A on the k examples. Second, for each non-terminal \(A\in N\), the program \(P[{G},{{E}}]\) contains a function

We denote by \(\delta (A)=\{r_1,\dots ,r_m\}\) the set of production rules of the form \(A\rightarrow \beta \) in \(\delta \). The body bodyA of \(\texttt {funcA}\) has the following structure:

The encoding \(\texttt {En}_\delta (r)\) of a production \(r=A_0\rightarrow b^{(j)}(A_1,\cdots ,A_j)\) is defined as follows (\(\tau _i\) denotes the type of the term \(A_i\)):

Note that if \(b^{(j)}\) is of arity 0—i.e., if \(j=0\)—the construction yields k assignments of the form \(\texttt {g\_m\_A0} = \textit{enc} _b^m()\).

The function \(\textit{enc} ^m_b\) interprets the semantics of b on the \(m^{\text {th}}\) input example. We take Linear Integer Arithmetic as an example to illustrate how \(\textit{enc} ^m_b\) works.

We now turn to the correctness of the construction. First, we formalize the relationship between expression-trees in L(G), the semantics of \(P[{G},{{E}}]\), and the number \(\texttt {n}\). Given an expression-tree e, we assume that each node \(q\) in e is annotated with the production that has produced that node. Recall that \(\delta (A)=\{r_1,\dots ,r_m\}\) is the set of productions with head A (where the subscripts are indexes in some arbitrary, but fixed order). Concretely, for every node \(q\), we assume there is a function \(pr(q)=(A,i)\), which associates \(q\) with a pair that indicates that non-terminal A produced n using the production \(r_i\) (i.e., \(r_i\) is the \(i^{\text {th}}\) production whose left-hand-side non-terminal is A).

We now define how we can extract a number \({\#}(e)\) for which the program \(\hat{P}[{\#}(e)]\) will exhibit the same semantics as that of the expression-tree e. First, for every node \(q\) in e such that \(pr(q)=(A,i)\), we define the following number:

$$ {\#_{nd}}(q)= {\left\{ \begin{array}{ll} 1\underbrace{0\cdots 0}_{i-1} &{} \quad \text {if}~i < |\delta (A)| \\ \underbrace{0\cdots 0}_{i-1} &{} \quad \text {if}~i = |\delta (A)|. \end{array}\right. } $$

The number \({\#_{nd}}(q)\) indicates what suffix of the value of \(\texttt {n}\) will cause \(\texttt {funcA}\) to trigger the code corresponding to production \(r_i\). Let \(q_1\cdots q_m\) be the sequence of nodes visited during a pre-order traversal of expression-tree e. The number corresponding to e, denoted by \({\#}(e)\), is defined as the bit-vector \({\#_{nd}}(q_m)\cdots {\#_{nd}}(q_1)\).

Finally, we add the entry-point of the program, which calls the function \(\texttt {funcS}\) corresponding to the initial non-terminal S, and contains the assertion that encodes our reachability problem on all the input examples \({E} = \{c_1,\ldots ,c_k\}\).

Correctness. We first need to show that the function \({\#}(\cdot )\) captures the correct language of expression-trees. Given a non-terminal A, a value n, and input values \(i_1,\ldots ,i_k\), we use \([\![\texttt {funcA}[n] ]\!] (i_1,\dots ,i_k)=(o_1,\ldots o_k)\) to denote the values of the variables \(\{\texttt {g\_1\_A},\dots ,\texttt {g\_k\_A}\}\) at the end of the execution of \(\texttt {funcA}[\texttt {n}]\) with the initial value of \(\texttt {n} = n\) and input values \(x_1,\ldots ,x_k\). Given a non-terminal A, we write L(GA) to denote the set of terms that can be derived starting with A.

Lemma 3

Let A be a non-terminal, \(e\in L(G,A)\) an expression, and \(\{i_1,\dots ,i_k\}\) an input set. Then, \(([\![e ]\!] (i_1),\dots ,[\![e ]\!] (i_k))=[\![\texttt {funcA}[{\#}(e)] ]\!] (i_1,\dots ,i_k)\).

Each procedure \(\texttt {funcA}[\texttt {n}](i_1,\dots ,i_k)\) that we construct has an explicit dependence on variable \(\texttt {n}\), where \(\texttt {n}\) controls the non-deterministic choices made by the \(\texttt {funcA}\) and procedures called by \(\texttt {funcA}\). As a consequence, when relating numbers and expression-trees, there are two additional issues to contend with:

  • Non-termination. Some numbers can cause \(\texttt {funcA}[n]\) to fail to terminate. For instance, if the case for “” in program \(P[{G_2},{E_1}]\) from Fig. 1 were moved from the first branch (lines (3)–(8)) to the final else case (line (13)), the number \(\texttt {n} = 0 = \ldots 0000000~(\text {base}~2)\) would cause Start to never terminate, due to repeated selections of Plus nodes. However, note that the only assert statement in the program is placed at the end of the main procedure. Now, consider a value of n such that \(re_{\textit{enc} ({sy}, E)}\) is satisfiable. Definition 3 implies that the flow of control will reach and falsify the assertion, which implies that \(\texttt {funcA}[{n}]\) terminates.Footnote 4

  • Shared suffixes of sufficient length. In Example 1, we showed how for program \(P[{G_2},{E_1}]\) (Fig. 1) the number \(\texttt {n} = 1000101~(\text {base}~2)\) corresponds to the top-down, left-to-right generation of Plus(x,1). That derivation consumed exactly seven bits; thus, any number that, written in \(\text {base}~2\), shares the suffix 1000101—e.g., 11010101000101—will also generate Plus(x,1).

The issue of shared suffixes is addressed in the following lemma:

Lemma 4

For every non-terminal A and number \(\texttt {n}\) such that \([\![\texttt {funcA}[{n}] ]\!] (i_1,\dots ,i_k) \ne \bot \) (i.e., \(\texttt {funcA}\) terminates when the non-deterministic choices are controlled by \(\texttt {n}\)), there exists a minimal \(\texttt {n}'\) that is a (\(\text {base}~2\)) suffix of \(\texttt {n}\) for which (i) there is an \(e \in L(G)\) such that \({\#}(e) = {n}'\), and (ii) for every input \(\{i_1,\dots ,i_k\}\), we have \([\![\texttt {funcA}[{n}] ]\!] (i_1,\dots ,i_k)=[\![\texttt {funcA}[{n}'] ]\!] (i_1,\dots ,i_k)\).

We are now ready to state the correctness property of our construction.

Theorem 2

Given a SyGuS problem \({sy}^{E} =(\psi _E(f,x), G)\) over a finite set of examples \({E} \), the problem \({sy}^{E} \) is realizable iff \( re_{\textit{enc} ({sy}, E)}\) is satisfiable.

5 Implementation and Evaluation

nope is a tool that can return two-sided answers to unrealizability problems of the form \({sy}= (\psi , G)\). When it returns unrealizable, no expression-tree in L(G) satisfies \(\psi \); when it returns realizable, some \(e \in L(G)\) satisfies \(\psi \); nope can also time out. nope incorporates several existing pieces of software.

  1. 1.

    The (un)reachability verifier SeaHorn is applied to the reachability problems of the form \(re_{\textit{enc} ({sy},{E})}\) created during successive CEGIS rounds.

  2. 2.

    The SMT solver Z3 is used to check whether a generated expression-tree e satisfies \(\psi \). If it does, nope returns realizable (along with e); if it does not, nope creates a new input example to add to \({E} \).

It is important to observe that SeaHorn, like most reachability verifiers, is only sound for un satisfiability—i.e., if SeaHorn returns unsatisfiable, the reachability problem is indeed unsatisfiable. Fortunately, SeaHorn ’s one-sided answers are in the correct direction for our application: to prove unrealizability, nope only requires the reachability verifier to be sound for unsatisfiability.

There is one aspect of nope that differs from the technique that has been presented earlier in the paper. While SeaHorn is sound for \({{\varvec{un}}}{\text {reachability}}\), it is not sound for reachability—i.e., it cannot soundly prove whether a synthesis problem is realizable. To address this problem, to check whether a given SyGuS problem \({sy}^{E} \) is realizable on the finite set of examples \({E} \), nope also calls the SyGuS solver ESolver [2] to synthesize an expression-tree e that satisfies \({sy}^{E} \).Footnote 5

In practice, for every intermediate problem \({sy}^{E} \) generated by the CEGIS algorithm, nope runs the ESolver on \(sy^{E} \) and SeaHorn on \(re_{\textit{enc} ({sy},{E})}\) in parallel. If ESolver returns a solution e, SeaHorn is interrupted, and Z3 is used to check whether e satisfies \(\psi \). Depending on the outcome, nope either returns realizable or obtains an additional input example to add to \({E} \). If SeaHorn returns unsatisfiable, nope returns unrealizable.

Modulo bugs in its constituent components, nope is sound for both realizability and unrealizability, but because of Lemma 2 and the incompleteness of SeaHorn, nope is not complete for unrealizability.

Benchmarks. We perform our evaluation on 132 variants of the 60 LIA benchmarks from the LIA SyGuS competition track [2]. We do not consider the other SyGuS benchmark track, Bit-Vectors, because the SeaHorn verifier is unsound for most bit-vector operations–e.g., bit-shifting. We used three suites of benchmarks. LimitedIf (resp. LimitedPlus) contains 57 (resp. 30) benchmarks in which the grammar bounds the number of times an IfThenElse (resp. Plus) operator can appear in an expression-tree to be 1 less than the number required to solve the original synthesis problem. We used the tool Quasi to automatically generate the restricted grammars. LimitedConst contains 45 benchmarks in which the grammar allows the program to contain only constants that are coprime to any constants that may appear in a valid solution—e.g., the solution requires using odd numbers, but the grammar only contains the constant 2. The numbers of benchmarks in the three suites differ because for certain benchmarks it did not make sense to create a limited variant—e.g., if the smallest program consistent with the specification contains no IfThenElse operators, no variant is created for the LimitedIf benchmark. In all our benchmarks, the grammars describing the search space contain infinitely many terms.

Our experiments were performed on an Intel Core i7 4.00 GHz CPU, with 32 GB of RAM, running Lubuntu 18.10 via VirtualBox. We used version 4.8 of Z3, commit 97f2334 of SeaHorn, and commit d37c50e of ESolver. The timeout for each individual SeaHorn/ESolver call is set at 10 min.

Experimental Questions. Our experiments were designed to answer the questions posed below.

figure a

Finding: nope can prove unrealizability for \(59/132 \) benchmarks. For the \(59 \) benchmarks solved by nope, the average time taken is \(15.59 \) s. The time taken to perform the last iteration of the algorithm—i.e., the time taken by SeaHorn to return unsatisfiable—accounts for \(87\%\) of the total running time.

nope can solve all of the LimitedIf benchmarks for which the grammar allows at most one IfThenElse operator. Allowing more IfThenElse operators in the grammar leads to larger programs and larger sets of examples, and consequently the resulting reachability problems are harder to solve for SeaHorn.

For a similar reason, nope can solve only one of the LimitedPlus benchmarks. All other LimitedPlus benchmarks allow 5 or more Plus statements, resulting in grammars that have at least 130 productions.

nope can solve all LimitedConst benchmarks because these require few examples and result in small encoded programs.

figure b

Note that Z3 can produce different models for the same query, and thus different runs of NOPE can produce different sequences of examples. Hence, there is no guarantee that NOPE finds a good sequence of examples that prove unrealizability. One measure of success is whether nope is generally able to find a small number of examples, when it succeeds in proving unrealizability.

Finding: Nope used 1 to 9 examples to prove unrealizability for the benchmarks on which it terminated. Problems requiring large numbers of examples could not be solved because either ESolver or times out—e.g., on the problem max4, nope gets to the point where the CEGIS loop has generated 17 examples, at which point ESolver exceeds the timeout threshold.

Fig. 3.
figure 3

Time vs examples.

Finding: The number of examples required to prove unrealizability depends mainly on the arity of the synthesized function and the complexity of the grammar. The number of examples seems to grow quadratically with the number of bounded operators allowed in the grammar. In particular, problems in which the grammar allows zero IfThenElse operators require 2–4 examples, while problems in which the grammar allows one IfThenElse operator require 7–9 examples.

Figure 3 plots the running time of nope against the number of examples generated by the CEGIS algorithm. Finding: The solving time appears to grow exponentially with the number of examples required to prove unrealizability.

6 Related Work

The SyGuS formalism was introduced as a unifying framework to express several synthesis problems [1]. Caulfield et al. [6] proved that it is undecidable to determine whether a given SyGuS problem is realizable. Despite this negative result, there are several SyGuS solvers that compete in yearly SyGuS competitions [2] and can efficiently produce solutions to SyGuS problems when a solution exists. Existing SyGuS synthesizers fall into three categories: (i) Enumeration solvers enumerate programs with respect to a given total order [7]. If the given problem is unrealizable, these solvers typically only terminate if the language of the grammar is finite or contains finitely many functionally distinct programs. While in principle certain enumeration solvers can prune infinite portions of the search space, none of these solvers could prove unrealizability for any of the benchmarks considered in this paper. (ii) Symbolic solvers reduce the synthesis problem to a constraint-solving problem [3]. These solvers cannot reason about grammars that restrict allowed terms, and resort to enumeration whenever the candidate solution produced by the constraint solver is not in the restricted search space. Hence, they also cannot prove unrealizability. (iii) Probabilistic synthesizers randomly search the search space, and are typically unpredictable [14], providing no guarantees in terms of unrealizability.

Synthesis as Reachability. CETI [12] introduces a technique for encoding template-based synthesis problems as reachability problems. The CETI encoding only applies to the specific setting in which (i) the search space is described by an imperative program with a finite number of holes—i.e., the values that the synthesizer has to discover—and (ii) the specification is given as a finite number of input-output test cases with which the target program should agree. Because the number of holes is finite, and all holes correspond to values (and not terms), the reduction to a reachability problem only involves making the holes global variables in the program (and no more elaborate transformations).

In contrast, our reduction technique handles search spaces that are described by a grammar, which in general consist of an infinite set of terms (not just values). Due to this added complexity, our encoding has to account for (i) the semantics of the productions in the grammar, and (ii) the use of non-determinism to encode the choice of grammar productions. Our encoding creates one expression-evaluation computation for each of the example inputs, and threads these computations through the program so that each expression-evaluation computation makes use of the same set of non-deterministic choices.

Using the input-threading, our technique can handle specifications that contain nested calls of the synthesized program (e.g., \(f(f(x)) = x\)). (App. A [13]).

The input-threading technique builds a product program that performs multiple executions of the same function as done in relational program verification [4]. Alternatively, a different encoding could use multiple function invocations on individual inputs and require the verifier to thread the same bit-stream for all input evaluations. In general, verifiers perform much better on product programs [4], which motivates our choice of encoding.

Unrealizability in Program Synthesis. For certain synthesis problems—e.g., reactive synthesis [5]—the realizability problem is decidable. The framework tackled in this paper, SyGuS, is orthogonal to such problems, and it is undecidable to check whether a given SyGuS problem is realizable [6].

Mechtaev et al. [11] propose to use a variant of SyGuS to efficiently prune irrelevant paths in a symbolic-execution engine. In their approach, for each path \(\pi \) in the program, a synthesis problem \(p_\pi \) is generated so that if \(p_\pi \) is unrealizable, the path \(\pi \) is infeasible. The synthesis problems generated by Mechtaev et al. (which are not directly expressible in SyGuS) are decidable because the search space is defined by a finite set of templates, and the synthesis problem can be encoded by an SMT formula. To the best of our knowledge, our technique is the first one that can check unrealizability of general SyGuS problems in which the search space is an infinite set of functionally distinct terms.