figure a

1 Introduction

In reactive synthesis, we automatically translate a formal specification, typically given in a temporal logic, into a controller that is guaranteed to satisfy the specification. Over the past two decades there has been much progress on reactive synthesis, both in terms of algorithms, notably with techniques like GR(1)-synthesis [7] and bounded synthesis [20], and in terms of tools, as showcased, for example, in the annual syntcomp competition [25].

In practice however, reactive synthesis has seen limited success. One of the largest published success stories [6] is the synthesis of the AMBA bus protocol. To push synthesis even further, automatically synthesizing a controller for an autonomous system has been recognized to be of critical importance [52]. Despite many years of experience with synthesis tools, our own attempts to synthesize such controllers with existing tools have been unsuccessful. The reason is that the tools are unable to handle the data complexity of the controllers. The controller only needs to switch between a small number of behaviors, like steering during a bend, or shifting gears on high rpm. The number of control states in a typical controller (cf. [18]) is thus not much different from the arbiter in the AMBA case study. However, in order to correctly initiate transitions between control states, the driving controller must continuously process data from more than 20 sensors.

If this data is included (even as a rough discretization) in the state space of the controller, then the synthesis problem is much too large to be handled by any available tools. It seems clear then, that a scalable synthesis approach must separate control and data. If we assume that the data processing is handled by some other approach (such as deductive synthesis [38] or manual programming), is it then possible to solve the remaining reactive synthesis problem?

In this paper, we show scalable reactive synthesis is indeed possible. Separating data and control has allowed us to synthesize reactive systems, including an autonomous driving controller and a music player app, that had been impossible to synthesize with previously available tools. However, the separation of data and control implies some fundamental changes to reactive synthesis, which we describe in the rest of the paper. The changes also imply that the reactive synthesis problem is no longer, in general, decidable. We thus trade theoretical decidability for practical scalability, which is, at least with regard to the goal of synthesizing realistic systems, an attractive trade-off.

We introduce Temporal Stream Logic (TSL), a new temporal logic that includes updates, such as \(\llbracket {{\texttt {y}}} \leftarrowtail \, {{\texttt {f}}}~{{\texttt {x}}} \rrbracket \), and predicates over arbitrary function terms. The update \(\llbracket {{\texttt {y}}} \leftarrowtail \, {{\texttt {f}}}~{{\texttt {x}}} \rrbracket \) indicates that the result of applying function \( {{\texttt {f}}} \) to variable \( {{\texttt {x}}} \) is assigned to \({{\texttt {y}}}\). The implementation of predicates and functions is not part of the synthesis problem. Instead, we look for a system that satisfies the TSL specification for all possible interpretations of the functions and predicates.

This implicit quantification over all possible interpretations provides a useful abstraction: it allows us to independently implement the data processing part. On the other hand, this quantification is also the reason for the undecidability of the synthesis problem. If a predicate is applied to the same term twice, it must (independently of the interpretation) return the same truth value. The synthesis must then implicitly maintain a (potentially infinite) set of terms to which the predicate has previously been applied. As we show later, this set of terms can be used to encode PCP [45] for a proof of undecidability.

We present a practical synthesis approach for TSL specifications, which is based on bounded synthesis [20] and counterexample-guided abstraction refinement (CEGAR) [9]. We use bounded synthesis to search for an implementation up to a (iteratively growing) bound on the number of states. This approach underapproximates the actual TSL synthesis problem by leaving the interpretation of the predicates to the environment. The underapproximation allows for inconsistent behaviors: the environment might assign different truth values to the same predicate when evaluated at different points in time, even if the predicate is applied to the same term. However, if we find an implementation in this underapproximation, then the CEGAR loop terminates and we have a correct implementation for the original TSL specification. If we do not find an implementation in the underapproximation, we compute a counter strategy for the environment. Because bounded synthesis reduces the synthesis problem to a safety game, the counter strategy is a reachability strategy that can be represented as a finite tree. We check whether the counter strategy is spurious by searching for a pair of positions in the strategy where some predicate results in different truth values when applied to the same term. If the counter strategy is not spurious, then no implementation exists for the considered bound, and we increase the bound. If the counter strategy is spurious, then we introduce a constraint into the specification that eliminates the incorrect interpretation of the predicate, and continue with the refined specification.

Fig. 1.
figure 1

The TSL synthesis procedure uses a modular design. Each step takes input from the previous step as well as interchangeable modules (dashed boxes).

A general overview of this procedure is shown in Fig. 1. The top half of the figure depicts the bounded search for an implementation that realizes a TSL specification using the CEGAR loop to refine the specification. If the specification is realizable, we proceed in the bottom half of the process, where a synthesized implementation is converted to a control flow model (CFM) determining the control of the system. We then specialize the CFM to Functional Reactive Programming (FRP), which is a popular and expressive programming paradigm for building reactive programs using functional programming languages [14]. Our framework supports any FRP library using the Arrow or Applicative design patterns, which covers most of the existing FRP libraries (e.g. [2, 3, 10, 41]). Finally, the synthesized control flow is embedded into a project context, where it is equipped with function and predicate implementations and then compiled to an executable program.

Our experience with synthesizing systems based on TSL specifications has been extremely positive. The synthesis works for a broad range of benchmarks, ranging from classic reactive synthesis problems (like escalator control), through programming exercises from functional reactive programming, to novel case studies like our music player app and the autonomous driving controller for a vehicle in the Open Race Car Simulator (TORCS).

Fig. 2.
figure 2

Sample code and specification for the music player app.

2 Motivating Example

To demonstrate the utility of our method, we synthesized a music player Android appFootnote 1 from a TSL specification. A major challenge in developing Android apps is the temporal behavior of an app through the Android lifecycle [46]. The Android lifecycle describes how an app should handle being paused, when moved to the background, coming back into focus, or being terminated. In particular, resume and restart errors are commonplace and difficult to detect and correct [46]. Our music player app demonstrates a situation in which a resume and restart error could be unwittingly introduced when programming by hand, but is avoided by providing a specification. We only highlight the key parts of the example here to give an intuition of TSL. The complete specification is presented in [19].

Our music player app utilizes the Android music player library (MP), as well as its control interface (Ctrl). It pauses any playing music when moved to the background (for instance if a call is received), and continues playing the currently selected track (Tr) at the last track position when the app is resumed. In the Android system (Sys), the leaveApp method is called whenever the app moves to the background, while the resumeApp method is called when the app is brought back to the foreground. To avoid confusion between pausing music and pausing the app, we use leaveApp and resumeApp in place of the Android methods onPause and onResume. A programmer might manually write code for this as shown on the left in Fig. 2.

The behavior of this can be directly described in TSL as shown on the right in Fig. 2. Even eliding a formal introduction of the notation for now, the specification closely matches the textual specification. First, when the user leaves the app and the music is playing, the music pauses. Likewise for the second part, when the user resumes the app, the music starts playing again.

However, assume we want to change the behavior so that the music only plays on resume when the music had been playing before leaving the app in the first place. In the manually written program, this new functionality requires an additional variable wasPlaying to keep track of the music state. Managing the state requires multiple changes in the code as shown on the left in Fig. 3. The required code changes include: a conditional in the resumeApp method, setting wasPlaying appropriately in two places in leaveApp, and providing an initial value. Although a small example, it demonstrates how a minor change in functionality may require wide-reaching code changes. In addition, this change introduces a globally scoped variable, which then might accidentally be set or read elsewhere. In contrast, it is a simple matter to change the TSL specification to reflect this new functionality. Here, we only update one part of the specification to say that if the user leaves the app and the music is playing, the music has to play again as soon as the app resumes.

Fig. 3.
figure 3

The effect of a minor change in functionality on code versus a specification.

Synthesis allows us to specify a temporal behavior without worrying about the implementation details. In this example, writing the specification in TSL has eliminated the need of an additional state variable, similarly to a higher order map eliminating the need for an iteration variable. However, in more complex examples the benefits compound, as TSL provides a modular interface to specify behaviors, offloading the management of multiple interconnected temporal behaviors from the user to the synthesis engine.

3 Preliminaries

We assume time to be discrete and denote it by the set \( \mathbb {N}\) of positive integers. A value is an arbitrary object of arbitrary type. \( \mathcal {V}\) denotes the set of all values. The Boolean values are denoted by \( \mathcal {B}\subseteq \mathcal {V}\). A stream \( s:\mathbb {N}\rightarrow \mathcal {V}\) is a function fixing values at each point in time. An n-ary function \( f :\mathcal {V}^{n} \rightarrow \mathcal {V}\) determines new values from n given values, where the set of all functions (of arbitrary arity) is given by \( \mathcal {F} \). Constants are functions of arity 0. Every constant is a value, i.e., is an element of \( \mathcal {F} \cap \mathcal {V}\). An n-ary predicate \( p :\mathcal {V}^{n} \rightarrow \mathcal {B}\) checks a property over n values. The set of all predicates (of arbitrary arity) is given by \( \mathcal {P} \), where \( \mathcal {P} \subseteq \mathcal {F} \). We use \( B^{[A]} \) to denote the set of all total functions with domain A and image B.

In the classical synthesis setting, inputs and outputs are vectors of Booleans, where the standard abstraction treats inputs and outputs as atomic propositions \( \mathcal {I} \cup \mathcal {O} \), while their Boolean combinations form an alphabet \( \varSigma = 2^{\mathcal {I} \cup \mathcal {O}} \). Behavior then is described through infinite sequences \( \alpha = \alpha (0)\alpha (1)\alpha (2) \ldots \in \varSigma ^{\omega } \). A specification describes a relation between input sequences \( \alpha \in (2^{\mathcal {I}})^{\omega } \) and output sequences \( \beta \in (2^{\mathcal {O}})^{\omega } \). Usually, this relation is not given by explicit sequences, but by a fomula in a temporal logic. The most popular such logic is Linear Temporal Logic (LTL) [43], which uses Boolean connectives to specify behavior at specific points in time, and temporal connectives, to relate sub-specifications over time. The realizability and synthesis problems for LTL are 2ExpTime-complete [44].

An implementation describes a realizing strategy, formalized via infinite trees. A \( \varPhi \)-labeled and \( \varUpsilon \)-branching tree is a function \( \sigma :\varUpsilon ^{*} \rightarrow \varPhi \), where \( \varUpsilon \) denotes the set of branching directions along a tree. Every node of the tree is given by a finite prefix \( v \in \varUpsilon ^{*} \), which fixes the path to reach a node from the root. Every node is labeled by an element of \( \varPhi \). For infinite paths \( \nu \in \varUpsilon ^{\omega } \), the branch \( \sigma \! \wr \nu \) denotes the sequence of labels that appear on \( \nu \), i.e., \( \forall t \in \mathbb {N}. \ (\sigma \! \wr \nu )(t) = \sigma (\nu (0) \ldots \nu (t-1)) \).

4 Temporal Stream Logic

We present a new logic: Temporal Stream Logic (TSL), which is especially designed for synthesis and allows for the manipulation of infinite streams of arbitrary (even non-enumerative, or higher order) type. It provides a straightforward notation to specify how outputs are computed from inputs, while using an intuitive interface to access time. The main focus of TSL is to describe temporal control flow, while abstracting away concrete implementation details. This not only keeps the logic intuitive and simple, but also allows a user to identify problems in the control flow even without a concrete implementation at hand. In this way, the use of TSL scales up to any required abstraction, such as API calls or complex algorithmic transformations.

Architecture. A TSL formula \( \varphi \) specifies a reactive system that in every time step processes a finite number of inputs \( \mathbb {I} \) and produces a finite number of outputs \( \mathbb {O} \). Furthermore, it uses cells \( \mathbb {C} \) to store a value computed at time t, which can then be reused in the next time step \( t + 1 \). An overview of the architecture of such a system is given in Fig. 4a. In terms of behavior, the environment produces infinite streams of input data, while the system uses pure (side-effect free) functions to transform the values of these input streams in every time step. After their transformation, the data values are either passed to an output stream or are passed to a cell, which pipes the output value from one time step back to the corresponding input value of the next. The behaviour of the system is captured by its infinite execution over time.

Fig. 4.
figure 4

General architecture of reactive systems that are specified in TSL on the left, and the structure of function, predicate and updates on the right.

Function Terms, Predicate Terms, and Updates. In TSL we differentiate between two elements: we use purely functional transformations, reflected by functions \( f \in \mathcal {F} \) and their compositions, and predicates \( p \in \mathcal {P} \), used to control how data flows inside the system. To argue about both elements we use a term based notation, where we distinguish between function terms \( \tau _{F} \) and predicate terms \( \tau _{P} \), respectively. Function terms are either constructed from inputs or cells (\( {{\texttt {s}}}_{{{\texttt {i}}}} \in \mathbb {I} \cup \mathbb {C} \)), or from functions, recursively applied to a set of function terms. Predicate terms are constructed similarly, by applying a predicate to a set of function terms. Finally, an update takes the result of a function computation and passes it either to an output or a cell (\( {{\texttt {s}}}_{{{\texttt {o}}}} \in \mathbb {O} \cup \mathbb {C} \)). An overview of the syntax of the different term notations is given in Fig. 4b. Note that we use curried argument notation similar to functional programming languages.

We denote sets of function and predicate terms, and updates by \( \mathcal {T} _{\!F} \), \( \mathcal {T} _{\!P} \) and \( \mathcal {T} _{U \!} \), respectively, where \( \mathcal {T} _{\!P} \subseteq \mathcal {T} _{\!F} \). We use \( \mathbb {F} \) to denote the set of function literals and \( \mathbb {P} \subseteq \mathbb {F} \) to denote the set of predicate literals, where the literals \( {{\texttt {s}}}_{{{\texttt {i}}}} \), \( {{\texttt {s}}}_{{{\texttt {o}}}} \), \( {{\texttt {f}}} \) and \( {{\texttt {p}}} \) are symbolic representations of inputs and cells, outputs and cells, functions, and predicates, respectively. Literals are used to construct terms as shown in Fig. 4b. Since we use a symbolic representation, functions and predicates are not tied to a specific implementation. However, we still classify them according to their arity, i.e., the number of function terms they are applied to, as well as by their type: input, output, cell, function or predicate. Furthermore, terms can be compared syntactically using the equivalence relation \( \equiv \). To assign a semantic interpretation to functions, we use an assignment function \( \langle \cdot \rangle :\mathbb {F} \rightarrow \mathcal {F} \).

Inputs, Outputs, and Computations. We consider momentary inputs \( i \in \mathcal {V}^{[\mathbb {I} ]} \), which are assignments of inputs \( {{\texttt {i}}} \in \mathbb {I} \) to values \( v \in \mathcal {V}\). For the sake of readability let \( \mathcal {I}= \mathcal {V}^{[\mathbb {I} ]} \). Input streams are infinite sequences \( \iota \in \mathcal {I}^{\omega } \) consisting of infinitely many momentary inputs.

Similarly, a momentary output \( o \in \mathcal {V}^{[\mathbb {O} ]} \) is an assignment of outputs \( {{\texttt {o}}} \in \mathbb {O} \) to values \( v \in \mathcal {V}\), where we also use \( \mathcal {O}= \mathcal {V}^{[\mathbb {O} ]} \). Output streams are infinite sequences \( \varrho \in \mathcal {O}^{\omega } \). To capture the behavior of a cell, we introduce the notion of a computation \( \varsigma \). A computation fixes the function terms that are used to compute outputs and cell updates, without fixing semantics of function literals. Intuitively, a computation only determines which function terms are used to compute an output, but abstracts from actually computing it.

The basic element of a computation is a computation step \( c\in \mathcal {T} _{\!F} ^{[\mathbb {O} \cup \mathbb {C} ]} \), which is an assignment of outputs and cells \( {{\texttt {s}}}_{{{\texttt {o}}}} \in \mathbb {O} \cup \mathbb {C} \) to function terms \( \tau _{F} \in \mathcal {T} _{\!F} \). For the sake of readability let \( \mathcal {C}= \mathcal {T} _{\!F} ^{[\mathbb {O} \cup \mathbb {C} ]} \). A computation step fixes the control flow behaviour at a single point in time. A computation \( \varsigma \in \mathcal {C}^{\omega } \) is an infinite sequence of computation steps.

As soon as input streams, and function and predicate implementations are known, computations can be turned into output streams. To this end, let \( \langle \cdot \rangle :\mathbb {F} \rightarrow \mathcal {F} \) be some function assignment. Furthermore, assume that there are predefined constants \( \textit{init}_{{{\texttt {c}}}} \in \mathcal {F} \cap \mathcal {V}\) for every cell \( {{\texttt {c}}} \in \mathbb {C} \), which provide an initial value for each stream at the initial point in time. To receive an output stream from a computation \( \varsigma \in \mathcal {C}^{\omega } \) under the input stream \( \iota \), we use an evaluation function \( \eta _{_{\!\langle \cdot \rangle \!}}\! :\mathcal {C}^{\omega } \times \mathcal {I}^{\omega } \times \mathbb {N}\times \mathcal {T} _{\!F} \rightarrow \mathcal {V}\):

$$\begin{aligned} \eta _{_{\!\langle \cdot \rangle \!}}\! (\varsigma , \iota , t, {{\texttt {s}}}_{{{\texttt {i}}}})= & {} {\left\{ \begin{array}{ll} \iota (t)({{\texttt {s}}}_{{{\texttt {i}}}}) &{} \text {if } {{\texttt {s}}}_{{{\texttt {i}}}} \in \mathbb {I} \\ \textit{init}_{{{\texttt {s}}}_{{{\texttt {i}}}}} &{} \text {if } {{\texttt {s}}}_{{{\texttt {i}}}} \in \mathbb {C}\ \wedge \ t = 0 \\ \eta _{_{\!\langle \cdot \rangle \!}}\! (\varsigma , \iota , t-1, \varsigma (t-1)({{\texttt {s}}}_{{{\texttt {i}}}})) &{} \text {if } {{\texttt {s}}}_{{{\texttt {i}}}} \in \mathbb {C}\ \wedge \ t > 0 \end{array}\right. } \\ \eta _{_{\!\langle \cdot \rangle \!}}\! (\varsigma , \iota , t, {{\texttt {f}}} \ \tau _{0} \ \cdots \ \tau _{m-1})= & {} \langle {{\texttt {f}}} \rangle \ \eta _{_{\!\langle \cdot \rangle \!}}\! (\varsigma ,\iota , t,\tau _{0}) \ \cdots \ \eta _{_{\!\langle \cdot \rangle \!}}\! (\varsigma ,\iota , t,\tau _{m-1}) \end{aligned}$$

Then \( \varrho _{\langle \cdot \rangle , \varsigma , \iota } \in \mathcal {O}^{\omega } \) is defined via \( \varrho _{\langle \cdot \rangle , \varsigma , \iota }(t)({{\texttt {o}}}) = \eta _{_{\!\langle \cdot \rangle \!}}\! (\varsigma , \iota , t, {{\texttt {o}}}) \) for all \( t \in \mathbb {N}\), \( {{\texttt {o}}} \in \mathbb {O} \).

Syntax. Every TSL formula \( \varphi \) is built according to the following grammar:

An atomic proposition \(\tau \) consists either of a predicate term, serving as a Boolean interface to the inputs, or of an update, enforcing a respective flow at the current point in time. Next, we have the Boolean operations via negation and conjunction, that allow us to express arbitrary Boolean combinations of predicate evaluations and updates. Finally, we have the temporal operator next: , to specify the behavior at the next point in time and the temporal operator until: , which enforces a property \( \vartheta \) to hold until the property \( \psi \) holds, where \( \psi \) must hold at some point in the future eventually.

Semantics. Formally, this leads to the following semantics. Let \( \langle \cdot \rangle :\mathbb {F} \rightarrow \mathcal {F} \), \( \iota \in \mathcal {I}^{\omega } \), and \( \varsigma \in \mathcal {C}^{\omega } \) be given, then the validity of a \( \text {TSL} \) formula \( \varphi \) with respect to \( \varsigma \) and \( \iota \) is defined inductively over \( t \in \mathbb {N}\) via:

Consider that the satisfaction of a predicate depends on the current computation step and the steps of the past, while for updates it only depends on the current computation step. Furthermore, updates are only checked syntactically, while the satisfaction of predicates depends on the given assignment \( \langle \cdot \rangle \) and the input stream \( \iota \). We say that \( \varsigma \) and \( \iota \) satisfy \( \varphi \), denoted by \( \varsigma , \iota \;\vDash _{\!\langle \cdot \rangle }\varphi \), if \( \varsigma , \iota , 0 \;\vDash _{\!\langle \cdot \rangle }\varphi \).

Beside the basic operators, we have the standard derived Boolean operators, as well as the derived temporal operators: release , finally , always , the weak version of until , and as soon as .

Realizability. We are interested in the following realizability problem: given a \( \text {TSL} \) formula \( \varphi \), is there a strategy \( \sigma \in \mathcal {C}^{[\mathcal {I}^{+}]} \) such that for every input \( \iota \in \mathcal {I}^{\omega } \) and function implementation \( \langle \cdot \rangle :\mathbb {F} \rightarrow \mathcal {F} \), the branch \( \sigma \! \wr \iota \) satisfies \( \varphi \), i.e.,

$$\begin{aligned} \exists \sigma \in \mathcal {C}^{[\mathcal {I}^{+}]}. \ \, \forall \iota \in \mathcal {I}^{\omega }. \ \, \forall \langle \cdot \rangle :\mathbb {F} \rightarrow \mathcal {F}. \ \, \sigma \! \wr \iota , \iota \;\vDash _{\!\langle \cdot \rangle }\varphi \end{aligned}$$

If such a strategy \( \sigma \) exists, we say \( \sigma \) realizes \( \varphi \). If we additionally ask for a concrete instantiation of \( \sigma \), we consider the synthesis problem of TSL.

5 TSL Properties

In order to synthesize programs from TSL specifications, we give an overview of the first part of our synthesis process, as shown in Fig. 1. First we show how to approximate the semantics of TSL through a reduction to LTL. However, due to the approximation, finding a realizable strategy immediately may fail. Our solution is a CEGAR loop that improves the approximation. This CEGAR loop is necessary, because the realizability problem of TSL is undecidable in general.

Approximating TSL with LTL. We approximate TSL formulas with weaker LTL formulas. The approximation reinterprets the syntactic elements, \(\mathcal {T} _{\!P} \) and \(\mathcal {T} _{U \!} \), as atomic propositions for LTL. This strips away the semantic meaning of the function application and assignment in TSL, which we reconstruct by later adding assumptions lazily to the LTL formula.

Formally, let \( \mathcal {T} _{\!P} \) and \( \mathcal {T} _{U \!} \) be the finite sets of predicate terms and updates, which appear in \( \varphi _{\textit{TSL}} \), respectively. For every assigned signal, we partition \( \mathcal {T} _{U \!} \) into \( \biguplus _{{{\texttt {s}}}_{{{\texttt {o}}}} \in \mathbb {O} \cup \mathbb {C}} \mathcal {T} _{U \!} ^{{{\texttt {s}}}_{{{\texttt {o}}}}} \). For every \( {{\texttt {c}}} \in \mathbb {C} \) let \(\mathcal {T} _{U / {{\texttt {id}}}}^{{{\texttt {c}}}} = \mathcal {T} _{U \!} ^{{{\texttt {c}}}} \cup \{ \llbracket {{\texttt {c}}} \leftarrowtail \, {{\texttt {c}}} \rrbracket \} \), for \( {{\texttt {o}}} \in \mathbb {O} \) let \( \mathcal {T} _{U / {{\texttt {id}}}}^{{{\texttt {o}}}} = \mathcal {T} _{U \!} ^{{{\texttt {o}}}} \), and let \( \mathcal {T} _{U / {{\texttt {id}}}}= \bigcup _{{{\texttt {s}}}_{{{\texttt {o}}}} \in \mathbb {O} \cup \mathbb {C}} \mathcal {T} _{U / {{\texttt {id}}}}^{{{\texttt {s}}}_{{{\texttt {o}}}}} \). We construct the LTL formula \( \varphi _{\textit{LTL}} \) over the input propositions \( \mathcal {T} _{\!P} \) and output propositions \( \mathcal {T} _{U / {{\texttt {id}}}}\) as follows:

Intuitively, the first part of the equation partially reconstructs the semantic meaning of updates by ensuring that a signal is not updated with multiple values at a time. The second part extracts the reactive constraints of the TSL formula without the semantic meaning of functions and updates.

Theorem 1

([19]). If \( \varphi _{\textit{LTL}} \) is realizable, then \( \varphi _{\textit{TSL}} \) is realizable.

Note that unrealizability of \(\varphi _{\textit{LTL}} \) does not imply that \( \varphi _{\textit{TSL}}\) is unrealizable. It may be that we have not added sufficiently many environment assumptions to the approximation in order for the system to produce a realizing strategy.

Fig. 5.
figure 5

A TSL specification (a) with input x and cell y that is realizable. A winning strategy is to save x to y as soon as \( {{\texttt {p}}}({{\texttt {x}}}) \) is satisfied. However, the initial approximation (b), that is passed to an LTL synthesis solver, is unrealizable, as proven through the counter-strategy (c) returned by the LTL solver.

Example. As an example, we present a simple TSL specification in Fig. 5a. The specification asserts that the environment provides an input x for which the predicate \( {{\texttt {p}}}~{{\texttt {x}}} \) will be satisfied eventually. The system must guarantee that eventually \( {{\texttt {p}}}~{{\texttt {y}}} \) holds. According to the semantics of TSL the formula is realizable. The system can take the value of \( {{\texttt {x}}} \) when \( {{\texttt {p}}}~{{\texttt {x}}} \) is true and save it to \( {{\texttt {y}}} \), thus guaranteeing that \( {{\texttt {p}}}~{{\texttt {y}}} \) is satisfied eventually. This is in contrast to LTL, which has no semantics for pure functions - taking the evaluation of \( {{\texttt {p}}}~{{\texttt {y}}} \) as an environmentally controlled value that does not need to obey the consistency of a pure function.

Refining the LTL Approximation. It is possible that the LTL solver returns a counter-strategy for the environment although the original TSL specification is realizable. We call such a counter-strategy spurious as it exploits the additional freedom of LTL to violate the purity of predicates as made possible by the underapproximation. Formally, a counter-strategy is an infinite tree \( \pi :\mathcal {C}^{*} \rightarrow 2^{\mathcal {T} _{\!P}} \), which provides predicate evaluations in response to possible update assignments of function terms \( \tau _{F} \in \mathcal {T} _{\!F} \) to outputs \( {{\texttt {o}}} \in \mathbb {O} \). W.l.o.g. we can assume that \( \mathbb {O} \), \( \mathcal {T} _{\!F} \) and \( \mathcal {T} _{\!P} \) are finite, as they can always be restricted to the outputs and terms that appear in the formula. A counter-strategy is spurious, iff there is a branch \( \pi \! \wr \varsigma \) for some computation \( \varsigma \in \mathcal {C}^{\omega } \), for which the strategy chooses an inconsistent evaluation of two equal predicate terms at different points in time, i.e.,

$$\begin{aligned} \begin{array}{l} \exists \varsigma \in \mathcal {C}^{\omega }. \ \exists t, t' \in \mathbb {N}. \ \exists \tau _{P} \in \mathcal {T} _{\!P}. \\ \qquad \tau _{P} \in \pi (\varsigma (0)\varsigma (1)\ldots \varsigma (t-1)) \, \wedge \, \tau _{P} \notin \pi (\varsigma (0)\varsigma (1)\ldots \varsigma (t'-1)) \wedge \\ \qquad \forall \langle \cdot \rangle :\mathbb {F} \rightarrow \mathcal {F}. \ \eta _{_{\!\langle \cdot \rangle \!}}\! (\varsigma , \pi \! \wr \varsigma , t, \tau _{P}) \, = \, \eta _{_{\!\langle \cdot \rangle \!}}\! (\varsigma , \pi \! \wr \varsigma , t', \tau _{P}). \end{array} \end{aligned}$$

Note that a non-spurious strategy can be inconsistent along multiple branches. Due to the definition of realizability the environment can choose function and predicate assignments differently against every system strategy accordingly.

By purity of predicates in TSL the environment is forced to always return the same value for predicate evaluations on equal values. However, this semantic property cannot be enforced implicitly in LTL. To resolve this issue we use the returned counter-strategy to identify spurious behavior in order to strengthen the LTL underapproximation with additional environment assumptions. After adding the derived assumptions, we re-execute the LTL synthesizer to check whether the added assumptions are sufficient in order to obtain a winning strategy for the system. If the solver still returns a spurious strategy, we continue the loop in a CEGAR fashion until the set of added assumptions is sufficiently complete. However, if a non-spurious strategy is returned, we have found a proof that the given TSL specification is indeed unrealizable and terminate.

figure b

Algorithm 1 shows how a returned counter-strategy \( \pi \) is checked for being spurious. To this end, it is sufficient to check \( \pi \) against system strategies bounded by the given bound b, as we use bounded synthesis [20]. Furthermore, we can assume w.l.o.g. that \( \pi \) is given by a finite state representation, which is always possible due to the finite model guarantees of LTL. Also note that \( \pi \), as it is returned by the LTL synthesizer, responds to sequences of sets of updates \( (2^{\mathcal {T} _{U / {{\texttt {id}}}}})^{*} \). However, in our case \( (2^{\mathcal {T} _{U / {{\texttt {id}}}}})^{*} \) is an alternative representation of \( \mathcal {C}^{*} \), due to the additional “single update” constraints added during the construction of \( \varphi _{\textit{LTL}} \).

The algorithm iterates over all possible responses \( v \in \mathcal {C}^{m \cdot b} \) of the system up to depth \( m \cdot b \). This is sufficient, since any deeper exploration would result in a state repetition of the cross-product of the finite state representation of \( \pi \) and any system strategy bounded by b. Hence, the same behaviour could also be generated by a sequence smaller than \( m \cdot b \). At the same time, the algorithm iterates over predicates \( \tau _{P} \in \mathcal {T} _{\!P} \) appearing in \( \varphi _{\textit{TSL}} \) and times t and \( t' \) smaller than \( m \cdot b \). For each of these elements, spuriousness is checked by comparing the output of \( \pi \) for the evaluation of \( \tau _{P} \) at times t and \( t' \), which should only differ if the inputs to the predicates are different as well. This can only happen, if the passed input terms have been constructed differently over the past. We check it by using the evaluation function \( \eta \) equipped with the identity assignment \( \langle \cdot \rangle _{\texttt {id}} :\mathbb {F} \rightarrow \mathbb {F} \), with \( \langle {{\texttt {f}}} \rangle _{\texttt {id}} = {{\texttt {f}}} \) for all \( {{\texttt {f}}} \in \mathbb {F} \), and the input sequence \( \iota _{\texttt {id}} \), with \( \iota _{\texttt {id}}(t)({{\texttt {i}}}) = (t,{{\texttt {i}}}) \) for all \( t \in \mathbb {N}\) and \( {{\texttt {i}}} \in \mathbb {I} \), that always generates a fresh input. Syntactic inequality of \( \eta _{_{\!\langle \cdot \rangle _{\texttt {id}}\!}}\! (v,\iota _{{{\texttt {id}}}},t,\tau _{P}) \) and \( \eta _{_{\!\langle \cdot \rangle _{\texttt {id}}\!}}\! (v,\iota _{{{\texttt {id}}}},t',\tau _{P}) \) then is a sufficient condition for the existence of an assignment \( \langle \cdot \rangle :\mathbb {F} \rightarrow \mathcal {F} \), for which \( \tau _{P} \) evaluates differently at times t and \( t' \).

If spurious behaviour of \( \pi \) could be found, then the revealing response \( v \in \mathcal {C}^{*} \) is first simplified using \( \texttt {reduce} \), which reduces v again to a sequence of sets of updates \( w \in (2^{\mathcal {T} _{U / {{\texttt {id}}}}})^{*} \) and removes updates that do not affect the behavior of \( \tau _{P} \) at the times t and \( t' \) to accelerate the termination of the CEGAR loop. Afterwards, the sequence w is turned into a new assumption that prohibits the spurious behavior, generalized to prevent it even at arbitrary points in time.

As an example of this process, reconsider the spurious counter-strategy of Fig. 5c. Already after the first system response \( \llbracket {{\texttt {y}}} \leftarrowtail \, {{\texttt {x}}} \rrbracket \), the environment produces an inconsistency by evaluating \( {{\texttt {p}}} \ {{\texttt {x}}} \) and \( {{\texttt {p}}} \ {{\texttt {y}}} \) differently. This is inconsistent, as the cell \( {{\texttt {y}}} \) holds the same value at time \( t = 1 \) as the input \( {{\texttt {x}}} \) at time \( t = 0 \). Using Algorithm 1 we generate the new assumption . After adding this strengthening the LTL synthesizer returns a realizability result.

Undecidability. Although we can approximate the semantics of TSL with LTL, there are TSL formulas that cannot be expressed as LTL formulas of finite size.

Theorem 2

([19]).  The realizability problem of \( \text {TSL} \) is undecidable.

6 TSL Synthesis

Our synthesis framework provides a modular refinement process to synthesize executables from \( \text {TSL} \) specifications, as depicted in Fig. 1. The user initially provides a \( \text {TSL} \) specification over predicate and function terms. At the end of the procedure, the user receives an executable to control a reactive system.

The first step of our method answers the synthesis question of TSL: if the specification is realizable, then a control flow model is returned. To this end, an intermediate translation to LTL is used, utilizing an LTL synthesis solver that produces circuits in the AIGER format. If the specification is realizable, the resulting control flow model is turned into Haskell code, which is implemented as an independent Haskell module. The user has the choice between two different targets: a module built on Arrows, which is compatible with any Arrowized FRP library, or a module built on Applicative, which supports Applicative FRP libraries. Our procedure generates a single Haskell module per TSL specification. This makes naturally decomposing a project according to individual tasks possible. Each module provides a single component, which is parameterized by their initial state and the pure function and predicate transformations. As soon as these are provided as part of the surrounding project context, a final executable can be generated by compiling the Haskell code.

An important feature of our synthesis approach is that implementations for the terms used in the specification are only required after synthesis. This allows the user to explore several possible specifications before deciding on any term implementations.

Control Flow Model. The first step of our approach is the synthesis of a Control Flow Model \( \mathcal {M} \) (CFM) from the given \( \text {TSL} \) specification \( \varphi \), which provides us with a uniform representation of the control flow structure of our final program.

Formally, a CFM \( \mathcal {M} \) is a tuple \( \mathcal {M} = (\mathbb {I}, \mathbb {O}, \mathbb {C}, V, \ell , \delta ), \) where \( \mathbb {I} \) is a finite set of inputs, \( \mathbb {O} \) is a finite set of outputs, \( \mathbb {C} \) is a finite set of cells, \( V\) is a finite set of vertices, \( \ell :V\rightarrow \mathbb {F} \) assigns a vertex a function \( {{\texttt {f}}} \in \mathbb {F} \) or a predicate \( {{\texttt {p}}} \in \mathbb {P} \), and

$$\begin{aligned} \delta :(\mathbb {O} \cup \mathbb {C} \cup V) \times \mathbb {N}\rightarrow (\mathbb {I} \cup \mathbb {C} \cup V\cup \{ \bot \}) \end{aligned}$$

is a dependency relation that relates every output, cell, and vertex of the CFM with \( n \in \mathbb {N}\) arguments, which are either inputs, cells, or vertices. Outputs and cells \( {{\texttt {s}}} \in \mathbb {O} \cup \mathbb {C} \) always have only a single argument, i.e., \( \delta ({{\texttt {s}}}, 0) \not \equiv \bot \) and \( \forall m > 0 .\ \delta ({{\texttt {s}}}, m) \equiv \bot \), while for vertices \( x \in V\) the number of arguments \( n \in \mathbb {N}\) align with the arity of the assigned function or predicate \( \ell (x) \), i.e., \( \forall m \in \mathbb {N}.\ \delta (x, m) \equiv \bot \leftrightarrow m > n \). A CFM is valid if it does not contain circular dependencies, i.e., on every cycle induced by \( \delta \) there must lie at least a single cell. We only consider valid CFMs.

Fig. 6.
figure 6

Example CFM of the music player generated from a TSL specification.

An example CFM for our music player of Sect. 2 is depicted in Fig. 6. Inputs \( \mathbb {I} \) come from the left and outputs \( \mathbb {O} \) leave on the right. The example contains a single cell \( {{\texttt {c}}} \in \mathbb {C} \), which holds the stateful memory Cell, introduced during synthesis for the module. The green, arrow shaped boxes depict vertices \( V\), which are labeled with functions and predicates names, according to \( \ell \). For the Boolean decisions that define \(\delta \), we use circuit symbols for conjunction, disjunction, and negation. Boolean decisions are piped to a multiplexer gate that selects the respective update streams. This allows each update stream to be passed to an output stream if and only if the respective Boolean trigger evaluates positively, while our construction ensures mutual exclusion on the Boolean triggers. For code generation, the logic gates are implemented using the corresponding dedicated Boolean functions. After building a control structure, we assign semantics to functions and predicates by providing implementations. To this end, we use Functional Reactive Programming (FRP). Prior work has established Causal Commutative Arrows (CCA) as an FRP language pattern equivalent to a CFM [33, 34, 53]. CCAs are an abstraction subsumed by other functional reactive programming abstractions, such as Monads, Applicative and Arrows [32, 33]. There are many FRP libraries using Monads [11, 14, 42], Applicative [2, 3, 23, 48], or Arrows [10, 39, 41, 51], and since every Monad is also an Applicative and Applicative/Arrows both are universal design patterns, we can give uniform translations to all of these libraries using translations to just Applicative and Arrows. Both translations are possible due to the flexible notion of a CFM.

In the last step, the synthesized FRP program is compiled into an executable, using the provided function and predicate implementations. This step is not fixed to a single compiler implementation, but in fact can use any FRP compiler (or library) that supports a language abstraction at least as expressive as CCA. For example, instead of creating an Android music player app, we could target an FRP web interface [48] to create an online music player, or an embedded FRP library [23] to instantiate the player on a computationally more restricted device. By using the strong core of CCA, we even can directly implement the player in hardware, which is for example possible with the C\( \lambda \)aSH compiler [3]. Note that we still need separate implementations for functions and predicates for each target. However, the specification and synthesized CFM always stay the same.

7 Experimental Results

To evaluate our synthesis procedure we implemented a tool that follows the structure of Fig. 1. It first encodes the given TSL specification in LTL and then refines it until an LTL solver either produces a realizability result or returns a non-spurious counter-strategy. For LTL synthesis we use the bounded synthesis tool BoSy [15]. As soon as we get a realizing strategy it is translated to a corresponding CFM. Then, we generate the FRP program structure. Finally, after providing function implementations the result is compiled into an executable.

Table 1. Number of cells \( |\mathbb {C} _{\mathcal {M}}| \) and vertices \( |V_{\mathcal {M}}|\) of the resulting CFM \( \mathcal {M} \) and synthesis times for a collection of TSL specifications \( \varphi \). A * indicates that the benchmark additionally has an initial condition as part of the specification.
Table 2. Set of programs that use purity to keep one or two counters in range. Synthesis needs multiple refinements of the specification to proof realizability.

To demonstrate the effectiveness of synthesizing TSL, we applied our tool to a collection of benchmarks from different application domains, listed in Table 1. Every benchmark class consists of multiple specifications, addressing different features of TSL. We created all specifications from scratch, where we took care that they either relate to existing textual specifications, or real world scenarios. A short description of each benchmark class is given in [19].

For every benchmark, we report the synthesis time and the size of the synthesized CFM, split into the number of cells (\( | \mathbb {C} _{\mathcal {M}} | \)) and vertices (\( | V_{\mathcal {M}} | \)) used. The synthesized CFM may use more cells than the original TSL specification if synthesis requires more memory in order to realize a correct control flow. The synthesis was executed on a quad-core Intel Xeon processor (E3-1271 v3, 3.6GHz, 32 GB RAM, PC1600, ECC), running Ubuntu 64bit LTS 16.04.

The experiments of Table 1 show that TSL successfully lifts the applicability of synthesis from the Boolean domain to arbitrary data domains, allowing for new applications that utilize every level of required abstraction. For all benchmarks we always found a realizable system within a reasonable amount of time, where the results often required synthesized cells to realize the control flow behavior.

We also considered a preliminary set of benchmarks that require multiple refinement steps to be synthesizable. An overview of the results is given in Table 2. The benchmarks are inspired by examples of the Reactive Banana FRP library [2]. Here, purity of function and predicate applications must be utilized by the system to ensure that the value of one or two counters never goes out of range. Thereby, the system not only needs purity to verify this condition, but also to take the correct decisions in the resulting implementation to be synthesized.

8 Related Work

Our approach builds on the rich body of work on reactive synthesis, see [17] for a survey. The classic reactive synthesis problem is the construction of a finite-state machine that satisfies a specification in a temporal logic like LTL. Our approach differs from the classic problem in its connection to an actual programming paradigm, namely FRP, and its separation of control and data.

The synthesis of reactive programs, rather than finite-state machines, has previously been studied for standard temporal logic [21, 35]. Because there is no separation of control and data, these approaches do not directly scale to realistic applications. With regard to FRP, a Curry-Howard correspondence between LTL and FRP in a dependently typed language was discovered [28, 29] and used to prove properties of FRP programs [8, 30]. However, our paper is the first, to the best of our knowledge, to study the synthesis of FRP programs from temporal specifications.

The idea to separate control and data has appeared, on a smaller scale, in the synthesis with identifiers, where identifiers, such as the number of a client in a mutual exclusion protocol, are treated symbolically [13]. Uninterpreted functions have been used to abstract data-related computational details in the synthesis of synchronization primitives for complex programs [5]. Another connection to other synthesis approaches is our CEGAR loop. Similar refinement loops also appear in other synthesis appraches, however with a different purpose, such as the refinement of environment assumptions [1].

So far, there is no immediate connection between our approach and the substantial work on deductive and inductive synthesis, which is specifically concerned with the data-transformation aspects of programs [16, 31, 40, 47, 49, 50]. Typically, these approaches are focussed on non-reactive sequential programs. An integration of deductive and inductive techniques into our approach for reactive systems is a very promising direction for future work. Abstraction-based synthesis [4, 12, 24, 37] may potentially provide a link between the approaches.

9 Conclusions

We have introduced Temporal Stream Logic, which allows the user to specify the control flow of a reactive program. The logic cleanly separates control from complex data, forming the foundation for our procedure to synthesize FRP programs. By utilizing the purity of function transformations our logic scales independently of the complexity of the data to be handled. While we have shown that scalability comes at the cost of undecidability, we addressed this issue by using a CEGAR loop, which lazily refines the underapproximation until either a realizing system implementation or an unrealizability proof is found.

Our experiments indicate that TSL synthesis works well in practice and on a wide range of programming applications. TSL also provides the foundations for further extensions. For example, a user may want to fix the semantics for a subset of the functions and predicates. Such refinements can be implemented as part of a much richer TSL Modulo Theory framework.