Keywords

1 Introduction

Linear Temporal Logic with Past (\(\textsf{LTL}\)  [17, 23]) is the de-facto standard language for the specification, verification, and synthesis of reactive systems [19]. Concerning these reasoning tasks, two fundamental subsets of \(\textsf{LTL}\)-definable languages come into play, namely, safety and cosafety languages. Safety languages express properties stating that “something bad never happens”; cosafety languages, instead, express the fact that “something good will eventually happen”. The crucial feature of cosafety (resp., safety) languages is that checking a finite prefix of an infinite trace suffices to establish whether the entire trace belongs (resp., does not belong) to the language. Such an ability of reducing reasoning over infinite words to the finite case plays a fundamental role in lowering the complexity of reasoning tasks [16]. Because of this, while \(\textsf{LTL}\) was commonly interpreted over infinite traces, recent work mainly considers its finite trace semantics [8, 18, 22].

In what follows, given a set of temporal operators S, we write \(\textsf{LTL} [S]\) for the set of all \(\textsf{LTL} \) formulae in negation normal form whose temporal operators are restricted to those in S. Similarly, we denote with \(\textsf{F}(\textsf{LTL} [S])\) the set of formulae of the form \(\textsf{F}(\alpha )\), with \(\alpha \in \textsf{LTL} [S]\). Here, \(\textsf{F}\) is the future modality (a.k.a. eventually).

There are two notable syntactic characterizations of the cosafety languages of \(\textsf{LTL}\). The first one is a pure future characterization given by the logic \(\textsf{LTL} [\textsf{X},\textsf{U}]\) featuring modalities next \(\textsf{X}\) and until \(\textsf{U}\). The second one is an eventually pure pastFootnote 1 characterisation given by the logic \(\textsf{F}(\textsf{pLTL})\), where \(\textsf{pLTL}\) is the pure past fragment of \(\textsf{LTL} \), that is, the restriction of \(\textsf{LTL} \) to past modalities. Analogous characterizations have been provided for safety languages.

As for applications, \(\textsf{F}(\textsf{pLTL})\) is considered to be much more convenient than \(\textsf{LTL} [\textsf{X,U}]\), because, starting from an (eventually) pure past formula of size n, it is possible to build an equivalent deterministic finite automaton of singly exponential size in n [7]. In the case of \(\textsf{LTL} [\textsf{X,U}]\), such an automaton may have size doubly exponential in n [16]. This computational advantage of pure past formulae originated a recent line of research that focuses on the pastification problem, i.e., the problem of translating an input pure future formula for a cosafety (or safety) language into an equivalent pure past (equivalently, eventually pure past) formula. While the best known algorithm for \(\textsf{LTL} [\textsf{X,U}]\) is triply exponential [7], a singly exponential pastification algorithm to transform \(\textsf{LTL} [\textsf{X},\textsf{F}]\) formulae into \(\textsf{F}(\textsf{LTL} [\mathsf {Y,\widetilde{\textsf{Y}},O}])\) ones has been recently developed in [4]. Here, modalities yesterday \(\textsf{Y}\) and once \(\textsf{O}\) are the “temporal reverses” of modalities \(\textsf{X}\) and \(\textsf{F}\), respectively, whereas the weak yesterday operator \(\mathsf {\widetilde{\textsf{Y}}}\) is the dual of \(\textsf{Y}\) (we formally define the semantics of all these modalities in Section 2). No super-polynomial lower bounds for these pastification problems are known.

While the above two characterisations of cosafety languages have been thoroughly studied in the last decades in terms of expressiveness [6] and complexity [2], their succinctness is still poorly understood. To the best of our knowledge, the only known result is the one in [3] showing that \(\textsf{F}(\textsf{pLTL})\) can be exponentially more succinct than \(\textsf{LTL} [\textsf{X,U}]\) — note that lower bounds to pastification problems require the opposite direction.Footnote 2

In this paper, we study the succinctness of \(\textsf{LTL} [\textsf{F}] \) against \(\textsf{F}(\textsf{LTL} [\mathsf {Y,\widetilde{\textsf{Y}},O,H}])\), where \(\textsf{H}\) is the dual of \(\textsf{O}\), as well as the succinctness of their reverse logics [3], that is, the succinctness of \(\textsf{F}(\textsf{LTL} [\textsf{O}])\) against \(\textsf{LTL} [\textsf{X},\mathsf {\widetilde{\textsf{X}}},\textsf{F},\textsf{G}] \). For these fragments of \(\textsf{LTL}\), we establish the following two results.

Theorem 1

\(\textsf{F}(\textsf{LTL} [\textsf{O}])\) can be exponentially more succinct than \(\textsf{LTL} [\textsf{X},\mathsf {\widetilde{\textsf{X}}},\textsf{F},\textsf{G}] \).

Theorem 2

\(\textsf{LTL} [\textsf{F}]\) can be exponentially more succinct than \(\textsf{F}(\textsf{LTL} [\textsf{Y},\mathsf {\widetilde{\textsf{Y}}},\textsf{O},\textsf{H}])\).

The two theorems prove an incomparability result about the succinctness of the characterizations of cosafety languages in the pure future and eventually pure past fragments of \(\textsf{LTL}\). Theorem 1 and Theorem 2 hold for both the finite and infinite trace semantics of \(\textsf{LTL}\) (however, due to lack of space, we report the proof of Theorem 1 only in the case of finite traces). As a corollary, Theorem 2 implies that the pastification algorithm proposed in [4] is optimal.

Corollary 1

The pastification of \(\textsf{LTL} [\textsf{X},\textsf{F}]\) into \(\textsf{F}(\textsf{LTL} [\textsf{Y},\mathsf {\widetilde{\textsf{Y}}},\textsf{O},\textsf{H}])\) is in \(2^{\varTheta (n)}\).

To prove Theorem 1, we devise and apply a combinatorial proof system.Footnote 3 Given two sets of finite traces A and B, with the proof system one can establish whether there is a formula \(\varphi \) in \(\textsf{LTL} [\textsf{X},\mathsf {\widetilde{\textsf{X}}},\textsf{F},\textsf{G}]\) that separates A from B, that is, \(\varphi \) is satisfied by all traces in A (written \(A \models \varphi \)) and violated by all traces in B (written ). A proof obtained by applying k rules of the proof system corresponds to the existence of one such separating formula \(\varphi \) of size k.

The proposed combinatorial proof system can be seen as a reformulation in terms of proofs of the games introduced by Adler and Immerman to show that \(\mathsf {CTL^+}\) is \(\varTheta (n)!\) more succinct than \(\textsf{CTL}\)  [1]. They are two-player games that extend Ehrenfeucht–Fraïssé games for quantifier depth in a way that captures the notion of formula size instead. However, unlike Ehrenfeucht–Fraïssé ones, in Adler–Immerman games one of the two players (the duplicator) has always a trivial strategy. With our proof system, we show that removing the duplicator from the game yields a natural one-player game based on building proofs.

To prove Theorem 1 by applying the proposed proof system, we provide, for every \(n \ge 1\), a formula \(\varPhi _n\) in \(\textsf{F}(\textsf{LTL} [\textsf{O}])\) of size linear in n and two sets of traces \(\textbf{A}_n\) and \(\textbf{B}_n\) such that \(\textbf{A}_n \models \varPhi _n\) and , and then we show that the smallest deduction tree that separates \(\textbf{A}_n\) from \(\textbf{B}_n\) has size at least \(2^n\). This implies that all formulas of \(\textsf{LTL} [\textsf{X},\mathsf {\widetilde{\textsf{X}}},\textsf{F},\textsf{G}] \) capturing \(\varPhi _n\) are of size at least \(2^n\).

Once Theorem 1 is established, one can prove Theorem 2 by “reversing” the direction of time, building correspondences between formulae of \(\textsf{LTL} [\textsf{F}]\) and \(\textsf{F}{\textsf{LTL} [\textsf{O}]}\), and between formulae of \(\textsf{F}(\textsf{LTL} [\mathsf {Y,\widetilde{\textsf{Y}},O,H}])\) and \(\textsf{LTL} [\mathsf {X,\widetilde{\textsf{X}},F,G}]\).

In the context of \(\textsf{LTL}\), the main technique to prove “future against past” succinctness discrepancies is arguably the automata method introduced by Markey in [20]. At its core, such a method exploits the fact that pure future formulae of \(\textsf{LTL}\) can be translated into nondeterministic Büchi automata of exponential size, and thus no property requiring a doubly exponential size automaton can be represented succinctly. The introduction of our proof system raises the question of whether Markey’s method can be applied to establish our succinctness results. We prove that it cannot be used in our context. In order to obtain such a result, the key observation is that, given a cosafety formula \(\textsf{F}\psi \), a deterministic Büchi automaton (\(\textsf{DBA}\)) for \(\textsf{F}\psi \) of size \(\ell \), and a prefix \(\varPi \) consisting of k temporal operators among \(\textsf{X}\), \(\textsf{F}\), and \(\textsf{G}\), the minimal \(\textsf{DBA}\) for the formula \(\varPi \textsf{F}\psi \) has size polynomial in k and \(\ell \).

Synopsis. Section 2 introduces the necessary background. Section 3 discusses the languages we use to prove Theorem 1. Section 4 introduces the combinatorial proof system. In Section 5 we prove Theorem 1. In Section 6 we prove Theorem 2 and Corollary 1. The limits of the automata-based method to prove succinctness lower bounds are discussed in Section 7. Related and future work are discussed in Section 8. An extended version of the paper, complete of all proofs, can be found in [13].

2 Preliminaries

In this section, we introduce background knowledge on \(\textsf{LTL}\) focusing on finite traces. All definitions admit a natural extension to the setting of infinite traces.

Let \(\varSigma \) be a finite alphabet. We denote by \(\varSigma ^*\) the set of all finite words over \(\varSigma \) and by \(\varSigma ^+\) the subset of finite non-empty words. We use the term trace as a synonym of word. A language \(\mathcal {L}\) over \(\varSigma \) is a subset of \(\varSigma ^*\). Let \(\sigma = \langle {w_0,w_1,\dots ,w_n}\rangle \) be a word in \(\varSigma ^*\). We denote by \(|\sigma |\) the length of \(\sigma \), that is, \(n+1\). A position in \(\sigma \) is an element in the set \({\text {pos}(\sigma ) {:}{=}[0,n]}\). For every \(i \in \text {pos}(\sigma )\), we denote by \(\sigma [i]\in \varSigma \) the letter \(w_i\), and by \(\sigma {[i\rangle } \) the word \(\langle {w_i,\dots ,w_n}\rangle \). We say that position j of \(\sigma \) has type \(\tau \in \varSigma \) whenever \(\sigma [j] = \tau \). Given two traces \(\sigma _1\) and \(\sigma _2\), we write \(\sigma _1 \sqsubseteq \sigma _2\) whenever \(\sigma _1\) is a suffix of \(\sigma _2\), that is, there is \(j \in \text {pos}(\sigma _2)\) such that \(\sigma _1 = \sigma _2{[j\rangle } \). Given a word \(\sigma '\in \varSigma ^*\), we denote the concatenation of \(\sigma '\) to \(\sigma \) as \(\sigma \cdot \sigma '\), or simply \(\sigma \sigma '\). Given two languages \(\mathcal {L}\) and \(\mathcal {L}'\), we define \(\mathcal {L}\cdot \mathcal {L}' {:}{=}\{\sigma \cdot \sigma ' \mathrel {\vert }\sigma \in \mathcal {L}, \sigma '\in \mathcal {L}'\}\). We sometimes apply the concatenation to a word and a language; in these cases the word is implicitly converted into a singleton language, e.g., \(\sigma \cdot \mathcal {L}{:}{=}\{\sigma \} \cdot \mathcal {L}\). With \(A \mathbin {\subseteq _{\textit{fin}}} B\) we denote the fact that A is a finite subset of the set B.

Linear Temporal Logic with Past. In the following, we introduce syntax and semantics of Linear Temporal Logic with Past (\(\textsf{LTL}\)) restricted to those operators that we are going to use throughout the paper. In particular, we omit the future operators until and release, and their past counterparts (since and triggers). Let \(\mathcal{A}\mathcal{P}\) be a finite set of atomic propositions. The syntax of the formulae over \(\mathcal{A}\mathcal{P}\) is generated by the following grammar:

$$\begin{aligned} \varphi {:}{=}p & \mathrel {\vert }\lnot p \mathrel {\vert }\varphi \vee \varphi \mathrel {\vert }\varphi \wedge \varphi & \text {Boolean connectives} \\ & \mathrel {\vert }\mathsf {X\varphi } \mathrel {\vert }\mathsf {\widetilde{\textsf{X}}\varphi } \mathrel {\vert }\mathsf {F \varphi } \mathrel {\vert }\mathsf {G \varphi } & \text {future operators} \\ & \mathrel {\vert }\mathsf {Y\varphi } \mathrel {\vert }\mathsf {\widetilde{\textsf{Y}}\varphi } \mathrel {\vert }\mathsf {O \varphi } \mathrel {\vert }\mathsf {H \varphi } & \text {past operators} \end{aligned}$$

where \(p\in \mathcal{A}\mathcal{P}\). The temporal operators are respectively called: \(\textsf{X}\), next; \(\mathsf {\widetilde{\textsf{X}}}\), weak next; \(\textsf{F}\), future; \(\textsf{G}\), globally; \(\textsf{Y}\), yesterday; \(\mathsf {\widetilde{\textsf{Y}}}\), weak yesterday; \(\textsf{O}\), once; \(\textsf{H}\), historically. For the rest of the paper, we let \(\mathbb{O}\mathbb{P}{:}{=}\{\mathsf {X,\widetilde{\textsf{X}},F,G,Y,\widetilde{\textsf{Y}},O,H}\}\).

For every formula \(\varphi \), we define the size of \(\varphi \), denoted by \(\mathrm{{size}}(\varphi )\), inductively defined as follows: (i) \(\mathrm{{size}}(p) {:}{=}1\) and \(\mathrm{{size}}(\lnot p) {:}{=}1\), (ii) \(\mathrm{{size}}(\otimes \varphi ) {:}{=}\mathrm{{size}}(\varphi ) + 1\), for \(\otimes \in \mathbb{O}\mathbb{P}\), and (iii) \(\mathrm{{size}}(\varphi _1 \oplus \varphi _2) {:}{=}\mathrm{{size}}(\varphi _1) +\) \(\mathrm{{size}}(\varphi _2) + 1\) for \(\oplus \in \{\vee ,\wedge \}\).

We focus on the interpretation of \(\textsf{LTL} \) formulae over finite non-empty traces over the alphabet \(2^{\mathcal{A}\mathcal{P}}\). From now on, we set the alphabet \(\varSigma \) to be \(2^{\mathcal{A}\mathcal{P}}\). Given a word \(\sigma \in \varSigma ^+\), the satisfaction of a formula \(\varphi \) by \(\sigma \) at time point / position \(i \in \text {pos}(\sigma )\), denoted by \(\sigma ,i\models \varphi \), is defined as follows:

figure c

For every formula \(\varphi \), we say that a trace \(\sigma \) satisfies \(\varphi \), written \(\sigma \models \varphi \), if \(\sigma ,0\models \varphi \). The language of \(\varphi \), denoted by \(\mathcal {L}(\varphi )\), is the set of words \(\sigma \in \varSigma ^+\) such that \(\sigma \models \varphi \). Given two formulae \(\varphi \) and \(\psi \), we say that \(\varphi \) is equivalent to \(\psi \), written \(\varphi \equiv \psi \), whenever \(\mathcal {L}(\varphi ) = \mathcal {L}(\psi )\).

Fragments of LTL. Given a set of operators \(S \subseteq \mathbb{O}\mathbb{P}\), we denote by \(\textsf{LTL} [S]\) the set of formulae only using temporal operators from S. When dealing with a concrete S, we omit the curly brackets and write, e.g., \(\textsf{LTL} [\textsf{X},\textsf{F}]\) instead of \(\textsf{LTL} [\{\textsf{X,F}\}]\). Whenever S contains only future operators (resp., past operators), the logic \(\textsf{LTL} [S]\) is called a pure future (resp., pure past) fragment of \(\textsf{LTL}\). Finally, we denote by \(\textsf{F}(\textsf{LTL} [S])\) (resp., \(\textsf{G}(\textsf{LTL} [S])\)) the set of formulae of the form \(\mathsf {F(\alpha )}\) (resp., \(\mathsf {G(\alpha )}\)), where \(\alpha \) is a formula of \(\textsf{LTL} [S]\). A language \(\mathcal {L}\subseteq \varSigma ^*\) is a cosafety language whenever \(\mathcal {L}= K \cdot \varSigma ^*\), for some \(K \subseteq \varSigma ^*\). A language \(\mathcal {L}\) is a safety language whenever its complement \(\overline{\mathcal {L}}\) is a cosafety language. For every formula \(\varphi \) in the fragments \(\textsf{LTL} [\textsf{X,F}]\) and \(\textsf{F}(\textsf{LTL} [\mathsf {Y,\widetilde{\textsf{Y}},O,H}])\), it holds that \(\mathcal {L}(\varphi )\) is a cosafety language. Similarly, for every formula \(\varphi \) in the fragments \(\textsf{LTL} [\mathsf {\widetilde{\textsf{X}},G}]\) and \(\textsf{G}(\textsf{LTL} [\mathsf {Y,\widetilde{\textsf{Y}},O,H}])\), it holds that \(\mathcal {L}(\varphi )\) is a safety language.

The pastification problem. Given two sets \(S \subseteq \{\mathsf {X,\widetilde{\textsf{X}},F,G}\}\) and \(S' \subseteq \{\mathsf {Y,\widetilde{\textsf{Y}},O,H}\}\), the pastification problem for \(\textsf{LTL} [S]\) into \(\textsf{F}(\textsf{LTL} [S'])\) asks, given an input formula \(\varphi \in \textsf{LTL} [S]\), to return a formula \(\psi \) from \(\textsf{F}(\textsf{LTL} [S'])\) such that \(\varphi \equiv \psi \). An algorithm for the pastification problem is said to be of k-exponential size (for \(k\in \mathbb {N}\) fixed) whenever the output formula \(\psi \) is such that \(\mathrm{{size}}(\psi ) \in \exp _2^k(\text {poly}(\mathrm{{size}}(\varphi )))\), where \(\exp ^k(.)\) is the k-th iteration of the base-2 tetration function given by \(\exp ^0(n) = n\) and \(\exp ^{i+1}(n) = 2^{\exp ^i(n)}\). In [4], an exponential time, 1-exponential size, pastification algorithm for \(\textsf{LTL} [\textsf{X,F}]\) into \(\textsf{F}(\textsf{LTL} [\mathsf {Y,\widetilde{\textsf{Y}},O}])\) is presented.

Succinctness. Given two sets \(S,S' \subseteq \mathbb{O}\mathbb{P}\), we say that \(\textsf{LTL} [S]\) can be exponentially more succinct than \(\textsf{LTL} [S']\) if there is a family of languages \((\mathcal {L}_n)_{n \ge 1}\) such that, for every \(n \ge 1\), \(\mathcal {L}_n \subseteq \varSigma _n^+\), for some alphabet \(\varSigma _n\), and:

  • there is \(\varphi \in \textsf{LTL} [S]\) such that \(\mathcal {L}(\varphi ) = \mathcal {L}_n\) and \(\mathrm{{size}}(\varphi ) \in \text {poly}(n)\), and

  • for every \(\psi \in \textsf{LTL} [S']\), if \(\mathcal {L}(\psi ) = \mathcal {L}_n\) then \(\mathrm{{size}}(\psi ) \in 2^{\varOmega (n)}\).

It is worth noticing that the above-given syntax for \(\textsf{LTL}\) is already in negation normal form, that is, negation may only appear in front of atomic propositions. Allowing negations to occur freely in the formula neither increase expressiveness nor succinctness, as the grammar above is already closed under dual operators, e.g., \(\textsf{G}\varphi \equiv \lnot \textsf{F}\lnot \varphi \), and the size of a formula does not depend on the number of negations occurring in literals. Because of this, all results given in the paper continue to hold when negation is added to the language.

3 A problematic cosafety language for \(\textsf{LTL} [\textsf{X},\widetilde{\textsf{X}},\textsf{F},\textsf{G}]\)

We now describe the property that we will exploit to prove that \(\textsf{F}(\textsf{LTL} [\textsf{O}])\) can be exponentially more succinct than \(\textsf{LTL} [\mathsf {X,\widetilde{\textsf{X}},F,G}]\) (Theorem 1). More precisely, we define a family of \(\textsf{F}(\textsf{LTL} [\textsf{O}])\) formulae \((\varPhi _n)_{n \ge 1}\) such that, for every \(n \ge 1\), \(\varPhi _n\) has size in \(\mathcal {O}(n)\) and captures a property requiring a formula of size at least \(2^n\) to be expressed in \(\textsf{LTL} [\textsf{X},\mathsf {\widetilde{\textsf{X}}},\textsf{F},\textsf{G}] \) (as we will see in Section 5).

Let \(n \ge 1\). We consider the alphabet of \(2n+2\) distinct atomic propositions \({\mathcal{A}\mathcal{P}{:}{=}\{\widetilde{p},\widetilde{q}\} \cup P \cup Q}\), with \(P {:}{=}\{p_1,\dots ,p_n\}\) and \(Q {:}{=}\{q_1,\dots ,q_n\}\). For all \(n \ge 1\), the formula \(\varPhi _n\) of \(\textsf{F}(\textsf{LTL} [\textsf{O}])\) is defined as follows:

$$ \textstyle \varPhi _n {:}{=}\textsf{F}\left( \widetilde{q} \wedge \bigwedge _{i=1}^n \Big ( \big (q_i \wedge \textsf{O}( \widetilde{p} \wedge p_i)\big ) \vee \big (\lnot q_i \wedge \textsf{O}(\widetilde{p} \wedge \lnot p_i)\big ) \Big ) \right) . $$

Observe that, for every \(n \ge 1\), \(\mathrm{{size}}(\varPhi _n)\) belongs to \(\mathcal {O}(n)\). The formula \(\varPhi _n\) is satisfied by those traces \(\sigma \in \varSigma ^+\) where there is a position \(j \in \text {pos}(\sigma )\) such that (i) \(\widetilde{q} \in \sigma [j]\) and (ii) for every \(i \in [1,n]\) there is a position \(k_i \in [0,j]\) such that \(\widetilde{p} \in \sigma [k_i]\) and \(q_i \in \sigma [j]\) if and only if \(p_i \in \sigma [k_i]\). Notice that each \(k_i \in [0,j]\) depends on an index \(i \in [1,n]\). Therefore, for distinct \(i,j \in [1,n]\) the positions \(k_i\) and \(k_j\) might differ. This feature is crucial to get a language which has a compact definition in \(\textsf{F}(\textsf{LTL} [\textsf{O}])\), but is hard to capture for \(\textsf{LTL} [\mathsf {X,\widetilde{\textsf{X}},F,G}]\).

As a matter of fact, requiring the various \(k_i\) to coincide yields a formula \(\varPsi _n\) characterising the property: “the trace \(\sigma \) has two positions \(j \ge k\) such that \(\widetilde{p} \in \sigma [k]\), \(\widetilde{q} \in \sigma [j]\) and, for every \(i \in [1,n]\), \(q_i \in \sigma [j]\) if and only if \(p_i \in \sigma [k]\)”. This formula is known to require exponential size in \(\textsf{LTL}\)  [20], and therefore in \(\textsf{F}(\textsf{LTL} [\textsf{O}])\) as well. In a sense, the asymmetry obtained by relaxing the uniqueness of the position k above is what makes \(\varPhi _n\) being easily expressible in \(\textsf{F}(\textsf{LTL} [\textsf{O}])\), but difficult to characterise in \(\textsf{LTL} [\textsf{X},\mathsf {\widetilde{\textsf{X}}},\textsf{F},\textsf{G}] \). The same trick, applied to position j instead of position k, can be used to obtain a family of formulae that can be represented in an exponentially more succinct way in \(\textsf{LTL} [\textsf{F}]\) than in \(\textsf{F}(\textsf{LTL} [\textsf{Y},\mathsf {\widetilde{\textsf{Y}}},\textsf{O},\textsf{H}])\). This form of “temporal duality” is what we will ultimately exploit in Section 6 to prove Theorem 2.

The following lemma shows that \(\varPhi _n\) can be expressed in \(\textsf{LTL} [\textsf{F}] \) (and thus in \(\textsf{LTL} [\textsf{X},\mathsf {\widetilde{\textsf{X}}},\textsf{F},\textsf{G}] \) as well) with a formula of exponential size.

Lemma 1

For every \(n \ge 1\), there is a formula \(\varPhi _n'\) in \(\textsf{LTL} [\textsf{F}] \) such that \(\varPhi _n' \equiv \varPhi _n\) and \(\mathrm{{size}}(\varPhi _n') < 2^{n+1} (n+2)^2\).

Proof sketch

Given \(\tau \in 2^P\), we write \(\overline{\tau }\) for the element of \(2^Q\) such that \(p_i \in \tau \) if and only if \(q_i \in \overline{\tau }\), for every \(i \in [1,n]\). Then, the formula \(\varPhi _n'\) is defined as follows:

$$ \textstyle \varPhi _n' {:}{=}\bigvee _{\tau \,\in \, 2^{P}} \Big ( \bigwedge _{p \,\in \, \tau } \textsf{F}\big (\widetilde{p} \wedge p \wedge \textsf{F}(\widetilde{q} \wedge \psi _{\overline{\tau }})\big ) \wedge \bigwedge _{p \,\in \, P \setminus \tau } \textsf{F}\big (\widetilde{p} \wedge \lnot p \wedge \textsf{F}(\widetilde{q} \wedge \psi _{\overline{\tau }})\big ) \Big ), $$

where \(\psi _{\overline{\tau }} {:}{=}(\bigwedge _{q \,\in \, \overline{\tau }} q \wedge \bigwedge _{q \,\in \, Q \setminus \overline{\tau }} \lnot q)\).    \(\square \)

4 A combinatorial proof system for \(\textsf{LTL} [\textsf{X},\widetilde{\textsf{X}},\textsf{F},\textsf{G}]\)

In this section, we introduce the proof system that we will later employ to prove Theorem 1, and discuss its connection with Adler–Immerman games [1].

Further notation. Let \(A \subseteq \varSigma ^+\), with \(\varSigma {:}{=}2^{\mathcal{A}\mathcal{P}}\) for some set of propositions \(\mathcal{A}\mathcal{P}\). We define \(A^{\textsf{X}} {:}{=}\{\sigma {[1\rangle } : \sigma \in A \text { s.t. } |\sigma | \ge 2\}\), i.e., the set of non-empty traces obtained from A by stepping each trace one position to the right. We define \(A^{\textsf{G}} {:}{=}{\{ \sigma {[j\rangle } : \sigma \in A \text { and } j \in \text {pos}(\sigma )\}}\), i.e., the set of all suffixes of the traces in A. We say that a map \(f : A \rightarrow \mathbb {N}\) is a future point for A whenever \(f(\sigma ) \in \text {pos}(\sigma )\) for every \(\sigma \in A\). We write \(\mathrm{{F}}_A\) for the set of all maps that are future points for A. Given a future point f for A and \(\sigma \in A\) with \(f(\sigma ) = i\), we define \(\sigma ^f {:}{=}\sigma {[i\rangle } \) and \({A^f {:}{=}\{ \sigma ^f : \sigma \in A \}}\). Note that, by definition, \(A^{\textsf{G}} = \bigcup _{f \in \mathrm{{F}}_A} A^f\).

For a formula \(\varphi \) of \(\textsf{LTL}\), we write \(A \models \varphi \) whenever \((\sigma ,0) \models \varphi \) for every \(\sigma \in A\), and whenever \((\sigma ,0) \not \models \varphi \) for every \(\sigma \in A\). Given two sets of traces \(A,B \subseteq \varSigma ^+\) we say that \(\varphi \) separates A from B whenever \(A \models \varphi \) and . We write \({\boldsymbol{\langle }}\cdot ,\cdot {\boldsymbol{\rangle }}_S \subseteq \varSigma ^+ \times \varSigma ^+\) for the separable relation on \(S \subseteq \mathbb{O}\mathbb{P}\), i.e., the binary relation holding on pairs (AB) whenever there is some formula from \(\textsf{LTL} [S]\) that separates A from B. Note that, when A and B are finite sets and \(\textsf{X} \in S\), deciding whether \({\boldsymbol{\langle }}A,B{\boldsymbol{\rangle }}_{S}\) holds is trivial.

Lemma 2

Let \(A,B \subseteq \varSigma ^+\) and \(S \subseteq \mathbb{O}\mathbb{P}\). Then, \({\boldsymbol{\langle }}A,B{\boldsymbol{\rangle }}_S\) implies \(A \cap B = \varnothing \). Moreover, if A and B are finite sets and \(\textsf{X}\in S\), \(A \cap B = \varnothing \) implies \({\boldsymbol{\langle }}A,B{\boldsymbol{\rangle }}_S\).

Proof sketch

For the first statement, clearly if \(A \cap B \ne \varnothing \) then it is not possible to separate A from B. To prove the second statement, one defines a disjunction \(\varphi \) of formulae, each characterising an element in A. For instance, for \(\mathcal{A}\mathcal{P}= \{p,q\}\), the trace \(\{p\}\{q\}\) can be characterised with the formula \((p \wedge \lnot q) \wedge \textsf{X}(q \wedge \lnot p \wedge \widetilde{\textsf{X}}\bot )\), where \(\bot {:}{=}p \wedge \lnot p\). Then, \(\varphi \) separates A from B.    \(\square \)

We mainly consider the relation \({\boldsymbol{\langle }}\cdot ,\cdot {\boldsymbol{\rangle }}_S\) with S being the set \(\{\textsf{X},\widetilde{\textsf{X}},\textsf{F},\textsf{G}\}\), and thus from now on simply write \({\boldsymbol{\langle }}\cdot ,\cdot {\boldsymbol{\rangle }}\) when considering this concrete choice of S.

4.1 The proof system

The combinatorial proof system that we define is a natural-deduction-style proof system. It is made of several inference rules of the form \(\frac{H_1 \ H_2 \ \dots \ H_n}{C}\), to be read as “if the hypotheses \(H_1,\dots ,H_n\) hold, then the consequence C holds”. As usual, proofs within the proof system have a tree-like presentation. An example of such a deduction tree is given in Figure 2, where \(a {:}{=}\{p\}\) and \(b {:}{=}\varnothing \), with \({p \in \mathcal{A}\mathcal{P}}\). This is a deduction tree for the term \({\boldsymbol{\langle }}\{abaa,aaaa\},\{aaab\}{\boldsymbol{\rangle }}\), which we call the root of the deduction tree. In Figure 2, to the root it is applied the rule Or, with hypotheses \({\boldsymbol{\langle }}\{abaa\},\{aaab\}{\boldsymbol{\rangle }}\) and \({\boldsymbol{\langle }}\{aaaa\},\{aaab\}{\boldsymbol{\rangle }}\). In turn, these two hypotheses are derived in the deduction tree by eventually reaching applications to the rule Atomic. A deduction tree is always closed: all maximal paths from the root ends with an application of the rule Atomic. This means that a rule of the proof system must be applied to each term \({\boldsymbol{\langle }}A,B{\boldsymbol{\rangle }}\) appearing in the tree. We call a tree a partial deduction tree if this property is not enforced, namely when there might be unproven terms \({\boldsymbol{\langle }}A,B{\boldsymbol{\rangle }}\). The size of a deduction tree is the number of rules in it. For instance, the tree in Figure 2 has size 5.

Fig. 1.
figure 1

The combinatorial proof system. Here, \(A,B \subseteq \varSigma ^+\).

Fig. 2.
figure 2

A deduction tree proving \({\boldsymbol{\langle }}\{abaa,aaaa\},\{aaab\}{\boldsymbol{\rangle }}\). Here, \(a {:}{=}\{p\}\) and \({b {:}{=}\varnothing }\).

We define the inference rules of the proof system in Figure 1. Let us briefly describe these rules. The Atomic rule allows deriving \({\boldsymbol{\langle }}A,B{\boldsymbol{\rangle }}\) if every trace in A satisfies some literal \(\alpha \) and every trace in B violates \(\alpha \). The Or rule corresponds the case of A being separable from B via a formula of the form \(\varphi _1 \vee \varphi _2\). In this and the rule And, \(\uplus \) stands for the union of disjoint sets. Intuitively, Or can be applied by proving that \(\varphi _1\) separates a set \(A_1 \subseteq A\) from B and that \(\varphi _2\) separates the set \(A \setminus A_1\) from B. The Next rule allows separating A from B with a formula of the form \(\mathsf {X\varphi }\), by checking whether the sets obtained by stepping all traces in A and B to next time point are separable by \(\varphi \). The condition \(A \subseteq \varSigma \cdot {\varSigma ^{+}}\) is necessary to ensure that all traces in A have a next time point. The Future rule separates A from B by following this principle: if the set obtained by choosing one suffix for every trace in A is separable from the set of all suffixes of the traces in B, then there is a formula of the form \(\textsf{F}\varphi \) separating A from B. The rules And, WeakNext and Globally are designed to be duals of the rules Or, Next and Future, respectively.

By using the proof system one can derive whether a pair of (finite or infinite) sets of traces (AB) is in the separable relation \({\boldsymbol{\langle }}\cdot ,\cdot {\boldsymbol{\rangle }}\). Because of Lemma 2, this is not, however, a particularly useful application. Instead, the proof system is to be used to derive non-trivial lower (or upper) bounds on the size of the minimal formula that separates A from B. This is done by studying the sizes of the possible deduction trees of \({\boldsymbol{\langle }}A,B{\boldsymbol{\rangle }}\) in the proof system.

For instance, the deduction tree of Figure 2 shows that there is a formula \(\varphi \) having \(\mathrm{{size}}(\varphi ) = 5\) and separating \(\{abaa,aaaa\}\) from \(\{aaab\}\). This formula is found by simply reading bottom-up, starting from the root, the rules in the deduction tree, associating to each rule the homonymous operator of \(\textsf{LTL}\). In the case of the tree in Figure 2 we have \(\varphi {:}{=}{(\textsf{X}\lnot p) \vee \textsf{G}p}\). Note that the formula \(\varphi \) is not the smallest separating formula, because the formula \(\textsf{X X G p}\) also separates \(\{abaa,aaaa\}\) from \(\{aaab\}\) and corresponds to a tree of size 4.

The correspondence between deduction trees and formulae is formalised in the next theorem (we remark that A and B below do not need to be finite sets).

Theorem 3

Consider \(A,B \,\subseteq \,\varSigma ^+\). Then, the term \({\boldsymbol{\langle }}A,B{\boldsymbol{\rangle }}\) has a deduction tree of size k if and only if there is a formula \(\varphi \) of \(\textsf{LTL} [\textsf{X},\mathsf {\widetilde{\textsf{X}}},\textsf{F},\textsf{G}]\) separating A from B and such that \(\mathrm{{size}}(\varphi ) = k\).

Proof sketch

We leave to the reader the proof of the left to right direction of the theorem (shown by induction on k), as it is not required to establish lower bounds on the sizes of formulae, and focus instead on the right to left direction.

Consider a \(\textsf{LTL} [\textsf{X},\mathsf {\widetilde{\textsf{X}}},\textsf{F},\textsf{G}]\) formula \(\varphi \) that separates A and B. We construct a deduction tree of size \(\mathrm{{size}}(\varphi )\). We proceed by structural induction on \(\varphi \).

base case: \(\varphi \) literal. The deduction tree consists of a single rule Atomic.

induction step, case: \(\varphi = \varphi _1 \vee \varphi _2\). Define \(A_1 {:}{=}\{ a \in A : a \models \varphi _1 \}\) and \(A_2 {:}{=}A \setminus A_1\). From \(A \models \varphi \) and we get \(A_i \models \varphi _i\) and for both \(i \in \{1,2\}\). By induction hypothesis \({\boldsymbol{\langle }}A_i,B{\boldsymbol{\rangle }}\) has a deduction tree of size \(\mathrm{{size}}(\varphi _i)\). By applying the rule And, we obtain a deduction tree for \({\boldsymbol{\langle }}A,B{\boldsymbol{\rangle }}\) having size \(\mathrm{{size}}(\varphi _1) + \mathrm{{size}}(\varphi _2) + 1 = \mathrm{{size}}(\varphi )\).

induction step, case: \(\varphi = \textsf{X} \psi \). Since \(A \models \textsf{X} \psi \), for every \(\sigma \in A\) we have \(|\sigma | \ge 2\) and \((\sigma ,1) \models \psi \). By definition of \(A^{\textsf{X}}\), \(A \subseteq \varSigma \cdot {\varSigma ^{+}}\) and \(A^{\textsf{X}} \models \psi \). From , for every \(\sigma ' \in B\), if \(|\sigma '| \ge 2\) then \((\sigma ',1) \not \models \psi \). By definition of \(B^{\textsf{X}}\), we have . By induction hypothesis, \({\boldsymbol{\langle }}A^{\textsf{X}},B^{\textsf{X}}{\boldsymbol{\rangle }}\) has a deduction tree of size \(\mathrm{{size}}(\psi )\). We apply the rule Next to obtain a deduction tree of \({\boldsymbol{\langle }}A,B{\boldsymbol{\rangle }}\) of size \(\mathrm{{size}}(\psi )+1=\mathrm{{size}}(\varphi )\).

induction step: \(\varphi = \textsf{F} \psi \). Since \(A \models \textsf{F} \psi \), for every \(\sigma \in A\) there is \(j_\sigma \in \text {pos}(\sigma )\) such that \((\sigma ,j_\sigma ) \models \psi \). Let \(f \in \mathrm{{F}}_A\) be the map given by \(f(\sigma ) = j_{\sigma }\) for every \(\sigma \in A\). We have \(A^f \models \psi \). We show that . Ad absurdum, suppose there is \(\sigma _1 \in B^{\textsf{G}}\) such that \(\sigma _1 \models \psi \). By definition of \(B^{\textsf{G}}\) there is \(\sigma _2 \in B\) such that \(\sigma _1 \sqsubseteq \sigma _2\). Then, \((\sigma _2,0) \models \textsf{F}\psi \). However, this contradicts the fact that . Therefore, . By induction hypothesis, \({\boldsymbol{\langle }}A^f, B^{\textsf{G}}{\boldsymbol{\rangle }}\) has a deduction tree of size \(\mathrm{{size}}(\psi )\). By applying the rule Future, we obtain a deduction tree for \({\boldsymbol{\langle }}A,B{\boldsymbol{\rangle }}\) of size \(\mathrm{{size}}(\psi )+1 = \mathrm{{size}}(\varphi )\).

induction step, cases \(\varphi = \varphi _1 \wedge \varphi _2\), \(\varphi = \widetilde{\textsf{X}}\psi \) and \(\varphi = \textsf{G}\psi \). The cases for \(\varphi = \varphi _1 \wedge \varphi _2\), \(\varphi = \widetilde{\textsf{X}}\psi \) and \(\varphi = \textsf{G}\psi \) are analogous to the cases \(\varphi = \varphi _1 \vee \varphi _2\), \(\varphi = \textsf{X}\psi \) and \(\varphi = \textsf{F}\psi \), respectively.    \(\square \)

The right to left direction of Theorem 3 implies the following corollary that highlights how our proof system is used for formulae sizes lower bounds.

Corollary 2

Consider a formula \(\varphi \) in \(\textsf{LTL} [\textsf{X},\mathsf {\widetilde{\textsf{X}}},\textsf{F},\textsf{G}]\). Suppose that (i) there are \(A,B \subseteq \varSigma ^+\) such that \(\varphi \) separates A from B, and (ii) every deduction tree of \({\boldsymbol{\langle }}A,B{\boldsymbol{\rangle }}\) has size at least k. Then, \(\mathrm{{size}}(\varphi ) \ge k\).

4.2 Connections with the Adler–Immerman games

As outlined in Section 1, our proof system can be seen as an adaptation of the games for \(\textsf{CTL}\) introduced by Adler and Immerman in [1]. We now illustrate this connection. Readers that are mostly interested in seeing our proof system in action may want to skip to Section 5.

The Adler–Immerman games extend the classical Ehrenfeucht–Fraïssé games in order to bound the sizes of the formulae that separate two (sets of) structures, instead of their quantifier depths. As in the Ehrenfeucht–Fraïssé games, the Adler–Immerman games are two-player games between a spoiler and a duplicator. The game arena is a pair of sets of structures (AB), and at each round of the game the spoiler choses a rule r to play (there is one rule for each Boolean connective and operator of the logic) and plays on one set of structures accordingly to what r dictates. The duplicator replies on the other set, again accordingly to r. The goal of the spoiler is to separate A from B (i.e., to show \({\boldsymbol{\langle }}A,B{\boldsymbol{\rangle }}\) in the context of \(\textsf{CTL}\)) in fewer rounds as possible, whereas the duplicator must prolong the game as much as she can. The length of the minimal game corresponds to the size of the minimal formula separating A from B. The main difference between an Adler–Immerman game and an Ehrenfeucht–Fraïssé game is that, in the former, in each round the duplicator is allowed to make copies of the structures in the set she is playing on, and to play differently in each of these copies. This extra power given to the duplicator is why the games end up capturing the notion of size of a formula.

In the setting of the Adler–Immerman games, the rule for the operator \(\textsf{F}\) in \(\textsf{LTL}\) would be spelled as follows: “For each structure \(\sigma \in A\), the spoiler moves to a future position of \(\sigma \) (i.e., \(\sigma {[j\rangle } \) for some \(j \in \text {pos}(\sigma )\)). The duplicator answers by first making as many copies of elements in B as she wants, and then selects a future position for each of these copies”. Because she can make copies, the duplicator has a trivial optimal strategy: at each round, copy the structures in B as much as possible, choosing a different position in each copy. The rule for \(\textsf{F}\) the simplifies to “For each structure \(\sigma \in A\), the spoiler moves to a future position of \(\sigma \). The duplicator answers with \(B^{\textsf{G}}\), which corresponds to our rule Future.

While Adler and Immerman discuss the fact that the duplicator has a trivial optimal strategy, they do not restate the games with only one player (mainly to not lose the similarity with the Ehrenfeucht–Fraïssé games). Our work shows that removing the duplicator yields a natural one-player game based on building proofs within a proof system. We think that this proof-system view has a few merits over the games. When proving lower bounds, it reduces the clumsiness of discussing the various moves of the spoiler and the replies of the duplicator. The combinatorics is of course still there, but not the players, and this substantially simplifies the exposition. Second, the proof system resembles the way in which one reasons about the algorithmic problem of separating A from B. For instance, the algorithm presented in [21] uses decision trees for solving this problem. These decision trees, when they encode a formula from \(\textsf{LTL} [\textsf{X},\mathsf {\widetilde{\textsf{X}}},\textsf{F},\textsf{G}]\), can be easily translated into proofs in our proof system. We discuss more this line of work connected with \(\textsf{LTL}\)  formulae learning and explainable planning in Section 8.

5 The exponential lower bound for \(\varPhi _n\)

In this section, we show that, for every \(n \ge 1\), all formulae of \(\textsf{LTL} [\textsf{X},\mathsf {\widetilde{\textsf{X}}},\textsf{F},\textsf{G}]\) characterising the \(\textsf{F}(\textsf{LTL} [\textsf{O}])\) formula \(\varPhi _n\) defined in Section 3 have size at least \(2^n\). According to the definition of \(\varPhi _n\), we consider a set of \(2n+2\) distinct atomic propositions \(\mathcal{A}\mathcal{P}{:}{=}\{\widetilde{p},\widetilde{q}\} \cup P \cup Q\), with \(P {:}{=}\{p_1,\dots ,p_n\}\) and \(Q {:}{=}\{q_1,\dots ,q_n\}\); and \(\varSigma {:}{=}2^{\mathcal{A}\mathcal{P}}\). Throughout the section, let \(\alpha (n) {:}{=}2^{n+1} (n+2)^2\), i.e., the upper bound given in Lemma 1 for one of these formulae.

Following Corollary 2, to prove the exponential lower bound we

  1. 1.

    define \(\textbf{A},\textbf{B}\subseteq \varSigma ^+\) such that \(\varPhi _n\) separates \(\textbf{A}\) from \(\textbf{B}\) (Section 5.1), and

  2. 2.

    prove that every deduction tree for \({\boldsymbol{\langle }}\textbf{A},\textbf{B}{\boldsymbol{\rangle }}\) has size at least \(2^n\) (Section 5.2).

5.1 Setting up the sets of traces \(\textbf{A}\) and \(\textbf{B}\)

We define the sets of types \(T_P {:}{=}\{ \tau \in \varSigma : \widetilde{p} \in \tau \text { and } \tau \subseteq P \cup \{\widetilde{p}\} \}\) and \(T_Q {:}{=}\{ \tau \in \varSigma : \widetilde{q} \in \tau \text { and } \tau \subseteq Q \cup \{\widetilde{q}\} \}\). Similarly to what done in the proof of Lemma 1, we write \(\overline{(\cdot )}\) for the involution on \(T_P \cup T_Q\) sending a type \(\tau \in T_Q\) into the (only) type \({\overline{\tau } \in T_P}\) with \(q_i \in \tau \) if and only if \(p_i \in \overline{\tau }\), for every \(i \in [1,n]\).

Throughout the section, we fix a (arbitrary) strict total order \(\prec \) on the elements of \(T_Q\). Then, we denote by \(\mathcal {E}\in (\varnothing ^{\alpha (n)} \cdot T_Q)^{2^n} \cdot \varnothing ^{\alpha (n)}\) the (only) finite word enumerating all elements in \(T_Q\), with respect to the order \(\prec \). Note that, in \(\mathcal {E}\), between any two subsequent elements of \(T_Q\) there are \(\alpha (n)\) positions of type \(\varnothing \). This “padding” added to the enumeration is required to handle the rules Next and WeakNext. Given \(\tau \in T_Q\), we write \(\mathcal {E}|_{-\tau }\) for the trace obtained from \(\mathcal {E}\) by eliminating the only position of type \(\tau \), together with the \(\alpha (n)\) positions of type \(\varnothing \) preceding it. So, \(\mathcal {E}_{-\tau }\) belongs to \({(\varnothing ^{\alpha (n)} \cdot T_Q)^{2^n-1} \cdot \varnothing ^{\alpha (n)}}\).

For instance, consider the case of \(n = 2\), so \(Q = \{q_1,q_2\}\) and \(\alpha (n) = 128\). Suppose \(\{\widetilde{q}\} \prec \{\widetilde{q},q_1\} \prec \{\widetilde{q},q_2\} \prec \{\widetilde{q},q_1,q_2\}\) to be the strict order on \(T_Q\). Then,

$$\begin{aligned} \mathcal {E}&= \varnothing ^{128} \cdot \{\widetilde{q}\} \cdot \varnothing ^{128} \cdot \{\widetilde{q},q_1\} \cdot \varnothing ^{128} \cdot \{\widetilde{q},q_2\} \cdot \varnothing ^{128} \cdot \{\widetilde{q},q_1,q_2\} \cdot \varnothing ^{128},\\ \mathcal {E}|_{-\{\widetilde{q},q_2\}} &= \varnothing ^{128} \cdot \{\widetilde{q}\} \cdot \varnothing ^{128} \cdot \{\widetilde{q},q_1\} \cdot \varnothing ^{128} \cdot \{\widetilde{q},q_1,q_2\} \cdot \varnothing ^{128}. \end{aligned}$$

For the rest of the paper, we denote with \(\textbf{A}\) and \(\textbf{B}\) the sets:

$$\begin{aligned} \textbf{A}{:}{=}\{ \varnothing ^j \cdot \overline{\tau } \cdot \mathcal {E}: j \in \mathbb {N}, \tau \in T_Q\}, \qquad \textbf{B}{:}{=}\{ \varnothing ^j \cdot \overline{\tau } \cdot (\mathcal {E}|_{-\tau }) : j \in \mathbb {N}, \tau \in T_Q \}. \end{aligned}$$

Lemma 3

The formula \(\varPhi _n\) separates \(\textbf{A}\) from \(\textbf{B}\).

Proof

Let \(j \in \mathbb {N}\) and \(\tau \in T_Q\). In a nutshell, the fact that \(\varnothing ^j \cdot \overline{\tau } \cdot \mathcal {E}\models \varPhi _n\) follows from the fact that \(\tau \) occurs in \(\mathcal {E}\), and from the position corresponding to \(\tau \) one can refer back to \(\overline{\tau }\) and find in this way a position satisfying \(p_i\) if and only if \(q_i \in \tau \), for every \(i \in [1,n]\). However, since \(\tau \) is removed from \(\mathcal {E}|_{-\tau }\), we see that \(b {:}{=}\varnothing ^j \cdot \overline{\tau } \cdot (\mathcal {E}|_{-\tau }) \not \models \varPhi _n\): indeed, \(b[j] = \overline{\tau }\) corresponds to the only position in b satisfying \(\widetilde{p}\), but \(\tau \) does not appear in b (since it does not appear in \(\mathcal {E}|_{-\tau }\)). Therefore, \(\textbf{A}\models \varPhi _n\) and .    \(\square \)

5.2 Separating \(\textbf{A}\) from \(\textbf{B}\) requires an exponential proof

We now show that every deduction tree for \({\boldsymbol{\langle }}\textbf{A},\textbf{B}{\boldsymbol{\rangle }}\) has size at least \(2^n\). To do so, we use a relation \(\approx \) that, roughly speaking, states what elements \((a,b) \in \textbf{A}^{\textsf{G}} \times \textbf{B}^{\textsf{G}}\) are similar enough to require a non-trivial proof in order to be separated using the proof system. Formally, for \(a,b \in \varSigma ^+\), we write \(a \approx b\) whenever:

$$\begin{aligned} a\text { and }b\text { are in the language }\varnothing ^u \cdot \rho \cdot \varnothing ^{\alpha (n)} \cdot \varSigma ^*\text {, for some }u \in \mathbb {N}\text { and }\rho \in T_Q \cup T_P. \end{aligned}$$

The central issue in the proof of the lower bound is counting how many of these pairs \(a \approx b\) are preserved when applying the rules of the proof system. This count is done inductively on the size of the deduction tree, and allows us to derive the following lemma.

Lemma 4

Let \(r_1,t_1,\dots ,r_m,t_m \in \mathbb {N}\) and let \({\tau _1,\dots ,\tau _m \in T_Q}\) be pairwise distinct sets. Consider \(A \subseteq \textbf{A}^{\textsf{G}}\), \(B {:}{=}\{(\varnothing ^{t_i} \cdot \overline{\tau _i} \cdot \mathcal {E}|_{-\tau _i}){[r_i\rangle } : i \in [1,m]\}\), and \(C {:}{=}\{ (a,b) \in A \times B : {a \approx b} \}\). Every deduction tree for \({\boldsymbol{\langle }}A,B{\boldsymbol{\rangle }}\) has size at least \(|C|+1\).

Proof

Below, suppose that \({\boldsymbol{\langle }}A,B{\boldsymbol{\rangle }}\) has a deduction tree (else the statement is trivially true). In particular, let \(\mathcal {T}\) be a minimal deduction tree for \({\boldsymbol{\langle }}A,B{\boldsymbol{\rangle }}\), and assume it has size s. Note that the hypothesis that \(\tau _1,\dots ,\tau _m\) are distinct implies \(|B| \le 2^n\), which in turn implies \(|C| < 2^n\) (by definition of \(\approx \), for every \(b \in B\) there is at most one \(a \in \textbf{A}^{\textsf{G}}\) such that \(a \approx b\)). Then, w.l.o.g. we can assume \(s \le \alpha (n)\); otherwise the lemma follows trivially.

During the proof, we write \(\prec \) for the strict total order on elements of \(T_Q\) used to construct the trace \(\mathcal {E}\) enumerating \(T_Q\). Before continuing the proof of the lemma, we highlight a useful property of the elements of C.

Claim 1

Let \((a,b) \in C\) and \(i \in [1,m]\) with \(b = (\varnothing ^{t_i} \cdot \overline{\tau _i} \cdot \mathcal {E}|_{-\tau _i}){[r_i\rangle } \). Then, \(b = \varnothing ^u \cdot \rho \cdot \varnothing ^{\alpha (n)} \cdot \sigma \), for some \(u \in \mathbb {N}\), \(\rho \in \{ \overline{\tau _i} \} \cup \{\tau \in T_Q : \tau \prec \tau _i \}\) and \(\sigma \in \varSigma ^*\).

In a nutshell, this claim tells us that for every \((a,b) \in C\) we have \(b \not \sqsubseteq \mathcal {E}\).

Let us go back to the proof of Lemma 4. If \(A = \varnothing \) or \(m = 0\) then \(C = \varnothing \) and the lemma follows trivially. Below, let us assume \(A \ne \varnothing \) and \(m \ge 1\). We prove the statement by induction on the size s of \(\mathcal {T}\).

In the base case \(s = 1\), \(\mathcal {T}\) is a simple application of the rule Atomic. This means that for every \(a \in A\) and \(b \in B\) we have \(a[0] \ne b[0]\). By definition of \(\approx \), this implies \(C = \varnothing \), and therefore \(s \ge |C|+1\).

Let us then consider the induction step \(s \ge 2\). Note that if \(|C| \le 1\) then the statement follows trivially. Hence, below, we assume \(|C| \ge 2\). We split the proof depending on the rule applied to the root \({\boldsymbol{\langle }}A,B{\boldsymbol{\rangle }}\) of \(\mathcal {T}\). Since \(s \ge 2\), this rule cannot be Atomic. We omit the cases for Or and And, as they simply follow the induction hypothesis, and focus on the rules related to temporal operators.

\(\bullet \) case: rules Next and WeakNext. We consider Next and WeakNext together, as both require \({\boldsymbol{\langle }}A^{\textsf{X}},B^{\textsf{X}}{\boldsymbol{\rangle }}\). Perhaps surprisingly, this case is non-trivial. The main difficulty stems from the fact that \(C' {:}{=}\{ (a,b) \in A^{\textsf{X}} \times B^{\textsf{X}} : {a \approx b} \}\) might in principle even be empty, and thus applying the induction hypothesis on \({\boldsymbol{\langle }}A^{\textsf{X}},B^{\textsf{X}}{\boldsymbol{\rangle }}\) is unhelpful for concluding that \(s \ge |C|+1\). We now show how to circumvent this issue. The minimal deduction tree for \({\boldsymbol{\langle }}A^{\textsf{X}},B^{\textsf{X}}{\boldsymbol{\rangle }}\) has size \(s-1\). Within this deduction tree, consider the maximal partial deduction tree \(\mathcal {T}'\) rooted at \({\boldsymbol{\langle }}A^{\textsf{X}},B^{\textsf{X}}{\boldsymbol{\rangle }}\) and made solely of applications of the rules AndOrNext, and WeakNext. Let \({\boldsymbol{\langle }}A_1,B_1{\boldsymbol{\rangle }},\dots ,{\boldsymbol{\langle }}A_q,B_q{\boldsymbol{\rangle }}\) be the leafs of such a tree. Let \({j \in [1,q]}\). In the tree \(\mathcal {T}\), to \({\boldsymbol{\langle }}A_j,B_j{\boldsymbol{\rangle }}\) it is applied a rule among Atomic, Future and Globally. Let \({\xi _j \ge 1}\) be the number of Next and WeakNext rules used in the path of \(\mathcal {T}\) from \({\boldsymbol{\langle }}A,B{\boldsymbol{\rangle }}\) to \({\boldsymbol{\langle }}A_j,B_j{\boldsymbol{\rangle }}\). Note that, from \(s \le \alpha (n)\), we have \(\xi _j \le \alpha (n)\). We define the following two sets \(C_j\) and \(N_j\), whose role is essentially to “track” the evolution of pairs in C with respect to \(A_j \times B_j\):

$$\begin{aligned} C_j &{:}{=}\{(a{[\xi _j\rangle },b{[\xi _j\rangle }) \in A_j \times B_j : {(a,b) \in {C}},\, a{[\xi _j\rangle } \approx b{[\xi _j\rangle } \},\\ N_j &{:}{=}\{(a{[\xi _j\rangle },b{[\xi _j\rangle }) \in A_j \times B_j : {(a,b) \in C},\, a{[\xi _j\rangle } \not \approx b{[\xi _j\rangle } \}. \end{aligned}$$

The minimal deduction tree for \({\boldsymbol{\langle }}A_j,B_j{\boldsymbol{\rangle }}\) has size \({s_j \ge |C_j|+1}\); by induction hypothesis. Claims 2 to 4 below highlight a series of properties on the sets \(C_j\) and \(N_j\) from which we derive \(s \ge |C|+1\).

Claim 2

For every \(j \in [1,q]\), if \(C_j \cup N_j \ne \varnothing \) then the rule applied to \({\boldsymbol{\langle }}A_j,B_j{\boldsymbol{\rangle }}\) in \(\mathcal {T}\) is either Future or Globally.

As already said, the rule applied to \({\boldsymbol{\langle }}A_j,B_j{\boldsymbol{\rangle }}\) is among the rules Atomic, Future and Globally. Then, showing that \(a[0] = b[0]\) for every \((a,b) \in C_j \cup N_j\) excludes the rule Atomic.

Claim 3

For every \(j \in [1,q]\), \(|N_j| \le 1\).

The proof of this claim is by contradiction, assuming the existence of distinct \((a_1,b_1),(a_2,b_2) \in N_j\). In the proof, we analyse structural properties of the traces \(a_1\), \(a_2\), \(b_1\) and \(b_2\), and consider several cases depending on such properties (for instance, one case split depends on whether \(a_1 \sqsubseteq a_2\)). In all these cases, we reach a contradiction with either \((a_1,b_1) \ne (a_2,b_2)\) or Claim 2.

Claim 4

\(|C| \le \sum _{j=1}^q |C_j \cup N_j|\).

The claim follows as soon as one proves the following two statements:

  1. 1.

    for every \((a,b) \in C\) there is \(j \in [1,q]\) such that \((a{[\xi _j\rangle },b{[\xi _j\rangle }) \in C_j \cup N_j\),

  2. 2.

    for all distinct \((a_1,b_1),(a_2,b_2) \in C\), we have \((a_1{[\ell \rangle },b_1{[\ell \rangle }) \ne (a_2{[\ell \rangle },b_2{[\ell \rangle })\) for every \(\ell \le \alpha (n)\) (recall that \(\xi _j \le \alpha (n)\), for every \(j \in [1,q]\)).

Item 1 is by induction on the size of \(\mathcal {T}'\). Similarly to Claim 3, the proof of Item 2 again requires to consider many cases, and uses properties of \(\approx \), \(\mathcal {E}\) and \(\mathcal {E}|_{-\tau _i}\).

Thanks to Claims 3 and 4, one can then prove \(s \ge |C|+1\), concluding the proof for the rules Next and WeakNext:

$$\begin{aligned} s &\ge 1 + \textstyle \sum _{j=1}^q s_j &\text {by definition}~\text {of }\mathcal {T}\text { and }\mathcal {T}'\\ &\ge 1+\textstyle \sum _{j=1}^q (|C_j|+1) &\text {by }s_j \ge |C_j|+1\text { (induction hypothesis)}\\ &\ge 1+\textstyle \sum _{j=1}^q (|C_j \cup N_j|) &\text {by }|N_j| \le 1\text { (Claim 3)}\\ &\ge |C|+1 &\text {by }\textstyle |C| \le \textstyle \sum _{j=1}^q |C_j \cup N_j|\text { (Claim 4).} \end{aligned}$$

\(\bullet \) case: rule Future. Let \(f \in \mathrm{{F}}_A\) be the future point used when applying this rule. Define \(C' {:}{=}\{(a',b') \in A^f \times B^{\textsf{G}} : a' \approx b' \}\). The minimal deduction tree for \({\boldsymbol{\langle }}A^f,B^{\textsf{G}}{\boldsymbol{\rangle }}\) has size \(s-1\). By induction hypothesis, \(s-1 \ge |C'| + 1\), i.e., \(s \ge |C'| + 2\). We divide the proof into two cases.

Case 1: for every \(a' \in A^f\), \(a' \not \sqsubseteq \mathcal {E}\). By definition of \(\approx \), every \((a,b) \in C\) is such that a and b belong to the language \(\varnothing ^u \cdot \overline{\tau _i} \cdot \varnothing ^{\alpha (n)} \cdot \varSigma ^*\) for some \(u \in \mathbb {N}\), and \(i \in [1,m]\). Since \(a^f \not \sqsubseteq \mathcal {E}\), we must have \(f(a) \le u+1\). Then, \(a^f \approx b{[f(a)\rangle } \). Note that distinct \((a,b) \in C\) concern distinct \(\overline{\tau _i}\) with \(i \in [1,m]\), and therefore, together with \(b{[f(a)\rangle } \in B^G\), one concludes that \(|C'| \ge |C|\); and so \(s \ge |C|+2\).

Case 2: there is \(a' \in A^f\) such that \(a' \sqsubseteq \mathcal {E}\). Let us denote with \(\widetilde{a}\) the element in \(A^f\) such that \(\widetilde{a} \sqsubseteq a\) for every \(a \in A^f\). The existence of such an element follows directly from the fact that \(a' \sqsubseteq \mathcal {E}\) for some \(a' \in A^f\).

Let \(I \subseteq [1,m]\) be the subset of those indices \(i \in [1,m]\) for which there is a pair \((a',b') \in C\) such that \(b' = (\varnothing ^{t_i} \cdot \overline{\tau _i} \cdot \mathcal {E}|_{\tau _i}){[r_i\rangle } \). Without loss of generality, suppose \(I = \{1,\dots ,q\}\) for some \(q \le m\), and that \(\tau _{1} \prec \tau _{2} \prec \dots \prec \tau _{q}\); recall that all these types are pairwise distinct. By definition of \(\approx \), for every \(b' \in B\) there is at most one \(a' \in \textbf{A}^{\textsf{G}}\) such that \(a' \approx b'\), and therefore \(q = |C|\). To conclude the proof it suffices to show \(|C'| \ge q - 1\). We do so by establishing a series of claims. Recall that we are assuming \(|C|\ge 2\), so in particular C and I are non-empty.

Claim 5

There are \(u \in \mathbb {N}\), \(\rho \in T_Q\) and \(\sigma \in (2^{Q})^*\) s.t. \({\widetilde{a} = \varnothing ^{u} \cdot \rho \cdot \varnothing ^{\alpha (n)} \cdot \sigma }\). Moreover, \(\rho \preceq \tau _i\) for every \(i \in I\).

The first statement of this claim is established from the definition of \(\widetilde{a}\). The second statement is proven by contradiction. In particular, assuming that there is \(i \in I\) such that \(\tau _i \prec \rho \) yields a contradiction with Claim 1.

Below, we write \(u,\rho \) and \(\sigma \) for the objects appearing in Claim 5. Note that, from \(\tau _1 \prec \dots \prec \tau _q\), the second statement of Claim 5 implies \(\rho \prec \tau _2 \dots \prec \tau _q\). For \(i \in [2,q]\), let \((a_i',b_i')\) denote the pair in C such that \(b_i' = (\varnothing ^{t_i} \cdot \overline{\tau _i} \cdot \mathcal {E}|_{\rho _i}){[r_i\rangle } \).

Claim 6

For each \(i \in [2,q]\) there is \(\ell \in \mathbb {N}\) such that \(\widetilde{a} \approx b_i''\) with \(b_i'' {:}{=}b_i'{[\ell \rangle } \). Moreover, every type in \(\{\tau _2, \dots , \tau _q\} \setminus \{\tau _i\}\) appears in some position of \(b_i''\).

This claim is proven using Claims 1 and 5 and properties of \(\mathcal {E}|_{-\tau _i}\).

Since all types \(\tau _2, \dots , \tau _q\) are pairwise distinct, from the second statement in Claim 6 we conclude that \(b_i'' \ne b_j''\) for every two distinct \(i,j \in I \setminus \{i_1\}\). Then, the first statement in Claim 6 entails \(|C'| \ge q-1\).

\(\bullet \) case: rule Globally. Let \(f \in \mathrm{{F}}_A\) be the future point used when applying this rule. The minimal deduction tree for \({\boldsymbol{\langle }}A^{\textsf{G}},B^f{\boldsymbol{\rangle }}\) has size \(s-1\). We define \({C' {:}{=}\{(a',b') \in A^{\textsf{G}} \times B^{f} : a' \approx b' \}}\). By induction hypothesis, \(s-1 \ge |C'| + 1\), i.e., \(s \ge |C'| + 2\). To conclude the proof it suffices to show that \(|C'| \ge |C|-1\) (in fact, we prove \(|C'| \ge |C|\)). Let \(\{(a_1,b_1),\dots ,(a_{|C|},b_{|C|})\} = C\).

Claim 7

For every \(j \in [1,|C|]\), \(b_j^f\) is not a suffix of \(\mathcal {E}\). More precisely, given \(i \in [1,m]\) such that \(b_j = (\varnothing ^{t_i} \cdot \overline{\tau _i} \cdot \mathcal {E}|_{-\tau _i}){[r_i\rangle } \), we have \(b_j^f = \varnothing ^u \cdot \rho \cdot \varnothing ^{\alpha (n)} \cdot \sigma \), for some \(u \in \mathbb {N}\), \(\rho \in \{ \overline{\tau _i} \} \cup \{\tau \in T_P : \tau \prec \tau _i \}\) and \(\sigma \in \varSigma ^*\).

See the similarities between this claim and Claim 1. The first statement is proven by contradiction, deriving an absurdum with the existence of \(\mathcal {T}\). The second statement follows from the definition of \(\mathcal {E}|_{-\tau _i}\).

Starting from Claim 7, we conclude that (i) for every \(j \ne k \in [1,|C|]\), \(b_j^f \ne b_k^f\), and (ii) for every \(j \in [1,|C|]\) there is \(\ell \in \mathbb {N}\) such that \(a_j{[\ell \rangle } \approx b_j^f\). This directly implies \(|C'| \ge |C|\). This concludes both the proof of the case Globally and the proof of the lemma.    \(\square \)

Together, Lemmas 3 and 4 yield an exponential lower bound for all formulae of \(\textsf{LTL} [\textsf{X},\mathsf {\widetilde{\textsf{X}}},\textsf{F},\textsf{G}]\) characterising \(\varPhi _n\) (Lemma 5), which in turn implies Theorem 1.

Lemma 5

Let \(\varPsi _n \in \textsf{LTL} [\textsf{X},\mathsf {\widetilde{\textsf{X}}},\textsf{F},\textsf{G}] \). If \(\varPsi _n \equiv \varPhi _n\) then \(\mathrm{{size}}(\varPsi _n) \ge 2^n\).

Proof

We define the sets \(A = \{ \overline{\tau } \cdot \mathcal {E}: \tau \in T_Q\}\) and \(B = \{ \overline{\tau } \cdot \mathcal {E}_{-\tau } : \tau \in T_Q \}\). Observe that \(A \subseteq \textbf{A}\subseteq \textbf{A}^{\textsf{G}}\) and \(B \subseteq \textbf{B}\). Let \(C = \{(a,b) \in A \times B : a \approx b \}\). From the definition of \(\approx \), \(|C| = 2^n\). We apply Lemma 4, and conclude that the minimal deduction tree for \({\boldsymbol{\langle }}A,B{\boldsymbol{\rangle }}\) has size at least \(2^n\) (in fact, \(2^n+1\)). Since \(A \subseteq \textbf{A}\) and \(B \subseteq \textbf{B}\), the same holds for the minimal deduction tree for \({\boldsymbol{\langle }}\textbf{A},\textbf{B}{\boldsymbol{\rangle }}\). Then, the theorem follows from Corollary 2 and Lemma 3.    \(\square \)

While we do not prove it formally, we claim that Theorem 1 also holds for infinite traces. It is in fact quite simple to see this: all traces in \(\textbf{A}\) and \(\textbf{B}\), have a suffix of the form \(\varnothing ^{\alpha (n)}\). Roughly speaking, these suitably long suffixes are added to make the far-end of the traces in \(\textbf{A}\) and \(\textbf{B}\) indistinguishable at the level of formulae, so that they cannot be used in deduction trees to separate \(\textbf{A}\) from \(\textbf{B}\). Then, to prove Theorem 1 for infinite traces, it suffices to update the proof system to handle these structures and consider an infinite suffix \(\varnothing ^{\omega }\) instead. The proof of Lemma 4 goes through with no significant change.

A second observation: traces in \(\textbf{A}\) and \(\textbf{B}\) are closed under taking arbitrary long prefixes of the form \(\varnothing ^j\). This feature is not used to prove Lemma 5 (see the definition of A and B in the proof). However, these prefixes play a role in the next section, when studying the succinctness of \(\textsf{F}(\textsf{LTL} [\textsf{Y},\mathsf {\widetilde{\textsf{Y}}},\textsf{O},\textsf{H}])\) on infinite traces.

6 Theorem 2: a \(2^{n}\) lower bound for \(\textsf{LTL} [\textsf{F}]\) pastification

In this section, we rely on Lemma 5 to prove Theorem 2 and Corollary 1.

Theorem 2 is proven by relying on a “future–past duality” between future and past fragments of \(\textsf{LTL}\). Given a trace \(\sigma \in \varSigma ^+\) we define the reverse of \(\sigma \), written \(\sigma ^{-}\), as the trace satisfying \(\sigma ^{-}[i] = \sigma [|\sigma |-i]\) for every \(i \in \text {pos}(\sigma )\). The reverse of a language \(\mathcal {L}\subseteq \varSigma ^+\) is defined as the language \(\mathcal {L}^{-} {:}{=}\{ \sigma ^{-} : \sigma \in \mathcal {L}\}\). Clearly, \((\mathcal {L}^-)^- = \mathcal {L}\). Given a set of temporal operators \(S \subseteq \{\textsf{X},\widetilde{\textsf{X}},\textsf{F},\textsf{G}\}\), we write \(S^-\) for the set of temporal operators among \(\{\textsf{Y}, \widetilde{\textsf{Y}}, \textsf{O}, \textsf{H}\}\) such that \(S^-\) contains \(\textsf{Y}\) (resp. \(\widetilde{\textsf{Y}}\); \(\textsf{O}\); \(\textsf{H}\)) if and only if S contains \(\textsf{X}\) (resp. \(\widetilde{\textsf{X}}\); \(\textsf{F}\); \(\textsf{G}\)). For finite traces, the following lemma, proves that if there is a family of languages \((\mathcal {L}_n)_{n\ge 1}\) that can be compactly defined in \(\textsf{F}(\textsf{LTL} [\textsf{O}])\) but explodes in \(\textsf{LTL} [\textsf{X},\mathsf {\widetilde{\textsf{X}}},\textsf{F},\textsf{G}] \), then the family \((\mathcal {L}_n^{-})_{n\ge 1}\) can be compactly defined in \(\textsf{LTL} [\textsf{F}] \) but explodes in \(\textsf{F}(\textsf{LTL} [\textsf{Y},\mathsf {\widetilde{\textsf{Y}}},\textsf{O},\textsf{H}])\).

Lemma 6

Let \(\mathcal {L}\subseteq \varSigma ^+\), \(S \subseteq \{\textsf{X},\widetilde{\textsf{X}},\textsf{F},\textsf{G}\}\), and \(\varphi \) be a formula in \(\textsf{F}(\textsf{LTL} [S^-])\). There is a formula \(\psi \) in \(\textsf{F}(\textsf{LTL} [S])\) such that \(\mathcal {L}(\psi ) = \mathcal {L}(\varphi )^-\) and \({\mathrm{{size}}(\psi ) = \mathrm{{size}}(\varphi )}\).

Theorem 2 follows by applying Lemma 6 on the family of formulae \((\varPhi _n)_{n \ge 1}\) defined in Section 3, and by relying on the exponential lower bounds of Lemma 5. The sequence of languages showing that \(\textsf{LTL} [\textsf{F}]\) can be exponentially more succinct than \(\textsf{F}(\textsf{LTL} [\textsf{Y},\mathsf {\widetilde{\textsf{Y}}},\textsf{O},\textsf{H}])\) is given by \((\mathcal {L}(\varPhi _n)^-)_{n \ge 1}\).

Next, we extend Theorem 2 to the case of infinite traces. As usual, let \(\varSigma ^\omega \) be the set of all infinite traces over the finite alphabet \(\varSigma \). We denote with \(\mathcal {L}^\omega (\varphi )\) the language of \(\varphi \), when \(\varphi \) is interpreted over infinite traces (we refer the reader to, e.g., [2] for the semantics of \(\textsf{LTL} \) on infinite traces).

Lemma 7

The family of languages of infinite traces \((\mathcal {L}(\varPhi _n)^{-} \cdot \varSigma ^\omega )_{n \ge 1}\) is such that, for every \(n\ge 1\), (i) there is a formula \(\varphi \) of \(\textsf{LTL} [\textsf{F}] \) such that \(\mathrm{{size}}(\varphi ) \in \mathcal {O}(n)\) and \(\mathcal {L}^\omega (\varphi ) = \mathcal {L}(\varPhi _n)^{-}\cdot \varSigma ^\omega \), and (ii) for every formula \(\psi \) in \(\textsf{F}(\textsf{LTL} [\textsf{Y},\mathsf {\widetilde{\textsf{Y}}},\textsf{O},\textsf{H}])\), if \(\mathcal {L}^\omega (\psi ) = \mathcal {L}(\varPhi _n)^{-}\cdot \varSigma ^\omega \) then \(\mathrm{{size}}(\psi ) \ge 2^{n}\).

Item (i) in the lemma above follows by applying Lemma 6 and exploiting the fact that formulae \(\varphi \) in \(\textsf{LTL} [\textsf{F}] \) satisfy \(\mathcal {L}^{\omega }(\varphi ) = \mathcal {L}(\varphi ) \cdot \varSigma ^\omega \) and \({\mathcal {L}(\varphi ) = \mathcal {L}(\varphi ) \cdot \varSigma ^*}\) (cf. [2, Definition 5 and Lemma 5]). The proof of Item (ii) is instead quite subtle. One would like to use the hypothesis \(\mathcal {L}^\omega (\psi ) = \mathcal {L}(\varPhi _n)^{-}\cdot \varSigma ^\omega \) and that \(\mathcal {L}(\psi )\) is a cosafety language to derive \(\mathcal {L}(\psi ) = \mathcal {L}(\varPhi _n)^-\). However, note that nothing prevents \(\mathcal {L}(\psi )\) to be equal to \(\mathcal {L}(\varPhi _n)^- \cdot \varSigma \), and as it stands we do not have bounds for characterising this language. We apply instead the following strategy. We consider the family of structures \(A' {:}{=}\{ a^- \cdot \varnothing ^\omega : a \in \textbf{A}\}\) and \(B' {:}{=}\{ b^- \cdot \varnothing ^\omega : b \in \textbf{B}\}\). Note that \(A' \subseteq \mathcal {L}^{\omega }(\psi )\) and \(B' \cap \mathcal {L}^{\omega }(\psi ) = \varnothing \). Since \(\psi \) is of the form \(\textsf{F}(\alpha )\) with \(\alpha \in \textsf{LTL} [\textsf{Y},\mathsf {\widetilde{\textsf{Y}}},\textsf{O},\textsf{H}] \), we can, roughly speaking, study the effects of applying to \(A'\) and \(B'\) a variant of the rule Future for infinite words and that does not “forget the past”, and then reverse all traces in the resulting sets \((A')^f\) and \((B')^{\textsf{G}}\). In this way, we obtain two sets of finite traces \(\widetilde{A} \subseteq \textbf{A}\) and \(\widetilde{B} \subseteq \textbf{B}\) (this is where the prefixes \(\varnothing ^j\) discussed at the end of Section 5 play a role). We show that the hypotheses of Lemma 4 apply to \(\widetilde{A}\) and a set \(\widehat{B} \subseteq \widetilde{B}\) for which the set \(\{(a,b) \in \widetilde{A} \times \widehat{B} : a \approx b \}\) has size at least \(2^n-1\). By Corollary 2, we get that \(\alpha \), and thus \(\psi \), is of size at least \(2^n\).

Lemma 7 shows that Theorem 2 holds over infinite traces as well. Together with the \(2^{\mathcal {O}(n)}\) upper bound for the pastification problem for \(\textsf{LTL} [\textsf{X},\textsf{F}] \) into \(\textsf{F}(\textsf{LTL} [\mathsf {Y,\widetilde{\textsf{Y}},O}])\) establishedFootnote 4 in [4], this entails Corollary 1.

7 The automata method does not work for \(\textsf{F}(\textsf{LTL} [\textsf{O}])\)

In this section we show that the classical method introduced by Markey in [20] to prove “future against past” succinctness discrepancies in fragments of \(\textsf{LTL}\) cannot be used to prove the results in Section 5, namely that \(\textsf{F}(\textsf{LTL} [\textsf{O}])\) can be exponentially more succinct than \(\textsf{LTL} [\textsf{X},\mathsf {\widetilde{\textsf{X}}},\textsf{F},\textsf{G}] \). Due to space constraints, we assume a basic familiarity with non-deterministic Büchi automata (\(\textsf{NBA}\) s) (and deterministic Büchi automata, \(\textsf{DBA}\) s), which are central tools in [20]. Moreover, we work on \(\textsf{LTL}\) on infinite traces, as done in [20], and write \(\varphi \equiv _{\omega } \psi \) whenever \(\mathcal {L}^{\omega }(\varphi ) = \mathcal {L}^{\omega }(\psi )\). We write \(\mathcal {L}^{\omega }(A)\) for the language of an \(\textsf{NBA}\) A.

Proposition 1 below summarises the method in [20], which is parametric on a fixed prefix \(\varPi \) of operators among \(\textsf{X}\), \(\textsf{F}\) and \(\textsf{G}\). Given two fragments \(F,F'\) of \(\textsf{LTL} \), with \(F'\) pure future, to apply the method one has to provide the two families of formulae \({(\varPhi _n)_{n \ge 1} \!\in \! F}\) and \({(\varPhi _n')_{n \ge 1} \!\in \! F'}\), as well as the family of minimal \(\textsf{NBA}\) s \({(A_n)_{n \ge 1}}\), satisfying the hypotheses of Proposition 1. In [20], this is done for \({F = \textsf{LTL}}\) and \(F'\) set as the pure future fragment of \(\textsf{LTL}\), using the prefix \(\varPi = \textsf{G}\).

Proposition 1

(Markey’s method [20]). Let \(F,F'\) be fragments of \(\textsf{LTL} \), with \(F'\) pure future. Consider two families of formulae \({(\varPhi _n)_{n \ge 1} \!\in \! F}\), \({(\varPhi _n')_{n \ge 1} \!\in \! F'}\), and a family of minimal \(\textsf{NBA}\) s \({(A_n)_{n \ge 1}}\), such that

$$\begin{aligned} \mathrm{{size}}(A_n) \in 2^{2^{\varOmega (n)}}\!\!, \quad \mathrm{{size}}(\varPhi _n) \in \text {poly}(n), \quad \varPhi _n \equiv _{\omega } \varPhi _n', \quad \mathcal {L}^{\omega }(\varPi (\varPhi _n')) = \mathcal {L}^{\omega }(A_n). \, \end{aligned}$$

Then, \(\mathrm{{size}}(\varPhi _n') \in 2^{\mathrm{{size}}(\varPhi _n)^{\varOmega (1)}}\) and F can be exponentially more succinct than \(F'\).

The consequence \(\mathrm{{size}}(\varPhi _n') \in 2^{\mathrm{{size}}(\varPhi _n)^{\varOmega (1)}}\) obtained in Proposition 1 follows directly from the fact that, from every pure future \(\textsf{LTL}\) formula \(\varphi \), one can build an \(\textsf{NBA}\) A of size \(2^{\mathcal {O}(\mathrm{{size}}(\varphi ))}\) such that \(\mathcal {L}^{\omega }(A) = \mathcal {L}^{\omega }(\varphi )\) [26]. Then, the hypotheses \({\mathrm{{size}}(A_n) \in 2^{2^{\varOmega (n)}}}\) and \(\mathcal {L}^{\omega }(\varPi (\varPhi _n')) = \mathcal {L}^{\omega }(A_n)\) imply \(\mathrm{{size}}(\varPhi _n') \in 2^{\varOmega (n)}\).

To show that Proposition 1 cannot be used to derive that \(F {:}{=}\textsf{F}(\textsf{LTL} [\textsf{O}])\) can be exponentially more succinct than \(F' {:}{=}\textsf{LTL} [\textsf{X},\mathsf {\widetilde{\textsf{X}}},\textsf{F},\textsf{G}] \), it suffices to show that no families \({(\varPhi _n)_{n \ge 1} \!\in \! F}\), \({(\varPhi _n')_{n \ge 1} \!\in \! F'}\) and \({(A_n)_{n \ge 1}}\) achieve the hypotheses required by Proposition 1, no matter the temporal prefix \(\varPi \). We do so by establishing that whenever \(\mathrm{{size}}(\varPhi _n) \in \text {poly}(n)\) and \(\varPhi _n \equiv _{\omega } \varPhi _n'\), the minimal deterministic Büchi automaton for \(\mathcal {L}^{\omega }(\varPi (\varPhi _n'))\) has size in \(2^{\mathcal {O}(\text {poly}(n))}\). Therefore, no family of minimal \(\textsf{NBA}\) s \((A_n)_{n \ge 1}\) such that \(\mathrm{{size}}(A_n) \in 2^{2^{\varOmega (n)}}\) can also satisfy the hypothesis \(\mathcal {L}^{\omega }(\varPi (\varPhi _n')) = \mathcal {L}^{\omega }(A_n)\). Here is the formal statement:

Theorem 4

Let \(\varPi \) be a prefix of k temporal operators among \(\textsf{X}\), \(\textsf{F}\) and \(\textsf{G}\). Let \(\varphi \) be a formula of \(\textsf{F}(\textsf{LTL} [\textsf{O}])\), and \(\psi \) be a formula of \(\textsf{LTL} [\textsf{X},\mathsf {\widetilde{\textsf{X}}},\textsf{F},\textsf{G}] \), with \(\varphi \equiv _{\omega } \psi \). The minimal \(\textsf{DBA}\) for \(\mathcal {L}^{\omega }(\varPi (\psi ))\) is of size in \((k+1) \cdot 2^{\mathcal {O}(\mathrm{{size}}(\varphi ))}\).

The proof of this theorem is divided into three steps.

As a first step, one shows that \(\psi \equiv _{\omega } \textsf{F}\psi \); which essentially follows from the fact that \(\psi \equiv _{\omega } \varphi \) with \(\varphi \in \textsf{F}(\textsf{LTL} [\textsf{O}])\). Together with tautologies of \(\textsf{LTL} \) such as \({\textsf{F}\textsf{G}\textsf{F}\psi ' \equiv _{\omega } \textsf{G}\textsf{F}\psi '}\), \(\textsf{F}\textsf{X}\psi ' \equiv _{\omega } \textsf{X}\textsf{F}\psi '\) and \(\textsf{G}\textsf{X}\psi ' \equiv _{\omega } \textsf{X}\textsf{G}\psi '\), the equivalence \(\psi \equiv _{\omega } \textsf{F}\psi \) let us rearrange \(\varPi \) into a prefix of the form either \(\textsf{X}^{j} \textsf{G}\textsf{F}\) or \(\textsf{X}^j \textsf{F}\), for some \(j \le k\). Let us focus on the former (more challenging) case of \(\varPi = \textsf{X}^{j} \textsf{G}\textsf{F}\).

The second step required for the proof is to bound the size of the minimal \(\textsf{DBA}\)  A recognising \(\mathcal {L}^{\omega }(\textsf{F}\psi )\). Thanks to the equivalences \(\varphi \equiv _{\omega } \psi \equiv _{\omega } \textsf{F}\psi \), such a \(\textsf{DBA}\) has size exponential in \(\mathrm{{size}}(\varphi )\) by the following lemma.

Lemma 8

Let \(\varphi \) in \(\textsf{F}(\textsf{LTL} [\textsf{O}])\). There is a \(\textsf{DBA}\) for \(\mathcal {L}^{\omega }(\varphi )\) of size \(2^{\mathcal {O}(\mathrm{{size}}(\varphi ))}\).

Starting from A, the third and last step of the proof requires constructing a \(\textsf{DBA}\) for \(\mathcal {L}^{\omega }(\textsf{X}^j \textsf{G}\textsf{F}\psi )\) of size in \((j+1) \cdot 2^{\mathcal {O}(\mathrm{{size}}(\varphi ))}\). The treatment for the prefix \(\textsf{X}^j\) is simple, so this step is mostly dedicated to constructing a \(\textsf{DBA}\) for \(\mathcal {L}^{\omega }(\textsf{G}\textsf{F}\psi )\). In the case of \(\textsf{LTL} \), it is known that closing an \(\textsf{NBA}\) under the globally operator \(\textsf{G}\) might lead to a further exponential blow-up (in fact, this is one of the reasons Markey’s method is possible in the first place). However, because \(\varphi \) is in \(\textsf{F}(\textsf{LTL} [\textsf{O}])\), and it is thus a cosafety language (and so \(\psi \) is a cosafety language too), it turns out that the size of the minimal \(\textsf{DBA}\) for \(\mathcal {L}^{\omega }(\textsf{G}\textsf{F}\psi )\) is in \(\mathcal {O}(\mathrm{{size}}(A))\).

Lemma 9

Let \(\psi \) be in \(\textsf{LTL} \), such that \(\mathcal {L}^{\omega }(\psi )\) is a cosafety language. Let A be a \(\textsf{DBA}\) for \(\mathcal {L}^{\omega }(\textsf{F}\psi )\). The minimal \(\textsf{DBA}\) for \(\mathcal {L}^{\omega }(\textsf{G}\textsf{F}\psi )\) has size in \(\mathcal {O}(\mathrm{{size}}(A))\).

Thanks to Lemma 9, the proof of Theorem 4 can be easily completed. To prove this lemma, one modifies A by redirecting all transitions exiting a final state so that they mimic the transitions exiting the initial state of the automaton. The reason why the obtained automaton recognises \(\mathcal {L}^{\omega }(\textsf{G}\textsf{F}\psi )\) uses in a crucial way the fact that \(\psi \) and \(\textsf{F}\psi \) are cosafety languages.

8 Related and Future Work

The proof systems we use in this work to establish Theorem 2 and Theorem 1 are strongly related to recent work in two seemingly disconnected areas of computer science: (i) combinatorial games for formulae lower bounds of first-order logics and (ii) learning of \(\textsf{LTL}\) formulae in explainable planning and program synthesis.

Combinatorial games. We have already discussed the connections between our proof system and the \(\mathsf {CTL^+}\) games by Adler and Immerman [1]. Recently, Fagin and coauthors [9, 10] have looked at combinatorial games that allow to count the number of quantifiers required to express a property in first-order logic. While these games simplify Adler–Immerman games, they come with a drawback: by design, they implicitly look at how to express properties with first-order formulae in prenex normal form, and they are not able to give any bound on the quantifier-free part of such formulae. It seems then not possible to use these types of games in the context of \(\textsf{LTL}\). One could in principle consider translations of \(\textsf{LTL}\) formulas into a prenex fragment of \(\textsf{S1S}\). However, since \(\textsf{S1S}\) is both more expressive and more succinct than \(\textsf{LTL}\)  [25], concluding that \(\textsf{LTL} [\textsf{F}]\) can be exponentially more succinct than \(\textsf{F}(\textsf{LTL} [\textsf{Y},\mathsf {\widetilde{\textsf{Y}}},\textsf{O},\textsf{H}])\) will ultimately require analysing structural properties of the quantifier-free part of the \(\textsf{S1S}\) formulae.

Closer to the case of \(\textsf{LTL}\) are the games on linear orders (implicitly) used by Grohe and Schweikartdt in [14]. These are formula-size games for a fixed number of variables of first-order logic. Using our notation, the method used to derive lower bounds in [14] relies on defining a function \(\omega \) from terms of the form \({\boldsymbol{\langle }}A,B{\boldsymbol{\rangle }}\) to \(\mathbb {N}\) that acts as a sub-additive measure with respect to the rules of the proof system. For instance, according to the rule Or, \(\omega \) should satisfy \({\omega ({\boldsymbol{\langle }}A,B{\boldsymbol{\rangle }}) \le \omega ({\boldsymbol{\langle }}A_1,B{\boldsymbol{\rangle }}) + \omega ({\boldsymbol{\langle }}A_2,B{\boldsymbol{\rangle }})}\), whenever \(A = A_1 \uplus A_2\). One can use a sub-additive measure \(\omega \) to conclude that the minimal deduction tree for \({\boldsymbol{\langle }}A,B{\boldsymbol{\rangle }}\), if it exists, has size at least \(\omega ({\boldsymbol{\langle }}A,B{\boldsymbol{\rangle }})\). When restricted to the objects in Lemma 4, one can show that the function \(\omega ({\boldsymbol{\langle }}A,B{\boldsymbol{\rangle }}) {:}{=}|\{ (a,b) \in A \times B : {a \approx b} \}|+1\) is a sub-additive measure with respect to the rules Atomic, Or, And, Future and Globally (this is implicit in the proof of Lemma 4). However, it is not a sub-additive measure for the rules Next and WeakNext: as stressed in the proof, we might have \(\omega ({\boldsymbol{\langle }}A^{\textsf{X}},B^{\textsf{X}}{\boldsymbol{\rangle }}) = 1\) even for \(\omega ({\boldsymbol{\langle }}A,B{\boldsymbol{\rangle }})\) arbitrary large. This partially explains why the proof of Lemma 4 turned out to be quite involved.

In view of the optimality of the algorithm in [4] (Corollary 1), the main open problem regarding pastification is the optimality of the triply-exponential time procedure given in [7] for the pastification of \(\textsf{LTL} [\textsf{X},\textsf{U}]\) into \(\textsf{F}(\textsf{pLTL})\). As far as we are aware, no super-polynomial lower bound for this problem is known. Our proof system, properly extended with rules for the until and release operators, might be able to tackle this issue. Obtaining a matching triply-exponential lower bound might however be impossible: when restricted to propositional logic, our proof system is equivalent to the communication games introduced by Karchmer and Wigderson [15]. It is well-known that these games cannot prove super-quadratic lower bounds for formulae sizes, and one should expect similar limitations for temporal logics, albeit with respect to some function that is at least exponential.

\(\textsf{LTL}\) formulae learning. Motivated by the practical issue of understanding a complex system starting from its execution traces, several recent works study the algorithmic problem of finding an \(\textsf{LTL}\) formula separating two finite sets of traces, see e.g. [5, 11, 12, 21, 24]. In light of Theorem 3, this problem is equivalent to finding a proof in (possibly variations of) our combinatorial proof system. We believe that this simple observation will turn out to be quite fruitful for both the “combinatorial games” and the “formula learning” communities. From our experience, tools such as the one developed in [5, 21, 24] are quite useful when studying combinatorial lower bounds, as they can be used to empirically test whether families of structures are difficult to separate, before attempting a formal proof. In our case, we have used the tool in [21] while searching for the structures and formulae we ended up using in Section 5, and discarded several other candidates thanks to the evidences the tool gave us. On the other side of the coin, combinatorial proof systems can be seen as a common foundational framework for all these formulae-learning procedures. With this in mind, we believe that the techniques developed for proving lower bounds in works such as [14] might be of help for improving these procedures, for example using the minimization of a sub-additive measure as a heuristic.