1 Introduction

Algebraic datatypes such as lists and trees are extremely common in many programming languages. Reasoning about them is therefore crucial for modeling and verifying programs. For this reason, various decision procedures for algebraic datatypes have been, and continue to be developed and employed by formal reasoning tools such as theorem provers and Satisfiability Modulo Theories (SMT) solvers. For example, the general algorithm of [4] describes a decision procedure for datatypes suitable for SMT solvers. Consistently with the SMT paradigm, [4] leaves the combination of datatypes with other theories to general combination methods, and focuses on parametric datatypes (or generic datatypes as they are called in the programming languages community).

The traditional combination method of Nelson and Oppen [20] is applicable for the combination of this theory with many other theories, as long as the other theory is stably infinite (a technical condition that intuitively amounts to the ability to extend every model to an infinite one). Some theories of interest, however, are not stably infinite, the most notable one being the theory of fixed-width bit-vectors, which is commonly used for modeling and verifying both hardware and software. To be able to perform combinations with such theories, a more general combination method was designed [21], which relies on polite theories. Roughly speaking, a theory is polite if: (i) every model can be arbitrarily enlarged; and (ii) there is a witness, a function that transforms any quantifier-free formula to an equivalent quantifier-free formula such that if the original formula is satisfiable, the new formula is satisfiable in a “minimal” interpretation. This notion was later strengthened to strongly polite theories [14], which also account for possible arrangements of the variables in the formula. Strongly polite theories can be combined with any other disjoint decidable theory, even if that other theory is not stably infinite. While strong politeness was already proven for several useful theories (such as equality, arrays, sets, multisets [21]), strong politeness of algebraic datatypes remained an unanswered question.

The main contribution of this paper is an affirmative answer to this question. We introduce a witness function that essentially “guesses” the right constructors of variables without an explicit constructor in the formula. We show how to “shrink” any model of a formula that is the output of this function into a minimal model. The witness function, as well as the model-construction, can be used by any SMT solver for the theory of datatypes that implements polite theory combination. We introduce and use the notion of additive witnesses, which allows us to prove politeness and conclude strong politeness. We further study the theory of datatypes beyond politeness and extend a decision procedure for a subset of this theory presented in [9] to support the full theory.

Related Work

The theory investigated in this paper is that of algebraic datatypes, as defined by the SMT-LIB 2 standard [3]. Detailed information on this theory, including a decision procedure and related work, can be found in [4]. Later work extends this procedure to handle shared selectors [23] and co-datatypes [22]. More recent approaches for solving formulas about datatypes use, e.g., theorem provers [15], variant satisfiability [12, 19], and reduction-based decision procedures [1, 6, 13].

In this paper, we focus on polite theory combination. Other combination methods for non stably infinite theories include shiny theories [27], gentle theories [11], and parametric theories [17]. The politeness property was introduced in [21], and extends the stable infiniteness assumption initially used by Nelson and Oppen. Polite theories can be combined à la Nelson-Oppen with any arbitrary decidable theory. Later, a flaw in the original definition of politeness was found [14], and a corrected definition (here called strong politeness) was introduced. Strongly polite theories were further studied in [8], where the authors proved their equivalence with shiny theories.

More recently, it was proved [9] that a general family of datatype theories extended with bridging functions is strongly polite. This includes the theories of lists/trees with length/size functions. The authors also proved that a class of axiomatizations of datatypes is strongly polite. In contrast, in this paper we focus on standard interpretations, as defined by the SMT-LIB 2 standard, without any size function, but including selectors and testers. One can notice that the theory of standard lists without the length function, and more generally the theory of finite trees without the size function, were not mentioned as polite in a recent survey [7]. Actually, it was unclear to the authors of [7] whether these theories are strongly polite. This is now clarified in the current paper.

Outline

The paper is organized as follows. Section 2 provides the necessary notions from first-order logic and polite theories, and it introduces our working definition of the theory of datatypes, which is based on SMT-LIB 2. Section 3 discusses the difference between politeness and strong politeness, and introduces a useful condition for their equivalence. Section 4 contains the main result of this paper, namely that the theory of algebraic datatypes is strongly polite. Section 5 studies various axiomatizations of the theory of datatypes, and relates them to politeness. Section 6 concludes with directions for further research.

2 Preliminaries

2.1 Signatures and Structures

We briefly review usual definitions of many-sorted first-order logic with equality (see [10, 26] for more details). For any set S, an S -sorted set A is a function from S to \({\mathcal{P}({X})}\setminus \left\{ \emptyset \right\} \) for some set X (i.e., A assigns a non-empty set to every element of S), such that \(A(s)\cap A(s')=\emptyset \) whenever \(s\ne s'\). We use \(A_{s}\) to denote A(s) for every \(s\in S\), and call the elements of S sorts. When there is no ambiguity, we sometimes treat sorted sets as sets (e.g., when writing expressions like \(x\in A\)). Given a set S (of sorts), the canonical S -sorted set, denoted \([[{S}]]\), satisfies \([[{S}]]_{s}=\left\{ s\right\} \) for every \(s\in S\). A many-sorted signature \(\varSigma \) consists of a set \(\mathcal{S}_{\varSigma }\) (of sorts), a set \(\mathcal{F}_{\varSigma }\) of function symbols, and a set \(\mathcal{P}_{\varSigma }\) of predicate symbols. Function symbols have arities of the form \(\sigma _{1}\times \ldots \times \sigma _{n}\rightarrow \sigma \), and predicate symbols have arities of the form \(\sigma _{1}\times \ldots \times \sigma _{n}\), with \(\sigma _{1},\dots ,\sigma _{n},\sigma \in \mathcal{S}_{\varSigma }\). For each sort \(\sigma \in \mathcal{S}_{\varSigma }\), \(\mathcal{P}_{\varSigma }\) includes an equality symbol \(=_{\sigma }\) of arity \(\sigma \times \sigma \). We denote it by \(=\) when \(\sigma \) is clear from context. \(\varSigma \) is called finite if \(\mathcal{S}_{\varSigma }\), \(\mathcal{F}_{\varSigma }\), and \(\mathcal{P}_{\varSigma }\) are finite.

We assume an underlying \(\mathcal{S}_{\varSigma }\)-sorted set of variables. Terms, formulas, and literals are defined in the usual way. For a \(\varSigma \)-formula \(\phi \) and a sort \(\sigma \), we denote the set of free variables in \(\phi \) of sort \(\sigma \) by \({ vars}_{\sigma }({\phi })\). This notation naturally extends to \({ vars}_{S}({\phi })\) when S is a set of sorts. A sentence is a formula without free variables. We denote by \({ QF}(\varSigma )\) the set of quantifier-free formulas of \(\varSigma \). A \(\varSigma \)-literal is called flat if it has one of the following forms: \(x=y\), \(x\ne y\), \(x=f(x_{1},\dots ,x_{n})\), \(P(x_{1},\dots ,x_{n})\), or \(\lnot P(x_{1},\dots ,x_{n})\) for some variables \(x,y,x_{1},\dots ,x_{n}\) and function and predicate symbols f and P from \(\varSigma \).

A \(\varSigma \) -structure is a many-sorted structure for \(\varSigma \), without interpretation of variables. It consists of a \(\mathcal{S}_{\varSigma }\)-sorted set A, and interpretations to the function and predicate symbols of \(\varSigma \). We further require that \(=_{\sigma }\) is interpreted as the identity relation over \(A_\sigma \) for every \(\sigma \in \mathcal{S}_{\varSigma }\). A \(\varSigma \) -interpretation \(\mathcal{A}\) is an extension of a \(\varSigma \)-structure with interpretations to some set of variables. For any \(\varSigma \)-term \(\alpha \), \(\alpha ^{\mathcal{A}}\) denotes the interpretation of \(\alpha \) in \(\mathcal{A}\). When \(\alpha \) is a set of \(\varSigma \)-terms, \(\alpha ^{\mathcal{A}}=\left\{ x^{\mathcal{A}}\mid x\in \alpha \right\} \). Similarly, \(\sigma ^{\mathcal{A}}\), \(f^{\mathcal{A}}\) and \(P^{\mathcal{A}}\) denote the interpretation of \(\sigma \), f and P in \(\mathcal{A}\). Satisfaction is defined as usual. denotes that \(\mathcal{A}\) satisfies \(\varphi \).

A \(\varSigma \)-theory T is a class of \(\varSigma \)-structures. A \(\varSigma \)-interpretation whose variable-free part is in T is called a T-interpretation. A \(\varSigma \)-formula \(\phi \) is T-satisfiable if for some T-interpretation \(\mathcal{A}\). Two formulas \(\phi \) and \(\psi \) are T -equivalent if they are satisfied by the same class of T-interpretations. Let \(\varSigma _{1}\) and \(\varSigma _{2}\) be signatures, \(T_{1}\) a \(\varSigma _{1}\)-theory, and \(T_{2}\) a \(\varSigma _{2}\)-theory. The combination of \(T_{1}\) and \(T_{2}\), denoted \(T_{1}\oplus T_{2}\), is the class of \(\varSigma _{1}\cup \varSigma _{2}\)-structures \(\mathcal{A}\) such that \(\mathcal{A}^{\varSigma _{1}}\) is in \(T_{1}\) and \(\mathcal{A}^{\varSigma _{2}}\) is in \(T_{2}\), where \(\mathcal{A}^{\varSigma _{i}}\) is the restriction of \(\mathcal{A}\) to \(\varSigma _{i}\) for \(i\in \left\{ 1,2\right\} \).

2.2 The SMT-LIB 2 Theory of Datatypes

In this section we formally define the SMT-LIB 2 theory of algebraic datatypes. The formalization is based on [3], but is adjusted to suit our investigation of politeness.

Definition 1

Given a signature \(\varSigma \), a set \(S\subseteq \mathcal{S}_{\varSigma }\) and an S-sorted set A, the set of \(\varSigma \) -trees over A of sort \(\sigma \in \mathcal{S}_{\varSigma }\) is denoted by \(T_{\sigma }(\varSigma ,A)\) and is inductively defined as follows:

  • \(T_{\sigma ,0}(\varSigma ,A) = A_\sigma \) if \(\sigma \in S\) and \(\emptyset \) otherwise.

  • \(T_{\sigma ,i+1}(\varSigma ,A) = T_{\sigma ,i}(\varSigma ,A) \cup \{ c(t_1,\dots ,t_n) ~|~ c:\sigma _{1}\times \ldots \times \sigma _{n} \rightarrow \sigma \in \mathcal{F}_{\varSigma }, t_j \in T_{\sigma _j,i}(\varSigma ,A) \text{ for } j=1,\dots ,n \}\) for each \(i \ge 0\).

Then \(T_{\sigma }(\varSigma ,A) = \bigcup _{i \ge 0} T_{\sigma ,i}(\varSigma ,A)\). The depth of a \(\varSigma \)-tree over A is inductively defined by \({ depth}(a)=0\) for every \(a\in A\), \({ depth}(c)=1\) for every 0-ary function symbol \(c\in \mathcal{F}_{\varSigma }\), and \({ depth}(c(t_{1},\dots ,t_{n}))=1+max({ depth}(t_{1}),\dots ,{ depth}(t_{n}))\) for every n-ary function symbol c of \(\varSigma \).

The idea behind Definition 1 is that \(T_{\sigma }(\varSigma ,A)\) contains all ground \(\sigma \)-sorted terms constructed from the elements of A (considered as constant symbols) and the function symbols of \(\varSigma \).

Example 1

Let \(\varSigma \) be a signature with two sorts, \(\mathbf{elem}\) and \(\mathbf{struct}\), and whose function symbols are b of arity \(\mathbf{struct}\), and c of arity \((\mathbf{elem}\,\times \,\mathbf{struct}\,\times \,\mathbf{struct})\rightarrow \mathbf{struct}\). Consider the \(\{ \mathbf{elem}\}\)-sorted set \(A = \{ a \}\). For the \(\mathbf{elem}\) sort, \(T_{\mathbf{elem}}(\varSigma ,A)\) is the singleton \(A = \{ a \}\) and the \(\varSigma \)-tree a is of depth 0. For the \(\mathbf{struct}\) sort, \(T_{\mathbf{struct}}(\varSigma ,A)\) includes infinitely many \(\varSigma \)-trees, such as b of depth 1, c(abb) of depth 2, and c(ac(abb), b) of depth 3.

Definition 2

A finite signature \(\varSigma \) is called a datatypes signature if \(\mathcal{S}_{\varSigma }\) is the disjoint union of two sets of sorts \(\mathcal{S}_{\varSigma }=\mathbf{Elem}_{\varSigma }\uplus \mathbf{{Struct}}_{\varSigma }\) and \(\mathcal{F}_{\varSigma }\) is the disjoint union of two sets of function symbols \(\mathcal{F}_{\varSigma }={\mathcal{C}}{\mathcal{O}}_{\varSigma }\uplus {\mathcal{S}}{\mathcal{E}}_{\varSigma }\), such that \({\mathcal{S}}{\mathcal{E}}_{\varSigma }= \{s_{c,i}:\sigma \rightarrow \sigma _{i} \mid c\in {\mathcal{C}}{\mathcal{O}}_{\varSigma }, c:\sigma _{1},\dots ,\sigma _{n}\rightarrow \sigma , 1\le i\le n \}\) and \({\mathcal{P}}_{\varSigma }= \{is_{c}:\sigma \mid c\in {\mathcal{C}}{\mathcal{O}}_{\varSigma }, c:\sigma _{1},\dots ,\sigma _{n}\rightarrow \sigma \}\). We denote by \({\varSigma }_{\mid {\mathcal{C}}{\mathcal{O}}}\) the signature with the same sorts as \(\varSigma \), no predicate symbols (except \(=_{\sigma }\) for \(\sigma \in \mathcal{S}_{\varSigma }\)), and whose function symbols are \({\mathcal{C}}{\mathcal{O}}_{\varSigma }\). We further require the following well-foundedness requirement: \(T_{\sigma }({\varSigma }_{\mid {\mathcal{C}}{\mathcal{O}}},[[{\mathbf{Elem}_{\varSigma }}]]) \ne \emptyset \) for any \(\sigma \in \mathbf{{Struct}}_{\varSigma }\).

From now on, we omit the subscript \(\varSigma \) from the above notations (e.g., when writing \([[{\mathbf{Elem}}]]\) rather than \([[{\mathbf{Elem}_{\varSigma }}]]\), \({\mathcal{C}}{\mathcal{O}}\) rather than \({\mathcal{C}}{\mathcal{O}}_{\varSigma }\)) whenever \(\varSigma \) is clear from the context. Notice that Definition 2 remains equivalent if we replace \([[{\mathbf{Elem}}]]\) by any (non-empty) \(\mathbf{Elem}\)-sorted set A. The set \([[{\mathbf{Elem}}]]\) has been chosen since this minimal \(\mathbf{Elem}\)-sorted set is sufficient.

In accordance with SMT-LIB 2, we call the elements of \({\mathcal{C}}{\mathcal{O}}\) constructors, the elements of \({\mathcal{S}}{\mathcal{E}}\) selectors, and the elements of \({\mathcal{P}}\) testers. 0-ary constructors are called nullary. In what follows, \(\varSigma \) denotes an arbitrary datatypes signature.

In the next example we review some common datatypes signatures.

Example 2

The signature \({\varSigma _{ list}}\) has two sorts, \(\mathbf{elem}\) and \(\mathbf{list}\). Its function symbols are cons of arity \((\mathbf{elem}\times \mathbf{list})\rightarrow \mathbf{list}\), nil of arity \(\mathbf{list}\), car of arity \(\mathbf{list}\rightarrow \mathbf{elem}\) and cdr of arity \(\mathbf{list}\rightarrow \mathbf{list}\). Its predicate symbols are \(is_{nil}\) and \(is_{cons}\), both of arity \(\mathbf{list}\). It is a datatypes signature, with \(\mathbf{Elem}=\left\{ \mathbf{elem}\right\} \), \(\mathbf{{Struct}}=\left\{ \mathbf{list}\right\} \), \({\mathcal{C}}{\mathcal{O}}=\left\{ nil,cons\right\} \) and \({\mathcal{S}}{\mathcal{E}}=\left\{ car,cdr\right\} \). It is often used to model lisp-style linked lists. car represents the head of the list and cdr represents its tail. nil represents the empty list. \({\varSigma _{ list}}\) is well-founded as \(T_{\mathbf{list}}({{\varSigma _{ list}}}_{\mid {\mathcal{C}}{\mathcal{O}}},[[{\mathbf{Elem}}]])\) includes nil.

The signature \({\varSigma _{ pair}}\) also has two sorts, \(\mathbf{elem}\) and \(\mathbf{pair}\). Its function symbols are pair of arity \((\mathbf{elem}\times \mathbf{elem})\rightarrow \mathbf{pair}\) and first and second of arity \(\mathbf{pair}\rightarrow \mathbf{elem}\). Its predicate symbol is \(is_{pair}\) of arity \(\mathbf{pair}\). It is a datatypes signature, with \(\mathbf{Elem}=\left\{ \mathbf{elem}\right\} \), \(\mathbf{{Struct}}=\left\{ \mathbf{pair}\right\} \), \({\mathcal{C}}{\mathcal{O}}=\left\{ pair\right\} \), and \({\mathcal{S}}{\mathcal{E}}=\left\{ first,second\right\} \). It can be used to model ordered pairs, together with projection functions. It is well-founded as \(T_{\mathbf{pair}}({{\varSigma _{ pair}}}_{\mid {\mathcal{C}}{\mathcal{O}}},[[{\mathbf{Elem}}]])\) is not empty (as \([[{\mathbf{Elem}}]]\) is not empty).

The signature \({\varSigma _{ lp}}\) has three sorts, \(\mathbf{elem}\), \(\mathbf{pair}\) and \(\mathbf{list}\). Its function symbols are cons of arity \((\mathbf{pair}\times \mathbf{list})\rightarrow \mathbf{list}\), car of arity \(\mathbf{list}\rightarrow \mathbf{pair}\), as well as nilcdrfirstsecond with arities as above. Its predicate symbols are \(is_{pair}\), \(is_{cons}\) and nil, with arities as above. It can be used to model lists of ordered pairs. Similarly to the above signatures, it is a datatypes signature.

Next, we distinguish between finite datatypes (e.g., records) and inductive datatypes (e.g., lists).

Definition 3

A sort \(\sigma \in \mathbf{{Struct}}\) is finite if \(T_\sigma ({\varSigma }_{\mid {\mathcal{C}}{\mathcal{O}}},[[{\mathbf{Elem}}]])\) is finite, and is called inductive otherwise.

We denote the set of inductive sorts in \(\varSigma \) by \({ Ind}(\varSigma )\) and the set of its finite sorts by \({ Fin}(\varSigma )\). Note that if \(\sigma \) is inductive, then according to Definitions 1 and 3 we have that for any natural number i there exists a natural number \(i' > i\) such that \(T_{\sigma ,i'}({\varSigma }_{\mid {\mathcal{C}}{\mathcal{O}}},[[{\mathbf{Elem}}]]) \ne T_{\sigma ,i}({\varSigma }_{\mid {\mathcal{C}}{\mathcal{O}}},[[{\mathbf{Elem}}]])\). Further, for any natural number d and every \(\mathbf{Elem}\)-sorted set D there exists a natural number \(i'\) such that \(T_{\sigma ,i'}({\varSigma }_{\mid {\mathcal{C}}{\mathcal{O}}},D)\) contains an element whose depth is greater than d.

Example 3

\(\mathbf{list}\) is inductive in \({\varSigma _{ list}}\) and \({\varSigma _{ lp}}\). \(\mathbf{pair}\) is finite in \({\varSigma _{ pair}}\) and \({\varSigma _{ lp}}\).

Finally, we define datatypes structures and the theory of algebraic datatypes.

Definition 4

Let \(\varSigma \) be a datatypes signature and D an \(\mathbf{Elem}\)-sorted set. A \(\varSigma \)-structure \(\mathcal{A}\) is said to be a datatypes \(\varSigma \) -structure generated by D if:

  • \(\sigma ^{\mathcal{A}} = T_{\sigma }({\varSigma }_{\mid {\mathcal{C}}{\mathcal{O}}},D)\) for every sort \(\sigma \in \mathcal{S}_{\varSigma }\),

  • \(c^{\mathcal{A}}(t_1 ,\dots ,t_n) = c(t_1 ,\dots ,t_n)\) for every \(c\in {\mathcal{C}}{\mathcal{O}}\) of arity \((\sigma _{1}\times \ldots \times \sigma _{n})\rightarrow \sigma \) and \(t_{1} \in \sigma _{1}^{\mathcal{A}} ,\dots ,t_{n} \in \sigma _{n}^{\mathcal{A}}\),

  • \(s_{c,i}^{\mathcal{A}}( c(t_{1} ,\dots ,t_{n}) )=t_{i}\) for every \(c\in {\mathcal{C}}{\mathcal{O}}\) of arity \((\sigma _{1}\times \ldots \times \sigma _{n})\rightarrow \sigma \), \(t_{1} \in \sigma _{1}^{\mathcal{A}} ,\dots ,t_{n} \in \sigma _{n}^{\mathcal{A}}\) and \(1\le i\le n\),

  • \(is_{c}^{\mathcal{A}}=\left\{ c(t_{1} ,\dots ,t_{n}) \mid t_{1} \in \sigma _{1}^{\mathcal{A}} ,\dots ,t_{n} \in \sigma _{n}^{\mathcal{A}} \right\} \) for every \(c\in {\mathcal{C}}{\mathcal{O}}\) of arity \((\sigma _{1}\times \ldots \times \sigma _{n})\rightarrow \sigma \).

\(\mathcal{A}\) is said to be a datatypes \(\varSigma \) -structure if it is a datatypes \(\varSigma \)-structure generated by D for some \(\mathbf{Elem}\)-sorted set D. The \(\varSigma \) -theory of datatypes, denoted \(\mathcal{T}_{\varSigma }\) is the class of datatypes \(\varSigma \)-structures.

Notice that the interpretation of selector functions \(s_{c,i}\) when applied to terms that are constructed using a constructor different than c is not fixed and can be set arbitrarily in datatypes structures, consistently with SMT-LIB 2.

Example 4

If \(\mathcal{A}\) is a datatypes \({\varSigma _{ list}}\)-structure then \(\mathbf{list}^{\mathcal{A}}\) is the set of terms constructed from \(\mathbf{elem}^{\mathcal{A}}\) and cons, plus nil. If \(\mathbf{elem}^{\mathcal{A}}\) is the set of natural numbers, then \(\mathbf{list}^{\mathcal{A}}\) contains, e.g., nil, cons(1, nil), and cons(1, cons(1, cons(2, nil))). These correspond to the lists [] (the empty list), [1] and [1, 1, 2], respectively.

If \(\mathcal{A}\) is a datatypes \({\varSigma _{ pair}}\)-structure then \(\mathbf{pair}^{\mathcal{A}}\) is the set of terms of the form pair(ab) with \(a,b\in \mathbf{elem}^{\mathcal{A}}\). If \(\mathbf{elem}^{\mathcal{A}}\) is again interpreted as the set of natural numbers, \(\mathbf{pair}^{\mathcal{A}}\) includes, for example, the terms pair(1, 1) and pair(1, 2), that correspond to (1, 1) and (1, 2), respectively. Notice that in this case, \(\mathbf{pair}^{\mathcal{A}}\) is an infinite set even though \(\mathbf{pair}\) is a finite sort (in terms of Definition 3).

Datatypes \({\varSigma _{ lp}}\)-structures with the same interpretation for \(\mathbf{elem}\) include the terms nil, cons(pair(1, 1), nil), and cons(pair(1, 1), cons(pair(1, 2), nil)) in the interpretation for \(\mathbf{list}\), that correspond to [], [(1, 1)] and [(1, 1), (1, 2)], respectively. If we rename \(\mathbf{elem}\) in the definition of \({\varSigma _{ list}}\) to \(\mathbf{pair}\), we get that \(\mathcal{T}_{{\varSigma _{ lp}}}=\mathcal{T}_{{\varSigma _{ list}}}\oplus \mathcal{T}_{{\varSigma _{ pair}}}\).

2.3 Polite Theories

Given two theories \(T_1\) and \(T_2\), a combination method à la Nelson-Oppen provides a modular way to decide \(T_1 \cup T_2\)-satisfiability problems using the satisfiability procedures known for \(T_1\) and \(T_2\). Assuming that \(T_1\) and \(T_2\) have disjoint signatures is not sufficient to get a complete combination method for the satisfiability problem. The reason is that \(T_1\) and \(T_2\) may share sorts, and the equality symbol on these shared sorts. To be complete, \(T_1\) and \(T_2\) must agree on the cardinality of their respective models, and there must be an agreement between \(T_1\) and \(T_2\) on the interpretation of shared formulas built over the equality symbol. These two requirements can be easily fulfilled, based on the following definitions:

Definition 5

(Stable Infiniteness). Given a signature \(\varSigma \) and a set \(S\subseteq \mathcal{S}_{\varSigma }\), we say that a \(\varSigma \)-theory T is stably infinite with respect to S if every quantifier-free \(\varSigma \)-formula that is T-satisfiable is also T-satisfiable by a T-interpretation \(\mathcal{A}\) in which \(\sigma ^{\mathcal{A}}\) is infinite for every \(\sigma \in S\).

Definition 6

(Arrangement). Let V be a finite set of variables whose sorts are in S and \(\left\{ V_{\sigma }{\ |\ }\sigma \in S\right\} \) a partition of V such that \(V_{\sigma }\) is the set of variables of sort \(\sigma \) in V. We say that a formula \(\delta \) is an arrangement of V if \(\delta =\bigwedge _{\sigma \in S}(\bigwedge _{(x,y)\in E_{\sigma }}(x=y)\wedge \bigwedge _{(x,y)\notin E_{\sigma }}(x\ne y))\), where \(E_{\sigma }\) is some equivalence relation over \(V_{\sigma }\) for each \(\sigma \in S\).

Assume that both \(T_1\) and \(T_2\) are stably infinite with disjoint signatures, and let V be the finite set of variables shared by \(T_1\) and \(T_2\). Under this assumption, \(T_1\) and \(T_2\) can agree on an infinite cardinality, and guessing an arrangement of V suffices to get an agreement on the interpretation of shared formulas.

In this paper we are interested in an asymmetric disjoint combination where \(T_1\) and \(T_2\) are not both stably infinite. In this scenario, one theory can be arbitrary. As a counterpart, the other theory must be more than stably infinite: it must be polite, meaning that it is always possible to increase the cardinality of a model and to have a model whose cardinality is finite.

In the following we decompose the politeness definition from [14, 21] in order to distinguish between politeness and strong politeness (in terms of [8]) in various levels of the definition. In what follows, \(\varSigma \) is an arbitrary (many-sorted) signature, \(S\subseteq \mathcal{S}_{\varSigma }\), and T is a \(\varSigma \)-theory.

Definition 7

(Smooth). The theory T is  smooth w.r.t. S if for every quantifier-free formula \(\phi \), T-interpretation \(\mathcal{A}\) that satisfies \(\phi \), and function \(\kappa \) from S to the class of cardinals such that \(\kappa (\sigma )\ge {\left| \sigma ^{\mathcal{A}}\right| }\) for every \(\sigma \in S\) there exists a \(\varSigma \)-interpretation \(\mathcal{A}'\) that satisfies \(\phi \) with \({\left| \sigma ^{\mathcal{A}'}\right| }=\kappa (\sigma )\) for every \(\sigma \in S\).

In definitions introduced above, as well as below, we often identify singletons with their single elements when there is no ambiguity (e.g., when saying that a theory is smooth w.r.t. a sort \(\sigma \)).

We now introduce some concepts in order to define finite witnessability. Let \(\phi \) be a quantifier-free \(\varSigma \)-formula and \(\mathcal{A}\) a \(\varSigma \)-interpretation. We say that \(\mathcal{A}\) finitely witnesses \(\phi \) for T w.r.t. S (or, is a finite witness of \(\phi \) for T w.r.t. S), if \(\mathcal{A}\) is a T-interpretation, , and \(\sigma ^{\mathcal{A}}={ vars}_{\sigma }({\phi })^{\mathcal{A}}\) for every \(\sigma \in S\). We say that \(\phi \) is finitely witnessed for T w.r.t. S if it is either T-unsatisfiable or it has a finite witness for T w.r.t. S. \(\phi \) is strongly finitely witnessed for T w.r.t. S if \(\phi \wedge \delta _{V}\) is finitely witnessed for T w.r.t. S for every arrangement \(\delta _{V}\) of V, where V is any set of variables whose sorts are in S. We say that a function \({ wtn}: { QF}(\varSigma ) \rightarrow { QF}(\varSigma )\) is a (strong) witness for T w.r.t. S if for every \(\phi \in { QF}(\varSigma )\) we have that: 1. \(\phi \) and \(\exists \,\overrightarrow{w}.\,{ wtn}(\phi )\) are T-equivalent for \(\overrightarrow{w}={ vars}_{}({{ wtn}(\phi )})\setminus { vars}_{}({\phi })\); and 2. \({ wtn}(\phi )\) is (strongly) finitely witnessed for T w.r.t. S.Footnote 1

Definition 8

(Finitely Witnessable). The theory T is (strongly) finitely witnessable w.r.t. S if there exists a (strong) witness for T w.r.t. S which is computable.

Definition 9

(Polite). T is called (strongly) polite w.r.t. S if it is smooth and (strongly) finitely witnessable w.r.t. S.

Finally, we recall the following theorem from [14].

Theorem 1

([14]). Let \(\varSigma _1\) and \(\varSigma _2\) be signatures and let \(S=\mathcal{S}_{\varSigma _1}\cap \mathcal{S}_{\varSigma _2}\). If \(T_1\) is a \(\varSigma _1\)-theory strongly polite w.r.t. \(S_1\subseteq \mathcal{S}_{\varSigma _1}\), \(T_2\) is a \(\varSigma _2\)-theory strongly polite w.r.t. \(S_2\subseteq \mathcal{S}_{\varSigma _2}\), and \(S\subseteq S_2\), then \(T_1\oplus T_2\) is strongly polite w.r.t. \(S_1\cup (S_2\setminus S)\).

3 Additive Witnesses

It was shown in [14] that politeness is not sufficient for the proof of the polite combination method from [21]. Strong politeness was introduced to fix the problem. It is unknown, however, whether there are theories that are polite but not strongly polite. In this section we offer a simple (yet useful) criterion for the equivalence of the two notions. Throughout this section, unless stated otherwise, \(\varSigma \) and S denote an arbitrary signature and a subset of its set of sorts, and \(T,T_{1},T_{2}\) denote arbitrary \(\varSigma \)-theories.

The following example, which is based on [14] using notions of the current paper, shows that the strong and non-strong witnesses are different. Let \(\varSigma _{0}\) be a signature with a single sort \(\sigma \) and no function or predicate symbols (except \(=_{\sigma })\), \(T_{0}\) the \(\varSigma _{0}\)-theory consisting of all \(\varSigma _{0}\)-structures \(\mathcal{A}\) with \({\left| \sigma ^{\mathcal{A}}\right| }\ge 2\), \(\phi \) the formula \(x=x\wedge w=w\), and \(\delta \) the arrangement \((x=w)\) of \(\left\{ x,w\right\} \). Then \(\phi \wedge \delta \) is \(T_{0}\)-satisfiable, but every interpretation \(\mathcal{A}\) with \(\sigma ^{\mathcal{A}}=\left\{ x,w\right\} ^{\mathcal{A}}\) that satisfies it has only one element in \(\sigma ^{\mathcal{A}}\) and so \(\phi \) is not strongly finitely witnessed for \(T_{0}\) w.r.t. \(\sigma \). It is straightforward to show, however, that \(\phi \) is finitely witnessed for \(T_{0}\) w.r.t. \(\sigma \). Moreover, the function \({ wtn}\) defined by \({ wtn}(\phi )=(\phi \ \wedge \ w_{1}=w_{1}\ \wedge \ w_{2}=w_{2})\) for fresh \(w_{1},w_{2}\) is a witness for \(T_{0}\) w.r.t. \(\sigma \), but not a strong one. This does not show, however, that \(T_{0}\) is not strongly polite. In fact, it is indeed strongly polite since the function \({ wtn}'(\phi )=\phi \wedge w_{1}\ne w_{2}\) for fresh \(w_{1},w_{2}\) is a strong witness for \(T_{0}\) w.r.t. \(\sigma \).

We introduce the notion of additivity, which ensures that the witness is able to “absorb” arrangements and thus lift politeness to strong politeness.

Definition 10

(Additivity). Let \(f:{ QF}(\varSigma )\rightarrow { QF}(\varSigma )\). We say that f is S-additive for T if \(f(f(\phi )\wedge \varphi )\) and \(f(\phi )\wedge \varphi \) are T-equivalent and have the same set of S-sorted variables for every \(\phi ,\varphi \in { QF}(\varSigma )\), provided that \(\varphi \) is a conjunction of flat literals such that every term in \(\varphi \) is a variable whose sort is in S. When T is clear from the context, we just say that f is S-additive. We say that T is additively finitely witnessable w.r.t. S if there exists a witness for T w.r.t. S which is both computable and S-additive. T is said to be additively polite w.r.t. S if it is smooth and additively finitely witnessable w.r.t. S.

Proposition 1

Let \({ wtn}\) be a witness for T w.r.t. S. If \({ wtn}\) is S-additive then it is a strong witness for T w.r.t. S.Footnote 2

Corollary 1

Suppose T is additively polite w.r.t. S. Then it is strongly polite w.r.t. S.

The theory \(T_{0}\) from the example above is additively finitely witnessable w.r.t. \(\sigma \), even though \({ wtn}'\) is not \(\sigma \)-additive. Indeed, it is possible to define a new witness for \(T_{0}\) w.r.t. \(\sigma \), say \({ wtn}''\), which is \(\sigma \)-additive. This function \({ wtn}''\) is defined by: \({ wtn}''(\phi )= { wtn}'(\phi )\) if \(\phi \) is a conjunction that includes some disequality \(x\ne y\) for some xy. Otherwise, \({ wtn}''(\phi )=\phi \).

\(T_{0}\) is an existential theory: it consists of all the structures that satisfy an existential sentence (in this case, \(\exists x,y\,.\,x\ne y\)). The construction of \({ wtn}''\) can be generalized to any existential theory. Such theories are also smooth w.r.t. any set of sorts and so existential theories are additively polite.

The notion of additive witnesses is useful for proving that a polite theory is strongly polite. In particular, the witnesses for the theories of equality, arrays, sets and multisets from [21] are all additive, and so strong politeness of these theories follows from their politeness. The same will hold later, when we conclude strong politeness of theories of algebraic datatypes from their politeness.

4 Politeness for the SMT-LIB 2 Theory of Datatypes

Let \(\varSigma \) be a datatypes signature with \(\mathcal{S}_{\varSigma }=\mathbf{Elem}\uplus \mathbf{{Struct}}\) and \(\mathcal{F}_{\varSigma }={\mathcal{C}}{\mathcal{O}}\uplus {\mathcal{S}}{\mathcal{E}}\). In this section, we prove that \(\mathcal{T}_{\varSigma }\) is strongly polite with respect to \(\mathbf{Elem}\). In Sect. 4.1, we consider theories with only inductive sorts, and consider theories with only finite sorts in Sect. 4.2. We combine them in Sect. 4.3, where arbitrary theories of datatypes are considered. This separation is only needed for finite witnessability. For smoothness, however, it is straightforward to show that the \(\mathbf{Elem}\) domain of a given interpretation can always be augmented without changing satisfiability of quantifier-free formulas.

Lemma 1

\(\mathcal{T}_{\varSigma }\) is smooth w.r.t. \(\mathbf{Elem}\).

Lemma 1 holds for any datatypes signature.

4.1 Inductive Datatypes

In this section, we assume that all sorts in \(\mathbf{{Struct}}\) are inductive.

To prove finite witnessability, we now introduce an additive witness function. Following arguments from [21], it suffices to define the witness only for conjunctions of flat literals. A complete witness can then use the restricted one by first transforming the input formula to flat DNF form and then creating a disjunction where each disjunct is the result of applying the witness on the corresponding disjunct. Similarly, it suffices to show that \({ wtn}(\phi )\) is finitely witnessed for \(\phi \) which is a conjunction of flat literals. Essentially, our witness guesses possible constructors for variables whose constructors are not explicit in the input formula.

Definition 11

(A Witness for \(\mathcal{T}_{\varSigma }\)). Let \(\phi \) be a quantifier-free conjunction of flat \(\varSigma \)-literals. \({ wtn_{i}}(\phi )\) is obtained from \(\phi \) by performing the following steps:

  1. 1.

    For any literal of the form \(y = s_{c,i}(x)\) such that \(x=c(\overrightarrow{u_{1}}, y, \overrightarrow{u_{2}})\) does not occur in \(\phi \) and \(x=d(\overrightarrow{u_d})\) does not occur in \(\phi \) for any \(\overrightarrow{u_{1}},\overrightarrow{u_{2}},\overrightarrow{u_d}\), we conjunctively add \(x=c(\overrightarrow{u_{1}}, y, \overrightarrow{u_{2}}) \vee (\bigvee _{d\ne c} x=d(\overrightarrow{u_d}))\) with fresh \(\overrightarrow{u_{1}}, \overrightarrow{u_{2}}, \overrightarrow{u_d}\), where c and d range over \({\mathcal{C}}{\mathcal{O}}\).

  2. 2.

    For any literal of the form \(is_c(x)\) such that \(x=c(\overrightarrow{u})\) does not occur in \(\phi \) for any \(\overrightarrow{u}\), we conjunctively add \(x=c(\overrightarrow{u})\) with fresh \(\overrightarrow{u}\).

  3. 3.

    For any literal of the form \(\lnot is_c(x)\) such that \(x=d(\overrightarrow{u_{d}})\) does not occur in \(\phi \) for any \(d\ne c\) and \(\overrightarrow{u_{d}}\), we conjunctively add \(\bigvee _{d\ne c}x=d(\overrightarrow{u_{d}})\), with fresh \(\overrightarrow{u_{d}}\).

  4. 4.

    For any sort \(\sigma \in \mathbf{Elem}\) such that \(\phi \) does not include a variable of sort \(\sigma \) we conjunctively add a literal \(x=x\) for a fresh variable x of sort \(\sigma \).

Example 5

Let \(\phi \) be the \({\varSigma _{ list}}\)-formula \(y=cdr(x)\wedge y'=cdr(x)\wedge is_{cons}(y)\). \({ wtn_{i}}(\phi )\) is \(\phi \wedge (x=nil\vee x=cons(e,y))\wedge (x=nil\vee x=cons(e',y'))\wedge y=cons(e'',z)\wedge e'''=e'''\) where \(e,e',e'',e''',z\) are fresh.

In Definition 11, Item 1 guesses the constructor of the argument for the selector. Items 2 and 3 correspond to the semantics of testers. Item 4 is meant to ensure that we can construct a finite witness with non-empty domains. The requirement for absence of literals before adding literals or disjunctions to \(\phi \) is used to ensure additivity of \({ wtn_{i}}\). And indeed:

Lemma 2

\({ wtn_{i}}\) is \(\mathbf{Elem}\)-additive.

Further, it can be verified that:

Lemma 3

Let \(\phi \) be a conjunction of flat literals. \(\phi \) and \(\exists \overrightarrow{w}\,.\,\varGamma \) are \(\mathcal{T}_{\varSigma }\)-equivalent, where \(\varGamma ={ wtn_{i}}(\phi )\) and \(\overrightarrow{w}={ vars}_{}({\varGamma })\setminus { vars}_{}({\phi })\).

The remainder of this section is dedicated to the proof of the following lemma:

Lemma 4

(Finite Witnessability). Let \(\phi \) be a conjunction of flat literals. Then, \(\varGamma ={ wtn_{i}}(\phi )\) is finitely witnessed for \(\mathcal{T}_{\varSigma }\) with respect to \(\mathbf{Elem}\).

Suppose that \(\varGamma \) is \(\mathcal{T}_{\varSigma }\)-satisfiable, and let \(\mathcal{A}\) be a satisfying \(\mathcal{T}_{\varSigma }\)-interpretation. We define a \(\mathcal{T}_{\varSigma }\)-interpretation \(\mathcal{B}\) as follows, and then show that \(\mathcal{B}\) is a finite witness of \(\varGamma \) for \(\mathcal{T}_{\varSigma }\) w.r.t. \(\mathbf{Elem}\). First for every \(\sigma \in \mathbf{Elem}\) we set \(\sigma ^{\mathcal{B}}={ vars}_{\sigma }({\varGamma })^\mathcal{A}\), and for every variable \(e\in { vars}_{\sigma }({\varGamma })\), we set \(e^{\mathcal{B}}=e^{\mathcal{A}}\). The interpretations of \(\mathbf{{Struct}}\)-sorts, testers and constructors are uniquely determined by the theory. It is left to define the interpretation of \(\mathbf{{Struct}}\)-variables in \(\mathcal{B}\), as well as the interpretation of the selectors (the interpretation of selectors is fixed by the theory only when applied to the “right” constructor). We do this in several steps:

Step 1 – Simplifying \(\varGamma \): since \(\phi \) is a conjunction of flat literals, \(\varGamma \) is a conjunction whose conjuncts are either flat literals or disjunctions of flat literals (introduced in Items 1 and 3 of Definition 11). Since , \(\mathcal{A}\) satisfies exactly one disjunct of each such disjunction. We can thus obtain a formula \(\varGamma _{1}\) from \(\varGamma \) by replacing every disjunction with the disjunct that is satisfied by \(\mathcal{A}\). Notice that and that it is a conjunction of flat literals. Let \(\varGamma _{2}\) be obtained from \(\varGamma _{1}\) by removing any literal of the form \(is_{c}(x)\) and any literal of the form \(\lnot is_{c}(x)\). Let \(\varGamma _{3}\) be obtained from \(\varGamma _{2}\) by removing any literal of the form \(x=s_{c,i}(y)\). For convenience, we denote \(\varGamma _{3}\) by \(\varGamma '\). Obviously, , and \(\varGamma '\) is a conjunction of flat literals without selectors and testers.

Step 2 – Working with Equivalence Classes: We would like to preserve equalities between \(\mathbf{{Struct}}\)-variables from \(\mathcal{A}\). To this end, we group all variables in \({ vars}_{}({\varGamma })\) to equivalence classes according to their interpretation in \(\mathcal{A}\). Let \(\equiv _{\mathcal{A}}\) denote an equivalence relation over \({ vars}_{}({\varGamma })\) such that \(x\equiv _{\mathcal{A}}y\) iff \(x^{\mathcal{A}}=y^{\mathcal{A}}\). We denote by [x] the equivalence class of x. Let \(\alpha \) be an equivalence class, thus \(\alpha ^{\mathcal{A}}=\left\{ x^{\mathcal{A}}\mid x\in \alpha \right\} \) is a singleton. Identifying this singleton with its only element, we have that \(\alpha ^{\mathcal{A}}\) denotes \(a^{\mathcal{A}}\) for an arbitrary element a of the equivalence class \(\alpha \).

Step 3 – Ordering Equivalence Classes: We would also like to preserve disequalities between \(\mathbf{{Struct}}\)-variables from \(\mathcal{A}\). Thus we introduce a relation \(\prec \) over the equivalence classes, such that \(\alpha \prec \beta \) if \(y = c(w_1,\dots ,w_n)\) occurs as one of the conjuncts in \(\varGamma '\) for some \(w_{1},\dots ,w_n\) and c such that \(w_k \in \alpha \) for some \(y \in \beta \), \(c\in {\mathcal{C}}{\mathcal{O}}\), and k. Call an equivalence class \(\alpha \) nullary if \(\mathcal{A}\models is_c(x)\) for some \(x\in \alpha \) and nullary constructor c. Call an equivalence class \(\alpha \) minimal if \(\beta \not \prec \alpha \) for every \(\beta \). Notice that each nullary equivalence class is minimal. The relation \(\prec \) induces a directed acyclic graph (DAG), denoted G. The vertices are the equivalence classes. Whenever \(\alpha \prec \beta \), we draw an edge from vertex \(\alpha \) to \(\beta \).

Step 4 – Interpretation of Equivalence Classes: We define \(\alpha ^{\mathcal{B}}\) for every equivalence class \(\alpha \). Then, \(x^{\mathcal{B}}\) is simply defined as \([x]^{\mathcal{B}}\), for every \(\mathbf{{Struct}}\)-variable x. The idea goes as follows. Nullary classes are assigned according to \(\mathcal{A}\). Other minimal classes are assigned arbitrarily, but it is important to assign different classes to terms whose depths are far enough from each other to ensure that the disequalities in \(\mathcal{A}\) are preserved. Non-minimal classes are uniquely determined after minimal ones are assigned. Formally, let m be the number of equivalence classes, l the number of minimal equivalence classes, r the number of nullary equivalence classes, and \(\alpha _{1},\dots ,\alpha _{m}\) a topological sort of G, such that all minimal classes occur before all others, and the first r classes are nullary. Let d be the length of the longest path in G. We define \(\alpha _{i}^{\mathcal{B}}\) by induction on i. In the definition, we use \(\mathcal{B}_{\mathbf{Elem}}\) to denote the \(\mathbf{Elem}\)-sorted set assigning \(\sigma ^{\mathcal{B}}\) to every \(\sigma \in \mathbf{Elem}\).

  1. 1.

    If \(0<r\) and \(i\le r\) then \(\alpha _{i}\) is a nullary class and so we set \(\alpha _i^{\mathcal{B}}=\alpha _i^{\mathcal{A}}\).

  2. 2.

    If \(r<i\le l\) then \(\alpha _{i}\) is minimal and not nullary. Let \(\sigma \) be the sort of variables in \(\alpha _{i}\). If \(\sigma \in \mathbf{Elem}\), then all variables in the class have already been defined. Otherwise, \(\sigma \in \mathbf{{Struct}}\). In this case, we define \(\alpha _{i}^{\mathcal{B}}\) to be an arbitrary element of \(T_{\sigma }({\varSigma }_{\mid {\mathcal{C}}{\mathcal{O}}}, \mathcal{B}_{\mathbf{Elem}})\) that has depth strictly greater than \(\max \left\{ { depth}(\alpha _{j}^{\mathcal{B}})\mid 0<j<i\right\} +d\) (here \(\max \emptyset =0\)).

  3. 3.

    If \(i>l\) then we set \(\alpha _{i}^{\mathcal{B}}=c(\beta _{1}^{\mathcal{B}},\dots ,\beta _{n}^{\mathcal{B}})\) for the unique equivalence classes \(\beta _{1},\dots ,\beta _{n}\subseteq \left\{ \alpha _{1},\dots ,\alpha _{i-1}\right\} \) and c such that \(y=c(x_{1},\dots ,x_{n})\) occurs in \(\varGamma '\) for some \(y\in \alpha _{i}\) and \(x_{1}\in \beta _{1},\dots ,x_{n}\in \beta _{n}\).

Since \(\varSigma \) is a datatypes signature in which all \(\mathbf{{Struct}}\)-sorts are inductive, the second case of the definition is well-defined. Further, the topological sort ensures \(\beta _{1},\dots ,\beta _{n}\) exist, and the partition to equivalence classes ensures that they are unique. Hence:

Lemma 5

\(\alpha _{i}^{\mathcal{B}}\) is well-defined.

Step 5 – Interpretation of Selectors: Let \(s_{c,i}\in {\mathcal{S}}{\mathcal{E}}\) for \(c:\sigma _{1}\times \ldots \times \sigma _{n}\rightarrow \sigma \), \(1\le i\le n\) and \(a\in \sigma ^{\mathcal{B}}\). If \(a\in is_{c}^{\mathcal{B}}\), we must have \(a=c(a_{1},\dots ,a_{n})\) for some \(a_{1}\in \sigma _{1}^{\mathcal{B}},\dots ,a_{n}\in \sigma _{n}^{\mathcal{B}}\). We then set \(s_{c,i}^{\mathcal{B}}(a)=a_{i}\). Otherwise, we consider two cases. If \(x^{\mathcal{B}}=a\) for some \(x\in { vars}_{}({\varGamma })\) such that \(y=s_{c,i}(x)\) occurs in \(\varGamma _{2}\) for some y, we set \(s_{c,i}^{\mathcal{B}}(a)=y^{\mathcal{B}}\). Otherwise, \(s_{c,i}^{\mathcal{B}}(a)\) is set arbitrarily.

Example 6

Let \(\varGamma \) be the following \({\varSigma _{ list}}\)-formula: \(x_1 = cons(e_1, x_2) \wedge x_3 = cons(e_2, x_4) \wedge x_2 \ne x_4\). Then \(\varGamma '=\varGamma \). We have the following satisfying interpretation \(\mathcal{A}\): \({\mathbf{elem}}^{\mathcal{A}} = \{1, 2, 3, 4\}\), \(e_{1}^{\mathcal{A}} = 1, {e_{2}} ^{\mathcal{A}} = 2\), \({x_1}^{\mathcal{A}} = [1, 2, 3], {x_2}^{\mathcal{A}} = [2, 3], {x_3}^{\mathcal{A}} = [2, 2, 4], {x_4}^{\mathcal{A}} = [2, 4]\). The construction above yields the following interpretation \(\mathcal{B}\): \({\mathbf{elem}}^{\mathcal{B}} = \{1, 2\}\), \({e_1}^{\mathcal{B}} = 1, {e_2} ^{\mathcal{B}} = 2\). For \(\mathbf{list}\)-variables, we proceed as follows. The equivalence classes of \(\mathbf{list}\)-variables are \([x_1], [x_2], [x_3], [x_4]\), with \([x_2]\prec [x_1]\) and \([x_4]\prec [x_3]\). The length of the longest path in G is 1. Assuming \([x_2]\) comes before \([x_4]\) in the topological sort, \(x_2^{\mathcal{B}}\) will get an arbitrary list over \(\left\{ 1,2\right\} \) with length greater than 1 (the depth of \(e_2^{\mathcal{B}}\) plus the length of the longest path), say, [1, 1, 1]. \(x_{4}^{\mathcal{B}}\) will then get an arbitrary list of length greater than 4 (the depth of \(x_2^{\mathcal{B}}\) plus the length of the longest path). Thus we could have \(x_{4}^{\mathcal{B}}=[1,1,1,1,1]\). Then, \(x_{1}^{\mathcal{B}}=[1,1,1,1]\) and \(x_{3}^{\mathcal{B}}=[2,1,1,1,1,1]\).

Now that \(\mathcal{B}\) is defined, it is left to show that it is a finite witness of \(\varGamma \) for \(\mathcal{T}_{\varSigma }\) w.r.t. \(\mathbf{Elem}\). By construction, \(\sigma ^{\mathcal{B}}={ vars}_{\sigma }({\varGamma })^{\mathcal{B}}\) for every \(\sigma \in \mathbf{Elem}\). \(\mathcal{B}\) also preserves the equalities and disequalities in \(\mathcal{A}\), and by considering every shape of a literal in \(\varGamma '\) we can prove that . Our interpretation of the selectors then ensures that:

Lemma 6

.

Lemma 6, together with the definition of the domains of \(\mathcal{B}\), gives us that \(\mathcal{B}\) is a finite witness of \(\varGamma \) for \(\mathcal{T}_{\varSigma }\) w.r.t. \(\mathbf{Elem}\), and so Lemma 4 is proven. As a corollary of Lemmas 1, 2 and 4, strong politeness is obtained.

Theorem 2

If \(\varSigma \) is a datatypes signature and all sorts in \(\mathbf{{Struct}}_{\varSigma }\) are inductive, then \(\mathcal{T}_{\varSigma }\) is strongly polite w.r.t. \(\mathbf{Elem}_{\varSigma }\).

4.2 Finite Datatypes

In this section, we assume that all sorts in \(\mathbf{{Struct}}\) are finite.

For finite witnessability, we define the following witness, that guesses the construction of each \(\mathbf{{Struct}}\)-variables until a fixpoint is reached. For every quantifier-free conjunction of flat \(\varSigma \)-literals \(\phi \), define the sequence \(\phi _{0},\phi _{1},\ldots \), such that \(\phi _{0}=\phi \), and for every \(i\ge 0\), \(\phi _{i+1}\) is obtained from \(\phi _{i}\) by conjuncting it with a disjunction \(\bigvee _{c\in {\mathcal{C}}{\mathcal{O}}} x=c(w_1^{c},\dots ,w_{n_{c}}^{c})\) for fresh \(w_{1}^{c},\dots ,w_{n_{c}}^{c}\), where x is some arbitrary \(\mathbf{{Struct}}\)-variable in \(\phi _{i}\) such that there is no literal of the form \(x=c(y_{1},\dots ,y_{n})\) in \(\phi _{i}\) for any constructor c and variables \(y_{1},\dots ,y_{n}\), if such x exists. Since \(\mathbf{{Struct}}\) only has finite sorts, this sequence becomes constant at some \(\phi _{k}\).

Definition 12

(A Witness for \(\mathcal{T}_{\varSigma }\)).\({ wtn_{f}}(\phi )\) is \(\phi _{k}\) for the minimal k such that \(\phi _{k}=\phi _{k+1}\).

Example 7

Let \(\phi \) be the \({\varSigma _{ pair}}\)-formula \(x=first(y)\wedge x'=first(y')\wedge x\ne x'\). \({ wtn_{f}}(\phi )\) is \(\phi \wedge y=pair(e_1,e_2)\wedge y'=pair(e_3,e_4)\).

Similarly to Sect. 4.1, we have:

Lemma 7

\({ wtn_{f}}\) is \(\mathbf{Elem}\)-additive.

Lemma 8

\(\phi \) and \(\exists \overrightarrow{w}\,.\,{ wtn_{f}}(\phi )\) are \(\mathcal{T}_{\varSigma }\)-equivalent, where \(\overrightarrow{w}={ vars}_{}({{ wtn_{f}}(\phi )})\setminus { vars}_{}({\phi })\).

We now prove the following lemma:

Lemma 9

(Finite Witnessability). Let \(\phi \) be a conjunction of flat literals. Then, \({ wtn_{f}}(\phi )\) is finitely witnessed for \(\mathcal{T}_{\varSigma }\) with respect to \(\mathbf{Elem}\).

Suppose \(\varGamma ={ wtn_{f}}(\phi )\) is \(\mathcal{T}_{\varSigma }\)-satisfiable, and let \(\mathcal{A}\) be a satisfying \(\mathcal{T}_{\varSigma }\)-interpretation. We define a \(\mathcal{T}_{\varSigma }\)-interpretation \(\mathcal{B}\) which is a finite witness of \(\varGamma \) for \(\mathcal{T}_{\varSigma }\) w.r.t. \(\mathbf{Elem}\). We set \(\sigma ^{\mathcal{B}}={ vars}_{\sigma }({\varGamma })^\mathcal{A}\) for every \(\sigma \in \mathbf{Elem}\), \(e^{\mathcal{B}}=e^{\mathcal{A}}\), for every variable \(e\in { vars}_{\mathbf{Elem}}({\varGamma })\) and \(x^{\mathcal{B}}=x^{\mathcal{A}}\) for every variable \(x\in { vars}_{\mathbf{{Struct}}}({\varGamma })\). Selectors are also interpreted as they are interpreted in \(\mathcal{A}\). This is well-defined: for any \(\mathbf{{Struct}}\)-variable x, every element in \(\sigma ^{\mathcal{A}}\) for \(\sigma \in \mathbf{Elem}\) that occurs in \(x^{\mathcal{A}}\) has a corresponding variable e in \(\varGamma \) such that \(e^{\mathcal{A}}\) is that element. This holds by the finiteness of the sorts in \(\mathbf{{Struct}}\) and the definition of \({ wtn_{f}}\). Further, for any \(\mathbf{{Struct}}\)-variable x such that \(s_{c,i}(x)\) occurs in \(\varGamma \), we must have that it occurs in some literal of the form \(y=s_{c,i}(x)\) of \(\varGamma \). Similarly to the above, all elements that occur in \(y^{\mathcal{A}}\) and \(x^{\mathcal{A}}\) have corresponding variables in \(\varGamma \). Therefore, is a trivial consequence of . By the definition of its domains, \(\mathcal{B}\) is a finite witness of \(\varGamma \) for \(\mathcal{T}_{\varSigma }\) w.r.t. \(\mathbf{Elem}\), and so Lemma 9 is proven. Then, by Lemmas 1 7 and 9 , strong politeness is obtained.

Theorem 3

If \(\varSigma \) is a datatypes signature and all sorts in \(\mathbf{{Struct}}_{\varSigma }\) are finite, then \(\mathcal{T}_{\varSigma }\) is strongly polite w.r.t. \(\mathbf{Elem}_{\varSigma }\).

4.3 Combining Finite and Inductive Datatypes

Now we consider the general case. Let \(\varSigma \) be a datatypes signature. We prove that \(\mathcal{T}_{\varSigma }\) is strongly polite w.r.t. \(\mathbf{Elem}\). We show that there are datatypes signatures \(\varSigma _{1},\varSigma _{2}\subseteq \varSigma \) such that \(\mathcal{T}_{\varSigma }=\mathcal{T}_{\varSigma _{1}}\oplus \mathcal{T}_{\varSigma _{2}}\), and then use Theorem 1. In \(\varSigma _{1}\), inductive sorts are excluded, while in \(\varSigma _{2}\), finite sorts are considered to be element sorts.

Formally, we set \(\varSigma _{1}\) as follows: where \(\mathbf{Elem}_{\varSigma _{1}}=\mathbf{Elem}_{\varSigma }\) and \(\mathbf{{Struct}}_{\varSigma _{1}}={ Fin}(\varSigma )\). \(\mathcal{F}_{\varSigma _{1}}={\mathcal{C}}{\mathcal{O}}_{\varSigma _{1}}\uplus {\mathcal{S}}{\mathcal{E}}_{\varSigma _{1}}\), where \({\mathcal{C}}{\mathcal{O}}_{\varSigma _{1}}=\{c:\sigma _{1}\times \ldots \times \sigma _{n}\rightarrow \sigma \mid c\in {\mathcal{C}}{\mathcal{O}}_{\varSigma },\sigma \in \mathbf{{Struct}}_{\varSigma _{1}}\}\) and \({\mathcal{S}}{\mathcal{E}}_{\varSigma _{1}}\) and \({\mathcal{P}}_{\varSigma _{1}}\) are the corresponding selectors and testers. Notice that if \(\sigma \) is finite and \(c:\sigma _{1}\times \ldots \times \sigma _{n}\rightarrow \sigma \) is in \({\mathcal{C}}{\mathcal{O}}_{\varSigma }\), then \(\sigma _{i}\) must be finite or in \(\mathbf{Elem}_{\varSigma }\) for every \(1\le i\le n\). Next, we set \(\varSigma _{2}\) as follows: \(\mathcal{S}_{\varSigma _{2}}=\mathbf{Elem}_{\varSigma _{2}}\uplus \mathbf{{Struct}}_{\varSigma _{2}}\), where \(\mathbf{Elem}_{\varSigma _{2}}=\mathbf{Elem}_{\varSigma }\cup { Fin}(\varSigma )\) and \(\mathbf{{Struct}}_{\varSigma _{2}}={ Ind}(\varSigma )\). \(\mathcal{F}_{\varSigma _{2}}={\mathcal{C}}{\mathcal{O}}_{\varSigma _{2}}\uplus {\mathcal{S}}{\mathcal{E}}_{\varSigma _{2}}\), where \({\mathcal{C}}{\mathcal{O}}_{\varSigma _{2}}=\{c:\sigma _{2}\times \ldots \times \sigma _{n}\rightarrow \sigma \mid c\in {\mathcal{C}}{\mathcal{O}}_{\varSigma },\sigma \in \mathbf{{Struct}}_{\varSigma _{2}}\}\) and \({\mathcal{S}}{\mathcal{E}}_{\varSigma _{2}}\) and \({\mathcal{P}}{\varSigma _{2}}\) are the corresponding selectors and testers. Thus, \(\mathcal{T}_{\varSigma }=\mathcal{T}_{\varSigma _{1}}\oplus \mathcal{T}_{\varSigma _{2}}\). Now set \(S=\mathbf{Elem}_{\varSigma }\cup { Fin}(\varSigma )\), \(S_{1}=\mathbf{Elem}_{\varSigma }\), \(S_{2}=\mathbf{Elem}_{\varSigma }\cup { Fin}(\varSigma )\), \(T_{1}=\mathcal{T}_{\varSigma _{1}}\), and \(T_{2}=\mathcal{T}_{\varSigma _{2}}\).

By Theorem 3, \(T_{1}\) is strongly polite w.r.t. \(S_{1}\) and by Theorem 2, \(T_{2}\) is strongly polite w.r.t. \(S_{2}\). By Theorem 1 we have:

Theorem 4

If \(\varSigma \) is a datatypes signature then \(\mathcal{T}_{\varSigma }\) is strongly polite w.r.t. \(\mathbf{Elem}_{\varSigma }\).

Remark 1

A concrete witness for \(\mathcal{T}_{\varSigma }\) in the general case, that we call \({ wtn}_{\varSigma }\), is obtained by first applying the witness from Definition 11 and then applying the witness from Definition 12 on the literals that involve finite sorts. A direct finite witnessability proof can be obtained by using the same arguments from the proofs of Lemmas  4 and 9. This witness is simpler than the one produced in the proof from [14] of Theorem 1, that involves purification and arrangements. In our case, we do not consider arrangements, but instead notice that the resulting function is additive, and hence ensures strong finite witnessability.

5 Axiomatizations

In this section, we discuss the possible connections between the politeness of \(\mathcal{T}_{\varSigma }\) and some axiomatizations of trees. We show how to get a reduction of any \(\mathcal{T}_{\varSigma }\)-satisfiability problem into a satisfiability problem modulo an axiomatized theory of trees. The latter can be decided using syntactic unification.

Let \(\varSigma \) be a datatypes signature. The set \({ TREE}_{\varSigma }^{*}\) of axioms is defined as the union of all the sets of axioms in Fig. 1 (where upper case letters denote implicitly universally quantified variables). Let \({ TREE}_{\varSigma }\) be the set obtained from \({ TREE}_{\varSigma }^{*}\) by dismissing \({ Ext _{1}}\) and \({ Ext _{2}}\). Note that because of \( Acyc \), we have that \({ TREE}_{\varSigma }\) is infinite (that is, consists of infinitely many axioms) unless all sorts in \(\mathbf{{Struct}}\) are finite. \({ TREE}_{\varSigma }\) is a generalization of the theory of Absolutely Free Data Structures (AFDS) from [9] to many-sorted signatures with selectors and testers. In what follows we identify \({ TREE}_{\varSigma }\) (and \({ TREE}_{\varSigma }^{*}\)) with the class of structures that satisfy them when there is no ambiguity.

Proposition 2

Every \({ TREE}_{\varSigma }^{*}\)-unsatisfiable formula is \(\mathcal{T}_{\varSigma }\)-unsatisfiable.

Fig. 1.
figure 1

Axioms for \({ TREE}_{\varSigma }\) and \({ TREE}_{\varSigma }^{*}\)

Remark 2

Along the lines of [1], a superposition calculus can be applied to get a \({ TREE}_{\varSigma }\)-satisfiability procedure. Such a calculus has been used in [6, 9] for a theory of trees with selectors but no testers. To handle testers, one can use a classical encoding of predicates into first-order logic with equality, by representing an atom \(is_c(x)\) as a flat equality \({ Is}_c(x) = \mathbb {T}\) where \({ Is}_c\) is now a unary function symbol and \(\mathbb {T}\) is a constant. Then, a superposition calculus dedicated to \({ TREE}_{\varSigma }\) can be obtained by extending the standard superposition calculus [1] with some expansion rules, one for each axiom of \({ TREE}_{\varSigma }\) [9]. For the axioms \({{ Is}_{1}}\) and \({{ Is}_{2}}\), the corresponding expansion rules are respectively \( x = c(x_1,\dots ,x_n) \vdash { Is}_c(x) = \mathbb {T} \text { if }c\in {\mathcal{C}}{\mathcal{O}}\), and \(x = d(x_1,\dots ,x_n) \vdash { Is}_c(x) \ne \mathbb {T} \text { if }c,d \in {\mathcal{C}}{\mathcal{O}}, c \ne d\). Further, consider the theory of finite trees defined from \({ TREE}_{\varSigma }\) by dismissing \( Proj , {{ Is}_{1}}\) and \({{ Is}_{2}}\). Being defined by Horn clauses, it is convex. Further, it is a Shostak theory [16, 18, 24] admitting a solver and a canonizer [9]. The solver is given by a syntactic unification algorithm [2] and the canonizer is the identity function. The satisfiability procedure built using the solver and the canonizer can be applied to decide \({ TREE}_{\varSigma }\)-satisfiability problems containing \({\varSigma }_{\mid {\mathcal{C}}{\mathcal{O}}}\)-atoms.

The following result shows that any \(\mathcal{T}_{\varSigma }\)-satisfiability problem can be reduced to a \({ TREE}_{\varSigma }\)-satisfiability problem. This leads to a \(\mathcal{T}_{\varSigma }\)-satisfiability procedure.

Proposition 3

Let \(\varSigma \) be a finite datatypes signature and \(\varphi \) any conjunction of flat \(\varSigma \)-literals including an arrangement over the variables in \(\varphi \). Then, there exists a \(\varSigma \)-formula \(\varphi '\) such that:

  1. 1.

    \(\varphi \) and \(\exists \overrightarrow{w}\,.\,\varphi '\) are \(\mathcal{T}_{\varSigma }\)-equivalent, where \(\overrightarrow{w}={ vars}_{}({\varphi '}) \backslash { vars}_{}({\varphi })\).

  2. 2.

    \(\varphi '\) is \(\mathcal{T}_{\varSigma }\)-satisfiable iff \(\varphi '\) is \({ TREE}_{\varSigma }\)-satisfiable.

Proposition 3 can be easily lifted to any conjunction of \(\varSigma \)-literals \(\varphi \) by flattening and then guessing all possible arrangements over the variables. Further, \(\exists \overrightarrow{w}\,.\,\varphi '\) and \(\varphi \) are not only \(\mathcal{T}_{\varSigma }\)-equivalent but also \({ TREE}_{\varSigma }^{*}\)-equivalent. As a consequence, Proposition 3 also holds when stated using \({ TREE}_{\varSigma }^{*}\) instead of \(\mathcal{T}_{\varSigma }\).

We conclude this section with a short discussion on the connection to Sect. 4. Both the current section and Sect. 4 rely on two constructions: (i) A formula transformation (\({ wtn}_{\varSigma }\) in Sect. 4, \(\varphi \mapsto \varphi '\) in the current section); and (ii) A small model construction (finite witnessability in Sect. 4, equisatisfiability between \(\mathcal{T}_{\varSigma }\) and \({ TREE}\) in Proposition 3). While these constructions are similar in both sections, they are not the same. A nice feature of the constructions of Sect. 4 is that they clearly separate between steps (i) and (ii). The witness is very simple, and amounts to adding to the input formula literals and disjunctions that trivially follow from the original formula in \(\mathcal{T}_{\varSigma }\). Then, the resulting formula is post-processed in step (ii), according to a given satisfying interpretation. Having a satisfying interpretation allows us to greatly simplify the formula, and the simplified formula is useful for the model construction. In contrast, the satisfying \({ TREE}_{\varSigma }\)-interpretation that we start with in step (ii) of the current section is not necessarily a \(\mathcal{T}_{\varSigma }\)-interpretation, which makes the approach of Sect. 4 incompatible, compared to the syntactic unification approach that we employ here. For that, some of the post-processing steps of Sect. 4 are employed in step (i) itself, in order to eliminate all testers and as much selectors as possible. In addition, a pre-processing is applied in order to include an arrangement. The constructed interpretation finitely witnesses \(\varphi '\) and so this technique can be used to produce an alternative proof of strong politeness.

6 Conclusion

In this paper we have studied the theory of algebraic datatypes, as it is defined by the SMT-LIB 2 standard. Our investigation included both finite and inductive datatypes. For this theory, we have proved that it is strongly polite, making it amenable for combination with other theories by the polite combination method. Our proofs used the notion of additive witnesses, also introduced in this paper. We concluded by extending existing axiomatizations and a decision procedure of trees to support this theory of datatypes.

There are several directions for further research that we plan to explore. First, we plan to continue to prove that more important theories are strongly polite, with an eye to recent extensions of the datatypes theory, namely datatypes with shared selectors [23] and co-datatypes [22]. Second, we envision to further investigate the possibility to prove politeness using superposition-based satisfiability procedures. Third, we plan to study extensions of the theory of datatypes corresponding to finite trees including function symbols with some equational properties such as associativity and commutativity to model data structures such as multisets [25]. We want to focus on the politeness of such extensions. Initial work in that direction has been done in [5], that we plan to build on.