Keywords

1 Introduction

Many cryptographic primitives rest on the assumption that their building blocks behave as perfectly random functions. This is the case for, among many others, encryption modes [4], authenticators [5, 9], or random permutations [28]. Yet, for all their utility, very few pseudorandom functions are actually available to practitioners. Instead, the leading cryptographic building block is the pseudorandom permutation, also known as the block cipher. It is therefore common practice to employ block ciphers as stand-ins for pseudorandom functions.

To a first approximation, this solves the problem. The PRP-PRF switch [6, 8, 13, 21, 24] tells us that a PRF can be safely replaced by a PRP up to approximately \(2^{n/2}\) queries. With large blocks this is often acceptable, but for lightweight block ciphers, whose number has grown tremendously in recent years (e.g., [1, 2, 11, 12, 18, 20, 22, 27, 46, 51]), this \(2^{n/2}\) birthday bound severely limits the application range. For example, Bhargavan and Leurent [10] recently presented practical collision attacks on TLS if a 64-bit cipher is used.

In order to save these ciphers from obsolescence, various PRP-to-PRF constructions have been presented that achieve security beyond the \(2^{n/2}\) security bound. We can categorize these into truncation-based solutions and xor-based solutions.Footnote 1 Here and throughout, we simply talk about permutations to refer to block ciphers instantiated with a secret key, unless explicitly stated otherwise.

Truncation. Hall et al. [21] suggested simple truncation. Bellare and Impagliazzo [3] and Gilboa and Gueron [19] proved that truncating an n-bit permutation by \(m < n\) bits has security up to approximately \(2^{\frac{m+n}{2}}\) queries. This result was, as a matter of fact, already derived around 20 years earlier by Stam [47], be it in a non-cryptographic context.

Xor of Permutations. The xor (or more generally, sum) of two permutations,

$$\begin{aligned} \mathrm {XoP}^{p_1,p_2}(x) = p_1(x) \oplus p_2(x)\,, \end{aligned}$$
(1)

where \(p_1,p_2\) are two permutations, was initially mentioned by Bellare et al. [7] as a “natural” PRP-to-PRF method, and was later analyzed by Lucks [29] and Bellare and Impagliazzo [3]. Patarin achieved \(2^n/67\) security [39, 40, 42]. The results are natively inherited by the construction that consists of the xor of three or more independent permutations [16, 30].

The xor of permutations evidently requires independence between \(p_1\) and \(p_2\). If only a single permutation is to be used, one can simulate this independence through domain separation, as suggested by Lucks [29] and Bellare and Impagliazzo [3]:

$$\begin{aligned} {\mathrm {XoP}'}^p(x) = p(0\Vert x) \oplus p(1\Vert x)\,. \end{aligned}$$
(2)

Patarin [40] proved that this single permutation construction achieves a similar level of security as \(\mathrm {XoP}\).

A New Contender. At CRYPTO 2016, Cogliati and Seurin [17] introduced the Encrypted Davies-Meyer (EDM) construction (see Fig. 1a):

$$\begin{aligned} \mathrm {EDM}^{p_1,p_2}(x) = p_2(p_1(x) \oplus x)\,, \end{aligned}$$
(3)

where \(p_1,p_2\) are two permutations. Cogliati and Seurin proved that \(\mathrm {EDM}^{p_1,p_2}\) behaves like a random function up to complexity \(2^{2n/3}\), and actually conjectured that \(2^n\) is possible.

\(\mathrm {EDM}^{p_1,p_2}\) shows structural differences with the xor of permutations, and these differences allowed Cogliati and Seurin to devise the misuse-resistant MAC function Encrypted Wegman-Carter with Davies-Meyer (EWCDM), defined as follows:

$$\begin{aligned} \mathrm {EWCDM}^{h,p_1,p_2}(\nu ,m) = p_2(p_1(\nu )\oplus \nu \oplus h(m))\,, \end{aligned}$$
(4)

where h is an almost xor universal hash function, \(p_1,p_2\) are two permutations, and where \(\nu \) denotes the nonce and m the message, which may be arbitrarily large. Cogliati and Seurin proved that \(\mathrm {EWCDM}^{h,p_1,p_2}\) achieves security up to \(2^{2n/3}\) in the nonce-respecting setting, and \(2^{n/2}\) security in the nonce-misusing setting. They likewise conjectured optimal \(2^n\) security in the nonce-respecting setting.

1.1 Our Contribution

We improve the security of \(\mathrm {EDM}^{p_1,p_2}\) as well as \(\mathrm {EWCDM}^{h,p_1,p_2}\) from \(2^{2n/3}\), as derived by Cogliati and Seurin [17], to \(2^n/(67n)\). Furthermore, we introduce the dual of \(\mathrm {EDM}\), the Encrypted Davies-Meyer Dual (EDMD) construction:

$$\begin{aligned} \mathrm {EDMD}^{p_1,p_2}(x) = p_2(p_1(x)) \oplus p_1(x)\,. \end{aligned}$$
(5)

The dual is depicted in Fig. 1b, and as can be seen from a simple comparison with \(\mathrm {EDM}^{p_1,p_2}\) of Fig. 1a, the constructions are very much related, and equally expensive. We show that the \(\mathrm {EDMD}\) construction achieves security up to \(2^n/67\) queries.

Fig. 1.
figure 1

Encrypted Davies-Meyer (a) and its dual (b). The dashed line represents the necessary addition to yield EWCDM.

Mirror Theory. The backbone of our security analysis is Patarin’s mirror theory [31, 36, 40, 43], a very powerful but rather unknown technique. We refurbish and modernize it in Sect. 3 in order to be able to neatly apply it in our analyses.

At a basic level, the idea of Patarin’s mirror theory is to consider \(q\ge 1\) equations in \(r\ge q\) unknowns, and to determine a lower bound on the number of possible solutions to the unknowns. Some conditions naturally apply: the q equations are of the form \(P_a\oplus P_b = \lambda \),Footnote 2 where \(P_a\) and \(P_b\) are two unknowns, and the solution to the unknowns should not contain collisions.

Consider the following example system of equations:

$$\begin{aligned} P_a\oplus P_b = \lambda _1\,, \,P_b\oplus P_c = \lambda _2\,, \,P_d\oplus P_e = \lambda _3\,. \end{aligned}$$
(6)

We have \(2^n\) choices for \(P_a\), after which \(P_b\) is determined by \(\lambda _1\) and \(P_c\) by \(\lambda _2\). Next, we have \(2^n-3\) options for \(P_d\) (as \(P_d\) should not collide with \(P_a\), \(P_b\), and \(P_c\)), after which \(P_e\) is determined by \(\lambda _3\). This naive counting gives \(2^n(2^n-3)\) solutions to the system of equations, but it disregards two potential problems: (i) the choice may result in a collision in the unknowns and (ii) the system of equations may be inconsistent in the first place. Problem (i) may occur in a straightforward way if, for instance, \(\lambda _1=0\), as in this case the first equation states that \(P_a=P_b\). It could also happen in a more delicate setting, for example if \(P_b=P_e\) (even though \(P_d\) does not collide with \(P_a\)). To understand problem (ii), consider the system of equations of (6) appended with equation \(P_a\oplus P_c = \lambda _4\). From the first two equations of (6) and the appended equation we can conclude that the system is inconsistent if \(\lambda _1\oplus \lambda _2\oplus \lambda _4\ne 0\).

If problem (i) or (ii) occurs, the system of equations naturally has no solution. Disregarding these two problems, the fundamental mirror theorem states that if the number of q equations is “small enough,” then the number of solutions to the r unknowns is at least \(\frac{(2^n)_r}{2^{nq}}\), where \((2^n)_r\) is the falling factorial. What it means for q to be “small enough” depends on the system of equations under investigation. We refer to Theorem 2 for the details. We will in fact use a generalization of this theorem, where the solution to the unknowns may contain some collisions (see Theorem 3).

The bound itself is merely a combinatorial lower bound whose relevance is not that clear at first sight. Its strength lies in the fact that it can be nicely employed within the H-coefficient technique by Patarin [15, 33, 37], and in particular, it forms a crucial part in proving the (almost) optimal security of \(\mathrm {EDM}\), \(\mathrm {EWCDM}\), and EDMD.

Patarin’s mirror theorem (or variants thereof) has been used already to analyze the security of Feistel constructions and the xor of permutations by Patarin [34,35,36, 38,39,40,41,42, 45], Cogliati et al. [16], and Volte et al. [48, 49]. Iwata et al. [26] recently pointed out that a result from Patarin’s mirror theorem implies almost optimal security of CENC [25].

Security of EDM. By looking at \(\mathrm {EDM}^{p_1,p_2}\) from a different angle, we can prove \(2^n/(67n)\) security for the case of independent permutations \(p_1,p_2\) (Sect. 4). In more detail, we regard \(\mathrm {EDM}^{p_1,p_2}\) as a sum of permutations in the middle, where an evaluation \(y=\mathrm {EDM}^{p_1,p_2}(x)\) corresponds to a xor of permutations as \(p_1(x)\oplus p_2^{-1}(y) = x\). After this we only need to overcome a few technicalities in order to apply the mirror theorem.

Security of EWCDM. Our analysis of \(\mathrm {EDM}^{p_1,p_2}\), namely the restructuring of the data flows, generalizes to \(\mathrm {EWCDM}^{h,p_1,p_2}\) almost verbatim. In more detail, we prove in Sect. 5 that, in the nonce-respecting setting, \(\mathrm {EWCDM}^{h,p_1,p_2}\) achieves close to optimal \(2^n/(67n)\) PRF security. The analysis straightforwardly generalizes to MAC security. Security in the nonce-misusing setting cannot exceed the birthday bound as derived in [17].

Security of EDMD. Similar techniques allow us to prove optimal security of \(\mathrm {EDMD}\) based on independent permutations. However, in Sect. 6 we observe that its security reduces quite elegantly to the xor of two independent permutations, \(\mathrm {XoP}^{p_1,p_2}\) of (1). Therefore, \(\mathrm {EDMD}\) based on independent permutations achieves \(2^n/67\) security.

Towards a Single Permutation. Our results on \(\mathrm {EDM}^{p_1,p_2}\) and \(\mathrm {EWCDM}^{h,p_1,p_2}\) satisfactorily resolve the conjecture put forward by Cogliati and Seurin [17] up to a logarithmic factor, and our construction \(\mathrm {EDMD}^{p_1,p_2}\) even achieves better security than \(\mathrm {EDM}^{p_1,p_2}\). Cogliati and Seurin furthermore conjectured that optimal security is already achieved in the identical permutation case, i.e., where \(p_1=p_2\). We support this conjecture, and think that it also holds for the dual, but it appears unlikely that the techniques used in this work can be employed to prove optimal security of \(\mathrm {EDM}^p\) or \(\mathrm {EDMD}^p\), let alone \(\mathrm {EWCDM}^{h,p}\). In Sect. 7 we give informal justification for this claim, and discuss further possibilities to investigate \(\mathrm {EDM}^p\) and \(\mathrm {EDMD}^p\).

A Dual of EWCDM? An earlier version of this article suggested, as a side result, the dual construction

$$\begin{aligned} \mathrm {EWCDMD}^{h,p_1,p_2}(\nu ,m) = p_2(p_1(\nu ) \oplus h(m)) \oplus p_1(\nu ) \oplus h(m)\,, \end{aligned}$$
(7)

with a claimed security of \(2^n/(67n)\). However, Nandi [32] pointed out that \(\mathrm {EWCDMD}^{h,p_1,p_2}\) can be seen as a cascade of two non-injective functions, therewith having twice as many collisions as expected, and can be distinguished from random in about \(2^{n/2}\) queries. Closer inspection of the security proof revealed a very subtle issue in the application of the mirror theory, namely that it cannot readily handle systems of equations with a conditional existence of (in-)equalities, e.g., where two unknowns must be equal if two other unknowns satisfy a certain condition.Footnote 3 Broadly speaking, the problem is similar to (but more subtle than) issues encountered when analyzing a single permutation variant \(\mathrm {EDM}^p\), \(\mathrm {EDMD}^p\), or \(\mathrm {EWCDM}^{h,p}\) (cf. Sect. 7). As such, we consider it to be a non-trivial exercise to derive a dual of \(\mathrm {EWCDM}^{h,p}\) that provably achieves security beyond the birthday bound. We remark that \(\mathrm {EWCDMD}^{h,p_1,p_2}\) may still achieve MAC security beyond the birthday bound, however, we have not considered MAC security in this work as it is beyond the scope of the article.

2 Preliminaries

For a natural number n, \(\{0,1\}^{n}\) denotes the set of all n-bit strings, and we denote by \(\{0,1\}^{*}\) the set of bit strings of arbitrary length. \(\mathsf {func}(n)\) denotes the set of all functions on \(\{0,1\}^{n}\), and \(\mathsf {perm}(n)\) the set of all permutations. We denote by \(\mathsf {func}(n+*,n)\) the set of all functions with domain \(\{0,1\}^{n}\times \{0,1\}^{*}\) and range \(\{0,1\}^{n}\). For a natural number \(m\ge n\), we write \((m)_n=m(m-1)\cdots (m-n+1)\) as the falling factorial. For a set \(\mathcal {X}\), \(x\xleftarrow {{\scriptscriptstyle \$}}\mathcal {X}\) denotes uniformly random sampling of x from \(\mathcal {X}\).

2.1 Universal Hash Functions

For two non-empty sets \(\mathcal {X},\mathcal {Y}\), a family of hash functions \(H = \{h:\mathcal {X}\rightarrow \mathcal {Y}\}\) is said to be \(\epsilon \)-AXU (almost xor universal) if for any distinct \(x,x'\in \mathcal {X}\) and \(y\in \mathcal {Y}\), we have

$$\begin{aligned} \mathbf {Pr}\left[ h\xleftarrow {{\scriptscriptstyle \$}}H \,:\, h(x) \oplus h(x')=y\right] \le \epsilon \,. \end{aligned}$$

2.2 Distinguishers

A distinguisher \(\mathcal {D}\) is a computationally unbounded adversary that is given adaptive access to an oracle \(\mathcal {O}\) and outputs a bit 0 / 1. For two oracles \(\mathcal {O}\) and \(\mathcal {P}\) with identical interface, we denote the distinguishing advantage of \(\mathcal {D}\) by

$$\begin{aligned} \varDelta _{\mathcal {D}}(\mathcal {O}\,;\, \mathcal {P}) = \mathbf {Pr}\left[ \mathcal {D}^{\mathcal {O}}\Rightarrow 1\right] - \mathbf {Pr}\left[ \mathcal {D}^{\mathcal {P}}\Rightarrow 1\right] \,. \end{aligned}$$
(8)

Throughout this work, we only consider computationally unbounded distinguishers whose complexities are solely measured by the number of queries to the oracle. Without loss of generality, it suffices to only focus on deterministic distinguishers, as for any probabilistic distinguisher there exists a deterministic one with at least the same success probability, and we will assume so henceforth.

2.3 H-Coefficient Technique

Central to our analysis will be the H-coefficient technique by Patarin [33, 37], and as a matter of fact, the mirror theory of Sect. 3 will be a useful tool within this technique. We will follow the renewed description of Chen and Steinberger [15].

Consider two oracles \(\mathcal {O}\) and \(\mathcal {P}\), and an information-theoretic deterministic distinguisher \(\mathcal {D}\) with query complexity q that tries to distinguish both oracles: \(\varDelta _{\mathcal {D}}(\mathcal {O}\,;\, \mathcal {P})\) of (8). The communication that \(\mathcal {D}\) has with its oracle is recorded in a transcript \(\tau \). Denote by \(X_{\mathcal {O}}\) the probability distribution of transcripts when \(\mathcal {D}\) is interacting with \(\mathcal {O}\), and similarly by \(X_{\mathcal {P}}\) the distribution of transcripts for interaction with \(\mathcal {P}\). Say that a transcript is “attainable” if \(\mathbf {Pr}\left[ X_{\mathcal {P}}=\tau \right] >0\) and denote by \(\mathcal {T}\) the set of all attainable transcripts.

The H-coefficient technique states the following:

Theorem 1

(H-coefficient technique). Let \(\delta ,\varepsilon \in [0,1]\). Consider a partition \(\mathcal {T}= \mathcal {T}_{\mathrm {bad}}\cup \mathcal {T}_{\mathrm {good}}\) of the set of attainable transcripts such that

  1. 1.

    \(\mathbf {Pr}\left[ X_{\mathcal {P}}\in \mathcal {T}_{\mathrm {bad}}\right] \le \delta \),

  2. 2.

    for all \(\tau \in \mathcal {T}_{\mathrm {good}}\), \(\displaystyle \frac{\mathbf {Pr}\left[ X_{\mathcal {O}}=\tau \right] }{\mathbf {Pr}\left[ X_{\mathcal {P}}=\tau \right] } \ge 1-\varepsilon \).

Then, the distinguishing advantage satisfies \(\varDelta _{\mathcal {D}}(\mathcal {O}\,;\, \mathcal {P}) \le \delta + \varepsilon \).

Proof

A proof of the technique is given among others in [14, 15], and we repeat it briefly. As we consider a deterministic distinguisher \(\mathcal {D}\), its advantage is equal to the statistical distance between the distributions of views \(X_{\mathcal {O}}\) and \(X_{\mathcal {P}}\):

$$\begin{aligned} \varDelta _{\mathcal {D}}(\mathcal {O}\,;\, \mathcal {P})&= \frac{1}{2} \sum _{\tau \in \mathcal {T}} \big | \mathbf {Pr}\left[ X_{\mathcal {O}}=\tau \right] - \mathbf {Pr}\left[ X_{\mathcal {P}}=\tau \right] \big |\\&= \sum _{\begin{array}{c} \tau \in \mathcal {T}:\mathbf {Pr}\left[ X_{\mathcal {P}}=\tau \right]>\mathbf {Pr}\left[ X_{\mathcal {O}}=\tau \right] \end{array}} \big ( \mathbf {Pr}\left[ X_{\mathcal {P}}=\tau \right] - \mathbf {Pr}\left[ X_{\mathcal {O}}=\tau \right] \big )\\&= \sum _{\begin{array}{c} \tau \in \mathcal {T}:\mathbf {Pr}\left[ X_{\mathcal {P}}=\tau \right] >\mathbf {Pr}\left[ X_{\mathcal {O}}=\tau \right] \end{array}} \mathbf {Pr}\left[ X_{\mathcal {P}}=\tau \right] \left( 1 - \frac{\mathbf {Pr}\left[ X_{\mathcal {O}}=\tau \right] }{\mathbf {Pr}\left[ X_{\mathcal {P}}=\tau \right] } \right) . \end{aligned}$$

Making a distinction between bad and good views, we find:

$$\begin{aligned} \varDelta _{\mathcal {D}}(\mathcal {O}\,;\, \mathcal {P}) \le \sum _{\tau \in \mathcal {T}_{\mathrm {bad}}} \mathbf {Pr}\left[ X_{\mathcal {P}}=\tau \right] + \sum _{\tau \in \mathcal {T}_{\mathrm {good}}} \mathbf {Pr}\left[ X_{\mathcal {P}}=\tau \right] \varepsilon \le \delta + \varepsilon \,, \end{aligned}$$

which completes the proof.    \(\square \)

The basic idea of the technique is that a large number of transcripts are almost equally likely in both worlds, and the odd ones appear only with negligible probability \(\delta \). Note that the partitioning of \(\mathcal {T}\) into bad and good transcripts is directly reflected in the terms \(\delta \) and \(\varepsilon \) in the bound: if \(\mathcal {T}_{\mathrm {good}}\) is too large, \(\varepsilon \) will become large, whereas if \(\mathcal {T}_{\mathrm {bad}}\) is too large, \(\delta \) will become large.

For a given transcript \(\tau = \{(x_1,y_1),\ldots ,(x_q,y_q)\}\) consisting of q input/output tuples, we say that an oracle \(\mathcal {O}\) extends \(\tau \), denoted \(\mathcal {O}\vdash \tau \), if

$$\begin{aligned} \mathcal {O}(x_i)=y_i \end{aligned}$$

for \(i=1,\ldots ,q\).

2.4 Pseudorandom Function Security

Let \(F^{p_1,p_2}\in \mathsf {func}(n)\) be a fixed-input-length function that internally uses two permutations \(p_1,p_2\in \mathsf {perm}(n)\). We denote the PRF security of F as a random function by

$$\begin{aligned} \mathbf {Adv}_{F^{p_1,p_2}}^{\mathrm {prf}}(\mathcal {D}) = \varDelta _{\mathcal {D}}(F^{p_1,p_2} \,;\, f) \end{aligned}$$
(9)

where the probabilities are taken over the drawing of \(p_1,p_2\xleftarrow {{\scriptscriptstyle \$}}\mathsf {perm}(n)\) and \(f\xleftarrow {{\scriptscriptstyle \$}}\mathsf {func}(n)\).

The model generalizes to the security of variable-input-length functions as follows. Let \(F^{h,p_1,p_2}\in \mathsf {func}(n+*,n)\) be a variable-input-length function that internally uses two permutations \(p_1,p_2\in \mathsf {perm}(n)\) and a universal hash function h from some hash function family H. We denote the PRF security of F as a random function by

$$\begin{aligned} \mathbf {Adv}_{F^{h,p_1,p_2}}^{\mathrm {prf}}(\mathcal {D}) = \varDelta _{\mathcal {D}}(F^{h,p_1,p_2} \,;\, f) \end{aligned}$$
(10)

where the probabilities are taken over the drawing of \(h\xleftarrow {{\scriptscriptstyle \$}}H\), \(p_1,p_2\xleftarrow {{\scriptscriptstyle \$}}\mathsf {perm}(n)\), and \(f\xleftarrow {{\scriptscriptstyle \$}}\mathsf {func}(n+*,n)\). For variable-input-length functions, we will impose that \(\mathcal {D}\) is nonce-respecting, i.e., it never makes two queries to its oracle with the same first component.

Remark 1

We focus on PRF security in the information-theoretic setting, where the underlying primitives are secret permutations uniformly randomly drawn from \(\mathsf {perm}(n)\). Our results straightforwardly generalize to the complexity-theoretic setting, where the permutations are instantiated as \(E_{k_1},E_{k_2}\) for secret keys \(k_1,k_2\). The bounds of this work carry over with an additional loss of \(2\mathbf {Adv}_{E}^{\mathrm {prp}}(q)\), where \(\mathbf {Adv}_{E}^{\mathrm {prp}}(q)\) denotes the maximum advantage of distinguishing \(E_k\) for secret k from a uniformly random permutation in q queries. Note that in our analyses, the distinguisher can only induce forward evaluations of the underlying primitive. Therefore, the block cipher only needs to be prp secure, and not necessarily sprp secure.

3 Mirror Theory

We revisit an important result from Patarin’s mirror theory [36, 40] in our context of pseudorandom function security. For the sake of presentation and interoperability with the results in the remainder of this paper, we use different parametrization and naming of definitions.

3.1 System of Equations

Let \(q\ge 1\) and \(r\ge 1\). Let \(\mathcal {P}= \{P_1,\ldots ,P_r\}\) be r unknowns, and consider a system of q equations

$$\begin{aligned} \mathcal {E}= \{ P_{a_1}\oplus P_{b_1} = \lambda _1, \cdots , P_{a_q}\oplus P_{b_q} = \lambda _q\} \end{aligned}$$
(11)

where \(a_i,b_i\) for \(i=1,\ldots ,q\) are mapped to \(\{1,\ldots ,r\}\) using some surjective index mapping

$$\begin{aligned} \varphi : \{a_1,b_1,\ldots ,a_q,b_q\} \rightarrow \{1,\ldots ,r\}\,. \end{aligned}$$

Note that for a given system of equations, the index mapping is unique up to a reordering of the unknowns. There is a one-to-one correspondence between \(\mathcal {E}\) on the one hand and \((\varphi ,\lambda _1,\ldots ,\lambda _q)\) on the other hand, and below definitions are mostly formalized based on the latter description (but it is convenient to think about them with respect to \(\mathcal {E}\)). For a subset \(I\subseteq \{1,\ldots ,q\}\) we define by \(\mathcal {M}_I\) the multiset

$$\begin{aligned} \mathcal {M}_I = \bigcup _{i\in I}\ \{\varphi (a_i),\varphi (b_i)\}\,. \end{aligned}$$

We give three definitions with respect to the system of equations \(\mathcal {E}\).

Definition 1

(circle-freeness). The system of equations \(\mathcal {E}\) is circle-free if there is no \(I\subseteq \{1,\ldots ,q\}\) such that the multiset \(\mathcal {M}_I\) has even multiplicity elements only.

Definition 2

(block-maximality). Let \(\{1,\ldots ,r\} = \mathcal {R}_1\cup \cdots \cup \mathcal {R}_s\) be a partition of the r indices into s minimal “blocks” such that for all \(i\in \{1,\ldots ,q\}\) there exists an \(\ell \in \{1,\ldots ,s\}\) such that \(\{\varphi (a_i),\varphi (b_i)\}\subseteq \mathcal {R}_\ell \). The system of equations \(\mathcal {E}\) is \(\xi \)-block-maximal for \(\xi \ge 2\) if there is no \(\ell \in \{1,\ldots ,s\}\) such that \(|\mathcal {R}_\ell |> \xi \).

Definition 3

(non-degeneracy). The system of equations \(\mathcal {E}\) is non-degenerate if there is no \(I\subseteq \{1,\ldots ,q\}\) such that the multiset \(\mathcal {M}_I\) has exactly two odd multiplicity elements and such that \(\bigoplus _{i\in I} \lambda _i = 0\).

Informally, circle-freeness means that there is no linear combination of one or more equations in \(\mathcal {E}\) that is independent of the unknowns, block-maximality means that the unknowns can be partitioned into blocks of a certain maximum size such that there is no linear combination of two or more equations in \(\mathcal {E}\) that relates two unknowns \(P_a,P_b\) from different blocks \(\mathcal {R}_i,\mathcal {R}_j\), and non-degeneracy means that there is no linear combination of one or more equations that implies \(P_a=P_b\) for some \(P_a,P_b\in \mathcal {P}\).

3.2 Main Result

The main theorem of Patarin’s mirror theory, simply dubbed “mirror theorem”, is the following. It corresponds to “Theorem \(P_i\oplus P_j\) for any \(\xi _{ max }\)” of Patarin [40, Theorem 6].

Theorem 2

(mirror theorem). Let \(\xi \ge 2\). Let \(\mathcal {E}\) be a system of equations over the unknowns \(\mathcal {P}\) that is (i) circle-free, (ii) \(\xi \)-block-maximal, and (iii) non-degenerate. Then, as long as \((\xi -1)^2\cdot r \le 2^n/67\), the number of solutions for \(\mathcal {P}\) such that \(P_a\ne P_b\) for all distinct \(a,b\in \{1,\ldots ,r\}\) is at least

$$\begin{aligned} \frac{(2^n)_r}{2^{nq}}\,. \end{aligned}$$

The quantity measured in above theorem (the number of solutions...) is called \(h_r\) in [40]. \(H_r\) is subsequently defined as \(2^{nq}h_r\). The parameter H has slightly different meanings in [39, 41, 42], namely the number of oracles whose outputs could solve the system of equations. In the end, these definitions yielded the naming of the H-coefficient technique of Theorem 1. For the mirror theorem, we have opted to stick to the convention of [40] as its definition is pure in the sense that it is independent of the actual oracles in use.

In Appendix A, we give a proof sketch of Theorem 2, referring to [40] for the details. In the proof sketch, it becomes apparent that the side condition \((\xi -1)^2\cdot r\le 2^n/67\) can be improved (even up to \(2^n/16\)) quite easily. Patarin first derived the side condition symbolically and only then derived the specific constants. Knowing the constants in advance, we reverted the reasoning. However, to remain consistent with the theorem statement of [40], we deliberately opted to leave the 67 in; the improvement is nevertheless only constant. The term \((\xi -1)^2\) is present to cover worst-case systems of equations; it can be improved to \((\xi -1)\) in certain cases [44]. Fortunately, in most cases \(\xi \) is a small number and the loss is relatively insignificant.

3.3 Extension to Relaxed Inequality Conditions

We consider a generalization to the case where the condition that \(P_a\ne P_b\) whenever \(a\ne b\) is released to some degree. More detailed, let \(\{1,\ldots ,r\} = \mathcal {R}_1\cup \cdots \cup \mathcal {R}_t\) be any partition of the r indices. We will require that \(P_a\ne P_b\) whenever \(a,b\in \mathcal {R}_j\) for some \(j\in \{1,\ldots ,t\}\). Definition 3 generalizes the obvious way in order to comply with this condition:

Definition 4

(relaxed non-degeneracy). The system of equations \(\mathcal {E}\) is relaxed non-degenerate with respect to partition \(\{1,\ldots ,r\} = \mathcal {R}_1\cup \cdots \cup \mathcal {R}_t\) if there is no \(I\subseteq \{1,\ldots ,q\}\) such that the multiset \(\mathcal {M}_I\) has exactly two odd multiplicity elements from a single set \(\mathcal {R}_j\) (\(j\in \{1,\ldots ,t\}\)) and such that \(\bigoplus _{i\in I} \lambda _i = 0\).

Note that a relaxed non-degenerate system of equations may induce equations of the form \(P_a=P_b\) where ab are from distinct index sets; such an equation does not make the system degenerate. The extension of Theorem 2 to relaxed inequality conditions is the following, which corresponds to [40, Theorem 7].

Theorem 3

(relaxed mirror theorem). Let \(\xi \ge 2\). Let \(\{1,\ldots ,r\} = \mathcal {R}_1\cup \cdots \cup \mathcal {R}_t\) be any partition of the r indices. Let \(\mathcal {E}\) be a system of equations over the unknowns \(\mathcal {P}\) that is (i) circle-free, (ii) \(\xi \)-block-maximal, and (iii) relaxed non-degenerate with respect to partition \(\{1,\ldots ,r\} = \mathcal {R}_1\cup \cdots \cup \mathcal {R}_t\). Then, as long as \((\xi -1)^2\cdot \max _j|\mathcal {R}_j| \le 2^n/67\), the number of solutions for \(\mathcal {P}\) such that \(P_a\ne P_b\) for all distinct \(a,b\in \{1,\ldots ,r\}\) is at least

$$\begin{aligned} \frac{\mathrm {NonEq}(\mathcal {R}_1,\ldots ,\mathcal {R}_t;\mathcal {E})}{2^{nq}}\,, \end{aligned}$$

where \(\mathrm {NonEq}(\mathcal {R}_1,\ldots ,\mathcal {R}_t;\mathcal {E})\) denotes the number of solutions to \(\mathcal {P}\) that satisfy \(P_a\ne P_b\) for all \(a,b\in \mathcal {R}_j\) (\(j=1,\ldots ,t\)) as well as the inequalities imposed by \(\mathcal {E}\) (but the equalities themselves released).

The quantity \(\mathrm {NonEq}(\mathcal {R}_1,\ldots ,\mathcal {R}_t;\mathcal {E})\) sounds rather technical, but for most systems it is fairly obvious to determine. If \(P_a\oplus P_b=\lambda \ne 0\) is an equation in \(\mathcal {E}\), then this equation imposes \(P_a\ne P_b\); if in addition ab are in distinct index sets, then this inequality \(P_a\ne P_b\) imposes an extra inequality over the ones suggested by \(\mathcal {R}_1,\ldots ,\mathcal {R}_t\). An obvious lower bound is

$$\begin{aligned} \mathrm {NonEq}(\mathcal {R}_1,\ldots ,\mathcal {R}_t;\mathcal {E}) \ge (2^n)_{|\mathcal {R}_1|}(2^n-(\xi -1))_{|\mathcal {R}_2|}\cdots (2^n-(\xi -1))_{|\mathcal {R}_t|}\,, \end{aligned}$$

as every unknown is in exactly one block, and this block imposes at most \(\xi -1\) additional inequalities on the unknowns. Better lower bounds can be derived for specific systems of equations. The relaxed theorem is equivalent to the original Theorem 2 if \(t=1\) and \(\mathcal {R}_1=\{1,\ldots ,r\}\).

3.4 Example

The strength of the mirror theorem becomes visible by considering the sum of permutations, \(\mathrm {XoP}^{p_1,p_2}\) of (1) and \({\mathrm {XoP}'}^p\) of (2). As a stepping stone to the analyses of \(\mathrm {EDM}\), \(\mathrm {EWCDM}\), and \(\mathrm {EDMD}\) in the remainder of the paper, we prove that \(\mathrm {XoP}^{p_1,p_2}\) is a secure PRF as long as \(q\le 2^n/67\). The proof is almost directly taken from [40] and is an immediate application of Theorem 3. Its single-key variant \({\mathrm {XoP}'}^p\) can be proved similarly from Theorem 2, provided \(2q\le 2^n/67\).

Proposition 1

For any distinguisher \(\mathcal {D}\) with query complexity at most \(q\le 2^n/67\), we have

$$\begin{aligned} \mathbf {Adv}_{\mathrm {XoP}^{p_1,p_2}}^{\mathrm {prf}}(\mathcal {D}) \le q/2^n\,. \end{aligned}$$
(12)

Proof

Let \(p_1,p_2\xleftarrow {{\scriptscriptstyle \$}}\mathsf {perm}(n)\) and \(f\xleftarrow {{\scriptscriptstyle \$}}\mathsf {func}(n)\). Consider any fixed deterministic distinguisher \(\mathcal {D}\) that has access to either \(\mathcal {O}=\mathrm {XoP}^{p_1,p_2}\) (real world) or \(\mathcal {P}=f\) (ideal world). It makes q construction queries recorded in a transcript \(\tau =\{(x_1,y_1),\ldots ,(x_q,y_q)\}\). Without loss of generality, we assume that \(x_i\ne x_j\) whenever \(i\ne j\).

In the real world, each tuple \((x_i,y_i)\in \tau \) corresponds to an evaluation of the function \(\mathrm {XoP}^{p_1,p_2}\) and thus to evaluations \(x_i\mapsto p_1(x_i)\) and \(x_i\mapsto p_2(x_i)\), such that \(p_1(x_i)\oplus p_2(x_i)=y_i\). Writing \(P_{2i-1}:=p_1(x_i)\) and \(P_{2i}:=p_2(x_i)\), the transcript \(\tau \) defines q equations on the unknowns:

$$\begin{aligned} {\begin{matrix} P_{1} \oplus P_{2} &{}= y_1\,,\\ P_{3} \oplus P_{4} &{}= y_2\,,\\ \vdots \!\!\;\qquad &{}\\ P_{2q-1} \oplus P_{2q} &{}= y_q\,. \end{matrix}} \end{aligned}$$
(13)

As \(x_i\ne x_j\) whenever \(i\ne j\), and additionally we use two independent permutations, all unknowns are formally distinct. In line with Sect. 3.1, denote the system of q equations of (13) by \(\mathcal {E}\), and let \(\mathcal {P}= \{P_1,\ldots ,P_{2q}\}\) be the 2q unknowns. We can divide the indices \(\{1,\ldots ,2q\}\) into two index sets: \(\mathcal {R}_1=\{1,3,\ldots ,2q-1\}\) are the indices corresponding to oracle \(p_1\) and \(\mathcal {R}_2=\{2,4,\ldots ,2q\}\) the indices corresponding to oracle \(p_2\).

Patarin’s H-coefficient technique of Theorem 1 states that \(\mathbf {Adv}_{\mathrm {XoP}^{p_1,p_2}}^{\mathrm {prf}}(\mathcal {D}) \le \varepsilon \), where \(\varepsilon \) is such that for any transcript \(\tau \) (we do not consider bad transcripts),

$$\begin{aligned} \frac{\mathbf {Pr}\left[ X_{\mathrm {XoP}^{p_1,p_2}}=\tau \right] }{\mathbf {Pr}\left[ X_{f}=\tau \right] } \ge 1-\varepsilon \,. \end{aligned}$$
(14)

For the computation of \(\mathbf {Pr}\left[ X_{\mathrm {XoP}^{p_1,p_2}}=\tau \right] \) and \(\mathbf {Pr}\left[ X_{f}=\tau \right] \), it suffices to compute the probability, over the drawing of the oracles, that a good transcript is obtained. For the real world \(\mathrm {XoP}^{p_1,p_2}\), the transcript \(\tau \) defines a system of equations \(\mathcal {E}\) which is circle-free, has q blocks of size 2 (so it is 2-block-maximal), and it is relaxed non-degenerate with respect to partition \(\{1,\ldots ,r\} = \mathcal {R}_1\cup \mathcal {R}_2\). We can subsequently apply Theorem 3 for \(\xi =2\), and obtain that, provided \(q\le 2^n/67\), the number of solutions for the output values \(\mathcal {P}\) is at least \(\frac{\mathrm {NonEq}(\mathcal {R}_1,\mathcal {R}_2;\mathcal {E})}{2^{nq}}\). To lower bound \(\mathrm {NonEq}(\mathcal {R}_1,\mathcal {R}_2;\mathcal {E})\), note that we have \((2^n)_q\) possible choices for \(P_1,P_3,\ldots ,P_{2q-1}\), at least \(2^n-1\) choices for \(P_2\) (if \(y_1\ne 0\) then \(P_2\) should be unequal to \(P_1\)), at least \(2^n-2\) choices for \(P_4\) (it should be unequal to \(P_2\), and if \(y_2\ne 0\), it should moreover be unequal to \(P_3\)), etc., and we obtain

$$\begin{aligned} \mathrm {NonEq}(\mathcal {R}_1,\mathcal {R}_2;\mathcal {E}) \ge (2^n)_q(2^n-1)_q\,. \end{aligned}$$

We have \((2^n-q)!\) possible choices for the remaining output values of \(p_1\), and similarly of \(p_2\). Thus,

$$\begin{aligned} \mathbf {Pr}\left[ X_{\mathrm {XoP}^{p_1,p_2}}=\tau \right]&= \frac{|\{p_1,p_2\in \mathsf {perm}(n) \mid \mathrm {XoP}^{p_1,p_2}\vdash \tau \}|}{|\mathsf {perm}(n)|^2}\nonumber \\&\ge \frac{\frac{(2^n)_q(2^n-1)_q}{2^{nq}}\cdot ((2^n-q)!)^2}{(2^n!)^2} = \frac{1}{2^{nq}}\left( 1 - \frac{q}{2^n}\right) \,. \end{aligned}$$
(15)

For the ideal world f, we similarly obtain

$$\begin{aligned} \mathbf {Pr}\left[ X_{f}=\tau \right] = \frac{|\{f\in \mathsf {func}(n) \mid f\vdash \tau \}|}{|\mathsf {func}(n)|} = \frac{1}{2^{nq}}\,. \end{aligned}$$
(16)

We thus obtain for the ratio of (14):

$$\begin{aligned} \frac{\mathbf {Pr}\left[ X_{\mathrm {XoP}^{p_1,p_2}}=\tau \right] }{\mathbf {Pr}\left[ X_{f}=\tau \right] } \ge 1 - \frac{q}{2^n}\,. \end{aligned}$$

We have obtained \(\varepsilon =\frac{q}{2^n}\), provided \(q\le 2^n/67\).    \(\square \)

4 Security of \(\mathrm {EDM}^{p_1,p_2}\)

Consider \(\mathrm {EDM}\) of (3) for the case of independent permutations \(p_1,p_2\). We will prove that this construction achieves close to optimal security.

Theorem 4

Let \(\xi \ge 1\) be any threshold. For any distinguisher \(\mathcal {D}\) with query complexity at most \(q\le 2^n/(67\xi ^2)\), we have

$$\begin{aligned} \mathbf {Adv}_{\mathrm {EDM}^{p_1,p_2}}^{\mathrm {prf}}(\mathcal {D}) \le \frac{q}{2^n} + \frac{{q\atopwithdelims ()\xi +1}}{2^{n\xi }}\,. \end{aligned}$$
(17)

The proof will be given in the remainder of this section. It relies on the mirror theorem, although this application is not straightforward. Most importantly, rather than considering \(\mathrm {EDM}^{p_1,p_2}\), we consider \(\mathrm {EDM}^{p_1,p_2^{-1}}\). As \(p_1,p_2\) are mutually independent, these two constructions are provably equally secure, but it is more convenient to reason about the latter one: we can view an evaluation \(y=\mathrm {EDM}^{p_1,p_2^{-1}}(x)\) as the xor of two permutations in the middle of the function, \(p_1(x)\oplus p_2(y) = x\). Therefore, q evaluations of \(\mathrm {EDM}^{p_1,p_2^{-1}}\) can be translated to a system of q equations on the outputs of \(p_1,p_2\) of the form (11). Some technicalities persist, such as the fact that y may be identical for different evaluations of the construction, and make it impossible to apply the mirror theorem directly.

The \(\xi \) functions as a threshold: as long as the largest block is of size at most \(\xi +1\), this means that the result of Patarin applies provided that \(q\le 2^n/(67\xi ^2)\). The probability that there is a block of size \(>\xi +1\) is at most \({q\atopwithdelims ()\xi +1}/2^{n\xi }\). Taking \(\xi =1\) gives condition \(q\le 2^n/67\) but the bound is capped by \(q^2/2^n\). The optimal choice of \(\xi \) is when \(q=2^n/(67\xi ^2)\) still yields a reasonable bound, i.e., when \((67\xi ^2)^{\xi +1}(\xi +1)!\ge 2^n\). For \(n=128\) this is the case for \(\xi \ge 9\). For \(n=256\) this is the case for \(\xi \ge 15\).

For general n, we can observe that the above definitely holds if \((67\xi ^2)^\xi = 2^n\) (a better but more complicated bound can be obtained using Stirling’s approximation). Solving this for \(\xi \) results in

$$\begin{aligned} \left( 67\xi ^2\right) ^{\xi }&= 2^n \\ \left( \sqrt{67}\xi \right) ^{\xi }&= 2^{n/2} \\ \left( \sqrt{67}\xi \right) ^{\sqrt{67}\xi }&= 2^{\sqrt{67}n/2} \\ \sqrt{67}\xi&= e^{W\left( \ln \left( 2^{\sqrt{67}n/2}\right) \right) } \\ \frac{\ln \left( 2^{\sqrt{67}n/2}\right) }{\ln \ln \left( 2^{\sqrt{67}n/2}\right) }&\le \sqrt{67}\xi \le \frac{\ln \left( 2^{\sqrt{67}n/2}\right) }{\sqrt{\ln \ln \left( 2^{\sqrt{67}n/2}\right) }}\,, \end{aligned}$$

where the last inequality comes from the approximation \(\ln x - \ln \ln x \le W(x) \le \ln x - \frac{1}{2} \ln \ln x\) on the Lambert W function [23]. Coupled with Theorem 4, this guarantees security as long as \(q\le \frac{2^n}{({67n}/{\sqrt{\ln 67n}})}\).

As suggested by Patarin [40, Generalization 2], it may be possible to eschew the condition \(\xi ^2\cdot q\le 2^n/67\) in favor of \(\xi _{\mathrm {average}}^2\cdot q\le 2^n/67\), where \(\xi _{\mathrm {average}}\) denotes the average block size. For \(\mathrm {EDM}^{p_1,p_2}\), the probability of a given block being of size \(\xi + 1\) is significantly lower than of it being of size \(\xi \); thus, the number of blocks with 2 variables is expected to dominate, and contribute the largest amount of solutions of the mirror system.

The proof of Theorem 4 consists of five steps: in Sect. 4.1 we describe how transcripts are generated, in Sect. 4.2 we discuss attainable index mappings, in Sect. 4.3 we give a definition of bad transcripts, in Sect. 4.4 we derive an upper bound on the probability of a bad transcript in the ideal world, and in Sect. 4.5 a lower bound on the ratio for good transcripts. Theorem 4 immediately follows from the H-coefficient technique of Theorem 1.

4.1 General Setting and Transcripts

Let \(p_1,p_2\xleftarrow {{\scriptscriptstyle \$}}\mathsf {perm}(n)\) and \(f\xleftarrow {{\scriptscriptstyle \$}}\mathsf {func}(n)\). Consider any fixed deterministic distinguisher \(\mathcal {D}\) that has access to either \(\mathcal {O}=\mathrm {EDM}^{p_1,p_2^{-1}}\) (real world) or \(\mathcal {P}=f\) (ideal world). It makes q construction queries recorded in a transcript \(\tau =\{(x_1,y_1),\ldots ,(x_q,y_q)\}\). Without loss of generality, we assume that \(x_i\ne x_j\) whenever \(i\ne j\).

4.2 Attainable Index Mappings

In the real world, each tuple \((x_i,y_i)\in \tau \) corresponds to an evaluation of the function \(\mathrm {EDM}^{p_1,p_2^{-1}}\) and thus to a one call to \(p_1\) and one to \(p_2\): \(x_i\mapsto p_1(x_i)\) and \(y_i\mapsto p_2(y_i)\), such that \(p_1(x_i)\oplus p_2(y_i) = x_i\). Indeed, \(p_1\) and \(p_2\) xor to \(x_i\) in the middle of the function \(\mathrm {EDM}^{p_1,p_2^{-1}}\). Writing \(P_{a_i}:=p_1(x_i)\) and \(P_{b_i}:=p_2(y_i)\), the transcript \(\tau \) defines q equations on the unknowns:

$$\begin{aligned} {\begin{matrix} P_{a_1} \oplus P_{b_1} &{}= x_1\,,\\ P_{a_2} \oplus P_{b_2} &{}= x_2\,,\\ \vdots \!\!\;\qquad &{}\\ P_{a_q} \oplus P_{b_q} &{}= x_q\,. \end{matrix}}\end{aligned}$$
(18)

In line with Sect. 3.1, denote the system of q equations of (18) by \(\mathcal {E}\), let \(\mathcal {P}= \{P_1,\ldots ,P_{r}\}\) be the r unknowns, for \(r\in \{q,\ldots ,2q\}\), and let

$$\begin{aligned} \varphi : \{a_1,b_1,\ldots ,a_q,b_q\} \rightarrow \{1,\ldots ,r\} \end{aligned}$$

be the unique index mapping corresponding to the system of Eq. (18). Denote \(\mathcal {R}_1=\{\varphi (a_1),\ldots ,\varphi (a_q)\}\) and \(\mathcal {R}_2=\{\varphi (b_1),\ldots ,\varphi (b_q)\}\).

There is a relation between the index mapping and the permutations \(p_1,p_2\), and different permutations could entail a different index mapping. Nevertheless, as \(x_i\ne x_j\) whenever \(i\ne j\), and additionally we consider independent permutations, any possible index mapping in the real world satisfies the following property.

Claim

\(\varphi (a_i)\ne \varphi (a_j)\) if and only if \(i\ne j\), and \(\varphi (b_i)\ne \varphi (b_j)\) if and only if \(y_i\ne y_j\). Furthermore, \(\varphi (a_i)\ne \varphi (b_j)\) for any ij.

Stated differently, \(\varphi \) should satisfy the input-output pattern induced by \(\tau \), and for any \(\varphi \) that does not satisfy this constraint, \(\mathbf {Pr}\left[ \varphi \mid \tau \right] =0\). This particularly means that, if \(\tau \) is given, there is a unique index mapping \(\varphi ^\tau \) (up to a reordering of the unknowns) that could have yielded the transcript. This index mapping has a range of size \(q+q'\), where \(q'=|\{y_1,\ldots ,y_q\}|\le q\) denotes the number of distinct range values in \(\tau \).

4.3 Bad Transcripts

In the real world, \(\varphi \) only exposes collisions of the form \(\varphi (b_i)=\varphi (b_j)\), or equivalently \(y_i=y_j\), for some ij. As a matter of fact, multi-collisions in the range values in \(\tau \) correspond to blocks in the mirror theory. Therefore, we say that a transcript \(\tau \) is bad if there exist \(\xi +1\) distinct equation indices \(i_1,\ldots ,i_{\xi +1}\in \{1,\ldots ,q\}\) such that \(y_{i_1} = \cdots = y_{i_{\xi +1}}\), where \(\xi \) is the threshold given in the theory statement.

4.4 Probability of Bad Transcripts (\(\delta \))

In accordance with Theorem 1, it suffices to analyze the probability of a bad transcript in the ideal world. We have:

$$\begin{aligned} \mathbf {Pr}\left[ X_{f}\in \mathcal {T}_{\mathrm {bad}}\right]&= \mathbf {Pr}\left[ \exists i_1,\ldots ,i_{\xi +1}\in \{1,\ldots ,q\} \,:\, y_{i_1}=\cdots =y_{i_{\xi +1}}\right] \le \frac{{q\atopwithdelims ()\xi +1}}{2^{n\xi }}\,, \end{aligned}$$

where we recall that in the ideal world the randomness in the transcript \(\tau \) is in the values \(y_1,\ldots ,y_q\xleftarrow {{\scriptscriptstyle \$}}\{0,1\}^{n}\). We have obtained \(\delta =\frac{{q\atopwithdelims ()\xi +1}}{2^{n\xi }}\).

4.5 Ratio for Good Transcripts (\(\varepsilon \))

Recall from Sect. 4.2 that for a given transcript \(\tau \), there is a unique index mapping \(\varphi ^\tau \) that could have resulted in the transcript. Pivotal to our proof is the following lemma.

Lemma 1

Consider good transcript \(\tau \), and denote by \(\mathcal {E}\) the system of q equations corresponding to \((\varphi ^\tau ,x_1,\ldots ,x_q)\). This system of equations is (i) circle-free, (ii) \((\xi +1)\)-block-maximal, and (iii) relaxed non-degenerate with respect to partition \(\{1,\ldots ,r\} = \mathcal {R}_1\cup \mathcal {R}_2\).

Proof

The proof relies on the fact that \(\varphi ^\tau (a_i)\ne \varphi ^\tau (a_j)\) whenever \(i\ne j\), and additionally that \(\varphi ^\tau (a_i)\ne \varphi ^\tau (b_j)\) for any ij. Particularly, for any \(I\subseteq \{1,\ldots ,q\}\) the corresponding multiset \(\mathcal {M}_I\) has at least |I| odd multiplicity elements, and there exists (i) no circle (Definition 1).

(ii) Suppose that \(\mathcal {E}\) is not \((\xi +1)\)-block-maximal (Definition 2). Then, there exists a minimal subset \(\mathcal {R}\subseteq \{1,\ldots ,r\}\) of size \(\ge \xi +2\) such that for any \(i\in \{1,\ldots ,q\}\) we either have \(\{\varphi ^\tau (a_i),\varphi ^\tau (b_i)\}\subseteq \mathcal {R}\) or \(\{\varphi ^\tau (a_i),\varphi ^\tau (b_i)\}\cap \mathcal {R}=\emptyset \). Let \(I\subseteq \{1,\ldots ,q\}\) be the subset such that \(\{\varphi ^\tau (a_i),\varphi ^\tau (b_i)\}\subseteq \mathcal {R}\) for all \(i\in I\). Due to our definition of \(\varphi ^\tau \), there must be an ordering \(I=\{i_1,\ldots ,i_{\xi +1}\}\) such that \(\varphi ^\tau (b_{i_1}) = \cdots = \varphi ^\tau (b_{i_{\xi +1}})\), or equivalently, \(y_{i_1}=\cdots =y_{i_{\xi +1}}\), therewith contradicting that \(\tau \) is good and does not contain a \((\xi +1)\)-fold collision.

(iii) Suppose that the system of equations is relaxed degenerate (Definition 4). Then, there exists a minimal subset \(I\subseteq \{1,\ldots ,q\}\) such that the multiset \(\mathcal {M}_I\) has exactly two odd multiplicity elements corresponding to the same oracle and such that \(\bigoplus _{i\in I} x_i = 0\). If \(|I|=1\), then \(\mathcal {M}_I\) has two elements from different oracles. If \(|I|=2\), then \(\bigoplus _{i\in I} x_i \ne 0\) as the \(x_i\) are all distinct. Finally, if \(|I|\ge 3\) then \(\mathcal {M}_I\) has at least 3 odd multiplicity elements.    \(\square \)

For the computation of \(\mathbf {Pr}\left[ X_{\mathrm {EDM}^{p_1,p_2^{-1}}}=\tau \right] \) and \(\mathbf {Pr}\left[ X_{f}=\tau \right] \), it suffices to compute the probability, over the drawing of the oracles, that a good transcript is obtained. Starting with the real world \(\mathrm {EDM}^{p_1,p_2^{-1}}\), for the transcript \(\tau \), there is a unique index mapping \(\varphi ^\tau \). It concerns q input-output tuples of \(p_1\) and \(q'\) input-output tuples of \(p_2\), where \(|\mathrm {rng}(\varphi ^\tau )|=q+q'\). Due to Lemma 1, we can apply Theorem 3 and obtain that, provided \(\xi ^2\cdot q\le 2^n/67\), the number of solutions to these \(q+q'\) unknowns is at least \(\frac{\mathrm {NonEq}(\mathcal {R}_1,\mathcal {R}_2;\mathcal {E})}{2^{nq}}\). We have \((2^n-q)!\) possible choices for the remaining output values of \(p_1\), and \((2^n-q')!\) for \(p_2\). Thus,

$$\begin{aligned} \mathbf {Pr}\left[ X_{\mathrm {EDM}^{p_1,p_2^{-1}}}=\tau \right]&= \mathbf {Pr}\left[ p_1,p_2\xleftarrow {{\scriptscriptstyle \$}}\mathsf {perm}(n) \,:\, \mathrm {EDM}^{p_1,p_2^{-1}} \vdash \tau \right] \\&\ge \frac{\frac{\mathrm {NonEq}(\mathcal {R}_1,\mathcal {R}_2;\mathcal {E})}{2^{nq}}\cdot (2^n-q)!(2^n-q')!}{(2^n!)^2} = \frac{\mathrm {NonEq}(\mathcal {R}_1,\mathcal {R}_2;\mathcal {E})}{2^{nq}(2^n)_{q}(2^n)_{q'}}\,. \end{aligned}$$

To lower bound \(\mathrm {NonEq}(\mathcal {R}_1,\mathcal {R}_2;\mathcal {E})\), note that we have \((2^n)_{q'}\) possible choices for \(\{P_j \mid j\in \mathcal {R}_2\}\), and subsequently at least \((2^n-1)_q\) possible choices for \(\{P_j \mid j\in \mathcal {R}_1\}\), as every index in \(\mathcal {R}_1\) is in a block with exactly one unknown from \(\mathcal {R}_2\). Thus,

$$\begin{aligned} \mathbf {Pr}\left[ X_{\mathrm {EDM}^{p_1,p_2^{-1}}}=\tau \right]&\ge \frac{(2^n-1)_q(2^n)_{q'}}{2^{nq}(2^n)_{q}(2^n)_{q'}} = \frac{1}{2^{nq}}\left( 1 - \frac{q}{2^n}\right) \,. \end{aligned}$$
(19)

For the ideal world, we obtain

$$\begin{aligned} \mathbf {Pr}\left[ X_{f}=\tau \right] = \mathbf {Pr}\left[ f\xleftarrow {{\scriptscriptstyle \$}}\mathsf {func}(n) \,:\, f\vdash \tau \right] = \frac{1}{2^{nq}}\,. \end{aligned}$$
(20)

We obtain for the ratio:

$$\begin{aligned} \frac{\mathbf {Pr}\left[ X_{\mathrm {EDM}^{p_1,p_2^{-1}}}=\tau \right] }{\mathbf {Pr}\left[ X_{f}=\tau \right] } \ge \frac{\frac{1}{2^{nq}}\left( 1 - \frac{q}{2^n}\right) }{\frac{1}{2^{nq}}} = 1 - \frac{q}{2^n}\,. \end{aligned}$$

We have obtained \(\varepsilon =\frac{q}{2^n}\), provided \(\xi ^2\cdot q\le 2^n/67\).

5 Security of \(\mathrm {EWCDM}^{h,p_1,p_2}\)

We prove that \(\mathrm {EWCDM}\) of (4) for the case independent permutations \(p_1,p_2\) achieves close to optimal PRF security in the nonce-respecting setting. We remark that Cogliati and Seurin proved PRF security of \(\mathrm {EWCDM}^{h,p_1,p_2}\) up to about \(2^{2n/3}\) queries (cf., [17, Theorem 3] for \(q_v=0\)). In a similar vein as the analysis of Cogliati and Seurin [17] on \(\mathrm {EWCDM}^{h,p_1,p_2}\), our analysis straightforwardly generalizes to the analysis for unforgeability or for the nonce-misusing setting.

Theorem 5

Let \(\xi \ge 1\) be any threshold. For any distinguisher \(\mathcal {D}\) with query complexity at most \(q\le 2^n/(67\xi ^2)\), we have

$$\begin{aligned} \mathbf {Adv}_{\mathrm {EWCDM}^{h,p_1,p_2}}^{\mathrm {prf}}(\mathcal {D}) \le \frac{q}{2^n} + \frac{{q\atopwithdelims ()2}\epsilon }{2^n} + \frac{{q\atopwithdelims ()\xi +1}}{2^{n\xi }}\,, \end{aligned}$$
(21)

where h is an \(\epsilon \)-AXU hash function.

The proof follows the same strategy as the one of \(\mathrm {EDM}^{p_1,p_2}\), i.e., replacing \(p_2\) by \(p_2^{-1}\) for readability and noting that \(t=\mathrm {EWCDM}^{h,p_1,p_2^{-1}}(\nu ,m)\) corresponds to the xor of two permutations as \(p_1(\nu ) \oplus p_2(t) = \nu \oplus h(m)\). An additional hurdle has to be overcome, namely cases where \(\nu \oplus h(m)=\nu '\oplus h(m')\): if this happens, and additionally we have \(t=t'\), the system of equations cannot be solved. (In retrospect, one can view the proof of \(\mathrm {EDM}^{p_1,p_2}\) as a special case of the new proof by keeping m constant.) As before, \(\xi \) functions as a threshold and the computations of Sect. 4 likewise apply.

5.1 General Setting and Transcripts

Let \(h\xleftarrow {{\scriptscriptstyle \$}}H\) be an \(\epsilon \)-AXU hash function, \(p_1,p_2\xleftarrow {{\scriptscriptstyle \$}}\mathsf {perm}(n)\), and \(f\xleftarrow {{\scriptscriptstyle \$}}\mathsf {func}(n+*,n)\). Consider any fixed deterministic distinguisher \(\mathcal {D}\) that has access to either \(\mathcal {O}=\mathrm {EWCDM}^{h,p_1,p_2^{-1}}\) (real world) or \(\mathcal {P}=f\) (ideal world). It makes q construction queries recorded in a transcript \(\tau _{\text {cq}}=\{(\nu _1,m_1,t_1),\ldots ,(\nu _q,m_q,t_q)\}\), where the q nonces \(\nu _i\) are mutually different.

We will reveal after \(\mathcal {D}\)’s interaction with its oracle, but before its final decision, a universal hash function h. In the real world, h is the hash function that is actually used. In the ideal world, h will be drawn uniformly at random from the \(\epsilon \)-AXU universal hash function family H. The extended transcript is denoted

$$\begin{aligned} \tau = (\tau _{\text {cq}},h)\,. \end{aligned}$$

5.2 Attainable Index Mappings

In the real world, each tuple \((\nu _i,m_i,t_i)\in \tau _{\text {cq}}\) corresponds to an evaluation of the function \(\mathrm {EWCDM}^{h,p_1,p_2^{-1}}\) and thus evaluations \(\nu _i\mapsto p_1(\nu _i)\) and \(t_i\mapsto p_2(t_i)\), such that \(p_1(\nu _i)\oplus p_2(t_i) = \nu _i\oplus h(m_i)\) (note the fundamental difference with respect to the analysis of \(\mathrm {EDM}^{p_1,p_2^{-1}}\) of Sect. 4, namely the addition of \(h(m_i)\)). Writing \(P_{a_i}:=p_1(\nu _i)\) and \(P_{b_i}:=p_2(t_i)\), the transcript \(\tau _{\text {cq}}\) defines q equations on the unknowns:

$$\begin{aligned} {\begin{matrix} P_{a_1} \oplus P_{b_1} &{}= \nu _1\oplus h(m_1)\,,\\ P_{a_2} \oplus P_{b_2} &{}= \nu _2\oplus h(m_2)\,,\\ \vdots \!\!\;\qquad &{}\\ P_{a_q} \oplus P_{b_q} &{}= \nu _q\oplus h(m_q)\,. \end{matrix}} \end{aligned}$$
(22)

(The system of equations differs from that of (18) as the unknowns should now sum to \(\nu _i\oplus h(m_i)\).) In line with Sect. 3.1, denote the system of q equations of (22) by \(\mathcal {E}\), let \(\mathcal {P}= \{P_1,\ldots ,P_{r}\}\) be the r unknowns, for \(r\in \{q,\ldots ,2q\}\), and let

$$\begin{aligned} \varphi : \{a_1,b_1,\ldots ,a_q,b_q\} \rightarrow \{1,\ldots ,r\} \end{aligned}$$

be the unique index mapping corresponding to the system of Eq. (22). Denote \(\mathcal {R}_1=\{\varphi (a_1),\ldots ,\varphi (a_q)\}\) and \(\mathcal {R}_2=\{\varphi (b_1),\ldots ,\varphi (b_q)\}\).

From the fact that \(\nu _i\ne \nu _j\) whenever \(i\ne j\), and additionally that we consider two independent permutations, we can derive the exact same property of \(\varphi \) as in Sect. 4.2, with \(\nu \) replacing x and t replacing y.

Claim

\(\varphi (a_i)\ne \varphi (a_j)\) if and only if \(i\ne j\), and \(\varphi (b_i)\ne \varphi (b_j)\) if and only if \(t_i\ne t_j\). Furthermore, \(\varphi (a_i)\ne \varphi (b_j)\) for any ij.

As before, for a given transcript \(\tau _{\text {cq}}\), there is a unique index mapping \(\varphi ^\tau \) that could have yielded the transcript. It has a range of size \(q+q'\), where \(q'=|\{t_1,\ldots ,t_q\}|\le q\) denotes the number of distinct range values in \(\tau _{\text {cq}}\).

5.3 Bad Transcripts

Unlike for the analysis of \(\mathrm {EDM}^{p_1,p_2^{-1}}\), it is insufficient to just require that there is no \((\xi +1)\)-fold collision, we must also take degeneracy of the system of equations into account. Indeed, if for two queries \((\nu _i,m_i,t_i),(\nu _j,m_j,t_j)\), we have that \(t_i=t_j\) (or, equivalently, \(\varphi (b_i) = \varphi (b_j)\)) and \(\nu _i\oplus h(m_i)=\nu _j\oplus h(m_j)\), the system of equations would imply that we need \(\varphi (a_i)=\varphi (a_j)\), which is impossible by design.

Formally, we say that a transcript \(\tau =(\tau _{\text {cq}},h)\) is bad if

  • there exist \(\xi +1\) distinct equation indices \(i_1,\ldots ,i_{\xi +1}\in \{1,\ldots ,q\}\) such that \(t_{i_1} = \cdots = t_{i_{\xi +1}}\), where \(\xi \) is the threshold given in the theory statement, or

  • there exist two distinct equation indices \(i,j\in \{1,\ldots ,q\}\) such that \(t_i=t_j\) and \(\nu _i\oplus h(m_i)=\nu _j\oplus h(m_j)\).

5.4 Probability of Bad Transcripts (\(\delta \))

As in Sect. 4.4, it suffices to analyze the probability of a bad transcript in the ideal world, and we have:

(23)

where we recall that in the ideal world the randomness in the transcript \(\tau \) is in the values \(t_1,\ldots ,t_q\xleftarrow {{\scriptscriptstyle \$}}\{0,1\}^{n}\) and in the uniform drawing \(h\xleftarrow {{\scriptscriptstyle \$}}H\). The first probability of (23) is identical to the one analyzed in Sect. 4.4 and upper bounded by \({q\atopwithdelims ()\xi +1}/2^{n\xi }\). For the second probability of (23), there are \({q\atopwithdelims ()2}\) possible indices, the first equation is satisfied with probability \(1/2^n\) (due to the drawing of the \(t_i\)), and the second equation is satisfied with probability \(\epsilon \) (as h is an \(\epsilon \)-AXU hash function). Thus, the second probability is upper bounded by \({q\atopwithdelims ()2}\epsilon /2^n\).

We thus obtain from (23):

$$\begin{aligned} \mathbf {Pr}\left[ X_{f}\in \mathcal {T}_{\mathrm {bad}}\right] \le \frac{{q\atopwithdelims ()2}\epsilon }{2^n} + \frac{{q\atopwithdelims ()\xi +1}}{2^{n\xi }} =: \delta \,. \end{aligned}$$

5.5 Ratio for Good Transcripts (\(\varepsilon \))

Recall from Sect. 5.2 that for a given transcript \(\tau _{\text {cq}}\), there is a unique index mapping \(\varphi ^\tau \) that could have resulted in the transcript. We can derive the following result.

Lemma 2

Consider good transcript \(\tau =(\tau _{\text {cq}},h)\) and denote by \(\mathcal {E}\) the system of q equations corresponding to \((\varphi ^\tau ,\nu _1\oplus h(m_1),\ldots ,\nu _q\oplus h(m_q))\). This system of equations is (i) circle-free, (ii) \((\xi +1)\)-block-maximal, and (iii) relaxed non-degenerate with respect to partition \(\{1,\ldots ,r\} = \mathcal {R}_1\cup \mathcal {R}_2\).

Proof

The proof is a generalization of the one of Lemma 1. Nothing changes for circle-freeness and \((\xi +1)\)-block-maximality.

Suppose that the system of equations is relaxed degenerate (Definition 4). Then, there exists a minimal subset \(I\subseteq \{1,\ldots ,q\}\) such that the multiset \(\mathcal {M}_I\) has exactly two odd multiplicity elements corresponding to the same oracle and such that \(\bigoplus _{i\in I} \nu _i\oplus h(m_i) = 0\). As in Lemma 1, this implies that \(|I|=2\), say \(I=\{i,j\}\), for which \(\varphi ^\tau (b_i)=\varphi ^\tau (b_j)\) and \(\nu _i\oplus h(m_i)=\nu _j\oplus h(m_j)\), therewith contradicting that \(\tau \) is good.    \(\square \)

The remaining analysis is almost identical to the one for \(\mathrm {EDM}^{p_1,p_2^{-1}}\) in Sect. 4.5, the sole exception being that both probabilities have an additional factor 1 / |H|, and henceforth omitted.

6 Security of \(\mathrm {EDMD}^{p_1,p_2}\)

Consider \(\mathrm {EDMD}^{p_1,p_2}\) of (5) for the case of independent permutations \(p_1,p_2\). We will prove that this construction achieves optimal PRF security without a logarithmic loss.

Theorem 6

For any distinguisher \(\mathcal {D}\) with query complexity at most \(q\le 2^n/67\), we have

$$\begin{aligned} \mathbf {Adv}_{\mathrm {EDMD}^{p_1,p_2}}^{\mathrm {prf}}(\mathcal {D}) \le q/2^n\,. \end{aligned}$$
(24)

The proof can be performed along the same lines of that of \(\mathrm {EDM}^{p_1,p_2}\), with the difference that for \(\mathrm {EDMD}^{p_1,p_2}\) no collisions among the evaluations of the permutations occur. However, the exact same security bound can be derived fairly elegantly from Proposition 1.

Proof

Let \(p_1,p_2,p_3\xleftarrow {{\scriptscriptstyle \$}}\mathsf {perm}(n)\) and \(f\xleftarrow {{\scriptscriptstyle \$}}\mathsf {func}(n)\). Write \(\mathrm {EDMD}^{p_1,p_2} = p_2\circ p_1 \oplus p_1\). By a simple hybrid argument we obtain:

$$\begin{aligned} \varDelta (p_2\circ p_1 \oplus p_1 \,;\, f)&\le \varDelta (p_2\circ p_1 \oplus p_1 \,;\, p_3 \oplus p_1) + \varDelta (p_3 \oplus p_1 \,;\, f)\,. \end{aligned}$$

The former distance equals 0 (reveal \(p_1\) to the distinguisher prior to the experiment, and it effectively has to distinguish \(p_2\) from \(p_3\)). The latter distance is bounded by \(q/2^n\) provided that \(q\le 2^n/67\), cf., Proposition 1.    \(\square \)

7 Towards a Single Permutation

Given our results on \(\mathrm {EDM}^{p_1,p_2}\) of Theorem 4 and \(\mathrm {EDMD}^{p_1,p_2}\) of Theorem 6, one may expect that similar techniques apply to the case where \(p_1=p_2\). However, it seems unlikely, if not impossible, to apply the mirror theory to these constructions. The reason is that the mirror theory works particularly well if only the input values of the functions are determined, and not the output values.

For example, for \(\mathrm {EDM}^{p_1,p_2}\), an evaluation \(y=\mathrm {EDM}^{p_1,p_2}(x)\) corresponds to evaluations \(p_1(x)\) and \(p_2(p_1(x)\oplus x)\), where \(y = p_2(p_1(x)\oplus x)\). Thus, the query-response tuple (xy) reveals one input value to \(p_1\) and one output value of \(p_2\). By, without loss of generality, replacing \(p_2\) by its inverse we nicely obtained a system where only input values of the permutations are fixed. Now, consider \(\mathrm {EDM}^p\): a single evaluation \(y=\mathrm {EDM}^p(x)\) reveals an input value x to p as well as an output value y of p, and there seems to be no way to properly employ the mirror theorem in this case. The trick to view \(\mathrm {EDM}^{p,p^{-1}}\) does not work as the construction is not equally secure as \(\mathrm {EDM}^p=\mathrm {EDM}^{p,p}\). (In fact, \(\mathrm {EDM}^{p,p^{-1}}\) is trivially insecure as it maps 0 to 0.)

For the single permutation variant of \(\mathrm {EDMD}\), the problem appears at a different surface: the chaining. In more detail, an evaluation \(y=\mathrm {EDMD}^p(x)\) corresponds to two evaluations of p: p(x) and p(p(x)), where \(y = p(x) \oplus p(p(x))\). Suppose we have a different evaluation \(y'=\mathrm {EDMD}^p(x')\) such that, accidentally, \(p(p(x))=p(x')\). This implies that the permutation p necessarily satisfies the following constraints:

$$\begin{aligned} p(x)=x' \;,\; p(p(x))=p(x')=y\oplus x' \;,\; p(p(x'))=y'\oplus y\oplus x'\,. \end{aligned}$$

In other words, a collision between two evaluations of p imposes conditions on the input-output pattern of p, and the mirror theorem does not allow to handle this case nicely. (Technically, the collision in this example forms a block of size 3 in the terminology of Definition 2, but the amount of freedom we have in fixing the unknowns in the block is not \(2^n\) (as for normal systems of equations of Sect. 3), but at most 1).

We are not aware of any potential attack on \(\mathrm {EDM}^p\) or \(\mathrm {EDMD}^p\) that may exploit these properties. In fact, we believe that the conjecture posed by Cogliati and Seurin [17] holds for \(\mathrm {EDM}^p\), and that also \(\mathrm {EDMD}^p\) achieves optimal security. It is interesting to note that

$$\begin{aligned} \mathrm {EDM}^p\circ p = p\circ \mathrm {EDMD}^p\,, \end{aligned}$$

and any attack on \(\mathrm {EDM}^p\) performed by, for instance, chaining multiple evaluations of \(\mathrm {EDM}^p\) would have its equivalent attack for \(\mathrm {EDMD}^p\).