Keywords

1 Introduction

1.1 Overview and Motivation

Block ciphers are traditionally used within modes of operation where they are instantiated under a secret key. Provable security results typically assume them to be good pseudorandom permutations (PRPs). This has motivated a large body of theoretical works on building PRPs from weaker or less structured components, e.g., through the Feistel construction and its variants [24, 28, 33].

Block ciphers are however also frequently used in settings where the key is fixed, or at least known. We refer to this as the known-key setting. For instance, it is common to rely on permutationsFootnote 1 or (equivalently) fixed-key ciphers to build hash functions [12, 35], authenticated encryption [3], PRNGs [13, 23], and even more involved objects, such as garbling schemes [8].

As there is no secret key to rely upon, it is less clear what kind of security properties block ciphers should satisfy in this setting. Hence, security proofs typically assume the cipher to behave like an ideal random permutation on each key. A number of design paradigms for block ciphers (cf. e.g. [1, 17,18,19, 21, 25] to mention a few results) are therefore analyzed in terms of indifferentiability [31], an ideal-model property which implies that for single-stage security games, the cipher inherits all properties of an ideal cipher. Still, the resulting proofs are notoriously involved, and the constructions more complex than seemingly necessary for the applications in which they are used. This is in sharp contrast with hash functions, where indifferentiability has helped shaping real-world designs.

Our contributions. The only two exceptions to the above indifferentiability-based approach we are aware of are the notions of a public-seed pseudorandom permutation (psPRP) [37] and of correlation-intractable block ciphers [27, 29]. Block ciphers satisfying variants of both have been shown to be sufficient to instantiate several schemes that otherwise only enjoyed security proofs assuming the cipher is ideal. Yet, the complexity of actually building these primitives from simpler objects is not understood.

In this work, we present constructions for each of the notions which only make two calls to an underlying non-invertible round function. All of our constructions are instantiations of the Naor-Reingold construction [33], which is the most efficient known approach to build a secure PRP, and we thus show that it retains meaningful properties when the seed is made public under appropriate assumptions on the round functions. The previously known best constructions require Feistel networks with five rounds (for psPRPs) [37] and six rounds (for correlation intractability) [29], and in both cases the security proofs relied indirectly on (weakened) forms of indifferentiability, inheriting seemingly unnecessary complexity. Here, we introduce substantially different techniques to bypass limitations of existing proofs, borrowing from areas such as black-box separations and applications of limited-independence.

We stress that our focus here is on foundations, and more specifically, breaking complexity barriers. While we follow the good practice of giving concrete bounds, we make no claims that these are suitable for practical applications. We hope however to spur quantitative research in this direction.

1.2 Public-Seed Pseudorandomness via the NR Construction

We start with our results on public-seed pseudorandom permutations (or psPRPs, for short), a notion recently introduced in [37], which considers a family of permutations \(\mathsf {E}\) on n-bit strings, indexed by a seed \(s\). (This could be obtained from a block cipher.) Ideally, we would like \(\mathsf {E}_{s}(\cdot )\) and \(\mathsf {E}_{s}^{-1}(\cdot )\) to be indistinguishable from \(\rho \) and \(\rho ^{-1}\) (for a random permutation \(\rho \)), even if the seed \(s\) is known to the distinguisher. This is obviously impossible, yet an approach to get around this borrowed from the UCE framework [6] is to split the distinguisher into two stages. A first stage, called the source S, gets access to either \((\mathsf {E}_{s}, \mathsf {E}_{s}^{-1})\) or \((\rho , \rho ^{-1})\), but does not know s, and then passes on some leakage L to a second-stage PPT D, the distinguisher, which learns \(s\), but has no access to the oracle any more. If \(\mathsf {E}\) is indeed secure, D will not be able to guess which one of the two oracles S had access to. This is very similar to the security definition of a UCE \(\mathsf {H}\), the only difference is that there the source accesses either \(\mathsf {H}_{s}\) or a random function f.

Clearly, nothing is gained if L is unrestricted, and thus restrictions on S are necessary. Two classes of sources were in particular considered in [37], unpredictable and reset-secure sources, inspired by analogous notions for UCEs. The former demands that when the source S accesses \(\rho \) and \(\rho ^{-1}\), an (unbounded)Footnote 2 predictor P given the leakage L cannot then guess any of S’s queries (and their inverses). In contrast, the latter notion demands that a computationally unbounded distinguisher R, given L cannot tell apart whether it is given oracle access to the same permutation \(\rho \), or an independent one, within a polynomial number of queries. Being a psPRP for all unpredictable sources is a potentially weaker assumption than being a psPRP for reset-secure sources, since every unpredictable source is reset-secure, but not vice versa.

Applications. PsPRPs for such restricted source classes are a versatile notion. For example, a psPRP for all reset-secure sources can be used to instantiate the permutation within permutation-based hash functions admitting indifferentiability-based security proofs, such as the sponge construction [12] (which underlies SHA-3), turning them into a UCE-secure hash function sufficient for a number of applications, studied in multiple works [5, 6, 11, 20, 30]. Also, [37] shows that psPRPs for unpredictable sources are sufficient to instantiate garbling schemes obtained from fixed-key blocks ciphers [8].

Constructing psPRPs: Previous work. But do psPRPs exist at all? Soni and Tessaro [37] show that they are implied by sufficiently strong UCEs:

Theorem (Informal) [37]. The five-round Feistel construction, with round functions instantiated from a UCE \(\mathsf {H}\) for reset-secure sources, is a psPRP for reset-secure sources.

This left two obvious questions open, however: (1) Whether the number of rounds can be reduced, and (2) whether the same holds for unpredictable sources, too. The techniques of [37], based on proving a weaker notion of indifferentiability for five-round Feistel, fail to help answering both questions.

Our contributions. We solve both questions, and even more in fact, showing that the Naor-Reingold (NR) construction [33] solves both (1) and (2). In particular, let \(\mathsf {H}\) be a family of functions from \(n + 1\) bits to n, and let \(\mathsf {P}\) be a family of permutations on 2n bit strings. Then, the NR construction on seed \({\varvec{s}} = (s, s^{\mathsf {in}}, s^{\mathsf {out}})\) and input \(u \in \{0,1\}^{2n}\), outputs \(v \in \{0,1\}^{2n}\), where

$$\begin{aligned} x_0 \,\Vert \,x_1 \leftarrow \mathsf {P}_{s^{\mathsf {in}}}(u)\;,&x_2 \leftarrow \mathsf {H}_{s}(0 \,\Vert \,x_1) \oplus x_0,&\\&x_3 \leftarrow \mathsf {H}_{s}(1 \,\Vert \,x_2) \oplus x_1\;,&v \leftarrow \mathsf {P}^{-1}_{s^{\mathsf {out}}}(x_3 \,\Vert \,x_2). \end{aligned}$$

The key point here is that \(\mathsf {P}\) only needs to satisfy a weak non-cryptographic property, namely that for a random \(s\) and for any distinct \(u \ne u'\), the right halves of \(\mathsf {P}_{s}(u)\) and \(\mathsf {P}_{s}(u')\) only collide with negligible probability. Therefore, only two calls to a “cryptographically hard” round function \(\mathsf {H}\) are made. Naor and Reingold [33] showed that \(\mathsf {NR}\) is a (strong) PRP whenever \(\mathsf {H}\) is a pseudorandom function. Here, we show the following public-seed counterparts:

Theorems 1 and 2 (Informal). The NR construction, with round functions instantiated with a UCE \(\mathsf {H}\) for X-sources, is a psPRP for X-sources, where \(X \in \{\)reset-secure, unpredictable\(\}\).

A detailed overview of our techniques is given below in Sect. 1.4. We remark here that such UCEs are of course strong, and the question of basing these on simpler assumptions is wide open. Still, we believe such results to be very important: First off, they show relations among notions, and getting a UCE (without any injectivity structure) is possibly simpler in practice than in theory (i.e., using the compression function of SHA-256). Second, even if we instantiate \(\mathsf {H}\) from a random oracle (which gives a good UCE [6]), the result is useful, as this would give us a simple instantiation of a (seeded) permutation in applications which are not even known to follow from full-fledged indifferentiability, as discussed by Mittelbach [32].

1.3 Correlation Intractability

The notion of correlation intractability (CI) of hash functions was introduced by Canetti et al. [15] as a weakening of a random oracle. CI naturally extends to permutations and block ciphers [27, 29]: Given the seed \(s\), an adversary should not be able to find an input-output pair (uv) such that \(\mathsf {E}_{s}(u) = v\) and such that \((u, v) \in R\), where R is a hard-to-satisfy relation for a truly random permutation, a so-called evasive relation. This, in turn, can be generalized to k-ary relations, where k input-output pairs are to be provided. CI is well-known not to hold in the standard model for arbitrary evasive relations.Footnote 3 Therefore, here, we target constructions in ideal models.

Applications. CI has important applications – for example, let \(\mathsf {E}\) be a permutation family on 2n-bit strings. Then, for \(n< m < 2n\), consider the function family \(\mathsf {H}\) from m bits to n bits such that

$$\begin{aligned} \mathsf {H}_{s}(x) = \mathsf {E}_{s}(x \,\Vert \,0^{2n - m})[1 \ldots n], \end{aligned}$$

i.e., this outputs the first n bits of \(\mathsf {E}_{s}(x \,\Vert \,0^{2n - m})\). Then, it is not too hard to show that if \(\mathsf {E}\) satisfies CI for evasive binary relations, then \(\mathsf {H}\) is collision resistant – indeed, a collision yields two distinct pairs \((u_1, v_1)\), \((u_2, v_2)\) where \(u_1[m + 1 \ldots 2n] = u_2[m + 1 \ldots 2n] = 0^{2n - m}\), whereas \(v_1[1 \ldots n] = v_2[1 \ldots n]\). Along similar lines, one can prove that \(\mathsf {H}\) can be used to instantiate the Fiat-Shamir transform [22] whenever \(\mathsf {E}\) satisfies CI for unary relations. And so on.

Correlation intractability for Feistel networks. Indifferentiability is easily seen to imply CI, and therefore, by [19], the Feistel construction with 8 rounds is correlation intractable. In fact, Mandal et al. [29] observed that a weaker notion of indifferentiability, called sequential indifferentiability, is sufficient for CI, and could show that 6 rounds are enough. It is known that the 5-round Feistel construction is not correlation intractable for evasive 4-ary relations (an attack was given in [29]). Other weaker notions of indifferentiability are known to imply CI, but do no appear to lead to any complexity improvements for constructions from non-invertible primitives [2, 16].

Our results. We study the correlation intractability of the NR construction described above where the two calls to \(\mathsf {H}\) are replaced by two calls to (seedless) public random function f from \(n + 1\) to n bits, and the seed of the construction only consists of the seeds for \(\mathsf {P}\). (This is similar to the model of Ramzan and Reyzin [34], although they consider PRP security, and secret seeds.)

In general, this basic form of the NR construction cannot be correlation intractable – indeed, \(\mathsf {P}\) can be instantiated by one-round Feistel with a pairwise-independent round function, and generic attacks against the correlation intractability of four-round Feistel would still apply. We show however the following result:

Theorem 3 (Informal). For any constant \(k = O(1)\), if \(\mathsf {P}^{-1}\) is an almost \(O(k^2)\)-wise independent permutation, then the NR construction is correlation intractable for every k-ary evasive relation.

For unary relations (i.e., \(k = 1\)), we can in fact show that instantiating \(\mathsf {P}\) with one-round Feistel using a 10-wise independent round function suffices. As this is effectively a four-round Feistel network, this confirms that no generic attacks exist for unary relations. Our result extends to non-constant k, however under some restrictions on the class of evasive relations for which we can prove correlation intractability. We believe an important part of our result is the technique we use, which gives a surprising paradigm to amplify CI unconditionally which we discuss below in Sect. 1.5.

Limitations. In contrast to existing 6-round results, our result is weaker in that it only covers evasive relations fully if \(k = O(1)\). We are not aware of counter examples showing attacks for larger k’s, but our proof inherently fails. We note however that most applications of correlation intractability only require constant arity, and we leave the question of assessing whether six calls to a random function are necessary for arbitrary arity for future work.

1.4 Technical Overview – psPRPs

Let us briefly recall the setting: For some PPT source S, which queries a permutation oracle on 2n-bit strings to produce a leakage L, we need to show that any PPT distinguisher D which learns L and \({\varvec{s}} = (s, s^{\mathsf {in}}, s^{\mathsf {out}})\) cannot tell apart whether S was accessing NR using a UCE \(\mathsf {H}\) (with seed \({\varvec{s}}\)) or a truly random permutation. We assume S is either (statistically) unpredictable (in Theorem 1) or reset-secure (in Theorem 2).

The source \(\overline{S}\). The natural approach we follow is to build another source, \(\overline{S}\) from S, for which \(\mathsf {H}\) should be a secure UCE. This source thus accesses an oracle \(\mathcal {O}\) implementing a function from \(n + 1\) to n bits. It first samples seeds \(s^{\mathsf {in}}, s^{\mathsf {out}}\) for \(\mathsf {P}\), and then simulates an execution of S. The oracle calls by the latter are processed by evaluating the NR construction using \(s^{\mathsf {in}}, s^{\mathsf {out}}\), and the oracle \(\mathcal {O}(\cdot )\) in lieu of \(\mathsf {H}(s, \cdot )\). Finally, when S produces its output L, \(\overline{S}\) outputs \((L, s^{\mathsf {in}}, s^{\mathsf {out}})\). We will show the following two facts:

  • If S is unpredictable (w.r.t. the psPRP notion), then \(\overline{S}\) is unpredictable (w.r.t. the UCE notion).

  • If S is reset-secure (w.r.t. the psPRP notion), then \(\overline{S}\) is reset-secure (w.r.t. the UCE notion).

Theorems 1 and 2 follow from Facts 1 and 2, respectively, by a fairly straightforward application of the (classical) indistinguishability of the NR construction with random round functions.Footnote 4

The unpredictable case. Our approach to establish Fact 1 is inspired by an elegant proof of secure domain extension for UCEs via Wegman-Carter MACs [7]. (The case of reset-secure sources will be more involved and use new techniques.)

Assume, towards a contradiction, that \(\overline{S}\) is not unpredictable; then there exists a strategy (not necessarily efficient) that given L and \(s^{\mathsf {in}}\) and \(s^{\mathsf {out}}\), guesses one of the inner oracle queries of \(\overline{S}\) with non-negligible probability \(\varepsilon \), when \(\overline{S}\)’s oracle is a random function from \(n+1\) to n bits. Imagine now that given \((L, s^{\mathsf {in}}, s^{\mathsf {out}})\) from \(\overline{S}\), we resample an execution of \(\overline{S}\) (which in particular means re-sampling the oracle used by it) consistent with outputting \((L, s^{\mathsf {in}}, s^{\mathsf {out}})\), and look at the inner oracle queries in this virtual, re-sampled execution. Then, one can show that the real and the virtual executions are likely to share an oracle query, with probability roughly at least \(\varepsilon ^2\), for our strategy to guess a query must be equally successful on the virtual execution.

We exploit this idea to build a predictor for the original source S, contradicting our hypothesis it is unpredictable. Note that S runs with a random permutation as its oracle, and produces leakage L. Imagine now we sample fresh seeds \(s^{\mathsf {in}}\), \(s^{\mathsf {out}}\) for \(\mathsf {P}\), and for each permutation query by S defining an input-output pair (uv), we define “fake” inputs \(x_0, x_1\) from \(x_0 \,\Vert \,x_1 = \mathsf {P}_{s^{\mathsf {in}}}(u)\) and \(x_3 \,\Vert \,x_2 = \mathsf {P}_{s^{\mathsf {out}}}(v)\). Then, the indistinguishability of the NR construction from a random permutation, and the construction of \(\overline{S}\), implies that if we resample a virtual execution of S consistent with leakage L, and compute the resulting fake inputs using \(s^{\mathsf {in}}\) and \(s^{\mathsf {out}}\), then the real and the re-sampled execution will share a fake input with probability approx. \(\varepsilon ^2\). The properties imposed on \(\mathsf {P}\) then imply that with probability roughly \(\varepsilon ^2\) the real and the re-sampled execution must share the input (or output) of a permutation query. This leads naturally to a predictor that just re-samples an execution consistent with the leakage, and picks them as its prediction.

Reset-security. The case of reset-security is somewhat harder. Here we start from the premise that \(\overline{S}\) is not reset-secure: Hence, there exists an adversary \(\overline{R}\) which receives \(L, s^{\mathsf {in}}, s^{\mathsf {out}}\) from \(\overline{S}\), and can distinguish (with non-negligible advantage \(\varepsilon \)) being given access to the same random \(f: \{0,1\}^{n + 1} \rightarrow \{0,1\}^n\) used by \(\overline{S}\) from being given access to an independent \(f'\). From this, we would like to build an adversary R which receives L from S, and can distinguish the setting where R and S are given access to the same random permutation \(\rho \), from a setting where they access independent permutations \(\rho , \rho '\).

The challenge here is that we want to simulate \(\overline{R}\) correctly, by using a permutation oracle \(\rho \)/\(\rho '\) rather than f/\(f'\). To better see why this is tricky, say S is the source that queries its permutation oracle on a random 2n-bit string u, obtaining output v, and leaks \(L = (u,v)\). (This defines the corresponding \(\overline{S}\).)Footnote 5 A clever \(\overline{R}\) on input \((L = (u, v), s^{\mathsf {in}}, s^{\mathsf {out}})\) could do the following: It computes \(x_0 \,\Vert \,x_1 \leftarrow \mathsf {P}(s^{\mathsf {in}}, u)\) and \(x_3 \,\Vert \,x_2 \leftarrow \mathsf {P}(s^{\mathsf {out}}, v)\). Then, it queries \(x_1\) to its oracle, and outputs 1 iff the output equals \(x_0 \oplus x_2\). This should always be true when \(\overline{R}\) accesses f, and almost never when it accesses \(f'\).

The natural proof approach would now attempt to build R which runs \(\overline{R}\) accessing a simulated oracle consistent with the NR construction on the permutation queries made by S. However, the problem is that generically R does not know which queries S has made. Previous work [37] handled this by requiring the construction to satisfy a weaker notion of indifferentiability, called CP-sequential indifferentiability, which essentially implies that there exists a simulator that can simulate f consistently by accessing \(\rho \) and \(\rho ^{-1}\) only, and only needs to know the queries \(\overline{R}\) makes to f. This would not work with NR and our \(\overline{R}\), as the query \(x_1\) is actually uniformly random, and the simulator would likely fail to set \(x_0 \oplus x_2\) as the right output. This is why the approach of [37] ends up using the 5-round Feistel construction, as here \(\overline{R}\)’s attempt to evaluate the construction are readily detected, and answered consistently.

Our proof strategy via heavy-query sampling. Our main observation is that indifferentiability is an overkill in this setting. There is no reason \(\overline{R}\) should act adversarially to the simulator. Even more so, we can use everything \(\overline{R}\) knows, namely L, to our advantage! To do this, we use techniques borrowed from impossibility proofs in the random oracle model [4, 26]. Namely, R, on input L from S, first performs a number of permutation queries which are meant to include all of S’s likely queries to its oracle, at least when R and S are run with the same permutation oracle \(\rho \). To do this, R samples executions of S consistent with L, and the partial knowledge of the oracle \(\rho \) acquired so far. Each time such a partial execution is sampled, all queries contained in it are made to \(\rho \), and the process is repeated a number of times polynomial in \(1/\varepsilon \). Then, R samples \(s^{\mathsf {in}}, s^{\mathsf {out}}\), and internally defines an oracle \(f: \{0,1\}^{n+1} \rightarrow \{0,1\}^n\) that will be used to simulate an execution of \(\overline{R}^f(L, s^{\mathsf {in}}, s^{\mathsf {out}})\). To do this, R goes through all input-output pairs (uv) for queries to \(\rho \) it has done while simulating executions of S,Footnote 6 and defines

$$\begin{aligned} f(0 \,\Vert \,x_1) \leftarrow x_0 \oplus x_2 \;, f(1 \,\Vert \,x_2) \leftarrow x_1 \oplus x_3, \end{aligned}$$

where \(x_0 \,\Vert \,x_1 \leftarrow \mathsf {P}_{s^{\mathsf {in}}}(u)\) and \(x_3 \,\Vert \,x_2 \leftarrow \mathsf {P}_{s^{\mathsf {out}}}(v)\). Then, f is defined to be random on every other input (this can be simulated via lazy sampling). The final output of the simulated \(\overline{R}\) is then R’s final output.

The core of our proof will show that when S and R share access to \(\rho \), then the probability that R’s output is one is similar to that of \(\overline{R}\) outputting one when it accesses the same oracle as \(\overline{S}\). This will combine properties of the NR construction (allowing us to switch between f and \(\rho \)), and similar arguments as those used in [26] to prove that R ensures consistency on all queries that matter.Footnote 7

1.5 Technical Overview – Correlation Intractability

Our approach towards achieving CI is based on the following blueprint. Let R be a relation which is evasive for permutations on 2n-bit strings, and let \(\pi , \sigma \) be permutations sampled from some given distribution (this will be meant to be instantiated unconditionally below). Then, we create a new relation \(R_{\pi , \sigma }\) such that \((u, v) \in R\) iff \((\pi (u), \sigma (v)) \in R_{\pi , \sigma }\). The hope is to show that if R is an evasive relation, then \(R_{\pi , \sigma }\) is hard to satisfy for a given construction \(\mathsf {E}\) which is only correlation-intractable for a subset of all evasive relations. Then, a new composed construction \(\mathsf {E}'\) which outputs \(\sigma ^{-1}(E_{s}(\pi (u)))\) on input u would be correlation intractable for all evasive relations, since satisfying R for \(\mathsf {E}'\) implies satisfying \(R_{\pi , \sigma }\) for \(\mathsf {E}\).

Two-round Feistel. In our context, we instantiate \(\mathsf {E}\) from a two-round Feistel network. That is, on input \(x = x_0 \,\Vert \,x_1\), the two-round Feistel construction outputs \(x_2 \,\Vert \,x_3\), where \(x_2 \leftarrow f(0 \,\Vert \,x_1) \oplus x_0\) and \(x_3 \leftarrow f(1 \,\Vert \,x_2) \oplus x_1\). In a model (as the one where we consider) where f is a random function to which the adversary is given access, this construction is not correlation intractable. For instance, take the (unary) relation which is satisfied by all input-output pairs \((x_0 \,\Vert \,x_1, x_2 \,\Vert \,x_3)\) where \(x_1 = x_2\). This is clearly evasive, but trivial to satisfy for two-round Feistel. Worse is possible with k-ary relations.

However, many relations are hard, even for two-round Feistel. Take for instance any relation R with the property that for all \(x_0, x_1, x_2, x_3\), the number of \(x^*\)’s such that \((x^* \,\Vert \,x_1, x_2 \,\Vert \,x_3) \in R\) or \((x_0 \,\Vert \,x_1, x_2 \,\Vert \,x^*) \in R\) is at most \(\delta \cdot 2^n\), for some negligible function \(\delta \). No adversary A making a polynomial number of queries to f will satisfy R, except with negligible probability. Indeed, when A queries (say) \(y_2 \leftarrow f(1 \,\Vert \,x_2)\) for some \(x_2\), the only chance to produce a pair that satisfies R is \(y_1 \leftarrow f(0 \,\Vert \,x_1)\) was previously queried for some \(x_1\), and additionally, \((x_2 \oplus y_1 \,\Vert \,x_1,\ x_2 \,\Vert \,x_1 \oplus y_2) \in R\). But because \(y_2\) is being set randomly, \(x_1 \oplus y_2\) is also random, this can only hold with probability at most \(\delta \) by our assumption on \(\delta \). Thus, the probability that this pair satisfies R is negligible, and the union bound over all pairs of queries shows A is unlikely to ever satisfy R.

Where does amplification come from? Let R be a unary evasive relation. Now, imagine, again for the sake of an oversimplified illustration, that \(\pi \) and \(\sigma \) are random permutations. Then, we want to show that with high probability over the choice of \(\pi \) and \(\sigma \), the relation \(R_{\pi , \sigma }\) is hard for two-round Feistel, even if the adversary learns the entire description of \(\pi \) and \(\sigma \). Indeed, find any \(x_0, x_1, x_2\), fix \(\pi \), and fix \(u = \pi ^{-1}(x_0 \,\Vert \,x_1)\). Because R is evasive, there exists at most \(\delta \cdot 2^{2n}\) v’s (for some negligible \(\delta \)) such that \((u, v) \in R\) – call the set of such v’s \(R_u\). Because \(\sigma \) is random, the probability that \(\sigma ^{-1}(x_2 \,\Vert \,z) \in R_u\) is at most \(\delta \), and thus the expected number of z such that \((x_0 \,\Vert \,x_1, x_2 \,\Vert \,z) \in R_{\pi , \sigma }\) is at most \(\delta \cdot 2^n\), by linearity of expectation. A concentration bound will show that the probability that we are far from this expectation is indeed small, say at most \(2^{-4n}\). Taking a union bound over all \(x_0, x_1, x_2\) shows that the probability this is true for any \(x_0, x_1, x_2\) is at most \(2^{-n}\). (The symmetric argument when fixing \(x_1, x_2, x_3\) can be handled analogously.) Thus, we have just argued that with high probability over the choice of \(\pi \) and \(\sigma \), \(R_{\pi , \sigma }\) is hard for two-round Feistel!

Challenges. But obviously, this is not very useful– random permutations \(\pi , \sigma \) are inefficient to sample and describe. Also, the above result only holds for unary relations, and it is interesting to extend this to k-ary relations.

Our first insight is that the above argument only requires a bounded degree of randomness, and that (almost) t-wise independent permutations for a sufficiently small t are sufficient. We prove this using techniques for bounding sums of random variables with bounded independence [10, 36], though this will require significant adaptation because almost t-wise independent permutations do not quite produce outputs which are t-wise independent, as they are required to be distinct, and also, only approximate a random permutation. We will instantiate these by using Feistel networks with sufficiently many rounds, and t-wise independent round functions, using bounds from [24]. In fact, for the case of unary relations, we will show that we can instantiate these permutations from one single Feistel round with a 10-wise independent round function.

Moving on to k-ary relations presents even more challenges. Our approach is inherently combinatorial, whereas evasiveness is defined indirectly through the inability of an adversary to win a security game. For this reason, our result will only deal with relations R that satisfy a more structured notion of evasiveness, which we refer to as strongly evasive. Most relations of interest that we are aware of are strongly evasive, but evasiveness does not always imply strong evasiveness. However, as a result of independent interest, we show that strong evasiveness and evasiveness are related, and asymptotically equivalent when k is a constant.

2 Preliminaries

Notational preliminaries. Throughout this paper, we denote by \( \mathsf {Funcs}(X, Y)\) the set of functions \(X \rightarrow Y\), and in particular use the shorthand \( \mathsf {Funcs}(m,n)\) whenever \(X = \{0,1\}^m\) and \(Y = \{0,1\}^n\). We also denote by \(\mathsf {Perms}(X)\) the set of permutations on the set X, and analogously, \(\mathsf {Perms}(n)\) denotes the special case where \(X = \{0,1\}^n\). For \(n \in \mathbb {N}\), we let [n] denote the set \(\{1, \ldots , n\}\).

Our security definitions and proofs will often use games, as formalized by Bellare and Rogaway [9]. Typically, our games will have boolean outputs – that is, either \(\mathsf {true}\) or \(\mathsf {false}\) – and we use the shorthand \(\mathsf {Pr}\left[ \mathsf {G} \right] \) to denote the probability that a certain game outputs the value \(\mathsf {true}\), or occasionally 1 (when the output is binary, rather than boolean). Most results in this paper will be concrete, but natural asymptotic statements can be made by allowing all parameters to be functions of the security parameter.

A function family with input set X and output set Y is a pair of algorithms \(\mathsf {F}= (\mathsf {F}.\mathsf {Kg}, \mathsf {F}.\mathsf {Eval})\), where the randomized key (or seed) generation algorithm \(\mathsf {F}.\mathsf {Kg}\) outputs a seed \(s\), and the deterministic evaluation algorithm \(\mathsf {F}.\mathsf {Eval}\) takes as inputs a valid seed \(s\) and an input \(x \in X\), and returns \(\mathsf {F}.\mathsf {Eval}(s, x) \in Y\). If \(X = \{0,1\}^m\) and \(Y = \{0,1\}^n\), we say that \(\mathsf {F}\) is a family of functions from m-bits to n-bits. We usually write \(\mathsf {F}(s, \cdot ) = \mathsf {F}.\mathsf {Eval}(s, \cdot )\). A permutation family \(\mathsf {P}= (\mathsf {P}.\mathsf {Kg}, \mathsf {P}.\mathsf {Eval})\) on n bits is the special case where \(X = \{\texttt {+}, \texttt {-}\} \times \{0,1\}^n\) and \(Y = \{0,1\}^n\), and for every \(s\), there exists a permutation \(\pi _{s}\) such that \(\mathsf {P}.\mathsf {Eval}(s, (\texttt {+}, x)) = \pi _s(x)\) and \(\mathsf {P}.\mathsf {Eval}(s, (\texttt {-}, y)) = \pi ^{-1}_{s}(y)\). We usually write \(\mathsf {P}(s, \cdot ) = \mathsf {P}(s, (\texttt {+}, \cdot ))\) and \(\mathsf {P}^{-1}(s, \cdot ) = \mathsf {P}(s, (\texttt {-}, \cdot ))\).

2.1 UCEs and psPRPs

We review the UCE notion introduced in [6], and the psPRP notion [37]. As explained in the latter work, they can be seen as instantiations of a general paradigm. Yet, we consider separate security games for better readability.

Concretely, let \(\mathsf {H}\) be function family from m-bits to n-bits. Let S be an adversary called the source and D an adversary called the distinguisher. We associate with them the game \(\mathsf {UCE}^{S, D}_{m,n,\mathsf {H}}\) depicted in Fig. 1. For a family \(\mathsf {E}\) of permutations on n-bits, the psPRP-security game \(\mathsf {psPRP}^{S,D}_{n,\mathsf {E}}\) differs in that \(\mathcal {O}\) allows for inverse queries, and the ideal object is a random permutation. The corresponding advantage metrics for an (SD) are defined as

(1)

Note that we adopt the multi-key versions of UCE and psPRP security, as they are the most general, and they are not known to follow from the single-key case. Our treatment scales down to the single-key version by forcing the source to always choose \(r = 1\).

Fig. 1.
figure 1

Games to define UCE and psPRP security. Here, S is the source and D is the distinguisher. Boxed statements are only executed in the corresponding game.

We say that \(\mathsf {H}\) is UCE-secure for a class of sources \(\mathcal {S}\) if \(\mathsf {Adv}^{\mathsf {uce}}_{m,n,\mathsf {H}}(S,D)\) is negligible for all PPT D and all sources \(S \in \mathcal {S}\). Similarly, \(\mathsf {E}\) is psPRP secure for \(\mathcal {S}\) if \(\mathsf {Adv}^{\mathsf {psprp}}_{n, \mathsf {E}}(S,D)\) is negligible for all PPT D and all sources \(S \in \mathcal {S}\) It is known that \(\mathcal {S}\) cannot contain all PPT algorithms for security to be attainable. Next, we discuss two important classes of restrictions – unpredictable and reset-secure sources – considered in the literature [6, 7, 37].

Fig. 2.
figure 2

Games to define unpredictability (left) and reset-security (right) of sources. Here, S is the source, P is the predictor and R is the reset-adversary. Boxed statements are only executed in the corresponding game.

Unpredictable Sources. Let S be a source and P be an adversary called the predictor. We associate with them games \(\mathsf {f}\text {-}\mathsf {Pred}^P_{m,n,S}\) and \(\mathsf {p}\text {-}\mathsf {Pred}^P_{n,S}\) of Fig. 2 which capture the fact that P cannot predict any of the queries of S (or their inverses), when the latter interacts with a random function from m bits to n bits, or respectively a random permutation on n-bit strings. The corresponding advantage metrics are

(2)

We say S is statistically unpredictable if \(\mathsf {Adv}^{\mathsf {f}\text {-}\mathsf {pred}}_{m,n,S}(P)\) (respectively, \(\mathsf {Adv}^{\mathsf {p}\text {-}\mathsf {pred}}_{n,S}(P)\)) is negligible for all predictors P outputting a set \(Q'\) of polynomial size. An analogous notion of computational unpredictability can be defined, but it is unachievable if IO exists [14], and is usually not needed for applications. We also note that what we formalize here is the notion of simple unpredictability – P is not permitted to query the underlying primitive. The notion was proved equivalent (asymptotically) for UCEs [6] to a version where we give P access to the primitive. A similar proof follows for psPRPs. (We omit it due to lack of space.)

Reset-secure Sources. Let S be a source and R be an adversary called the reset-adversary. We associate to them the games \(\mathsf {f}\text {-}\mathsf {Reset}^R_{m,n,S}\) and \(\mathsf {p}\text {-}\mathsf {Reset}^R_{n,S}\) of Fig. 2 which formalize the reset-security of S against a random function and a random permutation, respectively. The idea here is that R should not be able to tell apart whether S is accessing the same set of oracles it accesses, or not. This is captured via the advantage metrics

figure a

We say S is statistically reset-secure if the corresponding advantage is negligible for all reset-adversaries R making a polynomial number of queries to their oracle, but which are otherwise computationally unrestricted. It is known that a (statistically) unpredictable source is (statistically) reset-secure, for both UCEs [6] and psPRPs [37]. The converse is not true – S may query a fixed known input, and let L be the empty string. S is reset-secure in the strongest sense, while being easily predictable.

2.2 Evasive Relations, Correlation Intractability

In the following, a k-ary relation R over \(X \times Y\) is a set of subsets \(S \subseteq X \times Y\), where \(1 \le \left|S \right| \le k\).Footnote 8 We are going to consider relations which are evasive with respect to a random permutation.

Evasive relations. Given a relation R over \(\{0,1\}^{m} \times \{0,1\}^{m}\), we consider the following advantage metric, involving an adversary A:

figure b

where . We say that a relation R is \((q, \delta )\)-evasive for a random permutation if \(\mathsf {Adv}^{\mathsf {evp}}_{R, m}(A) \le \delta \) for all adversaries making q queries.

Correlation intractability. Let \(\mathsf {M}^f\) be a permutation family on m-bits which makes oracle calls to a function f from n bits to \(\ell \) bits, to be modeled as a random function. Let R be a k-ary relation. Let A be any (possibly unbounded) adversary. We associate to A, \(\mathsf {M}\) and R the following \(\mathsf {cri}\)-advantage metric:

where and .

3 Public-Seed Pseudorandomness of Naor-Reingold

This section revisits the Naor-Reingold construction [33] in the public-seed setting. We prove that it transforms a UCE into a psPRP, for both unpredictable (Sect. 3.2) and reset-secure sources (Sect. 3.3). Before turning to these results, however, Sect. 3.1 reviews the construction and proves a strong statement about its indistinguishability.

3.1 The NR Construction and Its Indistinguishability

Let \(\mathsf {P}\) be a permutation family on the 2n-bit strings. We say that \(\mathsf {P}\) is \(\alpha \)-right-universal if for all distinct \(u, u' \in \{0,1\}^{2n}\), where \(\mathsf {P}_1\) denote the second n-bit half of the output of \(\mathsf {P}\). Note that a pairwise-independent permutation is a good candidate of \(\mathsf {P}\), but a simpler approach is to employ one-round of Feistel with a pairwise independent hash function \(\mathsf {H}\) as the round function, i.e., \(\mathsf {P}(s, (u_0, u_1)) = (u_1, \mathsf {H}(s, u_1) \oplus u_0)\).

The Naor-Reingold (NR) Construction. Let \(\mathsf {H}\) be a function family from \(n + 1\) bits to n bits. We define the permutation family \(\mathsf {NR}= \mathsf {NR}[\mathsf {P}, \mathsf {H}]\) on the 2n-bit strings, where \(\mathsf {NR}.\mathsf {Kg}\) outputs \((s, s^{\mathsf {in}}, s^{\mathsf {out}})\) such that and . Further, forward evaluation proceeds as follows (the inverse is obvious):

figure c

Naor and Reingold [33] proved that the \(\mathsf {NR}\) construction with random round functions is indistinguishable from a random permutation under chosen ciphertext attacks. We will need a stronger result, which we prove here, that this is true even when the seed of \(\mathsf {P}\) is given to the adversary after it stops making queries, and when the adversary can make queries to multiple instances of the construction. It will be convenient to re-use the notation already in place for the psPRP framework, and we denote by \(\mathsf {Adv}^{\mathsf {psprp}^+}_{2n, \mathsf {NR}[\mathsf {P}, \mathsf {F}]}(S, D)\) the advantage obtained by (SD) in the \(\mathsf {psPRP}^{S, D}_{2n, \mathsf {NR}[\mathsf {P}, \mathsf {F}]}\) game, with the modification that D is not given the seed for \(\mathsf {F}\), only the seeds used by the permutation \(\mathsf {P}\).

Proposition 1

(Indistinguishability of the NR construction). Let \(\mathsf {F}= \mathsf {F}[n+1, n]\) be the family of all functions from \(n + 1\) to n bits, equipped with the uniform distribution. Further, let \(\mathsf {P}\) be \(\alpha \)-right-universal. For all SD, where S makes q queries, \(\mathsf {Adv}^{\mathsf {psprp}^+}_{2n,\mathsf {NR}[\mathsf {P}, \mathsf {F}] }(S, D) \le q^2 \cdot \left( 2 \alpha + \frac{1}{2^{2n}} \right) \).

The proof of Proposition 1 can be found in the full version [38, App. A.1].

3.2 The Case of Unpredictable Sources

We first prove that the \(\mathsf {NR}\) construction transforms a UCE function family for statistically unpredictable sources into a psPRP for statistically unpredictable sources. Our proof uses a technique inspired from that of Bellare et al. [7], given originally in the setting of UCE domain extension. Concretely, we prove the following.

Theorem 1

(NR security for unpredictable sources). Let \(\mathsf {P}\) be a \(\alpha \)-right universal family of permutations on 2n-bit strings. Let \(\mathsf {H}\) be a family of functions from \(n+1\) bits to n bits. Then, for all distinguishers D and sources S making overall q queries to their oracle, there exists \(\overline{D}\) and \(\overline{S}\) such that

$$\begin{aligned} \mathsf {Adv}^{\mathsf {psprp}}_{2n, \mathsf {NR}[\mathsf {P},\mathsf {H}]}(S, D) \le \mathsf {Adv}^{\mathsf {uce}}_{n+1, n, \mathsf {H}}(\overline{S}, \overline{D}) + q^2 \left( 2\alpha + \frac{1}{2^{2n}} \right) . \end{aligned}$$
(3)

Here, \(\overline{D}\) and D are roughly as efficient, and \(\overline{S}\) and S are similarly as efficient. In particular, \(\overline{S}\) makes 2q queries. Moreover, for every predictor \(\overline{P}\), there exists a predictor P such that

$$\begin{aligned} \mathsf {Adv}^{\mathsf {f}\text {-}\mathsf {pred}}_{n+1, n, \overline{S}}(\overline{P}) \le q^2 \cdot \left( 2\alpha + \frac{1}{2^{2n}} \right) + p \cdot \sqrt{2q^2 \alpha + \mathsf {Adv}^{\mathsf {p}\text {-}\mathsf {pred}}_{2n, S}(P)}, \end{aligned}$$
(4)

where p is a bound on the size of the set output by \(\overline{P}\).

The asymptotic interpretation is that if \(n = \omega (\log (\lambda ))\) and \(\alpha \) is negligible, if S is (statistically) unpredictable, then so is \(\overline{S}\). Further, if \(\mathsf {H}\) is a UCE for all unpredictable sources, then \(\mathsf {NR}\) is a psPRP for all statistically unpredictable sources.

We stress that the predictor P built in the proof does not preserve the efficiency of \(\overline{P}\), which is not a problem, as we only consider statistical notions. While we do not elaborate in the proof, it turns out that the running time of P is exponential in the length of S’s leakage, thus the statement carries over to computational unpredictability if \(L = O(\log \lambda )\).

Proof

We first consider three games, \(\mathsf {G}_0, \mathsf {G}_1\), and \(\mathsf {G}_2\). Game \(\mathsf {G}_0\) is the game \(\mathsf {psPRP}^{S, D}_{2n, \mathsf {NR}[\mathsf {P}, \mathsf {F}]}\) in the case \(b = 1\), and modified to return \(\mathsf {true}\) if \(b' = 1\). Game \(\mathsf {G}_2\) is the game \(\mathsf {psPRP}^{S, D}_{2n, \mathsf {NR}[\mathsf {P}, \mathsf {F}]}\) in the case \(b = 0\), and modified to return \(\mathsf {true}\) if \(b' = 1\). The intermediate game \(\mathsf {G}_1\) is obtained by modifying \(\mathsf {G}_0\) as follows: Initially, r random functions are sampled, and when evaluating the \(\mathsf {NR}\) construction within \(\mathcal {O}\) queries, the evaluation of \(\mathsf {H}(s_i, b \,\Vert \,x)\) is replaced by an evaluation of the random function \(f_i(b \,\Vert \,x)\). Then,

$$\begin{aligned} \mathsf {Adv}^{\mathsf {psprp}}_{2n, \mathsf {NR}[\mathsf {P},\mathsf {H}]}(S, D) = \left( \mathsf {Pr}\left[ \mathsf {G}_0 \right] - \mathsf {Pr}\left[ \mathsf {G}_1 \right] \right) + \left( \mathsf {Pr}\left[ \mathsf {G}_1 \right] - \mathsf {Pr}\left[ \mathsf {G}_2 \right] \right) . \end{aligned}$$

We can directly get \(\mathsf {Pr}\left[ \mathsf {G}_1 \right] - \mathsf {Pr}\left[ \mathsf {G}_2 \right] \le q^2 \left( 2\alpha + \frac{1}{2^{2n}} \right) \) as a corollary of Proposition 1, since neither of \(\mathsf {G}_1\) and \(\mathsf {G}_2\) uses the seeds generated by \(\mathsf {H}.\mathsf {Kg}\).

Going on, let us consider the new source \(\overline{S}\) which simulates an execution of S, and uses access to an oracle \(\mathcal {O}(i, X)\), implementing for each i a function from \(n + 1\) bits to n bits, to internally simulate the round functions \(\mathsf {NR}\) construction used to answer S’s queries. A formal description is in Fig. 3. Also consider the distinguisher \(\overline{D}\) such that

$$\begin{aligned} \overline{D}(L' = (L, {\varvec{s}}^{\mathsf {in}}, {\varvec{s}}^{\mathsf {out}}), {\varvec{s}}) = D(L, ({\varvec{s}}, {\varvec{s}}^{\mathsf {in}}, {\varvec{s}}^{\mathsf {out}})) , \end{aligned}$$

where \({\varvec{s}} = (s_1, \ldots , s_r)\), \({\varvec{s}}^{\mathsf {in}} = (s_1^{\mathsf {in}}, \ldots , s_r^{\mathsf {in}})\), and \({\varvec{s}}^{\mathsf {out}} = (s_1^{\mathsf {out}}, \ldots , s_r^{\mathsf {out}})\) Therefore, \(\mathsf {G}_0\) and \(\mathsf {G}_1\) behave exactly as \(\mathsf {UCE}^{\overline{S}, \overline{D}}_{n+1, n, \mathsf {H}}\) with challenge bits \(b = 1\) and \(b = 0\), respectively, with the only difference of outputting \(\mathsf {true}\) whenever the distinguisher’s output is \(b' = 1\). Consequently, \(\mathsf {Adv}^{\mathsf {uce}}_{n+1, n, \mathsf {H}}(\overline{S}, \overline{D}) = \mathsf {Pr}\left[ \mathsf {G}_0 \right] - \mathsf {Pr}\left[ \mathsf {G}_1 \right] \).

The remainder of the proof relates the unpredictability of S and that of \(\overline{S}\), establishing (4) in the theorem statement. For lack of space, the argument is deferred to the full version [38, App. A.2].    \(\square \)

Fig. 3.
figure 3

The source \(\overline{S}\) in the proof of Theorems 1 and 2.

3.3 The Case of Reset-Secure Sources

Theorem 1’s importance stems mostly from the fact that it establishes the equivalence of psPRPs and UCEs for the case of (statistically) unpredictable sources. The question was left open in [37]. Many applications (e.g., instantiating the permutation within sponges, or any other indifferentiable hash construction) however require the stronger notion of reset-security. For this, [37] show that the five-round Feistel construction suffices, using a weaker variant of indifferentiability, and left open the question of whether four-rounds suffice.

We do better here: we prove that the NR construction transforms a UCE for statistically reset-secure sources into a psPRP for the same class of sources. The proof starts as the one of Theorem 1, but then shows that the source \(\overline{S}\) built therein is in fact statistically reset-secure whenever S is. This step will resort to a variant of the heavy-query sampling method of Impagliazzo and Rudich [26] to simulate a random oracle from the leakage which captures “relevant correlations” with what is learnt by the source.

Theorem 2

(NR security for reset-secure sources). Let \(\mathsf {P}\) be a \(\alpha \)-right universal family of permutations on 2n-bit strings, and let \(\mathsf {H}\) be a function family from \(n +1\) bits to n bits. Then, for all distinguishers D and sources S making overall q queries to their oracle, there exists \(\overline{D}\) and \(\overline{S}\) such that

$$\begin{aligned} \mathsf {Adv}^{\mathsf {psprp}}_{2n, \mathsf {NR}[\mathsf {P}, \mathsf {H}]}(S, D) \le \mathsf {Adv}^{\mathsf {uce}}_{n+1, n, \mathsf {H}}(\overline{S}, \overline{D}) + q^2 \left( 2\alpha + \frac{1}{2^{2n}} \right) . \end{aligned}$$
(5)

Here, \(\overline{D}\) and D are roughly as efficient, and \(\overline{S}\) and S are similarly as efficient. In particular, \(\overline{S}\) makes 2q queries. Moreover, for every reset-adversary \(\overline{R}\) making p queries, there exists a reset-adversary R such that

$$\begin{aligned} \mathsf {Adv}^{\mathsf {f}\text {-}\mathsf {reset}}_{n+1, n, \overline{S}}(\overline{R}) \le 2\mathsf {Adv}^{\mathsf {p} \text {-}\mathsf {reset}}_{2n, S}(R) + 4\left( q + \frac{8qp^2}{\varepsilon } \ln (4p/\varepsilon )\right) ^2 \left( 2 \alpha + \frac{1}{2^{2n}} \right) , \end{aligned}$$
(6)

where \(\varepsilon := \mathsf {Adv}^{\mathsf {f}\text {-}\mathsf {reset}}_{n+1, n, \overline{S}}(\overline{R})\). In particular, R makes \(4qp^2/\varepsilon \cdot \ln (4p/\varepsilon )\) queries to its oracle.

Asymptotically, (6) implies that if \(\overline{R}\) exists making queries, and achieving non-negligible advantage \(\varepsilon \), then R makes also a polynomial number of queries, and achieves non-negligible advantage, as long as \(\alpha \) is negligible, and \(n = \omega (\log \lambda )\). Thus, reset-security of R yields reset-security of \(\overline{R}\).

We also believe that the technique of Barak and Mahmoody [4] can be used to reduce the \(8qp^2/\varepsilon \) term to \(O(qp)/\varepsilon \). We did not explore this avenue here, as the proof approach of [26] is somewhat easier to adapt to our setting.

Proof

The setup of the proof is identical to that in Theorem 1, in particular the construction of \(\overline{S}\) from S (and of \(\overline{D}\) from D.) The difference is in relating the reset-security of S and \(\overline{S}\). In particular, let

$$\begin{aligned} \varepsilon := \mathsf {Adv}^{\mathsf {f}\text {-}\mathsf {reset}}_{n+1, n, \overline{S}}(\overline{R}) = \mathsf {Pr}\left[ \mathsf {f}\text {-}\mathsf {Reset}^{\overline{R}}_{n+1,n,\overline{S}} \,\big |\, b = 0 \right] - \mathsf {Pr}\left[ \lnot \mathsf {f}\text {-}\mathsf {Reset}_{n+1,n,\overline{S}}^{\overline{R}} \,\big |\, b = 1 \right] . \end{aligned}$$

The RHS is the difference of the probabilities of \(\overline{R}\) outputting 0 in the cases \(b = 0\) and \(b = 1\) respectively. We are going to build a new adversary R against S which satisfies (6). We assume without loss of generality that \(\overline{R}\) is deterministic, and makes exactly p distinct queries to its oracle.

We start the proof with some game transitions that will lead naturally to the definition of the adversary R. Formal descriptions are found in our full version [38, Figs. 7 and 8] – our description here is self-contained.

The initial game \(\mathsf {G}_1\) is simply \(\mathsf {f}\text {-}\mathsf {Reset}^{\overline{R}}_{\overline{S}}\) with the bit \(b = 0\), i.e., \(\overline{S}\) and \(\overline{R}\) access the same functions \(f_1, \ldots , f_r\) here. Further, \(\mathsf {G}_1\) returns \(\mathsf {true}\) iff \(\overline{R}\) returns 0. Thus, \(\mathsf {Pr}\left[ \mathsf {G}_1 \right] = \mathsf {Pr}\left[ \mathsf {f}\text {-}\mathsf {Reset}^{\overline{R}}_{\overline{S}} \,\big |\, b = 0 \right] \). Game \(\mathsf {G}_2\) slightly changes \(\mathsf {G}_1\): It keeps track (in a set \(Q_{\mathsf {P}}\)) of the triples (iUV) describing \(\overline{\mathcal {O}}\) queries made by the simulated S within \(\overline{S}\); i.e., either S queried \((i, (\texttt {+}, U))\), and obtained V, or queries \((i, (\texttt {-}, V))\), and obtained U. After \(\overline{S}\) terminates with leakage \((L, {\varvec{s}})\), where \({\varvec{s}} = (s^{\mathsf {in}}_1, s^{\mathsf {out}}_1, \ldots , s^{\mathsf {in}}_r, s^{\mathsf {out}}_r)\), for every \((i, U, V) \in Q_{\mathsf {P}}\) we compute \(x_0 \,\Vert \,x_1 \leftarrow \mathsf {P}(s^{\mathsf {in}}_i, U)\) and \(x_3 \,\Vert \,x_2 \leftarrow \mathsf {P}(s^{\mathsf {out}}_i, V)\), and define table entries

$$T[i, 0 \,\Vert \,x_1] \leftarrow x_0 \oplus x_2 \;, \;\; T[i, 1 \,\Vert \,x_2] \leftarrow x_1 \oplus x_3\;. $$

For later reference, we denote by X the set of pairs (ix) for which we set T[ix] using \(Q_{\mathsf {P}}\) and \({\varvec{s}}\). We then run \(\overline{R}(L, {\varvec{s}})\), and answer its oracle queries (ix) using T[ix]. If the entry is undefined, then we return a random value. (As we assumed all of \(\overline{R}\)’s queries are distinct, we do not need to remember the output.) As before, \(\mathsf {G}_{2}\) outputs \(\mathsf {true}\) iff \(\overline{R}\) outputs 0.

Note that we always have \(T[i, x] = f_i(x)\) for very (ix) such that \(f_i(x)\) was queried by \(\overline{S}\), and re-sampling values un-queried by \(\overline{S}\) upon \(\overline{R}\)’s queries does not change the distribution of \(\overline{R}\)’s output, and hence \(\mathsf {Pr}\left[ \mathsf {G}_{1} \right] = \mathsf {Pr}\left[ \mathsf {G}_2 \right] \).

The intersection sampler. The game \(\mathsf {G}_3\) generates a surrogate for \(Q_{\mathsf {P}}\). This is the output of an algorithm \(\mathsf {Sam}\) which, after \(\overline{S}\) terminates with output \((L, {\varvec{s}})\), takes as input the leakage L (crucially, not \({\varvec{s}}\)!) and an iteration parameter \(\eta = 4p/\varepsilon \ln (4p/\varepsilon )\) (we let also \(\tau = p \cdot \eta \)). \(\mathsf {Sam}\) queries the very same \(\overline{\mathcal {O}}\) implemented by \(\overline{S}\) to answer S’s queries (which internally simulates the NR construction using \(\overline{S}\)’s own oracle), and returns a set \(\widetilde{Q}_{\mathsf {P}}\) of 4-tuples (iUVj) such that \(j \in [p]\), and (iUV) is such that \(\overline{\mathcal {O}}(i, (\texttt {+}, U))\) would return V (or equivalently \(\overline{\mathcal {O}}(i, (\texttt {-}, V))\) would return U). Internally, \(\mathsf {Sam}\) will make calls to another (randomized) sub-procedure \(\mathcal {Q}\) which takes as input the leakage L, as well as a set Q of tuples (iUVj) consistent with \(\overline{\mathcal {O}}\), and returns a set \(\varDelta \) of at most q tuples (iUV), which are not necessarily consistent with \(\overline{\mathcal {O}}\). We will specify in detail later below what \(\mathcal {Q}\) exactly does, as some further game transitions will come handy to set up proper notation. For now, a generic understanding will suffice. In particular, given such \(\mathcal {Q}\), \(\mathsf {Sam}\) operates as in Fig. 4. As we can see, for each \((i, U, V, j) \in \widetilde{Q}_{\mathsf {P}}\), j indicates the outer iteration in which this query was added to \(\widetilde{Q}_{\mathsf {P}}\). Using this information, for every \(j \in [p]\), and every 4-tuple (iUVj) we compute \(x_0 \,\Vert \,x_1 \leftarrow \mathsf {P}(s^{\mathsf {in}}_i, U)\) and \(x_3 \,\Vert \,x_2 \leftarrow \mathsf {P}(s^{\mathsf {out}}_i, V)\), define

$$\begin{aligned} \widetilde{T}[i, 0 \,\Vert \,x_1] \leftarrow x_0 \oplus x_2 \;, \; \; \widetilde{T}[i, 1 \,\Vert \,x_2] \leftarrow x_1 \oplus x_3, \end{aligned}$$

and add \((i, 0 \,\Vert \,x_1), (i, 1 \,\Vert \,x_2)\) to the set \(\widetilde{X}^j\). A for now irrelevant caveat is that if one of the entries in \(\widetilde{T}\) is already set, then we do not overwrite it.Footnote 9

Fig. 4.
figure 4

Description of algorithm \(\mathsf {Sam}\).

Then, after all of this, \(\mathsf {G}_3\) resumes by executing \(\overline{R}(L, {\varvec{s}})\). For \(\overline{R}\)’s j-th query (ix) we do the following:

  1. 1.

    If \((i, x) \in \widetilde{X}^{j'}\) for some \(j' \le j\), then we respond with \(\widetilde{T}[i, x]\).

  2. 2.

    Otherwise, if \((i, x) \in X\), but the first condition was not met, we respond with T[ix].

  3. 3.

    Finally, if neither of the above is true, we respond randomly.

As before, \(\mathsf {G}_3\) outputs \(\mathsf {true}\) iff \(\overline{R}\) outputs 0. For now, all modifications are syntactical. Indeed, up to the point we start \(\overline{R}\), we satisfy the invariant that \(T[i, x] = f_i(x)\) or \(\widetilde{T}[i, x] = f_i(x)\) whenever these are defined, because \(\overline{\mathcal {O}}\) behaves according to the \(\mathsf {NR}\) construction using \({\varvec{s}}\). On the other hand, if during the execution \(\overline{R}(L, {\varvec{s}})\) we respond randomly, we know for sure \(f_i(x)\) was not queried by \(\overline{S}\), and thus we can re-sample it. Thus, \(\mathsf {Pr}\left[ \mathsf {G}_3 \right] = \mathsf {Pr}\left[ \mathsf {G}_2 \right] = \mathsf {Pr}\left[ \mathsf {G}_1 \right] \).

Moving to \(\mathsf {G}_4\), we now answer \(\overline{\mathcal {O}}\) queries by S (within \(\overline{S}\)) and by \(\mathsf {Sam}\) using random permutations \(\pi _1, \ldots \pi _r\), instead of simulating the NR construction using \(f_1, \ldots , f_r\), i.e., \(\overline{\mathcal {O}}(i, (\texttt {+}, U)) = \pi _i(U)\) and \(\overline{\mathcal {O}}(i, (\texttt {-}, V)) = \pi _i^{-1}(V)\). The seeds \({\varvec{s}}\) are now independent of \(\overline{\mathcal {O}}\). We do not change anything else. We note that the indistinguishability of \(\mathsf {G}_3\) and \(\mathsf {G}_4\) directly reduces to a suitable distinguisher for Proposition 1, as only \(\mathsf {Sam}\) and S (within \(\overline{S}\)) make queries to \(\overline{\mathcal {O}}\), but they do not get the keys \({\varvec{s}}\), which are used only after all queries to \(\overline{O}\) have been made to define X and \(\widetilde{X}\). Therefore,

$$\begin{aligned} \mathsf {Pr}\left[ \mathsf {G}_1 \right] = \mathsf {Pr}\left[ \mathsf {G}_3 \right] \le \mathsf {Pr}\left[ \mathsf {G}_4 \right] + (q + 2q\tau )^2 \left( 2\alpha + \frac{1}{2^{2n}} \right) , \end{aligned}$$
(7)

where we have used the fact that \(\mathsf {Sam}\) makes \(2q\tau = 2pq\eta \) queries.

The final game is \(\mathsf {G}_5\) is identical to \(\mathsf {G}_4\), except that in the process of answering \(\overline{R}\)’s queries, if case 2 happens, we also set answer randomly. However, should such situation occur, a \(\mathsf {bad}\) flag is set in \(\mathsf {G}_5\), and since up to the point this flag is set, the behavior of \(\mathsf {G}_4\) and \(\mathsf {G}_5\) is identical,

$$\begin{aligned} \mathsf {Pr}\left[ \mathsf {G}_4 \right] - \mathsf {Pr}\left[ \mathsf {G}_5 \right] \le \mathsf {Pr}\left[ \mathsf {G}_5 \text { sets } \mathsf {bad} \right] . \end{aligned}$$

To analyze the probability on the RHS, we need to specify \(\mathcal {Q}(L, \widetilde{Q})\) used by \(\mathsf {Sam}\) here. (Note all statements so far were independent of it.) For a given L which appears with positive probability in \(\mathsf {G}_5\), consider the distribution of the input-output pairs \(Q_{\mathsf {P}}\) defined by the interaction of S with \(\overline{\mathcal {O}}\), conditioned on the leakage being L, and \(\pi _1, \ldots , \pi _r\) being consistent with the triples defined by \(\widetilde{Q}\). Then, \(\mathcal {Q}(L, \widetilde{Q})\) outputs a sample of \(Q_{\mathsf {P}}\) according to this distribution. Using this, we prove the following lemma in our full version [38, App. A.4], which uses ideas similar to those from [26], with some modifications due to the setting (and the fact that \(\overline{R}\) makes p queries).

Lemma 1

\(\mathsf {Pr}\left[ \mathsf {G}_5 \text { sets } \mathsf {bad} \right] \le \varepsilon /2\)

Fig. 5.
figure 5

Adversary R in the proof of Theorem 2.

Given this, we are now ready to give our adversary R, which we build from \(\overline{R}\) and \(\mathsf {Sam}\) as described in Fig. 5. By a purely syntactical argument,

$$\begin{aligned} \mathsf {Pr}\left[ \mathsf {G}_5 \right] = \mathsf {Pr}\left[ \mathsf {p}\text {-}\mathsf {Reset}_{S}^R \,\big |\, b = 0 \right] , \end{aligned}$$
(8)

recalling that the case \(b = 0\) is the one where both S and R access the same permutations \(\pi _1, \ldots , \pi _r\). Therefore, we have established, combining (8), (7), Lemma 1,

$$\begin{aligned} \mathsf {Pr}\left[ \mathsf {p}\text {-}\mathsf {Reset}_{S}^R \,\big |\, b = 0 \right] \ge \mathsf {Pr}\left[ \mathsf {f}\text {-}\mathsf {Reset}^{\overline{R}}_{\overline{S}} \,\big |\, b = 0 \right] - \frac{\varepsilon }{2} - (q + 2q\tau )^2 \left( 2\alpha + \frac{1}{2^{2n}} \right) . \end{aligned}$$
(9)

In the full version [38, App. A.5] we also prove formally that in the case \(b = 1\), R in the game \(\mathsf {p}\text {-}\mathsf {Reset}_{S}^R\) almost perfectly simulates an execution of \(\mathsf {f}\text {-}\mathsf {Reset}^{\overline{R}}_{\overline{S}}\), or more formally,

$$\begin{aligned} \mathsf {Pr}\left[ \lnot \mathsf {p}\text {-}\mathsf {Reset}_{S}^R \,\big |\, b = 1 \right] \le \mathsf {Pr}\left[ \lnot \mathsf {f}\text {-}\mathsf {Reset}_{\overline{S}}^{\overline{R}} \,\big |\, b = 1 \right] + (q + 2q\tau )^2 \left( 2 \alpha + \frac{1}{2^{2n}} \right) . \end{aligned}$$
(10)

We can combine (10) and (9) to obtain, with \(\varDelta = 2(q + 2q\tau )^2 \left( 2 \alpha + \frac{1}{2^{2n}} \right) \),

$$\begin{aligned} \begin{aligned} \mathsf {Adv}^{\mathsf {p} \text {-}\mathsf {reset}}_{2n, S}(R)&= \mathsf {Pr}\left[ \mathsf {p}\text {-}\mathsf {Reset}_{S}^R \,\big |\, b = 0 \right] - \mathsf {Pr}\left[ \lnot \mathsf {p}\text {-}\mathsf {Reset}_{S}^R \,\big |\, b = 1 \right] \\&\ge \mathsf {Pr}\left[ \mathsf {f}\text {-}\mathsf {Reset}_{\overline{S}}^{\overline{R}} \,\big |\, b = 0 \right] - \mathsf {Pr}\left[ \lnot \mathsf {f}\text {-}\mathsf {Reset}_{\overline{S}}^{\overline{R}}) \,\big |\, b = 1 \right] - \frac{\varepsilon }{2} - \varDelta \\&\ge \varepsilon /2 - \varDelta \;. \end{aligned} \end{aligned}$$

This concludes the proof.    \(\square \)

4 Correlation Intractability of Public-Seed Permutations

In this section, we study the correlation intractability (CI) of the NR construction against k-ary evasive relations. Firstly, in Sect. 4.1, we define a stronger notion of evasiveness – strong evasiveness – and show that evasiveness and strong evasiveness are asymptotically equivalent when \(k = O(1)\). In Sect. 4.2 we study the relations that are hard for two-round Feistel. In Sect. 4.3 we show that for \(k = O(1)\) the \(\mathsf {NR}\) construction where \(\mathsf {P}^{-1}\) is a family of almost \(O(k^2)\)-wise independent permutations is correlation intractable against k-ary evasive relations. In the special case of unary evasive relations (\(k=1\)), we show that (see [38, App. B.6]) \(\mathsf {P}\) instead can be instantiated from one-round Feistel with a 10-wise independent round function.

4.1 Strong Evasiveness

Evasiveness is defined through the hardness of winning a security game. For our results, we need instead a combinatorial understanding of evasive relations. To this end, we will rely on the following notion of evasiveness, which, as we show below, is generally implied by evasiveness if \(k = O(1)\).

Definition 1

(Strongly evasive relations). Let R be a k-ary relation over \(X \times X\) and \(\delta \in [0,1]\). We say that R is \(\delta \)-strongly evasive if the following are true for all \(0 \le j < k' \le k\):

  • For all distinct \(\mathbf {u}_1, \ldots , \mathbf {u}_{k'} \in X\), all \(\mathbf {v}_1,\ldots , \mathbf {v}_j \in X\), we have

    $$\begin{aligned} \left|\left\{ (\mathbf {v}_{j+1}, \ldots , \mathbf {v}_{k'}) \,:\, \{(\mathbf {u}_1, \mathbf {v}_1), \ldots , (\mathbf {u}_{k'}, \mathbf {v}_{k'})\} \in R \right\} \right| \le \delta \cdot \prod _{i = j}^{k'-1} (\left|X \right| - i). \end{aligned}$$
  • For all distinct \(\mathbf {v}_1, \ldots , \mathbf {v}_{k'} \in X\), all \(\mathbf {u}_1,\ldots , \mathbf {u}_j \in X\), we have

    $$\begin{aligned} \left|\left\{ (\mathbf {u}_{j+1}, \ldots , \mathbf {u}_{k'}) \,:\, \{(\mathbf {u}_1, \mathbf {v}_1), \ldots , (\mathbf {u}_{k'}, \mathbf {v}_{k'})\} \in R \right\} \right| \le \delta \cdot \prod _{i = j}^{k'-1} (\left|X \right| - i). \end{aligned}$$

It is not hard to see that if a relation is \(\delta \)-strongly evasive, then it is also evasive, in the sense that it is \((q, q^{k} \delta )\)-evasive. In particular, \(q^k \delta \) is negligible whenever \(\delta \) is negligible, q polynomial, and \(k = O(1)\).

We remark that there are relations R which are evasive, yet not strongly evasive. Consider for example the relation which contains \(\{(0^{2n}, 0^{2n}), (\mathbf {u}, \mathbf {v})\}\) for all \(\mathbf {u}, \mathbf {v}\ne 0^{2n}\). This relation is obviously evasive to start with – satisfying it requires \(\pi (0^{2n}) = 0^{2n}\), which will happen with probability \(2^{-2n}\) only, yet for \(\mathbf {u}_1 = \mathbf {v}_1 = 0^{2n}\), and \(\mathbf {u}_2 \ne 0^{2n}\), all strings \(\mathbf {v}_2\) make \(\{(\mathbf {u}_1, \mathbf {v}_1), (\mathbf {u}_2, \mathbf {v}_2)\}\) valid. Still, somehow, the intuition is that the core of R is the relation \(R^* = \{\{(0^{2n}, 0^{2n})\}\}\), which is strongly evasive, with \(\delta = 2^{-2n}\). Indeed, an attacker that satisfies the original relation R, can directly satisfy \(R^*\), thus the fact that \(R^*\) is evasive (and in particular, strongly evasive) implies that R is evasive.

The following lemma generalizes this, and implies e.g. that for and \(k = O(1)\), evasiveness and strong evasiveness are (qualitatively) equivalent. The proof is found in the full version [38, App. B.1].

Lemma 2

(Normalization of evasive relations). Let \(\delta > 0\), and let R be a k-ary \((k^2, \delta ^k)\)-evasive relation on \(X \times X\) for random permutations. Then, there exists a relation \(R^*\) which is \(\delta \)-strongly evasive for random permutations, and moreover, for every \(S \in R\), there exists \(\emptyset \ne S^* \subseteq S\) such that \(S^* \in R^*\).

Lemma 2 now is all we need. Say \(\mathsf {E}\) is correlation intractable for all k-ary strongly evasive relations (for some negligible \(\delta \)), where \(k = O(1)\). Then, \(\mathsf {E}\) must be correlation intractable for any \((k^2, \delta )\)-evasive relation R, too. Were it not, we could take an adversary A breaking the CI for R with non-negligible advantage, and use it to break CI of \(R^*\). To this end, we simply run A, and when it outputs \(S \in R\), we outputs the corresponding \(S^* \in R^*\) guaranteed by Lemma 2. (As \(k = O(1)\), a random subset of S will do with constant loss in the advantage.) But since \(R^*\) is \(\root k \of {\delta }\)-strongly evasive, this contradicts our assumption on \(\mathsf {E}\).

Clearly, the equivalence is merely asymptotic. If one is interested in concrete security, the best approach to use our results below is to directly assess the \(\delta \) for which a specific relation R is \(\delta \)-strongly evasive.

4.2 Partial Correlation Intractability of Two-Round Feistel

The two-round Feistel construction \(\mathsf {Fei}_2^f\), is a permutation on 2n-bit strings that makes calls to an oracle \(f: \{0,1\}^{n+1} \rightarrow \{0,1\}^{n}\). In particular, on input \(\mathbf {x}= x_0 \,\Vert \,x_1\), where \(x_0, x_1 \in \{0,1\}^{n}\), running \(\mathsf {Fei}_2^f(\mathbf {x})\) outputs \(\mathbf {y}= x_2 \,\Vert \,x_{3}\), where

$$\begin{aligned} x_2 \leftarrow x_0 \oplus f(0 \,\Vert \,x_1) \;; x_3 \leftarrow x_1 \oplus f(1 \,\Vert \,x_2). \end{aligned}$$

Symmetrically, upon an inverse query, \({\mathsf {Fei}_2^f}^{-1}(\mathbf {y}= x_2 \,\Vert \,x_{3})\) simply computes the values backwards, and outputs \(\mathbf {x}= x_0 \,\Vert \,x_1\).

In this section, we discuss relations R on 2n-bit strings that are hard for two-round Feistel when instantiated with a random function. In particular, we will give a combinatorial characterization which is sufficient to achieve this.

Feistel evasiveness. We first note that in a relation R, certain sets \(S \in R\) can never be satisfied by the two-round Feistel construction out of structural constraints. In particular, if we have two input-output pairs \((\mathbf {x}_1[0] \,\Vert \,\mathbf {x}_1[1], \mathbf {x}_1[2] \,\Vert \,\mathbf {x}_1[3])\) and \((\mathbf {x}_2[0] \,\Vert \,\mathbf {x}_2[1], \mathbf {x}_2[2] \,\Vert \,\mathbf {x}_2[3])\) with \(\mathbf {x}_1[2] = \mathbf {x}_2[2]\) in the same set \(S \in R\), then we must have \(\mathbf {x}_1[3] \oplus \mathbf {x}_2[3] = \mathbf {x}_1[1] \oplus \mathbf {x}_2[1]\). Symmetrically, if \(\mathbf {x}_1[1] = \mathbf {x}_2[1]\), then we must have \(\mathbf {x}_1[0] \oplus \mathbf {x}_2[0] = \mathbf {x}_1[2] \oplus \mathbf {x}_2[2]\). It will thus be convenient to define the following.

Definition 2

For every k-ary relation R on 2n-bit strings, we define the relation \(\overline{R} \subseteq R\) that only contains \(S \in R\) if for every \((\mathbf {x}_1[0] \,\Vert \,\mathbf {x}_1[1], \mathbf {x}_1[2] \,\Vert \,\mathbf {x}_1[3])\), \((\mathbf {x}_2[0] \,\Vert \,\mathbf {x}_2[1], \mathbf {x}_2[2] \,\Vert \,\mathbf {x}_2[3]) \in S\), the following is true:

  • If \(\mathbf {x}_1[2] = \mathbf {x}_2[2]\), then \(\mathbf {x}_1[3] \oplus \mathbf {x}_2[3] = \mathbf {x}_1[1] \oplus \mathbf {x}_2[1]\).

  • If \(\mathbf {x}_1[1] = \mathbf {x}_2[1]\), then \(\mathbf {x}_1[0] \oplus \mathbf {x}_2[0] = \mathbf {x}_1[2] \oplus \mathbf {x}_2[2]\).

Clearly, the significance of this is that when assessing whether R is correlation intractable for two-round Feistel, it suffices to prove that \(\overline{R}\) is correlation intractable, as \(S \in R \setminus \overline{R}\) can never be satisfied. We are now ready to state the following combinatorial requirement on relations, which we will prove to be evasive for two-round Feistel below.

Definition 3

(\(\delta \)-2-Feistel evasive relations). Let R be a k-ary relation over \(\{0,1\}^{2n}\), and \(\delta \in [0,1]\). We say that R is \(\delta \)-2-Feistel evasive if the following are true for all \(0 \le j < k' \le k\):

  • For all distinct \(\mathbf {x}_1, \ldots , \mathbf {x}_{k'} \in \{0,1\}^{2n}\), distinct \(\mathbf {y}_1,\ldots , \mathbf {y}_j \in \{0,1\}^{2n}\), and \(y^* \in \{0,1\}^n\) s.t. \(\mathbf {x}_{j+1}[1], \ldots , \mathbf {x}_{k'}[1]\) are distinct and \(y^* \notin \{\mathbf {y}_1[0], \ldots , \mathbf {y}_j[0]\}\),

    $$\begin{aligned}&\Big |\{(y_{j+1}, \ldots , y_{k'})\}\ :\ \{(\mathbf {x}_1, \mathbf {y}_1), \ldots , (\mathbf {x}_{j}, \mathbf {y}_{j}), \qquad \qquad \qquad \qquad \qquad \qquad \nonumber \\&\qquad \qquad \qquad \quad (\mathbf {x}_{j+1}, y^* \,\Vert \,y_{j+1}), \ldots , (\mathbf {x}_{k'}, y^* \,\Vert \,y_{k'}) \} \in \overline{R'}\Big |\le \delta ' \cdot 2^n \;. \end{aligned}$$
    (11)
  • For all distinct \(\mathbf {y}_1, \ldots , \mathbf {y}_{k'} \in \{0,1\}^{2n}\), distinct \(\mathbf {x}_1,\ldots , \mathbf {x}_j \in \{0,1\}^{2n}\), and \(x^* \in \{0,1\}^n\) s.t. \(\mathbf {y}_{j+1}[0], \ldots , \mathbf {y}_{k'}[0]\) are distinct and \(x^* \notin \{\mathbf {x}_1[1], \ldots , \mathbf {x}_j[1]\}\),

    $$\begin{aligned}&\Big |\{(x_{j+1}, \ldots , x_{k'})\}\ :\ \{(\mathbf {x}_1, \mathbf {y}_1), \ldots , (\mathbf {x}_{j}, \mathbf {y}_{j}), \qquad \qquad \qquad \qquad \qquad \qquad \nonumber \\&\qquad \qquad \qquad \quad (x_{j+1} \,\Vert \,x^*, \mathbf {y}_{j+1}), \ldots , (x_{k'} \,\Vert \,x^*, \mathbf {y}_{k'}) \} \in \overline{R'} \Big |\le \delta ' \cdot 2^n \;. \end{aligned}$$
    (12)

Also, we let \(\mathsf {FEv}(k,\delta )\) denote the set of all k-ary \(\delta \)-2-Feistel evasive relations.

Feistel correlation intractability. We now prove that for all relations satisfying Definition 3, two-round Feistel is indeed correlation intractable in the model where both round functions are independent random functions, to which the adversary is given oracle access.

Proposition 2

(CI of Two-round Feistel). For \(\delta \in [0,1]\) and \(k \ge 1\) be an integer, let \(R \in \mathsf {FEv}(k,\delta )\). For any (unbounded) adversary A making at most q queries to f, \(\mathsf {Adv}^{\mathsf {cri}}_{R,\mathsf {Fei}_2^f}(A) \le 2k {\delta } \cdot {q}^{2k+1}\).

The proof of Proposition 2 can be found in the full version [38, App. B.2].

Remark 1

For the special case of \(k=1\), that is, unary relations, one can adapt the above proof and show that \(\mathsf {Adv}^{\mathsf {cri}}_{R,\mathsf {Fei}_2^f}(A) \le \delta \cdot q^2\) where A makes q queries to f and \(R \in \mathsf {FEv}(1,\delta )\).

4.3 Correlation Intractability of the NR Construction

In this section we view the NR construction as a family \(\mathsf {NR}^f[\mathsf {P}]\) that makes oracle calls to \(f : \{0,1\}^{n+1} \rightarrow \{0,1\}^{n}\) and \(\mathsf {P}\) is a family of permutations on 2n-bits. The key generation algorithm \(\mathsf {NR}.\mathsf {Kg}\) now just outputs a tuple \((s^{\mathsf {in}}, s^{\mathsf {out}})\) where and the evaluation algorithm \(\mathsf {NR}.\mathsf {Eval}\) proceeds as before but instead makes calls to f for evaluating the round function.

We show that \(\mathsf {NR}^f[\mathsf {P}]\), where \(\mathsf {P}^{-1}\) is a family of almost \(O(k^2)\)-wise independent permutations, is correlation intractable against strongly evasive k-ary relations when the adversary is given the seed \((s^{\mathsf {in}}, s^{\mathsf {out}})\) of the \(\mathsf {NR}\) construction and only oracle access to f. The proof of CI proceeds by showing that \(\mathsf {P}\) transforms a strongly evasive relation R into a 2-Feistel evasive relation \(R_{\mathsf {\pi }, \mathsf {\sigma }}\) (see Fig. 6) and hence for the adversary to break the CI of \(\mathsf {NR}^f[\mathsf {P}]\) it needs to break the CI of two-round Feistel against \(R_{\mathsf {\pi }, \mathsf {\sigma }}\) which was studied in Sect. 4.2.

p -wise independent permutations. For any \(\varepsilon \in [0,1]\) and \(p \ge 1\), we say that a family of permutations \(\mathsf {P}\) on m-bit strings is \((\varepsilon , p)\)-wise independent if for all distinct \(u_1, \ldots , u_p \in \{0,1\}^m\), the distributions of \(\mathsf {P}(s, u_1), \ldots , \mathsf {P}(s, u_p)\) (for ) and of \(\rho (u_1), \ldots , \rho (u_p)\) (for ) are at most \(\varepsilon \)-apart in statistical distance.

Fig. 6.
figure 6

The NR construction instantiated with a permutation family \(\mathsf {P}\) on 2n-bits such that \(\mathsf {P}^{-1}\) is \((\varepsilon , k \cdot t)\)-wise independent where \(\mathsf {\pi }= \mathsf {P}(s^{\mathsf {in}}, \cdot )\) and \(\mathsf {\sigma }= \theta (\mathsf {P}(s^{\mathsf {out}}, \cdot ))\) for \(s^{\mathsf {in}}, s^{\mathsf {out}} \leftarrow \mathsf {P}.\mathsf {Kg}\). Here, \(\theta \) is a permutation on 2n-bits such that for all \(\mathbf {x}\in \{0,1\}^{2n}\) we have \(\theta ({\mathbf {x}} = x_0||x_1) = x_1||x_0\). For some k-ary strongly-evasive relation R, we construct a 2-Feistel evasive relation \(R_{\mathsf {\pi }, \mathsf {\sigma }}\) by transforming every \((\mathbf {u}, \mathbf {v}) \in S\) where \(S \in R\), by applying \(\mathsf {\pi }\) to \(\mathbf {u}\) and \(\mathsf {\sigma }\) to \(\mathbf {v}\).

From Strongly Evasive Relation to 2-Feistel-Evasive Relation. Let R be a k-ary relation over \(\{0,1\}^{2n} \times \{0,1\}^{2n}\). For , let \(\mathsf {\pi }= \mathsf{P}(s^{\mathsf {in}}, \cdot )\) and \(\mathsf {\sigma }= \theta (\mathsf{P}(s^{\mathsf {out}}, \cdot ))\), where \(\theta \) is a permutation on 2n-bits such that for \(\mathbf {x}\in \{0,1\}^{2n}\) we have \(\theta ({\mathbf {x}}=x_0||x_1) = x_1||x_0\)Footnote 10. We define a relation \(R_{\mathsf {\pi }, \mathsf {\sigma }}\) which is a result of transforming \(\{(\mathbf {u}_1, \mathbf {v}_1), \ldots , (\mathbf {u}_{k'}, \mathbf {v}_{k'}))\} \in R\) via \(\mathsf {\pi }\) and \(\mathsf {\sigma }\) in the following way,

$$\begin{aligned} R_{\mathsf {\pi }, \mathsf {\sigma }}= \{ \{(\mathsf {\pi }(\mathbf {u}_1), \mathsf {\sigma }(\mathbf {v}_1)), \ldots , (\mathsf {\pi }(\mathbf {u}_{k'}), \mathsf {\sigma }(\mathbf {v}_{k'})) \} \ | \ \{(\mathbf {u}_1, \mathbf {v}_1), \ldots , (\mathbf {u}_{k'}, \mathbf {v}_{k'})\} \in R \}. \end{aligned}$$

Then, for every \(\delta \)-strongly evasive k-ary relation R we show that \(R_{\mathsf {\pi }, \mathsf {\sigma }}\in \mathsf {FEv}(k,\delta ')\) for some \(\delta '\) larger than \(\delta \), except with small probability, where the probability is taken over the random choice of \((\mathsf {\pi }, \mathsf {\sigma })\). This is more formally captured in the following:

Proposition 3

(CI Amplification). For \(\delta \in [0,1)\) and an integer \(k \ge 1\), let R be a k-ary \(\delta \)-strongly evasive relation over \(\{0,1\}^{2n}\). For an even integer \(t \ge 2\), let \(\mathsf {P}\) be a family of permutations such that \(\mathsf {P}^{-1}\) is \((\varepsilon ,k\cdot t)\)-wise independent. Then, for \(\delta ' \in [0,1]\) such that \(\delta ' > \delta \),

$$\begin{aligned} \underset{\mathsf {\pi }, \mathsf {\sigma }}{\mathsf {Pr}}[ R_{\mathsf {\pi }, \mathsf {\sigma }}\notin \mathsf {FEv}(k,\delta ') ] \le 12k^2 \left( \frac{1}{\delta '-\delta }\right) ^t 2^{(4k-1)n}\left[ C_t \cdot \left( \frac{4\delta ^* t}{2^n}\right) ^{t/2} + \varepsilon \cdot (1+\delta )^t\right] \!, \end{aligned}$$

where \(C_t = 2e^{1/6t}\ \sqrt{\pi t}\ {\left( \frac{5}{2e} \right) }^{t/2}\), \(\delta ^* = \max \left( \delta , \frac{t \cdot 2^k}{2^n}\right) \), \(\mathsf {\pi }= \mathsf {P}(s^{\mathsf {in}}, \cdot )\) and \(\mathsf {\sigma }= \theta (\mathsf {P}(s^{\mathsf {out}}, \cdot ))\) for .

Now, Proposition 3 can be combined with Proposition 2 to establish the correlation intractability of \(\mathsf {NR}\) against strongly evasive relations (Theorem 3).

Theorem 3

(CI of NR). For \(\delta \in [0,1)\) and an integer \(k \ge 1\), let R be a k-ary \(\delta \)-strongly evasive relation over \(\{0,1\}^{2n}\). Further, let \(\mathsf {P}\) be a family of permutations on 2n-bits such that \(\mathsf {P}^{-1}\) is \((\varepsilon , 10k^2)\)-wise independent where \(\varepsilon \le 1/2^{5kn}\). Then for any (potentially unbounded) adversary A making q queries,

$$\begin{aligned} \mathsf {Adv}^{\mathsf {cri}}_{R,\mathsf {NR}^f[\mathsf {P}]}(A) \le \frac{24k^2 \cdot (40k\delta ^*)^{5k} + 12k^2 \cdot (1+\delta )^{10k}}{2^{4kn/9}} + 2k {\delta '} \cdot {q}^{2k+1}, \end{aligned}$$
(13)

where \(\delta ' = \delta + 2^{-n/18}\) and \(\delta ^* = \max \left( \delta , \frac{10k\cdot 2^k}{2^{n}}\right) \).

The proof of Theorem 3 can be found in the full version [38, App. B.3]. The asymptotic interpretation of Eq. (13) is that when \(n = \omega (\log \lambda )\), \(k = O(1)\), and , \(\mathsf {NR}^f[\mathsf {P}]\) is correlation intractable for k-ary strongly evasive relations. Combining this with Lemma 2, the CI then extends to any k-ary evasive relation. We also remark that Theorem 3 extends to the setting of multi-key correlation intractability introduced in [16], but to avoid notational overhead we limit ourselves to the single-key setting for this version.

On instantiating \(\mathsf {P}^{-1}\) from Theorem 3. We detail the construction of \((\varepsilon , p)\)-wise permutations in the full version [38, App. B.5] and show that an O(k)-round Feistel construction with \(10k^2\)-wise independent round functions can instantiate the permutation family \(\mathsf {P}^{-1}\). We refer the reader to the full version for more details.

4.4 Proof of Proposition 3

We will show that for every \(0 \le j < {k'}\le k\) the following hold,

  1. 1.

    For all distinct \(\mathbf {x}_1, \ldots , \mathbf {x}_{k'}\), distinct \(\mathbf {y}_1, \ldots , \mathbf {y}_j\) and \(y^* \in \{0,1\}^{n}\) such that \(\mathbf {x}_{j+1}[1], \ldots , \mathbf {x}_{k'}[1]\) are distinct and \(y^* \notin \{\mathbf {y}_1[0], \ldots , \mathbf {y}_j[0]\}\),

    $$\begin{aligned} \mathsf {Pr}[\text {Eq.\,(11) does not hold for } R_{\mathsf {\pi }, \mathsf {\sigma }}] \le e_1(k',j) + \varepsilon \cdot e_2(k',j). \end{aligned}$$
    (14)
  2. 2.

    For all distinct \(\mathbf {y}_1, \ldots , \mathbf {y}_{k'}\), all \(\mathbf {x}_1, \ldots , \mathbf {x}_j\) and all \(x^* \in \{0,1\}^{n}\) such that \(\mathbf {y}_{j+1}[0], \ldots , \mathbf {y}_{k'}[0]\) are distinct and \(x^* \notin \{\mathbf {x}_1[1], \ldots , \mathbf {x}_j[1]\}\),

    $$\begin{aligned} \mathsf {Pr}[\text {Eq.\,(12) does not hold for }R_{\mathsf {\pi }, \mathsf {\sigma }}] \le e_1(k',j) + \varepsilon \cdot e_2(k',j), \end{aligned}$$
    (15)

where the probability is taken over the random choice of \((\mathsf {\pi }, \mathsf {\sigma })\) and

$$\begin{aligned} \begin{aligned} e_1({k'},j)&= 3 \cdot C_t \left( \frac{1}{\delta ' - \delta } \right) ^t \left( \frac{2\delta ^* t}{2^n}\right) ^{t/2} 2^{({k'}- j)(t/2 + 1)}, \\ e_2({k'},j)&= 2 \left( \frac{1 + \delta }{\delta ' - \delta }\right) ^t 2^{{k'}- j}. \end{aligned} \end{aligned}$$

Given that the above hold, we then take appropriate union bounds (for Eq. (14)) over all \(\mathbf {y}_1, \ldots , \mathbf {y}_{k'}\), \(\mathbf {x}_1, \ldots , \mathbf {x}_j\) and \(x^*\) and then over all \(j,{k'}\). Symmetrically, we take union bounds (for Eq. (15)). Then the following holds and this concludes the proof of Theorem 3.

$$\begin{aligned} \begin{aligned}&\mathsf {Pr}[R_{\mathsf {\pi }, \mathsf {\sigma }}\notin \mathsf {FEv}(k,\delta ')]\\&\quad \le 2\sum \limits _{{k'}=1}^k\ \sum \limits _{j=0}^{{k'}- 1} 2^{n(2{k'}+2j+1)} \cdot (e_1(j,{k'}) + \varepsilon \cdot e_2(j,{k'})) \\&\quad \le 12k^2 \left( \frac{1}{\delta '-\delta }\right) ^t 2^{(4k-1)n}\left[ \ C_t \cdot \left( \frac{4\delta ^* t}{2^n}\right) ^{t/2} + \varepsilon \cdot (1+\delta )^t\ \right] . \end{aligned} \end{aligned}$$

From now on we focus on showing Eq. (15) and the analysis for Eq. (14) is symmetrical.

Establishing Equation (15). Let us fix some arbitrary \({k'}\) and j such that \(0 \le j < {k'}\le k\). Let us also fix some distinct \(\mathbf {y}_1, \ldots , \mathbf {y}_{k'} \in \{0,1\}^{2n}\), distinct \(\mathbf {x}_1, \ldots , \mathbf {x}_j\) and \(x^* \in \{0,1\}^{n}\) such that \(\mathbf {y}_{j+1}[0], \ldots , \mathbf {y}_{k'}[0]\) are distinct and \(x^* \notin \{\mathbf {x}_1[0], \ldots , \mathbf {x}_j[0]\}\). Then, we are interested in counting the number of tuples \(((\mathbf {u}_1, \mathbf {v}_1), \ldots , (\mathbf {u}_{k'}, \mathbf {v}_{k'}))\) in R that on applying \(\mathsf {\pi }\) and \(\mathsf {\sigma }\) transform to \(((\mathbf {x}_1, \mathbf {y}_1), \ldots , (\mathbf {x}_j, \mathbf {y}_j), (\cdot \,\Vert \,x^*,\mathbf {y}_{j+1}), \ldots , (\cdot \,\Vert \,x^*,\mathbf {y}_{{k'}}))\). Let us fix \(\mathsf {\sigma }\) and this defines \(\mathbf {v}_i = \mathsf {\sigma }^{-1}(\mathbf {y}_i)\) for every \(i \in [{k'}]\) allowing us to focus only on the following set \(\mathcal{U}\) of tuples.

$$\begin{aligned} \mathcal{U} = \{ (\mathbf {u}_1, \ldots , \mathbf {u}_{k'}) \ | \ \{(\mathbf {u}_1, \mathbf {v}_1), \ldots , (\mathbf {u}_{k'}, \mathbf {v}_{k'})\} \in R\}. \end{aligned}$$

Then, we are interested in counting the number of tuples \(\mathbf{U} = (\mathbf {u}_1, \ldots , \mathbf {u}_{k'})\) in \(\mathcal U\) that satisfy,

  1. 1.

    \(\mathsf {\pi }(\mathbf {u}_1) = \mathbf {x}_1,\ \mathsf {\pi }(\mathbf {u}_2) = \mathbf {x}_2,\ \ldots ,\ \mathsf {\pi }(\mathbf {u}_j) = \mathbf {x}_j\).

  2. 2.

    \(\mathsf {\pi }_1(\mathbf {u}_{j+1}) = \mathsf {\pi }_1(\mathbf {u}_{j+2}) \ldots = \mathsf {\pi }_1(\mathbf {u}_{{k'}}) = x^*\).

  3. 3.

    For every \(i \in \{j+1, \ldots , k'\}\), \(\mathsf {\pi }_0(\mathbf {u}_i) \oplus \mathsf {\pi }_0(\mathbf {u}_{j+1}) = \varDelta _i\), where \(\varDelta _i = \mathbf {y}_{j+1}[0] \oplus \mathbf {y}_{i}[0]\)Footnote 11,

where \(\mathsf {\pi }_0(\mathbf {u})\) and \(\mathsf {\pi }_1(\mathbf {u})\) denote the first n-bits and last n-bits of \(\mathsf {\pi }(\mathbf {u})\). Or equivalently, count the number of \(\mathbf U\)’s such that \(\mathsf {\pi }(\mathbf{U})\)Footnote 12 falls in \(\mathcal{X}\) where,

$$\begin{aligned} \mathcal{X} = \{ (\mathbf {x}_1, \ldots , \mathbf {x}_j, x \oplus \varDelta _{j+1} \,\Vert \,x^*, \ldots , x \oplus \varDelta _{k'}\,\Vert \,x^*)\ |\ x \in \{0,1\}^{n} \}. \end{aligned}$$

Note that every element \(\mathbf{X}\) of \(\mathcal X\) is completely described by an n-bit string x. Now for \(\mathbf{U} = (\mathbf {u}_1, \ldots , \mathbf {u}_{k'}) \in \mathcal{U}\), let \({I}_\mathbf{U}\) be an indicator random variable,

$$\begin{aligned} {I}_\mathbf{U} = {\left\{ \begin{array}{ll} 1 &{} \text { if }(\mathsf {\pi }(\mathbf {u}_1), \ldots , \mathsf {\pi }(\mathbf {u}_{k'})) \in \mathcal{X}, \\ 0 &{} \text { otherwise. } \\ \end{array}\right. } \end{aligned}$$

Then it suffices to prove that,

$$\begin{aligned} \underset{\mathsf {\pi }}{\mathsf {Pr}}\left[ \sum \limits _{\mathbf{U} \in \mathcal U }{I}_\mathbf{U} > \delta ' \cdot 2^n \right] \le e_1({k'},j) + \varepsilon \cdot e_2({k'},j). \end{aligned}$$

Instead of looking at the sum \(\sum {I}_\mathbf{U}\), we look at an equivalent sum \(\sum {I}_{x}\) of, albeit, different indicator random variables \({I}_{x}\)’s, which will be convenient to analyse. For \(x \in \{0,1\}^{n}\) we define an indicator random variable \({I}_{x}\) which is 1 if \(\mathsf {\pi }^{-1}\) transforms \(\mathbf{X} \in \mathcal{X}\) (that corresponds to x) into some \(\mathbf{U} \in \mathcal{U}\). More formally,

$$\begin{aligned} {I}_{x} = {\left\{ \begin{array}{ll} 1 &{} \text { if }(\mathsf {\pi }^{-1}(\mathbf {x}_1), \ldots , \mathsf {\pi }^{-1}(x \oplus \varDelta _{j+1} ||x^*), \ldots , \mathsf {\pi }^{-1}(x \oplus \varDelta _{k'}|| x^*)) \in \mathcal{U},\\ 0 &{} \text { otherwise. } \\ \end{array}\right. } \end{aligned}$$
(16)

Then, it is easy to see that counting the number of \(\mathbf{U} \in \mathcal{U}\) such that \({I}_\mathbf{U} = 1\) (or \(\mathsf {\pi }(\mathbf U) \in \mathcal{X}\)) is the same as counting the number of \(x \in \{0,1\}^{n}\) such that \({I}_{x} = 1\) (or \(\mathsf {\pi }^{-1}(\mathbf{X}) \in \mathcal{U}\)). Therefore, \(\sum \limits _{\mathbf{U} \in \mathcal{U}} {I}_\mathbf{U} = \sum \limits _{x \in \{0,1\}^{n}} {I}_{x}\) and we aim to show that,

$$\begin{aligned} \underset{\mathsf {\pi }}{\mathsf {Pr}}\left[ \sum \limits _{x \in \{0,1\}^{n}} {I}_{x} > \delta ' \cdot 2^n \right] \le e_1({k'},j) + \varepsilon \cdot e_2({k'},j). \end{aligned}$$
(17)

Partitioning \(\{0,1\}^{n}\). We would like to use concentration bounds for the sum of random variables \({I}_{x}\)’s. But note that they are not independent as they may depend on the output of \(\mathsf {\pi }^{-1}\) on the same input. Therefore, as a first step towards constructing independent random variables, we partition \(\{0,1\}^{n}\) into subsets which will allow us to break the sum \(\sum {I}_{x}\) into sums over these subsets.

Let us consider the following relation on \(\{0,1\}^{n} \times \{0,1\}^{n}\). For any \(x, x' \in \{0,1\}^{n}\), we say that x is related to \(x'\) (denoted as \(x \sim x'\)) if there exists an index set \(\mathcal{B} \subseteq \{j+1, \ldots , {k'}\}\) where such that,

$$\begin{aligned} x = x' \oplus \ \underset{i \in \mathcal{B}}{\bigoplus } \ {\varDelta _i}. \end{aligned}$$

It is easy to see that the relation \(\sim \) is an equivalence relation. Then, for any \(x \in \{0,1\}^{n}\), let \(\mathsf{EQ}_x\) denote its equivalence class, that is, \(\mathsf{EQ}_x = \{ x' \in \{0,1\}^{n} \ | x \sim x' \}\). Let \(|\mathsf{EQ}_x| = l\) and it is easy to see that \(l \le 2^{{k'}- j}\). Let \(\{\mathsf{EQ}_i\}_{i = 1}^{M}\) be the M equivalence classes of \(\sim \) where \(| \mathsf{EQ}_i | = l\) and \(M \cdot l = 2^n\). Furthermore, let \(\mathsf{EQ}_i = \{x^i_1, x^i_2, \ldots , x^i_{l}\}\) be an enumeration of \(\mathsf{EQ}_i\) where \(x^i_q\) is the qth member of the ith equivalence class \(\mathsf{EQ}_i\). Then, we can break the sum of \({I}_{x}\)’s into,

$$\begin{aligned} \sum \limits _{x \in \{0,1\}^{n}} {I}_{x} = \sum \limits _{i=1}^{M} \sum \limits _{q=1}^{l} {I}_{x^i_q} = \sum \limits _{q=1}^{l} \sum \limits _{i=1}^{M} {I}_{x^i_q}. \end{aligned}$$

For \(q \in [l]\), let \(X_q = \sum \limits _{i=1}^{M} {I}_{x^i_q}\). In other words, \(X_q\) is the sum of qth member of each equivalence class \(\mathsf{EQ}_i\). We are going to show that for every \(q \in [l]\),

$$\begin{aligned} \begin{aligned} \mathsf {Pr}[X_q > \delta '\cdot M]&\le 3 C_t \cdot \frac{1}{(\delta ' - \delta )^t}{\left( \frac{2t\delta ^*}{2^n} \right) }^{t/2} l^{t/2} + 2 \varepsilon \cdot \left( \frac{1 + \delta }{\delta '-\delta }\right) ^t \end{aligned} \end{aligned}$$
(18)

Taking union bound over all \(q \in [l]\) and using \(l \le 2^{{k'}- j}\), we have that Eq. (17) holds.

Bounding the subsum \(X_q\). From now on, we will focus on analysing one of the subsums \(X_q\) and the other subsums can be analogously handled. Fix some q and let \(X = X_q\). Let the corresponding set of x’s be \(\{x^1, \ldots , x^M \}\) where each \(x^i\) comes from a different equivalence class \(\mathsf{EQ}_i\). For every \(i_1 \ne i_2 \in [M]\),

  • Firstly, \(\varDelta _{j+1}, \ldots , \varDelta _{{k'}}\) are distinct as \(\mathbf {y}_{j+1}[0], \ldots , \mathbf {y}_{k'}[0]\) are distinct. Therefore, \(x^{i_1} \oplus \varDelta _{j+1}, \ldots , x^{i_1} \oplus \varDelta _{k'}\) are distinct.

  • Secondly for any index set \(\mathcal{B} \subseteq \{j+1, \ldots , {k'}\}\),

    $$\begin{aligned} x^{i_1} \ne \ x^{i_2} \oplus \ \underset{i \in \mathcal{B}}{\bigoplus }\ {\varDelta _i}. \end{aligned}$$
    (19)

    This implies that for any \({I}_{x^{i_1}}\) and \({I}_{x^{i_2}}\), \(\{x^{i_1}_{j+1} \oplus \varDelta _{j+1}, \ldots , x^{i_1}_{{k'}} \oplus \varDelta _{{k'}}\}\) and \(\{x^{i_2}_{j+1} \oplus \varDelta _{j+1}, \ldots , x^{i_2}_{{k'}} \oplus \varDelta _{{k'}}\}\) are disjoint. Therefore, except the first j (values that correspond to \(\mathsf {\pi }^{-1}(\mathbf {x}_i)\) for \(i \in [j]\)), the remaining set of values in the output of \(\mathsf {\pi }^{-1}\) that each \({I}_{x^{i_1}}\) and \({I}_{x^{i_2}}\) depend on are disjoint.

We will crucially exploit these two properties of \({I}_{x^i}\)’s to show that the following:

Lemma 3

For X (as defined above), there exists a random variable Z with expectation \(\mu = {\mathbb {E}\left[ Z \right] } \le \delta \cdot M\) where Z is a sum of M independent indicator random variables, such that for any integer \(a > 0\),

$$\begin{aligned} \mathsf {Pr}[|X - \mu | > a] \le \frac{3 \cdot {\mathbb {E}\left[ (Z - \mu )^t \right] }}{a^t} + 2\varepsilon \cdot \frac{\left( M + \mu \right) ^t}{a^t}. \end{aligned}$$

For each indicator random variable \({I}_{x^i}\), we first define another indicator random variable \({I}_{x^i}^\rho \) where the only difference is that we replace the \(k \cdot t\)- wise independent permutation \(\mathsf {\pi }^{-1}\) with a random permutation \(\rho \). Note that the resulting \({I}_{x^i}^\rho \) are still not independent as they depend on the output of \(\rho \). So, we then define a sequence of random variable \({I}_{x^i}^*\) that have the same marginal distribution as that of \({I}_{x^i}^\rho \) but are independent. Then, we show a domination argument that relates the t-th moment of \((Y - \mu )\) with the t-th moment of \((Z - \mu )\) where Y and Z are the sum of \({I}_{x^i}^\rho \) and \({I}_{x^i}^*\) respectively. The proof of Lemma 3 can be found in the full version [38, App. B.4]. Next, we apply the following concentration bound due to [10] to the random variable Z.

Lemma 4

(A.4. from [10]). Let \(t \ge 2\) be an even integer. Suppose \(Z_1, \ldots , Z_n\) are independent random variables taking values in [0, 1]. Let \(Z = Z_1 + \ldots + Z_n\) and \(\mu = {\mathbb {E}\left[ Z \right] }\). Then,

$$\begin{aligned} {\mathbb {E}\left[ (Z - \mu )^t \right] } \le C_t \cdot (t\mu + t^2)^{t/2}. \end{aligned}$$

Then as \(\mu \le \delta \cdot M\) we have,

$$\begin{aligned} \mathsf {Pr}[X> \delta \cdot M + a] \le \mathsf {Pr}[X > \mu + a] \le 3 C_t \cdot \left( \frac{t \mu + t^2}{a^2}\right) ^{t/2} + 2\varepsilon \cdot \left( \frac{M + \mu }{a} \right) ^t, \end{aligned}$$

Now let \(a = (\delta ' - \delta ) \cdot M\) and using \(M \cdot l = 2^n\) and \(\delta ^* = \max (\delta , {t} \cdot 2^k/2^n)\), we have

$$\begin{aligned} \begin{aligned} \mathsf {Pr}[X > \delta '\cdot M]&\le 3 C_t \cdot \frac{1}{(\delta ' - \delta )^t}{\left( \frac{2t\delta ^*}{2^n} \right) }^{t/2} l^{t/2} + 2 \varepsilon \cdot \left( \frac{1 + \delta }{\delta '-\delta }\right) ^t \end{aligned} \end{aligned}$$

which establishes that Eq. (18) holds (which establishes that Eq. (15) holds) and thereby concludes the proof of Proposition 3.    \(\square \)