1 Introduction

Computational hardness assumptions are the foundation of modern cryptography. The approach of building cryptographic systems whose security follows from well-defined computational assumptions has enabled us to obtain fantastical primitives and functionality, pushing far beyond the limitations of information theoretic security. But, in turn, the resulting systems are only as secure as the computational assumptions lying beneath them. As cryptographic constructions increasingly evolve toward usable systems, gaining a deeper understanding of the true hardness of these problems—and the relationship between assumptions—is an important task.

To date, a relatively select cluster of structured problems have withstood the test of time (and intense scrutiny), to the point that assuming their hardness is now broadly accepted as “standard.” These problems include flavors of factoring [RSA78, Rab79] and computing discrete logarithms [DH76], as well as certain computational tasks in high-dimensional lattices and learning theory [GKL88, BFKL93, Ajt96, BKW00, Ale03, Reg05]. A central goal in the foundational study of cryptography is constructing cryptographic schemes whose security provably follows from these (or weaker) assumptions.

In some cases, however, it may be beneficial—even necessary—to introduce and study new assumptions (indeed, every assumption that is “standard” today was at some point freshly conceived). There are several important cryptographic primitives (notable examples include indistinguishability obfuscation (IO) [BGI+01, GGH+13] and SNARKs [BCC+17]) that we do not currently know how to construct based on standard assumptions. Past experience has shown that achieving new functionalities from novel assumptions, especially falsifiable assumptions [Nao03, GW11, GK16], can be a stepping stone towards attaining the same functionality from standard assumptions. This was the case for fully homomorphic encryption [RAD78, Gen09, BV11], as well as many recent primitives that were first built from IO and later (following a long line of works) based on more conservative assumptions (notably, non-interactive zero-knowledge protocols for NP based on LWE [KRR17, CCRR18, HL18, CCH+19, PS19], and the cryptographic hardness of finding a Nash equilibrium based on the security of the Fiat-Shamir heuristic [BPR15, HY17, CHK+19]). Finally, cryptographic primitives that can be based on diverse assumptions are less likely to “go extinct” in the event of a devastating new algorithmic discovery.

Of course, new assumptions should be introduced with care. We should strive to extract some intuitive reasoning justifying them, and some evidence for their hardness. A natural approach is to analyze the connection between the new assumption and known (standard) assumptions, with the ultimate goal of showing that the new assumption is, in fact, implied by a standard assumption. However, coming up with such a reduction usually requires deep understanding of the new assumption, which can only be obtained through a systematic study of it.

DE-PIR and Permuted Polynomials. A recent example is the new computational assumption underlying the construction of Doubly Efficient Private Information Retrieval (DE-PIR) [BIPW17, CHR17], related to pseudorandomness of permuted low-degree curves.

Private Information Retrieval (PIR) [CGKS95, KO97] schemes are protocols that enable a client to access entries of a database stored on a remote server (or multiple servers), while hiding from the server(s) which items are retrieved. If no preprocessing of the database takes place, the security guarantee inherently requires the server-side computation to be linear in the size of the database for each incoming query [BIM00]. Database preprocessing was shown to yield computational savings in the multi-server setting [BIM00], but the goal of single-server PIR protocols with sublinear-time computation was a longstanding open question, with no negative results or (even heuristic) candidate solutions. Such a primitive is sometimes referred to as Doubly Efficient (DE) PIR.Footnote 1

Recently, two independent works [BIPW17, CHR17] provided the first candidate constructions of single-server DE-PIR schemes, based on a new conjecture regarding the hardness of distinguishing permuted local-decoding queries (for a Reed-Muller code [Ree54, Mul54] with suitable parameters) from a uniformly random set of points. Specifically, although given the queries \(\{z_1, \ldots , z_k\} \subseteq [N]\) of the local decoder it is possible to guess (with a non-trivial advantage) the index i which is being locally decoded, the conjectures of [BIPW17, CHR17] very roughly assert that adding a secret permutation can computationally hide i. More precisely, if an adversary instead sees (many) samples of sets of permuted queries \(\{\pi (z_1), \ldots , \pi (z_k)\}\), where \(\pi : [N] \rightarrow [N]\) is a secret fixed permutation (the same for all samples), then the adversary cannot distinguish these from independent uniformly random size-k subsets of [N].

This new assumption (which we will refer to as \(\mathsf{{PermRM}}\), see Conjecture 1 in Sect. 6.2) allowed for exciting progress forward in the DE-PIR domain. But what do we really know about its soundness? Although [BIPW17, CHR17] provide some discussion and cryptanalysis of the assumption, our understanding of it is still far from satisfactory.

Permuted Puzzles. The \(\mathsf{{PermRM}}\) assumption can be cast as a special case in a broader family of hardness assumptions: as observed in [BIPW17], it can be thought of as an example of an instance where a secret random permutation seems to make an (easy) “distinguishing problem” hard, namely the permutation is the only sources of computational hardness. It should be intuitively clear that such permutations may indeed create hardness. For example, while one can easily distinguish a picture of a cat from that of a dog, this task becomes much more challenging when the pixels are permuted. There are also other instances in which random secret permutations were used to introduce hardness (see Sect. 1.2 below). Therefore, using permutations as a source of cryptographic hardness seems to be a promising direction for research, and raises the following natural question:

Under which circumstances can a secret random permutation be a source of cryptographic hardness?

1.1 Our Results

We initiate a formal investigation of the cryptographic hardness of permuted puzzle problems. More concretely, our contributions can be summarized within the following three directions.

Rigorous Formalization. We formalize a notion of permuted puzzle distinguishing problems, which extends and generalizes the proposed framework of [BIPW17]. Roughly, a permuted puzzle distinguishing problem is associated with a pair of distributions \(\mathcal{D}_0,\mathcal{D}_1\) over strings in \(\varSigma ^n\), together with a random permutation \(\pi \) over [n]. The permuted puzzle consists of the distributions \(\mathcal{D}_{0,\pi },\mathcal{D}_{1,\pi }\) which are defined by sampling a string s according to \(\mathcal{D}_0,\mathcal{D}_1\) (respectively), and permuting the entries of s according to \(\pi \). A permuted puzzle is computationally hard if no efficient adversary can distinguish between a sample from \(\mathcal{D}_{0,\pi }\) or \(\mathcal{D}_{1,\pi }\), even given arbitrarily many samples of its choice from either of the distributions. We also briefly explore related hardness notions, showing that a weaker and simpler variant (which is similar to the one considered in [BIPW17]) is implied by our notion of hardness, and that in some useful cases the weaker hardness notion implies our hardness notion. Our motivation for studying the stronger (and perhaps less natural) hardness notion is that the weaker variant is insufficient for the DE-PIR application.

Identifying Hard Permuted Puzzles. We identify natural examples in which a one-time permutation provably introduces cryptographic hardness, based on standard assumptions. In these examples, the distributions \(\mathcal{D}_0,\mathcal{D}_1\) are efficiently distinguishable, but the permuted puzzle distinguishing problem is computationally hard. We provide such constructions in the random oracle model, and in the plain model under the Decisional Diffie-Hellman (DDH) assumption [DH76]. We additionally observe that the Learning Parity with Noise (LPN) assumption [BKW00, Ale03] itself can be cast as a permuted puzzle. This is described in the following theorem (see Propositions 1, 3, and 2 for the formal statements).

Informal Theorem 1

(Hard Permuted Puzzles). There exists a computationally-hard permuted puzzle distinguishing problem:

  • In the random oracle model.

  • If the DDH assumption holds.

  • If the LPN assumption holds.

Statistical Query Lower Bound for DE-PIR Toy Problem. We make progress towards better understanding the \(\mathsf{{PermRM}}\) assumption underlying the DE-PIR constructions of [BIPW17, CHR17]. Specifically, we show that a toy version of the problem, which was introduced in [BIPW17], provably withstands a rich class of learning algorithms known as Statistical Query (SQ) algorithms.

Roughly, the toy problem is to distinguish randomly permuted graphs of random univariate polynomials of relatively low degree from randomly permuted graphs of random functions. More formally, for a function \(f: X \rightarrow Y\), we define its 2-dimensional graph \(\mathsf {Graph}(f): X \times Y \rightarrow \{0,1\}\) where \(\mathsf {Graph}(f)(x, y) =1 \Leftrightarrow y = f(x)\). For a security parameter \(\lambda \) and a field \({\mathbb F}\), the distributions \(\mathcal{D}_0,\mathcal{D}_1\) in the toy problem are over \(\{0,1\}^n\) for \(n=\left| {{\mathbb F}}\right| ^2\), and output a sample \(\mathsf {Graph}(\gamma )\) where \(\gamma : {\mathbb F}\rightarrow {\mathbb F}\) is a uniformly random degree-\(\lambda \) polynomial in \(\mathcal{D}_0\), and a uniformly random function in \(\mathcal{D}_1\).

We analyze the security of the toy problem against SQ learning algorithms. Our motivation for focusing on learning algorithms in general is that permuted puzzles are a special example of a learning task. Indeed, the adversary’s goal is to classify a challenge sample, given many labeled samples. Thus, it is natural to explore approaches from learning theory as potential solvers for (equivalently, attacks on) the permuted puzzle. Roughly speaking, most known learning algorithms can be categorized within two broad categories. The first category leverages linearity, by identifying correlations with subspaces and using algorithms based on Gaussian elimination to identify these. The second category, which is our focus in this work, is SQ algorithms. Informally, an SQ algorithm obtains no labeled samples. Instead, it can make statistical queries that are defined by a boolean-valued function f, and the algorithm then obtains the outcome of applying f to a random sample. A statistical query algorithm is an SQ algorithm that makes polynomially many such queries. We show that the toy problem is hard for SQ algorithms (see Theorem 8):

Informal Theorem 2

The BIPW toy problem is hard for statistical query algorithms.

We contrast this statistical-query lower bound with the bounded-query statistical indistinguishability lower bound of [CHR17]. That result showed that there is some fixed polynomial B such that no adversary can distinguish B DE-PIR queries from random, even if computationally unbounded. In contrast, our result proves a lower bound for adversaries (also computationally unbounded), that have no a-priori polynomial bound on the number of queries that they can make—in fact, they can make up to \(2^{\epsilon \lambda }\) queries where \(\lambda \) is the security parameter and \(\epsilon \) is a small positive constant. However, they are restricted in that they cannot see the result of any individual query in its entirety; instead, adversaries can only see the result of applying bounded (up to \(\epsilon \lambda \)-bit) output functions separately to each query.

1.2 Other Instances of Hardness from Random Permutations

There are other instances in which random secret permutations were used to obtain computational hardness. The Permuted Kernel Problem (PKP) is an example in the context of a search problem. Roughly, the input in PKP consists of a matrix \(A\in {\mathbb {Z}}_p^{m\times n}\) and a vector \(\varvec{v}\in {\mathbb {Z}}_p^n\), where p is a large prime. A solution is a permutation \(\pi \) on [n] such that the vector \(\varvec{v}'\) obtained by applying \(\pi \) to the entries of \(\varvec{v}\) is in the kernel of A. PKP is known to be NP-complete in the worst-case [GJ02], and conjectured to be hard on average [Sha89], for sufficiently large \((n-m)\) and p. It is the underlying assumption in Shamir’s identification scheme [Sha89], and has lately seen renewed interest due to its applicability to post-quantum cryptography (e.g., [LP12, FKM+18, KMP19]). Despite being studied for 3 decades, the best known algorithms to date run in exponential time; see [KMP19] and the references therein.

1.3 Techniques

We now proceed to discuss our results and techniques in greater detail.

Defining Permuted Puzzles. We generalize and extend the intuitive puzzle framework proposed in [BIPW17], by formally defining the notions of (permuted) puzzle distinguishing problems.

We formalize a puzzle distinguishing problem as a pair of distributions \(\mathcal{D}_0,\mathcal{D}_1\) over \(\varSigma ^n\), for some alphabet \(\varSigma \) and some input length n. Very roughly, hardness of a puzzle distinguishing problem means one cannot distinguish a single sample from \(\mathcal{D}_0\) or \(\mathcal{D}_1\), even given oracle access to \(\mathcal{D}_0\) and \(\mathcal{D}_1\). We say that a puzzle problem is \((s,\epsilon )\)-hard if any size-s adversary distinguishes \(\mathcal{D}_0\) from \(\mathcal{D}_1\) with advantage at most \(\epsilon \). This concrete hardness notion naturally extends to computational hardness of an ensemble of puzzles, in which case we allow the distributions to be keyed (by both public and secret key information) and require that they be efficiently sampleable given the key.

With this notion of puzzle distinguishing problems, we turn to defining a permuted puzzle which, informally, is obtained by sampling a random permutation \(\pi \) once and for all as part of the secret key, and permutating all samples according to \(\pi \). Hardness of a permuted puzzle is defined identically to hardness of (standard) puzzle distinguishing problems.

We also consider a simpler hardness definition, in which the adversary is given oracle access only to a randomly selected \(\mathcal{D}_b\) (but not to \(\mathcal{D}_{1-b}\)), and attempts to guess b. We say that a puzzle distinguishing problem is weak computationally hard if every adversary of polynomial size obtains a negligible advantage in this modified distinguishing game. Weak computational hardness captures the security notion considered in [BIPW17], but is too weak for certain applications, as it allows for trivial permuted puzzles, e.g., \(\mathcal{D}_0=\left\{ 0^{n/2}1^{n/2}\right\} ,\mathcal{D}_1=\left\{ 1^{n/2}0^{n/2}\right\} \). More generally, and as discussed in Remark 3 (Sect. 3), weak computational hardness is generally weaker than the definition discussed above (which is more in line with the DE-PIR application). Concretely, we show that the definition discussed above implies the weaker definition, and that in certain cases (e.g., when \(\mathcal{D}_1\) is the uniform distribution), the weaker definition implies the stronger one. This last observation will be particularly useful in proving security of our permuted puzzle constructions.

Hard Permuted Puzzle in the Random Oracle (RO) Model. Our first permuted puzzle is in the random oracle model. Recall that a permuted puzzle is defined as the permuted version of a puzzle distinguishing problem. For our RO-based permuted puzzle, the underlying puzzle distinguishing problem is defined as follows. There is no key, but both the sampling algorithm and the adversary have access to the random oracle H. The sampling algorithm samples a uniformly random input \(x_0\) for H, and uniformly random seeds \(s_1,\ldots ,s_n\), where \(n=\lambda \), and computes \(x_n\) sequentially as follows. For every \(1\le i\le n\), \(x_i{\mathop {=}\limits ^{\mathsf {def}}}H\left( s_i,x_{i-1}\right) \). The sample is then \(\left( x_0,x_n',s_1,\ldots ,s_n\right) \) where \(x_n'{\mathop {=}\limits ^{\mathsf {def}}}x_n\) in \(\mathcal{D}_0\), and \(x_n'\) is uniformly random in \(\mathcal{D}_1\). Notice that in this (unpermuted) puzzle distinguishing problem one can easily distinguish samples from \(\mathcal{D}_0\) and \(\mathcal{D}_1\), by sequentially applying the oracle to \(x_0\) and the seeds, and checking whether the output is \(x_n'\). This will hold with probability 1 for samples from \(\mathcal{D}_0\), and only with negligible probability for samples from \(\mathcal{D}_1\) (assuming H has sufficiently long outputs). The corresponding permuted puzzle is obtained by applying a fixed random permutation \(\pi ^*\) to the seeds \(\left( s_1,\ldots ,s_n\right) \).Footnote 2

Hardness of the Permuted Puzzle. We focus on a simpler case in which the adversary receives only the challenge sample (and does not request any additional samples from its challenger). This will allow us to present the main ideas of the analysis, and (as we show in Sect. 4), the argument easily extends to the general case.

At a very high level, we show that the hardness of the permuted puzzle stems from the fact that to successfully guess b, the adversary has to guess the underlying random permutation \(\pi ^*\), even though it has oracle access to H.

We first introduce some terminology. For a random oracle H, input \(x_0\) and seeds \(s_1',\ldots ,s_n'\), each permutation \(\pi \) over the seeds uniquely defines a corresponding “output” \(x_n^{\pi }\) through a length-\((n+1)\) “path” \(\mathsf{{P}}_{\pi }\) defined as follows. Let \(x_0^{\pi }{\mathop {=}\limits ^{\mathsf {def}}}x_0\), and for every \(1\le i\le n\), let \(s_i''{\mathop {=}\limits ^{\mathsf {def}}}s_{\pi ^{-1}(i)}'\) and \(x_i^{\pi }{\mathop {=}\limits ^{\mathsf {def}}}H\left( s_i'',x_{i-1}^{\pi }\right) \). Then the label of the i’th node on the path \(\mathsf{{P}}_{\pi }\) is \(x_i^{\pi }\). We say that a node v with label x on some path \(\mathsf{{P}}_{\pi }\) is reachable if x was the oracle answer to one of the adversary’s queries in the distinguishing game. We note that when \(s_i'=s_{\pi ^*(i)}\), i.e., the seeds are permuted with the permutation used in the permuted puzzle, then \(x_i^{\pi ^*}=x_i\) for every \(1\le i\le n\). We call \(\mathsf{{P}}_{\pi ^*}\) the special path.

We will show that with overwhelming probability, unless the adversary queries H on all the \(x_i\)’s on the special path (i.e., on \(x_0^{\pi ^*},x_1^{\pi ^*},\ldots ,x_n^{\pi ^*}=x_n\)), then he obtains only a negligible advantage in guessing b. Hardness of the permuted puzzle then follows because there are n! possible paths, and the adversary has a negligible chance of guessing the special path (because \(\pi ^*\) is a secret random permutation).

We would first like to prove that all node labels, over all paths \(\mathsf{{P}}_{\pi }\), are unique. This, however, is clearly false, because the paths are not disjoint: for example, the label of node 0 in all of them is \(x_0\). More generally, if \(\pi \ne \pi '\) have the same length-k prefix for some \(0\le k<\lambda \), then for every \(0\le i\le k\), the i’th nodes on \(\mathsf{{P}}_{\pi },\mathsf{{P}}_{\pi '}\) have the same label. In this case, we say that the i’th nodes correspond to the same node. Let \(\mathsf{{Unique}}\) denote the event that across all paths there do not exist two nodes that (1) do not correspond to the same node, but (2) have the same label. Our first observation is that \(\mathsf{{Unique}}\) happens with overwhelming probability. Indeed, this holds when H’s output is sufficiently large (e.g., of the order of \(3\lambda \cdot \log \lambda \)), because there are only \(\lambda \cdot \lambda !\) different nodes (so the number of pairs is roughly of the order of \(2^{2\lambda \cdot \log \lambda }\)).

Let \(\mathcal{E}\) denote the event that the adversary queries H on the label of an unreachable node, and let \(\mathsf{{ReachQ}}=\bar{\mathcal{E}}\) denote its complement. Our next observation is that conditioned on \(\mathsf{{Unique}}\), \(\mathsf{{ReachQ}}\) happens with overwhelming probability. Indeed, conditioned on \(\mathsf{{Unique}}\), the label of an unreachable node is uniformly random, even given the entire adversarial view (including previous oracle answers). Thus, querying H on an unreachable node corresponds to guessing the random node label. When H’s output length is sufficiently large (on the order of \(3\lambda \cdot \log \lambda \) as discussed above) this happens only with negligible probability.

Consequently, it suffices to analyze the adversarial advantage in the distinguishing game conditioned on \(\mathsf{{Unique}}\wedge \mathsf{{ReachQ}}\). Notice that in this case, the only potential difference between the adversarial views when \(b=0\) and when \(b=1\) is in the label of the endpoint \(v_{\mathsf{{end}}}\) of the special path \(\mathsf{{P}}_{\pi ^*}\), which is \(x_n\) when \(b=0\), and independent of \(x_n\) when \(b=1\). Indeed, conditioned on \(\mathsf{{Unique}}\), the label of \(v_{\mathsf{{end}}}\) appears nowhere else (i.e., is not the label of any other node on any path). Therefore, conditioned on \(\mathsf{{ReachQ}}\wedge \mathsf{{Unique}}\), the label of \(v_{\mathsf{{end}}}\) appears as one of the oracle answers only if \(v_{\mathsf{{end}}}\) is reachable, i.e., only if the adversary queried H on all the node labels on the special path.

Hard Permuted Puzzles in the Plain Model. Our second permuted puzzle is based on the Decisional Diffi-Helman (DDH) assumption. The underlying puzzle distinguishing problem is defined over a multiplicative cyclic group G of prime order p with generator g. The public key consists of Gg and a uniformly random vector \(\varvec{u}\leftarrow \left( {\mathbb {Z}}_p^*\right) ^n\). A sample from \(\mathcal{D}_0,\mathcal{D}_1\) is of the form \(\left( g^{x_1},\ldots ,g^{x_n}\right) \), where in \(\mathcal{D}_0\) \(\left( x_1,\ldots ,x_n\right) \) is chosen as a uniformly random vector that is orthogonal to \(\varvec{u}\), whereas in \(\mathcal{D}_1\) \(\left( x_1,\ldots ,x_n\right) \) is uniformly random. As discussed below, in this (unpermuted) puzzle distinguishing problem one can easily distinguish samples from \(\mathcal{D}_0\) and \(\mathcal{D}_1\). The corresponding permuted puzzle is obtained by applying a fixed random permutation to the samples \(\left( g^{x_1},\ldots ,g^{x_n}\right) \).

Why are Both DDH and a Permutation Needed? The computational hardness of the permuted puzzles stems from the combination of the DDH assumption and the permutation, as we now explain. To see why the DDH assumption is needed, notice that in \(\mathcal{D}_0\), all sampled \(\left( x_1,\ldots ,x_n\right) \) belong to an \((n-1)\)-dimensional subspace of \({\mathbb {Z}}_p^n\), whereas in \(\mathcal{D}_1\) this happens only with negligible probability, because each sample is uniformly and independently sampled. Consider a simpler version in which \(\mathcal{D}_0,\mathcal{D}_1\) simply output the vector \(\left( x_1,\ldots ,x_n\right) \). In this case, one can obtain an overwhelming distinguishing advantage by (efficiently) checking whether all samples \(\left( x_1,\ldots ,x_n\right) \) lie within an \((n-1)\)-dimensional subspace, and if so guess that the underlying distribution is \(\mathcal{D}_0\). This “attack” can be executed even if the samples are permuted (as is the case in a permuted puzzle), because applying a permutation to the \(\left( x_1,\ldots ,x_n\right) \) is a linear operation, and therefore preserves the dimension of the subspace. Therefore, a permutation on its own is insufficient to get computational hardness, and we need to rely on the DDH assumption.

To see why the permutation is needed, notice that even if the DDH assumption holds in G, given \(\left( g^{x_1},\ldots ,g^{x_n}\right) \) one can efficiently test whether the underlying exponents \(\left( x_1,\ldots ,x_n\right) \) are orthogonal to a known vector \(\varvec{u}\), by only computing exponentiations and multiplications in G. Notice that for a sufficiently large p, the exponents of a sample from \(\mathcal{D}_1\) will be orthogonal to \(\varvec{u}\) only with negligible probability, so this “attack” succeeds with overwhelming probability.

Hardness of the Permuted Puzzle. We now show that the combination of the DDH assumption, and permuted samples, gives computational hardness. Notice that it suffices to prove that the permuted puzzle is weak computationally hard, because \(\mathcal{D}_1\) is random over \(G^n\) (see Sect. 1.3). In this case, the adversarial view \(\mathsf{{V}}_b,b\in \{0,1\}\) consists of the public key \(\left( G,g,\varvec{u}\right) \), and a polynomial number of permuted samples of the form \(\left( g^{x_1},\ldots ,g^{x_n}\right) \) which were all sampled according to \(\mathcal{D}_b\) and permuted using the same random permutation \(\pi \).

Our first observation is that \(\mathsf{{V}}_b\) is computationally indistinguishable from the distribution \(\mathcal {H}_b\) in which the public key is \(\left( G,g,\pi '\left( \varvec{u}\right) \right) \) for \(\pi '{\mathop {=}\limits ^{\mathsf {def}}}\left( \pi \right) ^{-1}\), and the samples from \(\mathcal{D}_b\) are unpermuted.

Our second observation is that the DDH assumption implies that \(\mathcal {H}_b\) is computationally indistinguishable from the distribution \(\mathcal {H}_b'\) in which the \(\left( x_1,\ldots ,x_n\right) \) additionally lie in a random 1-dimensional subspace \(L_{b,\varvec{v}}\). That is, \(\left( x_1,\ldots ,x_n\right) \) are chosen at random from \(L_{b,\varvec{v}}\), where in \(\mathcal {H}_0'\) \(\varvec{v}\) is random subject to \(\varvec{v}\cdot \varvec{u}=0\), and in \(\mathcal {H}_1'\) \(\varvec{v}\) is uniformly random. Specifically, we show that the problem of distinguishing between \(\mathcal {H}_b,\mathcal {H}_b'\) can be efficiently reduced to the task of distinguishing between a polynomial number of length-\((n-1)\) vectors of the form \(\left( g^{y_1},\ldots ,g^{y_{n-1}}\right) \), where the \(\left( y_1,\ldots ,y_{n-1}\right) \) are all sampled from a random 1-dimensional subspace of \({\mathbb {Z}}_p^{n-1}\) or all sampled from the full space \({\mathbb {Z}}_p^{n-1}\). If the DDH assumption holds in G then a polynomial-sized adversary cannot efficiently distinguish between these distributions [BHHO08]. Consequently, it suffices to show that \(\mathcal {H}_0',\mathcal {H}_1'\) are computationally close.

The final step is to show that \(\mathcal {H}_0',\mathcal {H}_1'\) are computationally (in fact, statistically) close. The only difference between the two distributions is in the choice of \(\varvec{v}\) (which is orthogonal to \(\varvec{u}\) in \(\mathcal {H}_0'\), and random in \(\mathcal {H}_1'\)), where all other sampled values are either identical or deterministically determined by the choice of \(\varvec{v}\). Notice that in \(\mathcal {H}_1'\), \(\left( \pi \left( \varvec{u}\right) ,\varvec{v}\right) \) is uniformly random in \({\mathbb {Z}}_p^n\times {\mathbb {Z}}_p^n\). Thus, to show that \(\mathcal {H}_0',\mathcal {H}_1'\) are statistically close and conclude the proof, it suffices to prove that \(\left( \pi \left( \varvec{u}\right) ,\varvec{v}\right) \) in \(\mathcal {H}_0'\) is statistically close to uniform over \({\mathbb {Z}}_p^n\times {\mathbb {Z}}_p^n\). Very roughly, this follows from the leftover hash lemma due to the following observations. First, \(\pi \left( \varvec{u}\right) \) has high min entropy even conditioned on \(\varvec{u}\) (because \(\pi \) is random). Second, the family of inner product functions with respect to a fixed vector (i.e., \(h_{\varvec{v}}\left( \varvec{v}'\right) =\varvec{v}\cdot \varvec{v}'\)) is a pair-wise independent hash function.

Permuted Puzzles and the Learning Parity with Noise (LPN) Assumption. The argument used in the DDH-based permuted puzzle can be generalized to other situations in which it is hard to distinguish between the uniform distribution and a hidden permuted kernel (but easy to distinguish when the kernel is not permuted). This more general view allows us to cast the LPN assumption as a permuted puzzle, see Sect. 5.1.

Statistical-Query Lower Bound. We show that SQ algorithms that make polynomially many queries obtain only a negligible advantage in distinguishing the distributions \(\mathcal{D}_0,\mathcal{D}_1\) in the toy problem presented in Sect. 1.1. Recall that a sample in the toy problem is a permuted \(\mathsf {Graph}(\gamma )\) where \(\gamma \) is either a uniformly random degree-\(\lambda \) polynomial (in \(\mathcal{D}_0\)), or a uniformly random function (in \(\mathcal{D}_1\)), and that the SQ algorithm obtains the outputs of boolean-valued functions f of its choice on random samples. Very roughly, we will show that the outcome of f on (permutation of) a random sample \(x\leftarrow \mathcal{D}_b\) is independent of the challenge bit b and the permutation \(\pi \).

Notice that every permutation \(\pi \) over \(\mathsf {Graph}(\gamma )\) defines a partition \(\varPhi {\mathop {=}\limits ^{\mathsf {def}}}\left\{ \pi \left( \{i\}\times {\mathbb F}\right) \right\} _{i\in {\mathbb F}}\) of \({\mathbb F}\times {\mathbb F}\), where each set in the partition corresponds to a single x value. We say that \(\pi \) respects the partition \(\varPhi \). Notice also that each set contains a single non-0 entry (which is \(\pi \left( i,\gamma (i)\right) \), where i is the value of x that corresponds to the set). Thus, an SQ algorithm can compute this partition, so we cannot hope to hide it. Instead, we show indistinguishability even when the adversary is given the partition.

Our main observation is that for every partition \(\varPhi \), and any boolean-valued function f, there exists \(p_{f,\varPhi }\in [0,1]\) such that for every \(b\in \{0,1\}\), with overwhelming probability over the choice of random permutation \(\pi \) that respects the partition \(\varPhi \), the expectation \(\mathbb {E}_{x\leftarrow \mathcal{D}_b}\left[ f\left( \pi \left( x\right) \right) \right] \) is very close to \(p_{f,\varPhi }\), where \(\pi \left( x\right) \) denote that the entries of x are permuted according to \(\pi \). Crucially, \(p_{f,\varPhi }\) is independent of the challenge bit b, any particular sample x, and the permutation (other than the partition).

We prove this observation in two steps. First, we show that in expectation over the choice of the permutation, \(\mathbb {E}_{x\leftarrow \mathcal{D}_0}\left[ f\left( \pi \left( x\right) \right) \right] \) and \(\mathbb {E}_{x\leftarrow \mathcal{D}_1}\left[ f\left( \pi \left( x\right) \right) \right] \) have the same value. To see this, we write the expectations over \(x \leftarrow \mathcal{D}_b\) as a weighted sum \(\sum _{x} P_b(x) f(\pi (x))\), and apply linearity of the expectation over \(\pi \). To show that this is independent of b, we observe that for any fixed x, the distribution of \(\pi (x)\) is the same (i.e. does not depend on x).

Next, we show that for any distribution \(\mathcal{D}\), the variance (over the choice of the permutation \(\pi \)) of \(\mathbb {E}_{x \leftarrow \mathcal{D}_b}\left[ f\left( \pi \left( x\right) \right) \right] \) is small. The variance is by definition the difference between

$$\begin{aligned} \mathop {\mathbb {E}}\limits _\pi \big [\mathop {\mathbb {E}}\limits _{x \leftarrow \mathcal{D}_b}\left[ f\left( \pi \left( x\right) \right) \right] ^2 \big ] \end{aligned}$$
(1)

and

$$\begin{aligned} \mathop {\mathbb {E}}\limits _\pi \big [\mathop {\mathbb {E}}\limits _{x \leftarrow \mathcal{D}_b}\left[ f\left( \pi \left( x\right) \right) \right] \big ]^2. \end{aligned}$$
(2)

We show that both Eqs. (1) and (2) can be expressed as an expectation (over some distribution of \(g, g'\)) of \(\mathbb {E}_\pi \Big [\big (f (\pi (\mathsf {Graph}(g))), f ( \pi (\mathsf {Graph}(g'))) \big ) \Big ]\). We observe that this depends only on the Hamming distance between g and \(g'\). Finally, we observe that the distribution of \((g,g')\) is uniform in Eq. (2) and two independent samples from \(\mathcal{D}_b\) in Eq. (1). To complete the bound on the variance, we show that when \(g, g'\) are sampled independently from \(\mathcal{D}_b\) (specifically, the interesting case is when they are sampled from \(\mathcal{D}_0\)), then the distribution of the Hamming distance between g and \(g'\) is nearly the same as when g and \(g'\) are independent uniformly random functions.

To prove this, we prove a lemma (Lemma 4) stating that when t-wise independent random variables \((X_1, \ldots , X_n)\) satisfy \(\Pr [X_i \ne \star _i] = p_i\) for some values of \(\star _i\) and \(p_i\) such that \(\sum _{i \in [n]} p_i \le \frac{t}{4} \ge \omega (\log \lambda )\), then \((X_1, \ldots , X_n)\) are statistically \(\mathrm{negl}(\lambda )\)-close to mutually independent. We apply this with \(X_i\) being the indicator random variable for the event that \(g(i) \ne g'(i)\). This lemma quantitatively strengthens a lemma of [CHR17].

Open Problems and Future Research Directions. The broad goal of basing DE-PIR on standard assumptions was a motivating starting point for this work, in which we put forth the framework of permuted puzzles. In describing hard permuted puzzles, we take a “bottom-up” approach by describing such constructions based on standard cryptographic assumptions. Since these permuted puzzles are still not known to imply DE-PIR, we try to close the gap between the permuted puzzle on which DE-PIR security is based, and provably hard permuted puzzles, by taking a “top-down” approach, and analyzing the security of a toy version of the DE-PIR permuted puzzle, against a wide class of possible attacks.

Our work still leaves open a fascinating array of questions, we discuss some of them below. First, it would be very interesting to construct a hard permuted puzzle based only on the existence of one-way functions, as well as to provide “public-key” hard permuted puzzles, namely ones in which the key generation algorithm needs no secret key, based on standard assumptions. In the context of DE-PIR and its related permuted puzzle, it would be interesting to construct DE-PIR based on other (and more standard) assumptions, as well as to analyze the security of its underlying permuted puzzle (and its toy version) against a wider class of attacks.

2 Preliminaries

For a set X, we write \(x\leftarrow X\) to denote that x is sampled uniformly at random from X. For a distribution \(\mathcal{D}\), we use \(\mathrm {Supp}\left( \mathcal{D}\right) \) to denote its support. The min entropy of \(\mathcal{D}\) is \({\textsf {H}}_{\infty }\left( \mathcal{D}\right) {\mathop {=}\limits ^{\mathsf {def}}}\min _{x\in \mathrm {Supp}\left( \mathcal{D}\right) }{\log \frac{1}{\Pr [x]}}\). For a pair XY of random variables, we denote their statistical distance by \(d_{\mathsf {TV}}\left( X,Y\right) \). We use \(\cdot \) to denote inner product, i.e., for a pair \(\varvec{x}=\left( x_1,\ldots ,x_n\right) ,\varvec{y}=\left( y_1,\ldots ,y_n\right) \) of vectors, \(\varvec{x}\cdot \varvec{y} {\mathop {=}\limits ^{\mathsf {def}}}\sum _{i=1}^{n}{x_iy_i}\). We use [n] to denote the set \(\{1, \ldots , n\}\), and \(S_n\) to denote the group of permutations of [n].

Notation 3

(Permutation of a vector). For a vector \(\varvec{x}=\left( x_1,\ldots ,x_n\right) \), and a permutation \(\pi \in S_n\), we denote:

$$ \pi \left( \varvec{x}\right) {\mathop {=}\limits ^{\mathsf {def}}}\left( x_{\pi ^{-1}(1)},\ldots ,x_{\pi ^{-1}(n)}\right) . $$

3 Distinguishing Problems and Permuted Puzzles

In this section, we formally define (permuted) puzzle problems which are, roughly, a (special case) of ensembles of keyed “string-distinguishing” problems.

We begin in Sect. 3.1 by developing terminology for general string-distinguishing and puzzle problems. In Sect. 3.2 we present the formal distinguishing challenge and define hardness. Then, in Sect. 3.3, we discuss the case of permuted puzzles, and present an alternative indistinguishability notion that is equivalent in certain cases.

3.1 String-Distinguishing Problems

At the core, we consider string-distinguishing problems, defined by a pair of distributions over n-element strings. We begin by defining a finite instance.

Definition 1

(String-Distinguishing Problems). A string-distinguishing problem is a tuple \(\varPi = (n, \varSigma , \mathcal{D}_0, \mathcal{D}_1)\), where n is a positive integer, \(\varSigma \) is a non-empty finite set, and each \(\mathcal{D}_b\) is a distribution on \(\varSigma ^n\). We call n the string length, and \(\varSigma \) the string alphabet.

More generally, an oracle-dependent string-distinguishing problem is a function \(\varPi ^{(\cdot )}\) that maps an oracle \(O : \{0,1\}^* \rightarrow \{0,1\}\) to a string-distinguishing problem \(\varPi ^O\).

For example, we will consider permuted puzzle string-distinguishing problems relative to a random oracle in Sect. 4. Note that oracle-dependent string-distinguishing problems are strictly more general than string-distinguishing problems, as the distributions can simply ignore the oracle.

Remark 1

(Oracle Outputs). In the above, we modeled the oracle as outputting a single bit for simplicity. However, any (deterministic) oracle with multi-bit output can be emulated given a corresponding single-bit-output oracle, at the cost of making more oracle queries.

We will be interested in distinguishing problems where the distributions \(\mathcal{D}_0\) and \(\mathcal{D}_1\) may depend on common sampled “key” information. Parts of this key may be publicly available, or hidden from a distinguishing adversary (discussed in Definition 4); these parts are denoted \(\mathsf {pk},\mathsf {sk}\), respectively.

Definition 2

(Keyed Families). A keyed family of (oracle-dependent) string-distinguishing problems is a tuple \((\mathcal{K}, \{\varPi _k\}_{k \in \mathcal{K}})\), where \(\mathcal{K}\) is a distribution on a non-empty finite set of pairs \((\mathsf {pk}, \mathsf {sk})\) and each \(\varPi _k\) is an (oracle-dependent) string-distinguishing problem. We refer to the support of \(\mathcal{K}\) as the key space, and also denote it by \(\mathcal{K}\).

Note that any string-distinguishing problem can trivially be viewed as a keyed family by letting \(\mathcal{K}\) be a singleton set.

Example 1

(Keyed Family: Dimension-t Subspaces). For a finite field \({\mathbb F}\), and \(n\in {\mathbb {N}}\), consider an example keyed family of string-distinguishing problems \((\mathcal{K}, \{\varPi _k\}_{k \in \mathcal{K}})\) as follows:

  • \(\mathcal{K}\) samples a random \(t\leftarrow \{1,\ldots ,n-1\}\), and a random subspace \(L\subseteq {\mathbb F}^n\) of dimension t, sets \(\mathsf {pk}=t\) and \(\mathsf {sk}=L\), and outputs \(\left( \mathsf {pk},\mathsf {sk}\right) \).

  • For a key \(k=\left( t,L\right) \), the corresponding string-distinguishing problem is \(\varPi _k= (n, {\mathbb F}, \mathcal{D}_0, \mathcal{D}_1)\) where \(\mathcal{D}_0\) outputs a uniformly random \(\varvec{v}\in L\), and \(\mathcal{D}_1\) outputs a uniformly random \(\varvec{v}\in {\mathbb F}^n\).

Note that in this example, it will be computationally easy to distinguish between the distributions \(\mathcal{D}_0,\mathcal{D}_1\) given sufficiently many samples.

We next define a puzzle problem which, informally, is an efficiently sampleable ensemble of keyed families of string-distinguishing problems.

Definition 3

(Puzzle problem). A puzzle problem is an ensemble \(\{(\mathcal{K}_\lambda , \{\varPi ^{(\cdot )}_k\}_{k \in \mathcal{K}_\lambda })\}_{\lambda \in {\mathbb Z}^+}\) of keyed families of (oracle-dependent) string-distinguishing problems associated with probabilistic polynomial-time algorithms \(\mathsf {KeyGen}\) and \(\mathsf {Samp}\) such that:

  • For any \(\lambda \in {\mathbb Z}^+\), \(\mathsf {KeyGen}(1^\lambda )\) outputs a sample from \(\mathcal{K}_\lambda \).

  • For any \(k \in \mathcal{K}_\lambda \), any \(b \in \{0,1\}\), and any oracle \(O : \{0,1\}^* \rightarrow \{0,1\}\), \(\mathsf {Samp}^{O}(k, b)\) outputs a sample from \(\mathcal{D}_b\), where \(\varPi ^O_k = (n, \varSigma , \mathcal{D}_0, \mathcal{D}_1)\).

Remark 2

( Abbreviated terminology). Somewhat abusing notation, we will also refer to a single keyed family of string-distinguishing problems as a puzzle problem.

3.2 Distinguishing Games and Hardness

We will focus on puzzle problems where it is computationally hard to distinguish between the pair of distributions. This notion of hardness is formalized through the following distinguishing game. Roughly, the distinguishing adversary is given a challenge sample x from a randomly selected \(\mathcal{D}_b\), and query access to both distributions (denoted by choices \(\beta \) below), and must identify from which \(\mathcal{D}_b\) the x was sampled.

Definition 4

(Distinguishing Game). Let \(\mathcal{P}=(\mathcal{K}, \{\varPi _k\}_{k \in \mathcal{K}})\) be a puzzle problem, and let \(\mathcal{O}\) be a distribution of oracles. The distinguishing game \(\mathcal{G}_{\mathsf {dist}}^{\mathcal{O}}[\mathcal{P}]\) is run between an “adversary” \(\mathcal{A}\) and a fixed “challenger” \(\mathcal{C}\), and is defined as follows:

  1. 1.

    \(\mathcal{C}\) samples a key \(k = (\mathsf {pk}, \mathsf {sk})\) from \(\mathcal{K}\), and \(O \leftarrow \mathcal{O}\), and denote \(\varPi ^O_k = (n, \varSigma , \mathcal{D}_0, \mathcal{D}_1)\). \(\mathcal{C}\) sends \(\mathsf {pk}\) to \(\mathcal{A}\), who is also given oracle access to O throughout the game.

  2. 2.

    \(\mathcal{C}\) samples a random bit \(b \leftarrow \{0,1\}\), samples \(x \leftarrow \mathcal{D}_b\), and sends x to \(\mathcal{A}\).

  3. 3.

    The following is repeated an arbitrary number of times: \(\mathcal{A}\) sends a bit \(\beta \) to \(\mathcal{C}\), who samples \(x' \leftarrow \mathcal{D}_\beta \) and sends \(x'\) to \(\mathcal{A}\).

  4. 4.

    \(\mathcal{A}\) outputs a “guess” bit \(b' \in \{0,1\}\).

\(\mathcal{A}\) is said to win the game if \(b' = b\). \(\mathcal{A}\)’s advantage is \({\textsf {Adv}}_{\mathcal{A}}(\mathcal{G}_{\mathsf {dist}}^{\mathcal{O}}[\mathcal{P}]){\mathop {=}\limits ^{\mathsf {def}}}2 \cdot \big | \Pr [b' = b] - \frac{1}{2} \big |\).

Informally, a permuted puzzle is computationally hard if any polynomial-time adversary wins the distinguishing game of Definition 4 with negligible advantage. We first formalize the notion of concrete hardness.

Definition 5

(Concrete Hardness). A puzzle problem \(\mathcal{P}=(\mathcal{K}, \{\varPi _k\}_{k \in \mathcal{K}})\) is said to be \((s, \epsilon )\)-hard (with respect to oracle distribution \(\mathcal{O}\)) if in the game \(\mathcal{G}_{\mathsf {dist}}^{\mathcal{O}}[\mathcal{P}]\), all adversaries \(\mathcal{A}\) of size at most s have advantage at most \(\epsilon \).

We say a puzzle problem \(\big \{(\mathcal{K}_\lambda , \{\varPi ^{(\cdot )}_k \}_{k \in \mathcal{K}_\lambda }) \big \}_{\lambda \in {\mathbb Z}^+}\) is \(\big (s(\cdot ), \epsilon (\cdot ) \big )\)-hard (with respect to an ensemble \(\{\mathcal{O}_\lambda \}\) of oracle distributions) if each \((\mathcal{K}_\lambda , \{\varPi ^{(\cdot )}_k\}_{k \in \mathcal{K}_\lambda })\) is \(\big (s(\lambda ), \epsilon (\lambda ) \big )-\)hard with respect to \(\mathcal{O}_\lambda \).

Definition 6

(Asymptotic Hardness). As usual, we say simply that \(\mathcal{P}\) is (computationally) hard if for every \(s(\lambda ) \le \lambda ^{O(1)}\), there exists \(\epsilon (\lambda ) \le \lambda ^{-\omega (1)}\) such that for every \(\lambda \in {\mathbb Z}^+\), \(\mathcal{P}\) is \((s(\cdot ), \epsilon (\cdot ))\)-hard.

\(\mathcal{P}\) is statistically hard if for some \(\epsilon (\lambda ) \le \lambda ^{-\omega (1)}\), \(\mathcal{P}\) is \((\infty , \epsilon (\cdot ))\)-hard.

Remark 3

(Discussion on Definition).

A slightly simpler and more natural definition would be to give the adversary access to (polynomially-many samples from) only a randomly selected \(\mathcal{D}_b\), where the adversary must identify b.

For keyed puzzles, these definitions are in general not equivalent. Consider, for example, a modified version of Example 1, where both \(\mathcal{D}_0\) and \(\mathcal{D}_1\) are defined by random dimension-t subspaces, \(L_0\) and \(L_1\). Then over the choice of the key (including \(L_0,L_1\)), the distributions \(\mathcal{D}_0\) and \(\mathcal{D}_1\) on their own are identical: that is, even an unbounded adversary with arbitrarily many queries would have 0 advantage in the simplified challenge. However, given t samples from both distributions, as in Definition 4, \(\mathcal{D}_0\) and \(\mathcal{D}_1\) are trivially separated, and a sample x can be correctly labeled with noticeable advantage. On the other hand, hardness with respect to our definition implies hardness with respect to the simplified notion, by a hybrid argument over the number of queries (see Lemma 1).

Since our motivation for studying puzzles come from applications where correlated samples from the corresponding distributions can be revealed (e.g., correlated PIR queries on different indices i), we thus maintain the more complex, stronger definition.

The definitional separation in the example above stems from the fact that given access to only one distribution \(\mathcal{D}_b\), one cannot necessarily simulate consistent samples from \(\mathcal{D}_0\) and \(\mathcal{D}_1\). However, in certain instances, this issue does not arise; for example, if one of the two is simply the uniform distribution over strings.We formally address this connection in the following section: presenting the simplified indistinguishability notion in Definition 8, and proving equivalence for certain special cases in Lemma 2.

3.3 Permuted Puzzles and a Related Indistinguishability Notion

In this work we will focus on permuted puzzles. This is a special case of puzzle problems, as we now define. Here, the key includes an additional secret random permutation on the indices of the n-element strings, and strings output by the distributions \(\mathcal{D}_0,\mathcal{D}_1\) will be permuted as dictated by \(\pi \).

Definition 7

(Permuted Puzzle Problems). For a puzzle problem \(\mathcal{P}= \{(\mathcal{K}_\lambda , \{\varPi ^{(\cdot )}_k\}_{k \in \mathcal{K}_\lambda })\}_{\lambda \in {\mathbb Z}^+}\), we define the associated permuted puzzle problem \(\mathsf {Perm}\left( \mathcal{P}\right) {\mathop {=}\limits ^{\mathsf {def}}}\{(\mathcal{K}_\lambda ', \{\varPi ^{\prime (\cdot )}_{k'}\}_{{k'} \in \mathcal{K}_\lambda '})\}_{\lambda \in {\mathbb Z}^+}\), where:

  • A sample from \(\mathcal{K}_{\lambda }'\) is \(\big (\mathsf {pk}, (\mathsf {sk}, \pi ) \big )\), where:

    • \((\mathsf {pk}, \mathsf {sk})\) is sampled from \(\mathcal{K}_\lambda \), and

    • If \(\varPi _k = (n, \varSigma , \mathcal{D}_0, \mathcal{D}_1)\), then \(\pi \) is sampled uniformly at random from the symmetric group \(S_n\).

  • For any key \(k' = (\mathsf {pk}, (\mathsf {sk}, \pi ))\), if \(\varPi _{(\mathsf {pk}, \mathsf {sk})} = (n, \varSigma , \mathcal{D}_0, \mathcal{D}_1)\) then \(\varPi '_{k'} = (n, \varSigma , \mathcal{D}'_0, \mathcal{D}'_1)\), where a sample from \(\mathcal{D}'_b\) is \(\pi (x)\) for \(x \leftarrow \mathcal{D}_b\).

Recall (Notation 3) for vector \(x \in \varSigma ^n\) and \(\pi \in S_n\), that \(\pi (x)\) denotes the index-permuted vector.

As discussed in Remark 3, we now present a simplified notion of indistinguishability, and show that in certain special cases, this definition aligns with Definition 6. In such cases, it will be more convenient to work with the simplified version.

Definition 8

(Weak Hardness of Puzzle Problems). Let \(\mathcal{P}=(\mathcal{K}, \{\varPi _k\}_{k \in \mathcal{K}})\) and \(\mathcal{O}\) be as in Definition 4. The simplified distinguishing game \(\mathcal{G}_{\mathsf {dist},s}^{\mathcal{O}}[\mathcal{P}]\) is defined similarly to \(\mathcal{G}_{\mathsf {dist}}^{\mathcal{O}}[\mathcal{P}]\), except that in Step 3, \(\mathcal{C}\) samples \(x' \leftarrow \mathcal{D}_b\) (instead of \(x'\leftarrow \mathcal{D}_{\beta }\)).

A puzzle problem \(\mathcal{P}=(\mathcal{K}, \{\varPi _k\}_{k \in \mathcal{K}})\) is weak \((s, \epsilon )\)-hard if \({\textsf {Adv}}_{\mathcal{A}}(\mathcal{G}_{\mathsf {dist},s}^{\mathcal{O}}[\mathcal{P}])\le \epsilon \) for any size-s adversary \(\mathcal{A}\). Weak computational hardness is defined similarly to Definition 6.

Note that weak computational (statistical) hardness (with respect to Definition 8) is implied by hardness with respect to Definition 4:

Lemma 1

(Standard \(\Rightarrow \) Weak). Let \(\mathcal{P}= \{(\mathcal{K}_\lambda , \{\varPi ^{(\cdot )}_k\}_{k \in \mathcal{K}_\lambda })\}_{\lambda \in {\mathbb Z}^+}\) be a puzzle problem. If \(\mathcal{P}\) is computationally (statistically, respectively) hard in the standard sense (Definition 6) then it is weak computationally (statistically, respectively) hard (Definition 8).

The more interesting direction is that weak hardness implies (standard) hardness in the case that one of the two distributions \(\mathcal{D}_0\) or \(\mathcal{D}_1\) is efficiently sampleable and permutation-invariant, in the following sense.

Definition 9

(Permutation-Invariant Distributions). Let \(n\in {\mathbb {N}}\), let \(\varSigma \) be a non-empty set, and let \(\mathcal{D}\) be a distribution over \(\varSigma ^n\). For a permutation \(\pi \in S_n\), let \(\mathcal{D}_{\pi }\) be the distribution induced by sampling \(x\leftarrow \mathcal{D}\) and outputting \(\pi \left( x\right) \). We say that \(\mathcal{D}\) is permutation-invariant if for a uniformly random \(\pi \in S_n\), the joint distribution \(\mathcal{D}_\pi \times \mathcal{D}_{\pi }\) is identical to \(\mathcal{D}\times \mathcal{D}_\pi \).

Remark 4

One example of a permutation-invariant distribution \(\mathcal{D}\) particularly useful in this work is the uniform distribution over \(\varSigma ^n\).

Lemma 2

(In certain cases Weak \(\Rightarrow \) Standard). Let \(\mathcal{P}= \{(\mathcal{K}_\lambda , \{\varPi ^{(\cdot )}_k\}_{k \in \mathcal{K}_\lambda })\}_{\lambda \in {\mathbb Z}^+}\) be a puzzle problem. If:

  • The corresponding permuted puzzle \(\mathsf {Perm}\left( \mathcal{P}\right) \) is weak computationally hard (Definition 8).

  • For every \(\lambda \), every \(k=\left( \mathsf {pk},\mathsf {sk}\right) \in \mathrm {Supp}\left( \mathcal{K}_\lambda \right) \), and every \(\varPi _k=\left( n,\varSigma ,\mathcal{D}_0,\mathcal{D}_1\right) \):

    • \(\mathcal{D}_1\) is permutation-invariant.

    • One can efficiently sample from \(\mathcal{D}_1\) without \(\mathsf {sk}\).

Then \(\mathsf {Perm}\left( \mathcal{P}\right) \) is computationally hard in the standard sense (Definition 6).

Finally, we show that the existence of hard permuted puzzles for which the original distributions \(\mathcal{D}_0,\mathcal{D}_1\) are statistically far implies the existence of one-way functions. Note that this holds with respect to our standard (strong) definition of computational hardness, but not in general for the weaker notion (where, for example, even trivially distinguishable singleton distributions \(D_0\) over (0, 1) and \(D_1\) over (1, 0) become statistically identical when receiving samples only from permuted-\(D_0\) or permuted-\(D_1\)).

Lemma 3

If \(\mathcal{P}\) is a puzzle problem that is not statistically hard, but \(\mathsf {Perm}(\mathcal{P})\) is computationally hard, then there exists a one-way function.

The proofs of Lemmas 12 and 3 are deferred to the full version.

4 Hard Permuted Puzzles in the Random Oracle Model

We show that there exist computationally hard permuted puzzles in the random oracle model. We first formally define the notion of a random oracle.

Definition 10

(Random Oracle). We use the term random oracle to refer to the uniform distribution on functions mapping \(\{0,1\}^* \rightarrow \{0,1\}\).

Construction 4

(Permuted puzzles in the ROM). Let H be a random oracle. For a security parameter \(\lambda \), we interpret H as a function \(H_\lambda :\{0,1\}^{m_\lambda +\lambda }\rightarrow \{0,1\}^{m_\lambda }\) for \(m_\lambda =2\left( \lambda +1\right) \log \lambda \) (also see Remark 1). We define a puzzle problem \(\mathcal{P}= \big \{ (\mathcal{K}_\lambda , \{\varPi _k\}_{k \in \mathcal{K}_\lambda }) \big \}\) by the following \(\mathsf {KeyGen}\) and \(\mathsf {Samp}\) algorithms:

  • \(\mathsf {KeyGen}\left( 1^\lambda \right) \) outputs \(1^{\lambda }\) as the public key (the secret key is empty).Footnote 3 We note that for any \(\lambda \), the corresponding string distinguishing problem \(\varPi _\lambda =\left( n,\varSigma ,\mathcal{D}_0^{(\cdot )},\mathcal{D}_1^{(\cdot )}\right) \) has \(n=\lambda +2\) and \(\varSigma = \{0,1\}^{m_\lambda } \times \{\texttt {INPUT}, \texttt {OUTPUT}, \texttt {SEED}\}\).

  • \(\mathsf {Samp}\left( k,b\right) \) where \(k=1^{\lambda }\) outputs a sample from \(\mathcal{D}_{\lambda ,b}^{H_\lambda }\) for \(H_\lambda : \{0,1\}^{m_\lambda +\lambda } \rightarrow \{0,1\}^{m_\lambda }\) as defined above, where \(\mathcal{D}_{\lambda ,b}^{H_\lambda }\) is defined as follows.

    • A sample from \(\mathcal{D}_{\lambda ,0}^{H_\lambda }\) is of the form \((\sigma _1, \ldots , \sigma _{\lambda + 2})\), where:

      • * For \(i \in [\lambda ]\), \(\sigma _i=(s_i, \texttt {SEED})\) for uniformly random and independent \(s_1, \ldots , s_\lambda \) in \(\{0,1\}^{m_\lambda }\).

      • * \(\sigma _{\lambda + 1}=(x_0, \texttt {INPUT})\), where \(x_0\) is uniformly random in \(\{0,1\}^{m_\lambda }\).

      • * \(\sigma _{\lambda + 2}=(x_{\lambda }, \texttt {OUTPUT})\), where for each \(i \in [\lambda ]\), \(x_i = H_\lambda (s_i', x_{i-1})\), where \(s_i'\) is the length-\(\lambda \) prefix of \(s_i\). (That is, the random oracle uses length-\(\lambda \) seeds, and the rest of the bits in the seed are ignored.)

    • \(\mathcal{D}_{\lambda , 1}^{H_\lambda }\) is defined identically to \(\mathcal{D}_{\lambda ,0}^{H_\lambda }\), except that \(x_{\lambda }\) is uniformly random in \(\{0,1\}^{m_\lambda }\), independent of \(x_0\), \(H_\lambda \), and \(s_1\), \(\ldots \), \(s_\lambda \).

Proposition 1

The puzzle problem \(\mathcal{P}\) of Construction 4 is computationally easy, and the corresponding permuted puzzle problem \(\mathsf {Perm}\left( \mathcal{P}\right) \) is statistically hard, with respect to a random oracle.

We note that \(\mathcal{P}\) is computationally easy in an extremely strong sense: a polynomial-sized adversary can obtain advantage \(1-\mathrm{negl}\left( \lambda \right) \) in the distinguishing game. The proof of is deferred to the full version.

5 Hard Permuted Puzzles in the Plain Model

In this section we discuss permuted puzzle problems based on hidden permuted kernels. At a high level, these puzzles have the following structure. First, the distributions \(\mathcal{D}_0,\mathcal{D}_1\) are associated with a group G with generator g, and a uniformly random public “constraint vector” \(\varvec{c}\). Samples from \(\mathcal{D}_0\) and \(\mathcal{D}_1\) are vectors in \(G^m\), of the form \(g^{\varvec{x}}\). Specifically, \(\mathcal{D}_1\) samples a uniformly random vector in \(G^m\), whereas \(\mathcal{D}_0\) samples a vector \(\varvec{x}\) that is uniformly random subject to being orthogonal to \(\varvec{c}\). Intuitively, since \(\mathcal{D}_1\) is uniformly random, weak computational hardness of the permuted puzzle problem implies computational hardness by Lemma 2.

Remark 5

( An alternative formulation of the problem). In the high-level blueprint of a permuted puzzle problem described above, the constraint vector \(\varvec{c}\) is given “in the clear” (namely, we assume it is public, and indistinguishability does not rely on the secrecy of \(\varvec{c}\)), and the samples \(\varvec{x}\) are permuted according to a random permutation \(\pi \in S_n\), namely, the adversary obtains \(\pi \left( \varvec{x}\right) \) (recall that \(\pi \left( \varvec{x}\right) =\left( x_{\pi ^{-1}(1)},\ldots ,x_{\pi ^{-1}(n)}\right) \) Let \(\mathcal{C}\) denote the set of “good” vectors \(\varvec{c}\), i.e., vectors that satisfy the requirement, and let \(G^n\) denote the domain over which \(\mathcal{D}_0,\mathcal{D}_1\) are defined. Let \(\mathcal{D}_b'{\mathop {=}\limits ^{\mathsf {def}}}\left( \varvec{c}, \left( \pi \left( \varvec{x}_i\right) \right) _{i\in [q]}\right) _{\varvec{c}\leftarrow \mathcal{C},\pi \leftarrow S_n,\varvec{x}_i\leftarrow \mathcal{D}_b}\) denote the distribution over the adversary’s view in the simplified distinguishing game of Definition 8, where b is the challenge bit, and q is the number of samples the adversary receives from the challenger. Denote \(\mathcal{D}_b''{\mathop {=}\limits ^{\mathsf {def}}}\left( \pi \left( \varvec{c}\right) , \left( \varvec{x}_i\right) _{i\in [q]}\right) _{\varvec{c}\leftarrow \mathcal{C},\pi \leftarrow S_n,\varvec{x}_i\leftarrow \mathcal{D}_b}\). The permuted puzzle problems described in this section will have the property that \(\mathcal{D}_b'\approx \mathcal{D}_b''\) for \(b\in \{0,1\}\), which will be used in the security proofs.

5.1 Permuted Puzzles and the Learning Parity with Noise (LPN) Assumption

We now describe how to cast the Learning Parity with Noise (LPN) assumption as a permuted puzzle.

Notation. For \(\varvec{a}\in {\mathbb F}_2^n\), we use \(\left| \varvec{a}\right| \) to denote the Hamming weight of \(\varvec{a}\). For \(i\in \left[ n\right] \), we denote \({{\mathbf {\mathsf{{v}}}}}_{n,i}=1^i\cdot 0^{n-i}\) (i.e., a canonical length-n vector of Hamming weight i). For \(n\in {\mathbb {N}}\), let \(\mathcal{R}_{n}\) denote the distribution that outputs a uniformly random \(\varvec{x} \leftarrow {\mathbb F}_2^n\). For a fixed \(\varvec{s}\in {\mathbb F}_2^n\), and \(\gamma \in (0,1)\), let \(\mathcal{D}_{{\textsf {LPN}},\varvec{s},\gamma }\) denote the distribution over \({\mathbb F}_2^n\) that with probability \(\gamma \) outputs a uniformly random \(\varvec{x}\leftarrow {\mathbb F}_2^n\), and otherwise (with probability \(1-\gamma \)) outputs a uniformly random element of the set \(\left\{ \varvec{x} \in {\mathbb F}_2^n\ :\ \varvec{x} \cdot \varvec{s} = 0\right\} \).

Definition 11

(Learning Parity with Noise (LPN)). Let \(\gamma \in (0,1)\). The \(\gamma \)-Learning Parity with Noise (\(\gamma \)-LPN) assumption conjectures that for every polynomial-sized oracle circuit ensemble \(\mathcal{A}=\left\{ \mathcal{A}_{\lambda }\right\} _{\lambda }\) there exists a negligible function \(\epsilon \left( \lambda \right) \) such that for every \(\lambda \),

$$ {\textsf {Adv}}_{\mathcal{A}}^{{\textsf {LPN}}}\left( \lambda \right) {\mathop {=}\limits ^{\mathsf {def}}}\left| \mathop {\Pr }\limits _{\varvec{s}\leftarrow {\mathbb F}_2^{\lambda }}\left[ \mathcal{A}^{\mathcal{D}_{{\textsf {LPN}},\varvec{s},\gamma }}(1^\lambda )=1\right] -\Pr \left[ \mathcal{A}^{\mathcal{R}_{\lambda }}(1^\lambda )=1\right] \right| \le \epsilon \left( \lambda \right) . $$

Remark 6

( Equivalence to standard LPN formulation). Recall that the standard \(\gamma \)-LPN assumption, for \(0< \gamma < \frac{1}{2}\), states that any polynomial-time adversary obtains only a negligible advantage in distinguishing between (polynomially many samples from) the following distributions:

  • \((\varvec{a}_i, \langle \varvec{a}_i, \varvec{s}\rangle + e_i)_{i = 1}^m\), where for every i, \(\varvec{a}_i \leftarrow {\mathbb F}_2^n\) and \(e_i\) is sampled from a Bernoulli distribution with \(\Pr [e_i = 1] = \gamma \); vs.

  • \((\varvec{a}_i, u_i)_{i=1}^m\), where each \((\varvec{a}_i, u_i)\) is sampled uniformly at random from \({\mathbb F}_2^{n+1}\).

We now show that if the standard LPN assumption holds with parameters \((\lambda -1,\gamma /2)\), then Definition 11 holds with parameters \((\lambda ,\gamma )\), where the distinguishing advantage increases by at most \(2^{-\lambda }\).

  • In Definition 11 if \(\varvec{s}=\varvec{0}\) then \(\mathcal{D}_{{\textsf {LPN}}, \varvec{s},\gamma }\) and \(\mathcal{R}_{\lambda }\) are identically distributed, whereas in the standard LPN formulation they might be distinguishable (with some advantage \(\le 1\)).

  • Conditioned on \(\varvec{s} \ne \varvec{0}\) in Definition 11, there exists at least one nonzero coordinate \(i \in [\lambda ]\) such that \(s_i = 1\), in which case the i’th coordinate of a sample from \(\mathcal{D}_{{\textsf {LPN}},\varvec{s}, \gamma }\) is a noisy linear function of the other coordinates. That is, with probability \((1-\gamma ) + \frac{\gamma }{2}\), it holds that \(\varvec{x}\) is random subject to \(x_i = \sum _{j \ne i} x_js_j\), and with probability \(\frac{\gamma }{2}\), the vector \(\varvec{x}\) is random subject to \(x_i = \sum _{j \ne i} x_js_j + 1\) with offset noise. Moreover, since \(\varvec{s}\) is uniformly random over non-zero vectors, such coordinate is equally likely to occur for any index \(i \in [\lambda ]\) (in contrast, in the standard LPN formulation the last coordinate always necessary satisfies this “special” structure; i.e., equivalent to \(\mathcal{D}_{{\textsf {LPN}},\varvec{s}, \gamma }\) with secret \(\varvec{s} = (\varvec{s}',1)\)).

Thus, conditioned on the (overwhelming probability) event that \(\varvec{s} \ne \varvec{0}\), we can reduce the problem of distinguishing standard LPN with parameters \((\lambda -1,\gamma /2)\), to distinguishing our version parameters \((\lambda ,\gamma )\), by selecting a random \(i \leftarrow [\lambda ]\) and transposing the i’th coordinate of all received LPN samples with the final coordinate.

We now describe how to cast LPN as a permuted puzzle.

Construction 5

(Permuted puzzle problem from LPN). For a noise parameter \(\gamma \in (0,1/2)\), we define a puzzle problem \(\mathcal{P}= \big \{ (\mathcal{K}_\lambda , \{\varPi _k\}_{k \in \mathcal{K}_\lambda } ) \big \}\) by the following \(\mathsf {KeyGen}\) and \(\mathsf {Samp}\) algorithms:

  • \(\mathsf {KeyGen}\left( 1^\lambda \right) \) samples a weight \({\textsf {w}}\) according to the binomial distribution over \(\left[ n\right] \). It outputs \({\textsf {w}}\) as the secret key (there is no public key).

    For a key k generated by \(\mathsf {KeyGen}\left( 1^\lambda \right) \), the corresponding string-distinguishing problem \(\varPi _k = \big (n, \varSigma , \mathcal{D}_0, \mathcal{D}_1 \big )\) has string length \(n = \lambda \) and alphabet \(\varSigma = {\mathbb F}_2\).

  • \(\mathsf {Samp}\left( {\textsf {w}},b\right) \) outputs a sample from \(\mathcal{D}_{\lambda ,b}\), where \(\mathcal{D}_{\lambda ,0}=\mathcal{D}_{{\textsf {LPN}},{{\mathbf {\mathsf{{v}}}}}_{\lambda ,{\textsf {w}}},\gamma }\), and \(\mathcal{D}_{\lambda ,1}=\mathcal{R}_{\lambda }\).

Proposition 2

For any constant \(\gamma \in (0, 1/2)\), the \(\gamma \)-LPN assumption is equivalent to the computational hardness of the permuted puzzle problem \(\mathsf {Perm}\left( \mathcal{P}_\gamma \right) \) of Construction 5.

Proof

Regarding the equivalence of the \(\gamma \)-LPN assumption and the computational hardness of \(\mathsf {Perm}\left( \mathcal{P}_\gamma \right) \), notice that the permuted distribution \(\mathcal{D}_{\lambda ,0}'\) of the permuted puzzle is exactly \(\mathcal{D}_{{\textsf {LPN}},\varvec{s},\gamma }\), where \(\varvec{s}=\pi \left( {{\mathbf {\mathsf{{v}}}}}_{\lambda ,{\textsf {w}}}\right) \) for a uniformly random \(\pi \in S_\lambda \), and a weight \({\textsf {w}}\in \left[ \lambda \right] \) which was sampled according to the binomial distribution, so \(\varvec{s}\) is uniformly random in \({\mathbb F}_2^n\). Therefore, the distinguishing advantage in the distinguishing game of the permuted puzzle corresponds exactly to the \(\gamma \)-LPN assumption (because additionally \(\mathcal{D}_{\lambda ,1}'=\mathcal{R}_{\lambda }\)).    \(\square \)

Remark 7

(( Unpermuted) puzzle problem is computationally easy). We note that the (unpermuted) puzzle problem of Construction 5 is computationally easy. Indeed, in the unpermuted puzzle problem there are only \(\lambda \) possible “secret” vectors (i.e., \(\varvec{\mathsf{{v}}}_{\lambda ,1},\ldots ,\varvec{\mathsf{{v}}}_{\lambda ,\lambda }\)). Given a polynomial number of samples from \(\mathcal{D}_{\lambda ,0}\) the adversary can determine, with overwhelming probability, which of these is the secret vector used in \(\mathcal{D}_{\lambda ,0}\), and can then determine (with constant advantage) whether the challenge sample is from \(\mathcal{D}_{\lambda ,0}\) or \(\mathcal{D}_{\lambda ,1}\).

5.2 Permuted Puzzles Based on DDH

In this section we describe a permuted puzzle problem based on the DDH assumption. We first recall the standard DDH assumption, and describe an equivalent formulation which we use.

Definition 12

(Group Samplers). A group sampler is a probabilistic polynomial-time algorithm \(\mathcal{G}\) that on input \(1^\lambda \) outputs a pair \(\left( G, g \right) \), where G is a multiplicative cyclic group of order \(p = \varTheta (2^\lambda )\), and g is a generator of G. We assume that p is included in the group description G, and that there exists an efficient algorithm that given G and descriptions of group elements \(g_1,g_2\) outputs a description of \(g_1\cdot g_2\).

Definition 13

(DDH assumption). For any cyclic group G of order p with generator g, define the following distributions:

  • \(\mathcal{D}_{{\textsf {DDH}}}(G, g)\) is uniform over the set \(\big \{ (g^x, g^y, g^{xy}) : x, y \in {\mathbb Z}_p \big \}\).

  • \(\mathcal{R}_{{\textsf {DDH}}}(G, g)\) is uniform over \(G^3\).

For a group sampler \(\mathcal{G}\), the DDH assumption over \(\mathcal{G}\) conjectures that for any polynomial-sized circuit family \(\mathcal{A}=\left\{ \mathcal{A}_{\lambda }\right\} _{\lambda }\) there exists a negligible function \(\epsilon \left( \lambda \right) \) such that for every \(\lambda \):

$$ {\textsf {Adv}}_{\mathcal{A}}^{{\textsf {DDH}}(\mathcal{G})}\left( \lambda \right) {\mathop {=}\limits ^{\mathsf {def}}}\left| \mathop {\Pr }\limits _{\begin{array}{c} (G, g )\leftarrow \mathcal{G}(1^\lambda ) \\ v \leftarrow \mathcal{D}_{{\textsf {DDH}}}(G, g) \end{array}} \left[ \mathcal{A}_{\lambda }\left( v\right) =1\right] - \mathop {\Pr }\limits _{\begin{array}{c} (G, g) \leftarrow \mathcal{G}(1^\lambda ) \\ v\leftarrow \mathcal{R}_{{\textsf {DDH}}}(G, g) \end{array}}\left[ \mathcal{A}_{\lambda }\left( v\right) =1\right] \right| \le \epsilon \left( \lambda \right) . $$

We will use the matrix version of DDH, defined next. Informally, in matrix DDH the adversary is given many vectors of the form \(\left( g^{x_1},\ldots ,g^{x_n}\right) \), and the conjecture is that no polynomial-time adversary can distinguish between the case that the \(\left( x_1,\ldots x_n\right) \) are sampled uniformly from \({\mathbb Z}_p^n\), and the case that \(\left( x_1,\ldots ,x_n\right) \) are sampled from a random 1-dimensional subspace of \({\mathbb Z}_p^n\).

Definition 14

(Matrix DDH assumption). For a cyclic group G of order p, and \(n,q\in {\mathbb {N}}\), define

$$ {\textsf {Rk}}_i\left( G^{q\times n}\right) =\left\{ g^A=\left( g^{a_{ij}}\right) _{i\in [q],j\in [n]}\ :\ A\in {\mathbb {Z}}_p^{q\times n},{\textsf {rank}}\left( A\right) =i\right\} . $$

Let \(\mathcal{G}\) be as in Definition 13, and let \(n=n\left( \lambda \right) , q=q\left( \lambda \right) \) be polynomials such that \(q\left( \lambda \right) \ge n\left( \lambda \right) \) for every \(\lambda \). The matrix DDH assumption over \(\mathcal{G}\) conjectures that for any polynomial-sized circuit family \(\mathcal{A}=\left\{ \mathcal{A}_{\lambda }\right\} _{\lambda }\) there exists a negligible function \(\epsilon \left( \lambda \right) \) such that for every \(\lambda \):

$$\begin{aligned} {\textsf {Adv}}_{\mathcal{A}}^{{\textsf {M-DDH}}(\mathcal{G})}\left( \lambda \right) {\mathop {=}\limits ^{\mathsf {def}}}\left| \mathop {\Pr }\limits _{\begin{array}{c} (G ,g )\leftarrow \mathcal{G}(1^\lambda ) \\ v\leftarrow {\textsf {Rk}}_{n}\left( G^{q\times n}\right) \end{array}}\left[ \mathcal{A}_{\lambda }\left( v\right) =1\right] - \mathop {\Pr }\limits _{\begin{array}{c} (G,g)\leftarrow \mathcal{G}(1^\lambda ) \\ v\leftarrow {\textsf {Rk}}_{1}\left( G^{q\times n}\right) \end{array}}\left[ \mathcal{A}_{\lambda }\left( v\right) =1\right] \right| \le \epsilon \left( \lambda \right) . \end{aligned}$$

Boneh et al. proved [BHHO08, Lemma 1] that the DDH assumption over \(\mathcal{G}\) implies the matrix DDH assumption over \(\mathcal{G}\):

Imported Theorem 6

(DDH implies matrix-DDH [BHHO08]). Let \(\lambda \) be a security parameter, let \(\mathcal{G}\) be as in Definition 13, and let \(n=n\left( \lambda \right) , q=q\left( \lambda \right) \) be polynomials. Then for any polynomial-sized adversary circuit \(\mathcal{A}_{{\textsf {M-DDH}}}\) there exists an adversary \(\mathcal{A}_{{\textsf {DDH}}}\) of size \(\left| {\mathcal{A}_{{\textsf {M-DDH}}}}\right| +\mathrm{poly}\left( q,n\right) \) such that \({\textsf {Adv}}_{\mathcal{A}_{{\textsf {M-DDH}}}}^{{\textsf {M-DDH}}(\mathcal{G})}\left( \lambda \right) \le \left( n-1\right) \cdot {\textsf {Adv}}_{\mathcal{A}_{\textsf {DDH}}}^{{\textsf {DDH}}(\mathcal{G})}\left( \lambda \right) \).

We are now ready to define the permuted puzzle problem based on DDH.

Construction 7

(Permuted puzzle problem from DDH). Let \(\mathcal{G}\) be as in Definition 13. We define a puzzle problem \(\mathcal{P}= \big \{ (\mathcal{K}_\lambda , \{\varPi _k\}_{k \in \mathcal{K}_\lambda } ) \big \}\) by the following \(\mathsf {KeyGen}\) and \(\mathsf {Samp}\) algorithms:

  • \(\mathsf {KeyGen}\) on input \(1^\lambda \) samples \((G, g) \leftarrow \mathcal{G}(1^\lambda )\), where \(\mathcal{G}\) is the group sampling algorithm of Definition 13. Let p denote the order of G. Then, \(\mathsf {KeyGen}\) samples a uniformly random vector \(\varvec{u} \in {\mathbb Z}_p^n\) for \(n = \lambda ^2\) and outputs \((G, g, \varvec{u})\) as a public key (there is no secret key).

    We note that for any \(k = (G, g, \varvec{u})\), the corresponding string distinguishing problem \(\varPi _k=\left( n, \varSigma , \mathcal{D}_0,\mathcal{D}_1\right) \) has alphabet \(\varSigma =G\).

  • \(\mathsf {Samp}\left( k,b\right) \) for \(k=\left( n, \varSigma , \mathcal{D}_0,\mathcal{D}_1\right) \) outputs a sample from \(\mathcal{D}_b\), where:

    • \(\mathcal{D}_{0}\) is uniform over \(\left\{ g^{\varvec{x}} \in G^n : \varvec{x} \cdot \varvec{u} = 0\right\} \).

    • \(\mathcal{D}_{1}\) is uniform over \(G^n\).

Proposition 3

The puzzle problem \(\mathcal{P}\) of Construction 7 is computationally easy. Moreover, if \(\mathcal{G}\) is an ensemble of groups in which the matrix DDH assumption of Definition 14 holds, then the corresponding permuted puzzle problem \(\mathsf {Perm}(\mathcal{P})\) is computationally hard.

We note that \(\mathcal{P}\) is computationally easy in an extremely strong sense: a polynomial-sized adversary can obtain advantage \(1-\mathrm{negl}\left( \lambda \right) \) in the distinguishing game. The proof of Proposition 3 is deferred to the full version.

6 Statistical Query Lower Bound

In this section we discuss a specific permuted puzzle toy problem introduced by [BIPW17], and study its hardness against a large class of potential adversarial algorithms called statistical-query algorithms. We first define this class of algorithms in Sect. 6.1, then present the toy problem in Sect. 6.2 and prove it is secure against such algorithms.

6.1 Statistical Query Algorithms

Definition 15

(Statistical Query Algorithms). Let \(\mathcal{P}= (\mathcal{K}, \{\varPi _k\}_{k \in \mathcal{K}})\) be a puzzle problem. A statistical q-query algorithm for \(\mathcal{G}_{\mathsf {dist},s}[\mathcal{P}]\) is a stateful adversary \(\mathcal{A}\) using an “inner adversary” \(\mathcal{A}_{\mathsf {SQ}}\) as follows.

  1. 1.

    Upon receiving the public key \(\mathsf {pk}\), \(\mathcal{A}\) forwards it to \(\mathcal{A}_{\mathsf {SQ}}\).

    Recall that \(\mathsf {pk}\) is part of the key k, and denote \(\varPi _k=\left( n,\varSigma ,\mathcal{D}_0,\mathcal{D}_1\right) \).

  2. 2.

    The following is repeated q times:

    1. (a)

      \(\mathcal{A}_{\mathsf {SQ}}\) outputs a boolean-valued function f.Footnote 4

    2. (b)

      \(\mathcal{A}\) requests a sample \(x \leftarrow \mathcal{D}_b\) from the challenger (where \(b \in \{0,1\}\) is the challenger’s secret bit), computes f(x) (this is a single bit), and forwards f(x) to \(\mathcal{A}_{\mathsf {SQ}}\).

  3. 3.

    When \(\mathcal{A}_{\mathsf {SQ}}\) outputs a “guess” bit \(b'\), \(\mathcal{A}\) forwards \(b'\) to the challenger.

Remark 8

We consider only statistical query algorithms for the simplified distinguishing game \(\mathcal{G}_{\mathsf {dist}, s}\) of Definition 8 because our lower bounds (proven in Sect. 6.2) hold for puzzle problems in which weak computational hardness (i.e., hardness of \(\mathcal{G}_{\mathsf {dist},s}\)) is equivalent to computational hardness (i.e., hardness of the more standard distinguishing game \(\mathcal{G}_{\mathsf {dist}}\) of Definition 4) by Lemma 2.

Statistical Query (SQ) algorithms constitute a broad class of distinguishing algorithms, that is incomparable in power to polynomial-time algorithms. For example, an SQ algorithm can distinguish between a PRG output and a uniformly random string with a single query. On the other hand, SQ algorithms cannot distinguish between a distribution that is uniform on \(\{0,1\}^n\) and one that is uniform on a random high-dimensional subspace of \(\{0,1\}^n\). These distributions can be distinguished (given many samples) in polynomial time by a simple rank computation.

Still, in the context of distinguishing problems, SQ algorithms seem to be a powerful class of adversarial algorithms. In fact, except for the aforementioned examples of algorithms which exploit algebraic structure, we are not aware of any natural distinguishing algorithms that cannot be simulated by statistical query algorithms. A challenging and important open problem, which we leave for future work, is to formalize a class of algorithms that use algebraic structure (or even only linear algebra), possibly together with statistical queries, and to prove lower bounds against this class.

6.2 The Toy Problem and Lower Bound

The works [CHR17, BIPW17] base the security of their DE-PIR schemes on the \(\mathsf{{PermRM}}\) conjecture, for which they also discuss different variants (e.g., noisy versions). Boyle et al. [BIPW17] also put forth a toy version of the problem, for which we will prove a lower bound against SQ algorithms. We first recall the \(\mathsf{{PermRM}}\) conjecture and its toy version.

Conjecture 1

(\(\mathsf{{PermRM}}\), Conjecture 4.2 in [BIPW17]). Let \(m\in {\mathbb N}\) be a dimension parameter, let \(\lambda \in {\mathbb N}\) be a security parameter, let \(d=d_m\left( n\right) \) be the minimal integer such that \(n\ge \left( {\begin{array}{c}m+d\\ d\end{array}}\right) \), and let \({\mathbb F}\) be a finite field satisfying \(\left| {{\mathbb F}}\right| >d\lambda +1\). Define a probabilistic algorithm \(\mathsf{{Samp}}\left( b,\pi ,v\right) \) that operates as follows:

  • If \(b=0\):

    1. 1.

      Select m random degree-\(\lambda \) polynomial \(p_1,\ldots ,p_m\leftarrow {\mathbb F}[X]\) such that for every \(1\le i\le \lambda \), \(p_i(0)=v\). Notice that these polynomials determine a curve \(\gamma \left( t\right) \) in \({\mathbb F}^m\), given by \(\left\{ \left( p_1(t),\ldots ,p_m(t)\right) \ :\ t\in {\mathbb F}\right\} \).

    2. 2.

      Sample \(d\lambda +1\) distinct points on the curve \(\gamma \left( t\right) \), determined by non-zero parameters \(t_0,\ldots ,t_{d\lambda }\leftarrow {\mathbb F}\).

    3. 3.

      Output the points, in order, where each point is permuted according to \(\pi :{\mathbb F}^m\rightarrow {\mathbb F}^m\), namely output

      $$\left( \pi \left( p_1(t_i),\ldots ,p_m(t_i)\right) \right) _{i=0}^{d\lambda } \in \left( {\mathbb F}^m\right) ^{d\lambda +1}.$$
  • If \(b=1\): sample \(d\lambda +1\) random points in \({\mathbb F}^m\) \(\left( w_0,\ldots ,w_{d\lambda }\right) \leftarrow \left( {\mathbb F}^m\right) ^{d\lambda +1}\), and output \(\left( w_0,\ldots ,w_{d\lambda }\right) \).

The \(\mathsf{{PermRM}}\) conjecture is that for every efficient non-uniform \(\mathcal{A}=\left( \mathcal{A}_1,\mathcal{A}_2\right) \) there exists a negligible function \(\mu (\lambda )=\mathrm{negl}\left( \lambda \right) \) such that:

$$ \Pr \left[ \begin{array}{c} \left( 1^n,1^{\left| {{\mathbb F}}\right| },\mathsf {aux}\right) \leftarrow \mathcal{A}_1\left( 1^{\lambda }\right) \\ \pi \leftarrow S_{\left( {\mathbb F}^m\right) }; b\leftarrow \{0,1\} \\ b'\leftarrow \mathcal{A}_2^{\mathsf{{Samp}}\left( b,\pi ,\cdot \right) }\left( 1^n,\mathsf {aux}\right) \end{array}\ :\ b'=b\right] \le 1/2+\mu \left( \lambda \right) $$

Let \({\mathbb F}= \{{\mathbb F}_\lambda \}_{\lambda \in {\mathbb Z}^+}\) denote an ensemble of finite fields with \(|{\mathbb F}_\lambda | = \varTheta (\lambda ^2)\). Let \(q = q_\lambda \) denote \(|{\mathbb F}_\lambda |\).

For a function \(f : X \rightarrow Y\), we define \(\mathsf {Graph}(f) : X \times Y \rightarrow \{0,1\}\) such that

$$ \mathsf {Graph}(f)(x, y) = {\left\{ \begin{array}{ll} 1 &{} \text {if }y = f(x) \\ 0 &{} \text {otherwise.} \end{array}\right. } $$

Define the puzzle problem \(\varPi _\lambda = (n, \{0,1\}, \mathcal{D}_0, \mathcal{D}_1)\), where \(n = q^2\), and \(\mathcal{D}_0\) and \(\mathcal{D}_1\) are defined as follows.

  • A sample from \(\mathcal{D}_0\) is \(\mathsf {Graph}(\gamma )\), where \(\gamma : {\mathbb F}\rightarrow {\mathbb F}\) is a uniformly random degree-\(\lambda \) polynomial.

  • A sample from \(\mathcal{D}_1\) is \(\mathsf {Graph}(U)\), where \(U : {\mathbb F}\rightarrow {\mathbb F}\) is a uniformly random function.

Conjecture 2

([BIPW17]). The permuted puzzle problem \(\mathcal{P}{\mathop {=}\limits ^{\mathsf {def}}}\mathsf {Perm}(\{\varPi _\lambda \}_{\lambda \in {\mathbb Z}^+})\) is computationally hard.

Theorem 8

The simplified distinguishing game \(\mathcal{G}_{\mathsf {dist}, s}[\mathcal{P}]\) is hard for statistical-query algorithms. That is, for all polynomially bounded \(q(\cdot )\), the advantage of any statistical \(q(\lambda )\)-query adversary in \(\mathcal{G}_{\mathsf {dist},s}[\mathcal{P}]\) is at most \(e^{-\varOmega (\lambda )}\).

Proof

We will show that even if we give the statistical query adversary additional information about \(\pi \), it cannot distinguish permuted samples from \(\mathcal{D}_0\) from permuted samples from \(\mathcal{D}_1\). Specifically, we will give the adversary (for free) the unordered partition \(\varPhi _1 \cup \cdots \cup \varPhi _q\) of \({\mathbb F}\times {\mathbb F}\), where \(\varPhi _i = \pi (\{i\} \times {\mathbb F})\). (Intuitively, \(\varPhi _i\) is the image under \(\pi \) of all points in which the X coordinate equals i. In particular, \(\pi \left( \mathsf {Graph}\left( f\right) \right) \) takes value “1” at exactly one coordinate in \(\varPhi _i\).) Note that it is indeed possible for a statistical query adversary to learn \(\varPhi {\mathop {=}\limits ^{\mathsf {def}}}\{\varPhi _1, \ldots , \varPhi _q\}\): if (xy) and \((x', y')\) belong to the same \(\varPhi _i\), then for a random sample \(z \leftarrow \mathcal{D}_b\), it is never the case that \(\pi (z)_{(x,y)} = \pi (z)_{(x',y')} = 1\). However, if (xy) and \((x', y')\) do not belong to the same \(\varPhi _i\), then \(\pi (z)_{(x,y)} = \pi (z)_{(x',y')} = 1\) with probability at least \(\frac{1}{q^2}\).

We say that a permutation \(\pi \) respects a partition \(\varPhi = \{\varPhi _1, \ldots , \varPhi _q\}\) if \(\{\pi (\{i\} \times {\mathbb F})\}_i = \varPhi \). For any partition \(\varPhi \), we will write \(\Pr _{\varPhi }\) to denote the probability space in which a permutation \(\pi \) is sampled uniformly at random from the set of permutations that respect \(\varPhi \). Similarly, we will write \(\mathbb {E}_{\varPhi }\) to denote expectations in \(\Pr _\varPhi \), and we write \(\mathrm {Var}_{\varPhi }\) to denote variances in \(\Pr _\varPhi \).

We will show that there is some negligible function \(\nu : {\mathbb Z}^+ \rightarrow {\mathbb R}\) such that for any function \(f : \{0,1\}^n \rightarrow \{0,1\}\) and any partition \(\varPhi \), there exists some \(p_{f,\varPhi } \in [0,1]\) such that for every \(b \in \{0,1\}\), it holds that

$$ \mathop {\Pr }\limits _{\varPhi } \left[ \mathop {\mathbb {E}}\limits _{x \leftarrow \mathcal{D}_b} \big [ f(\pi (x)) ] - p_{f,\varPhi } \big | \ge \nu (\lambda ) \right] \le \nu (\lambda ). $$

Crucially, \(p_{f,\varPhi }\) is independent of the challenge bit b, the specific sample x, and the secret permutation \(\pi \) (except for its dependence on \(\varPhi \)). Thus, the answer to a query f can be simulated by computing \(p_{f,\varPhi }\).

The following two observations are at the core of our proof. Recall that \(\varDelta \) denotes the Hamming distance. For a pair of functions \(g,g':X\rightarrow Y\), we denote \(\varDelta \left( g,g'\right) = \left| {\left\{ x\in X\ :\ g\left( x\right) \ne g'\left( x\right) \right\} }\right| \).

Claim 1

For any partition \(\varPhi \), any function \(g : {\mathbb F}\rightarrow {\mathbb F}\), and any fixed permutation \(\pi ^*\) that respects \(\varPhi \), the distribution of \(\pi (\mathsf {Graph}(g))\) under \(\Pr _\varPhi \) is identical to the distribution of \(\pi ^*(\mathsf {Graph}(u))\) when \(u : {\mathbb F}\rightarrow {\mathbb F}\) is a uniformly random function.

Proof

To sample a random permutation \(\pi \) conditioned on \(\big \{\pi (\{i\} \times {\mathbb F})\big \}_i = \varPhi {\mathop {=}\limits ^{\mathsf {def}}}\{\varPhi _1, \ldots , \varPhi _q\}\), one can sample a uniformly random permutation \(\sigma : {\mathbb F}\rightarrow {\mathbb F}\) and q independent bijections \(\pi _i : {\mathbb F}\rightarrow \varPhi _{\sigma (i)}\), and then define \(\pi (j,k) = \pi _j(k)\).

\(\pi (\mathsf {Graph}(g))\) is defined by the set of points \(\{\pi (j, g(j))\}_{j \in {\mathbb F}} = \{\pi _j(g(j))\}\). It is clear that sampling g uniformly at random corresponds to independently picking each g(j) at random, which produces an identical distribution of \(\pi (\mathsf {Graph}(g))\) as picking the bijections \(\{\pi _j\}\) independently and uniformly at random. Thus, \(\pi ^*(\mathsf {Graph}(u))\) for a fixed \(\pi ^*\) which respects the partition \(\varPhi \), and a random u, is distributed identically to \(\pi (\mathsf {Graph}(g))\) for a fixed g and a random \(\pi \) that respects \(\varPhi \).    \(\square \)

Claim 2

For any partition \(\varPhi \), any functions \(g, g' : {\mathbb F}\rightarrow {\mathbb F}\), and any fixed permutation \(\pi ^*\) that respects \(\varPhi \), the distribution of \(\big (\pi (\mathsf {Graph}(g)), \pi (\mathsf {Graph}(g')) \big )\) under \(\Pr _\varPhi \) is identical to the distribution of \(\big (\pi ^*(\mathsf {Graph}(u)), \pi ^*(\mathsf {Graph}(u')) \big )\), where \(u, u' : {\mathbb F}\rightarrow {\mathbb F}\) are jointly uniformly random conditioned on \(\varDelta (u, u') = \varDelta (g, g')\).

Proof

We first consider the distribution under \(\Pr _\varPhi \) of \((x, x') = \big (\pi (\mathsf {Graph}(g)), \pi (\mathsf {Graph}(g')) \big )\), where g and \(g'\) are fixed. Because g and \(g'\) are functions, both x and \(x'\) will consist mostly of zeros, but for each \(j \in {\mathbb F}\), they will contain a 1 in exactly one position in \(\varPhi _j\). Recall from the proof of Claim 1 that \(\pi \) can be sampled by sampling a uniformly random permutation \(\sigma : {\mathbb F}\rightarrow {\mathbb F}\) and q independent bijections \(\pi _i : {\mathbb F}\rightarrow \varPhi _{\sigma (i)}\), and defining \(\pi (j,k) = \pi _j(k)\). Therefore, for any \(j\in {\mathbb F}\) if \(g(j) = g'(j)\) then x and \(x'\) will agree on the position within \(\varPhi _{\sigma (j)}\) at which they contain a 1 entry. Otherwise, they will disagree. Other than that, the positions are uniformly random within \(\varPhi _{\sigma (j)}\) because \(\pi _j\) is a random bijection. Moreover, since \(\sigma \) is a random permutation, the set of \(\varPhi _i\)’s for which \(x,x'\) agree on the 1-entry is a random subset of size \(\varDelta \left( g,g'\right) \).

Now consider the distribution of \((y,y')=\big (\pi ^*(\mathsf {Graph}(u)), \pi ^*(\mathsf {Graph}(u')) \big )\) where \(\pi ^*\) is fixed and defined by \(\sigma ^*\) and \(\left\{ \pi _i^*\right\} _{i\in {\mathbb F}}\). The same arguments show that for every \(j\in {\mathbb F}\), \(y,y'\) agree on the positions within \(\varPhi _{\sigma ^*(j)}\) at which they contain a 1 if and only if \(u(j)=u'(j)\). Since \(u,u'\) are random and independent, the positions in \(\varPhi _{\sigma ^*(j)}\) in which \(y,y'\) have a 1 are otherwise random because these positions are \(\pi _j^*\left( u(j)\right) \) and \(\pi _j^*\left( u'(j)\right) \), respectively. Additionally, the \(\varPhi _i\)’s for which \(y,y'\) agree on the position of the 1 entry is a uniformly random subset of size \(\varDelta \left( g,g'\right) =\varDelta \left( u,u'\right) \), because this set is \(\left\{ \sigma ^*(j)\ :\ u(j)=u'(j)\right\} \), and \(u,u'\) are random and independent.    \(\square \)

Claim 3

If \(g_0, g_1 : {\mathbb F}\rightarrow {\mathbb F}\) are two independent uniformly random degree-\(\lambda \) polynomials, then \(\varDelta \left( g_0,g_1\right) \) is \(e^{-\varOmega (\lambda )}\)-close to \(\varDelta \left( g_0',g_1'\right) \) for uniformly random \(g_0',g_1' : {\mathbb F}\rightarrow {\mathbb F}\).

Proof

For \(i\in {\mathbb F}\), let \(X_i\) (respectively, \(Y_i\)) be indicator of the event that \(g_0(i)=g_1(i)\) (respectively, \(g_0'(i)=g_1'(i)\)). Then \(X_i,Y_i\) are \(\lambda \)-wise independent with \(\mathbb {E}[X_i]= \mathbb {E}[Y_i]=\left| {{\mathbb F}}\right| ^{-1}\). The claim now follows from Lemma 4 below for \(n=\left| {{\mathbb F}}\right| \).    \(\square \)

We now state the lemma used in the proof of Claim 3, the proof is deferred to the full version.

Lemma 4

Let \(X = (X_1, \ldots , X_n)\) and \(Y = (Y_1, \ldots , Y_n)\) be t-wise independent \(\{0,1\}\)-valued random variables with \(t \ge 2 e^2\), such that for all \(i \in [n]\), \(\mathbb {E}[Y_i] = \mathbb {E}[X_i] {\mathop {=}\limits ^{\mathsf {def}}}p_i\), let p denote \(\frac{1}{n} \cdot \sum _i p_i\), and suppose that \(p \le \frac{t}{4n}\). Then the total variation distance \(d_{\mathsf {TV}}(X,Y)\) is at most

$$ (n + 3) \cdot \frac{(4pn/t)^{t/2}}{\prod _{i \in [n]}(1 - p_i)} $$

Now, we will show that \(\mathbb {E}_{x \leftarrow \mathcal{D}_0}[f(\pi (x))]\) and \(\mathbb {E}_{x \leftarrow \mathcal{D}_1}[f(\pi (x))]\), viewed as random variables that depend on \(\pi \), have the same expectation and also have very small (negligible) variance.

Claim 4

For any \(f : \{0,1\}^n \rightarrow \{0,1\}\) and any partition \(\varPhi \),

$$ \mathop {\mathbb {E}}\limits _{\varPhi }\Big [ \mathop {\mathbb {E}}\limits _{x \leftarrow \mathcal{D}_0}[ f(\pi (x)) ] \Big ] = \mathop {\mathbb {E}}\limits _{\varPhi }\Big [ \mathop {\mathbb {E}}\limits _{x \leftarrow \mathcal{D}_1}[ f(\pi (x)) ] \Big ]. $$

Proof

Consider any \(f : \{0,1\}^n \rightarrow \{0,1\}\) and any partition \(\varPhi \). By Claim 1, there is a distribution \(\mathcal{U}\) that is equal to the distribution (in \(\Pr _\varPhi \)) of \(\pi (\mathsf {Graph}(g))\) for all functions \(g : {\mathbb F}\rightarrow {\mathbb F}\). Let \(\mu \) denote \(\mathbb {E}_{x' \leftarrow \mathcal{U}}[f(x')]\). Let \(P_b\) denote the probability mass function of \(\mathcal{D}_b\). Then for any \(b \in \{0,1\}\),

$$\begin{aligned} \mathop {\mathbb {E}}\limits _{\varPhi } \left[ \mathop {\mathbb {E}}\limits _{x \leftarrow \mathcal{D}_b}[f(\pi (x))] \right]&= \mathop {\mathbb {E}}\limits _{\varPhi } \left[ \sum _{x} P_b(x) \cdot f(\pi (x)) \right] \\&= \sum _{x} P_b(x) \cdot \mathop {\mathbb {E}}\limits _\varPhi [f(\pi (x))] \\&= \sum _x P_b(x) \cdot \mu \\&= \mu , \end{aligned}$$

which does not depend on b.    \(\square \)

Now we analyze the variance. Recall that our goal is to show that \(\mathrm {Var}_\varPhi \big [ \mathbb {E}_{x \leftarrow \mathcal{D}_b}[f(\pi (x))]\big ]\) is negligible for \(b \in \{0,1\}\). Because of Claim 3, this follows from the following more general claim.

Claim 5

Let \(\mathcal{D}\) be any distribution on functions mapping \({\mathbb F}\) to \({\mathbb F}\). Suppose that when g and \(g'\) are sampled independently from \(\mathcal{D}\) and \(u, u' : {\mathbb F}\rightarrow {\mathbb F}\) are independent uniformly random functions, the distribution of \(\varDelta (g, g')\) is statistically \(\epsilon \)-close to that of \(\varDelta (u, u')\).

Then, for any \(f : \{0,1\}^n \rightarrow \{0,1\}\), any partition \(\varPhi \),

$$ \mathop {\mathrm {Var}}\limits _{\varPhi }\Big [ \mathop {\mathbb {E}}\limits _{g \leftarrow \mathcal{D}} \big [ f \big (\pi (\mathsf {Graph}(g)) \big ) \big ] \Big ] \le \epsilon . $$

Proof

Let P denote the probability mass function of \(\mathcal{D}\), and let \(\pi ^*\) be an arbitrary permutation in \(S_n\) such that \(\{\pi ^*(\{i\} \times {\mathbb F})\}_i = \varPhi \). By the definition of variance,

$$\begin{aligned} \mathop {\mathrm {Var}}\limits _{\varPhi } \left[ \mathop {\mathbb {E}}\limits _{g \leftarrow \mathcal{D}}[f(\pi (\mathsf {Graph}(g)))] \right] = \mathop {\mathbb {E}}\limits _{\varPhi } \left[ \mathop {\mathbb {E}}\limits _{g \leftarrow \mathcal{D}}[f(\pi (\mathsf {Graph}(g)))]^2 \right] - \mathop {\mathbb {E}}\limits _{\varPhi } \left[ \mathop {\mathbb {E}}\limits _{g \leftarrow \mathcal{D}}[f(\pi (\mathsf {Graph}(g)))] \right] ^2. \end{aligned}$$

For the first term, we have

$$\begin{aligned} \mathop {\mathbb {E}}\limits _{\varPhi }[\mathop {\mathbb {E}}\limits _{g \leftarrow \mathcal{D}}[f(\pi (\mathsf {Graph}(g)))]^2]&= \mathop {\mathbb {E}}\limits _{\varPhi }\left[ \left( \sum _g P(g) \cdot f(\pi (\mathsf {Graph}(g))) \right) ^2 \right] \\&= \sum _{g,h} P(g) \cdot P(h) \cdot \mathop {\mathbb {E}}\limits _\varPhi [f(\pi (\mathsf {Graph}(g))) \cdot f(\pi (\mathsf {Graph}(h)))]&\text {(Claim 2)}\\&= \mathop {\mathbb {E}}\limits _{g, h \leftarrow \mathcal{D}} \left[ \mathop {\mathbb {E}}\limits _{\begin{array}{c} u, v : {\mathbb F}\rightarrow {\mathbb F}\\ \varDelta (u, v) = \varDelta (g, h) \end{array}} \big [f(\pi ^*(\mathsf {Graph}(u))) \cdot f(\pi ^*(\mathsf {Graph}(v))) \big ] \right] . \end{aligned}$$

For the second term, we have

$$\begin{aligned}&{=} \mathop {\mathbb {E}}\limits _{\varPhi } \left[ \mathop {\mathbb {E}}\limits _{g \leftarrow \mathcal{D}}[f(\pi (\mathsf {Graph}(g)))] \right] ^2 \\&= \left( \sum _g P(g) \cdot \mathop {\mathbb {E}}\limits _{\varPhi } \big [ f(\pi (\mathsf {Graph}(g))) \big ] \right) ^2 \\&= \left( \sum _g P(g) \cdot \mathop {\mathbb {E}}\limits _{u : {\mathbb F}\rightarrow {\mathbb F}} \big [ f(\pi ^*(\mathsf {Graph}(u))) \big ] \right) ^2&\text {(Claim 1)} \\&= \mathop {\mathbb {E}}\limits _{u : {\mathbb F}\rightarrow {\mathbb F}} \big [ f(\pi ^*(\mathsf {Graph}(u))) \big ]^2 \\&= \mathop {\mathbb {E}}\limits _{u, v : {\mathbb F}\rightarrow {\mathbb F}} \big [ f (\pi ^*(\mathsf {Graph}(u))) \cdot f(\pi ^*(\mathsf {Graph}(v))) \big ] \\&= \mathop {\mathbb {E}}\limits _{g, h : {\mathbb F}\rightarrow {\mathbb F}} \left[ \mathop {\mathbb {E}}\limits _{\begin{array}{c} u, v : {\mathbb F}\rightarrow {\mathbb F}\\ \varDelta (u, v) = \varDelta (g, h) \end{array}} \big [ f (\pi ^*(\mathsf {Graph}(u))) \cdot f(\pi ^*(\mathsf {Graph}(v))) \big ] \right]&\text {(law of total expectation).} \end{aligned}$$

The difference between these two expressions is only in the distribution of g and h over which the (outer) expectation is taken. Furthermore, the value whose expectation is computed lies in [0, 1] and depends only on the Hamming distance between g and h. The claim follows.    \(\square \)

Theorem 8 follows from Claims 3, 4, and 5, and Chebyshev’s inequality.    \(\square \)