Keywords

1 Introduction

Learning Parity with Noise. The computational version of learning parity with noise (LPN) assumption with parameters \(n\in \mathbb {N}\) (length of secret), \(q\in \mathbb {N}\) (number of queries) and \(0<\mu <1/2\) (noise rate) postulates that it is computationally infeasible to recover the n-bit secret \(s\in \mathbb {Z}^{n}_2\) given \((a\cdot {s}\oplus {e},~a)\), where a is a random \(q{\times }n\) matrix, e follows \(\mathsf {Ber}_\mu ^q\), \(\mathsf {Ber}_\mu \) denotes the Bernoulli distribution with parameter \(\mu \) (i.e., \(\Pr [\mathsf {Ber}_\mu =1]=\mu \) and \(\Pr [\mathsf {Ber}_\mu =0]=1-\mu \)), ‘\(\cdot \)’ denotes matrix vector multiplication over GF(2) and ‘\(\oplus \)’ denotes bitwise XOR. The decisional version of LPN simply assumes that \(a\cdot {s}\oplus {e}\) is pseudorandom (i.e., computationally indistinguishable from uniform randomness) given a. While seemingly stronger, the decisional version is known to be polynomially equivalent to its computational counterpart [4, 8, 21].

Hardness of LPN. The computational LPN problem represents a well-known NP-complete problem “decoding random linear codes” [6] and thus its worst-case hardness is well understood. LPN was also extensively studied in learning theory, and it was shown in [15] that an efficient algorithm for LPN would allow to learn several important function classes such as 2-DNF formulas, juntas, and any function with a sparse Fourier spectrum. Under a constant noise rate (i.e., \(\mu =\varTheta (1)\)), the best known LPN solvers [9, 25] require time and query complexity both \(2^{O(n/\log {n})}\). The time complexity goes up to \(2^{O(n/\log \log {n})}\) when restricted to \(q=\mathsf {poly}(n)\) queries [26], or even \(2^{O(n)}\) given only \(q=O(n)\) queries [28]. Under low noise rate \(\mu =n^{-c}\) (\(0<c<1\)), the security of LPN is less well understood: on the one hand, for \(q=n+O(1)\) we can already do an efficient distinguishing attack with advantage \(2^{-O(n^{1-c})}\) that matches the statistical indistinguishability (from uniform randomness) of the LPN samples ; on the other hand, for (even super-)polynomial q the best known attacks [5, 7, 10, 24, 31] are not asymptotically better, i.e., still at the order of \(2^{\varTheta (n^{1-c})}\). We mention that LPN does not succumb to known quantum algorithms, which makes it a promising candidate for “post-quantum cryptography”. Furthermore, LPN also enjoys simplicity and is more suited for weak-power devices (e.g., RFID tags) than other quantum-secure candidates such as LWE [30].

LPN-based Cryptographic Applications. LPN was used as a basis for building lightweight authentication schemes against passive [18] and even active adversaries [20, 21] (see [1] for a more complete literature). Recently, Kiltz et al. [23] and Dodis et al. [13] constructed randomized MACs based on the hardness of LPN, which implies a two-round authentication scheme with man-in-the-middle security. Lyubashevsky and Masny [27] gave an efficient three-round authentication scheme whose security can be based on LPN or weak pseudorandom functions (PRFs). Applebaum et al. [3] showed how to constructed a linear-stretch pseudorandom generator (PRG) from LPN. We mention other not-so-relevant applications such as public-key encryption schemes [2, 14, 22], oblivious transfer [11], commitment schemes and zero-knowledge proofs [19], and refer to a recent survey [29] on the current state-of-the-art about LPN.

The Error in [32] and Our Contributions. In the standard LPN, the secret vector is assumed to be generated uniformly at random and kept confidential. However, for the version where the secret vector is sampled from some arbitrary distribution with sufficient amount of min-entropy, its hardness is still unclear. In the paper [32], the authors claimed a positive answer on the open question. More specifically, they show that if the l-bit secret is of min-entropy \(k=\varOmega (l)\), then the LPN problem (on such a weak secret) is hard as long as the standard one is (on uniform secrets). Unfortunately, we find that the claim in [32] is flawed. Loosely speaking, the main idea of [32, Theorem 4] is the following: denote by \(\mathcal {D}\) a distribution over \(\mathbb {Z}^{l}_2\) with min-entropy \(k=\varOmega (l)\) and let \(n=k-2\log (1/\epsilon )\) for some \(\epsilon \) negligible in the security parameterFootnote 1, sample \(B\xleftarrow {\$}\mathbb {Z}^{m \times n}_2\), \(C\xleftarrow {\$}\mathbb {Z}^{n \times l}_2\), \(E\leftarrow \mathsf {Ber}_\alpha ^{m \times n}\), \(F\xleftarrow {\$}\mathbb {Z}^{n \times l}_2\) and \(e\leftarrow \mathsf {Ber}_\beta ^m\), and let \(A=BC\oplus EF\). The authors of [32] argue that \(As\oplus e\) is computationally indistinguishable from uniform even conditioned on A and that A is statistically close to uniform. Quantitatively, the standard LPN\(_{n,\frac{1}{2}-\frac{(1-\alpha )^{n}}{2}}\) assumption implies LPN\(^{\mathcal {D}}_{\frac{1}{2}-(\frac{1}{2}-\beta )(1-\alpha )^n}\). We stress that the proofs are incorrect for at least the following reasons:

  1. 1.

    For a reasonable assumption, the noise rate should be bounded away from uniform at least polynomially, i.e., \((1-\alpha )^n/2\ge 1/\mathsf {poly}(l)\). Otherwise, the hardness assumption is trivial and useless as it does not imply any efficient (polynomial-time computable) cryptographic applications.

  2. 2.

    \(A=BC\oplus EF\) is not statistically close to uniform. BC is sampled from a random subspace of dimension \(n<k\le l\) and thus far from being uniform over \(\mathbb {Z}^{m\times l}_2\) (recall that \(m\gg l\)). Every entry of matrix EF is distributed to \(\mathsf {Ber}_{1/2-(1-\alpha )^n/2}\) for \((1-\alpha )^n/2\ge 1/\mathsf {poly}(l)\) (see item 1 above). Therefore, the XOR sum of BC and EF never amplifies to statistically uniform randomness.

  3. 3.

    There are a few flawed intermediate statements. For example, the authors prove that every entry of EF is distributed according to \(\mathsf {Ber}_{1/2-(1-\alpha )^n/2}\) and then conclude that EF follows \(\mathsf {Ber}_{1/2-(1-\alpha )^n/2}^{m\times l}\), which is not true since there’s no guarantee that the entries of EF are all independent.

We fix the flaw using the “sampling from random subspace” technique [16, 33].

2 Preliminaries

Notations and Definitions. We use [n] to denote set {1, ..., n}. We use capital letters (e.g., X, Y) for random variables and distributions, standard letters (e.g., x, y) for values, and calligraphic letters (e.g. \(\mathcal {X}\), \(\mathcal {E}\)) for sets and events. The support of a random variable X, denoted by Supp(X), refers to the set of values on which X takes with non-zero probability, i.e., \(\{x:\Pr [X=x]>0\}\). For set \(\mathcal {S}\) and binary string s, \(|\mathcal {S}|\) denotes the cardinality of \(\mathcal {S}\) and |s| refers to the Hamming weight of s. We use \(\mathsf {Ber}_\mu \) to denote the Bernoulli distribution with parameter \(\mu \), i.e., \(\Pr [\mathsf {Ber}_\mu =1] = \mu \), \(\Pr [\mathsf {Ber}_\mu = 0] = 1 - \mu \), while \(\mathsf {Ber}_\mu ^q\) denotes the concatenation of q independent copies of \(\mathsf {Ber}_\mu \). For \(n\in \mathbb {N}\), \(U_n\) denotes the uniform distribution over \(\mathbb {Z}^{n}_2\) and independent of any other random variables in consideration, and \(f(U_n)\) denotes the distribution induced by applying function f to \(U_n\). \(X{\sim }D\) denotes that random variable X follows distribution D. We use \(s\leftarrow {S}\) to denote sampling an element s according to distribution S, and let \(s\xleftarrow {\$}{\mathcal {S}}\) denote sampling s uniformly from set \(\mathcal {S}\).

Entropy Definitions. For a random variable X and any \(x\in \mathsf {Supp}(X)\), the sample-entropy of x with respect to X is defined as

$$ {{\mathbf {H}}_{X}}(x)\mathop {=}\limits ^\mathsf{def}\log (1/\Pr [X=x]) $$

from which we define the Shannon entropy and min-entropy of X respectively, i.e.,

$$ {{\mathbf {H}}_{1}}(X)\mathop {=}\limits ^\mathsf{def}\mathbb {E}_{x\leftarrow {X}}[~{{\mathbf {H}}_{X}}(x)~],~{{\mathbf {H}}_{\infty }}(X)\mathop {=}\limits ^\mathsf{def}\min _{{x\in \mathsf {Supp}(X)}}{{\mathbf {H}}_{X}}(x). $$

Indistinguishability and Statistical Distance. We define the (t,\(\varepsilon \))-computational distance between random variables X and Y, denoted by \(X~{\mathop \sim \limits _{(t,\varepsilon )}}~Y\), if for every probabilistic distinguisher \(\mathsf{D}\) of running time t it holds that

$$ |~\Pr [\mathsf{D}(X)=1]-\Pr [\mathsf{D}(Y)=1]~|\le {\varepsilon }. $$

The statistical distance between X and Y, denoted by \(\mathsf {SD}(X,Y)\), is defined by

$$ \mathsf {SD}(X,Y) \mathop {=}\limits ^\mathsf{def}\frac{1}{2}\sum _{x}\left| \Pr [X=x] - \Pr [Y=x]\right| .$$

Computational/statistical indistinguishability is defined with respect to distribution ensembles (indexed by a security parameter). For example, \(X\mathop {=}\limits ^\mathsf{def}\{X_n\}_{n\in \mathbb {N}}\) and \(Y\mathop {=}\limits ^\mathsf{def}\{Y_n\}_{n\in \mathbb {N}}\) are computationally indistinguishable, denoted by \(X~{\mathop \sim \limits ^{c}}~Y\), if for every \(t=\mathsf {poly}(n)\) there exists \(\varepsilon =\mathsf{negl}(n)\) such that \(X~{\mathop \sim \limits _{(t,\varepsilon )}}~Y\), and they are statistically indistinguishable, denoted by \(X~{\mathop \sim \limits ^{s}}~Y\), if \(\mathsf {SD}(X,Y)=\mathsf{negl}(n)\).

Simplifying Notations. To simplify the presentation, we use the following simplified notations. Throughout, n is the security parameter and most other parameters are functions of n, and we often omit n when clear from the context. For example, \(q=q(n)\in \mathbb {N}\), \(t=t(n)>0\), \(\epsilon =\epsilon (n)\in (0,1)\), and \(m=m(n)=\mathsf {poly}(n)\), where \(\mathsf {poly}\) refers to some polynomial.

We will use the decisional version of the LPN assumption which is known to be polynomially equivalent to the computational counterpart.

Definition 1 (LPN)

The decisional \(\mathsf {LPN}_{\mu ,n}\) problem (with secret length n and noise rate \(0<\mu <1/2\)) is hard if for every \(q=\mathsf {poly}(n)\) we have

$$\begin{aligned} (A,~A{\cdot }{X}{\oplus }E)~{\mathop \sim \limits ^{c}}~(A, U_q) \end{aligned}$$
(1)

where \(q\times {n}\) matrix \(A~{\sim }~U_{q{n}}\), \(X\sim {U_n}\) and \(E\sim \mathsf {Ber}_\mu ^q\). The computational \(\mathsf {LPN}_{\mu ,n}\) problem is hard if for every \(q=\mathsf {poly}(n)\) and every PPT algorithm \(\mathsf{D}\) we have

$$ \Pr [~\mathsf{D}_{}(A,~A{\cdot }{X}{\oplus }E)=X~]~=~\mathsf{negl}(n), $$

where \(A~{\sim }~U_{q{n}}\), \(X\sim {U_n}\) and \(E\sim \mathsf {Ber}_\mu ^q\).

Lemma 1

(Leftover Hash Lemma [17]). Let \((X,Z)\in \mathcal {X}\times {\mathcal {Z}}\) be any joint random variable with \({{\mathbf {H}}_{\infty }}(X|Z)\ge {k}\), and let \(\mathcal {H}=\{h_b:\mathcal {X}\rightarrow \mathbb {Z}^{l}_2,b\in \mathbb {Z}^{s}_2\}\) be a family of universal hash functions, i.e., for any \(x_1\ne {x_2}\in \mathcal {X}\), \(\Pr _{b\xleftarrow {\$}\mathbb {Z}^{s}_2}[h_b(x_1)=h_b(x_2)]\le {2^{-l}}\). Then, it holds that

$$ \mathsf {SD}~\bigg ((Z,B,h_B(X))~,~(Z,B,U_{l})\bigg )~\le ~ 2^{l-k}, $$

where \(B\sim {U_{s}}\).

3 Correcting the Errors

3.1 The Main Contribution of [32]

In the standard LPN, the secret is assumed to be generated uniformly at random and kept confidential. However, it remains open whether or not the hardness of the LPN can still hold when secret is not uniform but sampled from any distribution of linear entropy (in the secret length). The recent work [32] claims a positive answer on the open question. More specifically, the authors show that the standard LPN\(_{n,\frac{1}{2}-\frac{(1-\alpha )^{n}}{2}}\) assumption implies LPN\(^{\mathcal {D}}_{\frac{1}{2}-(\frac{1}{2}-\beta )(1-\alpha )^n}\) for any \(\mathcal {D}\) of min-entropy \(k=\varOmega (l)\) and \(n=k-2\log (1/\epsilon )\).

3.2 How the Proof Goes Astray

The statement in [32, Theorem 4] does not hold. We recall that the setting of [32]: let \(\mathcal {D}\) be any distribution over \(\mathbb {Z}^{l}_2\) with min-entropy \(k=\varOmega (l)\) and let \(n=k-2\log (1/\epsilon )\) for some negligible \(\epsilon \), sample \(B\xleftarrow {\$}\mathbb {Z}^{m \times n}_2\), \(C\xleftarrow {\$}\mathbb {Z}^{n \times l}_2\), \(E\leftarrow \mathsf {Ber}_\alpha ^{m \times n}\), \(F\xleftarrow {\$}\mathbb {Z}^{n \times l}_2\) and \(e\leftarrow \mathsf {Ber}_\beta ^m\), and let \(A=BC\oplus EF\). As we pointed out in Sect. 1, there are a few flaws in their proof. First, the noise rate \(1/2-(1-\alpha )^n/2\) is too strong to make any meaningful statements. Second, the matrix A is far from statistically uniform and there’s not even any evidence that it could be pseudorandom. Third, the claim that EF follows \(\mathsf {Ber}_{1/2-(1-\alpha )^n/2}^{m\times l}\) is not justified since they only show that each entry of EF follows \(\mathsf {Ber}_{1/2-(1-\alpha )^n/2}\). It remains to show that entries of EF are all independent, which is less likely to be proven. Notice that here machinery such as two-source extraction does not help as the extracted bits are biased.

3.3 The Remedy

Now we give an easy remedy using the techniques from [16, 33]. Let \(\mathcal {D}\in \mathbb {Z}^{l}_2\) be any distribution with min-entropy \(k=\varOmega (l)\), \(n=k-\omega (\log l)\), let \(B\xleftarrow {\$}\mathbb {Z}^{m \times n}_2\), \(C\xleftarrow {\$}\mathbb {Z}^{n \times l}_2\), \(A=BC\) and \(e\leftarrow \mathsf {Ber}_\alpha ^m\), According to Leftover Hash Lemma, we have

$$ (C, C\cdot s)~{\mathop \sim \limits ^{s}}~(C, U_{n}), $$

which in turn implies

$$ (BC, (BC)\cdot s \oplus e)~{\mathop \sim \limits ^{s}}~(BC, B\cdot U_{n}\oplus e). $$

Note that the standard LPN\(_{n,\alpha }\) implies

$$ (B,B\cdot U_n\oplus e)~{\mathop \sim \limits ^{c}}~(B,U_m). $$

It follows that

$$ (BC, (BC)\cdot s \oplus e)~{\mathop \sim \limits ^{c}}~(BC, U_m) $$

and therefore completes the proof. This also simplifies the proof in [32] by eliminating the need for matrices E and F. Notice that we require that A is sampled from a random subspace of dimension n, instead of a uniform distribution.

4 Remarks on the Applications

In [32], the authors apply their result to the probabilistic CPA symmetric-key encryption scheme in [12], where the secret key is sampled from an arbitrary distribution with sufficient min-entropy. However, the noise rate \(\frac{1}{2}-\frac{(1-\alpha )^{n}}{2}\) is either statistically close to uniform (and thus infeasible to build any efficient applications), or it does not yield the desired conclusion due to flawed proofs.