1 Introduction

The Learning with Errors problem (LWE) is one of the most important problems in lattice-based cryptography. A huge variety of schemes, ranging from basic primitives like signature [18] and encryption schemes [32] to highly advanced schemes like group signatures [30] and fully homomorphic encryption [12], base their security on the LWE assumption. Understanding the concrete hardness of LWE is therefore important for selecting parameters.

Many cryptographic schemes are based on the hardness of special LWE instances like Ring-LWE [34], or LWE with ternary error [22]. Understanding the hardness of subclasses of the LWE problem and identifying those that are easy to solve is therefore an important task. In fact, several recent results [15, 19, 20, 29] show that some subclasses are easier than expected.

We show that the subclass LWE with binary error, which has been considered before in several papers [1, 35], fits into this category. To show that LWE with binary error is considerably easier than expected, we modify the hybrid lattice-reduction and meet-in-the-middle attack by Howgrave-Graham [25] (refered to as hybrid attack in the following), apply it to this setting, and analyze its complexity. In order to compare our approach to existing ones, we apply known attacks on LWE to the binary error setting and analyze their complexities in this case. Our comparison shows that the hybrid attack is much faster than existing methods such as the enumeration attack [32, 33], or the embedding approach [4] for several natural parameter sets. Figure 1 illustrates our improvement, by comparing the runtime of the best previously known attack with the hybrid attack, where \(m = 2n\) samples from an LWE distribution with binary error are given and n is the dimension of the secret vector. For example, in the case of \(n=256\) and \(q=256\), the hardness of the problem drops from 117 to 85 bits, which is a significant improvement. A detailed comparison between the hybrid attack and previous approaches is given in Table 1 in Sect. 4.

Fig. 1.
figure 1

Hardness of LWE instances with number of samples \(m=2n\) and modulus \(q=256\) before and after this work

The hybrid attack can also be seen as an improvement of an idea sketched by Bai and Galbraith [9]. However, Bai and Galbraith did not provide an analysis of their suggestion, and the analysis of Howgrave-Graham is partly based on experiments. A theoretical analysis of the hybrid attack that is not based on experimental results has been presented by Hirschhorn et al. in [24]. However, their analysis requires an additional assumption.

In this work we present a complete and improved analysis based on the same assumptions used in [25] without the additional assumption of [24], that does not require experimental support. For this reason, we introduce new analytic techniques. Our new analysis can also be applied to the Howgrave-Graham attack, as well as to the attack mentioned by Bai and Galbraith (see [9]). In addition, we show how to use our techniques to analyze the decoding attack on LWE with binary error.

Related Work. A number of recent works have highlighted the importance of considering the hardness of variants of LWE. For example, certain choices of rings lead to weak instances of the Ring-LWE problem [15, 19, 20]. Additionally, Laine and Lauter [29] provide a polynomial time attack for LWE instances with an exponentially large modulus q and a sufficiently narrow Gaussian error. The existence of such weak instances shows the necessity of studying the hardness of special instances of the LWE problem separately.

The hardness of LWE with binary error has been considered in some detail. So far, there are known attacks that require access to superlinearly many samples (i.e., \(m > \mathcal {O}\left( n\right) \)), and hardness results when the crypanalyst is given a sublinear number of additional samples (i.e., \(m = n + \mathcal {O}\left( n/\log (n)\right) \)), where n is the dimension of the secret vector. More precisely, the problem can be solved in polynomial time using the algorithm of Arora and Ge [6], when the number of samples is \(m=\mathcal {O}\left( n^2\right) \) (see, e.g., [1]). Furthermore, Albrecht et al. [1] showed that LWE with binary error can be solved in subexponential time using an improved version of the Arora-Ge attack, if the attacker has access to a quasi-linear number of samples, e.g., \(m = \mathcal {O}\left( n \log \log n \right) \). On the other hand, Micciancio and Peikert [35] proved that LWE with binary error reduces to worst-case lattice problems when the number of samples is restricted to \(n + \mathcal {O}\left( n/\log (n)\right) \). We close the margin between these hardness results on the one side and the weakness results on the other side by presenting an attack that runs with only n additional samples.

The idea of Bai and Galbraith which we build upon is to guess the first r components of the secret vector and apply a lattice attack on the remaining problem [9]. As noted in [5], this strategy enables the transformation of any algorithm for solving LWE into another one whose complexity is bounded by the cost of exhaustive search. Howgrave-Graham’s algorithm [25], which we apply here, involves a Meet-in-the-Middle component to speed up this guessing: this was not considered in either of [5, 9]. The existence of a Meet-in-the-Middle approach for solving LWE (without combining with any another algorithm) was mentioned in [9] and such an algorithm was presented in [5]. In Sect. 4 we show that it is much more efficient to combine a Meet-in-the-Middle approach with a decoding attack than to solve LWE with binary error entirely by a Meet-in-the-Middle approach.

Structure. In Sect. 2 we give some notation and required preliminaries. In Sect. 3 we describe how to apply the hybrid attack to LWE with binary error and analyze its complexity. In Sect. 4 we apply other possible attacks on LWE to the binary error case, analyze their complexities, and compare the results to the hybrid attack.

2 Notation and Preliminaries

Notation. In this work vectors are denoted in bold lowercase letters, e.g., \(\mathbf {a} \), and matrices in bold uppercase letters, e.g., \(\mathbf {A} \). For a vector \(\mathbf {v} \in \mathbb {R} ^n\) we write \(\mathbf {v} \mod q\) for its unique representative modulo q in \([-\lfloor \frac{q}{2} \rfloor , \frac{q}{2})^n\). Logarithms are base two unless stated otherwise, and \(\ln (x)\) denotes the natural logarithm of x.

Learning with Errors. The Learning with Errors (LWE) problem, introduced by Regev [41], is a computational problem, whose presumed hardness is the basis for several cryptographic constructions, e.g., [3941]. In this work, we consider the variant LWE with binary error.

Problem Statement 1

(LWE with binary error). Let nq be positive integers, \(\mathcal {U}\) be the uniform distribution on \(\{ 0,1 \}\) and \(\mathbf {s} \mathop {\leftarrow }\limits ^{\$}\mathcal {U}^n\) be a secret vector in \(\{ 0,1 \} ^n\). We denote by \(L_{\mathbf {s},\mathcal {U}} \) the probability distribution on \( \mathbb {Z}_q ^n \times \mathbb {Z}_q \) obtained by choosing \(\mathbf {a} \in \mathbb {Z}_q ^n\) uniformly at random, choosing \(e \mathop {\leftarrow }\limits ^{\$}\mathcal {U}\) and returning \((\mathbf {a}, \left\langle {\mathbf {a}},{\mathbf {s}}\right\rangle +e) \in \mathbb {Z}_q ^n \times \mathbb {Z}_q \).

LWE with binary error is the problem of recovering \(\mathbf {s} \) from m samples \((\mathbf {a} _i, \left\langle {\mathbf {a} _i},{\mathbf {s} _i}\right\rangle +e_i) \in \mathbb {Z}_q ^n \times \mathbb {Z}_q \) sampled according to \(L_{\mathbf {s},\mathcal {U}} \), with \(i\in \{1,\dots ,m\}\).

Note that Regev defined LWE with a secret vector \(\mathbf {s} \) chosen uniformly at random from the whole of \( \mathbb {Z}_q ^n\). However, it is well-known that LWE with arbitrarily distributed secret can be transformed to LWE with secret distributed according to the error distribution. Consequently, most cryptographic constructions are based on LWE where secret and error are identically distributed, and we focus on this case in this work.

Lattices and Bases. A lattice is a discrete additive subgroup of \( \mathbb {R} ^m\). A set of linearly independent vectors \(\mathbf {B} = \{\mathbf {b} _1, \ldots , \mathbf {b} _n\} \subset \mathbb {R} ^m\) is called a basis of a lattice \(\varLambda \), if \(\varLambda = \varLambda (\mathbf {B})\), where

$$\begin{aligned} \varLambda (\mathbf {B}) = \{\mathbf {x} \in \mathbb {R} ^m\mid \mathbf {x} = \sum _{i=1}^n \alpha _i \mathbf {b} _i\text { for } \alpha _i \in \mathbb {Z} \}. \end{aligned}$$

The dimension of a lattice \(\varLambda \) is defined as the cardinality of some (equivalently any) basis of \(\varLambda \). For the rest of this work we restrict our studies to lattices in \( \mathbb {R} ^m\) whose dimension is maximal, e.g., m, which are called full-ranked lattices. The fundamental parallelepiped of a lattice basis \(\mathbf {B} = \{ \mathbf {b} _1, \ldots , \mathbf {b} _m\} \subset \mathbb {R} ^m\) is given by

$$\begin{aligned} \mathcal {P}(\mathbf {B}) = \{\mathbf {x} \in \mathbb {R} ^m\mid \mathbf {x} = \sum _{i=1}^m \alpha _i \mathbf {b} _i\text { for } -1/2 \le \alpha _i < 1/2\}. \end{aligned}$$

The determinant of a lattice \(\varLambda (\mathbf {B})\) for a basis \(\mathbf {B} \) is defined as the m dimensional volume of its fundamental parallelepiped. Note that the determinant of the lattice is independent of the choice of the basis.

Every lattice of dimension \(m\ge 2\) has infinitely many different bases. A measure for the quality of a basis is provided by the Hermite delta. A lattice basis \(\mathbf {B} = \{\mathbf {b} _1, \ldots , \mathbf {b} _m\}\) has Hermite delta \(\delta \) if \(\left\| {\mathbf {b} _1}\right\| = \delta ^m \det (\varLambda )^{1/m}\).

Differing estimates exist in the literature for the number of operations of a basis reduction necessary to achieve a certain Hermite delta \(\delta \) (see for example [5, 16, 32, 33, 37]). Throughout this work we will use the estimate given by Lindner and Peikert [32]. This is that the number of operations needed to achieve a certain Hermite delta \(\delta \) is around

$$\begin{aligned} {{\mathrm{ops}}}_\text {BKZ}(\delta ) = 2^{1.8/\log _2(\delta ) - 110}\cdot 2.3\cdot 10^9. \end{aligned}$$
(1)

A lattice \(\varLambda \) satisfying \(q\cdot \mathbb {Z} ^m \subset \varLambda \subset \mathbb {R} ^m\) is a q-ary lattice. For a matrix \(\mathbf {A} \in \mathbb {Z} _q^{m\times n}\), we define the q-ary lattice

$$\begin{aligned} \varLambda _q(\mathbf {A}) := \{\mathbf {v} \in \mathbb {Z} ^m\mid \exists \mathbf {w} \in \mathbb {Z} ^n: \mathbf {A} \mathbf {w} = \mathbf {v} \mod q\}. \end{aligned}$$

If \(m\ge n\) and all column vectors \(\mathbf {A} \in \mathbb {Z} _q^{m\times n}\) are linearly independent over \( \mathbb {Z} _q\), we have \(\det (\varLambda _q(\mathbf {A})) = q^{m-n}\).

The closest vector problem is the problem of recovering the lattice vector closest to a given target vector, given also a basis of the lattice. One can consider a relaxation, namely a close vector problem, where the inputs are the same (a basis and a target vector), and the task is to recover a lattice vector which is sufficiently close to the target.

Babai’s Nearest Plane. The hybrid attack uses Babai’s nearest plane algorithm [7] (denoted by \({{\mathrm{NP}}}\) in the following) as subroutine. It gets a lattice basis \(\mathbf {B} \subset \mathbb {Z} ^m\) and a target vector \(\mathbf {t} \in \mathbb {R} ^m\) as input and outputs a vector \(\mathbf {e} \in \mathbb {R} ^m\) such that \(\mathbf {t} - \mathbf {e} \in \varLambda (\mathbf {B})\), which we denote by \({{\mathrm{NP}}}_{\mathbf {B}}(\mathbf {t})= \mathbf {e} \). If the used lattice basis is clear from the context, we omit it in the notation and simply write \({{\mathrm{NP}}}(\mathbf {t})\). A detailed explanation of nearest plane can be found in Babai’s original work [7] and Lindner and Peikert’s follow up work [32]. The output of nearest plane plays an important role in the analysis of the hybrid attack and can be understood without knowing details about the algorithm itself. It depends on the Gram-Schmidt basis of the input basis \(\mathbf {B} \), which is defined as \(\mathbf {\overline{B}} = \{\overline{\mathbf {b} _1}, \dots , \overline{\mathbf {b} _n}\}\) with

$$\begin{aligned} \overline{\mathbf {b} _i} = \mathbf {b} _i - \sum _{j=1}^{i-1} \frac{\langle \overline{\mathbf {b} _j},\mathbf {b} _i\rangle }{\langle \overline{\mathbf {b} _j},\overline{\mathbf {b} _j}\rangle }\overline{\mathbf {b} _j}, \end{aligned}$$

where \(\overline{\mathbf {b} _1}=\mathbf {b} _1\). We will use the following result from [8].

Lemma 1

For a lattice basis \(\mathbf {B} \) with Gram-Schmidt basis \(\overline{\mathbf {B}}\) and a target vector \(\mathbf {t} \) as input, the nearest plane algorithm returns the unique vector \(\mathbf {e} \in \mathcal P(\overline{\mathbf {B}})\) that satisfies \(\mathbf {t} - \mathbf {e} \in \varLambda (\mathbf {B})\).

Lemma 1 shows that analyzing the output of the nearest plane algorithm requires to estimate the lengths of the basis vectors of the corresponding Gram-Schmidt basis. The established way to do this is via the the following heuristic (see Lindner and Peikert [32] for more details).

Heuristic 1

(Geometric Series Assumption). Let \(\{\mathbf {b} _1 \ldots \mathbf {b} _m\} \subset \mathbb {Z} ^m\) be a reduced basis with Hermite delta \(\delta \) of an m-dimensional lattice with determinant D. Also let \(\overline{\mathbf {b} _i}\) denote the basis vectors of the corresponding Gram-Schmidt basis. Then the length of \(\overline{\mathbf {b} _i}\) is approximated by

$$\begin{aligned} \left\| {\overline{\mathbf {b}}_i}\right\| \approx \delta ^{-2(i-1)+m}D^{\frac{1}{m}}. \end{aligned}$$

3 The Attack

In this section we present and analyze the hybrid attack on LWE with binary error. The attack is described in Algorithm 1 of Sect. 3.1. In Theorem 1 of Sect. 3.2 we analyze the expected runtime of the hybrid attack. Section 3.3 shows how to optimize the attack parameters and perform a trade-off between precomputation and the actual attack in order to minimize the runtime of the attack.

3.1 The Hybrid Attack

In the following we describe the hybrid attack on LWE with binary error. The attack is presented in Algorithm 1.

Let \(m,n,q \in \mathbb {N} \) and let

$$\begin{aligned} (\mathbf {A}, \mathbf {b} =\mathbf {A} \tilde{\mathbf {s}} + {\mathbf {e}} \mod q) \end{aligned}$$
(2)

with \(\mathbf {A} \in \mathbb {Z} _q^{m\times n}, {\mathbf {b}}\in \mathbb {Z} _q^m, \tilde{\mathbf {s}} \in \{0,1\}^n\) and \({\mathbf {e}} \in \{0,1\}^m\) be an LWE instance with binary error \({\mathbf {e}}\) and binary secret \(\tilde{\mathbf {s}}\). In order to obtain a smaller error vector we can subtract the vector \((1/2)\cdot \mathbf {1} \) consisting of all 1 / 2 entries from Eq. (2). This yields a new LWE instance \((\mathbf {A}, \mathbf {b} ^\prime =\mathbf {A} \tilde{\mathbf {s}} + {\mathbf {e} '} \mod q)\), where \(\mathbf {b} ^\prime = {\mathbf {b}}-(1/2)\cdot \mathbf {1} \) and \(\mathbf {e} ^\prime = {\mathbf {e}}-(1/2)\cdot \mathbf {1} \). The new error vector \(\mathbf {e} ^\prime \) now has norm \(\sqrt{m/4}\) instead of the expected norm \(\sqrt{m/2}\) of the original error vector \({\mathbf {e}}\). For \(r \in \{1,\ldots , n-1\}\), we can split the secret \(\tilde{\mathbf {s}} = \begin{pmatrix}\mathbf {v} \\ \mathbf {s} \end{pmatrix}\) and the matrix \(\mathbf {A} = (\mathbf {A} _1 | \mathbf {A} _2)\) into two parts and rewrite this LWE instance as

$$\begin{aligned} \mathbf {b} ^\prime = (\mathbf {A} _1 | \mathbf {A} _2) \begin{pmatrix}\mathbf {v} \\ \mathbf {s} \end{pmatrix} + \mathbf {e} ^\prime = \mathbf {A} _1 \mathbf {v} + \mathbf {A} _2 \mathbf {s} + \mathbf {e} ^\prime \mod q, \end{aligned}$$
(3)

where \(\mathbf {v} \in \{0,1\}^r, \mathbf {s} \in \{0,1\}^{n-r}, \mathbf {A} _1\in \mathbb {Z} _q^{m\times r}, \mathbf {A} _2\in \mathbb {Z} _q^{m\times (n-r)}, \mathbf {b} ^\prime = {\mathbf {b}}-(1/2)\cdot \mathbf {1} \in \mathbb {Q}^m\), and \(\mathbf {e} ^\prime = {\mathbf {e}}-(1/2)\cdot \mathbf {1} \in \{-1/2, 1/2\}^m\).

The main idea of the attack is to guess \(\mathbf {v} \) and solve the remaining LWE instance \((\mathbf {A} _2, \tilde{\mathbf {b}}= \mathbf {b} ^\prime -\mathbf {A} _1 \mathbf {v} = \mathbf {A} _2\mathbf {s} + \mathbf {e} ^\prime \mod q)\), which has binary secret \(\mathbf {s} \) and error \(\mathbf {e} ^\prime \in \{-1/2,1/2\}^m\). The new LWE instance obtained in this way turns out to be considerably easier to solve, since the determinant \(\det (\varLambda _q(\mathbf {A} _2)) = q^{m-n+r}\) of the new lattice is significantly bigger than the determinant \(\det (\varLambda _q(\mathbf {A})) = q^{m-n}\) of the original lattice (see Sect. 6.1 of [9]). The newly obtained LWE instance is solved by solving a close vector problem in the lattice \(\varLambda _q(\mathbf {A} _2)\). In more detail, \(\tilde{\mathbf {b}}= \mathbf {A} _2\mathbf {s} + q \mathbf {w} + \mathbf {e} ^\prime \) for some vector \(\mathbf {w} \in \mathbb {Z} ^m\) is close to the lattice vector \(\mathbf {A} _2\mathbf {s} + q \mathbf {w} \in \varLambda _q(\mathbf {A} _2)\) since \(\mathbf {e} ^\prime \) is small. Hence \(\mathbf {e} ^\prime \) can be found by running the nearest plane algorithm in combination with a sufficient basis reduction as a precomputation (see [32]).

The guessing of \(\mathbf {v} \) is sped up by a Meet-in-the-Middle approach, i.e., guessing binary vectors \(\mathbf {v} _1 \in \{0,1\}^r\) and \(\mathbf {v} _2 \in \{0,1\}^r\) such that \(\mathbf {v} = \mathbf {v} _1 + \mathbf {v} _2\). In order to recognize matching guesses \(\mathbf {v} _1\) and \(\mathbf {v} _2\) that sum up to \(\mathbf {v} \), one searches for collisions in (hash) boxes. The addresses of these boxes are determined in the following way.

Definition 1

Let \(m \in \mathbb {N} \). For a vector \(\mathbf {x} \in \mathbb {R} ^m\) the set \(\mathcal {A}_{\mathbf {x}}^{(m)} \subset \{0,1\}^m\) is defined as

$$\begin{aligned} \mathcal {A}_{\mathbf {x}}^{(m)}=\left\{ \mathbf {z} \in \{0,1\}^m \bigg \vert \begin{array}{l} (\mathbf {z})_i = 1 \text { for all } i\in \{1,\ldots , m\} \text { with } (\mathbf {x})_i > -1/2, \text { and}\\ (\mathbf {z})_i = 0 \text { for all } i\in \{1,\ldots , m\} \text { with } (\mathbf {x})_i < -1/2 \end{array} \right\} . \end{aligned}$$

Intuitively, for \(\mathbf {x} _2\) obtained during Algorithm 1, the set \(\mathcal {A}_{\mathbf {x} _2}^{(m)}\) captures all the possible sign vectors of \(\mathbf {x} _2\) added up with a vector in \(\{-1/2, 1/2\}^m\) (where 1 represents a non-negative and 0 a negative sign). For \(\mathbf {x} _1\) obtained during Algorithm 1, the set \(\mathcal {A}_{\mathbf {x} _1}^{(m)}\) consists only of the sign vector of \(\mathbf {x} _1\). This is due to the fact that \(\mathbf {x} _2 \in \mathbb {Z} ^m + \{1/2\}^m\), whereas \(\mathbf {x} _1 \in \mathbb {Z} ^m\). This leads to the desired collisions, as can be seen in the upcoming Lemma 3.

figure a

3.2 Runtime Analysis

In this section we analyze the runtime and success probability of the attack presented in Algorithm 1. We start by presenting our main result.

Theorem 1

Let \(n,m,q,c \in \mathbb {N} \), and \(1\le \delta \in \mathbb {R} \) be fixed. Consider the following input distribution of \((q,r,\mathbf {A}, \mathbf {b}, \mathbf {B})\) for Algorithm 1. The modulus q and the attack parameter \(r=4c\) are fixed, \(\mathbf {A} = (\mathbf {A} _1 | \mathbf {A} _2)\), where \(\mathbf {A} _1\mathop {\leftarrow }\limits ^{\$}~ \mathbb {Z} _q^{m\times r}\), \(\mathbf {A} _2\mathop {\leftarrow }\limits ^{\$}~ \mathbb {Z} _q^{m\times (n-r)}\), \(\mathbf {b} = \mathbf {A} \begin{pmatrix}\mathbf {v} \\ \mathbf {s} \end{pmatrix} + \mathbf {e} \mod q\), where \(\mathbf {v} \mathop {\leftarrow }\limits ^{\$}~\{0,1\}^r\), \(\mathbf {s} \mathop {\leftarrow }\limits ^{\$}~\{0,1\}^{n-r}\), \(\mathbf {e} \mathop {\leftarrow }\limits ^{\$}~\{0,1\}^m\), and \(\mathbf {B} \) is some lattice basis of \(\varLambda _q(\mathbf {A} _2)\) with Hermite delta \(\delta \). Let all notations be as in the above description of the input distribution. Assume that the approximations given in Heuristics 2 and 4 are in fact equations and that \({{\mathrm{NP}}}_{\mathbf {B}}(\mathbf {b}- (1/2)\cdot \mathbf {1} - \mathbf {A_1 v})=\mathbf {e}- (1/2)\cdot \mathbf {1} \). Then, if Algorithm 1 terminates, it finds a valid binary error vector of the LWE with binary error instance \((\mathbf {A}, \mathbf {b})\). The probability that Algorithm 1 terminates is at least

$$\begin{aligned} p_0=2^{-r}\begin{pmatrix}r\\ 2c\end{pmatrix}. \end{aligned}$$

In case that Algorithm 1 terminates, the expected number of operations is

$$\begin{aligned} 2^{16} \begin{pmatrix} r\\ c \end{pmatrix} \left( p \begin{pmatrix}2c\\ c\end{pmatrix}\right) ^{-1/2}, \end{aligned}$$

with

$$\begin{aligned} p = \prod _{i=1}^m \left( 1-\frac{1}{r_iB(\frac{m-1}{2}, \frac{1}{2})}J(r_i, m)\right) , \end{aligned}$$

where \(B(\cdot , \cdot )\) denotes the Euler beta function (see [38]),

$$\begin{aligned} J(r_ i, m) = {\left\{ \begin{array}{ll} \int _{-r_i-1}^{r_i-1}\int _{-1}^{z+r_i} (1-y^2)^\frac{m-3}{2}dydz\\ \quad + \int _{r_i-1}^{-r_i}\int _{z-r_i}^{z+r_i} (1-y^2)^\frac{m-3}{2}dydz &{} \text { for } r_i < \frac{1}{2}\\ \int _{-r_i-1}^{-r_i}\int _{-1}^{z+r_i} (1-y^2)^\frac{m-3}{2}dydz &{} \text { for } r_i \ge \frac{1}{2}, \end{array}\right. } \end{aligned}$$

and

$$\begin{aligned} r_i = \frac{\delta ^{-2(i-1)+m}q^{\frac{m-n+r}{m}}}{2\sqrt{m/4}}. \end{aligned}$$

Remark 1

Algorithm 1 gets some basis \(\mathbf {B} \) as input. This basis has a certain quality, given by the Hermite delta \(\delta \). In practice, we can improve the attack by providing a basis with better, i.e., smaller, Hermite delta. We achieve this by running a basis reduction (e.g., BKZ) on \(\mathbf {B} \) in a precomputation step (see Sect. 3.3).

We postpone the proof of Theorem 1 to the end of this subsection, since we first need to develop some necessary tools. We start by giving a definition of a notion which is crucial to our analysis. We then give a useful lemma.

Definition 2

Let \(m\in \mathbb {N}\). A vector \(\mathbf {x} \in \mathbb {Z} ^m\) is called \(\mathbf {y} \)-admissible for some vector \(\mathbf {y} \in \mathbb {Z} ^m\) if \({{\mathrm{NP}}}(\mathbf {x}) = {{\mathrm{NP}}}(\mathbf {x} - \mathbf {y}) + \mathbf {y} \).

Intuitively, \(\mathbf {x} \) being \(\mathbf {y} \)-admissible means that running the nearest plane algorithm on \(\mathbf {x} \) and running it on \(\mathbf {x} - \mathbf {y} \) yields the same lattice vector, since then we have \(\mathbf {x} -{{\mathrm{NP}}}(\mathbf {x})=(\mathbf {x} - \mathbf {y}) - {{\mathrm{NP}}}(\mathbf {x} - \mathbf {y})\).

Lemma 2

Let \(\mathbf {t} _1\in \mathbb {R} ^m, \mathbf {t} _2\in \mathbb {R} ^m\) be two arbitrary target vectors. Then the following are equivalent.

  1. 1.

    \( {{\mathrm{NP}}}(\mathbf {t} _1) + {{\mathrm{NP}}}(\mathbf {t} _2) = {{\mathrm{NP}}}(\mathbf {t} _1 + \mathbf {t} _2)\).

  2. 2.

    \(\mathbf {t} _1\) is \({{\mathrm{NP}}}(\mathbf {t} _1 + \mathbf {t} _2)\)-admissible.

  3. 3.

    \(\mathbf {t} _2\) is \({{\mathrm{NP}}}(\mathbf {t} _1 + \mathbf {t} _2)\)-admissible.

A proof of this lemma can be found in the full version [13].

As we will see in our analysis, the expected runtime heavily depends on the following probability. Let all notations be as in Theorem 1 and \(\mathbf {e} ^\prime =\mathbf {e} - (1/2)\cdot \mathbf {1} \). For

$$\begin{aligned} W=\{\mathbf {w} \in \{0,1\}^r: \text { exactly } c \text { entries of } \mathbf {w} \text { are } 1\} \end{aligned}$$
(4)

we define

(5)

Note that the hybrid attack requires that nearest plane called on the target vector \(\mathbf {b}- (1/2)\cdot \mathbf {1} - \mathbf {A_1 v} \) returns the correct shifted error vector \(\mathbf {e}- (1/2)\cdot \mathbf {1} \). However, this is not a big restriction in practice, since this probability is bigger than the probability that the same vector is \(\mathbf {e} '\)-admissible. To see why, recall that nearest plane returns the correct error vector if and only if it lies in the fundamental parallelepiped \(\varLambda (\mathbf {B})\). On the other hand, Heuristic 3 states that the probability that \(\mathbf {b}- (1/2)\cdot \mathbf {1} - \mathbf {A_1 v} \) is \(\mathbf {e} '\)-admissible is approximately the probability that the sum of a random point in \(\varLambda (\mathbf {B})\) and the error vector is still in \(\varLambda (\mathbf {B})\). Consequently, we expect that \({{\mathrm{NP}}}_{\mathbf {B}}(\mathbf {b}- (1/2)\cdot \mathbf {1} - \mathbf {A_1 v})=\mathbf {e}- (1/2)\cdot \mathbf {1} \) holds with high probability for all realistic attack parameters.

Note that the analysis of the attack on the NTRU encryption proposed by Howgrave-Graham [25] also requires to calculate the probability p. In the original work, this is done experimentally. Replacing this probability estimation with the analytic methodology presented in the following removes the dependency on experimental support in the analysis of the hybrid attack. A first mathematical calculation of the probability p has already been presented by Hirschhorn et al. in [24]. However, their analysis requires an additional assumption that we no longer need.

Success Probability. In this subsection we determine the probability that Algorithm 1 terminates. We start by giving a sufficient condition for this event.

Lemma 3

Let all notations be as in Theorem 1 and let \(\mathbf {b} ^\prime =\mathbf {b} - (1/2)\cdot \mathbf {1} \) and \(\mathbf {e} ^\prime =\mathbf {e} - (1/2)\cdot \mathbf {1} \). Assume that \(\mathbf {v} _1\) and \(\mathbf {v} _2\) are guessed in separate loops of Algorithm 1 and satisfy \(\mathbf {v} _1 + \mathbf {v} _2 = \mathbf {v} \). Also let \(\mathbf {t} _1 = - \mathbf {A} _1 \mathbf {v} _1\) and \(\mathbf {t} _2 = \mathbf {b} ^\prime - \mathbf {A} _1 \mathbf {v} _2\) and assume \({{\mathrm{NP}}}(\mathbf {t} _1) + {{\mathrm{NP}}}(\mathbf {t} _2) = {{\mathrm{NP}}}(\mathbf {t} _1 + \mathbf {t} _2) = \mathbf {e} ^\prime \) holds. Then \(\mathbf {v} _1\) and \(\mathbf {v} _2\) collide in at least one box chosen during Algorithm 1 and the algorithm outputs the error vector \({\mathbf {e}}\) of the given LWE instance.

Proof: According to the notation used in Algorithm 1, let \(\mathbf {x} _1 = -{{\mathrm{NP}}}(\mathbf {t} _1\)) correspond to \(\mathbf {v} _1\) and \(\mathbf {x} _2 = {{\mathrm{NP}}}(\mathbf {t} _2)\) correspond to \(\mathbf {v} _2\). By assumption we have \(\mathbf {x} _1 = \mathbf {x} _2 - \mathbf {e} ^\prime \). Using the definition it is easy to verify that \(\mathbf {x} _1\) and \(\mathbf {x} _2\) share at least one common address, since \(\mathbf {e} ^\prime \in \{-1/2, 1/2\}^m\). Therefore \(\mathbf {v} _1\) and \(\mathbf {v} _2\) collide in at least one box. Again by assumption, we obtain \(\mathbf {x} = {{\mathrm{NP}}}(\mathbf {b} ^\prime - \mathbf {A} _1 \mathbf {v}) = {{\mathrm{NP}}}(\mathbf {t} _1 + \mathbf {t} _2) = \mathbf {e} ^\prime \). Hence the algorithm outputs the error vector \({\mathbf {e}}\).    \(\blacksquare \)

In the following lemma we give a lower bound on the probability that Algorithm 1 terminates.

Lemma 4

Let all notations be as in Theorem 1 and let \(\mathbf {b} ^\prime =\mathbf {b} - (1/2)\cdot \mathbf {1} \) and \(\mathbf {e} ^\prime =\mathbf {e} - (1/2)\cdot \mathbf {1} \). Assume that if \(\mathbf {v} \) has exactly 2c one-entries, then \(p >0\), where p is as defined in Eq. (5). If \({{\mathrm{NP}}}(\mathbf {b} ^\prime - \mathbf {A_1 v})=\mathbf {e} ^\prime \), then Algorithm 1 terminates with probability at least

$$\begin{aligned} p_0 = 2^{-r}\begin{pmatrix}r\\ 2c\end{pmatrix}. \end{aligned}$$

Proof: We show that Algorithm 1 terminates if \(\mathbf {v} \) consists of exactly 2c one-entries. The probability of this happening is exactly \(p_0\), since there are \(2^r\) binary vectors of length r, and \(\begin{pmatrix}r\\ 2c\end{pmatrix}\) of them have exactly 2c one-entries. Assume that v consists of exactly 2c one-entries. The claim follows directly from Lemmas 2 and 3. Since \(p>0\) there exist binary vectors \(\mathbf {v} _1, \mathbf {v} _2 \in \{0,1\}^r\), each containing exactly c one-entries, such that \(\mathbf {v} _1 + \mathbf {v} _2 = \mathbf {v} \) and \(-\mathbf {A_1 v_1} \) is \(\mathbf {e} ^\prime \)-admissible. These vectors will eventually be guessed during Algorithm 1 if it does not terminate before. By Lemma 2 they satisfy

$$\begin{aligned} {{\mathrm{NP}}}(- \mathbf {A} _1 \mathbf {v} _1)+{{\mathrm{NP}}}(\mathbf {b} ^\prime - \mathbf {A} _1 \mathbf {v} _2)={{\mathrm{NP}}}(\mathbf {b} ^\prime - \mathbf {A} _1 \mathbf {v})=\mathbf {e} ^\prime . \end{aligned}$$

Lemma 3 now guarantees that Algorithm 1 then outputs the error vector \({\mathbf {e}}\).    \(\blacksquare \)

Estimating the Number of Loops. The next step is to estimate the number of loops until the attack terminates.

Heuristic 2

Let all notations be as in Theorem 1 and let \(\mathbf {b} ^\prime =\mathbf {b} - (1/2)\cdot \mathbf {1} \) and \(\mathbf {e} ^\prime =\mathbf {e} - (1/2)\cdot \mathbf {1} \). Assume that \({{\mathrm{NP}}}(\mathbf {b} ^\prime - \mathbf {A_1 v})=\mathbf {e} ^\prime \), and that \(\mathbf {v} \) consists of exactly 2c one-entries. Then the expected number of loops of Algorithm 1 is

$$\begin{aligned} L \approx \begin{pmatrix} r\\ c \end{pmatrix} \left( p \begin{pmatrix}2c\\ c\end{pmatrix}\right) ^{-1/2}, \end{aligned}$$

and the probability p, as given in Eq. (5), is

$$\begin{aligned} p \approx \prod _{i=1}^m \left( 1-\frac{1}{r_iB(\frac{m-1}{2}, \frac{1}{2})}J(r_i, m)\right) , \end{aligned}$$

with \(B(\cdot , \cdot )\), \(J(\cdot , \cdot )\), and \(r_i\) defined as in Theorem 1.

In the following, we justify the heuristic. Assume that \(\mathbf {v} \) consists of exactly 2c one-entries. In addition to W (see Eq. (4)), define the set

$$\begin{aligned} V = \{\mathbf {v} _1\in W: \mathbf {v}-\mathbf {v} _1\in W \text { and }-\mathbf {A} _1 \mathbf {v} _1 \text { is } \mathbf {e} ^\prime \text {-admissible} \}. \end{aligned}$$

Note that W is the set from which Algorithm 1 samples the vectors \(\mathbf {v} _1\). Lemma 3 shows that the attack succeeds if two vectors \(\mathbf {v} _1, \mathbf {v} _2\in V\) satisfying \(\mathbf {v} _1 + \mathbf {v} _2 = \mathbf {v} \) are sampled in different loops of Algorithm 1. Since otherwise the probability of success is close to zero, for simplicity we assume that the attack is only successful in this case. Therefore we need to estimate the necessary number of loops in Algorithm 1 until some \(\mathbf {v} _1, \mathbf {v} _2\in V\) with \(\mathbf {v} _1 + \mathbf {v} _2 = \mathbf {v} \) are found. Note that by Lemma 2 if \(\mathbf {v} _1 \in V\), then also \(\mathbf {v} _2 = \mathbf {v}- \mathbf {v} _1 \in V\).

We start by calculating the probability that a vector sampled during Algorithm 1 lies in V. By definition of p, this probability is given by

Therefore we expect to sample a vector \(\mathbf {v} _1 \in V\) every \(\frac{1}{p_1 p}\) loops in Algorithm 1. The above equation also implies \( p_1 p = \frac{|V|}{|W|}, \) which gives us

$$\begin{aligned} |V| = p_1 p |W|= p_1 p \begin{pmatrix} r \\ c \end{pmatrix}. \end{aligned}$$

The probability \(p_1\) is given by \(p_1 = \begin{pmatrix}2c\\ c\end{pmatrix} / \begin{pmatrix}r\\ c\end{pmatrix}\), see the full version [13]. Therefore by the birthday paradox, the expected number of loops in Algorithm 1 until some \(\mathbf {v} _1, \mathbf {v} _2\in V\) with \(\mathbf {v} _1 + \mathbf {v} _2 = \mathbf {v} \) are found can be estimated by

$$\begin{aligned} L \approx \frac{1}{p_1 p} \sqrt{|V|} = \frac{\sqrt{\begin{pmatrix} r\\ c \end{pmatrix} }}{\sqrt{p_1 p}} = \begin{pmatrix} r\\ c \end{pmatrix} \left( p \begin{pmatrix}2c\\ c\end{pmatrix}\right) ^{-1/2}. \end{aligned}$$

It remains to approximate the probability p which we do in the following. Let \(\mathbf {v} _1 \in \{0, 1\}^r\) and \(\mathbf {B} \) be some basis of \(\varLambda _q(\mathbf {A} _2)\). By Lemma 1 there exist unique \(\mathbf {u} _1, \mathbf {u} _2\in \varLambda _q(\mathbf {A} _2)\) such that \({{\mathrm{NP}}}_{\mathbf {B}}(-\mathbf {A} _1 \mathbf {v} _1) =-\mathbf {A} _1 \mathbf {v} _1-\mathbf {u} _1\in \mathcal {P}(\overline{\mathbf {B}})\) and \({{\mathrm{NP}}}_{\mathbf {B}}(-\mathbf {A} _1 \mathbf {v} _1 - \mathbf {e} ^\prime ) + \mathbf {e} ^\prime =-\mathbf {A} _1 \mathbf {v} _1-\mathbf {u} _2\in \mathbf {e} ^\prime + \mathcal {P}(\overline{\mathbf {B}})\). Without loss of generality, in the following we assume \(\mathbf {u} _1 = \mathbf {0} \), or equivalently \(-\mathbf {A} _1 \mathbf {v} _1\in \mathcal {P}(\overline{\mathbf {B}})\). Now \(-\mathbf {A} _1 \mathbf {v} _1\) is \(\mathbf {e} ^\prime \)-admissible if and only if \(\mathbf {u} _2 = \mathbf {u} _1 = \mathbf {0} \), which is equivalent to \(\mathbf {e} ^\prime + \mathbf {A} _1 \mathbf {v} _1\in \mathcal {P}(\overline{\mathbf {B}})\). Therefore p is equal to the probability that \(\mathbf {e} ^\prime + \mathbf {A} _1 \mathbf {v} _1\in \mathcal {P}(\overline{\mathbf {B}})\), which we determine in the following.

There exists some orthonormal transformation that aligns \(\mathcal {P}(\overline{\mathbf {B}})\) along the standard axes of \( \mathbb {R} ^{m}\). By applying this transformation, we may therefore assume that \(\mathcal {P}(\overline{\mathbf {B}})\) is aligned along the standard axes of \( \mathbb {R} ^{m}\) and that in consequence \(\mathbf {e} ^\prime \) is a uniformly random vector of length \(\sqrt{m/4}\). Because \(\mathbf {A} _1\) is uniformly random in \( \mathbb {Z} _q^{m\times r}\) we may further assume that \(\mathbf {A} _1 \mathbf {v} _1\) is uniformly random in \(\mathcal {P}(\overline{\mathbf {B}})\), since without loss of generality we assume \(\mathbf {A} _1 \mathbf {v} _1 \in \mathcal {P}(\overline{\mathbf {B}})\). This gives rise to the following heuristic.

Heuristic 3

The probability p as defined in Eq. 5 (with respect to a reduced basis with Hermite delta \(\delta \)) is

where

$$\begin{aligned} S_m(\sqrt{m/4}) = \{\mathbf {x} \in \mathbb {R} ^m \mid \left\| {\mathbf {x}}\right\| = \sqrt{m/4}\} \end{aligned}$$

is the surface of a sphere with radius \(\sqrt{m/4}\) centered around the origin and

$$\begin{aligned} R = \{\mathbf {x} \in \mathbb {R} ^m \mid \forall i\in \{1,\dots ,m\}: -R_i/2 \le x_i < R_i/2 \} \end{aligned}$$

is the search rectangle with edge lengths

$$\begin{aligned} R_i = \delta ^{-2(i-1)+m}q^{\frac{m-n+r}{m}}. \end{aligned}$$

In the heuristic, the edge lengths are implied by the Geometric Series Assumption.

We continue calculating the approximation of p given in Heuristic 3. Let R and \(R_i\) be as defined in Heuristic 3. We can rewrite the approximation given in Heuristic 3 as

Rescaling everything by a factor of \({1/\sqrt{m/4}}\) leads to

where

$$\begin{aligned} r_i = \frac{R_i}{2\sqrt{m/4}} = \frac{\delta ^{-2(i-1)+m}q^{\frac{m-n+r}{m}}}{2\sqrt{m/4}}. \end{aligned}$$
(6)

Unfortunately, the distributions of the coordinates of \(\mathbf {e} \) are not independent, which makes calculating p extremely complicated. In practice, however, the probability that \(e_i \in [-R_i/2, R_i/2]\) is big for all but the last few indices i. This is due to the fact that by the Geometric Series Assumption typically only the last values \(R_i\) are small. Consequently, we expect the dependence of the remaining entries not to be strong. This assumption was already established by Howgrave-Graham [25] and appears to hold for all values of \(R_i\) appearing in practice.

It is therefore reasonable to assume that

were \(D_m\) denotes the distribution on the interval \([-1,1]\) obtained by the following experiment: sample a vector \(\mathbf {w} \) uniformly at random on the unit sphere and then output the first (equivalently, any arbitrary but fixed) coordinate of \(\mathbf {w} \).

Next we explore the density function of \(D_m\). The probability that \(e_i^\prime \le x\) for some \(-1<x < 0\), where \(e_i^\prime \mathop {\leftarrow }\limits ^{\$}D_m\), is given by the ratio of the surface area of a hyperspherical cap of the unit sphere in \( \mathbb {R} ^m\) with height \(h = 1+x\) and the surface area of the unit sphere. This is illustrated in the full version [13] for \(m=2\). The surface area of a hyperspherical cap of the unit sphere in \( \mathbb {R} ^m\) with height \(h < 1\) is given by (see [31])

$$\begin{aligned} A_m(h) = \frac{1}{2} A_m I_{2h-h^2}\left( \frac{m-1}{2}, \frac{1}{2}\right) , \end{aligned}$$

where \(A_m = 2 \pi ^{m/2} / \Gamma (m/2)\) is the surface area of the unit sphere and

$$\begin{aligned} I_x(a,b) = \frac{\int _0^xt^{a-1}(1-t)^{b-1} dt}{B(a,b)} \end{aligned}$$

is the regularized incomplete beta function (see [38]) and B(ab) is the Euler beta function.

Consequently, for \(-1<x < 0\), we have

(7)

Together with

we can use a convolution to obtain

Since

it suffices to calculate the integral

$$\begin{aligned} J(r_i, m) = \int _{-r_i-1}^{-r_i}\int _{\max (-1, z-r_i)}^{z+r_i} (1-y^2)^\frac{m-3}{2}dydz \end{aligned}$$
(8)

in order to calculate p. We calculated the integral symbolically using sage [42], which allows an efficient calculation of p.

Time Spend per Loop Cycle. With the estimation of the number of loops given, the remaining task is to estimate the time spend per loop cycle. Each cycle consists of four steps:

  1. 1.

    Guessing a binary vector.

  2. 2.

    Running the nearest plane algorithm (twice).

  3. 3.

    Calculating \(\mathcal {A}_{\mathbf {x} _1}^{(r)}\cup \mathcal {A}_{\mathbf {x} _1'}^{(r)}\).

  4. 4.

    Dealing with collisions in the boxes.

We assume that the runtime of one inner loop of Algorithm 1 is dominated by the runtime of the nearest plane algorithm, as argued in the following. It is well known that sampling a binary vector is extremely fast. Furthermore, note that only very few of the \(2^n\) addresses contain a vector, since filling a significant proportional would take exponential time. Consequently, collisions are extremely rare, and lines 8–11 of Algorithm 1 do not contribute much to the overall runtime.

An estimation by Howgrave-Graham [25] shows that for typical instances, the runtime of the nearest plane algorithm exceeds the time spent for storing the collision. We therefore omit the latter from our considerations.

Lindner and Peikert [32] estimated the time necessary to run the nearest plane algorithm to be about \(2^{-16}\) seconds, which amounts to about \(2^{15}\) bit operations on their machine. This leads to the following heuristic for the runtime of the attack.

Heuristic 4

The average number of operations per inner loop in Algorithm 1 is \(N \approx 2^{16}\).

Total Runtime. We are now able to prove our main theorem.

Proof (Theorem 1 ): By definition, every output of Algorithm 1 is a valid binary error vector of the given LWE with binary error instance. The rest follows directly from Lemma 4, Heuristics 2 and 4.    \(\blacksquare \)

3.3 Minimizing the Expected Runtime

As mentioned in Remark 1, we can perform a basis reduction to obtain a lattice basis with smaller Hermite delta \(\delta \) before running the actual attack in order to speed up the attack. We perform a binary search for the \(\delta \) such that the estimated runtimes of both the basis reduction and the actual attack are about equal. We also need to optimise r, the Meet-in-the-Middle dimension, which we do numerically, as there are only finitely many r to check. We refer the reader to the full version [13] for further details on the choice of \(\delta \) and r.

4 Comparison

In this section we consider other approaches to solve LWE with binary error and compare these algorithms to Algorithm 1. In particular we give upper bounds for the runtimes of the algorithms. A comparison of the most practical attacks, including the hybrid attack, is given in Table 1.

Table 1. Comparison of attacks on LWE with binary error using at most \(m=2n\) samples. \(\log _2(T_\text {attack})\) denotes the bit operations required to perform the algorithm described in ‘attack’. For algorithms requiring lattice reduction, we choose whichever is the fewer of \(m=2n\) or the ‘optimal subdimension’ \(m=\sqrt{n\log (q)/\log (\delta )}\) [36].

Much of the analyses below are in a similar spirit to that given in the survey [5] for methods of solving standard LWE. However we are often able to specifically adapt the analysis for the binary error case. Note that to solve LWE with binary error, in addition to algorithms for standard LWE, one may also be able to apply algorithms for the related Inhomogeneous Short Integer Solution problem. A discussion of these algorithms is given in [10].

4.1 Number of Samples

Recall that for reducing LWE with binary error to worst-case problems on lattices, one must restrict the number of samples to be \(m = n \left( 1 + \Omega (1/ \log {n}) \right) \) [35, Theorem 1.2]. On the other hand, with slightly more than linear samples, such as \(m = \mathcal {O}(n \log \log n)\), the algorithm given in [1] is subexponential. Therefore if a scheme bases its security on the hardness of LWE with binary error, it is reasonable to expect that one has only access to at most linearly many samples. We assume this is the case in our analysis below. For concreteness, we fix \(m=2n\).

4.2 Algorithms for Solving LWE

There are several approaches one could use to solve LWE or its variants (see the survey [5]). One may employ combinatorial algorithms such as the BKW [2, 11] algorithm and its variants [3, 17, 23, 27]. However, all these algorithms require far more samples than are available in the binary error case, and are therefore ruled out. We also omit a Meet-in-the-Middle attack [5] or attacks based on the algorithm of Arora and Ge [1, 6], as they will be slower than other methods. We consider them in the full version [13] for completeness.

Distinguishing Attack. One can solve LWE via a distinguishing attack as described in [32, 36]. The idea is to find a short vector \(\left\| {\mathbf {v}}\right\| \) in the scaled dual lattice of \(\mathbf {A} \), i.e. the lattice \(\varLambda = \{ \mathbf {w} \in \mathbb {Z}_q ^m \ | \ \mathbf {w} \mathbf {A} \equiv 0 \mod q \}\). Then, if the problem is to distinguish \((\mathbf {A},\mathbf {b})\) where \(\mathbf {b} \) is either formed as an LWE instance \(\mathbf {b} = \mathbf {A} \mathbf {s} + \mathbf {e} \) or is uniformly random, one can use this short vector \(\mathbf {v} \) as follows. Consider \(\left\langle {\mathbf {v}},{\mathbf {b}}\right\rangle = \left\langle {\mathbf {v}},{\mathbf {e}}\right\rangle \) if \(\mathbf {b} \) is from an LWE instance, which as the inner product of two short vectors, is small mod q. On the other hand, if \(\mathbf {b} \) is uniform then \(\left\langle {\mathbf {v}},{\mathbf {b}}\right\rangle \) is uniform on \( \mathbb {Z}_q \) so these cases can be distinguished if \(\mathbf {v} \) is suitably small.

We determine how small a \(\mathbf {v} \) which must be found as follows. Recall that our errors are chosen uniformly at random from \(\{ 0,1 \}\). So they follow a Bernoulli distribution with parameter 1 / 2, and have expectation 1 / 2 and variance 1 / 4. Consider the distribution of \(\left\langle {\mathbf {v}},{\mathbf {e}}\right\rangle \). Since the errors \(e_i\) are chosen independently, its expectation is \(\frac{1}{2}\sum _{i=1}^{m}v_i\) and its variance is \(\frac{1}{4}\sum _{i=1}^{m}v_i^2\). Since \(\left\langle {\mathbf {v}},{\mathbf {e}}\right\rangle \) is the sum of many independent random variables, asymptotically it follows a normal distribution with those parameters. Since the distinguishing attack success is determined by the variance and not the mean, and we can account for the mean, we assume it is zero. Then we can use the result of [32] to say that we can distinguish a Gaussian from uniform with advantage close to \(\exp (- \pi (\left\| {\mathbf {v}}\right\| \cdot s / q)^2)\), where s is the width parameter of the Gaussian. In our case \(s^2 = 2\pi \cdot \frac{1}{4}\) so we can distinguish with advantage close to \(\epsilon = \exp (- \pi ^2 \left\| {\mathbf {v}}\right\| ^2 / 2q^2)\). Therefore to distinguish with advantage \(\epsilon \) we require a vector \(\mathbf {v} \) of length \(\left\| {\mathbf {v}}\right\| = q \cdot \frac{\sqrt{2 \ln {(1/\epsilon )}}}{\pi }\).

We calculate a basis of the scaled dual lattice \(\varLambda \) and find a short vector \(\mathbf {v} \in \varLambda \) by lattice basis reduction. With high probability the lattice \(\varLambda \) has rank m and volume \(q^n\) [5, 36]. By definition of the Hermite delta we therefore have \(\left\| {\mathbf {v}}\right\| = \delta ^m q^{n/m}\). So the Hermite delta we require to achieve for the attack to succeed with advantage \(\epsilon \) is given by \( \delta ^m q^{n/m} = q \cdot \frac{\sqrt{2 \ln {(1/\epsilon )}}}{\pi }\). Assuming that the number of samples m is large enough to use the ‘optimal subdimension’ \(m=\sqrt{n\log (q)/\log (\delta )}\) [36], we rearrange to obtain

$$\begin{aligned} \log {\delta } = \frac{\left( \log {(q) + \log {\left( \frac{\sqrt{2 \ln {(1/\epsilon )}}}{\pi } \right) }} \right) ^2}{4n \log {(q)}}. \end{aligned}$$

To establish the estimates for the runtime of this attack given in Table 1, we assume one has to run the algorithm about \(1/\epsilon \) times to succeed, and consider \(\delta \) as a function of \(\epsilon \). The overall running time is then given by \(1/\epsilon \) multiplied the estimated time, according to Eq. (1), to achieve \(\delta (\epsilon )\). We pick the optimal \(\epsilon \) such that this overall running time is minimized.

It is possible that we do not have enough samples to use the ‘optimal subdimension’, in which case we use \(m = 2n\). For details, see the full version [13].

Reducing to uSVP. One may solve LWE via Kannan’s embedding technique [26], thus seeing an LWE instance as a unique shortest vector problem instance. This technique is used in [4, 9]. We follow analogously the analysis in [4, 5] for the LWE with binary error case and obtain that we require a Hermite delta of \( \log (\delta ) = \frac{[\log (q)-\log (2 \tau \sqrt{\pi e})]^2}{4 n \log (q)}\) for this attack to succeed. The number of operation necessary to achieve this Hermite delta is estimated using Eq. (1). A comprehensive analysis can be found in the full version [13].

Decoding. The decoding approach for solving LWE was first described in [32] and is based on Babai’s nearest plane algorithm [7]. The aim is to recover the error vector (so seeing LWE as a Bounded Distance Decoding instance). Recall (Lemma 1) that the error vector can be recovered using Babai’s algorithm if it lies within the fundamental parallelepiped of the Gram-Schmidt basis. The idea of Lindner and Peikert in [32] is to widen the search parallelepiped to

$$\begin{aligned} \mathcal {P}_{\text {decoding}} = \{\mathbf {x} \in \mathbb {Z} ^m\mid \mathbf {x} = \sum _{i=1}^n \alpha _i d_i\overline{\mathbf {b}}_i\text { for } -1/2 \le \alpha _i < 1/2\}, \end{aligned}$$

where \(d_1,\ldots ,d_m\) are integers chosen by the attacker.

Following the analysis of Lindner and Peikert, we estimate that an attack on a reduced basis with Hermite delta \(\delta \) requires about \(2^{15}\cdot \prod _{i=1}^md_i\) operations. However, the analysis of the success probability is more complicated. By definition of search parallelepiped, the attack succeeds if (and only if) the error \(\mathbf {e} \) lies in the search rectangle \(\mathcal {P}_{\text {decoding}}\). Under the same assumption as in Sect. 3.2 (and using the same error transformation), this probability can be estimated via

where

$$\begin{aligned} r_i = d_i\frac{\delta ^{-2(i-1)+m}q^{\frac{m-n}{m}}}{2\sqrt{m/4}}. \end{aligned}$$

Together with Eq. (7), this leads to

$$\begin{aligned} p_\text {decoding} \approx \prod _{i=1}^m\left( 1-\frac{2}{B(\frac{m-1}{2}, \frac{1}{2})}\int _{-1}^{-r_i}(1-t^2)^{\frac{m-3}{2}}dt\right) \end{aligned}$$

A standard way to increase the runtime of the attack is to use basis reduction (like BKZ2.0) as precomputation. Predicting the runtime of BKZ2.0 according to Eq. (1) leads to the runtime estimation

$$\begin{aligned} T_\text {decoding} \approx \frac{2^{1.8/\log _2(\delta ) - 110}\cdot 2.3\cdot 10^9 + 2^{15} \prod _{i=1}^m d_i}{p_\text {decoding}}. \end{aligned}$$

Using the same numeric optimization techniques as presented above to minimize the expected runtime leads to the complexity estimates given in Table 1.