1 Introduction

Let \(D_1\) and \(D_2\) be two non-vanishing probability distributions over a common measurable support X. Let \(a \in (1,+\infty )\). The Rényi divergence [Rén61, EH12] (RD for short) \(R_a(D_1 \Vert D_2)\) of order a between \(D_1\) and \(D_2\) is defined as the (\((a-1)\)th root of the) expected value of \((D_1(x)/D_2(x))^{a-1}\) over the randomness of x sampled from \(D_1\). For notational convenience, our definition of the RD is the exponential of the classical definition [EH12]. The RD is an alternative to the statistical distance (SD for short) \(\varDelta (D_1,D_2) = \frac{1}{2}\sum _{x \in X} |D_1(x) - D_2(x)|\) as measure of distribution closeness, where we replace the difference in SD, by the ratio in RD. RD enjoys several properties that are analogous of those enjoyed by SD, where addition in the property of SD is replaced by multiplication in the analogous property of RD (see Subsect. 2.3).

SD is ubiquitous in cryptographic security proofs. One of its most useful properties is the so-called probability preservation property: For any measurable event \(E \subseteq X\), we have \(D_2(E) \ge D_1(E) - \varDelta (D_1,D_2)\). RD enjoys the analogous property \(D_2(E) \ge D_1(E)^{\frac{a}{a-1}}/R_{a}(D_1 \Vert D_2)\). If the event E occurs with significant probability under \(D_1\), and if the SD (resp. RD) is small, then the event E also occurs with significant probability under \(D_2\). These properties are particularly handy when the success of an attacker against a given scheme can be described as an event whose probability should be negligible, e.g., the attacker outputs a new valid message-signature pair for a signature scheme. If in the attacker succeeds with good probability in the real scheme based on distribution \(D_1\), then it also succeeds with good probability in the simulated scheme (of the security proof) based on distribution \(D_2\).

To make the SD probability preservation property useful, it must be ensured that the SD \(\varDelta (D_1,D_2)\) is smaller than any \(D_1(E)\) that the security proof must handle. Typically, the quantity \(D_1(E)\) is assumed to be greater than some success probability lower bound \(\varepsilon \), which is of the order of \(1/\mathrm {poly}(\lambda )\) where \(\lambda \) refers to the security parameter, or even \(2^{-o(\lambda )}\) if the proof handles attackers whose success probabilities can be sub-exponentially small (which we believe better reflects practical objectives). As a result, the SD \(\varDelta (D_1,D_2)\) must be \(<\varepsilon \) for the SD probability preservation property to be relevant. Similarly, the RD probability preservation property is non-vacuous when the RD \(R_{a}(D_1 \Vert D_2)\) is \(\le \mathrm {poly}(1/\varepsilon )\). In many cases, the latter seems less demanding than the former: in all our applications of RD, the RD between \(D_1\) and \(D_2\) is small while their SD is too large for the SD probability preservation to be applicable. In fact, as we will see in Subsect. 2.3, the RD becomes sufficiently small to be useful before the SD when \(\mathrm {sup}_x D_1(x)/D_2(x)\) tends to 1. This explains the superiority of the RD in several of our applications.

Although RD seems more amenable than SD for search problems, it seems less so for distinguishing problems. A typical cryptographic example is semantic security of an encryption scheme. Semantic security requires an adversary \(\mathcal {A}\) to distinguish between the encryption distributions of two plaintext messages of its choosing: the distinguishing advantage \(\text{ Adv }_{\mathcal {A}}(D_1,D_2)\), defined as the difference of probabilities that \(\mathcal {A}\) outputs 1 using \(D_1\) or \(D_2\), should be large. In security proofs, algorithm \(\mathcal {A}\) is often called on distributions \(D_1'\) and \(D_2'\) that are close to \(D_1\) and \(D_2\) (respectively). If the SDs between \(D_1\) and \(D_1'\) and \(D_2\) and \(D_2'\) are both bounded from above by \(\varepsilon \), then, by the SD probability preservation property (used twice), we have \(\text{ Adv }_{\mathcal {A}}(D_1',D_2') \ge \text{ Adv }_{\mathcal {A}}(D_1,D_2) - 2\varepsilon \). As a result, SD can be used to distinguishing problems in a similar fashion as for search problems. The multiplicativity of the RD probability preservation property seems to prevent RD from being applicable to distinguishing problems.

We replace the statistical distance by the Rényi divergence in several security proofs for lattice-based cryptographic primitives. Lattice-based cryptography is a relatively recent cryptographic paradigm in which cryptographic primitives are shown at least as secure as it is hard to solve standard problems over lattices (see the survey [MR09]). Security proofs in lattice-based cryptography involve different types of distributions, often over infinite sets, such as continuous Gaussian distributions and Gaussian distributions with lattice supports. The RD seems particularly well suited to quantify the closeness of Gaussian distributions. Consider for example two continuous distributions over the reals, both with standard deviation 1, but one with center 0 and the other one with center c. Their SD is linear in c, so that c must remain extremely small for the SD probability preservation property to be useful. On the other hand, their RD of order \(a=2\) is bounded as \(\exp (O(c^2))\) so that the RD preservation property remains useful event for slightly growing c.

RD was first used in lattice-based cryptography by [LPR13], in the decision to search reduction for the Ring Learning With Errors problem (which serves as a security foundation for many asymptotically fast primitives). It was then exploited in [LSS14] to decrease the parameters of the Garg et al. (approximation to) cryptographic multilinear maps [GGH13]. In the present work, we present a more extensive study of the power of RD in lattice-based cryptography, by showing several independent applications of RD. In some cases, it leads to security proofs allowing to take smaller parameters in the cryptographic schemes, hence leading to efficiency improvements. In other cases, this leads to alternative security proofs that are conceptually simpler.

Our applications of RD also include distinguishing problems. To circumvent the aforementioned a priori limitation of the RD probability preservation property for distinguishing problems, we propose an alternative approach that handles a class of distinguishing problems, enjoying a special property that we call public sampleability. This public sampleability allows to estimate success probabilities via Hoeffding’s bound.

The applications we show in lattice-based cryptography are as follows:

  • Smaller signatures for the Hash-and-Sign GPV signature scheme [GPV08].

  • Smaller storage requirement for the Fiat-Shamir BLISS signature scheme [DDLL13, PDG14, Duc14].

  • Alternative proof that the Learning With Errors (LWE) problem with noise chosen uniformly in an interval is no easier than the Learning With Errors problem with Gaussian noise [DMQ13]. Our reduction does not require the latter problem to be hard, and it is hence marginally more general as it also applies to distributions with smaller noises. Further, our reduction preserves the LWE dimension n, and is hence tighter than the one from [DMQ13] (the latter degrades the LWE dimension by a constant factor).Footnote 1

  • Smaller parameters in the dual-Regev encryption scheme from [GPV08].

We think RD is likely to have further applications in lattice-based cryptography, for search and for distinguishing problems.

Related Works. The framework for using RD in distinguishing problems was used in [LPSS14], in the context of the k-LWE problem (a variant of LWE in which the attacker is given extra information). In [PDG14], Pöpplemann, Ducas and Güneysu used the Kullback-Leibler divergence (which is the RD of order \(a=1\)) to lower the storage requirement of [DDLL13]. Asymptotically, using the Kullback-Leibler divergence rather than SD only leads to a constant factor improvement. Our approach allows bigger savings in the case where the number of signature queries is limited, as explained in Sect. 3.

Very recently, Bogdanov et al. [BGM+15] adapted parts of our RD-based hardness proof for \(\mathrm {LWE}\) with noise uniform in a small interval, to the Learning With Rounding problem. In particular, they obtained a substantial improvement over the hardness results of [BPR12, AKPW13].

Road-Map. In Sect. 2, we provide necessary background on lattice-based cryptography, and on the Rényi divergence. In Sect. 3, we use RD to improve some lattice-based signature scheme parameters. Section 4 contains the description of the framework in which we can use RD for distinguishing problems, which we apply to the dual-Regev encryption scheme. In Sect. 5, we describe an alternative hardness proof for LWE with noise uniformly chosen in an interval.

Notations. If x is a real number, we let \(\lfloor x \rceil \) denote a closest integer to x. The notation \(\ln \) refers to the natural logarithm and the notation \(\log \) refers to the base 2 logarithm. We define \(\mathbb {T}=([0,1],+)\). For an integer q, we let \(\mathbb {Z}_q\) denote the ring of integers modulo q. We let \(\mathbb {T}_q\) denote the group \(\mathbb {T}_q = \{i/q \mod 1: i \in \mathbb {Z}\} \subseteq \mathbb {T}\). Vectors are denoted in bold. If \({\varvec{b}}\) is a vector in \(\mathbb {R}^d\), we let \(\Vert {\varvec{b}}\Vert \) denote its Euclidean norm. By default, all our vectors are column vectors.

If D is a probability distribution, we let \(\mathrm {Supp}(D) = \{x: D(x) \ne 0\}\) denote its support. For a set X of finite weight, we let U(X) denote the uniform distribution on X. The statistical distance between two distributions \(D_1\) and \(D_2\) over a countable support X is \(\varDelta (D_1, D_2)= \frac{1}{2} \sum _{x \in X} |D_1(x)-D_2(x)|\). This definition is extended in the natural way to continuous distributions. If \(f:X \rightarrow \mathbb {R}\) takes non-negative values, then for all countable \(Y \subseteq X\), we define \(f(Y) = \sum _{y\in Y} f(y) \in [0,+\infty ]\). For any vector \({\varvec{c}} \in \mathbb {R}^n\) and any real \(s>0\), the (spherical) Gaussian function with standard deviation parameter s and center \({\varvec{c}}\) is defined as follows: \(\forall {\varvec{x}} \in \mathbb {R}^n, \rho _{s,{\varvec{c}}}({\varvec{x}})= \exp ( -\pi \Vert {\varvec{x}}-{\varvec{c}} \Vert ^2/s^2)\). The Gaussian distribution is \(D_{s,{\varvec{c}}} = \rho _{s,{\varvec{c}}} / s^n\). When \({\varvec{c}}=\mathbf{0}\), we may omit the subscript \({\varvec{c}}\).

We use the usual Landau notations. A function \(f(\lambda )\) is said negligible if it is \(\lambda ^{-\omega (1)}\). A probability \(p(\lambda )\) is said overwhelming if it is \(1-\lambda ^{-\omega (1)}\).

The distinguishing advantage of an algorithm \(\mathcal {A}\) between two distributions \(D_0\) and \(D_1\) is defined as \(\mathrm {Adv}_{\mathcal {A}}(D_0,D_1) = |\Pr _{x\hookleftarrow D_0}[\mathcal {A}(x) = 1] - \Pr _{x\hookleftarrow D_1}[\mathcal {A}(x) = 1]|\), where the probabilities are taken over the randomness of the input x and the internal randomness of \(\mathcal {A}\). Algorithm \(\mathcal {A}\) is said to be an \((\varepsilon ,T)\)-distinguisher if it runs in time \(\le T\) and if \(\mathrm {Adv}_{\mathcal {A}}(D_0,D_1) \ge \varepsilon \).

2 Preliminaries

We assume the reader is familiar with standard cryptographic notions, as well as with lattices and lattice-based cryptography. We refer to [Reg09a, MR09] for introductions on the latter topic.

2.1 Lattices

A (full-rank) n-dimensional Euclidean lattice \(\varLambda \subseteq \mathbb {R}^n\) is the set of all integer linear combinations \(\sum _{i=1}^{n} x_{i} {\varvec{b}}_{i}\) of some \(\mathbb {R}\)-basis \(({\varvec{b}}_{i})_{1 \le i \le n}\) of \(\mathbb {R}^n\). In this setup, the tuple \(({\varvec{b}}_i)_i\) is said to form a \(\mathbb {Z}\)-basis of \(\varLambda \). For a lattice \(\varLambda \) and any \(i\le n\), the ith successive minimum \(\lambda _i(\varLambda )\) is the smallest radius r such that \(\varLambda \) contains i linearly independent vectors of norm at most r. The dual \(\varLambda ^*\) of a lattice \(\varLambda \) is defined as \(\varLambda ^* = \{{\varvec{y}} \in \mathbb {R}^n: {\varvec{y}}^t \varLambda \subseteq \mathbb {Z}^n\}\).

The (spherical) discrete Gaussian distribution over a lattice \(\varLambda \subseteq \mathbb {R}^n\), with standard deviation parameter \(s>0\) and center \({\varvec{c}}\) is defined as:

$$\begin{aligned} \forall {\varvec{x}} \in \varLambda , D_{\varLambda , s, {\varvec{c}}} = \frac{\rho _{s,{\varvec{c}}}({\varvec{x}})}{\rho _{s,{\varvec{c}}}(\varLambda )}. \end{aligned}$$

When the center is \(\mathbf{0}\), we omit the subscript \({\varvec{c}}\).

The smoothing parameter [MR07] of an n-dimensional lattice \(\varLambda \) with respect to \(\varepsilon > 0\), denoted by \(\eta _{\varepsilon }(\varLambda )\), is the smallest \(s>0\) such that \(\rho _{1/s}(\varLambda ^{*}\setminus \{0\}) \le \varepsilon \). We use the following properties.

Lemma 2.1

([MR07, Lemma 3.3]). Let \(\varLambda \) be an n-dimensional lattice and \(\varepsilon > 0\). Then

$$\begin{aligned} \eta _{\varepsilon }(\varLambda ) \le \sqrt{\frac{\ln (2n(1+1/\varepsilon ))}{\pi }} \cdot \lambda _n(\varLambda ). \end{aligned}$$

Lemma 2.2

([GPV08, Lemma 5.3]). Let \(m, n \ge 1\) and q a prime integer, with \(m \ge 2n \ln q\). For \(A \in \mathbb {Z}_q^{n \times m}\) we define \(A^{\perp }\) as the lattice \(\{{\varvec{x}} \in \mathbb {Z}^m: A {\varvec{x}} = \mathbf{0} \mod q\}\). Then,

$$\begin{aligned} \forall \varepsilon <1/2: \ \mathop {\Pr }\limits _{A \hookleftarrow U(\mathbb {Z}_q^{n \times m})} \left[ \eta _{\varepsilon }(A^\perp ) \ge 4\sqrt{\frac{\ln (4m / \varepsilon )}{\pi }} \right] \le q^{-n}. \end{aligned}$$

Lemma 2.3

([GPV08, Corollary 2.8]). Let \(\varLambda ,\varLambda '\) be n-dimensional lattices with \(\varLambda ' \subseteq \varLambda \) and \(\varepsilon \in (0,1/2)\). Then for any \({\varvec{c}} \in \mathbb {R}^n\) and \(s \ge \eta _{\varepsilon }(\varLambda ')\) and any \(x \in \varLambda /\varLambda '\) we have

$$\begin{aligned} (D_{\varLambda ,s,c} \!\!\!\! \mod \varLambda ')(x) \in \left[ \frac{1-\varepsilon }{1+\varepsilon }, \frac{1+\varepsilon }{1-\varepsilon }\right] \cdot \frac{\det (\varLambda )}{\det (\varLambda ')}. \end{aligned}$$

2.2 The SIS and LWE Problems

The Small Integer Solution (SIS) problem was introduced by Ajtai in [Ajt96]. It serves as a security foundation for numerous cryptographic primitives, including, among many others, hash functions [Ajt96] and signatures [GPV08, DDLL13].

Definition 2.4

Let \(m \ge n \ge 1\) and \(q \ge 2\) be integers, and \(\beta \) a positive real. The \(\mathrm {SIS}_{n,m,q,\beta }\) problem is as follows: given \(A \hookleftarrow U(\mathbb {Z}_q^{n \times m})\), the goal is to find \({\varvec{x}} \in \mathbb {Z}^m\) such that \(A {\varvec{x}} = \mathbf{0} \mod q\) and \(0 < \Vert {\varvec{x}}\Vert \le \beta \).

The SIS problem was proven by Ajtai [Ajt96] to be at least as hard as some standard worst-case problems over Euclidean lattices, under specific parameter constraints. We refer to [GPV08] for an improved (and simplified) reduction.

The Learning With Errors (LWE) problem was introduced in 2005 by Regev [Reg05, Reg09b]. LWE is also extensively used as a security foundation, for encryption schemes [Reg09b, GPV08], fully homomorphic encryption schemes [BV11], and pseudo-random functions [BPR12, AKPW13], among many others. Its definition involves the following distribution. Let \(\chi \) be a distribution over \(\mathbb {T}\), \(q \ge 2\), \(n\ge 1\) and \({\varvec{s}} \in \mathbb {Z}_q^n\). A sample from \(A_{{\varvec{s}},\chi }\) is of the form \(({\varvec{a}}, b) \in \mathbb {Z}_q^n \times \mathbb {T}\), with \({\varvec{a}} \hookleftarrow U(\mathbb {Z}_q^n)\), \(b = \frac{1}{q} \langle {\varvec{a}} , {\varvec{s}} \rangle +e\) and \(e \hookleftarrow \chi \).

Definition 2.5

Let \(\chi \) be a distribution over \(\mathbb {T}\), \(q \ge 2\), and \(m \ge n\ge 1\). The search variant \(\mathrm {sLWE}_{n,q,\chi ,m}\) of the \(\mathrm {LWE}\) problem is as follows: given m samples from \(A_{{\varvec{s}},\chi }\) for some \({\varvec{s}} \in \mathbb {Z}_q^n\), the goal is to find \({\varvec{s}}\). The decision variant \(\mathrm {LWE}_{n,q,\chi ,m}\) consists in distinguishing between the distributions \((A_{{\varvec{s}},\chi })^m\) and \(U(\mathbb {Z}_q^n \times \mathbb {T})^m\), where \({\varvec{s}} \hookleftarrow U(\mathbb {Z}_q^n)\).

In some cases, it is convenient to use an error distribution \(\chi \) whose support is \(\mathbb {T}_q\). In these cases, the definition of LWE is adapted in that \(U(\mathbb {Z}_q^n \times \mathbb {T})\) is replaced by \(U(\mathbb {Z}_q^n \times \mathbb {T}_q)\). Note also that for a fixed number of samples m, we can represent the LWE samples using matrices. The \({\varvec{a}}_{{\varvec{i}}}\)’s form the rows of a matrix A uniform in \(\mathbb {Z}_q^{m \times n}\), and the scalar product is represented by the product between A and \({\varvec{s}}\).

Regev [Reg09b] gave a quantum reduction from standard worst-case problems over Euclidean lattices to \(\mathrm {sLWE}\) and \(\mathrm {LWE}\), under specific parameter constraints. Classical (but weaker) reductions have later been obtained (see [Pei09, BLP+13]). We will use the following sample-preserving search to decision reduction for LWE.

Theorem 2.6

([MM11, Proposition 4.10]). If \(q \le \mathrm {poly}(m,n)\) is prime and the error distribution \(\chi \) has support in \(\mathbb {T}_q\), then there exists a reduction from \(\mathrm {sLWE}_{n,q,\chi ,m}\) to \(\mathrm {LWE}_{n,q,\chi ,m}\) that is polynomial in n and m.

2.3 The Rényi Divergence

For any two discrete probability distributions P and Q such that \(\mathrm {Supp}(P) \subseteq \mathrm {Supp}(Q)\) and \(a \in (1,+\infty )\), we define the Rényi divergence of order a by

$$\begin{aligned} R_{a}(P \Vert Q) = \left( \sum _{x \in \mathrm {Supp}(P)} \frac{P(x)^{a}}{Q(x)^{a-1}}\right) ^{\frac{1}{a-1}}\!. \end{aligned}$$

We omit the a subscript when \(a=2\). We define the Rényi divergences of orders 1 and \(+\infty \) by

$$\begin{aligned} R_1(P \Vert Q) = \exp \left( \sum _{x \in \mathrm {Supp}(P)} P(x) \log \frac{P(x)}{Q(x)}\right) \ \text{ and } \ R_{\infty }(P \Vert Q) = \max _{x \in \mathrm {Supp}(P)} \frac{P(x)}{Q(x)}. \end{aligned}$$

The definitions are extended in the natural way to continuous distributions. The divergence \(R_1\) is the (exponential of) the Kullback-Leibler divergence.

For any fixed PQ, the function \(a \mapsto R_{a}(P\Vert Q) \in (0, +\infty ]\) is non-decreasing, continuous over \((1, +\infty )\), tends to \(R_{\infty }(P \Vert Q)\) when a grows to infinity, and if \(R_{a}(P\Vert Q)\) is finite for some a, then \(R_{a}(P \Vert Q)\) tends to \(R_1(P \Vert Q)\) when a tends to 1 (we refer to [EH12] for proofs). A direct consequence is that if \(P(x)/Q(x) \le c\) for all \(x \in \mathrm {Supp}(P)\) and for some constant c, then \(R_{a}(P \Vert Q) \le R_{\infty }(P \Vert Q) \le c\). In the same setup, we have \(\varDelta (P,Q) \le c/2\).

The following properties can be considered the multiplicative analogues of those of the SD. We refer to [EH12, LSS14] for proofs.

Lemma 2.7

Let \(a \in [1, +\infty ]\). Let P and Q denote distributions with \(\mathrm {Supp}(P) \subseteq \mathrm {Supp}(Q)\). Then the following properties hold:

  • Log. Positivity: \(R_{a}(P \Vert Q) \ge R_{a}(P \Vert P) = 1\).

  • Data Processing Inequality: \(R_{a}(P^f \Vert Q^f) \le R_{a}(P \Vert Q)\) for any function f, where \(P^f\) (resp. \(Q^f\)) denotes the distribution of f(y) induced by sampling \(y \hookleftarrow P\) (resp. \(y \hookleftarrow Q\)).

  • Multiplicativity: Assume P and Q are two distributions of a pair of random variables \((Y_1,Y_2)\). For \(i \in \{1,2\}\), let \(P_i\) (resp. \(Q_i\)) denote the marginal distribution of \(Y_i\) under P (resp. Q), and let \(P_{2|1}(\cdot |y_1)\) (resp. \(Q_{2|1}(\cdot |y_1)\)) denote the conditional distribution of \(Y_2\) given that \(Y_1=y_1\). Then we have:

    • \(R_{a}(P \Vert Q) = R_{a}(P_1 \Vert Q_1) \cdot R_{a}(P_2 \Vert Q_2)\) if \(Y_1\) and \(Y_2\) are independent.

    • \(R_{a}(P \Vert Q) \le R_{\infty }(P_1 \Vert Q_1) \cdot \max _{y_1 \in X} R_{a}(P_{2|1}(\cdot |y_1) \Vert Q_{2|1}(\cdot |y_1))\).

  • Probability Preservation: Let \(A \subseteq \mathrm {Supp}(Q)\) be an arbitrary event. If \(a \in (1, +\infty )\), then \(Q(A) \ge P(A)^{\frac{a}{a-1}}/R_{a}(P \Vert Q)\). Further, we have

    $$\begin{aligned} Q(A) \ge P(A)/R_{\infty }(P \Vert Q). \end{aligned}$$

Let \(P_1,P_2,P_3\) be three distributions with \(\mathrm {Supp}(P_1) \subseteq \mathrm {Supp}(P_2) \subseteq \mathrm {Supp}(P_3)\). Then we have:

  • Weak Triangle Inequality:

    $$\begin{aligned} R_{a}(P_1 \Vert P_3) \le \left\{ \begin{array}{rl} R_{a}(P_1 \Vert P_2) \cdot R_{\infty }(P_2 \Vert P_3), \\ R_{\infty }(P_1 \Vert P_2)^{\frac{a}{a-1}} \cdot R_{a}(P_2 \Vert P_3) &{} \ \text{ if } a \in (1,+\infty ). \end{array} \right. \end{aligned}$$

Getting back to the setup in which \(P(x)/Q(x) \le c\) for all \(x \in \mathrm {Supp}(P)\) and for some constant c, the RD probability preservation property above is relevant even for large c, whereas the analogous SD probability preservation property starts making sense only when \(c < 2\).

Pinsker’s inequality is the analogue of the probability preservation property for \(a=1\): for an arbitrary event \(A \subseteq \mathrm {Supp}(Q)\), we have \(Q(A) \ge P(A) - \sqrt{\ln R_1(P \Vert Q)/2}\) (see [PDG14, Lemma 1] for a proof). Analogously to the statistical distance, this probability preservation property is useful for unlikely events A only if \(\ln R_1 (P \Vert Q)\) is very small. We refer to Subsect. 3.1 for additional comments on this property.

2.4 RD Bounds

We will use the following result, adapted from [LSS14].

Lemma 2.8

For any n-dimensional lattice \(\varLambda \subseteq \mathbb {R}^n\) and \(s>0\), let P be the distribution \(D_{\varLambda ,s,{\varvec{c}}}\) and Q be the distribution \(D_{\varLambda ,s,{\varvec{c}}'}\) for some fixed \({\varvec{c}},{\varvec{c}}' \in \mathbb {R}^n\). If \({\varvec{c}}, {\varvec{c}}' \in \varLambda \), let \(\varepsilon =0\). Otherwise, fix \(\varepsilon \in (0,1)\) and assume that \(s \ge \eta _{\varepsilon }(\varLambda )\). Then, for any \(a \in (1,+\infty )\):

$$\begin{aligned} R_{a}(P \Vert Q) \in \left[ \left( \frac{1-\varepsilon }{1+\varepsilon }\right) ^{\frac{2}{a-1}},\left( \frac{1+\varepsilon }{1-\varepsilon }\right) ^{\frac{2}{a-1}} \right] \cdot \exp \left( a \pi \frac{\Vert {\varvec{c}}-{\varvec{c}}'\Vert ^2}{s^2} \right) . \end{aligned}$$

It may be checked that also \(R_1 (P\Vert Q)\) is of the order of \(\exp (\Vert {\varvec{c}}'-{\varvec{c}}\Vert ^2/s^2)\), \(R_{\infty } (P \Vert Q) = +\infty \) and \(\varDelta (P,Q)\) is of the order of \(\Vert {\varvec{c}}'-{\varvec{c}}\Vert ^2/s^2\). In that setup, the RD of order \(a = \infty \) is useless, and the probability preservation properties of the SD and RD of order \(a = 1\) lead to interesting bounds for events occurring only when \(\Vert {\varvec{c}}'-{\varvec{c}}\Vert /s = o(\varepsilon )\). Oppositely, for any \(a \in (1, + \infty )\), the probability preservation property for the RD of order \(a \in (1, +\infty )\) may be used with \(\Vert {\varvec{c}}'-{\varvec{c}}\Vert /s = O(\sqrt{\log (1/\varepsilon )})\) while still leading to probabilistic lower bounds of the order of \(\varepsilon ^{O(1)}\).

As we have already seen, if two distributions are close in a uniform sense, then their RD is small. We observe the following immediate consequence of Lemma 2.3, that allows replacing the SD with the RD in the context of smoothing arguments, in order to save on the required parameter s. In applications of Lemma 2.3, it is customary to use \(s \ge \eta _{\varepsilon }(\varLambda ')\) with \(\varepsilon \le 2^{-\lambda }\), in order to make the distribution \(D_{\varLambda /\varLambda ',s,{\varvec{c}}} = D_{\varLambda ,s,{\varvec{c}}} \mod \varLambda '\) within SD \(2^{-\lambda }\) of the uniform distribution \(U(\varLambda /\varLambda ')\). This translates via Lemma 2.1 to use \(s =\varOmega (\sqrt{\lambda + \log n} \cdot \lambda _n(\varLambda '))\). Whereas if using an RD bound \(R_{\infty }(D_{\varLambda /\varLambda ',s,{\varvec{c}}} \Vert U_{\varLambda /\varLambda '}) = O(1)\) suffices for the application, one can take \(\varepsilon = O(1)\) in the corollary below, which translates to just \(s =\varOmega (\sqrt{\log n} \cdot \lambda _n(\varLambda '))\), saving a factor \(\varTheta (\sqrt{\lambda })\).

Lemma 2.9

Let \(\varLambda ,\varLambda '\) be n-dimensional lattices with \(\varLambda ' \subseteq \varLambda \) and \(\varepsilon \in (0,1/2)\). Let \(D_{\varLambda /\varLambda ',s,c}\) denote the distribution on \(\varLambda /\varLambda '\) induced by sampling from \(D_{\varLambda ,s,c}\) and reducing modulo \(\varLambda '\), and let \(U_{\varLambda /\varLambda '}\) denote the uniform distribution on \(\varLambda /\varLambda '\). Then for any \(c \in \mathbb {R}^n\) and \(s \ge \eta _{\varepsilon }(\varLambda ')\) and any \(x \in \varLambda /\varLambda '\) we have

$$\begin{aligned} R_{\infty }(D_{\varLambda /\varLambda ',s,c} \Vert U_{\varLambda /\varLambda '}) \le \frac{1+\varepsilon }{1-\varepsilon }. \end{aligned}$$

3 Application to Lattice-Based Signature Schemes

In this section, we use the RD to improve the security proofs of the GPV and BLISS signature schemes [GPV08, DDLL13], allowing to take smaller parameters for any fixed security level.

3.1 Sampling Discrete Gaussians and the BLISS Signature Scheme

We show that the use of RD in place of SD leads to significant savings in the required precision of integers sampled according to a discrete Gaussian distribution in the security analysis of lattice-based signature schemes. These savings consequently lower the precomputed table storage for sampling discrete Gaussians with the method described in [DDLL13, PDG14]. In Table 1, we provide a numerical comparison of RD and SD based on an instantiation of BLISS-I.

Discrete Gaussian Sampling. In the BLISS signature scheme [DDLL13] (and similarly in earlier variants [Lyu12]), each signature requires the signing algorithm to sample O(n) independent integers from the 1-dimensional discrete Gaussian distribution \(D_{\mathbb {Z},s}\), where \(s = O(m)\) is the standard deviation parameter (here the variable m denotes a parameter related to the underlying lattice dimension, and is typically in the order of several hundreds)Footnote 2.

In [DDLL13], a particularly efficient sampling algorithm for \(D_{\mathbb {Z}, s}\) is presented. To produce a sample from \(D_{\mathbb {Z}, s}\), this algorithm samples about \(\ell = \lceil \log (0.22s^2(1+2\tau s))\rceil \) Bernoulli random variables of the form \(B_{\exp (-\pi 2^i/s^2)}\), where \(i=0,\ldots ,\ell -1\) and \(\tau = O(\sqrt{\lambda })\) is the tail-cut factor for the Gaussian. To sample those Bernoulli random variables, the authors of [DDLL13] use a precomputed table of the probabilities \(c_i = \exp (-\pi 2^i/s^2)\), for \(i=1,\ldots ,\ell \). Since these probabilities are real numbers, they must be truncated to some bit precision p in the precomputed table, so that truncated values \(\tilde{c}_i = c_i + \varepsilon _i\) are stored, where \(|\varepsilon _i| \le 2^{-p} c_i\) are the truncation errors.

In previous works, the precision was determined by an analysis either based on the statistical distance (SD) [DDLL13] or the Kullback-Leibler divergence (KLD) [PDG14]. In this section, we review and complete these methods, and we propose an RD-based analysis that leads to bigger savings, asymptotically and in practice (see Table 1). More precisely, we give sufficient lower bounds on the precision p to ensure security on the scheme implemented with truncated values against adversaries succeeding with probability \(\ge \varepsilon \) and making \(\le q_s\) signing queries. For any adversary, the distributions \(\varPhi '\) and \(\varPhi \) denote the signatures in the view of the adversary in the untruncated (resp. truncated) cases.

SD-based analysis [DDLL13]. Any forging adversary \(\mathcal {A}\) with success probability \(\ge \varepsilon \) on the scheme implemented with truncated Gaussian has a success probability \(\varepsilon '\ge \varepsilon -\varDelta (\varPhi ,\varPhi ')\) against the scheme implemented with perfect Gaussian sampling. We select parameters to handle adversaries with success probabilities \(\ge \varepsilon /2\) against the untruncated scheme; we can set the required precision p so that \(\varDelta (\varPhi ,\varPhi ')\le \varepsilon /2\). Each signature requires \(\ell \cdot m\) samples from the Bernoulli random variables \((B_{\tilde{c}_i})_i\). To ensure security against \(q_s\) signing queries, each of the truncated Bernoulli random variables \(B_{\tilde{c}_i}\) should be within SD \(\varDelta (\varPhi ,\varPhi ')/(\ell \cdot m \cdot q_s)\) of the desired \(B_{c_i}\) (by the union bound). Using \(\varDelta (B_{\tilde{c}_i},B_{c_i})=|\varepsilon _i| \le 2^{-p}c_i \le 2^{-p-1}\) leads to a precision requirement

$$\begin{aligned} p \ge \log (\ell \cdot m \cdot q_s/\varDelta (\varPhi ,\varPhi ')) \ge \log (\ell \cdot m \cdot q_s/\varepsilon ). \end{aligned}$$
(1)

The overall precomputed table is hence of bit-size \(L_{\text{ SD }} = p \cdot \ell \ge \log (\ell \cdot m \cdot q_s/\varepsilon ) \cdot \ell \).

Note that in [DDLL13], the authors omitted the term \(\ell \cdot m \cdot q_s\) in their analysis: they only ensured that \(\varDelta (B_{\tilde{c}_i},B_{c_i})\le \varepsilon \), leading to the requirement that \(p \ge \log (1/\varepsilon )\).

One may also set the precision \(p_i\) depending on i for \(0 \le i \le \ell -1\). It is sufficient to set

$$\begin{aligned} 2^{-p_i} c_i = 2^{-p_i} \exp (-\pi 2^i / s^2) \le (\varepsilon /2) / ( \ell \cdot m \cdot q_s). \end{aligned}$$

Hence the precision \(p_i\) is

$$\begin{aligned} p_i \ge \log \left( \frac{\ell \cdot m \cdot q_s}{\varepsilon } \cdot \exp (-\pi 2^i / s^2)\right) + 1. \end{aligned}$$
(2)

The bit-size of the overall precomputed table can be computed as a sum of the above \(p_i\)’s. Using the symmetry of the Bernoulli variable, we can further drop the bit-size of the precomputed table.

KLD-based analysis [PDG14]. In [PDG14], Pöppelman, Ducas and Güneysu replace the SD-based analysis by a KLD-based analysis (i.e., using the RD of order \(a=1\)) to reduce the precision p needed in the precomputed table. They show that any forging adversary \(\mathcal {A}\) with success probability \(\varepsilon \) on the scheme implemented with truncated Gaussian has a success probability \(\varepsilon ' \ge \varepsilon - \sqrt{\ln R_1(\varPhi \Vert \varPhi ')/2}\) on the scheme implemented with perfect Gaussian (see remark at the end of Subsect. 2.3). By the multiplicative property of the RD over the \(\ell \cdot m \cdot q_s\) independent samples needed for signing \(q_s\) times, we get that \(R_1(\varPhi \Vert \varPhi ')\le (\max _{i=1, \ldots , \ell }R_1(B_{\tilde{c}_i}\Vert B_{c_i}))^{\ell \cdot m\cdot q_s}\). Now, we have:

$$\begin{aligned} \ln R_1(B_{\tilde{c}_i}\Vert B_{c_i})= & {} (1-c_i-\varepsilon _i) \ln \frac{1-c_i-\varepsilon _i}{1-c_i} + (c_i + \varepsilon _i) \ln \frac{c_i+\varepsilon _i}{c_i} \\\le & {} - (1-c_i-\varepsilon _i)\frac{\varepsilon _i}{1-c_i} + (c_i + \varepsilon _i) \frac{\varepsilon _i}{c_i} \ = \ \frac{\varepsilon _i^2}{(1-c_i)c_i}. \end{aligned}$$

Using \(|\varepsilon _i| \le 2^{-p} c_i\) and \(1-c_i \ge 1/2\), we obtain \(\ln R_1(B_{\tilde{c}_i}\Vert B_{c_i}) = 2^{-2p+1}\frac{c_i}{1-c_i} \le 2^{-2p}\). Therefore, we obtain \(\varepsilon ' \ge \varepsilon - \sqrt{\ell \cdot m\cdot q_s\cdot 2^{-2p}}\). We can select parameters such that \(\sqrt{\ell \cdot m\cdot q_s\cdot 2^{-2p+1}}\le \varepsilon /2\). This leads to a precision requirement

$$\begin{aligned} p \ge \frac{1}{2} \log \left( \frac{\ell \cdot m\cdot q_s}{\varepsilon ^2}\right) + \frac{1}{2}\,. \end{aligned}$$
(3)

The overall precomputed table is hence of bit-size \(L_{\text{ KLD }} \ge (\log (\ell \cdot m \cdot q_s/\varepsilon ^2)/2 + 1/2) \cdot \ell \). This KLD-based analysis may save some storage if \(\varepsilon \) is not too small.

Note that in [PDG14], the authors selected \(\varepsilon =1/2\) and \(\ell \cdot m \cdot q_s = 2^{\lambda }\) where \(\lambda \) is the desired bit-security, and hence obtained \(p \ge \lambda /2 + 1\).

One may also set the precision \(p_i\) depending on i. It is sufficient to set

$$\begin{aligned} p_i \ge \frac{1}{2} \log \left( \frac{\ell \cdot m \cdot q_s}{\varepsilon ^2}\cdot \frac{c_i}{1-c_i}\right) + 1. \end{aligned}$$
(4)

Using symmetry, we may assume \(c_i \le 1/2\).

R \(_\infty \) -based analysis. The probability preservation property of the Rényi divergence from Lemma 2.7 is multiplicative for \(a>1\) (rather than additive for \(a=1\)). Here we use the order \(a = \infty \). This property gives that any forging adversary \(\mathcal {A}\) having success probability \(\varepsilon \) on the scheme implemented with truncated Gaussian sampling has a success probability \(\varepsilon '\ge \varepsilon / R_\infty (\varPhi \Vert \varPhi ')\) on the scheme implemented with perfect Gaussian. If \(R=R_{\infty }(\varPhi \Vert \varPhi ') \le O(1)\), then \(\varepsilon '=\varOmega (\varepsilon )\). By the multiplicative property of the RD (over the \(\ell \cdot m \cdot q_s\) samples needed for signing \(q_s\) times), we have \(R_\infty (\varPhi \Vert \varPhi ')\le R_{\infty }(B_{\tilde{c}_i}\Vert B_{c_i})^{\ell \cdot m \cdot q_s}\). By our assumption that \(c_i \le 1/2\), we have \(R_\infty (B_{\tilde{c}_i}\Vert B_{c_i})= 1 + |\varepsilon _i| / c_i \le 1+2^{-p}\). Therefore, we get \(\varepsilon '\ge \varepsilon / (1 +2^{-p})^{\ell \cdot m\cdot q_s}\). We select parameters to get adversaries with success probabilities \(\ge \varepsilon /2\) against the untruncated scheme and set the precision so that \((1 +2^{-p})^{\ell \cdot m\cdot q_s}\le 2\). This yields an approximated precision requirement

$$\begin{aligned} p \ge \log (\ell \cdot m\cdot q_s). \end{aligned}$$
(5)

Note above estimate may not be accurate unless \(\ell \cdot m\cdot q_s\) is much smaller than \(2^p\). Hence we may also require that \(p \ge \log (\ell \cdot m\cdot q_s) + C\) for some constant C. This condition essentially eliminates the term \(\varepsilon \) from the precision needed by the SD-based and KLD-based analyses.Footnote 3 Overall, we get a precomputed table of bit-size \(L_{\text{ RD }} = \log (\ell \cdot m \cdot q_s) \cdot \ell \).

\(R_a\) -based analysis. We may also consider \(R_a\)-based analysis for general \(a > 1\). It should noted that the reductions here are not tight: for \(R_a\)-based analysis with \(a > 1\), the probability preservation shows \(\varepsilon ' > \varepsilon ^{a/(a-1)} / R_a(\varPhi \Vert \varPhi ').\) The Rényi divergence can be computed by

$$\begin{aligned} (R_a(\varPhi \Vert \varPhi '))^{a -1}&= \frac{(1-c_i - \varepsilon _i)^a}{(1-c_i)^{a-1}} + \frac{(c_i + \varepsilon _i )^a}{c_i^{a-1}}\\&= (1-c_i -\varepsilon _i) \left( 1 - \frac{\varepsilon _i}{1-c_i} \right) ^{a-1} + (c_i + \varepsilon _i) \left( 1 + \frac{\varepsilon _i}{c_i}\right) ^{a-1}. \end{aligned}$$

If a is much smaller than \(2^p\), we get

$$\begin{aligned} (R_a(\varPhi \Vert \varPhi '))^{a -1}&\approx (1-c_i -\varepsilon _i) \left( 1 - \frac{(a-1)\varepsilon _i}{1-c_i} \right) + (c_i + \varepsilon _i) \left( 1 + \frac{(a-1)\varepsilon _i}{c_i}\right) \\&= 1 + \frac{\varepsilon _i^2 (a - 1) }{c_i(1-c_i)} \le 1 + 2^{-2p} (a - 1) \frac{c_i}{1-c_i} \le 1 + 2^{-2p} (a - 1). \end{aligned}$$

For instance if we take \(a = 2\), we have \(R_2(\varPhi \Vert \varPhi ') \le 1 + 2^{-2p}\) and hence \( \varepsilon ' \ge \varepsilon ^{2} / R_2(\varPhi \Vert \varPhi ')\). To get a success probability lower bound \(\varepsilon ^2/2\), it is sufficient to set

$$\begin{aligned} p \ge \frac{1}{2}\log (\ell \cdot m \cdot q_s). \end{aligned}$$
(6)

On the other hand, if a is much larger than \(2^p\), then we have

$$\begin{aligned} (R_a(\varPhi \Vert \varPhi '))^{a -1}&= (1-c_i -\varepsilon _i) \left( 1 - \frac{\varepsilon _i}{1-c_i} \right) ^{a-1} + (c_i + \varepsilon _i) \left( 1 + \frac{\varepsilon _i}{c_i}\right) ^{a-1}\\&\approx (c_i + \varepsilon _i) \exp \left( \frac{(a-1)\varepsilon _i}{c_i}\right) . \end{aligned}$$

Hence the Rényi divergence

$$\begin{aligned} R_a(\varPhi \Vert \varPhi ') \approx (c_i + \varepsilon _i)^{1/(a - 1)} \exp \left( \frac{\varepsilon _i}{c_i}\right) \approx 1 + \frac{\varepsilon _i}{c_i}. \end{aligned}$$

As \(a \rightarrow \infty \), \(R_a(\varPhi \Vert \varPhi ') \rightarrow 1 + 2^{-p}\).

Thus if the tightness of the reduction is not a concern, using \(R_a\) with small a reduces the precision requirement. Furthermore, we can amplify the success probability of the forger on the truncated Gaussian from \(\varepsilon '\) to some \(\varepsilon '' > \varepsilon '\).

Numerical Examples. In Table 1, we consider a numerical example which gives the lower bound on the precision for the scheme BLISS-I (with \(\varepsilon = 2^{-128}\)) when allowing up to \(q_s = 2^{64}\) sign queries to the adversary. For the BLISS-I parameters, we use \(m=1024\), \(\ell =29\), \(s= \lceil \sqrt{2\pi }\cdot 254\cdot \sqrt{1/(2\ln 2)} \rceil = 541\) and \(\tau =13.4/\sqrt{2\pi }\approx 5.4\)). The reductions in the table are tight, except for \(R_2\) (as \(\varepsilon '\) in the reduction does not depend directly on \(\varepsilon \) but on \(\varepsilon ^2\)), and we are a little bit loose for the \(R_\infty \) case.

Table 1. Comparison of the precision to handle adversaries with success probability \(\ge \varepsilon \) making \(\le q_s\) sign queries to BLISS-I. Our Rényi-based parameters are on the last two lines.

When instantiating BLISS-I with the parameters of Table 1, the table bit-size can be reduced from about 6000 bits to about 1200 bits by using \(R_2\) in place of SD. If the tightness of the reduction is concerned, we may use \(R_\infty \) instead, which leads to a table of about 2300 bits.

3.2 GPV Signature Scheme

The RD can also be used to reduce the parameters obtained via the SD-based analysis of the GPV signature scheme in [GPV08].

In summary, the signature and the security proof from [GPV08] work as follows. The signature public key is a matrix \(A \in \mathbb {Z}_q^{n \times m}\) with n linear in the security parameter \(\lambda \), \(q = \mathrm {poly}(n)\), and \(m=O(n \log q)\). The private signing key is a short basis matrix T for the lattice \(A^{\perp } = \{{\varvec{x}} \in \mathbb {Z}^m: A\cdot {\varvec{x}} = \mathbf{0} \mod q\}\), whose last successive minimum satisfies \(\lambda _m(A^{\perp }) \le O(1)\) when \(m = \varOmega (n\log q)\) (see [GPV08]). A signature \((\varvec{\sigma }, s)\) on a message M is a short vector \(\varvec{\sigma } \in \mathbb {Z}^m\) and a random salt \(s \in \{0,1\}^{\lambda }\), such that \(A \cdot \varvec{\sigma } = H(M,s) \mod q\), where H is a random oracle hashing into \(\mathbb {Z}_q^n\). The short vector \(\varvec{\sigma }\) is sampled by computing an arbitrary vector \({\varvec{t}}\) satisfying \(A \cdot {\varvec{t}} = H(M,s) \mod q\) and using T along with a Gaussian sampling algorithm (see [GPV08, BLP+13]) to produce a sample from \({\varvec{t}}+D_{A^{\perp },r, -{\varvec{t}}}\).

The main idea in the security proof from the SIS problem [GPV08] is based on simulating signatures without T, by sampling \(\varvec{\sigma }\) from \(D_{\mathbb {Z}^m,r}\) and then programming the random oracle H at (Ms) according to \(H(M,s) = A \cdot \varvec{\sigma } \mod q\). As shown in [GPV08, Lemma 5.2], the conditional distribution of \(\varvec{\sigma }\) given \(A \cdot \varvec{\sigma } \mod q\) is exactly the same in the simulation and in the real scheme. Therefore, the SD between the simulated signatures and the real signatures is bounded by the SD between the marginal distribution \(D_1\) of \(A \cdot \varvec{\sigma } \mod q\) for \(\varvec{\sigma } \hookleftarrow D_{\mathbb {Z}^m,r}\) and \(U(\mathbb {Z}_q^m)\). This SD for one signature is bounded by \(\varepsilon \) if \(r \ge \eta _{\varepsilon }(A^{\perp })\). This leads, over the \(q_s\) sign queries of the attacker, in the SD-based analysis of [GPV08], to take \(\varepsilon = O(2^{-\lambda }q^{-1}_s)\) and thus \(r = \varOmega (\sqrt{\lambda + \log q_s})\) (using Lemma 2.2), in order to handle attackers with success probability \(2^{-o(\lambda )}\).

Now, by Lemma 2.9, we have that the RD \(R_{\infty }(D_1\Vert U)\) is bounded by \(1+c \cdot \varepsilon \) for one signature, for some constant c. By the multiplicativity property of Lemma 2.7, over \(q_s\) queries, it is bounded by \((1+c\varepsilon )^{q_s}\). By taking \(\varepsilon = O(q^{-1}_s)\), we obtain overall an RD bounded as O(1) between the view of the attacker in the real attack and simulation, leading to a security proof with respect to SIS but with a smaller \(r = \varOmega (\sqrt{\log \lambda + \log (nq_s)})\). When the number of sign queries \(q_s\) allowed to the adversary is much smaller than \(2^{\lambda }\), this leads to significant parameter savings, because SIS’s \(\beta \) is reduced and hence nmq may be set smaller for the same security parameter \(\lambda \).

4 Rényi Divergence and Distinguishing Problems

In this section, we prove Theorem 4.1 which allows to use the RD for distinguishing problems, and we show how to apply it to the dual-Regev encryption scheme.

4.1 Problems with Public Sampleability

A general setting one comes across in analyzing the security of cryptographic schemes has the following form. Let P denote a decision problem that asks to distinguish whether a given x was sampled from distribution \(X_0\) or \(X_1\), defined as follows:

$$\begin{aligned} X_0 = \{x : r \hookleftarrow \varPhi , x \hookleftarrow D_0(r) \}, \quad X_1 = \{x : r \hookleftarrow \varPhi , x \hookleftarrow D_1(r) \}. \end{aligned}$$

Here r is some parameter that is sampled from the same distribution \(\varPhi \) in both \(X_0\) and \(X_1\). The parameter r then determines the conditional distributions \(D_0(r)\) and \(D_1(r)\) from which x is sampled in \(X_0\) and \(X_1\), respectively, given r. Now, let \(P'\) denote another decision problem that is defined similarly to P, except that in \(P'\) the parameter r is sampled from a different distribution \(\varPhi '\) (rather than \(\varPhi \)). Given r, the conditional distributions \(D_0(r)\) and \(D_1(r)\) are the same in \(P'\) as in P. Let \(X'_0\) (resp. \(X'_1\)) denote the resulting marginal distributions of x in problem \(P'\). Now, in the applications we have in mind, the distributions \(\varPhi '\) and \(\varPhi \) are “close” in some sense, and we wish to show that this implies an efficient reduction between problems \(P'\) and P, in the usual sense that every distinguisher with efficient run-time T and non-negligible advantage \(\varepsilon \) against P implies a distinguisher for \(P'\) with efficient run-time \(T'\) and non-negligible advantage \(\varepsilon '\). In the classical situation, if the SD \(\varDelta (\varPhi ,\varPhi ')\) between \(\varPhi '\) and \(\varPhi \) is negligible, then the reduction is immediate. Indeed, for \(b \in \{0,1\}\), if \(p_b^{}\) (resp. \(p'_b\)) denotes the probability that a distinguisher algorithm \(\mathcal {A}\) outputs 1 on input distribution \(X_b^{}\) (resp. \(X'_b\)), then we have, from the SD probability preservation property, that \(|p'_b - p_b| \le \varDelta (\varPhi ,\varPhi ')\). As a result, the advantage \(\varepsilon '=|p'_1-p'_0|\) of \(\mathcal {A}\) against \(P'\) is bounded from below by \(\varepsilon -2\varDelta (\varPhi ,\varPhi ')\) which is non-negligible (here \(\varepsilon = |p_1-p_0|\) is the assumed non-negligible advantage of \(\mathcal {A}\) against P).

Unfortunately, for general decision problems \(P,P'\) of the above form, it seems difficult to obtain an RD-based analogue of the above SD-based argument, in the weaker setting when the SD \(\varDelta (\varPhi ,\varPhi ')\) is non-negligible, but the RD \(R=R(\varPhi \Vert \varPhi ')\) is small. Indeed, the probability preservation property of the RD in Lemma 2.7 does not seem immediately useful in the case of general decision problems \(P,P'\). With the above notations, it can be used to conclude that \(p'_b \ge p^2_b/R\) but this does not allow us to usefully relate the advantages \(|p'_1 - p'_0|\) and \(|p_1 - p_0|\).

Nevertheless, we now make explicit a special class of “publicly sampleable” problems \(P,P'\) for which such a reduction can be made. In such problems, it is possible to efficiently sample from both distributions \(D_0(r)\) (resp. \(D_1(r)\)) given the single sample x from the unknown \(D_b(r)\). This technique is implicit in the application of RD in the reductions of [LPR13], and we abstract it and make it explicit in the following.

Theorem 4.1

Let \(\varPhi ,\varPhi '\) denote two distributions with \(\mathrm {Supp}(\varPhi ) \subseteq \mathrm {Supp}(\varPhi ')\), and \(D_0(r)\) and \(D_1(r)\) denote two distributions determined by some parameter \(r \in \mathrm {Supp}(\varPhi ')\). Let \(P, P'\) be two decision problems defined as follows:

  • Problem P: Distinguish whether input x is sampled from distribution \(X_0\) or \(X_1\), where

    $$\begin{aligned} X_0 = \{x : r \hookleftarrow \varPhi , x \hookleftarrow D_0(r) \}, \quad X_1 = \{x : r \hookleftarrow \varPhi , x \hookleftarrow D_1(r) \}. \end{aligned}$$
  • Problem \(P'\): Distinguish whether input x is sampled from distribution \(X'_0\) or \(X'_1\), where

    $$\begin{aligned} X'_0 = \{x : r \hookleftarrow \varPhi ', x \hookleftarrow D_0(r) \}, \quad X'_1 = \{x : r \hookleftarrow \varPhi ', x \hookleftarrow D_1(r) \}. \end{aligned}$$

Assume that \(D_0(\cdot )\) and \(D_1(\cdot )\) satisfy the following public sampleability property: there exists a sampling algorithm \(\mathsf {S}\) with run-time \(T_S\) such that for all (rb), given any sample x from \(D_b(r)\):

  • \(\mathsf {S}(0,x)\) outputs a fresh sample distributed as \(D_0(r)\) over the randomness of \(\mathsf {S}\),

  • \(\mathsf {S}(1,x)\) outputs a fresh sample distributed as \(D_1(r)\) over the randomness of \(\mathsf {S}\).

Then, given a T-time distinguisher \(\mathcal {A}\) for problem P with advantage \(\varepsilon \), we can construct a distinguisher \(\mathcal {A}'\) for problem \(P'\) with run-time and distinguishing advantage respectively bounded from above and below by (for any \(a \in (1,+\infty ]\)):

$$\begin{aligned} O\left( \frac{1}{\varepsilon ^2} \log \left( \frac{R_{a}(\varPhi \Vert \varPhi ')}{\varepsilon ^{a/(a-1)}}\right) \cdot (T_S + T)\right) \text{ and } \frac{\varepsilon }{4 \cdot R_{a}(\varPhi \Vert \varPhi ')} \cdot \left( \frac{\varepsilon }{2}\right) ^{\frac{a}{a-1}}. \end{aligned}$$

Proof

Distinguisher \(\mathcal {A}'\) is given an input x sampled from \(D_b(r)\) for some r sampled from \(\varPhi '\) and some unknown \(b \in \{0,1\}\). For an \(\varepsilon '\) to be determined later, it runs distinguisher \(\mathcal {A}\) on \(N=O(\varepsilon ^{-2}\log (1/\varepsilon '))\) independent inputs sampled from \(D_0(r)\) and \(D_1(r)\) calling algorithm \(\mathsf {S}\) on (0, x) and (1, x) to obtain estimates \(\hat{p}_0\) and \(\hat{p}_1\) for the acceptance probabilities \(p_0(r)\) and \(p_1(r)\) of \(\mathcal {A}\) given as inputs samples from \(D_0(r)\) and \(D_1(r)\) (with the r fixed to the value used to sample the input x of \(\mathcal {A}'\)). By the choice of N and the Hoeffding bound, the estimation errors \(|\hat{p}_0 - p_0|\) and \(|\hat{p}_1 - p_1|\) are \(<\varepsilon /8\) except with probability \(< \varepsilon '\) over the randomness of \(\mathsf {S}\). Then, if \(\hat{p}_1 - \hat{p}_0 > \varepsilon /4\), distinguisher \(\mathcal {A}'\) runs \(\mathcal {A}\) on input x and returns whatever \(\mathcal {A}\) returns, else distinguisher \(\mathcal {A}'\) returns a uniformly random bit. This completes the description of distinguisher \(\mathcal {A}'\).

Let \(\mathcal {S}_1\) denote the set of r’s such that \(p_1(r)- p_0(r) \ge \varepsilon /2\), \(\mathcal {S}_2\) denote the set of r’s that are not in \(\mathcal {S}_1\) and such that \(p_1(r)- p_0(r) \ge 0\), and \(\mathcal {S}_3\) denote all the remaining r’s. Then:

  • If \(r \in \mathcal {S}_1\), then except with probability \(< \varepsilon '\) over the randomness of \(\mathsf {S}\), we will have \(\hat{p}_1 - \hat{p}_0 > \varepsilon /4\) and thus \(\mathcal {A}'\) will output \(\mathcal {A}(x)\). Thus, in the case \(b=1\), we have \(\Pr [\mathcal {A}'(x)=1 | r \in \mathcal {S}_1] \ge p_1(r) - \varepsilon '\) and in the case \(b=0\), we have \(\Pr [\mathcal {A}'(x)=1 | r \in \mathcal {S}_1] \le p_0(r) + \varepsilon '\).

  • Assume that \(r \in \mathcal {S}_2\). Let u(r) be the probability over the randomness of \(\mathsf {S}\) that \(\hat{p}_1 - \hat{p}_0 > \varepsilon /4\). Then \(\mathcal {A}'\) will output \(\mathcal {A}(x)\) with probability u(r) and a uniform bit with probability \(1-u(r)\). Thus, in the case \(b=1\), we have \(\Pr [\mathcal {A}'(x)=1| r \in \mathcal {S}_2] = u(r) \cdot p_1(r) + (1-u(r)) /2\), and in the case \(b=0\), we have \(\Pr [\mathcal {A}'(x)=1| r \in \mathcal {S}_2] = u(r) \cdot p_0(r) + (1-u(r))/2\).

  • If \(r \in \mathcal {S}_3\), except with probability \(<\varepsilon '\) over the randomness of \(\mathsf {S}\), we have \(\hat{p}_1 - \hat{p}_0 < \varepsilon /4\) and \(\mathcal {A}'\) will output a uniform bit. Thus, in the case \(b=1\), we have \(\Pr [\mathcal {A}'(x)=1| r \in \mathcal {S}_3] \ge 1/2 - \varepsilon '\), and in the case \(b=0\), we have \(\Pr [\mathcal {A}'(x)=1| r \in \mathcal {S}_3] \le 1/2 +\varepsilon '\).

Overall, the advantage of \(\mathcal {A}'\) is bounded from below by:

$$\begin{aligned} \sum _{r \in \mathcal {S}_1} \varPhi '(r) \left( p_1(r) - p_0(r) - 2 \varepsilon ' \right) + \sum _{r \in \mathcal {S}_2} \varPhi '(r) u(r) \left( p_1(r) - p_0(r)\right) - \sum _{r \in \mathcal {S}_3} \varPhi '(r) 2 \varepsilon ' \\ \ \ge \ \varPhi '(\mathcal {S}_1) \cdot \frac{\varepsilon }{2} - 2 \varepsilon '. \end{aligned}$$

Without loss of generality, we may assume that the advantage of \(\mathcal {A}\) is positive. By an averaging argument, the set \(\mathcal {S}_1\) has probability \(\varPhi (\mathcal {\mathcal {S}}_1) \ge \varepsilon /2\) under distribution \(\varPhi \). Hence, by the RD probability preservation property (see Lemma 2.7), we have \(\varPhi '(\mathcal {S}_1) \ge (\varepsilon /2)^{\frac{a}{a-1}}/R_{a}(\varPhi \Vert \varPhi ')\). The proof may be completed by setting \(\varepsilon ' = (\varepsilon /8) \cdot (\varepsilon /2)^{\frac{a}{a-1}}/R_{a}(\varPhi \Vert \varPhi ')\).    \(\square \)

4.2 Application to Dual-Regev Encryption

Let \(m,n,q,\chi \) be as in Definition 2.5 and \(\varPhi \) denote a distribution over \(\mathbb {Z}_q^{m \times n}\). We define the LWE variant \(\mathrm {LWE}_{n,q,\chi ,m}(\varPhi )\) as follows: Sample \(A \hookleftarrow \varPhi \), \({\varvec{s}} \hookleftarrow U(\mathbb {Z}_q^n)\), \({\varvec{e}} \hookleftarrow \chi ^m\) and \({\varvec{u}} \hookleftarrow U(\mathbb {T}^m)\); The goal is to distinguish between the distributions \((A,\frac{1}{q} A{\varvec{s}}+{\varvec{e}})\) and \((A,{\varvec{u}})\) over \(\mathbb {Z}_q^{m \times n} \times \mathbb {T}^m\). Note that standard LWE is obtained by taking \(\varPhi ' = U(\mathbb {Z}_q^{m \times n})\).

As an application to Theorem 4.1, we show that LWE with non-uniform and possibly statistically correlated \({\varvec{a}}_i\)’s of the samples \(({\varvec{a}}_i,b_i)\)’s (with \(b_i\) either independently sampled from \(U(\mathbb {T})\) or close to \(\langle {\varvec{a}}_i , {\varvec{s}} \rangle \) for a secret vector \({\varvec{s}}\)) remains at least as hard as standard LWE, as long as the RD \(R(\varPhi \Vert U)\) remains small, where \(\varPhi \) is the joint distribution of the given \({\varvec{a}}_i\)’s and U denotes the uniform distribution.

To show this result, we first prove in Corollary 4.2 that there is a reduction from \(\mathrm {LWE}_{n,q,\chi ,m}(\varPhi ')\) to \(\mathrm {LWE}_{n,q,\chi ,m}(\varPhi )\) using Theorem 4.1 if \(R_{a}(\varPhi \Vert \varPhi ')\) is small enough. We then describe in Corollary 4.3 how to use this first reduction to obtain smaller parameters for the dual-Regev encryption. This allows us to save an \(\varOmega (\sqrt{\lambda /\log \lambda })\) factor in the Gaussian deviation parameter r used for secret key generation in the dual-Regev encryption scheme [GPV08], where \(\lambda \) refers to the security parameter.

Corollary 4.2

Let \(\varPhi \) and \(\varPhi '\) be two distributions over \(\mathbb {Z}_q^{m \times n}\) with \(\mathrm {Supp}(\varPhi ) \subseteq \mathrm {Supp}(\varPhi ')\). If there exists a distinguisher \(\mathcal {A}\) against \(\mathrm {LWE}_{n,q,\chi ,m}(\varPhi )\) with run-time T and advantage \(\varepsilon = o(1)\), then there exists a distinguisher \(\mathcal {A}'\) against \(\mathrm {LWE}_{n,q,\chi ,m}(\varPhi ')\) with run-time \(T' = O(\varepsilon ^{-2} \log \frac{R_{a}(\varPhi \Vert \varPhi ')}{\varepsilon ^{a/(a-1)}} \cdot (T + \mathrm {poly}(m,\log q)))\) and advantage \(\varOmega \left( \frac{\varepsilon ^{1+a/(a-1)}}{R_{a} (\varPhi \Vert \varPhi ')}\right) \), for any \(a \in (1,+\infty ]\).

Proof

Apply Theorem 4.1 with \(r = A\in \mathbb {Z}_q^m\), \(x = (A,{\varvec{b}}) \in \mathbb {Z}_q^{m \times n} \times \mathbb {T}^m\), \(D_0(r) = (A,A \cdot {\varvec{s}}+{\varvec{e}})\) with \({\varvec{s}} \hookleftarrow U(\mathbb {Z}_q^n)\) and \({\varvec{e}} \hookleftarrow \chi ^m\), and \(D_1(r) = (A,{\varvec{u}})\) with \({\varvec{u}} \hookleftarrow U(\mathbb {Z}_q^m)\). The sampling algorithm \(\mathsf {S}\) is such that \(\mathsf {S}(0,x)\) outputs \((A, A \cdot {\varvec{s}}' + {\varvec{e}}')\) for \({\varvec{s}}' \hookleftarrow U(\mathbb {Z}_q^n)\) and \({\varvec{e}}' \hookleftarrow \chi ^m\), while \(\mathsf {S}(1,x)\) outputs \((A,{\varvec{u}}')\) with \({\varvec{u}}' \hookleftarrow U(\mathbb {Z}_q^m)\).    \(\square \)

We recall that the dual-Regev encryption scheme has a general public parameter \(A \in \mathbb {Z}_q^{m \times n}\), a secret key of the form \(sk = {\varvec{x}}\) with \({\varvec{x}} \hookleftarrow D_{\mathbb {Z}^m,r}\) and a public key of the form \({\varvec{u}}= A^t {\varvec{x}} \mod q\). A ciphertext for a message \(M \in \{0,1\}\) is obtained as follows: Sample \({\varvec{s}} \hookleftarrow U(\mathbb {Z}_q^n)\), \({\varvec{e}}_1 \hookleftarrow \chi ^m\) and \(e_2 \hookleftarrow \chi \); return ciphertext \((c_1,c_2) = (\frac{1}{q} A{\varvec{s}} + {\varvec{e}}_1, \frac{1}{q} \langle {\varvec{u}} , {\varvec{s}} \rangle + e_2 + \frac{M}{2}) \in \mathbb {T}^m \times \mathbb {T}\).

Corollary 4.3

Suppose that q is prime, \(m \ge 2n \log q\) and \(r \ge 4 \sqrt{\log (12m)/\pi }\). If there exists an adversary against the IND-CPA security of the dual-Regev encryption scheme with run-time T and advantage \(\varepsilon \), then there exists a distinguishing algorithm for \(\mathrm {LWE}_{n,q,\chi , m+1}\) with run-time \(O((\varepsilon ')^{-2} \log (\varepsilon ')^{-1} \cdot (T + \mathrm {poly}(m)))\) and advantage \(\varOmega ((\varepsilon ')^2)\), where \(\varepsilon ' = \varepsilon - 2q^{-n}\).

Proof

The IND-CPA security of the dual-Regev encryption scheme as described above is at least as hard as \(\mathrm {LWE}_{n,q,\chi ,m+1}(\varPhi )\) where \(\varPhi \) is obtained by sampling \(A \hookleftarrow U(\mathbb {Z}_q^{m \times n})\), \({\varvec{u}} \hookleftarrow A^t \cdot D_{\mathbb {Z}^m,r} \mod q\) and returning the \((m+1) \times n\) matrix obtained by appending \({\varvec{u}}^t\) at the bottom of A. We apply Corollary 4.2 with \(\varPhi ' = U(\mathbb {Z}_q^{(m+1) \times n})\).

Since q is prime, if A is full rank, then the multiplication by \(A^t\) induces an isomorphism between the quotient group \(\mathbb {Z}^m/A^{\perp }\) and \(\mathbb {Z}_q^n\), where \(A^{\perp } = \{ {\varvec{x}} \in \mathbb {Z}^m: A^t \cdot {\varvec{x}} = \mathbf{0} \mod q\}\). By Lemma 2.2, we have \(\eta _{1/3}(A^{\perp }) \le 4 \sqrt{\log (12m)/\pi } \le r\), except for a fraction \(\le q^{-n}\) of the A’s. Let BAD denote the union of such bad A’s and the A’s that are not full rank. We have \(\Pr [BAD] \le 2q^{-n}\).

By the multiplicativity property of Lemma 2.7, we have:

$$\begin{aligned} R_{\infty }(\varPhi \Vert \varPhi ') \le \max _{A \notin BAD} R_{\infty }(D_{\mathbb {Z}^m,r} \!\!\!\! \mod A^{\perp } \Vert U_{\mathbb {Z}^m/A^{\perp }}). \end{aligned}$$

Thanks to Lemma 2.9, we know that the latter is \(\le 2\). The result now follows from Corollary 4.2.    \(\square \)

In all applications we are aware of, the parameters satisfy \(m \le \mathrm {poly}(\lambda )\) and \(q^{-n} \le 2^{-\lambda }\), where \(\lambda \) refers to the security parameter. The \(r = \varOmega (\sqrt{\log \lambda })\) bound of our Corollary 4.3, that results from using \(\delta =1/3\) in the condition \(r \ge \eta _{\delta }(A^{\perp })\) in the RD-based smoothing argument of the proof above, improves on the corresponding bound \(r = \varOmega (\sqrt{\lambda })\) that results from the requirement to use \(\delta =O(2^{-\lambda })\) in the condition \(r \ge \eta _{\delta }(A^{\perp })\) in the SD-based smoothing argument of the proof of [GPV08, Theorem 7.1], in order to handle adversaries with advantage \(\varepsilon = 2^{-o(\lambda )}\) in both cases. Thus our RD-based analysis saves a factor \(\varOmega (\sqrt{\lambda /\log \lambda })\) in the choice of r, and consequently of \(a^{-1}\) and q. (The authors of [GPV08] specify a choice of \(r = \omega (\sqrt{\log \lambda })\) for their scheme because they use in their analysis the classical “no polynomial attacks” security requirement, corresponding to assuming attacks with advantage \(\varepsilon =\lambda ^{-O(1)}\), rather than the stronger \(\varepsilon = \omega (2^{-\lambda })\) but more realistic setting we take.)

5 Application to LWE with Uniform Noise

The \(\mathrm {LWE}\) problem with noise uniform in a small interval was introduced in [DMQ13]. In that article, the authors exhibit a reduction from \(\mathrm {LWE}\) with Gaussian noise, which relies on a new tool called lossy codes. The main proof ingredients are the construction of lossy codes for \(\mathrm {LWE}\) (which are lossy for the uniform distribution in a small interval), and the fact that lossy codes are pseudorandom.

We note that the reduction from [DMQ13] needs the number of \(\mathrm {LWE}\) samples to be bounded by \(\mathrm {poly}(n)\) and that it degrades the \(\mathrm {LWE}\) dimension by a constant factor. The parameter \(\beta \) (when the interval of the noise is \([- \beta , \beta ]\)) should be at least \(m n^{\sigma } \alpha \) where \(\alpha \) is the \(\mathrm {LWE}\) Gaussian noise parameter and \(\sigma \in (0,1)\) is an arbitrarily small constant.

We now provide an alternative reduction from the \(\mathrm {LWE}_{n,q,D_{\alpha },m}\) distinguishing problem to the \(\mathrm {LWE}_{n,q,U([- \beta , \beta ]),m}\) distinguishing problem, and analyze it using RD. Our reduction preserves the \(\mathrm {LWE}\) dimension n, and is hence tighter than the one from [DMQ13]. We also require that \(\beta = \varOmega (m \alpha )\).

Theorem 5.1

Let \(m \ge n \ge 1\) and with \(q \le \mathrm {poly}(m,n)\) prime. Let \(\alpha , \beta > 0\) be real numbers with \(\beta = \varOmega (m \alpha )\). Then there is a polynomial-time reduction from \(\mathrm {LWE}_{n,q,D_{\alpha },m}\) to \(\mathrm {LWE}_{n,q,\phi ,m}\), with \(\phi = \frac{1}{q} \lfloor q U([-\beta ,\beta ]) \rceil \).

Proof

In the proof, we let \(U_{\beta }\) denote the distribution \(U([-\beta ,\beta ])\), to ease notations. Our reduction relies on four steps:

  • A reduction from \(\mathrm {LWE}_{n,q,D_{\alpha },m}\) to \(\mathrm {LWE}_{n,q,\psi ,m}\) with \(\psi = D_{\alpha } + U_{\beta }\),

  • A reduction from \(\mathrm {LWE}_{n,q,\psi ,m}\) to \(\mathrm {sLWE}_{n,q,\psi ,m}\),

  • A reduction from \(\mathrm {sLWE}_{n,q,\psi ,m}\) to \(\mathrm {sLWE}_{n,q,U_{\beta },m}\),

  • A reduction from \(\mathrm {sLWE}_{n,q,U_{\beta },m}\) to \(\mathrm {LWE}_{n,q,U_{\beta },m}\).

First step. The reduction is given m elements \(({\varvec{a}}_i,b_i) \in \mathbb {Z}_q^n \times \mathbb {T}\), all drawn from \(A_{{\varvec{s}}, D_{\alpha }}\) (for some \({\varvec{s}}\)), or all drawn from \(U(\mathbb {Z}_q^n \times \mathbb {T})\). The reduction consists in adding independent samples from \(U_{\beta }\) to each \(b_i\). The reduction maps the uniform distribution to itself, and \(A_{{\varvec{s}}, D_{\alpha }}\) to \(A_{{\varvec{s}}, \psi }\).

Second step. Reducing the distinguishing variant of LWE to its search variant is direct.

Third step.

The reduction from \(\mathrm {sLWE}_{n,q,\psi ,m}\) to \(\mathrm {sLWE}_{n,q,U_{\beta },m}\) is vacuous: by using the RD (and in particular the probability preservation property of Lemma 2.7), we show that an oracle solving \(\mathrm {sLWE}_{n,q,U_{\beta },m}\) also solves \(\mathrm {sLWE}_{n,q,\psi ,m}\).

Lemma 5.2

Let \(\alpha ,\beta \) be real numbers with \(\alpha \in (0, 1/e)\) and \(\beta \ge \alpha \). Let \(\psi = D_{\alpha }+U_{\beta }\). Then

$$\begin{aligned} R_2( U_{\beta } \Vert \psi ) = \frac{\alpha }{\beta } \int _{0}^{\beta } \frac{1}{ \int _{-\beta }^{\beta } e^{ \frac{- \pi (x - y)^2}{\alpha ^2} } dy} dx \ \le \ 1 + 16\frac{\alpha }{\beta } \sqrt{\ln (1/\alpha )/\pi }. \end{aligned}$$

Proof

The density function of \(\psi \) is the convolution of the density functions of \(D_{\alpha }\) and \(U_{\beta }\):

$$\begin{aligned} f_{\psi }(x) = \frac{1}{2 \alpha \beta } \int _{-\beta }^{\beta } e^{ \frac{- \pi (x - y)^2}{\alpha ^2} } dy. \end{aligned}$$

Using Rényi of order 2, we have:

$$\begin{aligned} R_2( U_{\beta } \Vert \psi ) = \int _{- \beta }^{\beta } \frac{\frac{1}{(2 \beta )^2}}{\frac{1}{2 \alpha \beta } \int _{- \beta }^{\beta } e^{ \frac{- \pi (x - y)^2}{\alpha ^2} } dy} dx = \frac{\alpha }{\beta } \int _{0}^{\beta } \frac{1}{\int _{- \beta }^{\beta } e^{ \frac{- \pi (x - y)^2}{\alpha ^2} } dy} dx. \end{aligned}$$

The denominator in the integrand is a function for \(x \in [0, \beta ]\).

$$\begin{aligned} \phi (x) = \alpha - \int _{\beta +x}^{\infty } \exp (\frac{-\pi y^2}{\alpha ^2}) \; \mathrm {d} y - \int _{\beta -x}^{\infty } \exp (\frac{-\pi y^2}{\alpha ^2}) \; \mathrm {d} y . \end{aligned}$$

For standard Gaussian, we use the following tail bound [CDS03]:

$$\begin{aligned} \frac{1}{\sqrt{2 \pi }} \int _{z}^\infty e^{-x^2/2} dx \le \frac{1}{2} e^{-z^2/2}. \end{aligned}$$

Then we have

$$\begin{aligned} \phi (x) \ge \alpha \left( 1 - \frac{1}{2} \exp \left( \frac{- \pi (\beta +x)^2}{\alpha ^2}\right) - \frac{1}{2} \exp \left( \frac{- \pi (\beta -x)^2}{\alpha ^2}\right) \right) . \end{aligned}$$

Taking the reciprocal of above, we use the first-order Taylor expansion. Note here

$$\begin{aligned} t(x) = \frac{1}{2} \exp \left( \frac{- \pi (\beta +x)^2}{\alpha ^2}\right) + \frac{1}{2} \exp \left( \frac{- \pi (\beta -x)^2}{\alpha ^2}\right) . \end{aligned}$$

We want to bound the function t(x) by a constant \(c \in (0, 1)\). Here t(x) is not monotonic. We take the maximum of the first-half and the maximum of the second-half of t(x). An upper bound (\(\beta \ge \alpha \)) is:

$$\begin{aligned} t(x) \le \frac{1}{2}e^{- \pi \beta ^2/\alpha ^2} + \frac{1}{2} =: \sigma _{\alpha , \beta } + \frac{1}{2} < 1. \end{aligned}$$

We then use the fact that \(\frac{1}{1-t(x)}= 1 + \frac{1}{1-t(x)} t(x) \le 1 + \frac{1}{1-2\sigma _{\alpha , \beta }} t(x)\) to bound the Rényi divergence of order 2.

$$\begin{aligned} R_2( U_{\beta } \Vert \psi )= & {} \frac{\alpha }{\beta } \int _{0}^\beta \frac{1}{ \phi (x)} \mathrm {d} x \\\le & {} \frac{1}{\beta } \int _{0}^\beta \frac{1}{ 1 - \frac{1}{2} \exp \left( \frac{- \pi (\beta +x)^2}{\alpha ^2}\right) - \frac{1}{2} \exp \left( \frac{- \pi (\beta -x)^2}{\alpha ^2}\right) } \mathrm {d} x \\\le & {} \frac{1}{\beta } \int _{0}^\beta \left( 1 + \frac{1}{1-2\sigma _{\alpha , \beta }} \exp \left( \frac{- \pi (\beta +x)^2}{\alpha ^2}\right) \right. \\&+ \left. \frac{1}{1-2\sigma _{\alpha , \beta }} \exp \left( \frac{- \pi (\beta -x)^2}{\alpha ^2}\right) \right) \mathrm {d} x \\= & {} 1 + \frac{1}{(1-2\sigma _{\alpha , \beta })\beta } \int _{0}^{2\beta } \exp \left( \frac{- \pi x^2}{\alpha ^2}\right) \mathrm {d} x \\= & {} 1 + \frac{1}{2(1-2\sigma _{\alpha , \beta })\beta } \int _{-2\beta }^{2\beta } \exp \left( \frac{- \pi x^2}{\alpha ^2}\right) \mathrm {d} x \\= & {} 1 + \frac{\alpha }{(1-2\sigma _{\alpha , \beta })\beta } (1 - 2 D_\alpha (2\beta )) \le 1 + \frac{1}{1-2\sigma _{\alpha , \beta }}\frac{\alpha }{\beta }. \end{aligned}$$

Hence we have the bound,

$$\begin{aligned} R_2( U_{\beta } \Vert \psi ) \le 1 + \frac{1}{1 - e^{- \pi \beta ^2/\alpha ^2}} \frac{\alpha }{\beta }. \end{aligned}$$

   \(\square \)

We use Lemma 5.2 with m samples and \(\beta = \varOmega ( m \alpha )\) to ensure that the mth power of the RD is \(\le 2\). The RD multiplicativity and probability preservation properties (see Lemma 2.7) imply that \(\varepsilon ' \ge \varepsilon ^2 / R^m_2 (U_\beta \Vert \phi )\); hence if an oracle solves \(\mathrm {sLWE}_{n,q,U_{\beta },m}\) with probability \(\varepsilon \), then it also solves \(\mathrm {sLWE}_{n,q,\psi ,m}\) with probability \(\ge \varepsilon ^2/2\).

Fourth step. We reduce \(\mathrm {sLWE}_{n,q,U_{\beta },m}\) with continuous noise \(U_{\beta }\) to \(\mathrm {sLWE}_{n,q,\phi ,m}\) with discrete noise \(\phi = \frac{1}{q} \lfloor q U_{\beta } \rceil \) with support contained in \(\mathbb {T}_q\), by rounding to the nearest multiple of \(\frac{1}{q}\) any provided \(b_i\) (for \(i \le m\)). We reduce \(\mathrm {sLWE}_{n,q,\phi ,m}\) to \(\mathrm {LWE}_{n,q,\phi ,m}\) by invoking Theorem 2.6.    \(\square \)

6 Open Problems

Our results show the utility of the Rényi divergence in several areas of lattice-based cryptography. However, they also suggest some natural open problems, whose resolution could open up further applications. In particular, can we extend the applicability of RD to more general distinguishing problems than those satisfying our ‘public sampleability’ requirement? This may extend our results further. For instance, can we use RD-based arguments to prove the hardness of LWE with uniform noise without using the search to decision reduction of [MM11]? This may allow the proof to apply also to Ring-LWE with uniform noise.