1 Introduction

Lattice reduction is a powerful algorithmic tool for solving a wide range of problems ranging from integer optimization problems and problems from algebra or number theory. Lattice reduction has played a role in the cryptanalysis of cryptosystems not directly related to lattices, and is now even more relevant to quantifying the security of lattice-based cryptosystems [1, 6, 14].

The goal of lattice reduction is to find a basis with short and nearly orthogonal vectors. In 1982, the first polynomial time lattice reduction algorithm, \(\text {LLL}\) [15], was invented by Lenstra, Lenstra and Lovász. Then, the idea of block-wise reduction appeared and several block-wise lattice reduction algorithms [7, 8, 19, 24] were proposed successively. Currently, \(\text {BKZ}\) is the most practical lattice reduction algorithm. Schnorr and Euchner first put forward the original \(\text {BKZ}\) algorithm in [24]. It is subject to many heuristic optimizations, such as early-abort [12], pruned enumeration [10] and progressive reduction [2, 4]. All such improvements have been combined in the so-called BKZ 2.0 algorithm of Chen and Nguyen [5] (progressive strategy was improved further in later work [2]). Also, plenty of analyses [2, 9, 19, 23, 31] of \(\text {BKZ}\) algorithms have been made to explore and predict the performance of \(\text {BKZ}\) algorithms, which provide rough security estimations for lattice-based cryptography.

Despite of their popularity, the behavior of lattice reduction algorithms is still not completely understood. While there are reasonable models (e.g. the Geometric Series Assumption [25] and simulators [5]), there are few studies on the experimental statistical behavior of those algorithms, and they considered rather outdated versions of those algorithms [3, 20, 23]. The accuracy of the current model remains unclear.

This state of affair is quite problematic to evaluate accurately the concrete security level of lattice-based cryptosystem proposal. With the recent calls for post-quantum schemes by the NIST, this matter seems pressing.

Our Contribution. In this work, we partially address this matter, by proposing a second-order statistical (for random input bases) analysis of the behavior of reduction algorithms in practice, qualitatively and quantitatively. We figure out one more low order term in the predicted average value of several quantities such as the root Hermite factor. Also, we investigate the variation around the average behavior, a legitimate concern raised by Micciancio and Walter [19].

In more details, we experimentally study the logarithms of ratios between two adjacent Gram-Schmidt norms in \(\text {LLL}\) and \(\text {BKZ}\)-reduced basis (denoted \(r_i\)’s below). We highlight three ranges for the statistical behavior of the \(r_i\): the head (\(i \le h\)), the body (\(h< i < n-t\)) and the tail (\(i \ge n-t\)). The lengths of the head and tail are essentially determined by the blocksize \(\beta \). In the body range, the statistical behavior of the \(r_i\)’s are similar: this does not only provide new support for the so-called Geometric Series Assumption [25] when \(\beta \ll n\), but also a refinement of it applicable even when \(\beta \not \ll n\). We note in particular that the impact of the head on the root Hermite factor is much stronger than the impact of the tail.

We also study the variance and the covariance between the \(r_i\)’s. We observe a local correlation between the \(r_i\)’s. More precisely we observe that \(r_i\) and \(r_{i+1}\) are negatively correlated, inducing a self-stabilizing behavior of those algorithms: the overall variance is less than the sum of local variances.

Then, we measure the half-volume, i.e. \(\prod _{i=1}^{\lfloor \frac{n}{2}\rfloor }\Vert \mathbf {b}_i^*\Vert \), a quantity determining the cost of enumeration on reduced basis. By expressing the half-volume using the statistics of the \(r_i\)’s, we determine that the complexity of enumeration on \(\text {BKZ}\)-reduced basis should be of the form \(2^{an^2 \pm bn^{1.5}}\): the variation around average (denoted by ±) can impact the speed of enumeration by a super-exponential factor.

At last, we also compare all those experimental resultsFootnote 1 to the simulator [5], and conclude that the simulator can predict the body of the profile and the tail phenomenon qualitatively and quantitatively, but the head phenomenon is not captured. Thus it is necessary to revise the security estimation and refine the simulator.

Impact. Our work points at several inaccuracies of the current models for the behavior of LLL and BKZ, and quantifies them experimentally. It should be noted that our measured statistics are barely enough to address the question of precise prediction. Many tweaks on those algorithms are typically applied (more aggressive pruning, more subtle progressive reductions, ...) to accelerate them and that would impact those statistics. On the other hand, the optimal parametrization of heuristic tweaks is very painful to reproduce, and not even clearly determined in the literature. We therefore find it preferable to first approach stable versions of those algorithm, and minimize the space of parameters.

We would also not dare to simply guess extrapolation models for those statistics to larger blocksize: this should be the topic of a more theoretical study.

Yet, by pointing out precisely the problematic phenomena, we set the ground for revised models and simulators: our reported statistics can be used to sanity check such future models and simulators.

Source code. Our experiments heavily rely on the latest improvements of the open-source library fplll [27], catching up with the state of the art algorithm BKZ 2.0. For convenience, we used the python wrapper fpylll [28] for fplll, making our scripts reasonably concise and readable. All our scripts are open-source and available onlineFootnote 2, for reviewing, reproduction or extension purposes.

2 Preliminaries

We refer to [21] for a detailed introduction to lattice reduction and to [12, 16] for an introduction to the behavior of LLL and BKZ.

2.1 Notations and Basic Definitions

All vectors are denoted by bold lower case letters and are to be read as row-vectors. Matrices are denoted by bold capital letters. We write a matrix \(\mathbf {B}\) into \(\mathbf {B}= (\mathbf {b}_1,\cdots ,\mathbf {b}_n)\) where \(\mathbf {b}_i\) is the i-th row vector of \(\mathbf {B}\). If \(\mathbf {B}\in \mathbb {R}^{n \times m}\) has full rank n, the lattice \(\mathcal {L}\) generated by the basis \(\mathbf {B}\) is denoted by \(\mathcal {L}(\mathbf {B}) = \{\mathbf {x}\mathbf {B}\ |\ \mathbf {x}\in \mathbb {Z}^n\}\). We denote by \((\mathbf {b}_1^*,\cdots ,\mathbf {b}_n^*)\) the Gram-Schmidt orthogonalization of the matrix \((\mathbf {b}_1,\cdots ,\mathbf {b}_n)\). For \(i \in \{1,\cdots ,n\}\), we define the orthogonal projection to the span of \((\mathbf {b}_1,\cdots ,\mathbf {b}_{i-1})^{\perp }\) as \(\pi _i\). For \(1\le i < j \le n\), we denote by \(\mathbf {B}_{[i,j]}\) the local block \((\pi _i(\mathbf {b}_i),\cdots ,\pi _i(\mathbf {b}_j))\), by \(\mathcal {L}_{[i,j]}\) the lattice generated by \(\mathbf {B}_{[i,j]}\).

The Euclidean norm of a vector \(\mathbf {v}\) is denoted by \(\Vert \mathbf {v}\Vert \). The volume of a lattice \(\mathcal {L}(\mathbf {B})\) is \(\text {vol}(\mathcal {L}(\mathbf {B})) = \prod _i \Vert \mathbf {b}_i^*\Vert \), that is an invariant of the lattice. The first minimum of a lattice \(\mathcal {L}\) is the length of a shortest non-zero vector, denoted by \(\lambda _1(\mathcal {L})\). We use the shorthands \(\text {vol}(\mathbf {B}) = \text {vol}(\mathcal {L}(\mathbf {B}))\) and \(\lambda _1(\mathbf {B}) = \lambda _1(\mathcal {L}(\mathbf {B}))\).

Given a random variable X, we denote by \(\mathbf {E}(X)\) its expectation and by \(\mathbf {Var}(X)\) its variance. Also we denote by \(\mathbf {Cov}(X,Y)\) the covariance between two random variables X and Y. Let \(\mathbf {X} = (X_1,\cdots ,X_n)\) be a vector formed by random variables, its covariance matrix is defined by \(\mathbf {Cov}(\mathbf {X})= (\mathbf {Cov}(X_i,X_j))_{i,j}\).

2.2 Lattice Reduction: In Theory and in Practice

We now recall the definitions of \(\text {LLL}\) and \(\text {BKZ}\) reduction. A basis \(\mathbf {B}\) is \(\text {LLL}\)-reduced with parameter \(\delta \in (\frac{1}{2},1]\), if:

  1. 1.

    \(|\mu _{i,j}| \le \frac{1}{2}\), \(1\le j < i \le n\), where \(\mu _{i,j} = \langle \mathbf {b}_i, \mathbf {b}_j^*\rangle / \langle \mathbf {b}_j^*, \mathbf {b}_j^*\rangle \) are the Gram-Schmidt orthogonalization coefficients;

  2. 2.

    \(\delta \Vert \mathbf {b}_i^*\Vert \le \Vert \mathbf {b}_{i+1}^* + \mu _{i+1,i}\mathbf {b}_i^*\Vert \), for \(1 \le i < n\).

A basis \(\mathbf {B}\) is \(\text {BKZ}\)-reduced with parameter \(\beta \ge 2\) and \(\delta \in (\frac{1}{2},1]\), if:

  1. 1.

    \(|\mu _{i,j}| \le \frac{1}{2}\), \(1\le j < i \le n\);

  2. 2.

    \(\delta \Vert \mathbf {b}_i^*\Vert \le \lambda _1(\mathcal {L}_{[i,\min (i+\beta -1,n)]})\), for \(1\le i < n\).

Note that we follow the definition of \(\text {BKZ}\) reduction from [24] which is a little different from the first notion proposed by Schnorr [26]. We also recall that, as proven in [24], \(\text {LLL}\) is equivalent to \(\text {BKZ}_2\). Typically, \(\text {LLL}\) and \(\text {BKZ}\) are used with Lovász parameter \(\delta = \sqrt{0.99}\) and so will we.

For high dimensional lattices, running \(\text {BKZ}\) with a large blocksize is very expensive. Heuristics improvements were developed, and combined by Chen and Nguyen [5], advertised as \(\text {BKZ}\) 2.0.Footnote 3 In this paper, we report on pure BKZ behavior to avoid perturbations due to heuristic whenever possible. Yet we switch to \(\text {BKZ}\) 2.0 to reach larger blocksizes when deemed relevant.

The two main improvements in \(\text {BKZ}\) 2.0 are called early-abort and pruned enumeration. As proven in [12], the output basis of \(\text {BKZ}\) algorithm with blocksize \(\beta \) would be of an enough good quality after \(C\cdot \frac{n^2}{\beta ^2}\left( \log n + \log \log \max \frac{\Vert \mathbf {b}_i^*\Vert }{\text {vol}(\mathcal {L})^{1/n}}\right) \) tours, where C is a small constant. In our experiments of \(\text {BKZ}\) 2.0, we chose different C and observed its effect on the final basis. We also applied the pruning heuristic (see [4, 10, 27] for details) to speed-up enumeration, but chose a conservative success probability (\(95\% \)) without re-randomization to avoid altering the quality of the output. The preprocessing-pruning strategies were optimized using the strategizer [29] of fplll/fpylll.

Given a basis \(\mathbf {B}\) of an n-dimensional lattice \(\mathcal {L}\), we denote by \(\mathbf {rhf}(\mathbf {B})\) the root Hermite factor of \(\mathbf {B}\), defined by \(\mathbf {rhf}(\mathbf {B}) = \left( \frac{\Vert \mathbf {b}_1\Vert }{\text {vol}(\mathcal {L})^{1/n}}\right) ^{1/n}.\) The root Hermite factor is a common measurement of the reducedness of a basis, e.g. [9].

Let us define the sequence \(\{r_i(\mathbf {B})\}_{1\le i\le n-1}\) of an n-dimensional lattice basis \(\mathbf {B}= (\mathbf {b}_1,\cdots ,\mathbf {b}_n)\) such that \(r_i(\mathbf {B}) = \ln \left( \Vert \mathbf {b}_i^*\Vert /\Vert \mathbf {b}_{i+1}^*\Vert \right) \). The root Hermite factor \(\mathbf {rhf}(\mathbf {B})\) can be expressed in terms of the \(r_i(\mathbf {B})\)’s:

$$\begin{aligned} \mathbf {rhf}(\mathbf {B}) = \exp \left( \frac{1}{n^2}\sum _{1\le i\le n-1}(n-i)r_{i}(\mathbf {B})\right) . \end{aligned}$$
(1)

Intuitively, the sequence \(\{r_i(\mathbf {B})\}_{1\le i\le n-1}\) characterizes how fast the sequence \(\{\Vert \mathbf {b}_i^*\Vert \}\) decreases. Thus Eq. (1) provides an implication between the fact that the \(\Vert \mathbf {b}_i^*\Vert \)’s don’t decrease too fast and the fact that the root Hermite factor is small. For reduced bases, the \(r_i(\mathbf {B})\)’s are of certain theoretical upper bounds. However, it is well known that experimentally, the \(r_i(\mathbf {B})\)’s tend to be much smaller than the theoretical bounds in practice.

From a practical perspective, we are more interested in the behavior of the \(r_i(\mathbf {B})\)’s for random lattices. The standard notion of random real lattices of given volume is based on Haar measures of classical groups. As shown in [11], the uniform distribution over integer lattices of volume V converges to the distribution of random lattices of unit volume, as V grows to infinity. In our experiments, we followed the sampling procedure of the lattice challenges [22]: its volume is a random prime of bit-length 10n and its Hermite normal form (see [18] for details) is sampled uniformly once its volume is determined. Also, we define a random \(\text {LLL}\) (resp. \(\text {BKZ}_\beta \))-reduced basis as the basis outputted by \(\text {LLL}\) (resp. \(\text {BKZ}_\beta \)) applied to a random lattice given by its Hermite normal form, as described above. To speed up convergence, following a simplified progressive strategy [2, 4], we run \(\text {BKZ}\) (resp. \(\text {BKZ}\) 2.0) with blocksize \(\beta =2,4,6,...\) (resp. \(\beta =2,6,10,...\)) progressively from the Hermite normal form of a lattice.

We treat the \(r_i(\mathbf {B})\)’s as random variables (under the randomness of the lattice basis before reduction). For any \(i \in \{1,\cdots ,n-1\}\), we denote by \(r_i(\beta ,n)\) the random variable \(r_i(\beta ,n) = r_i(\mathbf {B})\), where \(\mathbf {B}\) is a random \(\text {BKZ}_\beta \)-reduced basis, and by \(\mathbb {D}_i(\beta ,n)\) the distribution of \(r_i(\beta ,n)\). When \(\beta \) and n are clear from context, we simply write \(r_i\) for \(r_i(\beta , n)\).

2.3 Heuristics on Lattice Reduction

Gaussian Heuristic. The Gaussian Heuristic, denoted by \(\text {GAUSS}\), says that, for “any reasonable” subset K of the span of the lattice \(\mathcal {L}\), the number of lattice points inside K is approximately \(\text {vol}(K)/\text {vol}(\mathcal {L})\). Let the volume of n-dimensional unit ball be \(V_n(1) = \frac{\pi ^{n/2}}{\Gamma (n/2+1)}\). A prediction derived from \(\text {GAUSS}\) is that \(\lambda _1(\mathcal {L}) \approx \text {vol}(\mathcal {L})^{1/n}\cdot \text {GH}(n)\) where \(\text {GH}(n) = V_n(1)^{-1/n}\), which is accurate for random lattices. As suggested in [10, 13], \(\text {GAUSS}\) is a valuable heuristic to estimate the cost and quality of various lattice algorithms.

Random Local Block. In [5], Chen and Nguyen suggested the following modeling assumption, seemingly accurate for large enough blocksizes:

Assumption 1

\([\text {RAND}_{n,\beta }]\) Let \(n, \beta \ge 2\) be integers. For a random \(\text {BKZ}_{\beta }\)-reduced basis of a random n-dimensional lattice, most local block lattices \(\mathcal {L}_{[i,i+\beta -1]}\) behave like a random \(\beta \)-dimensional lattice where \(i \in \{1,\cdots ,n+1-\beta \}\).

By \(\text {RAND}_{n,\beta }\) and \(\text {GAUSS}\), one can predict the root Hermite factor of local blocks: \(\mathbf {rhf}(\mathbf {B}_{[i,i+\beta -1]}) \approx \text {GH}(\beta )^{\frac{1}{\beta }}.\)

Geometric Series Assumption. In [25], Schnorr first proposed the Geometric Series Assumption, denoted by \(\text {GSA}\), which says that, in typical reduced basis \(\mathbf {B}\), the sequence \(\{\Vert \mathbf {b}_i^*\Vert \}_{1\le i\le n}\) looks like a geometric series (while \(\text {GAUSS}\) provides the exact value of this geometric ratio). \(\text {GSA}\) provides a simple description of Gram-Schmidt norms and then leads to some estimations of Hermite factor and enumeration complexity [9, 10]. When it comes to \(\{r_i(\mathbf {B})\}_{1\le i \le n-1}\), \(\text {GSA}\) implies that the \(r_i(\mathbf {B})\)’s are supposed to be almost equal to each others. However, \(\text {GSA}\) is not so perfect, because the first and last \(\mathbf {b}_i^*\)’s usually violate it [3]. The behavior in the tail is well explained, and can be predicted and simulated [5].

3 Head and Tail

In [3, 5], it was already claimed that for a \(\text {BKZ}_\beta \)-reduced basis \(\mathbf {B}\), \(\text {GSA}\) doesn’t hold in the first and last indices. We call this phenomenon “Head and Tail”, and provide detailed experiments. Our experiments confirm that \(\text {GSA}\) holds in a strong sense in the body of the basis (i.e. outside of the head and tail regions). Precisely, the distributions of \(r_i\)’s are similar in that region, not only their averages. We also confirm the violations of \(\text {GSA}\) in the head and the tail, quantify them, and exhibit that they are independent of the dimension n.

As a conclusion, we shall see that the head and tail have only small impacts on the root Hermite factor when \(n \gg \beta \), but also that they can also be quantitatively handled when \(n \not \gg \beta \). We notice that the head has in fact a stronger impact than the tail, which emphasizes the importance of finding models or simulators that capture this phenomenon, unlike the current ones that only capture the tail [5].

3.1 Experiments

We ran \(\text {BKZ}\) on many random input lattices and report on the distribution of each \(r_i\). We first plot the average and the variance of \(r_i\) for various blocksizes \(\beta \) and dimensions n in Fig. 1. By superposing with proper alignment curves for the same \(\beta \) but various n, we notice that the head and tail behavior doesn’t depend on the dimension n, but only on the relative index i (resp. \(n-i\)) in the head (resp. the tail). A more formal statement will be provided in Claim 1.

We also note that inside the body (i.e. outside both the head and the tail) the mean and the variance of \(r_i\) do not seem to depend on i, and are tempted to conclude that the distribution itself doesn’t depend on i. To give further evidence of this stronger claim, we ran the Kolmogorov-Smirnov test [17] on samples of \(r_i\) and \(r_j\) for varying ij. The results are depicted on Fig. 2, and confirm this stronger claim.

Fig. 1.
figure 1

Average value and standard deviation of \(r_i\) as a function of i. Experimental values measure over 5000 samples of random n-dimensional \(\text {BKZ}\) bases for \(n = 100,140\). First halves \(\{r_i\}_{i\le (n-1)/2}\) are left-aligned while last halves \(\{r_i\}_{i>(n-1)/2}\) are right-aligned so to highlight heads and tails. Dashed lines mark indices \(\beta \) and \(n-\beta \). Plots look similar in blocksize \(\beta = 6, 10, 20, 30\) and in dimension \(n=80, 100, 120, 140\), which are provided in the full version.

Fig. 2.
figure 2

Kolmogorov-Smirnov Test with significance level 0.05 on all \(\mathbb {D}_i(\beta ,100)\)’s calculated from 5000 samples of random 100-dimensional \(\text {BKZ}\) bases with blocksize \(\beta = 2,20\) respectively. A black pixel at position (ij) marks the fact that the pair of distributions \(\mathbb {D}_i(\beta ,100)\) and \(\mathbb {D}_j(\beta ,100)\) passed Kolmogorov-Smirnov Test, i.e. two distributions are close. Plots in \(\beta = 10,\,30\) look similar to that in \(\beta \,=\,20\), which are provided in the full version.

3.2 Conclusion

From the experiments above, we allow ourselves to the following conclusion.

Experimental Claim 1

There exist two functions \(h,t :\mathbb {N}\rightarrow \mathbb {N}\), such that, for all \(n,\beta \in \mathbb {N}\), and when \(n \ge h(\beta ) + t(\beta ) + 2\):

  1. 1.

    When \(i \le h(\beta )\), \(\mathbb {D}_i(\beta ,n)\) depends on i and \(\beta \) only: \(\mathbb {D}_i(\beta ,n) = \mathbb {D}^h_i(\beta )\)

  2. 2.

    When \(h(\beta )< i < n - t(\beta )\), \(\mathbb {D}_i(\beta ,n)\) depends on \(\beta \) only: \(\mathbb {D}_i(\beta ,n) = \mathbb {D}^b(\beta )\)

  3. 3.

    When \(i \ge n - t(\beta )\), \(\mathbb {D}_i(\beta ,n)\) depends on \(n-i\) and \(\beta \) only: \(\mathbb {D}_i(\beta ,n) = \mathbb {D}^t_{n-i}(\beta )\)

Remark 1

We only make this claim for basis that have been fully BKZ-reduced. Indeed, as we shall see later, we obtained experimental clues that this claim would not hold when the early-abort strategy is applied. More precisely, the head and tail phenomenon is getting stronger as we apply more tours (see Fig. 4).

From now on, we may omit the index i when speaking of the distribution of \(r_i\), implicitly implying that the only indices considered are such that \(h(\beta )< i < n-t(\beta )\). The random variable r depends on blocksize \(\beta \) only, hence we introduce two functions of \(\beta \), \(e(\beta )\) and \(v(\beta )\), to denote the expectation and variance of r respectively. Also, we denote by \(r_i^{(h)}\)(resp. \(r_{n-i}^{(t)}\)) the \(r_i\) inside the head (resp. tail), and by \(e_i^{(h)}(\beta )\) and \(v_i^{(h)}(\beta )\) (resp. \(e_{n-i}^{(t)}(\beta )\) and \(v_{n-i}^{(t)}(\beta )\)) the expectation and variance of \(r_i^{(h)}\) (resp. \(r_{n-i}^{(t)}\)).

We conclude by a statement on the impacts of the head and tail on the logarithmic average root Hermite factor:

Corollary 1

For a fixed blocksize \(\beta \), and as the dimension n grows, it holds that

$$\begin{aligned} \mathbf {E}(\ln (\mathbf {rhf}(\mathbf {B}))) = \frac{1}{2}e(\beta ) + \frac{d(\beta )}{n} + O\left( \frac{1}{n^2}\right) , \end{aligned}$$
(2)

where \(d(\beta )=\sum _{i \le h}e_i^{(h)}(\beta ) - \left( h+\frac{1}{2}\right) e(\beta )\).

Corollary 1 indicates that the impacts on the average root Hermite factor from the head and tail are decreasing. In particular, the tail has a very little effect \(O\left( \frac{1}{n^2}\right) \) on the average root Hermite factor. The impact of the head \(d(\beta ) / n\), which hasn’t been quantified in earlier work, is —perhaps surprisingly— asymptotically larger. We include the proof of Corollary 1 in Appendix A.

Below, Figs. 3 and 4 provide experimental measure of \(e(\beta )\) and \(d(\beta )\) from 5000 random 100-dimensional \(\text {BKZ}_\beta \)-reduced bases. We note that the lengths of the head and tail seem about the maximum of 15 and \(\beta \). Thus we set \(h(\beta )=t(\beta )=\max (15,\beta )\) simply, which affects the measure of \(e(\beta )\) and \(d(\beta )\) little. For the average \(e(2) \approx 0.043\) we recover the experimental root Hermite factor of \(\text {LLL}\) \(\mathbf {rhf}(\mathbf {B}) = \exp (0.043/2) \approx 1.022\), compatible with many other experiments [9].

To extend the curves, we also plot the experimental measure of \(e(\beta )\) and \(d(\beta )\) Footnote 4 from 20 random 180-dimensional \(\text {BKZ}_\beta \) 2.0 bases with bounded tour number \(\left\lceil C\cdot \frac{n^2}{\beta ^2}\left( \log n + \log \log \max \frac{\Vert \mathbf {b}_i^*\Vert }{\text {vol}(\mathcal {L})^{1/n}}\right) \right\rceil \). It shows that the qualitative behavior of BKZ 2.0 is different from full-BKZ not only the quantitative one: there is a bumpFootnote 5 in the curve of \(e(\beta )\) when \(\beta \in [22,30]\). Considering that the success probability for the SVP enumeration was set to 95%, the only viable explanation for this phenomenon in our BKZ 2.0 experiments is the early-abort strategy: the shape of the basis is not so close to the fix-point.

Fig. 3.
figure 3

Experimental measure of \(e(\beta )\)

Fig. 4.
figure 4

Experimental measure of \(d(\beta )\)

4 Local Correlations and Global Variance

In the previous section, we have classified the \(r_i\)’s and established a connection between the average of the root Hermite factor and the function \(e(\beta )\). Now we are to report on the (co-)variance of the \(r_i\)’s. Figure 5 shows the experimental measure of local variances, i.e. variances of the \(r_i\)’s inside the body, but it is not enough to deduce the global variance, i.e. the variance of the root Hermite factor. We still need to understand more statistics, namely the covariances among these \(r_i\)’s. Our experiments indicate that local correlations—i.e.   correlations between \(r_i\) and \(r_{i+1}\)—are negative and other correlations seem to be zero. Moreover, we confirm the tempting hypothesis that local correlations inside the body are all equal and independent of the dimension n.

Based on these observations, we then express the variance of the logarithm of root Hermite factor for fixed \(\beta \) and increasing n asymptotically, and quantify the self-stability of \(\text {LLL}\) and \(\text {BKZ}\) algorithms.

Fig. 5.
figure 5

Experimental measure of \(v(\beta )\)

4.1 Experiments

Let \(\mathbf {r}= (r_1,\cdots ,r_{n-1})\) be the random vector formed by random variables \(r_i\)’s. We profile the covariance matrices \(\mathbf {Cov}(\mathbf {r})\) for 100-dimensional lattices with \(\text {BKZ}\) reduction of different blocksizes in Fig. 6. The diagonal elements in covariance matrix correspond to the variances of the \(r_i\)’s which we have studied before. Thus we set all diagonal elements to 0 to enhance contrast. We discover that the elements on the second diagonals, i.e.   \(\mathbf {Cov}(r_i,r_{i+1})\)’s, are significantly negative and other elements seems very close to 0. We call the \(\mathbf {Cov}(r_i,r_{i+1})\)’s local covariances.

Fig. 6.
figure 6

Covariance matrices of \(\mathbf {r}\). Experimental values measure over 5000 samples of random 100-dimensional \(\text {BKZ}\) bases with blocksize \(\beta = 2,20\). The pixel at coordinates (ij) corresponds to the covariance between \(r_i\) and \(r_j\). Plots in \(\beta = 10, 30\) look similar to that in \(\beta = 20\), which are provided in the full version.

We then plot measured local covariances in Fig. 7. Comparing these curves for various dimensions n, we notice that the head and tail parts almost coincide, and the local covariances inside the body seem to depend on \(\beta \) only, we will denote this value by \(c(\beta )\). We also plot the curves of the \(\mathbf {Cov}(r_i,r_{i+2})\)’s in Fig. 7 and note that the curves for the \(\mathbf {Cov}(r_i,r_{i+2})\)’s are horizontal with a value about 0. For other \(\mathbf {Cov}(r_i,r_{i+d})\)’s with larger d, the curves virtually overlap that for the \(\mathbf {Cov}(r_i,r_{i+2})\)’s. For readability, larger values of d are not plotted. One thing to be noted is that the case for blocksize \(\beta =2\) is an exception. On one hand, the head and tail of the local covariances in \(\text {BKZ}_2\) basis bend in the opposite directions, unlike for larger \(\beta \). In particular, the \(\mathbf {Cov}(r_i,r_{i+2})\)’s in \(\text {BKZ}_2\) basis are not so close to 0, but are nevertheless significantly smaller than the local covariances \(\mathbf {Cov}(r_i,r_{i+1})\). That indicates some differences between \(\text {LLL}\) and \(\text {BKZ}\).

Fig. 7.
figure 7

\(\mathbf {Cov}(r_i,r_{i+1})\) and \(\mathbf {Cov}(r_i,r_{i+2})\) as a function of i. Experimental values measured over 5000 samples of random n-dimensional \(\text {BKZ}\) bases for \(n = 100, 140\). The blue curves denote the \(\mathbf {Cov}(r_i,r_{i+1})\)’s and the red curves denote the \(\mathbf {Cov}(r_i,r_{i+2})\)’s. For same dimension n, the markers in two curves are identical. First halves are left aligned while last halves \(\{\mathbf {Cov}(r_i,r_{i+1})\}_{i>(n-2)/2}\) and \(\{\mathbf {Cov}(r_i,r_{i+2})\}_{i>(n-3)/2}\) are right aligned so to highlight heads and tails. Dashed lines mark indices \(\beta \) and \(n-\beta -2\). Plots look similar in blocksize \(\beta = 6, 10, 20, 30\) and in dimension \(n=80, 100, 120, 140\), which are provided in the full version.

Also, we calculate the average of \((n-2\max (15,\beta ))\) middle local covariances as an approximation of \(c(\beta )\) for different n and plot the evolution of \(c(\beta )\) in Fig. 8. The curves for different dimensions seem to coincide, which provides another evidence to support that the local covariances inside the body don’t depend on n indeed. To determine the minimum of \(c(\beta )\), we ran a batch of \(\text {BKZ}\) with \(\beta =2,3,4,5,6\) separately. We note that \(c(\beta )\) increases with \(\beta \) except for \(c(3) < c(2)\), which is another difference between \(\text {LLL}\) and \(\text {BKZ}\).

Remark 2

To obtain a precise measure of covariances, we need enough samples and thus the extended experimental measure of \(c(\beta )\) is not given. Nevertheless, it seems that, after certain number of tours, local covariances of \(\text {BKZ}\) 2.0 bases still tend to be negative but other covariances tend to zero.

Fig. 8.
figure 8

Experimental measure of the evolution of \(c(\beta )\) calculated from 5000 samples of random \(\text {BKZ}\) bases in different dimension n respectively.

Fig. 9.
figure 9

Experimental measure of \(\frac{v(\beta )+2c(\beta )}{3}\). The data point for \(\beta =2\), \(\frac{v(2)+2c(2)}{3} \approx 0.00045\) was clipped out, being 10 times larger than all other values.

4.2 Conclusion

From above experimental observations, we now arrive at the following conclusion.

Experimental Claim 2

Let h and t be the two functions defined in Claim 1. For all \(n \in \mathbb {N}\) and \(\beta >2\) such that \(n \ge h(\beta ) + t(\beta ) + 2\):

  1. 1.

    When \(|i - j|>1\), \(r_i\) and \(r_j\) are not correlated: \(\mathbf {Cov}(r_i,r_j) = 0\)

  2. 2.

    When \(|i - j|=1\), \(r_i\) and \(r_j\) are negatively correlated: \(\mathbf {Cov}(r_i,r_j) < 0\). More specifically:

    • When \(i \le h(\beta )\), \(\mathbf {Cov}(r_i,r_{i+1})\) depends on i and \(\beta \) only: \(\mathbf {Cov}(r_i,r_{i+1}) = c^h_i(\beta )\)

    • When \(h(\beta )< i < n - t(\beta )\), \(\mathbf {Cov}(r_i,r_{i+1})\) depends on \(\beta \) only: \(\mathbf {Cov}(r_i,r_{i+1}) = c(\beta )\)

    • When \(i \ge n - t(\beta )\), \(\mathbf {Cov}(r_i,r_{i+1})\) depends on \(n-i\) and \(\beta \) only: \(\mathbf {Cov}(r_i,r_{i+1}) = c^t_{n-i}(\beta )\)

One direct consequence derives from the above experimental claim is that the global variance, i.e. the variance of the logarithm of root Hermite factor, converges to 0 as \(\varTheta (1/n)\), where the hidden constant is determined by \(\beta \):

Corollary 2

For a fixed blocksize \(\beta \), and as the dimension n grows, it holds that

$$\begin{aligned} \mathbf {Var}(\ln (\mathbf {rhf}(\mathbf {B}))) = \frac{1}{3n}v(\beta ) + \frac{2}{3n}c(\beta )+ O\left( \frac{1}{n^2}\right) . \end{aligned}$$
(3)

The proof of Corollary 2 is given in Appendix B. Note that the assumption that all \(\mathbf {Cov}(r_i,r_{i+d})\)’s with \(d>1\) equal 0 may not be exactly true. However, the \(\mathbf {Cov}(r_i,r_{i+d})\)’s converge to 0 quickly as d increases, hence we may assert that the sum \(\sum _{d=1}^{n-1-i}\mathbf {Cov}(r_i,r_{i+d})\) converge with n for fixed \(\beta \) and i inside the body. Then we still can conclude that \(\mathbf {Var}(\ln (\mathbf {rhf}(\mathbf {B})))=O(\frac{1}{n})\). The faster the \(\mathbf {Cov}(r_i,r_{i+d})\)’s converges to 0 as d grows, the more accurate our above approximation is. The experimental measure of \(\frac{v(\beta )+2c(\beta )}{3}\) is shown in Fig. 9 and \(\frac{v(\beta )+2c(\beta )}{3}\) seems to converge to a finite value \(\approx 5 \times 10^{-5}\) as \(\beta \) grows.

5 Half Volume

We shall now study statistics on the half-volume, \(\text {H}(\mathbf {B})=\prod _{i=1}^{\lfloor \frac{n}{2}\rfloor }\Vert \mathbf {b}_i^*\Vert \), of a random \(\text {BKZ}\)-reduced basis \(\mathbf {B}\). As claimed in [10], the nodes in the enumeration tree at the depths around \(\frac{n}{2}\) contribute the most to the total node number, for both full and regular pruned enumerations. Typically, the enumeration radius R is set to \(c\sqrt{n}\cdot \text {vol}(\mathbf {B})^{\frac{1}{n}}\) for some constant \(c>0\), e.g.  \(R = 1.05\cdot \text {GH}(n)\cdot \text {vol}(\mathbf {B})^{\frac{1}{n}}\), the number of nodes in the \(\lfloor \frac{n}{2}\rfloor \) level is approximately proportional to \(\frac{\text {H}(\mathbf {B})}{\text {vol}(\mathbf {B})^{\lceil \frac{n}{2}\rceil /n}}\), making the half-volume a good estimator for the cost of enumeration. Those formulas have to be amended in case pruning is used (see [10]), but the half-volume remains a good indicator of the cost of enumeration.

Let \(\mathbf {hv}(\beta , n)\) be the random variable \(\ln (\text {H}(\mathbf {B}))-\frac{\lfloor \frac{n}{2}\rfloor }{n}\ln (\text {vol}(\mathbf {B}))\) where \(\mathbf {B}\) is a random \(\text {BKZ}_\beta \)-reduced basis. By the above experimental claims, we conclude the following result. The proof is shown in Appendix C.

Corollary 3

(Under previous experimental claims). For a fixed blocksize \(\beta \), let n be an integer such that \(n > 2\max (h(\beta ),t(\beta ))\). Then, as the dimension n grows, it holds that

$$\begin{aligned} \mathbf {E}(\mathbf {hv}(\beta , n)) = \frac{n^2}{8}e(\beta )+d'(\beta )+O\left( \frac{1}{n}\right) , \end{aligned}$$
(4)

where \(d'(\beta ) = \sum _{i \le h}\frac{i}{2}\left( e_i^{(h)}(\beta )-e(\beta )\right) + \sum _{i \le t}\frac{i}{2}\left( e_i^{(t)}-e(\beta )\right) -\frac{1}{4}\{\frac{n}{2}\}e(\beta )\), and

$$\begin{aligned} \mathbf {Var}(\mathbf {hv}(\beta , n)) = \frac{n^3}{48}(v(\beta )+2c(\beta )) + O(n) . \end{aligned}$$
(5)

Assuming heuristically that the variation around the average of \(\mathbf {hv}\) follows a Normal law, Corollary 3 implies that the complexity of enumeration on a random n-dimensional \(\text {BKZ}_\beta \)-reduced basis should be of the shape

$$\begin{aligned} \exp \left( n^2x(\beta )+y(\beta ) \pm n^{1.5}l\cdot z(\beta )\right) \end{aligned}$$
(6)

except a fraction at most \(\exp (-l^2/2)\) of random bases, where

$$\begin{aligned} x(\beta ) = \frac{e(\beta )}{8}, \quad y(\beta ) = d'(\beta ), \quad z(\beta ) = \sqrt{\frac{v(\beta )+2c(\beta )}{48}} \end{aligned}$$
(7)

and where the term \(\pm n^{1.5}l\cdot z(\beta )\) accounts for variation around the average behavior. In particular, the contribution of the variation around the average remains asymptotically negligible compared to the main \(\exp (\varTheta (n^2))\) factor, it still introduces a super-exponential factor, that can make one particular attempt much cheaper or much more expensive in practice. It means that it could be beneficial in practice to rely partially on luck, restarting BKZ without trying enumeration when the basis is unsatisfactory.

The experimental measure of \(8x(\beta )\) and \(16z(\beta )^2\) has been shown in Figs. 3 and 9 respectively. We now exhibit the experimental measure of \(y(\beta )\) in Fig. 10. Despite the curves for \(\text {BKZ}\) 2.0 are not smooth, it seems that \(y(\beta )({=}d'(\beta ))\) would increase with \(\beta \) when \(\beta \) is large. However, comparing to \(n^2x(\beta )\), the impact of \(y(\beta )\) on the half-volume is still much weaker.Footnote 6

Fig. 10.
figure 10

Experimental measure of \(y(\beta )({=}d'(\beta ))\)

6 Performance of Simulator

In [5], Chen and Nguyen proposed a simulator to predict the behavior of \(\text {BKZ}\). For large \(\beta \), the simulator can provide a reasonable prediction of average profile, i.e. \(\left\{ \log \left( \frac{\Vert \mathbf {b}_i^*\Vert }{\text {vol}(\mathcal {L})^{1/n}}\right) \right\} _{i=1}^n\). In this section, we will further report on the performance of the simulator qualitatively and quantitatively. Our experiments confirm that the tail still exists in the simulated result and fits the actual measure, but the head phenomenon is not captured by the simulator, affecting its accuracy for cryptanalytic prediction.

To make the simulatorFootnote 7 coincide with the actual algorithm, we set the parameter \(\delta = \sqrt{0.99}\) and applied a similar progressive strategyFootnote 8. The maximum tour number corresponds to the case that \(C=0.25\) in [12], but the simulator always terminates after a much smaller number of tours.

6.1 Experiments

We ran simulator on several sequences of different dimensions and plot the average values of \(r_i\)’s in Fig. 11. An apparent tail remains in the simulated result and the length of its most significant part is about \(\beta \) despite a slim stretch. However, there is no distinct head, which does not coincide with the actual behavior: the head shape appears after a few tours of \(\text {BKZ}\) or \(\text {BKZ}\) 2.0. Still, the \(r_i\)’s inside the body share similar values, in accordance with \(\text {GSA}\) and experiments.

Fig. 11.
figure 11

Average value of \(r_i\) calculated by simulator. First halves are left aligned while last halves \(\{r_i\}_{i>(n-1)/2}\) are right aligned so to highlight heads and tails. The vertical dashed line marks the index \(n-\beta \) and the horizontal dashed line is used for contrast.

We now compare the average experimental behavior with the simulated result. Note that the simulator is not fed with any randomness, so it does not make sense to consider variance in this comparison.

Figure 12 illustrates the comparison on \(e(\beta )\). For small blocksize \(\beta \), the simulator does not work well, but, as \(\beta \) increases, the simulated measure of \(e(\beta )\) seems close to the experimental measure and both measures converge to the prediction \(\ln \left( \text {GH}(\beta )^{\frac{2}{\beta -1}}\right) \).

Fig. 12.
figure 12

Comparison on \(e(\beta )\)

Fig. 13.
figure 13

Comparison on \(s^{(h)}(\beta )\)

Finally, we consider the two functions \(d(\beta )\) and \(d'(\beta )\) that are relevant to the averages of the logarithms of the root Hermite factor and the complexity of enumeration and defined in Corollarys 1 and 3 respectively. To better understand the difference, we compared the following terms \(s^{(h)}(\beta )=\sum _{i \le h}e_i^{(h)}(\beta )\), \(w^{(h)}(\beta )=\sum _{i \le h}\frac{i}{2}e_i^{(h)}(\beta )\) and \(w^{(t)}(\beta )=\sum _{i \le t}\frac{i}{2}e_i^{(t)}\) respectively, where we set \(h(\beta ) = t(\beta ) =\max (15,\beta )\) as before. Indeed, combined with \(e(\beta )\), these three terms determine \(d(\beta )\) and \(d'(\beta )\).

From Fig. 13, we observe that the simulated measure of \(s^{(h)}(\beta )\) is greater than the experimental measure, which is caused by the lack of the head. The similar inaccuracy exists as well with respect to \(w^{(h)}(\beta )\) as shown in Fig. 14. The experimental measure of \(e(\beta )\) is slightly greater than the simulated measure and thus the \(e_i^{(h)}(\beta )\)’s of greater weight may compensate somewhat the lack of the head. After enough tours, the head phenomenon is highlighted and yet the body shape almost remains the same so that the simulator still cannot predict \(w^{(h)}(\beta )\) precisely. Figure 15 indicates that the simulator could predict \(w^{(t)}(\beta )\) precisely for both large and small blocksizes and therefore the HKZ-shaped tail model is reasonable.

Fig. 14.
figure 14

Comparison on \(w^{(h)}(\beta )\)

Fig. 15.
figure 15

Comparison on \(w^{(t)}(\beta )\)

6.2 Conclusion

Chen and Nguyen’s simulator gives an elementary profile for random \(\text {BKZ}_\beta \)-reduced bases with large \(\beta \): both body and tail shapes are reflected well in the simulation result qualitatively and quantitatively. However, the head phenomenon is not captured by this simulator, and thus the first \(\Vert \mathbf {b}_i^*\Vert \)’s are not predicted accurately. In particular, the prediction of \(\Vert \mathbf {b}_1^*\Vert \) that determines the Hermite factor is usually larger than the actual value, which leads to an underestimation of the quality of \(\text {BKZ}\) bases. Consequently, related security estimations need to be refined.

Understanding the main cause of the head phenomenon, modeling it and refining the simulator to include it seems an interesting and important problem, which we leave to the future work. It would also be interesting to introduce some randomness in the simulator, so to properly predict variance around the mean behavior.