Abstract
For any pair (X, Z) of correlated random variables we can think of Z as a randomized function of X. If the domain of Z is small, one can make this function computationally efficient by allowing it to be only approximately correct. In folklore this problem is known as simulating auxiliary inputs. This idea of simulating auxiliary information turns out to be a very usefull tool, finding applications in complexity theory, cryptography, pseudorandomness and zero-knowledge. In this paper we revisit this problem, achieving the following results:
-
(a)
We present a novel boosting algorithm for constructing the simulator. This boosting proof is of independent interest, as it shows how to handle “negative mass” issues when constructing probability measures by shifting distinguishers in descent algorithms. Our technique essentially fixes the flaw in the TCC’14 paper “How to Fake Auxiliary Inputs”.
-
(b)
The complexity of our simulator is better than in previous works, including results derived from the uniform min-max theorem due to Vadhan and Zheng. To achieve \((s,\epsilon )\)-indistinguishability we need the complexity \(O\left( s\cdot 2^{5\ell }\epsilon ^{-2}\right) \) in time/circuit size, which improve previous bounds by a factor of \(\epsilon ^{-2}\). In particular, with we get meaningful provable security for the EUROCRYPT’09 leakage-resilient stream cipher instantiated with a standard 256-bit block cipher, like \(\mathsf {AES256}\).
Our boosting technique utilizes a two-step approach. In the first step we shift the current result (as in gradient or sub-gradient descent algorithms) and in the separate step we fix the biggest non-negative mass constraint violation (if applicable).
The full (and updated) version of this paper is available at the Cryptology ePrint archive and the arXiv archive (http://arxiv.org/abs/1503.00484).
M. Skorski—Supported by the National Science Center, Poland (2015/17/N/ST6/03564).
You have full access to this open access chapter, Download conference paper PDF
Similar content being viewed by others
Keywords
- Simulating auxiliary inputs
- Boosting
- Leakage-resilient cryptography
- Stream ciphers
- Computational indistinguishability
1 Introduction
1.1 Simulating Correlated Information
Informal Problem Statement. Let \((X,Z)\in \mathcal {X}\times \mathcal {Z}\) be a pair of correlated random variables. We can think of Z as a randomized function of X. More precisely, consider the randomized function \(h:\mathcal {X}\rightarrow \mathcal {Z}\), which for every x outputs z with probability \(\Pr [Z=z|X=x]\). By definition it satisfies
however the function h is inefficient as we need to hardcode the conditional probability table of Z|X. It is natural to ask, if this limitation can be overcome
Q1: Can we represent Z as an efficient function of X?
Not surprisingly, it turns out that a positive answer may be given only in computational settings. Note that replacing the equality in Eq. (1) by closeness in the total variation distance (allowing the function h to make some mistakes with small probability) is not enoughFootnote 1. This discussion leads to the following reformulated question
Q1’: Can we efficiently simulate Z as a function of X?
Why It Matters? Aside from being very foundational, this question is relevant to many areas of computer science. We will not discuss these applications in detail, as they are well explained in [JP14]. Below we only mention where such a generic simulator can be applied, to show that this problem is indeed well-motivated.
-
(a)
Complexity Theory. From the simulator one can derive Dense Model Theorem [RTTV08], Impagliazzo’s hardcore lemma [Imp95] and a version of Szemeredis Regularity Lemma [FK99].
-
(b)
Cryptography. The simulator can be applied for settings where Z models short leakage from a secret state X. It provides tools for improving and simplifying proofs in leakage-resilient cryptography, in particular for leakage-resilient stream ciphers [JP14].
-
(c)
Pseudorandomness. Using the simulator one can conclude results called chain rules [GW11], which quantify pseudorandomness in conditioned distributions. They can be also applied to leakage-resilient cryptography.
-
(d)
Zero-knowledge. The simulator can be applied to represent the text exchanged in verifier-prover interactions Z from the common input X [CLP15].
Thus, the simulator may be used as a tool to unify, simplify and improve many results. Having briefly explained the motivation we now turn to answer the posed question, leaving a more detailed discussion of some applications to Sect. 1.6.
1.2 Problem Statement
The problem of simulating auxiliary inputs in the computational setting can be defined precisely as follows
Given a random variables \(X\in \{0,1\}^n\) and correlated \(Z\in \{0,1\}^{\ell }\), what is the minimal complexity \(s_h\) of a (randomized) function h such that the distributions of h(X) and Z are \((\epsilon ,s)\)-indistinguishable given X, that is
$$\begin{aligned} |{{\mathrm{\mathbb { E }}}}\textsc {D}(X,{h(X)})-{{\mathrm{\mathbb { E }}}}\textsc {D}(X,Z) | < \epsilon \end{aligned}$$holds for all (deterministic) circuits \(\textsc {D}\) of size s?
The indistinguishability above is understood with respect to deterministic circuits. However it doesn’t really matter for distinguishing two distributions, where randomized and deterministic distinguishers are equally powerfulFootnote 2.
It turns out that it is relatively easyFootnote 3 to construct a simulator h with a polynomial blowup in complexity, that is when
However, more challenging is to minimize the dependency on \(\epsilon ^{-1}\). This problem is especially important for cryptography, where security definitions require the advantage \(\epsilon \) to be possibly small. Indeed, for meaningful security \(\epsilon =2^{-80}\) or at least \(\epsilon = 2^{-40}\) it makes a difference whether we lose \(\epsilon ^{-2}\) or \(\epsilon ^{-4}\). We will see later how much inefficient bounds here may affect provable security of stream ciphers.
1.3 Related Works
Original Work of Jetchev and Pietrzak (TCC’14).
The authors showed that Z can be “approximately” computed from X by an “efficient” function \(\mathsf {h}\).
Theorem 1
([JP14], corrected). For every distribution (X, Z) on \(\{0,1\}^n\times \{0,1\}^{\ell }\) and every \(\epsilon \), s, there exists a “simulator” \(h:\{0,1\}^n\rightarrow \{0,1\}^\ell \) such that
-
(a)
(X, h(X)) and (X, Z) are \((\epsilon ,s)\)-indistinguishable
-
(b)
h is of complexity \(s_{\mathsf {h}}=O\left( s\cdot 2^{4\ell }\epsilon ^{-4} \right) \)
The proof uses the standard min-max theorem. In the statement above we correct two flaws. One is a missing factor of \(2^{\ell }\). The second (and more serious) one is the (corrected) factor \(\epsilon ^{-4}\), claimed incorrectly to be \(\epsilon ^{-2}\). The flaws are discussed in Appendix A.
Vadhan and Zheng (CRYPTO’13).
The authors derived a version of Theorem 1 but with incomparable bounds
Theorem 2
([VZ13]). For every distribution X, Z on \(\{0,1\}^n\times \{0,1\}^{\ell }\) and every \(\epsilon \), s, there exists a “simulator” \(h:\{0,1\}^n\rightarrow \{0,1\}^\ell \) such that
-
(a)
(X, h(X)) and (X, Z) are \((s,\epsilon )\)-indistinguishable
-
(b)
h is of complexity \(s_{\mathsf {h}}=O\left( s\cdot 2^{\ell }\epsilon ^{-2} + 2^{\ell }\epsilon ^{-4}\right) \)
The proof follows from a general regularity theorem which is based on their uniform min-max theorem. The additive loss of \(O\left( 2^\ell \epsilon ^{-4}\right) \) appears as a consequence of a sophisticated weight-updating procedure. This error is quite large and may dominate the main term for many settings (whenever \(s \ll \epsilon ^{-2}\)).
As we show later, Theorems 1 and 2 give in fact comparable security bounds when applied to leakage-resilient stream ciphers (see Sect. 1.6)
1.4 Our Results
We reduce the dependency of the simulator complexity \(s_h\) on the advantage \(\epsilon \) to only a factor of \(\epsilon ^{-2}\), from the factor of \(\epsilon ^{-4}\).
Theorem 3
(Our Simulator). For every distribution X, Z on \(\{0,1\}^n\times \{0,1\}^{\ell }\) and every \(\epsilon \), s, there exists a “simulator” \(h:\{0,1\}^n\rightarrow \{0,1\}^\ell \) such that
-
(a)
(X, h(X)) and (X, Z) are \((s,\epsilon )\)-indistinguishable
-
(b)
h is of complexity \(s_{\mathsf {h}}=O\left( s\cdot 2^{5\ell }\log (1/\epsilon )\epsilon ^{-2}\right) \)
Below in Table 1 we compare our result to previous works.
Our result is slightly worse in terms of dependency \(\ell \), but outperforms previous results in terms of dependency on \(\epsilon ^{-1}\). However, the second dependency is more crucial for cryptographic applications. Note that the typical choice is sub-logarithmic leakage, that is \(\ell = o\left( \log \epsilon ^{-1}\right) \) is asymptotic settingsFootnote 4 (see for example [CLP15]). Stated in non-asymptotic settings this assumption translates to \(\ell < c\log \epsilon ^{-1}\) where c is a small constant (for example \(c= \frac{1}{12}\) see [Pie09]). In these settings, we outperform previous results.
To illustrate this, suppose we want to achieve security \(\epsilon = 2^{-60}\) simulating just one bit from a 256-bit input. As it follows from Table 1, previous bounds are useless as they give the complexity bigger than \(2^{256}\) which is the worst complexity of all boolean functions over the chosen domain. In settings like this, only our bound can be applied to conclude meaningful results. For more concrete examples of settings where our bounds are even only meaningful, we refer to Table 2 in Sect. 1.6.
1.5 Our Techniques
Our approach utilizes a simple boosting technique: as long as the condition (a) in Theorem 3 fails, we can use the distinguisher to improve the simulator. This makes our algorithm constructive with respect to distinguishers obtained from an oracleFootnote 5, similarly to other boosting proofs [JP14, VZ13]. In short, if for a “candidate” solution h there exists \(\textsc {D}\) such that
then we construct a new solution \(h'\) using \(\textsc {D}\) and h, according to the equationFootnote 6
where
-
(a)
The parameter \(\gamma \) is afixed step chosen in advance (its optimal value depends on \(\epsilon \) and \(\ell \) and is calculated in the proof.)
-
(b)
\( \mathsf {Shift}\left( \textsc {D}(x,z)\right) \) is a shifted version of \(\textsc {D}\), so that \(\sum _{z} \mathsf {Shift}\left( \textsc {D}(x,z)\right) = 0\). This restriction correspond to the fact that we want to preserve the constraint \(\sum _{z}h(x,z)=1\). More precisely, \(\mathsf {Shift}\left( \textsc {D}(x,z)\right) =\textsc {D}(x,z)-{{\mathrm{\mathbb { E }}}}_{z'\leftarrow U_{\ell }}\textsc {D}(x,z)\)
-
(c)
\(\mathsf {Corr}(x,z)\) is a correction term used to fix (some of) possibly negative weights.
The procedure is being repeated in a loop, over and over again. The main technical difficulty is to show that it eventually stops after not so many iterations.
Note that in every such a step the complexity cost of the shifting term is \(O\left( 2^{\ell }\cdot \mathrm {size}(\textsc {D}) \right) \) Footnote 7. The correction term, in our approach, does a search over z looking for the biggest negative mass, and redistributes it over the remaining points. Intuitively, it works because the total negative mass is getting smaller with every step. See Algorithm 1 for a pseudo-code description of the algorithm and the rest of Sect. 3 for a proof.
1.6 Applications
Better Security for the EUROCRYPT’09 Stream Cipher. The first construction of leakage-resilient stream cipher was proposed by Dziembowski and Pietrzak in [DP08]. On Fig. 1 below we present a simplified version of this cipher [Pie09], based on a weak pseudorandom function (wPRF).
Jetchev and Pietrzak in [JP14] showed how to use the simulator theorem to simplify the security analysis of the EUROCRYPT’09 cipher. The cipher security depends on the complexity of the simulator as explained in Theorem 1 and Remark 2. We consider the following setting:
-
number of rounds \(q=16\),
-
F instantiated with \(\mathsf {AES}256\) (as in [JP14])
-
cipher security we aim for \(\epsilon '=2^{-40}\)
-
\(\lambda = 3\) bits of leakage per round
The concrete bounds for \((q,\epsilon ',s')\)-security of the cipher (which roughly speaking means that q consecutive outputs is \((s',\epsilon ')\)-pseudorandom, see Sect. 2 for a formal definition) are given in Table 2 below. We ommit calculations as they are merely putting parameters from Theorems 1, 2 and 3 into Remark 2 and assuming that AES as a weak PRF is \((\epsilon ,s)\)-secure for any pairs \(s/\epsilon \approx 2^k\) (following the similar example in [JP14]).
More generally, we can give the following comparison of security bounds for different wPRF-based stream ciphers, in terms of time-sccess ratio. The bounds in Table 3 follow from the simple lemma in Sect. 4, which shows how the time-success ratio changes under explicit reduction formulas.
1.7 Organization
In Sect. 2 we discuss basic notions and definitions. The proof of Theorem 3 appears in Sect. 3.
2 Preliminaries
2.1 Notation
By \(\mathbb {E}_{y\leftarrow Y} f(y)\) we denote an expectation of f under y sampled according to the distribution Y.
2.2 Basic Notions
Indistinguishability. Let \(\mathcal {V}\) be a finite set, and \(\mathcal {D}\) be a class of deterministic [0, 1]-valued functions on \(\mathcal {V}\). For any two real functions \(f_1,f_2\) on \(\mathcal {V}\), we say that \(f_1,f_2\) are \((\mathcal {D},\epsilon )\)-indistinguishable if
Note that the domain \(\mathcal {V}\) depends on the context. If \(X_1,X_2\) are two probability distributions, we say that they are \((s,\epsilon )\)-indistinguishable if their probability mass functions are indistinguishable, that is when
for all \(\textsc {D}\in \mathcal {D}\). If \(\mathcal {D}\) consists of all circuits of size s we say that \(f_1,f_2\) are \((s,\epsilon )\)-indistinguishable.
Remark 1
This an extended notion of indistinguishability, borrowed from [TTV09], which captures not only probability measures but also real-valued functions. A good intuition is provided by the following observation [TTV09]: think of functions over \(\mathcal {V}\) as \(|\mathcal {V}|\)-dimensional vectors then \(\epsilon \geqslant | \sum _{x\in V} \textsc {D}(x)\cdot f_1(x)-\sum _{x\in V} \textsc {D}(x)\cdot f_2(x)| =| \langle f_1-f_2, \textsc {D}\rangle |\) means that \(f_1\) and \(f_2\) are nearly orthogonal for all test functions in \(\mathcal {D}\).
Distinguishers. In the definition above we consider deterministic distinguishers, as this is required by our algorithm. However, being randomized doesn’t help in distinguishing, as any randomized-distinguisher achieving advantage \(\epsilon \) when run on two fixed distributions can be converted into a deterministic distinguishers of the same size and advantage (by fixing one choice of coins). Moreover, any real-valued distinguisher can be converted, by a boolean threshold, into a boolean one with at least the same advantage [FR12].
Relative Complexity. We say that a function h has complexity at most T relative to the set of functions \(\mathcal {D}\) if there are functions \(\textsc {D}_1,\ldots ,\textsc {D}_{T}\) such h can be computed by combining them using at most T of the following operations: (a) multiplication by a constant, (b) application of a boolean threshold function, (c) sum, (d) product.
2.3 Stream Ciphers Definitions
We start with the definition of weak pseudorandom functions, which are computationally indistinguishable from random functions, when queried on random inputs and fed with uniform secret key.
Definition 1
(Weak pseudorandom functions). A function \(\textsc {F}: \{0, 1\}^{k} \times \{0, 1\}^{n} \rightarrow \{0, 1\}^{m}\) is an \((\epsilon , s, q)\)-secure weak PRF if its outputs on q random inputs are indistinguishable from random by any distinguisher of size s, that is
where the probability is over the choice of the random \(X_i \leftarrow \{0,1\}^n\), the choice of a random key \(K \leftarrow \{0,1\}^k\) and \(R_i \leftarrow \{ 0,1\}^m\) conditioned on \(R_i = R_j\) if \(X_i = X_j\) for some \(j < i\).
Stream ciphers generate a keystream in a recursive manner. The security requires the output stream should be indistinguishable from uniformFootnote 8.
Definition 2
(Stream ciphers). A stream-cipher \(\mathsf {SC} : \{0, 1\}^k \rightarrow \{0, 1\}^k \times \{0, 1\}^n\) is a function that, when initialized with a secret state \(S_0 \in \{0, 1\}^k\), produces a sequence of output blocks \(X_1, X_2, . . . \) computed as
A stream cipher \(\mathsf {SC}\) is \((\epsilon ,s,q)\)-secure if for all \(1 \leqslant i \leqslant q\), the random variable \(X_i\) is \((s,\epsilon )\)-pseudorandom given \(X_1, . . . , X_{i-1}\) (the probability is also over the choice of the initial random key \(S_0\)).
Now we define leakage resilient stream ciphers, following the “only computation leaks” assumption.
Definition 3
(Leakage-resilient stream ciphers). A leakage-resilient stream-cipher is \((\epsilon ,s,q,\lambda )\)-secure if it is \((\epsilon ,s,q)\)-secure as defined above, but where the distinguisher in the j-th round gets \(\lambda \) bits of arbitrary deceptively chosen leakage about the secret state accessed during this round. More precisely, before \((S_j,X_j) := \mathsf {SC}(S_{j-1})\) is computed, the distinguisher can choose any leakage function \(f_j\) with range \(\{0,1\}^{\lambda }\), and then not only get \(X_j\), but also \(\Lambda _j := f_j(\hat{S}_{j-1})\), where \(\hat{S}_{j-1}\) denotes the part of the secret state that was modified (i.e., read and/or overwritten) in the computation \(\mathsf {SC}(S_{j-1})\).
2.4 Security of Leakage-Resilient Stream Ciphers
Best provable secure constructions of leakage-resilient streams ciphers are based on so called weak PRFs, primitives which look random when queried on random inputs [Pie09, FPS12, JP14, DP10, YS13]. The most recent (TCC’14) analysis is based on a version of Theorem 1.
Theorem 4
(Proving Security of Stream Ciphers [JP14]). If F is a \((\epsilon _F, s_F, 2)\)-secure weak PRF then \(\mathsf {SC}^F\) is a \((\epsilon ', s', q, \lambda )\)-secure leakage resilient stream cipher where
Remark 2
(The exact complexity loss). An inspection of the proof in [JP14] shows that \(s_{F}\) equals the complexity of the simulator h in Theorem 1, with circuits of size \(s'\) as distingusihers and \(\epsilon \) replaced by \(\epsilon '\).
2.5 Time-Success Ratio
The running time (circuit size) s and success probability \(\epsilon \) of attacks (practical and theoretical) against a particular primitive or protocol may vary. For this reason Luby [LM94] introduced the time-success ratio \(\frac{t}{\epsilon }\) as a universal measure of security. This model is widely used to analyze provable security, cf. [BL13] and related works.
Definition 4
(Security by Time-Success Ratio [LM94]). A primitive P is said to be \(2^{k}\)-secure if for every adversary with time resources (circuit size in the nonuniform model) s, the success probability in breaking P (advantage) is at most \(\epsilon < s\cdot 2^{-k}\). We also say that the time-success ratio of P is \(2^{k}\), or that is has k bits of security.
For example, \(\mathsf {AES}\) with a 256-bit random key is believed to have 256 bits of security as a weak PRFFootnote 9.
3 Proof of Theorem 3
For technical convenience, we attempt to efficiently approximate the conditional probability function \(g(x,z) = \Pr [Z=z|X=x]\) rather than building the sampler directly. Once we end with building an efficient approximation h(x, z), we transform it into a sampler \(h_{\mathsf {sim}}\) which outputs z with probability h(x, z) (this transformation yields only a loss of \(2^{\ell }\log (1/\epsilon )\)). We are going to prove the following fact
For every function g on \(\mathcal {X}\times \mathcal {Z}\) which is a \(\mathcal {X}\)-conditional probability mass function over Z (that is \(g(x,z)\geqslant 0\) for all x, z and \(\sum _{z}g(x,z)=1\) for every x), and for every class \(\mathcal {D}\) closed under complementsFootnote 10 there exists h such that
- (a)
h is a \(\mathcal {X}\)-conditional probability mass function over Z
- (b)
h is of complexity \(s_h = O(2^{4\ell }\epsilon ^{-2})\) with respect to \(\mathcal {D}\)
- (c)
(X, Z) and \((X, h_{\mathsf {sim}}(X))\) are indistinguishable, which in terms of g and h means
$$\begin{aligned} \left| \sum _{z} {{\mathrm{\mathbb { E }}}}_{x\sim X} \left[ \textsc {D}(x,z)\cdot ( g(x,z)-h(x,z) ) \right] \right| \leqslant \epsilon \end{aligned}$$(2)
The sketch of the construction is shown in Algorithm 1. Here we would like to point out two things. First, we stress that we do not produce a strictly positive function; what our algorithm guarantees, is that the total negative mass issmall. We will see later that this is enough. Second, our algorithm performs essentially same operations for every x, which is why its complexity depends only on \(\mathcal {Z}\).
We denote for shortness \(\overline{\textsc {D}}(x,z)=\textsc {D}(x,z)-{{\mathrm{\mathbb { E }}}}_{z'\leftarrow U_{\mathcal {Z}}}\textsc {D}(x,z')\) for any \(\textsc {D}\) (the “shift” transformation).
Proof
Consider the functions \({h}^{t}\). Define \(\tilde{h}^{t+1}(x,z) \overset{def}{=} h^{t}(x,z)+\gamma \cdot \overline{\textsc {D}}^{t+1}(x,z)\). According to Algorithm 1, we have
with the correction term \(\theta ^{t}(x,z)\) that be computed recursively as (see Line 13 in Algorithm 1)
where \(z_{\text {min}}^{t}(x)\) is one of the points z minimizing \( h^{t}(x,z)+\gamma \cdot \overline{\textsc {D}}^{t+1}(x,z) \) (chosen and fixed for every t) . In particular
Notation: for notational convenience we indenify the functions \(\textsc {D}^{t}(x,z)\), \(\overline{\textsc {D}}^{t}(x,z)\), \(\theta ^{t}(x,z)\), \(\tilde{h}^{t}(x,z)\) and \(h^{t}(x,z)\) with matrices where x are columns and z are rows. That is \(h^{t}_x\) denotes the \(|\mathcal {Z}|\)-dimensional vector with entries \(h^{t}(x,z)\) for \(z\in \mathcal {Z}\) and similarly for other functions \(\textsc {D}^{t}(x,z)\), \(\overline{\textsc {D}}^{t}(x,z)\), \(\theta ^{t}(x,z)\), \(\tilde{h}^{t}(x,z)\).
Claim 1
(Complextity of Algorithm 1). T executions of the “while loop” can be realized with time \(O\left( T\cdot |\mathcal {Z}| \cdot \mathrm {size}(\mathcal {D})\right) \) and memory \(O(|\mathcal {Z}|)\).Footnote 11
This claim describes precisely resources required to compute the function \(h^{T}\) for every T. In order to bound T, we define the energy function as follows:
Claim 2
(Energy function). Define the auxiliary function
Then we have \( \varDelta ^{t} = E_1 + E_2 \) where
Note that all the symbols represent vectors and multiplications, including squares, should be understood as scalar products. The proof is based on simple algebraic manipulations and appears in Appendix B.
Remark 3
(Technical issues and intuitions). To upper-bound the formulas in Eq. (7), we need the following important properties
-
(a)
Boundedness of correction terms, that is ideally \(|\theta ^{i}(x.z)| = O(\mathrm {poly}(|\mathcal {Z}|)\cdot \gamma )\).
-
(b)
Acute angle between the correction and the error, that is \(\theta ^{i}_x\cdot (g_x-h^{i}_x) \geqslant 0\).
Below we present an outline of the proof, discussing more technical parts in the appendix.
Proof Outline. Indeed, with these assumptions we prove an upper bound on the energy function, namely
which follows from the properties (a) and (b) above (they are proved in Claims 4 and 3 below, and the inequality on \(E_1+E_2\) is derived in Claim 5). Note that, except a factor \(\mathrm {poly}(|\mathcal {Z}|)\), our formula (not the proof, though) is identical to the bound used in [TTV09] (see Claim 3.4 in the eprint version). Indeed, our theorem is, to some extent, an extension to the main result in [TTV09] to cover the conditional case, where \(|\mathcal {X}|>1\). The main difference is that we show how to simulate a short leakage |Z| given X, whereas [TTV09] shows how to simulate Z alone, under the assumption that the distribution of Z is dense in the uniform distribution (the min-entropy gap being small)Footnote 12.
Since the bound above is valid for any step t, and since on the other hand we have \(t\epsilon \leqslant \varDelta ^{t} \) after t steps of the algorithm, we achieve a contradiction (to the number of steps) setting \(\gamma = \epsilon /\mathrm {poly}(|\mathcal {Z})\). Indeed, suppose that \(t\epsilon \leqslant A |\mathcal {Z}|^B (\gamma ^{-1} + t\gamma )\) for some positive constants A, B. Since the step size \(\gamma \) can be chosen arbitrarily, we can set \(\gamma = \frac{\epsilon }{2A|\mathcal {Z}|^B}\) which yields \( \frac{t\epsilon }{2} \leqslant \frac{2A^2|\mathcal {Z}|^B}{\epsilon }\) or \(t \leqslant 4A^2 |\mathcal {Z}|^B\epsilon ^{-2}\), which means that the algorithm terminates after at most \(T = \mathrm {poly}(|\mathcal {Z}|)\epsilon ^{-2}\) steps. Our proof goes exactly this way, except some extra optimization do obtain better exponent A.
We stress that it outputs only a signed measure, not a probability distribution yet. However, because of property (a) the negative mass is only of order \(\mathrm {poly}(|\mathcal {Z}|)\epsilon \) and the function we end with can be simply rescaled (we replace negative masses by 0 and normalize the function dividing by a factor \(1-m\) where m is the total negative mass). With this transformation, we keep the expected advantage \(O(\epsilon )\) and lose an extra factor \(O(|\mathcal {Z}|)\) in the complexity. We can then. Finally, we need to remember that we construct only a probability distribution function, not a sampler. Transforming it into a sampler yields an overhead of \(O(\mathcal {Z})\). This discussion shows that it is possible to build a sampler of complexity \( \mathrm {poly}(|\mathcal {Z}|)\epsilon ^{-2}\) with respect to \(\mathcal {D}\). A more careful inspection of the proof shows that we can actually achieve the claimed bound \(|\mathcal {Z}|^5\epsilon ^{-2}\) (see Remark 4 at the end of the proof).
Technical Discussion. We note that condition (b) somehow means that mass cuts should go in the right direction, as it is much simpler to prove that Algorithm 1 terminates when there are no correction terms \(\theta ^{t}\); thus we don’t want to go in a wrong direction and ruin the energy gain. Concrete bounds on properties (a) and (b) are given in Claims 3 and 4.
In Algorithm 1 in every round we shift only one negative point mass (see Line 13). However, since this point mass is chosen to be as big as possible and since \(h^{t+1}\) and \(h^{t}\) differ only by a small term \(\gamma \cdot \overline{\textsc {D}}^{t+1}\) except the mass shift \(\theta ^{t+1}\), one can expect that we have the negative mass under control. Indeed, this is stated precisely in Claim 3 below.
Claim 3
(The total negative mass is small). Let
be the total negative mass in \(h^{t}(x,z)\) as the function of z. Then we have
for every x and every t. In fact, for all x, z and t we have the following stronger bound
The proof is based on a recurrence relation that links \(\textsf {NegativeMass}(h^{t+1}(x,\cdot )\) with \(\textsf {NegativeMass}(h^{t}(x,\cdot )\), and appears in Appendix C.
Claim 4
(The angle formed by the correction and the difference vector is acute). For every x, t we have \(\textsf {Angle}\left( \theta ^{t+1}_x,g_x-{h}^{t+1}_x\right) \in \left[ -\frac{\pi }{2},\frac{\pi }{2}\right] \).
The proof appears in Appendix D.
Having established Claims 3 and 4 we are now in position to prove a concrete bound in Eq. (8). To this end, we give upper bounds on \(E_1\) and \(E_2\), defined in Eq. (7), separately.
Claim 5
(Algorithm 1 terminates after a small number of steps). The energy function in Claim 2 can be bounded as follows
In particular, we conclude that with \(\gamma = \frac{\epsilon }{8|\mathcal {Z}|^4}\) the algorithm terminates after at most \(t = O( |\mathcal {Z}|^3) \epsilon ^{-2}\) steps.
First, note that by Claim 4 we have \( -\sum _{i=0}^{t-1} \theta ^{i+1}_x\cdot \left( g_x-h^{i+1}_x\right) \leqslant 0\). Second, by definition of the sequence \((h^{i})_i\) we have \(- \sum _{i=0}^{t-1} \theta ^{i+1}_x\cdot \left( h^{i+1}_x-h^{i}_x\right) = -\sum _{i=0}^{t-1} \theta ^{i+1}_x\cdot \theta ^{i+1}_x - \sum _{i=0}^{t-1}\gamma \theta ^{i+1}_x\cdot \overline{\textsc {D}}^{i+1}_x\) which is at most \(2|\mathcal {Z}|^3 t \gamma ^2\), because of Eq. (9) (the sum of absolute correction terms \(\sum _{z}|\theta ^{i+1}(x,z)|\) is, by definition, twice the total negative mass, and \(|\overline{\textsc {D}}^{i+1}(x,z)| \leqslant 1\)). This proves that
To bound \(E_1\), note that we have to bounds two non-negative terms, namely \(\frac{1}{2}\sum _{i}\left( h^{i+1}_x-h^{i}_x\right) ^2\) and \(\left( h^{t}_x-h^{0}_x\right) \cdot g_x\). As for the first one, we have
where the inequality follows by the Cauchy-Schwarz inequalityFootnote 13. We trivially have \(\left( \overline{\textsc {D}}^{i+1}_x\right) ^2 \leqslant |\mathcal {Z}|\) (because of \(|\overline{\textsc {D}}(x,z) | \leqslant 1\)). By the definition of correction terms in Eq. (4) we have \(\left( \theta ^{i+1}_x\right) ^2 = \sum _{z}( \theta ^{i+1}(x,z))^2 < 2(\theta ^{i+1}(x,z_0))^2\), where \(\theta ^{i+1}(x,z_0)\) is the smallest negative mass, which is at most \((2|\mathcal {Z}|^3\gamma )^2\) by Eq. (9) . Thus, we have \(\left( h^{i+1}_x-h^{i}_x\right) ^2 \leqslant 2 |\mathcal {Z}|\gamma ^2 + 8|\mathcal {Z}|^6\gamma ^2\). To bund \(\left( h^{t}_x-h^{0}_x\right) \cdot g_x\) note that \(-h^{0}_x\cdot g_x \leqslant 0\) and that \(h^{t}_x\cdot g_x \leqslant \max _{z} |h^{t}(x,z)|\) (because \(g(x,z) \geqslant 0\) and \(\sum _{x}g(x,z) = 1\)) which means \(h^{t}_x\cdot g_x \leqslant 1+2\textsf {NegativeMass}(h^{t}_x)\) (as \(\sum _{z}\max ( h^{t}(x,z),0) = 1-\sum _{z}\min ( h^{t}(x,z), 0) = 1+\textsf {NegativeMass}(h^{t}_x)\) and \(-\sum _{z}\min ( h^{t}(x,z),0) = \textsf {NegativeMass}(h^{t}_x)\) by \(\sum _{z}\max ( h^{t}(x,z) = 1\) and the definition of the total negative mass). This allows us to estimate \(E_1\) as follows
After t steps, the energy is at least \(t\epsilon \). On the other hand, it at most \(E_1+E_2\). Since \(|\mathcal {Z}|, |\mathcal {Z}|^3 \leqslant |\mathcal {Z}|^6\), we obtain
Since this is true for any positive \(\gamma \), we choose \(\gamma = \frac{\epsilon }{14|\mathcal {Z}|^6}\), which gives us (slightly weaker than claimed)
Remark 4
(Optimized bounds). By the second part of Claim 3 we have \(|\theta ^{i}(x,z)| < |\mathcal {Z}|\gamma \) for every x, z and i. An inspection of the discussion above shows that this allows us to improve the bounds on \(E_1,E_2\)
Setting \(\gamma = \frac{\epsilon }{8|\mathcal {Z}|^2} \) we get \(E_1 + E_2 \leqslant 20 |\mathcal {Z}|^2 \epsilon ^{-1}\) and \(t \leqslant 20 |\mathcal {Z}|^2\epsilon ^{-2}\).
This finishes the proof of the claim.
From Claim 5 we conclude that after \(t = O\left( |\mathcal {Z}|^2\epsilon ^{-2} \right) \) steps we end up with a function \(h = h^{t}\) that is \((s,\epsilon )\)-indistinguishable from g, because the algorithm terminated (and, clearly, has the complexity at most \( O\left( |\mathcal {Z}|^3\epsilon ^{-2} \right) \) relative to circuits of size s (including an overhead of \(O(|\mathcal {Z}|)\) to compute \(\overline{\textsc {D}}\) from \(\textsc {D}\)). To finish the proof, we need to solve two issues
Claim 6
(From the signed measure to the probability measure). Let \( h^{t}\) be the output of the algorithm. Define the probability distribution
for every x, z. Then \(h^{t}(x,\cdot )\) and \(h(x,\cdot )\) are \(O(\epsilon )\)-statistically close for every x.
To prove the claim, we note that \( \sum _{z'}{ \max (h^{t}(x,z'),0)}\) equals \(1+\beta \) where \(\beta = \textsf {NegativeMass}(h^{t}(x,\cdot )\). Thus we have \(|h(x,z) - h^{t}(x,z)| \leqslant | h^{t}(x,z)|\cdot \frac{\beta }{1+\beta }\). Since \(\sum _{z'}| h^{t}(x,z')| = \sum _{z'}{ \max (h^{t}(x,z'),0)} - \sum _{z'}{ \min (h^{t}(x,z'),0)} = 1 + 2\beta \), we get \(\sum |h(x,z) - h^{t}(x,z)| = O(\beta )\) which is \(O(\epsilon )\) by Claim 3 for \(\gamma \) defined as in Claim 5.
Recall that we have constructed an approximating probability measure h for the probability mass function g, which is not a sampler yet. However, we can fix it by rejection sampling, as shown below.
Claim 7
(From the pmf to the sampler). There exists a (probabilistic) function \(h_{\mathsf {sim}}:\mathcal {X}\rightarrow \mathcal {Z}\) which calls h(x, z) (defined as above) at most \(O(|\mathcal {Z}|\log (1/\epsilon ))\) times and for every x the distribution of its output is \(\epsilon \)-close to \(h(x,\cdot )\) for every x.
The proof goes by a simple rejection sampling argument: we sample a point \(z\leftarrow \mathcal {Z}\) at random and reject with probability h(x, z). The rejection probability in one turn is \(\frac{1}{|\mathcal {Z}|}\). If we repeat the experiment \(|\mathcal {Z}|\log (1/\epsilon )|\) then the probability of rejection in every round is only \(\epsilon \). On the other hand, conditioned on the opposite event, we get the distribution identical to \(h(x,\cdot )\). So the distance is at most \(\epsilon \) as claimed. note that
The last two claims prove that the distribution of \(h_{sim}(x)\) is \((s,O(\epsilon ))\)-close to \(h^{t}_x = h^{t}(x,\cdot )\), for every x. Since \(h^{t}\), as a function of x, z is \((s,\epsilon )\)-close to g, and g is the conditional distribution of Z|X, we obtain
and the complexity of the final sampler \(h_{sim}(X)\) is \(O(|\mathcal {Z}|^5\epsilon ^{-2})\)
4 Time-Success Ratio Under Algebraic Transformations
In Theorem 1 below we provide a quantitative analysis of how the time-success ratio changes under concrete formulas in security reductions.
Lemma 1
(Time-success ratio for algebraic transformations). Let a, b, c and A, B, C be positive constants. Suppose that \(P'\) is secure against adversaries \((s',\epsilon ')\), whenever P is secure against adversaries \((s,\epsilon )\), where
In addition, suppose that the following condition is satisfied
Then the following is true: if P is \(2^{k}\)-secure, then \(P'\) is \(2^{k'}\)-secure (in the sense of Definition 4) where
The proof is elementary though not immediate. It can be found in [Skó15].
Remark 5
(On the technical condition 11 ( ) ) . This condition is satisfied in almost all applications, at in the reduction proof typically \(\epsilon '\) cannot be better (meaning higher exponent) than \(\epsilon \). Thus, quite often we have \(A\leqslant 1\).
Notes
- 1.
Indeed, consider the simplest case \(\mathcal {Z} = \{0,1\}\), define X to be uniform over \(\mathcal {X}=\{0,1\}^n\), and take \(Z=f(X)\) where f is a function which is 0.5-hard to predict by circuits exponential in n, Then (X, h(X)) and (X, Z) are at least \(\frac{1}{4}\)-away in total variation.
- 2.
If two distributions can be distinguished by a randomized circuit, we can fix a specific choice of coins to achieve at least the same advantage.
- 3.
We briefly sketch the idea of the proof: note first that it is easy to construct a simulator for every single distinguisher. Having realized that, we can use the min-max theorem to switch the quantifiers and get one simulator for all distinguishers.
- 4.
This is a direct consequence of the fact that we want \(\ell \) to fit poly-preserving reductions.
- 5.
The oracle evaluates the distance of the given candidate solution and the simulated distribution, answering with a distiguisher if the distance is smaller than required.
- 6.
As we already mentioned, we can assume that \(\textsc {D}\) is deterministic without loss of generality. Then all the terms in the equation are well-defined.
- 7.
By definition, it requires computing the average of \(\textsc {D}(x,\cdot )\) over \(2^{\ell }\) elements.
- 8.
We note that in a more standard notion the entire stream \(X_1,\ldots ,X_{q}\) is indistinguishable from random. This is implied by the notion above by a standard hybrid argument, with a loss of a multiplicative factor of q in the distinguishing advantage.
- 9.
We consider the security of \(\mathsf {AES256}\) as a weak PRF, and not a standard PRF, because of non-uniform attacks which show that no PRF with a k-bit key can have \(s/\epsilon \approx 2 ^k\) security [DTT09], at least unless we additionally require \(\epsilon \gg 2^{-k/2}\).
- 10.
This is a standard assumption in indistinguishability proofs. We can always extend the class by adding \(-\textsc {D}\) for every \(\textsc {D}\in \mathcal {D}\), which increases the complexity only by 1.
- 11.
The RAM model.
- 12.
- 13.
Or cam be concluded from the parallelogram identity \((x+y)^2 + (x-y)^2 = x^2+y^2\).
References
Buldas, A., Laanoja, R.: Security proofs for hash tree time-stamping using hash functions with small output size. In: Boyd, C., Simpson, L. (eds.) ACISP. LNCS, vol. 7959, pp. 235–250. Springer, Heidelberg (2013). doi:10.1007/978-3-642-39059-3_16
Chung, K.-M., Lui, E., Pass, R.: From weak to strong zero-knowledge and applications. In: Dodis, Y., Nielsen, J.B. (eds.) TCC 2015. LNCS, vol. 9014, pp. 66–92. Springer, Heidelberg (2015). doi:10.1007/978-3-662-46494-6_4
Dziembowski, S., Pietrzak, K.: Leakage-resilient cryptography. In: Proceedings of the 49th Annual IEEE Symposium on Foundations of Computer Science, Washington, DC, USA, FOCS 2008, pp. 293–302. IEEE Computer Society (2008)
Dodis, Y., Pietrzak, K.: Leakage-resilient pseudorandom functions and side-channel attacks on Feistel networks. In: Rabin, T. (ed.) CRYPTO 2010. LNCS, vol. 6223, pp. 21–40. Springer, Heidelberg (2010). doi:10.1007/978-3-642-14623-7_2
De, A., Trevisan, L., Tulsiani, M.: Non-uniform attacks against one-way functions and prgs. In: Electronic Colloquium on Computational Complexity (ECCC), vol. 16, p. 113 (2009)
Frieze, A.M., Kannan, R.: Quick approximation to matrices and applications. Combinatorica 19(2), 175–220 (1999)
Faust, S., Pietrzak, K., Schipper, J.: Practical leakage-resilient symmetric cryptography. In: Prouff, E., Schaumont, P. (eds.) CHES 2012. LNCS, vol. 7428, pp. 213–232. Springer, Heidelberg (2012). doi:10.1007/978-3-642-33027-8_13
Fuller, B., Reyzin, L.: Computational entropy and information leakage. Cryptology ePrint Archive, report 2012/466 (2012). http://eprint.iacr.org/
Gentry, C., Wichs, D.: Separating succinct non-interactive arguments from all falsifiable assumptions. In: Fortnow, L., Vadhan, S.P. (eds.) STOC, pp. 99–108. ACM (2011)
Impagliazzo, R.: Hard-core distributions for somewhat hard problems. In: 36th Annual Symposium on Foundations of Computer Science, pp. 538–545. IEEE (1995)
Jetchev, D., Pietrzak, K.: How to fake auxiliary input. In: Lindell, Y. (ed.) TCC 2014. LNCS, vol. 8349, pp. 566–590. Springer, Heidelberg (2014)
Luby, M.G., Michael, L.: Pseudorandomness and Cryptographic Applications. Princeton University Press, Princeton (1994)
Pietrzak, K.: A leakage-resilient mode of operation. In: Joux, A. (ed.) EUROCRYPT 2009. LNCS, vol. 5479, pp. 462–482. Springer, Heidelberg (2009). doi:10.1007/978-3-642-01001-9_27
Pietrzak, K.: Private communication, May 2015
Reingold, O., Trevisan, L., Tulsiani, M., Vadhan, S.: Dense subsets of pseudorandom sets. In: Proceedings of the 49th Annual IEEE Symposium on Foundations of Computer Science, Washington, DC, USA, FOCS 2008, pp. 76–85. IEEE Computer Society (2008)
Skórski, M.: Time-advantage ratios under simple transformations: applications in cryptography. Cryptography and Information Security in the Balkans - Second International Conference, BalkanCryptSec: Koper, Slovenia, 3–4 September 2015. Revised Selected Papers, pp. 79–91 (2015)
Trevisan, L., Tulsiani, M., Vadhan, S.: Regularity, boosting, and efficiently simulating every high-entropy distribution. In: Proceedings of the 24th Annual IEEE Conference on Computational Complexity, Washington, DC, USA, CCC 2009, pp. 126–136. IEEE Computer Society (2009)
Vadhan, S., Zheng, C.J.: A uniform min-max theorem with applications in cryptography. In: Canetti, R., Garay, J.A. (eds.) CRYPTO 2013. LNCS, vol. 8042, pp. 93–110. Springer, Heidelberg (2013). doi:10.1007/978-3-642-40041-4_6
Yu, Y., Standaert, F.-X.: Practical leakage-resilient pseudorandom objects with minimum public randomness. In: Dawson, E. (ed.) CT-RSA 2013. LNCS, vol. 7779, pp. 223–238. Springer, Heidelberg (2013). doi:10.1007/978-3-642-36095-4_15
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Appendices
A More on the Flaw in [JP14]
In the original setting we have \(\mathcal {Z} = \{0,1\}^{\lambda }\). In the proof of the claimed better bound \(O\left( s\cdot 2^{3\lambda }\epsilon ^{-2}\right) \) there is a mistake on page 18 (eprint version), when the authors enforce a signed measure to be a probability measure by a mass shifting argument. The number M defined there is in fact a function of x and is hard to compute, whereas the original proof amuses that this is a constant independent of x. During iterations of the boosting loop, this number is used to modify distinguishers class step by step, which drastically blows up the complexity (exponentially in the number of steps, which is already polynomial in \(\epsilon \)). In the min-max based proof giving the bound \(O\left( s\cdot 2^{3\lambda }\epsilon ^{-4}\right) \) a fixable flaw is a missing factor of \(2^{\lambda }\) in the complexity (page 16 in the eprint version), which is because what is constructed in the proof is only a probability mass function, not yet a sampler [Pie15].
B Proof of Claim 2
We can rewrite Eq. (6) as
First, note that
As to the second term in Eq. (13), we observe that
C Proof of Claim 3
Proof
(Proof of Claim 3 ). We start by comparing the total negative mass in the functions \(h^{t+1}=h^{t}+\overline{\textsc {D}}^{t+1}+\theta ^{t+1}\) and \({h}^{t} \). Suppose first that \(\tilde{h}^{t}(x,z_0) < 0\) where \(z_0 = z_{\text {min}}^{t}(x)\). Since \(\sum _{z\not =z_0} \tilde{h}^{t+1}= 1-\tilde{h}^{t+1}(x,z_0)\), there exists \(z_1\) such that \( \tilde{h}^{t+1}(x,z_1) \geqslant \frac{1-\tilde{h}^{t+1}(x,z_0)}{|\mathcal {Z}| - 1} > 0 \). Combining this with Eq. (4) we obtain
These observations together with Eq. (3) give us
where the inequality line follows from \(\tilde{h}^{t+1}(x,z_0) < 0\) and Eq. (16). But by the definition of \(z_0= z_{\text {min}}^{t}(x)\) we have \(\tilde{h}^{t+1}(x,z_0) = \min _{z} \tilde{h}^{t+1}(x,z)\). Since this value is negative, we get
Combining Eqs. (17) and (18) we obtain
Since \(| h^{t+1}(x,z)-\tilde{h}^{t}(x,z)| \leqslant \gamma \) by Eq. (3), we get the following recursion
which can be rewritten as
which is in addition trivially true if \(\tilde{h}^{t+1}(x,z) \geqslant 0\) for all z. Since we have \(\textsf {NegativeMass}\left( {h}^{0}(x,\cdot ) \right) = 0\), expanding this recursion till \(t=0\) gives an upper bound \(|\mathcal {Z}|\gamma \cdot \sum _{j\leqslant t+1} \left( 1-|\mathcal {Z}|^{-2}\right) ^{j}\) which is smaller than by \(|\mathcal {Z}|^{3}\gamma \) by the convergence of the geometric series. This finishes the proof of the first part.
To prove the second part, recall that by the definition of \(z_0\) we have \(\tilde{h}^{t+1}(x,z_0) = \min _{z} \tilde{h}^{t+1}(x,z)\). Suppose that \(\tilde{h}^{t+1}(x,z_0) < 0\) (that is, there is a negative mass in \(\widetilde{h}^{t+1}(x,\cdot )\)). Now, by the definition of \(h^{t+1}\), we get
Suppose that \(\widetilde{h}^{t+1}(x,z) + \frac{ |\widetilde{h}^{t+1}(x,z_0)|}{|\mathcal {Z}|-1} \leqslant 0\) for some z. Then, by the definition of \(z_0\), we also have
From this we conclude that for any z we have
and thus
which means that (still assuming that \(\widetilde{h}^{t+1}(x,z_0) < 0\))
Note that \(0\geqslant \min \left( \widetilde{h}^{t+1}(x,z),0\right) \geqslant \min \left( {h}^{t}(x,z),0\right) -\gamma \) by the definition of \(h^{t+1}\) and \(\widetilde{h}^{t+1}\). Then
Note that this inequality is true even if \(\widetilde{h}^{t+1}(x,z_0) = 0\), that is \(\widetilde{h}^{t+1}(x,z) \geqslant 0\) for all z as then \({h}^{t+1}(x,z)\geqslant 0\) for all z. By expanding this recursion, and noticing that \(\min (h^{0}(x,z),0) = 0\) for all x, z by definition, we get
D Proof of Claim 4
Proof
If \(\theta ^{t+1}(x,z) = 0\) then there is nothing to prove. Suppose that \(\theta ^{t+1}(x,z) < 0\). Let \(z_0=z_{\text {min}}^{t}(x)\). According to Eq. (4) we have \(\theta ^{t+1}(x,z_0) = -\tilde{h}^{t+1}(x,z_0)\) and \(\theta ^{t+1}(x,z) = \frac{\tilde{h}^{t+1}(x,z_0)}{\#\mathcal {Z}-1}\) for \(z\not =z_0\). Therefore
and
Putting Eqs. (22) and (23) together we obtain
which is positive because \(\tilde{h}^{t,r}(x,z_0)<0\) and \(g(x,z_0) \geqslant 0\). This proves Claim 4.
Rights and permissions
Copyright information
© 2016 International Association for Cryptologic Research
About this paper
Cite this paper
Skórski, M. (2016). Simulating Auxiliary Inputs, Revisited. In: Hirt, M., Smith, A. (eds) Theory of Cryptography. TCC 2016. Lecture Notes in Computer Science(), vol 9985. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-53641-4_7
Download citation
DOI: https://doi.org/10.1007/978-3-662-53641-4_7
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-662-53640-7
Online ISBN: 978-3-662-53641-4
eBook Packages: Computer ScienceComputer Science (R0)