Keywords

1 Introduction

Sources of reproducible secret random bits are necessary for many cryptographic applications. In many situations these bits are not explicitly stored for future use, but are obtained by repeating the same process (such as reading a biometric or a physically unclonable function) that generated them the first time. However, bits obtained this way present a problem: noise [4, 8, 12, 14, 19, 30, 31, 33, 37, 39, 43]. That is, when a secret is read multiple times, readings are close (according to some metric) but not identical. To utilize such sources, it is often necessary to remove noise, in order to derive the same value in subsequent readings.

The same problem occurs in the interactive setting, in which the secret channel used for transmitting the bits between two users is noisy and/or leaky [42]. Bennett, Brassard, and Robert [4] identify two fundamental tasks. The first, called information reconciliation, removes the noise without leaking significant information. The second, known as privacy amplification, converts the high entropy secret to a uniform random value. In this work, we consider the noninteractive version of these problems, in which these tasks are performed together with a single message.

The noninteractive setting is modeled by a primitive called a fuzzy extractor [13], which consists of two algorithms. The generate algorithm (\({\mathsf {Gen}} \)) takes an initial reading w and produces an output \({\mathsf {key}} \) along with a nonsecret helper value p. The reproduce (\({\mathsf {Rep}} \)) algorithm takes the subsequent reading \(w'\) along with the helper value p to reproduce \({\mathsf {key}} \). The correctness guarantee is that the key is reproduced precisely when the distance between w and \(w'\) is at most t.

The security requirement for fuzzy extractors is that \({\mathsf {key}} \) is uniform even to a (computationally unbounded) adversary who has observed p. This requirement is harder to satisfy as the allowed error tolerance t increases, because it becomes easier for the adversary to guess \({\mathsf {key}} \) by guessing a \(w'\) within distance t of w and running \({\mathsf {Rep}} (w',p)\).

Fuzzy Min-Entropy. We introduce a new entropy notion that precisely measures how hard it is for the adversary to guess a value within distance t of the original reading w. Suppose w is sampled from a distribution W. To have the maximum chance that \(w'\) is within distance t of w, the adversary would want to maximize the total probability mass of W within the ball \(B_t(w')\) of radius t around \(w'\). We therefore define fuzzy min-entropy

$$\mathrm {H}^{\mathtt {fuzz}}_{t,\infty }(W) \mathop {=}\limits ^\mathrm{def}-\log \max _{w'} \Pr [W\in B_t(w')].$$

The security of the resulting key cannot exceed the fuzzy min-entropy (Proposition 1).

However, existing constructions do not measure their security in terms of fuzzy min-entropy; instead, their security is shown to be the min-entropy of W, denoted \(\mathrm {H}_\infty (W)\), minus some loss, for error-tolerance, that is at least \(\log |B_t|\).Footnote 1 Since (trivially) \(\mathrm {H}_\infty (W)-\log |B_t| \le \mathrm {H}^{\mathtt {fuzz}}_{t,\infty }(W)\), it is natural to ask whether this loss is necessary. This question is particularly relevant when the gap between the two sides of the inequality is high.Footnote 2 As an example, iris scans appear to have significant \(\mathrm {H}^{\mathtt {fuzz}}_{t,\infty }(W)\) (because iris scans for different people appear to be well-spread in the metric space [11]) but negative \(\mathrm {H}_\infty (W) -\log |B_t|\) [6, Sect. 5]. We therefore ask: is fuzzy min-entropy sufficient for fuzzy extraction? There is evidence that it may be sufficient when the security requirement is computational rather than information-theoretic—see Sect. 1.2. We provide an answer for the case of information-theoretic security in two settings.

Contribution 1: Sufficiency of \(\mathrm {H}^{\mathtt {fuzz}}_{t,\infty }(W)\) for a Precisely Known W . It should be easier to construct a fuzzy extractor when the designer has precise knowledge of the probability distribution function of W. In this setting, we show that it is possible to construct a fuzzy extractor that extracts a \({\mathsf {key}} \) almost as long as \(\mathrm {H}^{\mathtt {fuzz}}_{t,\infty }(W)\) (Theorem 1). Our construction crucially utilizes the probability distribution function of W and, in particular, cannot necessarily be realized in polynomial time (this is similar, for example, to the interactive information-reconciliation feasibility result of [34]). This result shows that \(\mathrm {H}^{\mathtt {fuzz}}_{t,\infty }(W)\) is a necessary and sufficient condition for building a fuzzy extractor for a given distribution W.

A number of previous works in the precise knowledge setting have provided efficient algorithms and tight bounds for specific distributions—generally the uniform distribution or i.i.d. sequences (for example, [20, 2628, 38, 41]). Our characterization unifies previous work, and justifies using \(\mathrm {H}^{\mathtt {fuzz}}_{t,\infty }(W)\) as the measure of the quality of a noisy distribution, rather than cruder measures such as \(\mathrm {H}_\infty (W)-\log |B_t|\). Our construction can be viewed as a reference to evaluate the quality of efficient constructions in the precise knowledge setting by seeing how close they get to extracting all of \(\mathrm {H}^{\mathtt {fuzz}}_{t,\infty }(W)\).

Contribution 2: The Cost of Distributional Uncertainty. Assuming precise knowledge of a distribution W is often unrealistic for high-entropy distributions; they can never be fully observed directly and must therefore be modeled. It is imprudent to assume that the designer’s model of a distribution is completely accurate—the adversary, with greater resources, would likely be able to build a better model. (In particular, the adversary has more time to build the model after a particular construction is deployed.) Because of this, existing designs work for a family of sources (for example, all sources of min-entropy at least m with at most t errors). The fuzzy extractor is designed given only knowledge of the family. The attacker may know more about the distribution than the designer. We call this the distributional uncertainty setting.

Our second contribution is a set of negative results for this more realistic setting. We provide two impossibility results for fuzzy extractors. Both demonstrate families \(\mathcal {W}\) of distributions over \(\{0,1\}^n\) such that each distribution in the family has \(\mathrm {H}^{\mathtt {fuzz}}_{t,\infty }\) linear in n, but no fuzzy extractor can be secure for most distributions in \(\mathcal {W}\). Thus, a fuzzy extractor designer who knows only that the distribution comes from \(\mathcal {W}\) is faced with an impossible task, even though our positive result, Theorem 1, shows that fuzzy extractors can be designed for each distribution in the family individually.

The first impossibility result (Theorem 2) assumes that \({\mathsf {Rep}} \) is perfectly correct and rules our fuzzy extractors for entropy rates as high as \(\mathrm {H}^{\mathtt {fuzz}}_{t,\infty }(W)\approx 0.18n\). The second impossibility result (Theorem 3), relying on the work of Holenstein and Renner [25], also rules out fuzzy extractors in which \({\mathsf {Rep}} \) is allowed to make a mistake, but applies only to distributions with entropy rates up to \(\mathrm {H}^{\mathtt {fuzz}}_{t,\infty }(W)\approx 0.07n\).

We also provide a third impossibility result (Theorem 4), this time for an important building block called “secure sketch,” which is used in most fuzzy extractor constructions (in order to allow \({\mathsf {Rep}} \) to recover the original w from the input \(w'\)). The result rules out secure sketches for a family of distributions with entropy rate up to 0.5n, even if the secure sketches are allowed to make mistakes. Because secure sketches are used in most fuzzy extractors constructions, the result suggests that building a fuzzy extractor for this family will be very difficult. We define secure sketches formally in Sect. 7.

These impossibility results motivate further research into computationally, rather information-theoretically, secure fuzzy extractors (Sect. 1.2).

1.1 Our Techniques

Techniques for Positive Results for a Precisely Known Distribution. We now explain how to construct a fuzzy extractor for a precisely known distribution W with fuzzy min-entropy. We begin with distributions in which all points in the support have the same probability (so-called “flat” distributions). \({\mathsf {Gen}} \) simply extracts a key from the input w using a randomness extractor. Consider some subsequent reading \(w'\). To achieve correctness, the string p must permit \({\mathsf {Rep}} \) to disambiguate which point \(w\in W\) within distance t of \(w'\) was given to \({\mathsf {Gen}} \). Disambiguating multiple points can be accomplished by universal hashing, as long as the size of hash output space is slightly greater than the number of possible points. Thus, \({\mathsf {Rep}} \) includes into the public value p a “sketch” of w computed via a universal hash of w. To determine the length of that sketch, consider the heaviest (according to W) ball \(B^*\) of radius t. Because the distribution is flat, \(B^*\) is also the ball with the most points of nonzero probability. Thus, the length of the sketch needs to be slightly greater than the logarithm of the number of non-zero probability points in \(B^*\). Since \(\mathrm {H}^{\mathtt {fuzz}}_{t,\infty }(W)\) is determined by the weight of \(B^*\), the number of points cannot be too high and there will be entropy left after the sketch is published. This remaining entropy suffices to extract a key.

For an arbitrary distribution, we cannot afford to disambiguate points in the ball with the greatest number of points, because there could be too many low-probability points in a single ball despite a high \(\mathrm {H}^{\mathtt {fuzz}}_{t,\infty }(W)\). We solve this problem by splitting the arbitrary distribution into a number of nearly flat distributions we call “levels.” We then write down, as part of the sketch, the level of the original reading w and apply the above construction considering only points in that level. We call this construction leveled hashing (Construction 1).

Techniques for Negative Results for Distributional Uncertainty. We construct a family of distributions \(\mathcal {W}\) and prove impossibility for a uniformly random \(W \leftarrow \mathcal {W}\). We start by observing the following asymmetry: \({\mathsf {Gen}} \) sees only the sample w (obtained via \(W\leftarrow \mathcal {W}\) and \(w\leftarrow W\)), while the adversary knows W.

To exploit the asymmetry, in our first impossibility result (Theorem 2), we construct \(\mathcal {W}\) so that conditioning on the knowledge of W reduces the distribution to a small subspace (namely, all points on which a given hash function produces a given output), but conditioning on only w leaves the rest of the distribution uniform on a large fraction of the entire space. An adversary can exploit the knowledge of the hash value to reduce the uncertainty about \({\mathsf {key}} \), as follows.

The nonsecret value p partitions the metric space into regions that produce a consistent value under \({\mathsf {Rep}} \) (preimages of each \({\mathsf {key}} \) under \({\mathsf {Rep}} (\cdot , p)\)). For each of these regions, the adversary knows that possible w lie at distance at least t from the boundary of the region (else, the fuzzy extractor would have a nonzero probability of error). However, in the Hamming space, the vast majority of points lie near the boundary (this result follows by combining the isoperimetric inequality [21], which shows that the ball has the smallest boundary, with bounds on the volume of the interior of a ball, which show that this boundary is large). This allows the adversary to rule out so many possible w that, combined with the adversarial knowledge of the hash value, many regions become empty, leaving \({\mathsf {key}} \) far from uniform.

For the second impossibility result (Theorem 3, which rules out even fuzzy extractors that are allowed a possibility of error), we let the adversary know some fraction of the bits of w. Holenstein and Renner [25] showed that if the adversary knows each bit of w with sufficient probability, and bits of \(w'\) differ from bits of w with sufficient probability, then so-called “information-theoretic key agreement” is impossible. Converting the impossibility of information-theoretic key agreement to impossibility of fuzzy extractors takes a bit of technical work.

1.2 Related Settings

Other Settings with Close Readings: \(\mathrm {H}^{\mathtt {fuzz}}_{t,\infty }\) is Sufficient. The security definition of fuzzy extractors can be weakened to protect only against computationally bounded adversaries [17]. In this computational setting, for most distance metrics a single fuzzy extractor can simultaneously secure all possible distributions by using virtual grey-box obfuscation for all circuits in \(\mathtt {NC}^1\) [5]. This construction is secure when the adversary can rarely learn \({\mathsf {key}}\) with oracle access to the program functionality. The set of distributions with fuzzy min-entropy are exactly those where an adversary learns \({\mathsf {key}}\) with oracle access to the functionality with negligible probability. Thus, extending our negative result to the computational setting would have negative implications on the existence of obfuscation.

Furthermore, the functional definition of fuzzy extractors can be weakened to permit interaction between the party having w and the party having \(w'\). Such a weakening is useful for secure remote authentication [7]. When both interaction and computational assumptions are allowed, secure two-party computation can produce a key that will be secure whenever the distribution W has fuzzy min-entropy. The two-party computation protocol needs to be secure without assuming authenticated channels; it can be built under the assumptions that collision-resistant hash functions and enhanced trapdoor permutations exist [3].

Correlated Rather than Close Readings. A different model for the problem of key derivation from noisy sources does not explicitly consider the distance between w and \(w'\), but rather views w and \(w'\) as samples of drawn from a correlated pair of random variables. This model is considered in multiple works, including [1, 10, 29, 42]; recent characterizations of when key derivation is possible in this model include [35, 40]. In particular, Hayashi et al. [22] independently developed an interactive technique similar to our non-interactive leveled hashing, which they called “spectrum slicing.” To the best of our knowledge, prior results on correlated random variables are in the precise knowledge setting; we are unaware of works that consider the cost of distributional uncertainty.

2 Preliminaries

Random Variables. We generally use uppercase letters for random variables and corresponding lowercase letters for their samples. A repeated occurrence of the same random variable in a given expression signifies the same value of the random variable: for example \((W, {\mathsf {SS}} (W))\) is a pair of random variables obtained by sampling w according to W and applying the algorithm \({\mathsf {SS}} \) to w.

The statistical distance between random variables A and B with the same domain is \(\mathbf {SD}(A,B) = \frac{1}{2} \sum _a |\Pr [A=a] - \Pr [B=b]| = \max _S \Pr [A\in S]-\Pr [B\in ~S]\).

Entropy. Unless otherwise noted logarithms are base 2. Let (XY) be a pair of random variables. Define min-entropy of X as \(\mathrm {H}_\infty (X) = -\log (\max _x \Pr [X=x])\), and the average (conditional) min-entropy of X given Y as \(\tilde{\mathrm {H}}_\infty (X|Y) =~-\log ({{\mathrm{\mathbb {E}}}}_{y\in Y} \max _{x} \Pr [X=x|Y=y])\) [13, Sect. 2.4]. Define Hartley entropy \(H_0(X)\) to be the logarithm of the size of the support of X, that is \(H_0(X) = \log |\{x | \Pr [X=x]>0\}|\). Define average-case Hartley entropy by averaging the support size: \(\tilde{H}_0(X |Y) = \log ({{\mathrm{\mathbb {E}}}}_{y\in Y} |\{y | \Pr [X=x |Y=y]>0\}|)\). For \(0<a<1\), define the binary entropy \(h_2(p) = -p \log p - (1-p)\log (1-p)\) as the Shannon entropy of any random variable that is 0 with probability p and 1 with probability \(1-p\).

Randomness Extractors. We use randomness extractors [32], as defined for the average case in [13, Sect. 2.5].

Definition 1

Let \(\mathcal {M}\), \(\chi \) be finite sets. A function \(\mathtt {ext}: \mathcal {M}\times \{0,1\}^d \rightarrow \{0,1\}^\kappa \) a \((\tilde{m}, \epsilon )\) -average case extractor if for all pairs of random variables XY over \(\mathcal {M}, \chi \) such that \(\tilde{H}_\infty (X|Y) \ge \tilde{m}\), we have

$$\mathbf {SD}((\mathtt {ext}(X, U_d), U_d, Y), U_\kappa \times U_d \times Y) \le \epsilon .$$

Metric Spaces and Balls. For a metric space \((\mathcal {M}, \mathsf {dis})\), the (closed) ball of radius t around w is the set of all points within radius t, that is, \(B_t(w) = \{w'| \mathsf {dis}(w, w')\le t\}\). If the size of a ball in a metric space does not depend on w, we denote by \(|B_t|\) the size of a ball of radius t. We consider the Hamming metric over vectors in \(\mathcal {Z}^n\) for some finite alphabet \(\mathcal {Z}\), defined via \(\mathsf {dis}(w,w') = |\{i | w_i \ne w'_i\}|\). \(U_\kappa \) denotes the uniformly distributed random variable on \(\{0,1\}^\kappa \).

We will use the following bounds on \(|B_t|\) in \(\{0,1\}^n\), see [2, Lemma 4.7.2, Eq. 4.7.5, p. 115] for proofs.

Lemma 1

Let \(\tau =t/n\). The volume \(|B_t|\) of the ball of radius in t in the Hamming space \(\{0,1\}^n\) satisfies

$$ \frac{1}{\sqrt{8n\tau (1-\tau )}} \cdot 2^{nh_2(\tau )}\le |B_t|\le 2^{nh_2(\tau )}. $$

2.1 Fuzzy Extractors

In this section, we define fuzzy extractors, slightly modified from the work of Dodis et al. [13, Sect. 3.2]. First, we allow for error as discussed in [13, Sect. 8]. Second, in the distributional uncertainty setting we consider a general family \(\mathcal {W}\) of distributions instead of families containing all distributions of a given min-entropy. Let \(\mathcal {M}\) be a metric space with distance function \(\mathsf {dis}\).

Definition 2

An \((\mathcal {M}, \mathcal {W}, \kappa , t, \epsilon )\)-fuzzy extractor with error \(\delta \) is a pair of randomized procedures, “generate” \(({\mathsf {Gen}})\) and “reproduce” \(({\mathsf {Rep}})\). \({\mathsf {Gen}}\) on input \(w\in \mathcal {M}\) outputs an extracted string \({\mathsf {key}} \in \{0,1\}^\kappa \) and a helper string \(p\in \{0,1\}^*\). \({\mathsf {Rep}}\) takes \(w'\in \mathcal {M}\) and \(p\in \{0,1\}^*\) as inputs. \(({\mathsf {Gen}}, {\mathsf {Rep}})\) have the following properties:

  1. 1.

    Correctness: if \(\mathsf {dis}(w, w')\le t\) and \(({\mathsf {key}}, p)\leftarrow {\mathsf {Gen}} (w)\), then \(\Pr [{\mathsf {Rep}} (w', p) = {\mathsf {key}} ] \ge 1-\delta .\)

  2. 2.

    Security: for any distribution \(W\in \mathcal {W}\), if \(({\mathsf {Key}},P)\leftarrow {\mathsf {Gen}} (W)\), then \(\mathbf {SD}(({\mathsf {Key}},P),(U_\kappa ,P))\le \epsilon .\)

In the above definition, the errors must be chosen before p is known in order for the correctness guarantee to hold.

The Case of a Precisely Known Distribution. If in the above definition we take \(\mathcal {W}\) to be a one-element set containing a single distribution W, then the fuzzy extractor is said to be for a precisely known distribution. In this case, we need to require correctness only for w that have nonzero probability. Note that we have no requirement that the algorithms are compact or efficient, and so the distribution can be fully known to them.

3 New Notion: Fuzzy Min-Entropy

The fuzzy extractor helper string p allows everyone, including the adversary, to find the output of \({\mathsf {Rep}} (\cdot , p)\) on any input \(w'\). Ideally, p should not provide any useful information beyond this ability, and the outputs of \({\mathsf {Rep}} \) on inputs that are too distant from w should provide no useful information, either. In this ideal scenario, the adversary is limited to trying to guess a \(w'\) that is t-close to w. Letting \(w'\) be the center of the maximum-weight ball in W is optimal, we measure the quality of a source by (the negative logarithm of) this weight.

Definition 3

The t-fuzzy min-entropy of a distribution W in a metric space \((\mathcal {M}, \mathsf {dis})\) is:

$$ \mathrm {H}^{\mathtt {fuzz}}_{t,\infty }(W) = -\log \left( \max _{w'} \sum _{w\in \mathcal {M}| \mathsf {dis}(w, w')\le t} \Pr [W=w] \right) $$

Fuzzy min-entropy measures the functionality provided to the adversary by \({\mathsf {Rep}} \) (since p is public), and thus is a necessary condition for security. We formalize this statement in the following proposition.

Proposition 1

Let W be a distribution over \((\mathcal {M}, \mathsf {dis})\) with \(\mathrm {H}^{\mathtt {fuzz}}_{t,\infty }(W) =m\). Let \(({\mathsf {Gen}}, {\mathsf {Rep}})\) be a \((\mathcal {M}, \{W\}, \kappa , t, \epsilon )\)-fuzzy extractor with error \(\delta \). Then

$$ 2^{-\kappa }\ge 2^{-m}-\delta -\epsilon . $$

If \(\delta =\epsilon =2^{-\kappa }\), then \(\kappa \) cannot exceed \(m+2\). Additionally, if fuzzy min-entropy of the source is only logarithmic in a security parameter while the \(\delta \) and \(\epsilon \) parameters are negligible, then extracted key must be of at most logarithmic length.

Proof

Let W be a distribution where \(\mathrm {H}^{\mathtt {fuzz}}_{t,\infty }(W) = m\). This means that there exists a point \(w' \in \mathcal {M}\) such that \(\Pr _{w\in W}[\mathsf {dis}(w, w')\le t] = 2^{-m}\). Consider the following distinguisher D: on input \(({\mathsf {key}}, p)\), if \({\mathsf {Rep}} (w', p) = {\mathsf {key}} \), then output 1, else output 0.

\(\Pr [D({\mathsf {Key}}, P) = 1]\ge 2^{-m} - \delta \), while \(\Pr [D(U_\kappa , P)=1 ]= 1/2^{-\kappa }\). Thus,

$$ \mathbf {SD}(({\mathsf {Key}}, P), (U_\kappa , P)) \ge \delta ^D(({\mathsf {Key}}, P), (U_\kappa , P))\ge 2^{-m} -\delta -2^{-\kappa }. $$

   \(\square \)

Proposition 1 extends to the settings of computational security and interactive protocols. Fuzzy min-entropy represents an upper bound on the security from a noisy source. However, there are many distributions with fuzzy min-entropy with no known information-theoretically secure fuzzy extractor (or corresponding impossibility result).

We explore other properties of fuzzy min-entropy, not necessary for the proofs presented here, in the full version [18, Appendix E].

4 \(\mathrm {H}^{\mathtt {fuzz}}_{t,\infty }(W)\) is Sufficient in the Precise Knowledge Setting

In this section, we build fuzzy extractors that extract almost all of \(\mathrm {H}^{\mathtt {fuzz}}_{t,\infty }(W)\) for any distribution W. We reiterate that these constructions assume precise knowledge of W and are not necessarily polynomial-time. They should thus be viewed as feasibility results. We begin with flat distributions and then turn to arbitrary distributions.

4.1 Warm-Up for Intuition: Fuzzy Extractor for Flat Distributions

Let \({\text {supp}}(W)=\{w|\Pr [W=w]>0\}\) denote the support of a distribution W. A distribution W is flat if all elements of \({\text {supp}}(W)\) have the same probability. Our construction for this case is quite simple: to produce p, \({\mathsf {Gen}} \) outputs a hash of its input point w and an extractor seed; to produce \({\mathsf {key}} \), \({\mathsf {Gen}} \) applies the extractor to w. Given \(w'\), \({\mathsf {Rep}} \) looks for \(w\in {\text {supp}}(W)\) that is near \(w'\) and has the correct hash value, and applies the extractor to this w to get \({\mathsf {key}} \).

The specific hash function we use is universal. (We note that universal hashing has a long history of use for information reconciliation, for example [4, 34, 36]. This construction is not novel; rather, we present it as a stepping stone for the case of general distributions).

Definition 4

([9]). Let \(F : \mathcal {K} \times \mathcal {M}\rightarrow R\) be a function. We say that F is universal if for all distinct \(x_1, x_2 \in \mathcal {M}\):

$$ \mathop {\Pr }\limits _{K \leftarrow \mathcal {K}}[F(K, x_1) = F(K, x_2)] = \frac{1}{|R|} \;. $$

In our case, the hash output length needs to be sufficient to disambiguate elements of \({\text {supp}}(W) \cap B_t(w')\) with high probability. Observe that there are at most \(2^{\mathrm {H}_\infty (W)-\mathrm {H}^{\mathtt {fuzz}}_{t,\infty }(W)}\) such elements when W is flat, so output length slightly greater (by \(\log 1/\delta \)) than \(\mathrm {H}_\infty (W)-\mathrm {H}^{\mathtt {fuzz}}_{t,\infty }(W)\) will suffice. Thus, the output \({\mathsf {key}} \) length will be \(\mathrm {H}^{\mathtt {fuzz}}_{t,\infty }(W)-\log 1/\delta - 2\log 1/\epsilon +2\) (by using average-case leftover hash lemma, per [13, Lemmas 2.2b and 2.4]). As this construction is only a warm-up, so we do not state it formally and proceed to general distributions.

4.2 Fuzzy Extractor for Arbitrary Distributions

The hashing approach used in the previous subsection does not work for arbitrary sources. Consider a distribution W consisting of the following balls: \(B^1_t\) is a ball with \(2^{\mathrm {H}_\infty (W)}\) points with total probability \(\Pr [W\in B^1_t] =2^{-\mathrm {H}_\infty (W)}\), \(B^2_t,..., B^{2^{-\mathrm {H}_\infty (W)}}_t\) are balls with one point each with probability \(\Pr [W\in B^i_t] = 2^{-\mathrm {H}_\infty (W)}\). The above hashing algorithm writes down \(\mathrm {H}_\infty (W)\) bits to achieve correctness on \(B^1_t\). However, with probability \(1-2^{-\mathrm {H}_\infty (W)}\) the initial reading is outside of \(B^1_t\), and the hash completely reveals the point.

Instead, we use a layered approach: we separate the input distribution W into nearly-flat layers, write down the layer from which the input w came (i.e., the approximate probability of w) as part of p, and rely on the construction from the previous part for each layer. In other words, the hash function output is now variable-length, longer if probability of w is lower. Thus, p now reveals a bit more about w. To limit this information and the resulting security loss, we limit number of layers. As a result, we lose only \(1+\log H_0(W)\) more bits of security compared to the previous section. We emphasize that this additional loss is quite small: if W is over \(\{0,1\}^n\), it is only \(1+\log n\) bits (so, for example, only 11 bits if W is 1000 bits long, and no more than 50 bits for any remotely realistic W). We thus obtain the following theorem.

Theorem 1

For any metric space \(\mathcal {M}\), distribution W over \(\mathcal {M}\), distance t, error \(\delta >0\), and security \(\epsilon >0\), there exists a \((\mathcal {M}, \{W\}, \kappa , t, \epsilon )\)-known distribution fuzzy extractor with error \(\delta \) for \(\kappa = \mathrm {H}^{\mathtt {fuzz}}_{t,\infty }(W) - \log H_0(W) - \log 1/\delta - 2\log 1/\epsilon + 1\). (Note that the value \(\log H_0(W)\) is doubly logarithmic in the size of the support of W and is smaller than \(\log 1/\delta \) and \(\log 1/\epsilon \) for typical setting of parameters.)

We provide the construction and the proof in Appendix A. The main idea is that providing the level information makes the distribution look nearly flat (the probability of points differs by at most a factor of two, which increases the entropy loss as compared to the flat case by only one bit). And the level information itself increases the entropy loss by \(\log H_0(W)\) bits, because there are only \(H_0(W)\) levels that contain enough weight to matter.

5 Impossibility of Fuzzy Extractors for Family with \(\mathrm {H}^{\mathtt {fuzz}}_{t,\infty }\)

In the previous section, we showed the sufficiency of \(\mathrm {H}^{\mathtt {fuzz}}_{t,\infty }(W)\) for building fuzzy extractors when the distribution W is precisely known. However, it may be infeasible to completely characterize a high-entropy distribution W. Traditionally, algorithms deal with this distributional uncertainty by providing security for a family of distributions \(\mathcal {W}\). In this section, we show that distributional uncertainty comes at a real cost.

We demonstrate an example over the binary Hamming metric in which every \(W\in \mathcal {W}\) has linear \(\mathrm {H}^{\mathtt {fuzz}}_{t,\infty }(W)\) (which is in fact equal to \(\mathrm {H}_\infty (W)\)), and yet there is some \(W\in \mathcal {W}\) where even for 3-bit output keys and high constant \(\epsilon =\frac{1}{4}\). In fact, we show that the adversary need not work hard: even a uniformly random choice of distribution W from \(\mathcal {W}\) will thwart the security of any \(({\mathsf {Gen}}, {\mathsf {Rep}})\). The one caveat is that, for this result, we require \({\mathsf {Rep}} \) to be always correct (i.e., \(\delta =0\)). As mentioned in the introduction, this perfect correctness requirement is removed in Sects. 6 and 7 at a cost of lower entropy rate and stronger primitive, respectively.

As basic intuition, the result is based on the following reasoning: \({\mathsf {Gen}} \) sees only a random sample w from a random \(W\in \mathcal {W}\), but not W. The adversary sees W but not w. Because \({\mathsf {Gen}} \) does not know which W the input w came from, \({\mathsf {Gen}} \) must produce p that works for many distributions W that contain w in their support. Such p must necessarily reveal a lot of information. The adversary can combine information gleaned from p with information about W to narrow down the possible choices for w and thus distinguish \({\mathsf {key}} \) from uniform.

Theorem 2

Let \(\mathcal {M}\) denote the Hamming space \(\{0, 1\}^n\). There exists a family of distributions \(\mathcal {W}\) over \(\mathcal {M}\) such that for each element \(W \in \mathcal {W}\), \(\mathrm {H}^{\mathtt {fuzz}}_{t,\infty }(W)=\mathrm {H}_\infty (W) \ge m\), and yet any \((\mathcal {M}, \mathcal {W}, \kappa , t, \epsilon )\)-fuzzy extractor with error \(\delta = 0\) has \(\epsilon > 1/4\).

This holds as long as \(\kappa \ge 3\) and under the following conditions on the entropy rate \(\mu =m/n\), noise rate \(\tau =t/n\), and n:

  • any \(0\le \tau < \frac{1}{2}\) and \(\mu >0\) such that \( \mu< 1-h_2(\tau ) \text{ and } \mu < 1- h_2\left( \frac{1}{2}-\tau \right) \)

  • any \(n\ge \max \left( \frac{2}{1-h_2(\tau )-\mu },\frac{5}{1- h_2\left( \frac{1}{2}-\tau \right) -\mu }\right) \).

Fig. 1.
figure 1

The region of \(\tau \) (x-axis) and \(\mu \) (y-axis) pairs for which Theorem 2 applies is the region below both curves.

Note that the conditions on \(\mu \) and \(\tau \) imply the result applies to any entropy rate \(\mu \le .18\) as long as \(\tau \) is set appropriately and n is sufficiently large (for example, the result applies to \(n\ge 1275\) and \(\tau = .6\sqrt{\mu }\) when \(0.08\le \mu \le .18\); similarly, it applies to \(n\ge 263\) and \(\tau = \sqrt{\mu }\) when \(0.01\le \mu \le 0.08\)). The \(\tau \) vs. \(\mu \) tradeoff is depicted in Fig. 1.

Proof (Sketch)

Here we describe the family \(\mathcal {W}\) and provide a brief overview of the main proof ideas. We provide a full proof in Appendix B. We will show the theorem holds for an average member of \(\mathcal {W}\). Let Z denote a uniform choice of W from \(\mathcal {W}\) and denote by \(W_z\) the choice specified by a particular value of z.

Let \(\{\mathsf {Hash}_\mathsf {k}\}_{\mathsf {k}\in {\mathcal {K}}}\) be a family of hash function with domain \(\mathcal {M}\) and the following properties:

  • \(2^{-{a}}\)-universality: for all \(v_1\ne v_2 \in \mathcal {M}\), \(\Pr _{\mathsf {k}\leftarrow {\mathcal {K}}} [\mathsf {Hash}_\mathsf {k}(v_1)=\mathsf {Hash}_\mathsf {k}(v_2)]\le 2^{-{a}}\), where \({a}= n\cdot h_2\left( \frac{1}{2}-\tau \right) +3\).

  • \(2^m\)-regularity: for each \(\mathsf {k}\in {\mathcal {K}}\) and \(\mathsf {h}\) in the range of \(\mathsf {Hash}_\mathsf {k}\), \(|\mathsf {Hash}_\mathsf {k}^{-1}(\mathsf {h})|=2^m\), where \(m\ge \mu n\).

  • preimage sets have minimum distance \(t+1\): for all \(\mathsf {k}\in {\mathcal {K}}\), if \(v_1\ne v_2\) but \(\mathsf {Hash}_\mathsf {k}(v_1)=\mathsf {Hash}_\mathsf {k}(v_2)\), then \(\mathsf {dis}(v_1, v_2)> t\).

We show such a hash family exists in Appendix B. Let Z be the random variable consisting of pairs \((\mathsf {k}, \mathsf {h})\), where \(\mathsf {k}\) is uniform in \({\mathcal {K}}\) and \(\mathsf {h}\) is uniform in the range of \(\mathsf {Hash}_\mathsf {k}\). Let \(W_z\) for \(z=(\mathsf {k}, \mathsf {h})\) be the uniform distribution on \(\mathsf {Hash}_\mathsf {k}^{-1}(\mathsf {h})\). By the \(2^m\)-regularity and minimum distance properties of \(\mathsf {Hash}\), \(\mathrm {H}_\infty (W_z)=\mathrm {H}^{\mathtt {fuzz}}_{t,\infty }(W_z)=m\). Let \(\mathcal {W}=\{W_z\}\).

The intuition is as follows. We now want to show that for a random \(z\leftarrow Z\), if \(({\mathsf {key}}, p)\) is the output of \({\mathsf {Gen}} (W_z)\), then \({\mathsf {key}} \) can be easily distinguished from uniform in the presence of p and z.

In the absence of information about z, the value w is uniform on \(\mathcal {M}\) (by regularity of \(\mathsf {Hash}\)). Knowledge of p reduces the set of possible w from \(2^n\) to \(2^{n\cdot h_2\left( \frac{1}{2}-\tau \right) }\), because, by correctness of \({\mathsf {Rep}} \), every candidate input w to \({\mathsf {Gen}} \) must be such that all of its neighbors \(w'\) of distance at most t produce the same output of \({\mathsf {Rep}} (w', p)\). And knowledge of z reduces the set of possible w by another factor of \(2^{{a}}\), because a hash value with a random hash function key likely gives fresh information about w.

6 Impossibility in the Case of Imperfect Correctness

The impossibility result in the previous section applies only to fuzzy extractors with perfect correctness. In this section, we build on the work of Holenstein and Renner [25] to show the impossibility of fuzzy extractors even when they are allowed to make mistakes a constant fraction \(\delta \) (as much as \(4\,\%\)) of the time. However, the drawback of this result, as compared to the previous section, is that we can show impossibility only for a relatively low entropy rate of at most \(7\,\%\). In Sect. 7, we rule out stronger primitives called secure sketches with nonzero error (which are used in most fuzzy extractor constructions), even for entropy rate as high as \(50\,\%\).

Theorem 3

Let \(\mathcal {M}\) denote the Hamming space \(\{0, 1\}^n\). There exists a family of distributions \(\mathcal {W}\) over \(\mathcal {M}\) such that for each element \(W \in \mathcal {W}\), \(\mathrm {H}^{\mathtt {fuzz}}_{t,\infty }(W)=\mathrm {H}_\infty (W) \ge m\), and yet any \((\mathcal {M}, \mathcal {W}, \kappa , t, \epsilon )\)-fuzzy extractor with error \(\delta \le \frac{1}{25}\) has \(\epsilon > \frac{1}{25}\).

This holds for any \(\kappa >0\) under the following conditions on the entropy rate \(\mu =m/n\), noise rate \(\tau =t/n\), and n:

  • any \(0\le \tau \le \frac{1}{2}\) and \(\mu \) such that \(\mu < 4\tau (1-\tau ) \left( 1-h_2\left( \frac{1}{4-4\tau }\right) \right) \)

  • any sufficiently large n (as a function of \(\tau \) and \(\mu \))

Note that the conditions on \(\mu \) and \(\tau \) imply that the result applies to any entropy rate \(\mu \le \frac{1}{15}\) as long as \(\tau \) is set appropriately and n is sufficiently large. The \(\tau \) vs. \(\mu \) tradeoff is depicted in Fig. 2.

Fig. 2.
figure 2

The region of \(\tau \) (x-axis) and \(\mu \) (y-axis) pairs for which Theorem 3 applies is the region below this curve.

Proof (Proof Sketch)

We now describe the family \(\mathcal {W}\) and provide an overview of the main ideas. The full proof is in Appendix C.

Similarly to the proof of Theorem 2, we will prove that any fuzzy extractor fails for an element \(W_z\) of \(\mathcal {W}\) chosen according to the distribution Z. In this case, Z will not be uniform but rather binomial (with tails cut off). Essentially, Z will contain each bit of w with (appropriately chosen) probability \(\beta \); given \(Z=z\), the remaining bits of w will be uniform and independent.

For a string \(z\in \{0,1,\perp \}^n\), denote by \( info (z)\) the number of entries in z that are not \(\perp \): \( info (z)=|\{i \text{ s.t } z_i\ne \perp \}|\). Let \(W_z\) be the uniform distribution over all strings in \(\{0,1\}^n\) that agree with z in positions that are not \(\perp \) in z (i.e., all strings \(w\in \{0,1\}^n\) such that for \(1\le i\le n\), either \(z_i=\perp \) or \(w_i=z_i\)).

We will use \(\mathcal {W}\) to prove the theorem statement. First, we show that every distribution \(W_z\in \mathcal {W}\) has sufficient \(\mathrm {H}^{\mathtt {fuzz}}_{t,\infty }\). Indeed, z constrains \( info (z)\) coordinates out of n and leaves the rest uniform. Thus, \(\mathrm {H}^{\mathtt {fuzz}}_{t,\infty }(W_z)\) is the same as \(\mathrm {H}^{\mathtt {fuzz}}_{t,\infty }\) of the uniform distribution on the space \(\{0,1\}^{n- info (z)}\). Second, we now want to show that \(\mathbf {SD}(({\mathsf {Key}},P, Z), (U_\kappa , P, Z))>\frac{1}{25}\). To show this, we use a result of Holenstein and Renner [25, Theorem 4]. Their result shows impossibility of interactive key agreement for a noisy channel where the adversary observes each bit with some probability. Several technical results are necessary to apply the result in our setting (presented in Appendix C).

7 Stronger Impossibility Result for Secure Sketches

Most fuzzy extractor constructions share the following feature with our construction in Sect. 4: p includes information that is needed to recover w from \(w'\); both \({\mathsf {Gen}} \) and \({\mathsf {Rep}} \) simply apply an extractor to w. The recovery of w from \(w'\), known as information-reconciliation, forms the core of many fuzzy extractor constructions. The primitive that performs this information reconciliation is called secure sketch. In this section we show stronger impossibility results for secure sketches. First, we recall their definition from [13, Sect. 3.1] (modified slightly, in the same way as Definition 2).

Definition 5

An \((\mathcal {M},\mathcal {W}, \tilde{m}, t)\)-secure sketch with error \(\delta \) is a pair of randomized procedures, “sketch” \(({\mathsf {SS}})\) and “recover” \(({\mathsf {Rec}})\). \({\mathsf {SS}}\) on input \(w\in \mathcal {M}\) returns a bit string \(ss\in \{0,1\}^*\). \({\mathsf {Rec}}\) takes an element \(w'\in \mathcal {M}\) and \(ss\in \{0,1\}^*\). \(({\mathsf {SS}}, {\mathsf {Rec}})\) have the following properties:

  1. 1.

    Correctness: \( \forall w, w'\in \mathcal {M}\) if \(\mathsf {dis}(w,w')\le t\) then \(\Pr [{\mathsf {Rec}} (w',{\mathsf {SS}} (w))=w]\ge 1-\delta .\)

  2. 2.

    Security: for any distribution \(W\in \mathcal {W}\), \(\tilde{\mathrm {H}}_\infty (W|{\mathsf {SS}} (W))\ge \tilde{m}\).

Secure sketches are more demanding than fuzzy extractors (secure sketches can be converted to fuzzy extractors by using a randomness extractors like in our Construction 1 [13, Lemma 4.1]). We prove a stronger impossibility result for them. Specifically, in the case of secure sketches, we can extend the results of Theorems 2 and 3 to cover imperfect correctness (that is, \(\delta >0\)) and entropy rate \(\mu \) up to \(\frac{1}{2}\). Since most fuzzy extractor constructions rely on secure sketches, this result gives evidence that fuzzy extractors even with imperfect correctness and for high entropy rates are difficult to construct in the case of distributional uncertainty.

Fig. 3.
figure 3

The region of \(\tau \) (x-axis) and \(\mu \) (y-axis) pairs for which Theorem 4 applies is the region below both curves.

Theorem 4

Let \(\mathcal {M}\) denote the Hamming space \(\{0, 1\}^n\). There exists a family of distributions \(\mathcal {W}\) over \(\mathcal {M}\) such that for each element \(W \in \mathcal {W}\), \(\mathrm {H}^{\mathtt {fuzz}}_{t,\infty }(W)=\mathrm {H}_\infty (W) \ge m\), and yet any \((\mathcal {M}, \mathcal {W}, \tilde{m}, t)\)-secure sketch with error \(\delta \) has \(\tilde{m} \le 2\).

This holds under the following conditions on \(\delta \), the entropy rate \(\mu =m/n\), noise rate \(\tau =t/n\), and n:

  • any \(0\le \tau < \frac{1}{2}\) and \(\mu >0\) such that \(\mu< h_2(\tau ) \text{ and } \mu < 1-h_2(\tau )\)

  • any \(n\ge \max \left( \frac{.5\log n +4\delta n + 4}{h_2(\tau )-\mu }, \frac{2}{1-h_2(\tau )-\mu }\right) \)

Note that the result holds for any \(\mu <0.5\) as long as \(\delta < (h_2(\tau )-\mu )/4\) and n is sufficiently large. The \(\tau \) vs. \(\mu \) tradeoff is depicted in Fig. 3.

We provide the proof, which uses similar ideas to the proof of Theorem 2, in Appendix D.