1 Introduction

We start by addressing the two main themes of this work—non-uniformity and random oracles—in isolation, before connecting them to explain the main motivation for this work.

Non-uniformity. Modern cryptography (in the “standard model”) usually models the attacker \(\mathcal A\) as non-uniform, meaning that it is allowed to obtain some arbitrary (but bounded) “advice” before attacking the system. The main rationale to this modeling comes from the realization that a determined attacker will know the security parameter n of the system in advance and might be able to invest a significant amount of preprocessing to do something “special” for this fixed value of n, especially if n is not too large (for reasons of efficiency), or the attacker needs to break a lot of instances online (therefore amortizing the one-time offline cost). Perhaps the best known example of such attacks comes from rainbow tables ([31, 46]; see also [38, Sect. 5.4.3]) for inverting arbitrary functions; the idea is to use one-time preprocessing to initialize a clever data structure in order to dramatically speed up brute-force inversion attacks. Thus, restricting to uniform attackers might not accurately model realistic preprocessing attacks one would like to protect against. However, there are other, more technical, reasons why this choice is convenient:

  • Adleman [2] showed that non-uniform polynomial-time attackers can be assumed to be deterministic (formally, \(BPP/poly = P/poly\)), which is handy for some proofs.

  • While many natural reductions in cryptography are uniform, there are several important cases where the only known (or even possible!) reduction is non-uniform. Perhaps the best known example are zero-knowledge proofs [27, 28], which are not closed under sequential composition unless one allows non-uniform attackers (and simulators; intuitively, in order to use the simulator for the second zero-knowledge proof, one must use the output of the first proof’s simulator as an auxiliary input to the verifier).Footnote 1 Of course, being a special case of general protocol composition, this means that any work—either using zero-knowledge as a subroutine or generally dealing with protocol composition—must use security against non-uniform attackers in order for the composition to work.

  • The non-uniform model of computation has many applications in complexity theory, such as the famous “hardness-vs-randomness” connection (see [33,34,35,36, 45]), which roughly states that non-uniform hardness implies non-trivial de-randomization. Thus, by defining cryptographic attackers as non-uniform machines, any lower bounds for such cryptographic applications might yield exciting de-randomization results.

Of course, despite the pragmatic, definitional, and conceptual advantages of non-uniformity, one must ensure that one does not make the attacker “too powerful,” so that it can (unrealistically) solve problems which one might use in cryptographic applications. Fortunately, although non-uniform attackers can solve undecidable problems (by encoding the input in unary and outputting solutions in the non-uniform advice), the common belief is that non-uniformity cannot solve interesting “hard problems” in polynomial time. As one indirect piece of evidence, the Karp-Lipton theorem [37] shows that if NP has polynomial-size circuits, then the polynomial hierarchy collapses. And, of course, the entire field of cryptography is successfully based on the assumption that many hard problems cannot be solved even on average by polynomially sized circuits, and this belief has not been seriously challenged so far.

Hence, by and large it is believed by the theoretical community that non-uniformity is the right cryptographic modeling of attackers, despite being overly conservative and including potentially unrealistic attackers.

The Random-Oracle Model. Hash functions are ubiquitous in cryptography. They are widely used to build one-way functions (OWFs), collision-resistant hash functions (CRHFs), pseudorandom functions/generators (PRFs/PRGs), message authentication codes (MACs), etc. Moreover, they are often used together with other computational assumptions to show security of higher-level applications. Popular examples include Fiat-Shamir heuristics [1, 23] for signature schemes (e.g., Schnorr signatures [49]), full-domain-hash signatures [8], or trapdoor functions (TDFs) [8] and OAEP [9] encryption, among many others.

For each such application Q, one can wonder how to assess its security \(\varepsilon \) when instantiated with a concrete hash function H, such as SHA-3. Given our inability to prove unconditional lower bounds, the traditional approach is the following: Instead of proving an upper bound on \(\varepsilon \) for some specific H, one analyzes the security of Q assuming H is a truly random (aka “ideal”) function \(\mathcal O\). Since most Q are only secure against computationally bounded attackers, one gives the attacker \(\mathcal A\) oracle access to \(\mathcal O\) and limits the number of oracle queries that \(\mathcal A\) can make by some parameter \(T\). This now becomes the traditional random-oracle model (ROM), popularized by the seminal paper of Bellare and Rogaway [8].

The appeal of the ROM stems from two aspects. First, it leads to very clean and intuitive security proofs for many primitives that resisted standard-model analysis under natural security assumptions (see some concrete examples below). Second, this resulting ROM analysis is independent of the tedious specifics of H, is done only once for a given hash-based application, and also provides (for non-pathological Q’s) the best possible security one might hope to achieve with any concrete function H. In particular, we hope that a specific hash function H we use is sufficiently “well-designed” that it (essentially) matches this idealized bound. If it does, then our bound on \(\varepsilon \) was accurate anyway; and, if it does not, this usually serves as strong evidence that we should not use this particular H, rather than the indication that the idealized analysis was the wrong way to guess the exact security of Q. Ironically, in theory we know that the optimistic methodology above is false [5, 11, 12, 29, 44], and some applications secure in the ROM will be insecure for any instantiation of H, let alone maintain the idealized bound on \(\varepsilon \). Fortunately, all counterexamples of this kind are rather artificial, and do not shed much light on the security of concrete schemes used in practice, such as the use of hash functions as OWFs, CRHFs, PRFs, PRGs, MACs, and also as parts of natural signature and encryption schemes used in practice [8, 9, 23, 49]. In other words, despite purely theoretical concerns, the following random-oracle methodology appears to be a good way for practitioners to assess the best possible security level of a given (natural) application Q.

Random-oracle methodology. For “natural” applications of hash functions, the concrete security proven in the random-oracle model is the right bound even in the standard model, assuming the “best possible” concrete hash function H is chosen.

Random Oracles and Non-uniformity. The main motivation for this work is to examine the soundness of the above methodology, while also being consistent with the fact that attackers should be modeled as non-uniform. We stress that we are not addressing the conceptual question of whether non-uniform security is the “right” way to model attackers in cryptography, as this is the subject of a rather heated on-going debate between theoreticians and practitioners; see [10, 48] for some discussion on the subject. Instead, assuming we want to model attackers as non-uniform (for the reasons stated above and to be consistent with the theoretical literature), and assuming we want to have a way of correctly assessing the concrete, non-asymptotic security for important uses of hash functions in applications, we ask: is the random oracle methodology a sound way to achieve this goal? Unfortunately, with the traditional modeling of the random oracle, the answer is a resounding “NO,” even for the most basic usages of hash functions, as can be seen from the following examples.

  1. (i)

    In the standard model, no single function H can be collision-resistant, as a non-uniform attacker can trivially hardwire a collision. In contrast, a single (non-salted) random oracle \(\mathcal O\) is trivially collision-resistant in the ROM, with excellent exact security \(O(T^2/M)\), where M is the range of \(\mathcal O\). This is why in the standard model one considers a family of collision-resistant hash functions whose public key, which we call salt, is chosen after \(\mathcal A\) gets its non-uniform advice. Interestingly, one of the results in this paper will show that the large gap (finding collisions in time \(M^{1/2}\) vs. \(M^{1/3}\)) between uniform and non-uniform security exists for the popular Merkle-Damgård construction even if salting is allowed.

  2. (ii)

    In the standard model, no PRG candidate H(x) can have security better than \(2^{-n/2}\) even against linear-time (in n) attackers [3, 10, 20], where n is the seed-length of x. In contrast, an expanding random oracle \(\mathcal O(x)\) can be trivially shown to be \((T/2^n)\)-secure PRG in the traditional ROM, easily surpassing the \(2^{-n/2}\) barrier in the standard model (even for huge T up to \(2^{n/2}\), let alone polynomial T).

  3. (iii)

    The seminal paper of Hellman [31], translated to the language of non-uniform attackers, shows that a random function \(H:[N]\rightarrow [N]\) can be inverted with constant probability using a non-uniform attacker of size \(O\left( N^{2/3}\right) \), while Fiat and Naor [22] extended this attack to show that every (even non-random) function H can be inverted with constant probability by circuits of size at most \(N^{3/4}\). In contrast, if one models H as a random oracle \(\mathcal O\), one can trivially show that \(\mathcal O\) is a OWF with security \(O\left( T/N\right) \) in the traditional ROM. For example, setting \(T=N^{2/3}\) (or even \(T=N^{3/4}\)), one would still get negligible security \(N^{-1/3}\) (or \(N^{-1/4}\)), contradicting the concrete non-uniform attacks mentioned above.

To put it differently, once non-uniformity is allowed in the standard model, the separations between the random-oracle model and the standard model are no longer contrived and artificial but rather lead to impossibly good exact security of widely deployed applications.

Auxiliary-Input ROM. The above concern regarding the random-oracle methodology is not new and was extensively studied by Unruh [51] and Dodis et al. [18]. Fortunately, these works offered a simple solution, by extending the traditional ROM to also allow for oracle-dependent auxiliary input. The resulting model, called the auxiliary-input random-oracle model (AI-ROM), is parameterized by two parameters S (“space”) and T (“time”) and works as follows: First, as in the traditional random-oracle model, a function \(\mathcal O\) is chosen uniformly from the space of functions with some domain and range. Second, the attacker \(\mathcal A\) in the AI-ROM consists of two entities \(\mathcal A_1\) and \(\mathcal A_2\). The first-stage attacker \(\mathcal A_1\) is computationally unbounded, gets full access to the random oracle \(\mathcal O\), and computes some “non-uniform” advice z of size S. This advice is then passed to the second-stage attacker \(\mathcal A_2\), who may make up to T queries to oracle \(\mathcal O\) (and, unlike \(\mathcal A_1\), might have additional application-specific restrictions, like bounded running time, etc.). This naturally maps to the preprocessing model discussed earlier and can also be used to analyze security against non-uniform circuits of size C by setting \(S=T=C\).Footnote 2 Indeed, none of the concerns expressed in examples (i)–(iii) remain valid in AI-ROM: (i) \(\mathcal O\) itself is no longer collision-resistant since \(\mathcal A_1\) can precompute a collision; (ii)–(iii) the generic non-uniform PRG or OWF attacks mentioned earlier can also be performed on \(\mathcal O\) itself (by letting \(\mathcal A_1\) treat \(\mathcal O\) as any other function H and computing the corresponding advice for \(\mathcal A_2\)). In sum, the AI-ROM model allows us to restate the modified variant of the random oracle methodology as follows:

AI-Random-Oracle Methodology. For “natural” applications of hash functions, the concrete security proven in the AI-ROM is the right bound even in the standard model against non-uniform attackers, assuming the “best possible” concrete hash function H is chosen.

Dealing with Auxiliary Information. The AI-ROM yields a clean and elegant way towards obtaining meaningful non-uniform bounds for natural applications. Unfortunately, obtaining such bounds is considerably more difficult than in the traditional ROM. In retrospect, such difficulties are expected, since we already saw several examples showing that non-uniform attackers are very powerful when exact security matters, which means that the security bounds obtained in the AI-ROM might often be noticeably weaker than in the traditional ROM. From a technical point, the key difficulty is this: conditioned on the leaked value z, which can depend on the entire function table of \(\mathcal O\) in some non-trivial manner, many of the individual values \(\mathcal O(x)\) are no longer random to the attacker. And this ruins many of the key techniques utilized in the traditional ROM, such as: (1) lazy sampling, which allows the reduction to sample the not-yet-queried values of \(\mathcal O\) at random, as needed, without worrying that such lazy sampling will be inconsistent with the past; (2) programmability, which allows the reduction to dynamically define some value of \(\mathcal O\) in a special (still random) way, as this might be inconsistent with the leakage value z it has to produce before knowing how and where to program \(\mathcal O\); (3) distinguishing-to-extraction argument, which states that the attacker cannot distinguish the value of \(\mathcal O\) from random without explicitly querying it (which again is false given auxiliary input). For these reasons, new techniques are required for dealing with the AI-ROM. Fortunately, two such techniques are known:

  • Pre-sampling technique. This beautiful technique was introduced in the original, pioneering work of Unruh [51]. From our perspective, we will present Unruh’s pre-sampling technique in a syntactically different (but technically equivalent) way which will be more convenient for our presentation. Specifically, Unruh implicitly introduced an intermediate oracle model, which we term the bit-fixing random-oracle model (BF-ROM),Footnote 3 which can be arbitrarily fixed on some P coordinates, but then the remaining coordinates are chosen at random and independently of the fixed coordinates. Moreover, the non-uniform S-bit advice of the attacker can only depend on the P fixed points, but not on the remaining truly random points. Intuitively, dealing with the BF-ROM—at least when P is small—appears to be much easier than with the AI-ROM, as many of the traditional ROM proof techniques can be adapted provided that one avoids the “pre-sampled” set. Quite remarkably, for any value P, Unruh showed that any (ST)-attack in the AI-ROM will have similar advantage in (appropriately chosen) P-BF-ROM, up to an additive loss of \(\delta (S,T,P)\), which Unruh upper bounded by \(\sqrt{ST/P}\). This yields a general recipe for dealing with the AI-ROM: (a) prove security \(\varepsilon (S,T,P)\) of the given application in the P-BF-ROM;Footnote 4 (b) optimize for the right value of P by balancing \(\varepsilon (S,T,P)\) and \(\delta (S,T,P)\) (while also respecting the time and other constraints of the attacker).

  • Compression technique. Unfortunately, Dodis et al. [18] showed that the concrete security loss \(\delta (S,T,P)= \sqrt{ST/P}\) proven by Unruh is not strong enough to get tight bounds for any of the basic applications of hash functions, such as building OWFs, PRGs, PRFs, (salted) CRHFs, and MACs. To remedy the situation, Dodis et al. [18] showed a different, less general technique for dealing with the AI-ROM, by adapting the compression paradigm, introduced by Gennaro and Trevisan [24, 25] in the context of black-box separations, to the AI-ROM. The main idea is to argue that if some AI-ROM attacker succeeds with high probability in breaking a given scheme, then that attacker can be used to reversibly encode (i.e., compress) a random oracle beyond what is possible from an information-theoretic point of view. Since we are considering attackers who perform preprocessing, our encoding must include the S-bit auxiliary information produced by the attacker. Thus, the main technical challenge in applying this technique is to ensure that the constructed encoding compress by (significantly) more than S bits. Dodis et al. [18] proceeded by successfully applying this idea to show nearly tight (and always better than what was possible by pre-sampling) bounds for a variety of natural applications, including OWFs, PRGs, PRFs, (salted) CRHFs, and MACs.

Pre-sampling or Compression? The pre-sampling and compression techniques each have their pros and cons, as discussed below.

On a positive, pre-sampling is very general and seems to apply to most applications, as analyzing the security of schemes in BF-ROM is not much harder than in the traditional ROM. Moreover, as shown by Unruh, the pre-sampling technique appears at least “partially friendly” to computational applications of random oracles (specifically, Unruh applied it to OAEP encryption [9]). Indeed, if the size \(P\) of the pre-sampled set is not too large, then it can be hardwired as part of non-uniform advice to the (efficient) reduction to the computational assumption. In fact, in the asymptotic domain Unruh even showed that the resulting security remains “negligible in security parameter \(\lambda \),” despite not being smaller than any concrete negligible function (like the inverse Ackermann function).Footnote 5

On a negative, the concrete security bounds which are currently obtainable using this technique are vastly suboptimal, largely due to the big security loss \(\sqrt{ST/P}\) incurred by using Unruh’s bound [51]. Moreover, for computational applications, the value of P cannot be made larger than the size of attacker for the corresponding computational assumption. Hence, for fixed (“non-asymptotic”; see Footnote 5) polynomial-size attackers, the loss \(\sqrt{ST/P}\) cannot be made negligible. Motivated by this, Unruh conjectured that the security loss of pre-sampling can be improved by a tighter proof. Dodis et al. [18] showed that the best possible security loss is at most ST / P. For computational applications, this asymptotically disproves Unruh’s conjecture, as ST / P is still non-negligible for polynomial values of P (although we will explain shortly that the situation is actually more nuanced).

Moving to the compression technique, we already mentioned that it led Dodis et al. [18] to establishing nearly tight AI-ROM bounds for several information-theoretic applications of random oracles. Unfortunately, each proof was noticeably more involved than the original ROM proof, or than the proof in the BF-ROM one would do if applying the more intuitive pre-sampling technique. Moreover, each primitive required a completely different set of algorithmic insights to get the required level of compression. And it is not entirely clear how far this can go. For example, we do not see any way to apply the compression paradigm to relatively basic applications of hash functions beyond using the hash function by itself as a given primitive; e.g., to show AI-ROM security of the classical Merkle-Damgård paradigm [16, 42] (whose tight AI-ROM security we will later establish in this work). Moreover, unlike pre-sampling, the compression paradigm cannot be applied at all to computational applications, as the compressor and the decompressor are computationally unbounded.

1.1 Our Results

We obtain a number of results about dealing with the AI-ROM, which, at a high-level, take the best features from pre-sampling (simplicity, generality) and compression (tightness).

Improving Unruh. Recall, Unruh [51] showed that one can move from the AI-ROM to the P-BF-ROM at the additive cost \(\delta (S,T,P)\le \sqrt{ST/P}\), and Dodis et al. [18] showed that \(\delta (S,T,P) = \varOmega \left( ST/P\right) \) in general. We show that the true additive error bound is indeed \(\delta (S,T,P) = \varTheta (ST/P)\), therefore improving Unruh’s bound by a quadratic factor; see Theorem 1. Namely, the effect of S bits of auxiliary information \(z = z(\mathcal O)\) against an attacker making T adaptive random-oracle queries can be simulated to within an additive error O(ST / P) by fixing the value of the random oracle on P points (which depend on the function z), and picking the other points at random and independently of the auxiliary information.

While the quadratic improvement might appear “asymptotically small,” we show that it already matches the near-tight bound for all indistinguishability applications (specifically, PRGs and PRFs) proved by [18] using much more laborious compression arguments. For example, to match the \(\varepsilon = O(\sqrt{ST/N} + T/N)\) bound for PRGs with seed domain N, we show using a simple argument that the random oracle is \(\varepsilon ' = O(P/N + T/N)\)-secure in the P-BF-ROM, where the first term corresponds to the seed being chosen from the pre-sampled set, and the second term corresponds to the probability of querying the oracle on the seed in the attack stage. Setting \(P= O(\sqrt{STN})\) to balance the P / N and ST / P terms, we immediately get our final bound, which matches that of [18]. For illustrative purposes, we also apply our improved bound to argue the AI-ROM security of a couple of indistinguishability applications not considered by [18]. First, we show an improved—compared to its use as a (standard) PRF—bound for the random oracle as a weak PRF, which is enough for chosen-plaintext secure symmetric-key encryption. Our proof is a very simple adaptation of the PRF proof in the BF-ROM, while we believe the corresponding compression proof, if possible at all, would involve noticeable changes to the PRF proof of [18] (due to the need for better compression to get the improved bound). Second, we also apply it to a typical example of a computational application, namely, the (KEM-variant of the) TDF-based public-key encryption scheme \(\mathsf {Enc}_{f}(m;x) = (f(x), \mathcal O(x)\oplus m)\) from the original Bellare-Rogaway paper [8], where f is a trapdoor permutation (part of the public key, while the inverse is the secret key) and x is the randomness used for encryption. Recall that the compression technique cannot be applied to such applications.

To sum up, we conjecture that the improved security bound ST / P should be sufficient to get good bounds for most natural indistinguishability applications; these bounds are either tight, or at least they match those attainable via compression arguments (while being much simpler and more general).

Improved Pre-sampling for Unpredictability Applications. Even with our improved bound of ST / P for pre-sampling, we will not match the nearly tight compression bounds obtained by Dodis et al. [18] for OWFs and MACs. In particular, finding the optimal value of P will result in “square root terms” which are not matched by any existing attacks. As our key insight, we notice that this is not due to the limitations of pre-sampling (i.e., going through the BF-ROM), but rather to the fact that achieving an additive error is unnecessarily restrictive for unpredictability applications. Instead, we show that if one is happy with a multiplicative factor of 2 in the probability of breaking the system, then one can achieve this generically by setting the pre-sampling set size \(P\approx ST\); see Theorem 2.

This has a number of implications. First, with this multiplicative pre-sampling technique, we can easily match the compression bounds for the OWF and MAC unpredictability applications considered by Dodis et al. [18], but with much simpler proofs. Second, we also apply it to a natural information-theoretic application where we believe the compression technique will fail to get a good bound; namely, building a (salted) CHRF family via the Merkle-Damgård paradigm, where the salt is the initialization vector for the construction (see Theorem 3). The salient feature of this example is that the random oracle is applied in iteration, which poses little difficulties to adapting the standard-ROM proof to the BF-ROM, but seems to completely blow up the complexity of the compression arguments, as there are too many possibilities for the attacker to cause a collision for different salts when the number of blocks is greater than 1.Footnote 6 The resulting AI-ROM bound \(O(ST^2/M)\) becomes vacuous for circuits of size roughly \(M^{1/3}\), where M is the range of the compression function. This bound is well below the conjectured \(M^{1/2}\) birthday security of CRHFs based on Merkle-Damgård against uniform attackers. Quite unexpectedly, we show that \(M^{1/3}\) security we prove is tight: there exists a (non-uniform) collision-finding attack implementable by a circuit of size \(O\left( M^{1/3}\right) \) (see Theorem 4)! This example illustrates once again the surprising power of non-uniformity.

Implications to Computational Reductions. Recall that, unlike compression techniques, pre-sampling can be applied to computational reductions, by “hardwiring” the pre-sampling set of size P into the attacker breaking the computational assumption. However, this means that P cannot be made larger than the maximum allowed running time t of such an attacker. Since standard pre-sampling incurs additive cost \(\varOmega (ST/P)\), one cannot achieve final security better that ST / t, irrespective of the value of \(\varepsilon \) in the \((t,\varepsilon )\)-security of the corresponding computational assumption. For example, when t is polynomial (in the security parameter) and \(\varepsilon \ll 1/t\) is exponentially small, we only get inverse polynomial security (at most ST / t) when applying standard pre-sampling. In contrast, the multiplicative variant of pre-sampling sets the list size to be roughly \(P\approx ST\), which is polynomial for polynomial S and T and can be made smaller than the complexity t of the standard model attacker for the computational assumption we use. Thus, when t is polynomial and \(\varepsilon \) is exponentially small, we will get negligible security using multiplicative pre-sampling. For a concrete illustrative example, see the bound in Theorem 5 when we apply our improved pre-sampling to the natural computational unpredictability application of Schnorr signatures [49].Footnote 7 To put it differently, while the work of Dodis et al. [18] showed that Unruh’s “pre-sampling conjecture” is false in general—meaning that negligible security is not possible with a polynomial list size P—we show that it is qualitatively true for unpredictability applications, where the list size can be made polynomial (roughly ST).

Moreover, we show that in certain computational indistinguishability applications, we can still apply our improved pre-sampling technique inside the reduction, and get final security higher than the ST / t barrier mentioned above. We illustrate this phenomenon in our analysis of TDF encryption (cf. Theorem 6) by separating the probability of the attacker’s success into 2 disjoint events: (1) the attacker, given ciphertext f(x), managed to query the random oracle on the TDP preimage x; (2) the attacker succeeds in distinguishing the value \(\mathcal O(x)\) from random without querying \(\mathcal O(x)\). Now, for the event (1), we can reduce to the TDP security with polynomial list size using our improved multiplicative pre-sampling (since is an unpredictability event), while for the event (2), we can prove information-theoretic security using standard additive pre-sampling, without the limitation of having to upper bound P by the running time of the TDP attacker. It is an interesting open question to classify precisely the type of indistinguishability applications where such “hybrid” reduction technique can be applied.

Going to the Traditional ROM. So far, the general paradigm we used is to reduce the hard-to-analyze security of any scheme in the AI-ROM to the much simpler and proof-friendly security of the same scheme in the BF-ROM. However, an even simpler approach, if possible, would be to reduce the security in the AI-ROM all the way to the traditional ROM. Of course, we know that this is impossible without any modifications to the scheme, as we have plenty of examples where the AI-ROM security of the scheme is much weaker than its ROM security (or even disappears completely). Still, when a simple modification is possible without much inconvenience to the users, reducing to the ROM has a number of obvious advantages over the BF-ROM:

  • While much simpler than in the AI-ROM, one must still prove a security bound in BF-ROM. It would be much easier if one could just utilize an already proven result in ROM and seamlessly “move it” to the AI-ROM at a small cost.

  • Some natural schemes secure in the traditional ROM are insecure in the BF-ROM (and also in the AI-ROM) without any modifications. Simple example include the general Fiat-Shamir heuristic [1, 23] or the FDH signature scheme [8] (see the full version of this paper [15]). Thus, to extend such schemes to the AI-ROM, we must modify them anyway, so we might as well try to generically ensure that ROM security is already enough.

As our next set of results, we show two simple compilers which build a hash function \(\mathcal O'\) to be used in AI-ROM application out of hash function \(\mathcal O\) used in the traditional ROM application. Both results are in the common-random-string model. This means that they utilize a public random string (which we call salt and denote \(a\)) chosen after the auxiliary information about \(\mathcal O\) is computed by the attacker. The honest parties are then assumed to have reliable access to this \(a\) value. We note that in basic applications, such as encryption and authentication, the salt can simply be chosen at key generation and be made part of the public key/parameters, so this comes at a small price indeed.

The first transformation analyzed in Sect. 6.1 is simply salting; namely \(\mathcal O_a'(x) = \mathcal O(a,x)\), where \(a\) is a public random string chosen from the domain of size K. This technique is widely used in practice (going back to password hashing [43]), and was analyzed by Dodis et al. [18] in the context of AI-ROM, by applying the compression argument to show that salting provably defeats preprocessing for the few natural applications they consider (OWFs, PRGs, PRFs, and MACs). What our work shows is that salting provably defeats pre-processing generically, as opposed to a few concrete applications analyzed by [18].Footnote 8 Namely, by making the salt domain K large enough, one gets almost the same security in AI-ROM than in the traditional ROM. To put differently, when salting is possible, one gets the best of both worlds: security against non-uniform attacks, but with exact security matching that in the traditional ROM.

The basic salting technique sacrificed a relatively large factor of K from the domain of the random oracle \(\mathcal O\) in order to build \(\mathcal O'\) (for K large enough to bring the “salting error” down). When the domain of \(\mathcal O\) is an expensive resource, in Sect. 6.2 we also design a more domain-efficient compiler, which only sacrifices a small factor \(k\ge 2\) in the domain of \(\mathcal O\), at the cost that each evaluation of \(\mathcal O'\) takes \(k\ge 2\) evaluations of \(\mathcal O\) (and the “salting error” decays exponentially in k). This transformation is based on the adaptation of the technique of Maurer [41], originally used in the context of key-agreement with randomizers. While the basic transformation needs \(O(k\log N)\) bits of public salt, we also show than one can reduce the number of random bits to \(O(k+\log N)\). And since we do not envision k to be larger than \(O(\log N)\) for any practical need, the total length of the salt is always \(O(\log N)\).

Our Main Lemma. The key technical contribution of our work is Lemma 1, proved in Sect. 2.1, which roughly shows that a random oracle with auxiliary input is “close” to the convex combination of “P-bit-fixing sources” (see Definition 1). Moreover, we give both additive and multiplicative versions of this “closeness,” so that we can later use different parameters to derive our Theorem 1 (for indistinguishability applications in the AI-ROM) and Theorem 2 (for unpredictability applications in the AI-ROM) in Sect. 2.2.

1.2 Other Related Work

Most of the related work was already mentioned earlier. The realization that multiplicative error is enough for unpredictability applications, and this can lead to non-trivial savings, is related to the work of Dodis et al. [19] in the context of improved entropy loss of key derivation schemes. Tessaro [50] generalized Unruh’s presampling techniques to the random-permutation model, albeit without improving the tightness of the bound.

De et al. [17] study the effect of salting for inverting a permutation \(\mathcal O\) as well as for a specific pseudorandom generator based on one-way permutations. Chung et al. [14] study the effects of salting in the design of collision-resistant hash functions, and used Unruh’s pre-sampling technique to argue that salting defeats preprocessing in this important case. Using salting to obtain non-uniform security was also advocated by Mahmoody and Mohammed [40], who used this technique for obtaining non-uniform black-box separation results.

Finally, the extensive body of work on the bounded storage model [4, 21, 41, 52] is related to the special case of AI-ROM, where all T queries in the second stage are done by the challenger to derive the key (so that one tries to minimize T to ensure local computability), but the actual attacker is not allowed any such queries after S-bit preprocessing.

2 Dealing with Auxiliary Information

Since an attacker with oracle-dependent auxiliary input may obtain the output of arbitrary functions evaluated on a random oracle’s function table, it is not obvious how the security of schemes in the auxiliary-input random-oracle model (AI-ROM) can be analyzed. To remedy this situation, Unruh [51] introduced the bit-fixing random-oracle model (BF-ROM), in which the oracle is fixed on a subset of the coordinates and uniformly random and independent on the remaining ones, and showed that such an oracle is indistinguishable from an AI-RO.

In Sect. 2.1, we improve the security bounds proved by Unruh [51] in the following two ways: First, we show that a BF-RO is indistinguishable from an AI-RO up to an additive term of roughly \(ST/P\), where \(P\) is the size of the fixed portion of the BF-RO; this improves greatly over Unruh’s bound, which was in the order of \(\sqrt{ST/P}\). Second, we prove that the probability that any distinguisher outputs 1 in the AI-ROM is at most twice the probability that said distinguisher outputs 1 in the BF-ROM—already when P is roughly equal to ST.

Section 2.2 contains the formalizations of the AI and BF-ROMs, attackers with oracle-dependent advice, and the notion of application. As a consequence of the connections between the two models, the security of any application in the BF-ROM translates to the AI-ROM at the cost of the \(ST/P\) term, and, additionally, the security of unpredictability applications translates at the mere cost of a multiplicative factor of 2 (as long as \(P\ge ST\)). The corresponding theorems and their proofs can also be found in Sect. 2.2.

2.1 Replacing Auxiliary Information by Bit-Fixing

In this section, we show that any random oracle about which an attacker may have a certain amount of auxiliary information can be replaced by a suitably chosen convex combination of bit-fixing sources. This substitution comes at the price of either an additive term to the distinguishing advantage or a multiplicative one to the probability that a distinguisher outputs 1. To that end, consider the following definition:

Definition 1

An \((N,M)\)-source is a random variable X with range \([M]^N\). A source is called

  • \((1-\delta )\)-dense if for every subset \(I \subseteq [N]\),

    $$\begin{aligned} H_\infty ( X_I ) \ \ge \ (1-\delta ) \cdot |I| \cdot \log M\ = \ (1-\delta ) \cdot \log M^{|I|}. \end{aligned}$$
  • \((P,1-\delta )\)-dense if it is fixed on at most P coordinates and is \((1-\delta )\)-dense on the rest,

  • \(P\)-bit-fixing if it is fixed on at most P coordinates and uniform on the rest.

That is, the min-entropy of every subset of the function table of a \(\delta \)-dense source is at most a fraction of \(\delta \) less than what it would be for a uniformly random one.

Lemma 1

Let X be distributed uniformly over \([M]^N\) and \(Z:= f(X)\), where \(f: [M]^N \rightarrow \{0,1\}^S\) is an arbitrary function. For any \(\gamma >0\) and \(P\in \mathbb N\), there exists a family \(\{Y_z\}_{z\in \{0,1\}^S}\) of convex combinations \(Y_z\) of \(P\)-bit-fixing \((N,M)\)-sources such that for any distinguisher D taking an \(S\)-bit input and querying at most \(T<P\) coordinates of its oracle,

$$\begin{aligned} \ \left| {\mathsf P\big [\mathcal D^X (f(X)) = 1\big ]} - {\mathsf P\big [\mathcal D^{Y_{f(X)}} (f(X)) = 1\big ]} \right| \ \le \ \frac{(S+\log {1/\gamma }) \cdot T}{P}+ \gamma \end{aligned}$$

and

$$\begin{aligned} {\mathsf P\big [\mathcal D^X (f(X)) = 1\big ]} \ \le \ 2^{(S+\log {1/\gamma })T/P} \cdot {\mathsf P\big [\mathcal D^{Y_{f(X)}} (f(X)) = 1\big ]} + \gamma . \end{aligned}$$

Lemma 1 is proved using a technique (cf. the first claim in the proof) put forth by Göös et al. [30] in the area of communication complexity. The technique was also adopted in a paper by Kothari et al. [39], who gave a simplified argument for decomposing high-entropy sources into bit-fixing sources with constant density (cf. Definition 1). For self-containment, the full version of this paper [15] contains a proof of this decomposition technique. Furthermore, the proof uses the well-known H-coefficient technique by Patarin [47], while following a recent re-formulation of it due to Hoang and Tessaro [32].

Proof

Fix an arbitrary \(z \in \{0,1\}^S\) and let \(X_z\) be the distribution of X conditioned on \(f(X)=z\). Let \(S_z= N \log M - H_\infty (X_z)\) be the min-entropy deficiency of \(X_z\). Let \(\gamma > 0\) be arbitrary.

Claim 1

For every \(\delta > 0\), \(X_z\) is \(\gamma \)-close to a convex combination of finitely many \((P',1-\delta )\)-dense sources for

$$\begin{aligned} P' \ = \ \frac{S_z+\log {1/\gamma }}{\delta \cdot \log M} \ . \end{aligned}$$

The proof of the above claim can be found in the full version of this paper [15].

Let \(X'_z\) be the convex combination of \((P',1-\delta )\)-dense sources that is \(\gamma \)-close to \(X_z\) for a \(\delta = \delta _z\) to be determined later. For every \((P',1-\delta )\) source \(X'\) in said convex combination, let \(Y'\) be the corresponding \(P'\)-bit-fixing source \(Y'\), i.e., \(X'\) and \(Y'\) are fixed on the same coordinates to the same values. The following claim bounds the distinguishing advantage between \(X'\) and \(Y'\) for any \(T\)-query distinguisher.

Claim 2

For any \((P',1-\delta )\)-dense source \(X'\) and its corresponding \(P'\)-bit-fixing source \(Y'\), it holds that for any (adaptive) distinguisher \(\mathcal D\) that queries at most T coordinates of its oracle,

$$\begin{aligned} \ \left| {\mathsf P\big [\mathcal D^{X'} = 1\big ]} - {\mathsf P\big [\mathcal D^{Y'} = 1\big ]} \right| \ \le \ T \delta \cdot \log M, \end{aligned}$$

and

$$\begin{aligned} {\mathsf P\big [\mathcal D^{X'} = 1\big ]} \ \le \ M^{T\delta } \cdot {\mathsf P\big [\mathcal D^{Y'} = 1\big ]}. \end{aligned}$$

Proof

Assume without loss of generality that \(\mathcal D\) is deterministic and does not query any of the fixed positions. Let \(T_{X'}\) and \(T_{Y'}\) be the random variables corresponding to the transcripts containing the query/answer pairs resulting from \(\mathcal D\)’s interaction with \(X'\) and \(Y'\), respectively. For a fixed transcript \(\tau \), denote by \(\mathsf p_{X'}(\tau )\) and \(\mathsf p_{Y'}(\tau )\) the probabilities that \(X'\) and \(Y'\), respectively, produce the answers in \(\tau \) if the queries in \(\tau \) are asked. Observe that these probabilities depend only on \(X'\) resp. \(Y'\) and are independent of \(\mathcal D\).

Observe that for every transcript \(\tau \),

$$\begin{aligned} \mathsf p_{X'}(\tau )\ \le \ M^{-(1-\delta )T} \qquad \text {and} \qquad \mathsf p_{Y'}(\tau )\ = \ M^{-T} \end{aligned}$$
(1)

as \(X'\) is \((1-\delta )\)-dense and \(Y'\) is uniformly distributed.

Since \(\mathcal D\) is deterministic, \({\mathsf P[T_{X'} = \tau ]} \in {\{0,\mathsf p_{X'}(\tau )\}}\), and similarly, \({\mathsf P[T_{Y'} = \tau ]} \in {\{0,\mathsf p_{Y'}(\tau )\}}\). Denote by \(\mathcal T_{X}\) the set of all transcripts \(\tau \) for which \({\mathsf P[T_{X'} = \tau ]} > 0\). For such \(\tau \), \({\mathsf P[T_{X'} = \tau ]} = \mathsf p_{X'}(\tau )\) and also \({\mathsf P[T_{Y'} = \tau ]} = \mathsf p_{Y'}(\tau )\). Towards proving the first part of the lemma, observe that

$$\begin{aligned} \left| {\mathsf P\big [\mathcal D^{X'} = 1\big ]} - {\mathsf P\big [\mathcal D^{Y'} = 1\big ]} \right|&\le \ \mathsf {SD}(T_{X'},T_{Y'}) \\&= \ \sum _{\tau } \ \max {\{0, {\mathsf P[T_{X'} = \tau ]} - {\mathsf P[T_{Y'} = \tau ]}\}} \\&= \ \sum _{\tau \in \mathcal T_{X}} \max {\{0, \mathsf p_{X'}(\tau )- \mathsf p_{Y'}(\tau )\}} \\&= \ \sum _{\tau \in \mathcal T_{X}} \mathsf p_{X'}(\tau )\cdot \max {\left\{ 0, 1 - \frac{\mathsf p_{Y'}(\tau )}{\mathsf p_{X'}(\tau )}\right\} } \\&\le \ 1 - M^{-T\delta } \le \ T\delta \cdot \log M, \end{aligned}$$

where the first sum is over all possible transcripts and where the last inequality uses \(2^{-x}\ge 1-x\) for \(x \ge 0\).

As for the second part of the lemma, observe that due to (1) and the support of \(T_{X'}\) being a subset of \(T_{Y'}\),

$$\begin{aligned} {\mathsf P[T_{X'} = \tau ]} \ \le \ M^{T\delta } \cdot {\mathsf P[T_{Y'} = \tau ]} \end{aligned}$$

for any transcript \(\tau \). Let \(\mathcal T_{\mathcal D}\) be the set of transcripts where \(\mathcal D\) outputs 1. Then,

$$\begin{aligned} {\mathsf P[\mathcal D^{X'} = 1]} \ = \ \sum _{\tau \in \mathcal T_{\mathcal D}} {\mathsf P[T_{X'} = \tau ]} \ \le \ M^{T\delta } \cdot \sum _{\tau \in \mathcal T_{\mathcal D}} {\mathsf P[T_{Y'} = \tau ]} \ = \ M^{T\delta } \cdot {\mathsf P[\mathcal D^{Y'} = 1]}. \end{aligned}$$

   \(\square \)

Let \(Y_z'\) be obtained by replacing every \(X'\) by the corresponding \(Y'\) in \(X'_z\). Setting \(\delta _z= (S_z+ \log 1/\gamma )/(P\log M)\), Claims 1 and 2 imply

$$\begin{aligned} \left| {\mathsf P\big [\mathcal D^{X_z}(z) = 1\big ]} - {\mathsf P\big [\mathcal D^{Y'_z}(z) = 1\big ]} \right| \ \le \ \frac{(S_z+ \log {1/\gamma }) \cdot T}{P}+ \gamma , \end{aligned}$$
(2)

as well as

$$\begin{aligned} {\mathsf P\big [\mathcal D^{X_z}(z) = 1\big ]} \ \le \ 2^{(S_z+ \log {1/\gamma })T/P} \cdot {\mathsf P\big [\mathcal D^{Y_z'}(z) = 1\big ]} + \gamma \ . \end{aligned}$$
(3)

Moreover, note that for the above choice of \(\delta _z\), \(P' = P\), i.e., the sources \(Y'\) are fixed on at most \(P\) coordinates, as desired.

Claim 3

\(\mathbf {E}_{z}[S_z] \le S\) and \(\mathbf {E}_{z}[2^{S_zT/P}] \le 2^{ST/P}\).

The proof of the above claim can be found in the full version of this paper [15]. The lemma now follows (using \(Y_z:= Y'_z\)) by taking expectations over \(z\) of (2) and (3) and applying the above claim.    \(\square \)

2.2 From the BF-ROM to the AI-ROM

Capturing the Models. Before Lemma 1 from the preceding section can be used to show how security proofs in the BF-ROM can be transferred to the AI-ROM, it is necessary to formally define the two models as well as attackers with oracle-dependent advice and the notion of an application. The high-level idea is to consider two-stage attackers \(\mathcal A= (\mathcal A_1,\mathcal A_2)\) and (single-stage) challengers \(\mathsf C\) with access to an oracle \(\mathcal O\). Oracles have two interfaces \(\mathsf {pre}\) and \(\mathsf {main}\), where \(\mathsf {pre}\) is accessible only to \(\mathcal A_1\), which may pass auxiliary information to \(\mathcal A_2\), and both \(\mathcal A_2\) and \(\mathsf C\) may access \(\mathsf {main}\).

Oracles. An oracle \(\mathcal O\) has two interfaces \(\mathcal O{.}\mathsf {pre}\) and \(\mathcal O{.}\mathsf {main}\), where \(\mathcal O{.}\mathsf {pre}\) is accessible only once before any calls to \(\mathcal O{.}\mathsf {main}\) are made. Oracles used in this work are:

  • Random oracle \(\mathsf {RO}(N,M)\): Samples a random function table \(F\leftarrow \mathcal F_{N,M}\), where \(\mathcal F_{N,M}\) is the set of all functions from \([N]\) to \([M]\); offers no functionality at \(\mathcal O{.}\mathsf {pre}\); answers queries \(x \in [N]\) at \(\mathcal O{.}\mathsf {main}\) by the corresponding value \(F[x] \in [M]\).

  • Auxiliary-input random oracle \(\mathsf {AI\text {-}RO}(N,M)\): Samples a random function table \(F\leftarrow \mathcal F_{N,M}\); outputs \(F\) at \(\mathcal O{.}\mathsf {pre}\); answers queries \(x \in [N]\) at \(\mathcal O{.}\mathsf {main}\) by the corresponding value \(F[x] \in [M]\).

  • Bit-Fixing random oracle \(\mathsf {BF\text {-}RO}(P,N,M)\): Samples a random function table \(F\leftarrow \mathcal F_{N,M}\); takes a list at \(\mathcal O{.}\mathsf {pre}\) of at most \(P\) query/answer pairs that override \(F\) in the corresponding positions; answers queries \(x \in [N]\) at \(\mathcal O{.}\mathsf {main}\) by the corresponding value \(F[x] \in [M]\).

  • Standard model: Neither interface offers any functionality.

The parameters \(N\), \(M\) are occasionally omitted in contexts where they are of no relevance. Similarly, whenever evident from the context, explicitly specifying which interface is queried is omitted.

Attackers with Oracle-Dependent Advice. Attackers \(\mathcal A= (\mathcal A_1,\mathcal A_2)\) consist of a preprocessing procedure \(\mathcal A_1\) and a main algorithm \(\mathcal A_2\), which carries out the actual attack using the output of \(\mathcal A_1\). Correspondingly, in the presence of an oracle \(\mathcal O\), \(\mathcal A_1\) interacts with \(\mathcal O{.}\mathsf {pre}\) and \(\mathcal A_2\) with \(\mathcal O{.}\mathsf {main}\).

Definition 2

An \((S,T)\)-attacker \(\mathcal A= (\mathcal A_1,\mathcal A_2)\) in the \(\mathcal O\)-model consists of two procedures

  • \(\mathcal A_1\), which is computationally unbounded, interacts with \(\mathcal O{.}\mathsf {pre}\), and outputs an \(S\)-bit string, and

  • \(\mathcal A_2\), which takes an \(S\)-bit auxiliary input and makes at most \(T\) queries to \(\mathcal O{.}\mathsf {main}\).

In certain contexts, additional restrictions may be imposed on \(\mathcal A_2\), captured by some parameters \(p\). \(\mathcal A\) is referred to as \((S,T,p)\)-attacker in such cases. Examples of such parameters include time and space requirements of \(\mathcal A_2\) or a limit on the number of queries of a particular type that \(\mathcal A_2\) makes to a challenger it interacts with. Observe that the parameter S is meaningful also in the standard model, where it measures the length of standard non-uniform advice to the attacker. The parameter T, however, is not relevant as there is no random oracle to query in the attack stage. Consequently, standard-model attackers with resources p are referred to as \((S,*,p)\)-attackers.

Applications. Let \(\mathcal O\) be an arbitrary oracle. An application \(G\) in the \(\mathcal O\)-model is defined by specifying a challenger \(\mathsf C\), which is an oracle algorithm that has access to \(\mathcal O{.}\mathsf {main}\), interacts with the main stage \(\mathcal A_2\) of an attacker \(\mathcal A= (\mathcal A_1,\mathcal A_2)\), and outputs a bit at the end of the interaction. The success of \(\mathcal A\) on \(G\) in the \(\mathcal O\)-model is defined as

$$\begin{aligned} \mathrm {Succ}_{G,\mathcal O}(\mathcal A)\ := \ {\mathsf P\big [\mathcal A_2^{\mathcal O{.}\mathsf {main}}(\mathcal A_1^{\mathcal O{.}\mathsf {pre}}) \leftrightarrow \mathsf C^{\mathcal O{.}\mathsf {main}} = 1\big ]}, \end{aligned}$$

where \(\mathcal A_2^{\mathcal O{.}\mathsf {main}}(\mathcal A_1^{\mathcal O{.}\mathsf {pre}}) \leftrightarrow \mathsf C^{\mathcal O{.}\mathsf {main}}\) denotes the bit output by \(\mathsf C\) after its interaction with the attacker. This work considers two types of applications, captured by the next definition.

Definition 3

For an indistinguishability application \(G\) in the \(\mathcal O\)-model, the advantage of an attacker \(\mathcal A\) is defined as

$$\begin{aligned} \mathrm {Adv}_{G,\mathcal O}^{\mathsf {}}(\mathcal A)\ := \ 2 \left| \mathrm {Succ}_{G,\mathcal O}(\mathcal A)- \frac{1}{2} \right| . \end{aligned}$$

For an unpredictability application \(G\), the advantage is defined as

$$\begin{aligned} \mathrm {Adv}_{G,\mathcal O}^{\mathsf {}}(\mathcal A)\ := \ \mathrm {Succ}_{G,\mathcal O}(\mathcal A). \end{aligned}$$

An application \(G\) is said to be \(((S,T,p),\varepsilon )\)-secure in the \(\mathcal O\)-model if for every \((S,T,p)\)-attacker \(\mathcal A\),

$$\begin{aligned} \mathrm {Adv}_{G,\mathcal O}^{\mathsf {}}(\mathcal A)\ \le \ \varepsilon . \end{aligned}$$

Combined Query Complexity. In order to enlist Lemma 1 for proving Theorems 1 and 2 below, the interaction of some attacker \(\mathcal A= (\mathcal A_1,\mathcal A_2)\) with a challenger \(\mathsf C\) in the \(\mathcal O\)-model must be “merged” into a single entity \(\mathcal D= (\mathcal D_1,\mathcal D_2)\) that interacts with oracle \(\mathcal O\). That is, \(\mathcal D_1^{(\cdot )} := \mathcal A_1^{(\cdot )}\) and \(\mathcal D_2^{(\cdot )}(z) := \mathcal A_2^{(\cdot )}(z) \leftrightarrow \mathsf C^{(\cdot )}\) for \(z\in \{0,1\}^S\). \(\mathcal D\) is called the combination of \(\mathcal A\) and \(\mathsf C\), and the number of queries it makes to its oracle is referred to as the combined query complexity of \(\mathcal A\) and \(\mathsf C\). For all applications in this work there exists an upper bound \(T^{\mathsf {comb}}_{G}= T^{\mathsf {comb}}_{G}(S,T,p)\) on the combined query complexity of any attacker and the challenger.

Additive Error for Arbitrary Applications. Using the first part of Lemma 1, one proves the following theorem, which states that the security of any application translates from the BF-ROM to the AI-ROM at the cost of an additive term of roughly \(ST/P\), where \(P\) is the maximum number of coordinates an attacker \(\mathcal A_1\) is allowed to fix in the BF-ROM.

Theorem 1

For any \(P\in \mathbb N\) and every \(\gamma > 0\), if an application \(G\) is \(((S,T,p),\varepsilon ')\)-secure in the \(\mathsf {BF\text {-}RO}(P)\)-model, then it is \(((S,T,p),\varepsilon )\)-secure in the \(\mathsf {AI\text {-}RO}\)-model, for

$$\begin{aligned} \varepsilon \ \le \ \varepsilon ' + \frac{(S+\log \gamma ^{-1})\cdot T^{\mathsf {comb}}_{G}}{P} + \gamma , \end{aligned}$$

where \(T^{\mathsf {comb}}_{G}\) is the combined query complexity corresponding to \(G\).

Proof

Fix \(P\) as well as \(\gamma \). Set \(\mathsf {BF\text {-}RO}:= \mathsf {BF\text {-}RO}(P)\) and let \(G\) be an arbitrary application and \(\mathsf C\) the corresponding challenger. Moreover, fix an \((S,T)\)-attacker \(\mathcal A= (\mathcal A_1,\mathcal A_2)\), and let \(\{Y_z\}_{z\in \{0,1\}^S}\) be the family of distributions guaranteed to exist by Lemma 1, where the function f is defined by \(\mathcal A_1\). Consider the following \((S,T)\)-attacker \(\mathcal A' = (\mathcal A_1',\mathcal A_2')\) (expecting to interact with \(\mathsf {BF\text {-}RO}\)):

  • \(\mathcal A_1'\) internally simulates \(\mathcal A_1\) to compute \(z\leftarrow \mathcal A_1^{\mathsf {AI\text {-}RO}{.}\mathsf {pre}}\). Then, it samples one of the \(P\)-bit-fixing sources \(Y'\) making up \(Y_z\) and presets \(\mathsf {BF\text {-}RO}\) to match \(Y'\) on the at most \(P\) points where \(Y'\) is fixed. The output of \(\mathcal A_1'\) is \(z\).

  • \(\mathcal A_2'\) works exactly as \(\mathcal A_2\).

Let \(\mathcal D\) be the combination of \(\mathcal A_2= \mathcal A_2'\) and \(\mathsf C\). Hence, \(\mathcal D\) is a distinguisher taking an \(S\)-bit input and making at most \(T^{\mathsf {comb}}_{G}\) queries to its oracle. Therefore, by the first part of Lemma 1,

$$\begin{aligned} \mathrm {Succ}_{G,\mathsf {AI\text {-}RO}}(\mathcal A)\ \le \ \mathrm {Succ}_{G,\mathsf {BF\text {-}RO}}(\mathcal A') + \frac{(S+\log \gamma ^{-1})\cdot T^{\mathsf {comb}}_{G}}{P} + \gamma . \end{aligned}$$

Since there is only an additive term between the two success probabilities, the above inequality implies

$$\begin{aligned} \mathrm {Adv}_{G,\mathsf {AI\text {-}RO}}^{\mathsf {}}(\mathcal A)\ \le \ \mathrm {Adv}_{G,\mathsf {BF\text {-}RO}}^{\mathsf {}}(\mathcal A') + \frac{(S+\log \gamma ^{-1})\cdot T^{\mathsf {comb}}_{G}}{P} + \gamma \end{aligned}$$

for both indistinguishability and unpredictability applications.    \(\square \)

Multiplicative Error for Unpredictability Applications. Using the second part of Lemma 1, one proves the following theorem, which states that the security of any unpredictability application translates from the BF-ROM to the AI-ROM at the cost of a multiplicative factor of 2, provided that \(\mathcal A_1\) is allowed to fix roughly \(ST\) coordinates in the BF-ROM.

Theorem 2

For any \(P\in \mathbb N\) and every \(\gamma > 0\), if an unpredictability application \(G\) is \(((S,T,p),\varepsilon ')\)-secure in the \(\mathsf {BF\text {-}RO}(P,N,M)\)-model for

$$\begin{aligned} P\ \ge \ (S+ \log \gamma ^{-1}) \cdot T^{\mathsf {comb}}_{G}, \end{aligned}$$

then it is \(((S,T,p),\varepsilon )\)-secure in the \(\mathsf {AI\text {-}RO}(N,M)\)-model for

$$\begin{aligned} \varepsilon \ \le \ 2\varepsilon ' + \gamma , \end{aligned}$$

where \(T^{\mathsf {comb}}_{G}\) is the combined query complexity corresponding to \(G\).

Proof

Using the same attacker \(\mathcal A'\) as in the proof of Theorem 1 and applying the second part of Lemma 1, one obtains, for any \(P\ge (S+ \log \gamma ^{-1}) \cdot T^{\mathsf {comb}}_{G}\),

$$\begin{aligned} \mathrm {Succ}_{G,\mathsf {AI\text {-}RO}}(\mathcal A)&\le \ 2^{(S+\log {1/\gamma })T^{\mathsf {comb}}_{G}/P} \cdot \mathrm {Succ}_{G,\mathsf {BF\text {-}RO}}(\mathcal A') + \gamma \\&\le \ 2 \cdot \mathrm {Succ}_{G,\mathsf {BF\text {-}RO}}(\mathcal A') + \gamma , \end{aligned}$$

which translates into

$$\begin{aligned} \mathrm {Adv}_{G,\mathsf {AI\text {-}RO}}^{\mathsf {}}(\mathcal A)\ \le \ 2 \cdot \mathrm {Adv}_{G,\mathsf {BF\text {-}RO}}^{\mathsf {}}(\mathcal A') + \gamma \end{aligned}$$

for unpredictability applications.    \(\square \)

The Security of Applications in the AI-ROM. The connections between the auxiliary-input random-oracle model (AI-ROM) and the bit-fixing random-oracle model (BF-ROM) established above suggest the following approach to proving the security of particular applications in the AI-ROM: first, deriving a security bound in the easy-to-analyze BF-ROM, and then, depending on whether one deals with an indistinguishability or an unpredictability application, generically inferring the security of the schemes in the AI-ROM, using Theorems 1 or 2.

The three subsequent sections deal with various applications in the AI-ROM: Sect. 3 is devoted to security analyses of basic primitives, where “basic” means that the oracle is directly used as the primitive; Sect. 4 deals with the collision resistance of hash functions built from a random compression function via the Merkle-Damgård construction (MDHFs); and, finally, Sect. 5 analyzes several cryptographic schemes with computational security.

3 Basic Applications in the AI-ROM

This section treats the AI-ROM security of one-way functions (OWFs), pseudorandom generators (PRGs), normal and weak pseudorandom functions (PRFs and wPRFs), and message-authentication codes (MACs). More specifically, the applications considered are:

  • One-way functions: For an oracle \(\mathcal O: [N]\rightarrow [M]\), given \(y = \mathcal O(x)\) for a uniformly random \(x \in [N]\), find a preimage \(x'\) with \(\mathcal O(x') = y\).

  • Pseudo-random generators: For an oracle \(\mathcal O: [N]\rightarrow [M]\) with \(M> N\), distinguish \(y = \mathcal O(x)\) for a uniformly random \(x \in [N]\) from a uniformly random element of \([M]\).

  • Pseudo-random functions: For an oracle \(\mathcal O: [N]\times [L]\rightarrow [M]\), distinguish oracle access to \(\mathcal O(s,\cdot )\) for a uniformly random \(s \in [N]\) from oracle access to a uniformly random function \(F: [L]\rightarrow [M]\).

  • Weak pseudo-random functions: Identical to PRFs, but the inputs to the oracle are chosen uniformly at random and independently.

  • Message-authentication codes: For an oracle \(\mathcal O: [N]\times [L]\rightarrow [M]\), given access to an oracle \(\mathcal O(s,\cdot )\) for a uniformly random \(s \in [N]\), find a pair (xy) such that \(\mathcal O(s,x) = y\) for an x on which \(\mathcal O(s,\cdot )\) was not queried.

Table 1. Asymptotic upper and lower bounds on the security of basic primitives against \((S,T)\)-attackers in the AI-ROM, where \(q_{\mathsf {prf}}\) and \({q_{\mathsf {sig}}}\) denote PRF and signing queries, respectively, and where (for simplicity) \(N= M\) for OWFs. Observe that attacks against OWFs also work against PRGs and PRFs.

The asymptotic bounds for the applications in question are summarized in Table 1. For OWFs, PRGs, PRFs, and MACs, the resulting bounds match the corresponding bounds derived by Dodis et al. [18], who used (considerably) more involved compression arguments; weak PRFs have not previously been analyzed.

The precise statements and the corresponding proofs can be found in the full version of this paper [15]; the proofs all follow the paradigm outlined in Sect. 2.2 of first assessing the security of a particular application in the BF-ROM and then generically inferring the final bound in the AI-ROM using Theorems 1 or 2.

4 Collision Resistance in the AI-ROM

A prominent application missing from Sect. 3 is that of collision resistance, i.e., for an oracle \(\mathcal O: [N]\times [L]\rightarrow M\), given a uniformly random salt value \(a\in [N]\), finding two distinct \(x,x' \in [L]\) such that \(\mathcal O(a,x) = \mathcal O(a,x')\). The reason for this omission is that in the BF-ROM, the best possible bound is easily seen to be in the order of \(P/N+ T^2/M\). Even applying Theorem 2 for unpredictability applications with \(P\approx ST\) results in a final AI-ROM bound of roughly \(ST/N+ T^2/M\), which is inferior to the optimal bound of \(S/N+ T^2/M\) proved by Dodis et al. [18] using compression.

However, hash functions used in practice, most notably SHA-2, are based on the Merkle-Damgård mode of operation for a compression function \(\mathcal O: [M]\times [L]\rightarrow [M]\), modeled as a random oracle here. Specifically, a B-block message \(y=(y_1,\dots , y_{B})\) with \(y_j \in [L]\) is hashed to \(\mathcal O^B(y)\), where

$$\begin{aligned} \mathcal O^1(y_1)=\mathcal O(a, y_1)\,\, \text { and } \,\, \mathcal O^j(y_1,\dots ,y_j)=\mathcal O(\mathcal O^{j-1}(y_1,\dots ,y_{j-1}),y_j) \,\, \text { for } \,\, j > 1. \end{aligned}$$

While—as pointed out above—Dodis et al. [18] provide a tight bound for the one-block case, it is not obvious at all how their compression-based proof can be extended to deal with even two-block messages. Fortunately, no such difficulties appear when we apply our technique of going through the BF-ROM model, allowing us to derive a bound in Theorem 3 below.

Formally, the collision resistance of Merkle-Damgård hash functions (MDHFs) in the \(\mathcal O(ML,M)\)-model is captured by the application \(G^{\mathsf {MDHF},M,L}\), which is defined via the following challenger \(\mathsf C^{\mathsf {MDHF},M,L}\): It initially chooses a public initialization vector (IV) \(a \in [M]\) uniformly at random and sends it to the attacker. The attacker wins if he submits \(y=(y_1,\dots , y_{B})\) and \(y'=(y'_1,\dots , y'_{B'})\) such that \(y\ne y'\) and \(\mathcal O^B(y)=\mathcal O^{B'}(y')\).

For attackers \(\mathcal A= (\mathcal A_1,\mathcal A_2)\) in the following theorem, we make the simplifying assumption that \(T> \max (B,B')\). We prove the following bound on the security of MDHFs in the AI-ROM:

Theorem 3

Application \(G^{\mathsf {MDHF},M,L}\) is \(\left( (S,T,B),\varepsilon \right) \)-secure in the \(\mathsf {AI\text {-}RO}(ML,M)\)-model, where

$$\begin{aligned} \varepsilon \ = \ \tilde{O}\left( \frac{ST^2}{M} + \frac{T^2}{M}\right) . \end{aligned}$$

The proof of Theorem 3 is provided in the full version of this paper [15].

Observe that if \(S\) and \(T\) are taken to be the circuit size, the bound in Theorem 3 becomes vacuous for circuits of size \(M^{1/3}\), i.e., it provides security only well below the birthday bound and may therefore seem extremely loose. Quite surprisingly, however, it is tight:

Theorem 4

There exists an \((S,T)\)-attacker \(\mathcal A= (\mathcal A_1,\mathcal A_2)\) against application \(G:= G^{\mathsf {MDHF},M,L}\) in the \(\mathcal O:= \mathsf {AI\text {-}RO}(ML,M)\)-model with advantage at least

$$\begin{aligned} \mathrm {Adv}_{G,\mathcal O}^{\mathsf {}}(\mathcal A)\ = \ \tilde{\varOmega }\left( \frac{ST^2}{M} + \frac{1}{M}\right) , \end{aligned}$$

assuming \(ST^2 \le M/2\) and \(L\ge M\).

The attack is loosely based on rainbow tables [31] and captured by the following \((S,T)\)-attacker \(\mathcal A= (\mathcal A_1,\mathcal A_2)\):

  • \(\mathcal A_1\): Obtain the function table \(F: [M] \times [L]\rightarrow [M]\) from \(\mathcal O\). For \(i = 1,\ldots ,m:= S/ (3 \lceil \log L\rceil )\), proceed as follows:

    1. 1.

      Choose \(a_{i,0} \in [M]\) uniformly at random.

    2. 2.

      Compute \(a_{i,\ell -1} \leftarrow F^{(\ell -1)}(a_{i,0},0)\), where \(\ell := \lfloor T/2 \rfloor \).Footnote 9

    3. 3.

      Find values \(x_{i} \ne x_{i}'\) such that \(a_{i,\ell }:= F(a_{i,\ell -1}, x_{i}) = F(a_{i,\ell -1}, x_{i}')\); abort if no such values exist.

    Output the triples \((a_{i,\ell -1},x_{i},x_{i}')\) for \(i = 1,\ldots ,m\).

  • \(\mathcal A_2\): Obtain the public initialization vector a from \(\mathsf C^{\mathsf {MDHF},M,L}\) and the \(m\) triples output by \(\mathcal A_1\). Proceed as follows:

    1. 1.

      If \(a = a_{i,\ell -1}\) for some i, return \((x_{i},x_{i}')\).

    2. 2.

      Otherwise, set \(\tilde{a} \leftarrow a\) and for \(j = 1,\ldots ,T\), proceed as follows:

      1. (a)

        Query \(\tilde{a} \leftarrow \mathcal O(\tilde{a},0)\).

      2. (b)

        If \(\tilde{a} = a_{i,\ell -1}\) for some i, return \((0^j \Vert x_{i},0^j \Vert x_{i}')\); otherwise return (0, 1).

The analysis of the attack can be found in the full version of this paper [15]. It should be noted that in practice hash functions use a fixed IV \(a\), and, therefore—in contrast to, e.g., function inversion, where usually the cost of a single preprocessing stage can be amortized over many inversion challenges—the rather sizeable amount of preprocessing required by the attack to just find a collision may not be justified. However, in some cases, the hash function used in a particular application (relying on collision-resistance) is salted by prepending a random salt value to the input. Such salting essentially corresponds to the random-IV setting considered here, and, therefore, the attack becomes relevant again as one might be able to break many instances of the application using a single preprocessing phase.

5 Computationally Secure Applications in the AI-ROM

This section illustrates the bit-fixing methodology on two typical computationally secure applications: (1) Schnorr signatures [49], where Theorem 2 can be applied since forging signatures is an unpredictability application, and (2) trapdoor-function (TDF) key-encapsulation (KEM) [8], where an approach slightly more involved than merely analyzing security in the BF-ROM and applying Theorem 1 is required in order to get a tighter security reduction; see below.

(Please refer to Sect. A of the appendix for the definitions of digital signatures, KEMs, TDFs, and other standard concepts used in this section.)

Fiat-Shamir with Schnorr. Let \(G\) be a cyclic group of prime order \(|G|= N\). The Schnorr signature scheme \(\varSigma = (\mathsf {Gen},\mathsf {Sig},\mathsf {Vfy})\) in the \(\mathcal O(N^2,N)\)-model works as follows:

  • Key generation: Choose \(x\in \mathbb Z_N\) uniformly at random, compute \(y\leftarrow g^x\), and output \(\mathsf {sk}:= x\) and \(\mathsf {vk}:= y\).

  • Signing: To sign a message \(m\in [N]\) with key \(\mathsf {sk}= x\), pick \(r\in \mathbb Z_N\) uniformly at random, compute \(a\leftarrow g^r\), query \(c\leftarrow \mathcal O(a,m)\), set \(z\leftarrow r+ cx\), and output \(\sigma := (a,z)\).

  • Verification: To verify a signature \(\sigma = (a,z)\) for a message \(m\) with key \(\mathsf {vk}= y\), query \(c\leftarrow \mathcal O(a,m)\), and check whether \(g^z{\mathop {=}\limits ^{?}} ay^c\). If the check succeeds and \(c\ne 0\), accept the signature, and reject it otherwise.

For attackers \(\mathcal A= (\mathcal A_1,\mathcal A_2)\) in Theorem 5, which assesses the security of Fiat-Shamir with Schnorr in the AI-ROM, we make the running time \(t\) and space complexity \(s\) of \(\mathcal A_2\) explicit. Moreover, if \(\mathcal A\) is an attacker against \(G^{\mathsf {DS},\varSigma }\), there is an additional parameter \({q_{\mathsf {sig}}}\) that restricts \(\mathcal A_2\) to making at most \({q_{\mathsf {sig}}}\) signing queries. The proof of Theorem 5 is provided in the full version of this paper [15].

Theorem 5

Assume \(G^{\mathsf {DL},G}\) for a prime \(|G|= N\) is \(((S',*,t',s'),\varepsilon ')\)-secure, and let \(\varSigma = (\mathsf {Gen},\mathsf {Sig},\mathsf {Vfy})\) be the Schnorr scheme. Then, for any \(T, {q_{\mathsf {sig}}}\in \mathbb N\), \(G^{\mathsf {DS},\varSigma }\) is \(((S,T,t,s,{q_{\mathsf {sig}}}),\varepsilon )\)-secure in the \(\mathsf {AI\text {-}RO}(N^2,N)\)-model for

$$\begin{aligned} \varepsilon = \tilde{O}\left( \sqrt{T\varepsilon '} + \frac{S{q_{\mathsf {sig}}}({q_{\mathsf {sig}}}+ T)}{N} \right) , \end{aligned}$$

any \(S\le S' / {\tilde{O}\left( T+ {q_{\mathsf {sig}}}\right) }\), \(t\le t' - \tilde{O}\left( S(T+ {q_{\mathsf {sig}}})\right) \), and \(s\le s' - \tilde{O}\left( S(T+ {q_{\mathsf {sig}}})\right) \).

For comparison, note that the security of Schnorr signatures in the standard ROM is \(O\left( \sqrt{T\varepsilon '} + \frac{{q_{\mathsf {sig}}}({q_{\mathsf {sig}}}+ T)}{N}\right) \), i.e., in the AI-ROM the second term worsens by a factor of \(S\).

TDF Key Encapsulation. Let \(F\) be a trapdoor family (TDF) generator. TDF encryption is a key-encapsulation mechanism \(\varPi = (\mathsf {Gen},\mathsf {Enc},\mathsf {Dec})\) that works as follows:

  • Key generation: Run the TDF generator to obtain \((f,f^{-1}) \leftarrow F\), where \(f,f^{-1}: [N]\rightarrow [N]\). Set the public key \(\mathsf {pk}:= f\) and the secret key \(\mathsf {sk}:= f^{-1}\).

  • Encapsulation: To encapsulate a key with public key \(\mathsf {pk}= f\), choose \(x \in [N]\), query \(k \leftarrow \mathcal O(x)\), compute \(y \leftarrow f(x)\), and output \((c,k) \leftarrow (y,k)\).

  • Decapsulation: To decapsulate a ciphertext \(c= y\) with secret key \(\mathsf {sk}= f^{-1}\), output \(k \leftarrow \mathcal O(f^{-1}(y))\).

Theorem 6 deals with the security of TDF key encapsulation in the AI-ROM. Once again, for attackers \(\mathcal A= (\mathcal A_1,\mathcal A_2)\), the running time \(t\) and space complexity \(s\) of \(\mathcal A_2\) is made explicit. The proof of Theorem 6 is provided in the full version of this paper [15].

Theorem 6

Let \(\varPi \) be TDF encapsulation. If \(G^{\mathsf {TDF},F}\) is \(((S',*,t',s'),\varepsilon ')\)-secure, then, for any \(T\in \mathbb N\), \(G^{\mathsf {KEM\text {-}{}CPA},\varPi }\) is \(((S,T,t,s),\varepsilon )\)-secure in the \(\mathsf {AI\text {-}RO}(N,N)\)-model, where

$$\begin{aligned} \varepsilon \ = \ \tilde{O}\left( \varepsilon ' + \sqrt{ \frac{ST}{N} }\right) \end{aligned}$$

and \(S= S' - \tilde{O}\left( ST\right) \), \(t= t' - \tilde{O}\left( t_{\mathsf {tdf}}\cdot T\right) \), and \(s= s' - \tilde{O}\left( ST\right) \), where \(t_{\mathsf {tdf}}\) is the time required to evaluate the TDF.

Moreover, \(G^{\mathsf {KEM\text {-}{}CCA},\varPi }\) is \(((S,T,t,s),\varepsilon )\)-secure with the same parameters, except that \(t= t' - \tilde{O}\left( t_{\mathsf {tdf}}\cdot ST\right) \).

Observe that the above security bound corresponds simply to the sum of the security of the TDF and the security of \(\mathcal O\) as a PRG (cf. Sect. 3); in the standard random-oracle model, the security of TDF encryption is simply upper bounded by \(O\left( \varepsilon '\right) \) (cf. Sect. A.2).

An important point about the proof of Theorem 6 is that it does not follow the usual paradigm of deriving the security of TDF encryption in the BF-ROM and thereafter applying Theorem 1 (for CPA/CCA security is an indistinguishability application). Doing so—as Unruh does for RSA-OAEP [51] (but in an “asymptotic sense,” as explained in Footnote 5)—would immediately incur an additive error of \(ST/P\le ST/t'\), since the size of the list \(P\) is upper bounded by the TDF attacker size \(t'\). So the naive application Theorem 1 would result in poor exact security.

Instead, our tighter proof of Theorem 6 considers two hybrid experiments (one of which is the original CPA/CCA security game in the AI-ROM). The power of the BF-ROM is used twice—with different list sizes: (1) to argue the indistinguishability of the two experiments and (2) to upper bound the advantage of the attacker in the second hybrid. Crucially, a reduction to TDF security is only required for (1), which has an unpredictability flavor and can therefore get by with a list size of roughly \(P\approx ST\); observe that this is polynomial for efficient \((S,T)\)-attackers. The list size for (2) is obtained via the usual balancing between \(ST/P\) and the security bound in the BF-ROM.Footnote 10

6 Salting Defeats Auxiliary Information

There exist schemes that are secure in the standard ROM but not so in the AI-ROM. A simple example is if the random oracle itself is directly used as a collision-resistant hash function \(\mathcal O: [N]\rightarrow [M]\) for some \(N\) and \(M\): in the ROM, \(\mathcal O\) is easily seen to be collision-resistant, while in the AI-ROM, the first phase \(\mathcal A_1\) of an attacker \(\mathcal A= (\mathcal A_1,\mathcal A_2)\) (cf. Sect. 2.2) can simply leak a collision to \(\mathcal A_2\), which then outputs it, thereby breaking the collision-resistance property.

The full version of this paper [15] briefly highlights two schemes with computational security where the above phenomenon can be observed as well. The first one is a generic transformation of an identification scheme into a signature scheme using the so-called Fiat-Shamir transform, and the second one is the well-known full-domain hash.Footnote 11

To remedy the situation with schemes such as those mentioned above, in this section we prove that the security of any standard ROM scheme can be carried over to the BF-ROM by sacrificing part of the domain of the BF-RO for salting. First, in Sect. 6.1, we analyze the standard way of salting a random oracle by prefixing a randomly chosen (public) value to every oracle query. Second, in Sect. 6.2, we also show how to adapt a technique by Maurer [41], originally used in the context of key-agreement with randomizers, to obtain a more domain-efficient salting technique, albeit with a longer salt value; the salt length can be reduced by standard derandomization techniques based on random walks on expander graphs.

6.1 Standard Salting

The standard way of salting a scheme is to simply prepend a public salt value to every oracle query: Consider an arbitrary application \(G\) with the corresponding challenger \(\mathsf C\). Let \(\mathsf C_{\mathsf {salt}}\) be the challenger that is identical to \(\mathsf C\) except that it initially chooses a uniformly random value \(a\in [K]\), outputs \(a\) to \(\mathcal A_2\), and prepends \(a\) to every oracle query. Denote the corresponding application by \(G_{\mathsf {salt}}\). Observe that the salt value \(a\) is chosen after the first stage \(\mathcal A_1\) of the attack, and, hence, as long as the first stage \(\mathcal A_1\) of the attacker in the BF-ROM does not prefix a position starting with \(a\), it is as if the scheme were executed in the standard ROM. Moreover, note that the time and space complexities \(s\) and \(t\), respectively, of \(\mathcal A_2\) increase roughly by \(P\) due to the security reduction used in the proof.

Theorem 7

For any \(P\in \mathbb N\), if an application \(G\) is \(((S',T',t',s'),\varepsilon ')\)-secure in the \(\mathsf {RO}(N,M)\)-model, then \(G_{\mathsf {salt}}\) is \(((S,T,t,s),\varepsilon )\)-secure in the \(\mathsf {BF\text {-}RO}(P,NK,M)\)-model for

$$\begin{aligned} \varepsilon \ = \ \varepsilon ' + \frac{P}{K}, \end{aligned}$$

\(S= S' - \tilde{O}\left( P\right) \), \(T= T'\), \(t= t' - \tilde{O}\left( P\right) \), and \(s= s' - \tilde{O}\left( P\right) \).

The proof of Theorem 7 is provided in the full version of this paper [15]. Combining Theorem 7 with Theorems 1 and 2 from Sect. 2.2 yields the following corollaries:

Corollary 1

For any \(P\in \mathbb N\) and every \(\gamma > 0\), if an arbitrary application \(G\) is \(((S',T',t',s'),\varepsilon ')\)-secure in the \(\mathsf {RO}(N,M)\)-model, then \(G_{\mathsf {salt}}\) is \(((S,T,t,s),\varepsilon )\)-secure in the \(\mathsf {AI\text {-}RO}(NK,M)\)-model for

$$ \varepsilon \ = \ \varepsilon ' + \frac{P}{K} + \frac{(S+ \log \gamma ^{-1}) \cdot T^{\mathsf {comb}}_{G_{\mathsf {salt}}}}{P}+ \gamma $$

and any \(S= S' - \tilde{O}\left( P\right) \), \(T= T'\), \(t= t' - \tilde{O}\left( P\right) \), and \(s= s' - \tilde{O}\left( P\right) \), where \(T^{\mathsf {comb}}_{G_{\mathsf {salt}}}\) is the combined query complexity corresponding to \(G_{\mathsf {salt}}\).

Corollary 2

For every \(\gamma > 0\), if an unpredictability application \(G\) is \(((S',T',t',s'),\varepsilon ')\)-secure in the \(\mathsf {RO}(N,M)\)-model, then \(G_{\mathsf {salt}}\) is \(((S,T,t,s),\varepsilon )\)-secure in the \(\mathsf {AI\text {-}RO}(NK,M)\)-model for

$$ \varepsilon \ = \ 2 \varepsilon + \frac{2 (S+ \log \gamma ^{-1}) \cdot T^{\mathsf {comb}}_{G_{\mathsf {salt}}}}{K} + \gamma $$

and any \(S= S' / \tilde{O}(T^{\mathsf {comb}}_{G_{\mathsf {salt}}})\), \(T= T'\), \(t' = t- \tilde{O}\left( P\right) \), and \(s' = s- \tilde{O}\left( P\right) \), where \(P= (S+ \log \gamma ^{-1}) T^{\mathsf {comb}}_{G_{\mathsf {salt}}}\) and where \(T^{\mathsf {comb}}_{G_{\mathsf {salt}}}\) is the combined query complexity corresponding to \(G_{\mathsf {salt}}\).

Applications. In the full version of this paper [15], we briefly discuss how salting affects the security of the applications presented in Sects. 3 to 5. We also provide examples to illustrate that directly analyzing a salted scheme in the BF-ROM can lead to much better bounds than combining a standard-ROM security bound with one of the above corollaries.

6.2 Improved Salting

One way to think of salting is to view the function table of \(\mathsf {BF\text {-}RO}(KN,M)\) as a \((K\times N)\)-matrix and let the challenger in the salted application randomly pick and announce the row to be used for oracle queries. However, \(K\) has to be around the same size as \(N\) to obtain meaningful bounds. In this section, based on a technique by Maurer [41], we provide a more domain-efficient means of salting, where the security will decay exponentially (as opposed to inverse linearly) with the domain expansion factor K, at the cost that each evaluation of the derived random oracle will cost K evaluations (as opposed to 1 evaluation) of the original random oracle.

Consider an arbitrary application \(G\) with corresponding challenger \(\mathsf C\). Let \(\mathsf C_{\mathsf {salt'}}\) be the challenger works as follows: It initially chooses a uniformly random value \(a= (a_1,\dots , a_K) \in [N]^K\) and outputs \(a\) to \(\mathcal A_2\). Then, it internally runs \(\mathsf C\), forwards all messages between the attacker and \(\mathsf C\), but answers the queries \(x \in [N]\) that \(\mathsf C\) makes to the oracle by

$$\begin{aligned} \sum _{i=1}^K\mathsf {BF\text {-}RO}{.}\mathsf {main}(i,x+a_i), \end{aligned}$$

where addition is in \(\mathbb Z_{N}\) and \(\mathbb Z_{M}\), respectively. In other words, the function table of \(\mathsf {BF\text {-}RO}\) is arranged as a \(K\times N\) matrix, the \({i}^\text {th}\) row is shifted by \(a_i\), and queries x are answered by computing the sum modulo \(M\) of all the values in the \({x}^\text {th}\) column of the shifted matrix, denoted \(F_{a}\). Denote the corresponding application by \(G_{\mathsf {salt'}}\). The proof of the following theorem is provided in the full version of this paper [15]. Moreover, we present a means of reducing the size of the public salt value.

Theorem 8

For any \(P\in \mathbb N\), if an application \(G\) is \(((S',T',t',s'),\varepsilon ')\)-secure in the \(\mathsf {RO}(N,M)\)-model, then \(G_{\mathsf {salt'}}\) is \(((S,T,t,s),\varepsilon )\)-secure in the \(\mathsf {BF\text {-}RO}(P,NK,M)\)-model for

$$\begin{aligned} \varepsilon ' \ = \ \varepsilon + N \cdot \left( \frac{P}{{KN}}\right) ^K, \end{aligned}$$

\(S= S' - \tilde{O}\left( P\right) \), \(T= T'\), \(t= t' - \tilde{O}\left( P\right) \), and \(s= s' - \tilde{O}\left( P\right) \).

In particular, assuming \(P\le KN/2\), setting \(K=O(\log N)\) will result in additive error \(N(P/NK)^K= o(\frac{1}{N})\) and domain size \(O(N\log N)\). But if \(P\le N^{1-\varOmega (1)}\), setting \(K=O(1)\) will result in the same additive error \(o(\frac{1}{N})\) in the original domain of near-optimal size O(N). Hence, for most practical purposes, the efficiency slowdown \(K\) (in both the domain size and the complexity of oracle evaluation) is at most \(O(\log N)\) and possibly constant.

Combining the above results with those in Sect. 2.2 yields the following corollaries:

Corollary 3

For any \(P\in \mathbb N\) and every \(\gamma > 0\), if an application \(G\) is \(((S',T',t',s'),\varepsilon ')\)-secure in the \(\mathsf {RO}(N,M)\)-model, then \(G_{\mathsf {salt'}}\) is \(((S,T,t,s),\varepsilon )\)-secure in the \(\mathsf {AI\text {-}RO}(NK,M)\)-model for

$$\begin{aligned} \varepsilon \ = \ \varepsilon ' + N\cdot \left( \frac{P}{KN} \right) ^K+ \frac{(S+ \log \gamma ^{-1}) \cdot T^{\mathsf {comb}}_{G_{\mathsf {salt'}}}}{P} + \gamma \end{aligned}$$

and any \(S= S' - \tilde{O}\left( P\right) \), \(T= T'\), \(t= t' - \tilde{O}\left( P\right) \), and \(s= s' - \tilde{O}\left( P\right) \), where \(T^{\mathsf {comb}}_{G_{\mathsf {salt'}}}\) is the combined query complexity corresponding to \(G_{\mathsf {salt'}}\).

Corollary 4

For every \(\gamma > 0\), if an application \(G\) is \(((S',T',t',s'),\varepsilon ')\)-secure in the \(\mathsf {RO}(N,M)\)-model, then \(G_{\mathsf {salt'}}\) is \(((S,T,t,s),\varepsilon )\)-secure in the \(\mathsf {AI\text {-}RO}(NK,M)\)-model for

$$\begin{aligned} \varepsilon \ = \ 2 \varepsilon + 2 N\cdot \left( \frac{(S+ \log \gamma ^{-1}) T^{\mathsf {comb}}_{G_{\mathsf {salt'}}}}{KN} \right) ^K\end{aligned}$$

and any \(S= S' / \tilde{O}(T^{\mathsf {comb}}_{G_{\mathsf {salt'}}})\), \(T= T'\), \(t' = t- \tilde{O}\left( P\right) \), and \(s' = s- \tilde{O}\left( P\right) \), where \(P= (S+ \log \gamma ^{-1}) T^{\mathsf {comb}}_{G_{\mathsf {salt'}}}\) and where \(T^{\mathsf {comb}}_{G_{\mathsf {salt'}}}\) is the combined query complexity corresponding to \(G_{\mathsf {salt'}}\).