1 Introduction

Several applications rely on so-called “moderately-hard tasks” that are not infeasible to solve, but whose cost is non-trivial. The cost can for example be the hardware or electricity cost of computation as in proofs of work [13, 14, 18] or time-lock puzzles [23], the cost for disk storage space as in proofs of space [15], the cost of “human attention” in captchas [24] or the cost of memory in memory-bound [2, 12] or memory-hard functions [8, 20], the latter being the topic of this work. Applications of such tasks include the prevention of spam [13], protection against denial-of-service attacks [19], metering client access to web sites [16], consensus protocols underlying decentralized cryptocurrencies [10] (knows as blockchains) or password hashing [22], which we’ll discuss in more detail next.

In the setting of password hashing, a user’s password (plus a salt and perhaps other system-dependent parameters) is the input to a moderately-hard function f, and the resulting output is the password hash to be kept in a password file. The hope is that even if the password file is compromised, a brute-force dictionary attack remains costly as it would require an attacker to evaluate f on every password guess. Traditional approaches for password hashing have focused on iterating a hash function a certain number (typically a few thousands) of times, as for instance in PBKDF2. An advantage for the honest user results from the fact that he or she needs to compute f only once on the known password, while an attacker is forced to compute f on a large number of passwords. However, this advantage can be eroded, because in constrast to honest users, who typically use general-purpose hardware, attackers may invest into special-purpose hardware like ASICs (Application Specific Integrated Circuits) and recoup the investment over multiple evaluations. Moreover, such special-purpose hardware may exploit parallelism, pipelining, and amortization in ways that the honest user’s single evaluation of f cannot. Consequently, the adversary’s cost per evaluation can be several orders of magnitude lower than that for the honest user.

Memory-Hard Functions. To reduce the disparity between per-evaluation costs of the honest user and a potential attacker, Percival [20] suggested measuring cost by the amount of space used by the algorithm multiplied by the amount of time. A memory-hard function (MHF), in Percival’s definition, is one where this evaluation cost measure is high not only for the honest user’s sequential algorithm, but also no parallel algorithm can do much better. In particular, if a parallel algorithm can cut the time to evaluate f by some factor, it must come at the cost of increase in space by roughly the same factor. Since memory is inherently general-purpose, measuring cost in terms of space provides a reasonably accurate comparison of resources used by different implementations. We stress that memory hardness is a very different notion than that of memory-bound functions proposed by Abadi, Burrows, Manasse, and Wobber [1, 2], which maximize the number of memory accesses at unpredictable locations so that the inherent memory-access latency (resulting from frequent cache misses) imposes a lower bound on the time needed to evaluate the function which is independent from the actual CPU power. Memory hardness also does not guarantee that a lot of memory will be required, because it allows trading memory for time.

Alwen and Serbinenko [8] observed that Percival’s notion of cost is not robust to amortization: it may be that an algorithm uses a large amount of memory at its peak, but a much smaller amount on average; pipelining multiple evaluations (by multiple CPUs using shared memory) in such a way that peaks occur at different times can thus reduce the per-evaluation cost of f. They propose the notion of cumulative memory complexity (abbreviated \(\mathsf {cc_{mem}}\)), which is robust to amortization. It is defined as the sum of memory actually used at each point in time (rather than the product of peak memory and time). We will use this notion in our work. The \(\mathsf {cc_{mem}}\) of a function is defined as the lowest \(\mathsf {cc_{mem}}\) of all algorithms that evaluate the function.

Best-Possible Hardness. Given the state of computational complexity theory, where we cannot even prove superlinear lower bounds for problems in NP, all \(\mathsf {cc_{mem}}\) lower-bound results so far necessarily make use of idealized models of computation, like the random oracle model.

Many candidate memory-hard functions (including \(\mathtt{scrypt} \)) can be viewed as a mode of operation for an underlying building block like a cryptographic hash-function \(\mathsf {h}\). Such MHFs come with an evaluation algorithm—which we’ll call “the naïve algorithm”—which makes only sequential access to \(\mathsf {h}\). Note that after t steps (each involving at most one query to \(\mathsf {h}\)), even a naïve algorithm which stores all t outputs it received from \(\mathsf {h}:\{0,1\}^*\rightarrow \{0,1\}^w\) will not use more that \(O(t\cdot w)\) memory; therefore, if the naïve algorithm for an MHF \(f^\mathsf {h}\) runs for n steps total, the \(\mathsf {cc_{mem}}\) of \(f^\mathsf {h}\) will be in \(O(n^2\cdot w)\). Thus a lower bound on the \(\mathsf {cc_{mem}}\) of \(\varOmega (n^2\cdot w)\) is the best we can hope for in any model which captures at least the naïve algorithm.

If the naïve algorithm has the feature that its memory access pattern (addresses read and written) is independent of the input, the MHF is called data-independent. A data-independent \(f^\mathsf {h}\) can be represented as a directed acyclic graph, with a unique source corresponding to the input, a unique sink corresponding to the final output, and the other nodes indicating intermediary values where the value of a node is computed as a function of the nodes of its parents (using one invocation of \(\mathsf {h}\)). To derive meaningful bounds, we require that this graph has constant in-degree, so computing an intermediate value takes constant time. The evaluation of \(f^\mathsf {h}\) can now be cast as a graph pebbling problem [8]. Any constant in-degree graph can be pebbled (in the so called parallel black pebbling game) using “cumulative pebbling complexity” \(\mathsf {cc_{peb}}=O(n^2/\log n)\).Footnote 1 As any such pebbling implies an evaluation algorithm with \(\mathsf {cc_{mem}}\approx \mathsf {cc_{peb}}\cdot w\), we get an \(O(w\cdot n^2/\log n)\) upper bound on \(\mathsf {cc_{mem}}\) for any data-indepedent MHF [3]. This upper bound is matched by [4], who construct a data-independent function with \(\mathsf {cc_{mem}}=\varOmega (w \cdot n^2/\log (n))\) in the parallel random oracle model. This calls for the question of whether the lower bound of \(\varOmega (w\cdot n^2)\) is achievable at all; the above discussion shows that to achieve this lower bound, it will not be sufficient to only consider data-independent MHFs.

The \(\mathtt{Scrypt} \) MHF. Percival [20] proposed a candidate (data-dependent) memory-hard function called \(\mathtt{scrypt} \).Footnote 2 On input X, the \(\mathtt{scrypt} ^\mathsf {h}\) function–where \(\mathsf {h}\) is a cryptographic hash function modeled as a random oracle for the lower bound proof – computes values \(X_0,X_1,\ldots ,X_{n-1},S_0,\ldots ,S_n\) as defined below, and finally outputs \(S_n\)

  • \(X_0=X\) and for \(i=1,\ldots ,n-1\ :\ X_i=\mathsf {h}(X_{i-1})\)

  • \(S_0=\mathsf {h}(X_{n-1})\) and for \(i=1,\ldots , n\ :\ S_i= \mathsf {h}(S_{i-1}\oplus X_{S_{i-1} \, \bmod \, n})\)

\(\mathtt{Scrypt} \) has found widespread popularity: it is used in proofs-of-work schemes for cryptocurrencies (most notably Litecoin [10], but also Tenebrix or Dogecoin), is described by an RFC [21], and has inspired the design of one of the Password-hashing Competition’s [22] winners, Argon2d [9].

An intuitive explanation for why \(\mathtt{scrypt} \) was conjectured to be memory-hard is as follows. View the first portion of \(\mathtt{scrypt} \) as an n-node line graph, with nodes labeled by \(X_0, \dots , X_{n-1}\). To compute \(S_{i+1}\), an algorithm needs \(X_{S_{i} \, \bmod \, n}\), whose index \((S_i \, \bmod \, n)\) is random and unknown until \(S_{i}\) is computed. If the algorithm stores a subset of the X values of size p before \(S_{i}\) is known, then the label of a random node in the line graph will be on average n / (2p) steps from a stored label, and will therefore take n / (2p) sequential evaluations of \(\mathsf {h}\) to compute, for a total memory \(\cdot \) time cost of \(p\cdot n/(2p)= n/2\). Since there are n \(S_i\) values to compute, this strategy has \(\mathsf {cc_{mem}}\) of \(w\cdot n\cdot n/2=\frac{1}{2}wn^2\).

This simple argument, however, does not translate easily into a proof. The two main challenges are as follows. First, in general an algorithm computing \(\mathtt{scrypt} \) is not restricted to just store labels of nodes, but can compute and store arbitrary information. Surprisingly, f for which storing information other than just labels provably decreases \(\mathsf {cc_{mem}}\) have been constructed in [5, Appendix A]. Second, an algorithm is not compelled to keep all p labels in memory after the index \(S_i \, \bmod \, n\) is known. In fact, [8] show that if one is given the indices \(S_i \, \bmod \, n\) in advance, an evaluation algorithm exists which evaluates \(\mathtt{scrypt} ^\mathsf {h}\) with \(\mathsf {cc_{mem}}\) only \(O(w\cdot n^{1.5})\), because knowing the future indices enables the algorithm to keep or recompute the labels that will be needed in the near future, and delete those that won’t.

Previous work on \(\mathtt{scrypt} \). Percival’s original paper [20] proposed an analysis of \(\mathtt{scrypt} \), but his analysis is incorrect, as we point out in Appendix A, in addition to not targeting \(\mathsf {cc_{mem}}\). Recent progress toward proving that \(\mathtt{scrypt} \) is memory-hard was made by Alwen et al. [6]. They lower bound the \(\mathsf {cc_{mem}}\) of \(\mathtt{scrypt} \) by \(\varOmega (w\cdot n^2/\log ^2n)\), albeit only for a somewhat restricted class of adversaries (informally, adversaries who can store secret shares of labels, but not more general functions). We’ll compare their work with ours in more detail below.

Our Results. We give the first non-trivial unconditional lower bound on \(\mathsf {cc_{mem}}\) for \(\mathtt{scrypt} ^\mathsf {h}\) in the parallel random oracle model, and our bound already achieves optimal \(\mathsf {cc_{mem}}\) of \(\varOmega (w\cdot n^2)\).

We’ll give the exact theorem statement and an overview of the proof in Sect. 3. However, to appreciate the novelty of our results, we note that the only existing proofs to lower bound \(\mathsf {cc_{mem}}\) of MHFs go through some kind of lower bounds for pebbling.

For data independent MHFs [8] there is an elegant argument (known as “ex post facto”) stating that a lower bound on the cumulative complexity for the parallel black pebbling game translates directly into a lower bound for \(\mathsf {cc_{mem}}\). Thus, the problem is reduced to a purely combinatorial problem of proving a pebbling lower bound on the graph underlying the MHF.

For data dependent MHFs no such result, showing that pebbling lower bounds imply \(\mathsf {cc_{mem}}\) lower bounds for general adversaries, is known.Footnote 3 The lower bound on \(\mathsf {cc_{mem}}\) for \(\mathtt{scrypt} \) from [6] was also derived by first proving a lower bound on the pebbling complexity, but for a more powerful pebbling adversary that can use “entangled” pebbles. This lower bound then translated into a lower bound for \(\mathsf {cc_{mem}}\) for a limited class of adversaries who, apart from labels, can also store “secret shares” of labels. It was conjectured [6] that lower bounds for this entangled pebbling game already imply lower bounds on \(\mathsf {cc_{mem}}\) for arbitrary adversaries, and a combinatorial conjecture was stated which, if true, would imply this. Unfortunately the strongest (and simplest) version of this conjecture has already been refuted. A weaker version of the conjecture has been “weakly” refuted, in the sense that, even if it was true, one would lose a factor of at least \(\log (n)\) by going from pebbling to memory lower bounds. (The current state of the conjecture is available on the eprint version of the paper [5].)

In this work, in Sect. 5, we also prove an optimal \(\varOmega (n^2)\) lower bound on the parallel cumulative pebbling complexity for a game which abstracts the evaluation of \(\mathtt{scrypt} \): we consider a path of length n, and an adversary must pebble n randomly chosen nodes on this graph, where the ith challenge node is only revealed once the node of challenge \(i-1\) is pebbled. This already gives an optimal \(\varOmega (n^2\cdot w)\) lower bound on \(\mathsf {cc_{mem}}\) for \(\mathtt{scrypt} \) for adversaries who are only allowed to store entire labels, but not any functions thereof. This improves on the \(\varOmega (n^2/\log ^2(n))\) lower bound from [6], who use a rather coarse potential argument which roughly states that, for any challenge, either we pay a lot for pebbling the next challenge node, or the “quality” of our pebbling configuration decreases. As this quality cannot decrease too many times, at least every \(\log (n)\)’th challenge will cost \(n/\log (n)\) in cumulative complexity, giving the overall \(\varOmega (n^2/\log ^2(n))\) lower bound after n challenges. In this work we introduce a new technique for analyzing the cumulative pebbling cost where—for every challenge—we take into account the cumulative cost of the pebbling configurations before this challenge is revealed. Both the potential argument from [6], as well as our new proof, rely on the generalization of the fact that given a configuration with p pebbles, and a random challenge, with good probability (say at least \(\frac{1}{2}\)), an adversary also needs also at least (roughly) n / p steps to pebble the challenge.

As discussed above, pebbling lower bounds are not known to directly imply \(\mathsf {cc_{mem}}\) lower bounds for data dependent MHFs, so to prove our main result in Sect. 6, we in some sense emulate our proof for the pebbling game directly in the parallel random oracle model. However, there are two problems we will need to overcome. The first is that the adversary’s state is not made of labels (corresponding to pebbles), but could be any function thereof. Still, we will want to show that in order to compute a challenge, an adversary storing any \(p \cdot w\) bits of information about the random oracle, will need to take with good probability (say at least \(\frac{1}{2}\)) at least (roughly) n / p steps. We will show this using a careful compression argument in Sect. 4. The second problem is the fact that in \(\mathtt{scrypt} \) the challenges are not randomly and externally generated, but come from the random oracle.

2 Preliminaries

We review basic notation and concepts from the literature on memory-hard functions. We will also define the \(\mathtt{scrypt} \) function as needed further below.

The parallel-random oracle model. We first define the parallel random-oracle model (pROM), essentially following the treatment from [8], with some highlighted differences.

Concretely, we consider an oracle-aided deterministicFootnote 4 algorithm A which runs in rounds, starting with round 1. Let \(\mathsf {h}\) denote an oracle with w-bit outputs. It does not matter for our model whether oracle inputs are restricted in length, but it will be simpler to assume a general upper bound (even very large) on the length of its inputs to make the set of oracles finite.

In general, a state is a pair \(({\tau }, {\mathbf s})\) where data \({\tau } \) is a string and \({\mathbf s} \) is a tuple of strings. In an execution, at the end of round i, algorithm A produces as output an output state \({\bar{\sigma }} _i = ({\tau } _i, {\mathbf q} _i)\) where \({\mathbf q} _i = [q_i^1,\ldots ,q_i^{z_i}]\) is a tuple of queries to \(\mathsf {h}\). At the begining of next round \(i+1\), algorithm A gets as input the corresponding input state \({\sigma } _i = ({\tau } _i, \mathsf {h}({\mathbf q} _i))\) where \(\mathsf {h}({\mathbf q} _i) = [\mathsf {h}(q_i^1), \ldots , \mathsf {h}(q_i^{z_i})]\) is the tuple of responses from \(\mathsf {h}\) to the queries \({\mathbf q} _i\). In particular, since A is deterministic, for a given \(\mathsf {h}\) the input state \({\sigma } _{i+1}\) is a function of the input state \({\sigma } _i\).

The initial input state \({\sigma } _0\) is normally empty with length 0 (though in the proof we will also need to consider a non-empty initial input state); an input X is given together with \({\sigma } _0\) in the first round. We require that A eventually terminates and denote its output by \(A^{\mathsf {h}}(X)\).

Complexity measure. For a given execution the complexity measure we are going to be concerned with is the sum of the bit-lengths of the input states. To that make this precise we introduce the following notation. For a string x we denote its bit-length by |x|. For state \({\sigma } =({\tau },{\mathbf s})\) where \({\mathbf s} =[s_1, \ldots , s_z]\) we denote the bit-length (or size) of \({\sigma } \) by \(|{\sigma } |=|{\tau } | + \sum _{j=1}^z |s_j|\). We can now define the cumulative (memory) complexity of an execution of algorithm A on input X using oracle \(\mathsf {h}\) resulting in input states \({\sigma } _0, {\sigma } _1, \ldots \) as

$$\begin{aligned} \mathsf {cc_{mem}}(A^{\mathsf {h}}(X)) = \sum _{i \ge 0} {\left| {\sigma } _i\right| }. \end{aligned}$$

We will assume without loss of generality that at each round, the query tuple \({\mathbf q} \) contains at least one query, for otherwise A can proceed directly to the next round where it issues a query, without increasing its cumulative complexity. In particular, this implies \({\left| {\sigma } _i\right| } \ge w\) for \(i>0\).

Note that \(\mathsf {cc_{mem}}\) does not charge anything for computation or memory used within each round itself. We are also allowing inputs to \(\mathsf {h}\) to be arbitrary long without extra memory cost—only the output length w is charged to the cumulative complexity. This only makes our lower bound stronger. Note however that this also means that \(\mathsf {cc_{mem}}\) gives a good upper bound only when computation is dominated by the memory cost (as is the case for the naïve evaluation algorithm of \(\mathtt{scrypt} ^\mathsf {h}\), which, aside from querying \(\mathsf {h}\) sequentially, performs only a few trivial computations, such as exlusive-ors and modular reductions).

The \(\mathtt{scrypt} \) MHF. We will consider the \(\mathtt{scrypt} ^\mathsf {h}\) function throughout this paper (more specifically, we study its core, ROMix, as defined in [20]). Recall that for a hash function \(\mathsf {h}: {\{0,1\}} ^* \rightarrow {\{0,1\}} ^w\), \(\mathtt{scrypt} ^{\mathsf {h}}\) on input \(X \in {\{0,1\}} ^w\) and parameter \(n\in \mathbb {N}\) computes values \(X_0,X_1,\ldots ,X_{n-1},S_0,\ldots ,S_n\) and outputs \(S_n\), where

  • \(X_0=X\) and for \(i=1,\ldots ,n-1\ :\ X_i=\mathsf {h}(X_{i-1})\)

  • \(S_0=\mathsf {h}(X_{n-1})\) and for \(i=1,\ldots , n\ :\ S_i= \mathsf {h}(S_{i-1}\oplus X_{S_{i-1} \, \bmod \, n})\)

We will also define intermediate variables \(T_0,\ldots ,T_n\) with \(T_0=X_{n-1}\) and \(T_i = S_{i-1}\oplus X_{S_{i-1} \, \bmod \, n}\) for \(1\le i\le n\), so that \(S_i = \mathsf {h}(T_i)\).

Note that one may not want to restrict X to w bits. In this case, one can replace X with \(\mathsf {h}(X)\) in the above construction. For notational simplicity, we will only analyze the w-bit input case in this paper, but the general analysis is very similar.

Graph and pebbling preliminaries. For some of our partial results below, we will adopt the graph-pebbling view on computing candidate MHFs, following [8]. A parallel black pebbling considers a direct acyclic graph \(G = (V, E)\). At each time step t starting with \(t=0\), the adversary maintains a subset \(P_t\) of nodes (“pebbles”). A node v is allowed (but not required) to get a pebble at time t if there is a pebble on all of its predecessors (i.e., all \(v'\) such that \((v', v) \in E\)), or if there was a pebble on v itself at time \(t - 1\). Formally, define \(\mathsf {pre}(v)\) to be the set of all predecessors of v, and for \(U \subseteq V\), define \(U^+ = \{ v \in V \;:\; \mathsf {pre}(v) \subseteq U \}\). Then, at time \(t>0\), the set \(P_t\) must be a subset of \(P_{t-1}\cup P_{t-1}^+\).

We define \(\mathsf {p}_i=|P_i| \ge 1\). The (parallel) cumulative pebbling complexity of a sequence of pebbling configuration \(P_0, P_1, \ldots , P_t\) is \(\sum _{i=0}^t \mathsf {p}_i\). We remark that we modify the pebbling rules slightly from [8] by not permitting the adversary to put a pebble on the source for free: \(v_0\) is contained in \(P_0\) and cannot be added to \(P_t\) if it is absent in \(P_{t-1}\) (this change will simplify calculations, and only increase the size of each set by 1).

Pebbling with challenges. Normally, the goal of pebbling games is to place a pebble on the sink of the graph. Here, we are going to consider pebbling games with \(Q\) challenges on a graph \(G = (V,E)\), where the adversary proceeds in rounds, and in each round i, it receives a random challenge \(c_i \in V\) (usually uniform from a subset \(V' \subseteq V\)), and the goal is to place a pebble on \(c_i\), which enables the adversary to move to the next round (unless this was the last challenge \(c_{Q}\), in which case the game terminates.) For instance, the core of the evaluation of \(\mathtt{scrypt} \) is captured by the line graph with vertices \(v_0,\ldots ,v_{n-1}\) and edges \((v_i, v_{i+1})\) for \(i = 0, \ldots , n-2\), and we will study this pebbling game in detail below.

3 Main Result and Overview

In this section, we state our main result, and give a brief high-level overview of the next sections.

Theorem 1

(Memory-hardness of \(\mathtt{Scrypt} \) , main theorem). For any \(X\in \{0,1\}^w\) and \(n\ge 2\), if \(A^\mathsf {h}(X,n)\) outputs \(S_n=\mathtt{scrypt} ^{\mathsf {h}}(X,n)\) with probability \(\chi \), where the probability is taken over the choice of the random oracle \(\mathsf {h}\), then with probability (over the choice of \(\mathsf {h}\)) at least \(\chi - .08n^6\cdot 2^{-w}- 2^{-n/20}\),

$$\begin{aligned} \mathsf {cc_{mem}}(A^\mathsf {h}(X)) > \frac{1}{25} \cdot n^2 \cdot \left( w - 4 \log n\right) . \end{aligned}$$

We note that if w is large enough in terms of n (say, \(4 \log n \le w/2\), which clearly holds for typical values \(w=256,n=2^{20}\)), then \(\mathsf {cc_{mem}}(A^\mathsf {h}(X))\) is in \(\varOmega (n^2 w)\). As discussed in the introduction, this is the best possible bound up to constant factors, as already the (sequential) naïve algorithm for evaluating \(\mathtt{scrypt} ^\mathsf {h}\) has \(\mathsf {cc_{mem}}\in O(n^2w)\). We discuss the constants following Theorem 5.

Proof outline. The proof consists of three parts outlined below. The first two parts, in fact, will give rise to statements of independent interest, which will then be combined into the proof of our main theorem.

  • Section 4: Single-shot time complexity. To start with, we consider a pROM game where the adversary \(A^\mathsf {h}(X)\) starts its execution with input X and an M-bit state \(\sigma _0\) that can depend arbitrarily on \(\mathsf {h}\) and X. Then, \(A^\mathsf {h}(X)\) is given a random challenge \(j\in \{0,\ldots ,n-1\}\) and must return \(X_j = \mathsf {h}^j(X)\).

    Clearly, \(\sigma _0\) may contain \(X_j\), and thus in the best case, A may answer very quickly, but this should not be true for all challenges if \(M\ll nw\). We will prove a lower bound on the expected time complexity (in the pROM) of answering such a challenge. We will show that with good probability (e.g., \(\frac{1}{2}\)) over the choice of j, \(A^{\mathsf {h}}(X)\) needs at least (roughly) nw / M steps.

    This validates in particular the intuition that the adversary in this game cannot do much better than an adversary in the corresponding pebbling game on the line graph with vertices \(v_0, v_1, \ldots , v_{n-1}\), where the adversary gets to choose an initial configuration with \(p = M/w\) pebbles, and is then asked to put a pebble on \(v_j\) for a random \(j \in \{0, 1, \ldots , n-1\}\). Here, one can show that at least n / p steps are needed with good probability. In fact, this pebbling game is equivalent to a variant of the above pROM game where the adversary only stores random-oracle output labels, and thus our result shows that an adversary cannot do much better than storing whole labels.

  • Section 5: Multi-challenge cumulative pebbling complexity. In the above scenario, we have only considered the time needed to answer a challenge. There is no guarantee, a priori, that the cumulative complexity is also high: An optimal adversary, for instance, stores p labels corresponding to equidistant pebbles, and then computes the challenge from the closest label, dropping the remainder of the memory contents.

    Here, for the randomized pebbling game with \(Q\) challenges on the line graph, we will show a lower bound of \(\varOmega (n Q)\) on the cumulative pebbling complexity. Our argument will use in particular a (generalization) of the above single-shot trade-off theorem, i.e., the fact that whenever p pebbles are placed on the line, at least n / p steps are needed with good probability to pebble a randomly chosen node. We will use this to lower bound the cumulative complexity before each particular challenge is answered. Our proof gives a substantial quantitative improvement over the looser lower bound of [6].

  • Section 6: \(\mathsf {cc_{mem}}\) of \(\mathtt{scrypt} \). Finally, we lower bound the cumulative memory complexity of \(\mathtt{scrypt} ^\mathsf {h}\) as stated in Theorem 1. Unfortunately, this does not follow by a reduction from the pebbling lower bound directly. Indeed, as discussed in the introduction (and as explained in [6]), unlike for data-independent MHFs, for data-dependent MHFs like \(\mathtt{scrypt} \) it is an open problem whether one can translate lower bounds on the cumulative pebbling complexity to lower bounds for cumulative memory complexity. Fortunately, however, through a careful analysis, we will be able to employ the same arguments as in the proof of Sect. 5 in the pROM directly.

    In particular, we will use our result from Sect. 4 within an argument following the lines to that of Sect. 5 in the pROM. One particularly delicate technical issue we have to address is the fact that in \(\mathtt{scrypt} ^\mathsf {h}\) the challenges are not sampled randomly, but will depend on the random oracle \(\mathsf {h}\), which the adversary can query. We will provide more intuition below in Sect. 6.

Remark 1

Note that in Theorem 1 above, the random oracle \(\mathsf {h}\) is sampled uniformly after the input X is chosen arbitrarily. This is equivalent to saying that X and \(\mathsf {h}\) are independent. In practice this assumption is usually (nearly) satisfied. For example, when used in a blockchain, X will be the output of \(\mathsf {h}\) on some previous inputs, typically a hash of the last block and a public-key. This doesn’t make X independent of \(\mathsf {h}\), but its distribution will be dense in the uniform distribution even conditioned on \(\mathsf {h}\), which is means it is very close to independent. When used for password hashing, \(X=\mathsf {h}(pwd,S)\) for a password pwd and a random salt S. For a sufficiently long salt, this will make X as good as uniform [11]. We defer more rigorous treatment of this issue to the full version of this paper.

4 Time Complexity of Answering a Single Challenge in the Parallel Random Oracle Model

We prove the following theorem, and below discuss briefly how this result can be extended beyond the setting of \(\mathtt{scrypt} \).

Fix positive integers n, u and w, a string \(X \in \{0,1\}^u\), a finite domain \(\mathcal {D}\) that contains at least \(\{X\}\,\cup \, \{0,1\}^w\), and let \(\mathcal {R}=\{0,1\}^w\). Given a function \(\mathsf {h}:\mathcal {D}\rightarrow \mathcal {R}\), define \(X_i = \mathsf {h}^i(X)\). Let A be any oracle machine (in the parallel random oracle model as defined in Sect. 2) that on any input and oracle makes at most \(q-1\) total queries to its oracle. Suppose \(A^\mathsf {h}(X,j)\) starts on input state \(\sigma _0\) with the goal of eventually querying \(X_j\) to \(\mathsf {h}\). Let \(t_j\) be the number of the earliest round in which \(A^\mathsf {h}(X,j)\) queries \(X_j\) to \(\mathsf {h}\) (with \(t_j=\infty \) if never). We show that A cannot do much better than if it were doing the following in the corresponding random challenge pebbling game on the line graph: initially placing \(p\approx M/w\) equidistant pebbles, and then pebbling the challenge from the closest pebble preceding it.

Theorem 2

(Single-Challenge Time Lower Bound). There exists a set of random oracles \(\mathsf {good}_\mathsf {h}\) such that \(\Pr _{\mathsf {h}\in \mathcal {R}^\mathcal {D}} [\mathsf {h}\notin \mathsf {good}_\mathsf {h}] \le qn^32^{-w}\), and for every \(\mathsf {h}\in \mathsf {good}_\mathsf {h}\), the following holds: for every memory size M, and every input state \(\sigma _0\) of length at most M bits,

$$ {\mathop {\Pr }\limits _{j\leftarrow \{0,\ldots ,n-1\}}}\left[ t_j> \frac{n}{2p} \right] \ge \frac{1}{2}, $$

where the probability is taken over only the challenge j and \(p=\lceil (M+1)/(w-2\log n -\log q)+1 \rceil \).

We will actually prove a slightly more general result: for any \(0\le \mathrm {pr}_\mathrm {hard}\le 1\),

$$ {\mathop {\Pr }\limits _{j\leftarrow \{0,\ldots ,n-1\}}}\left[ t_j> \frac{n(1-\mathrm {pr}_\mathrm {hard})}{p} \right] \ge \mathrm {pr}_\mathrm {hard}. $$

Proof

Recall that for each j, A performs \(t_j\) rounds of the following process. At round k read an input state containing oracle responses \(\mathsf {h}({\mathbf q} _{k-1})\) (except for \(k=1\), when A reads \(\sigma _0\)). Then (after arbitrary computation) produce an output state containing oracle queries \({\mathbf q} _k\). We count rounds starting from 1. Consider the sequence of such tuples of queries and responses to and from \(\mathsf {h}\). If the first appearance of \(X_i\) in this sequence is a query to \(\mathsf {h}\) in round k (\(k>0\) is minimal such that \(X_i \in {\mathbf q} _k\)), then we assign \(X_i\) position \(\pi _{ij}=k\). If instead the first appearance of \(X_i\) is a response from \(\mathsf {h}\) to query \(X_{i-1}\) made at round k (\(k>0\) is minimal such that \(X_{i-1}\in {\mathbf q} _{k}\)), then we assign \(X_i\) position \(\pi _{ij}=k+1/2\). In all other cases (i.e., if \(X_i\) does not appear, or appears only because of a hash collision in response to some query that is not \(X_{i-1}\)), let \(\pi _{ij}=\infty \).

Let “best position” correspond to the earliest time, over all j, that \(X_i\) appears during the computation of \(X_j\): \(\beta _i :=\min _j \pi _{ij}\); let “best challenge” \(\mathsf {bestchal}_i\) be \({\mathop {\mathrm {argmin}}\nolimits _{j}} \pi _{ij}\) (if argmin returns a set, pick one element arbitrarily). Let i be “blue” if \(\beta _i\) is an integer (i.e., it was produced “out of the blue” by A as a query to \(\mathsf {h}\)).

Let \(B = \{i \text { s.t. } i> 0 \text { and } i \text { is blue} \}\) (that is, all the blue indices except \(X_0\)). In the rest of the proof, we will show that the size of B cannot exceed \(p-1\) (for most \(\mathsf {h}\)), where p, as defined in the theorem statement, is proportional to the memory size M; and that the amount of time to answer the challenge is at least its distance from the preceding blue index. Thus, blue indices effectively act like pebbles, and the bounds on the time to reach a random node in the line graph by moving pebbles apply.

Claim 1

Given adversary A and input X, there exists a predictor algorithm \(\mathcal {P}\) (independent of \(\mathsf {h}\), but with oracle access to it) with the following property: for every \(\mathsf {h}\), every M, and every length M input state \(\sigma _0\) of A, there exists a hint of length \(|B|(2\log n + \log q)\) such that given \(\sigma _0\) and the hint, \(\mathcal {P}\) outputs every \(X_i\) for \(i\in B\) without querying \(X_{i-1}\) to \(\mathsf {h}\).

Moreover, if we want fewer elements, we can simply give a shorter hint: there exists a predictor algorithm that similarly outputs p elements of B whenever \(p\le |B|\), given \(\sigma _0\) and an additional \(p(2\log n + \log q)\)-bit hint.

Note that the inputs to \(\mathcal {P}\) can vary in size; we assume that the encoding of inputs is such that the size is unambiguous.

Proof

We will focus on the first sentence of the claim and address the second sentence at the end.

\(\mathcal {P}\) depends on input label \(X=X_0\) and algorithm A (which are independent of \(\mathsf {h}\)). \(\mathcal {P}\) will get the state \(\sigma _0\) of A (which may depend on \(\mathsf {h}\)) as input, and, for every \(i\in B\), a hint containing the challenge \(\mathsf {bestchal}_i\) for which \(X_i\) appears earliest, and the sequential order (among all the \(q-1\) queries A makes in answering \(\mathsf {bestchal}_i\)) of the first query to \(X_i\) (using the value q to indicate that this query never occurs). This hint (which depends on \(\mathsf {h}\)) will thus consist of a list of |B| entries, each containing \(i\in B\), \(\mathsf {bestchal}_i\), and \(\log q\) bits identifying the query number, for a total of \(|B|(2\log n + \log q)\) bits.

\(\mathcal {P}\) will build a table containing \(X_i\) for \(i\ge 0\) (initializing \(X_0=X\)). To do so, \(\mathcal {P}\) will run A on every challenge in parallel, one round at a time. After each round k, \(\mathcal {P}\) will obtain, from the output states of A, all the queries A makes for all the challenges in round k. Then \(\mathcal {P}\) will fill in some spots in its table and provide answers to these queries as input states for round \(k+1\) by performing the following three steps:

Step k. :

put any blue queries into its table (blue queries and their positions in the table can easily be recognized from the hint);

Step k+1/4. :

answer any query that can be answered using the table (i.e., any query that matches \(X_{i-1}\) in the table for some filled positions \(i-1\) and i);

Step k+1/2. :

send remaining queries to \(\mathsf {h}\), return the answers to A, and fill in any new spots in the table that can be filled in (i.e., for every query that matches \(X_{i-1}\) in the table for some filled-in position \(i-1\), fill in position i with the answer to that query).

Once every \(X_i\) for \(i\in B\) is in the table, \(\mathcal {P}\) queries \(\mathsf {h}\) to fill in the missing positions in the table, and outputs the prediction that \(\mathsf {h}(X_{i-1})=X_i\) for \(i\in B\).

To prove that \(\mathcal {P}\) simulates \(\mathsf {h}\) correctly to A, it suffices to show that the table contains correct labels. This can be easily argued by induction on i. Assume all the labels in the table are correct up to now. A new label \(X_i\) enters the table either because it is marked as blue (and thus correct by the hint) or is obtained as an answer from \(\mathsf {h}\) to the query that \(\mathcal {P}\) identified as \(X_{i-1}\) using the table (which is correct by inductive hypothesis).

The above also shows that \(\mathcal {P}\) will not output an incorrect prediction. It remains to show that \(\mathcal {P}\) did not query to \(\mathsf {h}\) the value \(X_{i-1}\) for any \(i\in B\). To prove this, we first show that \(X_i\) is placed into the table no later than step \(\beta _i\) of \(\mathcal {P}\), by induction on \(\beta _i\). The base case is \(X_0\), which is in the table at step 0. If \(\beta _i\) is an integer, then \(i\in B\) and this is true because of step \(\beta _i\) of \(\mathcal {P}\). If \(\beta _i\) is not an integer, then \(\beta _{i-1}<\beta _i\) (because \(X_{i-1}\) appears as a query at round \(\lfloor \beta _i \rfloor \)), so at the beginning of step \(\beta _i\) of \(\mathcal {P}\), by the inductive hypothesis, position \(i-1\) in the table will already contain \(X_{i-1}\), and thus position i will get filed in when \(X_{i-1}\) gets queried to \(\mathsf {h}\).

Note also that \(X_i\) cannot be placed into the table earlier than step \(\beta _i\), so it is placed in the table exactly at step \(\beta _i\) (as long as \(\beta _i\ne \infty \), in which case it is placed into the table at the end, when \(\mathcal {P}\) fills in the missing positions).

Now suppose, for purposes of contradiction, that \(\mathcal {P}\) queries \(\mathsf {h}\) for some value \(X_{i-1}\) for some \(i\in B\). That can happen only if at the end of some round k, \(X_{i-1}\) is queried by A as part of the output state, but either \(X_{i-1}\) or \(X_i\) are not in the table at that time.

  • If \(X_{i-1}\) is not in the table at the beginning of step \(k+1/2\) of \(\mathcal {P}\), then \(\beta _{i-1}\ge k+1/2\); but since \(X_{i-1}\) is being queried at the end of round k, \(\beta _{i-1}\le k\), which is a contradiction.

  • If \(X_i\) is not in the table at the beginning of step \(k+1/2\) of \(\mathcal {P}\), then \(\beta _i\ge k+1\) (because \(\beta _i\) is an integer); but since \(X_{i-1}\) appears as query in the output state of round k, \(\beta _i \le k+1/2\), which is also a contradiction.

Thus, \(\mathcal {P}\) always achieves its goal.

For the second sentence of the claim, observe that we can simply give \(\mathcal {P}\) the hint for the p blue labels with the smallest \(\beta \) values.    \(\square \)

In the next claim, we show that for every input to \(\mathcal {P}\), the algorithm \(\mathcal {P}\) cannot be correct for too many oracles \(\mathsf {h}\).

Claim 2

Fix an algorithm \(\mathcal {P}\) and fix its input, a positive integer p, some domain \(\mathcal {D}\), and range \(\mathcal {R}\). For \(\mathsf {h}: \mathcal {D}\rightarrow \mathcal {R}\), call \(\mathcal {P}^\mathsf {h}\) successful if \(\mathcal {P}\) with oracle access to \(\mathsf {h}\) outputs p distinct values \(x_1, \dots , x_p\in \mathcal {D}\) and \(\mathsf {h}(x_1), \dots , \mathsf {h}(x_p)\) without querying \(\mathsf {h}\) on any of \(x_1, \dots , x_p\). Then \(\Pr _{\mathsf {h}\in \mathcal {R}^\mathcal {D}} [\mathcal {P}^\mathsf {h}\text { is successful}]\le |\mathcal {R}|^{-p}\).

Proof

Instead of choosing \(\mathsf {h}\) all at once, consider the equivalent view of choosing answers to fresh queries of \(\mathcal {P}\) uniformly at random, and then choosing the remainder of the \(\mathsf {h}\) uniformly at random after \(\mathcal {P}\) produces its output. Since \(\mathcal {P}\) does not query \(x_1, \dots , x_p\), the choices of \(\mathsf {h}\) on those points will agree with \(y_1, \dots , y_p\) with the probability at most \(|\mathcal {R}|^{-p}\).    \(\square \)

Using the previous two claims, we now bound the number of random oracles for which the size of the blue set is too large. Recall B is the blue set minus \(X_0\).

Claim 3

Given adversary A, there exists a set of random oracles \(\mathsf {good}_\mathsf {h}\) such that \(\Pr [\mathsf {h}\notin \mathsf {good}_\mathsf {h}]\le q n^3 2^{-w}\), and for every \(\mathsf {h}\in \mathsf {good}_\mathsf {h}\), every M, and every initial state \(\sigma _0\) of A of size at most M bits, \(|B|\le p-1\), where \(p=\lceil (M+1)/(w-2\log n -\log q)+1 \rceil \).

Proof

The intuition is as follows: if for some \(\mathsf {h}\) and some initial input state of length M, \(|B|>p-1\), then either \(\mathcal {P}\) successfully predicts the output of \(\mathsf {h}\) on p distinct inputs (by Claim 1), or some of the values among \(X_0, \dots , X_{n-1}\) are not distinct. We will define \(\mathsf {bad}_\mathsf {h}\) as the set of random oracles for which this can happen, and then bound its size.

Let \(\mathcal {S}\) be the size of the space of all possible random oracles \(\mathsf {h}\). There are at most \(\frac{1}{2} \mathcal {S} n^2 2^{-w}\) random oracles for which some of the values among \(X_0, \dots , X_{n-1}\) are not distinct; (suppose the first collision pair is \(i,j < n\), thus \(X_{i-1} \ne X_{j-1}\), and the probability that \(X_i=X_j\) is \(2^{-w}\); then the bound is given by taking union bound over at most \(n^2/2\) pairs of (ij).) Call this set of random oracles \(\mathsf {colliding}\).

In the next paragraph, we will formally define the set \(\mathsf {predictable}\) as the set of random oracles for which \(\mathcal {P}\) correctly predicts the output on p distinct inputs given the M-bit input state of A and an additional \(p(2\log n + \log q)\)-bit hint. We will bound the size of \(\mathsf {predictable}\) by bounding it for every possible memory state of A and every possible hint, and then taking the union bound over all memory states and hints.

Consider a particular input state of length M for A; recall that \(p=\lceil (M+1)/(w-2\log n -\log q)+1 \rceil \). Assume \(1\le p\le n-1\) (otherwise, the statement of Claim 3 is trivially true). Fix a particular value of the hint for \(\mathcal {P}\) for predicting p elements of B. (Recall that the hint was previously defined dependent on the random oracle; we are now switching the order of events by fixing the hint first and then seeing for how many random oracles this hint can work.) Since the input to \(\mathcal {P}\) is now fixed, there are most \(\mathcal {S}2^{-pw}\) random oracles for which it can correctly output p distinct values without querying them, by Claim 2. The set \(\mathsf {predictable}\) consists of all such random oracles, for every value of M such that \(p\le n\), every M-bit input state \(\sigma _0\), and every hint.

To count how many random oracles are in \(\mathsf {predictable}\), first fix p. Let \(M_p\) be the largest input state length that gives this particular p. Take all input state lengths that give this p, all possible input states of those lengths (there are at most \(2^{M_p}+2^{M_p-1}+\dots +1<2^{M_p+1}\) of them), and all possible hints for extracting p values (there are at most \(2^{p(2\log n + \log q)}\) of them). This gives us at most \(\mathcal {S}2^{(M_p+1)+p(2\log n + \log q-w)}\) random oracles in \(\mathsf {predictable}\). Since \((M_p+1) \le (p-1) (w-2\log n - \log q)\) by definition of p, this number does not exceed \(\mathcal {S}2^{(2\log n + \log q-w)} = \mathcal {S}n^2q2^{-w}\). Now add up over all possible values of p (from 2 to n), to get \(|\mathsf {predictable}|\le \mathcal {S} (n-1)n^2q 2^{-w}\).

Set \(\mathsf {bad}_\mathsf {h}= \mathsf {colliding}\cup \mathsf {predictable}\) and let \(\mathsf {good}_\mathsf {h}\) be the complement of \(\mathsf {bad}_\mathsf {h}\).

   \(\square \)

Claim 4

For every i, \(0\le i< n\), the value \(t_i\) is at least \(1+i-j\), where \(j=max \{{a\le i}\, |\, a\) is blue\(\}\).

Proof

If i is blue, we are done, since \(t_i\ge 1\) simply because we start counting rounds from 1.

We will first show that \(\lceil \beta _i-\beta _j \rceil \ge i-j\). Fix a blue j and proceed by induction on i such that \(i> j\) and there are no blue indices greater than j and less than i.

For the base case, suppose \(i=j+1\). Recall that \(\beta _i\) is not an integer because i is not blue. Then \(\beta _{i-1} \le \beta _i - 1/2\), because \(X_{i-1}\) is present as the query to \(\mathsf {h}\) that produces response \(X_i\) in the sequence of queries that A makes when responding to the challenge \(\mathsf {bestchal}_i\), and we are done. For the inductive case, it suffices to show that \(\beta _{i-1} \le \beta _i -1\), which is true by the same argument as for the base case, except that we add that \(\beta _{i-1}\) is also not an integer (since \(i-1\) is also not blue).

Therefore, \(\lceil \beta _i \rceil \ge i-j+1\), because \(\beta _j \ge 1\). We thus have \(\pi _{ii} = t_i \ge \lceil \beta _i \rceil \ge i-j+1\).    \(\square \)

The number of blue indices (namely, \(|B|+1\), because \(X_0\) is blue but not in B) is at most p if \(\mathsf {h}\in \mathsf {good}_\mathsf {h}\). Since at most d indices are within distance \(d-1\) of any given blue index, and there are at most p blue indices, we can plug in \(d=n(1-\mathrm {pr}_\mathrm {hard})/p\) to get

$$ {\mathop {\Pr }\limits _{i}}\left[ t_i\le \frac{n(1-\mathrm {pr}_\mathrm {hard})}{p}\right] \le 1-\mathrm {pr}_\mathrm {hard}. $$

This concludes the proof of Theorem 2.    \(\square \)

Generalizing to other graphs. In general, every single-source directed acyclic graph G defines a (data-independent) function whose evaluation on input X corresponds to labeling G as follows: The source is labeled with X, and the label of every node is obtained by hashing the concatenation of the labels of its predecessors. Rather than evaluating this function, one can instead consider a game with challenges, where in each round, the adversary needs to compute the label of a random challenge node from G. Theorem 2 above can be seen as dealing with the special case where G is a line graph.

Theorem 2 can be generalized, roughly as follows. We replace \(\log q\) with \(\log q+\log d\) (where d is the degree of the graph), because identifying blue nodes now requires both the query number and the position within the query. We modify the proof of Claim 1 to account for the fact that the random oracle query resulting in response \(X_i\) is not necessarily \(X_{i-1}\), but a concatenation of labels. The only conceptually significant change due to this generalization is in Claim 4, whose generalized statement is as follows. For every node i, define the “limiting depth of i” to be the length of a longest possible path that starts at a blue node, goes through no other blue nodes, and ends at i. The generalized version of Claim 4 states that the amount of time required to query \(X_i\) is at least one plus the limiting depth of node i.

With this more general claim in place, it follows that

$$ {\mathop {\Pr }\limits _{i}}\left[ t_i> m\right] \ge \frac{1}{2}, $$

where m defined as follows. Let S denote set of blue nodes, and let \(m_S\) denote the median limiting depth of the nodes in G. We define m to be the minimum \(m_S\) over all S such that the origin is in S and \(|S|=p\).

Of course, other statistical properties of the distribution of \(t_i\) can also be deduced from this claim if we use another measure instead of the median. Essentially, the generalized theorem would show that the best the adversary can do is place p pebbles on the graph and use parallel pebbling.

5 Cumulative Complexity of Answering Repeated Challenges in the Parallel Pebbling Model

In the previous part, we showed that, in the parallel random oracle model, an adversary with memory (input state) of size M cannot do much better when answering a random challenge than placing \(\mathsf {p}\approx M/w\) pebbles on the graph and pebbling. In this section, we prove a lower bound on the cumulative complexity of repeated random challenges in the pebbling model. While the result in this section does not directly apply to the random oracle model for reasons explained in Sect. 5.1, all of the techniques are used in the proof of our main theorem in Sect. 6.

The Single Challenge Pebbling Game. Consider now the pebbling game for the line graph G consisting of nodes \(v_0, \dots , v_{n-1}\) and edges \((v_i, v_{i+1})\) for every \(0\le i <n\). Recall that in this game, at each time step t starting with \(t=0\), the adversary maintains a subset \(P_t\) of nodes (“pebbles”). If there is a pebble on a node at time \(t-1\), its successor is allowed (but not required) to get a pebble at time t. Formally, at time \(t>0\), the set \(P_t\) must be a subset of \(P_{t-1}\cup \{ v_{i+1}: v_i\in P_{t-1} \}\). Also recall that we modify the game of [8] slightly by not permitting the adversary to put a pebble on the source for free: \(v_0\) is contained in \(P_0\) and cannot be added to \(P_t\) if it is absent in \(P_{t-1}\) (this change simplifies calculations). Let \(\mathsf {p}_i=|P_i| \ge 1\).

We will say that the adversary answers a challenge \(\mathsf {chal}\) (for \(0\le \mathsf {chal}<n\)) in t steps if \(t>0\) is the earliest time when \(v_\mathsf {chal}\in P_{t-1}\) (note that a pebble needs to be on \(v_\mathsf {chal}\) at time \(t-1\)—think of time t as the step when the output to the challenge is presented; this convention again simplifies calculations, and intuitively corresponds to the \(\mathtt{scrypt} \) evaluation, in which the “output” step corresponds to querying \(X_\mathsf {chal}\oplus S_i\) in order to advance to the next challenge).

It is easy to see that t is at least one plus the distance between \(\mathsf {chal}\) and the nearest predecessor of \(\mathsf {chal}\) in \(P_0\). Therefore, for the same reason as in the proof of Theorem 2 (because at most \(n/(2\mathsf {p}_0)\) challenges are within \(n/(2\mathsf {p}_0)-1\) distance to a particular node in \(P_0\) and there are \(\mathsf {p}_0\) nodes in \(P_0\)).

$$ {\mathop {\Pr }\limits _{\mathsf {chal}}}\left[ t> \frac{n}{2\mathsf {p}_0}\right] \ge \frac{1}{2}. $$

More generally, the following is true for any \(0\le \mathrm {pr}_\mathrm {hard}\le 1\):

Fact 3

$$ {\mathop {\Pr }\limits _{\mathsf {chal}}}\left[ t> \frac{c}{\mathsf {p}_0}\right] \ge \mathrm {pr}_\mathrm {hard}\quad \text {with}\quad c = n (1-\mathrm {pr}_\mathrm {hard}). $$

Repeated Challenges Pebbling Game. We now consider repeated challenges. At time \(s_1=0\), the adversary receives a challenge \(c_1\), \(0\le c_1<n\). The adversary answers this challenge at the earliest moment \(s_2>s_1\) when \(P_{s_2-1}\) contains \(X_{c_1}\); after \(P_{s_2}\) is determined, the adversary receives the next challenge \(c_2\), and so on, for \(Q\) challenges, until challenge \(c_{Q}\) is answered at time \(s_{Q+1}\). We are interested in the cumulative pebbling complexity \(\mathsf {cc_{peb}}=\sum _{t=0}^{s_{Q+1}} \mathsf {p}_t\).

Note that the adversary can adaptively vary the number of pebbles used throughout the game, while Fact 3 above addresses only the number of pebbles used before a challenge is known. Nevertheless, we are able to show the following.

Theorem 4

(Cumulative pebbling complexity of repeated challenges game). The cumulative pebbling complexity of the repeated challenges pebbling game is with high probability \(\varOmega (nQ)\).

More precisely, suppose the adversary never has fewer than \(\mathsf {p}_0\) pebbles. Then for any \(\epsilon >0\), with probability at least \(1-e^{-2\epsilon ^2Q}\) over the choice of the \(Q\) challenges,

$$ \mathsf {cc_{peb}}\ge \mathsf {p}_0+\frac{n}{2}\cdot Q\cdot \left( \frac{1}{2}-\epsilon \right) \cdot \ln 2. $$

More generally, we replace the condition that the adversary never has fewer than \(\mathsf {p}_0\) pebbles with the condition \(\mathsf {p}_t\ge \mathsf {p}_\mathrm {min}\) for some \(\mathsf {p}_\mathrm {min}\) and every \(t\ge 1\), we need to replace \(\ln 2\) with

$$ \ln \left( 1+\left( \frac{\mathsf {p}_\mathrm {min}}{\mathsf {p}_0}\right) ^{\frac{1}{Q\left( \frac{1}{2}-\epsilon \right) }}\right) . $$

This result improves [6, Theorem 1] by eliminating the \(\log ^2 n\) factor from the cumulative memory complexity of the pebbling game. In the full version [7], we discuss the general case of \(\mathsf {p}_t\ge \mathsf {p}_\mathrm {min}\) and show an attack showing that a bound as the above is necessary (up to constant factors in the exponent).

Our approach is general enough to apply to space-time tradeoffs other than inverse proportionality, to other graphs, and even to some other models of computation that do not deal with pebbling. However, we will explain in Sect. 5.1 why it cannot be used without modification in the parallel random oracle model and other models where space is measured in bits of memory.

Proof

Recall time starts at 0, \(\mathsf {p}_t\) denotes the number of pebbles at time t, and \(s_i\) denotes the moment in time when challenge number i (with \(1\le i\le Q\)) is issued. Let \(t_i\) denote the amount of time needed to answer challenge number i (thus, \(s_1 = 0\) and \(s_{i+1}=s_i+t_i\); let \(s_{Q+1} = s_Q+t_Q\)). Let \({\mathsf {cc}} (t_1, t_2)\) denote \(\sum _{t=t_1}^{t_2} \mathsf {p}_t\).

The Main Idea of the Proof. The difficulty in the proof is that we cannot use \(t_i\) to infer anything about the number of pebbles used during each step of answering challenge i. All we know is that the number of pebbles has to be inversely proportional to \(t_i\) immediately before the challenge was issued—but the adversary can then reduce the number of pebbles used once the challenge is known (for, example by keeping pebbles only on \(v_0\) and on the predecessor of the challenge).

The trick to overcome this difficulty is to consider how many pebbles the adversary has to have in order to answer the next challenge not only immediately before the challenge, but one step, two steps, three steps, etc., earlier.

Warm-Up: Starting with a Stronger Assumption. For a warm-up, consider the case when the pebbles/time tradeoff is guaranteed (rather than probabilistic, as in Fact 3): assume, for now, that in order to answer the next random challenge in time t, it is necessary to have a state of size c / t right before the challenge is issued. Now apply this stronger assumption not only to the moment s in time when the challenge is issued, but also to a moment in time some j steps earlier. The assumption implies that the number of pebbles needed at time \(s-j\) is at least \(c/(j+t)\) (because the challenge was answered in \(j+t\) steps starting from time \(s-j\), which would be impossible with a lower number of pebbles even if the challenge had been already known at time \(s-j\)).

We will use this bound for every challenge number \(i\ge 2\), and for every \(j=0\) to \(t_{i-1}\), i.e., during the entire time the previous challenge is being answered. Thus, cumulative pebbling complexity during the time period of answering challenge \(i-1\) is at least

$$ \begin{aligned} {\mathsf {cc}} (s_{i-1}+1, s_i)&\ge \sum _{j=0}^{t_{i-1}-1} \mathsf {p}_{s_i-j} \ge c\left( \frac{1}{t_i}+\frac{1}{t_i+1} + \dots + \frac{1}{t_i+t_{i-1}-1}\right) \\&\ge c \int _{t_i}^{t_{i-1}+t_i} \frac{dx}{x} = c(\ln (t_{i-1}+t_i) - \ln t_i). \end{aligned} $$

Then adding these up for each i between 2 and \(Q\), we get the cumulative pebbling complexity of

$$ {\mathsf {cc}} (1, s_{Q+1}) \ge c \sum _{i=2}^{Q} (\ln (t_{i-1}+t_i) - \ln t_i). $$

If all \(t_i\) are equal (which is close to the minimum, as we will show below), this becomes \(c (Q-1) \cdot \ln 2\).

Back to the Actual Assumption. The proof is made messier by the fact that the bound in the assumption is not absolute. Moreover, the bound does not give the number of pebbles in terms of running time, but rather running time in terms of the number of pebbles (it makes no sense to talk probabilistically of the number of pebbles, because the number of pebbles is determined by the adversary before the challenge is chosen). To overcome this problem, we look at the number of pebbles at all times before \(s_i\) and see which one gives us the best lower bound on \(t_i\).

Consider a point \(t\le s_i\) in time. We can apply Fact 3 to the size \(\mathsf {p}_t\) of the set of pebbles \(P_t\) at time t, because the ith challenge is selected at random after the adversary determines \(P_t\). The ith challenge will be answered \(t_i+(s_i-t)\) steps after time t; thus, with probability at least \(\mathrm {pr}_\mathrm {hard}\) over the choice of the ith challenge, \(t_i + (s_i-t) > c/\mathsf {p}_t\), i.e., \(t_i > c/\mathsf {p}_t - (s_i-t)\). Let \(r_i\) be a moment in time that gives the best bound on \(t_i\):

$$ r_i={\mathop {\hbox {argmax}}\limits _{0\le t \le s_i}} \left( \frac{c}{\mathsf {p}_t} - (s_i-t)\right) . $$

Call the ith challenge “hard” if \(t_i + (s_i-r_i) > c/\mathsf {p}_{r_i}\). We claim that if challenge i is hard, then the same fact about the number of pebbles j steps before the challenge as we used in the warm-up proof holds.

Claim 5

If challenge i is hard, then for any j, \(0\le j \le s_i\), \(\mathsf {p}_{s_i-j}> c/(t_i+j)\).

Proof

Indeed, let \(t=s_i-j\). Then \(c/\mathsf {p}_{s_i-j} - j = c/\mathsf {p}_{t} - (s_i-t) \le c/\mathsf {p}_{r_i} - (s_i-r_i)\) by the choice of \(r_i\). This value is less than \(t_i\) by definition of a hard challenge. Therefore, \(c/\mathsf {p}_{s_i-j} - j < t_i\) and the result is obtained by rearranging the terms.

   \(\square \)

We now claim that with high probability, the number of hard challenges is sufficiently high.

Claim 6

For any \(\epsilon >0\), with probability at least \(1-e^{-2\epsilon ^2Q}\), the number of hard challenges is at least \(H\ge Q(\mathrm {pr}_\mathrm {hard}-\epsilon )\).

Proof

The intuitive idea is to apply Hoeffding’s inequality [17], because challenges are independent. However, the hardness of challenges is not independent, because it may be (for example) that one particular challenge causes the adversary to slow down for every subsequent challenge. Fortunately, it can only be “worse than independent” for the adversary. Specifically, for any fixing of the first \(i-1\) challenges \(c_1, \dots , c_{i-1}\), we can run the adversary up to time \(s_i\); at this point, time \(r_i\) is well defined, and we can apply Fact 3 to \(r_i\) to obtain that \(\Pr [c_i \text { is hard } \,|\, c_1, \dots , c_{i-1}]\ge \mathrm {pr}_\mathrm {hard}\). This fact allows us to apply the slightly generalized version of Hoeffiding’s inequality stated in Claim 7 (setting \(V_i=1\) if \(c_i\) is hard and \(V_i= 0\) otherwise) to get the desired result.    \(\square \)

Claim 7

(Generalized Hoeffding’s inequality). If \(V_1, V_2, \dots , V_Q\) are binary random variables such that for any i (\(0 \le i < Q\)) and any values of \(v_1, v_2, \dots , v_{i}\), \(\Pr [V_{i+1} = 1 \,|\, V_1=v_1, \dots , V_i=v_i]\ge \rho \), then for any \(\epsilon >0\), with probability at least \(1-e^{-2\epsilon ^2Q}\), \(\sum _{i=1}^QV_i \ge Q(\rho -\epsilon )\).

Proof

For \(0\le i < Q\), define the binary random variable \(F_{i+1}\) as follows: for any fixing of \(v_1, \dots , v_i\) such that \(\Pr [V_1=v_1, \dots , V_i=v_i]>0\), let \(F_{i+1} = 1\) with probability \(\rho /\Pr [V_{i+1} = 1 \,|\, V_1=v_1, \dots , V_i=v_i]\) and 0 otherwise, independently of \(V_{i+1}, \dots , V_Q\). Let \(W_{i+1} = V_{i+1} \cdot F_{i+1}\). Note that \(\Pr [W_{i+1}=1]=\rho \) regardless of the values of \(V_1, \dots , V_i\), and thus \(W_{i+1}\) is independent of \(V_1, \dots , V_i\). Since \(F_1, \dots , F_i\) are correlated only with \(V_1, \dots , V_i\), we have that \(W_{i+1}\) is independent of \((V_1, \dots , V_i, F_1, \dots , F_i)\), and thus independent of \(W_1, \dots , W_i\). Therefore, \(W_1, \dots , W_Q\) are mutually independent (this standard fact can be shown by induction on the number of variables), and thus \(\sum _{i=1}^QV_i \ge \sum _{i=1}^QW_i \ge Q(\rho -\epsilon )\) with probability at least \(1-e^{-2\epsilon ^2Q}\) by Hoeffding’s inequality.    \(\square \)

Now assume \(H\) challenges are hard. What remains to show is a purely algebraic statement about the sum of \(\mathsf {p}_i\) values when \(H\ge Q(\mathrm {pr}_\mathrm {hard}-\epsilon )\) of challenges satisfy Claim 5.

Claim 8

Let c be a real value. Let \(t_1, \dots , t_{Q}\) be integers, \(s_1=0\), and \(s_i=s_{i-1}+t_{i-1}\) for \(i=2, \dots , Q+1\). Let \(\mathsf {p}_0, \dots , \mathsf {p}_Q\) be a sequence of real values with \(\mathsf {p}_t>\mathsf {p}_\mathrm {min}\) for every \(t\ge 1\). Suppose further that there exist at least \(H\) distinct indices i, with \(1\le i \le Q\) (called “hard indices”) such that for any \(0\le j \le s_i\), \(\mathsf {p}_{s_i-j}\ge c/(t_i+j)\). Then

$$ \sum _{i=1}^{s_{Q+1}} \mathsf {p}_i \ge c\cdot H\cdot \ln \left( 1+\left( \frac{\mathsf {p}_\mathrm {min}}{\mathsf {p}_0}\right) ^{\frac{1}{H}}\right) . $$

Proof

Let \(i_1< i_2< \dots < i_{H}\) be the hard indices. Recall the notation \({\mathsf {cc}} (i,j)=\sum _{t=i}^j \mathsf {p}_t\). Then for \(k\ge 2\),

$$ \begin{aligned} {\mathsf {cc}} (s_{i_{k-1}}+1, s_{i_k})&\ge {\mathsf {cc}} (s_{i_k}-t_{i_{k-1}}+1, s_{i_k}) = \sum _{j=0}^{t_{i_{k-1}}-1} \mathsf {p}_{s_{i_k}-j} \\&\ge \sum _{j=0}^{t_{i_{k-1}}-1} \frac{c}{t_{i_k}+j} \ge c\cdot (\ln (t_{i_{k-1}}+t_{i_k}) - \ln t_{i_k}), \end{aligned} $$

(the last inequality follows by the same reasoning as in the warm-up). To bound \({\mathsf {cc}} (1, Q+1)\), we will add up the pebbling complexity during these nonoverlapping time periods for each k and find the minimum over all sets of values of \(t_{i_k}\). Unfortunately, the result will decrease as \(t_{i_1}\) decreases and as \(t_{i_H}\) increases, and we have no bounds on these values. To get a better result, we will need to consider special cases of \(k=2\) (to replace \(t_{i_1}\) with \(c/\mathsf {p}_0\)) and \(k=H+1\) (to add another term with \(t_{i_H+1} = c/\mathsf {p}_\mathrm {min}\)).

For \(k=2\), we will bound \({\mathsf {cc}} (1, s_{i_2})\) by noticing that \(s_{i_2} \ge t_{i_1}+s_{i_1} \ge c/\mathsf {p}_0 \) (where the second step follows by Claim 5 with \(j=s_{i_1}\)), and therefore

$$ \begin{aligned} {\mathsf {cc}} (1, s_{i_2})&\ge \sum _{j=0}^{s_{i_2}-1} \mathsf {p}_{s_{i_2}-j} \ge c \sum _{j=0}^{s_{i_2}-1} \frac{1}{t_{i_2}+j} \ge c \int _{t_{i_2}}^{s_{i_2}+t_{i_2}} \frac{dx}{x} \\&= c(\ln (s_{i_2}+t_{i_2}) - \ln t_{i_2}) \ge c\cdot (\ln (c/\mathsf {p}_0+t_{i_2}) - \ln t_{i_2}). \end{aligned} $$

For \(k=H+1\),

$$\begin{aligned} {\mathsf {cc}} (s_{i_{H}}+1, s_{H+1})&\ge \mathsf {p}_\mathrm {min}\cdot t_{i_H} \ge c \cdot \left( \frac{1}{c/\mathsf {p}_\mathrm {min}} t_{i_H}\right) \\&\ge c \cdot \left( \frac{1}{c/\mathsf {p}_\mathrm {min}}+\dots +\frac{1}{c/\mathsf {p}_\mathrm {min}+t_{i_H}-1}\right) \\&\ge c\cdot (\ln (t_{i_h}+c/\mathsf {p}_\mathrm {min}) - \ln c/\mathsf {p}_\mathrm {min}). \end{aligned}$$

Adding these up, we get

$$\begin{aligned} {\mathsf {cc}} (0, s_{i+1}) = \mathsf {p}_0+{\mathsf {cc}} (1, s_{i_2})+{\mathsf {cc}} (s_{i_2}+1, s_{i_3})+\dots \\ +{\mathsf {cc}} (s_{i_{H-1}}+1, s_{i_H})+{\mathsf {cc}} (s_{i_H}+1, s_{i_{H+1}}) \\ \ge \mathsf {p}_0+ c \cdot \sum _{i=1}^H(\ln (x_i+x_{i+1})-\ln x_{i+1}), \end{aligned}$$

where \(x_1=c/\mathsf {p}_0\), \(x_2=t_{i_2}\), \(x_3=t_{i_3}\), ..., \(x_H=t_{i_H}\), and \(x_{H+1}=c/\mathsf {p}_\mathrm {min}\).

To find the minimum of this function, observe that the first derivative with respect to \(x_i\) is \(\frac{1}{x_i+x_{i-1}}+\frac{1}{x_i+x_{i+1}}-\frac{1}{x_i}\), which, assuming all the \(x_i\)s are positive, is zero at \(x_i=\sqrt{x_{i-1}x_{i+1}}\), is negative for \(x_i<\sqrt{x_{i-1}x_{i+1}}\), and is positive for \(x_i>\sqrt{x_{i-1}x_{i+1}}\). Therefore, the minimum of this function occurs when each \(x_i\), for \(2\le i \le H\), is equal to \(\sqrt{x_{i-1} x_{i+1}}\), or equivalently, when \(x_i = c / (\mathsf {p}_\mathrm {min}^{i-1} \mathsf {p}_0^{H-i+1})^{1/H}\). This setting of \(x_i\) gives us \(\ln (x_i+x_{i+1}) - \ln x_{i+1} = \ln (1+(\mathsf {p}_\mathrm {min}/\mathsf {p}_0)^{1/H})\), which gives the desired result.    \(\square \)

Plugging in \(Q\cdot (\mathrm {pr}_\mathrm {hard}-\epsilon )\) for \(H\) and \(\mathrm {pr}_\mathrm {hard}=\frac{1}{2}\) concludes the proof of Theorem 4.    \(\square \)

5.1 Why This Proof Needs to be Modified for the Parallel Random Oracle Model

The main idea of the proof above is to apply the space-time tradeoff of Fact 3 to every point in time before the challenge is known, arguing that delaying the receipt of the challenge can only hurt the adversary. While this is true in the pebbling game (via an easy formal reduction—because getting bits does not help get pebbles), it’s not clear why this should be true in the parallel random oracle game, where space is measured in bits rather than pebbles. A reduction from the adversary A who does not know the challenge to an adversary B who does would require B to store the challenge in memory, run A until the right moment in time, and then give A the challenge. This reduction consumes memory of B—for storing the challenge and keeping track of time. In other words, A can save on memory by not knowing, and therefore not storing, the challenge. While the amount of memory is small, it has to be accounted for, which makes the formulas even messier. Things get even messier if A is not required to be 100% correct, because, depending on the exact definition of the round game, B may not know when to issue the challenge to A.

Nevertheless, this difficulty can be overcome when challenges come from the random oracle, as they do in \(\mathtt{scrypt} \). We do so in the next section.

6 Main Result: Memory Hardness of scrypt in the Parallel Random Oracle Model

We are ready to restate our main theorem, which we now prove extending the techniques from the two previous sections. We state it with a bit more detail than Theorem 1.

Fix positive integers \(n\ge 2\) and w, a string \(X\in \{0,1\}^w \), a finite domain \(\mathcal {D}\) that contains at least \(\{0,1\}^w\), and let \(\mathcal {R}=\{0,1\}^w\).

Theorem 5

Let A be any oracle machine (in the parallel random oracle model as defined in Sect. 2) with input X. Assume \(A^\mathsf {h}(X)\) outputs \(S^\mathsf {h}_n = \mathtt{scrypt} ^{\mathsf {h}}(X)\) correctly with probability \(\chi \), where the probability is taken over the choice of \(\mathsf {h}: \mathcal {D}\rightarrow \mathcal {R}\). Then for any \(\epsilon >0\) and \(q\ge 2\), with probability (over the choice \(\mathsf {h}\)) at least \(\chi - 2qn^4\cdot 2^{-w} - e^{- 2 \epsilon ^2n}\) one of the following two statements holds: either \(A^\mathsf {h}(X)\) makes more than q queries (and thus \(\mathsf {cc_{mem}}(A^{\mathsf {h}_n})>qw\) by definition) or

$$\begin{aligned} \mathsf {cc_{mem}}(A^{\mathsf {h}}(X)) \ge \frac{\ln 2}{6}\cdot \left( \frac{1}{2} - \epsilon \right) \cdot n^2 \cdot \left( w - 2 \log n - \log q - 1 \right) . \end{aligned}$$

To get the statement of Theorem 1, we set \(\epsilon = 1/7\) and observe that then \(e^{-2\epsilon ^2 n} = 2^{- \frac{2n}{49 \ln 2}}<2^{-n/20}\) and \(\frac{\ln 2}{6} (\frac{1}{2}-\frac{1}{7})> \frac{1}{25}\). We also plug in \(q=\min (2, \frac{n^2}{25})\) (and therefore \(\log q \le 2 \log n -1\)), thus removing the “either/or” clause (this setting of q requires us to manually check that the probability statement is correct when \(q>\frac{n^2}{25}\), i.e., \(2\le n \le 7\)—a tedious process that we omit here).

The multiplicative constant of \(\ln (2) (1/2-\epsilon ) /6\), which becomes 1 / 25 in Theorem 1 (and can be as small as \(\approx 1/18\) if we use a smaller \(\epsilon \)), is about 9–12 times worse than the constant in the naïve scrypt algorithm described on p. 4, which has \(\mathsf {cc_{mem}}\) of \(n^2w/2\). We describe approaches that may improve this gap to a factor of only about \(1/\ln 2 \approx 1.44\) in the full version [7]. We note that there is also gap between w in the naïve algorithm and \((w-2 \log n -\log q -1)\) in the lowerbound, which matters for small w (as values of 20–30 for \(\log n\) and \(\log q\) are reasonable).

The rest of this section is devoted to the proof of this theorem.

6.1 Outline of the Approach

Before proceeding with the proof, we justify our proof strategy by highlighting the challenges of extending Theorems 2 and 4 to this setting. Theorem 2 applies to a fixed random oracle \(\mathsf {h}\) and a random challenge. In fact, the proof relies crucially on the ability to try every challenge for a given oracle. However, in the present proof, once the random oracle is fixed, so is every challenge. Moreover, Theorem 4 crucially relies on the uniformity and independence of each challenge, which is issued only when the previous challenge is answered. In contrast, here, again, once the oracle is fixed, the challenges are fixed, as well. Even if we think of the oracle as being lazily created in response to queries, the challenges implicitly contained in the answers to these queries are not necessarily independent once we condition (as we need to in Theorem 2) on the oracle not being in \(\mathsf {bad}_\mathsf {h}\). We resolve these issues by working with multiple carefully chosen random oracles.

Recall our notation: \(X_0=X\), \(X_1=\mathsf {h}(X_{0}), \dots , X_{n-1}=\mathsf {h}(X_{n-2})\); \(T_0 = X_{n-1}\), \(S_0=\mathsf {h}(T_0)\), and for \(i=1,\ldots , n\), \(T_i = S_{i-1}\oplus X_{S_{i-1} \, \bmod \, n}\) and \(S_i = \mathsf {h}(T_i)\). Because we will need to speak of different random oracles, we will use notation \(X_i^\mathsf {h}\), \(T_i^\mathsf {h}\), and \(S_i^\mathsf {h}\) when the label values are being computed with respect to the random oracle \(\mathsf {h}\) (to avoid clutter, we will omit the superscript when the specific instance of the random oracle is clear from the context). We will denote by \(A^\mathsf {h}\) the adversary running with oracle \(\mathsf {h}\). (To simplify notation, we will omit the fixed argument X to the adversary A for the remainder of this section.)

Let \(\mathrm {changeModn}(S, i)\) be the function that keeps the quotient \(\lfloor S/n\rfloor \) but changes the remainder of S modulo n to i. Consider the following process of choosing a random oracle (this process is described more precisely in the following section). Choose uniformly at random an oracle \(\mathsf {h}_0\). Choose uniformly at random challenges \(c_1, \dots , c_{n}\), each between 0 and \(n-1\). Let \(\mathsf {h}_1\) be equal to \(\mathsf {h}_0\) at every point, except \(\mathsf {h}_1(T_0^{\mathsf {h}_0}) = \mathrm {changeModn}(S_0^{\mathsf {h}_0}, c_1)\). Similarly, let \(\mathsf {h}_2\) be equal to \(\mathsf {h}_1\) at every point, except \(\mathsf {h}_2(T_1^{\mathsf {h}_1})=\mathrm {changeModn}(S_1^{\mathsf {h}_1}, c_2)\), and so on, until \(\mathsf {h}_n\), which is our final random oracle.

This method of choosing \(\mathsf {h}_n\) is close to uniform, and yet explicitly embeds a uniform random challenge. Unless some (rare) bad choices have been made, each challenge has about a \(\frac{1}{2}\) probability of taking a long time to answer, by the same reasoning as in Theorem 2. And since the challenges are independent (explicitly through the choices of \(c_i\) values), we can use the same reasoning as in Theorem 4 to bound the cumulative complexity.

The main technical difficulty that remains is to define exactly what those bad choices are and bound their probability without affecting the independence of the challenges. In particular, there are \(n^n\) possible challenge combinations, and the probability that all of them yield random oracles that are acceptable (cause no collisions and cannot be predicted) is not high enough. We have to proceed more carefully.

The first insight is that if \(\mathsf {h}_{k-1}\) is not predictable (i.e., a predictor \(\mathcal {P}\) with a short input cannot correctly extract many oracle values), and no oracle queries up to \(T_{k-1}\) collide, then \(\Pr _{c_k}[\)time between queries \(T^{\mathsf {h}_{k}}_{k-1}\) and \(T^{\mathsf {h}_{k}}_k\) is high\(]\ge \frac{1}{2}\), by the same reasoning as in Theorem 2 (except predictor needs an extra \(\log q\) bits of hint to know when query \(T_{k-1}\) occurs, so as to substitute the answer to \(T_{k-1}\) with \(\mathrm {changeModn}(S_{k-1}, c_k)\) for every possible \(c_k\)). This allows us to worry about only n random oracles avoiding collisions and the set \(\mathsf {predictable}\) (instead of worrying about \(n^n\) random oracles) to ensure that the time between consecutive challenges is likely to be high.

However, the reasoning in the previous paragraph bounds the time required to answer the challenge \(c_k\) only with respect to oracle \(\mathsf {h}_k\). In order to reason about A interacting with oracle \(\mathsf {h}_n\), we observe that if for every k, \(A^\mathsf {h}_k\) asks the queries \(X_0, \dots , X_{n-1}=T_0, T_1, \dots , T_n\) in the correct order, then the computation of \(A^{\mathsf {h}_n}\) is the same as the computation of \(A^{\mathsf {h}_k}\) until the kth challenge is answered—i.e., until \(T_k\) is queried. Thus, results about each of the oracles \(h_k\) apply to \(\mathsf {h}_n\).

The rest of the work involves a careful probability analysis to argue that the challenges \(c_1,\dots , c_n\) are almost independent even when conditioned on all the bad events not happenning, and to bound the probability of these events.

6.2 The Detailed Proof

Recall that we assume that the adversary A is deterministic without loss of generality (this fact will be used heavily throughout the proof). In particular, the randomness of the experiment consists solely of the random oracle A is given access to.

Following up on the above high-level overview, we now make precise the definition of \(\mathsf {h}_{k}\). Let \(\mathsf {h}_0\) be a uniformly chosen random oracle. Let \(\mathrm {changeModn}(S, i)\) be a function that keeps the quotient \(\lfloor S/n\rfloor \) but changes the remainder of S modulo n to i if possible: it views S as an integer in \([0,2^w-1]\), computes \(S'=\lfloor S/n\rfloor \cdot n+i\), and outputs \(S'\) (viewed as a w-bit string) if \(S'<2^w\), and S otherwise (which can happen only if n is not a power of 2, and even then is very unlikely for a random S).

Definition 1

Let \(\mathsf {roundingProblem}_{k}\) be the set of all random oracles \(\mathsf {h}\) for which the value of at least one of \(S_0^\mathsf {h}, \dots , S_k^\mathsf {h}\) is greater than \(\lfloor 2^w/n\rfloor \cdot n-1\) (i.e., those for which \(\mathrm {changeModn}\) does not work on some S value up to \(S_k\)).

Definition 2

Let \(\mathsf {colliding}_{k}^*\) be the set of all \(\mathsf {h}\) which there is at least one collision among the values \(\{X_0, X_1^\mathsf {h}, X_2^\mathsf {h}, \dots , X_{n-2}^\mathsf {h}, T_0^\mathsf {h}, T_1^\mathsf {h}, \dots , T_k^\mathsf {h}\}\). Let \(\mathsf {colliding}_{k} = \mathsf {roundingProblem}_{\min (k,n-1)}\cup \mathsf {colliding}_{k}^*\).

Definition 3

For every k \((0\le k< n)\), let \(\mathsf {h}_{k+1}=\mathsf {h}_k\) if \(\mathsf {h}_k\in \mathsf {colliding}_{k}\); else, choose \(c_{k+1}\) uniformly at random between 0 and \(n-1\), let \(\mathsf {h}_{k+1}(T_k^{\mathsf {h}_k})=\mathrm {changeModn}(S_k^{\mathsf {h}_k}, c_{k+1})\), and let \(\mathsf {h}_{k+1}(x)=\mathsf {h}_k(x)\) for every \(x\ne T_k^{\mathsf {h}_k}\). (Recall that \(\mathsf {h}_0\) is chosen uniformly.)

Note that this particular way of choosing \(\mathsf {h}_{k+1}\) is designed to ensure that it is uniform, as we argue in the full version.

The Single Challenge Argument. In the argument in Theorem 2, the predictor issues different challenges to A. Here, the predictor will run A with different oracles. Specifically, given \(1\le k\le n\) and a particular oracle \(\mathsf {h}_{k-1}\not \in \mathsf {colliding}_{k-1}\), consider the n oracles \(\mathsf {h}_{k,j}\) for each \(0\le j<n\), defined to be the same as \(\mathsf {h}_{k-1}\), except \(\mathsf {h}_{k,j}(T_{k-1}^{\mathsf {h}_{k-1}}) = \mathrm {changeModn}(S_{k-1}^{\mathsf {h}_{k-1}}, j)\) (instead of \(S_{k-1}^{\mathsf {h}_{k-1}}\)).

Since \(\mathsf {h}_{k-1} \not \in \mathsf {colliding}_{k-1}\), \(T_{k-1}^{\mathsf {h}_{k-1}}\) is not equal to \(X_i^{\mathsf {h}_{k-1}}\) for any \(0\le i < n-1\) and \(T_i^{\mathsf {h}_{k-1}}\) for any \(0\le i < k-1\). Therefore (since \(\mathsf {h}_{k-1}\) and \(\mathsf {h}_{k,j}\) differ only at the point \(T_{k-1}^{\mathsf {h}_{k-1}}\)), we have \(X_i^{\mathsf {h}_{k-1}}=X_i^{\mathsf {h}_{k,j}}\) for every \(0\le i \le n-1\) and \(T_{i}^{\mathsf {h}_{k-1}} = T_{i}^{\mathsf {h}_{k,j}}\) for any \(0\le i \le k-1\). In particular, the execution of A with oracle \(\mathsf {h}_{k,j}\) will proceed identically for any j (and identically to the execution of \(A^{\mathsf {h}_{k-1}}\)) up to the point when the query \(T_{k-1}\) is first made (if ever). We will therefore omit the superscript on \(T_{k-1}\) for the remainder of this argument.

The observation is that the moment \(T_{k-1}\) is queried is the moment when the predictor argument of Theorem 2 can work, by having the predictor substitute different answers to this query and run A on these different answers in parallel. However, since Sect. 5 requires a time/memory tradeoff for every point in time before the challenge is given, we will prove a more general result for any point in time before \(T_{k-1}\) is queried.

We number all the oracle queries that A makes across all rounds, sequentially. We will only care about the first q oracle queries that A makes, for some q to be set later (because if q is too large, then \(\mathsf {cc_{mem}}\) of A is automatically high). Note that q here is analogous to \(q-1\) in Theorem 2.

Let \(s_k>0\) be the round in which \(T_{k-1}\) is first queried, i.e., contained in \({\mathbf q} _{s_k}\). For an integer \(r\le s_k\), consider the output state \({\bar{\sigma }} _{r}\) of \(A^{\mathsf {h}_{k-1}}\) from round r. Given \({\bar{\sigma }} _r\), consider n different continuations of that execution, one for each oracle \(\mathsf {h}_{k,j},0\le j<n\). For each of these continuations, we let \(t_j>0\) be the smallest value such that such \(r+t_j>s_k\) and the query \(T^{\mathsf {h}_{k,j}}_k\) is contained in \({\mathbf q} _{r+t_j}\) (if ever before query number \(q+1\); else, set \(t_j=\infty \)). We can thus define \(\pi _{ij}\), \(\beta _i\), and \(\mathsf {bestchal}_i\), blue nodes, and the set B the same way as in Theorem 2, by counting the number of rounds after round r (instead of from 0) and substituting, as appropriate “challenge j” with \(\mathsf {h}_{k,j}\) and “query \(X_j\)” with “query \(X_j\) or \(T^{\mathsf {h}_{k,j}}_k\)” (note that because \(\mathsf {h}_{k-1}\notin \mathsf {roundingProblem}_{k-1}\), \(S^{\mathsf {h}_{k,j}}_{k-1} \, \bmod \, n=j\), and so \(T^{\mathsf {h}_{k,j}}_k = X_j\oplus S^{\mathsf {h}_{k,j}}_{k-1}\)). (We stop the execution of A after q total queries in these definitions.)

We now show that, similarly to Claim 1, we can design a predictor algorithm \(\mathcal {P}\) that predicts every \(X^{\mathsf {h}_{k-1}}_i\) in B by interacting with \(\mathsf {h}_{k-1}\) but not querying it at the predecessors of points in B. The difference is that instead of running \(A^{\mathsf {h}_{k-1}}\) on \({\sigma } _0\) and giving A different challenges j, \(\mathcal {P}\) will run A with initial input state \({\sigma } _r\), simulating different oracles \(\mathsf {h}_{k, j}\) (which differ from \(\mathsf {h}_{k-1}\) on only one point—namely, the output on input \(T_{k-1}\)). \(\mathcal {P}\) gets, as input, \({\sigma } _r\) and the same hint as in Claim 1. \(\mathcal {P}\) also needs an additional hint: an integer between 1 and q indicating the sequential number (across all queries made in round r or later) of the first time query \(T_{k-1}\) occurs, in order to know when to reply with \(S^{\mathsf {h}_{k,j}}_{k-1} = \mathrm {changeModn}(S^{\mathsf {h}_{k-1}}_{k-1}, j)\) instead of \(S^{\mathsf {h}_{k-1}}_{k-1}\) itself. Note that this substitution will require \(\mathcal {P}\) to modify the input state \({\sigma } _{s_k}\). If \(s_k>r\), then \(\mathcal {P}\) will not only be able to answer with \(S^{\mathsf {h}_{k,j}}_{k-1}\), but will also see the query \(T_{k-1}\) itself as part of the output state \({\bar{\sigma }} _{s_k}\), and will therefore be able to answer subsequent queries to \(T_{k-1}\) consistently. However, if \(s_k=r\), then we need to give \(T_{k-1}\) to \(\mathcal {P}\) to ensure subsequent queries to \(T_{k-1}\) are answered consistently. In order to do so without lengthening the input of \(\mathcal {P}\), we note that in such a case we do not need \(S^{\mathsf {h}_{k-1}}_{k-1}\) in \({\sigma } _r\) (since \(\mathcal {P}\) can obtain it by querying \(\mathsf {h}_{k-1}\)), and so we can take out \(S^{\mathsf {h}_{k-1}}_{k-1}\) and replace it with \(T_{k-1}\) (\(\mathcal {P}\) will recognize that this happened by looking at the additional hint that contains the query number for \(T_{k-1}\) and noticing that it is smaller than the number of queries \(z_r\) in round r).

There is one more small modification: if \(X_j\in B\) and \(\mathsf {bestchal}_j=j\), then in order to correctly predict \(X_j\) itself (assuming \(X_j\in B\)), \(\mathcal {P}\) will need one additional bit of hint, indicating whether \(X_j\) is first queried by itself or as part of the “next round,” i.e., as part of the query \(T^{\mathsf {h}_{k,j}}_{k}=S^{\mathsf {h}_{k,j}}_{k-1} \oplus X_j\) (in which case \(\mathcal {P}\) will need to xor the query with \(S^{\mathsf {h}_{k,j}}_{k-1}\), which \(\mathcal {P}\) knows, having produced it when answering the query \(T_{k-1}\)). Finally, note that \(\log q\) bits suffice for the query number of \(X_i\) on challenge \(\mathsf {bestchal}_i\), because it is not the same query number as \(T_{k-1}\), because \(h_{k-1}\notin \mathsf {colliding}_{k-1}\), so there are \(q-1\) possibilities plus the possibility of “never”.

We thus need to give (in addition to \(\sigma _r\)) \(\log q + |B|(1+2\log n + \log q)\) bits of hint to \(\mathcal {P}\), and \(\mathcal {P}\) is guaranteed to be correct as long as \(\mathsf {h}_{k-1} \notin \mathsf {colliding}_{k-1}\).

Suppose \(\sigma _r\) has \(m_r\) bits. Claim 2 does not change. We modify Claim 3 as follows. We replace p with a function \(p_r\) of the memory size \(m_r\), defined as

$$\begin{aligned} p_r= \lceil (m_r+1+\log q)/(w-2\log n -\log q-1)+1 \rceil \end{aligned}$$
(1)

(note that it is almost the same as the definition of p, but accounts for the longer hint). We now redefine \(\mathsf {predictable}\) according to our new definition of \(\mathcal {P}\), \(p_r\), and hint length.

Definition 4

The set \(\mathsf {predictable}\) consists of all random oracles \(\mathsf {h}\) for which there exists an input state \(\sigma _r\) of size \(m_r\) (such that \(1\le p_r\le n-1\)) and a hint of length \(\log q + p_r(1+2\log n + \log q)\), given which \(\mathcal {P}\) can correctly output \(p_r\) distinct values from among \(X_1^\mathsf {h}, \dots , X_{n-1}^\mathsf {h}\) without querying them.

Finally, we replace \(\mathsf {bad}_\mathsf {h}\) with \( \mathsf {colliding}_{k-1} \cup \mathsf {predictable}\). As long as \(\mathsf {h}_{k-1} \notin \mathsf {colliding}_{k-1} \cup \mathsf {predictable}\), we are guaranteed that \(\Pr _j[t_j> n/(2p_r)]\ge 1/2\), like in Theorem 2.

The discussion above gives us the following lemma (analogous to Theorem 2).

Lemma 1

Fix any k (\(1\le k \le n\)). Assume \(\mathsf {h}_{k-1} \notin \mathsf {colliding}_{k-1}\cup \mathsf {predictable}\). Let \(s_k>0\) be the smallest value such that \(T^{\mathsf {h}_{k-1}}_{k-1}\) is among the queries \({\mathbf q} _{s_k}\) during the computation of \(A^{\mathsf {h}_{k-1}}\). Let \(r\le s_k\) and \(m_r\) be the bit-length of the input state \(\sigma _r\) of \(A^{\mathsf {h}_{k-1}}\) in round \(r+1\). Let \(t_{k,j,r}>0\) be such that the first time \(T^{\mathsf {h}_{k,j}}_k\) is queried by \(A^{\mathsf {h}_{k,j}}\) after round \(s_k\) is in round \(r+t_{k,j,r}\) (let \(t_{k,j,r}=\infty \) if such a query does not occur after round \({\sigma } _k\) or does not occur among the first q queries, or if \(T^{\mathsf {h}_{k-1}}_{k-1}\) is never queried). Call j “hard” for time r if \(t_{k,j,r} > n/(2p_r)\), where \(p_r= \lceil (m_r+1+\log q)/(w-2\log n -\log q-1)+1 \rceil \). We are guaranteed that

$$ {\mathop {\Pr }\limits _{j}} \left[ j \text { is hard for time }r \right] \ge \frac{1}{2}. $$

Hardness of Challenge \(c_k\). We continue with the assumptions of Lemma 1. In order to get an analogue of Claim 5, we need to define what it means for a challenge to be hard. Consider running \(A^{\mathsf {h}_k}\). Let \(t_k>0\) be such that \(T^{\mathsf {h}_k}_{k}\) is queried for the first time in round \(s_k+t_k\) (again, letting \(t_k=\infty \) if this query does not occur among the first q queries). Find the round \(r_k \le s_k\) such that bit-length \(m_r\) of the input state \({\sigma } _r\) in round \(r_k+1\) gives us the best bound on \(t_k\) using the equation of Lemma 1 (i.e., set \(r_k={\mathop {{\hbox {argmax}}}\nolimits _{0\le r \le s_k}} (n/(2p_r) - (s_k-r))\), where \(m_r\) denotes the size of the state \(\sigma _r\) at the end of round r, and \(p_r\) is the function of \(m_r\) defined by Eq. 1), and define \(c_k\) to be “hard” if it is hard for time \(r_k\).

Definition 5

A challenge \(c_k\) is hard if for \(r_k={\mathop {{\mathrm {argmax}}}\nolimits _{0\le r \le s_k}} (n/(2p_r) - (s_k-r))\) we have \(t_{k,c_k,r_k} > n/(2p_r)\), where \(s_k, t_{k, j, r}\) and \(p_r\) are as defined in Lemma 1.

The Multiple Challenges Argument. So far, we considered hardness of \(c_k\) during the run of A with the oracle \(\mathsf {h}_k\). We now need to address the actual situation, in which A runs with \(\mathsf {h}_n\). We need the following claim, which shows that the actual situation is, most of the time, identical. Define \(\mathsf {wrongOrder}_{k}\) as the set of all random oracles \(\mathsf {h}\) for which the values \(\{T^\mathsf {h}_0, T^\mathsf {h}_1, \dots , T^\mathsf {h}_k\}\) are not queried by \(A^\mathsf {h}\) in the same order as they appear in the correct evaluation of \(\mathtt{scrypt} \) (when we look at first-time queries only, and only up to the first q queries).

Definition 6

\(\mathsf {wrongOrder}_{k}\) consists of all \(\mathsf {h}\) for which there exist \(i_1\) and \(i_2\) such that \(0\le i_1 < i_2\le k\) and, in the run of \(A^\mathsf {h}\), query \(T^\mathsf {h}_{i_2}\) occurs, while query \(T^\mathsf {h}_{i_1}\) does not occur before query \(T^\mathsf {h}_{i_2}\) occurs.

Claim 9

If for every j (\(0\le j \le n\)), \(\mathsf {h}_j \notin \mathsf {colliding}_{j}\cup \mathsf {wrongOrder}_{j}\), then for every k and \(i\le k\), \(T^{\mathsf {h}_n}_i=T^{\mathsf {h}_k}_i\), and the execution of \(A^{\mathsf {h}_n}\) is identical to the execution of \(A^{\mathsf {h}_k}\) until the query \(T_k\) is first made, which (for \(1\le k \le n\)) happens later than the moment when query \(T^{\mathsf {h}_n}_{k-1} = T^{\mathsf {h}_k}_{k-1}\) is first made.

Proof

To prove this claim, we will show, by induction, that for every \(j\ge k\) and \(i\le k\), \(T^{\mathsf {h}_k}_{i}=T^{\mathsf {h}_j}_{i}\), and the execution of \(A^{\mathsf {h}_j}\) is identical to the execution of \(A^{\mathsf {h}_k}\) until the query \(T_k\) is first made.

The base of induction (\(j=k\)) is simply a tautology.

The inductive step is as follows. Suppose the statement is true for some \(j\ge k\). We will show it for \(j+1\). We already established that if \(\mathsf {h}_{j}\not \in \mathsf {colliding}_{j}\), then \(T^{\mathsf {h}_{j}}_{i}=T^{\mathsf {h}_{j+1}}_{i}\) for every \(i\le j\), and is therefore equal to \(T^{\mathsf {h}_k}_{i}\) by the inductive hypothesis. Since \(h_j\) and \(h_{j+1}\) differ only in their answer to the query \(T^{\mathsf {h}_{j}}_{j}=T^{\mathsf {h}_{j+1}}_j\), the execution of \(A^{\mathsf {h}_{j+1}}\) proceeds identically to the execution of \(A^{\mathsf {h}_{j}}\) until this query is first made. Since \(\mathsf {h}_{j} \not \in \mathsf {wrongOrder}_{j}\), this moment is no earlier than when the query \(T_{k}\) is made; therefore, until the point the query \(T_k\) is first made, the execution of \(A^{\mathsf {h}_{j+1}}\) proceeds identically to the execution of \(A^{\mathsf {h}_{j}}\) and thus (by the inductive hypothesis) identically to the execution of \(A^{\mathsf {h}_k}\).

The last part of the claim follows because \(\mathsf {h}_{n}\not \in \mathsf {wrongOrder}_{n}\).    \(\square \)

We therefore get the following analogue of Claim 5.

Claim 10

Given adversary A, assume for every k (\(0\le k \le n\)), \(\mathsf {h}_k \notin \mathsf {colliding}_{k}\cup \mathsf {wrongOrder}_{k}\). Let \(c=n/2\). If challenge i is hard (i.e., \(t_i + (s_i - r_i) > c / p_{r_i}\)), then, during the run of \(A^{\mathsf {h}_n}\), for any \(0 \le j \le s_i\), \(p_{s_i-j}\ge c/(t_i+j)\).

Definition 7

Let \(E_1\) be the event that there are at least \(H\ge n (\frac{1}{2} -\epsilon )\) hard challenges (as defined in Definition 5). Let \(E_2\) be the event that \(\mathsf {h}_k \notin \mathsf {colliding}_{k}\cup \mathsf {wrongOrder}_{k}\) (see Definitions 2 and 6) for every k, and \(A^{\mathsf {h}_n}\) queries \(T^{\mathsf {h}_n}_n\). Let \(E_q\) be the event that \(A^{\mathsf {h}_n}\) makes no more than q total queries.

Claim 11

If \(E_1\cap E_2 \cap E_q\), then

$$ \sum _{r=1}^{s_{n+1}} {p_r} \ge \ln 2 \cdot \left( \frac{1}{2} - \epsilon \right) \cdot \frac{1}{2} \cdot n^2. $$

Proof

Since \(E_2\) holds, every query \(T_0, \dots , T_n\) gets made, in the correct order. Since \(E_q\) holds, all these queries happen no later than query q, thus ensuring that Claim 10 applies and each \(t_k\) is finite. Moreover, by definition of \(p_r\) in Eq. 1, \(p_r\ge 1\) and \(p_0=1\). Therefore, we can apply Claim 8 to the execution of \(A^{\mathsf {h}_n}\) to get the desired result.    \(\square \)

Converting from \(\sum {p_r}\) to \(\mathsf {cc_{mem}}\) Now we need to convert from \(\sum {p_r}\) to \(\sum {m_r}\).

Claim 12

For every \(r>0\),

$$ m_r \ge p_r \cdot (w - 2 \log n - \log q - 1)/3. $$

Proof

By definition of \(p_r\), we have that

$$ p_r = \left\lceil \frac{m_r + 1 + \log q}{w - 2 \log n - \log q - 1} + 1 \right\rceil \le \frac{m_r + 1 + \log q}{w - 2 \log n - \log q - 1} + 2, $$

because the ceiling adds at most 1. Therefore,

$$ (p_r - 2) \cdot (w - 2 \log n - \log q - 1) \le m_r + 1 + \log q, $$

(because we can assume \((w - 2 \log n - \log q - 1)>0\) — otherwise, Theorem 5 is trivially true) and thus

$$\begin{aligned} m_r\ge & {} (p_r - 2) \cdot (w - 2 \log n - \log q - 1) - \log q - 1 \end{aligned}$$
(2)
$$\begin{aligned}= & {} p_r \cdot (w - 2 \log n - \log q - 1) - 2 \cdot (w - 2 \log n - \log q - 1) - \log q - 1 \nonumber \\= & {} p_r \cdot (w - 2 \log n - \log q - 1) - 2 \cdot (w - 2 \log n - 0.5 \log q - 0.5). \end{aligned}$$
(3)

Since \(m_r \ge w\) (see our complexity measure definition in Sect. 2), \(m_r \ge w - 2 \log n - 0.5 \log q - 0.5\) and therefore we can increase the left-hand side by \(2\cdot m_r\) and the right-hand side by \(2\cdot (w - 2 \log n - 0.5 \log q - 0.5)\) and the inequality still holds; and therefore \( 3 m_r \ge p_r \cdot (w - 2 \log n - \log q - 1). \)    \(\square \)

Lemma 2

Assuming \(E_1\cap E_2\) (see Definition 7), for any integer q, either \(A^{\mathsf {h}_n}\) makes more than q queries (and thus \(\mathsf {cc_{mem}}(A^{\mathsf {h}_n})>qw\) by definition) or

$$ \mathsf {cc_{mem}}(A^{\mathsf {h}_n}(X)) \ge \frac{\ln 2}{6}\cdot \left( \frac{1}{2} - \epsilon \right) \cdot n^2 \cdot \left( w - 2 \log n - \log q - 1 \right) \;. $$

Proof

We observe that if \(A^{\mathsf {h}_n}\) makes no more than q queries, then \(E_1\cap E_2\cap E_q\) hold, and we can combine Claims 11 and 12 to get

$$\begin{aligned} \mathsf {cc_{mem}}(A^{\mathsf {h}_n}(X)) = \sum _{r=1}^{s_{n+1}} {m_r}&\ge \frac{1}{3} \cdot \sum _{r=1}^{s_{n+1}} {p_r} \cdot (w - 2 \log n - \log q - 1) \\&\ge \frac{\ln 2}{3} \cdot \left( \frac{1}{2} - \epsilon \right) \cdot \frac{1}{2} \cdot n^2 \cdot \left( w - 2 \log n - \log q - 1 \right) . \end{aligned}$$

This concludes the proof of Lemma 2.    \(\square \)

All that remains is to show a lower bound for the probability of \((E_1\cap E_2 \cap E_q)\cup \bar{E_q}\), and to argue that \(\mathsf {h}_n\) is uniform, because the statement we are trying to prove is concerned with \(A^\mathsf {h}\) for uniform \(\mathsf {h}\) rather than with \(A^{\mathsf {h}_n}\). The details are deferred to the full version [7].