1 Introduction

In cryptographic settings we typically consider tasks which can be done efficiently by honest parties, but are infeasible for potential adversaries. This requires an asymmetry in the capabilities of honest and dishonest parties. An example are trapdoor functions, where the honest party – who knows the secret trapdoor key – can efficiently invert the function, whereas a potential adversary – who does not have this key – cannot.

1.1 Moderately-Hard Functions

Moderately hard functions consider a setting where there’s no asymmetry, or even worse, the adversary has more capabilities than the honest party. What we want is that the honest party can evaluate the function with some reasonable amount of resources, whereas the adversary should not be able to evaluate the function at significantly lower cost. Moderately hard functions have several interesting cryptographic applications, including securing blockchain protocols and for password hashing.

An early proposal for password hashing is the “Password Based Key Derivation Function 2” (PBKDF2) [Kal00]. This function just iterates a cryptographic hash function like SHA1 several times (1024 is a typical value). Unfortunately, PBKDF2 doesn’t make for a good moderately hard function, as evaluating a cryptographic hash function on dedicated hardware like ASCIs (Application Specific Integrated Circuits) can be by several orders of magnitude cheaper in terms of hardware and energy cost than evaluating it on a standard x86 CPU. An economic analysis of Blocki et al. [BHZ18] suggests that an attacker will crack almost all passwords protected by PBKDF2. There have been several suggestions how to construct better, i.e., more “egalitarian”, moderately hard functions. We discuss the most prominent suggestions below.

Memory-Bound Functions. Abadi et al. [ABW03] observe that the time required to evaluate a function is dominated by the number of cache-misses, and these slow down the computation by about the same time over different architectures. They propose memory-bound functions, which are functions that will incur many expensive cache-misses (assuming the cache is not too big). They propose a construction which is not very practical as it requires a fairly large (larger than the cache size) incompressible string as input. Their function is then basically pointer jumping on this string. In subsequent work [DGN03] it was shown that this string can also be locally generated from a short seed.

Bandwidth-Hard Functions. Recently Ren and Devadas [RD17] suggest the notion of bandwidth-hard functions, which is a refinement of memory-bound functions. A major difference being that in their model computation is not completely free, and this assumption – which of course is satisfied in practice – allows for much more practical solutions. They also don’t argue about evaluation time as [ABW03], but rather the more important energy cost; the energy spend for evaluating a function consists of energy required for on chip computation and memory accesses, only the latter is similar on various platforms. In a bandwidth-hard function the memory accesses dominate the energy cost on a standard CPU, and thus the function cannot be evaluated at much lower energy cost on an ASICs as on a standard CPU.

Memory-Hard Functions. Whereas memory-bound and bandwidth-hard functions aim at being egalitarian in terms of time and energy, memory-hard functions (MHF), proposed by Percival [Per09], aim at being egalitarian in terms of hardware cost. A memory-hard function, in his definition, is one where the memory used by the algorithm, multiplied by the amount of time, is high, i.e., it has high space-time (ST) complexity. Moreover, parallelism should not help to evaluate this function at significantly lower cost by this measure. The rationale here is that the hardware cost for evaluating an MHF is dominated by the memory cost, and as memory cost does not vary much over different architectures, the hardware cost for evaluating MHFs is not much lower on ASICs than on standard CPUs.

Cumulative Memory Complexity. Alwen and Serbinenko [AS15] observe that ST complexity misses a crucial point, amortization. A function might have high ST complexity because at some point during the evaluation the space requirement is high, but for most of the time a small memory is sufficient. As a consequence, ST complexity is not multiplicative: a function can have ST complexity C, but evaluating X instances of the function can be done with ST complexity much less than \(X\cdot C\), so the amortized ST cost is much less than C. Alwen and Blocki [AB16, AB17] later showed that prominent MHF candidates such as Argon2i [BDK16], winner of the Password Hashing Competition [PHC] do not have high amortized ST complexity.

To address this issue, [AS15] put forward the notion of cumulative-memory complexity (cmc). The cmc of a function is the sum – over all time steps – of the memory required to compute the function by any algorithm. Unlike ST complexity, cmc is multiplicative.

Sustained-Memory Complexity. Although cmc takes into account amortization and parallelism, it has been observed (e.g., [RD16, Cox16]) that it still is not sufficient to guarantee egalitarian hardware cost. The reason is simple: if a function has cmc C, this could mean that the algorithm minimizing cmc uses some T time steps and C / T memory on average, but it could also mean it uses time \(100\cdot T\) and \(C/100\cdot T\) memory on average. In practice this can makes a huge difference because memory cost doesn’t scale linearly. The length of the wiring required to access memory of size M grows like \(\sqrt{M}\) (assuming a two dimensional layout of the circuit). This means for one thing, that – as we increase M – the latency of accessing the memory will grow as \(\sqrt{M}\), and moreover the space for the wiring required to access the memory will grow like \(M^{1.5}\).

The exact behaviour of the hardware cost as the memory grows is not crucial here, just the point that it’s superlinear, and cmc does not take this into account. In this work we introduce the notion of sustained-memory complexity, which takes this into account. Ideally, we want a function which can be evaluated by a “naïve” sequential algorithm (the one used by the honest parties) in time T using a memory of size S where (1) S should be close to T and (2) any parallel algorithm evaluating the function must use memory \(S'\) for at least \(T'\) steps, where \(T'\) and \(S'\) should be not much smaller than T and S, respectively.

Property (1) is required so the memory cost dominates the evaluation cost already for small values of T. Property (2) means that even a parallel algorithm will not be able to evaluate the function at much lower cost; any parallel algorithm must make almost as many steps as the naïve algorithm during which the required memory is almost as large as the maximum memory S used by the naïve algorithm. So, the cost of the best parallel algorithm is similar to the cost of the naïve sequential one, even if we don’t charge the parallel algorithm anything for all the steps where the memory is below \(S'\).

Ren and Devadas [RD16] previously proposed the notion of “consistent memory hardness” which requires that any sequential evaluation algorithm must either use space \(S'\) for at least \(T'\) steps, or the algorithm must run for a long time e.g., \(T \gg n^2\). Our notion of sustained-memory complexity strengthens this notion in that we consider parallel evaluation algorithms, and our guarantees are absolute e.g., even if a parallel attacker runs for a very long time he must still use memory \(S'\) for at least \(T'\) steps. \(\mathtt {scrypt}\)  [Per09] is a good example of a MHF that has maximal cmc \(\varOmega \left( n^2\right) \) [ACP+17] that does not have high sustained space complexity. In particular, for any memory parameter M and any running time parameter n we can evaluate \(\mathtt {scrypt}\)  [Per09] in time \(n^2/M\) and with maximum space M. As was argued in [RD16] an adversary may be able to fit \(M= n/100\) space in an ASIC, which would allow the attacker to speed up computation by a factor of more than 100 and may explain the availability of ASICs for \(\mathtt {scrypt}\) despite its maximal cmc.

In this work we show that functions with asymptotically optimal sustained-memory complexity exist in the random oracle model. We note that we must make some idealized assumption on our building block, like being a random oracle, as with the current state of complexity theory, we cannot even prove superlinear circuit lower-bounds for problems in \(\mathcal NP\). For a given time T, our function uses maximal space \(S\in \varOmega (T)\) for the naïve algorithm,Footnote 1 while any parallel algorithm must have at least \(T'\in \varOmega (T)\) steps during which it uses memory \(S'\in \varOmega (T/\log (T))\).

Graph Labelling. The functions we construct are defined by directed acyclic graphs (DAG). For a DAG \(G_n=(V,E)\), we order the vertices \(V=\{v_1,\ldots ,v_n\}\) in some topological order (so if there’s a path from i to j then \(i<j\)), with \(v_1\) being the unique source, and \(v_n\) the unique sink of the graph. The function is now defined by \(G_n\) and the input specifies a random oracle H. The output is the label \(\ell _n\) of the sink, where the label of a node \(v_i\) is recursively defined as \(\ell _i=H(i,\ell _{p_1},\ldots ,\ell _{p_d})\) where \(v_{p_1},\ldots ,v_{p_d}\) are the parents of \(v_i\).

Pebbling. Like many previous works, including [ABW03, RD17, AS15] discussed above, we reduce the task of proving lower bounds – in our case, on sustained memory complexity – for functions as just described, to proving lower bounds on some complexity of a pebbling game played on the underlying graph.

For example, Ren and Devedas [RD17] define a cost function for the so called reb-blue pebbling game, which then implies lower bounds on the bandwidth hardness of the function defined over the corresponding DAG.

Most closely related to this work is [AS15], who show that a lower bound the so called sequential (or parallel) cumulative (black) pebbling complexity (cpc) of a DAG implies a lower bound on the sequential (or parallel) cumulative memory complexity (cmc) of the labelling function defined over this graph. Alwen et  al. [ABP17] constructed a constant indegree family of DAGs with parallel cpc \(\varOmega (n^2/\log (n))\), which is optimal [AB16], and thus gives functions with optimal cmc. More recently, Alwen et al. [ABH17] extended these ideas to give the first practical construction of an iMHF with parallel cmc \(\varOmega (n^2/\log (n))\).

The black pebbling game – as considered in cpc – goes back to [HP70, Coo73]. It is defined over a DAG \(G=(V,E)\) and goes in round as follows. Initially all nodes are empty. In every round, the player can put a pebble on a node if all its parents contain pebbles (arbitrary many pebbles per round in the parallel game, just one in the sequential). Pebbles can be removed at any time. The game ends when a pebble is put on the sink. The cpc of such a game is the sum, over all time steps, of the pebbles placed on the graph. The sequential (or parallel) cpc of G is the cpc of the sequential (or parallel) black pebbling strategy which minimizes this cost.

It’s not hard to see that the sequential/parallel cpc of G directly implies the same upper bound on the sequential/parallel cmc of the graph labelling function, as to compute the function in the sequential/parallel random oracle model, one simply mimics the pebbling game, where putting a pebble on vertex \(v_i\) with parents \(v_{p_1},\ldots ,v_{p_d}\) corresponds to the query \(\ell _i\leftarrow H(i,\ell _{p_1},\ldots ,\ell _{p_d})\). And where one keeps a label \(\ell _j\) in memory, as long as \(v_j\) is pebbled. If the labels \(\ell _i\in \{0,1\}^w\) are w bits long, a cpc of p translates to cmc of \(p\cdot w\).

More interestingly, the same has been shown to hold for interesting notions also for lower bounds. In particular, the ex-post facto argument [AS15] shows that any adversary who computes the label \(\ell _n\) with high probability (over the choice of the random oracle H) with cmc of m, translates into a black pebbling strategy of the underlying graph with cpc almost m / w.

In this work we define the sustained-space complexity (ssc) of a sequential/parallel black pebbling game, and show that lower bounds on ssc translate to lower bounds on the sustained-memory complexity (smc) of the graph labelling function in the sequential/parallel random oracle model.

Consider a sequential (or parallel) black pebbling strategy (i.e., a valid sequence pebbling configurations where the last configuration contains the sink) for a DAG \(G_n=(V,E)\) on \(|V|=n\) vertices. For some space parameter \( s \le n\), the s-ssc of this strategy is the number of pebbling configurations of size at least s. The sequential (or parallel) s-ssc of G is the strategy minimizing this value. For example, if it’s possible to pebble G using \(s'<s\) pebbles (using arbitrary many steps), then its s-ssc is 0. Similarly as for csc vs cmc, an upper bound on s-ssc implies the same upper bound for \((w\cdot s)\)-smc. In Sect. 5 we prove that also lower bounds on ssc translate to lower bounds on smc.

Thus, to construct a function with high parallel smc, it suffices to construct a family of DAGs with constant indegree and high parallel ssc. In Sect. 3 we construct such a family \(\{G_n\}_{n\in \mathbb {N}}\) of DAGs where \(G_n\) has n vertices and has indegree 2, where \(\varOmega (n/\log (n))\)-ssc is in \(\varOmega (n)\). This is basically the best we can hope for, as our bound on ssc trivially implies a \(\varOmega (n^2/\log (n))\) bound on csc, which is optimal for any constant indegree graph [AS15].

Data-Dependent vs Data-Independent MHFs. There are two categories of Memory Hard Functions: data-Independent Memory Hard Functions (iMHFs) and data-dependent Memory Hard Functions (dMHFs). As the name suggests, the algorithm to compute an iMHFs must induce a memory access pattern that is independent of the potentially sensitive input (e.g., a password), while dMHFs have no such constraint. While dMHFs (e.g., \(\mathtt {scrypt}\)  [PJ12], Argon2d, Argon2id [BDK16]) are potentially easier to construct, iMHFs (e.g., Argon2i [BDK16], DRSample [ABH17]) are resistant to side channel leakage attacks such as cache-timing. For the cumulative memory complexity metric there is a clear gap between iMHFs and dMHFs. In particular, it is known that \(\mathtt {scrypt}\) has cmc at least \(\varOmega \left( n^2w\right) \) [ACP+17], while any iMHF has cmc at most \(O\left( \frac{n^2 w \log \log n}{\log n}\right) \). Interestingly, the same gap does not hold for smc. In particular, any dMHF can be computed with maximum space \(O\left( nw/\log n + n \log n\right) \) by recursively applying a result of Hopcroft et al. [HPV77]—see more details in the full version [ABP18].

1.2 High Level Description of Our Construction and Proof

Our construction of a family \(\{G_n\}_{n\in \mathbb {N}}\) of DAGs with optimal ssc involves three building blocks:

The first building block is a construction of Paul et al. [PTC76] of a family of DAGs \(\{{\mathsf {PTC}} _n\}_{n\in \mathbb {N}}\) with \({\mathsf {indeg}} ({\mathsf {PTC}} _n)=2\) and space complexity \(\varOmega (n/\log n)\). More significantly for us they proved that for any sequential pebbling of \(G_n\) there is a time interval [ij] during which at least \(\varOmega (n/\log n)\) new pebbles are placed on sources of \(G_n\) and at least \(\varOmega (n/\log n)\) are always on the DAG. We extend the proof of Paul et al. [PTC76] to show that the same holds for any parallel pebbling of \({\mathsf {PTC}} _n\); a pebbling game first introduced in [AS15] which natural models parallel computation. We can argue that \(j-i =\varOmega (n/\log n)\) for any sequential pebbling since it takes at least this many steps to place \(\varOmega (n/\log n)\) new pebbles on \(G_n\). However, we stress that this argument does not apply to parallel pebblings so this does not directly imply anything about sustained space complexity for parallel pebblings.

To address this issue we introduce our second building block: a family of \(\{D_n^\epsilon \}_{n\in \mathbb {N}}\) of extremely depth robust DAGs with \({\mathsf {indeg}} (D_n)\in O\left( \log n\right) \)—for any constant \(\epsilon >0\) the DAG \(D_n^\epsilon \) is (ed)-depth robust for any \(e+d \le (1-\epsilon )n\). We remark that our result improves upon the construction of Mahmoody et al. [MMV13] whose construction required \({\mathsf {indeg}} (D_n) \in O\left( \log ^2 n {{\mathsf {polylog}}} (\log n)\right) \) and may be of independent interest (e.g., our construction immediately yields a more efficient construction of proofs of sequential work [MMV13]). Our construction of \(D_n^\epsilon \) is (essentially) the same as Erdos et al. [EGS75] albeit with much tighter analysis. By overlaying an extremely depth-robust DAG \(D_n^\epsilon \) on top of the sources of \({\mathsf {PTC}} _n\), the construction of Paul et al. [PTC76], we can ensure that it takes \(\varOmega (n/\log n)\) steps to pebble \(\varOmega (n/\log n)\) sources of \(G_n\). However, the resulting graph would have \({\mathsf {indeg}} (G_n) \in O(\log n)\) and would have sustained space \(\varOmega (n/\log n)\) for at most \(O(n/\log n)\) steps. By contrast, we want a n-node DAG G with \({\mathsf {indeg}} (G)=2\) which requires space \(\varOmega (n/\log n)\) for at least \(\varOmega (n)\) stepsFootnote 2.

Our final tool is to apply the indegree reduction lemma of Alwen et  al. [ABP17] to \(\{D_t^{\epsilon }\}_{t\in \mathbb {N}}\) to obtain a family of DAGs \(\{J_t^{\epsilon }\}_{t\in \mathbb {N}}\) such that \(J_t^\epsilon \) has \({\mathsf {indeg}} \left( J_t^\epsilon \right) =2\) and \(2t \cdot {\mathsf {indeg}} \left( D_t^{\epsilon }\right) \in O(t \log t)\) nodes. Each node in \(D_t^\epsilon \) is associated with a path of length \(2 \cdot {\mathsf {indeg}} (D_t^{\epsilon })\) in \(J_t^\epsilon \) and each path p in \(D_t^\epsilon \) corresponds to a path \(p'\) of length \(|p'|\ge |p| \cdot {\mathsf {indeg}} (G_t)\) in \(J_t^{\epsilon }\). We can then overlay the DAG \(J_t^{\epsilon }\) on top of the sources in \({\mathsf {PTC}} _n\) where \(t=\varOmega (n/\log n)\) is the number of sources in \({\mathsf {PTC}} _n\). The final DAG has size O(n) and we can then show that any legal parallel pebbling requires \(\varOmega (n)\) steps with at least \(\varOmega (n/\log n)\) pebbles on the DAG.

2 Preliminaries

In this section we introduce common notation, definitions and results from other work which we will be using. In particular the following borrows heavily from [ABP17, AT17].

2.1 Notation

We start with some common notation. Let \({\mathbb {N}} = \{0, 1, 2,\ldots \}\), \({\mathbb {N}} ^+ = \{1, 2,\ldots \}\), and \({\mathbb {N}} _{\ge c} = \{c, c+1, c+2, \ldots \}\) for \(c \in {\mathbb {N}} \). Further, we write \([c] := \{1, 2,\ldots ,c\}\) and \([b,c]= \{b, b+1, \ldots , c\}\) where \(c \ge b \in {\mathbb {N}} \). We denote the cardinality of a set B by |B|.

2.2 Graphs

The central object of interest in this work are directed acyclic graphs (DAGs). A DAG \(G=(V,E)\) has size \(n = |V|\). The indegree of node \(v\in V\) is \({\delta } = {\mathsf {indeg}} (v)\) if there exist \({\delta } \) incoming edges \({\delta } = |(V \times \{v\}) \cap E|\). More generally, we say that G has indegree \({\delta } = {\mathsf {indeg}} (G)\) if the maximum indegree of any node of G is \({\delta } \). If \({\mathsf {indeg}} (v) = 0\) then v is called a source node and if v has no outgoing edges it is called a sink. We use \({\mathsf {parents}} _G(v) = \{u \in V: (u,v) \in E\}\) to denote the parents of a node \(v \in V\). In general, we use \({\mathsf {ancestors}} _G(v) := \bigcup _{i \ge 1} {\mathsf {parents}} _G^i(v)\) to denote the set of all ancestors of v—here, \({\mathsf {parents}} _G^2(v) := {\mathsf {parents}} _G\left( {\mathsf {parents}} _G(v) \right) \) denotes the grandparents of v and \({\mathsf {parents}} ^{i+1}_G(v) := {\mathsf {parents}} _G\left( {\mathsf {parents}} ^i_G(v)\right) \). When G is clear from context we will simply write \({\mathsf {parents}} \) (\({\mathsf {ancestors}} \)). We denote the set of all sinks of G with \({\mathsf {sinks}} (G)=\{v\in V: \not \exists (v,u)\in E\}\)—note that \({\mathsf {ancestors}} \left( {\mathsf {sinks}} (G) \right) = V\). The length of a directed path \(p = (v_1, v_2, \ldots , v_z)\) in G is the number of nodes it traverses \({\mathsf {length}} (p):=z\). The depth \(d={\mathsf {depth}} (G)\) of DAG G is the length of the longest directed path in G. We often consider the set of all DAGs of fixed size n \({\mathbb {G}} _n := \{G=(V,E)\ : \ |V|=n\}\) and the subset of those DAGs at most some fixed indegree \({\mathbb {G}} _{n,{\delta }} := \{G\in {\mathbb {G}} _n : {\mathsf {indeg}} (G) \le {\delta } \}\). Finally, we denote the graph obtained from \(G=(V,E)\) by removing nodes \(S\subseteq V\) (and incident edges) by \(G-S\) and we denote by \(G[S]=G-(V\setminus S)\) the graph obtained by removing nodes \(V \setminus S\) (and incident edges).

The following is an important combinatorial property of a DAG for this work.

Definition 1

(Depth-Robustness). For \(n\in {\mathbb {N}} \) and \(e,d \in [n]\) a DAG \(G=(V,E)\) is (ed)-depth-robust if

$$\forall S \subset V~~~|S| \le e \Rightarrow {\mathsf {depth}} (G-S) \ge d.$$

The following lemma due to Alwen et al. [ABP17] will be useful in our analysis. Since our statement of the result is slightly different from [ABP17] we include a proof in Appendix A for completeness.

Lemma 1

[ABP17, Lemma 1] (Indegree-Reduction). Let \(G=(V=[n],E)\) be an (ed)-depth robust DAG on n nodes and let \({\delta } = {\mathsf {indeg}} (G)\). We can efficiently construct a DAG \(G' = (V'=[2n{\delta } ],E')\) on \(2n{\delta } \) nodes with \({\mathsf {indeg}} (G')=2\) such that for each path \(p=(x_1,...,x_k)\) in G there exists a corresponding path \(p'\) of length \(\ge k{\delta } \) in \(G'\left[ \bigcup _{i=1}^k [2(x_i-1){\delta } +1,2x_i{\delta } ]\right] \) such that \(2x_i {\delta } \in p'\) for each \(i \in [k]\). In particular, \(G'\) is \((e,d{\delta })\)-depth robust.

2.3 Pebbling Models

The main computational models of interest in this work are the parallel (and sequential) pebbling games played over a directed acyclic graph. Below we define these models and associated notation and complexity measures. Much of the notation is taken from [AS15, ABP17].

Definition 2

(Parallel/Sequential Graph Pebbling). Let \(G= (V,E)\) be a DAG and let \(T \subseteq V\) be a target set of nodes to be pebbled. A pebbling configuration (of G) is a subset \(P_i\subseteq V\). A legal parallel pebbling of T is a sequence \(P=(P_0,\ldots ,P_t)\) of pebbling configurations of G where \(P_0 = \emptyset \) and which satisfies conditions 1 & 2 below. A sequential pebbling additionally must satisfy condition 3.

  1. 1.

    At some step every target node is pebbled (though not necessarily simultaneously).

    $$ \forall x \in T~ \exists z \le t ~~:~~x\in P_z. $$
  2. 2.

    A pebble can be added only if all its parents were pebbled at the end of the previous step.

    $$ \forall i \in [t]~~:~~ x \in (P_i \setminus P_{i-1}) ~\Rightarrow ~ {\mathsf {parents}} (x) \subseteq P_{i-1}. $$
  3. 3.

    At most one pebble is placed per step.

    $$ \forall i \in [t]~~:~~ |P_i \setminus P_{i-1}|\le 1 \ . $$

We denote with \(\mathcal{P}_{G,T}\) and \(\mathcal{P}^{\parallel }_{G,T}\) the set of all legal sequential and parallel pebblings of G with target set T, respectively. Note that \(\mathcal{P}_{G,T}\subseteq \mathcal{P}^{\parallel }_{G,T}\). We will mostly be interested in the case where \(T = {\mathsf {sinks}} (G)\) in which case we write \(\mathcal{P}_{G}\) and \(\mathcal{P}^{\parallel }_{G}\).

Definition 3

(Pebbling Complexity). The standard notions of time, space, space-time and cumulative (pebbling) complexity (cc) of a pebbling \(P=\{P_0,\ldots ,P_t\}\in \mathcal{P}^{\parallel }_G\) are defined to be:

$$ \varPi _t(P)=t ~~~~~ \varPi _s(P)= \max _{i\in [t]}|P_i| ~~~~~ \varPi _{st}(P)= \varPi _t(P)\cdot \varPi _s(P) ~~~~~ \varPi _{cc}(P)= \sum _{i\in [t]}|P_i| \ . $$

For \(\alpha \in \{s,t,{st},{cc}\}\) and a target set \(T \subseteq V\), the sequential and parallel pebbling complexities of G are defined as

$$ \varPi _\alpha (G,T)=\min _{P\in \mathcal{P}_{G,T}}\varPi _\alpha (P) \qquad {and}\qquad \varPi ^{\parallel }_\alpha (G,T)=\min _{P\in \mathcal{P}^{\parallel }_{G,T}}\varPi _\alpha (P) \ . $$

When \(T = {\mathsf {sinks}} (G)\) we simplify notation and write \(\varPi _\alpha (G)\) and \(\varPi ^{\parallel }_\alpha (G)\).

The following defines a sequential pebbling obtained naturally from a parallel one by adding each new pebble on at a time.

Definition 4

Given a DAG G and \(P = \left( P_0,\ldots ,P_t\right) \in \mathcal{P}^{\parallel }_G\) the sequential transform \({\mathsf {seq}} (P) = P' \in \varPi _G\) is defined as follows: Let difference \(D_j = P_i \setminus P_{i-1}\) and let \(a_i = \left| P_i \setminus P_{i-1}\right| \) be the number of new pebbles placed on \(G_n\) at time i. Finally, let \(A_j = \sum _{i=1}^j a_i\) (\(A_0=0\)) and let \(D_j[k]\) denote the \(k^{th}\) element of \(D_j\) (according to some fixed ordering of the nodes). We can construct \(P' = \left( P_{1}',\ldots ,P_{A_t}'\right) \in \mathcal{P}(G_n)\) as follows: (1) \(P_{A_i}' = P_i\) for all \(i\in [0,t]\), and (2) for \(k\in [1, a_{i+1}]\) let \(P_{A_i+k}' = P_{A_i+k-1}' \cup D_j[k] \).

If easily follows from the definition that the parallel and sequential space complexities differ by at most a multiplicative factor of 2.

Lemma 2

For any DAG G and \(P\in \mathcal{P}^{\parallel }_G\) it holds that \({\mathsf {seq}} (P) \in \mathcal{P}_G\) and \(\varPi _s({\mathsf {seq}} (P)) \le 2 *\varPi ^{\parallel }_s(P)\). In particular \(\varPi _s(G) /2 \le \varPi ^{\parallel }_s(G)\).

Proof

Let \(P\in \mathcal{P}^{\parallel }_G\) and \(P' = {\mathsf {seq}} (P)\). Suppose \(P'\) is not a legal pebbling because \(v\in V\) was illegally pebbled in \(P'_{A_i+k}\). If \(k=0\) then \({\mathsf {parents}} _G(v) \not \subseteq P'_{A_{i-1}+a_i-1}\) which implies that \({\mathsf {parents}} _G(v) \not \subseteq P_{i-1}\) since \(P_{i-1} \subseteq P'_{A_{i-1}+a_i-1}\). Moreover \(v\in P_i\) so this would mean that also P illegally pebbles v at time i. If instead, \(k>1\) then \(v\in P_{i+1}\) but since \({\mathsf {parents}} _G(v) \not \subseteq P'_{A_i+k-1}\) it must be that \({\mathsf {parents}} _G(v) \not \subseteq P_i\) so P must have pebbled v illegally at time \(i+1\). Either way we reach a contradiction so \(P'\) must be a legal pebbling of G. To see that \(P'\) is complete note that \(P_0 = P'_{A_0}\). Moreover for any sink \(u \in V\) of G there exists time \(i\in [0,t]\) with \(u\in P_i\) and so \(u\in P'_{A_i}\). Together this implies \(P'\in \mathcal{P}_G\).

Finally, it follows by inspection that for all \(i\ge 0\) we have \(|P'_{A_i}| = |P_i|\) and for all \(0<k<a_i\) we have \(|P'_{A_i+k}| \le |P_i| + |P_{i+1}|\) which implies that \(\varPi _s(P') \le 2*\varPi ^{\parallel }_s(P)\).

New to this work is the following notion of sustained-space complexity.

Definition 5

(Sustained Space Complexity). For \(s\in {\mathbb {N}} \) the s-sustained-space (s-ss) complexity of a pebbling \(P=\{P_0,\ldots ,P_t\}\in \mathcal{P}^{\parallel }_G\) is:

$$ \varPi _{ss}(P,s) = |\{ i \in [t] : |P_i| \ge s\}|. $$

More generally, the sequential and parallel s-sustained space complexities of G are defined as

$$ \varPi _{ss}(G,T,s)=\min _{P\in \mathcal{P}_{G,T}}\varPi _{ss}(P,s) \qquad {and}\qquad \varPi ^{\parallel }_{ss}(G,T,s)=\min _{P\in \mathcal{P}^{\parallel }_{G,T}}\varPi _{ss}(P,s) \ . $$

As before, when \(T = {\mathsf {sinks}} (G)\) we simplify notation and write \(\varPi _{ss}(G,s)\) and \(\varPi ^{\parallel }_{ss}(G,s)\).

Remark 1

(On Amortization). An astute reader may observe that \(\varPi ^{\parallel }_{ss}\) is not amortizable. In particular, if we let \(G^{\bigotimes m}\) denotes the graph which consists of m independent copies of G then we may have \(\varPi ^{\parallel }_{ss}\left( G^{\bigotimes m},s\right) \ll m \varPi ^{\parallel }_{ss}(G,s)\). However, we observe that the issue can be easily corrected by defining the amortized s-sustained-space complexity of a pebbling \(P=\{P_0,\ldots ,P_t\}\in \mathcal{P}^{\parallel }_G\):

$$ \varPi _{am,{ss}}(P,s)= \sum _{i=1}^t \left\lfloor \frac{\left| P_i\right| }{s} \right\rfloor . $$

In this case we have \(\varPi ^{\parallel }_{am,{ss}}\left( G^{\bigotimes m},s\right) = m \varPi ^{\parallel }_{am,{ss}}(G,s)\) where \(\varPi ^{\parallel }_{am,{ss}}(G,s) \doteq \min _{P\in \mathcal{P}^{\parallel }_{G,{\mathsf {sinks}} (G)}} \varPi _{am,{ss}}(P,s)\). We remark that a lower bound on s-sustained-space complexity is a strictly stronger guarantee than an equivalent lower bound for amortized s-sustained-space since \(\varPi ^{\parallel }_{ss}(G,s) \le \varPi ^{\parallel }_{am,{ss}}(G,s)\). In particular, all of our lower bounds for \(\varPi ^{\parallel }_{ss}\) also hold with respect to \(\varPi ^{\parallel }_{am,{ss}}\).

The following shows that the indegree of any graph can be reduced down to 2 without loosing too much in the parallel sustained space complexity. The technique is similar the indegree reduction for cumulative complexity in [AS15]. The proof is in Appendix A. While we include the lemma for completeness we stress that, for our specific constructions, we will use more direct approach to lower bound \(\varPi ^{\parallel }_{ss}\) to avoid the \({\delta } \) factor reduction in space.

Lemma 3

(Indegree Reduction for Parallel Sustained Space).

$$ \forall G \in {\mathbb {G}} _{n,{\delta }}, ~~ \exists H \in {\mathbb {G}} _{n',2}~{such~that }~\forall s\ge 0~~ \varPi ^{\parallel }_{ss}(H,s/({\delta }-1)) = \varPi ^{\parallel }_{ss}(G,s)~{ where}~n' \in [n,{\delta } n]. $$

3 A Graph with Optimal Sustained Space Complexity

In this section we construct and analyse a graph with very high sustained space complexity by modifying the graph of [PTC76] using the graph of [EGS75]. Theorem 1, our main theorem, states that there is a family of constant indegree DAGs \(\{G_n\}_{n=1}^\infty \) with maximum possible sustained space \(\varPi _{ss}\left( G_n, \varOmega (n/\log n) \right) = \varOmega (n)\).

Theorem 1

For some constants \(c_4,c_5 >0\) there is a family of DAGs \(\{G_n\}_{n=1}^\infty \) with \({\mathsf {indeg}} \left( G_n\right) = 2\), O(n) nodes and \(\varPi ^{\parallel }_{ss}\left( G_n,c_4n/\log n \right) \ge c_5n\).

Remark 2

We observe that Theorem 1 is essentially optimal in an asymptotic sense. Hopcroft et al. [HPV77] showed that any DAG \(G_n\) with \({\mathsf {indeg}} (G_n)\in O(1)\) can be pebbled with space at most \(\varPi ^{\parallel }_s(G_n) \in O\left( n/\log n\right) \). Thus, \(\varPi _{ss}\left( G_n,s_n=\omega \left( n/\log n\right) \right) = 0\) for any DAG \(G_n\) with \({\mathsf {indeg}} (G_n)\in O(1)\) since \(s_n > \varPi _s(G_n)\).Footnote 3

We now overview the key technical ingredients in the proof of Theorem 1.

Technical Ingredient 1: High Space Complexity DAGs. The first key building blocks is a construction of Paul et al. [PTC76] of a family of n node DAGs \(\{{\mathsf {PTC}} _n\}_{n=1}^\infty \) with space complexity \(\varPi _s({\mathsf {PTC}} _n)\in \varOmega (n/\log n)\) and \({\mathsf {indeg}} ({\mathsf {PTC}} _n)=2\). Lemma 2 implies that \(\varPi ^{\parallel }_s({\mathsf {PTC}} _n) \in \varOmega (n/\log n)\) since \(\varPi _s({\mathsf {PTC}} _n) /2 \le \varPi ^{\parallel }_s({\mathsf {PTC}} _n)\). However, we stress that this does not imply that the sustained space complexity of \({\mathsf {PTC}} _n\) is large. In fact, by inspection one can easily verify that \({\mathsf {depth}} ({\mathsf {PTC}} _n) \in O(n/\log n)\) so we have \(\varPi _{ss}({\mathsf {PTC}} _n, s) \in O(n/\log n)\) for any space parameter \(s>0\). Nevertheless, one of the core lemmas from [PTC76] will be very useful in our proofs. In particular, \({\mathsf {PTC}} _n\) contains \(O(n/\log n)\) source nodes (as illustrated in Fig. 1a) and [PTC76] proved that for any sequential pebbling \(P = (P_0,\ldots ,P_t) \in \varPi _{{\mathsf {PTC}} _n}\) we can find an interval \([i,j] \subseteq [t]\) during which \(\varOmega (n/\log n)\) sources are (re)pebbled and at least \(\varOmega (n/\log n)\) pebbles are always on the graph.

As Theorem 2 states that the same result holds for all parallel pebblings \(P \in \varPi ^{\parallel }_{{\mathsf {PTC}} _n}\). Since Paul et al. [PTC76] technically only considered sequential black pebblings we include the straightforward proof of Theorem 2 in the full version of this paper for completeness [ABP18]. Briefly, to prove Theorem 2 we simply consider the sequential transform \({\mathsf {seq}} (P) = (Q_0,\ldots ,Q_{t'}) \in \varPi _{{\mathsf {PTC}} _n}\) of the parallel pebbling P. Since \({\mathsf {seq}} (P)\) is sequential we can find an interval \([i',j'] \subseteq [t']\) during which \(\varOmega (n/\log n)\) sources are (re)pebbled and at least \(\varOmega (n/\log n)\) pebbles are always on the graph \(G_n\). We can then translate \([i',j']\) to a corresponding interval \([i,j] \subseteq [t]\) during which the same properties hold for P.

Theorem 2

There is a family of DAGs \(\{{\mathsf {PTC}} _n= (V_n=[n],E_n)\}_{n=1}^\infty \) with \({\mathsf {indeg}} \left( {\mathsf {PTC}} _n\right) = 2\) with the property that for some positive constants \(c_1,c_2,c_3 > 0\) such that for each \(n \ge 1\) the set \(S = \{v \in [n] ~:~ {\mathsf {parents}} (v)=\emptyset \}\) of sources of \({\mathsf {PTC}} _n\) has size \(\left| S\right| \le c_1n/\log n\) and for any legal pebbling \(P=\left( P_1,\ldots ,P_t\right) \in \mathcal{P}^{\parallel }_{{\mathsf {PTC}} _n}\) there is an interval \([i,j] \subseteq [t]\) such that (1) \(\left| S \cap \bigcup _{k =i}^j P_k \setminus P_{i-1} \right| \ge c_2n/\log n\) i.e., at least \(c_2 n/\log n\) nodes in S are (re)pebbled during this interval, and (2) \(\forall k \in [i,j], \left| P_k \right| \ge c_3n/\log n\) i.e., at least \(c_3n/\log n\) pebbles are always on the graph.

One of the key remaining challenges to establishing high sustained space complexity is that the interval [ij] we obtain from Theorem 2 might be very short for parallel black pebblings. For sequential pebblings it would take \(\varOmega (n/\log n)\) steps to (re)pebble \(\varOmega (n/\log n)\) source nodes since we can add at most one new pebble in each round. However, for parallel pebblings we cannot rule out the possibility that all \(\varOmega (n/\log n)\) sources were pebbled in a single step!

A first attempt at a fix is to modify \({\mathsf {PTC}} _n\) by overlaying a path of length \(\varOmega (n)\) on top of these \(\varOmega (n/\log n)\) source nodes to ensure that the length of the interval \(j-i+1\) is sufficiently large. The hope is that it will take now at least \(\varOmega (n)\) steps to (rep)pebble any subset of \(\varOmega (n/\log n)\) of the original sources since these nodes will be connected by a path of length \(\varOmega (n)\). However, we do not know what the pebbling configuration looks like at time \(i-1\). In particular, if \(P_{i-1}\) contained just \(\sqrt{n}\) of the nodes on this path then the it would be possible to (re)pebble all nodes on the path in at most \(O\left( \sqrt{n}\right) \) steps. This motivates our second technical ingredient: extremely depth-robust graphs.

Technical Ingredient 2: Extremely Depth-Robust Graphs. Our second ingredient is a family \(\{D_n^\epsilon \}_{n=1}^\infty \) of highly depth-robust DAGs with n nodes and \({\mathsf {indeg}} (D_n) \in O(\log n)\). In particular, \(D_n^\epsilon \) is (ed)-depth robust for any \(e+d \le n(1-\epsilon )\). We show how to construct such a family \(\{D_n^\epsilon \}_{n=1}^\infty \) for any constant \(\epsilon >0\) in Sect. 4. Assuming for now that such a family exists we can overlay \(D_m\) over the \(m=m_n\le c_1 n/\log n\) sources of \({\mathsf {PTC}} _n\). Since \(D_m^\epsilon \) is highly depth-robust it will take at least \(c_2n/\log n - \epsilon m \ge c_2 n/\log n - \epsilon c_1 n/\log n \in \varOmega (n/\log n)\) steps to pebble these \(c_2 n/\log n\) sources during the interval [ij].

Overlaying \(D_m^\epsilon \) over the \(m \in O(n/\log (n))\) sources of \({\mathsf {PTC}} _n\) yields a DAG G with O(n) nodes, \({\mathsf {indeg}} (G)\in O(\log n)\) and \(\varPi ^{\parallel }_{ss}\left( G,c_4n/\log n \right) \ge c_5n/\log n\) for some constants \(c_4,c_5 >0\). While this is progress it is still a weaker result than Theorem 1 which promised a DAG G with O(n) nodes, \({\mathsf {indeg}} (G)=2\) and \(\varPi ^{\parallel }_{ss}\left( G,c_4n/\log n \right) \ge c_5n\) for some constants \(c_4,c_5 >0\). Thus, we need to introduce a third technical ingredient: indegree reduction.

Technical Ingredient 3: Indegree Reduction. To ensure \({\mathsf {indeg}} (G_n)=2\) we instead apply indegree reduction algorithm from Lemma 1 to \(D_m^\epsilon \) to obtain a graph \(J_m^\epsilon \) with \(2m{\delta } \in O(n)\) nodes \([2{\delta } m]\) and \({\mathsf {indeg}} (J_m^\epsilon )=2\) before overlaying—here \({\delta } = {\mathsf {indeg}} (D_m^\epsilon )\). This process is illustrated in Fig. 1b. We then obtain our final construction \(G_n\), illustrated in Fig. 1, by associating the m sources of \({\mathsf {PTC}} _n\) with the nodes \(\{2{\delta } v~:v \in [m]\}\) in \(J_m^\epsilon \), where \(\epsilon >0\) is fixed to be some suitably small constant.

Fig. 1.
figure 1

Building \(G_n\) with \(\varPi ^{\parallel }_{ss}\left( G_n, \frac{c n}{\log n} \right) \in \varOmega (n)\) for some constant \(c>0\).

It is straightforward to show that \(J_m^\epsilon \) is \((e,{\delta } d)\)-depth robust for any \(e+d \le (1-\epsilon ) m\). Thus, it would be tempting that it will take \(\varOmega (n)\) steps to (re)pebble \(c_2 n/\log n\) sources during the interval [ij] we obtain from Theorem 2. However, we still run into the same problem: In particular, suppose that at some point in time k we can find a set \(T \subseteq \{2v{\delta }:v \in [m]\}\setminus P_k\) with \(|T| \ge c_2 n/\log n\) (e.g., a set of sources in \({\mathsf {PTC}} _n\)) such that the longest path running through T in \(J_m^\epsilon - P_{k}\) has length less than \(c_5 n\). If the interval [ij] starts at time \(i=k+1\) then we cannot ensure that it will take time \(\ge c_5 n\) to (re)pebble these \(c_2 n/\log n\) source nodes.

Claim 1 addresses this challenge directly. If such a problematic time k exists then Claim 1 implies that we must have \(\varPi ^{\parallel }_{ss}\left( P, \varOmega (n/\log n) \right) \in \varOmega (n)\). At a high level the argument proceeds as follows: suppose that we find such a problem time k along with a set \(T \subseteq \{2v{\delta }:v \in [m]\}\setminus P_k\) with \(|T| \ge c_2 n/\log n\) such that \({\mathsf {depth}} \left( J_m^\epsilon [T]\right) \le c_5 n\). Then for any time \(r \in [k-c_5 n,k]\) we know that the length of the longest path running through T in \(J_m^\epsilon - P_r\) is at most \({\mathsf {depth}} \left( J_m^\epsilon [T]- P_r\right) \le c_5 n +(k-r) \le 2 c_5 n\) since the depth can decrease by at most one each round. We can then use the extreme depth-robustness of \(D_m^\epsilon \) and the construction of \(J_m^\epsilon \) to argue that \(\left| P_r \right| = \varOmega (n/\log n)\) for each \(r \in [k-c_5 n,k]\). Finally, if no such problem time k exists then the interval [ij] we obtain from Theorem 2 must have length at least \( i- j \ge c_5 n\). In either case we have \(\varPi ^{\parallel }_{ss}\left( P, \varOmega (n/\log n)) \right) \ge \varOmega (n)\).

Proof of Theorem 1. We begin with the family of DAGs \(\{{\mathsf {PTC}} _n\}_{n=1}^\infty \) from Theorem 2. Fixing \({\mathsf {PTC}} _n=([n],E_n)\) we let \(S = \{v \in [n]: {\mathsf {parents}} (v) = \emptyset \}\subseteq V\) denote the sources of this graph and we let \(c_1,c_2,c_3 > 0\) be the constants from Theorem 2. Let \(\epsilon \le c_2/(4c_1)\). By Theorem 3 we can find a depth-robust DAG \(D_{|S|}^\epsilon \) on |S| nodes which is (a|S|, b|S|)-DR for any \(a+b\le 1-\epsilon \) with indegree \(c' \log n \le {\delta } = {\mathsf {indeg}} (D) \le c'' \log (n)\) for some constants \(c',c''\). We let \(J_{|S|}^\epsilon \) denote the indegree reduced version of \(D_{|S|}^\epsilon \) from Lemma 1 with \(2|S|{\delta } \in O(n)\) nodes and \({\mathsf {indeg}} =2\). To obtain our DAG \(G_n\) from \(J_n^\epsilon \) and \({\mathsf {PTC}} _n\) we associate each of the S nodes \(2v{\delta } \) in \(J_n^\epsilon \) with one of the nodes in S. We observe that \(G_n\) has at most \(2|S|{\delta } +n \in O(n)\) nodes and that \({\mathsf {indeg}} (G) \le \max \left\{ {\mathsf {indeg}} ({\mathsf {PTC}} _n),{\mathsf {indeg}} \left( J_n^\epsilon \right) \right\} =2\) since we do not increase the indegree of any node in \(J_n^\epsilon \) when overlaying and in \(G_n\) do not increase the indegree of any nodes other than the sources S from \({\mathsf {PTC}} _n\) (these overlayed nodes have indegree at most 2 in \(J_n^\epsilon \)).

Let \(P=(P_0,\ldots ,P_t) \in \mathcal{P}^{\parallel }_{G}\) be given and observe that by restricting \(P'_i = P_i \cap V({\mathsf {PTC}} _n) \subseteq P_i\) we have a legal pebbling \(P'=(P_0',\ldots ,P_t') \in \mathcal{P}^{\parallel }_{{\mathsf {PTC}} _n}\) for \({\mathsf {PTC}} _n\). Thus, by Theorem 2 we can find an interval [ij] during which at least \(c_2n/\log n\) nodes in S are (re)pebbled and \(\forall k \in [i,j]\) we have \(\left| P_k\right| \ge c_3n/\log n\). We use \(T=S \cap \bigcup _{x=i}^j P_x - P_{i-1}\) to denote the source nodes of \({\mathsf {PTC}} _n\) that are (re)pebbled during the interval [ij]. We now set \(c_4 = c_2/4\) and \(c_5 = c_2 c'/4\) and consider two cases:

Case 1: We have \({\mathsf {depth}} \left( {\mathsf {ancestors}} _{G_n-P_i}(T)\right) \ge |T|{\delta }/4\). In other words at time i there is an unpebbled path of length \(\ge |T|{\delta }/4\) to some node in T. In this case, it will take at least \(j-i\ge |T|{\delta }/4\) steps to pebble T so we will have at least \(|T|{\delta }/4 \in \varOmega (n)\) steps with at least \(c_3n/\log n\) pebbles. Because \(c_5 = c_2 c'/4\) it follows that \(|T|{\delta }/4 \ge c_2c' n \ge c_5 n\). Finally, since \(c_4 \le c_2\) we have \(\varPi _{ss}\left( P,c_4n/\log n \right) \ge c_5n\).

Case 2: We have \({\mathsf {depth}} \left( {\mathsf {ancestors}} _{G_n-P_i}(T)\right) < |T|{\delta }/4\). In other words at time i there is no unpebbled path of length \(\ge |T|{\delta }/4\) to any node in T. Now Claim 1 directly implies that \(\varPi _{ss}\left( P, |T|-\epsilon |S| - |T|/2 \right) \ge {\delta } |T|/4\). This in turn implies that \(\varPi _{ss}\left( P, (c_2/2)n/(\log n) -\epsilon |S| \right) \ge {\delta } c_2n/(2\log n) \). We observe that \({\delta } c_2n/(2\log n) \ge c_5n\) since, we have \(c_5 = c_2 c'/4\). We also observe that \((c_2/2 )n/\log n -\epsilon |S| \ge (c_2/2-\epsilon c_1)n/\log n \ge (c_2/2-c_2/4)n/\log n \ge c_2n/(4 \log n) = c_4n\) since \(|S| \le c_1 n/\log n\), \(\epsilon \le c_2/(4c_1)\) and \(c_4=c_2/4\). Thus, in this case we also have \(\varPi _{ss}\left( P, c_4n/\log n \right) \ge c_5 n\), which implies that \(\varPi ^{\parallel }_{ss}\left( G_n, c_4 n/\log n\right) \ge c_5 n\).    \(\square \)

Claim 1

Let \(D_n^\epsilon \) be an DAG with nodes \(V\left( D_n^\epsilon \right) =[n]\), indegree \({\delta } = {\mathsf {indeg}} \left( D_n^\epsilon \right) \) that is (ed)-depth robust for all \(e,d>0\) such that \(e+d\le (1-\epsilon )n\), let \(J_n^\epsilon \) be the indegree reduced version of \(D_n^\epsilon \) from Lemma 1 with \(2 {\delta } \) nodes and \({\mathsf {indeg}} \left( J_n^\epsilon \right) = 2\), let \(T \subseteq [n]\) and let \(P=(P_1,\ldots ,P_t) \in \mathcal{P}^{\parallel }_{J_n^\epsilon ,\emptyset }\) be a (possibly incomplete) pebbling of \(J_n^\epsilon \). Suppose that during some round i we have \({\mathsf {depth}} \left( {\mathsf {ancestors}} _{J_n^\epsilon -P_i}\left( \bigcup _{v \in T} \{2{\delta } v \}\right) \right) \le c{\delta } |T|\) for some constant \(0< c < \frac{1}{2}\). Then \(\varPi _{ss}\left( P, |T|-\epsilon n - 2c|T|\right) \ge c {\delta } |T|\).

Proof of Claim 1. For each time step r we let \(H_r = {\mathsf {ancestors}} _{J_n^\epsilon -P_r}\left( \bigcup _{v \in T} \{2{\delta } v \}\right) \) and let \(k < i\) be the last pebbling step before i during which \({\mathsf {depth}} (G_k) \ge 2c|T|{\delta } \). Observe that \(k-i \ge {\mathsf {depth}} (H_k) - {\mathsf {depth}} (H_i) \ge cn{\delta } \) since we can decrease the length of any unpebbled path by at most one in each pebbling round. We also observe that \({\mathsf {depth}} (H_k) = c|T|{\delta } \) since \({\mathsf {depth}} (H_k)-1 \le {\mathsf {depth}} (H_{k+1}) < 2c|T|{\delta } \).

Let \(r \in [k,i]\) be given then, by definition of k, we have \({\mathsf {depth}} \left( H_r\right) \le 2c|T|{\delta } \). Let \(P_r' = \{v \in V(D_n^\epsilon ): P_r \cap [2{\delta } (v-1)+1,2{\delta } v] \ne \emptyset \}\) be the set of nodes \(v \in [n]=V\left( D_n^\epsilon \right) \) such that the corresponding path \(2{\delta } (v-1)+1,\ldots ,2{\delta } v\) in \(J_n^\epsilon \) contains at least one pebble at time r. By depth-robustness of \(D_n^\epsilon \) we have

$$\begin{aligned} {\mathsf {depth}} \left( D_n^\epsilon [T] - P_r'\right) \ge |T|-|P_r'|-\epsilon n \ . \end{aligned}$$
(1)

On the other hand, exploiting the properties of the indegree reduction from Lemma 1, we have

$$\begin{aligned} depth\left( D_n^\epsilon [T] - P_r'\right) {\delta } \le {\mathsf {depth}} \left( H_r\right) \le 2c|T|{\delta }\ . \end{aligned}$$
(2)

Combining Eqs. 1 and 2 we have

$$|T|-|P_r'|-\epsilon n \le {\mathsf {depth}} \left( D_n^\epsilon [T] - P_r'\right) \le 2c|T| \ . $$

It immediately follows that \(\left| P_r\right| \ge |P_r'| \ge |T|- 2c |T| - \epsilon n\) for each \(r \in [k,i]\) and, therefore, \(\varPi ^{\parallel }_{ss}\left( P, |T|-\epsilon n - 2c|T| \right) \ge c {\delta } |T|\).    \(\square \)

Remark 3

(On the Explicitness of Our Construction). Our construction of a family of DAGs with high sustained space complexity is explicit in the sense that there is a probabilistic polynomial time algorithm which, except with very small probability, outputs an n node DAG G that has high sustained space complexity. In particular, Theorem 1 relies on an explicit construction of [PTC76], and the extreme depth-robust DAGs from Theorem 3. The construction of [PTC76] in turn uses an object called superconcentrators. Since we have explicit constructions of superconcentrators [GG81] the construction of [PTC76] can be made explicit. While the proof of the existence of a family of extremely depth-robust DAGs is not explicit the proof uses a probabilistic argument and can be adapted to obtain a probabilistic polynomial time which, except with very small probability, outputs an n node DAG G that is extremely depth-robust. In practice, however it is also desirable to ensure that there is a local algorithm which, on input v, computes the set \({\mathsf {parents}} (v)\) in time \({{\mathsf {polylog}}} (n)\). It is an open question whether any DAG G with high sustained space complexity allows for highly efficient computation of the set \({\mathsf {parents}} (v)\).

4 Better Depth-Robustness

In this section we improve on the original analysis of Erdos et al. [EGS75], who constructed a family of DAGs \(\{G_n\}_{n=1}^\infty \) with \({\mathsf {indeg}} (G_n)\in O(\log n)\) such that each DAG \(G_n\) is \(\left( e= \varOmega (n),d=\varOmega (n)\right) \)-depth robust. Such a DAG \(G_n\) is not sufficient for us since we require that the subgraph \(G_n[T]\) is also highly depth robust for any sufficiently large subset \(T \subseteq V_n\) of nodes e.g., for any T such that \(|T| \ge n/1000\). For any fixed constant \(\epsilon > 0\) [MMV13] constructs a family of DAGs \(\{G_n^\epsilon \}_{n=1}^\infty \) which is \((\alpha n,\beta n)\)-depth robust for any positive constants \(\alpha ,\beta \) such that \(\alpha + \beta \le 1-\epsilon \) but their construction has indegree \(O\left( \log ^2 n \cdot {{\mathsf {polylog}}} \left( \log n\right) \right) \). By contrast, our results in the previous section assumed the existence of such a family of DAGs with \({\mathsf {indeg}} \left( G_n^\epsilon \right) \in O(\log n)\).

In fact our family of DAGs is essentially the same as [EGS75] with one minor modification to make the construction for all \(n > 0\). Our contribution in this section is an improved analysis which shows that the family of DAGs \(\{G_n^\epsilon \}_{n=1}^\infty \) with indegree \(O\left( \log n\right) \) is \((\alpha n,\beta n)\)-depth robust for any positive constants \(\alpha ,\beta \) such that \(\alpha + \beta \le 1-\epsilon \).

We remark that if we allow our family of DAGs to have \({\mathsf {indeg}} \left( G_n^\epsilon \right) \in O(\log n \log ^* n)\) then we can eliminate the dependence on \(\epsilon \) entirely. In particular, we can construct a family of DAGs \(\{G_n\}_{n=1}^\infty \) with \({\mathsf {indeg}} (G_n) = O(\log n \log ^* n)\) such that for any positive constants such that \(\alpha + \beta < 1\) the DAG \(G_n\) is \((\alpha n,\beta n)\)-depth robust for all suitably large n.

Theorem 3

Fix \(\epsilon >0\) then there exists a family of DAGs \(\{G_n^\epsilon \}_{n=1}^\infty \) with \({\mathsf {indeg}} \left( G_n^\epsilon \right) =O(\log n)\) that is \(\left( \alpha n, \beta n \right) \)-depth robust for any constants \(\alpha ,\beta \) such that \(\alpha + \beta < 1-\epsilon \).

The proof of Theorem 3 relies on Lemmas 4, 5 and 6. We say that G is a \(\delta \)-local expander if for every node \(x \in [n]\) and every \(r \le x, n-x\) and every pair \(A \subseteq I_r(x)\doteq \{x-r-1,\ldots ,x\}, B \subseteq I_r^*(x) \doteq \{x+1,\ldots ,x+r\}\) with size \(\left| A\right| , \left| B \right| \ge \delta r\) we have \(A \times B \cap E \ne \emptyset \) i.e., there is a directed edge from some node in A to some node in B. Lemma 4 says that for any constant \(\delta > 0\) we can construct a family of DAGs \(\{\mathsf {LE}_n^\delta \}_{n=1}^\infty \) with \({\mathsf {indeg}} =O(\log n)\) such that each \(\mathsf {LE}_n^\delta \) is a \(\delta \)-local expander. Lemma 4 essentially restates [EGS75, Claim 1] except that we require that \(\mathsf {LE}_n\) is a \(\delta \)-local expander for all \(n >0\) instead of for n sufficiently large. Since we require a (very) minor modification to achieve \(\delta \)-local expansion for all \(n >0\) we include the proof of Lemma 4 in the full version [ABP18] for completeness.

Lemma 4

[EGS75]. Let \(\delta > 0\) be a fixed constant then there is a family of DAGs \(\{\mathsf {LE}_n^\delta \}_{n=1}^\infty \) with \({\mathsf {indeg}} \in O(\log n)\) such that each \(\mathsf {LE}_n^\delta \) is a \(\delta \)-local expander.

While Lemma 4 essentially restates [EGS75, Claim 1], Lemmas 5 and 6 improve upon the analysis of [EGS75]. We say that a node \(x \in [n]\) is \(\gamma \)-good under a subset \(S\subseteq [n]\) if for all \(r > 0\) we have \(\left| I_r(x)\backslash S \right| \ge \gamma \left| I_r(x)\right| \) and \(\left| I_r^*(x)\backslash S \right| \ge \gamma \left| I_r^*(x)\right| \). Lemma 5 is similar to [EGS75, Claim 3], which also states that all \(\gamma \)-good nodes are connected by a directed path in \(\mathsf {LE}_n-S\). However, we stress that the argument of [EGS75, Claim 3] requires that \(\gamma \ge 0.5\) while Lemma 5 has no such restriction. This is crucial to prove Theorem 3 where we will select \(\gamma \) to be very small.

Lemma 5

Let \(G = (V=[n],E)\) be a \(\delta \)-local expander and let \(x < y \in [n]\) both be \(\gamma \)-good under \(S \subseteq [n]\) then if \(\delta < \min \{\gamma /2,1/4\}\) then there is a directed path from node x to node y in \(G-S\).

Lemma 6 shows that almost all of the remaining nodes in \(\mathsf {LE}_n^\delta -S\) will be \(\gamma \)-good. It immediately follows that \(\mathsf {LE}_n-S\) contains a directed path running through almost all of the nodes \([n]\setminus S\). While Lemma 6 may appear similar to [EGS75, Claim 2] at first glance, we again stress one crucial difference. The proof of [EGS75, Claim 2] is only sufficient to show that at least \(n- 2|S|/(1-\gamma ) \ge n-2|S|\) nodes are \(\gamma \)-good. At best this would allow us to conclude that \(\mathsf {LE}_n^\delta \) is \((e,n-2e)\)-depth robust. Together Lemmas 5 and 6 imply that if \(\mathsf {LE}_n^\delta \) is a \(\delta \)-local expander (\(\delta < \min \{\gamma /2,1/4\}\)) then \(\mathsf {LE}_n^\delta \) is \(\left( e,n-e\frac{1+\gamma }{1-\gamma }\right) \)-depth robust.

Lemma 6

For any DAG \(G = ([n],E)\) and any subset \(S \subseteq [n]\) of nodes at least \(n-|S|\frac{1+\gamma }{1-\gamma }\) of the remaining nodes in G are \(\gamma \)-good with respect to S.

Proof of Theorem 3. By Lemma 4, for any \(\delta > 0\), there is a family of DAGs \(\{\mathsf {LE}_n^\delta \}_{n=1}^\infty \) with \({\mathsf {indeg}} \left( \mathsf {LE}_n^\delta \right) \in O(\log n)\) such that for each \(n \ge 1\) the DAG \(\mathsf {LE}_n^\delta \) is a \(\delta \)-local expander. Given \(\epsilon \in (0,1]\) we will set \(G_n^{\epsilon } = \mathsf {LE}_n^\delta \) with \(\delta =\epsilon /10 < 1/4\) so that \(G_n^\epsilon \) is a \((\epsilon /10)\)-local expander. We also set \(\gamma = \epsilon /4 > 2\delta \). Let \(S \subseteq V_n\) of size \(|S| \le e\) be given. Then by Lemma 6 at least \(n-e\frac{1+\gamma }{1-\gamma }\) of the nodes are \(\gamma \)-good and by Lemma 5 there is a path connecting all \(\gamma \)-good nodes in \(G_n^\epsilon -S\). Thus, the DAG \(G_n^\epsilon \) is \(\left( e,n-e\frac{1+\gamma }{1-\gamma }\right) \)-depth robust for any \(e \le n\). In particular, if \(\alpha = e/n\) and \(\beta = 1-\alpha \frac{1+\gamma }{1-\gamma }\) then the graph is \((\alpha n,\beta n)\)-depth robust. Finally we verify that

$$ n- \alpha n -\beta n = -e + e \alpha \frac{1+\gamma }{1-\gamma } = e\frac{2\gamma }{1-\gamma } \le n \frac{\epsilon }{2-\epsilon /2} \le \epsilon n \ . $$

   \(\square \)

The proof of Lemma 5 follows by induction on the distance \(|y-x|\) between \(\gamma \)-good nodes x and y. Our proof extends a similar argument from [EGS75] with one important difference. [EGS75] argued inductively that for each good node x and for each \(r>0\) over half of the nodes in \(I_r^*(x)\) are reachable from x and that x can be reached from over half of the nodes in \(I_r(x)\)—this implies that y is reachable from x since there is at least one node \(z \in I_{|y-x|}^*(x)=I_{|y-x|}(y)\) such that z can be reached from x and y can be reached from z in \(G-S\). Unfortunately, this argument inherently requires that \(\gamma \ge 0.5\) since otherwise we may have at least \(\left| I_r^*(x) \cap S \right| \ge (1-\gamma ) r\) nodes in the interval \(I_r(x)\) that are not reachable from x. To get around this limitation we instead show, see Claim 2, that more than half of the nodes in the set \(I_r^*(x) \setminus S\) are reachable from x and that more than half of the nodes in the set \(I_r(x) \setminus S\) are reachable from x—this still suffices to show that x and y are connected since by the pigeonhole principle there is at least one node \(z \in I_{|y-x|}^*(x)\setminus S =I_{|y-x|}(y)\setminus S\) such that z can be reached from x and y can be reached from z in \(G-S\).

Claim 2

Let \(G = (V=[n],E)\) be a \(\delta \)-local expander, let \(x \in [n]\) be a \(\gamma \)-good node under \(S \subseteq [n]\) and let \(r >0\) be given. If \(\delta < \gamma /2\) then all but \(2\delta r\) of the nodes in \(I_r^*(x)\backslash S\) are reachable from x in \(G-S\). Similarly, x can be reached from all but \(2\delta r\) of the nodes in \(I_r(x)\backslash S\). In particular, if \(\delta < 1/4\) then more than half of the nodes in \(I_r^*(x)\backslash S\) (resp. in \(I_r(x)\backslash S\)) are reachable from x (resp. x is reachable from) in \(G-S\).

Proof of Claim 2. We prove by induction that (1) if \(r = 2^k \delta ^{-1}\) for some integer k then all but \(\delta r\) of the nodes in \(I_r^*(x)\backslash S\) are reachable from x and, (2) if \(2^{k-1}< r < 2^k \delta ^{-1}\) then all but \(2\delta r\) of the nodes in \(I_r^*(x)\backslash S\) are reachable from x. For the base cases we observe that if \(r \le \delta ^{-1}\) then, by definition of a \(\delta \)-local expander, x is directly connected to all nodes in \(I_r^*(x)\) so all nodes in \(I_r(x)\backslash S \) are reachable.

Now suppose that Claims (1) and (2) holds for each \(r' \le r= 2^k \delta ^{-1}\). Then we show that the claim holds for each \(r < r' \le 2r=2^{k+1} \delta ^{-1}\). In particular, let \(A \subseteq I_r^*(x)\backslash S\) denote the set of nodes in \(I_r^*(x)\backslash S\) that are reachable from x via a directed path in \(G-S\) and let \(B \subseteq I_{r'-r}^*(x+r) \backslash S\) be the set of all nodes in \(I_{r'-r}^*(x+r) \backslash S\) that are not reachable from x in \(G-S\). Clearly, there are no directed edges from A to B in G and by induction we have \(|A| \ge \left| I_{r}^*(x)\backslash S \right| -\delta r \ge r(\gamma -\delta ) > \delta r\). Thus, by \(\delta \)-local expansion \(|B| \le r\delta \). Since, \(\left| I_{r}^*(x)\backslash (S\cup A) \right| \le \delta r\) at most \(\left| I_{r'}^*(x)\backslash (S\cup A) \right| \le |B|+\delta r \le 2\delta r \le 2 \delta r'\) nodes in \(I_{2r}^*(x) \backslash S\) are not reachable from x in \(G-S\). Since, \(r' > r\) the number of unreachable nodes is at most \(2\delta r \le 2\delta r'\), and if \(r'=2r\) then the number of unreachable nodes is at most \(2\delta r = \delta r'\).

A similar argument shows that x can be reached from all but \(2 \delta r\) of the nodes in \(I_r(x)\backslash S\) in the graph \(G-S\).    \(\square \)

Proof of Lemma 5. By Claim 2 for each r we can reach \(\left| I_r^*(x)\backslash S \right| - \delta r =\left| I_r^*(x)\backslash S \right| \left( 1- \delta \frac{\left| I_r^*(x)\right| }{\left| I_r^*(x)\backslash S \right| }\right) \ge \left| I_r^*(x)\backslash S \right| \left( 1-\frac{\delta }{\gamma }\right) > \frac{1}{2} \left| I_r^*(x)\backslash S \right| \) of the nodes in \(I_r^*(x)\backslash S \) from the node x in \(G-S\). Similarly, we can reach y from more than \(\frac{1}{2} \left| I_r(x)\backslash S \right| \) of the nodes in \(I_r(y)\backslash S\). Thus, by the pigeonhole principle we can find at least one node \(z \in I_{|y-x|}^*(x)\setminus S =I_{|y-x|}(y)\setminus S\) such that z can be reached from x and y can be reached from z in \(G-S\).    \(\square \)

Lemma 6 shows that almost all of the nodes in \(G-S\) are \(\gamma \)-good. The proof is again similar in spirit to an argument of [EGS75]. In particular, [EGS75] constructed a superset T of the set of all \(\gamma \)-bad nodes and then bound the size of this superset T. However, they only prove that \(BAD \subset T \subseteq F \cup B\) where \(|F|,|B| \le |S|/(1-\gamma )\). Thus, we have \(|BAD| \le |T| \le 2|S|/(1-\gamma )\). Unfortunately, this bound is not sufficient for our purposes. In particular, if \(|S| = n/2\) then this bound does not rule out the possibility that \(|BAD| = n\) so that none of the remaining nodes are good. Instead of bounding the size of the superset T directly we instead bound the size of the set \(T\setminus S\) observing that \(|BAD| \le |T| \le |S| + |T\setminus S|\). In particular, we can show that \(|T\setminus S| \le \frac{2 \gamma |S|}{1-\gamma }\). We then have \(|GOOD| \ge n-|T| = n-|S|-|T\backslash S| \ge n-|S|-\frac{2\gamma |S|}{1-\gamma }\).

Proof of Lemma 6. We say that a \(\gamma \)-bad node x has a forward (resp. backwards) witness r if \(\left| I_r^*(x) \backslash S \right| > \gamma r\). Let \(x_1^*,r_1^*\) be the lexicographically first \(\gamma \)-bad node with a forward witness. Once \(x_1^*,r_1^*,\ldots ,x_k^*,r_k^*\) have been define let \(x_{k+1}^*\) be the lexicographically least \(\gamma \)-bad node such that \(x_{k+1}^* > x_k^*+r_k^*\) and \(x_{k+1}^*\) has a forward witness \(r_{k+1}^*\) (if such a node exists). Let \(x_1^*,r_1^*,\ldots ,x_k^*,r_{k*}^*\) denote the complete sequence, and similarly define a maximal sequence \(x_1,r_1,\ldots ,x_k,r_{k}\) of \(\gamma \)-bad nodes with backwards witnesses such that \(x_i-r_i > x_{i+1}\) for each i.

Let

$$ F = \bigcup _{i=1}^{k^*} I_{r_i^*}^*\left( x_i^*\right) , ~~~~\text{ and }~~~~B = \bigcup _{i=1}^{k} I_{r_i}\left( x_i\right) $$

Note that for each \(i \le k^*\) we have \(\left| I_{r_i^*}^*\left( x_i^*\right) \backslash S\right| \le \gamma r\). Similarly, for each \(i \le k\) we have \(\left| I_{r_i}\left( x_i\right) \backslash S\right| \le \gamma r\). Because the sets \(I_{r_i^*}^*\left( x_i^*\right) \) are all disjoint (by construction) we have

$$\left| F \backslash S \right| \le \gamma \sum _{i=1}^{k^*} r_i^* = \gamma |F| \ .$$

Similarly, \(\left| B \backslash S \right| \le \gamma |B|\). We also note that at least \((1-\gamma )|F|\) of the nodes in |F| are in |S|. Thus, \(|F|(1-\gamma ) \le |S|\) and similarly \(|B|(1-\gamma ) \le |S|\). We conclude that \(\left| F \backslash S \right| \le \frac{\gamma |S|}{1-\gamma }\) and that \(\left| B \backslash S \right| \le \frac{\gamma |S|}{1-\gamma }\).

To finish the proof let \(T = F \cup B =S \cup \left( F \backslash S \right) \cup \left( B \backslash S \right) \). Clearly, T is a superset of all \(\gamma \)-bad nodes. Thus, at least \(n-|T| \ge n-|S|\left( 1 + \frac{2\gamma }{1-\gamma }\right) = n-|S|\frac{1+\gamma }{1-\gamma }\) nodes are good.

We also remark that Lemma 4 can be modified to yield a family of DAGs \(\{\mathsf {LE}_n\}_{n=1}^\infty \) with \({\mathsf {indeg}} (\mathsf {LE}_n)\in O\left( \log n \log ^* n\right) \) such that each \(\mathsf {LE}_n\) is a \(\delta _n\) local expander for some sequence \(\{\delta _n\}_{n=1}^\infty \) converging to 0. We can define a sequence \(\{\gamma _n\}_{n=1}^\infty \) such that \(\frac{1+\gamma _n}{1-\gamma _n}\) converges to 1 and \(2\gamma _n > \delta _n\) for each n. Lemmas 4 and 6 then imply that each \(G_n\) is \(\left( e,n-e\frac{1+\gamma _n}{1-\gamma _n}\right) \)-depth robust for any \(e \le n\).

4.1 Additional Applications of Extremely Depth Robust Graphs

We now discuss additional applications of Theorem 3.

Application 1: Improved Proofs of Sequential Work. As we previously noted Mahmoody et al. [MMV13] used extremely depth-robust graphs to construct efficient Proofs-of-Sequential Work. In a proof of sequential work a prover wants to convince a verifier that he computed a hash chain of length n involving the input value x without requiring the verifier to recompute the entire hash chain. Mahmoody et al. [MMV13] accomplish this by requiring the prover computes labels \(L_1,\ldots ,L_n\) by “pebbling” an extremely depth-robust DAG \(G_n\) e.g., \(L_{i+1} = H\left( x \Vert L_{v_1} \Vert \ldots \Vert L_{v_{\delta }}\right) \) where \(\{v_1,\ldots ,v_{\delta } \} = {\mathsf {parents}} (i+1)\) and H is a random oracle. The prover then commits to the labels \(L_1,\ldots ,L_n\) using a Merkle Tree and sends the root of the tree to the verifier who can audit randomly chosen labels e.g., the verifier audits label \(L_{i+1}\) by asking the prover to reveal the values \(L_{i+1}\) and \(L_v\) for each \(v \in {\mathsf {parents}} (i+1)\). If the DAG is extremely-depth robust then either a (possibly cheating) prover make at least \((1-\epsilon )n\) sequential queries to the random oracle, or the prover will fail to convince the verifier with high probability [MMV13].

We note that the parameter \({\delta } = {\mathsf {indeg}} ( G_n)\) is crucial to the efficiency of the Proofs-Of-Sequential Work protocol since each audit challenge requires the prover to reveal \({\delta } +1\) labels in the Merkle tree. The DAG \(G_n\) from [MMV13] has \({\mathsf {indeg}} ( G_n) \in O\left( \log ^2 n \cdot {{\mathsf {polylog}}} \left( \log n\right) \right) \) while our DAG \(G_n\) from Theorem 3 has maximum indegree \({\mathsf {indeg}} ( G_n) \in O\left( \log n\right) \). Thus, we can improve the communication complexity of their Proofs-Of-Sequential Work protocol by a factor of \(\varOmega (\log n \cdot {{\mathsf {polylog}}} \log n)\). However, Cohen and Pietrzak [CP18] found an alternate construction of a Proofs-Of-Sequential Work protocol that does not involve depth-robust graphs and which would almost certainly be more efficient than either of the above constructions in practice.

Application 2: Graphs with Maximum Cumulative Cost. We now show that our family of extreme depth-robust DAGs has the highest possible cumulative pebbling cost even in terms of the constant factors. In particular, for any constant \(\eta >0\) and \(\epsilon < \eta ^2/100\) the family \(\{G_n^\epsilon \}_{n=1}^\infty \) of DAGs from Theorem 3 has \(\varPi ^{\parallel }_{cc}\left( G_n^\epsilon \right) \ge \frac{n^2(1-\eta )}{2}\) and \({\mathsf {indeg}} (G_n)\in O(\log n)\). By comparison, \(\varPi ^{\parallel }_{cc}(G_n) \le \frac{n^2+n}{2}\) for any DAG \(G \in {\mathbb {G}} _n\)—even if G is the complete DAG.

Previously, Alwen et al. [ABP17] showed that any (ed)-depth robust DAG G has \(\varPi ^{\parallel }_{cc}(G) > ed\) which implies that there is a family of DAG \(G_n\) with \(\varPi ^{\parallel }_{cc}(G_n) \in \varOmega \left( n^2 \right) \) [EGS75]. We stress that we need new techniques to prove Theorem 4. Even if a DAG \(G \in {\mathbb {G}} _n\) were \((e,n-e)\)-depth robust for every \(e \ge 0\) (the only DAG actually satisfying this property is the compete DAG \(K_n\)) [ABP17] only implies that \(\varPi ^{\parallel }_{cc}(G) \ge \max _{e \ge 0} e(n-e) = n^2/4\). Our basic insight is that at time \(t_i\), the first time a pebble is placed on node i in \(G_n^\epsilon \), the node \(i+\gamma i\) is \(\gamma \)-good and is therefore reachable via an undirected path from all of the other \(\gamma \)-good nodes in [i]. If we have \(|P_{t_i}| < \left( 1-\eta /2\right) i\) then we can show that at least \(\varOmega (\eta i)\) of the nodes in [i] are \(\gamma \)-good. We can also show that these \(\gamma \)-good nodes form a depth robust subset and will cost \(\varOmega \left( (\eta -\epsilon )^2i^2\right) \) to repebble them by [ABP17]. Since, we would need to pay this cost by time \(t_{i+\gamma i}\) it is less expensive to simply ensure that \(|P_{t_i}| > \left( 1-\eta /2\right) i\). We refer an interested reader to Appendix A for a complete proof.

Theorem 4

Let \(0< \eta < 1\) be a positive constant and let \(\epsilon = \eta ^2/100\) then the family \(\{G_n^{\epsilon }\}_{n=1}^\infty \) of DAGs from Theorem 3 has \({\mathsf {indeg}} \left( G_n^\epsilon \right) \in O\left( \log n\right) \) and \(\varPi ^{\parallel }_{cc}\left( G_n^\eta \right) \ge \frac{n^2\left( 1-\eta \right) }{2}\).

Application 3: Cumulative Space in Parallel-Black Sequential-White Pebblings. The black-white pebble game [CS76] was introduced to model nondeterministic computations. White pebbles correspond to nondeterministic guesses and can be placed on any vertex at any time. However, these pebbles can only be removed from a node when all parents of the node contain a pebble (i.e., when we can verify the correctness of this guess). Formally, black white-pebbling configuration \(P_i = \left( P_i^W,P_i^B\right) \) of a DAG \(G=([n],E)\) consists of two subsets \(P_i^W, P_i^B \subseteq [n]\) where \(P_i^B\) (resp. \(P_i^W\)) denotes the set of nodes in G with black (resp. white) pebbles on them at time i. For a legal parallel-black sequential-white pebbling \(P = (P_0,\ldots ,P_t) \in \mathcal{P}_{G}^{BW}\) we require that we start with no pebbles on the graph i.e., \(P_0 = (\emptyset ,\emptyset )\) and that all white pebbles are removed by the end i.e., \(P_t^W = \emptyset \) so that we verify the correctness of every nondeterministic guess before terminating. If we place a black pebble on a node v during round \(i+1\) then we require that all of v’s parents have a pebble (either black or white) on them during round i i.e., \({\mathsf {parents}} \left( P_{i+1}^B \setminus P_{i}^B\right) \subseteq P_{i}^B \cup P_i^W\). In the Parallel-Black Sequential-White model we require that at most one new white pebble is placed on the DAG in every round i.e., \(\left| P_i^W \setminus P_{i-1}^W \right| \le 1\) while no such restrict applies for black pebbles.

We can use our construction of a family of extremely depth-robust DAG \(\{G_n^\epsilon \}_{n=1}^\infty \) to establish new upper and lower bounds for bounds for parallel-black sequential white pebblings.

Alwen et al. [AdRNV17] previously showed that in the parallel-black sequential white pebbling model an (ed)-depth-robust DAG G requires cumulative space at least \(\varPi ^{BW}_{cc}(G) \doteq \min _{P \in \mathcal{P}_{G}^{BW}}\sum _{i=1}^t \left| P_i^B \cup P_i^W \right| \in \varOmega \left( e\sqrt{d}\right) \) or at least \(\ge ed\) in the sequential black-white pebbling game. In this section we show that any (ed)-reducible DAG admits a parallel-black sequential white pebbling with cumulative space at most \(O(e^2+dn)\) which implies that any DAG with constant indegree admits a parallel-black sequential white pebbling with cumulative space at most \(O(\frac{n^2 \log ^2 \log n}{\log ^2 n})\) since any DAG is \((n \log \log n/\log n, n/\log ^2 n)\)-reducible. We also show that this bound is essentially tight (up to \(\log \log n\) factors) using our construction of extremely depth-robust DAGs. In particular, by applying indegree reduction to the family \(\{G_n^{\epsilon }\}_{n=1}^\infty \), we can find a family of DAGs \(\{J_n^\epsilon \}_{n=1}^\infty \) with \({\mathsf {indeg}} \left( J_n^\epsilon \right) =2\) such that any parallel-black sequential white pebbling has cumulative space at least \(\varOmega (\frac{n^2}{\log ^2 n})\). To show this we start by showing that any parallel-black sequential white pebbling of an extremely depth-robust DAG \(G_n^{\epsilon }\), with \({\mathsf {indeg}} (G)\in O(\log n)\), has cumulative space at least \(\varOmega (n^2)\). We use Lemma 1 to reduce the indegree of the DAG and obtain a DAG \(J_n^\epsilon \) with \(n'\in O(n \log n)\) nodes and \({\mathsf {indeg}} (G)=2\), such that any parallel-black sequential white pebbling of \(J_n^\epsilon \) has cumulative space at least \(\varOmega (\frac{n^2}{\log ^2 n})\).

To the best of our knowledge no general upper bound on cumulative space complexity for parallel-black sequential-white pebblings was known prior to our work other than the parallel black-pebbling attacks of Alwen and Blocki [AB16]. This attack, which doesn’t even use the white pebbles, yields an upper bound of \(O(ne+n\sqrt{nd})\) for (ed)-reducible DAGs and \(O(n^2 \log \log n/\log n)\) in general. One could also consider a “parallel-white parallel-black” pebbling model in which we are allowed to place as many white pebbles as he would like in each round. However, this model admits a trivial pebbling. In particular, we could place white pebbles on every node during the first round and remove all of these pebbles in the next round e.g., \(P_1 = (\emptyset ,V)\) and \(P_2 = (\emptyset ,\emptyset )\). Thus, any DAG has cumulative space complexity \(\theta (n)\) in the “parallel-white parallel-black” pebbling model.

Theorem 5 shows that (ed)-reducible DAG admits a parallel-black sequential white pebbling with cumulative space at most \(O(e^2+dn)\). The basic pebbling strategy is reminiscent of the parallel black-pebbling attacks of Alwen and Blocki [AB16]. Given an appropriate depth-reducing set S we use the first \(e=|S|\) steps to place white pebbles on all nodes in S. Since \(G-S\) has depth at most d we can place black pebbles on the remaining nodes during the next d steps. Finally, once we place pebbles on every node we can legally remove the white pebbles. A formal proof of Theorem 5 can be found in the full version of this paper [ABP18].

Theorem 5

Let \(G= (V,E)\) be (ed)-reducible then \(\varPi ^{BW}_{cc}(G) \le \frac{e(e+1)}{2} + dn\). In particular, for any DAG G with \({\mathsf {indeg}} (G)\in O(1)\) we have \(\varPi ^{BW}_{cc}(G) \in O\left( \left( \frac{n \log \log n}{\log n} \right) ^2 \right) \).

Theorem 6 shows that our upper bound is essentially tight. In a nut-shell their lower bound was based on the observation that for any integers id the DAG \(G-\bigcup _j P_{i+jd}\) has depth at most d since any remaining path must have been pebbled completely in time d—if G is (ed)-depth robust this implies that \(\left| \bigcup _j P_{i+jd}\right| \ge e\). The key difficulty in adapting this argument to the parallel-black sequential white pebbling model is that it is actually possible to pebble a path of length d in \(O(\sqrt{d})\) steps by placing white pebbles on every interval of length \(\sqrt{d}\). This is precisely why Alwen et al. [AdRNV17] were only able to establish the lower bound \(\varOmega (e\sqrt{d})\) for the cumulative space complexity of (ed)-depth robust DAGs—observe that we always have \(e\sqrt{d} \le n^{1.5}\) since \(e+d \le n\) for any DAG G. We overcome this key challenge by using extremely depth-robust DAGs.

In particular, we exploit the fact that extremely depth-robust DAGs are “recursively” depth-robust. For example, if a DAG G is (ed)-depth robust for any \(e+d \le (1-\epsilon )n\) then the DAG \(G-S\) is (ed)-depth robust for any \(e+d \le (n-|S|) - \epsilon n\). Since \(G-S\) is still sufficiently depth-robust we can then show that for some node \(x \in V(G-S)\) any (possibly incomplete) pebbling \(P= (P_0,P_1, \ldots , P_t)\) of \(G-S\) with \(P_0 = P_t = (\emptyset ,\emptyset )\) either (1) requires \(t\in \varOmega (n)\) steps, or (2) fails to place a pebble on x i.e. \(x \notin \bigcup _{r=0}^t \left( P_0^W \cup P_r^B\right) \). By Theorem 3 it then follows that there is a family of DAGs \(\{G_n^\epsilon \}_{n=1}^\infty \) with \({\mathsf {indeg}} \left( G_n^\epsilon \right) \in O\left( \log n\right) \) and \(\varPi ^{BW}_{cc}(G) \in \varOmega (n^2)\). If apply indegree reduction Lemma 1 to the family \(\{G_n^\epsilon \}_{n=1}^\infty \) we obtain the family \(\{J_n^\epsilon \}_{n=1}^\infty \) with \({\mathsf {indeg}} (J_n^\epsilon )=2\) and O(n) nodes. A similar argument shows that \(\varPi ^{BW}_{cc}\left( J_n^\epsilon \right) \in \varOmega (n^2/\log ^2 n)\). A formal proof of Theorem 6 can be found in the full version of this paper [ABP18].

Theorem 6

Let \(G= (V = [n],E \supset \{(i,i+1): i < n\})\) be (ed)-depth-robust for any \(e+d \le (1-\epsilon )n\) then \(\varPi ^{BW}_{cc}(G) \ge \left( 1/16-\epsilon /2 \right) n^2 \). Furthermore, if \(G'= ([2n{\delta } ],E')\) is the indegree reduced version of G from Lemma 1 then \(\varPi ^{BW}_{cc}(G') \ge \left( 1/16-\epsilon /2 \right) n^2\). In particular, there is a family of DAGs \(\{G_n\}_{n=1}^\infty \) with \({\mathsf {indeg}} (G_n) \in O\left( \log n\right) \) and \(\varPi ^{BW}_{cc}(G) \in \varOmega (n^2)\), and a separate family of DAGs \(\{H_n\}_{n=1}^\infty \) with \({\mathsf {indeg}} (H_n) = 2\) and \(\varPi ^{BW}_{cc}(H_n) \in \varOmega \left( \frac{n^2}{\log ^2 n}\right) \).

5 A Pebbling Reduction for Sustained Space Complexity

As an application of the pebbling results on sustained space in this section we construct a new type of moderately hard function (MoHF) in the parallel random oracle model pROM. In slightly more detail, we first fix the computational model and define a particular notion of moderately hard function called sustained memory-hard functions (SMHF). We do this using the framework of [AT17] so, beyond the applications to password based cryptography, the results in [AT17] for building provably secure cryptographic applications on top of any MoHF can be immediately applied to SMHFs. In particular this results in a proof-of-work and non-interactive proof-of-work where “work” intuitively means having performed some computation entailing sufficient sustained memory. Finally we prove a “pebbling reduction” for SMHFs; that is we show how to bound the parameters describing the sustained memory complexity of a family of SMHFs in terms of the sustained space of their underlying graphs.Footnote 4

We note that the pebbling reduction below caries over almost unchanged to the framework of [AS15]. That is by defining sustained space in the computational model of [AS15] similarly to the definition below a very similar proof to that of Theorem 7 results the analogous theorem but for the [AT17] framework. Never-the-less we believe the [AT17] framework to result in a more useful definition as exemplified by the applications inherited from that work.

5.1 Defining Sustained Memory Hard Functions

We very briefly sketch the most important parts of the MoHF framework of [AT17] which is, in turn, a generalization of the indifferentiability framework of [MRH04].

We begin with the following definition which describes a family of functions that depend on a (random) oracle.

Definition 6

(Oracle functions). For (implicit) oracle set \({{\mathbb {H}}} \), an oracle function \(f^{(\cdot )}\) (with domain D and range R), denoted \(f^{(\cdot )}:D \rightarrow R\), is a set of functions indexed by oracles \(h\in {{\mathbb {H}}} \) where each \(f^h\) maps \(D \rightarrow R\).

Put simply, an MoHF is a pair consisting of an oracle family \(f^{(\cdot )}\) and an honest algorithm \({{\mathcal {N}}} \) for evaluating functions in the family using access to a random oracle. Such a pair is secure relative to some computational model M if no adversary \({{\mathcal {A}}} \) with a computational device adhering to M (denoted \({{\mathcal {A}}} \in M\)) can produce output which couldn’t be produced simply by called \(f^{(h)}\) a limited number of times (where h is a uniform choice of oracle from \({{\mathbb {H}}} \)). It is assumed that algorithm \({{\mathcal {N}}} \) is computable by devices in some (possibly different) computational model \(\bar{M}\) when given sufficient computational resources. Usually M is strictly more powerful than \(\bar{M}\) reflecting the assumption that an adversary could have a more powerful class of device than the honest party. For example, in this work we will let model \(\bar{M}\) contain only sequential devices (say Turing machines which make one call to the random oracle at a time) while M will also include parallel devices.

In this work, both the computational models M and \(\bar{M}\) are parametrized by the same space \({\mathbb {P}} \). For each model, the choice of parameters fixes upperbounds on the power of devices captured by that model; that is on the computational resources available to the permitted devices. For example \(M_{a}\) could be all Turing machines making at most a queries to the random oracle. The security of a given moderatly hard function is parameterized by two functions \({\alpha } \) and \({\beta } \) mapping the parameter space for M to positive integers. Intuitively these functions are used to provide the following two properties.  

Completeness: :

To ensure the construction is even useable we require that \({{\mathcal {N}}} \) is (computable by a device) in model \(M_{a}\) and that \({{\mathcal {N}}} \) can evaluate \(f^{(h)}\) (when given access to h) on at least \({\alpha } (a)\) distinct inputs.

Security: :

To capture how bounds on the resources of an adversary \({{\mathcal {A}}} \) limit the ability of \({{\mathcal {A}}} \) to evaluate the MoHF we require that the output of \({{\mathcal {A}}} \) when running on a device in model \(M_{b}\) (and having access to the random oracle) can be reproduced by some simulator \({{\sigma }} \) using at most \({\beta } (b)\) oracle calls to \(f^{(h)}\) (for uniform randomly sampled \(h{\leftarrow } {{\mathbb {H}}} \).

 

To help build provably secure applications on top of MoHFs the framework makes use of a destinguisher \({{\mathcal {D}}} \) (similar to the environment in the Universal Composability [Can01] family of models or, more accurately, to the destinguisher in the indifferentiability framework). The job of \({{\mathcal {D}}} \) is to (try to) tell a real world interaction with \({{\mathcal {N}}} \) and the adversary \({{\mathcal {A}}} \) apart from an ideal world interaction with \(f^{(h)}\) (in place of \({{\mathcal {N}}} \)) and a simulator (in place of the adversary). Intuitivelly, \({{\mathcal {D}}} \)’s access to \({{\mathcal {N}}} \) captures whatever \({{\mathcal {D}}} \) could hope to learn by interacting with an arbitrary application making use of the MoHF. The definition then ensures that even leveraging such information the adversary \({{\mathcal {A}}} \) can not produce anything that could not be simulated (by simulator \({{\sigma }} \)) to \({{\mathcal {D}}} \) using nothing more than a few calls to \(f^{(h)}\).

As in the above description we have ommited several details of the framework we will also use a somewhat simplified notation. We denote the above described real world execution with the pair \(({{\mathcal {N}}}, {{\mathcal {A}}})\) and an ideal world execution where \({{\mathcal {D}}} \) is permitted \(c\in {\mathbb {N}} \) calls to \(f^{(\cdot )}\) and simulator \({{\sigma }} \) is permited \(d\in {\mathbb {N}} \) calls to \(f^{(h)}\) with the pair \((f^{(\cdot )}, {{\sigma }})_{c,d}\). To denote the statement that no \({{\mathcal {D}}} \) can tell an interaction with \(({{\mathcal {N}}}, {{\mathcal {A}}})\) apart one with \((f^{(\cdot )}, {{\sigma }})_{c,d}\) with more than probability \({\epsilon } \) we write \(({{\mathcal {N}}}, {{\mathcal {A}}}) \approx _{{\epsilon }} (f^{(\cdot )}, {{\sigma }})_{c,d}\).

Finally, to accommodate honest parties with varying amounts of resources we equip the MoHF with a hardness parameter \(n\in {\mathbb {N}} \). The following is the formal security definition of a MoHF. Particular types of MoHF (such as the one we define bellow for sustained memory complexity) differ in the precise notion of computational model they consider. For further intuition, a much more detailed exposition of the framework and how the following definition can be used to prove security for applications we refer to [AT17].

Definition 7

(MoHF security). Let M and \(\bar{M}\) be computational models with bounded resources parametrized by \({\mathbb {P}} \). For each \(n \in {\mathbb {N}} \), let \(f_n^{(\cdot )}\) be an oracle function and \({{\mathcal {N}}} (n,\cdot )\) be an algorithm (computable by some device in \(\bar{M}\)) for evaluating \(f_n^{(\cdot )}\). Let \({\alpha },{\beta }: {\mathbb {P}} \times {\mathbb {N}} \rightarrow {\mathbb {N}} \), and let \({\epsilon }: {\mathbb {P}} \times {\mathbb {P}} \times {\mathbb {N}} \rightarrow {\mathbb {R}} _{\ge 0}\). Then, \((f^{(\cdot )}_n, {{\mathcal {N}}} _n)_{n\in {\mathbb {N}}}\) is a \(({\alpha },{\beta },{\epsilon })\)-secure moderately hard function family (for model M) if

$$\begin{aligned} \forall n\in {\mathbb {N}}, {\mathbf {r}} \in {\mathbb {P}}, {{\mathcal {A}}} \in M_{{\mathbf {r}}}\ \exists {{\sigma }}\ \forall {\mathbf {l}} \in {\mathbb {P}}: \quad ({{\mathcal {N}}} (n,\cdot ),{{\mathcal {A}}}) ~\approx _{{\epsilon } ({\mathbf {l}},{\mathbf {r}},n)}~(f_n^{(\cdot )},{{\sigma }})_{{\alpha } ({\mathbf {l}},n),{\beta } ({\mathbf {r}},n)} \; , \end{aligned}$$
(3)

The function family is asymptotically secure if \({\epsilon } ({\mathbf {l}},{\mathbf {r}},\cdot )\) is a negligible function in the third parameter for all values of \({\mathbf {r}},{\mathbf {l}} \in {\mathbb {P}} \).

Sustained Space Constrained Computation. Next we define the honest and adversarial computational models for which we prove the pebbling reduction. In particular we first recall (a simplified version of) the pROM of [AT17]. Next we define a notion of sustained memory in that model naturally mirroring the notion of sustained space for pebbling. Thus we can parametrize the pROM by memory threshold s and time t to capture all devices in the pROM with no more sustained memory complexity then given by the choice of those parameters.

In more detail, we consider a resource-bounded computational device \(\mathcal S^{\textsc {}}\). Let \(w\in {\mathbb {N}} \). Upon startup, \(\mathcal S^{\textsc {w-prom}}_{}\) samples a fresh random oracle with range \(\{0,1\}^w\). Now \(\mathcal S^{\textsc {w-prom}}_{}\) accepts as input a pROM algorithm \({{\mathcal {A}}} \) which is an oracle algorithm with the following behavior.

A state is a pair \(({\tau }, \mathbf{{s}})\) where data \({\tau } \) is a string and \(\mathbf{{s}} \) is a tuple of strings. The output of step i of algorithm \({{\mathcal {A}}} \) is an output state \({\bar{{\sigma }}} _i = ({\tau } _i, \mathbf{{q}} _i)\) where \(\mathbf{{q}} _i = [q_i^1,\ldots ,q_i^{z_i}]\) is a tuple of queries to h. As input to step \(i+1\), algorithm \({{\mathcal {A}}} \) is given the corresponding input state \({\sigma } _i = ({\tau } _i, h(\mathbf{{q}} _i))\), where \(h(\mathbf{{q}} _i) = [h(q_i^1), \ldots , h(q_i^{z_i})]\) is the tuple of responses from h to the queries \(\mathbf{{q}} _i\). In particular, for a given h and random coins of \({{\mathcal {A}}} \), the input state \({\sigma } _{i+1}\) is a function of the input state \({\sigma } _i\). The initial state \({\sigma } _0\) is empty and the input \(x_{{{\mathbf {\mathsf{{in}}}}}}\) to the computation is given a special input in step 1.

For a given execution of a pROM, we are interested in the following new complexity measure parametrized by an integer \(s\ge 0\). We call an element of \(\{0,1\}^s\) a block. Moreover, we denote the bit-length of a string r by |r|. The length of a state \({\sigma } =({\tau },\mathbf{{s}})\) with \(\mathbf{{s}} =(s^1, s^2, \ldots , s^y)\) is \(|{\sigma } | = |{\tau } | + \sum _{i\in [y]} |s^i|\). For a given state \({\sigma } \) let \(b({\sigma }) = {\left\lfloor |{\sigma } | / s\right\rfloor } \) be the number of “blocks in \({\sigma } \)”. Intuitively, the s-sustained memory complexity (s-SMC) of an execution is the sum of the number of blocks in each state. More precisely, consider an execution of algorithm \({{\mathcal {A}}} \) on input \(x_{{{\mathbf {\mathsf{{in}}}}}}\) using coins \({\$} \) with oracle h resulting in \(z\in {\mathbb {Z}} _{\ge 0}\) input states \({\sigma } _1, \ldots , {\sigma } _z\), where \({\sigma } _i = ({\tau } _i,\mathbf{{s}} _i)\) and \(\mathbf{{s}} _i = (s_i^1, s_i^2, \ldots , s_i^{y_j})\). Then the for integer \(s\ge 0\) the s-sustained memory complexity (s-SMC) of the execution is

$$ s{\text {-}}{\mathsf {smc}} ({{\mathcal {A}}} ^h(x_{{{\mathbf {\mathsf{{in}}}}}};{\$})) = \displaystyle \sum _{i\in [z]} b({\sigma } _i) \; , $$

while the total number of RO calls is \(\sum _{i\in [z]} y_j\). More generally, the s-SMC (and total number of RO calls) of several executions is the sum of the s-sMC (and total RO calls) of the individual executions.

We can now describe the resource constraints imposed by \(\mathcal S^{\textsc {w-prom}}_{}\) on the pROM algorithms it executes. To quantify the constraints, \(\mathcal S^{\textsc {w-prom}}_{}\) is parametrized by element from \({{\mathbb {P}} ^{\textsc {prom}}} ={\mathbb {N}} ^3\) which describe the limits on an execution of algorithm \({{\mathcal {A}}} \). In particular, for parameters \((q, s, t)\in {{\mathbb {P}} ^{\textsc {prom}}} \), algorithm \({{\mathcal {A}}} \) is allowed to make a total of q RO calls and have s-SMC at most t (summed across all invocations of \({{\mathcal {A}}} \) in any given experiment).

As usual for moderately hard functions, to ensure that the honest algorithm can be run on realistic devices, we restrict the honest algorithm \({{\mathcal {N}}} \) for evaluating the SMHF to be a sequential algorithms. That is, \({{\mathcal {N}}} \) can make only a single call to h per step. Technically, in any execution, for any step j it must be that \(y_j \le 1\). No such restriction is placed on the adversarial algorithm reflecting the power (potentially) available to such a highly parallel device as an ASIC. In symbols we denote the sequential version of the pROM, which we refer to as the sequential ROM (sROM) by \(\mathcal S^{\textsc {w-srom}}_{}\).

We can now (somewhat) formally define of a sustained memory-hard function for the pROM. The definition is a particular instance of and moderately hard function (c.f. Definition 7).

Definition 8

(Sustained Memory-Hard Function). For each \(n \in {\mathbb {N}} \), let \(f_n^{(\cdot )}\) be an oracle function and \({{\mathcal {N}}} _n\) be an sROM algorithm for computing \(f^{(\cdot )}\). Consider the function families:

$$ {\alpha } = \{{\alpha } _w : {{\mathbb {P}} ^{\textsc {prom}}} \times {\mathbb {N}} \rightarrow {\mathbb {N}} \}_{w\in {\mathbb {N}}} \; ,~~~ {\beta } = \{{\beta } _w : {{\mathbb {P}} ^{\textsc {prom}}} \times {\mathbb {N}} \rightarrow {\mathbb {N}} \}_{w\in {\mathbb {N}}} \; ,$$
$${\epsilon } = \{{\epsilon } _w:{{\mathbb {P}} ^{\textsc {prom}}} \times {{\mathbb {P}} ^{\textsc {prom}}} \times {\mathbb {N}} \rightarrow {\mathbb {N}} \}_{w\in {\mathbb {N}}} \; . $$

Then \(F=(f^{(\cdot )}_n, {{\mathcal {N}}} _n)_{n\in {\mathbb {N}}}\) is called an \(({\alpha },{\beta },{\epsilon })\)-sustained memory-hard function (SMHF) if \(\forall w\in {\mathbb {N}} \) F is an \(({\alpha } _w,{\beta } _w,{\epsilon } _w)\)-secure moderately hard function family for \(\mathcal S^{\textsc {w-prom}}_{}\).

5.2 The Construction

In this work \(f^{(\cdot )}\) will be a graph function [AS15] (also sometimes called “hash graph”). The following definition is taken from [AT17]. A graph function depends on an oracle \(h\in {{\mathbb {H}}} _w\) mapping bit strings to bit strings. We also assume the existence of an implicit prefix-free encoding such that h is evaluated on unique strings. Inputs to h are given as distinct tuples of strings (or even tuples of tuples of strings). For example, we assume that h(0, 00), h(00, 0), and h((0, 0), 0) all denote distinct inputs to h.

Definition 9

(Graph function). Let function \(h:\{0,1\}^* \rightarrow \{0,1\}^w\in {{\mathbb {H}}} _w\) and DAG \(G=(V,E)\) have source nodes \(\{v^{{\mathsf {in}}}_1, \ldots , v^{{\mathsf {in}}}_a\}\) and sink nodes \((v^{{\mathsf {out}}}_1, \ldots , v^{{\mathsf {out}}}_z)\). Then, for inputs \({\mathbf{x}} = (x_1,\ldots ,x_a) \in (\{0,1\}^*)^{\times a}\), the \((h, {\mathbf{x}})\)-labeling of G is a mapping \({{\mathbf {\mathsf{{lab}}}}}: V \rightarrow \{0,1\}^w\) defined recursively to be:

$$\begin{aligned} \forall v \in V ~~ {{\mathbf {\mathsf{{lab}}}}} (v) := {\left\{ \begin{array}{ll} h\left( {\mathbf{x}}, v, x_{j}\right) &{} : v = v^{{\mathsf {in}}}_j\\ h\left( {\mathbf{x}}, v, {{\mathbf {\mathsf{{lab}}}}} (v_1), \ldots , {{\mathbf {\mathsf{{lab}}}}} (v_d)\right) &{} : \text{ else } \end{array}\right. } \end{aligned}$$

where \(\{v_1, \ldots , v_d\}\) are the parents of v arranged in lexicographic order.

The graph function (of G and \({{\mathbb {H}}} _w\)) is the oracle function

$$f_G:(\{0,1\}^*)^{\times a} \rightarrow (\{0,1\}^w)^{\times z} \; ,$$

which maps \({\mathbf{x}} \mapsto ({{\mathbf {\mathsf{{lab}}}}} (v^{{\mathsf {out}}}_1), \ldots , {{\mathbf {\mathsf{{lab}}}}} (v^{{\mathsf {out}}}_z))\) where \({{\mathbf {\mathsf{{lab}}}}} \) is the \((h,{\mathbf{x}})\)-labeling of G.

Given a graph function we need an honest (sequential) algorithm for computing it in the pROM. For this we use the same algorithm as already used in [AT17]. The honest oracle algorithm \({{\mathcal {N}}} _G\) for graph function \(f_G\) computes one label of G at a time in topological order appending the result to its state. If G has \(|V|=n\) nodes then \({{\mathcal {N}}} _G\) will terminate in n steps making at most 1 call to h per step, for a total of n calls, and will never store more than \(n*w\) bits in the data portion of its state. In particular for all inputs \({\mathbf{x}} \), oracles h (and coins \({\$} \)) we have that for any \(s\in [n]\) if the range of h is in \(\{0,1\}^w\) then algorithm \({{\mathcal {N}}} \) has sw-SMC of \(n-s\).

Recall that we would like to set \({\alpha } _w:{{\mathbb {P}} ^{\textsc {prom}}} \rightarrow {\mathbb {N}} \) such that for any parameters (qst) constraining the honest algorithms resources we are still guaranteed at least \({\alpha } _w(q,s,t)\) evaluations of \(f_G\) by \({{\mathcal {N}}} _G\). Given the above honest algorithm we can thus set:

$$\begin{aligned} \forall (q,s,t) \in {{\mathbb {P}} ^{\textsc {prom}}} ~~ {\alpha } _w(q,s,t) := {\left\{ \begin{array}{ll} 0 &{} : q < n\\ \min ({\left\lfloor q/n\right\rfloor },{\left\lfloor t/(n-{\left\lfloor s/w\right\rfloor } \right\rfloor }) &{} : \text{ else } \end{array}\right. } \end{aligned}$$

It remains to determine how to set \({\beta } _w\) and \({\epsilon } _w\), which is the focus of the remainder of this section.

5.3 The Pebbling Reduction

We state the main theorem of this section which relates the parameters of an SMHF based on a graph function to the sustained (pebbling) space complexity of the underlying graph.

Theorem 7

(Pebbling reduction). Let \(G_n=(V_n,E_n)\) be a DAG of size \(|V_n|=n\). Let \(F=(f_{G,n}, {{\mathcal {N}}} _{G,n})_{n\in {\mathbb {N}}}\) be the graph functions for \(G_n\) and their naïve oracle algorithms. Then, for any \(\lambda \ge 0\), F is an \(({\alpha },{\beta },{\epsilon })\)-sustained memory-hard function where

$$ {\alpha } = \left\{ {\alpha } _w(q,s,t)\right\} _{w\in {\mathbb {N}}} \; , $$
$$ {\beta } = \left\{ {\beta } _w(q,s,t) = \frac{\varPi ^{\parallel }_{ss}(G,s)(w-\log q)}{1+\lambda }\right\} _{w\in {\mathbb {N}}} \; ,~~~ {\epsilon } = \left\{ {\epsilon } _w(q,m) \le \frac{q}{2^w} + 2^{-\lambda }\right\} _{w\in {\mathbb {N}}}\; . $$

The technical core of the proof follows that of [AT17] closely. The proof can be found in the full version of this paper [ABP18].

6 Open Questions

We conclude with several open questions for future research. The primary challenge is to provide a practical construction of a DAG G with high sustained space complexity. While we provide a DAG G with asymptotically optimal sustained space complexity, we do not optimize for constant factors. We remark that for practical applications to iMHFs it should be trivial to evaluate the function \({\mathsf {parents}} _G(v)\) without storing the DAG G in memory explicitly. Toward this end it would be useful to either prove or refute the conjecture that any depth-robustness is sufficient for high sustained space complexity e.g., what is the sustained space complexity of the depth-robust DAGs from [EGS75] or [PTC76]? Another interesting direction would be to relax the notion of sustained space complexity and instead require that for any pebbling \(P \in \mathcal{P}^{\parallel }(G)\) either (1) P has large cumulative complexity e.g., \(n^3\), or (2) P has high sustained space complexity. Is it possible to design a dMHF with the property for any evaluation algorithm either has (1) sustained space complexity \(\varOmega (n)\) for \(\varOmega (n)\) rounds, or (2) has cumulative memory complexity \(\omega (n^2)\)?