Springer Nature is making SARS-CoV-2 and COVID-19 research free. View research | View latest news | Sign up for updates

A Comparison of Cryptanalytic Tradeoff Algorithms

Abstract

Three time-memory tradeoff algorithms are compared in this paper. Specifically, the classical tradeoff algorithm by Hellman, the distinguished point tradeoff method, and the rainbow table method, in their non-perfect table versions, are treated.

We show that, under parameters and assumptions that are typically considered in theoretic discussions of the tradeoff algorithms, the Hellman and distinguished point tradeoffs perform very close to each other and the rainbow table method performs somewhat better than the other two algorithms. Our method of comparison can easily be applied to other situations, where the conclusions could be different.

The analysis of tradeoff efficiency presented in this paper does not ignore the effects of false alarms and also covers techniques for reducing storage, such as ending point truncations and index tables. Our comparison of algorithms fully takes into account success probabilities and precomputation efforts.

This is a preview of subscription content, log in to check access.

Fig. 1.
Fig. 2.
Fig. 3.
Fig. 4.
Fig. 5.
Fig. 6.

Notes

  1. 1.

    The paper refers to the Hellman tradeoff, but it seems that the DP tradeoff was implied. Many researchers view the Hellman tradeoff as always incorporating the DP technique.

  2. 2.

    It seems that the DP tradeoff was implied, even though the paper refers to the Hellman tradeoff.

References

  1. [1]

    G. Avoine, P. Junod, P. Oechslin, Characterization and improvement of time-memory trade-off based on perfect tables. ACM Trans. Inf. Syst. Secur. 11(4), 17:1–17:22 (2008). Preliminary version in INDOCRYPT 2005

  2. [2]

    S.H. Babbage, Improved exhaustive search attacks on stream ciphers, in European Convention on Security and Detection. IEE Conference Publication, vol. 408 (IEE, London, 1995), pp. 161–166

  3. [3]

    E.P. Barkan, Cryptanalysis of ciphers and protocols. Ph.D. Thesis, Israel Institute of Technology, March 2006

  4. [4]

    E. Barkan, E. Biham, A. Shamir, Rigorous bounds on cryptanalytic time/memory tradeoffs, in Advances in Cryptology—CRYPTO 2006. LNCS, vol. 4117 (Springer, Berlin, 2006), pp. 1–21

  5. [5]

    A. Biryukov, A. Shamir, Cryptanalytic time/memory/data tradeoffs for stream ciphers, in Advances in Cryptology—ASIACRYPT 2000. LNCS, vol. 1976 (Springer, Berlin, 2000), pp. 1–13

  6. [6]

    A. Biryukov, A. Shamir, D. Wagner, Real time cryptanalysis of A5/1 on a PC, in FSE 2000. LNCS, vol. 1978 (Springer, Berlin, 2001), pp. 1–18

  7. [7]

    J. Borst, Block ciphers: Design, analysis, and side-channel analysis. Ph.D. Thesis, Katholieke Universiteit Leuven, September 2001

  8. [8]

    J. Borst, B. Preneel, J. Vandewalle, On the time-memory tradeoff between exhaustive key search and table precomputation, in Proceedings of the 19th Symposium on Information Theory in the Benelux (WIC, 1998)

  9. [9]

    C. Calik, How to invert one-way functions: time-memory trade-off method. M.S. Thesis, Middle East Technical University, January 2007

  10. [10]

    D.E. Denning, Cryptography and Data Security (Addison-Wesley, Reading, 1982)

  11. [11]

    P. Flajolet, A.M. Odlyzko, Random mapping statistics, in Advances in Cryptology—EUROCRYPT’89. LNCS, vol. 434 (Springer, Berlin, 1990), pp. 329–354

  12. [12]

    S. Goldwasser, M. Bellare, Lecture notes on cryptography. Unpublished manuscript, July 2008. Available at: http://cseweb.ucsd.edu/~mihir/papers/gb.html

  13. [13]

    J.Dj. Golić, Cryptanalysis of alleged A5 stream cipher, in Advances in Cryptology—EUROCRYPT’97. LNCS, vol. 1233 (Springer, Berlin, 1997), pp. 239–255

  14. [14]

    M.E. Hellman, A cryptanalytic time-memory trade-off. IEEE Trans. Inf. Theory 26, 401–406 (1980)

  15. [15]

    J. Hong, The cost of false alarms in Hellman and rainbow tradeoffs. Des. Codes Cryptogr. 57, 293–327 (2010)

  16. [16]

    J. Katz, Y. Lindell, Introduction to Modern Cryptography (Chapman & Hall/CRC, London, 2008)

  17. [17]

    I.-J. Kim, T. Matsumoto, Achieving higher success probability in time-memory trade-off cryptanalysis without increasing memory size. IEICE Trans. Fundam. Electron. Commun. Comput. Sci. E82-A, 123–129 (1999)

  18. [18]

    K. Kusuda, T. Matsumoto, Optimization of time-memory trade-off cryptanalysis and its application to DES, FEAL-32, and Skipjack. IEICE Trans. Fundam. Electron. Commun. Comput. Sci. 79(1), 35–48 (1996)

  19. [19]

    D. Ma, J. Hong, Success probability of the Hellman trade-off. Inf. Process. Lett. 109(7), 347–351 (2009)

  20. [20]

    A.J. Menezes, P.C. van Oorschot, S.A. Vanstone, Handbook of Applied Cryptography (CRC Press, Boca Raton, 1997)

  21. [21]

    S. Moon, Parameter selection in cryptanalytic time memory tradeoffs. M.S. Thesis, Seoul National University, June 2009

  22. [22]

    A. Narayanan, V. Shmatikov, Fast dictionary attacks on passwords using time-space tradeoff, in Proceedings of the 12th ACM CCS (ACM, New York, 2005), pp. 364–372

  23. [23]

    P. Oechslin, Making a faster cryptanalytic time-memory trade-off, in Advances in Cryptology—CRYPTO 2003. LNCS, vol. 2729 (Springer, Berlin, 2003), pp. 617–630

  24. [24]

    R. Oppliger, Contemporary Cryptography (Artech House, Boston, 2005)

  25. [25]

    J.-J. Quisquater, J. Stern, Time-memory tradeoff revisited. Unpublished manuscript, December 1998

  26. [26]

    N. Saran, Time memory trade off attack on symmetric ciphers. Ph.D. Thesis, Middle East Technical University, February 2009

  27. [27]

    N. Saran, A. Doganaksoy, Choosing parameters to achieve a higher success rate for Hellman time memory trade off attack, in 2009 International Conference on Availability, Reliability and Security (IEEE, New York, 2009), pp. 504–509

  28. [28]

    F.-X. Standaert, G. Rouvroy, J.-J. Quisquater, J.-D. Legat, A time-memory tradeoff using distinguished points: New analysis & FPGA results, in Cryptographic Hardware and Embedded Systems—CHES 2002. LNCS, vol. 2523 (Springer, Berlin, 2003), pp. 593–609

Download references

Author information

Correspondence to Jin Hong.

Additional information

J. Hong was supported by the Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education, Science and Technology (2012003379).

Communicated by Antoine Joux

Appendices

Appendix A. Technical Approximation

The following lemma shows that the approximation \((1-\frac {1}{{\textup {\textsf {b}}}} )^{{\textup {\textsf {a}}}}\approx e^{-\frac{{\textup {\textsf {a}}}}{{\textup {\textsf {b}}}}}\), which we have used frequently in this work, is very accurate for large integers a and b such that a=O(b).

Lemma 39

For positive integers  a and  b, we have

$$ \biggl|\exp \biggl(-\frac{{\textup {\textsf {a}}}}{{\textup {\textsf {b}}}} \biggr) - \biggl(1-\frac {1}{{\textup {\textsf {b}}}} \biggr)^{\textup {\textsf {a}}}\biggr| < \biggl\{ \frac{1}{2}\frac{{\textup {\textsf {a}}}}{{\textup {\textsf {b}}}^2} + \frac{1}{({\textup {\textsf {a}}}+1)!} \biggl(\frac{{\textup {\textsf {a}}}}{{\textup {\textsf {b}}}} \biggr)^{{\textup {\textsf {a}}}+1} \biggr\} \exp \biggl(\frac{{\textup {\textsf {a}}}}{{\textup {\textsf {b}}}} \biggr). $$

Proof

We start by writing \(\exp (-\frac{{\textup {\textsf {a}}}}{{\textup {\textsf {b}}}} )\) in its Taylor series form and fully expanding the term \((1-\frac{1}{{\textup {\textsf {b}}}})^{{\textup {\textsf {a}}}}\).

After noting that the beginning two pairs of terms cancel out, we collect corresponding pairs from the two sequences of terms and bound the above by

$$ \biggl\{ \biggl|\frac{{\textup {\textsf {a}}}^2}{2!} - \binom{{\textup {\textsf {a}}}}{2} \biggr| \frac {1}{{\textup {\textsf {b}}}^2} + \cdots + \biggl|\frac{{\textup {\textsf {a}}}^{\textup {\textsf {a}}}}{{\textup {\textsf {a}}}!} - \binom{{\textup {\textsf {a}}}}{{\textup {\textsf {a}}}} \biggr| \frac{1}{{\textup {\textsf {b}}}^{\textup {\textsf {a}}}} \biggr\} + \biggl\{ \frac{1}{({\textup {\textsf {a}}}+1)!} \biggl( \frac{{\textup {\textsf {a}}}}{{\textup {\textsf {b}}}} \biggr)^{{\textup {\textsf {a}}}+1} + \cdots \biggr\}. $$
(A.1)

It is easy to see that

for every k≥2, where the last inequality can be checked through induction on k. This shows that the terms of (A.1) that appear inside the first set of braces are bounded by

As for the second set of braces from (A.1), it is easy to see that

$$ \frac{1}{({\textup {\textsf {a}}}+1)!} \biggl(\frac{{\textup {\textsf {a}}}}{{\textup {\textsf {b}}}} \biggr)^{{\textup {\textsf {a}}}+1} \exp \biggl(\frac{{\textup {\textsf {a}}}}{{\textup {\textsf {b}}}} \biggr) $$

can serve as its very rough bound. It now suffices to gather the two bounds to arrive at the claim. □

Appendix B. Random Function Arguments

Any analysis of a tradeoff algorithm assumes the one-way function F to be a one-way function, and most results given in this work as equations are certain values expected of a random function. In other words, we have been stating values that had been averaged over the choice of all functions . In this section, we point out that many of the arguments made during these computations are not strictly correct, and we then try to justify heuristically that the existing logical error may safely be ignored.

B.1 Existence of a Logical Gap

Recall the expected image size of a random function given by (1) and the expected iterated image sizes given by (2). The claim that (1) implies (2) is acceptable in the realm of cryptology. In this subsection, we clarify that there is a small logical gap in such a claim.

Let us rewrite (1) as an explicit self-contained statement which is precisely correct.

Lemma 40

Let be a random function on a finite set of size  N. If is of size m 0, then the size of is expected to be

$$ m_1 = {\textup {\textsf {N}}}\biggl\{1- \biggl(1 - \frac{1}{{\textup {\textsf {N}}}} \biggr)^{m_0} \biggr\}. $$

The proof of this lemma is quite trivial. It suffices to consider the ratio of points among  that remain untouched throughout the sequential assignments made to elements of  for the random function construction.

We want to emphasize two things about this lemma. The first is that the value claimed by this lemma is the exact expected value and does not involve any approximation. In fact, the largest reason for rewriting the statement here was to remove the approximate expression. The second point we make is that the statement of this lemma does not contain any averaging over input sets. The expected image size claim holds true for every set of size m 0.

Discussing just the double iteration case will be sufficient for our purposes. Let us define

$$ m_1 = {\textup {\textsf {N}}}\biggl\{1- \biggl(1 - \frac{1}{{\textup {\textsf {N}}}} \biggr)^{m_0} \biggr\} \quad\text{and}\quad m_2 = {\textup {\textsf {N}}}\biggl\{1- \biggl(1 - \frac{1}{{\textup {\textsf {N}}}} \biggr)^{m_1} \biggr\}, $$
(B.1)

for any given m 0. One might believe that m 2 is the expected size of , when is a random function and is of size m 0. Since Lemma 40 contains no approximation, one might expect (B.1) to hold exactly. However, this reasonable prediction is not met, at least in the strict sense, by the explicit example given below.

The set of all functions F:{0,1}→{0,1} can be visualized as follows.

figurea

When the input set  is a single point, the image size expectation is clearly 1. This is in agreement with the value \(2 \{1- (1-\frac{1}{2} )^{1} \} = 1\), computed according to Lemma 40. When the input set is the complete domain {0,1}, the image size expectation is \(E_{F} [ |F(\{0,1\} )| ] = \frac{1}{4}\cdot1 + \frac{1}{4}\cdot2 + \frac {1}{4}\cdot2 + \frac{1}{4}\cdot1 = \frac{3}{2}\), and this is also identical to the value \(E_{F} [ |F(\{0,1\})| ] = 2 \{1- (1-\frac{1}{2} )^{2} \} = \frac{3}{2}\), computed according to Lemma 40. We have just verified that Lemma 40, which had already been proved, holds exactly for the case, regardless of the input set size and the choice of the set itself. Now, the four functions F 2=FF can be visualized as follows.

figureb

When the input set  is taken to be the complete domain, the expected image size of the double iteration is

$$ E_{F} \bigl[ \bigl|F^2\bigl(\{0,1\}\bigr)\bigr| \bigr] = \frac{2}{4}\cdot1 + \frac{2}{4}\cdot2 = \frac{3}{2}. $$
(B.2)

In comparison, the corresponding value computed through (B.1) is

$$ 2 \biggl\{1- \biggl(1-\frac{1}{2} \biggr)^{2\{1-(1-1/2)^2\}} \biggr\} = 2 \biggl\{1- \biggl(1-\frac{1}{2} \biggr)^{3/2}\biggr\} \approx1.293. $$
(B.3)

The two values given above are clearly in disagreement.

A cryptographer would naturally attempt to rectify the current situation by relaxing the strict correlation between the two functions that are being composed. Let and be two independent random functions operating on a finite set of size N. One would like to claim that if is of size m 0, then the size of is expected to be the m 2 value given by (B.1). This second version for the doubly iterated image size expectation seems structurally much simpler to analyze than the previous attempt, and one might be tempted to say that the modified claim is a trivial consequence of Lemma 40.

We again turn to the example F,G:{0,1}→{0,1}. The complete set of all possible double iterations can be visualized as follows.

figurec

When the input set  is the full domain {0,1}, after separately counting the number of functions with image sizes one and two, the expected image size can be computed as

$$ E_{F,G} \bigl[ \bigl|G\bigl(F\bigl(\{0,1\}\bigr) \bigr) \bigr| \bigr] = \frac{12}{16}\cdot1 + \frac{4}{16}\cdot2 = \frac{5}{4}. $$
(B.4)

Once again, this disagrees with (B.3), which was computed through (B.1).

It is now clear that (2) does not directly follow from (1). The claims to the iterated image sizes are not consequences of the single step image size, at least not without additional arguments. The logical gap persists even when all iterations are allowed to be independent random functions.

B.2 Narrowing the Logical Gap

The failed attempt (B.1) at giving a doubly iterated image size expectation had substituted the m 1 value in the place of m 0 in the single step result Lemma 40. This reuse of average value in the computation of another average value was the source of our problem. In reality, as can be seen in the two counterexamples, inputs to the second step function are not all of m 1 size, but of varying sizes that only average to m 1. After this simple observation, we can state that, if is a set of size m 0 such that the image size is exactly m 1 for every choice of function F and the image size is exactly m 2 for every choice of function F and every input set of size m 1, then m 2 is the exact expected size of . The assumptions included in this statement cannot be met, but it is reasonable to expect the conclusion to hold approximately, when a slight relaxation is given to the assumptions. We are thus justified in stating that, if for the vast majority of the sets  and functions , the image size is very close to , then the m 2 of (B.1) will be a good approximation for the doubly iterated image size expectation.

Therefore, we consider the images of a fixed set  under different functions F and discuss how their sizes are distributed around its average. Let us use μ N,m and σ N,m to denote the average and standard deviation of the image set size . These are to be computed for a fixed input set of size m and with running over all possible function choices. We already know \(\mu_{{\textup {\textsf {N}}},m} \approx {\textup {\textsf {N}}}\{ 1- \exp (-\frac{m}{{\textup {\textsf {N}}}} ) \}\). A proof of the following lemma is given in Appendix C.

Lemma 41

We have \(\frac{\sigma_{{\textup {\textsf {N}}},m}}{\mu_{{\textup {\textsf {N}}},m}} < \frac {2}{\sqrt{{\textup {\textsf {N}}}}}\) for all  N and m.

According to Chebyshev’s inequality, at least 99 % of the N N image sizes will fall within the range μ N,m ±10σ N,m . The above lemma states that this deviation of sizes from the mean is bounded by \(\frac{20\mu_{{\textup {\textsf {N}}},m}}{\sqrt {{\textup {\textsf {N}}}}}\). Hence, the distribution or clustering of image sizes around the expected value μ N,m will tighten, at least in comparison to the expected value, as N is increased.

This observation can be restated in more plain terms as follows. Suppose we take some input set and measure its image size under a single function, chosen at random, and take it to be an estimate of the true average image size. We make it clear that the averaging over multiple measurements made with multiple functions is not being performed here. In such a situation, we can expect each measurement to return a larger number of significant digits as N is increased. Let us briefly work with some explicit numbers. For parameters N=264 and m=250 the average image size can be computed to be μ N,m ≈1.13×1016. For the same parameters, the standard deviation is bounded by σ N,m ≤5.24×105. Chebyshev’s inequality ensures that at least 99 % of the N N image sizes will lie in the range μ N,m ±10σ N,m , which is 1.13×1016±5.24×106 in the current situation. For any practical purposes, we can believe that close to 10 significant digits from any single measurement are highly likely to be identical to those of the true expected value.

Let us summarize the discussion of this subsection. For any function acting on a large set that was chosen at random and any input set of size m 0, the image size of the first iteration will be very close to the m 1 value given by (B.1). At the second iterated application of the same function, even though the input size was not exactly m 1, we can expect the output size to be very close to the m 2 value given by (B.1). Actually, the output size could be different from m 2 even if the input size was exactly m 1. In any case, the fact that the standard deviation of the image sizes is very small relative to its expected value implies a tight clustering of image sizes, and allows us to believe that the formula (2) will predict doubly iterated image sizes with accuracy, in the sense that a large number of significant digits are returned. The heuristic arguments of this subsection have added further justification to the already acceptable cryptographic argument that (1) implies (2).

B.3 Other Reuses of Average Values

The intention of this section was not to test the validity of (2). In fact, although the authors of the current paper are unfit to verify its correctness, a full proof is provided in [11] for at least the case when  is the full domain. What we have done so far in the current section is to first point out that average values have erroneously been reused in the computation of other average values and then argue heuristically that such methods are still acceptable as long as the distribution of values that are being treated is tightly gathered around the average. This reasoning does not have to be restricted to the discussion of iterated image sizes, or even random function arguments.

There are many occasions in this paper where an average value was used during the computation of another average value. It should now be clear that (10), stating the success probability of a single rainbow matrix, is also slightly problematic, but acceptable. The different reduction functions at each rainbow matrix column do not provide independence of the colored iterating functions, and the existing logical gap would not be closed even if different columns were processed with independent random functions. However, the small standard deviation of image sizes justifies (10) as a good approximation.

The success probability (4) of the DP and Hellman tradeoffs, computed from the average number of points in a tradeoff matrix, is another example of average value reuse. We have not checked if the standard deviation of the coverage rate is small, but we know from experience that (4) predicts the correct value accurately, so this should not be a problem. In fact, this situation is less problematic than the iterated image case, because the arguments become strictly correct when independent random functions are used in different tables.

Readers may have noticed that we were more careful in reusing average values in Sect. 4.2. The distribution of chain lengths in a DP matrix can be inferred from (16), and it is clear that the lengths are not at all centered around the average length t. Hence, we were careful to work with the full range of possible chain lengths, rather than treat t as being the typical precomputation or online chain length. In particular, we did not treat the DP matrix as consisting of m chains of identical length t. This cautious handling of chains should not be confused with our free use of the value (16) itself, which is an expected value, in other computations.

Appendix C. Standard Deviation of Image Sizes

The purpose of the section is to provide a proof to Lemma 41 concerning the standard deviation of image sizes. We first prepare a couple of technical lemmas.

Lemma 42

Let be a random function. Fix a subset of size m and let be any two distinct points. The probability for to contain both y 1 and y 2 is

$$ \biggl\{ 1 - \biggl(1-\frac{1}{{\textup {\textsf {N}}}} \biggr)^m \biggr \}^2 - \biggl(1-\frac{1}{{\textup {\textsf {N}}}} \biggr)^m \biggl\{ \biggl(1-\frac{1}{{\textup {\textsf {N}}}} \biggr)^m - \biggl(1-\frac{1}{{\textup {\textsf {N}}}-1} \biggr)^m \biggr\}. $$

Proof

The probability under consideration may be computed as follows:

In each additive term, the part \(\binom{m}{k} (\frac{1}{{\textup {\textsf {N}}}} )^{k} (1-\frac{1}{{\textup {\textsf {N}}}} )^{m-k}\) gives the probability for exactly k out of the m inputs to map to y 1. The remaining \(\{ 1 - (1-\frac{1}{{\textup {\textsf {N}}}-1} )^{m-k} \} \) part is the probability for at least one of the (mk) inputs that are known not to have reached y 1 to map to y 2. The above sum is equal to the expression

To check this claim, it suffices to expand the first two pairs of braces. This expression can be rewritten in the form stated by this lemma. □

Lemma 43

For positive integers N and m, we have

$$ \biggl(1-\frac{1}{{\textup {\textsf {N}}}} \biggr)^m - \biggl(1-\frac{1}{{\textup {\textsf {N}}}-1} \biggr)^m \geq \frac{m}{{\textup {\textsf {N}}}({\textup {\textsf {N}}}-1)} \biggl(1-\frac{1}{{\textup {\textsf {N}}}-1} \biggr)^{m-1}. $$

Proof

It suffices to check the following sequence of equalities and inequality:

In fact, a similar upper bound is also easy to obtain. □

In the remainder of this section, will be a fixed set of size m. For each , let us define the function by

The dependence of χ y on was not made explicit in the notation since we will keep fixed for the rest of this section. The size of the image of under any function can be expressed in terms of this indicator function as

Using this observation, one can present

(C.1)

where y′ is any fixed point of , as an alternative way of writing the proof to Lemma 40.

Let us fix the notation

and view this as a random variable defined on the space , which is given the uniform probability distribution. It maps each element F to the positive integer . Equation (C.1) is equivalent to

$$ E[\chi] = {\textup {\textsf {N}}}\biggl\{1 - \biggl(1 - \frac{1}{{\textup {\textsf {N}}}} \biggr)^m \biggr\} $$
(C.2)

and we need to work with the standard deviation

$$ \textup{stdev}(\chi) = \sqrt{E\bigl[\chi^2\bigr] - \bigl(E[\chi] \bigr)^2}. $$

One can easily check that

where \(\mathbf {y}_{1}'\) and \(\mathbf {y}_{2}'\) are any two distinct points of . The expectation \(E [\chi_{\mathbf {y}_{1}'}\chi_{\mathbf {y}_{2}'} ]\) is equal to the probability for both \(\mathbf {y}_{1}'\) and \(\mathbf {y}_{2}'\) to belong to the image space, and this is the content of Lemma 42. Referring also to (C.2) and Lemma 43, we can compute a bound for the variance as follows:

Here, the second inequality follows from the observation \((1-\frac {1}{{\textup {\textsf {N}}}} )^{m} \geq1 - \frac{m}{{\textup {\textsf {N}}}}\). The final expression allows us to state that \(\textup {stdev}(\chi) \leq\frac{{m}}{\sqrt{{\textup {\textsf {N}}}}}\).

On the other hand, from the observation \((1-\frac{1}{{\textup {\textsf {N}}}} )^{m} \leq1 - \frac{m}{{\textup {\textsf {N}}}} + \frac{m(m-1)}{2{\textup {\textsf {N}}}^{2}}\), which holds for every mN, we know that

$$ E[\chi] \geq {\textup {\textsf {N}}}\biggl\{ \frac{m}{{\textup {\textsf {N}}}} - \frac{m(m-1)}{2{\textup {\textsf {N}}}^2} \biggr\} > {\textup {\textsf {N}}}\biggl(\frac{m}{{\textup {\textsf {N}}}} - \frac{m}{2{\textup {\textsf {N}}}} \biggr) = \frac{m}{2}. $$

Finally, by combining the two bounds, we can state that

$$ \frac{\textup{stdev}(\chi)}{E[\chi]} < \frac{2}{\sqrt{{\textup {\textsf {N}}}}}. $$

This concludes the proof of Lemma 41.

Appendix D. Note on the Index Tables Method

The index table method can be seen as a special case of a more general and widely known data structure called hash tables. To store m starting point and ending point pairs, one first fixes a hash function that maps elements of  to logm-bit strings. This function need not be a cryptographic hash function, although the same term is used. Instead of sorting the data, each starting point and ending point pair is recorded at the position in the storage addressed by the hash value of the ending point. Collisions of addresses are inevitable, but there are various ways to deal with this problem.

Table lookups to hash tables are performed by first hashing the ending point to be searched for in the table and fetching the data located at the address pointed to by the hash value. Since the address holds logm bits of information, even if almost logm bits from each ending point are removed before storage, we can reliably determine whether or not a match has occurred.

One advantage of the hash table method, other than reducing storage and not requiring any sorting, is that it provides constant time table lookups. In comparison, a lookup to a sorted table requires time that is logarithmic in the table size.

If the hash function is set to return the first {(logm)−ε} bits of its input and buckets to hold approximately 2ε table entries are placed at the position pointed to by each hash value, then the hash table technique reduces to the index table technique.

Appendix E. Experimental Results

In this section we verify that the main parts of our arguments agree well with the experimental results. Experiments are done to check the validity of our results concerning the coverage rate and the cost of false alarms for the DP tradeoff. Analogous testing for the Hellman and rainbow tradeoffs is not provided, as this testing was done in [15]. We also provide experimental evidence supporting our arguments surrounding the effects of the ending point truncation method.

Since averaging over all functions defined on any reasonably large space is not at all possible, all our tests were conducted with a very small subset of explicitly constructed one-way functions. The one-way function used was always the encryption key to ciphertext mapping, under a fixed plaintext, computed with the block cipher AES-128. Different randomly chosen plaintexts were used to provide multiple one-way functions. The size of the input space was controlled by utilizing only a small number of key bits and padding the remaining key bits with zeros. The output space size was controlled by masking the ciphertext to an appropriate bit length. When working with the DP tradeoff, as discussed at the start of Sect. 4, we constructed \(m_{0} = \frac{m}{1-e^{-{\hat {t}}/t}}\) precomputation chains and gathered every resulting DP chain, rather than incrementally generating additional chains until m DP chains were collected.

E.1 Coverage Rate of DP Tradeoffs

The experimental results supporting Proposition 9, which presents the coverage rate of a DP table, are given in Table 2. The coverage rate was measured by simply storing all DP matrix entries while constructing the DP chains and later counting the number of distinct matrix entries that were used as inputs to the one-way function. Each test result value given in the table is an average over 100 experiments. Different randomly generated plaintexts for AES were used for each of these experiments. All the tests were done on a space of size N=230. One can check that the test figures are very close to what the theory predicted.

Table 2. Coverage rate of DP tradeoff (N=230)

E.2 Cost of Resolving Alarms for the DP Tradeoff

Our next goal is to check the validity of our arguments concerning the time complexity that incorporates the extra cost of false alarms. We could do this with the expression for time complexity stated during the proof of Theorem 13, but such an approach would hide much of the inner workings. Hence, we decided to verify the following lemma, which allows access to much finer detail.

Lemma 44

Consider the DP tradeoff. The expected number of chain collisions at the ith iteration of the online phase is

$$ \frac {1}{t}\frac{ \textup {\texttt {D}}_{\mathrm {msc}}}{1-e^{-{\hat {t}}/t}} \biggl\{ - e^{-{\hat {t}}/t} + e^{-{\hat {t}}/t} \exp \biggl(-\frac{i}{t} \biggr) + \frac{i}{t} \exp \biggl(- \frac{i}{t} \biggr) \biggr\}. $$

Proof

The expected number of chain collisions is the sum over all rows of the DP matrix of the respective probabilities for the ith iteration to sound an alarm in association with that row. After reading the proof of Lemma 12, it should be clear that the sum of probabilities we are looking for is

$$ \sum_{j=1}^{\hat {t}}\frac{\frac{m}{t}}{1-e^{-{\hat {t}}/t}} \exp \biggl(-\frac{j}{t} \biggr)\cdot \frac{t}{{\textup {\textsf {N}}}} \biggl\{\exp \biggl( \frac{\min\{i,j\}}{t} \biggr)-1 \biggr\} \exp \biggl(-\frac{i}{t} \biggr). $$

In integral form, this is approximately

$$ \frac {1}{t}\frac{\frac{mt^2}{{\textup {\textsf {N}}}}}{1-e^{-{\hat {t}}/t}} \exp \biggl(-\frac{i}{t} \biggr) \int _0^{{\hat {t}}/t} \exp(-v) \biggl\{\exp \biggl(\min \biggl \{ \frac{i}{t}, v \biggr\} \biggr) - 1 \biggr\} \, dv, $$

which simplifies to what is claimed. □

This lemma contains the core of our arguments given in the main text concerning the cost of alarms, and its verification through experiments should provide good support for the correctness of our theory.

To test this lemma, we first initialized an array of \({\hat {t}}\) counters to zeros. Next, we fixed a one-way function by randomly choosing a plaintext and constructed a DP table with the fixed function. Then, a random password (= zero-padded encryption key) was generated and the password hash (= masked ciphertext) corresponding to that password was computed. The online chain starting from the password hash was computed until a DP was found or the \({\hat {t}}\)th iteration was reached. If the online chain terminated at a DP and it was found to reside within the DP table, the counter corresponding to the current online iteration count was incremented. The online chain generation was repeated multiple times with the same table, but with newly generated random keys. Note that, since we are not using perfect tables, it is possible for the online chain to collide simultaneously with more than one entry of the DP table. Care was taken to increment the counter corresponding to the current iteration count as many times as the number of collisions found. The whole process described after the counter initialization step was repeated multiple times, with each repetition using a newly generated one-way function and a DP table.

The test results for four different parameter sets are presented in Fig. 7. Each of these experiments was performed with 2000 tables and 5000 random online chains per table. In each of the four boxes, the barely visible thin dashed line represents our theory as given by Lemma 44. There are \({\hat {t}}\)-many tiny dots in each box, and these represent our experimental results. The height of the ith dot, counting from the left, is the value of the ith iteration counter at the end of the experiment divided by 2000×5000, the total number of chains that were utilized. All the experiment results match our theory very well.

Fig. 7.
figure7

Expected number of collisions at each iteration of the DP tradeoff (dots: experiment; dashed line: theory).

E.3 Ending Point Truncation

Finally, we test the validity of our arguments concerning the ending point truncation method for reducing storage. The straightforward approach would be to simply test Lemma 16, Lemma 24, and Lemma 32, which present the cost of truncation related alarms, but we decided to work with the probability of alarms related to truncations, so as to expose more of our argument details to the tests.

Lemma 45

Consider the DP tradeoff that uses an ending point truncation of \(\frac {1}{r}\) truncated match probability. At the ith iteration of the online processing of a single DP table, the number of pseudo-collisions that are due to the ending point truncations, i.e., those that are not associated with any true chain collisions, is expected to be  \(\frac{m}{r} \exp(-\frac{i}{t})\). The corresponding value for the Hellman tradeoff is  \(\frac{m}{r}\), and that for the rainbow tradeoff is also  \(\frac{m}{r}\), if one decides to fully process a single rainbow table without terminating, even when the correct answer is found.

Proof

The proof of Lemma 16 shows that the claimed expected value for the DP tradeoff case can be computed as

$$ \sum_{j=1}^{\hat {t}}\frac{\frac{m}{t}}{1-e^{-{\hat {t}}/t}} \exp \biggl(-\frac{j}{t} \biggr)\cdot \exp \biggl(-\frac{i}{t} \biggr) \frac{1}{r} \approx \frac{m}{1-e^{-{\hat {t}}/t}} \int_0^{{\hat {t}}/t} \exp(-v) \,dv \exp \biggl(-\frac{i}{t} \biggr)\frac{1}{r}, $$

which simplifies to what is claimed. The statement for the Hellman tradeoff case follows immediately from the proof of Lemma 24, and the rainbow tradeoff case can be inferred from the proof of Lemma 32. □

The three claims given by this lemma are at the core of our arguments concerning the ending point truncation method, and experimental verification of these statements should provide confidence as to the validity of our arguments given in the main text.

As in the previous section, we generated random tradeoff tables and tested with random online chains for the occurrence of alarms induced from truncations. We stored the full ending points, together with the truncated ending points, in the precomputation table. The full ending point information was used to distinguish between alarms that were caused by ending point truncations and those that arose from true chain collisions.

The test results are given in Figs. 8, 9, and 10. As before, the thin dashed lines are the graphs claimed in Lemma 45 and the numerous tiny dots represent the experimental data. All the test results are in good agreement with the theory. Each of the two diagrams for the DP tradeoff was obtained by averaging over 2000 tables and 5000 online chains per table. For the Hellman tradeoff we generated 2000 tables and 5000 inversion targets per table. The online chain was computed to the full length t for each inversion target, and a search was made for the truncated match with the table elements after each one-way function iteration. In the rainbow tradeoff case, each diagram is the result of 100 tables with 5000 inversion targets per table. Recall that the kth iteration for the rainbow tradeoff refers to a process that consists of (k−1) invocations of the one-way function and one table lookup. Full t iterations were attempted for each inversion target; hence each inversion target generated t searches to the table for truncated matches.

Fig. 8.
figure8

Expected number of collisions, induced by ending point truncation, at each iteration of the DP tradeoff (dots: experiment; dashed line: theory).

Fig. 9.
figure9

Expected number of collisions, induced by ending point truncation, at each iteration of the Hellman tradeoff (dots: experiment; dashed line: theory).

Fig. 10.
figure10

Expected number of collisions, induced by ending point truncation, at each iteration of the rainbow tradeoff (dots: experiment; dashed line: theory).

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Hong, J., Moon, S. A Comparison of Cryptanalytic Tradeoff Algorithms. J Cryptol 26, 559–637 (2013). https://doi.org/10.1007/s00145-012-9128-3

Download citation

Key words

  • Time-memory tradeoff
  • Hellman
  • Distinguished point
  • Rainbow table