Integrals Go Statistical: Cryptanalysis of Full Skipjack Variants
 3 Citations
 1.1k Downloads
Abstract
Integral attacks form a powerful class of cryptanalytic techniques that have been widely used in the security analysis of block ciphers. The integral distinguishers are based on balanced properties holding with probability one. To obtain a distinguisher covering more rounds, an attacker will normally increase the data complexity by iterating through more plaintexts with a given structure under the strict limitation of the full codebook. On the other hand, an integral property can only be deterministically verified if the plaintexts cover all possible values of a bit selection. These circumstances have somehow restrained the applications of integral cryptanalysis.
In this paper, we aim to address these limitations and propose a novel statistical integral distinguisher where only a part of value sets for these input bit selections are taken into consideration instead of all possible values. This enables us to achieve significantly lower data complexities for our statistical integral distinguisher as compared to those of traditional integral distinguisher. As an illustration, we successfully attack the fullround SkipjackBABABABA for the first time, which is the variant of NSA’s Skipjack block cipher.
Keywords
Block cipher Statistical integral Integral attack SkipjackBABABABA1 Introduction
Integral attack is an important cryptanalytic technique for symmetrickey ciphers, which was originally proposed by Knudsen as a dedicated attack against Square cipher [7]. Later, Knudsen and Wagner unified it as integral attack [11]. The integral distinguisher of this attack makes use of the balanced property where one fixes a part of plaintext bits and takes all possible values for the other plaintext bits such that a specific part of the corresponding ciphertext gets balanced, i.e., each possible partial value for the ciphertext occurs exactly the same number of times. If one additional linear layer after this distinguisher is considered, the property will be that the XOR of all possible values of the specific part of ciphertext becomes zero, referred to as zerosum property [1] throughout this paper^{1}. Being variants of the original integral distinguisher, saturation distinguisher [15] and multiset distinguisher [3] also use the same balanced property or zerosum property with probability one as integral distinguisher.
Statistical saturation attack is different from integral attack, as proposed by Collard and Standaert in [6]. Here by choosing a plaintext set with some bits fixed while the others vary randomly, the statistical saturation distinguisher tracks the evolution of a nonuniform plaintext distribution through the cipher instead of observing the evolution of the plaintext bits in the integral distinguisher. In other words, the statistical saturation distinguisher requires the same inputs as the integral distinguisher, but uses the different property on the output side to distinguish between the right or wrong key guesses. As Leander showed that the statistical saturation distinguisher is identical to multidimensional linear distinguisher on average in [13], the statistical saturation distinguisher makes use of the advantage (bias or capacity) while the balanced property used in the integral distinguisher has no bias. The first publication of statistical saturation distinguisher came without a method to estimate its complexity. However, this complexity was demonstrated to be inverse proportional to the capacity or square of the capacity for the output under the chosen input set [4, 13]. Block ciphers such as PRESENT and PUFFIN are natural targets for such statistical saturation attacks as well as linear cryptanalysis, but the integral cryptanalysis has not been proven efficient for them [21, 22]. This highlights the difference between the integral distinguisher and statistical saturation distinguisher.
Integral attack has been widely used for many other block ciphers. In order to reduce the time complexity of integral attack, Moriai et al. gave a method to improve the time complexity against low degree round function for higher order differential attacks including integral attacks in [16]. Ferguson et al. proposed the partialsum technique in [8]. Sasaki and Wang presented the meetinthemiddle technique for integral attack on Feistel ciphers in [17].
So far the data complexity for a given integral has been determined by taking all values of a bit selection at the input of the balanced property. However, there are cases where it is possible or even desirable to shift the tradeoff from data towards time. Often it is the data requirements that exceeds the restriction while the time complexity budget of an attack is far from being exhausted. Therefore, in these cases, it is of paramount importance to reduce the data complexity of an attack to make it applicable. An interesting example of this behaviour is constituted by NSA’s Skipjack variant SkipjackBABABABA studied at ASIACRYPT’12 [5]. It has been attacked for 31 rounds with an integral distinguisher, whereas the data complexity prohibits the attack to apply to the full 32 rounds. In this paper, we aim to remove this restriction by proposing a novel type of integral distinguisher that features a lower data complexity with nonbalanced output bits that are still distinguishable from random.
1.1 Our Contributions
Integrals Go Statistical. We propose a new statistical integral distinguisher that consists in applying a statistical technique on top of the original integral distinguisher with the balanced property. The proposed statistical integral distinguisher requires less data than the original integral distinguisher. Although the balanced property does not strictly hold in the statistical integral distinguisher, we prove that the distribution of output values for a cipher can be distinguished from the distribution of output values which originate from a random permutation. This allows us to distinguish between the two distributions and to construct our statistical integral distinguisher. To quantify the advantage, let s be the number of input bits that take all possible values at some bits of the input while the other input bits are fixed. Furthermore, let t be the number of the output bits that are balanced. Then, for the original integral distinguisher, the data complexity is \(\mathcal {O}(2^s).\) At the same time, by deploying our new statistical integral distinguisher, the data complexity is reduced to \(\mathcal {O}(2^{s\frac{t}{2}}).\)
In summary, statistical integral attacks we propose have lower data complexity than traditional integral attacks. From [5, 19], the traditional integral distinguisher with the balanced property can be converted to a zerocorrelation integral distinguisher, so our proposed statistical integral attacks can be regarded as chosenplaintext multidimensional zerocorrelation attacks.
Note that the statistical integral attack is different from the statistical saturation attack as they use different distinguishers and the statistical integral attack is efficient for wordwise ciphers but the statistical saturation attack seems to be valid for bitwise ciphers.
The effectiveness of our proposed statistical integral distinguisher is well presented with the keyrecovery attack the fullround SkipjackBABABABA.
Summary of attacks on SkipjackBABABABA
Outline. The new statistical integral distinguisher is established in Sect. 2. Section 3 presents the attack on the fullround SkipjackBABABABA and the improved attack on 31round SkipjackBABABABA. Finally the paper is concluded in Sect. 4.
2 Statistical Integral Distinguisher
2.1 Integral Distinguisher
2.2 Statistical Integral Distinguisher
Assume that we need N different values of y to distinguish the above two distributions. A tbit value \(T_\lambda (y)\in \mathbb {F}_2^t\) is computed for each y and we allocate a counter vector \(V[T_\lambda (y)],T_\lambda (y)\in \mathbb {F}_2^t\) and initialize these counters to zero. These counters are used to keep track of the number of each value \(T_\lambda (y)\). Usually t is far from block size n.

If there is one or more values of \(T_{\lambda }[y]\) satisfying \(V[T_{\lambda }(y)]>2^{st}\), then output random permutation.

If there is no value of \(T_{\lambda }[y]\) satisfying \(V[T_{\lambda }(y)]>2^{st}\), then output actual cipher.
This statistic C follows different distributions determined by whether we are dealing with an actual cipher (right key guess) or a random permutation (wrong key guess).
Proposition 1
Proof
For a randomly drawn permutation, the values of \(V[T_\lambda (y)]\) are obtained by counting the occurrences of \(T_\lambda (y)\) when the values are chosen uniformly at random, which follows the multinomial distribution with parameter N and \(\varvec{p}=(p_0, \ldots , p_{2^t1})\), \(p_i=2^{t}\) \((0 \le i= T_\lambda (y)< 2^t)\).
The wellknown Pearson’s \(\chi ^2\) statistical result is that \(\sum _{i=1}^{k}\frac{(X_{i}np_{i})^{2}}{np_{i}}\) follows a \(\chi ^{2}\)distribution with degree of freedom \(k1\), where the vector \(X = (X_1, \ldots , X_k)\) follows a multinomial distribution with parameters n and \(\varvec{p}\), where \(\varvec{p} = (p_1, \ldots , p_k)\). We give a short proof for Pearson’s \(\chi ^2\) statistic in Appendix A.1 based on [9, 14].
If the vector \(X = (X_1, \ldots , X_k)\) follows a multivariate hypergeometric distribution with parameters \((\varvec{K},m,n)\), where \(\varvec{K}=(K_1,\ldots , K_k)\) with \(\sum _{i=1}^k K_i=m\), the statistic \(\frac{m1}{mn}\sum _{i=1}^{k}\frac{(X_{i}np_{i})^{2}}{np_{i}}\) follows a \(\chi ^{2}\)distribution with degree of freedom \(k1\), which is proved in Appendix A.2.
To distinguish these two normal distributions with different means and variances, one can compute the data complexity required as follows, given error probabilities.
Corollary 1
Note that this statistic test is based on the decision threshold \(\tau =\mu _0+\sigma _0q_{1\alpha _0}=\mu _1\sigma _1q_{1\alpha _1}\): if \(C\le \tau \), the test outputs ‘cipher’. Otherwise, if the statistic \(C>\tau \), the test outputs ‘random’.
As the integral distinguisher with the balanced property is equivalent to the multidimensional zerocorrelation distinguisher [5], the statistical integral attacks can be regarded as the chosenplaintext multidimensional zerocorrelation attacks which require lower data complexity than the knownplaintext multidimensional zerocorrelation attacks.
2.3 Experiment Results
3 Statistical Integral Attack on SkipjackBABABABA
3.1 Skipjack and Its Variant SkipjackBABABABA
Before SIMON and SPECK were proposed in 2013, Skipjack [18] was the only block cipher known to be designed by NSA (declassified in 1998). Skipjack is a 64bit block cipher with 80bit key adopting an unbalanced Feistel network with 32 rounds of two types, namely Rule A and Rule B. The 64bit block of Skipjack is divided into four 16bit words and each round is described in the form of a linear feedback shift register with additional nonlinear keyed G permutation. The keyed G permutation \(G:\mathbb {F}_2^{32}\times \mathbb {F}_2^{16}\rightarrow \mathbb {F}_2^{16}\) consists of a 4round Feistel structure whose internal function \(F:\mathbb {F}_2^{8}\rightarrow \mathbb {F}_2^8\) is an \(8\times 8\) Sbox. Skipjack applies eight rounds of Rule A, followed by eight rounds of Rule B and once again eight rounds of Rule A and finally eight rounds of Rule B. The key schedule of Skipjack takes 10 bytes secret key and uses four bytes at a time to key each G permutation, thus Skipjack’s key schedule has a periodicity of five rounds. In this section, we use \(k_0, k_1, \ldots , k_9\) to denote the ten bytes secret key. This original Skipjack is often referred to as SkipjackAABBAABB, where A denotes 4round Rule A and B denotes 4round Rule B. A variant of Skipjack, namely SkipjackBABABABA consisting of four iterations of fourround Rule B followed by fourround Rule A, is also discussed. This variant has the same number of rounds and key schedule as SkipjackAABBAABB.
Since its declassification, SkipjackAABBAABB has sparked numerous security analysis. Among which, the best known cryptanalytic result against SkipjackAABBAABB was reported more than one decade ago by Biham et al. [2] at EUROCRYPT’99, where a 24round impossible differential was revealed and with which an attack against 31round SkipjackAABBAABB was mounted. Besides the considerable security analysis, Skipjack’s structure was also studied to discuss variants of Skipjack to improve its strength. In [10, 12], Knudsen et al. suggested that putting Rule B before Rule A, for example, the earlier mentioned SkipjackBABABABA, might facilitate the resistance to truncated differential attacks. Till now, the only security analysis against SkipjackBABABABA was reported by Bogdanov et al. [5] at ASIACRYPT’12, where an integral distinguisher over 30round SkipjackBABABABA was utilized to attack a 31round version.
3.2 Integral Distinguisher of SkipjackBABABABA
To attack fullround SkipjackBABABABA, we are going to use the 30round integral distinguisher proposed at ASIACRYPT’12 [5]. The 30round integral distinguisher can be described as: when we take all \(2^{48}\) possible values for the input of round 2 \((\alpha ^2,\beta ^2,\gamma ^2,\delta ^2)\) with \(\delta ^2=\alpha ^2\), the set of all corresponding values for the output of round 31 \(\beta ^{32}\oplus \gamma ^{32}\) is balanced.
3.3 Key Recovery Attack on 32Round SkipjackBABABABA
Complexity Estimation. In Step 8 and Step 9, the time complexity is \(2^{61.7} \cdot 2=2^{62.7}\) memory accesses which is equivalent to \(2^{62.7}\) encryptions. Next, Step 15 needs about \(2^{32}\cdot 2^{16}=2^{48}\) times of G computation equivalent to \(2^{48}\cdot \frac{1}{32}=2^{43}\) encryptions. Suppose that one memory access to an array of size \(2^{24}\) and of size \(2^{61.7}\) are equivalent to one round encryption and full cipher encryption respectively, then Step 17 and 18 need about \(2^{32}\cdot 2^{16}\cdot 2^{16}\cdot 2^{13.7} \cdot (1+\frac{1}{32})\approx 2^{77.7}\) encryptions. The operations done in Step 23 and Step 24 are comparable to halfround encryption, which are about \(2^{32}\cdot 2^{16}\cdot 2^{24}\cdot \frac{1}{2}\cdot \frac{1}{32}=2^{66}\) encryptions. In the same way, we regard the operations in Step 29 and Step 30 also as halfround encryption, then the time complexity of these two steps is about \(2^{32}\cdot 2^{16}\cdot 2^{8}\cdot 2^{16}\cdot \frac{1}{2}\cdot \frac{1}{32}=2^{66}\) encryptions. As we set the wrong key guess filteration ratio as \(\alpha _1=2^{4}\), thus in Step 33, we need to exhaustively search about \(2^{804}=2^{76}\) key values to find the right key. To summarize, the time complexity of our key recovery attack on fullround SkipjackBABABABA is about \(2^{62.7}+2^{43}+2^{77.7}+2^{66}+2^{66}+2^{76}\approx 2^{78.1}\) encryptions. About the data complexity, in Step 6, all possible values of \(\delta ^1\) will be iterated through. Thus our attack needs about \(2^{61.7}\) chosen plaintexts. The dominant memory requirements occur to store the plaintext/ciphertext pairs in Step 1, which needs about \(2\times 2^{61.7}\times 8 =2^{65.7}\) bytes.
3.4 Improved Integral Attack on 31Round Skipjack
Complexity Estimation. Assume that one memory access is equivalent to one round encryption, Step 3 and 4 need about \(2^{46.8}\times \frac{1}{31}\approx 2^{41.8}\) encryptions. Then the operations in Step 9 and 10 are about \(2^{16}\times 2^{24}\times \frac{1}{2}\times \frac{1}{31}\approx 2^{34.0}\) encryptions. Step 15 and 16 need about \(2^{16}\times 2^8 \times 2^{16}\times \frac{1}{2}\times \frac{1}{31}\approx 2^{34.0}\) encryptions. As we set the wrong key guess filteration ratio as \(2^{16}\), the numbers of remained key \((k_5, k_6, k_7)\) are about \(2^{2416}=2^8\) in Step 19. Until now, we exploit the integral property of \(\beta ^{32}_R \oplus \gamma ^{32}_R\) to filter most wrong keys. Next, we use the integral property of \(\beta ^{32}_L \oplus \gamma ^{32}_L\) to filter all wrong keys of \((k_4, k_5, k_6, k_7)\). Step 25 needs about \(2^{8}\times 2^8 \times 2^{24}\times \frac{1}{31}\approx 2^{35.0}\) encryptions. Finally, by setting \(\alpha _1=2^{16}\) we need to exhaustively search about \(2^{801616}=2^{48}\) key values in Step 28 to find the right key. In total the time complexity is about \(2^{41.8}+2^{34.0}+2^{34.0}+2^{35.0}+2^{48}\approx 2^{48}\) encryptions. The dominant memory complexity is required in Step 1 which is about \(2 \times 2^{24}\times 3 \approx 2^{27.6}\) bytes which happen.
4 Conclusion
In this paper, we propose the statistical integral attack where we use the statistic technique to deal with the original integral distinguisher with balanced property. The new integral attack has the lower data complexity than that of the original one. Our experiment for mini version of AES shows that the experimental results are in good accordance with the theoretic results. What’ more, with this new distinguisher we can improve the previous integral attack on 31round SkipjackBABABABA and achieve the fullround attack of SkipjackBABABABA. In the future, we will apply the statistical integral model to many other block ciphers which are vulnerable to integral attack.
Footnotes
 1.
Although the common sense of balanced property refers to as zerosum property, the balanced property used in this paper is active or ALL property.
Notes
Acknowledgments
This work has been supported by 973 Program (No. 2013C B834205), NSFC Projects (No. 61133013, No. 61572293), Program for New Century Excellent Talents in University of China (NCET 130350).
References
 1.Aumasson, J.P., Meier, W.: ZeroSum Distinguishers for Reduced Keccakf and for the Core Functions of Luffa and Hamsi. Presented at the rump session of Cryptographic Hardware and Embedded Systems CHES 2009 (2009)Google Scholar
 2.Biham, E., Biryukov, A., Shamir, A.: Cryptanalysis of skipjack reduced to 31 rounds using impossible differentials. In: Stern, J. (ed.) EUROCRYPT 1999. LNCS, vol. 1592, pp. 12–23. Springer, Heidelberg (1999)CrossRefGoogle Scholar
 3.Biryukov, A., Shamir, A.: Structural cryptanalysis of SASAS. In: Pfitzmann, B. (ed.) EUROCRYPT 2001. LNCS, vol. 2045, pp. 394–405. Springer, Heidelberg (2001)Google Scholar
 4.Blondeau, C., Nyberg, K.: Links between truncated differential and multidimensional linear properties of block ciphers and underlying attack complexities. In: Nguyen, P.Q., Oswald, E. (eds.) EUROCRYPT 2014. LNCS, vol. 8441, pp. 165–182. Springer, Heidelberg (2014)CrossRefGoogle Scholar
 5.Bogdanov, A., Leander, G., Nyberg, K., Wang, M.: Integral and multidimensional linear distinguishers with correlation zero. In: Wang, X., Sako, K. (eds.) ASIACRYPT 2012. LNCS, vol. 7658, pp. 244–261. Springer, Heidelberg (2012)CrossRefGoogle Scholar
 6.Collard, B., Standaert, F.X.: A statistical saturation attack against the block cipher PRESENT. In: Fischlin, M. (ed.) CTRSA 2009. LNCS, vol. 5473, pp. 195–210. Springer, Heidelberg (2009)CrossRefGoogle Scholar
 7.Daemen, J., Knudsen, L.R., Rijmen, V.: The block cipher SQUARE. In: Biham, E. (ed.) FSE 1997. LNCS, vol. 1267, pp. 149–165. Springer, Heidelberg (1997)CrossRefGoogle Scholar
 8.Ferguson, N., Kelsey, J., Lucks, S., Schneier, B., Stay, M., Wagner, D., Whiting, D.L.: Improved cryptanalysis of rijndael. In: Schneier, B. (ed.) FSE 2000. LNCS, vol. 1978, pp. 213–230. Springer, Heidelberg (2001)CrossRefGoogle Scholar
 9.Fergnson, T.S.: A Course in Large Sample Theory. Chapman and Hall, London (1996)CrossRefGoogle Scholar
 10.Knudsen, L.R., Robshaw, M., Wagner, D.: Truncated differentials and skipjack. In: Wiener, M. (ed.) CRYPTO 1999. LNCS, vol. 1666, pp. 165–180. Springer, Heidelberg (1999)CrossRefGoogle Scholar
 11.Knudsen, L.R., Wagner, D.: Integral cryptanalysis. In: Daemen, J., Rijmen, V. (eds.) FSE 2002. LNCS, vol. 2365, pp. 112–127. Springer, Heidelberg (2002)CrossRefGoogle Scholar
 12.Knudsen, L.R., Wagner, D.: On the structure of skipjack. Discrete Appl. Math. 111(1–2), 103–116 (2001). ElsevierMathSciNetCrossRefzbMATHGoogle Scholar
 13.Leander, G.: On linear hulls, statistical saturation attacks, PRESENT and a cryptanalysis of PUFFIN. In: Paterson, K.G. (ed.) EUROCRYPT 2011. LNCS, vol. 6632, pp. 303–322. Springer, Heidelberg (2011)CrossRefGoogle Scholar
 14.Lehmann, E.L.: Elements of LargeSample Theory. Springer, New York (1999)CrossRefzbMATHGoogle Scholar
 15.Lucks, S.: The saturation attack  a bait for twofish. In: Matsui, M. (ed.) FSE 2001. LNCS, vol. 2355, pp. 1–15. Springer, Heidelberg (2002)CrossRefGoogle Scholar
 16.Moriai, S., Shimoyama, T., Kaneko, T.: Higher order differential attack of a CAST cipher. In: Vaudenay, S. (ed.) FSE 1998. LNCS, vol. 1372, pp. 17–31. Springer, Heidelberg (1998)CrossRefGoogle Scholar
 17.Sasaki, Y., Wang, L.: Meetinthemiddle technique for integral attacks against feistel ciphers. In: Knudsen, L.R., Wu, H. (eds.) SAC 2012. LNCS, vol. 7707, pp. 234–251. Springer, Heidelberg (2013)CrossRefGoogle Scholar
 18.Skipjack and KEA Algorithm Specifications, Version 2.0, 29. Available at the National Institute of Standards and Technology’s web page, May 1998. http://csrc.nist.gov/groups/ST/toolkit/documents/skipjack/skipjack.pdf
 19.Sun, B., Liu, Z., Rijmen, V., Li, R., Cheng, L., Wang, Q., Alkhzaimi, H., Li, C.: Links among Impossible Differential, Integral and Zero Correlation Linear Cryptanalysis. http://eprint.iacr.org/2015/181.pdf
 20.Vaudenay, S.: An experiment on DES statistical cryptanalysis. In: Proceedings of the 3rd ACM Conference on Computer and Communications Security, pp. 139–147. ACM (1996)Google Scholar
 21.Wu, S., Wang, M.: Integral attacks on reducedround PRESENT. In: Qing, S., Zhou, J., Liu, D. (eds.) ICICS 2013. LNCS, vol. 8233, pp. 331–345. Springer, Heidelberg (2013)CrossRefGoogle Scholar
 22.Z’aba, M.R., Raddum, H., Henricksen, M., Dawson, E.: Bitpattern based integral attack. In: Nyberg, K. (ed.) FSE 2008. LNCS, vol. 5086, pp. 363–381. Springer, Heidelberg (2008)CrossRefGoogle Scholar