Keywords

1 Introduction

Tweakable Block Cipher or TBC is a highly versatile symmetric-key primitive that has found applications in almost all verticals of modern information security, including encryption schemes [8], message authentication codes [21], authenticated encryption [29, 38], and even leakage resillience [42]. The popularity of TBCs is largely credited to the simplicity of TBC-based constructions, and more importantly, comparatively simpler proofs of beyond-the-birthday bound (BBB) security.

In a seminal paper [32] at CRYPTO 2002, Liskov, Rivest, and Wagner (LRW) formalized the notion of tweakable block ciphers (TBCs), although the high-level idea already appeared in some AES candidates such as Hasty Pudding [41] and Misty [13]. Over the years, the design landscape of TBCs has changed progressively. The design of a TBC mainly falls into one of the two categories: ad hoc designs based on well-established primitive design paradigms, or provably secure designs based on block ciphers or cryptographic permutations. In recent years, the popularity of ad-hoc designs has gained momentum with the advent of the TWEAKEY framework [22], its chief example being Deoxys-TBC [23], Skinny [6] and Qarma [2]. These designs are built from scratch, and their security mainly depends on cryptanalysis. On the other hand, the security of provably secure designs is directly linked to the security of the underlying primitives, such as a block cipher, a permutation, or a pseudorandom function. Some prominent examples include LRW’s original constructions [32] LRW1 and LRW2, XEX  [40] by Rogaway, and its extensions by Chakraborty and Sarkar [9], Minematsu [35], and Granger et al. [16]. Note that all these schemes are inherently birthday bound secure due to detectable internal collisions.

Cascading LRW2: Landecker et al. were the first to notice [31] that a cascading of two independent instances of LRW2 results in a BBB secure TBC construction. They proved that 2-round cascaded LRW2 is secure up to approx. \( 2^{2n/3} \) CCA queries, where n denotes the block size in bits. The initial proof was flawed [39], and superseded by a corrected proof by both Landecker et al. and Procter [39]. The construction was later found [26, 34] to be tightly secure up to \( 2^{3n/4} \) CCA queries. For any arbitrary \( r \ge 2 \)-round independent cascading of LRW2, denoted \( {r}\hbox {-}\textsf {LRW2} \), Lampe and Seurin proved [30] CCA security up to approx. \( 2^{\frac{r n}{r+2}} \) queries.

Cascading LRW1: The idea to cascade LRW1 came quite later in [4], where Bao et al. showed that 3-round cascading of LRW1, referred as TNT, is CCA secure up to \( 2^{2n/3} \) queries. The design is highly appreciated in the community for its simple design and high provable security guarantee. In fact, the CPA security was later improved to \( 2^{3n/4} \) queries, essentially matching the bound for 2-round LRW2. Since this later result, it is widely believed that the CPA improvement carries over to the CCA setting, as well. For the more general case of arbitrary \( r \ge 3 \), denoted \({r}\hbox {-}\textsf {LRW1} \), Zhang et al. proved [43] CCA security up to approx. \(2^{\frac{r-1}{r+1}n} \) queries.

We remark that the aforementioned LRW-based constructions are all studied under the standard assumption on the pseudorandomness of the underlying block ciphers. However, several good constructions are also based on cryptographic permutations [11, 12] and even rekeyingFootnote 1 of block ciphers [25, 33, 35]. We skip a detailed discussion on these ideal model constructions since the focus here is specific to the LRW design paradigm. We encourage the readers to see [26, 34] for a more inclusive discussion on ideal model constructions.

1.1 Motivation

The primary motivation behind this work is a peculiar non-random behavior exhibited by TNT in the CCA setting.

Suppose \( \pmb {\pi }_1,\pmb {\pi }_2,\pmb {\pi }_3 \) are three independent random permutations of \(\{0,1\}^{n}\). The TNT construction (see Fig. 1) based on \(\pmb {\pi }_1,\pmb {\pi }_2,\pmb {\pi }_3\) is a TBC with n-bit tweak and n-bit block input, defined by the mapping

figure a
Fig. 1.
figure 1

The TNT construction [4].

As can be noticed by the definition of TNT, it has a peculiar property, that we refer as the final-block cancellation property. Specifically, suppose we have a triple \( (t,m,\widehat{c}) \) such that \( \textsf {TNT} (t,m) = \widehat{c}\). Then, it is easy to see that any inverse query of the form \( (t',\widehat{c}) \) would result in a cancellation of the call to \( \pmb {\pi }_3 \), and this is independent of the tweak values t and \( t' = t \oplus \delta \). Essentially, the construction boils down to the one in Fig. 2. Let’s call it \(\textsf {TNT} _{\delta ,m}\) for some fixed \(\delta \ne 0^n\) and \(m \in \{0,1\}^{n}\). For a fixed m, we have \(u_1 \oplus u_2 = t_1 \oplus t_2\). Now, suppose the adversary can find a pair of tweaks \((t_1,t_2)\) such that there is a collision at the output, i.e.,

Fig. 2.
figure 2

TNT with final-block cancellation.

$$ (m'_1 = m'_2) \iff (\widehat{m}'_1 = \widehat{m}'_2) \iff (u'_1 \oplus u'_2 = t_1 \oplus t_2) $$

So, an output collision happens if and only if \( u'_1 \oplus u'_2 = u_1 \oplus u_2 \). Interestingly, for \( \textsf {TNT} _{\delta ,m} \), we have the following property:

$$ (\widehat{u}_1 \oplus \widehat{u}_2 = \delta ) \implies (u'_1 \oplus u'_2 = u_1 \oplus u_2), $$

which implies that there are two sources of collisions in \( \textsf {TNT} _{\delta ,m} \). A collision happens whenever \( \widehat{u}_1 \oplus \widehat{u}_2 = \delta \), or \( \widehat{u}_1 \oplus \widehat{u}_2 \ne \delta \) and \( u'_1 \oplus u'_2 = u_1 \oplus u_2 \). This indicates that one can expect more number of collisions (roughly double) in \( \textsf {TNT} _{\delta ,m} \) as compared to a random function.

1.2 Contributions

Our contributions are threefold:

  1. 1.

    Birthday-bound CCA Attack on TNT: In Sect. 3, we start by formalizing the aforementioned non-random behavior of TNT. We show (see Sect. 3.1) that the expected number of output collisions for \( \textsf {TNT} _{\delta ,m} \) is approximately twice the expected number for \( {\widetilde{\pmb {\pi }}}_{\delta ,m} \), where \( {\widetilde{\pmb {\pi }}}\) is an n-bit uniform random permutation with n-bit tweaks. Our analysis strongly indicates a global non-random phenomenon that can be detected in roughly \( O(2^{n/2})\) CCA queries. We establish this assertion by giving a fully scalable CCA distinguisher. We provide a rigorous analysis for the query complexity and advantage of our distinguisher, which shows that the distinguisher has an advantage expression of \( 1 - O(2^n/q^2) \), where q denotes the number of CCA queries. We provide details (see Sect. 3.3) for efficient implementation and verification of our attacks, including results for an attack on TNT-GIFT-64, the TNT instantiation using GIFT-64 block cipher.

    Since the attack clearly contradicts the security claims of the designers of TNT, we study their security proof in Sect. 4 and identify a bug, where a random variable is erroneously assumed to have a uniform distribution, leading to an overestimation of the security.

    See [28] and [27] for two alternative analyses of the attack. The former employs random permutations statistics to estimate the number of collisions and the latter directly bounds the probability of collisions in the two worlds. The analysis in this paper is more comprehensive and leads to a scalable advantage, but all three analyses come to the same conclusion: TNT can be broken in birthday bound queries!

  2. 2.

    Birthday-bound CCA Security of TNT: In Sect. 5, we provide a simple proof of birthday-bound CCA security for TNT. Note that the CCA security bound also follows from the results in [43]. Nevertheless, given the flaws in TNT ’s original analysis, we believe that multiple security proofs using different techniques will lead to greater confidence in the revised security claim. In addition to the original TNT, we also analyze the single-keyed variant of TNT, and show that it retains the same level of CCA security as well.

  3. 3.

    A Generalization of Cascaded LRW Paradigm: In a more abstract direction, in Sect. 6, we present a generalized view of the cascaded LRW design strategy for any arbitrary number of rounds \( r \ge 2 \), called the LRW+ construction. It consists of two block cipher calls sandwiched between a pair of tweakable universal hashes. We show that as long as the tweakable hashes are sufficientlyFootnote 2 universal, the LRW+ construction is CCA secure up to \( 2^{3n/4} \) queries. Note that LRW+ encompasses both \( {2}\hbox {-}\textsf {LRW2} \) and \( {4}\hbox {-}\textsf {LRW1} \). Thus, as a direct side-effect of our analysis, in Sect. 6.2, we show that \( {2}\hbox {-}\textsf {LRW2} \) and \( {4}\hbox {-}\textsf {LRW1} \) are CCA secure up to \( 2^{3n/4} \) queries. In case of \( {2}\hbox {-}\textsf {LRW2} \), our bound matches the tight analysis in [26], and in case of \( {4}\hbox {-}\textsf {LRW1} \), our bound matches a concurrent result [15] by Datta et al.

    Note that the result on LRW+ directly shows that \( {r}\hbox {-}\textsf {LRW1} \) is at least 3n/4-bit secure for any \( r\ge 4 \), improving on the results for \(r\le 8\). Similarly, for \( {r}\hbox {-}\textsf {LRW2} \) it shows at least 3n/4-bit security for any \( r\ge 2 \), improving on the results for \(r\le 6\). See Table 1 for a summary of the state-of-the-art on the security of cascaded LRW constructions.

    Comparison with [15]: Concurrently, Datta et al. also proposed [15] an improved bound for \( {4}\hbox {-}\textsf {LRW1} \) that matches our \( 2^{3n/4} \) bound. Both the proofs follow the proof strategy [26] used for \( {2}\hbox {-}\textsf {LRW2} \) by Jha and Nandi, although ours is in a more general form (analyzing LRW+) that applies to all the cascaded LRW constructions with two or more block cipher calls.

Table 1. Summary of security bounds for LRW-based construction. We have assumed all hash functions to be \(2^{-n}\)-(XOR) universal. The bottom four rows present our results. LRW+ generalizes both \({2}\hbox {-}\textsf {LRW2}\) and \({4}\hbox {-}\textsf {LRW1} \). So the bound on LRW+ implies similar bounds for \({2}\hbox {-}\textsf {LRW2}\) and \({4}\hbox {-}\textsf {LRW1} \).

1.3 Impact of Our Birthday-Bound Attack

As mentioned before, the authors of [4] claimed the CCA security of TNT to be 2n/3 bits. In Asiacrypt 2020, the authors of [18] conjectured that the CCA security of TNT is probably 3n/4 bits. In [43], the authors have stated:

$$\begin{aligned} \begin{array}{c} \text {A natural open problem is the exact security of } {r}\hbox {-}\textsf {LRW1} . \text { Unlike } {r}\hbox {-}\textsf {LRW2} , \text { the } \\ \text { exact security of } {r}\hbox {-}\textsf {LRW1} \text { for } r = 3 \text { already appears challenging, and might } \\ \text { require new proof approaches. } \end{array} \end{aligned}$$

We believe this work answers a critical research question of both practical and theoretical implications. On one hand, it studies the exact security of an efficient construction that has several practical applications. On the other hand, it offers another cautionary tale on how to use statistical proof techniques such as the \(\chi ^2\) method.Footnote 3

Additionally, the attack applies to practical instances of TNT: TNT-AES in [4] and TNT-SM4-128 in [19]. The authors of [19] also introduced TNT-SM4-32, where the tweak size is limited to 32 bits. Our distinguisher requires \(O(2^{n/2})\) tweaks, where \(n=128\) in case of TNT-SM4. Hence, the distinguisher directly applies to TNT-SM4-128, which has a tweak size of 128 bits. It does not directly apply to TNT-SM4-32, since the tweak space is too small. However, since our distinguisher breaks the BBB security proof in [4], the exact security of TNT-SM4-32 and whether it has BBB security is an open question.

We note that in Eurocrypt 2023, a full-round distinguisher on TNT-AES using truncated boomerang attacks was presented in [5]. However, the attack is particular to TNT-AES and requires almost \(2^{n}\) queries. Our attack applied to any 128-bit instantiation of TNT, including TNT-AES, requires \(\le 2^{69}\) queries to have an almost \(100\%\) success rate, making it the best-known distinguisher for any 128-bit TNT variant, without relying on the properties of the underlying block cipher. We sum up all known distinguishers on TNT-AES in Table 2, which indicates that our distinguisher is not only theoretical but outperforms all cryptanalytic efforts on TNT, so far.

Table 2. Known distinguishers against TNT-AES. CCA stands for adaptive Chosen Ciphertext Adversary. NCPA stands for Non-adaptive Chosen Plaintext Adversary. Rounds is the number of AES rounds in \(\pmb {\pi }_1\), \(\pmb {\pi }_2\) and \(\pmb {\pi }_3\), respectively. \(\star \) means any number of rounds. Generic attacks do not rely on any AES properties and apply to TNT instantiated with any 128-bit block cipher. \(2^{69}\) is the complexity for which our attack is expected to have almost \(100\%\) success rate, while \(2^{68}\) is expected to have \(99\%\) success rate.

2 Preliminaries

Notational Setup: For \( n \in \mathbb {N}\), [n] denotes the set \( \{1,2,\ldots ,n\} \), \( \{0,1\}^n \) denotes the set of bit strings of length n, and \( \textsf{Perm}(n) \) denotes the set of all permutations over \( \{0,1\}^n \). For \( \tau ,n \in \mathbb {N}\), \( \widetilde{\textsf{Perm}}(\tau ,n) \) denotes the set of all families of permutations \( \pi _t := \pi (t,\cdot ) \in \textsf{Perm}(n) \), indexed by \( t \in \{0,1\}^{\tau } \). Any \( \widetilde{\pi } \in \widetilde{\textsf{Perm}}(\tau ,n) \) is referred as a \( (\tau ,n) \)-tweakable permutation.

For \( n,r \in \mathbb {N}\), such that \( n \ge r \), we define the falling factorial \( (n)_r := n!/(n-r)! = n(n-1)\cdots (n-r+1) \), and define \((n)_0 := 1\).

For \( q \in \mathbb {N}\), \( x^q \) denotes the q-tuple \( (x_1,x_2,\ldots ,x_q) \), and in this context, \( \texttt {M}(x^q) \) and \( \texttt {S}(x^q) \) respectively denote the multiset and set corresponding to \( \{x_i : i \in [q]\} \). For a set \( \mathcal {I}\subseteq [q] \) and a q-tuple \( x^q \), \( x^{\mathcal {I}} \) denotes the tuple \( (x_i)_{i \in \mathcal {I}} \). For a pair of tuples \( x^q \) and \( y^q \), \( (x^q,y^q) \) denotes the 2-ary q-tuple \( ((x_1,y_1),\ldots ,(x_q,y_q)) \). An n-ary q-tuple is defined analogously. For \( q \in \mathbb {N}\), for any set \( \mathcal {X}\), \( (\mathcal {X})_q \) denotes the set of all q-tuples with distinct elements from \( \mathcal {X}\). For \( q \in \mathbb {N}\), a 2-ary tuple \( (x^q,y^q) \) is called permutation compatible, denoted , if \( x_i = x_j \iff y_i =y_j \). Extending notations, a 3-ary tuple \( (t^q,x^q,y^q) \) is called tweakable permutation compatible, denoted by , if \( (t_i,x_i) = (t_j,x_j) \iff (t_i,y_i) = (t_j,y_j) \). For any tuple \( x^q \in \mathcal {X}^q \), and for any function \( f: \mathcal {X}\rightarrow \mathcal {Y}\), \( f(x^q) \) denotes the tuple \( (f(x_1),\ldots ,f(x_q)) \). We use shorthand notation \(\exists ^*\) to represent the phrase “there exists distinct”.

Unless stated otherwise, upper and lower case letters denote variables and values, respectively, and Serif font letters are used to denote random variables. For a finite set \( \mathcal {X}\), denotes the uniform and random sampling of \( \textsf{X} \) from \( \mathcal {X}\). We write \( \textsf{X}^q {\mathop {\longleftarrow }\limits ^{\textrm{wor}}}\mathcal {X}\) to denote WOR (without replacement sampling) of a q-tuple \( \textsf{X}^q \) from the set \( \mathcal {X}\), where \( |\mathcal {X}| \ge q \) is obvious. More precisely, .

We will use the following proposition, which is a slight variation of [17, Lemma 6].

Proposition 2.1

Let \( \textsf{R}_0 \) and \( \textsf{R}_1 \) be two random variables with variances \( \sigma ^2_0 \) and \( \sigma ^2_1 \), respectively, and suppose their expectations follow the relation \( \textsf{Ex}_{}\left( {\textsf{R}_0}\right) \ge \mu _0 \ge \mu _1 \ge \textsf{Ex}_{}\left( {\textsf{R}_1}\right) \), for some \( \mu _0 \ge \mu _1 \ge 0 \). Then, for \( \mu = (\mu _0+\mu _1)/2 \), we have

$$\begin{aligned} \left| \textsf{Pr}_{}\left( {\textsf{R}_0 > \mu }\right) - \textsf{Pr}_{}\left( {\textsf{R}_1 > \mu }\right) \right| \ge 1 - \frac{4(\sigma ^2_0+\sigma ^2_1)}{(\mu _0 - \mu _1)^2}. \end{aligned}$$

When \( \textsf{Ex}_{}\left( {\textsf{R}_0}\right) = \mu _0 \) and \( \textsf{Ex}_{}\left( {\textsf{R}_1}\right) = \mu _1 \), we get back [17, Lemma 6]. A proof of this proposition can be derived using a similar approach as used in the proof of [17, Lemma 6]. We provide a short alternate proof in the full version of this paper [24] by using the Bienaymé-Chebyshev inequality.

2.1 (Tweakable) Block Ciphers and Random Permutations

A \( (\kappa ,n) \)-block cipher with key size \( \kappa \) and block size n is a family of permutations \( {E}\in \widetilde{\textsf{Perm}}(\kappa ,n) \). For \( k \in \{0,1\}^\kappa \), we denote \( {E}_k(\cdot ) := {E}(k,\cdot ) \), and \( {E}^{-1}_k(\cdot ) := {E}^{-1}(k,\cdot ) \). A \( (\kappa ,\tau ,n) \)tweakable block cipher with key size \( \kappa \), tweak size \( \tau \) and block size n is a family of permutations \( \widetilde{E}\in \widetilde{\textsf{Perm}}((\kappa ,\tau ),n) \). For \( k \in \{0,1\}^\kappa \) and \( t \in \{0,1\}^\tau \), we denote \( \widetilde{E}_k(t,\cdot ) := \widetilde{E}(k,t,\cdot ) \), and \( \widetilde{E}^{-1}_k(t,\cdot ) := \widetilde{E}^{-1}(k,t,\cdot ) \). Throughout this paper, we fix \( \kappa ,\tau ,n \in \mathbb {N}\) as the key size, tweak size, and block size, respectively, of the given (tweakable) block cipher.

We say that \( \pmb {\pi }\) is an (ideal) random permutation on block space \( \{0,1\}^{n}\) to indicate that . Similarly, we say that \( {\widetilde{\pmb {\pi }}}\) is an (ideal) tweakable random permutation on tweak space \( \{0,1\}^\tau \) and block space \( \{0,1\}^{n}\) to indicate that .

2.2 Security Definition

In this paper, we assume that the distinguisher is non-trivial, i.e. it never makes a duplicate query, and it never makes a query for which the response is already known due to some previous query. Let \( \mathbb {A}(q,t) \) be the class of all non-trivial distinguishers limited to q oracle queries, and t computations.

In our analyses, especially security proofs, it will be convenient to work in the information-theoretic setting. Accordingly, we always skip the boilerplate hybrid steps and often assume that the adversary is computationally unbounded, i.e., \( t = \infty \), and deterministic. A computational equivalent of all our security proofs can be easily obtained by a simple hybrid argument.

IND-CCA Security: The IND-CCA advantage of distinguisher \( \textbf{A}\) against \( \widetilde{E}\) instantiated with a key is defined as

$$\begin{aligned} \textbf{Adv}^{\mathsf {ind\text {-}cca}}_{\widetilde{E}}(\textbf{A}) = \textbf{Adv}_{\widetilde{E}^\pm ;{\widetilde{\pmb {\pi }}}^\pm }(\textbf{A}) := \left| \textsf{Pr}_{}\left( {\textbf{A}(\widetilde{E}_{\textsf{K}}^\pm ) = 1}\right) -\textsf{Pr}_{}\left( {\textbf{A}({\widetilde{\pmb {\pi }}}^\pm ) = 1}\right) \right| . \end{aligned}$$
(1)

The IND-CCA security of \( \widetilde{E}\) is defined as

$$\begin{aligned} \displaystyle \textbf{Adv}^{\mathsf {ind\text {-}cca}}_{\widetilde{E}}(q,t) := \max _{\textbf{A}\in \mathbb {A}(q,t)} \textbf{Adv}^{\mathsf {ind\text {-}cca}}_{\widetilde{E}}(\textbf{A}). \end{aligned}$$

2.3 The Expectation Method

Let \( \textbf{A}\) be a computationally unbounded and deterministic distinguisher that tries to distinguish between two oracles \( \mathcal {O}_0 \) and \( \mathcal {O}_1 \) via black box interaction with one of them. We denote the query-response tuple of \( \textbf{A}\)’s interaction with its oracle by a transcript \( \omega \). This may also include any additional information the oracle chooses to reveal to the distinguisher at the end of the query-response phase of the game. We denote by \( \mathsf {\Theta }_1\) (res. \( \mathsf {\Theta }_0\)) the random transcript variable when \( \textbf{A}\) interacts with \( \mathcal {O}_1 \) (res. \( \mathcal {O}_0 \)). The probability of realizing a given transcript \( \omega \) in the security game with an oracle \( \mathcal {O}\) is known as the interpolation probability of \( \omega \) with respect to \( \mathcal {O}\). Since \( \textbf{A}\) is deterministic, this probability depends only on the oracle \( \mathcal {O}\) and the transcript \( \omega \). A transcript \( \omega \) is said to be attainable if \( \textsf{Pr}_{}\left( {\mathsf {\Theta }_0= \omega }\right) > 0 \). The expectation method [20] (stated below) is a generalization of Patarin’s H-coefficients technique [36], which is quite useful in obtaining improved bounds in many cases [20, 26].

Lemma 2.1

(Expectation Method [20]). Let \( \varOmega \) be the set of all transcripts. For some \( \epsilon _{\textsf{bad}}\ge 0 \) and a non-negative function \( \epsilon _{\textsf{ratio}}: \varOmega \rightarrow [0,\infty ) \), suppose there is a set \( \varOmega _\textsf{bad}\subseteq \varOmega \) satisfying the following:

  • \( \textsf{Pr}_{}\left( {\mathsf {\Theta }_0\in \varOmega _\textsf{bad}}\right) \le \epsilon _{\textsf{bad}}\);

  • For any \( \omega \notin \varOmega _\textsf{bad}\), \( \omega \) is attainable and \( \displaystyle \frac{\textsf{Pr}_{}\left( {\mathsf {\Theta }_1= \omega }\right) }{\textsf{Pr}_{}\left( {\mathsf {\Theta }_0= \omega }\right) } \ge 1-\epsilon _{\textsf{ratio}}(\omega ) \).

Then for any distinguisher \( \textbf{A}\) trying to distinguish between \( \mathcal {O}_1 \) and \( \mathcal {O}_0 \), we have the following bound on its distinguishing advantage:

$$\begin{aligned} \textbf{Adv}_{\mathcal {O}_1;\mathcal {O}_0}(\textbf{A}) \le \epsilon _{\textsf{bad}}+ \textsf{Ex}_{}\left( {\epsilon _{\textsf{ratio}}(\mathsf {\Theta }_0)}\right) . \end{aligned}$$

When \( \epsilon _{\textsf{ratio}}\) is a constant function, we get the H-coefficients technique.

3 Birthday-Bound Attack on TNT

We consider the TNT construction in an information-theoretic setting. Accordingly, we instantiate TNT based on three independent uniform random permutations \(\pmb {\pi }_1\), \(\pmb {\pi }_2\), and \(\pmb {\pi }_3\) of \(\{0,1\}^{n}\). Recall that, the TNT construction is defined by the mapping

(2)

For some non-zero \( \delta \in \{0,1\}^{n}\) and \(m \in \{0,1\}^{n}\), consider the function \( \mathcal {O}_{\delta , m} :\{0,1\}^{n}\rightarrow \{0,1\}^{n}\), associated to each n-bit tweakable permutation \( \mathcal {O}\) with n-bit tweak, defined by the mapping

(3)

We are only interested in \( {\widetilde{\pmb {\pi }}}_{\delta ,m} \) and \(\textsf {TNT} _{\delta ,m} \) where \( {\widetilde{\pmb {\pi }}}\) is a tweakable uniform random permutation of \( \{0,1\}^{n}\) with n-bit tweaks.

Suppose \( {\widetilde{\pmb {\pi }}}_{\delta ,m} \) is executed over q distinct inputs \( (t_1,\ldots ,t_q) \). Observe that, for any valid choice of \( (t_1,\ldots ,t_q) \), \( {\widetilde{\pmb {\pi }}}\) is executed at most twice for any tweak \( t_i \). Thus, one can expect \( {\widetilde{\pmb {\pi }}}_{\delta ,m}(\cdot ) \) to be almost uniform and independent, and thus, indistinguishable from a uniform random function \( \pmb {\rho }:\{0,1\}^{n}\rightarrow \{0,1\}^{n}\) for a large range of q. In fact, as long as

$$\begin{aligned} {\widetilde{\pmb {\pi }}}(t_i,m) \ne {\widetilde{\pmb {\pi }}}(t_j,m) \text { for all } i \ne j \text { such that } t_j = t_i \oplus \delta , \end{aligned}$$

\( {\widetilde{\pmb {\pi }}}_{\delta ,m} \) can be shown to be indistinguishable from \( \pmb {\rho }\) up to \( O(2^n) \) queries. More importantly, as we show in the following discussion, one can easily show that the \( {\widetilde{\pmb {\pi }}}_{\delta ,m} \) is almost identical to \( \pmb {\rho }\) in terms of the number of output collisions.

\(\textsf {TNT} _{\delta ,m} \), on the other hand, exhibits a rather peculiar and interesting property. Apparently, \( \textsf {TNT} _{\delta ,m} \) is more prone to collisions as compared to \( {\widetilde{\pmb {\pi }}}_{\delta ,m} \), which results in a direct IND-CCA distinguisher for TNT. A formal distinguisher with complete advantage calculation appears later in Sect. 3.2. We first demonstrate the biased behavior by comparing the number of output collisions for \( \textsf {TNT} _{\delta ,m} \) and \( {\widetilde{\pmb {\pi }}}_{\delta ,m} \).

3.1 Comparing the Number of Collision Pairs in \( {\widetilde{\pmb {\pi }}}_{\delta ,m} \) and \( \textsf {TNT} _{\delta ,m} \)

Fix some non-negative integer \( q \le 2^n \). Fix a set \( \mathcal {T}= \{t_1, \ldots , t_q\} \subseteq \{0,1\}^{n}\) of size q, an \( m \in \{0,1\}^{n}\), and a non-zero \( \delta \in \{0,1\}^{n}\). Let \(\mathcal {O}\) be a tweakable permutation (which is either \({\widetilde{\pmb {\pi }}}\) in the ideal world or \(\textsf {TNT} \) in the real world). We compute \( \textsf{M}'_i = \mathcal {O}_{\delta ,m}(t_i) \) by making a forward query \( \mathcal {O}(t_i,m) := \mathsf {\widehat{C}}_i \), followed by a backward query \( \textsf{M}'_i = \mathcal {O}^{-1}(t_i \oplus \delta ,\mathsf {\widehat{C}}_i) \). We write \(\texttt{COLL}(\mathcal {O}_{\delta ,m})\) to denote the number of pairs (ij), \(i < j\) such that \(\textsf{M}'_i = \textsf{M}'_j\).

Analyzing \(\textsf{coll}_{\textrm{id}} := \texttt{COLL}({\widetilde{\pmb {\pi }}}_{\delta ,m})\): For any \( i \ne j \in [q] \), let \( \mathbbm {1}_{i,j} \) denote the indicator random variable corresponding to the event: \(\textsf{M}'_j = \textsf{M}'_i\). Then, using linearity of expectation, we have

$$\begin{aligned} \textsf{Ex}_{}\left( {\textsf{coll}_{\textrm{id}}}\right) = \sum _{i < j \in [q]}\textsf{Ex}_{}\left( {\mathbbm {1}_{i,j}}\right) = \sum _{i < j \in [q]} \textsf{Pr}_{}\left( {\mathbbm {1}_{i,j}}\right) , \end{aligned}$$
(4)

where we abused the notation slightly to use \( \mathbbm {1}_{i,j} \) to denote the event \( \mathbbm {1}_{i,j} = 1\). Let \( \sim \) be a relation on [q] , such that for all \( i \ne j \in [q] \), \( i \sim j \) if and only if \( t_i = t_j \oplus \delta \). Note that \( \sim \) is symmetric. Suppose there are \(\nu \) pairs \((t_i, t_j)\), \(i < j\) such that \(t_i \sim t_j\). Clearly, \(\nu \le q/2\). Now, we can split the right-hand side of (4) as follows:

$$\begin{aligned} \sum _{i < j \in [q]} \textsf{Pr}_{}\left( {\mathbbm {1}_{i,j}}\right) = \sum _{\begin{array}{c} i < j \in [q]\\ i \sim j \end{array}} \textsf{Pr}_{}\left( {\mathbbm {1}_{i,j}}\right) + \sum _{\begin{array}{c} i < j \in [q]\\ i \not \sim j \end{array}} \textsf{Pr}_{}\left( {\mathbbm {1}_{i,j}}\right) \end{aligned}$$
(5)

Case \( i \not \sim j \): We must have \( \{t_i,t_j\} \cap \{t_i \oplus \delta ,t_j \oplus \delta \} = \emptyset \). Thus, the two calls to \( {\widetilde{\pmb {\pi }}}_{\delta ,m} \) corresponding to the i-th and j-th queries result in exactly 2 calls to \( {\widetilde{\pmb {\pi }}}\) and 2 calls \( {\widetilde{\pmb {\pi }}}^{-1} \), each with a distinct tweak than others. Hence, the outputs of \( {\widetilde{\pmb {\pi }}}_{\delta ,m} \) on inputs \( t_i \) and \( t_j \) are mutually independent and uniformly distributed in \( \{0,1\}^{n}\). Thus, for any \( i \not \sim j \), we have

$$\begin{aligned} \textsf{Pr}_{}\left( {\mathbbm {1}_{i,j}}\right) = \frac{1}{2^n}, \end{aligned}$$
(6)

which results in

$$\begin{aligned} \sum _{\begin{array}{c} i < j \in [q]\\ i \not \sim j \end{array}} \textsf{Pr}_{}\left( {\mathbbm {1}_{i,j}}\right) = \left( {q \atopwithdelims ()2} - \nu \right) \frac{1}{2^n}, \end{aligned}$$
(7)

Case \( i \sim j \): In this case we have \( t_i = t_j \oplus \delta \). Let \( \texttt{F}_{i,j} \) be the event that \( {\widetilde{\pmb {\pi }}}(t_i,m) = {\widetilde{\pmb {\pi }}}(t_j,m) \). Then, we have \(\textsf{M}'_i = \textsf{M}'_j = m\). Since, \( t_i \ne t_j \), \( \textsf{Pr}_{}\left( {\texttt{F}_{i,j}}\right) = 2^{-n} \). So, for any \( i \sim j \), we have

$$\begin{aligned} \textsf{Pr}_{}\left( {\mathbbm {1}_{i,j}}\right) &= \textsf{Pr}_{}\left( {\mathbbm {1}_{i,j} \wedge \texttt{F}_{i,j}}\right) + \textsf{Pr}_{}\left( {\mathbbm {1}_{i,j} \wedge \lnot \texttt{F}_{i,j}}\right) \\ &= \textsf{Pr}_{}\left( {\texttt{F}_{i,j}}\right) + \textsf{Pr}_{}\left( {\mathbbm {1}_{i,j} \wedge \lnot \texttt{F}_{i,j}}\right) \\ &= \frac{1}{2^n} + \textsf{Pr}_{}\left( {\mathbbm {1}_{i,j} \wedge \lnot \texttt{F}_{i,j}}\right) , \end{aligned}$$

which immediately gives

$$\begin{aligned} \frac{1}{2^n} \le \textsf{Pr}_{}\left( {\mathbbm {1}_{i,j}}\right) \le \frac{1}{2^n} + \textsf{Pr}_{}\left( {\mathbbm {1}_{i,j} ~|~ \lnot \texttt{F}_{i,j}}\right) \le \frac{1}{2^n} + \frac{1}{2^{n}-1}. \end{aligned}$$
(8)

Note that the last inequality follows from the observation that given \( \lnot \texttt{F}_{i,j} \), outputs of \( {\widetilde{\pmb {\pi }}}^{-1}(t_i \oplus \delta ) \) and \( {\widetilde{\pmb {\pi }}}^{-1}(t_j \oplus \delta ) \) are sampled independently from a set of size exactly \( 2^n-1 \). This further results in

$$\begin{aligned} \frac{\nu }{2^n} \le \sum _{\begin{array}{c} i < j \in [q]\\ i \sim j \end{array}} \textsf{Pr}_{}\left( {\mathbbm {1}_{i,j}}\right) \le \nu \left( \frac{1}{2^n} + \frac{1}{2^{n}-1}\right) . \end{aligned}$$
(9)

Using (4), (5), (7), (9), and \( \nu \le q/2 \) we have

$$\begin{aligned} {q \atopwithdelims ()2}\frac{1}{2^n} \le \textsf{Ex}_{}\left( {\textsf{coll}_{\textrm{id}}}\right) \le {q \atopwithdelims ()2}\frac{1}{2^n} + \frac{q}{2^n}. \end{aligned}$$
(10)
Fig. 3.
figure 3

The execution trace for \(\textsf {TNT} _{\delta ,m}\) on input \(t_i\).

Analyzing \(\textsf{coll}_{\textrm{re}} := \texttt{COLL}(\textsf {TNT} _{\delta ,m})\): The analysis of \(\texttt{COLL}(\textsf {TNT} _{\delta ,m})\) is a bit more subtle and interesting. Figure 3 gives a pictorial view of the i-th execution of \(\textsf {TNT} _{\delta ,m}\). Clearly, the respective calls to \( \pmb {\pi }_3\) and its inverse cancel out each other, resulting in the compressed view illustrated in Fig. 4.

Fig. 4.
figure 4

The effective execution trace for \(\textsf {TNT} _{\delta ,m}\) on input \(t_i\).

Note that for any \( i, j \in [q] \), \( \textsf{U}_i \oplus \textsf{U}_j = t_i \oplus t_j \). Now, fix a pair of inputs \((t_i,t_j)\) such that there is a collision at the output, i.e.,

$$ (\textsf{M}'_i = \textsf{M}'_j) \iff (\mathsf {\widehat{M}}'_i = \mathsf {\widehat{M}}'_j) \iff (\textsf{U}'_i \oplus \textsf{U}'_j = t_i \oplus t_j) \iff (\textsf{U}'_i \oplus \textsf{U}'_j= \textsf{U}_i \oplus \textsf{U}_j), $$

and let \( \mathbbm {1}_{i,j} \) denote the corresponding indicator random variable. Observe that \( \textsf {TNT} _{\delta ,m} \), has the following interesting property:

$$ (\mathsf {\widehat{U}}_i \oplus \mathsf {\widehat{U}}_j = \delta ) \implies (\textsf{U}'_i \oplus \textsf{U}'_j = \textsf{U}_i \oplus \textsf{U}_j = t_i \oplus t_j), $$

which implies that there are two sources of collisions in \( \textsf {TNT} _{\delta ,m} \). A collision happens whenever

  1. 1.

    \( \mathsf {\widehat{U}}_i \oplus \mathsf {\widehat{U}}_j = \delta \), or

  2. 2.

    \( \mathsf {\widehat{U}}_i \oplus \mathsf {\widehat{U}}_j \ne \delta \) and \( \textsf{U}'_i \oplus \textsf{U}'_j = t_i \oplus t_j \).

From this one can easily get a good upper and lower bound on the expected number of collisions in the real world. Using linearity of expectation, we have

$$\begin{aligned} \textsf{Ex}_{}\left( {\textsf{coll}_{\textrm{re}}}\right) = \sum _{i < j \in [q]}\textsf{Ex}_{}\left( {\mathbbm {1}_{i,j}}\right) = \sum _{i < j \in [q]} \textsf{Pr}_{}\left( {\mathbbm {1}_{i,j}}\right) \end{aligned}$$
(11)

Further, from the above discussion, we have

$$\begin{aligned} \textsf{Pr}_{}\left( {\mathbbm {1}_{i,j}}\right) &= \textsf{Pr}_{}\left( {\mathbbm {1}_{i,j} \wedge \mathsf {\widehat{U}}_i \oplus \mathsf {\widehat{U}}_j = \delta }\right) + \textsf{Pr}_{}\left( {\mathbbm {1}_{i,j} \wedge \mathsf {\widehat{U}}_i \oplus \mathsf {\widehat{U}}_j \ne \delta }\right) \nonumber \\ &= \textsf{Pr}_{}\left( {\mathsf {\widehat{U}}_i \oplus \mathsf {\widehat{U}}_j = \delta }\right) + \textsf{Pr}_{}\left( {\mathsf {\widehat{U}}_i \oplus \mathsf {\widehat{U}}_j \ne \delta }\right) \nonumber \\ &\qquad \qquad \qquad \qquad \times \textsf{Pr}_{}\left( {\textsf{U}'_i \oplus \textsf{U}'_j = t_i \oplus t_j~|~\mathsf {\widehat{U}}_i \oplus \mathsf {\widehat{U}}_j \ne \delta }\right) \nonumber \\ &= \frac{1}{2^n-1} + \left( 1 - \frac{1}{2^n-1}\right) \nonumber \\ &\qquad \qquad \times \textsf{Pr}_{}\left( {\textsf{U}'_i \oplus \textsf{U}'_j = t_i \oplus t_j~|~\mathsf {\widehat{U}}_i \oplus \mathsf {\widehat{U}}_j \ne \delta }\right) , \end{aligned}$$
(12)

Note that \( \mathsf {\widehat{U}}_i \oplus \mathsf {\widehat{U}}_j \ne \delta \) implies that \( \textsf{U}'_i,\textsf{U}'_j \notin \{\textsf{U}_i,\textsf{U}_j\} \). Now, fix a valid choice for \( (\textsf{U}_i,\textsf{U}_j,\mathsf {\widehat{U}}_i,\mathsf {\widehat{U}}_j) \), say \( (u_i,u_j,\widehat{u}_i,\widehat{u}_j) \). Then, the number of valid choices for \((\textsf{U}'_i,\textsf{U}'_j)\) that satisfy the equation \( \textsf{U}'_i \oplus \textsf{U}'_j = t_i \oplus t_j \), are all \( (x,x \oplus t_i \oplus t_j) \) pairs such that

$$ x \in \{0,1\}^{n}\setminus \left( \{ u_i,u_j\} \cup \{u_i \oplus t_i \oplus t_j,u_j \oplus t_i \oplus t_j\}\right) $$

But, observe that \( \{u_i,u_j\} = \{u_i \oplus t_i \oplus t_j,u_j \oplus t_i \oplus t_j\} \) by definition, for any valid choice of \( (u_i,u_j) \). Therefore, the number of valid \( (x,x \oplus t_i \oplus t_j) \) is exactly \( 2^n-2 \). Furthermore, this counting is independent of the choice of \( (u_i,u_j,\widehat{u}_i,\widehat{u}_j) \), whence it holds unconditionally. Now, each such choice for \( (\textsf{U}'_i,\textsf{U}'_j) \) occurs with at most \( 1/(2^n-2)(2^n-3) \) probability, as they are sampled from \( \{0,1\}^{n}\setminus \{\textsf{U}_i,\textsf{U}_j\} \) in a WOR (without replacement) manner. Then, using (12), we have

$$\begin{aligned} \textsf{Pr}_{}\left( {\mathbbm {1}_{i,j}}\right) &= \frac{1}{2^n-1} + \left( 1 - \frac{1}{2^n-1}\right) \times \frac{1}{2^n-3} \\ &= \frac{1}{2^n-1} + \frac{1}{2^n-3} - \frac{1}{(2^n-1)(2^n-3)} \\ &=\frac{2}{2^n} + \frac{1}{2^n(2^n - 1)} + \frac{3}{2^n(2^n - 3)} - \frac{1}{(2^n - 1)(2^n - 3)} \end{aligned}$$

Using (11), we immediately have

$$\begin{aligned} \textsf{Ex}_{}\left( {\textsf{coll}_{\textrm{re}}}\right) = {q \atopwithdelims ()2}\left( \frac{1}{2^n-1} + \frac{1}{2^n-3} - \frac{1}{(2^n-1)(2^n-3)}\right) \ge {q \atopwithdelims ()2}\frac{2}{2^n}, \end{aligned}$$
(13)

and on comparing this with (10), we can conclude that

$$\begin{aligned} \textsf{Ex}_{}\left( {\textsf{coll}_{\textrm{re}}}\right) \approx 2\textsf{Ex}_{}\left( {\textsf{coll}_{\textrm{id}}}\right) . \end{aligned}$$

This clearly indicates that the occurrence of collisions in \( \textsf {TNT} _{\delta ,m} \) is approximately twice that of \( {\widetilde{\pmb {\pi }}}_{\delta ,m} \).

3.2 The Collision Counting Distinguisher

Based on the observations from the preceding section, we now present a formal distinguisher, called \( \textbf{A}^* \), in Algorithm 1.

Fix a message \( m \in \{0,1\}^{n}\), a set \( \mathcal {T}=\{t_1, \ldots , t_q\} \subseteq \{0,1\}^{n}\) of size q, and a \( \delta \ne 0^n \). Let \(\theta (q, n)\) be some non-negative function of q and n, which will be defined later in the course of analysis.

Let \( \mathcal {O}^\pm \) be the oracle \( \textbf{A}^* \) is interacting with. Then, \( \textbf{A}^* \) works by collecting \( \textsf{M}'_i = \mathcal {O}_{\delta ,m}(t_i) \) for all \( t_i \in \mathcal {T}\) in a multiset \( \mathcal {M}\). As shown in the preceding section and Algorithm 1, this can be easily done by a pair of encryption-decryption queries for each \( i \in [q] \). After this, \( \textbf{A}^* \) counts the number of collisions in \( \mathcal {M}\) using the function \( \texttt{collCount} \). If the number of collisions is greater than \( \theta (q,n) \), the distinguisher returns 1, otherwise, it returns 0.

figure o

Note that the exact implementation of \( \texttt{collCount} \) is not relevant for the forthcoming advantage calculation. So, we postpone a discussion on its implementation and resulting time and space complexity analysis to Sect. 3.3, where we also provide experimental verification for \( \textbf{A}^* \).

However, it is amply evident that the space complexity of the attack is O(q) , i.e., dominated by the query complexity. Further, looking ahead momentarily, one can implement \( \texttt{collCount} \) in such a way that it runs in time \( O(q \log _2 q) \). Other than this, \( \textbf{A}^* \) only makes 2q calls to \( \mathcal {O}\), thus the overall time complexity is also in \( O(q \log _2 q) \).

Define

$$\begin{aligned} \mu _{\textrm{re}} := {q \atopwithdelims ()2}\frac{2}{2^n} \qquad \qquad \mu _{\textrm{id}} := {q \atopwithdelims ()2}\frac{1}{2^n} + \frac{q}{2^n}. \end{aligned}$$

Then, from (10) and (13), we have that \( \textsf{Ex}_{}\left( {\texttt{COLL}(\textsf {TNT} _{\delta ,m})}\right) \ge \mu _{\textrm{re}} \ge \mu _{\textrm{id}} \ge \textsf{Ex}_{}\left( {\texttt{COLL}({\widetilde{\pmb {\pi }}}_{\delta ,m})}\right) \), whenever \( q \ge 3 \).

Theorem 3.1

For \( n\ge 4 \), \( 10 \le q \le 2^{n} \), and \( \theta (q,n) = (\mu _{\textrm{re}}+\mu _{\textrm{id}})/2 \), we have

$$\begin{aligned} \textbf{Adv}^{\mathsf {ind\text {-}cca}}_{\textsf {TNT}}(\textbf{A}^*) \ge 1 - 371\frac{2^n}{q^2}. \end{aligned}$$

Specifically, for \( q \ge 28 \times 2^{\frac{n}{2}} \), \( \textbf{Adv}^{\mathsf {ind\text {-}cca}}_{\textsf {TNT}}(\textbf{A}^*) \ge 0.5 \).

Proof

Recall that \( \textsf{coll}_{\textrm{id}} = \texttt{COLL}({\widetilde{\pmb {\pi }}}_{\delta ,m}) \) and \( \textsf{coll}_{\textrm{re}} = \texttt{COLL}({\widetilde{\pmb {\pi }}}_{\delta ,m}) \). Let \( \sigma ^2_s := \textsf{Var}_{}\left( {\textsf{coll}_{s}}\right) \), for all \( s \in \{\textrm{id},\textrm{re}\} \). In addition, whenever necessary, we also reuse the notations and definitions from the expectation calculation given in Sect. 3.1.

Now, we have

$$\begin{aligned} \textbf{Adv}^{\mathsf {ind\text {-}cca}}_{\textsf {TNT}}(\textbf{A}^*) &= \left| \textsf{Pr}_{}\left( {\textbf{A}^*(\textsf {TNT} _{\delta ,m}) = 1}\right) - \textsf{Pr}_{}\left( {\textbf{A}^*({\widetilde{\pmb {\pi }}}_{\delta ,m}) = 1}\right) \right| \nonumber \\ &= \left| \textsf{Pr}_{}\left( {\textsf{coll}_{\textrm{re}} > \theta (q,n)}\right) - \textsf{Pr}_{}\left( {\textsf{coll}_{\textrm{id}} > \theta (q,n)}\right) \right| \nonumber \\ &\ge 1 - \frac{4(\sigma ^2_{\textrm{re}} + \sigma ^2_{\textrm{id}})}{(\mu _{\textrm{re}} - \mu _{\textrm{id}})^2}. \end{aligned}$$
(14)

where the last inequality follows from Proposition 2.1. We make the following claim on \( \sigma ^2_{\textrm{re}} \) and \( \sigma ^2_{\textrm{id}} \).

Claim 3.1

For \( n \ge 4 \), \( 10 \le q \le 2^{n} \), we have

$$\begin{aligned} \sigma ^2_{\textrm{id}} \le \frac{4q^2}{2^n} \qquad \qquad \sigma ^2_{\textrm{re}} \le \frac{11q^2}{2^n} \end{aligned}$$

A proof of this claim is available in the full version of this paper [24]. Next, from (10) and (13), we have

$$\begin{aligned} (\mu _{\textrm{re}} - \mu _{\textrm{id}})^2 &\ge \left( {q \atopwithdelims ()2}\frac{2}{2^n} - {q \atopwithdelims ()2}\frac{1}{2^n} - \frac{q}{2^n}\right) ^2 \nonumber \\ &\ge {q \atopwithdelims ()2}^2\frac{1}{2^{2n}}\left( 1 - \frac{1}{q}\right) ^2 \ge 0.162\frac{q^4}{2^{2n}} \end{aligned}$$
(15)

where the last inequality follows from \( q \ge 10 \). The result then follows from (14), Claim 3.1, and (15).   \(\square \)

Remark 3.1

Note that the constant in Theorem 3.1 is a bit loose for the sake of simplicity. It is likely that this constant can be improved by a more tighter estimation or a more sophisticated concentration inequality. Indeed, in the next section, we show that in practical applications the advantage might already be close to 0.8 when the number of queries is close to \( 4\times 2^{\frac{n}{2}} \).

With that being said, it’s important to highlight that our attack demonstrates full scalability. In other words, as the value of q approaches \( 2^n \), the advantage becomes close to 1.

3.3 Experimental Verification

We have implemented the collision counting Algorithm 1 for different values for n. We have implemented two variants of the \(\texttt{collCount}\) function of the algorithm, which include various optimizations to make the attack practical. The first variant is an adversary without space complexity and with time complexity O(q), and is given in Algorithm 2. The second is for a space-optimized adversary, with space complexity O(q) and time complexity \(O(q\log _2(q))\), described in Algorithm 3. For the underlying random permutations, we used generated using Python NumPy’s \(\texttt{shuffle}\) and \(\texttt{argsort}\) functions, to generate and invert a permutation, respectively. We generated permutations of sizes 16, 20, 24, 28 and 32 bits and performed the distinguishing attack on each generated permutation. Results where taken over an average of \(1,000\sim 10,000\) random generations (each consisting of 3 independent permutations). In the ideal world, random values are sampled, since the tweaks are never repeated and lazy sampling can be used. Table 3 includes the average number of collisions for \(n=16\) and \(n=20\). The distinguisher reaches 16 expected collisions in the real world \(4\times \) faster than the distinguisher in [18] for \(n=16\) and \(16\times \) faster for \(n=20\).

Algorithm 1 is expected to have twice as many collisions in the real world as in the ideal world. \(\theta (q,n)\) is set to:

$$ \theta (q,n) = 2^{2d-1}+2^{2d-2} $$

when \(q=2^{n/2+d}\), which is roughly 1.5 times the expected number of collisions in the ideal case.

figure p
Table 3. Average number of collisions using random permutations.

We also calculated the success rate, which is the number of successful distinguishing attempts over the total number of attempts, for different values of q and \(\theta (q,n)\). This is equivalent to the advantage in Theorem 3.1. Table 4 shows the success rate for the different parameters. The distinguisher reaches \(\ge 85\%\) with \(q=2^{n/2+2}\) and \(99\%\) success rate with \(q=2^{n/2+3}\). The attack complexities are \(2^{n/2+3}\) and \(2^{n/2+4}\), respectively, since each iteration includes two queries to the construction. For large n, the factors \(2^3\) and \(2^4\) are small. With complexity \(2^{n/2+5}\), we get a success rate of almost \(100\%\), and an attack that breaks the security claim for In practice, \(n\ge 64\). The complexity of the distinguisher is compared to known TNT distinguishers with \(n=128\) in Table 2.

Note that our experimental estimations closely match the advantage curve obtained through theoretical analysis, up to a change in constant. In fact, we get a more optimistic constant in experimental results. In particular, we estimate that the advantage is around

$$ 1-2\frac{2^n}{q^2}, $$

but the discrepancy is expected since the theoretical advantage is more conservative and bound to be a bit loose for the sake of simplicity.

We also calculated the success rate, which is the number of successful distinguishing attempts over the total number of attempts, for different values of q and \(\theta (q,n)\). Table 4 shows the success rate for the different parameters. The distinguisher reaches \(\ge 85\%\) with \(q=2^{n/2+2}\) and \(99\%\) success rate with \(q=2^{n/2+3}\). The attack complexities are \(2^{n/2+3}\) and \(2^{n/2+4}\), respectively, since each iteration includes two queries to the construction. For large n, the factors \(2^3\) and \(2^4\) are small. With complexity \(2^{n/2+5}\), we get a success rate of almost \(100\%\), and an attack that breaks the security claim for In practice, \(n\ge 64\). The complexity of the distinguisher is compared to known TNT distinguishers with \(n=128\) in Table 2.

Table 4. The success rate achieved for different values of n and q.

On the Time-Memory Trade-Off. Algorithm 2 runs in time O(q), with space complexity \(O(2^n)\). This is sufficient and provides optimal time complexity for information-theoretic (unbounded) adversaries. On the other hand, Algorithm 3 is more geared towards linear space complexity. Its time complexity is dominated by the \(\texttt{sort}\) function, which can be executed with time complexity \(O(q\log _2(q))\) using merge-sort. The space complexity is dominated by the size of the list L which is O(q).

In practice, while we assume that the cost of applying encryption is constant, executing q encryptions are decryptions is more costly than sort a list with q entries. However, the adversary will not actually execute the encryptions and decryptions themselves, but will request them from the challenger, and in that case, the time complexity of Algorithm 2 is indeed superior to that of Algorithm 3, since the former will be able to terminate shortly after all the queries are executed, while the later needs to execute the costly sorting operation. However, the exponential space complexity of Algorithm 2 makes it unsuitable for attacking practical instances of TNT. In Table 2, we provide the parameters for attacking TNT-AES using Algorithm 3, bounding both time and memory by \(2^{n/2+5}=2^{69}\). We ignore the \(\log _2\) term in the time complexity since this is concerning the practical and not asymptotic performance, which is dominated by the encryptions and decryptions.

Attacking TNT-GIFT-64. We have implemented this variant to attack TNT instantiated with GIFT-64 [3]. We used the implementation of GIFT-64 described in [1] which can encrypt 2 blocks at the same time. We implemented the attack over 16 cores on an Intel Xeon E5-2630 CPU, each doing \(2^{31}\) encryption calls and \(2^{31}\) decryption calls (\(2^{30}\) calls \(\times \ 2\) blocks), generating \(2^{35}\) blocks in total. This process took two hours (32 core-hours). In practice, the adversary is unlikely to be able to parallelize the queries, since that depends on the challenger.

Counting the collisions cannot be parallelized. It requires 40 min to count collisions in a set of \(2^{32}\) blocks and 1 h, 20 min in a set of \(2^{33}\) blocks, generating on average 1 collision and 4 collisions, respectively. These results are reported in details in Table 5. We note that for \(q\ge 2^{34}\), the attack uses less memory and significantly more time than the other cases. This is due to memory limitations, since the platform is limited by 256 GB, so the collision counting phase had to be optimized towards memory consumption, leading to a significant slow down. The time in Table 5 seems (at first glance) dominated by collision counting, which is contradictory to the statement we made earlier. However, it is to be noted that the collision counting part is serial in nature, while the TNT queries have been parallelized. For instance, performing the attack with \(2^{35}\) complexity needs 68 core-hour, while counting needs  36 core-hour on a limited memory machine, but can be faster on a machine with more memory. In particular, we estimate that with memory of about 384 GB and 768 GB, we can run the attacks with \(2^{34}\) and \(2^{35}\) complexities in slightly more than 20 and 40 core-hours, respectively.

Table 5. Results for an attack on TNT-GIFT-64.

4 Spotting the Flaw in the BBB Security Proof of TNT

In [4], Bao et al. presented an IND-CCA security proof for TNT that contradicts our attack. This proof employs the \( \chi ^2 \) technique [14] — a relatively new proof technique — due to Dai et al.

In this section, we carefully revisit the security proof with the distinguisher \( \mathcal {A}^* \), and identify an issue that involves a subtle, yet fundamental, case analysis. We temporarily switch to the notation of [4] to follow their proof approach. Namely, the random variable corresponding to plaintext is referred to as \(\textsf{X}\). The random variable corresponding to ciphertext is referred to as \(\textsf{Y}\) and the random variable corresponding to the tweak is referred to as \(\textsf{T}\). The rest of the random variables are related to the internal values of TNT and relate to the first three variables as: \(\textsf{S}=\pmb {\pi }_1(\textsf{M})\), \(\textsf{U}=\textsf{T}\oplus \textsf{S}\), \(\textsf{V}=\pmb {\pi }_2(\textsf{U})\), \(\textsf{W}=\textsf{T}\oplus \textsf{V}\) and \(\textsf{Y}=\pmb {\pi }_3(\textsf{W})\). For the \(l^{th}\) query to the construction, we define a set \(\mathcal {Q}_l\) as the set of the first l queries \(\{(\textsf{T}_1,\textsf{X}_1,\textsf{Y}_1),\ldots ,(\textsf{T}_l,\textsf{X}_l,\textsf{Y}_l)\}\). We follow a slight abuse of notation utilized in [4]: we say \(\textsf{X}\in \mathcal {Q}_l\) to mean \(\exists (\textsf{T},\textsf{X},\textsf{Y})\in \mathcal {Q}_l\), and similarly for \(\textsf{Y}\). We define a random variable \(\textsf{Inter}\) as the vector of internal values in the first \(l-1\) queries:

$$ \left( (\textsf{S}_1,\ldots ,\textsf{S}_{l-1}),(\textsf{U}_1,\ldots ,\textsf{U}_{l-1}),(\textsf{V}_1,\ldots ,\textsf{V}_{l-1}),(\textsf{W}_1,\ldots ,\textsf{W}_{l-1})\right) . $$

The main technique of the proof, from a high level point of view, works as follows:

  • A deterministic distinguisher observes the first \(l-1\) queries and selects whether the next query is a forward or inverse query as well as the tweak \(\textsf{T}_l\) and the plaintext \(\textsf{X}_l\) or ciphertext \(\textsf{Y}_l\) (\(\textsf{M}_l\) and \(\textsf{C}_l\) using our notations, respectively).

  • Find the probability distribution of all the internal values of the construction given the first \(l-1\) query. We call a set of possible vectors of internal values \(\textsf{Inter}\).

  • For each possible \(\textsf{Inter}\), estimate the probability distribution of each possible response to query l.

The authors then analyze different possible cases and apply the \(\chi ^2\) method on the resulting distribution.

In order to better understand the issue, we analyze our distinguisher in the flow of the security proof. Our distinguisher works as follows:

  • If l is odd, it makes a forward query \((\textsf{X}_0,\textsf{T}_{l-2}+1)\).

  • If l is even, it makes a backward query \((\textsf{Y}_{l-1}, \textsf{T}_{l-1}\oplus \delta )\).

Let \((\textsf{S}_o,\textsf{U}_o,\textsf{V}_o)\) are the output of \(\pmb {\pi }_1\), input of \(\pmb {\pi }_2\) and output of \(\pmb {\pi }_2\) in the last (odd) query \(l-1\), and we estimate the probability, for a given \(\textsf{X}_i\in \mathcal {Q}_{l}\) where i is even, \( \Pr [\textsf{X}_l=\textsf{X}_i] \).

Let \((\textsf{S}_i,\textsf{U}_i,\textsf{V}_i)\) and \((\textsf{S}_e,\textsf{U}_e,\textsf{V}_e)\) are the corresponding internal values of \(\textsf{X}_i\) and \(\textsf{X}_l\), respectively. Then, we know that \( \textsf{V}_o\oplus \textsf{V}_e = \delta \) and

$$\begin{aligned} \Pr [\textsf{X}_l=\textsf{X}_i] = \Pr [\textsf{S}_e=\textsf{S}_i] &= \Pr [\textsf{U}_e\oplus \textsf{T}_{l-1}\oplus \delta =\textsf{U}_i\oplus \textsf{T}_{i-1}\oplus \delta ] \\ &=\Pr [\textsf{U}_e\oplus \textsf{U}_i=\textsf{T}_{l-1}\oplus \textsf{T}_{i-1}] \end{aligned}$$

Since \(\textsf{X}_0\) is fixed for all odd queries, so is \(\textsf{S}_o\). Thus, \(\textsf{U}_o\oplus \textsf{T}_{l-1}=\textsf{U}_{i-1}\oplus \textsf{T}_{i-1}\). Therefore,

$$ \Pr [\textsf{U}_e\oplus \textsf{U}_i=\textsf{T}_{l-1}\oplus \textsf{T}_{i-1}] = \Pr [\textsf{U}_e\oplus \textsf{U}_o=\textsf{U}_i\oplus \textsf{U}_{i-1}] \approx \frac{c}{2^n}, $$

where c is a small positive integer constant. The security proof considers two possible cases such collisions may occur. The first is when \(\textsf{U}_e\) has appeared before in one of the previous queries, and the second is when it has never appeared before. They dubbed these two cases as class \(\mathcal {A}\) and class \(\mathcal {B}\) respectively. The collision can occur in either class \(\mathcal {A}\) or class \(\mathcal {B}\), which the proof bounds the probability of their probability for the \(l^{th}\) query by \(4l/2^{2n}\) and \(1/(2^n-l)\), respectively. Thus, our analysis deviates from the distribution assumed in [4]. In terms of the proof presented in [4], the event we are discussing belongs to case 5 (case 1 if we swap all the forward and backward queries). In this case, the authors claim [4, (9)]).

$$ \Pr [\textsf{X}_l=\textsf{X}_i]\le \frac{4l}{2^{2n}} + \frac{1}{2^n-l} $$

We argue that the distribution assumed for case 5/case 1 - class \(\mathcal {A}\) erroneously underestimates the probability of certain bad events, and by changing the distribution to account for these bad events, the proof argumentation falls apart. Besides, it is not clear how to do so in the existing proof framework using the \(\chi ^2\) method.

In particular, we look at the term \(4l/2^{2n}\). The term stems from the following argument in [4]:

figure q

Consider the first case of the collision in Fig. 5. We note that if the triplet \((\alpha ,\textsf{S}_o,\textsf{U}_o)\) is known, then the collision happens with probability 1, which puts it in class \(\mathcal {A}\). Then, what remains is to calculate what is the probability that the adversary can force this collision, i.e.,

$$ \Pr [\textsf{Inter}\in \mathcal {A}|\mathcal {Q}_{l-1}]=\Pr [\textsf{U}_e\oplus \textsf{U}_o=\textsf{T}_{l-1}\oplus \textsf{T}_{i-1}|\mathcal {Q}_{l-1}], $$

where \(\textsf{T}_{l-1}=t_{l/2}\) and \(\textsf{T}_{i-1}=t_{i/2}\) are determined by the adversary during previous queries. This means than once \(\textsf{U}_o\) and all other values of \(\textsf{U}\) except \(\textsf{U}_e\) in \(\textsf{Inter}\) are fixed (both \(\textsf{U}_o\) and \(\textsf{U}_e\) belong to queries \(i,j<l\)), \(\textsf{U}_e\) has at most \(2^n-\alpha (\mathcal {Q}_{l-1})\) choices where \(\alpha (\mathcal {Q}_{l-1})\le q\le 2^{n-1}\) is the number of distinct values in \(\{\textsf{U}_1,\dots \textsf{U}_{l}\}\setminus \{\textsf{U}_e\}\), and at most 1 of them (\(\textsf{U}_e=\textsf{U}_o\oplus \alpha \)) enforces the collision. In other words,

$$\begin{aligned} \Pr [\textsf{Inter}\in \mathcal {A}|\mathcal {Q}_{l-1}] = \Pr [\textsf{U}_e\oplus \textsf{U}_o=\textsf{T}_{l-1}\oplus \textsf{T}_{i-1}|\mathcal {Q}_{l-1}] &\ge \frac{1}{2^n-\alpha (\mathcal {Q}_{l-1})} \\ & \ge \frac{1}{2^n} \gg \frac{4l}{2^{2n}}, \end{aligned}$$

when \(l\ll q\), contradicting [4, (9)]. This reflects in the final analysis of case 1 as follows: In [4, (11)], we take the maximum of two expressions. One is on the form \(al/2^{2n}\) for a small constant a, and one is on the form \(4l/2^{2n}+O(l/2^{2n})\). The term \(4l/2^{2n}\) comes from [4, (9)]. However, if the term is on the form \(O(1/2^n)\) instead of \(4l/2^{2n}\), as suggested by our attack, then the maximum function in [4, (11)] would return \(O(1/2^n)+O(l/2^{2n})\). Thus, the squared difference used in the \(\chi ^2\) statistic becomes one the form

$$ \left( \frac{1}{2^n-\alpha (\mathcal {Q}_{l-1})}-\frac{1}{2^n-\mu _l}+\frac{1}{2^{n}-l}\right) ^2 $$

or

$$ \left( \frac{A2^n+B2^{2n}}{2^{3n}}\right) ^2 $$

for some constants A and B. Note that \(\alpha (\mathcal {Q}_{l-1})\) is based on the probabilistic behaviour of the transcript, while \(\mu _l\) is fully controlled by the adversary, and we cannot ensure that \(\mu _l=\alpha (\mathcal {Q}_{l-1})\). Thus, the squared difference cannot be bounded tighter than \(O(1/2^{2n})\). Besides,

$$ \frac{1}{2^n-\alpha (\mathcal {Q}_{l-1})} $$

is a lower bound. The \(\chi ^2\) statistic then becomes on the form

$$ \sum \frac{O(1/2^{2n})}{O(1/2^n)} \approx \sum O(1/2^n) \approx 2^n\dot{O}(1/2^n) \approx O(1) $$

not leading to any meaningful security.

Fig. 5.
figure 5

A class \(\mathcal {A}\) collision occurring in Algorithm 1. If the collision occurs, then permutation calls with the same color compute the same permutation point. Curved arrows represent difference relations (if the collision occurs with these internal values, each two connected nodes differ by \(\alpha \)).

Note that the values of \(\textsf{V}_i\) and \(\textsf{W}_i\) for \(i<l\) did not affect the behaviour of the collision or the probability that \(\textsf{Inter}\) is in class \(\mathcal {A}\). It seems the ambiguity may stem from applying the \(\chi ^2\) method to a primitive with two dependent functions (\(\tilde{\pi }\) and its inverse). By cascading forward and backward queries, we managed to eliminate \(\textsf{W}_i\) for all \(1\le i\le q\) and the values of \(\textsf{W}_l\) do not matter for the attack. Similarly, by fixing the difference between \(\textsf{V}_o\) and \(\textsf{V}_e\) to a constant \(\delta \), we minimize the effect of their exact values on the attack.

5 Birthday-Bound Security of TNT and Its Variant

In light of the above discussion, it is clear that the security of TNT is in limbo. One can rely on the IND-CCA bound by Zhang et al. to demonstrate the tightness of the proposed attacks. However, we observe that the generic bound in [43] introduces some constant factors, and in general, an independent security proof, using a different proof technique, will instill greater confidence in the revised security claims of TNT.

Theorem 5.1

Let \(\pmb {\pi }_1\), \(\pmb {\pi }_2\), and \(\pmb {\pi }_3\) be three independent random permutations of \(\{0,1\}^{n}\). Then, for all \( q \ge 1 \), we have

$$\begin{aligned} \textbf{Adv}^{\mathsf {ind\text {-}cca}}_{{\textsf {TNT}}}(q) \le \frac{q^2}{2^{n}}. \end{aligned}$$

In fact, we intend to prove a stronger version of Theorem 5.1, as stated in Theorem 5.2 below. Particularly, we show that even the single-keyed TNT, which we denote as \(\textsf {1k-TNT} \), is sufficient to achieve birthday-bound security. A proof of Theorem 5.1 is available in the full version of this paper [24], for the sake of completeness.

Theorem 5.2

Let \( \pmb {\pi }_1 = \pmb {\pi }_2 = \pmb {\pi }_3 = \pmb {\pi }\), where \( \pmb {\pi }\) is a uniform random permutation of \( \{0,1\}^{n}\). Then, for all \( q \ge 1 \), we have

$$\begin{aligned} \textbf{Adv}^{\mathsf {ind\text {-}cca}}_{{\textsf {1k}\hbox {-}\textsf {TNT}}}(q) \le \frac{8q^2}{2^{n}}. \end{aligned}$$

Proof

The statement is vacuously true for \(q \ge 2^{n/2}\). We will use the Expectation method (see Lemma 2.1) to prove the statement for \(1 \le q < 2^{n/2}\).

Let \( \mathcal {O}_1\) and \( \mathcal {O}_0\) be the oracles corresponding to \( \textsf {1k-TNT} \) and a tweakable random permutation \( {\widetilde{\pmb {\pi }}}\), respectively. If \((\textsf{T}_i, \textsf{M}_i)\) is the encryption query with a tweak \(\textsf{T}_i\) we write the response as \(\mathsf {\widehat{C}}_i\). Similarly, if \((\textsf{T}_i, \mathsf {\widehat{C}}_i)\) is the decryption query with a tweak \(\textsf{T}_i\) we write the response as \(\textsf{M}_i\). After all queries have been made, the two oracles release some additional data to the adversary, who is obviously free to ignore this additional information, \( \mathsf {\widehat{M}}^q \) and \( \textsf{C}^q \).

In the real world, \( \mathsf {\widehat{M}}^q \) and \(\textsf{C}^q\) correspond to the output of the first permutation and input of the third permutation, respectively, and thus they are well defined from the definition of 1k-TNT. The real-world transcript is thus defined as the tuple

$$\begin{aligned} \mathsf {\Theta }_1:= (\textsf{T}^q,\textsf{M}^q,\mathsf {\widehat{C}}^q,\mathsf {\widehat{M}}^q,\textsf{C}^q). \end{aligned}$$

In the ideal system \({\widetilde{\pmb {\pi }}}\), we sample \( \mathsf {\widehat{M}}^q, \textsf{C}^q\) as follows: For every \(i \in [q]\),

  1. 1.

    \(\mathsf {\widehat{M}}_i = \mathsf {\widehat{M}}_j\) whenever \(\textsf{M}_i = \textsf{M}_j\) for \(j < i\). Otherwise (for all \(j < i\), \(\textsf{M}_j \ne \textsf{M}_i\)), we sample

    figure r
  2. 2.

    \(\textsf{C}_i = \textsf{C}_j\) whenever \(\mathsf {\widehat{C}}_j = \mathsf {\widehat{C}}_i\) for \(j < i\). Otherwise (for all \(j < i\), \(\mathsf {\widehat{C}}_j \ne \mathsf {\widehat{C}}_i\)), we sample

    figure s

The ideal world transcript is defined as

$$\begin{aligned} \mathsf {\Theta }_0:= (\textsf{T}^q,\textsf{M}^q,\mathsf {\widehat{C}}^q,\mathsf {\widehat{M}}^q,\textsf{C}^q). \end{aligned}$$

Note that we use the same notation to denote the random variables in both worlds. However, their probability distributions will be unambiguously determined at the time of probability computations.

Bad Transcript and Its Analysis: Let \( u^q := \widehat{m}^q \oplus t^q \), and \( \widehat{u}^q := c^q \oplus t^q \). A transcript \((t^q, m^q, \widehat{c}^q, \widehat{m}^q, c^q)\) is called bad if and only if any of the following bad events occur:

\(\textsf{bad}_{1a}\)::

\( \exists i \ne j \in [q] \) such that \( \widehat{m}_i = \widehat{c}_j \).

\(\textsf{bad}_{1b}\)::

\( \exists i \ne j \in [q] \) such that \( c_i = m_j \).

\(\textsf{bad}_{2a}\)::

\( \exists i < j \in [q] \) such that \( u_i = u_j \).

\(\textsf{bad}_{2b}\)::

\( \exists i < j \in [q] \) such that \( \widehat{u}_i = \widehat{u}_j \).

\(\textsf{bad}_{3a}\)::

\( \exists i \ne j \in [q] \) such that \( u_i = m_j \).

\(\textsf{bad}_{3b}\)::

\( \exists i \ne j \in [q] \) such that \( \widehat{u}_i = \widehat{c}_j \).

\(\textsf{bad}_{4a}\)::

\( \exists i \ne j \in [q] \) such that \( u_i = c_j \).

\(\textsf{bad}_{4b}\)::

\( \exists i \ne j \in [q] \) such that \( \widehat{u}_i = \widehat{m}_j \).

Let \(\varOmega _{\textsf{bad}}\) denote the set of all bad transcripts. Then, using union bound, we have

$$\begin{aligned} \textsf{Pr}_{}\left( {\mathsf {\Theta }_0\in \varOmega _{\textsf{bad}}}\right) \le \sum _{\begin{array}{c} i \in [4]\\ \texttt{s}\in \{a,b\} \end{array}} \textsf{Pr}_{}\left( {\textsf{bad}_{i{\texttt{s}}}}\right) \end{aligned}$$
(16)

On the right-hand side, we bound the probability for \( \textsf{bad}_{i\texttt{a}}\) for all \(i \in [4]\). The \(\texttt{s}=b\) cases can be bounded analogously.

  • \( \textsf{Pr}_{}\left( {\textsf{bad}_{1a}}\right) \le \frac{q^2}{2^n} \). This follows from union bound: For a fixed choice of i and j, \( \textsf{bad}_{1a} \) happens with \( 2^{-n} \) probability, and there are \( q^2 \) such (ij) pairs.

  • \(\textsf{Pr}_{}\left( {\textsf{bad}_{2a}}\right) \le \sum _{i < j} \textsf{Pr}_{}\left( {\mathsf {\widehat{M}}_i + \textsf{T}_i = \mathsf {\widehat{M}}_j + \textsf{T}_j}\right) \le q^2/2^{n} \). This can be argued as follows: For fixed i and j, \( \textsf{Pr}_{}\left( {\mathsf {\widehat{M}}_i + \textsf{T}_i = \mathsf {\widehat{M}}_j + \textsf{T}_j}\right) \le 1/2^{n-1} \le 2^{1-n} \), and there are \( {q \atopwithdelims ()2} \) such (ij) pairs.

  • \( \textsf{Pr}_{}\left( {\textsf{bad}_{3a}}\right) \le \frac{q^2}{2^n} \). The argumentation is similar to the one for \( \textsf{bad}_{1a} \).

  • \( \textsf{Pr}_{}\left( {\textsf{bad}_{4a}}\right) \le \frac{q^2}{2^n} \). This can be argued as follows: For any fixed i and j, \( \textsf{bad}_{4a} \) happens with at most \( 2^{-n} \) probability, and there are at most \( q^2 \) such (ij) pairs.

Thus, on combining everything in (16), we have

$$\begin{aligned} \textsf{Pr}_{}\left( {\mathsf {\Theta }_0\in \varOmega _{\textsf{bad}}}\right) \le \frac{8 q^2}{2^n}. \end{aligned}$$

Analysis of Good Transcripts: For a good transcript \( \tau = (t^q, m^q, \widehat{c}^q, \widehat{m}^q, c^q)\), we know that \((m^q,\widehat{m}^q)\), \((c^q,\widehat{c}^q) \), and \( (u^q,\widehat{u}^q) \) are permutation consistent non-overlapping input-output pairs and hence for the real world we have

$$\begin{aligned} \textsf{Pr}_{}\left( {\mathsf {\Theta }_1= \omega }\right) & = \textsf{Pr}_{}\left( {\pmb {\pi }(m^q) = \widehat{m}^q}\right) \times \textsf{Pr}_{}\left( {\pmb {\pi }(u^q) = \widehat{u}^q}\right) \times \textsf{Pr}_{}\left( {\pmb {\pi }(c^q) = \widehat{c}^q}\right) \nonumber \\ & = \frac{1}{(2^n)_{r+q+s}} \end{aligned}$$

where r and s denote the size of \( \texttt {S}(m^q) \) and \( \texttt {S}(\widehat{c}^q) \) respectively. In the ideal world, we have,

$$\begin{aligned} \textsf{Pr}_{}\left( {\mathsf {\Theta }_0=\omega }\right) & = \textsf{Pr}_{}\left( {{\widetilde{\pmb {\pi }}}(t^q, m^q) = \widehat{c}^q}\right) \times \frac{1}{(2^n)_{r}} \times \frac{1}{(2^n)_{s}} \le \frac{1}{(2^n)_{q}} \times \frac{1}{(2^n)_{r}} \times \frac{1}{(2^n)_{s}}, \end{aligned}$$

where the final inequality follows from the fact that \(\textsf{Pr}_{}\left( {{\widetilde{\pmb {\pi }}}(t^q, m^q) = \widehat{c}^q}\right) \) maximizes when \(t_i=t_j\) for all \(1 \le i<j \le q\). Thus

$$\begin{aligned} \frac{\textsf{Pr}_{}\left( {\mathsf {\Theta }_1= \omega }\right) }{\textsf{Pr}_{}\left( {\mathsf {\Theta }_0= \omega }\right) } \ge \frac{(2^n)_{q} \times (2^n)_{r} \times (2^n)_{s}}{(2^n)_{q + r + s}} \ge 1 \end{aligned}$$

Now the result follows from the Expectation method by setting \( \epsilon _{\textsf{ratio}}\) to be a zero function.

6 The Generalized LRW Paradigm

In this section, we propose a generalized view of the cascaded LRW design that encompasses both cascaded LRW1 and cascaded LRW2 constructions. In addition, we identify some necessary properties to guarantee IND-CCA security up to \( 2^{3n/4} \) queries.

Almost XOR Universal Hash Function: A \( (\tau ,n) \)-hash function family \( \mathcal {H}\), is a family of functions \( \{h:\{0,1\}^{\tau }\rightarrow \{0,1\}^{n}\} \), keyed implicitly by the choice of h. A \( (\tau ,n) \)-hash function family \( \mathcal {H}\) is called an \( \epsilon \)-almost XOR universal hash family (AXUHF) if for all \( t \ne t' \in \{0,1\}^{\tau }\), and \( \delta \in \{0,1\}^{n}\), we have

(17)

For the special case of \( \delta = 0^n \), \( \mathcal {H}\) is referred as an \( \epsilon \)-AUHF.

The LRW+ Construction: Let \(\widetilde{\mathcal {H}}\) be a family of \((\tau ,n)\)-tweakable permutations, and \(\mathcal {H}\) be a \((\tau ,n)\)-hash function family. Let \( \widehat{\mathcal {H}}= (\widetilde{\mathcal {H}}^2 \times \mathcal {H}) \), \( ({\widetilde{\pmb {\textsf{H}}}}_1,{\widetilde{\pmb {\textsf{H}}}}_2,{\pmb {\textsf{H}}}) \leftarrow \textsf{KG}\left( \widehat{\mathcal {H}}\right) \), and , where \(\textsf{KG}\left( \widehat{\mathcal {H}}\right) \) is an efficient probabilistic algorithm that returns a random triple from \(\widehat{\mathcal {H}}\).

The \(\textsf {LRW+} \) construction is a \((\tau ,n)\)- tweakable permutation family, defined by the following mapping (see Fig. 6 for an illustration):

$$\begin{aligned} (t,m) \mapsto {\widetilde{\pmb {\textsf{H}}}}_2^{-1}\left( t,\pmb {\pi }_2\left( {\pmb {\textsf{H}}}(t) \oplus \pmb {\pi }_1\left( {\widetilde{\pmb {\textsf{H}}}}_1\left( t,m\right) \right) \right) \right) . \end{aligned}$$
(18)
Fig. 6.
figure 6

The \(\textsf {LRW+} \) construction.

6.1 Security of LRW+

We say that \( \textsf{KG}\left( \widehat{\mathcal {H}}\right) \) is a pairwise independent sampling mechanism or PISM, if \( ({\widetilde{\pmb {\textsf{H}}}}_1,{\widetilde{\pmb {\textsf{H}}}}_2,{\pmb {\textsf{H}}}) \leftarrow \textsf{KG}\left( \widehat{\mathcal {H}}\right) \) is a pairwise independent tuple.

We say that \( \widetilde{\mathcal {H}}\) is an \(\epsilon \)-almost universal tweakable permutation family (AUTPF) if and only if for all distinct \( (t,m),(t',m') \in \{0,1\}^{\tau }\times \{0,1\}^{n}\),

figure u

Theorem 6.1

Let \( \tau ,n \in \mathbb {N}\), and \( \epsilon _1,\epsilon _2 \in [0,1] \). If \( \widetilde{\mathcal {H}}\) and \( \mathcal {H}\) are respectively \(\epsilon _1\)-AUTPF and \(\epsilon _2\)-AUHF, and \( \textsf{KG}\left( \widehat{\mathcal {H}}\right) \) is a PISM, then, for \( q \le 2^{n-2} \), we have

$$\begin{aligned} \textbf{Adv}^{\mathsf {ind\text {-}cca}}_{{\textsf {LRW+}}}(q) \le \epsilon (q,n), \end{aligned}$$

where

$$\begin{aligned} \epsilon (q,n) = 2q^2\epsilon _1^{1.5} + \frac{4q^4\epsilon _1^2}{2^{n}}+\frac{32q^4\epsilon _1}{2^{2n}} + \frac{13q^4}{2^{3n}} + q^2\epsilon _1^2 + q^2\epsilon _1\epsilon _2 + \frac{2q^2}{2^{2n}}. \end{aligned}$$
(19)

A proof of this theorem follows from a simple generalization of Jha and Nandi’s (JN) proof [26] for \( {2}\hbox {-}\textsf {LRW2} \). In particular, the exact same strategy of using the expectation method with the JN adaptation of mirror theory [10, 37] in the tweakable permutation settings works here as well. For the sake of completeness, we give the complete proof in the full version of this paper [24].

Remark 6.1

The proof presented in [26] appears to have overlooked the analysis of a specific subset of transcripts, which in hindsight of our generalized analysis seems to be a minor issue. Indeed, our proof demonstrates that this omission does not impact the overall bound significantly, with any potential effects being limited to a small constant factor.

6.2 Instantiating LRW+

We show that any cascaded LRW construction with \( r \ge 2 \) rounds can be viewed as an instance of \( \textsf {LRW+} \). Thus, they can be proven secure up to \( 2^{3n/4} \) queries provided the derived hash functions are close to \( 2^{-n} \)-universal. Note that it would be sufficient to define \( {\widetilde{\pmb {\textsf{H}}}}_1 \), \( {\widetilde{\pmb {\textsf{H}}}}_2 \), \( {\pmb {\textsf{H}}}\), \( \pmb {\pi }_1 \) and \( \pmb {\pi }_2 \) for each construction. In the following discussion, let and , where \( \mathcal {H}\) is an \( \epsilon \)-AXUHF.

Cascaded LRW1. For \(r \ge 2\), the \({r}\hbox {-}\textsf {LRW1} [\pmb {\pi }^r]\) construction takes as input \( (t,m) \in \{0,1\}^{n}\times \{0,1\}^{n}\) and returns \( c \in \{0,1\}^{n}\), which is defined as follows:

Let \( y_0 = t \oplus m \) and for all \( i \in [r] \):

$$\begin{aligned} x_i := t \oplus y_{i-1} \qquad \text {and}\qquad y_i := \pmb {\pi }'_i(x_i), \end{aligned}$$

and finally \( c := y_r \). The inverse of \( {r}\hbox {-}\textsf {LRW1} \) is analogously defined.

Cascaded LRW1 as an Instance of LRW+: For some \( r \ge 2 \), \( r' := \left\lfloor r/2\right\rfloor \), and any (tm) such that \( {r}\hbox {-}\textsf {LRW1} (t,m) = c \), define \( {\widetilde{\pmb {\textsf{H}}}}_1(t,m) := x_{r'} \), \( {\pmb {\textsf{H}}}(t) := t \), \( {\widetilde{\pmb {\textsf{H}}}}_2(t,c) := y_{r'+1} \), \( \pmb {\pi }_1 := \pmb {\pi }'_{r'} \), and \( \pmb {\pi }_{2} := \pmb {\pi }'^{-1}_{r'+1} \).

Clearly, the LRW+ instance so defined is same as \( {r}\hbox {-}\textsf {LRW1} \). We have the following corollary on the security of cascaded LRW1.

Corollary 6.1

  For \( r \ge 4 \), we have

$$\begin{aligned} \textbf{Adv}^{\mathsf {ind\text {-}cca}}_{{r}\hbox {-}\textsf {LRW1}}(q) \le \frac{2q^2}{(2^n-1)^{1.5n}} + \frac{49q^4}{(2^n-1)^3} + \frac{3q^2}{(2^{n}-1)^2}. \end{aligned}$$

In particular, for \( r=4 \), we have proved CCA security for \( {4}\hbox {-}\textsf {LRW1} \) up to \( 2^{3n/4} \) queries. A proof of this corollary is available in the full version of this paper [24].

Cascaded LRW2. For \(r \ge 1\), the \({r}\hbox {-}\textsf {LRW2} [\pmb {\pi }^r,{\pmb {\textsf{H}}}'^r]\) construction takes as input \( (t,m) \in \{0,1\}^{\tau }\times \{0,1\}^{n}\) and returns \( c \in \{0,1\}^{n}\), which is defined as follows:

Let \( y_0 = m \), \( {\pmb {\textsf{H}}}'_0 \) be a constant function that returns \( 0^n \), and for all \( i \in [r] \):

$$\begin{aligned} x_i := {\pmb {\textsf{H}}}'_{i-1}(t) \oplus {\pmb {\textsf{H}}}'_{i}(t) \oplus y_{i-1} \qquad \text {and}\qquad y_i := \pmb {\pi }'_i(x_i), \end{aligned}$$

and finally \( c := {\pmb {\textsf{H}}}'_r(t) \oplus y_r \). The inverse of \( {r}\hbox {-}\textsf {LRW2} \) is analogously defined.

Cascaded LRW2 as an Instance of LRW+: For some \( r \ge 2 \), \( r' = \left\lfloor r/2\right\rfloor \), and any (tkm) such that \( {r}\hbox {-}\textsf {LRW2} (t,m) = c \), define \( {\widetilde{\pmb {\textsf{H}}}}_1(t,m) := x_{r'} \), \( {\pmb {\textsf{H}}}(t) := {\pmb {\textsf{H}}}'_{r'}(t) \oplus {\pmb {\textsf{H}}}'_{r'+1}(t) \), \( {\widetilde{\pmb {\textsf{H}}}}_2(t,c) := y_{r'+1} \), \( \pmb {\pi }_1 := \pmb {\pi }'_{r'} \), \( \pmb {\pi }_{2} := \pmb {\pi }'^{-1}_{r'+1} \).

Clearly, the LRW+ instance so defined is same as \( {r}\hbox {-}\textsf {LRW2} \). We have the following corollary on the security of cascaded LRW2.

Corollary 6.2

  For \( r \ge 2 \), we have

$$ \textbf{Adv}^{\mathsf {ind\text {-}cca}}_{{r}\hbox {-}\textsf {LRW2}}(q) \le 2q^2\epsilon ^{1.5} + \frac{4q^4\epsilon ^2}{2^{n}}+\frac{32q^4\epsilon }{2^{2n}} + \frac{13q^4}{2^{3n}} + 2q^2\epsilon ^2 + \frac{2q^2}{2^{2n}}. $$

In particular, for \( r=2 \), assuming \( \epsilon = O\left( 2^{-n}\right) \), we have reprovedFootnote 4 the CCA security for \( {2}\hbox {-}\textsf {LRW2} \) up to \( 2^{3n/4} \) queries. A proof of this corollary is available in the full version of this paper [24].

7 Conclusion and Future Directions

In this paper, we gave a birthday-bound CCA distinguisher on TNT, thereby completely invalidating its beyond-the-birthday bound security claims. Further, we showed that our attack is tight by reestablishing a birthday bound security for TNT and its single-keyed variant.

In addition, we showed that by adding just one more block cipher call, the security can be amplified to 3n/4-bit even in the CCA setting. We note that our generalization of the cascaded LRW constructions could be of independent interest.

Open Problems: This work opens several new research avenues in (block cipher-based) TBC constructions. Some prominent problems that are worth exploring include:

  1. 1.

    Optimal LRW Construction for BBB Security: \( {4}\hbox {-}\textsf {LRW1} \) employs 4 calls of block cipher. Similarly, \( {2}\hbox {-}\textsf {LRW2} \) with block cipher based hash functions also requires 4 calls. This raises a natural question regarding their optimality. In other words, are 4 block cipher calls necessary for BBB security?

  2. 2.

    Reduced-key Version of \( {4}\hbox {-}\textsf {LRW1} \): \( {4}\hbox {-}\textsf {LRW1} \) needs 4 independent keys. Is it possible to reduce the number of keys from 4 to 3, or 2?

  3. 3.

    Exact Security of \({4}\hbox {-}\textsf {LRW1} \):We do not have an attack against \( {4}\hbox {-}\textsf {LRW1} \). Neither Mennink’s \( O(\sqrt{n}2^{3n/4}) \)-distinguisher, nor any variant of our \( O(2^{n/2}) \)-distinguisher seem to work. The additional permutation calls seem to help in avoiding these attack strategies. It would be interesting to see if there exists an attack that matches our bound, or if the construction is beyond 3n/4-bit secure?

  4. 4.

    Security of short-tweak TNT: Our attack requires a tweak space of size roughly \( 2^{n/2} \). So it is natural to ask if TNT is still BBB secure when the tweak space size is much less than \( 2^{n/2} \)?

  5. 5.

    Security of Longer Cascades of LRW: This is a long standing problem even for the more analyzed case of \( {r}\hbox {-}\textsf {LRW2} \), for \( r \ge 3 \). The best bounds [11, 43] that we have in this case are coupling-based. It is clear from our bound for LRW+ that these bounds are rather loose. It would be interesting to explore the possibility of better security bounds for the general case with a dedicated and more tighter analysis.