1 Introduction

A cryptographic hash \(\left( \mathrm {CH}\right) \) is defined as to proceed data from an arbitrary domain to a fixed domain [1, 2, 68, 10]. The applications of \(\mathrm {CH}\) are enormous. Generally, the \(\mathrm {CH}\) is used in message verification, password verification, pseudorandom generation, and message authentication [13, 7]. Furthermore, the cryptographic hash is an efficient primitive of security solution for IoT-end device, RfID, and resource constrained device [3539, 44]. Usually, the internal construction of \(\mathrm {CH}\) depends on compression function [16, 17]. The compression function is based on scratch or blockcipher [6, 8, 16, 17, 31]. The blockcipher based compression function is a combination of component functions (Fig. 1). The component functions depend on the 16 modes of PGV construction so far [8, 16, 17]. Additionally, a classical structure of Merkle Damgrad is used for message encryption of the cryptographic hash, if message size is bigger than the blocksize [13]. According to Fig. 1, message \(\left( M\right) \) is multiple of blocklength. Hence, message is partitioned as \(M|m_{i=1}||\text {. . .}||m_{l}\). Thereafter, partitioned message injects as input with initial vector value \(\left( IV\right) \). The function \(F_i\) is called compression function, which is built by blockcipher or scratch. Usually, one of the PGV modes needs to select as a component function of compression function [8, 16, 17]. On the contrary, the generic of blockcipher compression function is more suitable than that of the scratch for encryption of a constrained device, IoT-end device because of implementation of blockcipher rather than the encryption function [5, 6, 13, 14].

Fig. 1.
figure 1

Basic concept of cryptographic hash [2, 6, 8, 34]

Usually, the blockcipher based compression function is classified as single block-length \(\left( \mathcal {SBL}\right) \) and double block-length \(\left( \mathcal {DBL}\right) \). Due to short size of output, the application of \(\mathcal {SBL}\) is limited now [2, 9, 33]. On the other hand, the \(\mathcal {DBL}\) is more reliable construction due to its better resistance against birthday attack [2, 13, 16, 18, 21, 28]. Moreover, the \(\mathcal {DBL}\) is categorized as \(\left( n, n\right) \) and \(\left( n, 2n\right) \) blockcipher \(\left( \text {base is key size}\right) \). The \(\left( n, 2n\right) \) blockcipher is better due to upper security bound \(\left( \text {larger key space}\right) \) [6, 8, 13, 20, 23]. Generally, there are certain parameters that indicate the strength of blockcipher based compression function such as:

  • security bound \(\left( CR:{\text {collision}} \text { and } PR:{\text {preimage resistance}}\right) \)

  • efficiency-rate \(\left( r\right) \)

  • number of calling blockcipher \(\left( \#E\right) \)

  • key scheduling \(\left( KS\right) \)

  • operational mode \(\left( OM\right) \)

The CR is defined as a game, where an adversary tries to find similar output under two different input, but the advantage of adversary is very limited [6, 13, 21]. Under PR, it is infeasible for adversary to find any \(m\left( \text {message}\right) \) such that \(y=F\left( m\right) \), where y is predefined by the adversary [2, 6, 16]. The number of blockcipher \(\left( \#E\right) \) depends on number of calling blockcipher per message-block encryption. The KS directs the number of key requirement for single message block encryption [16]. Furthermore, the OM stands for operational mode \(\left( \text {parallel or serial}\right) \) [17, 18]. In addition, the efficiency-rate [6, 15] is defined as:

Table 1. Result of existing familiar schemes

Motivation. The parameters of CR, PR, r, \(\#E\), OM, and KS are vital for any satisfactory scheme of blockcipher based compression function [1, 68, 13, 21]. Firstly, certain gaps are identified from the current familiar schemes based on the above parameters. Thus, the importance of the findings are shown in the field of efficient and secure communication. For example, the key scheduling cost is analysed in respect of construction of compression function. Usually, 176 bytes are needed for operating of single key scheduling [27]. Hence, minimization of key scheduling is a common practice. Additionally, the operation mode is very crucial for resource limited devices, where the parallel mode can provide maximum support in respect of memory system [29, 30]. Moreover, the efficiency-rate needs to reach the landmark \(\left( {r=1}\right) \) [6, 13, 15, 21]. There are some well-known schemes of blockcipher compression function such as MR, Weimar, Hirose, Tandem, Abreast, Nandi, and ISA09 (Table 1). For example, the CR of MR scheme is bounded by \(q={2^{126.70}}\) but the r is 1 / 2 \(\left( q:\text {number of queries}\right) \). The scheme of Weimar-DM provides tight security bound such as \(q={2^{126.23}}\) [6]. Moreover, it follows double key scheduling including 1 / 2 efficiency-rate. The scheme of Hirose delivers marginal security bound as \(q={2^{124.55}}\) but it ensures a single key scheduling. However, the CR and PR bound of the Tandem-DM and Abreast-DM are not satisfactory as that of the MR, Weimar, and Hirose [23]. Moreover, the efficiency-rate of Tandem-DM and Abreast-DM is 1 / 2 like MR, Weimar, and Hirose [6, 11, 12]. Though the scheme of Nandi is bounded by \(q=O\left( 2^{2n/3}\right) \) but it provides higher efficiency-rate \(\left( r=2/3\right) \) [20]. Additionally, the construction of ISA09 provides better efficiency-rate \(\left( r={2/3}\right) \) [21]. According to the above discussions and Table 1, most of the existing schemes have rigorous security margin. However, the efficiencies are low for the constructions of MR, Weimar, Hirose, Tandem and Abreast. On the other hand, the schemes of Nandi and ISA09 satisfies higher efficiency-rate. Moreover, the constructions of Nandi and ISA09 satisfies \(KS=3\) and \(\#E=3\) [20, 21]. On the contrary, the OM is serial for Nandi and ISA09 schemes. Thus, the overall efficiencies are not adequate for the ISA09 and Nandi schemes.

Now-a-days, the importance of an efficient blockcipher compression function are enormous [6, 8, 13, 33, 34, 40, 41, 44]. The blockcipher is one of the important cryptographic primitive for the security solution of IoT environment according to certain standards such as ISO/IEC29192-1, ISO/IEC29192-2, ISO/IEC29192-3, and ISO/IEC29192-4, [4244]. Generally, IoT-end device, RfID, and constrained device are used in IoT environment [3942]. Furthermore, these devices need to operate fast but the major draw-backs are limited memory, power, and processor [37, 38, 4244]. Therefore, the cryptographic solution scheme should satisfies the property of better efficiency. In summary, the targets for an efficient blockcipher compression function are as follows:

  • higher efficiency-rate

  • reasonable key scheduling

  • less number of calling blockcipher \(\left( \#E\right) \)

  • operational mode

  • satisfiable security bound

Contribution. In this paper, a blockcipher based compression function is proposed. The component function of the proposed construction follows one of the secure modes of PGV. The contributions of the proposed construction are as follows:

  • efficiency rate, \(r=0.996\)

  • \(KS=2\)

  • \(\#E=2\)

  • Parallel mode

  • CR security bound, \(q=2^{125.84}|q:\text {number of query}\)

In addition, a comparative study of the proposed construction and current familiar schemes is given through Table 2.

Table 2. Comparison: the proposed scheme and existing familiar schemes [6, 14, 15, 20, 21, 23]

Outline. The basic preliminaries are provided in Sect. 2. The technical details of the proposed scheme are given in Sect. 3. Section 4 is responsible for the analysis of security bound. Furthermore, the result analysis is given including performance analysis in Sect. 5. Finally, the conclusions and future works are provided in Sect. 6.

2 Preliminaries

2.1 Ideal Cipher Model (ICM)

In ideal cipher model, a blockcipher is defined as \(\mathcal {B}\left( n,k\right) \) where n means block-length and k means key-length. The operation of \(\mathcal {B}\left( n,k\right) \) is \(\mathcal {E} = {\left\{ {0,1} \right\} ^n} \times {\left\{ {0,1} \right\} ^k} \rightarrow {\left\{ {0,1} \right\} ^n}\). The reply of forward \(\left( \mathcal {E}\right) \) and backward \(\left( \mathcal {E}^{-1}\right) \) query is random and independent permutation of \(\mathcal {K} \in {\left\{ {0,1} \right\} ^k}\). Let \(\mathcal {BLOCK}_n^k\) is the set of all blockciphers \(\mathcal {B}\left( n,k\right) \). Under ideal cipher model, \(\mathcal {E}\) is chosen randomly from \(\mathcal {BLOCK}_n^k\). Actually, \(\mathcal {E}\) invokes key and plaintext as input and returns ciphertext as output. On the contrary, input of \({\mathcal {E}}^{-1}\) are key and ciphertext. Then output is plaintext. Usually, the query and response through \({\mathcal {E}}\) and \({\mathcal {E}}^{-1}\) are stored as \(k_i, x_i, y_i\). Moreover, the adversary is not allowed to make any duplicate query [17, 22].

2.2 Security Definition

There are certain properties, which are responsible for analysing the security issue of blockcipher compression function. For example, collision resistance \(\left( {CR}\right) \), preimage resistance \(\left( {PR}\right) \), padding oracle attack, and initial value \(\left( {CV}\right) \) attack are the most familiar properties [6, 13, 23, 24]. In this section, the collision and preimage resistance of the blockcipher compression function are briefly discussed [1619].

Collision Resistance of Compression Function. The adversary \({\mathcal {A}}\) is allowed for accessing to the blockcipher oracle \(\left( \mathcal {E} \in \mathcal {BLOCK}_n^k\right) \). Hence, the output of compression function are \(\left( {{\alpha _1},{\beta _1},{m_1}} \right) \) and \(\left( {{\alpha _2},{\beta _2},{m_2}} \right) \). Furthermore, an experiment is defined as \({\text {Exp-col}}{{\text {l}}_{{f_{\mathcal {E}}}}}\left( \mathcal {A} \right) \). The output of the experiment is 1 iff following condition satisfies.

$$\begin{aligned} {f_\mathcal{E}}\left( {{\alpha _1},{\beta _1},{m_1}} \right) = {f_\mathcal{E}}\left( {{\alpha _2},{\beta _2},{m_2}} \right) \wedge \left\{ {\left( {{\alpha _1},{\beta _1},{m_1}} \right) \ne \left( {{\alpha _2},{\beta _2},{m_2}} \right) } \right\} , \end{aligned}$$

where \(f_\mathcal {E}^{\text {}}\) is a blockcipher compression function and \(\alpha , \beta \) are chaining values including \(m|\,\text {message}\). The advantage of adversary for finding a collision under \(f_\mathcal {E}^{\text {}}\) is defined below. Let, \(\mathrm{{Adv}}_{{f_{\mathcal {E}}}}^{{\text {coll}}}\left( \mathcal {A} \right) = \Pr \left[ {{\text {Exp-col}}{{\text {l}}_{{f_{\mathcal E}}}}\left( {\mathcal A} \right) = 1} \right] \), where coll stands for collision. The advantage of adversary \(\mathcal {A}\) is quantified by the number of queries that are allowed to ask blockcipher oracle. Therefore, \(\mathrm{{Adv}}_{{f_{\mathcal {E}}}}^{\mathrm{{coll}}}\left( q \right) = {\max _\mathcal{A}}\left\{ {\mathrm{{Adv}}_{{f_\mathcal{E}}}^{\mathrm{{coll}}}\left( \mathcal{A} \right) } \right\} \), where the maximum is taken over all adversaries that ask at most q oracle queries [16, 19].

Preimage Resistance of Compression Function. The adversary \({\mathcal {A}}\) has access on blockcipher oracle \(\left( \mathcal {E} \in \mathcal {BLOCK}_n^k\right) \). Furthermore, \({\mathcal {A}}\) selects value of \(\alpha , \beta \) randomly before making any query to blockcipher oracle. Let the feedback of oracle are \(\alpha ' \text { and }\beta '\) in respect of adversarial query. In addition, assume an experiment \({\text {Exp-pr}}{\mathrm{{e}}_{{f_\mathcal{E}}}}\left( \mathcal{A} \right) \), where pre stands for preimage. Hence, the output of the defined experiment is 1 iff:

$$\begin{aligned} {f_\mathcal{E}}\left( {{\alpha _1},{\beta _1},{m_1}} \right) = \left( {\alpha ,\beta } \right) , \end{aligned}$$

where \(f_\mathcal {E}^{\text {}}\) is a blockcipher compression function and \(\alpha _1, \beta _1\) are chaining values including \(m|\,\text {message}\). The advantage of adversary for finding a preimage under \(f_\mathcal {E}^{\text {}}\) is defined by \(\mathrm{{Adv}}_{{f_\mathcal{E}}}^{\mathrm{{pre}}}\left( \mathcal{A} \right) = \Pr \left[ {{\text {Exp-pr}}{\mathrm{{e}}_{{f_\mathcal{E}}}}\left( \mathcal{A} \right) = 1} \right] \). Moreover, the advantage of \(\mathcal {A}\) is evaluated through the total number of queries. Therefore, \(\mathrm{{Adv}}_{{f_\mathcal{E}}}^{\mathrm{{pre}}}\left( q \right) = {\max _{\mathcal {A}}}\left\{ {\mathrm{{Adv}}_{{f_{\mathcal {E}}}}^{\mathrm{{pre}}}\left( \mathcal {A} \right) } \right\} \), where the maximum is taken over all adversaries that ask q oracle queries [16, 19].

3 Proposed Scheme

Usually, the efficiency-rate can be increased by using three calls of blockcipher. The above method is used in Nandi and ISA09 [20, 21]. Furthermore, a method of using a pair of chaining values including message in the two blockciphers is also useful. Such kind of method is used in MDC-2 and later in MDC-4 [4, 9, 32, 45]. The proposed construction is actually inspired and followed by the construction of MDC-2 and MDC-4 [4, 9, 45]. However, in respect of security there is a drawback for these \(\left( \text {MDC-2, 4}\right) \) kind of construction. In MDC-2, two chaining values are used as input, where message is common for two blockciphers. There is no dependency between two chaining values as input. On the contrary, it can be said that the computations of the two block ciphers used in the compression function are completely isolated. For example, given the input and output \(\left( x_1, y_1 \rightarrow x_2, y_2\right) \), if the input is swapped then the new output will be swapped values of the old output \(\left( y_1, x_1 \rightarrow y_2, x_2\right) \). It actually suffers for symmetric property. Therefore, certain changes are occurred in the proposed construction (Fig. 2). For example, one constant bit 0 and 1 is used to each of the block ciphers as part of the key for the proposed scheme (trivial practice in cryptography, [14]). Hence, the attacker can’t predict the output of the chaining values which is given under the assumption where the attacker can freely alter the input of chaining values and message. This premise is used for breaking the symmetric property of the proposed scheme, where x||y and y||x will be treated as two different values. Moreover, the scheme is secured under a generic attack because of the ideal cipher model primitive [26]. Additionally, the MDC-2, MDC-4 are \(\left( n, n\right) \)-bit \(\mathcal {DBL}\) hash functions with efficiency-rate 1 / 2 and 1 / 4 [24], where the proposed scheme is based on \(\left( n, 2n\right) \) blockcipher. Furthermore, a different component function is used in respect of the MDC-2 and MDC-4. The proposed scheme can compress 4n bits into 2n bits, where MDC-2 and MDC-4 can compress 3n bits to 2n bits. Furthermore, the proposed scheme satisfies type-1 \(\left( \text {from Stam's conjecture}\right) \), where two blockciphers \({\mathcal {E}_{\text {l}},\,\mathcal {E}_{\text {r}}}\) are distinct and independent under the ICM [8, 16]. In general, the proposed scheme is defined as variant of the MDC-2 and MDC-4.

Fig. 2.
figure 2

Proposed Scheme \(\left( \text {Variant of MDC-2, 4}\right) \)

Definition 1

Let \({\mathcal {E} \in \mathcal {BLOCK}_n^k}\) be a block cipher taking a set of k-bit key and n-bit block-length such that \(\mathcal {E}_{\text {l},\text {r}}={\left\{ {0,1} \right\} ^k} \times {\left\{ {0,1} \right\} ^n} \rightarrow {\left\{ {0,1} \right\} ^n}\). \(\mathcal {E}^{\text {dbl}}={\left\{ {0,1} \right\} ^k} \times {\left\{ {0,1} \right\} ^{2n}} \rightarrow {\left\{ {0,1} \right\} ^{2n}}\) is defined as a double block length \(\left( \text {dbl}\right) \) cipher and parallel calling of two independent blockciphers of \({\mathcal {E}_{\text {l},\,\text {r}}}\) such that,

$$\begin{array}{*{20}{l}} {{x_i} \leftarrow {\mathcal{E}_{\mathrm{{l}},\left( {{{\overline{m} }_i}||c} \right) }}\left( {\overline{{a_{i - 1}} \oplus l\left( {{m_i}} \right) } } \right) }\\ {{y_i} \leftarrow {\mathcal{E}_{\mathrm{{r}},\left( {{m_i}||\bar{c}} \right) }}\left( {{b_{i - 1}} \oplus l\left( {{m_i}} \right) } \right) } \end{array}$$

where parameters are defined as \({m_i} \in {\left\{ {0,1} \right\} ^{2n - 1}},\left( {a,b,x,y} \right) \in {\left\{ {0,1} \right\} ^n}\) and \(l\left( {{m_i}} \right) = {\text {lsb of }}{m_i} \in {\left\{ {0,1} \right\} ^n},c = \left\{ 1 \right\} \). Thus, the final output is \(f_\mathcal {E} \left( {{a_i},{b_i}} \right) \) where,

$$\left\{ {\begin{array}{*{20}{l}} {{a_i} \leftarrow {x_i} \oplus \left( {{a_{i - 1}} \oplus l\left( {{m_i}} \right) } \right) \oplus c}\\ {{b_i} \leftarrow {y_i} \oplus \left( {{b_{i - 1}} \oplus l\left( {{m_i}} \right) } \right) \oplus \bar{c}} \end{array}} \right. $$

Definition 2

Let \(f_\mathcal {E}={\left\{ {0,1} \right\} ^{k}} \times {\left\{ {0,1} \right\} ^{2n}} \rightarrow {\left\{ {0,1} \right\} ^{2n}}\) be a blockcipher based compression function such as \({\left( {{a_i},{b_i},{m_i}} \right) = f\left( {{a_i},{b_i},{m_i}} \right) ,{} {\text {}}}\) where, \({a_i} \in {\left\{ {0,1} \right\} ^n}\), \({b_i} \in {\left\{ {0,1} \right\} ^n}\), \({m_i} \in {\left\{ {0,1} \right\} ^{2n - 1}}\), and \(c = \left\{ {0,1} \right\} \). Therefore, \({f_\mathcal {E}}\) consists of ideal blockcipher \(\left( \mathcal{E} \right) \) such as:

$${\left[ {\begin{array}{*{20}{l}} {{a_i} = {f_\mathrm{{l}}}\left( {\overline{{a_{i - 1}} \oplus l\left( {{m_i}} \right) },{{\bar{m}}_i}||c} \right) \oplus \left( {{a_{i - 1}} \oplus l\left( {{m_i}} \right) } \right) \oplus c \leftarrow }\\ {{\mathcal{E}_\mathrm{{l}}}\left( {\overline{{a_{i - 1}} \oplus l\left( {{m_i}} \right) },{{\bar{m}}_i}||c} \right) \oplus \left( {{a_{i - 1}} \oplus l\left( {{m_i}} \right) } \right) \oplus c} \end{array}} \right] }$$
$${\left[ {\begin{array}{*{20}{l}} {{b_i} = {f_\mathrm{{r}}}\left( {{b_{i - 1}} \oplus l\left( {{m_i}} \right) ,{m_i}||\bar{c}} \right) \oplus \left( {{b_{i - 1}} \oplus l\left( {{m_i}} \right) } \right) \oplus \bar{c} \leftarrow }\\ {{\mathcal{E}_\mathrm{{r}}}\left( {{b_{i - 1}} \oplus l\left( {{m_i}} \right) ,{m_i}||\bar{c}} \right) \oplus \left( {{b_{i - 1}} \oplus l\left( {{m_i}} \right) } \right) \oplus \bar{c}} \end{array}} \right] }$$

4 Security Analysis

The security proof of the proposed scheme follows an ICM [16, 17], where \(\mathcal {A}\) is not allowed to make any duplicate query. For example, the query of \({\mathcal {E}\left( {k},{x}\right) }={y}\) isn’t being executed by the adversary, if \({\mathcal {E}^{-1}\left( {k}, {y}\right) = {x}}\) query is already in the query storage \(\left( \mathcal {Q}\right) \). The adversary \(\mathcal {A}\) searches for a collision under a pair of different inputs \(\left( \text {query}\right) \) through the blockcipher oracle. Additionally, \(\mathcal {A}\) tries to find an output of compression function for making collision with initial chaining value. Moreover, the preimage attack means: Adversary \(\mathcal {A}\) selects \(\alpha ', \beta '\) randomly and tries to find \(f\left( \alpha ,\beta ,m\right) =\alpha ',\beta '\). In addition, the advantage of \(\mathcal {A}\) is very limited to get the above success.

4.1 Collision Security Analysis

An adversary \(\mathcal {A}\) has access to a blockcipher oracle for finding a collision. The query is \({Q_i}\) and corresponding response is triplet as \((m:\text {mesage},\,k: \text {key}, c: \text {ciphertext})\). For any i-th iteration \(\left( i \le q\right) \), the query process looks either \({Q_i}\in \left\{ \left( m, k\right) =c\right\} \) or \({Q_i}\in \left\{ \left( c, k\right) =m\right\} \). The \({Q_i}\) stores in \({\mathcal {Q}} \in \left( {{Q_1},{Q_2},...,{Q_i}} \right) \) for each iteration of i where \(\mathcal {Q}:{\text {query storage}}\). Under this circumstance, adversary \(\mathcal {A}\) has target to find,

$$\begin{aligned} {\left. {{f_\mathcal{E}}\left( {{m_i},{k_i},{c_i}} \right) = {f_\mathcal{E}}\left( {{m_j},{k_j},{c_j}} \right) } \right| \because \left( {{m_i},{k_i},{c_i}} \right) \ne \left( {{m_j},{k_j},{c_j}} \right) \wedge \left( {i \ne j} \right) } \end{aligned}$$
(1)

According to the definition of proposed scheme, 1 is re-defined as:

$$\begin{aligned} \left. {{f_\mathcal{E}}\left( {{a_i},{b_i},{m_i}} \right) = {f_\mathcal{E}}\left( {{a_j},{b_j},{m_j}} \right) } \right| \because \left( {{a_i},{b_i},{m_i}} \right) \ne \left( {{a_j},{b_j},{m_j}} \right) \wedge \left( {i \ne j} \right) \end{aligned}$$
(2)

Theorem 1

Let \(f_\mathcal {E}\) be a double block-length compression function (Definitions 1 and 2). An adversary, \({\mathcal {A}}\) is assigned for finding a collision \(\left( \text {coll}\right) \) under the \(f_\mathcal {E}\) after q pairs of queries. Hence, the advantage of \({\mathcal {A}}\) is bounded by,

$$\begin{aligned} {\text {Adv}}_{f_{\mathcal E}}^{{\text {coll}}}\left( q \right) \le \frac{{6{q^2} - 2q}}{{{{\left( {2^{n} - q} \right) }^2}}} \end{aligned}$$

Proof

An adversary \(\mathcal {A}\) makes a relevant query to the blockcipher oracle, where the number of query is limited by q queries. For any i-th query, the reply of \({x_i}\) and \({y_i}\) randomly selects by the adversary from the blockcipher oracle. The main difficulty is to find out the set size of an oracle from where these fresh value come. There are three possible incidents that are responsible for collision-hit under any i-th iteration. In the beginning, the three incidents are clarified through two targets \(\left( {\mathcal {TAR}}1,\, {\mathcal {TAR}}2\right) \). The goal of the first incident is to find a collision for two distinct queries \(\left( j<i\right) \) where \({\mathcal {TAR}}1\) represents the responsibilities of the first incident. The \({\mathcal {TAR}}2\) is responsible for second and third incident. Since \(\mathcal {A}\) has target to find a collision through single query. Furthermore, \(\mathcal {A}\) investigates for a collision against initial chaining values. Finally, three phases of \(\mathcal {QUERY}\), \(\mathcal {RESPONSE}\), and \(\mathcal {CHECK}\) have been defined under \({\mathcal {TAR}}1\) and \({\mathcal {TAR}}2\). Let adversary \(\mathcal {A}\) is allowed to ask query to blockcipher oracle at \(\mathcal {QUERY}\) phase. Moreover, corresponding feedback assign under \(\mathcal {RESPONSE}\) phase. In addition, a collision is checked in the phase of \(\mathcal {CHECK}\).

figure a

Collision probability based on the first incident \(\left( {\mathcal {TAR}}1\right) \). Under an iteration of i, a pair of query is executed that returns two distinct outputs. According to Algorithm 1, there is a chance to make collision through two different query-pairs after any i-th \(\left( j<i<q\right) \) iteration. For example, a query pair of j-th iteration are:

$$\left[ {\begin{array}{*{20}{l}} {{a_j} \leftarrow {\mathcal{E}_{\mathrm{{l}},{{\bar{m}}_j}||c}}\left( {\overline{{a_{j - 1}} \oplus l\left( {{m_j}} \right) } } \right) \oplus \left( {{a_{j - 1}} \oplus l\left( {{m_j}} \right) } \right) \oplus c,}\\ {{b_j} \leftarrow {\mathcal{E}_{\mathrm{{r}},{m_j}||\bar{c}}}\left( {{a_{j - 1}} \oplus l\left( {{m_j}} \right) } \right) \oplus \left( {{a_{j - 1}} \oplus l\left( {{m_j}} \right) } \right) \oplus \bar{c}} \end{array}} \right] $$

Moreover, the query responses are \({a_i} \leftarrow {E_{\mathrm{{l}},\bar{m}||c}}\left( {\overline{{a_{i - 1}} \oplus l\left( {{m_i}} \right) } } \right) \oplus \left( {{a_{i - 1}} \oplus l\left( {{m_i}} \right) } \right) \oplus c\) and \({{b_i} \leftarrow {E_{\mathrm{{r}},m||\bar{c}}}\left( {{a_{i - 1}} \oplus l\left( {{m_i}} \right) } \right) \oplus \left( {{a_{i - 1}} \oplus l\left( {{m_i}} \right) } \right) \oplus \bar{c}}\) on the i-th \(\left( j<i\right) \) iteration. Let \({^{{\mathcal {TAR}}1}{\mathcal {C}_i}}\) be an event, where adversary tries to find a collision through different two iterations \(\left( j<i\le q\right) \). Thus, Eq. 2 is re-defined as:

$$\begin{aligned} \left\{ {\begin{array}{*{20}{l}} {\left\{ {{a_i} \leftarrow \left( {c \oplus {a_{i - 1}} \oplus l\left( {{m_i}} \right) \oplus {x_i}} \right) } \right\} = }\\ {\left\{ {{a_j} \leftarrow \left( {c \oplus {a_{j - 1}} \oplus l\left( {{m_j}} \right) \oplus {x_j}} \right) } \right\} } \end{array} \vee \left\{ {\begin{array}{*{20}{l}} {\left\{ {{a_i} \leftarrow \left( {c \oplus {a_{i - 1}} \oplus l\left( {{m_i}} \right) \oplus {x_i}} \right) } \right\} = }\\ {\left\{ {{b_j} \leftarrow \left( {\bar{c} \oplus {b_{j - 1}} \oplus l\left( {{m_j}} \right) \oplus {y_j}} \right) } \right\} } \end{array}} \right. } \right. \end{aligned}$$
(3)
$$\begin{aligned} \left\{ {\begin{array}{*{20}{l}} {\left\{ {{b_i} \leftarrow \left( {\bar{c} \oplus {b_{i - 1}} \oplus l\left( {{m_i}} \right) \oplus {y_i}} \right) } \right\} = }\\ {\left\{ {{a_j} \leftarrow \left( {c \oplus {a_{j - 1}} \oplus l\left( {{m_j}} \right) \oplus {x_j}} \right) } \right\} } \end{array} \vee } \right. \left\{ {\begin{array}{*{20}{l}} {\left\{ {{b_i} \leftarrow \left( {\bar{c} \oplus {b_{i - 1}} \oplus l\left( {{m_i}} \right) \oplus {y_i}} \right) } \right\} = }\\ {\left\{ {{b_j} \leftarrow \left( {\bar{c} \oplus {b_{j - 1}} \oplus l\left( {{m_j}} \right) \oplus {y_j}} \right) } \right\} } \end{array}} \right. \end{aligned}$$
(4)

From \(3 \wedge 4\), the probability of collision hit under the event of \({^{{\mathcal {TAR}}1}{\mathcal {C}_i}}\) is \(\frac{{2(i - 1)}}{{{{\left( {{2^n} - \left( {i - 1} \right) } \right) }^2}}}\) \(\left( \text {when}\,j <i \le q\right) \). Therefore, the probability of single event under the \({\mathcal {TAR}}1\) is:

If \({^{{\mathcal {TAR}}1}{\mathcal {C}}}\) be the events of all colliding pairs under the \(f_{\mathcal {E}}\) for q pairs of queries. Hence,

$$\begin{aligned} \le \sum \limits _{i = 2}^q {\Pr } \left[ {{}^{\mathcal{T}\mathcal{A}\mathcal{R}1}{\mathcal{C}_i}} \right] \le \frac{{2 \times 2 \times \left( {i - 1} \right) }}{{{{\left( {{2^n} - i} \right) }^2}}} = \frac{{2{q^2} - 2q}}{{{{\left( {{2^n} - q} \right) }^2}}} \end{aligned}$$
(5)

Collision probability based on the second and third incident \(\left( {\mathcal {TAR}}2\right) \). Let \(a_i, b_i\) be the output of compression function \(\left( i<q\right) \), where

$$\left\{ {\left( {{a_i} \leftarrow {x_i} \oplus \left( {{a_{i - 1}} \oplus l\left( {{m_i}} \right) } \right) \oplus c} \right) ,\left( {{b_i} \leftarrow {y_i} \oplus \left( {{b_{i - 1}} \oplus l\left( {{m_i}} \right) } \right) \oplus \bar{c}} \right) } \right\} $$

Hence, there is a probability to make collision when \(a_i=b_i\). Let \({^{{\mathcal {TAR}}2}{\mathcal {C}_i}}\) be a collision event for the above condition under the check phase of \(i < q\). Furthermore, there is an option to make a collision with initial chaining values. For example, the output pair of the proposed scheme \(a_{i},\,b_{i}\) collides with the initial chaining values \(\left( a_0,\,b_0\right) \) at any phase of query process. Therefore, the conditions of collision-hit under the initial key attack are \(\left\{ {{a_i} = \left( {{a_0}} \right) ,\left( {{b_0}} \right) } \right\} \vee \left\{ {{b_i} = \left( {{a_0}} \right) ,\left( {{b_0}} \right) } \right\} \).

figure b

Hence, the probability of collision under two incidents is at most \(1/({2^n}-i) \times 2 \times 2/({2^n}-i)\). Finally, the probability of these two incidents under the event of \({^{{\mathcal {TAR}}2}{\mathcal {C}}}\) for q pairs of queries is:

$$\begin{aligned}&\Pr \left[ {^{\mathcal{T}\mathcal{A}\mathcal{R}2}\mathcal{C}} \right] = \Pr \left[ {^{\mathcal{T}\mathcal{A}\mathcal{R}2}{\mathcal{C}_1} \vee {{..}^{\mathcal{T}\mathcal{A}\mathcal{R}2}}{\mathcal{C}_q}} \right] \nonumber \\ \le \sum \limits _{i = 1}^q {\Pr \left[ {^{\mathcal{T}\mathcal{A}\mathcal{R}2}{\mathcal{C}_i}} \right] }= & {} \sum \limits _{i = 1}^q {\frac{1}{{\left( {{2^n} - i} \right) }}} \times \frac{{2 \times 2}}{{\left( {{2^n} - i} \right) }} \le \frac{q}{{\left( {{2^n} - q} \right) }} \times \frac{{2 \times 2 \times q}}{{\left( {{2^n} - q} \right) }} = \frac{{4{q^2}}}{{{{\left( {{2^n} - q} \right) }^2}}}\nonumber \\ \end{aligned}$$
(6)

Adding the values of 5 and 6, Theorem 1 satisfies.

4.2 Preimage Security Analysis

A standard proof technique of Armknecht et al. is used for the preimgae security proof of the proposed scheme [14]. The PR security bound of MR, Weimar, Hirose, Tandem and Abreast is also based on [14]. The two important concepts are adopted such as query: super, normal and adjacent query-pair from [6, 14]. Let \(\mathcal {A}\) randomly picks the output value of compression function \(\left( {a',\,b'}\right) \). Now \(\mathcal {A}\) has target to find a probability for preimage-hit through \(f_\mathcal {E}^\mathrm{{p}}\left( {{a_i},{b_i},m} \right) = \left( {a',b'} \right) \) condition, where \({{a_i},{b_i},m}:\) input of compression function and \({{a_i} \ne {b_i}}\).

Theorem 2

Let \(f_\mathcal {E}\) be a double block-length compression function. An adversary \(\mathcal {A}\) is defined for finding a preimage-hit under the \(f_\mathcal {E}\) after q pairs of queries. Hence, the advantage of \(\mathcal {A}\) is bounded by,

Proof

An adversary \(\mathcal {A}\) keeps a query database in the form of,

$$\left[ {\begin{array}{*{20}{l}} {\left\{ {{a_i} \leftarrow {\mathcal{E}_{\mathrm{{l}},\overline{m} ||c}}\left( {\overline{{a_{i - 1}} \oplus l\left( {{m_i}} \right) } } \right) \oplus \left( {{a_{i - 1}} \oplus l\left( {{m_i}} \right) } \right) \oplus c} \right\} }\\ {\mathrm{{and }}\left\{ {{b_i} \leftarrow {\mathcal{E}_{\mathrm{{r}},m||\bar{c}}}\left( {{b_{i - 1}} \oplus l\left( {{m_i}} \right) } \right) \oplus \left( {{b_{i - 1}} \oplus l\left( {{m_i}} \right) } \right) \oplus \bar{c}} \right\} } \end{array}} \right] $$

In such a fashion, when the oracle size reaches N / 2 \(\left( N: \text {Oracle size}\left( 2^{n}\right) \right) \), the rest of the queries under the key-set reaches the adversary as free query [6, 14, 25]. This free set of queries exist in the domain which is called the super query database \(\left( \mathcal {SQD}\right) \). On the other hand, the first N / 2 is defined as a normal query database \(\left( \mathcal {NQD}\right) \) [14]. Additionally, the free queries are asked by the adversary non-adaptively in the super query database \(\left( \mathcal {SQD}\right) \). Therefore the successful conditions of a preimage-hit are:

$$\begin{aligned} \begin{array}{l} \left\{ {{a_i} \leftarrow {\mathcal {E}_{\mathrm{{l}},{{\bar{m}}_i}||c}}\left( {\overline{{a_{i - 1}} \oplus l\left( {{m_i}} \right) } } \right) \oplus \left( {{a_{i - 1}} \oplus l\left( {{m_i}} \right) } \right) \oplus c} \right\} ,\\ \left\{ {{a_j} \leftarrow {\mathcal {E}_{\mathrm{{l}},{{\bar{m}}_j}||c}}\left( {\overline{{a_{j - 1}} \oplus l\left( {{m_j}} \right) } } \right) \oplus \left( {{a_{j - 1}} \oplus l\left( {{m_j}} \right) } \right) \oplus c} \right\} = \left\{ {\left( {a'} \right) ,\left( {b'} \right) } \right\} \end{array} \end{aligned}$$
(7)
$$\begin{aligned} {\begin{array}{*{20}{l}} \begin{array}{l}\left\{ {{b_i} \leftarrow {\mathcal {E}_{\mathrm{{r}},m||{{\bar{c}}_i}}}\left( {{b_{i - 1}} \oplus l\left( {{m_i}} \right) } \right) \oplus \left( {{b_{i - 1}} \oplus l\left( {{m_i}} \right) } \right) \oplus \bar{c}} \right\} , \end{array}\\ {\left\{ {{b_j} \leftarrow {\mathcal {E}_{\mathrm{{r}},m||{{\bar{c}}_j}}}\left( {{b_{j - 1}} \oplus l\left( {{m_j}} \right) } \right) \oplus \left( {{b_{j - 1}} \oplus l\left( {{m_j}} \right) } \right) \oplus \bar{c}} \right\} = \left\{ {\left( {a'} \right) ,\left( {b'} \right) } \right\} } \end{array}} \end{aligned}$$
(8)

Equations 7 and 8 can occur in either in the domain of a normal query win \(\left( \mathcal {NQW}\right) \) or super query win \(\left( \mathcal {SQW}\right) \). Therefore, the probability of the preimage-hit is \(\Pr \left[ {{\mathcal {NQW}}} \right] + \Pr \left[ {{\mathcal {SQW}}} \right] \).

figure c

Probability of \(\mathcal {NQW}\). The adversary \(\mathcal {A}\) makes any relevant query independently and receives \({a_i},\,{b_i}\). Furthermore, \(\mathcal {A}\) executes until the oracle set size reaches to N / 2 [6, 14]. According to the above mentioned conditions (7, 8), the hitting probability is .

If \(\mathcal {A}\) makes a query \({\mathcal {E}_{\text {l},{\overline{m}_i}||c}}\left( {\overline{{a_{i - 1}} \oplus l\left( {{m_i}} \right) } } \right) \) \(\left( \text {left block}\right) \) then the answer of a right block provides as free query to \(\mathcal {A}\) because of the adjacent query pair [6, 14]. Thereafter, the set size is \((2^{n}-q)/2\) which outfits the probability as . Thus, the probability of the normal query is:

(9)

Probability of \(\mathcal {SQW}\). The concept of a super query oracle is very simple [6, 14]. If the query oracle reaches at the point of N / 2, then the rest of the queries set as free to the adversary [6, 14]. Later these queries are asked by the adversary non-adaptively [14] for finding a preimage-hit (Algorithm 3). Moreover, the preimage-hit is notified either in this domain \(\left( \mathcal {SQD}\right) \) or not. Thus, the probability is either 2 / N or 0 for any output value of \(a_{i}/b_{i}\). Now a pair of conditions under \(\mathcal {SQW}\) are:

$$\begin{aligned} \left\{ {{a_i} \leftarrow \left( {l\left( {{m_i}} \right) \oplus {a_{i - 1}} \oplus {x_i}} \right) \oplus c} \right\} = \left( {a'} \right) ,\left( {b'} \right) \end{aligned}$$
(10)
$$\begin{aligned} \left\{ {{b_i} \leftarrow \left( {l\left( {{m_i}} \right) \oplus {b_{i - 1}} \oplus {y_i}} \right) \oplus \overline{c} } \right\} = \left( {a'} \right) ,\left( {b'} \right) \end{aligned}$$
(11)

According to 10, the answer of \({a_i}\) has a possibility to come from the set size of N / 2. Hence, the probability is 2 / N. Recalling the concept of an adjacent query pair \(\left( \text {free query}\right) \) [6, 14], where the answer of another block \(\left( \text {right block}\right) \) comes from the set size of N / 2. As a result, the probability of 10 is in total \({4/N^{2}}\). In similar way, the probability of 11 is \({4/N^{2}}\). Now, the final probability of the \(\mathcal {SQW}\) is evaluated based on the the number of points for a \(\mathcal {SQW}\), the cost of \(\mathcal {SQW}\) and the probability of obtaining preimgae-hit such as:

(12)

Adding the values of 9 and 12, Theorem 2 satisfies.

5 Result Analysis

5.1 Collision Resistance Analysis

Theorem 1 provides a probability of collision hit under the given adversary \({\mathcal {A}}\). The number of queries \(\left( q\right) \) is important for finding an upper bound of the collision security. Hence, the value of q is required to investigate when the adversarial advantage is 1 / 2 \(\left( \text {birthday attack}\right) \).

Let, \(N = {2^n}\) and \({\text {Adv}}_{f_E}^{{\text {coll}}}({\mathcal A}) \le \frac{{6{q^2} - 2q}}{{{{\left( {{2^n} - q} \right) }^2}}}\) [Theorem 1], where \(n=128\). According to the birthday attack [1, 6, 13, 20, 21], \({\text {Adv}}_{f_\mathcal {E}}^{\text {coll}}\left( {\mathcal A} \right) = \frac{1}{2}\). Thus, the number of queries are \(q = {2^{125.84}}\).

5.2 Efficiency-rate

The efficiency-rate of a blockcipher based compression function is defined as \(r = \frac{{\left| m \right| }}{{\left( {n \times \# E} \right) }}\), where |m| = length of message, n = blocklength and \({\#E}\) = number of blockcipher calls. According to the definitions (Definitions 1 and 2) of the proposed scheme, the efficiency-rate is \({{r_{\text {}}} = 0.996} \Rightarrow {{r_{\text {}}} \approx 1}\). In Fig. 3, the proposed scheme is compared with the existing schemes in respect of efficiency-rate.

Fig. 3.
figure 3

Comparison of efficiency-rate

Table 3. Required memory for key scheduling [20, 21, 27]
Table 4. Required memory for key scheduling [6, 23, 27]
Table 5. Required memory for key scheduling, when \(m=tn\)

5.3 Performance Analysis

In this section, a comparison study is given for the proposed scheme in respect of memory resources. It is known that 176 bytes of memory is required for single key scheduling [27]. For example, a 2n-bit size of message is taken for encryption. Therefore, the following Tables 3 and 4 are made based on the characteristics of the current familiar schemes and the proposed scheme. For any \(\mathcal {DBL}\) compression function, the output is 2n-bit. Therefore, assume that the minimum \(2n \rightarrow {\gamma }\) bit is required to store the output value \(\left( \text {denoted as}\,\, \mathcal {V}\right) \) of i-th iteration. In Table 4, the message size is 2n-bit for example. Hence, the memory resource doesn’t need to store the output for the proposed scheme. Next, the above cost (Table 4) is generalized including the number of iterations \(\left( l\right) \) for tn-bit message \(\left( t > 2\right) \) in Table 5. Additionally, the proposed scheme is faster than that of the MR, Weimar, Tandem, Abreast \(\left( \text {if,}\,\, m > 2n\right) \) in certain cases.

6 Conclusion

This paper studied the gap between security bound and efficiency of compression function for the cryptographic hash. Additionally, study result introduces that the blockcipher based compression function is more suitable than the scratch based construction for security solution of IoT-end devices, RfID, and constrained devices. Thus, a better efficient compression function \(\left( \text {blockcipher based}\right) \) is proposed in this paper. Additionally, the proposed scheme provides improved efficiency-rate, less call of blockcipher, and reasonable security bound. It satisfies two calls of 2n-bit key property, where two block ciphers are independent. The proof technique of this scheme depends on the ICM tool. The proposed scheme has a provision of fixed size message encryption property. Therefore, this property opens a window for new applications, where a variable length of the message can be encrypted without padding. Finally, the proposed scheme is secure under one of the modes of PGV which can be extended to make the scheme secure under all modes of the PGV [1719].