Keywords

1 Introduction

In the affine equivalence problem, the input consists of two functions \(\varvec{F},\varvec{G}\) and the goal is to determine whether they are affine equivalent, and if so, output the equivalence relations. More precisely, if there exist invertible affine transformations (over some field) \(A_1,A_2\) such that \(\varvec{G} = A_2 \circ \varvec{F} \circ A_1\), output \(A_1,A_2\). Otherwise, assert that \(\varvec{F},\varvec{G}\) are not affine equivalent.

Variants of the affine equivalence problem have been studied in several branches of mathematics and are relevant to both asymmetric and symmetric cryptography. In the context of asymmetric cryptography, the problem was first formalized by Patarin [17] and referred to as isomorphism of polynomials. In this setting \(\varvec{F},\varvec{G}\) are typically of low algebraic degree (mainly quadratic) over some field.

The focus of this work is on the affine equivalence variant in which \(\varvec{F},\varvec{G}\) map between n-bit words and the affine transformations \(A_1,A_2\) are over \(GF(2)^n\). This variant is mostly relevant in several contexts of symmetric-key cryptography. In particular, it is relevant to the classification and analysis of Sboxes (see [6, 14]) as affine equivalent Sboxes share differential, linear and several algebraic properties (refer to [7] for recent results on this subject). Moreover, algorithms for the affine equivalence problem were applied in [3] to generate equivalent representations of AES and other block ciphers. These algorithms also have cryptanalytic applications and were used to break white-box ciphers (e.g., in [15]). Additionally, solving the affine equivalence problem can be viewed as breaking a generalization of the Even-Mansour scheme [11], which has received substantial attention from the cryptographic community in recent years. The original scheme builds a block cipher from a public permutation \(\varvec{F}\) using two n-bit keys \(k_1,k_2\) and its encryption function is defined as \(\varvec{E}(p) = \varvec{F}(p + k_1) + k_2\) (where addition is over \(GF(2)^n\)). The generalized Even-Mansour scheme replaces the key additions with secret affine mappings and breaking it reduces to solving the affine equivalence problem, as originally described in [3].

The best known algorithms for the affine equivalence problem were presented by Biryukov et al. at EUROCRYPT 2003 [3]. The main algorithm described in [3] has complexity of about \(n^3 2^{2n}\) bit operations, while a secondary algorithm has time complexity of about \(n^3 2^{3n/2}\), but also uses about the same amount of memory.Footnote 1 Besides its high memory consumption, another disadvantage of the secondary algorithm of [3] is that it cannot be used to prove that \(\varvec{F}\) and \(\varvec{G}\) are not affine equivalent.

In this paper we devise a new algorithm for the affine equivalence problem whose complexity is about \(n^3 2^n\) bit operations with very high probability whenever \(\varvec{F}\) (or \(\varvec{G}\)) is chosen uniformly at random from the set of all permutations on n-bit words. Our algorithm is also applicable without any modification to arbitrary functions (rather than permutations) and seems to perform similarly on random functions. However we focus on permutations as almost all applications actually require solving the affine equivalence problem for permutations. Since our algorithm can be used to prove that \(\varvec{F}\) and \(\varvec{G}\) are not affine equivalent, it does not share the disadvantage of the secondary algorithm of [3].

As a consequence of our improved algorithm, we solve within several minutes affine equivalence problem instances of size up to \(n=28\) on a single core. Optimizing our implementation and exploiting parallelism would most likely allow solving instances of size at least \(n=40\) using an academic budget. Such instances are out of reach of all previous algorithms for the problem.

Technically, the main algorithm devised in [3] for the affine equivalence problem is a guess-and-determine algorithm (which is related to the “to and fro” algorithm of [18], devised to solve the problem of isomorphism of polynomials) whereas the secondary algorithm is based on collision search (it generalizes Daemen’s attack on the original Even-Mansour cipher [8]). On the other hand, algorithms that use algebraic techniques (such as [5]) are mainly known for the asymmetric variant, in which \(\varvec{F},\varvec{G}\) are functions of low degree, and it is not clear how to adapt them to arbitrary functions.

In contrast to previous algorithms, our approach involves analyzing algebraic properties of \(\varvec{F},\varvec{G}\) which are of high algebraic degree. More specifically, we are interested in the polynomial representation (algebraic normal form or ANF) of each of the n output bits of \(\varvec{F}\) (and \(\varvec{G}\)) as a Boolean function in the n input bits. In fact, we are mainly interested in “truncated” polynomials that include only monomials of degree at least d (in particular, we choose \(d=n-2\)). Each such polynomial can be viewed a vector in a vector space (with the standard basis of all monomials of degree at least \(d = n-2\)). Therefore we can define the rank of the set of n truncated polynomials for each \(\varvec{F},\varvec{G}\) as the rank of the matrix formed by arranging these polynomials as row vectors. In other words, we associate a rank value (which is an integer between 0 and n) to \(\varvec{F}\) (and to \(\varvec{G}\)) by computing the rank of its n truncated polynomials (derived from its n output bits) as vectors. We first show that if \(\varvec{F},\varvec{G}\) are affine equivalent, their associated ranks are equal.Footnote 2

To proceed, we analyze \(\varvec{F},\varvec{G}\) independently. We derive from \(\varvec{F}\) several functions, each one defined by restricting its \(2^n\) inputs to an affine subspace of dimension \(n-1\). Since each such derived function (restricted to an affine subspace) has an associated rank, we assign to each possible \((n-1)\)-dimensional subspace a corresponding rank. As there are \(2^{n+1}\) possible affine subspaces (such a subspace can be characterized using its orthogonal subspace by a single linear expression over n variables and a free coefficient), we obtain \(2^{n+1}\) rank values for \(\varvec{F}\). These values are collected in the rank table of \(\varvec{F}\), where a rank table entry r stores the set of all affine subspaces (more precisely, their compact representations as linear expressions) assigned to rank r.Footnote 3

The main idea of the algorithm is to compute the rank tables of both \(\varvec{F}\) and \(\varvec{G}\) and then use these tables (and additional more complex structures derived from them) to recover the (unknown) affine transformation \(A_1\), assuming that \(\varvec{G} = A_2 \circ \varvec{F} \circ A_1\). In essence, the rank tables allow us to recover matchings between \((n-1)\)-dimensional affine subspaces that are defined by \(A_1\): an affine subspace S in matched with \(S'\) if \(A_1\) transforms S to the affine subspace \(S'\). Each such matching between S and \(S'\) reveals information about \(A_1\) in the form of linear equations. Hence we aim to use the rank tables to recover sufficiently many such matchings and compute \(A_1\) using linear algebra. Once \(A_1\) is derived, computing \(A_2\) is trivial. The main property of the rank table that we prove and exploit to recover the matchings is that if S in matched with \(S'\), then S appears in the rank table of \(\varvec{G}\) in the same entry r as \(S'\) appears in the rank table of \(\varvec{F}\).

Since the number of \((n-1)\)-dimensional affine subspaces is \(2^{n+1}\), each containing \(2^{n-1}\) elements, a naive approach to computing the rank table (which works independently on each subspace) has complexity of at least \(2^{n+1} \cdot 2^{n-1} = 2^{2n}\). However, using symbolic computation of polynomials, we show how to reduce this complexity to about \(n^3 2^n\) bit operations. While this computational step is easy to analyze, this is not the case for the overall algorithm’s performance. Indeed, its success probability and complexity depend on the monomials of degree at least \(n-2\) of \(\varvec{F}\) and \(\varvec{G}\). In particular, if all n output bits of \(\varvec{F}\) and \(\varvec{G}\) are functions of degree \(n-3\) or lower, they do not contain any such monomials. As a result, all affine subspaces for \(\varvec{F}\) and \(\varvec{G}\) are assigned rank zero and the rank tables of these functions contain no useful information, leading to failure of the algorithm.Footnote 4

When \(\varvec{F}\) (or \(\varvec{G}\)) is chosen uniformly at random from the set of all possible n-bit permutations (or n-bit functions in general), the case that its algebraic degree is less than \(n-2\) is extremely unlikely for \(n \ge 8\). Nevertheless, rigorous analysis of the algorithm seems challenging as its performance depends on subtle algebraic properties of random permutations. To deal with this situation, we make a heuristic assumption about the distribution of high degree monomials in random permutations which enables us to use well-known results regarding the rank distribution of random Boolean matrices. Consequently, we derive the distribution of the sizes of the rank table entries for a random permutation. This distribution and additional properties enable us to show that asymptotically the algorithm succeeds with probability close to 1 in complexity of about \(n^3 2^n\) bit operations. This heuristic analysis is backed up by thousands of experiments on various problem instances of different sizes. Rigorously analyzing the algorithm and extending it to succeed on all functions (or permutations) with probability 1 in the same complexity remain open problems.

The properties of the rank table and the algorithm for computing it are of independent interest. In particular, we propose methods to build experimental distinguishers for block ciphers based on the rank table and a method to efficiently detect high-order differential distinguishers based on the algorithm for its computation. Furthermore, our techniques are relevant to decomposition attacks on the white-box ASASA block cipher instances proposed by Biryukov et al. [2]. In this application, we adapt the algorithm for computing the rank table in order to improve the complexity of the integral attack on ASASA published in [10] from \(2^{3n/2}\) to about \(2^n\) (where n is the block size of the instance).

The rest of the paper is organized as follows. In Sect. 2 we describe some preliminaries and give an overview of the new affine equivalence algorithm in Sect. 3. In Sect. 4 we prove the basic property of rank equality for affine equivalent functions, while in Sect. 5 we define and analyze the matching between \((n-1)\)-dimensional affine subspaces that we use to recover \(A_1\). In Sect. 6 we define the rank table and additional objects used in our algorithm, and describe the relation between these objects for affine equivalent functions. In Sect. 7 we analyze properties of rank tables for random permutations under our heuristic assumption. Then, we describe and analyze the new affine equivalence algorithm in Sect. 8. Next, in Sect. 9, we describe applications of the new algorithm and the rank table structure. Finally, we conclude the paper in Sect. 10.

2 Preliminaries

For a finite set R, denote by |R| its size. Given a vector \(u=(u[1],\ldots ,u[n])\in GF(2)^n\), let wt(u) denote its Hamming weight. Throughout this paper, addition between vectors \(u_1,u_2 \in GF(2)^n\) is performed bit-wise over \(GF(2)^n\).

Multivariate Polynomials. Any Boolean function \(F: \{0,1\}^n \rightarrow \{0,1\}\) can be represented as a multivariate polynomial whose algebraic normal form (ANF) is unique and given as \(F(x[1],\ldots ,x[n])=\sum \limits _{u=(u[1],\ldots ,u[n])\in \{0,1\}^n}\alpha _u M_u\), where \(\alpha _u \in \{0,1\}\) is the coefficient of the monomial \(M_u = \prod _{i=1}^{n}x[i]^{u[i]}\), and the sum is over GF(2). The algebraic degree of the function F is defined as \(deg(F) = max\{wt(u) \; | \; \alpha _u \ne 0\}\).

In several cases it will be more convenient to directly manipulate the representation of F as a multivariate polynomial \(P(x[1],\ldots ,x[n]) = \sum _{u \in \{0,1\}^n}\alpha _u M_u\). Note that unlike F, the polynomial P is not treated as a function but rather as a symbolic object. \(P(x[1],\ldots ,x[n])\) can be viewed as a vector in the vector space spanned by the set of all monomials \(\{M_u \; | \; u \in \{0,1\}^n\}\).

Given a multivariate polynomial \(P(x[1],\ldots ,x[n]) = \sum _{u \in \{0,1\}^n}\alpha _u M_u\) and a positive integer d, define \(P_{(\ge d)}\) by taking all the monomials of P of degree at least d, namely, \(P_{(\ge d)}(x[1],\ldots ,x[n]) = \sum \limits _{u \in \{0,1\}^n \wedge wt(u)\ge d}\alpha _u M_u\). Note that \(P_{(\ge d)}\) can be represented using at most \(\sum _{i=d}^n \left( {\begin{array}{c}n\\ i\end{array}}\right) \) non-zero coefficients.

Given a function \(F: \{0,1\}^n \rightarrow \{0,1\}\) represented by a polynomial \(P(x[1],\ldots ,x[n])\), define \(F_{(\ge d)}: \{0,1\}^n \rightarrow \{0,1\}\) as the function represented by \(P_{(\ge d)}\).

Vectorial Functions and Polynomials. Given a vectorial Boolean function \(\varvec{F}: \{0,1\}^n \rightarrow \{0,1\}^m\), let \(F^{(i)}: \{0,1\}^n \rightarrow \{0,1\}\) denote the Boolean function of its i’th output bit.

We say that a sequence of m polynomials \(\varvec{P} = \{P^{(i)}(x[1],\ldots ,x[n])\}_{i=1}^{m}\) represents \(\varvec{F}\) if for each \(i \in \{1,2,\ldots ,m\}\), the i’th polynomial \(P^{(i)}\) represents \(F^{(i)}\).

Given a positive integer d, denote \(\varvec{P}_{(\ge d)} = \{P^{(i)}_{(\ge d)}(x[1],\ldots ,x[n])\}_{i=1}^{m}\). The vectorial function \(\varvec{F}_{(\ge d)}: \{0,1\}^n \rightarrow \{0,1\}^m\) is defined analogously.

The algebraic degree \(deg(\varvec{P})\) of \(\varvec{P}\) is defined as the maximal degree of its m polynomials. The algebraic degree \(deg(\varvec{F})\) is defined analogously.

As each \(P^{(i)}\) can be viewed as a vector in a vector space, we define the symbolic rank of \(\varvec{P}\) as the rank of the m vectors \(\{P^{(i)}\}_{i=1}^{m}\). We denote the symbolic rank of \(\varvec{P}\) as \(SR(\varvec{P})\). Note that \(SR(\varvec{P}) \in \mathbb {Z}_{m+1}\).

Affine Transformations and Affine Equivalence. An affine transformation \(A:\{0,1\}^m \rightarrow \{0,1\}^n\) over \(GF(2)^m\) is defined using a Boolean matrix \(L_{n \times m}\) and a word \(a \in \{0,1\}^n\) as \(A(x) = L(x) + a\) (L(x) is simply matrix multiplication). The transformation is invertible if \(m=n\) and L is an invertible matrix. If \(a=0\), then the A is called a linear transformation (such functions are a subclass of affine functions).

Two functions \(\varvec{F}: \{0,1\}^n \rightarrow \{0,1\}^m\), \(\varvec{G}: \{0,1\}^n \rightarrow \{0,1\}^m\) are affine equivalent is there exist two invertible affine transformations \(A_1: \{0,1\}^n \rightarrow \{0,1\}^n\) and \(A_2: \{0,1\}^m \rightarrow \{0,1\}^m\) such that \(\varvec{G} = A_2 \circ \varvec{F} \circ A_1\). It is easy to show that the affine equivalence relation partitions the set of all functions into (affine) equivalence classes. We denote \(\varvec{F} \equiv \varvec{G}\) if \(\varvec{F}\) is affine equivalent to \(\varvec{G}\).

Symbolic Composition. Given \(\varvec{P} = \{P^{(i)}(x[1],\ldots ,x[n])\}_{i=1}^{m}\), and an affine function \(A_1: \{0,1\}^{n'} \rightarrow \{0,1\}^n\), the composition \(\varvec{P} \circ A_1\) is a sequence of m polynomials in \(n'\) variables. For \(i \in \{1,2,\ldots ,m\}\), the i’th polynomial in this composition is \(P^{(i)} \circ A_1\). It can be computed by substituting each variable x[j] in \(P^{(i)}\) with the (affine) symbolic representation of the j’th output bit of \(A_1\) (and simplifying the outcome to obtain the ANF).

For example, given \(P(x[1],x[2],x[3]) = x[1]x[2] + x[1]x[3] + x[2] + 1\) and \(A_1: \{0,1\}^{2} \rightarrow \{0,1\}^3\) defined by the relations \(x[1] = y[1] + y[2] + 1, x[2] = y[2], x[3] = y[1] + y[2]\), then

$$\begin{aligned} P \circ A_1 = (y[1] + y[2] + 1)(y[2]) + (y[1] + y[2] + 1)(y[1] + y[2]) + y[2] + 1 = \\(y[1]y[2] + y[2] + y[2]) + (y[1] + y[1]y[2] + y[1]y[2] + y[2] + y[1] + y[2]) + y[2] + 1 = \\ y[1]y[2] + y[2] + 1.\end{aligned}$$

Thus, we compose each monomial \(M_u\) with coefficient 1 in P with \(A_1\) to obtain a polynomial expression, add all the expressions and simplify the result. Formally, if we denote P’s \(M_u\) coefficient by \(\alpha _u\), then

$$P \circ A_1 = \sum _{u \in \{0,1\}^n} \alpha _u \cdot (M_u \circ A_1).$$

Note that composition with an affine function does not increase the algebraic degree of the composed polynomial, namely \(deg(\varvec{P} \circ A_1) \le deg(\varvec{P})\).

Analogously, given an affine function \(A_2: \{0,1\}^m \rightarrow \{0,1\}^{m'}\), the composition \(A_2 \circ \varvec{P}\) is a sequence of \(m'\) polynomials in n variables. It can be computed by substituting each variable x[j] in the (affine) symbolic representation of \(A_2\) with \(\varvec{P}^{(j)}\). Equivalently, if \(A_2(x) = L(x) + a\), then \(A_2 \circ \varvec{P}\) can be computed by symbolic matrix multiplication (and addition of a) as \(L(\varvec{P}) + a\). In particular, if \(m' = 1\) and \(a=0\), then \(A_2\) reduces to a vector \(v = (v[1],v[2],\ldots ,v[m]) \in \{0,1\}^m\) and \(v(\varvec{P}) = \sum _{i=1}^m v[i]\varvec{P}^{(i)}\) is a symbolic inner product.

By rules of composition, if \(\varvec{F}\) is represented by \(\varvec{P}\), then \(\varvec{P} \circ A_1\) represents \(\varvec{F} \circ A_1\) (which is a standard composition of functions) and \(A_2 \circ \varvec{P}\) represents \(A_2 \circ \varvec{F}\).

Half-Space Masks and Coefficients. Let \(A:\{0,1\}^{n-1} \rightarrow \{0,1\}^{n}\) be an affine transformation such that \(A(x) = L(x)+a\) for a matrix \(L_{n \times n-1}\) with linearly independent columns. Then the (affine) range of A is an \((n-1)\)-dimensional affine subspace spanned by the columns of L with the addition of a. The subspace orthogonal to the range of A is of dimension 1 and hence spanned by a single non-zero vector \(h \in \{0,1\}^n\). Namely, a vector \(v \in \{0,1\}^n\) is in the range of A if and only if \(h(v+a) = 0\), i.e., v satisfies the linear equation \(h(v) + h(a) = 0\).

Since h partitions the space of \(\{0,1\}^n\) into two halves, we call h the half-space mask (HSM) of A and call the bit h(a) the half-space free coefficient (HSC) of A.

We call the linear subspace spanned by the columns of L the linear range of A. A vector \(v\in \{0,1\}^n\) is in the linear range of A if and only if \(h(v) = 0\).

Canonical Affine Transformations. Given non-zero \(h \in \{0,1\}^n\) and \(c \in \{0,1\}\), there exist many affine transformations whose HSM and HSC are equal to hc, respectively. We will use the fact (stated formally below) that all affine transformations with an identical affine range are related by composition on the right with an invertible affine transformation.

Fact 1

The affine transformations \(A_1:\{0,1\}^{n-1} \rightarrow \{0,1\}^{n}\) and \(A_2:\{0,1\}^{n-1} \rightarrow \{0,1\}^{n}\) have the same affine range if and only if there exists an invertible affine transformation \(A':\{0,1\}^{n-1} \rightarrow \{0,1\}^{n-1}\) such that \(A_1 = A_2 \circ A'\).

Given \(A_1,A_2\) the matrix \(A'\) above can be computed using basic linear algebra.

We now define the canonical affine transformation \(C_{|h,c}:\{0,1\}^{n-1} \rightarrow \{0,1\}^{n}\) with respect to hc. Let \(\ell \) denote the index of the first non-zero bit of \(h = (h[1],\ldots ,h[n])\). Write \(C_{|h,c}(x) = L(x) + a\). We define \(a = c\cdot e_{\ell }\) (where \(e_{\ell }\) is the \(\ell \)’th unit vector) and define L[i] (the i’th column of L) using h and the unit vectors as follows:

$$ L[i]= {\left\{ \begin{array}{ll} e_i &{} \text {if } i < \ell \\ e_{i+1} + h[i+1]e_{\ell } &{} \text {otherwise } (\ell \le i \le n-1) \end{array}\right. } $$

Thus, on input \((y[1],\ldots ,y[n-1])\), the transformation \(C_{|h,c}\) is defined by the symbolic form

$$(x[1],x[2],\ldots ,x[n]) = (y[1],\ldots ,y[\ell -1],\sum _{i=\ell }^{n-1}h[i+1]y[i] + c,y[\ell ],\ldots ,y[n-1]).$$

The motivation behind the definition of \(C_{|h,c}\) is that it allows very simple symbolic composition when applied on the right: its main action is to replace the variable \(x[\ell ]\) with the affine combination that is specified by the coefficients of h and by c. Other variables are just renamed: variables with index \(i<\ell \) remain with the same index, while for each variable with index \(i>\ell \), its index it reduced by 1.

Remark 1

Note that we have to show that the definition of \(C_{|h,c}\) is valid. First, the \(n-1\) columns of L are clearly linearly independent. It remains to prove that hc are indeed the HSM and HSC of \(C_{|h,c}\). For this purpose, it suffices to show that for each column L[i], the vector \(L[i] + a\) satisfies the equation \(h(L[i] + a) + c =0\). Since \(h(a) = h(c\cdot e_{\ell }) = c\cdot h(e_{\ell }) = c\) (as \(h_{\ell } = 1\)), it remains to show that \(h(L[i]) = 0\). Indeed, if \(0 \le i < \ell \), then \(h(L[i]) = h_i = 0\) (as \(\ell \) is the index of the first non-zero bit of h). Otherwise, \(\ell \le i \le n-1\), then \(h(L[i]) = h[i+1] + h[i+1]h[\ell ] = 0\) (as \(h[\ell ] = 1\)).

3 Overview of the New Affine Equivalence Algorithm

We demonstrate the new algorithm using an example. Although it is oversimplified, this example is sufficient to convey the main ideas of our algorithm.

Definition of Functions. We define the function \(\varvec{F}: \{0,1\}^3 \rightarrow \{0,1\}^3\) using its symbolic representation \(\varvec{P} = \{P^{(i)}(x[1],x[2],x[3])\}_{i=1}^3\),

$$\begin{aligned} P^{(1)}(x[1],x[2],x[3]) =&x[1]x[2] + x[1]x[3]+ x[2] + 1 \\ P^{(2)}(x[1],x[2],x[3]) =&x[1]x[2] + x[1] + x[2] \\ P^{(3)}(x[1],x[2],x[3]) =&x[1]x[3] + x[3].\end{aligned}$$

We define \(\varvec{G}: \{0,1\}^3 \rightarrow \{0,1\}^3\) using 2 affine transformations as \(\varvec{G} = A_2 \circ \varvec{F} \circ A_1\), where \(A_2\) is simply the identity and \(A_1\) is defined using the relations

$$x[1] = y[1] + y[3] + 1, \; x[2] = y[1] + y[2], \; x[3] = y[2].$$

Composing \(A_2 \circ \varvec{P} \circ A_1\) and simplifying the resultant ANFs, gives the symbolic representation of \(\varvec{G}\) as \(\varvec{Q} = \{Q^{(i)}(y[1],y[2],y[3])\}_{i=1}^3\), where

$$\begin{aligned} Q^{(1)}(y[1],y[2],y[3]) =&y[1]y[3] + y[1] + y[2] + 1 \\ Q^{(2)}(y[1],y[2],y[3]) =&y[1]y[2] + y[1]y[3]+ y[2]y[3] + y[3] + 1 \\ Q^{(3)}(y[1],y[2],y[3]) =&y[1]y[2] + y[2]y[3].\end{aligned}$$

The input to our affine equivalence algorithm is \(\varvec{F},\varvec{G}\) defined above and its goal is to recover the (presumably) unknown affine transformation \(A_1\). The first step of the algorithm is to interpolate \(\varvec{F},\varvec{G}\) and obtain \(\varvec{P},\varvec{Q}\), respectively.

Rank Tables and Histograms. The most basic property that we prove in Theorem 1 is that since \(\varvec{F}\) and \(\varvec{G}\) are affine equivalent, the symbolic ranks of \(\varvec{P}\) and \(\varvec{Q}\) (as vectors) are equal. Indeed, it is easy to verify that both \(\varvec{P}\) and \(\varvec{Q}\) have symbolic rank of 3. More significantly, Theorem 1 is stronger and asserts that \(SR(\varvec{P}_{(\ge d)}) = SR(\varvec{Q}_{(\ge d)})\) for every \(d \ge 1\). Indeed, if we take \(d=2\), we get \(\varvec{P}_{(\ge 2)} = \{x[1]x[2] + x[1]x[3], x[1]x[2], x[1]x[3]\}\), which has symbolic rank 2. This is also the symbolic rank of \(\varvec{Q}_{(\ge 2)} = \{y[1]y[3], y[1]y[2] + y[1]y[3]+ y[2]y[3], y[1]y[2] + y[2]y[3]\}\).

We would like to use this property to recover \(A_1\). Let us examine the 2-dimensional affine subspace defined by the 3-bit HSM \(h' = 100\) (whose bits are \(h[1]=1, h[2]=0, h[3] = 0\)) and the single bit HSC \(c= 0\). We calculate the symbolic form of \((\varvec{F} \circ C_{|h',0})_{(\ge 1)}\) by evaluating \((\varvec{P} \circ C_{|h',0})_{(\ge 1)}\) (i.e., plugging \(x[1] = 0\) into \(\varvec{P}_{(\ge 1)}\)) and obtain \(\{x[2], x[2], x[3]\}\) which has symbolic rank 2. Similarly, for \(c = 1\) we calculate \((\varvec{P} \circ C_{|h',1})_{(\ge 1)}\) (i.e., plug \(x[1] = 1\) into \(\varvec{P}_{(\ge 1)}\)) and obtain \(\{x[3], 0, 0\}\) which has symbolic rank 1. Hence, we attach the symbolic rank pair (2, 1) to \(h' = 100\). We do the same for all 7 non-zero \(h' \in \{0,1\}^{3}\). The result is a table whose entries are pairs of ranks of the form \((maxR,minR) \in \mathbb {Z}_4 \cdot \mathbb {Z}_4\) (where \(maxR \ge minR\)), such that entry (maxRminR) stores the set of HSMs that are associated with this pair of ranks.

$$\begin{aligned} (3,2):&\; \{010,011,111,110\} \\ (2,2):&\; \{001\} \\ (2,1):&\; \{100,101\} \end{aligned}$$

This table is called the rank table of \(\varvec{F}\) (with respect to the degree \(d = 1\) as we only considered monomials of degree at least 1). The set of HSMs in an entry (maxRminR) of the rank table is called a rank group (e.g., the rank group with index (2, 1) is \(\{100,101\}\)). Similarly, we compute the rank table of \(\varvec{G}\) with respect to \(d=1\).

$$\begin{aligned} (3,2):&\; \{100,001,110,011\} \\ (2,2):&\; \{010\} \\ (2,1):&\; \{101,111\} \end{aligned}$$

Although the rank tables are different, the size of each rank group (maxRminR) of \(\varvec{F},\varvec{G}\) is identical. We define the rank histogram of \(\varvec{F}\) (with respect to d) as a mapping from each (maxRminR) value to the corresponding rank group size (e.g., the histogram entry for \(\varvec{F}\) with index (2, 1) has value \(|\{100,101\}| =2\)). As we show in Lemma 9, that rank histograms of affine equivalent functions (such as \(\varvec{F},\varvec{G}\)) are identical.

To explain this, we look at the HSM \(h = 101\) in the rank group of \(\varvec{G} = A_2 \circ \varvec{F} \circ A_1\) with index (2, 1) and note that it partitions the space into halves \(\{000,101,010,111\}\) and \(\{001,011,100,110\}\) (according to whether \(x[1]+x[3] = 0\) or \(x[1]+x[3]=1\)). After applying \(A_1\), these half-spaces are mapped into \(\{100,101,110,111\}\) and \(\{000,001,010,011\}\). This is exactly the partition defined by \(h' = 100\), which is in the rank group of \(\varvec{F}\) with the same index (2, 1). In terms of canonical affine transformations, \(C_{|h',c}\) and \(A_1 \circ C_{|h,0}\) have the same half-space range of \(\{100,101,110,111\}\) (for \(c = 1\) in this case) and we define a mapping \(h \mapsto _{A_1} h'\) to capture this. In Lemma 3 we show that this mapping is a bijection as \(A_1\) is invertible.

Exploiting Matchings. The central property of the mapping \(\mapsto _{A_1}\) is proved in Lemma 6 which asserts that it preserves affine equivalence. Namely, if \(\varvec{F} \equiv \varvec{G}\) and \(h \mapsto _{A_1} h'\), then \(\varvec{F} \circ C_{|h',c} \equiv \varvec{G} \circ C_{|h,0}\) (for some \(c \in \{0,1 \}\)). By flipping the half-space ranges and applying the same argument, we also obtain \(\varvec{F} \circ C_{|h',c+1} \equiv \varvec{G} \circ C_{|h,1}\). Combined with Theorem 1 (which states that symbolic rank is an invariant of affine equivalent functions) we obtain that for any \(d \ge 1\), \(r_0 \triangleq SR((\varvec{F} \circ C_{|h',c})_{(\ge d)})= SR((\varvec{G} \circ C_{|h,0})_{(\ge d)})\) and \(r_1 \triangleq SR((\varvec{F} \circ C_{|h',c+1})_{(\ge d)})= SR((\varvec{G} \circ C_{|h,1})_{(\ge d)})\). Since the ordered rank pairs for \(h',h\) in \(\varvec{F},\varvec{G}\) are equal to (maxRminR) (for \(maxR = max\{r_0,r_1\},minR = min\{r_0,r_1\}\)), they belong to rank groups with the same index (maxRminR) in the rank tables of \(\varvec{F}, \varvec{G}\), respectively. The fact that \(\mapsto _{A_1}\) is a bijection leads to Lemma 9 (which assets that the rank histograms of \(\varvec{F},\varvec{G}\) are identical).

The main goal of our affine equivalence algorithm is to recover matchings \(h \mapsto _{A_1} h'\) for several pairs \(h,h'\). This is useful, as in Lemma 5 we show that each such matching gives n linear equations on the unknown matrix L of \(A_1(x) = L(x)+a\). Furthermore, the constant c associated with \(h \mapsto _{A_1} h'\) (which determines whether \(\varvec{F} \circ C_{|h',0} \equiv \varvec{G} \circ C_{|h,0}\) or \(\varvec{F} \circ C_{|h',1} \equiv \varvec{G} \circ C_{|h,0}\)) gives a linear equation on a (once again, by Lemma 5). In total, we need to find about n matchings \(h \mapsto _{A_1} h'\) along with their associated constants to completely recover \(A_1\).

Going back to the example, the rank group with index (2, 2) for \(\varvec{G}\) is \(\{010\}\), while the rank group with the same index for \(\varvec{F}\) is \(\{001\}\). Therefore, after computing the rank tables we know that

$$\begin{aligned} 010 \mapsto _{A_1} 001.\end{aligned}$$
(1)

Remark 2

We matched \(010 \mapsto _{A_1} 001\) in the rank group \((maxR,minR) = (2,2)\) and since \(maxR = minR\) we cannot derive the constant c associated with this matching (hence we cannot derive a linear equation on a). Such constants can only be derived for matchings \(h \mapsto _{(A_1)} h'\) in rank groups where \(maxR > minR\), as in such cases we know whether \(\varvec{F} \circ C_{|h',0} \equiv \varvec{G} \circ C_{|h,0}\) or \(\varvec{F} \circ C_{|h',1} \equiv \varvec{G} \circ C_{|h,0}\) according to the equality \(SR((\varvec{F} \circ C_{|h',c})_{(\ge d)})= SR((\varvec{G} \circ C_{|h,0})_{(\ge d)})\). More precisely, if \(maxR > minR\), then either \(SR((\varvec{F} \circ C_{|h',0})_{(\ge d)})= SR((\varvec{G} \circ C_{|h,0})_{(\ge d)})\) or \(SR((\varvec{F} \circ C_{|h',1})_{(\ge d)})= SR((\varvec{G} \circ C_{|h,0})_{(\ge d)})\), but not both. In this sense, it is more useful to recover matchings for HSMs in rank groups (maxRminR) such that \(maxR > minR\).

By applying similar arguments to the rank group (2, 1), we know that either \(101 \mapsto _{A_1} 100\) or \(101 \mapsto _{A_1} 101\) (and similarly \(111 \mapsto _{A_1} 100\) or \(111 \mapsto _{A_1} 101\)). Since we have very few possibilities, we can guess which matchings hold, derive \(A_1\) and test our guess. Unfortunately, for larger n we expect the rank groups to be much bigger and it would be inefficient to exhaustively match HSMs for \(\varvec{F}, \varvec{G}\) only based on their ranks. Thus, to narrow down the number of possibilities and eventually uniquely match sufficiently many pairs \(h,h'\) such that \(h \mapsto _{A_1} h'\), we need to attach more data to each HSM for \(\varvec{F},\varvec{G}\).

HSM Rank Histograms. The main observation that allows attaching more data to each HSM is given in Lemma 4 which shows that the mapping \(\mapsto _{A_1}\) is additive.

Consider the two rank groups with index (3, 2) for \(\varvec{F}, \varvec{G}\). They are of size 4 and their HSMs cannot be uniquely matched. We first focus on \(\varvec{G}\) and examine the rank group (3, 2) which is \(\{100,001,110,011\}\). We take \(h_1 = 011\) and compute its HSM rank histogram with respect to the rank group (2, 1) (which is \(\{101,111\}\)). This is done by computing the (maxRminR) rank pairs for the set defined by adding all elements of rank group (2, 1) to 011, namely \(\{011+101, 011+111\} = \{110, 100\}\). Looking for 110 and 100 in the rank table of \(\varvec{G}\), both HSMs have ranks (3, 2). Thus, the HSM rank histogram of \(h_1 = 011\) with respect to rank group (2, 1) has a single non-zero entry (3, 2) with the value of 2. We write this HSM rank histogram in short as [(3, 2) : 2].

We now consider the match of \(h_1 = 011\) under \(A_1\) which is \(h'_1 = 110\) (namely, \(h_1 \mapsto _{(A_1)} h'_1\)). Similarly to \(h_1\), we compute the HSM rank histogram of \(h'_1\) with respect to the rank group (2, 1) for \(\varvec{F}\) (which is \(\{100,101\}\)) and obtain the same HSM rank histogram [(3, 2) : 2]. This is a particular case of Lemma 10, which shows that matching HSMs for \(\varvec{F}, \varvec{G}\) have identical HSM rank histograms (with respect to a fixed rank group). Lemma 10 is derived using Lemma 4 which asserts that the mapping \(\mapsto _{A_1}\) is additive: if \(h_1 \mapsto _{(A_1)} h'_1\) and \(h_2 \mapsto _{(A_1)} h'_2\), then \((h_1 + h_2) \mapsto _{(A_1)} (h'_1 + h'_2)\).

Fixing \(h_1 = 011\) for \(\varvec{G}\) and its match \(h'_1 = 110\) under \(A_1\), let \(h^i_2,h'^i_2\) for \(i \in \{1,2 \}\) vary over the 2 elements of the rank groups with index (2, 1) in \(\varvec{G},\varvec{F}\), respectively. Then, as \(h_1 \mapsto _{(A_1)} h'_1\) and \(h^i_2 \mapsto _{(A_1)} h'^i_2\) for \(i \in \{1,2 \}\), we get \((h_1 + h^i_2) \mapsto _{(A_1)} (h'_1 + h'^i_2)\). By the aforementioned Theorem 1 and Lemma 6 (equating ranks for matching HSMs) we conclude that indeed the HSM rank histograms of 011 and 110 with respect to rank group (2, 1) are identical (which is a special case of Lemma 10).

HSM Rank Histogram Multi-Sets. Since we do not know in advance that \(h_1 \mapsto _{(A_1)} h'_1\), we have to compute the HSM rank histograms (with respect to rank group (2, 1)) for all HSMs in rank group (3, 2). The outcome is the HSM rank histogram multi-set of rank group (3, 2) with respect to rank group (2, 1). It is computed by considering all the HSMs in the rank group (3, 2), namely \(\{100,001,110,011\}\) for \(\varvec{G}\) and \(\{010,011,111,110\}\) for \(\varvec{F}\). Lemma 11 (whose proof is based on Lemma 10) asserts that these HSM rank histogram multi-sets are identical as \(\varvec{F},\varvec{G}\) are affine equivalent.

We hope that these multi-sets contain unique HSM rank histograms (with multiplicity 1), which would allow us to derive more matching between HSMs. Unfortunately, the resultant multi-set (for both \(\varvec{F}\) and \(\varvec{G}\)) is \(\{[(3,2):2],[(3,2):2],[(3,2):2],[(3,2):2]\}\). It contains 4 identical elements and does not give us any new information about \(A_1\). If the multiplicity of the element [(3, 2) : 2] (calculated above for \(h_1,h'_1\)) in this multi-set would have been 1, we could have derived the relation \(h_1 \mapsto _{(A_1)} h'_1\).

Remark 3

Generally, when n is very small (as in our case), the direct application of the algorithm is more likely to fail to completely recover \(A_1\). As we show later in this paper, for \(n \ge 8\) the fraction of instances for which this occurs is very small (and tends to 0 as n grows). In some cases a failure to retrieve \(A_1\) occurs since the affine mappings \(A_1,A_2\) are not uniquely defined. In particular, if there are several solutions for \(A_1\), then we cannot hope to obtain unique matchings that completely define \(A_1\), but we can recover all possible solutions to the affine equivalence problem by enumerating several possibilities for the matchings.

In conclusion, we attached to each HSM in rank group (3, 2) for \(\varvec{F}, \varvec{G}\) its HSM rank histogram with respect to rank group (2, 1) and in general such data may allow us to derive additional matchings \(h \mapsto _{(A_1)} h'\). Once we obtain about n matchings, we can recover \(A_1\) by solving a system of linear equations.

4 A Basic Property of Affine Equivalent Functions

Before proving the main result of this section, we state two useful lemmas (the first is proved in the extended version of this paper [9]).

Lemma 1

Let \(\varvec{P} = \{P^{(i)}(x[1],\ldots ,x[n])\}_{i=1}^{m}\), let \(A_1: \{0,1\}^{n'} \rightarrow \{0,1\}^n\), \(A_2: \{0,1\}^m \rightarrow \{0,1\}^{m'}\) be affine functions, and d be a positive integer. Then,

  1. 1.

    \((\varvec{P}_{(\ge d)} \circ A_1)_{(\ge d)} = (\varvec{P} \circ A_1)_{(\ge d)}\)

  2. 2.

    \((A_2 \circ (\varvec{P}_{(\ge d)}))_{(\ge d)} = (A_2 \circ \varvec{P})_{(\ge d)}\) and if \(A_2\) is a linear function, \(A_2 \circ (\varvec{P}_{(\ge d)}) = (A_2 \circ \varvec{P})_{(\ge d)}\).

Essentially, the lemma states that removing all monomials of degree less than d from \(\varvec{P}\) can be done before or after composing it with an affine function and the outcomes are identical.

Note that a potentially simplified first part of the lemma which equates \(\varvec{P}_{(\ge d)} \circ A_1\) and \((\varvec{P} \circ A_1)_{(\ge d)}\) is generally incorrect, as the first expression may contain monomials of degree less than d. For example, if \(d=2\) and we compose the affine transformation defined by \(x[1] = y[1] + y[2]\) and \(x[2] = y[2]\) with the polynomial x[1]x[2], then we get the polynomial \(y[1]y[2] + y[2]\) which has a monomial of degree 1.

Lemma 2

Let \(\varvec{P} = \{P^{(i)}(x[1],\ldots ,x[n])\}_{i=1}^{m}\), and let \(A_1: \{0,1\}^n \rightarrow \{0,1\}^n\) be an invertible affine function. Then, \(deg(\varvec{P}) = deg(\varvec{P} \circ A_1)\).

Proof

We show that for \(i \in \{1,2,\ldots ,m\}\), \(deg(P^{(i)}) = deg(P^{(i)} \circ A_1)\). Observe that \(deg(P^{(i)}) \ge deg(P^{(i)} \circ A_1)\) as composition with an affine function cannot increase the algebraic degree of a polynomial. By the same argument and by the invertibility of \(A_1\), we also obtain \(deg(P^{(i)} \circ A_1) \ge deg(P^{(i)} \circ A_1 \circ (A_1)^{-1}) = deg(P^{(i)}).\)    \(\blacksquare \)

Theorem 1

Let \(\varvec{F}: \{0,1\}^n \rightarrow \{0,1\}^m\), \(\varvec{G}: \{0,1\}^n \rightarrow \{0,1\}^m\) be two affine equivalent functions, represented by \(\varvec{P},\varvec{Q}\), respectively. Then, for every positive integer d, \(SR(\varvec{P}_{(\ge d)}) = SR(\varvec{Q}_{(\ge d)})\).

Proof

At a high level, the fact that \(\varvec{P}\) and \(\varvec{Q}\) have the same symbolic rank follows since rank is preserved by composition with invertible affine transformations. Moreover, this rank equality is preserved after truncating low degree monomials since they cannot affect the high degree monomials when composing with invertible affine transformations. The formal proof is below.

Write \(\varvec{G} = A_2 \circ \varvec{F} \circ A_1\), implying that \(\varvec{Q} = A_2 \circ \varvec{P} \circ A_1\). Denote \(\varvec{P'} = \varvec{P} \circ A_1\) and observe that

$$SR(\varvec{P'}_{(\ge d)}) = SR((A_2 \circ (\varvec{P'}_{(\ge d)}))_{(\ge d)}) = SR((A_2 \circ \varvec{P'})_{(\ge d)}) = SR(\varvec{Q}_{(\ge d)}),$$

where the first equality holds since rank is preserved by invertible linear transformationsFootnote 5 and the second equality is due to the second part of Lemma 1.

It remains to show that \(SR(\varvec{P'}_{(\ge d)}) = SR(\varvec{P}_{(\ge d)})\), or \(SR((\varvec{P} \circ A_1)_{(\ge d)}) = SR(\varvec{P}_{(\ge d)})\). We first show that \(SR(\varvec{P}_{(\ge d)}) \ge SR((\varvec{P} \circ A_1)_{(\ge d)})\).

If \(\varvec{P}_{(\ge d)}\) has full rank of m then the claim is trivial. Otherwise, let \(v \in \{0,1\}^m\) be a non-zero vector in the kernel of \(\varvec{P}_{(\ge d)}\), namely \(v(\varvec{P}_{(\ge d)}) = 0\). Then,

$$v((\varvec{P} \circ A_1)_{(\ge d)}) = (v((\varvec{P} \circ A_1)_{(\ge d)}))_{(\ge d)} = (v(\varvec{P}_{(\ge d)}) \circ A_1)_{(\ge d)} = 0,$$

where the first equality follows from the second part of Lemma 1 and the second equality follows from the first part of this lemma. This implies that v is also in the kernel of \((\varvec{P} \circ A_1)_{(\ge d)}\), as required.

To prove that \(SR(\varvec{P}_{(\ge d)}) \le SR((\varvec{P} \circ A_1)_{(\ge d)})\), observe that if v is a non-zero vector in the kernel of \((\varvec{P} \circ A_1)_{(\ge d)}\), then by the equality above we have \(0 = v((\varvec{P} \circ A_1)_{(\ge d)}) = (v(\varvec{P}_{(\ge d)}) \circ A_1)_{(\ge d)}\). This implies that \(deg(v(\varvec{P}_{(\ge d)}) \circ A_1) < d\) and since \(A_1\) is invertible, by Lemma 2, \(deg(v(\varvec{P}_{(\ge d)})) = deg(v(\varvec{P}_{(\ge d)}) \circ A_1) < d\). This gives \(v(\varvec{P}_{(\ge d)})=0\), as the polynomial does not contain monomials of degree less than d. Hence, v is in the kernel of \(\varvec{P}_{(\ge d)}\) which completes the proof.    \(\blacksquare \)

5 The Half-Space Mask Bijection and Its Properties

Definition 1

Let \(A: \{0,1\}^n \rightarrow \{0,1\}^n\) be an invertible affine transformation. Define a mapping between HSMs using A as follows: \(h \in \{0,1\}^n\) is mapped to \(h'\) if there exists \(c \in \{0,1\}\) such that the affine ranges of \(A \circ C_{|h,0}\) and \(C_{|h',c}\) are equal. We write \(h \mapsto _{(A)} h'\) and say that h and \(h'\) match (under A). The bit c is called the associated constant of \(h \mapsto _{(A)} h'\).

Lemma 3

Let \(A: \{0,1\}^n \rightarrow \{0,1\}^n\) be an invertible affine transformation. The mapping \(\mapsto _{(A)}\) is a bijection and its inverse is given by \(\mapsto _{(A^{-1})}\).

Proof

The proof follows from the invertibility of A. Given that \(h \mapsto _{(A)} h'\), there exists \(c \in \{0,1\}\) such that the affine ranges of \(A \circ C_{|h,0}\) and \(C_{|h',c}\) are equal. According to Fact 1, this implies that there exists an invertible affine transformation \(A': \{0,1\}^{n-1} \rightarrow \{0,1\}^{n-1}\) such that \(A \circ C_{|h,0} = C_{|h',c} \circ A'\). Consequently, \(C_{|h,0} \circ (A')^{-1} = A^{-1} \circ C_{|h',c}\) and the affine ranges of \(C_{|h,0}\) and \(A^{-1} \circ C_{|h',c}\) are equal (again, according to Fact 1). This implies that the affine ranges of \(A^{-1} \circ C_{|h',0}\) and \(C_{|h,c}\) are equal (flipping the HSC of both sides if \(c=1\)), namely \(h' \mapsto _{(A^{-1})} h\).    \(\blacksquare \)

A property of \(\mapsto _{(A)}\) which will be very useful is that it is additive. This is established by the lemma below (proved in the extended version of this paper [9]).

Lemma 4

Let \(A: \{0,1\}^n \rightarrow \{0,1\}^n\) be an invertible affine transformation. Let \(h_1,h_1',h_2,h'_2 \in \{0,1\}^n\) be HSMs where \(h_1 \ne h_2\) and \(h_1 \mapsto _{(A)} h'_1\), \(h_2 \mapsto _{(A)} h'_2\) with associated constants \(c_1,c_2\), respectively. Then \((h_1 + h_2) \mapsto _{(A)} (h'_1 + h'_2)\) with the associated constant \(c_1 + c_2\).

The following lemma (proved in the extended version of this paper [9]) shows that the bijection reveals information about the presumably unknown transformation A.

Lemma 5

Let \(A: \{0,1\}^n \rightarrow \{0,1\}^n\) be an invertible affine transformation such that \(A(x) = L(x) + a\). Let \(h,h' \in \{0,1\}^n\) be HSMs such that \(h \mapsto _{(A)} h'\) with associated constant c. Then, A satisfies the following constraints.

  1. 1.

    For each \(i \in \{1,2,\ldots ,n\}\), the i’th column of L, denoted by L[i], satisfies the equation \(h'(L[i]) = h[i]\), where h[i] is the i’th bit of h.

  2. 2.

    The vector a satisfies the equation \(h'(a) = c\).

The following lemma asserts that affine equivalence is preserved under composition with matching HSMs.

Lemma 6

Let \(\varvec{F}: \{0,1\}^n \rightarrow \{0,1\}^m\), \(\varvec{G}: \{0,1\}^n \rightarrow \{0,1\}^m\) be two affine equivalent functions such that \(\varvec{G} = A_2 \circ \varvec{F} \circ A_1\). Let \(h,h' \in \{0,1\}^n\) be HSMs such that \(h \mapsto _{(A_1)} h'\) with associated constant c. Then, \(\varvec{F} \circ C_{|h',c} \equiv \varvec{G} \circ C_{|h,0}\) and \(\varvec{F} \circ C_{|h',c+1} \equiv \varvec{G} \circ C_{|h,1}\).

Proof

Since \(h \mapsto _{(A_1)} h'\) with associated constant c, the affine ranges of \(A_1 \circ C_{|h,0}\) and \(C_{|h',c}\) are equal. According to Fact 1, there exists an invertible affine transformation \(A'_1:\{0,1\}^{n-1} \rightarrow \{0,1\}^{n-1}\) such that \(A_1 \circ C_{|h,0} = C_{|h',c} \circ A'_1\).

We obtain, \(A_2 \circ \varvec{F} \circ C_{|h',c} \circ A'_1 = A_2 \circ \varvec{F} \circ A_1 \circ C_{|h,0} = \varvec{G} \circ C_{|h,0}\), implying that \(\varvec{F} \circ C_{|h',c}\) and \(\varvec{G} \circ C_{|h,0}\) are affine equivalent.

The claim that \(\varvec{F} \circ C_{|h',c+1}\) and \(\varvec{G} \circ C_{|h,1}\) are affine equivalent follows by considering the complimentary half-space and observing that the affine ranges of \(A_1 \circ C_{|h,1}\) and \(C_{|h',c+1}\) are equal. The remainder of the proof is similar.    \(\blacksquare \)

Definition 2

Let \(\varvec{F}: \{0,1\}^n \rightarrow \{0,1\}^m\) be a function represented by \(\varvec{P}\), let d be a positive integer and let \(h \in \{0,1\}^n\) be a HSM. Let \(r_0 = SR((\varvec{P} \circ C_{|h,0})_{(\ge d)})\), \(r_1 = SR((\varvec{P} \circ C_{|h,1})_{(\ge d)})\), \(maxR = max\{r_0,r_1\}\) and \(minR = min\{r_0,r_1\}\).

  1. 1.

    The HSM rank of h (with respect \(\varvec{F},d\)) is the ordered pair of integers (maxRminR), denoted as \(R_{\varvec{F},d,h}\),

  2. 2.

    The attached constant of h is the value \(c \in \{0,1\}\) such that \(maxR = SR((\varvec{P} \circ C_{|h,c})_{(\ge d)})\) (if \(maxR = minR\), the attached constant is undefined).

The lemma below states that the HSM ranks of matching HSMs are equal for affine equivalent functions.

Lemma 7

Let \(\varvec{F}: \{0,1\}^n \rightarrow \{0,1\}^m\), \(\varvec{G}: \{0,1\}^n \rightarrow \{0,1\}^m\) be two affine equivalent functions and let d be a positive integer. Assume that \(\varvec{G} = A_2 \circ \varvec{F} \circ A_1\). Let \(h,h' \in \{0,1\}^n\) be HSMs such that \(h \mapsto _{(A_1)} h'\). Then \(R_{\varvec{G},d,h} = R_{\varvec{F},d,h'}\).

Proof

Let \(c \in \{0,1\}\) be the associated constant of \(h \mapsto _{(A_1)} h'\). According to Lemma 6, \(\varvec{F} \circ C_{|h',c} \equiv \varvec{G} \circ C_{|h,0}\) and \(\varvec{F} \circ C_{|h',c+1} \equiv \varvec{G} \circ C_{|h,1}\).

Assume that \(\varvec{F},\varvec{G}\) are represented by \(\varvec{P},\varvec{Q}\), respectively. Denote \(r'_0 = SR((\varvec{F} \circ C_{|h',c})_{(\ge d)})\), \(r'_1 = SR((\varvec{F} \circ C_{|h',c+1})_{(\ge d)})\), \(r_0 = SR((\varvec{G} \circ C_{|h,0})_{(\ge d)})\), \(r_1 = SR((\varvec{G} \circ C_{|h,1})_{(\ge d)})\).

By the above affine equivalences and Theorem 1, we have \(r_0 = r'_0\) and \(r_1 = r'_1\). Hence \(max(r_0,r_1) = max(r'_0,r'_1)\) and \(min(r_0,r_1) = min(r'_0,r'_1)\) and the lemma follows.    \(\blacksquare \)

6 Rank Tables, Rank Histograms and their Properties

Definition 3

Given a function \(\varvec{F}: \{0,1\}^n \rightarrow \{0,1\}^m\) and a positive integer d, define the following mappings.

  1. 1.

    The rank table of \(\varvec{F}\) with respect to d is a mapping \(\mathcal {T}_{\varvec{F},d}\), whose keys (indexes) are integer pairs \((maxR,minR) \in \mathbb {Z}_{m+1} \times \mathbb {Z}_{m+1}\) such that \(maxR \ge minR\). It is defined as

    $$\mathcal {T}_{\varvec{F},d}(maxR,minR) = \{h \in \{ 0,1\}^n \; | \; R_{\varvec{F},d,h} = (maxR,minR) \}.$$

    Moreover, along with each such HSM h, the table stores its attached constant \(c \in \{0,1\}\) (if defined).

    An entry in the rank table \(\mathcal {T}_{\varvec{F},d}(maxR,minR)\) (containing all HSMs with this rank) is called a rank group.

  2. 2.

    The rank histogram of \(\varvec{F}\) with respect to d is a mapping \(\mathcal {H}_{\varvec{F},d}: \mathbb {Z}_{m+1} \times \mathbb {Z}_{m+1} \rightarrow \mathbb {Z}\) such that \(\mathcal {H}_{\varvec{F},d}(maxR,minR) = |\mathcal {T}_{\varvec{F},d}(maxR,minR)|\).

To simplify our notation, in the following we refer to a HSM rank \((maxR,minR) \in \mathbb {Z}_{m+1} \times \mathbb {Z}_{m+1}\) such that \(maxR \ge minR\) using a single symbol \(\varvec{r}\).

The lemma below states that if \(\varvec{F} \equiv \varvec{G}\), then each HSM with rank \(\varvec{r}\) for \(\varvec{G}\) is matched in the rank group with the same HSM rank \(\varvec{r}\) for \(\varvec{F}\).

Lemma 8

Let \(\varvec{F}: \{0,1\}^n \rightarrow \{0,1\}^m\), \(\varvec{G}: \{0,1\}^n \rightarrow \{0,1\}^m\) be two affine equivalent functions and let d be a positive integer. Assume that \(\varvec{G} = A_2 \circ \varvec{F} \circ A_1\) and let \(\varvec{r} \in \mathbb {Z}_{m+1} \times \mathbb {Z}_{m+1}\). Then, for each \(h \in \mathcal {T}_{\varvec{G},d}(\varvec{r})\), there exists \(h' \in \mathcal {T}_{\varvec{F},d}(\varvec{r})\) such that \(h \mapsto _{(A_1)} h'\).

Proof

By Lemma 7, given \(h \in \mathcal {T}_{\varvec{G},d}(\varvec{r})\), its match \(h'\) under \(A_1\) satisfies \(R_{\varvec{F},d,h'} = R_{\varvec{G},d,h} = \varvec{r}\) hence \(h' \in \mathcal {T}_{\varvec{F},d}(\varvec{r})\) as claimed.    \(\blacksquare \)

Lemma 9

Let \(\varvec{F}: \{0,1\}^n \rightarrow \{0,1\}^m\), \(\varvec{G}: \{0,1\}^n \rightarrow \{0,1\}^m\) be two affine equivalent functions and let d be a positive integer. Then the rank histograms of \(\varvec{F}\) and \(\varvec{G}\) with respect to d are equal, namely \(\mathcal {H}_{\varvec{F},d} = \mathcal {H}_{\varvec{G},d}\).

Proof

Assume that \(\varvec{G} = A_2 \circ \varvec{F} \circ A_1\). Given a histogram entry with index \(\varvec{r}\), for each \(h \in \mathcal {T}_{\varvec{G},d}(\varvec{r})\) let \(h'\) be its match \(h \mapsto _{(A_1)} h'\). Then, by Lemma 8, \(h' \in \mathcal {T}_{\varvec{F},d}(\varvec{r})\). Since \(h \mapsto _{(A_1)} h'\) is a bijection, this shows that \(\mathcal {H}_{\varvec{G},d}(\varvec{r}) = |\mathcal {T}_{\varvec{G},d}(\varvec{r})| \le |\mathcal {T}_{\varvec{F},d}(\varvec{r})| = \mathcal {H}_{\varvec{F},d}(\varvec{r})\). On the other hand, as \(\mathcal {H}_{\varvec{G},d}(\varvec{r}) \le \mathcal {H}_{\varvec{F},d}(\varvec{r})\) holds for all histogram entries \(\varvec{r}\) and the sum of entries in both histograms is \(2^n - 1\), this implies that \(\mathcal {H}_{\varvec{F},d} = \mathcal {H}_{\varvec{G},d}\).    \(\blacksquare \)

Definition 4

Given a function \(\varvec{F}: \{0,1\}^n \rightarrow \{0,1\}^m\), a positive integer d, a HSM \(h_1 \in \{0,1\}^n\) and \(\varvec{r} \in \mathbb {Z}_{m+1} \times \mathbb {Z}_{m+1}\), we define the HSM rank histogram of \(h_1\) with respect to (or relative to) the rank group \(\varvec{r}\) and denote it by \(\mathcal {HG}_{\varvec{F},d,h_1,\varvec{r}}\). As the standard histogram, it is a mapping \(\mathcal {HG}_{\varvec{F},d,h_1,\varvec{r}}: \mathbb {Z}_{m+1} \times \mathbb {Z}_{m+1} \rightarrow \mathbb {Z}\), where

$$\begin{aligned} \mathcal {HG}_{\varvec{F},d,h_1,\varvec{r}}(\varvec{r'}) = |\{h_1 + h_2 \; | \; h_2 \in \{0,1\}^n \wedge h_1 \ne h_2 \wedge R_{\varvec{F},d,h_2} = \varvec{r} \wedge R_{\varvec{F},d,h_1 + h_2} = \varvec{r'}\}|.\end{aligned}$$

Note that unlike the (standard) rank histogram, the HSM rank histogram is defined for a specific HSM with respect to a rank group. We further remark that the HSM rank histogram of \(h_1\) can also be defined with respect to its own the rank group (this is assured by the condition \(h_1 \ne h_2\)).

The following lemma equates HSM rank histograms for matching HSMs in affine equivalent functions.

Lemma 10

Let \(\varvec{F}: \{0,1\}^n \rightarrow \{0,1\}^m\), \(\varvec{G}: \{0,1\}^n \rightarrow \{0,1\}^m\) be two affine equivalent functions and let d be a positive integer. Assume that \(\varvec{G} = A_2 \circ \varvec{F} \circ A_1\). Let \(h_1,h'_1 \in \{0,1\}^n\) be such that \(h_1 \mapsto _{(A_1)} h'_1\). Then, for every \(\varvec{r} \in \mathbb {Z}_{m+1} \times \mathbb {Z}_{m+1}\), \(\mathcal {HG}_{\varvec{G},d,h_1,\varvec{r}} = \mathcal {HG}_{\varvec{F},d,h'_1,\varvec{r}}.\)

Proof

The proof follows from the fact that the mapping \(\mapsto _{(A_1)}\) preserves HSM ranks for affine equivalent functions (Lemma 7), and by exploiting its additive property (Lemma 4).

Fix a HSM rank histogram entry \(\varvec{r'} \in \mathbb {Z}_{m+1} \times \mathbb {Z}_{m+1}\). Define the following two sets:

$$\begin{aligned} D_1 = \{h_1 + h_2 \; | \; h_2 \in \{0,1\}^n \wedge h_1 \ne h_2 \wedge R_{\varvec{G},d,h_2} = \varvec{r} \wedge R_{\varvec{G},d,h_1 + h_2} = \varvec{r'}\}\end{aligned}$$

and

$$\begin{aligned} D_2 = \{h'_1 + h'_2 \; | \; h'_2 \in \{0,1\}^n \wedge h'_1 \ne h'_2 \wedge R_{\varvec{F},d,h'_2} = \varvec{r} \wedge R_{\varvec{F},d,h'_1 + h'_2} = \varvec{r'}\}.\end{aligned}$$

To prove the lemma, we need to show that \(|D_1| = |D_2|\). Let \(h_1 + h_2 \in D_1\) and denote by \(\hat{h} \in \{0,1\}^n\) the vector such that \(h_1 + h_2 \mapsto _{(A_1)} \hat{h}\). We show that \(\hat{h} \in D_2\).

Since \(h_1 + h_2 \mapsto _{(A_1)} \hat{h}\), by Lemma 7, \(R_{\varvec{F},d,\hat{h}} = R_{\varvec{G},d,h_1 + h_2} = \varvec{r'}\). Next, write \(\hat{h} = h'_1 + (h'_1 + \hat{h})\). Since \(h_1 \mapsto _{(A_1)} h'_1\) and \(h_1 + h_2 \mapsto _{(A_1)} \hat{h}\), by Lemma 4, \(h_2 \mapsto _{(A_1)} h'_1 + \hat{h}\), and by Lemma 7, \(R_{\varvec{F},d,h'_1 + \hat{h}} = R_{\varvec{G},d,h_2} = \varvec{r}\), giving \(\hat{h} \in D_2\). Since \(\mapsto _{(A_1)}\) is a bijection this implies that \(|D_2| \ge |D_1|\).

As \(|D_2| \ge |D_1|\) holds for all HSM histogram entries \(\varvec{r'}\) and the sum of HSM histogram entries in both \(\mathcal {HG}_{\varvec{G},d,h_1,\varvec{r}}\) and \(\mathcal {HG}_{\varvec{F},d,h'_1,\varvec{r}}\) is equal to size of the rank groupFootnote 6 \(\varvec{r}\) (which is equal to \(\mathcal {H}_{\varvec{F},d}(\varvec{r}) = \mathcal {H}_{\varvec{G},d}(\varvec{r})\)), the equality \(|D_1| = |D_2|\) holds.    \(\blacksquare \)

Definition 5

Given a function \(\varvec{F}: \{0,1\}^n \rightarrow \{0,1\}^m\), a positive integer d, HSMs ranks \(\varvec{r},\varvec{r'} \in \mathbb {Z}_{m+1} \times \mathbb {Z}_{m+1}\), we define the HSM rank histogram multi-set of rank group r with respect to rank group \(\varvec{r'}\) as

$$\mathcal {HM}_{\varvec{F},d,\varvec{r},\varvec{r'}} = \{\mathcal {HG}_{\varvec{F},d,h,\varvec{r'}} \; | \; R_{\varvec{F},d,h} = \varvec{r} \}.$$

The HSM rank histogram multi-set collects all the HSM histograms for HSMs in rank group \(\varvec{r}\) with respect to the rank group \(\varvec{r'}\). Note that it is possible to have \(\varvec{r} = \varvec{r'}\).

The following lemma equates HSM rank histogram multi-set in affine equivalent functions.

Lemma 11

Let \(\varvec{F}: \{0,1\}^n \rightarrow \{0,1\}^m\), \(\varvec{G}: \{0,1\}^n \rightarrow \{0,1\}^m\) be two affine equivalent functions and let d be a positive integer. Then, for every \(\varvec{r},\varvec{r'} \in \mathbb {Z}_{m+1} \times \mathbb {Z}_{m+1}\), \(\mathcal {HM}_{\varvec{F},d,\varvec{r},\varvec{r'}} = \mathcal {HM}_{\varvec{G},d,\varvec{r},\varvec{r'}}\).

Proof

Fix \(\varvec{r},\varvec{r'} \in \mathbb {Z}_{m+1} \times \mathbb {Z}_{m+1}\). We define a mapping between the elements (HSM histograms) of the multi-sets \(\mathcal {HM}_{\varvec{G},d,\varvec{r},\varvec{r'}}\) and \(\mathcal {HM}_{\varvec{F},d,\varvec{r},\varvec{r'}}\). Naturally, the mapping is based on the bijection \(\mapsto _{(A_1)}\).

Assume that \(\varvec{G} = A_2 \circ \varvec{F} \circ A_1\). Let \(h \in \{0,1\}^n\) be such that \(R_{\varvec{G},d,h} = \varvec{r}\) which implies that \(\mathcal {HG}_{\varvec{G},d,h,\varvec{r'}} \in \mathcal {HM}_{\varvec{G},d,\varvec{r},\varvec{r'}}\). Let \(h' \in \{0,1\}^n\) be the HSM such that \(h \mapsto _{(A_1)} h'\). By Lemma 10, \(\mathcal {HG}_{\varvec{G},d,h,\varvec{r'}} = \mathcal {HG}_{\varvec{F},d,h',\varvec{r'}}\). Furthermore, by Lemma 7 we have \(R_{\varvec{F},d,h'} = R_{\varvec{G},d,h} = \varvec{r}\), hence \(\mathcal {HG}_{\varvec{G},d,h,\varvec{r'}} = \mathcal {HG}_{\varvec{F},d,h',\varvec{r'}} \in \mathcal {HM}_{\varvec{F},d,\varvec{r},\varvec{r'}}\). Since \(\mapsto _{(A_1)}\) is a bijection, we obtain \(\mathcal {HM}_{\varvec{G},d,\varvec{r},\varvec{r'}} \subseteq \mathcal {HM}_{\varvec{F},d,\varvec{r},\varvec{r'}}\) as multi-sets.

On the other hand, the number of elements (HSM histograms) in both multi-sets is equal to the size of the rank group \(\varvec{r}\) (which is equal to \(\mathcal {H}_{\varvec{F},d}(\varvec{r}) = \mathcal {H}_{\varvec{G},d}(\varvec{r})\)), hence \(\mathcal {HM}_{\varvec{G},d,\varvec{r},\varvec{r'}} = \mathcal {HM}_{\varvec{F},d,\varvec{r},\varvec{r'}}\).    \(\blacksquare \)

Definition 6

Let \(\varvec{F}: \{0,1\}^n \rightarrow \{0,1\}^m\), let d be a positive integer and let \(\varvec{r},\varvec{r'} \in \mathbb {Z}_{m+1} \times \mathbb {Z}_{m+1}\). A HSM \(h \in \{0,1\}\) such that \(R_{\varvec{F},d,h} = \varvec{r}\) is called unique (with respect to \(\varvec{F},d,\varvec{r'}\)) if \(\mathcal {HG}_{\varvec{F},d,h,\varvec{r'}} \in \mathcal {HM}_{\varvec{F},d,\varvec{r},\varvec{r'}}\) has multiplicity 1 in this multi-set.

The following theorem establishes the importance of unique HSMs in recovering matchings between HSMs for affine equivalent functions.

Theorem 2

Let \(\varvec{F}: \{0,1\}^n \rightarrow \{0,1\}^m\), \(\varvec{G}: \{0,1\}^n \rightarrow \{0,1\}^m\) be two affine equivalent functions and let d be a positive integer. Then for every \(\varvec{r},\varvec{r'} \in \mathbb {Z}_{m+1} \times \mathbb {Z}_{m+1}\), if \(h \in \{0,1\}^n\) (such that \(R_{\varvec{G},d,h} = \varvec{r}\)) is unique with respect to \(\varvec{G},d,\varvec{r'}\), then the following statements hold:

  1. 1.

    There exists \(h' \in \{0,1\}^n\) such that \(R_{\varvec{F},d,h'} = \varvec{r}\) and \(h'\) is unique with respect to \(\varvec{F},d,\varvec{r'}\).

  2. 2.

    \(\mathcal {HG}_{\varvec{G},d,h,\varvec{r'}} = \mathcal {HG}_{\varvec{F},d,h',\varvec{r'}}\).

  3. 3.

    Assume that \(\varvec{G} = A_2 \circ \varvec{F} \circ A_1\). Then, \(h \mapsto _{(A_1)} h'\). Moreover, if the attached constants of \(h,h'\) are defined and equal to \(c,c'\), respectively, then the associated constant of \(h \mapsto _{(A_1)} h'\) is \(c + c'\).

Proof

By Lemma 11, we have equality of the multi-sets \(\mathcal {HM}_{\varvec{F},d,\varvec{r},\varvec{r'}} = \mathcal {HM}_{\varvec{G},d,\varvec{r},\varvec{r'}}\) which immediately implies the first two statements. Denote by \(h'' \in \{0,1\}^n\) the HSM such that \(h \mapsto _{(A_1)} h''\). To complete the proof of the third statement we show that \(h'' = h'\).

By Lemma 7, we have \(R_{\varvec{F},d,h''} = R_{\varvec{G},d,h} = \varvec{r}\). Hence \(\mathcal {HG}_{\varvec{F},d,h'',\varvec{r}} \in \mathcal {HM}_{\varvec{F},d,\varvec{r},\varvec{r'}}\) (and also \(\mathcal {HG}_{\varvec{F},d,h',\varvec{r}} \in \mathcal {HM}_{\varvec{F},d,\varvec{r},\varvec{r'}}\) from the first statement). Since \(h'\) is unique with respect to \(\varvec{F},d,\varvec{r'}\), then \(\mathcal {HG}_{\varvec{F},d,h',\varvec{r'}}\) has multiplicity 1 in \(\mathcal {HM}_{\varvec{F},d,\varvec{r},\varvec{r'}}\). Thus if we show that \(\mathcal {HG}_{\varvec{F},d,h',\varvec{r'}} = \mathcal {HG}_{\varvec{F},d,h'',\varvec{r}}\), then \(h'' = h'\) must hold.

According to Lemma 10, \(\mathcal {HG}_{\varvec{G},d,h,\varvec{r}} = \mathcal {HG}_{\varvec{F},d,h'',\varvec{r}}\) and by the second statement we obtain \(\mathcal {HG}_{\varvec{F},d,h',\varvec{r'}} = \mathcal {HG}_{\varvec{G},d,h,\varvec{r'}} = \mathcal {HG}_{\varvec{F},d,h'',\varvec{r}}\) as required.

Finally, we examine the attached constants \(c,c'\) of \(h,h'\), respectively (assuming they are defined). If \(c = c'\), then the affine ranges of \(A_1 \circ C_{|h,0}\) and \(C_{|h',0}\) are equal implying that the associated constant of \(h \mapsto _{(A_1)} h'\) is \(0 = c + c'\). Otherwise \(c = c' + 1\) and the affine ranges of \(A_1 \circ C_{|h,0}\) and \(C_{|h',1}\) are equal implying that the associated constant of \(h \mapsto _{(A_1)} h'\) is \(1 = c + c'\).    \(\blacksquare \)

7 Analysis of the Distribution of Rank Histogram Entries for Random Permutations

In this section we analyze the distribution of entries of the rank histogram \(\mathcal {H}_{\varvec{F},d}\) for a random permutation \(\varvec{F}: \{0,1\}^n \rightarrow \{0,1\}^n\). The analysis is performed for \(d = n-2\), which is the value that we use in our algorithm as explained in detail next.

Assume that \(\varvec{F}\) is represented by \(\varvec{P} = \{P^{(i)}(x[1],\ldots ,x[n])\}_{i=1}^{n}\). For a given \(h \in \{0,1\}^n\), we consider \(SR((\varvec{P} \circ C_{|h,0})_{(\ge n-2)})\) and \(SR((\varvec{P} \circ C_{|h,1})_{(\ge n-2)})\). For \(c \in \{0,1\}\), every one of the n polynomials of \((\varvec{P} \circ C_{|h,c})_{(\ge n-2)}\) has \(n-1\) variables (the number of variables in \(\varvec{P}\) is reduced by 1 after composition with \(C_{|h,c}\)). Hence, the number of possible non-zero monomial coefficients in each such polynomial is \(\left( {\begin{array}{c}n-1\\ n-1\end{array}}\right) + \left( {\begin{array}{c}n-1\\ n-2\end{array}}\right) = 1 + n-1 = n\). Therefore, \((\varvec{P} \circ C_{|h,c})_{(\ge n-2)}\) can be represented by an \(n \times n\) Boolean matrix and we are interested in its rank.

Choosing \(d=n-1\) would leave at most one non-zero monomial which almost always would be present in \((\varvec{P} \circ C_{|h,c})_{(\ge n-1)}\). Hence, essentially all HSMs would fall into a single rank group and the affine equivalence algorithm would not be able to distinguish and match them. On the other hand, choosing \(d \le n-3\) would leave \(\varOmega (n^2)\) non-zero monomials and \((\varvec{P} \circ C_{|h,c})_{(\ge r)}\) would almost always have full rank, leading once again to a single rank group. We conclude that \(d = n-1\) is indeed the optimal choice.

Our analysis is based on the following heuristic assumption.

Assumption 1

For a random permutation \(\varvec{F} :\{0,1\}^n \rightarrow \{0,1\}^n\) represented by \(\varvec{P}\), for every \(h \in \{0,1\}^n\) and \(c \in \{0,1\}\), the entries of the \(n \times n\) Boolean matrix \((\varvec{P} \circ C_{|h,c})_{(\ge n-2)}\) are uniform independent random variables.

The \(n \times n\) Boolean matrix \((\varvec{P} \circ C_{|h,c})_{(\ge n-2)}\) is indeed uniform for a random function \(\varvec{F}\) (rather than a random permutation), given any \(h \in \{0,1\}^n\) and \(c \in \{0,1\}\). However, even for a random function the Boolean matrices obtained for different hc values are correlated. Nevertheless, these correlations (and the fact that \(\varvec{F}\) is a permutation) do not seem to have a noticeable influence on our algorithm in practice (as we demonstrate in Sect. 8.5 and the extended version of this paper [9]).

The rank of random matrices is a well-studied problem. For large n and a non-negative integer \(r \le n\), we denote the probability that a random Boolean \(n \times n\) matrix has rank r by \(\beta _r\). We can lower bound \(\beta _{r}\) by considering the event where we first select r linearly independent rows to form a subspace of size \(2^{r}\) (which occurs with constant probability) and then select the remaining \(n-r\) rows within this subspace (which occurs with probability \(2^{-(r-n)^2}\)). This gives a lower bound of \(\varOmega (2^{-(r-n)^2})\) on \(\beta _{r}\). The exact formula is given by the theorem below, taken and adapted from [13].

Theorem 3

([13], p. 126, adapted). For \(n \rightarrow \infty \), the probability that a random Boolean \(n \times n\) matrix has rank r is

$$\begin{aligned} \beta _{r} = 2^{-(r-n)^2} \cdot \alpha \cdot \prod _{i=1}^{n-r} (1- 1/2^i)^{-2},\end{aligned}$$
(2)

where \(\alpha = \prod _{i=1}^{\infty } (1- 1/2^i) \approx 0.2888\).

Since \(\alpha \le \alpha \cdot \prod _{i=1}^{n-r} (1- 1/2^i)^{-2} < 1/\alpha \), the initial probability estimation of \(\approx 2^{-(r-n)^2}\) is correct up to a small constant. We also note that (2) is a good estimation even for relatively small values of n (e.g., \(n \ge 8\)).

Let

$$\begin{aligned} p_{maxR,minR}= {\left\{ \begin{array}{ll} 2\beta _{maxR}\beta _{minR} &{} \text {if } minR < maxR\\ \beta _{maxR}\beta _{minR} &{} \text {otherwise } (minR = maxR), \end{array}\right. } \end{aligned}$$
(3)

where

$$\begin{aligned} \beta _{maxR}\beta _{minR} = \alpha ^2 \cdot 2^{-(maxR-n)^2-(minR-n)^2} \cdot \prod _{i=1}^{n-maxR} (1- 1/2^i)^{-2} \cdot \prod _{i=1}^{n-minR} (1- 1/2^i)^{-2}. \end{aligned}$$

Then, based on Assumption 1 and Theorem 3, for every \((maxR,minR) \in \mathbb {Z}_{m+1} \times \mathbb {Z}_{m+1}\) such that \(maxR \ge minR\), given \(h \in \{0,1\}^n\), we have \(Pr[R_{\varvec{F},n-2,h} = (maxR,minR)] \approx p_{maxR,minR}\). Hence, according to Assumption 1, the entries of \(\mathcal {H}_{\varvec{F},n-2}\) are distributed multinomially, with parameter \(2^n\) (the number of HSMsFootnote 7) and probabilities given by \(p_{maxR,minR}\). In particular, each individual histogram entry \(\mathcal {H}_{\varvec{F},n-2}(maxR,minR)\) is distributed binomially with parameter \(2^n\) and probability \(p_{maxR,minR}\).

Experimental results that support this conclusion are given in the extended version of this paper [9].

Asymptotic Analysis of Specific Histogram Entries. For large n, the binomial variable \(\mathcal {H}_{\varvec{F},n-2}(maxR,minR)\) is with high probability very close to its expectation, which is about

$$ 2^n \cdot p_{maxR,minR}.$$

If we ignore constant multiplicative factors, we can approximate this expectation by

$$\begin{aligned} 2^n \cdot 2^{-(maxR-n)^2-(minR-n)^2},\end{aligned}$$
(4)

as \(p_{maxR,minR} \approx 2^{-(maxR-n)^2-(minR-n)^2}\).

We now approximate (up to constant multiplicative factors) the expected values of two specific histogram entries which will be useful for our algorithm. Denote \(\gamma _n = \lfloor (n/2)^{1/2} \rfloor \), and let \(\varvec{r_1} = (n + 1 - \gamma _n ,n - \gamma _n)\) and \(\varvec{r_2} = (n,n - \gamma _n)\). Define the random variables \(S_1 = \mathcal {H}_{\varvec{F},d}(\varvec{r_1})\) and \(S_2 = \mathcal {H}_{\varvec{F},d}(\varvec{r_2})\). Below, we estimate their expected values according to (4).

Write \(\gamma _n = \lfloor (n/2)^{1/2} \rfloor = (n/2)^{1/2} - k\), where \( 0 \le k < 1\). Hence, with very high probability we have \(S_2 = \mathcal {H}_{\varvec{F},d}(\varvec{r_2}) = \mathcal {H}_{\varvec{F},d}(n,n - \gamma _n) \approx 2^n \cdot p_{n,n - \gamma _n} \approx 2^n \cdot 2^{-(\gamma _n)^2} = 2^n \cdot 2^{-((n/2)^{1/2} - k)^2} = 2^n \cdot 2^{-n/2 + 2k(n/2)^{1/2} - k^2} = 2^{n/2 + O(n^{1/2})}.\) Therefore, \(S_2\) is close to \(2^{n/2}\).

Similarly \(S_1 = \mathcal {H}_{\varvec{F},d}(\varvec{r_1}) = \mathcal {H}_{\varvec{F},d}(n + 1 - \gamma _n ,n - \gamma _n) \approx 2^n \cdot p_{n + 1 - \gamma _n ,n - \gamma _n} \approx 2^n \cdot 2^{-(\gamma _n-1)^2-(\gamma _n)^2} = 2^n \cdot 2^{-2(\gamma _n)^2+2\gamma _n-1} = 2^n \cdot 2^{-n + (4k+2)(n/2)^{1/2} -2k^2-2k-1} = 2^{\varTheta (n^{1/2} )}\). Hence \(S_1\) is sub-exponential in n.

8 Details of the New Affine Equivalence Algorithm

In this section we describe and analyze our new affine equivalence algorithm. We start with a description of the auxiliary algorithms it uses.

8.1 The Rank Table and Histogram Algorithm

For \(\varvec{F}: \{0,1\}^n \rightarrow \{0,1\}^n\) represented by \(\varvec{P} = \{P^{(i)}(x[1],\ldots ,x[n])\}_{i=1}^{n}\), the following algorithm computes the rank histogram \(\mathcal {H}_{\varvec{F},d}\) and rank table \(\mathcal {T}_{\varvec{F},d}\) for \(d=n-2\). The algorithm is given as input \(\varvec{P}_{\ge (n-2)}\).

figure a

Note that \((\varvec{P}_{(\ge n-2)} \circ C_{|h,0})_{(\ge n-2)} = (\varvec{P} \circ C_{|h,0})_{(\ge n-2)}\) holds according to the first part of Lemma 1.

The time complexity of the algorithm depends on how a polynomial is represented. Here, we represent it using a bit array that specifies the values of its monomial coefficients.

We first analyze the complexity of computing the composition \((\varvec{P}_{(\ge n-2)} \circ C_{|h,c})_{(\ge n-2)}\) in Step 1.(a), which is performed for each non-zero \(h\in \{0,1\}^n\) and \(c \in \{0,1\}\). Each of the n polynomials of \(\varvec{P}_{(\ge n-2)}\) contains at most \(\left( {\begin{array}{c}n\\ n\end{array}}\right) + \left( {\begin{array}{c}n\\ n-1\end{array}}\right) + \left( {\begin{array}{c}n\\ n-2\end{array}}\right) < n^2\) non-zero monomials. As described in Sect. 2, computing the composition \(\varvec{P}_{(\ge n-2)} \circ C_{|h,c}\) requires substituting one of the n variables with a linear combination of the remaining \(n-1\) variables (while renaming the variables of the monomials).

In total, for each polynomial of \(\varvec{P}_{(\ge n-2)} \circ C_{|h,c}\), we compose its \(n^2\) monomials with a linear combination of size n, which requires \(n^2 \cdot n = n^3\) bit operations. However, as we are only interested in monomials of degree at least \(n-2\), the outcome \((\varvec{P}_{(\ge n-2)} \circ C_{|h,0})_{(\ge n-2)}\) is a polynomial of at most \(\left( {\begin{array}{c}n-1\\ n-1\end{array}}\right) + \left( {\begin{array}{c}n-1\\ n-2\end{array}}\right) = 1 + n-1 =n\) monomials, and the average complexity can be easily reduced to \(n^2\) using low-level optimization techniques.Footnote 8

In conclusion, the average complexity of computing the n polynomials of \((\varvec{P}_{(\ge n-2)} \circ C_{|h,0})_{(\ge n-2)}\) is \(n \cdot n^2 = n^3\) and the total time spent on composition is \(n^3 \cdot 2^n\) bit operations (up to multiplicative constant factors). Similarly, Gaussian elimination requires \(n^3\) bit operations, hence the total time complexity of the algorithm is \(n^3 \cdot 2^n\) bit operations.

8.2 The Unique HSM Algorithm

The following algorithm computes the HSM rank histogram multi-set \(\mathcal {HM}_{\varvec{F},n-2,\varvec{r},\varvec{r'}}\) and uses it to compute a set of unique HSMs, denoted by \(U_{\varvec{F}}\). This set contains triplets of the form \((h,c,\mathcal {HG}_{\varvec{F},n-2,h,\varvec{r'}})\), where \(h \in \mathcal {T}_{\varvec{F},n-2}(\varvec{r})\) is unique with respect to \(\varvec{F},n-2,\varvec{r'}\) and \(c \in \{0,1\}\) is its attached constant. Note that for the attached constant to be defined, we must have \(maxR > minR\), where \((maxR,minR) = \varvec{r}\).

The algorithm is given as input the rank table \(\mathcal {T}_{\varvec{F},n-2}\) and rank group indexes \(\varvec{r},\varvec{r'}\).

figure b

The time complexity of the algorithm is the product of sizes of the rank groups \(|\mathcal {T}_{\varvec{F},n-2}(\varvec{r})| \cdot |\mathcal {T}_{\varvec{F},n-2}(\varvec{r'})| = \mathcal {H}_{\varvec{F},n-2}(\varvec{r}) \cdot \mathcal {H}_{\varvec{F},n-2}(\varvec{r'})\).

Since the goal of the affine equivalence algorithm will be to find n linearly independent unique HSMs, it is useful to estimate their number. In the extended version of this paper [9] we lower bound the expected number of unique HSMs in \(\mathcal {HM}_{\varvec{F},n-2,\varvec{r},\varvec{r'}}\) asymptotically (ignoring constant factors) based on Assumption 1, given that \(\varvec{F}\) is a random permutation. More specifically, we obtain the lower bound of \(S - S^2/\sqrt{S'}\), where \(S = \mathcal {H}_{\varvec{F},n-2}(\varvec{r})\), and \(S' = \mathcal {H}_{\varvec{F},n-2}(\varvec{r'})\).

8.3 The Affine Transformation \(A_1\) Recovery Algorithm

Assume that we have affine equivalent functions \(\varvec{F}\) and \(\varvec{G}\) such that \(\varvec{G} = A_2 \circ \varvec{F} \circ A_1\) and \(A_1(x) = L(x) + a\).

The following algorithm recovers \(A_1\) using sets of unique HSMs \(U_{\varvec{F}}\) and \(U_{\varvec{G}}\), computed with the previous algorithm of Sect. 8.2 (where its invocations for \(\varvec{F}\) and \(\varvec{G}\) use the same parameters values of \(\varvec{r}, \varvec{r'}\)). Since \(\varvec{F}\) and \(\varvec{G}\) are affine equivalent, the HSM rank histograms of the HSMs in these sets have to match according to Theorem 2. Each equal HSM histogram pair reveals the matching \(h \mapsto _{A_1} h'\) and its associated constant is revealed by adding the attached constants of \(h,h'\) (which are defined in case \(maxR > minR\), where \((maxR,minR) = \varvec{r}\)), again by Theorem 2.

Each matching \(h \mapsto _{A_1} h'\) and its associated constant give linear equations on the columns of L and on a (respectively) according to Lemma 5. Assuming that \(U_{\varvec{F}}\) and \(U_{\varvec{G}}\) contain n linearly independent unique HSMs, \(A_1\) is recovered by linear algebra.

figure c

The complexity of the algorithm is about \(n \cdot n^3 = n^4\) bit operations, which is polynomial in n. Since we solve the same linear equation (with coefficients given by the \(h'\) vectors) \(n+1\) times with different constants, the complexity can be reduced to \(n^3\) by inverting the matrix which defines the linear equations.

8.4 The New Affine Equivalence Algorithm

We describe the new affine equivalence algorithm below. Let \(\varvec{r_1} = (n + 1 - \gamma _n ,n - \gamma _n)\) and \(\varvec{r_2} = (n,n - \gamma _n)\) for \(\gamma _n = \lfloor (n/2)^{1/2} \rfloor \), as defined in Sect. 7.

figure d

The correctness of the algorithm follows from the correctness of the sub-procedures is executes and from the results obtained so far. In particular, Step 2 is correct according to Lemma 9, while the correctness of Step 3 is based on Theorem 2. The correctness of the final step is trivial.

Step 3 is the most complex to analyze in terms of success probability and complexity (which is the product \(\mathcal {H}_{\varvec{F},n-2}(\varvec{r_1}) \cdot \mathcal {H}_{\varvec{F},n-2}(\varvec{r_2})\)). We first focus on the time complexity analysis of the other steps.

The complexities of steps 4, 5 are at most polynomial in n and can be neglected. Step 1 interpolates the \(2^n\) ANF coefficients for each of the n output bits of \(\varvec{F},\varvec{G}\). Each such interpolation can be performed in \(n2^n\) bit operations using the Moebius transform [12]. Hence this step requires \(n^2 2^n\) bit operations and \(2^n\) function evaluations in total. The complexity of Step 2 was shown to be \(n^3 2^n\) bit operations, while the complexity of Step 6 is \(2^n\) function evaluations.

In total, the time complexity of the algorithm is at most \(n^3 2^n\) bit operations and \(2^n\) function evaluations, assuming that the complexity of Step 3 does not dominate the algorithm (as we show below).

The memory complexity is \(2^n\) words of n bits, but it can be significantly reduced in some cases as described in Sect. 8.6.

Asymptotic Analysis of the Unique HSM Algorithm. As in Sect. 7, denote \(S_1 = \mathcal {H}_{\varvec{F},d}(\varvec{r_1})\) and \(S_2 = \mathcal {H}_{\varvec{F},d}(\varvec{r_2})\) and recall that their expected values are \(2^{\varTheta (n^{1/2})}\) and \(2^{n/2 + O(n^{1/2})}\), respectively.

The expected asymptotic complexity of the unique HSM algorithm is therefore at most

$$S_1 \cdot S_2 = 2^{n/2 + O(n^{1/2})} \ll 2^{n}.$$

Hence, the complexity of Step 3 is negligible compared to the complexity of the remaining steps of the affine equivalence algorithm described above.

According to the analysis of the unique HSM algorithm given in the extended version of this paper [9], the asymptotic lower bound on the expected number of unique HSMs in \(\mathcal {HM}_{\varvec{F},n-2,\varvec{r},\varvec{r'}}\) is

$$S_1 - (S_1)^2/\sqrt{S_2} > 2^{\varTheta (n^{1/2})} - 2^{\varTheta (n^{1/2})}/2^{n/4 + O(n^{1/4})} = 2^{\varTheta (n^{1/2})} \gg n.$$

Out of these unique HSMs, n are very likely to be linearly independent. This shows that asymptotically the algorithm succeeds with overwhelming probability.

We remark that there are many possible ways to select the rank group indexes \(\varvec{r_1}\) and \(\varvec{r_2}\) that give similar results.

8.5 Experimental Results

In practice we do not pre-fix the rank groups of \(\varvec{F},\varvec{G}\) for which we run the unique HSM algorithm. Instead, we select a reference rank group \(\varvec{r'}\) such that \(|\mathcal {T}_{\varvec{F},n-2}(\varvec{r'})| \approx 2^{n/2}\) (as \(\varvec{r_2}\) defined above). We then iterate over the rank groups \(\varvec{r}\) from the smallest to the largest, while collecting unique HSMs using repeated executions of the unique HSM algorithm with inputs \(\varvec{r},\varvec{r'}\) . We stop once we collect n linearly independent unique HSMs. This practical variant is more flexible and succeeds given that the variant above succeeds.

We implemented the algorithm and tested it for various values of \(8 \le n \le 28\). In each trial we first selected the permutation \(\varvec{F}\) uniformly at random. We then chose invertible affine mappings \(A_1,A_2\) uniformly at random and defined \(\varvec{G} = A_2 \circ \varvec{F} \circ A_1\). After calculating these inputs, we executed the algorithm and verified that it correctly recovered \(A_1,A_2\).

Following this initial verification, our goal was to collect statistics that support the asymptotic complexity analysis of the unique HSM algorithm above. For this purpose, we selected the permutation \(\varvec{F}\) at random and calculated the success rate and complexity of Step 3, which executes the unique HSM algorithm on \(\varvec{F}\) (after running steps 1, 2). If Step 3 succeeds to return n linearly independent unique HSMs with a certain complexity for \(\varvec{F}\), then it would succeed with identical complexity on any linearly equivalent \(\varvec{G}\), hence analyzing a single permutation is sufficient for the purpose of gathering statistics.

Our results for \(n \in \{8,12,16,20,24,28\}\) are summarized in Table 1. This table shows that all the trials for the various choices of n were successful. In terms of complexity, for \(n=8\), the unique HSM algorithm had to iterate over \(2^{10.5} > 2^8\) HSMs on average in order to find 8 unique linearly independent HSMs. This relatively high complexity is due to the fact that our asymptotic analysis ignores constants whose effect is more pronounced for smaller values of n. Nevertheless, the complexity of the algorithm for \(n=8\) in terms of bit operations remains roughly \(8^3 \cdot 2^8\), as the unique HSM algorithm does not perform linear algebra.

For \(n \ge 12\), the average complexity of the unique HSM algorithm is below \(2^n\), and this gap increases as n grows (as predicted by the asymptotic analysis). Note that the complexity drops in two cases (between \(n=16\) and \(n=20\) and between \(n=24\) and \(n=28\)) since for larger n we have more non-empty rank groups of various sizes and hence more flexibility in the algorithm (which happens to be quite substantial for \(n=20\) and \(n=28\)). Finally, we note that we did not optimize the index \(\varvec{r'}\) of the reference rank group and better options that improve the complexities are likely to exist. However, since the unique HSM algorithm does not dominate the overall complexity, such improvements would have negligible effect.

In addition to the experiments on random permutations, we also performed simulations on random functions and obtained similar results.

8.6 Additional Variants of the New Affine Equivalence Algorithm

We describe several variants of the affine equivalence algorithm.

Table 1. Experimental results for the unique HSM algorithm

Using Rank Group Sums. The first additional variant we describe uses the rank tables of \(\varvec{F},\varvec{G}\) to directly recover several matchings in the initial stage of the algorithm. It is based on the observation that for each non-empty rank group \(\varvec{r}\), the HSM obtained by summing of all HSMs in \(\mathcal {T}_{\varvec{G},n-2}(\varvec{r})\) has to match (under \(\mapsto _{(A_1)}\)) the HSM obtained by summing of all HSMs in \(\mathcal {T}_{\varvec{F},n-2}(\varvec{r})\) due to the additive property of the HSM bijection. Simple analysis (based on Assumption 1 and backed up by experimental results) shows that the number of non-empty rank groups for a random permutation \(\varvec{F}\) is at least n/4 with very high probability. Hence we can initially recover at least n/4 matchings using this approach. There are several ways to recover the remaining matchings by exploiting the fact that we have essentially reduced the size of the problem from \(2^n\) to at most \(2^{3n/4}\). We can also continue in a similar way, further exploiting additive properties of the bijection: we take a uniquely matched HSM pair \(h,h'\). For \(\varvec{G}\), we compute the HSM rank table for h with respect to some rank group \(\varvec{r'}\) by adding it to all HSMs in this group. We do the same for \(\varvec{F}\) by computing the HSM rank table of \(h'\) with respect to \(\varvec{r'}\). As in the initial observation, the sum of HSMs in each non-empty rank group of these smaller tables for \(\varvec{F},\varvec{G}\) match under \(\mapsto _{(A_1)}\), revealing additional matchings. We repeat this process for several uniquely matched HSM pairs (computing additional HSM rank tables) until we identify the required n linearly independent matchings.

Reducing the Memory Complexity. The memory complexity of the algorithm is about \(2^n\) words of n bits. If the functions \(\varvec{F},\varvec{G}\) are given as truth tables, then the memory complexity cannot be reduced by much. However, if we are given access to \(\varvec{F},\varvec{G}\) via oracles (e.g., they are implemented by block ciphers with a fixed key), then we can significantly reduce the memory complexity with no substantial effect on the time complexity.

First, instead of using the Moebius transform in Step 1 in order to interpolate all the coefficients of \(\varvec{F},\varvec{G}\), we simply interpolate each of the relevant \(\approx n^2\) coefficients of degree at least \(n-2\) independently, increasing the complexity of Step 1 by a factor of about n. Next, in Step 2 we do not store the entire rank table, but only the relevant rank groups with indexes \(\varvec{r_1}\) and \(\varvec{r_2}\). As a result, we now have to recompute the ranks of \(S_1 \cdot S_2\) HSMs in Step 3, but this requires much lower complexity than \(2^n\).

Overall, the memory complexity of this low-memory variant is dominated by the size of largest rank group stored in memory \(S_2\), which is bit more than \(2^{n/2}\). Finally, by a different choice of rank groups of indexes \(\varvec{r_1}\) and \(\varvec{r_2}\), it is possible reduce the memory to be sub-exponential in n.

Multiple Solutions to the Affine Equivalence Problem. Consider the case where there are two or more solutions of the form \((A_1^{(i)},A_2^{(i)})\) to an instance of the affine equivalence problem \(\varvec{F},\varvec{G}\). This may occur (for example) if \(\varvec{F}\) is self-affine equivalent, namely, there exist \((A_1,A_2)\) (that are not both identities) such that \(\varvec{F} = A_2 \circ \varvec{F} \circ A_1\). We note that this case is extremely unlikely if \(\varvec{F}\) is chosen uniformly at random for \(n \ge 8\), but it may occur for specific choices (e.g., the AES Sbox is self-affine equivalent).

In case of multiple solutions, a straightforward application of the affine equivalence algorithm would fail, as a HSM h would most likely match a different \(h'^{(i)}\) for each solution \(A_1^{(i)}\), namely \(h \mapsto _{A_1^{(i)}} h'^{(i)}\). Consequently, we would not be able to find sufficiently many unique HSMs in Step 4. However, we can tweak the algorithm to deal with this case by working on each match \(h \mapsto _{A_1^{(i)}} h'^{(i)}\) separately. More specifically, according to Lemma 6 we know that \(\varvec{F} \circ C_{|h'^{(i)},c} \equiv \varvec{G} \circ C_{|h,0}\) and we can apply the algorithm recursively on these functions.

Affine Equivalences Among a Set of Functions. We consider a generalization of the affine equivalence problem that was described in [3]. Given a set of K functions \(\{\varvec{F_i}\}_{i=1}^K\), our goal is to partition them into groups of affine equivalent functions. The naive approach is to run the affine equivalence algorithm on each pair of functions, which results in complexity of \(K^2 \cdot n^3 2^n\).

We can improve this complexity by noticing that up to Step 4 of the affine equivalence algorithm the functions \(\varvec{F},\varvec{G}\) are analyzed independently. In particular, we can compute the rank histogram \(\mathcal {H}_{\varvec{F_i},n-2}\) for each function \(\varvec{F_i}\) independently (as done in Step 2) in time \(n^3 2^n\), and then sort the functions and classify them according to their rank histograms.Footnote 9 This reduces the time complexity to about \(K \cdot n^3 2^n + \tilde{O}(K^2)\) (where \(\tilde{O}\) hides a small polynomial factor in n), improving upon the time complexity of \(K \cdot n^3 2^{2n} + \tilde{O}(K^2)\), obtained in [3].

9 Applications

We describe applications of the affine equivalence algorithm and then focus on additional applications of the new objects and algorithms defined in this paper.

9.1 Applications of the New Affine Equivalence Algorithm

Algorithms for the affine equivalence problem are useful in several contexts such as classification of Sboxes [6, 14], producing equivalent representations of block ciphers [3] and attacking white-box ciphers [15]. In all of these contexts, if the goal is to apply the algorithm a few times to functions with a small domain size n, then the main algorithm of Biryukov et al. [3] is already practical and there is little to be gained by using our algorithm.

On the other hand, our algorithm may provide an advantage if the goal is to solve the affine equivalence problem on functions with a larger domain sizes (e.g., the domain size of the CAST Sbox [1] is \(n=32\)). Furthermore, our algorithm may be beneficial if we need to solve the affine equivalence problem for many functions with domain size \(n \ge 8\). For example, if we want to classify a large set of 8-bit Sboxes produced based on some design criteria, we can use the variant that searches for affine equivalences among a set of functions (described in Sect. 8.6).

An additional application (which is also described in [3]) is cryptanalysis of a generalization of the Even-Mansour scheme. The original scheme [11] builds a block cipher using a public permutation \(\varvec{F} : \{0,1\}^n \rightarrow \{0,1\}^n\) and a pair of n-bit keys \(k_1,k_2\) by defining the encryption function on a plaintext \(p \in \{0,1\}^n\) as \(\varvec{E}(p) = \varvec{F}(p + k_1) + k_2\). Breaking the scheme may be considered as a special case of solving the affine equivalence problem where the linear matrices are identities. Thus, in the generalized scheme, arbitrary affine transformations \(A_1,A_2\) are used as the key and the encryption function is defined as \(\varvec{E}(p) = A_2 \circ \varvec{F} \circ A_1(p)\). Clearly, breaking the generalized Even-Mansour scheme reduces to solving the affine equivalence problem. The currently best know attack on this scheme (given in [3]) requires about \(2^{3n/2}\) time and memory. It uses a birthday paradox based approach that generalizes Daemen’s attack on the original Even-Mansour cipher [8]. Hence, we improve the complexity of the best known attack on the generalized Even-Mansour cipher from about \(2^{3n/2}\) to \(2^n\).

9.2 Additional Applications

We describe additional applications of the rank table and histogram objects defined in this paper, and the algorithm used to compute them.

Application to Decomposition of the ASASA Construction. The ASASA construction is an SP-network that consists of three secret affine layers (A) interleaved with two secret Sbox layers (S). At ASIACRYPT 2014, Biryukov et al. [2] proposed several concrete ASASA block cipher designs as candidates for white-box cryptography, whose security was based on the alleged difficulty of recovering their internal components. These designs were subsequently broken in [16] and [10].

Of particular interest is the integral attack of [10]. While the full details of this attack are out of the scope of this paper, we focus on its heaviest computational step that consists of summing over about \(2^n\) affine subspaces of dimension slightly less than n (where n is the block size of the scheme). This step was performed in [10] in complexity of about \(2^{3n/2}\). We can improve the complexity of this step (and the complexity of the full attack) to about \(2^n\) by using a symbolic algorithm which is similar to the one used for computing the rank table.

Application to Distinguishers on Sboxes and Block Ciphers. In [4] Biryukov and Perrin considered the problem of reverse-engineering Sboxes and proposed techniques to check whether a given Sbox was selected at random or was designed according to some unknown criteria. These techniques are based on the linear approximation table (LAT) and difference distribution table (DDT) of the Sbox. Here, we provide another method based on the distribution of entries in the rank histogram of the Sbox. More specifically, an Sbox would be considered suspicious if its rank histogram entry sizes differ significantly from their expected values according to the distribution derived in Sect. 7 (supported by the experimental results of the extended version of this paper [9]).

An advantage of our proposal is that the LAT and DDT require about \(2^{2n}\) time and memory to compute and store, whereas the rank histogram can be computed in time of about \(2^n\). Hence, our proposal can be used to analyze larger Sboxes. We can also use additional properties of HSM rank histogram multi-sets (such as the number of unique HSMs) as possible distinguishing techniques.

In a related application, the rank table (and additional structures defined in this paper) can be used to experimentally construct distinguishers on block ciphers with a small block size (e.g., 32 bits). This is done be selecting a few keys for the block cipher at random and detecting consistent deviations from random among the resultant permutations. In particular, if there is a linear combination of the output bits that is a low-degree function of some \((n-1)\)-dimensional input subspace, then we can detect it in time complexity of about \(2^n\). Since there are \(2^{n+1}\) possible \((n-1)\)-dimensional affine subspaces and \(2^n\) linear combinations of output bits, we search over a space of \(2^{2n+1}\) possible distinguishers in about \(2^n\) time. This can be viewed as an improvement over known experimental methods [19] that search a much smaller space containing about \(n^2\) potential high-order differential distinguishers in similar complexity (these methods only consider the input and output bits, but not their linear combinations). Finally, the technique can also be used on block ciphers with larger block sizes by considering linear subspaces of the input domain and output range.

10 Conclusions and Open Problems

In this paper we described an improved algorithm for the affine equivalence problem, focusing on randomly chosen permutations. The main open problem is to further improve the algorithm’s complexity and applicability. An additional future work item is to find more applications for the rank table and related structures defined in this paper.