1 Introduction

The analysis of multidimensional time series is a standard problem in data science. Usually, as a first step, features of a time series must be extracted that are (in some sense) robust and that characterize the time series. In many applications the features should additionally be invariant to a particular group acting on the data. In Human Activity Recognition for example, the orientation of the measuring device is often unknown. This leads to the requirement of rotation invariant features [37]. In EEG analysis, invariants to the general linear group are beneficial [12]. In other applications, the labeling of coordinates is arbitrary, which leads to permutation invariant features.

As any time series in discrete time can, via linear interpolation, be thought of as a multidimensional curve, one is naturally led to the search of invariants of curves. Invariant features of curves have been treated using various approaches, mostly focussing on two-dimensional curves. Among the techniques are Fourier series (of closed curves) [21, 27, 52], wavelets [6], curvature based methods [2, 36] and integral invariants [13, 35].

The usefulness of iterated integrals in data analysis has recently been realized, see for example [20, 26, 32, 51] and the introduction in [5]. Let us demonstrate the appearance of iterated integrals on a very simple example. Let \(X: [0,T] \to \mathbb{R}^{2}\) be a smooth curve. Say we are looking for a feature describing this curve that remains unchanged if one is handed a rotated version of \(X\). Maybe the simplest one that one can come up with is the (squared) total displacement length \(|X_{T} - X_{0}|^{2}\). Now,

$$\begin{aligned} |X_{T} - X_{0}|^{2} &= \bigl(X^{1}_{T} - X^{1}_{0}\bigr)^{2} + \bigl(X^{2}_{T} - X ^{2}_{0}\bigr)^{2} \\ &= 2 \int _{0}^{T} \bigl( X^{1}_{r} - X^{1}_{0} \bigr) \dot{X}^{1} _{r} dr + 2 \int _{0}^{T} \bigl( X^{2}_{r} - X^{2}_{0} \bigr) \dot{X}^{2}_{r} dr \\ &= 2 \int _{0}^{T} \biggl( \int _{0}^{r} \dot{X}^{1}_{u} du \biggr) \dot{X}^{1}_{r} dr + 2 \int _{0}^{T} \biggl( \int _{0}^{r} \dot{X}^{2} _{u} du \biggr) \dot{X}^{2}_{r} dr \\ &= 2 \int _{0}^{T} \int _{0}^{r} d X^{1}_{u} d X^{1}_{r} + 2 \int _{0} ^{T} \int _{0}^{r} d X^{2}_{u} d X^{2}_{r}, \end{aligned}$$

where we have applied the fundamental theorem of calculus twice and then introduced the notation \(dX^{i}_{r}\) for \(\dot{X}^{i}_{r} dr\). We see that we have expressed this simple invariant in terms of iterated integrals of \(X\); the collection of which is usually called its signature. The aim of this work can be summarized as describing all invariants that can be obtained in this way. It turns out, when formulated in the right way, this search for invariants reduces to classical problems in invariant theory. We note that already in the early work of Chen (see for example [4, Chap. 3]) the topic of invariants arose, although a systematic study was missing (see also [23]).

The aim of this work is threefold. Firstly, we adapt classical results in invariant theory regarding non-commuting polynomials (or, equivalently, multilinear maps), to our situation. These results are spread out in the literature and sometimes need a little massaging. Secondly, it lays out the usefulness of the iterated-integral signature in the search for invariants of \(d\)-dimensional curves. We show, see Sect. 7, that certain “integral invariants” found in the literature are in fact found in the signature and our approach simplifies their enumeration. Lastly, we present new geometric insights into some entries found in the signature, Sect. 3.3.Footnote 1

The paper is structured as follows. In the next section we introduce the iterated-integral signature of a multidimensional curve, as well as some algebraic language to work with it. Based on this signature, we present in Sect. 3 and Sect. 4 invariants to the general linear group and the special orthogonal group. Both are based on classical results in invariant theory. For completeness, we present in Sect. 5 the invariants to permutations, which have been constructed in [1]. In Sect. 6 we show how to use all these invariants if an additional (time) coordinate is introduced. In Sect. 7 we relate our work to the integral invariants of [13] and demonstrate that the invariants presented there cannot be complete. We formulate the conjecture of completeness for our invariants and point out open algebraic questions.

For readers who want to use these invariants without having to go into the technical results, we propose the following route. The required notation is presented in the next section. The invariants are presented in Proposition 3.11, Proposition 4.4 and Proposition 5.4. Examples are given in Sect. 3.1 (in particular Remark 3.14), Example 4.7 and Example 5.6. All these invariants are also implemented in the software package [9]. For calculating the iterated-integral signature in Python we propose using the package iisignature, as described in [40].

2 The Signature of Iterated Integrals

By a multidimensional curve\(X\) we will denote a continuous mapping \(X: [0,T] \to \mathbb{R}^{d}\) of bounded variation.Footnote 2 The aim of this work is to find features (i.e. complex or real numbers) describing such a curve that are invariant under the general linear group, the group of rotations and the group of permutations. Note that in practical situations one is usually presented with a discrete sequence of data points in \(\mathbb{R}^{d}\), a multidimensional time series. Such a time series can be easily transformed into a (piecewise) smooth curve by linear interpolation.

It was proven in [22], which extends the work of [4], that a curve \(X = (X^{1},.., X^{d})\) is almost completely characterized by the collection of its iterated integralsFootnote 3

$$\begin{aligned} \int _{0}^{T} \int _{0}^{r_{n}} \dots \int _{0}^{r_{2}} dX^{i_{1}}_{r _{1}} \dots dX^{i_{n}}_{r_{n}}, \quad n \ge 1,\quad i_{1}, \dots i_{n} \in \{1,\dots ,d\}. \end{aligned}$$

The collection of all these integrals is called the signatureFootnote 4 of \(X\). In a first step, we can hence reduce the goal

Find functions \(\varPsi : \text{curves} \to \mathbb{R}\) that are invariant under the action of a group \(G\) ,

to the goal

Find functions \(\varPsi : \text{signature of curves} \to \mathbb{R}\) that are invariant under the action of a group \(G\) .

By the shuffle identity (Lemma 2.1), any polynomial function on the signature can be re-written as a linear function on the signature. Assuming that arbitrary functions are well-approximated by polynomial functions, we are led to the final simplification, which is the goal of this paper

Find linear functions \(\varPsi : \text{signature of curves} \to \mathbb{R}\) that are invariant under the action of a group \(G\) .

2.1 Algebraic Underpinning

Let us introduce some algebraic notation in order to work with the collection of iterated integrals. Denote by \(T((\mathbb{R}^{d}))\) the space of formal power series in \(d\)non-commuting variables \(x_{1}, x_{2}, \dots , x_{d}\). We can conveniently store all the iterated integrals of the curve \(X\) in \(T((\mathbb{R}^{d}))\), by defining the signature of \(X\) to be

$$\begin{aligned} S(X)_{0,T} := \sum x_{i_{1}} \dots x_{i_{n}} \int _{0}^{T} \int _{0} ^{r_{n}} \dots \int _{0}^{r_{2}} dX^{i_{1}}_{r_{1}} \dots dX^{i_{n}} _{r_{n}}. \end{aligned}$$

Here the sum is taken over all \(n \ge 0\) and all \(i_{1}, \dots , i _{n} \in \{1,2,..,d\}\). For \(n=0\) the summand is, for algebraic reasons, taken to be the constant 1.

The algebraic dual of \(T((\mathbb{R}^{d}))\) is \(T(\mathbb{R}^{d})\), the space of polynomialsFootnote 5 in \(x_{1}, x_{2}, \dots , x _{d}\). The dual pairing, denoted by \(\langle \cdot , \cdot \rangle \) is defined by declaring all monomials to be orthonormal, so for example

$$\begin{aligned} \Big\langle x_{1} + 15\cdot x_{1} x_{2} - 2\cdot x_{1} x_{2} x_{1},\ x _{1} x_{2} \Big\rangle = 15. \end{aligned}$$

Here, we write the element of \(T((\mathbb{R}^{d}))\) on the left and the element of \(T(\mathbb{R}^{d})\) on the right. We can “pick out” iterated integrals from the signature as follows

$$\begin{aligned} \Big\langle S(X)_{0,T},\ x_{i_{1}} \dots x_{i_{n}} \Big\rangle = \int _{0}^{T} \int _{0}^{r_{n}} \dots \int _{0}^{r_{2}} dX^{i_{1}}_{r_{1}} \dots dX^{i_{n}}_{r_{n}}. \end{aligned}$$

The space \(T((\mathbb{R}^{d}))\) becomes an algebra by extending the usual product of monomials, denoted ⋅, to the whole space by bilinearity. Note that ⋅ is non-commutative.

On \(T(\mathbb{R}^{d})\) we often use the shuffle product which, on monomials, interleaves them in all order-preserving ways, so for example

Note that is commutative.

Monomials, and hence homogeneous polynomials, have the usual concept of order or homogeneity. For \(n \ge 0\) we denote the projection on polynomials of order \(n\) by \(\pi _{n}\), so for example

$$\begin{aligned} \pi _{2} ( x_{1} + 15\cdot x_{1} x_{2} - 2\cdot x_{1} x_{2} x _{1} ) = 15\cdot x_{1} x_{2}. \end{aligned}$$

See [41] for more background on these spaces.

As mentioned above, every polynomial expression in terms of the signature can be re-written as a linear expression in (different) terms of the signature. This is the content of the following lemma, which is proven in [39] (see also [41, Corollary 3.5]).

Lemma 1

(Shuffle identity)

Let\(X: [0,T] \to \mathbb{R}^{d}\)be a continuous curve of bounded variation, then for every\(a,b \in T(\mathbb{R}^{d})\)

Remark 2

We have used this fact already in the introduction, where we confirmed by hand that

The concatenation of curves is compatible with the product on \(T((\mathbb{R}^{d}))\) in the following sense (for a proof, see for example [16, Theorem 7.11]).

Lemma 3

(Chen’s relation)

For curves\(X: [0,T] \to \mathbb{R}^{d}\), \(Y: [0,T] \to \mathbb{R}^{d}\)denote their concatenation

$$\begin{aligned} X \sqcup Y: [0,2T] \to \mathbb{R}^{d}, \end{aligned}$$

as\(X_{\cdot }\)on\([0,T]\)and\(Y_{\cdot -T} - Y_{0} + X_{T}\)on\([T,2T]\). Then

$$\begin{aligned} S(X \sqcup Y)_{0,2T} = S(X)_{0,T} \cdot S(Y)_{0,T}. \end{aligned}$$

We will use the following fact repeatedly, which also explains the commonly used name tensor algebra for \(T(\mathbb{R}^{d})\).

Lemma 4

The space of all multilinear maps on\(\mathbb{R}^{d} \times \cdots \times \mathbb{R}^{d}\) (\(n\)-times) is in a one-to-one correspondence with homogeneous polynomials of order\(n\)in the non-commuting variables\(x_{1},\dots ,x_{d}\)by the following bijective linear map

$$\begin{aligned} \psi \mapsto \mathsf{poly}(\psi ) := \sum_{i_{1}, \dots , i_{n} \in \{1,\dots ,d\}} \psi (e_{i_{1}}, e_{i _{2}}, \dots , e_{i_{n}}) x_{i_{1}} \cdot x_{i_{2}} \cdot .. \cdot x _{i_{n}}, \end{aligned}$$

with\(e_{i}\)being the\(i\)-th canonical basis vector of\(\mathbb{R} ^{d}\).

For example, with \(d=2\) and \(n=3\), we can consider the multilinear map \(\psi\) which takes \(\bigl((a_{1},b_{1}),(a_{2},b_{2}),(a_{3},b_{3})\bigr) \in \mathbb{R}^{2} \times \mathbb{R}^{2} \times \mathbb{R}^{2}\) to the number \(a_{1}a_{2}b_{3}\). It maps to \(\mathsf{poly}(\psi)=x_{1}x_{1}x_{2}\).

3 General Linear Group

Let

$$\begin{aligned} \operatorname{GL}\bigl(\mathbb{R}^{d}\bigr) = \bigl\{ A \in \mathbb{R}^{d\times d} : \det ( A ) \ne 0 \bigr\} , \end{aligned}$$

be the general linear group of \(\mathbb{R}^{d}\).

Definition 1

For \(w \in \mathbb{N}\), we call \(\phi \in T(\mathbb{R}^{d})\) a \(\operatorname{GL}\)invariant of weight\(w\) if

$$\begin{aligned} \bigl\langle S(A X)_{0,T}, \phi \bigr\rangle = (\det A)^{w} \bigl\langle S(X)_{0,T}, \phi \bigr\rangle \end{aligned}$$

for all \(A \in \operatorname{GL}(\mathbb{R}^{d})\) and all curves \(X\).

Definition 2

Define a linear action of \(\operatorname{GL}(\mathbb{R}^{d})\) on \(T((\mathbb{R}^{d}))\) and \(T(\mathbb{R}^{d})\), by specifying on monomials

$$\begin{aligned} A x_{i_{1}} .. x_{i_{n}} :=& \sum_{j} (A e_{i_{1}})_{j_{1}} x_{j_{1}} .. (A e_{i_{n}})_{j_{n}} x_{j_{n}} \\ =& \sum_{j} A_{j_{1} i_{1}} .. A_{j_{n} i_{n}} x_{j_{1}} .. x_{j_{n}}. \end{aligned}$$

Lemma 3

For all\(A \in \mathbb{R}^{d\times d}\)and any curve\(X\),

$$\begin{aligned} \bigl\langle S(A X)_{0,T}, \phi \bigr\rangle = \bigl\langle S(X)_{0,T}, A^{\top }\phi \bigr\rangle . \end{aligned}$$

Proof

It is enough to verify this on monomials \(\phi = x_{\ell _{1}} .. x _{\ell _{m}}\). Then, since the \(\ell _{r}\)-th component of the curve \(A X\) is equal to \((A X)^{\ell _{r}} = \sum_{j_{r}} A_{\ell _{r} j_{r}} X^{j_{r}}\), we get

$$\begin{aligned} \bigl\langle S( A X ), \phi \bigr\rangle &= \int d (A X)^{\ell _{1}} \dots d (A X)^{\ell _{m}} \\ &= \sum_{j} A_{\ell _{1} j_{1}} \dots A_{\ell _{m} j_{m}} \int d X^{j _{1}} \dots d X^{j_{m}} \\ &= \biggl\langle S(X), \sum_{j} A_{\ell _{1} j_{1}} x_{j_{1}} .. A_{ \ell _{m} j_{m}} x_{j_{m}} \biggr\rangle \\ &= \bigl\langle S(X), A^{\top }\phi \bigr\rangle . \end{aligned}$$

 □

We can simplify the concept of \(\operatorname{GL}\) invariants further, using the next lemma. Owing to the shuffle identity, signatures of curves live in a nonlinear subset of the whole tensor algebra \(T((\mathbb{R}^{d}))\), the set of “grouplike elements” (compare [41, Sect. 3.1]). It turns out though that they linearly span all of \(T((\mathbb{R}^{d}))\).

Lemma 4

For \(n\ge 1\)

$$\begin{aligned} \operatorname{span} \bigl\{ \pi _{n} S(X)_{0,T} : X \textit{ curve} \bigr\} = \pi _{n} T\bigl( \bigl(\mathbb{R}^{d}\bigr)\bigr). \end{aligned}$$
(1)

Proof

It is clear by definition that the left hand side of (1) is included in \(\pi _{n} T((\mathbb{R}^{d}))\). We show the other direction and use ideas of [3, Proposition 4]. Let \(x_{i_{n}} \cdot \ldots \cdot x_{i_{1}} \in \pi _{n} T((\mathbb{R} ^{d}))\) be given. Let \(X\) be the piecewise linear path that results from the concatenation of the vectors \(t_{1} e_{i_{1}}\), \(t_{2} e_{i_{2}}\) up to \(t_{n} e_{i_{n}}\), where \(e_{i}\), \(i=1,..,d\) is the standard basis of \(\mathbb{R}^{d}\). Its signature is given by (see for example [16, Chap. 6])

$$\begin{aligned} S(X)_{0,1} = \exp ( {t_{n} x_{i_{n}}} ) \cdot \ldots \cdot \exp ( t _{1} x_{i_{1}} ) =: \phi (t_{1}, \dots , t_{n}), \end{aligned}$$

where the exponential function is defined by its power series. Then

$$\begin{aligned} \frac{d}{dt_{n}} \dots \frac{d}{dt_{1}} \phi (0,\dots ,0) = x_{i_{n}} \cdot \ldots \cdot x_{i_{1}}. \end{aligned}$$

Combining this with the fact that left hand side of (1) is a closed set we get that

$$\begin{aligned} x_{i_{n}} \cdot \ldots \cdot x_{i_{1}} \in \operatorname{span} \bigl\{ \pi _{n}\bigl( S(X) _{0,1} \bigr) : X \text{ curve} \bigr\} . \end{aligned}$$

These elements span \(\pi _{n} T((\mathbb{R}^{d}))\), which finishes the proof. □

Hence \(\phi \) is a \(\operatorname{GL}\) invariant of weight \(w\) in the sense of Definition 3.1 if and only if for all \(A \in \operatorname{GL}(\mathbb{R}^{d})\)

$$\begin{aligned} A^{\top }\phi = (\det A)^{w} \phi . \end{aligned}$$

Since the action respects homogeneity, we immediately obtain that projections of invariants are invariants (take \(B = (\det A)^{-w} A ^{\top }\) in the following lemma):

Lemma 5

If\(\phi \in T(\mathbb{R}^{d})\)satisfies

$$\begin{aligned} B \phi = \phi , \end{aligned}$$

for some\(B \in \operatorname{GL}(\mathbb{R}^{d})\)then

$$\begin{aligned} B \pi _{n} \phi = \pi _{n} \phi , \end{aligned}$$

for all\(n \ge 1\).

Proof

By definition, the action of \(\operatorname{GL}\) on \(T(\mathbb{R}^{d})\) commutes with \(\pi _{n}\). □

In order to apply classical results in invariant theory, we use the bijection \(\mathsf{poly}\) between multilinear functions and non-commuting polynomials, given in Lemma 2.4.

Lemma 6

For\(\psi : (\mathbb{R}^{d})^{\times n} \to \mathbb{R}\)multilinear and\(A \in \operatorname{GL}(\mathbb{R}^{d})\),

$$\begin{aligned} \mathsf{poly}\bigl[ \psi ( A \cdot ) \bigr] = A^{\top }\mathsf{poly}[ \psi ]. \end{aligned}$$

Proof

$$\begin{aligned} \mathsf{poly}\bigl[ \psi ( A \cdot ) \bigr] &= \sum _{i} \psi ( A e_{i_{1}}, .. A e_{i_{n}} ) x_{i_{1}} .. x_{i_{n}} \\ &= \sum_{i,j} A_{j_{1} i_{1}} .. A_{j_{n} i_{n}} \psi ( e_{j_{1}}, .. e_{j_{n}} ) x_{i_{1}} .. x_{i_{n}} \\ &= \sum_{j} \psi ( e_{j_{1}}, .. e_{j_{n}} ) A^{\top }x_{j_{1}} .. x _{j_{n}} \\ &= A^{\top }\mathsf{poly}[ \psi ]. \end{aligned}$$

 □

The simplest multilinear function

$$\begin{aligned} \varPsi : \bigl(\mathbb{R}^{d}\bigr)^{\times n} \to \mathbb{R}, \end{aligned}$$

satisfying \(\varPsi ( A v_{1},.., A v_{n} ) = \det ( A ) \varPsi (v_{1},.., v_{n})\) that one can maybe think of, is the determinant itself. That is, \(n=d\) and

$$\begin{aligned} \varPsi (v_{1},..,v_{n}) = \det [ v_{1} v_{2} .. v_{n} ], \end{aligned}$$

where \(v_{1} v_{2} .. v_{n}\) is the \(d \times d\) matrix with columns \(v_{i}\). Up to a scalar this is in fact the only one, and it turns out that invariants of higher weight are built only using determinants as a building block.

To state the following classical result, we introduce the notion of Young diagrams, which play an important role in the representation theory of the symmetric group.

Let \(\lambda = (\lambda _{1},.., \lambda _{r})\) be a partition of \(n \in \mathbb{N}\), which we assume ordered as \(\lambda _{1} \ge \lambda _{2} \ge .. \ge \lambda _{r}\). We associate to it a Young diagram, which is an arrangement of \(n\) boxes into left-justified rows. There are \(r\) rows, with \(\lambda _{i}\) boxes in the \(i\)-th row. For example, the partition \((4,2,1)\) of 7 gives the Young diagram

A Young tableau is obtained by filling these boxes with the numbers \(1,.., n\). Continuing the example, the following is a Young tableau

A Young tableau is standard if the values in every row are increasing (from left to right) and are increasing in every column (from top to bottom). The previous tableau was not standard; the following is.

The following result is classical, see for example Dieudonné [10, Sect. 2.5], [50] and [18], none of which explicitly give a basis for the invariants though. See [47, Theorem 4.1.12] for a slightly different basis.

Theorem 7

The space of multilinear maps

$$\begin{aligned} \psi : \underbrace{\mathbb{R}^{d}\times \cdots \times \mathbb{R}^{d}} _{n \textit{ times}} \to \mathbb{R} \end{aligned}$$

that satisfy

$$\begin{aligned} \psi (A v_{1}, A v_{2}, \dots , A v_{n}) = (\det A)^{w} \psi (v_{1}, v _{2}, \dots , v_{n}) \end{aligned}$$

for all\(A \in \operatorname{GL}(\mathbb{R}^{d})\)and\(v_{1}, \dots , v_{n} \in \mathbb{R}^{d}\)is non-empty if and only if\(n = w d\)for some integer\(w \ge 1\).

In that case, a linear basis is given by

$$\begin{aligned} \bigl\{ v \mapsto \det [ v_{C_{1}} ] .. \det [ v_{C_{w}} ] \bigr\} \end{aligned}$$

where\(C_{i}\)are the columns of\(\varSigma \), and\(\varSigma \)ranges over all standard Young tableaux corresponding to the partition\(\lambda = \overbrace{(w, w,.., w)}^{d~\textit{times}}\)of\(n\).

Here, for a sequence\(C = (c_{1},..,c_{d})\), \(v_{C}\)denotes the matrix of column vectors\(v_{c_{i}}\), i.e.

$$\begin{aligned} v_{C} = (v_{c_{1}},.., v_{c_{d}}). \end{aligned}$$

Remark 8

A consequence of this theorem is the existence of identities between products of determinants. For example, for vectors \(v_{1},.., v_{4} \in \mathbb{R}^{2}\), one can check by hand

$$\begin{aligned} \det [ v_{1} v_{4} ] \det [ v_{2} v_{3}] = \det [ v_{1} v_{3} ] \det [ v_{2} v_{4}] - \det [ v_{1} v_{2} ] \det [ v_{3} v_{4}]. \end{aligned}$$

This is why the product on the left-hand side here is not part of the basis in the previous lemma for \(d=2\), \(w=2\) (compare Sect. 3.1).

Identities of this type are called Plücker identities. They have a long history and are a major ingredient in the representation theory of the symmetric group. The procedure of reducing certain products of determinants to a basic set of such products is called the straightening algorithm [44, Sect. 2.6]. See also [30] and [48].

Remark 9

The only invariant for \(d=2\), \(w=1\) is

$$\begin{aligned} x_{1} x_{2} - x_{2} x_{1} = [x_{1},x_{2}], \end{aligned}$$

a Lie polynomial. One can generally ask for invariant Lie polynomials [41, Sect. 8.6.2]. This seems to be of no relevance to the application of invariant feature extraction for curves though.

Remark 10

Let \(C^{(d)}_{w}\) be the number of linear independent invariants of weight \(w\). By Theorem 3.7, this is the number of standard Young tableaux of shape \((w, w,.., w)\). By the Hook formula [44, Theorem 3.10.2]

$$\begin{aligned} C^{(d)}_{w} &= \frac{\prod_{\ell =1}^{d-1} \ell !}{\prod_{\ell =1} ^{d} (w+1)^{d-\ell }} \binom{ d\cdot w }{ w, w,.., w } \\ &= \frac{\prod_{\ell =1}^{d-1} \ell ! \cdot (d\cdot w)!}{\prod_{ \ell =0}^{d-1} (w+\ell )!}. \end{aligned}$$

For example for \(d=2\), the number of invariants for weights \(w=0,1,2,3,\ldots\) (and hence for levels \(n=0,2,4,6,\ldots\)) are (the Catalan numbers, https://oeis.org/A000108)

$$\begin{aligned} 1, 1, 2, 5, 14, 42, 132, 429, 1430, 4862, .. \end{aligned}$$

For \(d=3\), the number of invariants for weights \(w=0,1,2,3,\ldots\) (and hence for levels \(n=0,3,6,9,\ldots\)) are (the 3-dimensional Catalan numbers, https://oeis.org/A005789)

$$\begin{aligned} 1, 1, 5, 42, 462, 6006, 87516, 1385670, 23371634, 414315330, .. \end{aligned}$$

Proof of Theorem 3.7

Write \(V = (\mathbb{R}^{d})^{*}\), the dual space of \(\mathbb{R}^{d}\). Every \(\phi \in V^{\otimes n}\) that satisfies

$$\begin{aligned} A \phi = (\det A)^{w} \phi , \end{aligned}$$
(2)

clearly spans a one-dimensional irreducible representation of \(\operatorname{GL}(V)\). Hence we need to investigate all one-dimensional irreducible representation of \(\operatorname{GL}(V)\) contained in \(V^{\otimes n}\) (and it will turn out that all of them satisfy (2)).

The (diagonal) action of \(\operatorname{GL}(V)\) on \(V^{\otimes n}\) is best understood by simultaneously studying the left action of \(S_{n}\) on \(V^{\otimes n}\) given by

$$\begin{aligned} \tau \cdot v_{1} \otimes .. \otimes v_{n} := v_{\tau ^{-1}(1)} \otimes .. \otimes v_{\tau ^{-1}(n)}. \end{aligned}$$

By Schur-Weyl duality, [29, Theorem 6.4.5.2], as \(S_{n} \times \operatorname{GL}(V)\) modules,

$$\begin{aligned} V^{\otimes n} \simeq \bigoplus_{\lambda \vdash n} S^{\lambda }\otimes V^{\lambda }, \end{aligned}$$
(3)

where the sum is over integer partitions \(\lambda \) of \(n\), the \(S^{\lambda }\) are irreducible representations of \(S_{n}\), to be detailed below and the \(V^{\lambda }\) are irreducible representations of \(\operatorname{GL}(V)\). The exact form of the latter is irrelevant here, we only need to know that \(V^{\lambda }\) is one-dimensional if and only if \(\lambda = (w,..,w)\), \(d\)-times, for some integer \(w \ge 1\), [10, p. 21]. This gives the condition \(n = w d\) in the statement. We assume this to hold from now on.

We are hence left with understanding the unique copy of the “Specht module” \(S^{\lambda }\) inside of \(V^{\otimes n}\). We sketch its classical construction. Let us recall that a tabloid is an equivalence class of Young tableaux modulo permutations leaving the set of entries in each row invariant [44, Chap. 2].Footnote 6 For \(t\) a Young tableau denote \(\{ t \}\) its tabloid, so for example

The symmetric group \(S_{n}\) acts on Young tableaux as

$$\begin{aligned} (\tau \cdot t)_{ij} := \tau ( t_{ij} ). \end{aligned}$$

For example

It then acts on tabloids by \(\tau \cdot \{t\} := \{ \tau \cdot t \}\). Define for a Young tableau \(t\)

$$\begin{aligned} e_{t} := \sum_{\pi } \operatorname{sign}( \pi ) \pi \cdot \{ t \}, \end{aligned}$$

where the sum is over all \(\pi \in S_{n}\) that leave the set of values in each column invariant. For example with

we get

$$\begin{aligned} e_{t} = \{ t \} - (1 3) \cdot \{ t \} - (2 4) \cdot \{ t \} + (1 3) (2 4) \cdot \{t\}. \end{aligned}$$

Then

$$\begin{aligned} \operatorname{Irrep}_{(w,..,w)} := \operatorname{span}\bigl\{ e_{t} : t \text{ Young tableau of shape } (w,..,w) \bigr\} \end{aligned}$$

is an irreducible representation of \(S_{n}\) and

$$\begin{aligned} \bigl\{ e_{t} : t \text{ standard Young tableau of shape $(w,..,w)$} \bigr\} , \end{aligned}$$

forms a basis [44, Theorem 2.5.2]. This concludes the reminder on representation theory for \(S_{n}\).

Define the map \(\iota \) from the space of tabloids of shape \((w,..,w)\) into \(V^{\otimes n}\) as follows,

$$\begin{aligned} \iota \bigl( \{ t \} \bigr) := e_{j_{1}}^{*} \otimes .. \otimes e_{j_{n}}^{*}, \end{aligned}$$

where \(e_{i}^{*}\) is the canonical basis of \(V\) and

$$\begin{aligned} j_{\ell }= i \quad \Leftrightarrow \quad \ell \in \text{ $i$-th row of } \{ t \}. \end{aligned}$$

For example

This is a homomorphism of \(S_{n}\) representations. Indeed,

$$\begin{aligned} \iota \bigl( \tau \cdot \{t\} \bigr) = e_{j_{1}}^{*} \otimes .. \otimes e_{j _{n}}^{*}, \end{aligned}$$

with

$$\begin{aligned} j_{\ell }= i \quad \Leftrightarrow\quad \ell \in \text{ $i$-th row of } \tau \cdot \{t\}. \end{aligned}$$

On the other hand

$$\begin{aligned} \tau \cdot \iota \bigl( \{t\} \bigr) &= \tau \cdot e_{r_{1}}^{*} \otimes .. \otimes e_{r_{n}}^{*} \\ &= e_{p_{1}}^{*} \otimes .. \otimes e_{p_{n}}^{*}, \end{aligned}$$

with \(p_{\ell }:= r_{\tau ^{-1}(\ell )}\) and

$$\begin{aligned} p_{\ell }= i &\quad \Leftrightarrow\quad r_{\tau ^{-1}(\ell )} = i \\ &\quad \Leftrightarrow\quad \tau ^{-1}(\ell ) \in \text{ $i$-th row of } \{t\} \\ &\quad \Leftrightarrow\quad \ell \in \text{ $i$-th row of } \tau \cdot \{t\}. \end{aligned}$$

So indeed \(\iota ( \tau \cdot \{t\} ) = \tau \cdot \iota ( \{t\} )\), and \(\iota \) is a homomorphism of \(S_{n}\) representations. It is a bijection from the space of \((w,..,w)\) tabloids into the space spanned by the vectors

$$\begin{aligned} e^{*}_{i_{1}} \otimes .. \otimes e^{*}_{i_{n}} : \#\{ \ell : i_{ \ell }= j \} = w, \quad j = 1,.., d. \end{aligned}$$

Restricting to \(\operatorname{Irrep}_{(w,..,w)}\) then yields an isomorphism of irreducible \(S_{n}\) representations. Hence \(\iota ( \operatorname{Irrep}_{(w,..,w)} )\) is the (unique) realization of \(S^{\lambda }\) inside of \(V^{\otimes n}\) in (3). We finish by describing its image.

Consider the standard Young tableau \(t_{\mathrm{first}}\) of shape \((w,w,..,w)\) obtained by filling the columns from left to right, i.e.

Clearly, for any (standard) Young tableau \(t\) there exists a unique \(\sigma _{t} \in S_{n}\) such that

$$\begin{aligned} \sigma _{t} \cdot t_{\mathrm{first}} = t. \end{aligned}$$

We claim

$$\begin{aligned} \iota ( e_{t} ) = \bigl( v \mapsto \det [ v_{\sigma _{t}(1)} .. v_{ \sigma _{t}(d)} ] \cdot ... \cdot \det [ v_{\sigma _{t}((w-1)d+ 1)} .. v _{\sigma _{t}(n)}] \bigr). \end{aligned}$$

Indeed, since \(\iota \) is a homomorphism of \(S_{n}\) representation,

$$\begin{aligned} \iota ( \sigma _{t} \cdot e_{t_{\mathrm{first}}} ) (v_{1},..,v_{n}) &= \iota ( e _{t_{\mathrm{first}}} ) (v_{\sigma _{t}(1)},..,v_{\sigma _{t}(n)}) \end{aligned}$$

It remains to check

$$\begin{aligned} \iota ( e_{t_{\mathrm{first}}} ) = \det [v_{1}..v_{d}] \cdot ... \cdot \det [v _{(w-1)d+1} .. v_{n}]. \end{aligned}$$

Every \(\pi \in S^{n}\) that is column-preserving for \(t_{\mathrm{first}}\) can be written as the product \(\pi _{1} \cdot .. \cdot \pi _{w}\), with \(\pi _{j}\) ranging over the permutations of the entries of the \(j\)-th column \(t_{\mathrm{first}}\). Then

$$\begin{aligned} \iota ( e_{t_{\mathrm{first}}} ) (v_{1},..,v_{n}) =& \sum _{\pi } \operatorname{sign}\pi \ \iota \bigl( \pi \{t\} \bigr) ( v_{1},.., v_{n} ) \\ =& \sum_{\pi _{j}} \prod_{j} \operatorname{sign}\pi _{j}\ \iota \bigl( \pi _{1} .. \pi _{w} \{t\} \bigr) ( v_{1},.., v_{n} ) \\ =& \sum_{\pi _{j}} \prod_{j} \operatorname{sign}\pi _{j}\ e^{*}_{\pi ^{-1}_{1}(1)} \otimes .. \otimes e^{*}_{\pi ^{-1}_{1}(d)} \otimes e ^{*}_{(\pi ^{-1}_{2}(d+1) \operatorname{mod}d) + 1} \otimes \cdots \\ &{} \otimes e^{*}_{(\pi ^{-1}_{w}(n) \operatorname{mod}d) + 1} (v _{1},..,v_{n}) \\ =& \det [v_{1}..v_{d}] \cdot ... \cdot \det [v_{(w-1)d+1} .. v_{n}], \end{aligned}$$

as desired. □

Applying Lemma 2.4 to Theorem 3.7 we get the invariants in \(T(\mathbb{R}^{d})\).

Proposition 11

A linear basis for the space of\(\operatorname{GL}\)invariants of order\(n = w d\)is given by

$$\begin{aligned} \sum_{i_{1},\dots ,i_{n} \in \{1,\dots ,d\}} g_{\varSigma }(e_{i_{1}},e _{i_{2}},\dots ,e_{i_{n}}) x_{i_{1}} x_{i_{2}} \dots x_{i_{n}}, \end{aligned}$$

where

$$\begin{aligned} g_{\varSigma }(v) = \det [ v_{C_{1}} ] .. \det [ v_{C_{w}} ], \end{aligned}$$

where\(C_{i}\)are the columns of\(\varSigma \), \(\varSigma \)ranges over all standard Young tableaux corresponding to the partition\(\lambda = \underbrace{(w, w,.., w)}_{d~\textit{times}}\)of\(n\), and the notation\(v_{C}\)is as introduced in Theorem 3.7.

Remark 12

By Lemma 3.5, for any invariant \(\phi \in T(\mathbb{R}^{d})\) and \(n\ge 1\) we have that \(\pi _{n} \phi \) is also invariant. Hence the previous theorem characterizes all invariants we are interested in (Definition 3.1), not just homogeneous ones.

Remark 13

Note that each of these invariants \(\phi \) consists only of monomials that contain every variable \(x_{1}, \dots , x_{d}\) at least once. This implies that \(\langle S(X)_{0,T}, \phi \rangle \) consists only of iterated integrals that contain every component \(X^{1},\dots ,X^{d}\) of the curve at least once. Hence, if at least one of these components is constant, the whole expression will be zero.

Since \(\phi \) is invariant, this implies that \(\langle S(X)_{0,T}, \phi \rangle = 0\) as soon as there is some coordinate transformation under which one component is constant, that is whenever the curve \(X\) stays in a hyperplane of dimension strictly less then \(d\).

One of the simplest curves in \(d\) dimensions that does not lie in any hyperplane of lower dimension is the moment curve

$$\begin{aligned} t \mapsto \bigl(t,t^{2},..,t^{d}\bigr). \end{aligned}$$

We will come back to this example in Lemma 3.29.

3.1 Examples

We will use the following short notation:

$$\begin{aligned} \mathtt {i_{1} \dots i_{n}} := x_{i_{1}} x_{i_{2}} .. x_{i_{n}}, \end{aligned}$$

so, for example

$$\begin{aligned} \mathtt {1121} := x_{1} x_{1} x_{2} x_{1}. \end{aligned}$$

We present the invariants described in Sect. 2 for some special cases of \(d\) and \(w\).

The case \(d=2\)

Level 2 (\(w=1\))

$$\begin{aligned} \mathtt {12} - \mathtt {21} \end{aligned}$$

Remark 14

Let us make clear that from the perspective of data analysis, the “invariant” of interest is really the action of this element in \(T(\mathbb{R}^{d})\) on the signature of a curve.

In this example, the real number

$$\begin{aligned} \bigl\langle S(X)_{0,T}, \mathtt {12} - \mathtt {21} \bigr\rangle = \int _{0}^{T} \int ^{r_{2}} dX^{1}_{r_{1}} dX^{2}_{r_{2}} - \int _{0}^{T} \int ^{r_{2}} dX^{1}_{r_{1}} dX^{2}_{r_{2}}, \end{aligned}$$

changes only by the determinant of \(A \in \operatorname{GL}( \mathbb{R}^{2})\) when calculating it for the transformed curve \(A X\):

$$\begin{aligned} \bigl\langle S(A X)_{0,T}, \mathtt {12} - \mathtt {21} \bigr\rangle = \det (A)\ \bigl\langle S(X)_{0,T}, \mathtt {12} - \mathtt {21} \bigr\rangle . \end{aligned}$$

Level 4 (\(w=2\))

$$\begin{aligned} & \mathtt {1212} - \mathtt {1221} - \mathtt {2112} + \mathtt {2121} \\ & \mathtt {1122} - \mathtt {1221} - \mathtt {2112} + \mathtt {2211} \end{aligned}$$

Remark 15

This is a linear basis of invariants in the fourth level. If one takes algebraic dependencies into consideration, the set of invariants becomes smaller. To be specific, assume that one already has knowledge of the invariant of level 2 (i.e. \(\langle S(X)_{0,T}, \mathtt {12} - \mathtt {21} \rangle \)). If, say in a machine learning application, the learning algorithm can deal sufficiently well with nonlinearities, one should not be required to provide additionally the square of this number. In other words \(|\langle S(X)_{0,T}, \mathtt {12} - \mathtt {21} \rangle |^{2}\) can also be assumed to be “known”. But, by the shuffle identity (Lemma 2.1), this can be written as

Now, seeing that \(4\cdot \mathtt {1122} - 4\cdot \mathtt {1221} - 4\cdot \mathtt {2112} + 4\cdot \mathtt {2211}\) is invariant, there is only one “new” independent invariant in the fourth level, namely \(\mathtt {1212} - \mathtt {1221} - \mathtt {2112} + \mathtt {2121}\).

A similar analysis can also be carried out for the following invariants, but we refrain from doing so, since it can be easily done with a computer algebra system.

Level 6 (\(w=3\))

$$\begin{aligned} & \mathtt {121212} -\! \mathtt {121221} -\! \mathtt {122112} +\! \mathtt {122121} - \mathtt {211212} + \mathtt {211221} + \mathtt {212112} - \mathtt {212121} \\ & \mathtt {112212} -\! \mathtt {112221} -\! \mathtt {122112} +\! \mathtt {122121} - \mathtt {211212} + \mathtt {211221} + \mathtt {221112} - \mathtt {221121} \\ & \mathtt {121122} -\! \mathtt {121221} -\! \mathtt {122112} +\! \mathtt {122211} - \mathtt {211122} + \mathtt {211221} + \mathtt {212112} - \mathtt {212211} \\ & \mathtt {112122} -\! \mathtt {112221} -\! \mathtt {122112} +\! \mathtt {122211} - \mathtt {211122} + \mathtt {211221} + \mathtt {221112} - \mathtt {221211} \\ & \mathtt {111222} -\! \mathtt {112221} -\! \mathtt {121212} +\! \mathtt {122211} - \mathtt {211122} + \mathtt {212121} + \mathtt {221112} - \mathtt {222111} \end{aligned}$$

The case\(d=3\)

Level \(n=3\) (\(w=1\))

$$\begin{aligned} \mathtt {123} - \mathtt {132} - \mathtt {213} + \mathtt {231} + \mathtt {312} - \mathtt {321} \end{aligned}$$

Level \(n=6\) (\(w=2\))

$$\begin{aligned} \textstyle\begin{array}{l} \mathtt {123123} - \mathtt {312132} + \mathtt {312312} + \mathtt {213132} - \mathtt {213231} - \mathtt {213123} + \mathtt {321213} - \mathtt {312321} - \mathtt {132231} \\ \quad {}- \mathtt {132123} - \mathtt {321231} + \mathtt {321132} + \mathtt {132321} + \mathtt {132213} + \mathtt {231231} + \mathtt {321321} + \mathtt {213321} \\ \quad {}+ \mathtt {123231} + \mathtt {231123} - \mathtt {312213} - \mathtt {321123} - \mathtt {231132} + \mathtt {213213} + \mathtt {132132} + \mathtt {312231} \\ \quad {}- \mathtt {213312} - \mathtt {231321} - \mathtt {132312} - \mathtt {123213} - \mathtt {321312} + \mathtt {312123} - \mathtt {231213} + \mathtt {231312} \\ \quad {}- \mathtt {123321} + \mathtt {123312} - \mathtt {123132} \\ {}+ \text{ $4$ more} \end{array}\displaystyle \end{aligned}$$

The case\(d=4\)

Level \(n=4\) (\(w=1\))

$$\begin{aligned} &\mathtt {1234} - \mathtt {1243} - \mathtt {1324} + \mathtt {1342} + \mathtt {1423} - \mathtt {1432} - \mathtt {2134} + \mathtt {2143} \\ & \qquad + \mathtt {2314} - \mathtt {2341} - \mathtt {2413} + \mathtt {2431} + \mathtt {3124} - \mathtt {3142} - \mathtt {3214} + \mathtt {3241} \\ & \qquad + \mathtt {3412} - \mathtt {3421} - \mathtt {4123} + \mathtt {4132} + \mathtt {4213} - \mathtt {4231} - \mathtt {4312} + \mathtt {4321} \end{aligned}$$

3.2 The Invariant of Weight One, in Dimension Two

Geometric Interpretation

The invariant for \(d=2\), \(w=1\), namely \(\phi = x_{1} x_{2} - x_{2} x_{1}\) has a simple geometric interpretation: it picks out (two times)Footnote 7 the area (signed, and with multiplicity) between the curve \(X\) and the cord spanned between its starting and endpoint (compare Fig. 1). For (smooth) non-intersecting curves, this follows from Green’s theorem [43, Theorem 10.33]. For self-intersecting curves, the mathematically most convenient definition of “signed area” is the integral (in the plane) of its winding number. The claimed relation to the invariant \(\phi \) is for example proven in [34, Proposition 1].

Fig. 1
figure 1

A curve \(X = (X^{1}, X^{2})\) is shown, with shaded area given by \(\frac{1}{2} \langle S(X)_{0,T}, x_{1} x_{2} - x_{2} x_{1} \rangle = \frac{1}{2} \int _{0}^{T} \int _{0}^{r_{2}} dX^{1}_{r_{1}} dX^{2}_{r_{2}} - \frac{1}{2} \int _{0}^{T} \int _{0}^{r_{2}} dX^{2}_{r_{1}} dX^{1}_{r_{2}}\)

Connection to Correlation

Assume that \(X\) is a continuous curve, piecewise linear between some time points \(t_{i}\), \(i=0,\dots ,n\).Footnote 8 The area is then explicitly calculated as

$$\begin{aligned} & \int _{0}^{T} \int _{0}^{r} d X^{1}_{u} dX^{2}_{r} - \int _{0}^{T} \int _{0}^{r} dX^{2}_{u} dX^{1}_{r} \\ & \quad = \int _{0}^{T} \bigl( X^{1}_{r} - X^{1}_{0} \bigr) dX^{2}_{r} - \int _{0}^{T} \bigl( X^{2}_{r} - X^{2}_{0} \bigr) dX^{1}_{r} \\ & \quad = \sum_{i=0}^{n-1} \int _{0}^{1} \bigl( X^{1}_{t_{i}} + t \bigl(X^{1}_{t _{i+1}} - X^{1}_{t_{i}}\bigr) - X^{1}_{0} \bigr) \bigl( X^{2}_{t_{i+1}} - X^{2}_{t_{i}} \bigr) dt \\ & \qquad{} - \sum_{i=0}^{n-1} \int _{0}^{1} \bigl( X^{2}_{t_{i}} + t \bigl(X^{2}_{t _{i+1}} - X^{2}_{t_{i}}\bigr) - X^{2}_{0} \bigr) \bigl( X^{1}_{t_{i+1}} - X^{1}_{t_{i}} \bigr) dt \\ & \quad = \sum_{i=0}^{n-1} X^{1}_{t_{i}} X^{2}_{t_{i+1}} - \sum _{i=0}^{n-1} X ^{2}_{t_{i}} X^{1}_{t_{i+1}} + X^{2}_{0} \bigl(X^{1}_{t_{n}} - X^{1}_{t _{0}}\bigr) - X^{1}_{0} \bigl(X^{2}_{t_{n}} - X^{2}_{t_{0}}\bigr) \\ & \quad = \operatorname{Corr}\bigl(X^{2},X^{1} \bigr)_{1} - \operatorname{Corr}\bigl(X^{1},X ^{2} \bigr)_{1} + X^{2}_{0} \bigl(X^{1}_{t_{n}} - X^{1}_{t_{0}}\bigr) - X^{1}_{0} \bigl(X ^{2}_{t_{n}} - X^{2}_{t_{0}}\bigr). \end{aligned}$$

Here, for two vectors \(a\), \(b\) of length \(n\)

$$\begin{aligned} \operatorname{Corr}(a,b)_{1} := \sum_{i=0}^{n-1} a_{i+1} b_{i}, \end{aligned}$$

the lag-one cross-correlation, which is a commonly used feature in signal analysis, see for example [38, Chap. 13.2].Footnote 9 In particular, if the curve starts at 0, we have

$$\begin{aligned} \int _{0}^{T} \int _{0}^{r} d X^{1}_{u} dX^{2}_{r} - \int _{0}^{T} \int _{0}^{r} dX^{2}_{u} dX^{1}_{r} = \operatorname{Corr}\bigl(X^{2},X^{1} \bigr)_{1} - \operatorname{Corr}\bigl(X^{1},X^{2} \bigr)_{1}, \end{aligned}$$

which is an antisymmetrized version of the lag-one cross-correlation.

Remark 16

The antisymmetrized version of the lag \(\tau \) cross-correlation, for each \(\tau \ge 2\), is also a \(\operatorname{GL}(\mathbb{R}^{2})\) invariant of the curve. In general these invariants cannot be found in the signature, and we thank the anonymous referee for pointing out the following example. Consider the treelike curve which linearly interpolates the following points

$$\begin{aligned} (0,0), (1,0), (1,1/2), (1,1), (1,0), (0,0). \end{aligned}$$

Its signature is trivial, but

$$\begin{aligned} \operatorname{Corr}\bigl(X^{1},X^{2}\bigr)_{2} - \operatorname{Corr}\bigl(X^{2},X ^{1}\bigr)_{2} = 1 - 0.5 = 0.5. \end{aligned}$$

3.3 The Invariant of Weight One, in Any Dimension

Whatever the dimension \(d\) of the curve’s ambient space, the space of invariants of weight 1 has dimension 1 and is spanned by

$$\begin{aligned} \operatorname{Inv}_{d} := \operatorname{Inv}_{d}(x_{1},..,x_{d}) := \sum_{\sigma \in S_{d}} \operatorname{sign}(\sigma )\ x_{\sigma (1)} .. x_{\sigma (d)} = \det \begin{pmatrix} x_{1} & .. & x_{d} \\ .. & .. & .. \\ x_{1} & .. & x_{d} \end{pmatrix} . \end{aligned}$$
(4)

Here, for a matrix \(C\) of non-commuting variables, (compare [14, Definition 3.1])

$$\begin{aligned} \det C := \sum_{\tau }\operatorname{sign}\tau \prod _{i} C_{i \tau (i)}. \end{aligned}$$

This invariant is of homogeneity \(d\). The following lemma tells us that we can write \(\operatorname{Inv}_{d}\) in terms of expressions on lower homogeneities.

To state it, we first define the operation \(\mathsf{InsertAfter}(x _{i},r)\) on monomials of order \(n \ge r\), as the insertion of the variable \(x_{i}\) after position \(r\), and extend it linearly. For example

$$\begin{aligned} \mathsf{InsertAfter}(x_{1},1) \operatorname{Inv}_{2}(x_{2},x_{3}) &= \mathsf{InsertAfter}(x_{1},1) ( x_{2} x_{3} - x_{3} x_{2} ) \\ &= x_{2} x_{1} x_{3} - x_{3} x_{1} x_{2}. \end{aligned}$$

Lemma 17

In any dimension\(d\)and for any\(r=0,1,..,d-1\)

$$\begin{aligned} \operatorname{Inv}_{d}(x_{1},..,x_{d}) = (-1)^{r} \sum_{j=1}^{d} (-1)^{j+1} \mathsf{InsertAfter}(x_{j},r) \operatorname{Inv}_{d-1}( x _{1},.., \widehat{x_{j}} .., x_{d}), \end{aligned}$$

where\(\widehat{x_{j}}\)denotes the omission of that argument.

For\(d\)odd,

Remark 18

For completeness, we also note the related de Bruijn’s formula. For \(d\) even,

where

$$\begin{aligned} A_{ij} = \operatorname{Inv}_{2}(x_{i},x_{j}), \end{aligned}$$

and the Pfaffian (with respect to the shuffle product), is

For a proof see [7] and [33].

Proof

The first statement follows from expressing the determinant in (4) in terms of minors with respect to the row \(r+1\) (since the \(x_{i}\) are non-commuting, this does not work with columns!).

Regarding the second statement, since \(d\) is odd and then using the first statement

as claimed. □

An immediate consequence is the following lemma.

Lemma 19

If the ambient dimension\(d\)is odd and the curve\(X\)is closed (i.e. \(X_{T} = X_{0}\)) then

$$\begin{aligned} \bigl\langle S(X)_{0,T}, \operatorname{Inv}_{d} \bigr\rangle = 0. \end{aligned}$$

Proof

By Lemma 3.17 and then by the shuffle identity (Lemma 2.1)

since the increment \(\langle S(X)_{0,T}, x_{j} \rangle = X ^{j}_{T} - X^{j}_{0}\) is zero for all \(j\) by assumption. □

In even dimension we have the phenomenon that closing a curve does not change the value of the invariant.

Lemma 20

If the ambient dimension\(d\)is even, then for any curve\(X\)

$$\begin{aligned} \bigl\langle S(X)_{0,T}, \operatorname{Inv}_{d} \bigr\rangle = \bigl\langle S(\bar{X})_{0,T}, \operatorname{Inv}_{d} \bigr\rangle , \end{aligned}$$

where\(\bar{X}\)is\(X\)concatenated with the straight line connecting\(X_{T}\)to\(X_{0}\).

Proof

Let \(\bar{X}\) be parametrized on \([0,2T]\) as follows: \(\bar{X} = X\) on \([0,T]\) and it is the linear path connecting \(X_{T}\) to \(X_{0}\) on \([T,2T]\). By translation invariance we can assume \(X_{0} = 0\) and by \(\operatorname{GL}(\mathbb{R}^{d})\)-invariance that \(X_{T}\) lies on the \(x_{1}\) axis. Then the only component of \(\bar{X}\) that is non-constant on \([T,2T]\) is the first one, \(\bar{X}^{1}\).

By Lemma 3.17

$$\begin{aligned} \operatorname{Inv}_{d} = - \sum_{j=1}^{d}(-1)^{j+1} \operatorname{Inv}_{d-1}(x_{1},.., \hat{x}_{j}, .. x_{d}) x_{j}. \end{aligned}$$

Letting the summands act on \(S(\bar{X})_{0,t}\) we get \(\pm 1\) times

$$\begin{aligned} \int _{0}^{t} \bigl\langle S(\bar{X})_{0,r}, \operatorname{Inv}_{d-1}(x _{1},.., \hat{x}_{j}, .. x_{d}) \bigr\rangle d\bar{X}^{j}_{r}. \end{aligned}$$

For \(j\ne 1\) these expressions are constant on \([T,2T]\), since we arranged things so that those \(\bar{X}^{j}\) do not move on \([T,2T]\). But also for \(j=1\) this expression is constant on \([T,2T]\). Indeed, the integrand

$$\begin{aligned} \bigl\langle S(\bar{X})_{0,r}, \operatorname{Inv}_{d-1}(x_{2}, x_{3},.., x_{d}) \bigr\rangle , \end{aligned}$$

is zero on \([T,2T]\), since \(X\), projected on the \(x_{2}-..-x_{d}\) hyperplane, is a closed curve, and so Lemma 3.19 applies. □

Lemma 21

Let\(X\)be the piecewise linear curve through\(p_{0},..,p_{d} \in \mathbb{R}^{d}\). Then

S ( X ) 0 , T , Inv d =det [ 1 1 . . 1 p 0 p 1 . . p d ]

Proof

First, for any \(v \in \mathbb{R}^{d}\),

det [ 1 1 . . 1 p 0 + v p 1 + v . . p d + v ] =det [ 1 1 . . 1 p 0 p 1 . . p d ] .

Since the signature is also invariant to translation, we can assume \(p_{0} = 0\). Now both sides of the statement transform the same way under the action of \(\operatorname{GL}(\mathbb{R}^{d})\) on the points \(p_{1},..,p_{d}\). It is then enough to prove this for

$$\begin{aligned} p_{0} =& 0 \\ p_{1} =& e_{1} \\ p_{2} =& e_{1} + e_{2} \\ ..& \\ p_{d} =& e_{1} + .. + e_{d}. \end{aligned}$$

Now, for this particular choice of points the right hand side is clearly equal to 1. For the left hand side, the only non-zero term is

$$\begin{aligned} \bigl\langle S(X)_{0,T}, \mathtt {12} .. \mathtt {d} \bigr\rangle &= \int dX ^{1} .. dX^{d} \\ &= 1. \end{aligned}$$

 □

The modulus of the determinant

det [ 1 1 . . 1 0 p 1 . . p d ] =det [ p 1 . . p d ]

gives the Lebesgue measure of the parallelepiped spanned by the vectors \(p_{1}-p_{0},..,p_{d}-p_{0}\). The polytope spanned by the points \(p_{0},p_{1},..,p_{d}\) fits \(d!\) times into that parallelepiped. We hence have the relation to classical volume as follows.

Lemma 22

Let\(p_{0},..,p_{d} \in \mathbb{R}^{d}\), then

| Convex−Hull ( p 0 , . . , p d ) | = 1 d ! | det [ 1 1 . . 1 p 0 p 1 . . p d ] |

We now proceed to piecewise linear curves with more than \(d\) vertices.

Lemma 23

Let\(X\)be the piecewise linear curve through, \(p_{0},..,p_{n} \in \mathbb{R}^{d}\), with\(n \ge d\). Then,

S ( X ) 0 , T , Inv d = i det [ 1 1 . . 1 p i 0 p i 1 . . p i d ] .
(5)

For\(d\) even, the subsequences\(i\)are chosen as follows:

$$\begin{aligned} i_{0} = 0 \end{aligned}$$

and\(i_{1},..,i_{d}\)ranges over all possible increasing subsequences of\(1,2,..,n\)such that for\(\ell \)odd: \(i_{\ell }+ 1 = i_{\ell +1}\).

For\(d\) odd, they are chosen as follows:

$$\begin{aligned} i_{0} &= 0 \\ i_{d} &= n, \end{aligned}$$

and\(i_{1},..,i_{d-1}\)ranges over all possible increasing subsequences of\(1,2,..,n-1\)such that for\(\ell \)odd: \(i_{\ell }+ 1 = i_{\ell +1}\)

Remark 24

The number of indices is easily calculated. In the even case, we have \(B := d/2\) “groups of two” to place, \(A := n - d\) “fillers” in between. This gives

$$\begin{aligned} \binom{ A + B }{ B } = \binom{n-d + d/2 }{ d/2 } = \binom{ \lfloor \frac{d}{2}\rfloor + n - d }{ \lfloor \frac{d}{2}\rfloor }, \end{aligned}$$

where \(\lfloor r \rfloor \) is the largest integer less than or equal to \(r\).

In the odd case, we have \(B :=(d-1)/2\) “groups of two” to place, with \(A := n-1 - (d-1)\) “fillers” in between. This gives

$$\begin{aligned} \binom{ A + B }{ B } = \binom{ n-1 - \frac{d-1}{2} }{ \frac{d-1}{2} } = \binom{ \lfloor \frac{d}{2}\rfloor + n - d }{ \lfloor \frac{d}{2} \rfloor }. \end{aligned}$$

Remark 25

Consider the case \(d=2\), and a curve \(X\) through the points \(p_{0}, p_{1},.., p_{n} \in \mathbb{R}^{d}\), with \(p_{0} = 0\). Then

S ( X ) T , Inv 2 = S ( X ) T , 12 21 = i = 1 n 1 det [ 1 1 1 p 0 p i p i + 1 ] = i = 1 n 1 det [ p i p i + 1 ] = : i = 1 n 1 P i , i + 1 .

We can express \(\operatorname{Inv}_{d}\) as a linear combination of the \(2\times 2\) minors \(P_{i,j}\) of the \(2 \times n\) matrix \((p_{1},p_{2},..,p _{n})\). Generally, it is well-known that all invariants to \(\operatorname{GL}(\mathbb{R}^{2})\) of a tuple of points are expressible in terms of these minors [47, Sect. 3.2]. So, for a piecewise linear curve through \(0,p_{1},..,p_{n}\), all our integral invariants are—a fortiori—expressible in terms of them. In the simple case shown here, this expression is just a linear combination. Experimentally, for higher order invariants, polynomial combinations appear with a lot of structure. This poses the question on whether one can set up some kind of “\(\operatorname{GL}\) invariant integration”, where, instead of the classical Riemann integration that uses increments, one “integrates” using only these \(P_{i,j}\).

Example 26

For \(d=2\), \(n=5\) we get the subsequences

$$\begin{aligned}{} [0, 1, 2] \\ [0, 2, 3] \\ [0, 3, 4] \end{aligned}$$

For \(d=4\), \(n=7\) we get the subsequences

$$\begin{aligned}{} [0, 1, 2, 3, 4] \\ [0, 1, 2, 4, 5] \\ [0, 1, 2, 5, 6] \\ [0, 2, 3, 4, 5] \\ [0, 2, 3, 5, 6] \\ [0, 3, 4, 5, 6] \end{aligned}$$

For \(d=5\), \(n=8\) we get the subsequences

$$\begin{aligned}{} [0, 1, 2, 3, 4, 7] \\ [0, 1, 2, 4, 5, 7] \\ [0, 1, 2, 5, 6, 7] \\ [0, 2, 3, 4, 5, 7] \\ [0, 2, 3, 5, 6, 7] \\ [0, 3, 4, 5, 6, 7] \end{aligned}$$

Proof of Lemma 3.23

The case \(d=2\)

Let \(X\) be the curve through the points \(p_{0},p_{1},..,p_{n}\). We can write it as concatenation of the curves \(X^{(i)}\), where \(X^{(i)}\) is the curve through the points \(p_{0}\), \(p_{i}\), \(p_{i+1}\), \(p_{0}\). The time-interval of definition for these curves (and all curves in this proof) do not matter, so we omit the subscript of \(S(.)\). Then, by Chen’s lemma (Lemma 2.3)

$$\begin{aligned} \bigl\langle S(X), \mathtt {12} - \mathtt {21} \bigr\rangle &= \bigl\langle S \bigl(X^{(n-1)}\bigr) \cdot .. \cdot S\bigl(X^{(1)}\bigr), \mathtt {12} - \mathtt {21} \bigr\rangle \\ &= \sum_{i=1}^{n-1} \bigl\langle S \bigl(X^{(i)}\bigr), \mathtt {12} - \mathtt {21} \bigr\rangle . \end{aligned}$$

For the last equality we used that

$$\begin{aligned} \langle g h, \mathtt {12} - \mathtt {21} \rangle = \langle g, \mathtt {12} - \mathtt {21} \rangle + \langle h, \mathtt {12} - \mathtt {21} \rangle + \langle g, \mathtt {1} \rangle \langle h, \mathtt {2} \rangle - \langle g, \mathtt {2} \rangle \langle h, \mathtt {1} \rangle , \end{aligned}$$

and that the increments of all curves \(X^{(i)}\) are zero. Now by Lemma 3.20 we can omit the last straight line in every \(X^{(i)}\) and hence by Lemma 3.21

S ( X ( i ) ) , 12 21 =det [ 1 1 1 p 0 p i p i + 1 ] ,

which finishes the proof for \(d=2\).

Now assume the statement is true for all dimensions strictly smaller than some \(d\). We show it is true for \(d\). \(d\)is odd

As before we can assume \(p_{0} = 0\) and that \(p_{n}\) lies on the \(x_{1}\) axis. Every sequence summed over on the right-hand side of (5) is of the form \(i = (0,\ldots,n)\). For each of those, we calculate

det [ 1 1 . . 1 1 p i 0 p i 1 . . p i d 1 p i d ] = det [ 1 1 . . 1 1 0 p i 1 . . p i d 1 Δ e 1 ] = Δ det [ 1 1 . . 1 0 p ¯ i 1 . . p ¯ i d 1 ] .

Here \(\bar{p}_{j} \in \mathbb{R}^{d-1}\) is obtained by deleting the first coordinate of \(p_{j}\), \(e_{1}\) is the first canonical coordinate vector in \(\mathbb{R}^{d}\) and \(\Delta := (p_{0} - p_{n})_{1} = \langle S(X), x_{1} \rangle \) is the total increment of \(X\) in the \(x_{1}\) direction. Here we used that \(d\) is odd (otherwise we would get a prefactor −1).

The last determinant is the expression for the summands of the right-hand side of (5), but with dimension \(d-1\) and points \(0 = \bar{p}_{0}, \bar{p}_{1},.., \bar{p}_{n-1}\). By assumption, summing up all these determinants gives

$$\begin{aligned} \Delta \cdot \bigl\langle S(\bar{X}), \operatorname{Inv}_{d-1} \bigr\rangle = \bigl\langle S(X), x_{1} \bigr\rangle \bigl\langle S( \bar{X}), \operatorname{Inv}_{d-1} \bigr\rangle , \end{aligned}$$

where \(\bar{X}\) is the curve in \(\mathbb{R}^{d-1}\) through the points \(\bar{p}_{0},.., \bar{p}_{n-1}\). Since \(\bar{p}_{n} = \bar{p}_{0} = 0\), we can attach the additional point \(\bar{p}_{n}\) to \(\bar{X}\) without changing the value here (Lemma 3.20). Hence the sum of determinants is equal to

$$\begin{aligned} \bigl\langle S(X), x_{1} \bigr\rangle \bigl\langle S(X), \operatorname{Inv}_{d-1}(x_{2},..,x_{d}) \bigr\rangle . \end{aligned}$$

Since we arranged matters such that \(\langle S(X), x_{i} \rangle = 0\) for \(i\ne 1\), this is equal to

where we used the shuffle identity, Lemma 2.1. By the second part of Lemma 3.17 this is equal to \(\langle S(X), \operatorname{Inv}_{d} \rangle \), which finishes the proof for odd \(d\).

\(d\) is even

We proceed by induction on \(n\). For \(n=d\) the statement follows from Lemma 3.21.

Let it be true for some \(n\), we show it for a piecewise linear curve through some points \(p_{0},.., p_{n+1}\). Write \(X = X' \sqcup X''\) where \(X'\) is the linear interpolation of \(p_{0},.., p_{n}\), \(X''\) is the linear path from \(p_{n}\) to \(p_{n+1}\) and we recall concatenation ⊔ of paths from Lemma 2.3. By assumption, (5) is true for the curve \(X'\). Adding an additional point \(p_{n+1}\), the sum on the right hand side of (5) gets additional indices of the form

$$\begin{aligned} (p_{j_{0}},.., p_{j_{d-1}}, p_{n+1}), \end{aligned}$$

where

$$\begin{aligned} j_{0} &= 0 \\ j_{d-1} &= n, \end{aligned}$$

and where \(j_{1},..,j_{d-2}\) ranges over all possible increasing subsequences of \(1,2,..,n-1\) such that for \(\ell \) odd \(j_{\ell }+ 1 = j_{\ell +1}\).

Assume \(p_{n+1} - p_{n} = \Delta \cdot e_{1}\) lies on the \(x_{1}\)-axis. Then, summing over those \(j\),

j det [ 1 1 . . 1 1 1 0 p j 1 . . p j d 2 p n p n + 1 ] = j det [ 1 1 . . 1 1 1 p n p j 1 p n . . p j d 2 p n 0 p n + 1 p n ] = j det [ 1 1 . . 1 1 1 p n p j 1 p n . . p j d 2 p n 0 Δ e 1 ] = Δ j det [ 1 1 . . 1 1 p ¯ n p ¯ j 1 p ¯ n . . p ¯ j d 2 p ¯ n 0 ] = Δ j det [ 1 1 . . 1 1 0 p ¯ j 1 . . p ¯ j d 2 p ¯ n ] = Δ S ( X ¯ ) , Inv d 1 = Δ S ( X ) , Inv d 1 ( x 2 , . . , x d )

Here \(\bar{X}'\) is the curve in \(\mathbb{R}^{d-1}\) through the points \(\bar{p}_{0},.., \bar{p}_{n}\), and we used the fact that the indices \(j\) here range over the ones used for (5) in dimension \(d-1\) on the points \(\bar{p}_{0},.., \bar{p}_{n}\). On the other hand,

$$\begin{aligned} \bigl\langle S(X), \operatorname{Inv}_{d} \bigr\rangle &= \bigl\langle S\bigl(X'\bigr) S\bigl(X''\bigr), \operatorname{Inv}_{d} \bigr\rangle \\ &= \bigl\langle S\bigl(X'\bigr), \operatorname{Inv}_{d} \bigr\rangle - \bigl\langle S\bigl(X'\bigr), \operatorname{Inv}_{d-1}(x_{2},.., x_{d}) \bigr\rangle \bigl\langle S\bigl(X'' \bigr), x_{1} \bigr\rangle \end{aligned}$$

Here we used that \(S(X'') = \exp ( \Delta \cdot x_{1} ) = 1 + \Delta \cdot x_{1} + O(x_{1}^{2})\) [16, Example 7.21], the fact that each monomial in \(\operatorname{Inv}_{d}\) has exactly one occurrence of \(x_{1}\) and Lemma 3.17. This finishes the proof. □

Definition 27

Let \(X: [0,T] \to \mathbb{R}^{d}\) be any curve. Define its signed volume to be the following limit, if it exists,

Signed−Volume(X):= 1 d ! lim | π | 0 i det [ 1 1 . . 1 X t i 0 π X t i 1 π . . X t i d π ] .

Here \(\pi = (0=t^{\pi }_{0},.., t^{\pi }_{n^{\pi }}=T)\) is a partition of the interval \([0,T]\) and \(|\pi |\) denotes its mesh size. The indices \(i\) are chosen as in Lemma 3.23.

Theorem 28

Let\(X: [0,T] \to \mathbb{R}^{d}\)a continuous curve of bounded variation. Then its signed volume exists and

$$\begin{aligned} \operatorname{Signed-Volume}(X) = \frac{1}{d!} \bigl\langle S(X)_{0,T}, \operatorname{Inv}_{d} \bigr\rangle \end{aligned}$$

Proof

Fix some sequence \(\{\pi ^{n}\}_{n\in \mathbb{N}}\), of partitions of \([0,T]\) with \(|\pi ^{n}| \to 0\) and interpolate \(X\) linearly along each \(\pi ^{n}\) to obtain a sequence of linearly interpolated curves \(X^{n}\). Then by Lemma 3.23

$$\begin{aligned} \operatorname{Signed-Volume}\bigl(X^{n}\bigr) = \frac{1}{d!} \bigl\langle S\bigl(X ^{n}\bigr)_{0,T}, \operatorname{Inv}_{d} \bigr\rangle \end{aligned}$$

By stability of the signature in the class of continuous curves of bounded variation [16, Proposition 1.28, Proposition 2.7], we get convergence

$$\begin{aligned} \bigl\langle S\bigl(X^{n}\bigr)_{0,T}, \operatorname{Inv}_{d} \bigr\rangle \to \bigl\langle S(X)_{0,T}, \operatorname{Inv}_{d} \bigr\rangle \end{aligned}$$

and this is independent of the particular sequence \(\pi ^{n}\) chosen. □

The previous theorem is almost a tautology, but there are relations to classical objects in geometry. For \(d=2\), as we have seen in Sect. 3.2,

$$\begin{aligned} \frac{1}{2} \bigl\langle S(X)_{0,T}, \operatorname{Inv}_{2} \bigr\rangle , \end{aligned}$$

is equal to the signed area of the curve \(X\). In general dimension, the value of the invariant is related to some kind of classical “volume” if the curve satisfies some kind of monotonicity. This is in particular satisfied for the “moment curve”.

Lemma 29

Let \(X\) be the moment curve

$$\begin{aligned} X_{t} = \bigl(t,t^{2},\ldots,t^{d}\bigr) \in \mathbb{R}^{d}. \end{aligned}$$

Then for any \(T > 0\)

$$\begin{aligned} \frac{1}{d!} \bigl\langle S(X)_{0,T}, \operatorname{Inv}_{d} \bigr\rangle = \bigl|\operatorname{Convex-Hull}( X_{[0,T]} )\bigr| \end{aligned}$$

Remark 30

It is easily verified that for integers \(n_{1} .. n_{d}\) one has

$$\begin{aligned} \frac{1}{n_{1} \cdot .. \cdot n_{d}} \int _{0}^{T} dt_{1}^{n_{1}} .. dt _{d}^{n_{d}} = \frac{1}{n_{1}} \frac{1}{n_{1}+n_{2}} .. \frac{1}{n _{1}+..+n_{d}} T^{n_{1}+..+n_{d}}. \end{aligned}$$

We deduce that

$$\begin{aligned} \bigl|\operatorname{Convex-Hull}( X_{[0,T]})\bigr| = T^{1+2+..+d} \sum _{\sigma \in S_{d}} \operatorname{sign}\sigma \frac{1}{\sigma (1)} \frac{1}{\sigma (1)+\sigma (2)} .. \frac{1}{ \sigma (1)+..+\sigma (d)}. \end{aligned}$$

In [24, Sect. 15], the value of this volume is determined, for \(T=1\), as

$$\begin{aligned} \prod_{\ell =1}^{d}\frac{(\ell -1)! (\ell -1)!}{(2\ell -1)!}. \end{aligned}$$

We hence get the combinatorial identity

$$\begin{aligned} \prod_{\ell =1}^{d}\frac{(\ell -1)! (\ell -1)!}{(2\ell -1)!} = \sum _{\sigma \in S_{d}} \operatorname{sign}\sigma \frac{1}{\sigma (1)} \frac{1}{\sigma (1)+\sigma (2)} .. \frac{1}{ \sigma (1)+..+\sigma (d)}. \end{aligned}$$

Proof

For \(n\ge d\) let \(0 = t_{0} < .. < t_{n} \le T\) be time-points, let \(p_{i} := X_{t_{i}}\) be the corresponding points on the moment curve and denote by \(X^{n}\) the piecewise linear curve through those points. We will show

$$\begin{aligned} \frac{1}{d!} \bigl\langle S\bigl(X^{n}\bigr)_{0,T}, \operatorname{Inv}_{d} \bigr\rangle = \bigl|\operatorname{Convex-Hull}\bigl( X^{n}_{[0,T]} \bigr)\bigr|. \end{aligned}$$

First note that for any \(1 \le i_{0} < i_{1} < .. \le i_{d}\le n\),

det [ 1 1 . . 1 p i 0 p i 1 . . p i d ] = 0 < k n ( t i k t i )>0,
(6)

since it is a Vandermonde determinant.

We will decompose \(P := \{p_{0},..,p_{n}\}\) into (overlapping) sets \(S_{\ell }\) with cardinality \(d+1\) and such thatFootnote 10

$$\begin{aligned} \bigl|\operatorname{Convex-Hull}( p_{0},..,p_{n} )\bigr| = \sum _{\ell }\bigl|\operatorname{Convex-Hull}(S_{\ell })\bigr|. \end{aligned}$$

A face of \(P\) is a subset \(F \subset P\) such that its convex hull \(\operatorname{Convex-Hull}( F )\) equals the intersection of \(\operatorname{Convex-Hull}(P)\) with some affine hyperspace. A face is a facet, if its affine span has dimension \(d-1\). The following is a fact that is true for any polytope spanned by some points \(P\): up to a set of measure zero, for every point \(x\) in \(\operatorname{Convex-Hull}( P )\), the line connecting \(p_{0}\) to \(x\) exits \(\operatorname{Convex-Hull}(p_{0},..,p_{n})\) through a unique facet of \(\operatorname{Convex-Hull}( p_{0},..,p_{n} )\) contained in \(\{ p_{1},.., p_{n} \}\). Hence

$$\begin{aligned} \bigl|\operatorname{Convex-Hull}( p_{0},..,p_{n} )\bigr| = \sum_{F} \bigl|\operatorname{Convex-Hull}( p_{0}\cup F )\bigr|, \end{aligned}$$

where the sum is over all such facets.

Our points \(p_{i}\) lie on the moment curve. Then, by (6), any collection of points \(p_{i_{0}}, p _{i_{1}},.., p_{i_{d}}\) is in general position. This means that every facet of \(P\) must have exactly \(d\) points (and not more). Facets of \(\operatorname{Convex-Hull}(P)\) with \(d\) points are characterized by Gale’s criterion ([17, Theorem 3], [53, Theorem 0.7]):

the points \(p_{i_{1}},.., p_{i_{d}}\), with distinct \(i_{j} \in \{0,..,n \}\) form a facet of \(P\) if and only if any two elements of \(\{0,..,n \} \setminus \{i_{1},.., i_{d}\}\) are separated by an even number of elements in \(\{i_{1},.., i_{d}\}\).Footnote 11

\(d\) odd

We are looking for such \(\{i_{j}\}\) such that \(i_{1} \ge 1\). Those are exactly the indices with

  • \(i_{\ell +1} = i_{\ell }+ 1\) for \(\ell \) odd

  • \(i_{d}= n\).

Together with \(i_{0} := 0\) these form the indices of Lemma 3.23.

\(d\) even

We are looking for such \(\{i_{j}\}\) such that \(i_{1} \ge 1\). Those are exactly the indices with

  • \(i_{\ell +1} = i_{\ell }+ 1\) for \(\ell \) odd.

Together with \(i_{0} := 0\) these form the indices of Lemma 3.23.

Hence

$$\begin{aligned} \bigl|\operatorname{Convex-Hull}\bigl( X^{n}_{[0,T]} \bigr)\bigr| = \sum_{i} \bigl| \operatorname{Convex-Hull}( p_{i_{0}},.., p_{i_{d}} )\bigr|. \end{aligned}$$

Now by Lemma 3.22

| Convex−Hull ( p i 0 , . . , p i d ) | = 1 d ! | det [ 1 1 . . 1 p i 0 p i 1 . . p i d ] | .

The determinant is in fact positive here, by (6). We can hence omit the modulus and get

| Convex−Hull ( X [ 0 , T ] n ) | = i | Convex−Hull ( p i 0 , . . , p i d ) | = i 1 d ! det [ 1 1 . . 1 p i 0 p i 1 . . p i d ] = 1 d ! S ( X n ) 0 , T , Inv d ,

by Lemma 3.23.

The statement of the lemma now follows by piecewise linear approximation of \(X\) using continuity of the convex hull, which follows from [11, Lemma 3.2], and of iterated integrals [16, Proposition 1.28, Proposition 2.7]. □

4 Rotations

Let

$$\begin{aligned} \operatorname{SO}\bigl(\mathbb{R}^{d}\bigr) = \bigl\{ A \in \operatorname{GL}\bigl( \mathbb{R}^{d}\bigr) : A A^{\top }= \operatorname{id}, \det ( A ) = 1 \bigr\} , \end{aligned}$$

be the group of rotations of \(\mathbb{R}^{d}\).

Definition 1

We call \(\phi \in T(\mathbb{R}^{d})\) an \(\operatorname{SO}\)invariant if

$$\begin{aligned} \bigl\langle S(X)_{0,T}, \phi \bigr\rangle = \bigl\langle S(A X)_{0,T}, \phi \bigr\rangle \end{aligned}$$

for all \(A \in \operatorname{SO}(\mathbb{R}^{d})\) and all curves \(X\). Alternatively, as explained in Sect. 3,

$$\begin{aligned} A^{\top }\phi = \phi , \end{aligned}$$

for all \(A \in \operatorname{SO}(\mathbb{R}^{d})\), where the action on \(T(\mathbb{R}^{d})\) was given in Definition 3.2.

Since \(\det (A) = 1\), any \(\operatorname{GL}\) invariant of weight \(w \ge 1\) (Sect. 3) is automatically an \(\operatorname{SO}\) invariant. But there are \(\operatorname{SO}\) invariants that are not \(\operatorname{GL}\) invariants (of any weight), for example, for \(d=2\), \(\phi := x_{1} x_{1} + x_{2} x_{2}\).

Switching to the perspective of multilinear maps, this is the map \((v_{1},v_{2}) \mapsto \langle v_{1}, v_{2} \rangle \). It is shown, see for example [50, Theorem 2.9.A], that all invariants are built from the inner product and the determinant.

Recently, a linear basis for these invariants has been constructed. To formulate the result, we need to introduce some notation from [28]. Define

$$\begin{aligned} I(r,n) := \bigl\{ ( i_{1},.., i_{r} ) : 1 \le i_{1} < .. < i_{r} \le n \bigr\} . \end{aligned}$$

Use the following partial order on these sequences: for \(a \in I(r,n)\), \(a' \in I(r',n)\)

$$\begin{aligned} a \ge a' \end{aligned}$$

if \(r \le r'\) and \(a_{j} \ge a'_{j}\) for \(j \le r\).

For \(c \in I(d,n)\) and \(v_{1},.., v_{n} \in \mathbb{R}^{d}\), define

$$\begin{aligned} u(c) ( v_{1},..,v_{n} ) := \text{ $d$-minor of the $d\times n$ matrix $(v_{1},..,v_{n})$, with columns given by $c$}. \end{aligned}$$

For \(a,b \in I(r,n) \times I(r,n)\) with \(r \le d\) and \(v_{1},..,v_{n} \in \mathbb{R}^{d}\), define

$$\begin{aligned} p(a,b) ( v_{1},..,v_{n} ) := \text{ $r$-minor of the matrix $\langle v _{i}, v_{j} \rangle $, rows given by $a$, columns given by $b$ } \end{aligned}$$

Theorem 2

([28, Theorem 12.5.0.8])

Let\(V\)be a\(d\)-dimensional vector space with inner product\(\langle \cdot , \cdot \rangle \). A basis for the space of multilinear maps

$$\begin{aligned} \psi : \underbrace{V \times \cdots \times V}_{n \textit{ times}} \to \mathbb{R} \end{aligned}$$

that satisfy

$$\begin{aligned} \psi (A v_{1}, A v_{2}, \dots , A v_{n}) = \psi (v_{1}, v_{2}, \dots , v_{n}) \end{aligned}$$

for all\(A \in \operatorname{SO}(V)\)and\(v_{1}, \dots , v_{n} \in V\)is given by the maps

$$\begin{aligned} F(v_{1},..,v_{n}) =& p \bigl(a^{(1)},b^{(1)} \bigr) (v_{1},..,v_{n}) \cdot .. \cdot p \bigl(a^{(r)},b^{(r)} \bigr) (v_{1},..,v_{n}) \\ &{}\cdot u \bigl(c^{(1)} \bigr) (v_{1},..,v_{n}) \cdot .. \cdot u \bigl(c ^{(s)} \bigr) (v_{1},..,v_{n}), \end{aligned}$$

satisfying

  • \(c^{(j)} \in I(d,n)\) for each \(j=1,..,s\)

  • \(a^{(j)},b^{(j)} \in I(t_{j},n)\) for some \(1 \le t_{j} \le d-1\) for each \(j=1,..,r\)

  • \(a^{(1)} \ge b^{(1)} \ge a^{(2)} \ge .. \ge b^{(r)} \ge c^{(1)} \ge .. \ge c^{(s)}\)

  • every number\(1,..,n\)appears in exactly one of the sequences\(a^{(1)},.., a^{(r)}, b^{(1)},.., b^{(r)}, c^{(1)},.., c^{(s)}\); (in particular\(n = 2 \cdot C_{1} + d\cdot C_{2}\)for some\(C_{1}\), \(C_{2}\)non-negative integers)

Example 3

We give examples of these sequences for \(d=2\).

\(n=1\): There is no such set of sequences, since non-negative integers \(C_{1}\), \(C_{2}\) with \(2\cdot C_{1} + 2 \cdot C_{2} = 1\) cannot be found.

\(n=2\): Allowed sets of sequences are

  • \(c^{(1)} = (1,2)\); meaning that \(F(v_{1},v_{2}) = \langle v_{1}, v _{2} \rangle \)

  • \(a^{(1)} = (2)\), \(b^{(1)} = (1)\); meaning that \(F(v_{1},v_{2}) = \det [ v_{1} v_{2} ]\)

\(n=3\): There is no such set of sequences.

\(n=4\): Allowed sets of sequences are

  • \(a^{(1)} = (4)\), \(b^{(1)} = (3)\), \(a^{(2)} = (2)\), \(b^{(2)} = (1)\); meaning that \(F(v_{1},v_{2},v_{3},v_{4}) = \langle v_{4}, v_{3} \rangle \langle v_{2}, v_{1} \rangle \).

  • \(a^{(1)} = (4)\), \(b^{(1)} = (3)\), \(c^{(1)} = (1,2)\); meaning that \(F(v_{1},v_{2},v_{3},v_{4}) = \langle v_{4}, v_{3} \rangle \det [ v _{1} v_{2} ]\).

  • \(a^{(1)} = (4)\), \(b^{(1)} = (2)\), \(c^{(1)} = (1,3)\)

  • \(a^{(1)} = (3)\), \(b^{(1)} = (2)\), \(c^{(1)} = (1,4)\)

  • \(c^{(1)} = (3,4)\), \(c^{(2)} = (1,2)\)

  • \(c^{(1)} = (2,4)\), \(c^{(2)} = (1,3)\)

In the setting of \(T(\mathbb{R}^{d})\) we have

Proposition 4

The\(\operatorname{SO}\)invariants of homogeneity\(n\)are spanned by

$$\begin{aligned} \mathsf{poly}( \varPsi ), \end{aligned}$$

where\(\varPsi \)ranges over the invariants of the previous theorem and\(\mathsf{poly}\)is given in Lemma 2.4.

In the case \(d=2\), there is another way to arrive at a basis for the invariants. Taking inspiration from [15], which concerns rotation invariants of images, we work in the complex vector space \(T(\mathbb{C}^{2})\).Footnote 12

Theorem 5

Define

$$\begin{aligned} z_{1} &= x_{1} + i x_{2} \\ z_{2} &= x_{1} - i x_{2}. \end{aligned}$$

The space of \(\operatorname{SO}\) invariants on level \(n\) in \(T(\mathbb{C}^{2})\) is spanned freely by

$$\begin{aligned} z = z_{j_{1}} \cdot .. \cdot z_{j_{n}} \quad \textit{with } \#\{ r : j_{r} = 1 \} = \#\{ r : j_{r} = 2 \}. \end{aligned}$$

The space of \(\operatorname{SO}\) invariants on level \(n\) in \(T(\mathbb{R}^{2})\) is spanned freely by

$$\begin{aligned} \operatorname{Re}[ z ], \operatorname{Im}[ z ] \quad \textit{with } \#\{r : j_{r} = 1 \} = \#\{ r : j_{r} = 2 \} \textit{ and } z _{1} = 1. \end{aligned}$$

Remark 6

In particular for \(d=2\) and \(n\) even, the dimension of rotation invariants on level \(n\) in \(T(\mathbb{R}^{2})\) is equal to \(\binom{n}{n/2}\).

Proof

1. The elements\(z\)are invariant

Let

$$\begin{aligned} A_{\theta }:= \begin{pmatrix} \cos (\theta ) & \sin (\theta ) \\ -\sin (\theta ) & \cos (\theta ) \end{pmatrix} \end{aligned}$$

Then (recall Definition 3.2)

$$\begin{aligned} A_{\theta }^{\top }z_{1} &= A_{\theta }^{\top }(x_{1} + i x_{2}) \\ &= \cos (\theta ) x_{1} + \sin (\theta ) x_{2} + i \bigl( - \sin ( \theta ) x_{1} + \cos (\theta ) x_{2} \bigr) \\ &= e^{-i\theta } z_{1} \\ A_{\theta }^{\top }z_{2} &= e^{i\theta } z_{2}. \end{aligned}$$

Hence

$$\begin{aligned} A_{\theta }^{\top }z_{j_{1}} \cdot .. \cdot z_{j_{n}} = z_{j_{1}} \cdot .. \cdot z_{j_{n}} \forall \theta \quad \text{if and only} \quad \#\{ r : j_{r} = 1 \} = \#\{ r : j_{r} = 2 \}. \end{aligned}$$

2. The elements\(z\)form a basis

Now \(x_{j_{1}} .. x_{j_{n}}: j_{\ell }\in \{1,2\}\) is a basis of \(\pi _{n} T(\mathbb{C}^{2})\) with respect to ℂ. Hence \(z_{j_{1}} .. z_{j_{n}}\) is (the map \((x_{1},x_{2}) \mapsto (z_{1},z _{2})\) is invertible). By Step 1 we have hence exhibited a basis (with respect to ℂ) for all invariants in \(\pi _{n} T(\mathbb{C} ^{2})\).

3. Real invariants

The space of \(\operatorname{SO}\) invariants on level \(n\) in \(T(\mathbb{C}^{2})\) is spanned freely by the set of

$$\begin{aligned} z_{j_{1}} \cdot .. \cdot z_{j_{n}} \quad \text{with } \#\{ r : j_{r} = 1 \} = \#\{ r : j_{r} = 2 \}. \end{aligned}$$

Adding and subtracting the elements with \(j_{1}=2\) from the elements with \(j_{1}=1\), we get that the space of \(\operatorname{SO}\) invariants on level \(n\) in \(T(\mathbb{C}^{2})\) is spanned freely by the set of

$$\begin{aligned} & (z_{j_{1}} \cdot .. \cdot z_{j_{n}} + z_{3-j_{1}} \cdot .. \cdot z _{3-j_{n}}) \quad \text{and}\quad (z_{j_{1}} \cdot .. \cdot z_{j_{n}} - z_{3-j_{1}} \cdot .. \cdot z_{3-j_{n}}) \\ &\quad \text{with } \#\{ r : j_{r} = 1 \} = \#\{ r : j_{r} = 2 \}\text{ and $j_{1}=1$}. \end{aligned}$$

Because \(z_{3-j_{1}} \cdot .. \cdot z_{3-j_{n}}\) is the complex conjugate of \(z_{j_{1}} \cdot .. \cdot z_{j_{n}}\), this means that the space of \(\operatorname{SO}\) invariants on level \(n\) in \(T(\mathbb{C} ^{2})\) is spanned freely by the set of

$$\begin{aligned} & \operatorname{Re}(z_{j_{1}} \cdot .. \cdot z_{j_{n}}) \quad \text{and}\quad \operatorname{Im}(z_{j_{1}} \cdot .. \cdot z_{j_{n}}) \\ &\quad \text{with } \#\{ r : j_{r} = 1 \} = \#\{ r : j_{r} = 2 \}\text{ and $j_{1}=1$}. \end{aligned}$$

This is an expression for a basis of the \(\operatorname{SO}\) invariants in terms of real combinations of basis elements of the tensor space. They thus form a basis for the \(\operatorname{SO}\) invariants for the free real vector space on the same set, namely \(\pi _{n} T( \mathbb{R}^{2})\). □

Example 7

Consider \(d=2\), level \(n=2\)

$$\begin{aligned} \mathtt {11} + \mathtt {22} \\ - \mathtt {12} + \mathtt {21} \end{aligned}$$

Level \(n=4\)

$$\begin{aligned} \mathtt {1111} - \mathtt {1122} + \mathtt {1212} + \mathtt {1221} + \mathtt {2112} + \mathtt {2121} - \mathtt {2211} + \mathtt {2222} \\ - \mathtt {1112} - \mathtt {1121} + \mathtt {1211} - \mathtt {1222} + \mathtt {2111} - \mathtt {2122} + \mathtt {2212} + \mathtt {2221} \\ \mathtt {1111} + \mathtt {1122} - \mathtt {1212} + \mathtt {1221} + \mathtt {2112} - \mathtt {2121} + \mathtt {2211} + \mathtt {2222} \\ - \mathtt {1112} + \mathtt {1121} - \mathtt {1211} - \mathtt {1222} + \mathtt {2111} + \mathtt {2122} - \mathtt {2212} + \mathtt {2221} \\ \mathtt {1111} + \mathtt {1122} + \mathtt {1212} - \mathtt {1221} - \mathtt {2112} + \mathtt {2121} + \mathtt {2211} + \mathtt {2222} \\ \mathtt {1112} - \mathtt {1121} - \mathtt {1211} - \mathtt {1222} + \mathtt {2111} + \mathtt {2122} + \mathtt {2212} - \mathtt {2221} \end{aligned}$$

Consider \(d=3\), level \(n=2\)

$$\begin{aligned} \mathtt {11} + \mathtt {22} + \mathtt {33} \end{aligned}$$

Level \(n=3\).

$$\begin{aligned} \mathtt {123} - \mathtt {132} + \mathtt {312} - \mathtt {321} + \mathtt {231} - \mathtt {213} \end{aligned}$$

Consider \(d=4\), level \(n=2\)

$$\begin{aligned} \mathtt {11} + \mathtt {22} + \mathtt {33} + \mathtt {44}. \end{aligned}$$

Level \(n=4\)

$$\begin{aligned} &\mathtt {1144} + \mathtt {4422} + \mathtt {4444} + \mathtt {3333} + \mathtt {1122} + \mathtt {4433} + \mathtt {1133} + \mathtt {4411} + \mathtt {2211} + \mathtt {3344} \\ & \qquad + \mathtt {1111} + \mathtt {2244} + \mathtt {2222} + \mathtt {3322} + \mathtt {2233} + \mathtt {3311} \\ & + \text{$4$ more} \end{aligned}$$

5 Permutations

Denote by \(S_{d}\) the group of permutations of \([d] := \{1,.., d\}\).

Lemma 1

For\(\sigma \in S_{d}\), define\(M(\sigma ) \in \operatorname{GL}( \mathbb{R}^{d})\)as

$$\begin{aligned} M(\sigma )_{ij} = 1 \quad \textit{if } i = \sigma (j). \end{aligned}$$

Then\(M: S_{d}\to \operatorname{GL}(\mathbb{R}^{d})\)is a group homomorphism and moreover\(M(\sigma ^{-1}) = M(\sigma )^{\top }\).Footnote 13

Proof

Regarding the first point, for \(i=\{1,..,d\}\),

$$\begin{aligned} M(\sigma ) M(\tau ) e_{i} = M(\sigma ) e_{\tau (i)} = e_{\sigma ( \tau (i))} = M(\sigma \tau ) e_{i}. \end{aligned}$$

Regarding the last point, note the following sequence of equivalences.

$$\begin{aligned} M_{ij}\bigl(\sigma ^{-1}\bigr) = 1 \quad \Leftrightarrow\quad i = \sigma ^{-1}(j) \quad \Leftrightarrow\quad j = \sigma (i) \quad \Leftrightarrow\quad M_{ji}(\sigma ) = 1. \end{aligned}$$

This proves the claim. □

\(S_{d}\) then acts on \(T((\mathbb{R}^{d}))\) and \(T(\mathbb{R}^{d})\) via Definition 3.2. Explicitly,

$$\begin{aligned} \sigma \cdot x_{i_{1}} .. x_{i_{n}} = x_{\sigma (i_{1})} .. x_{\sigma (i_{n})}. \end{aligned}$$

Definition 2

We call \(\phi \in T(\mathbb{R}^{d})\) a permutation invariant if

$$\begin{aligned} \bigl\langle S\bigl( M(\sigma ) X\bigr)_{0,T}, \phi \bigr\rangle = \bigl\langle S(X)_{0,T}, \phi \bigr\rangle \end{aligned}$$

for all \(\sigma \in S_{d}\) and all curves \(X\). Alternatively, as explained in Sect. 3,

$$\begin{aligned} M(\sigma )^{\top }\phi = \phi , \end{aligned}$$

for all \(\sigma \in S_{d}\). Equivalently,

$$\begin{aligned} M(\sigma ) \phi = \phi , \end{aligned}$$

for all \(\sigma \in S_{d}\).

We follow [1, Sect. 3]. To a monomial

$$\begin{aligned} x_{i_{1}}\cdot .. \cdot x_{i_{n}}, \end{aligned}$$

we associate the following set partition of \([n] := \{1,.., n \}\)

$$\begin{aligned} \nabla ( x_{i_{1}} \cdot .. \cdot x_{i_{n}} ) := \bigl\{ \{ \ell : i_{ \ell }= p \} : p \in [d] \bigr\} \setminus \bigl\{ \{ \} \bigr\} . \end{aligned}$$

Example 3

Let \(d=3\), then

$$\begin{aligned} \nabla ( x_{2} x_{3} x_{2} x_{2} x_{1} ) = \bigl\{ \{1, 3, 4\}, \{2\}, \{5 \} \bigr\} . \end{aligned}$$

Note that for every permutation \(\sigma \in S_{d}\),

$$\begin{aligned} \nabla ( x_{i_{1}}\cdot .. \cdot x_{i_{n}} ) = \nabla ( x_{\sigma (i _{1})} \cdot .. \cdot x_{\sigma (i_{n})} ). \end{aligned}$$
(7)

Proposition 4

([1, Sect. 3])

Define

$$\begin{aligned} M_{A} := \sum_{i: \nabla ( x_{i_{1}} .. x_{i_{n}} ) = A} x_{i_{1}} .. x_{i_{n}}. \end{aligned}$$

Then\(\{ M_{A} : A~\textit{is set partition of}~[n]~\textit{and}~|A| \le d\}\)is a linear basis for the space of permutation invariants of homogeneity\(n\).

Remark 5

The generating function for partitions with at most \(d\) blocks is given by

$$\begin{aligned} \frac{\sum_{\ell =1}^{d+1} x^{\ell } \prod_{m=\ell }^{d} (1-m\ x)}{ \prod_{\ell =1}^{d} (1-\ell \ x)}. \end{aligned}$$

This follows from summing up [45, (1.94c)].

For example \(d=2\),

$$\begin{aligned} \frac{(1-x)(1-2x) + x(1-2x) + x^{2}}{(1-x)(1-2x)} = \frac{1-x}{1-2x}, \end{aligned}$$

which is the generating function of the sequence (https://oeis.org/A011782)

$$\begin{aligned} 1, 2^{0}, 2^{1}, 2^{2}, 2^{3}, 2^{4}, .. \end{aligned}$$

For \(d=3\) one gets, the generating function

$$\begin{aligned} \frac{(1-x)(1-2x)(1-3x) + x(1-2x)(1-3x) + x^{2} (1-3x) + x^{3}}{(1-x)(1-2x)(1-3x)}, \end{aligned}$$

which is the generating function of the sequence (https://oeis.org/A124302)

$$\begin{aligned} 1, \bigl(3^{0} + 1\bigr)/2, \bigl(3^{1} + 1\bigr)/2, \bigl(3^{2} + 1\bigr)/2, .. \end{aligned}$$

We are not aware of a general explicit formula for the number of partitions (i.e. the coefficients of the generating function).

Proof of Proposition 5.4

By (7), each \(M_{A}\) is permutation invariant. Moreover, since \(|A| \le d\), \(M_{A}\) is nonzero.

For \(A\), \(A'\) distinct set partitions of \([n]\), the monomials in \(M_{A}\) and the monomials in \(M_{A'}\) do not overlap. Hence the proposed basis is linearly independent.

Now, if \(\phi \) is permutation invariant and if for some \(i\), \(i'\), \(\nabla ( x_{i_{1}} .. x_{i_{n}} ) = \nabla ( x_{i'_{1}} .. x_{i'_{n}} )\) then the coefficient of \(x_{i}\) and \(x_{i'}\) must coincide. Hence the proposed basis spans invariants of homogeneity \(n\). □

Example 6

Consider \(d=3\)

Order \(n=1\)

$$\begin{aligned} \mathtt {1} + \mathtt {2} + \mathtt {3} \end{aligned}$$

Order \(n=2\)

$$\begin{aligned} &\mathtt {33} + \mathtt {22} + \mathtt {11} \\ &\mathtt {32} + \mathtt {31} + \mathtt {23} + \mathtt {21} + \mathtt {13} + \mathtt {12} \end{aligned}$$

Order \(n=3\)

$$\begin{aligned} &\mathtt {333} + \mathtt {222} + \mathtt {111} \\ &\mathtt {332} + \mathtt {331} + \mathtt {223} + \mathtt {221} + \mathtt {113} + \mathtt {112} \\ &\mathtt {323} + \mathtt {313} + \mathtt {232} + \mathtt {212} + \mathtt {131} + \mathtt {121} \\ &\mathtt {322} + \mathtt {311} + \mathtt {233} + \mathtt {211} + \mathtt {133} + \mathtt {122} \\ &\mathtt {321} + \mathtt {312} + \mathtt {231} + \mathtt {213} + \mathtt {132} + \mathtt {123} \end{aligned}$$

6 An Additional (Time) Coordinate

Assume now that \(X = (X^{0},X^{1},..,X^{d}): [0,T] \to \mathbb{R}^{1+d}\). Here \(X^{0}\) plays a special role, in that we assume that it is not affected by the space transformations under consideration.

Adding an “artificial” 0-th component, usually keeping track of time, \(X^{0}_{t} := t\), is a common trick to improve the expressiveness of the signature. In particular, if such an \(X^{0}\) is monotonically increasing, the enlarged curve \((X^{0},X^{1},..,X^{d})\) never has any “tree-like” components (compare Sect. 7), no matter what the original \((X^{1},..,X^{d})\) was.

Consider \(\operatorname{GL}\) invariants for the moment.

Definition 1

Let

$$\begin{aligned} \operatorname{GL}_{0}\bigl(\mathbb{R}^{d}\bigr) := \bigl\{ A \in \operatorname{GL}\bigl( \mathbb{R}^{1+d}\bigr) : A e_{0} = A^{-1} e_{0} = e_{0} \bigr\} , \end{aligned}$$

the space of invertible maps of \(\mathbb{R}^{1+d}\) leaving the first direction unchanged. We call \(\phi \in T(\mathbb{R}^{1+d})\) a \(\operatorname{GL}_{0}\)invariant of weight\(w\) if

$$\begin{aligned} A^{\top }\phi = (\det A)^{w} \phi , \end{aligned}$$

for all \(A \in \operatorname{GL}_{0}(\mathbb{R}^{d})\).

Consider the \(\operatorname{GL}(\mathbb{R}^{2})\) invariant of weight 1

$$\begin{aligned} x_{1} x_{2} - x_{2} x_{1}. \end{aligned}$$

Since elements of \(\operatorname{GL}_{0}(\mathbb{R}^{2})\) leave the variable \(x_{0}\) unchanged, a straightforward way to produce \(\operatorname{GL}_{0}\) invariants presents itself: insert \(x_{0}\) at the same position in every monomial. For example

$$\begin{aligned} x_{1} x_{0} x_{2} - x_{2} x_{0} x_{1} \end{aligned}$$

is a \(\operatorname{GL}_{0}(\mathbb{R}^{2})\) invariant of weight 1. We now formalize this idea and show that we get every \(\operatorname{GL}_{0}\) invariant this way.

Define the linear map \(\mathsf{Remove}\) of “removing instances of \(x_{0}\)” on monomials, as

$$\begin{aligned} \mathsf{Remove}\,x_{i_{1}} .. x_{i_{m}} := \prod _{\ell : i_{\ell } \ne 0} x_{i_{\ell }}, \end{aligned}$$

so for example

$$\begin{aligned} \mathsf{Remove}\,x_{0} x_{1} x_{1} x_{0} x_{3} &= x_{1} x_{1} x_{3} \\ \mathsf{Remove}\,x_{0} x_{0} &= 1. \end{aligned}$$

Define for \(U \subset [m]\) and \(i = (i_{1},.., i_{m})\)

$$\begin{aligned} i|_{U} = (i_{\ell }: \ell = 1,.., m; \ell \in U). \end{aligned}$$

Define the linear map of restriction to \(U\) on polynomials of order \(m\) by defining on monomials

$$\begin{aligned} x_{i}|_{U} := x_{i|_{U}} \end{aligned}$$

so for example

$$\begin{aligned} x_{i_{1}} x_{i_{2}} x_{i_{3}} |_{\{1,3\}} = x_{i_{1}} x_{i_{3}}. \end{aligned}$$

For \(z = (z_{1},..,z_{m+1}) \in \mathbb{N}^{m+1}\) denote by \(\mathsf{Insert}_{z}\) the linear operator on polynomials of order \(m\) by defining it on monomials as follows. For a monomial \(x_{i_{1}} .. x_{i_{m}}\) of order \(m\), \(\mathsf{Insert}_{z}\) inserts \(z_{1}\) occurrences of \(x_{0}\) before \(x_{i_{1}}\), \(z_{2}\) occurrences of \(x_{0}\) before \(x_{i_{2}},.., z_{m}\) occurrences of \(x_{0}\) before \(x_{i_{m}}\) and \(z_{m+1}\) occurrences of \(x_{0}\) after \(x_{i_{m}}\). For example

$$\begin{aligned} \mathsf{Insert}_{(2,1,4)} x_{1} x_{2} = x_{0} x_{0} x_{1} x_{0} x_{2} x_{0} x_{0} x_{0} x_{0}. \end{aligned}$$

Theorem 2

A basis for the space of\(\operatorname{GL}_{0}\)invariants of weight\(w\), homogeneous of degree\(m\), is given by the polynomials

$$\begin{aligned} \mathsf{Insert}_{z} \psi , \end{aligned}$$

with\(0 \le n \le m\), \(\psi \)ranging over the basis for\(\operatorname{GL}\)invariant of weight\(w\)and homogeneity\(n\) (Proposition3.11) and\(z \in \mathbb{N}^{n+1}\)such that\(\sum_{\ell }z_{\ell }= m - n\).

Proof

Let \(n\), \(\psi \), \(z\) be as in the statement. Then, for \(A_{0} = \operatorname{diag}(1, A) \in \operatorname{GL}_{0}(\mathbb{R}^{d})\), with \(A \in \operatorname{GL}(\mathbb{R}^{d})\),

$$\begin{aligned} A_{0}\ \mathsf{Insert}_{z} \psi = \mathsf{Insert}_{z} A \psi = ( \det A)^{w}\,\mathsf{Insert}_{z}\psi . \end{aligned}$$

Therefore \(\mathsf{Insert}_{z} \psi \) is \(\operatorname{GL}_{0}\) invariant of weight \(w\).

On the other hand, let \(\phi \) of order \(m\) be a \(\operatorname{GL} _{0}\) invariant modulo time of weight \(w\). Define for \(U \subset [m]\)

$$\begin{aligned} \phi ^{U} := \sum_{i : i_{\ell }= 0, \ell \in U; i_{j} \ne 0, j \notin U} \langle \phi , x_{i} \rangle x_{i}, \end{aligned}$$

which collects all monomials having \(x_{0}\) exactly at the positions in \(U\). Then

$$\begin{aligned} \phi = \sum_{U \subset [m]} \phi ^{U}. \end{aligned}$$

Now, since \(\phi \) is \(\operatorname{GL}_{0}\) invariant of weight \(w\) and since \(\operatorname{GL}_{0}\) leaves

$$\begin{aligned} \operatorname{span} \{ x_{i} : i_{\ell }= 0, \ell \in U; i_{j} \ne 0, j \notin U \} \end{aligned}$$

invariant, we get that \(\phi ^{U}\) is \(\operatorname{GL}_{0}\) invariant of weight \(w\). Clearly, there is \(0 \le n \le m\) and \(z \in \mathbb{N}^{n+1}\) such that

$$\begin{aligned} \mathsf{Insert}_{z}\,\mathsf{Remove}\,\phi ^{U} = \phi ^{U}. \end{aligned}$$

Lastly, \(\mathsf{Remove}\,\phi ^{U}\) is \(\operatorname{GL}\) invariant, since for \(A_{0} = \operatorname{diag}(1, A) \in \operatorname{GL} _{0}(\mathbb{R}^{d})\), with \(A \in \operatorname{GL}(\mathbb{R}^{d})\),

$$\begin{aligned} A \, \mathsf{Remove}\,\phi ^{U} = \mathsf{Remove}\,A_{0} \phi ^{U} = ( \det A_{0})^{w}\,\mathsf{Remove}\, \phi ^{U} = (\det A)^{w}\,\mathsf{Remove}\,\phi ^{U}. \end{aligned}$$

Hence every invariant is in the span of the set given in the statement. They are linearly independent, and hence form a basis. □

The corresponding statements for rotations and permutations are completely analogous, so we omit them.

7 Discussion and Open Problems

We have presented a novel way to extract invariant features of \(d\)-dimensional curves, based on the iterated-integral signature. We have identified all those features that can be written as a finite linear combination of terms in the signature.

Among the techniques used previously for finding invariants of curves, the method of “integral invariants” [13] is closest to ours (it has been used for example in [19] for character recognition). In that work, for a curve \(X: [0,T] \to \mathbb{R}^{d}\), \(d=2,3\), the building blocks for invariants are expressions of the form

$$\begin{aligned} \int _{0}^{T} \bigl(X^{1}_{r} \bigr)^{\alpha _{1}} .. \bigl(X^{d}_{r}\bigr)^{\alpha _{d}} dX ^{i}_{r}, \quad i=1,..,d. \end{aligned}$$
(8)

Using an algorithmic procedure, some invariants to certain subgroups of \(G \subset \operatorname{GL}(\mathbb{R}^{d})\) are derived. In particular for \(d=2\) and \(G=\operatorname{GL}(\mathbb{R}^{d})\) the following invariants are given

$$\begin{aligned} I_{1} =& \frac{1}{2} \int _{0}^{T} X^{1}_{0,r} dX^{2}_{r} - \frac{1}{2} X^{1}_{0,t} X^{2}_{0,t} \\ I_{2} =& \int _{0}^{T} X^{1}_{0,r} X^{2}_{0,r} dX^{2}_{r}\ X^{1}_{0,t} - \frac{1}{2} \int _{0}^{t} \bigl(X^{1}_{r} \bigr)^{2} dX^{2}_{r}\ X^{2}_{0,t} \\ I_{3} =& \int _{0}^{T} X^{1}_{0,r} \bigl(X^{2}_{0,r}\bigr)^{2} dX^{2}_{r} \ X ^{2}_{0,T} - \int _{0}^{T} \bigl(X^{1}_{0,r} \bigr)^{2} X^{2}_{0,r} dX^{2}_{r} \ X ^{1}_{0,T} X^{2}_{0,T} \\ &{} +\frac{1}{3}\int _{0}^{T} \bigl(X^{1}_{0,r}\bigr)^{3} dX ^{2}_{r}\ X^{2}_{0,r}X^{2}_{0,r}- \frac{1}{12} \bigl(X^{1}_{0,t} \bigr)^{3} \bigl(X^{2}_{0,t}\bigr)^{3}. \end{aligned}$$

By the shuffle identity (Lemma 2.1), we can write these as \(I_{i} = \langle S(X)_{0,T}, \phi _{i} \rangle \), with

$$\begin{aligned} \phi _{1} :=& \frac{1}{2} \mathtt {12} - \frac{1}{2} \mathtt {12} \\ \phi _{2} :=& \frac{1}{3} \mathtt {1221} +\frac{1}{3} \mathtt {1212} - \frac{2}{3} \mathtt {1122} +\frac{1}{3} \mathtt {2121} + \frac{1}{3} \mathtt {2112} -\frac{2}{3} \mathtt {2211} \\ \phi _{3} :=& - \mathtt {121212} - \mathtt {211122} + \mathtt {212121} + \mathtt {221112} - \mathtt {121221} + \mathtt {122211} - \mathtt {112212} \\ &{}+\! \mathtt {122112} -\! \mathtt {211212} -\! \mathtt {211221} -\! \mathtt {121122} +\! \mathtt {122121} -3 \mathtt {222111} +3 \mathtt {111222} \\ &{}+ \mathtt {221121} + \mathtt {212211} - \mathtt {112122} + \mathtt {212112} - \mathtt {112221} + \mathtt {221211}. \end{aligned}$$

One can easily check that these lie in the linear span of the invariants given in Proposition 4.4 (or Theorem 4.5), as expected.

We note that expressions of the form (8) are not enough to uniquely characterize a path. Indeed, the following lemma gives a counterexample to the conjecture on p. 906 in [13] that “signatures of non-equivalent curves are different” (here, the “signature” of a curve means the set of expressions of the form (8)).

Lemma 1

Consider the two closed curves\(X^{+}\)and\(X^{-}\)in\(\mathbb{R}^{2}\), given for\(t\)in\([0,2\pi ]\)as

$$\begin{aligned} X^{\pm ,1}_{t} &=\pm \cos t \\ X^{\pm ,2}_{t} &=\sin 2t. \end{aligned}$$

Then all the expressions (8) coincide on\(X^{+}\)and\(X^{-}\).Footnote 14

These curves both trace a figure called the lemniscate of Gerono which is illustrated in Fig. 2.

Fig. 2
figure 2

The lemniscate of Gerono. Traversing it once from each of the two starting points indicated gives two distinct closed curves with distinct iterated-integral signatures, but which cannot be distinguished with the “signature” of [13]

Proof

Consider the function \(f^{m}_{n}(t):=\cos ^{m} t\sin ^{n} t\), where \(m\) and \(n\) are nonnegative integers. If \(n\) is odd, then \(f^{m}_{n}(t)=-f ^{m}_{n}(2\pi - t)\) so \(\int _{0}^{2\pi }f^{m}_{n}(t)\,dt\) is zero. If \(m\) is odd, then

$$\begin{aligned} \int _{0}^{2\pi }f^{m}_{n}(t) \,dt=- \int _{\frac{\pi }{2}}^{-\frac{3 \pi }{2}}f^{m}_{n}\biggl( \frac{\pi }{2}-t\biggr)\,dt= \int ^{\frac{\pi }{2}}_{-\frac{3 \pi }{2}}f^{n}_{m}(t) \,dt= \int ^{2\pi }_{0}f^{n}_{m}(t) \,dt=0. \end{aligned}$$

Thus \(\int _{0}^{2\pi }f^{m}_{n}(t)\,dt\) can only be nonzero if \(m\) and \(n\) are both even.

Any expression like (8) is either of the form

$$\begin{aligned} A_{m,n}^{\pm } &= \int _{0}^{2\pi } \bigl(X^{\pm ,1}_{t} \bigr)^{m} \bigl(X ^{\pm ,2}_{t} \bigr)^{n}\,dX^{\pm ,1}_{t} \\ &= \int _{0}^{2\pi }(\pm 1)^{m}\cos ^{m} t\sin ^{n} 2t (\mp \sin t)\,dt \\ &=\mp 2^{n}(\pm 1)^{m} \int _{0}^{2\pi }\cos ^{m+n}t\sin ^{n+1}t\,dt \\ &= \textstyle\begin{cases} 0 &\text{$n$ even or $m$ even} \\ -2^{n}\int _{0}^{2\pi }\cos ^{m+n}t\sin ^{n+1}t\,dt &\text{otherwise} \end{cases}\displaystyle \end{aligned}$$

or of the form

$$\begin{aligned} B_{m,n}^{\pm } &= \int _{0}^{2\pi } \bigl(X^{\pm ,1}_{t} \bigr)^{m} \bigl(X ^{\pm ,2}_{t} \bigr)^{n}\,dX^{\pm ,2}_{t} \\ &= \int _{0}^{2\pi }(\pm 1)^{m}\cos ^{m} t\sin ^{n} 2t (2\cos 2t)\,dt \\ &=2^{n+1}(\pm 1)^{m} \int _{0}^{2\pi }\cos ^{m+n}t\sin ^{n}t\bigl(\cos ^{2}t- \sin ^{2}t\bigr)\,dt \\ &= \textstyle\begin{cases} 0 &\text{$n$ odd or $m$ odd} \\ 2^{n+1}\int _{0}^{2\pi }\cos ^{m+n}t\sin ^{n}t(\cos ^{2}t-\sin ^{2}t) \,dt &\text{otherwise}. \end{cases}\displaystyle \end{aligned}$$

Both these expressions are free from the symbols ± and ∓. Therefore these two curves have the same values on terms of the form (8). □

Moreover, the algorithmic nature of the construction in [13] makes it difficult to proceed to invariants of higher order. In contrast, our method gives an explicit linear basis for the invariants under consideration up to any order.

Regarding the question of whether our invariants are complete we propose the following conjecture. As shown in [22], if \(S(X)_{0,T} = S(Y)_{0,T}\) for some curves \(X\) and \(Y\), then \(X\) is “tree-like equivalent” to \(Y\). For the concrete definition of this equivalence we refer to their paper, but let us give one example. Consider in \(d=2\), the constant path \(X_{t} := (0,0)\), \(t \in [0,T]\) and the piecewise linear path \(Y\), between the points \((0,0)\), \((1,0)\) and \((0,0)\). One can check that

$$\begin{aligned} S(X)_{0,T} = S(Y)_{0,T} = 1. \end{aligned}$$

The signature has no chance of picking up these kind of “excursions” in a path; this concept is formalized in “tree-like equivalence”. We suspect that the following holds true (with corresponding formulations for the other subgroups of \(\operatorname{GL}(\mathbb{R}^{d})\)).

Conjecture 2

Let\(X, Y: [0,T] \to \mathbb{R}^{d}\)be two curves such that

$$\begin{aligned} \bigl\langle S(X)_{0,T}, \phi \bigr\rangle = \bigl\langle S(Y)_{0,T}, \phi \bigr\rangle , \end{aligned}$$

for all\(\operatorname{SO}\)invariants given in Proposition 4.4. Then there is a curve\(\bar{X}\), tree-like equivalent to\(X\), and a rotation\(A \in \operatorname{SO}(\mathbb{R}^{d})\), such that

$$\begin{aligned} A \bar{X} = Y. \end{aligned}$$

In Proposition 3.11, Proposition 4.4 and Proposition 5.4 we have established a linear basis for invariants for every homogeneity. As already mentioned in Remark 3.15, owing to the shuffle identity, there are algebraic relations between elements of different homogeneity. An interesting open problem is then to find a minimal set of generators for the set of invariants, considered as a subalgebra of the shuffle algebra. This applies to all subgroups of \(\operatorname{GL}( \mathbb{R}^{d})\) and their corresponding invariants.

Lastly, a word on (computational) complexity. We have seen in Remark 3.10 the dimensions of \(\operatorname{GL}\) invariant elements (which is a lower bound on the dimensions of \(\operatorname{SO}\) invariant elements).Footnote 15 In Remark 5.5 we have seen the dimensions for the permutation invariant elements.

Computing the signature itself up to level \(n\) has complexity \(\varOmega (d^{n})\), since \(d+ .. + d^{n}\) integrals need to be calculated. So any method that calculates the invariant features of a curve \(X\) by first calculating its signature and extracting them (see Remark 3.14) will have computational complexity dominated by the calculation of the signature. Furthermore, the calculation of the invariant elements is a computation that can be done offline (they do not depend on the curve \(X\)).

This leaves several directions of future research.

  • Is it possible to apply kernelization techniques similar to the ones used for the entire (non-invariant) signature in [25]? These techniques, in the non-invariant setting, allow to use information of the signature up to high levels and dimension for certain learning algorithms.

  • We have studied in this paper linear expressions on the signature that are invariant to a group action. This was justified by using the shuffle identity (Lemma 2.1), which tells us that any polynomial functional on the signature can in fact be linearized. One can also consider a fixed level \(n\) of the signature and look for all nonlinear expressions that are invariant under the group action. This is the classical problem of invariant theory for polynomial rings [47, Sect. 4]. On the one hand, this makes it possible to “peek ahead” in the signature, since one gets invariant information that would only be seen in linear expressions of higher levels than \(n\). On the other hand, except for special cases, there are no explicit expressions for these nonlinear invariant. One has to proceed algorithmically (for example via Derksen’s algorithm, [8]) which only works for low dimension \(d\) and low levels \(n\). Since the calculation of those nonlinear invariant elements can also be done offline it would nonetheless be nice to have a tabulation of nonlinear invariants (as far as existing algorithms can reach).

  • For \(\operatorname{GL}\) invariants, in Remark 3.25 we conjecture the existence of a “\(\operatorname{GL}\) invariant” signature. This could improve computation time, since no non-invariant integrals have to be computed.