Keywords

1 Introduction

In 1996, Olsen et al. (see [23]) proved the existence of an isomorphism between the family \(\fancyscript{C}\) of two-dimensional copulas (endowed with the so-called star product) and the family \(\fancyscript{M}\) of all Markov operators (with the standard composition as binary operation). Using disintegration (see [29]) allows to express the aforementioned Markov operators in terms of Markov kernels, resulting in a one-to-one correspondence of \(\fancyscript{C}\) with the family \(\fancyscript{K}\) of all Markov kernels having the Lebesgue measure \(\lambda \) on \([0,1]\) as fixed point. Identifying every copula with its Markov kernel allows to define new metrics \(D_1,D_2,D_\infty \) which, contrary to the uniform one, strictly separate independence from complete dependence (full predictability). Additionally, the ‘translation’ of various copula-related concepts from \(\fancyscript{C}\) to \(\fancyscript{M}\) and \(\fancyscript{K}\) has proved useful in so far that it allowed both, for alternative simple proofs of already known properties as well as for new and interesting results. Section 3 of this paper is a quick incomplete survey over some useful consequences of this translation. In particular, we mention the fact that for each copula \(A\in \fancyscript{C}\), the iterates of the star product of A with itself are Cesáro converge to an idempotent copula \(\hat{A}\) w.r.t. each of the three metrics mentioned before, i.e., we have

$$ \lim _{n \rightarrow \infty } D_1(s_{*n}(A), \hat{A}) = 0 $$

whereby \(s_{*n}(A)=\frac{1}{n} \sum _{i=1}^n A^{*i}\) for every \(n \in \mathbb {N}\). Section 4 contains some new unpublished results and proves that the idempotent limit copula \(\hat{A}\) must have a very simple (ordinal-sum-like) form if the Markov operator \(T_A\) corresponding to \(A\) is quasi-constrictive in the sense of Lasota ([1, 15, 18]).

2 Notation and Preliminaries

As already mentioned before, \(\fancyscript{C}\) will denote the family of all (two-dimensional) copulas, \(d_\infty \) will denote the uniform metric on \(\fancyscript{C}\). For properties of copulas, we refer to [8, 22, 26]. For every \(A\in \fancyscript{C}\), \(\mu _A\) will denote the corresponding doubly stochastic measure, \(\fancyscript{P}_\fancyscript{C}\), the class of all these doubly stochastic measures. Since copulas are the restriction of two-dimensional distribution functions with \(\fancyscript{U}(0,1)\)-marginals to \([0,1]^2\), the Lebesgue decomposition of every element in \(\fancyscript{P}_\fancyscript{C}\) has no discrete component. The Lebesgue measure on \([0,1]\) and \([0,1]^2\) will be denoted by \(\lambda \) and \(\lambda _2\), respectively. For every metric space \((\varOmega ,d)\), the Borel \(\sigma \)-field on \(\varOmega \) will be denoted by \(\fancyscript{B}(\varOmega )\). A Markov kernel from \(\mathbb {R}\) to \(\fancyscript{B}(\mathbb {R})\) is a mapping \(K: \mathbb {R} \times \fancyscript{B}(\mathbb {R})\rightarrow [0,1]\) such that \(x \mapsto K(x,B)\) is measurable for every fixed \(B \in \fancyscript{B}(\mathbb {R})\) and \(B \mapsto K(x,B)\) is a probability measure for every fixed \(x \in \mathbb {R}\). Suppose that \(X,Y\) are real-valued random variables on a probability space \((\varOmega , \fancyscript{A}, \fancyscript{P})\), then a Markov kernel \(K:\mathbb {R}\times \fancyscript{B}(\mathbb {R}) \rightarrow [0,1]\) is called regular conditional distribution of \(Y\) given \(X\) if for every \(B \in \fancyscript{B}(\mathbb {R})\)

$$\begin{aligned} K(X(\omega ),B)=\mathbb {E}(\mathbf {1}_B\circ Y |X)(\omega ) \end{aligned}$$
(1)

holds \(\fancyscript{P}\)-a.s. It is well known that for each pair \((X,Y)\) of real-valued random variables a regular conditional distribution \(K(\cdot ,\cdot )\) of \(Y\) given \(X\) exists, that \(K(\cdot ,\cdot )\) is unique \(\fancyscript{P}^X\)-a.s. (i.e., unique for \(\fancyscript{P}^X\)-almost all \(x \in \mathbb {R}\)) and that \(K(\cdot ,\cdot )\) only depends on \(\fancyscript{P}^{X \otimes Y}\). Hence, given \(A\in \fancyscript{C}\) we will denote (a version of) the regular conditional distribution of \(Y\) given \(X\) by \(K_A(\cdot ,\cdot )\) and refer to \(K_A(\cdot ,\cdot )\) simply as regular conditional distribution of \(A\) or as the Markov kernel of \(A\). Note that for every \(A\in \fancyscript{C}\), its conditional regular distribution \(K_A(\cdot ,\cdot )\), and every Borel set \(G \in \fancyscript{B}([0,1]^2)\) we have

$$\begin{aligned} \int \limits _{[0,1]} K_A(x,G_x)\, d\lambda (x) = \mu _A(G), \end{aligned}$$
(2)

whereby \(G_x:=\{y \in [0,1]: (x,y) \in G\}\) for every \(x \in [0,1]\). Hence, as special case,

$$\begin{aligned} \int \limits _{[0,1]} K_A(x,F)\, d\lambda (x) = \lambda (F) \end{aligned}$$
(3)

for every \(F \in \fancyscript{B}([0,1])\). On the other hand, every Markov kernel \(K:[0,1]\times \fancyscript{B}([0,1]) \rightarrow [0,1]\) fulfilling (3) induces a unique element \(\mu \in \fancyscript{P}_\fancyscript{C}([0,1]^2)\) via (2). For more details and properties of conditional expectation, regular conditional distributions, and disintegration see [13, 14].

\(\fancyscript{T}\) will denote the family of all \(\lambda \)-preserving transformations \(h:[0,1]\rightarrow [0,1]\) (see [34]), \(\fancyscript{T}_p\) the subset of all bijective \(h \in \fancyscript{T}\). A copula \(A\in \fancyscript{C}\) will be called completely dependent if and only if there exists \(h \in \fancyscript{T}\) such that \(K(x,E):=\mathbf {1}_E(hx)\) is a regular conditional distribution of \(A\) (see [17, 29] for equivalent definitions and main properties). For every \(h \in \fancyscript{T}\), the corresponding completely dependent copula will be denoted by \(C_h\), the class of all completely dependent copulas by \(\fancyscript{C}_d\).

A linear operator \(T\) on \(L^1([0,1]):=L^1([0,1], \fancyscript{B}([0,1]),\lambda )\) is called Markov operator ([3, 23] if it fulfills the following three properties:

  1. 1.

    \(T\) is positive, i.e., \(T(f) \ge 0\) whenever \(f\ge 0 \)

  2. 2.

    \(T(\mathbf {1}_{[0,1]})=\mathbf {1}_{[0,1]}\)

  3. 3.

    \(\int _{[0,1]} (Tf)(x) d\lambda (x) =\int _{[0,1]} f(x) d\lambda (x)\)

As mentioned in the introduction \(\fancyscript{M}\) will denote the class of all Markov operators on \(L^1([0,1])\). It is straightforward to see that the operator norm of \(T\) is one, i.e., \(\Vert T \Vert :=\sup \{\Vert Tf \Vert _1:\, \Vert f \Vert _1 \le 1\} =1\) holds. According to [23] there is a one-to-one correspondence between \(\fancyscript{C}\) and \(\fancyscript{M}\)—in fact, the mappings \(\varPhi : \fancyscript{C}\rightarrow \fancyscript{M}\) and \(\varPsi : \fancyscript{M}\rightarrow \fancyscript{C}\), defined by

$$\begin{aligned} \varPhi (A)(f)(x):&=(T_Af)(x):= \frac{d }{d x} \int \limits _{[0,1]} A_{,2}(x,t) f(t) d\lambda (t), \nonumber \\&\\ \varPsi (T)(x,y):&=A_T(x,y) := \int \limits _{[0,x]} (T\mathbf {1}_{[0,y]})(t) d\lambda (t) \nonumber \end{aligned}$$
(4)

for every \( f \in L^1([0,1])\) and \((x,y) \in [0,1]^2\) (\(A_{,2}\) denoting the partial derivative of \(A\) w.r.t. \(y\)), fulfill \(\varPsi \circ \varPhi = id_\fancyscript{C}\) and \(\varPhi \circ \varPsi = id_\fancyscript{M}\). Note that in case of \(f:=\mathbf {1}_{[0,y]}\) we have \((T_A\mathbf {1}_{[0,y]})(x) = A_{,1}(x,y) \,\,\lambda \)-a.s. According to [29] the first equality in (4) can be simplified to

$$\begin{aligned} (T_Af)(x)=\mathbb {E}(f\circ Y|X=x) = \int \limits _{[0,1]} f(y) K_A(x,dy) \qquad \lambda \mathrm -a.s. \end{aligned}$$
(5)

It is not difficult to show that the uniform metric \(d_\infty \) is a metrization of the weak operator topology on \(\fancyscript{M}\) (see [23]).

3 Some Consequences of the Markov Kernel Approach

In this section, we give a quick survey showing the usefulness of the Markov kernel perspective of two-dimensional copulas.

3.1 Strong Metrics on \(\fancyscript{C}\)

Expressing copulas in terms of their corresponding Markov kernels, the metrics \(D_1,D_2,D_\infty \) on \(\fancyscript{C}\) can be defined as follows:

$$\begin{aligned} D_1(A,B) := \int \limits _{[0,1]} \int \limits _{[0,1]} \big \vert K_A(x,[0,y]) - K_B(x,[0,y]) \big \vert d \lambda (x)\,d\lambda (y) \end{aligned}$$
(6)
$$\begin{aligned} D_2^2(A,B) := \int \limits _{[0,1]} \int \limits _{[0,1]} \big \vert K_A(x,[0,y]) - K_B(x,[0,y]) \big \vert ^2 d \lambda (x)\,d\lambda (y) \end{aligned}$$
(7)
$$\begin{aligned} D_\infty (A,B) := \sup _{y \in [0,1]} \int \limits _{[0,1]} \big \vert K_A(x,[0,y]) - K_B(x,[0,y]) \big \vert ^2 d \lambda (x) \quad \quad \,\, \end{aligned}$$
(8)

The following two theorems state the most important properties of the metrics \(D_1,D_2\) and \(D_\infty \).

Theorem 1

([29]) Suppose that \(A,A_1,A_2,\ldots \) are copulas and let \(T,T_1,T_2,\ldots \) denote the corresponding Markov operators. Then the following four conditions are equivalent:

  1. (a)

    \(\lim _{n \rightarrow \infty } D_1(A_n,A)=0\)

  2. (b)

    \(\lim _{n \rightarrow \infty } D_\infty (A_n,A)=0\)

  3. (c)

    \(\lim _{n \rightarrow \infty } \Vert T_nf - Tf \Vert _1 =0\) for every \(f \in L^1([0,1])\)

  4. (d)

    \(\lim _{n \rightarrow \infty } D_2(A_n,A)=0\)

As a consequence, each of the three metrics \(D_1,D_2\) and \(D_\infty \) is a metrization of the strong operator topology on \(\fancyscript{M}\).

Theorem 2

([29]) The metric space \((\fancyscript{C},D_1)\) is complete and separable. The same holds for \((\fancyscript{C},D_2)\) and \((\fancyscript{C},D_\infty )\). The topology induced on \(\fancyscript{C}\) by \(D_1\) is strictly finer than the one induced by \(d_\infty \).

Remark 3

The idea of constructing metrics via conditioning to the first coordinate can be easily extended to the family \(\fancyscript{C}^m\) of all \(m\)-dimensional copulas for arbitrary \(m\ge 3\). For instance, the multivariate version of \(D_1\) on \(\fancyscript{C}^m\) can be defined by

$$ D_1(A,B)=\int \limits _{[0,1]^{m-1}} \int \limits _{[0,1]} \vert K_A(x,[\mathbf {0},\mathbf {y}]) - K_B(x,[\mathbf {0},\mathbf {y}]) \vert d \lambda (x) d\lambda ^{m-1}(\mathbf {y}), $$

whereby \([\mathbf {0},\mathbf {y}]=\times _{i=1}^{m-1} [0,y_i] \) and \(K_A (K_B)\) denotes the Markov kernel (regular conditional distribution) of   \(\mathbf {Y}\) given \(X\) for \((X,\mathbf {Y}) \sim A (B)\). As shown in [11], the resulting metric spaces \((\fancyscript{C}^m,D_1), (\fancyscript{C}^m,D_2), (\fancyscript{C}^m,D_\infty )\) are again complete and separable.

3.2 Induced Dependence Measures

The main motivation for the consideration of conditioning-based metrics like \(D_1\) was the need for a metric that, contrary to \(d_\infty \), is capable of distinguishing extreme types of statistical dependence, i.e., independence and complete dependence. For the uniform metric \(d_\infty \), it is straightforward to construct sequences \((C_{h_n})_{n \in \mathbb {N}}\) of completely dependent copulas (in fact, even sequences of shuffles of \(M\), see [9, 22]) fulfilling \(\lim _{n \rightarrow \infty } d_\infty (C_{h_n},\varPi )=0\)—for \(D_1\), however, the following result holds:

Theorem 4

([29]) For every \(A \in \fancyscript{C}\) we have \(D_1(A,\varPi ) \le 1/3\). Furthermore, equality \(D_1(A,\varPi ) = 1/3\) holds if and only if \(A \in \fancyscript{C}_d\).

As a straightforward consequence, we may define \(\tau _1: \fancyscript{C}\rightarrow [0,1]\) by

$$\begin{aligned} \tau _1(A):=3 D_1(A,\varPi ). \end{aligned}$$
(9)

This dependence measure \(\tau _1\) exhibits the seemingly natural properties that (i) exactly members of the family \(\fancyscript{C}_d\) (describing complete dependence) are assigned maximum dependence (equal to one) and (ii) \(\varPi \) is the only copula with minimum dependence (equal to zero). Note that (i) means that \(\tau _1(A)\) is maximal if and only if \(A\) describes the situation of full predictability, i.e., asset \(Y\) is a deterministic function of asset \(X\). In particular, all shuffles of \(M\) have maximum dependence. Dependence measures based on the metric \(D_2\) may be constructed analogously.

Example 5

For the Farlie-Gumbel-Morgenstern family \((G_{\varvec{\theta }}) \in [-1,1]\) of copulas (see [22]), given by

$$\begin{aligned} G_{\varvec{\theta }}(x,y)=xy + \varvec{\theta } xy(1-x)(1-y), \end{aligned}$$
(10)

it is straightforward to show that \(\tau _1(G_{\varvec{\theta }}) = \frac{\vert \varvec{\theta } \vert }{4}\) holds for every \(\varvec{\theta } \in [-1,1]\) (for details see [29]).

Example 6

For the Marshall-Olkin family \((M_{\alpha ,\beta })_{(\alpha ,\beta ) \in [0,1]^2}\) of copulas (see [22]), given by

$$\begin{aligned} M_{\alpha ,\beta } (x,y) = \left\{ \begin{array}{ll} x^{1-\alpha }\,y &{} \quad \text {if}\; x^\alpha \ge y^\beta \\ xy^{1-\beta } &{} \quad \text {if}\; x^\alpha \le y^\beta . \end{array} \right. \end{aligned}$$
(11)

it can be shown that

$$\begin{aligned} \zeta _1(M_{\alpha ,\beta }) = 3\alpha \left( 1-\alpha \right) ^{z} + \frac{6}{\beta }\, \frac{1-(1-\alpha )^z}{z} - \frac{6}{\beta } \,\frac{1-(1-\alpha )^{z+1}}{z+1} \end{aligned}$$
(12)

holds, whereby \(z=\frac{1}{\alpha } + \frac{2}{\beta } - 1\) (for details again see [29]).

Remark 7

The dependence measure \(\tau _1\) is nonmutual, i.e., we do not necessarily have \(\tau _1(A)=\tau _1(A^t)\), whereby \(A^t\) denotes the transpose of \(A\) (i.e., \(A^t(x,y)=A(y,x)\)). This reflects the fact that the dependence structure of random variables might be strongly asymmetric, see [29] for examples as well as [27] for a measure of mutual dependence.

Remark 8

Since most properties of \(D_1\) in dimension two also hold in the general \(m\)-dimensional setting it might seem natural to simply consider \(\tau _1(A) := aD_1(A, \varPi )\) as dependence measure on \( \fancyscript{C}^m\) (a being a normalizing constant). It is, however, straightforward to see that this yields no reasonable notion of a dependence quantification in so far that we would also have \(\tau _1(A) > 0\) for copulas \(A\) describing independence of \(X\) and \(\mathbf {Y} = (Y_1,\ldots ,Y_{m-1})\). For a possible way to overcome this problem and assign copulas describing the situation in which each component of a portfolio \((Y_1,\ldots ,Y_{m-1})\) is a deterministic function of another asset \(X\) maximum dependence we refer to [11].

Remark 9

It is straightforward to verify that for samples \((X_1,Y_1),\ldots ,(X_n,Y_n)\) from \(A \in \fancyscript{C}\) the empirical copula \(\hat{E}_n\) (see [22, 28]) cannot converge to \(A\) w.r.t. \(D_1\) unless we have \(A \in \fancyscript{C}_d\). Using Bernstein or checkerboard aggregations (smoothing the empirical copula) might make it possible to construct \(D_1\)-consistent estimators of \(\tau _1(A)\). Convergence rates of these aggregations and other related questions are future work.

3.3 The IFS Construction of (Very) Singular Copulas

Using Iterated Function Systems, one can construct copulas exhibiting surprisingly irregular analytic behavior. The aim of this section is to sketch the construction and then state two main results. For general background on Iterated Function Systems with Probabilities (IFSP, for short), we refer to [16]. The IFSP construction of two-dimensional copulas with fractal support goes back to [12] (also see [2]), for the generalization to the multivariate setting we refer to [30].

Definition 10

([12]) A \(n\times m\)-matrix \(\tau =(t_{ij})_{i=1,\ldots , n, \,j=1,\ldots , m }\) is called transformation matrix if it fulfills the following four conditions: (i) \(\max (n,m)\ge 2\), (ii) all entries are non-negative, (iii) \(\sum _{i,j} t_{ij}=1\), and (iv) no row or column has all entries \(0\). \(\mathfrak {T}\) will denote the family of all transformations matrices.

Given \(\tau \in \mathfrak {T}\) define the vectors \((a_j)_{j=0}^m, (b_i)_{i=0}^n\) of cumulative column and row sums by \(a_0=b_0=0\) and

$$\begin{aligned} a_j = \sum _{j_0\le j} \sum _{i=1}^n t_{ij_0} \quad j\in \{1,\ldots ,m\}, \qquad b_i = \sum _{i_0\le i} \sum _{j=1}^m t_{i_0j} \quad i\in \{1,\ldots ,n\}. \end{aligned}$$
(13)

Since \(\tau \) is a transformation matrix both \((a_j)_{j=0}^m\) and \((b_i)_{i=0}^n\) are strictly increasing and \(R_{ji}:=[a_{j-1},a_j]\times [b_{i-1}, b_i]\) is a compact rectangle with nonempty interior for all \(j\in \{1,\ldots ,m\}\) and \(i\in \{1,\ldots ,n\}\). Set \(\widetilde{I}:=\{(i,j): t_{ij}>0\}\) and consider the IFSP \(\{[0,1]^2,(f_{ji})_{(i,j) \in \widetilde{I}}, (t_{ij})_{(i,j) \in \widetilde{I}}\}\), whereby the affine contraction \(f_{ji}:[0,1]^2 \rightarrow R_{ji}\) is given by

$$\begin{aligned} f_{ji}(x,y)=\big (a_{j-1} + x (a_j - a_{j-1})\,,\, b_{i-1} + y (b_i - b_{i-1})\big ). \end{aligned}$$
(14)

\(Z_\tau ^\star \in \fancyscript{K}([0,1]^2)\) will denote the attractor of the IFSP (see [16]). The induced operator \(\fancyscript{V}_\tau \) on \(\fancyscript{P}([0,1]^2)\) is defined by

$$\begin{aligned} \fancyscript{V}_\tau (\mu ):= \sum _{j=1}^m \sum _{i=1}^n t_{ij}\,\mu ^{f_{ji}} = \sum _{(i,j) \in \widetilde{I}} t_{ij}\,\mu ^{f_{ji}}. \end{aligned}$$
(15)

It is straightforward to see that \(\fancyscript{V}_\tau \) maps \(\fancyscript{P}_\fancyscript{C}\) into itself so we may view \(\fancyscript{V}_\tau \) also as operator on \(\fancyscript{C}\). According to [12] there is exactly one copula \(A^\star _\tau \in \fancyscript{C}\), to which we will refer to as invariant copula, such that \(\fancyscript{V}_\tau (\mu _{A_\tau ^\star })=\mu _{A_\tau ^\star }\) holds. The IFSP construction also converges w.r.t. \(D_1\)—the following result holds:

Theorem 11

([29]) Let \(\tau \in \mathfrak {T}\) be a transformation matrix. Then \(V_\tau \) is a contraction on the metric space \((\fancyscript{C},D_1)\) and there exists a unique copula \(A_\tau ^\star \) such that \(V_\tau A_\tau ^\star = A_\tau ^\star \) and for every \(B \in \fancyscript{C}\) we have \(\lim _{n \rightarrow \infty } D_1(V_\tau ^nB,A_\tau ^\star )=0\).

Example 12

Figure 1 depicts the density of \(\fancyscript{V}^n_\tau (\varPi )\) for \(n \in \{1,2,3,5\}\), whereby \(\tau \) is given by

Fig. 1
figure 1

Image plot of the density of \(\fancyscript{V}^n_\tau (\varPi )\) for \(n \in \{1,2,3,5\}\) and \(\tau \) according to Example 12

$$\begin{aligned} \tau = \left( \begin{array}{ccc} \frac{1}{6} &{} 0 &{} \frac{1}{6} \\ 0 &{} \frac{1}{3} &{} 0 \\ \frac{1}{6} &{} 0 &{}\frac{1}{6} \end{array} \right) . \end{aligned}$$

Moreover (again see [12]) the support \(Supp(\mu _{A_\tau ^\star })\) of \(\mu _{A_\tau ^\star }\) fulfills \(\lambda _2(Supp(\mu _{A_\tau ^\star }))=0\) if \(\tau \) contains at least one zero. Hence, in this case, \(\mu _{A_\tau ^\star }\) is singular w.r.t. the Lebesgue measure \(\lambda _2\), we write \(\mu _{A_\tau ^\star } \perp \lambda _2\). On the other hand, if \(\tau \) contains no zeros we may still have \(\mu _{A_\tau ^\star } \perp \lambda _2\) although in this case \(\mu _{A_\tau ^\star }\) has full support \([0,1]^2\). In fact, an even stronger and quite surprising singularity result holds—letting \(\hat{\mathfrak {T}}\) denote the family of all transformation matrices \(\tau \) (i) containing no zeros, (ii) fulfilling that the row sums and column sums through every \(t_{ij}\) are identical, and (iii) \(\mu _{A_\tau ^\star } \not = \lambda _2\) we have the following striking result:

Theorem 13

([33]) Suppose that \(\tau \in \hat{\mathfrak {T}}\). Then the corresponding invariant copula \(A_\tau ^\star \) is singular w.r.t. \(\lambda _2\) and has full support \([0,1]^2\). Moreover, for \(\lambda \)-almost every \(x \in [0,1]\) the conditional distribution function \(y\mapsto F_x^{A_\tau ^\star }(y)=K_{A_\tau ^\star }(x,[0,y])\) is continuous, strictly increasing and has derivative zero \(\lambda \)-almost everywhere.

3.4 The Star Product of Copulas

Given \(A, B \in \fancyscript{C}\) the star product \(A*B \in \fancyscript{C}\) is defined by (see [3, 23] )

$$\begin{aligned} (A * B)(x,y):=\int \limits _{[0,1]} A_{,2}(x,t) B_{,1}(t,y) d\lambda (t) \end{aligned}$$
(16)

and fulfills \(T_{A*B}=\varPhi _{A * B} = \varPhi (A) \circ \varPhi (B)=T_A \circ T_B\), so the mapping \(\varPhi \) in equation (4) actually is an isomorphism. A copula \(A \in \fancyscript{C}\) is called idempotent if \(A*A=A\) holds, the family of all idempotent copulas will be denoted by \(\fancyscript{C}^{ip}\). For a complete characterization of idempotent copulas we refer to [4] (also see [26]). The star product can easily be translated to the Markov kernel setting—the following result holds:

Lemma 14

([30]) Suppose that \(A\), \(B \in \fancyscript{C}\) and let \(K_A,K_B\) denote Markov kernels of \(A\) and \(B\). Then the Markov kernel \(K_A\circ K_B\), defined by

$$\begin{aligned} (K_A\circ K_B)(x,F):=\int \limits _{[0,1]} K_B(y,F) K_A(x,dy), \end{aligned}$$
(17)

is a Markov kernel of \(A*B\). Furthermore \(\fancyscript{C}^{ip}\) is closed in \((\fancyscript{C},D_1)\).

Remark 15

Let \(A \in \fancyscript{C}\) be arbitrary. If \((X_n)_{n \in \mathbb {N}}\) is a stationary Markov process on \([0,1]\) with (stationary) transition probability \(K_A(\cdot ,\cdot )\) and \(X_1 \sim \fancyscript{U}(0,1)\) then \((X_n,X_{n+1}) \sim A\) for every \(n \in \mathbb {N}\) and Lemma 14 implies that \((X_1,X_{n+1}) \sim A * A * \cdots * A=:A^{*n}\), i.e., the \(n\)-step transition probability of the process is given by the Markov kernel of \(A^{*n}\).

Remark 16

In case the copulas \(A,B\) are absolutely continuous with densities \(k_A\) and \(k_B\) it is straightforward to verify that \(A*B\) is absolutely continuous with density \(k_{A*B}\) given by

$$\begin{aligned} k_{A*B}(x,y)=\int \limits _{[0,1]} k_A(x,z) k_B(z,y) d\lambda (z). \end{aligned}$$
(18)

Since the star product of copulas is a natural generalization of the multiplication of doubly stochastic matrices and doubly stochastic idempotent matrices are fully characterizable (see [10, 25]) the following result underlines how much more complex the family of idempotent copulas is (also see [12] for the original result without idempotence).

Theorem 17

([30]) For every \(s \in (1,2)\) there exists a transformation matrix \(\tau _s \in \mathfrak {T}\) such that:

  1. 1.

    The invariant copula \(A_{\tau _s}^\star \) is idempotent.

  2. 2.

    The Hausdorff dimension of the support of \(A_{\tau _s}^\star \) is \(s\).

Example 18

For the transformation matrix \(\tau \) from Example 12 the invariant copula \(A^\star _\tau \) is idempotent and its support has Hausdorff dimension \(\ln {5}/\ln {3}\). Hence, setting \(A:=A^\star _\tau \) and considering the Markov process outlined in Remark 15 we have \((X_i,X_{i+n}) \sim A\) for all \(i,n \in \mathbb {N}\). The same holds if we take \(A:=\fancyscript{V}^j_\tau (\varPi )\) for arbitrary \(j \in \mathbb {N}\) since this \(A\) is idempotent too.

We conclude this section with a general result that will be used later on and which, essentially, follows from Von Neumanns mean ergodic theorem for Hilbert spaces (see [24]) since Markov operators have operator norm one. For every copula \(A \in \fancyscript{C}\) and every \(n \in \mathbb {N}\) as in the Introduction we set

$$\begin{aligned} s_{*n}(A)=\frac{1}{n} \sum _{i=1}^n A^{*i}. \end{aligned}$$
(19)

Theorem 19

([32]) For every copula \(A\) there exists a copula \(\hat{A}\) such that

$$\begin{aligned} \lim _{n \rightarrow \infty } D_1 \big (s_{*n}(A),\hat{A} \big )=0. \end{aligned}$$
(20)

This copula \(\hat{A}\) is idempotent, symmetric, and fulfills \(\hat{A} *A=A*\hat{A}=\hat{A}\).

As nice by-product, Theorem 19 also offers a very simple proof of the fact that idempotent copulas are necessarily symmetric (originally proved in [4]).

4 Copulas Whose Corresponding Markov Operator Is Quasi-constrictive

Studying asymptotic properties of Markov operators quasi-constrictiveness is a very important concept. To the best of the authors’ knowledge, there is no natural/simple characterization of copulas whose Markov operator is quasi-constrictive. The objective of this section, however, is to show that the \(D_1\)-limit \(\hat{A}\) of \(s_{*n}(A)\) has a very simple form if \(T_A\) is quasi-constrictive. We start with a definition of quasi-constrictiveness in the general setting. In general, \(T\) is a Markov operator on \(L^1(\varOmega ,\fancyscript{A},\mu )\) if the conditions (M1)-(M3) from Sect. 2 with \([0,1]\) replaced by \(\varOmega \), \(\fancyscript{B}([0,1])\) replaced by \(\fancyscript{A}\), and \(\lambda \) replaced by \(\mu \) hold.

Definition 20

([1, 15, 18]) Suppose that \((\varOmega ,\fancyscript{A},\mu )\) is a finite measure space and let \(\fancyscript{D}(\varOmega ,\fancyscript{A},\mu )\) denote the family of all probability densities w.r.t. \(\mu \). Then a Markov operator \(T:L^1(\varOmega ,\fancyscript{A},\mu ) \rightarrow L^1(\varOmega ,\fancyscript{A},\mu )\) is called quasi-constrictive if there exist constants \(\delta >0\) and \(\kappa <1\) such that for every probability density \(f \in \fancyscript{D}(\varOmega ,\fancyscript{A},\mu )\) the following inequality is fulfilled:

$$\begin{aligned} \limsup _{n \rightarrow \infty } \int \limits _E T^nf(x) d\mu (x) \le \kappa \quad \text {for every}\; E\in \fancyscript{A} \,\text {with}\; \mu (E) \le \delta \end{aligned}$$
(21)

Komornik and Lasota (see [15]) have shown in 1987 that quasi-constrictivity is equivalent to asymptotic periodicity—in particular they proved the following spectral decomposition theorem: For every quasi-constrictive Markov operator \(T\) there exist an integer \(r\ge 1\), densities \(g_1,\ldots ,g_r \in \fancyscript{D}(\varOmega ,\fancyscript{A},\mu )\) with pairwise disjoint support, essentially bounded non-negative functions \(h_1,\ldots ,h_r \in L^\infty (\varOmega ,\fancyscript{A},\mu )\) and a permutation \(\sigma \) of \(\{1,\ldots ,r\}\) such that for every \(f \in L^1(\varOmega ,\fancyscript{A},\mu )\)

$$\begin{aligned} T^nf(x)=\sum _{i=1}^r \Big (\int \limits _\varOmega fh_id\mu \Big ) g_{\sigma ^n(i)}(x) \,+\, R_nf(x) \quad \text {with} \,\, \lim _{n \rightarrow \infty } \Vert R_nf \Vert _1=0 \end{aligned}$$
(22)

holds. Furthermore (see again [1, 15, 18]), in case of \(\mu (\varOmega )=1\) there exists a measurable partition \((E_i)_{i=1}^r\) of \(\varOmega \) in sets with positive measure such that \(g_j\) and \(\sigma \) in (22) fulfill

$$\begin{aligned} g_j=\frac{1}{\mu (E_j)} \mathbf {1}_{E_j} \quad \text {and} \quad \mu (E_j)=\mu (E_{\sigma ^n(j)}) \end{aligned}$$
(23)

for every \(j\in \{1,\ldots ,r\}\) and every \(n \in \mathbb {N}\).

Example 21

For every absolutely continuous copula \(A\) with density \(k_A\) fulfilling \(k_A \le M\) the corresponding Markov operator is quasi-constrictive. This directly follows from the fact that

$$ T_Af(x) = \int \limits _{[0,1]} f(y)K_A(x,dy)= \int \limits _{[0,1]} f(y)k_A(x,y)dy \le M $$

holds for every \(f \in \fancyscript{D}([0,1]):=\fancyscript{D}([0,1],\fancyscript{B}([0,1]),\lambda )\).

Example 22

There are absolutely continuous copulas \(A\) whose corresponding Markov operator is not quasi-constrictive—one example is the idempotent ordinal-sum-like copula \(O\) with unbounded density \(k_O\) defined by

$$ k_O(x,y):=\sum _{n=1}^\infty 2^n \mathbf {1}_{[1-2^{1-n},1-2^{-n} )}(x,y) $$

for all \(x,y \in [0,1]\) (straightforward to verify).

Before returning to the copula setting we prove a first proposition to the spectral decomposition that holds for general Markov operators on \(L^1(\varOmega ,\fancyscript{A},\mu )\) with \((\varOmega ,\fancyscript{A},\mu )\) being a probability space.

Lemma 23

Suppose that \((\varOmega ,\fancyscript{A},\mu )\) is a probability space and that \(T:L^1(\varOmega ,\fancyscript{A},\mu ) \rightarrow L^1(\varOmega ,\fancyscript{A},\mu )\) is a quasi-constrictive Markov operator. Then there exists \(r\ge 1\), a measurable partition \((E_i)_{i=1}^r\) of \(\varOmega \) in sets with positive measure, densities \(h'_1,\ldots ,h'_r \in L^\infty (\varOmega ,\fancyscript{A},\mu ) \cap \fancyscript{D}(\varOmega ,\fancyscript{A},\mu )\) and a permutation \(\sigma \) of \(\{1,\ldots ,r\}\) such that we have \(\sum _{i=1}^r \mu (E_i) h'_i = 1\) as well as

$$\begin{aligned} T^nf(x)=\sum _{i=1}^r \Big (\int \limits _\varOmega fh'_i d\mu \Big ) \mathbf {1}_{E_{\sigma ^n(i)}}(x) \,+\, R_nf(x) \quad \mathrm{with } \,\, \lim _{n \rightarrow \infty } \Vert R_nf \Vert _1=0 \end{aligned}$$
(24)

for every \(f \in L^1(\varOmega ,\fancyscript{A},\mu )\) and every \(n \in \mathbb {N}\).

Proof

Using (22) and (23) it follows that

$$\begin{aligned} R_n \mathbf {1}_\varOmega (x)&= 1 - T^n \mathbf {1}_\varOmega (x) = \sum _{i=1}^r \mathbf {1}_{E_{\sigma ^n(i)}}(x) - \sum _{i=1}^r \frac{\Vert h_i \Vert _{1}}{\mu (E_i)}\mathbf {1}_{E_{\sigma ^n(i)}}(x) \\&= \sum _{i=1}^r \Big (1-\frac{\Vert h_i\Vert _{1}}{\mu (E_i)} \Big )\mathbf {1}_{E_{\sigma ^n(i)}}(x) \end{aligned}$$

for every \(x \in \varOmega \), which implies

$$ 0 = \lim _{n \rightarrow \infty } \Vert R_n \mathbf {1}_\varOmega \Vert _1 = \lim _{n \rightarrow \infty } \sum _{i=1}^r \bigg \vert 1-\frac{\Vert h_i\Vert _{1}}{\mu (E_i)} \bigg \vert \mu (E_{\sigma ^n(i)}) = \sum _{i=1}^r \bigg \vert 1-\frac{\Vert h_i\Vert _{1}}{\mu (E_i)} \bigg \vert \mu (E_i). $$

Since \(\mu (E_i)>0\) for every \(i \in \{1,\ldots ,r\}\) this shows that \(h'_i:=\frac{h_i}{\mu (E_i)} \in L^\infty (\varOmega ,\fancyscript{A},\mu ) \cap \fancyscript{D}(\varOmega ,\fancyscript{A},\mu )\) for every \(i \in \{1,\ldots ,r\}\). Furthermore we have \(\lim _{n \rightarrow \infty } \Vert R_n h'_i \Vert _1=0\) for every fixed \(i\), from which

$$\begin{aligned} 1&= \int \limits _{\varOmega } T^n h'_i(x) d\mu (x) = \lim _{n \rightarrow \infty } \int \limits _{\varOmega } \sum _{j=1}^r \Big (\int \limits _{\varOmega } h'_i(z)h'_j(z) d\mu (z)\Big ) \mathbf {1}_{E_{\sigma ^n(j)}}(x) d\mu (x) \\&= \sum _{j=1}^r \Big (\int \limits _{\varOmega } h'_i(z)h'_j(z) d\mu (z)\Big ) \mu (E_j) \end{aligned}$$

follows. Multiplying both sides with \(\mu (E_i)\), summing over \(i \in \{1,\ldots ,r\}\) yields

$$ 1= \int \limits _{\varOmega } \sum _{i=1}^r h'_i(z) \mu (E_i) \underbrace{\sum _{j=1}^r h'_j(z) \mu (E_j)}_{:=g(z)} d\mu (z) $$

so \(g\in \fancyscript{D}(\varOmega ,\fancyscript{A},\mu )\) and at the same time \(g^2\in \fancyscript{D}(\varOmega ,\fancyscript{A},\mu )\). Using Cauchy Schwarz inequality it follows that \(g(x)=1\) for \(\mu \)-almost every \(x \in \varOmega \). \(\square \)

Lemma 24

Suppose that \(A\) is a copula whose corresponding Markov operator \(T_A\) is quasi-constrictive. Then there exists \(r\ge 1\), a measurable partition \((E_i)_{i=1}^r\) of \([0,1]\) in sets with positive measure, and pairwise different densities \(h_1,\ldots ,h_r \in L^\infty ([0,1])\,\cap \,\fancyscript{D}([0,1])\) such that the limit copula \(\hat{A}\) of \(s_{*n}(A)\) is absolutely continuous with density \(k_{\hat{A}}\), defined by

$$\begin{aligned} k_{\hat{A}}(x,y)=\sum _{i=1}^r h_i(y) \mathbf {1}_{E_i}(x) \end{aligned}$$
(25)

for all \(x,y \in [0,1]\).

Proof

Fix an arbitrary \(f \in L^1([0,1])\). Then, using Lemma 23, we have

$$\begin{aligned} \frac{1}{n} \sum _{j=1}^n T^j_A f(x)&= \frac{1}{n} \sum _{j=1}^n \sum _{i=1}^r \int \limits _{[0,1]} f h'_{\sigma ^{-j}(i)}d\lambda \mathbf {1}_{E_i}(x) \,+\, \frac{1}{n} \sum _{j=1}^n R_jf(x) \\&= \sum _{i=1}^r \mathbf {1}_{E_i}(x) \int \limits _{[0,1]} f(z) \underbrace{\frac{1}{n} \sum _{j=1}^n h'_{\sigma ^{-j}(i)}(z)}_{:=g^i_n(z)}d\lambda (z) \,+\, \frac{1}{n} \sum _{j=1}^n R_jf(x) \end{aligned}$$

for every \(x \in [0,1]\) and every \(n \in \mathbb {N}\). Since \(\sigma \) is a permutation \(j \rightarrow h'_{\sigma ^{-j}(i)}(z)\) is periodic for every \(z\) and every \(i\), so, for every \(i \in \{1,\ldots ,r\}\), there exists a function \(h_i\) such that

$$ \lim _{n \rightarrow \infty } \frac{1}{n} \sum _{j=1}^n h'_{\sigma ^{-j}(i)}(z)=h_i(z) $$

for every \(z \in [0,1]\) and every \(i \in \{1,\ldots ,r\}\). Obviously \(h_i \in L^\infty ([0,1])\) and, using Lebesgue’s theorem on dominated convergence, \(h_i\) is also a density, so we have \(h_1,\ldots ,h_r \in L^\infty ([0,1]) \cap \fancyscript{D}([0,1])\). Finally, using Theorem 19 and the fact that \(\lim _{n \rightarrow \infty }\Vert R_n f\Vert _1=0\) for every \(f \in L^1([0,1])\), it follows immediately that

$$ T_{\hat{A}} f(x)= \int \limits _{[0,1]} f(y) \sum _{i=1}^r h_i(y) \mathbf {1}_{E_i}(x) d\lambda (y). $$

This completes the proof since mutually different densities can easily be achieved by building unions from elements in the partition \((E_i)_{i=1}^r\) if necessary. \(\square \)

Using the fact that \(\hat{A}\) is idempotent we get the following stronger result:

Lemma 25

The density \(k_{\hat{A}}\) of \(\hat{A}\) in Lemma 24 has the form

$$ k_{\hat{A}}(x,y) =\sum _{i,j=1}^r m_{i,j} \mathbf {1}_{E_i\times E_j}(x,y), $$

i.e., it is constant on all rectangles \(E_i\times E_j\).

Proof

According to Theorem 19 the copula \(\hat{A}\) is idempotent so \(\hat{A}\) is symmetric. Consequently the set

$$ \varDelta :=\{(x,y)\in [0,1]^2: k_{\hat{A}}(x,y)= k_{\hat{A}}(y,x)\} \in \fancyscript{B}([0,1]^2) $$

has full measure \(\lambda _2(\varDelta )=1\). Using Lemma 24 we have

$$ \sum _{i=1}^r h_i(y) \mathbf {1}_{E_i}(x) = \sum _{i=1}^r h_i(x) \mathbf {1}_{E_i}(y) $$

for every \((x,y) \in \varDelta \). Fix arbitrary \(i,j \in \{1,\ldots ,r\}\). Then we can find \(x \in E_i\) such that \(\lambda (\varDelta _x)=1\) holds, whereby \(\varDelta _x=\{y \in [0,1]: (x,y) \in \varDelta \}\). For such \(x\) we have \(h_i(y)=h_j(x)\) for \(\lambda \)-almost every \(y \in E_j\), which firstly implies that \(h_j\) is, up to a set of measure zero, constant on \(E_j\) and, secondly, that \(k_{\hat{A}}\) is constant on \(E_i\times E_j\) outside a set of \(\lambda _2\)-measure zero. Since we may modify the density on a set of \(\lambda _2\)-measure zero we can assume that \(k_{\hat{A}}\) is of the desired form

$$ k_{\hat{A}}(x,y) =\sum _{i,j=1}^r m_{i,j} \mathbf {1}_{E_i\times E_j}(x,y), $$

with \(M=(m_{i,j})_{i,j=1}^r\) being a non-negative, symmetric matrix fulfilling

  1. (a)

    \(\sum _{i,j=1}^r m_{i,j} \lambda (E_i)\lambda (E_j)=1\)

  2. (b)

    \(\sum _{j=1}^r m_{i,j} \lambda (E_j)=1\) for every \(i \in \{1,\ldots ,r\}\)

  3. (c)

    \(\sum _{i=1}^r m_{i,j} \lambda (E_i)=1\) for every \(j \in \{1,\ldots ,r\}\)

  4. (d)

    \(\sum _{i=1}^r \vert m_{i,j}-m_{i,l}\vert >0\) whenever \(j \not = l\). \(\square \)

Before proceeding with the final result it is convenient to take a look at the matrix \(H=(H_{i,j})_{i,j=1}^r\) defined by

$$\begin{aligned} H_{i,j}:=m_{i,j} \lambda (E_j)=\int \limits _{E_j} h_i(z) d\lambda (z) \end{aligned}$$
(26)

for all \(i,j \in \{1,\ldots ,r\}\). According to (a) in the proof of Lemma 25 \(H\) is stochastic. Furthermore, idempotence of \(\hat{A}\) and Remark 16 imply \(k_{\hat{A}}*k_{\hat{A}}=k_{\hat{A}}\), hence

$$\begin{aligned} \sum _{i=1}^r h_i(y) \mathbf {1}_{E_i}(x)&= k_{\hat{A}}(x,y)=k_{\hat{A}}*k_{\hat{A}}(x,y) \\&= \int \limits _{[0,1]} \sum _{i=1}^r h_i(z) \mathbf {1}_{E_i}(x) \sum _{j=1}^r h_j(y) \mathbf {1}_{E_j}(z) \,d\lambda (z)\\&= \sum _{i,j=1}^r \mathbf {1}_{E_i}(x)h_j(y) \int \limits _{E_j} h_i(z) d\lambda (z)= \sum _{i,j=1}^r \mathbf {1}_{E_i}(x)h_j(y) H_{i,j}. \end{aligned}$$

From this is follows immediately that \( h_i(y)=\sum _{j=1}^r H_{i,j} h_j(y) \) is fulfilled for every \(y \in [0,1]\) and \(i \in \{1,\ldots ,r\}\), so, integrating both sides over \(E_l\), we have \( H_{i,l}=\sum _{j=1}^r H_{i,j} H_{j,l}, \) which shows that \(H\) is idempotent. Having this, the proof of the following main result of this section will be straightforward.

Theorem 26

Suppose that \(A\) is a copula whose corresponding Markov operator \(T_A\) is quasi-constrictive. Then there exist \(r\ge 1\) and a measurable partition \((E_i)_{i=1}^r\) of \([0,1]\) in sets with positive measure, such that the limit copula \(\hat{A}\) of \(s_{*n}(A)\) is absolutely continuous with density \(k_{\hat{A}}\) given by

$$\begin{aligned} k_{\hat{A}}(x,y)=\sum _{i=1}^r \frac{1}{\lambda (E_i)} \mathbf {1}_{E_i \times E_i}(x,y) \end{aligned}$$
(27)

for all \(x,y \in [0,1]\). In other words, the limit copula \(\hat{A}\) has an ordinal-sum-of-\(\varPi \)-like structure.

Proof

Since \(H\) is an idempotent stochastic matrix and since \(H\) can not have any column consisting purely of zeros, up to a permutation, \(H\) must have the form (see [5, 21]).

$$\begin{aligned} \left( \begin{array}{cccc} Q_1 &{} 0 &{} \ldots &{}0 \\ 0 &{} Q_2 &{} \ldots &{} 0 \\ \vdots &{} \vdots &{} \ddots &{} \vdots \\ 0 &{} 0 &{} \ldots &{} Q_s \end{array} \right) , \end{aligned}$$
(28)

whereby each \(Q_i\) is a strictly positive \(r_i \times r_i\)-matrix with identical rows and \(s\) is the range of \(H\). We will show that \(r_i=1\) for every \(i \in \{1,\ldots ,s\}\). Suppose, on the contrary, that \(r_l \ge 2\) for some \(l\). Then there would be indices \(I_l:=\{i_1,\ldots ,i_{r_l}\} \subseteq \{1,\ldots ,r\}\) and \(a_1,\ldots ,a_{r_l} \in (0,1)^{r_l}\) with \(\sum _{i=1}^{r_l} a_i =1\) such that \(Q_l\) would have the form

$$\begin{aligned} Q_l= \left( \begin{array}{cccc} a_1 &{} a_2 &{} \ldots &{} a_{r_l} \\ a_1 &{} a_2 &{} \ldots &{} a_{r_l}\\ \vdots &{} \vdots &{} \ddots &{} \vdots \\ a_1 &{} a_2 &{} \ldots &{} a_{r_l} \end{array} \right) = \left( \begin{array}{cccc} H_{i_1,i_1} &{} H_{i_1,i_2} &{} \ldots &{} H_{i_1,i_{r_l}} \\ H_{i_2,i_1} &{} H_{i_2,i_2} &{} \ldots &{} H_{i_2,i_{r_l}}\\ \vdots &{} \vdots &{} \ddots &{} \vdots \\ H_{i_{r_l},i_1} &{} H_{i_{r_l},i_2} &{} \ldots &{} H_{i_{r_l},i_{r_l}} \end{array} \right) . \end{aligned}$$
(29)

It follows immediately that

$$ H_{i_1,i_v} =m_{i_1,i_1} \lambda (E_{i_v})= H_{i_2,i_v}=m_{i_2,i_v} \lambda (E_{i_v}) = \cdots = H_{i_{r_l},i_v}= m_{i_{r_l},i_v} \lambda (E_{i_v}), $$

so \(m_{i_j,i_v}=m_{i_1,i_v}\) for every \(j \in \{1,\ldots ,r_l\}\) and arbitrary \(v \in \{1,\ldots ,r_l\}\). Having this symmetry of \(M\) implies that all entries of \(Q_l\) are identical, which contradicts the fact that the conditional densities are not identical, i.e., the fact that

$$ \sum _{j \in I_l} \vert m_{j,i_1}-m_{j,i_2} \vert =\sum _{j=1}^r \vert m_{j,i_1}-m_{j,i_2}\vert >0 $$

whenever \(i_1 \not = i_2\). Consequently \(r_i =1 \) for every \(i \in \{1,\ldots ,s\}\) and \(k_{\hat{A}}\) has the desired form. \(\square \)

Remark 27

Consider again the transformation matrix \(\tau \) from Example 12. Then \(\fancyscript{V}^1_\tau (\varPi ), \fancyscript{V}^2_\tau (\varPi ),\ldots \) are examples of the ordinal-sum-of-\(\varPi \)-like copulas mentioned in the last theorem.