Abstract
Weyl’s discrepancy measure induces a norm on ℝ^{n} which shows a monotonicity and a Lipschitz property when applied to differences of indexshifted sequences. It turns out that its ndimensional unit ball is a zonotope that results from a multiple sheared projection from the (n+1)dimensional hypercube which can be interpreted as a discrete differentiation. This characterization reveals that this norm is the canonical metric between sequences of differences of values from the unit interval in the sense that the ndimensional unit ball of the discrepancy norm equals the space of such sequences.
Motivation
In the mathematical literature discrepancy theory is devoted to problems related to irregularities of distributions. In this context the term discrepancy refers to a measure that evaluates to which extent a given distribution deviates from total uniformity in measuretheoretic, combinatorial and geometric settings. This theory goes back to Weyl [39] and is still an active field of research, see, e.g., [3, 12, 19]. Applications can be found in the field of numerical integration, especially for Monte Carlo methods in high dimensions, see, e.g., [28, 36, 40], or in computational geometry, see, e.g., [1, 9, 21]. For applications to data storage problems on parallel disks, see [10, 13] and for halftoning images, see [31].
This paper is motivated by [24] which applies Weyl’s discrepancy concept in order to derive an orderingdependent norm for measuring the (dis)similarity between patterns. In this context the focus lies on evaluating the automisalignment that measures the deviation of some function f(⋅) with its translated version f(⋅−T) with respect to the lag T. The function f represents a signal, the intensity profile of a line of an image, an image or volumetric data. The interesting point about this is that based on Weyl’s discrepancy concept distance measures can be constructed that guarantee the desirable registration properties: (R1) the measure vanishes if and only if the lag vanishes, (R2) the measure increases monotonically with an increasing lag, and (R3) the measure obeys a Lipschitz condition that guarantees smooth changes also for patterns with high frequencies. As proven in [26], properties (R1)–(R3) are not satisfied simultaneously by commonly used measures in this context like mutual information, Kullback–Leibler distance or the Jensen–Rényi divergence measure which are special variants of fdivergence and finformation measures, see, e.g., [4, 20, 29, 30, 37], nor do the standard measures based on pnorms or the widely used correlation measures due to Pearson or Spearman, see [8, 17, 34].
From the point of view of applications properties (R1)–(R3) are relevant for a variety of problems whenever the misalignment of a pattern with its shifted version has to be evaluated. Such problems encounter as autocorrelation in signal processing, see [5]. In computer vision such problems are particularly encountered in stereo matching as point correspondence problem, see, e.g., [32] and [26], in template matching, e.g., for the purpose of print inspection, see, e.g., [7] and [23], in superpixel matching [16] or in defect detection in textured images, see [6, 25, 35]. In these cases, for highfrequency patterns, the discrepancy norm leads to cost functions with less local extrema and a more distinctive region of convergence in the neighborhood of the global minimum compared to commonly used (dis)similarity measures.
A further promising field of future application is related to measuring the similarity between eventbased signals as encountered in neuroscience due to the allornone characteristics of neural signals, see, e.g., [38] and, closely related, in eventbased imaging, see, e.g., [18] and [22]. In this context it is interesting to point out that the asynchronicity of neighboring sensor elements can lead to misaligned response sequences of events in time. Figure 1 illustrates a sequence of allornone events and its automisalignment functions induced by the normalized crosscorrelation on the one hand and the discrepancy norm on the other hand. Due to properties (R1)–(R3), the discrepancy norm induces a topology in the space of such signals which is compatible with the asynchronicity effect. This means that slightly shifted versions of a sequence of events are still recognized as similar.
The question addressed in this paper therefore is what makes the discrepancy so special when applied to differences of index shifted sequences. This paper provides a geometric analysis that makes clear that the discrepancy norm is inherently related to measuring the distance between indexshifted sequences.
The paper first recalls Weyl’s definition [39] in Sect. 2, formulates it as a norm on ℝ^{n} and recalls some of its properties from [23]. As the main result of this paper, in Sect. 3 its unit ball is revealed as special zonotope. Section 4 focusses on geometric properties of this zonotope like the number of kdimensional faces in Sect. 4.1 and its volume in Sect. 4.2.
The Discrepancy Norm
In [39] Weyl introduces a concept of discrepancy in the context of pseudorandomness of sequences of numbers from the unit interval. He proposes the formula
to measure the deviation of a sequence (y _{ k })_{ k∈{1,…,N}}⊂(0,1) from a uniformly distributed sequence where N(a,b)={k∈{1,…,N}y _{ k }∈(a,b)}, a,b∈(0,1), b>a. As a generalization, the discrepancy of measures μ and ν is defined as
where \(\mathcal{A}\) is a σalgebra of measurable sets over the domain \(\mathcal{U}\), and μ, ν are signed measures defined on the measure space \((\mathcal{U}, \mathcal{A})\).
For linear combinations of Dirac measures δ _{{i}} on ℤ given by \(\mu = \sum_{i=1}^{n} x_{i} \delta_{\{i\}}\) and \(\nu = \sum_{i=1}^{n} y_{i} \delta_{\{i\}}\), x _{ i },y _{ i }∈ℝ, and \(\tilde{A}\) a set of index intervals, Definition (2) yields
Therefore, for a summable sequence of real values x=(x _{ i })_{ i∈ℤ}, ∑_{ i∈ℤ}x _{ i }<∞, Weyl’s discrepancy concept leads to the definition
which induces a norm, see Appendix. Applications of the norm (3) can be found in pattern recognition [27] and print inspection in the context of pixel classification [2]. In contrast to pnorms ∥⋅∥_{ p }, ∥x∥_{ p }=(∑_{ i }x _{ i }^{p})^{(1/p)}, the norm ∥⋅∥_{ D } strongly depends on the sign and also the ordering of the entries, as illustrated by the examples ∥(−1,1,−1,1)∥_{ D }=1 and ∥(−1,−1,1,1)∥_{ D }=2.
Generally, x=(x _{ i })_{ i } with x _{ i }≥0 entails ∥x∥_{ D }=∥x∥_{1}, and x=((−1)^{i})_{ i } the equality ∥x∥_{ D }=∥x∥_{∞}, respectively, indicating that the more there are alternating signs of consecutive entries, the lower is the value of the discrepancy norm. Observe that ∥x∥_{∞}≤∥x∥_{ D }≤∥x∥_{1}; hence, due to Hölder’s inequality, n ^{−1/p}∥x∥_{ p }≤∥x∥_{ D }≤n ^{1−1/p}∥x∥_{ p }. For convenience, let us consider a sequence (x _{ i })_{ i } with i∈I _{ n }, x _{ i }=0 for i∉I _{ n }, and denote by Δ _{ x }(k)=∥(x _{ i+k }−x _{ i })_{ i }∥_{ D } the misalignment function of x with respect to ∥⋅∥_{ D }. For the proof of the following properties, see Appendix:

(P1)
\(\(x_{i})_{i\in I_{n}}\_{D}\) induces a norm on ℝ^{n}.

(P2)
Δ _{ x }(0)=0 for all summable real sequences x.

(P3)
\(\ (x_{i})_{i\in I_{n}}\_{D} = \max\{0,\max_{k\in I_{n}} \sum_{i = 1}^{k} x_{i}\}  \min\{0, \min_{k \in I_{n}} \sum_{i = 1}^{k} x_{i}\}\).

(P4)
Δ _{ x }(k)≤k⋅L, where L=max_{ i } x _{ i }−min_{ i } x _{ i }, k∈ℤ, and x _{ i }≥0.

(P5)
Δ _{ x }(k)=Δ _{ x }(−k) for x=(x _{ i })_{ i } with x _{ i }≥0 and k∈ℤ.

(P6)
For x=(x _{ i })_{ i } with x _{ i }≥0, the function Δ _{ x }(⋅) is monotonically increasing on ℕ∪{0}.
The equation of (P3) allows us to compute the discrepancy of a sequence of length n with O(n) operations instead of O(n ^{2}) number of operations resulting from the original Definition (3). Especially the monotonicity (P6) and the Lipschitz property (P4) are interesting properties for applications in the field of signal analysis. It is instructive to point out that the Lipschitz constant in (P4) does not depend on frequencies or other characteristics of the sequence x. Properties (P4), (P5), and (P6) are illustrated in Figs. 1(a) and 1(b), which demonstrate the behavior of the misalignment function of a sequence of allornone events. While Fig. 1(a) shows typical local minima of the misalignment function with respect to the Euclidean norm, Fig. 1(b) visualizes the symmetry property (P5), the monotonicity property (P6), and the boundedness of its slope due to the Lipschitz property (P4) of the corresponding misalignment function induced by the discrepancy norm.
The Unit Ball of the Discrepancy Norm as Convex Polytope
In this section we consider the unit ball of the discrepancy norm in dimension n∈ℕ, \(B_{D}^{(n)} = \{\mathbf{x} \in \mathbb{R}^{n} \\mathbf{x}\_{D} \leq 1 \}\), as geometric object. Definition (3) immediately leads to the representation
where \(\mathcal{I}_{n}\) denotes the set of subintervals from {1,…,n}, and 1_{ I }(⋅) the indicator function given by 1_{ I }(i)=1 if and only if i∈I. Equation (4) represents the unit ball \(B_{D}^{(n)}\) as bounded intersection of a set of halfspaces, which shows that the unit balls of the discrepancy norm are convex polytopes. Figures 2(a) and 2(b) illustrate the unit balls \(B_{D}^{(n)}\) for n=2 and n=3. Lemma 1 shows a first relationship between \(~B_{D}^{(n)}\) and the (n+1)hypercube.
Lemma 1
Let x=(x _{ i })_{ i }∈[−1,1]^{n} with ∥x∥_{ D }≤1. Then,
if and only if
The constant c is uniquely determined if and only if ∥x∥_{ D }=1.
Proof
Note that if \(\min_{k= 1}^{n}\{0, \sum_{j= 1}^{k} x_{j}\} <0\), then
and that if \(\max_{k= 1}^{n}\{0, \sum_{j= 1}^{k} x_{j}\} >0\), there holds
According to property (P3), the assumption ∥x∥_{ D }≤1 implies
which shows that condition (6) implies formula (5). Formulas (7) and (8) reveal that the bounds 0 and 1 in inequality (9) are assumed, showing the necessity of condition (6). □
Given a sequence x=(x _{1},…,x _{ n }) with ∥x∥_{ D }≤1 Lemma 1 reveals that x can be represented as a sequence of differences y _{ i+1}−y _{ i }, i∈I _{ n }, with y _{ i }∈[0,1], and that such a representation is uniquely determined if ∥x∥_{ D }=1. This observation motivates Lemma 2, which points out a fundamental relationship between the discrepancy and the maximum norm.
Lemma 2
Let x=(x _{ i })_{ i }∈ℝ^{n+1}, n∈ℕ. Then,
Proof
Consider an x with ∥x−min_{ i }{x _{ i }}∥_{∞}>0 and set \(\tilde{x}_{i} = \frac{x_{i}  \min_{i}\{x_{i}\}}{\ \mathbf{x}  \min_{i}\{x_{i}\}\_{\infty} } \in [0,1]\). Then by the Lipschitz property (P4) we get
Since \(\max_{i}\{\tilde{x}_{i}\}= 1\) and \(\min_{i}\{\tilde{x}_{i}\}= 0\), there are indices i _{0} and i _{1} such that \(\tilde{x}_{i_{0}}=1\) and \(\tilde{x}_{i_{1}}=0\). Without loss of generality, let us assume that i _{0}<i _{1}. Then \(\( \tilde{x}_{i+1}  \tilde{x}_{i})_{i}\_{D} \geq  \tilde{x}_{i_{1}}  \tilde{x}_{i_{1}1} + \dots + \tilde{x}_{i_{0}+1}  \tilde{x}_{i_{0}}  = 1\), which, together with (11), yields \(\( \tilde{x}_{i+1}  \tilde{x}_{i})_{i}\_{D} = 1\), and hence ∥(x _{ i+1}−x _{ i })_{ i }∥_{ D }=∥x−min_{ i }{x _{ i }}∥_{∞}. □
For convenience, let us define that for a sequence \(\mathbf{x} = (x_{i})_{i\in I_{n}}\in \mathbb{R}^{n}\), the index interval C⊆I _{ n } is called a core discrepancy interval with respect to x if and only if for any subset \(\tilde{C} \subseteq C\) with \(\sum_{i\in \tilde{C}} x_{i} = \\mathbf{x}\_{D}\), it follows that \(\tilde{C} = C\). Note that for any x, due to the definition of the discrepancy norm, there is at least one core discrepancy interval. Further, for convenience, let 0=(0,…,0)^{T} and 1=(1,…,1)^{T}.
With these prerequisites we come to the central result of this paper that characterizes the vertices \(\mathrm{vert}(B_{D}^{(n)})\) of \(B_{D}^{(n)}\) in terms of vertices of the hypercube of dimension (n+1).
Lemma 3
x∈ℝ^{n} is a vertex of the convex polytope \(B_{D}^{(n)}\) if and only if ∥x∥_{ D }=1 and x∈{−1,0,1}^{n}.
Proof
First of all we show that \(B_{D}^{(n)}\) equals the convex hull of \(\mathcal{D}^{(n)} = \{\mathbf{c}\in \{1,0,1\}^{n}  \\mathbf{c}\_{D}=1\}\).
Observe that \(\mathrm{conv}(\mathcal{D}^{(n)}) \subseteq B_{D}^{(n)}\) follows immediately from definition (3) and the representation (4) as \(\mathbf{x} \in \mathcal{D}^{(n)}\) implies \(\sup_{n_{1}, n_{2} \in \mathbb{Z}: n_{1} \leq n_{2}} \sum_{i=n_{1}}^{n_{2}} x_{i} \leq 1\).
What remains to be shown is that an arbitrary \(\mathbf{x} \in B_{D}^{(n)}\) can be represented as a convex combination of elements from the set \(\mathcal{D}^{(n)}\).
Therefore, suppose that x∉{−1,0,1}^{n} with ∥x∥_{ D }=1. Let C={n _{1},…,n _{2}}⊆I _{ n } be a core discrepancy interval with respect to x. Without loss of generality we may assume that ∑_{ i∈C } x _{ i }>0.
Let us consider the cases n _{1}>1 or n _{1}=1. For the case that n _{1}>1, let us set
Observe that \(\alpha_{i^{*}} > 0\) for some index i ^{∗}∈{1,…,n _{1}−1} entails \(\sum_{i= i^{*}}^{n_{2}} x_{i} = \alpha_{i^{*}} + \sum_{i = n_{1}}^{n_{2}} x_{i} > \sum_{i = n_{1}}^{n_{2}} x_{i} = \\mathbf{x}\_{D} \) and, therefore, contradicts the fact that C is a core discrepancy interval. From this it follows that
for all indices i∈{1,…,n _{1}−1}.
Now, arrange the partial sums α _{ i }, i∈{1,…,n _{1}−1}, in an increasing order: \(0 \leq \alpha_{r_{1}} \leq \alpha_{r_{2}} \leq \dots \leq \alpha_{r_{n_{1}1}} \), and set (k∈{1,…,n _{1}−1})
Then, we have λ _{ i }≥0 for i∈{1,…,n _{1}−1}, and due to \(\alpha_{r_{1}} = \lambda_{1}\) and \(\alpha_{r_{k}} = \lambda_{1} + \dots + \lambda_{k}\), we obtain \(\sum_{i= 1}^{n_{1}1} \lambda_{i} \leq \max_{i= 1}^{n_{1}1} \{\alpha_{i}\} \leq 1 \). Consequently,
Finally, we get the representation
where
Next let us define the auxiliary vectors \(\mathbf{s}^{(0)}, \mathbf{s}^{(1)},\ldots, \mathbf{s}^{({n_{1}1})} \in \{1,0\}^{(n_{1}1)}\) given by
where j∈{1,…,n _{1}−1}. Observe that the vectors (16), the scalars (14) and (15) yield
Hence,
where \(\tilde{\mathbf{g}}^{(0)} = \mathbf{0}\) and
Note that \(\tilde{\mathbf{g}}^{(1)},\ldots, \tilde{\mathbf{g}}^{({n_{1}1})} \in \mathcal{D}^{(n_{1}1)} \) because of
for j∈{1,…,n _{1}−1}. Note that also \(\tilde{\mathbf{g}}^{(0)}\) can be represented as a convex combination of vectors of \(\mathcal{D}^{(n_{1}1)}\), e.g., \(\tilde{\mathbf{g}}^{(0)} = \frac{1}{2} (1,1, 0, \dots,0)^{T} + \frac{1}{2} (1,1, 0, \dots,0)^{T}\). This proves that \((x_{1}, \ldots, x_{n_{1}1})\) can be represented as a convex combination of elements
For the other case that n _{1}=1, let us set \(\beta_{i} := \sum_{j=n_{1}}^{i} x_{j} \) where i∈{n _{1},…,n _{2}}. If n _{1}=n _{2}, the core discrepancy interval property of C entails that \(x_{n_{1}} = 1\). Therefore, let us consider the case n _{2}>n _{1}. Then, the assumption \(\beta_{i^{*}} < 0\) for some index i ^{∗}∈{n _{1},…,n _{2}−1} leads to \(\sum_{i=n_{1}}^{n_{2}} x_{i} = \beta_{i^{*}} + \sum_{i=i^{*}}^{n_{2}} x_{i} \) implying that \(\sum_{i=i^{*}}^{n_{2}} x_{i} > \sum_{i=n_{1}}^{n_{2}} x_{i}\). This contradicts the core discrepancy interval property of C; hence, in analogy to the case n _{1}>1 and formula (13), we get that β _{ i }≥0 for all i∈{n _{1},…,n _{2}}. Now, reasoning steps analogous to the case n _{1}>1 can be applied in order to show that \((x_{n_{1}}, \ldots, x_{n_{2}})\) can be represented as a convex combination of elements
If n _{2}=n, we are ready, and if n _{2}<n, then let us consider \(\gamma_{i} := \sum_{j=n_{2} +1}^{n} x_{j} \), which, in analogy to the case n _{1}>1 leads to γ _{ i }≤0, for which the same reasoning as in case n _{1}>1 can be applied showing that \((x_{n_{2}+1}, \ldots, x_{n})\) can be represented as a convex combination of elements
Putting all together, formulas (17), (18), (19) show that x can be represented by elements
showing that \(\mathrm{conv}(\mathcal{D}^{(n)}) = B_{D}^{(n)}\).
Finally we show that all elements of \(\mathcal{D}^{(n)}\) are vertices of \(\mathrm{conv}(\mathcal{D}^{(n)})\). Suppose that \(\mathbf{v}_{0} \in \mathcal{D}^{(n)}\) can be represented as convex combination of elements \(\mathbf{v}_{i} \in \mathcal{D}^{(n)}\backslash \{\mathbf{v}_{0}\}\), \(i \in \{1, \ldots, \mathcal{D}^{(n)}1\}\), i.e.,
∑_{ i } λ _{ i }=1, λ _{ i }≥0. Then, due to Lemma 1 and the fact that ∥v _{ i }∥_{ D }=1, \(i \in \{0, \ldots, \mathcal{D}^{(n)}1\}\), there are constants c _{ i } such that \(\overline{\mathbf{v}}_{i} = (c_{i}, c_{i} + v_{1}^{i}, \ldots, c_{i} + v_{1}^{i} + \cdots + v_{n}^{i}) \in [0,1]^{n+1}\), where \(\mathbf{v}_{i} = (v_{1}^{i}, \ldots, v_{n}^{i})\). Since \(v_{j}^{i} \in \{1,0,1\}\), Lemma 1 tells that \(c_{i} = 1  \max_{k=1}^{n}\{0, \sum_{j=1}^{k} v_{j}^{i}\} \in \{0,1\}\). From this it follows that \(\overline{\mathbf{v}}_{i} \in \{0,1\}^{n+1}\). Further, ∥v _{ i }∥_{ D }=1 implies \(\overline{\mathbf{v}}_{i} \in \{0,1\}^{n+1}\backslash \{\mathbf{0}, \mathbf{1}\}\). Note that v _{ i }≠v _{ j } implies \(\overline{\mathbf{v}}_{i} \neq \overline{\mathbf{v}}_{j}\). Now, let us consider the linear mapping
where \((x_{i})_{i\in I_{n+1}} \in \{0,1\}^{n+1}\backslash \{\mathbf{0}, \mathbf{1}\}\). Equation (20) expressed in terms of (21) means that \(\mathcal{D}(\overline{\mathbf{v}}_{0}) = \sum_{i} \lambda_{i} \mathcal{D}(\overline{\mathbf{v}}_{i})\), which entails \(\mathcal{D}(\overline{\mathbf{v}}_{0}) = \mathcal{D}( \sum_{i} \lambda_{i} \overline{\mathbf{v}}_{i})\). Since \(\mathcal{D}(\overline{\mathbf{v}}) = \mathcal{D}(\overline{\mathbf{w}})\) implies \(\overline{\mathbf{v}} = \overline{\mathbf{w}} + c \cdot \mathbf{1}\) for some c∈ℝ, we obtain
Since \(\{\overline{\mathbf{v}}_{0} + c\cdot \mathbf{1} c \in \mathbb{R}\} \cap [0,1]^{n+1} = \{\overline{\mathbf{v}}_{0}\}\), we obtain c=0 in Eq. (22), and hence \(\overline{\mathbf{v}} = \overline{\mathbf{w}}\). This proves the injectivity of the mapping (21).
Note that \(\overline{\mathbf{v}}_{0}\) is a vertex of the hypercube and \(\sum_{i} \lambda_{i} \overline{\mathbf{v}}_{i}\) is an element of the hypercube [0,1]^{n+1}. But \(\overline{\mathbf{v}}_{0}\) as a vertex of the hypercube cannot be represented as a convex combination of other vertices of the hypercube [0,1]^{n+1} different from \(\overline{\mathbf{v}}_{0}\), which by means of the injectivity of (21) shows that assumption (20) cannot be true. Consequently, we get \(\mathcal{D}^{(n)} = \mathrm{vert}(\mathrm{conv}(\mathcal{D}^{(n)}))\), and together with the first part of the proof, \(\mathrm{conv}(\mathcal{D}^{(n)}) = B_{D}^{(n)}\), we conclude that \(\mathrm{vert}(B_{D}^{(n)}) = \mathcal{D}^{(n)}\), which ends the proof. □
Next, the main result that characterizes the unit ball of the discrepancy norm by means of the mapping (21) is stated.
Theorem 1
Let \(B_{D}^{(n)}\) denote the ndimensional unit ball of the discrepancy norm, n≥1.

(a)
The mapping \(\mathcal{D}:\{0,1\}^{n+1}\backslash{\{\mathbf{0},\mathbf{1}\}} \mapsto \mathrm{vert}(B_{D}^{(n)})\) given by \(\mathcal{D}((x_{i})_{i\in I_{n+1}}) = (x_{i+1}x_{i})_{i\in I_{n}}\) is a onetoone correspondence.

(b)
\(B_{D}^{(n)} = \mathrm{conv}(\{(y_{i+1}y_{i})_{i=1}^{n}  y_{i} \in \{0,1\}\})\).
Proof
The injectivity of the mapping (21) can be shown by induction. In order to prove the surjectivity of the mapping (21), let us consider \(\mathbf{x} \in \mathrm{vert}(B_{D}^{(n)})\), which by Lemma 3 is equivalent to ∥x∥_{ D }=1 and x∈{−1,0,1}^{n}. Due to Lemma 1, there is a uniquely determined integration constant \(c = 1  \max_{k \in I_{n}}\{0, \sum_{j=1}^{k} x_{j}\} \in [0,1] \) such that (c,c+x _{1},…,c+x _{1}+⋯+x _{ n })∈[0,1]^{n+1}. The assumption x _{ i }∈{−1,0,1} and ∥x∥_{ D }=1 therefore implies y=(c,c+x _{1},…,c+x _{1}+⋯+x _{ n })∈{0,1}^{n+1}∖{0,1}; hence there is a sequence \(y = (y_{i})_{i \in {I_{n+1}}} \in \{0,1\}^{n+1}\backslash{ \{\mathbf{0}, \mathbf{1}\}}\) such that \((x_{i})_{i\in I_{n}} = (y_{i+1}  y_{i})_{i\in I_{n}}\). Equation (b) of Theorem 1 directly follows from the bijectivity of the mapping (21). □
Figure 3 illustrates the bijectivity of the mapping (21) for n=2.
Geometric Characteristics of the nDimensional Unit Ball of the Discrepancy Norm
In this section Theorem 1 is applied in order to determine geometric characteristics of \(B_{D}^{(n)}\) like the number of kdimensional faces and its volume.
Number of kDimensional Faces
The following corollary relates the number of kdimensional faces of the ndimensional unit ball of the discrepancy norm to the number of corresponding faces of the (n+1)dimensional hypercube.
Corollary 1
Let D _{ k,n } denote the number of kfaces of \(B_{D}^{(n)}\), n∈ℕ, 0≤k<n. Then
Proof
First of all let us denote by H _{ k,n } the number of kdimensional faces of the ndimensional unit hypercube [0,1]^{n}. As it is well known from the theory of regular polytopes, see, e.g., [11], we have . Observe that kfaces of the (n+1)hypercube cannot contain the elements 0 and 1 together if 0≤k<n+1, n≥1. Therefore, for 0≤k<n+1, there are kfaces of the (n+1)hypercube that contain either 0 or 1. Further, observe that for k=n, all kfaces contain either 0 or 1, which also can be seen from the identity Z _{ n,n+1}=H _{ n,n+1}.
Now we consider k<n and apply the mapping \(\mathcal{D}\) of Theorem 1 on the (n+1)hypercube [0,1]^{n+1}. Observe that the elements 0,1∈[0,1]^{n+1} are mapped to the inner point 0 of \(B_{D}^{(n)}\). Note that a kface F of the (n+1)hypercube can be represented by means of a linear combination of (n+1) vertices, i.e., \(F = \{\mathbf{e}_{i_{0}} + \sum_{l=1}^{k} \lambda_{l} \mathbf{e}_{i_{l}}  \lambda_{l} \in [0,1]\}\), where e _{ i } denotes the ith unit vector. We show that a kdimensional face of the (n+1)hypercube which does not contain either 0 or 1 is mapped to a kdimensional face of \(B_{D}^{(n)}\).
For this, let us consider the set of linearly independent vectors \(\{\mathbf{e}_{i_{0}}  \mathbf{e}_{i_{1}}, \ldots, \mathbf{e}_{i_{0}}  \mathbf{e}_{i_{k}} \}\). The linear independency of the mapped vectors \(\{\mathcal{D}(\mathbf{e}_{i_{0}} \mathbf{e}_{i_{1}}),\ldots, \mathcal{D}(\mathbf{e}_{i_{0}} \mathbf{e}_{i_{k}}) \}\) follows from the observation that \(\sum_{l=1}^{k} \lambda_{l} \mathcal{D}(\mathbf{e}_{i_{0}} \mathbf{e}_{i_{l}})= \mathbf{0}\), i.e., \(\mathcal{D} (\sum_{l=1}^{k} \lambda_{l} (\mathbf{e}_{i_{0}} \mathbf{e}_{i_{l}}))= \mathbf{0}\), can only be satisfied if there is a real c∈ℝ such that \(\sum_{l=1}^{k} \lambda_{l} (\mathbf{e}_{i_{0}} \mathbf{e}_{i_{l}}) = c \mathbf{1}\). Since k<n and \(\mathbf{e}_{i_{l}} \in \{0,1\}^{n+1}\), there is an index k ^{∗}∈{1,…,n+1} for which the corresponding coordinate is zero for all vectors \(\mathbf{e}_{i_{l}}\), l∈{0,…,k}. This implies c=0, hence λ _{ l }=0 for l∈{1,…,k} due to the assumption that the set of vectors \(\{ \mathbf{e}_{i_{0}}  \mathbf{e}_{i_{1}},\ldots, \mathbf{e}_{i_{0}}  \mathbf{e}_{i_{k}}\}\) is linearly independent. From this and from Theorem 1 it follows that there is a onetoone mapping between the set of kfaces of the (n+1)hypercube that do not contain 0 or 1 to the set of kfaces of \(B_{D}^{(n)}\). This implies D _{ k,n }=H _{ k,n+1}−Z _{ k,n+1}, which equals (23). □
In particular, \(B_{D}^{(n)}\) has D _{0,n }=2^{n+1}−2 vertices, D _{1,n }=(n+1)(2^{n}−2) edges, and D _{ n−1,n }=n(n+1) facets \(\mathcal{D}(F_{ij})\), i≠j, of dimension (n−1), where F _{ ij }={e _{ i }+∑_{ k≠i,j } λ _{ k } e _{ k }0≤λ _{ k }≤1} and i,j∈I _{ n+1}. Note that F _{ ij }=F _{ ji }+(e _{ i }−e _{ j }).
Volume of \(B_{D}^{(n)}\)
Using the terminology of the theory of convex polytopes, see, e.g., [14, 41], the volume can be obtained by looking at \(B_{D}^{(n)}\) as zonotope generated by a projection from the hypercube [0,1]^{n+1} followed by a product of shearing transformations. The (n+1) unit vectors e _{1},…,e _{ n+1} are mapped to the Minkowski sum generators g _{1},…,g _{ n+1}. Since the unit ball is decomposed by the sheared projection of n faces out of (n+1) possibilities with each face mapped to an ndimensional subbody of \(B_{D}^{(n)}\) of volume one, we obtain Corollary 2.
Corollary 2
\(V(B_{D}^{(n)}) = n+1\), n∈ℕ.
It is interesting that the volume \(V(B_{D}^{(n)}) = n+1\) of the ndimensional unit ball of the discrepancy norm increases linearly with its dimension n, while for pnorms with 1≤p<∞, it can be shown that \(r^{n} V(B_{\\cdot\_{p}}^{(n)}) \stackrel{n\rightarrow \infty}{\longrightarrow} 0 \) for any r>0, see, e.g., [15, 33].
Conclusion
In this paper Weyl’s discrepancy norm was studied from a geometrical point of view by considering its unit ball. Thinking of sequences and differentiation in the sense that two consecutive entries are subtracted, it was shown that the unit ball of Weyl’s discrepancy norm of dimension n results from differentiating the unit hypercube of dimension (n+1). It was shown how this interpretation helps to derive and prove properties of the discrepancy norm, as, for example, that the volume of the unit ball of dimension n equals n+1. This paper was motivated by considering the discrepancy norm as dissimilarity measure for pattern analysis. In the near future, it is planned to investigate the relevance of the discrepancy norm in various fields of pure and applied mathematics. Particularly, research will be dedicated to the determination of the distribution of the diameter of a random walk, the discrete mathematical foundation of eventbased image processing, and the improvement of stereo matching and related algorithms.
References
 1.
Alexander, J.R., Beck, J., Chen, W.W.L.: Geometric discrepancy theory and uniform distribution. In: Handbook of Discrete and Computational Geometry, pp. 185–207. CRC Press, Inc., Boca Raton (1997)
 2.
Bauer, P., Bodenhofer, U., Klement, E.P.: A fuzzy algorithm for pixel classification based on the discrepancy norm. In: Proc. 5th IEEE Int. Conf. on Fuzzy Systems, New Orleans, LA, vol. III, pp. 2007–2012 (1996)
 3.
Beck, J., Chen, W.W.L.: Irregularities of Distribution. Cambridge University Press, New York (2009)
 4.
Bercher, J.F.: On some entropy functionals derived from Rényi information divergence. Inf. Sci. 178(12), 2489–2506 (2008)
 5.
Bouchot, J., Himmelbauer, J., Moser, B.: On autocorrelation based on Hermann Weyl’s discrepancy norm for time series analysis. In: Proc. 2010 Int. Joint Conf. on Neural Networks (IJCNN’10), pp. 1–7 (2010)
 6.
Bouchot, J., Stübl, G., Moser, B.: A template matching approach based on the discrepancy norm for defect detection on regularly textured surfaces. In: Pinoli, J., Debayle, J., Gavet, Y., Gruy, F., Lambert, C. (eds.) Proc. 10th Intern. Conf. on Quality Control by Artificial Vision (QCAV’11), vol. 8000, p. 80000K. SPIE, Bellingham (2011)
 7.
Brunelli, R.: Template Matching Techniques in Computer Vision: Theory and Practice. Wiley, New York (2009)
 8.
Campbell, J., Lo, A., MacKinlay, C.: The Econometrics of Financial Markets. Princeton University Press, Princeton (1996)
 9.
Chazelle, B.: The Discrepancy Method: Randomness and Complexity. Cambridge University Press, New York (2000)
 10.
Chen, C.M., Cheng, C.T.: From discrepancy to declustering: nearoptimal multidimensional declustering strategies for range queries. J. ACM 51(1), 46–73 (2004)
 11.
Coxeter, H.S.M.: Regular Polytopes. Dover, New York (1973)
 12.
Doerr, B.: Integral approximations. Habilitation thesis, University of Kiel (2005)
 13.
Doerr, B., Hebbinghaus, N., Werth, S.: Improved bounds and schemes for the declustering problem. Theor. Comput. Sci. 359(1), 123–132 (2006)
 14.
Grünbaum, B.: Convex Polytopes. Graduate Texts in Mathematics, vol. 221. Springer, New York (2003)
 15.
Huang, Z.Y., He, B.W.: Volume of unit ball in an ndimensional normed space and its asymptotic properties. J. Shanghai Univ. 12, 107–109 (2008)
 16.
Jancosek, M., Pajdla, T.: A criterial function for superpixel matching (2010)
 17.
Kendall, M., Stuart, A.: The Advanced Theory of Statistics, vol. 2. MacMillan, New York (1979)
 18.
Kogler, J., Sulzbachner, C., Kubinger, W.: Bioinspired stereo vision system with Silicon Retina imagers. In: Fritz, M., Schiele, B., Piater, J. (eds.) Computer Vision Systems. Lecture Notes in Computer Science, vol. 5815, pp. 174–183. Springer, Berlin (2009)
 19.
Kuipers, L., Niederreiter, H.: Uniform Distribution of Sequences. Dover Publications, New York (2005)
 20.
Kullback, S.: Information Theory and Statistics. Wiley, New York (1959)
 21.
Matousek, J.: Geometric Discrepancy: An Illustrated Guide (Algorithms and Combinatorics), 1st edn. Springer, Berlin (1999)
 22.
Moser, B.: Discrete geometric foundation of event based imaging. Tech. Rep. SCCHTR201144, Software Competence Center Hagenberg (2011)
 23.
Moser, B.: A similarity measure for image and volumetric data based on Hermann Weyl’s discrepancy. IEEE Trans. Pattern Anal. Mach. Intell. 33(11), 2321–2329 (2011)
 24.
Moser, B., Hoch, T.: Misalignment measure based on Hermann Weyl’s discrepancy. In: Kuijper, A., Heise, B., Muresan, L. (eds.) Challenges in the Biosciences: Image Analysis and Pattern Recognition Aspects, Linz, Austria, vol. 232, pp. 187–198 (2008)
 25.
Moser, B., Haslinger, P., Kazmar, T.: On the potential of Hermann Weyl’s discrepancy norm for texture analysis. In: Masoud, M. (ed.) Proc. 2008 Int. Conf. on Computational Intelligence for Modelling, Control and Automation; Intelligent Agents, Web Technologies and Internet Commerce; and Innovation in Software Engineering (CIMCA/IAWTIC/ISE), pp. 187–191. IEEE Computer Society, Washington (2008)
 26.
Moser, B., Stübl, G., Bouchot, J.L.: On a nonmonotonicity effect of similarity measures. In: Proc. 1st Int. Conf. on SimilarityBased Pattern Recognition, SIMBAD’11, pp. 46–60. Springer, Berlin (2011)
 27.
Neunzert, H., Wetton, B.: Pattern recognition using measure space metrics. Tech. Rep. 28, University of Kaiserslautern, Department of Mathematics (1987)
 28.
Niederreiter, H.: Random Number Generation and QuasiMonte Carlo Methods. Society for Industrial and Applied Mathematics, Philadelphia (1992)
 29.
Pluim, J., Maintz, J., Viergever, M.: fInformation measures in medical image registration. IEEE Trans. Med. Imaging 23(12), 1506–1518 (2004)
 30.
Renyi, A.: On measures of entropy and information. In: Proc. 4th Berkeley Symposium on Mathematical Statistics and Probability, vol. 1, pp. 547–561 (1961)
 31.
Sadakane, K., Chebihi, N., Tokuyama, T.: Discrepancybased digital halftoning: automatic evaluation and optimization. Interdiscip. Inf. Sci. 8(2), 219–234 (2002)
 32.
Scharstein, D., Szeliski, R.: A taxonomy and evaluation of dense twoframe stereo correspondence algorithms. Int. J. Comput. Vis. 47, 7–42 (2001)
 33.
Smith, D.J., Vamanamurthy, M.: How small is a unit ball? Math. Mag. 62, 101–107 (1989)
 34.
Spearman, C.: The proof and measurement of association between two things. Am. J. Psychol. 15, 72–101 (1904)
 35.
Stübl, G., Bouchot, J.L., Haslinger, P., Moser, B.: Discrepancy norm as fitness function for defect detection on regularly textured surfaces. Tech. Rep. SCCHTR, Software Competence Center Hagenberg, Austria (2012)
 36.
Takhtamyshev, G., Vandewoestyne, B., Cools, R.: Quasirandom integration in high dimensions. Math. Comput. Simul. 73(5), 309–319 (2007)
 37.
Vajda, I.: Theory of Statistical Inference and Information. Kluwer Academic, Dordrecht (1989)
 38.
Victor, J.D.: Spike train metrics. Curr. Opin. Neurobiol. 15(5), 585–592 (2005)
 39.
Weyl, H.: Über die Gleichverteilung von Zahlen mod. Eins. Math. Ann. 77, 313–352 (1916)
 40.
Zaremba, S.: The mathematical basis of Monte Carlo and quasiMonte Carlo methods. SIAM Rev. 10(3), 303–314 (1968)
 41.
Ziegler, G.M.: Lectures on Polytopes. Graduate Texts in Mathematics, vol. 152. Springer, New York (1995)
Acknowledgements
This work is partially supported by the Austrian Science Fund, FWF, grant no. P21496 N23, and the Austrian COMET program.
Thanks to Peter Haslinger, Gernot Stübl, and JeanLuc Bouchot for fruitful discussions and support for preparing the computer graphics and Johannes Himmelbauer for careful proof reading.
Author information
Appendix
Appendix
Here the proofs of properties (P1)–(P6) of Sect. 2 are outlined.
\(\(x_{i})_{i\in I_{n}}\_{D} =0\) implies \(\max_{n,m \in \mathcal{I}} \sum_{i=m}^{n} x_{i}=0\); hence x(i)=0 for all i∈I _{ n }. Homogeneity and triangle inequality immediately follow from Eq. (3). For (P3), let us consider the sequence \((x_{i})_{i\in I_{n}}\), and let us set x _{ i }:=0 for i∉I _{ n }. We are distinguishing two cases. First, let us suppose that there are indices \(\tilde{n}_{1}\), \(\tilde{n}_{2}\) with \(\sum_{i=\infty}^{\tilde{n}_{1}} x_{i} < 0\) and \(\sum_{i=\infty}^{\tilde{n}_{2}} x_{i} >0\). Then
Secondly, if either \(\sum_{i=\infty}^{\tilde{k}} x_{i} \leq 0\) for all k∈I _{ k } or \(\sum_{i=\infty}^{\tilde{k}} x_{i} \geq 0\) for all k∈I _{ k } we also get \(\(x_{i})_{i\in I_{n}} \_{D} = \max\{0, \max_{k \in I_{n}} \sum_{i=1}^{k} x_{i} \}  \min\{0, \min_{k \in I_{n}} \sum_{i=1}^{k} x_{i} \}\). The Lipschitz property (P4), the symmetry property (P5), and the monotonicity (P6) follow from the observation
where x _{ i }≥0 is assumed.
Rights and permissions
About this article
Cite this article
Moser, B.A. Geometric Characterization of Weyl’s Discrepancy Norm in Terms of Its nDimensional Unit Balls. Discrete Comput Geom 48, 793–806 (2012). https://doi.org/10.1007/s0045401294540
Received:
Revised:
Accepted:
Published:
Issue Date:
Keywords
 Discrepancy
 Lipschitz property
 Zonotope
 Hypercube