Supplements

Löffler, Andreas; Kruschwitz, Lutz

doi:10.1007/978-3-030-20103-6_7

Andreas Löffler³ &
Lutz Kruschwitz³

Part of the book series: Springer Texts in Business and Economics ((STBE))

7700 Accesses

Abstract

Anyone writing a book will rarely follow a plan that was not revised several times during the process. This was definitely the case when this book was written. We have discussed many different versions before we arrived at the current format. In some of these versions mathematical terms like “convergence of functions” or “cardinality of sets” played an important role. At the end, we found a way to discuss the Brownian motion without using these terms explicitly. The obvious consequence could have been to simply drop this material.

You have full access to this open access chapter, Download chapter PDF

Anyone writing a book will rarely follow a plan that was not revised several times during the process. This was definitely the case when this book was written. We have discussed many different versions before we arrived at the current format. In some of these versions mathematical terms like “convergence of functions” or “cardinality of sets” played an important role. At the end, we found a way to discuss the Brownian motion without using these terms explicitly. The obvious consequence could have been to simply drop this material.

Discussions with students and colleagues taught us that these topics can also be of use in several other areas of economics. Therefore we decided to leave the supplements in our book. The four subsequent sections can be read independently of each other. The entire chapter can be skipped for the understanding of the Brownian motion.

7.1 Cardinality of Sets

Imagine adding 0 to the set of scores on a dice as another element:

$$\displaystyle \begin{aligned} \{0, 1, 2, 3, 4, 5, 6\}. \end{aligned}$$

Obviously, this set is larger than the original set: instead of six there exist now seven elements. With this simple fact in mind, one is inclined to conclude that this idea will also be applicable in the case of infinite sets. For example, if we compare the set $\mathbb N$ of all natural numbers with the set $\mathbb Z$ of integers, it seems reasonable to suppose that $\mathbb Z$ is greater than $\mathbb N$.

However, one cannot prove whether such a proposition is correct or false by looking at the number of elements. This number is infinite in $\mathbb Z$ as well as $\mathbb N$, and we had already realized that infinite is not a number that can be used to perform simple arithmetic operations such as addition or comparisons. Thus, one has to create another concept if one wants to compare infinite sets. This boils down to cardinality.

If one looks at infinite sets, results dealing with finite sets seem to contradict common sense. First, one might think that the set of natural numbers is smaller than the set of integers since all negative values − 1, −2, … are missing. However, one can prove by a simple consideration that this conclusion is mistaken. Rather, it is shown that the set of integers is exactly as large as the set of natural numbers or both have “the same cardinality” which we will explain below. This underlines the fact that infinity must be handled very carefully. It is better not to rely on common sense or “intuition”!

The idea of cardinality is to employ a one-to-one relation when comparing two sets rather than counting their elements. Two sets are said to have the same cardinality (or are “equal in size”) only if there exists a one-to-one relation between all their elements.

With finite sets counting elements or using one-to-one relations lead to the same result. Figure 7.1 illustrates that the set with seven elements is greater than the set with six elements: one element from the set {0, 1, …, 6} will never find a “partner.”

In the case of the two infinite sets, however, the outcome is surprising. This is demonstrated by the assignment in Fig. 7.2: each natural number is mapped to exactly one integer and this mapping is one-to-one. One can clearly observe that both every natural number and every integer appear exactly once. Those preferring formulas might use

$$\displaystyle \begin{aligned} f:\mathbb N \,\rightarrow\,\mathbb Z, \qquad f(n)= \begin{cases} -\frac{n}{2}, \quad \text{if }n\text{ is even}\,;\\ \frac{n+1}{2}, \quad \text{if }n\text{ is odd}\,. \end{cases} \end{aligned} $$

(7.1)

f is a function that obviously assigns an integer to each natural number n and f is also reversible in the sense that every integer in $\mathbb Z$ is also captured.

The idea of cardinality will be further illustrated with another example.

Example 7.1 (Cantor’s Diagonal Argument)

The set of nonnegative rational numbers $\mathbb Q_+$ has the same cardinality as the set of natural numbers. To show the equivalence it is necessary to prove—analogous to Fig. 7.2—that it is possible to uniquely assign all nonnegative rational numbers to natural numbers.

The rational numbers $\mathbb Q_+$ consist of all fractions $\frac {m}{n}$ with m and n being positive natural numbers. These rational numbers are now arranged in an infinite two-dimensional matrix as shown in Fig. 7.3.^{Footnote 1} The arrows shown illustrate how one may imagine the one-to-one correspondence between the natural and the rational numbers: the 1 is assigned to fraction $\frac {1}{1}$, the 2 to fraction $\frac {2}{1}$, the 3 to fraction $\frac {1}{2}$, the 4 to fraction $\frac {1}{3}$, the 5 to fraction $\frac {2}{2}$, and so on.

This procedure would create a one-to-one relation if there was not an annoying blemish. The right matrix contains too many elements. The rational numbers $\frac {1}{1}$, $\frac {2}{2}$, $\frac {3}{3},$ …or $\frac {3}{17}, \frac {6}{34}, \frac {9}{51},\ldots $ are actually identical and do not represent different rational numbers at all. Therefore, they must not be assigned to different natural numbers. One has to make sure that they are accounted for only once. This is achieved by “thinning-out” the right matrix. All fractions $\frac {m}{n}$ consisting of m, n which are not coprime are deleted. In this case the diagonal construction is only carried out for values that are coprime. The formal proof is much more complicated due to this “thinning-out” and must—if one wants to be formally precise—be conducted with complete induction. However, we will not present the details of this proof.

A set whose cardinality corresponds to the cardinality of the natural numbers is called countable. In this sense natural numbers, integers and rational numbers are countable. Countable quantities are of great importance because they can appear as indices in sums and products. An expression of the form ∑_{i ∈ A} a _i makes sense if and only if A is countable. If $A=\mathbb N$ one can even write $\lim _{n\to \infty }\sum _{i=1}^n a_i$ for this sum.

One could suspect that for all infinite sets it can be proven—with ingenious tricks—that they are countable. However, that is not the case and we will show for a very prominent set that it is larger than the set of natural numbers.

Example 7.2 (Uncountability)

We prove that the set of real numbers $\mathbb R$ has a different cardinality than the set of natural numbers. That is quite simple.

To this end we assume that someone claims being able to map the set of real numbers one-to-one to the set of natural numbers. This person would be able to list all real numbers one after the other. This would constitute a sequence of all real numbers. In particular, this person can name a unique predecessor and successor for each real number. We will show that at least one real number is still missing—which is a contradiction. This proves that the set of real numbers must be larger than the set of natural numbers.

In Fig. 7.4 we present the sequence of real numbers with their (possibly infinite) decimal representation which the above person claims to be complete, i.e., containing all real numbers. Instead of the decimals 0, 1, …, 9 we use symbols a _i, b _i, c _i, d _i, … for every real number.^{Footnote 2}

The missing number can be constructed very easily. We consider Fig. 7.4 as a matrix of numbers and focus on the diagonal (the diagonal elements are printed in red). Using the diagonal we form a new real number of the form 0. z ₁ z ₂ z ₃ z ₄…. As first decimal z ₁ of this new real number, a decimal must be selected such that it does not equal a ₁. The second decimal must fulfill the inequality z ₂ ≠ b ₂, for the third decimal the inequality z ₃ ≠ c ₃ must hold, and so on. The new real number formed in this way cannot match any of the numbers mentioned in our person’s supposedly complete list. With each element of our person’s list (at least) one decimal in the representation is different from our newly constructed number. We have found the missing number!

These considerations show that the set of real numbers can hardly be counted. It is said that the real numbers are uncountable. Therefore, it follows that an expression of the form $\sum _{i\in \mathbb R}a_i$ does not represent a mathematically meaningful term: each element i in an index set must have a unique predecessor and a unique successor, a situation impossible for the real numbers $\mathbb R$.

Example 7.2 shows that there exist infinite sets with different cardinalities. The set of real numbers $\mathbb R$ is “larger” than $\mathbb N$, while the sets of natural numbers is “as large” as the sets $\mathbb Z$ and $\mathbb Q_+$. In mathematics this is indicated by appropriate symbols. The number of natural numbers is not indicated by the rather fuzzy infinity sign ∞ but by the symbol ℵ ₀.^{Footnote 3} Since the cardinality of the real numbers is greater than ℵ ₀, the symbol ℵ ₁ is used.

Concluding Remark

Finally, we would like to draw the reader’s attention to an interesting issue. We have already shown that the set of natural numbers is smaller than the set of real numbers. Instead of the set of natural numbers, one could use their power set ${\mathcal {P}}(\mathbb N)$, i.e., the set of all subsets of natural numbers. This power set contains the set of all even numbers, the set of all odd numbers, the set of all natural numbers less than 5, and so on. Without presenting the mathematical details, it can be shown that the power set has the same cardinality as the set of real numbers. On page 20 we had made it clear that for a finite set of n elements the number of subsets is just 2ⁿ. This relationship is assigned to the symbols just introduced by writing the following equation:

$$\displaystyle \begin{aligned} 2^{\aleph_0}=\aleph_1. \end{aligned} $$

(7.2)

However, this symbolic notation should not be confused with real arithmetic operations. One must not write ℵ ₀ =log₂(ℵ ₁).

What do these considerations tell us? If mathematicians transfer as in (7.2) a symbolic notation from one subject area to another, one is tempted to use it in all its dimensions. Unfortunately, such practice cannot only be wrong but even be dangerous. We have already experienced this situation while discussing the notation of Brownian motion.

7.2 Continuous and Almost Nowhere Differentiable Functions

In order to discuss the Brownian motion thoroughly, it is useful to deal with remarkable features of functions. The paths of Brownian motion are continuous functions which one cannot differentiate at (almost) any point. Anyone wanting to handle such functions properly must recognize that the use of mathematical operations known from ordinary analysis is inadmissible. Compared to ordinary analysis dealing with Brownian paths can be considered as being “exotic.”

Non-mathematicians probably cannot imagine continuous functions that are not differentiable (almost) anywhere. We would like to assist this understanding by an example developed by Weierstraß.^{Footnote 4} He also showed that in mathematics such functions are anything but rare. Prior to Weierstraß these functions had been regarded as “monster curves.”^{Footnote 5} It was assumed that these functions were either only special cases or that the points where differentiation is not possible were indeed rare.

Weierstraß considered the function

$$\displaystyle \begin{aligned} w(x)=\sum_{n=0}^\infty \frac{\sin (3^n\, x)}{2^n}. \end{aligned} $$

(7.3)

To give an idea of the appearance of this function, Fig. 7.5 shows only the first seven summands of a Taylor series.^{Footnote 6} We concentrate on two characteristics of the Weierstraß function: first its continuity and second its differentiability.

Non-mathematicians state that a function is continuous if one can draw its path without interrupting the movement of the drawing pen. Although this is not a precise definition one may suspect that the Weierstraß function is continuous when looking at Fig. 7.5. Even with more precision the same result applies: the numerator of each fraction is at most 1 and the denominator grows exponentially. Therefore, the sum converges for each x. Furthermore, it also converges uniformly. This means that the difference between $\sum _{n=0}^m \frac {\sin (3^n\, x)}{2^n}$ and w(x) going to zero can be estimated independently of x. In such cases the property of continuity of the summands $\frac {\sin (3^n\, x)}{2^n}$ also applies to the function w(x).

The above considerations do not represent a complete proof but only give an indication of the evidence: the result is intuitively appealing. Looking at the definition of the function w(x) the following observation is decisive. The numerator of each additional summand exists in the interval [−1, 1]. On the other hand, the denominator of each new summand grows exponentially. Hence, each new summand (however it may behave) contributes only marginally to the change of the function value. Therefore, continuity is maintained at the limit.

Let us turn to the second characteristic of the function w(x). Weierstraß was able to show that the function cannot be differentiated except for a few values x. While the proof is difficult, one can illustrate the result as follows: deriving the sum with respect to x one obtains^{Footnote 7}

$$\displaystyle \begin{aligned} \frac{dw(x)}{dx}=\lim_{N\to\infty}\sum_{n=0}^N \left(\frac{3}{2}\right)^n\cos (3^n\, x). \end{aligned} $$

(7.4)

To examine this limit in more detail we first ignore the factor $\left (\frac {3}{2}\right )^n$ and draw several graphs of the function $\cos {}(3^n x)$ depending on n (see Fig. 7.6).

It can easily be seen that the frequency of the cosine function increases with every exponent n. Since the increasing fluctuations are multiplied by the factor $\left (\frac {3}{2}\right )^n$, their impact on the sum grows with n. Obviously, the sum can only converge for numbers x where the cosine function approaches zero. The zeros of these cosine functions are very thinly scattered.^{Footnote 8} For all other x the sum diverges to plus or minus infinity and this represents the default case. Thus, the first derivative of this function is almost everywhere either minus or plus infinity. This implies that the function cannot be differentiated anywhere.

7.3 Convergence Terms

From numerous discussions with students and colleagues we learned that there is certainly interest in looking more closely at the issue of convergence of functions. When looking at convergence of numbers it is entirely irrelevant how to define convergence precisely. Regardless of the definition of convergence of numbers, all turn out to be equivalent. However, this is entirely different when dealing with sequences of functions. There are many different ways to define convergence with each option being fundamentally different from one another. While most non-mathematicians can imagine what a sequence of numbers is, the issue of dealing with a sequence of functions is very different.

To illustrate this phenomenon we use an analogy. Finding the shortest route from Berlin to San Francisco depends on the way the earth is looked at. Using a conventional map of the world it will be concluded that the shortest route of the two cities is always south of 53^∘ North. However, when using a globe you will find that the shortest route is in fact via Greenland. This analogy is similar to the convergence concept for functions: there are not just one but several ways of defining the convergence of a sequence of functions. The results depend on the chosen convergence definition.

Convergence is important in the context of limits. To understand the applications, it is useful to realize how proofs are conducted in the theory of Lebesgue integration^{Footnote 9}: if one wants to prove that a certain property or a given proposition applies in general, one can make life easier to start by proving the correctness of the proposition for linear or piecewise linear functions. In order to show the general validity, one has to move from these simple functions to more general ones. To this end one has to consider the limit of a sequence of functions. A proposition applying to each (piecewise linear or simple) element of a function sequence will also apply to the limit of this sequence and thus to a general function. It should be noted it must not matter whether one integrates first and subsequently passes to the limit or vice versa. Integration and limit must be interchangeable:

$$\displaystyle \begin{aligned} \lim_n \int_\Omega \stackrel{!}{=} \int_\Omega \lim_n. \end{aligned} $$

(7.5)

Let us look at random variables as an example of functions. For random variables expectation and variance are (Lebesgue) integrals.^{Footnote 10} From (7.5) it should follow

$$\displaystyle \begin{aligned} \lim_{n\to\infty} \operatorname*{\mathrm{E}} \left[Z_n\right] \stackrel{!}{=}\operatorname*{\mathrm{E}} \left[\lim_{n\to\infty} Z_n\right] \end{aligned} $$

(7.6)

and

$$\displaystyle \begin{aligned} \lim_{n\to\infty} \operatorname*{\mathrm{Var}} \left[Z_n\right] \stackrel{!}{=} \operatorname*{\mathrm{Var}} \left[\lim_{n\to\infty} Z_n\right]. \end{aligned} $$

(7.7)

Remember that Z _n is a random variable and thus a measurable function.

The above claims deserve two remarks: first, there is an exclamation mark above the equal signs. We need a definition of a limit such that right and left sides are identical. It is apparent that limit and expectation or limit and variance can be swapped. Second, consider the left side of Eq. (7.5) which represents limits of sequences of numbers since expected values and variances are numbers. The right side of Eq. (7.5) does not contain a sequence of numbers but a sequence of functions. While students of economics are aware of how to determine a limit of a sequence of numbers, they may not know what a sequence of a function is let alone how to determine its limit.

Before introducing two important concepts of convergence, namely pointwise convergence and mean square convergence,^{Footnote 11} we will start with sequences of numbers.

Sequences of Numbers

In mathematical analysis, it is stated that a sequence of numbers converges to a limit if the numbers with a sufficiently large index will approach a particular value. For example, if you look at the sequence of numbers

$$\displaystyle \begin{aligned} s_n = a+\frac{1}{n} \qquad \text{with } n=1, 2, \ldots, \end{aligned} $$

(7.8)

we have

$$\displaystyle \begin{aligned} s_1=a+1, \quad s_2=a+\frac{1}{2}, \quad s_3=a+\frac{1}{3}, \end{aligned} $$

(7.9)

and so on. By letting n increase the second summand decreases and approaches zero.^{Footnote 12} For n →∞ the summand can be neglected. Thus, the sequence converges to a which is written as

$$\displaystyle \begin{aligned} \lim_{n\to\infty}s_n=\lim_{n\to\infty}\left(a+\frac{1}{n}\right)=a. \end{aligned} $$

(7.10)

After exploring sequences of numbers we will now concentrate on sequences of functions.

Sequences of Functions

We look at the simple example

$$\displaystyle \begin{aligned} f_n(t) = a+\frac{t}{n}. \end{aligned} $$

(7.11)

With increasing n one obtains

$$\displaystyle \begin{aligned} f_1(t)=a+t, \quad f_2(t)=a+\frac{t}{2}, \quad f_3(t)=a+\frac{t}{3}, \end{aligned} $$

(7.12)

and so on. It seems clear that such a sequence of functions converges and how its limit is determined. In a sequence of numbers individual numerical values at the limit should converge to a certain value. With a sequence of functions it is quite plausible to expect that with increasing n a function “clings to a limit function.” In the above example the functions f _n(t) are approaching the limit function f(t) = a. Figure 7.7 illustrates this vividly. With increasing n the influence of the term $\frac {t}{n}$ gets less and less significant in Eq. (7.11). The limit function takes the form lim_n→∞ f _n(t) = a.

Pointwise Convergence

This definition can be regarded as a “natural” candidate based on the above example.

Definition 7.1 (Pointwise Convergence)

Consider a sequence of functions of the form $f_n: \Omega \,\rightarrow \,\mathbb {R}$.

A sequence of functions f _n converges pointwise ^{Footnote 13} to a function f if and only if the following is valid^{Footnote 14}:

$$\displaystyle \begin{aligned} \lim_{n\to\infty}f_n(\omega)=f(\omega)\qquad \forall\omega\in\Omega\,. \end{aligned} $$

(7.13)

With this definition of convergence integration and limit can be swapped only under certain conditions.^{Footnote 15}

We will now present an example which demonstrates that the interchangeability of integration and limit is lost if one uses pointwise convergence. The expected value of the limit does not equal the limit of expectations.

Let us consider the state space $\Omega =\mathbb {R}$ and a function f _n which is zero on the real line except in the neighborhood of $n\in \mathbb {R}$. The area below the function should be exactly one. Figure 7.8 illustrates such a function that show a rectangle at index n. With increasing index the rectangle is moving to infinity.^{Footnote 16}

We look at this sequence of functions and apply the definition of pointwise convergence. Doing so we will show that the limit of this sequence is zero with the rectangle neither changing its form nor disappearing entirely. This might be surprising.

The functions f _n converge pointwise to zero: consider a fixed value t. For t the following applies
$$\displaystyle \begin{aligned} \lim_{n\to\infty}f_n(t)=0\,, \end{aligned} $$
(7.14)
because any index n will eventually be greater than t. This is why the following must hold:
$$\displaystyle \begin{aligned} \lim_{n\to\infty}f_n(t)=0\quad \Longrightarrow \quad \int_{-\infty}^\infty \lim_{n\to\infty}f_n(t)\,dt=0. \end{aligned} $$
(7.15)
On the other hand, the area under each function is 1 and therefore
$$\displaystyle \begin{aligned} \int_{-\infty}^\infty f_n(t)\,dt=\int_{-n}^n f_n(t)\,dt=n+\frac{1}{2}-\left(n-\frac{1}{2}\right)=1, \end{aligned} $$
(7.16)
and therefore
$$\displaystyle \begin{aligned} \lim_{n\to\infty} \int_{-\infty}^{\infty} f_n(t)\,dt=\lim_{n\to\infty} 1 = 1. \end{aligned} $$
(7.17)

Equations (7.15) and (7.17) show that one must not interchange integration and limit in the sequence of functions considered here. This conclusion can be expressed as

$$\displaystyle \begin{aligned} \lim_n\int \neq \int\lim_n. \end{aligned} $$

(7.18)

For the reasons described above such a result is useless. We must therefore note that pointwise convergence is not an appropriate concept. Rather, it is advisable to find another concept of convergence which permits the interchangeability of integration and limit.

Mean Square Convergence

This concept of convergence^{Footnote 17} is used to ensure that expectation (i.e., expected value and variance) and limit can be interchanged. To this end we assume a measure space $(\Omega \,, {\mathcal {F}},\, \mu )$. It is presupposed that there is a sequence of measurable functions f _n.

Mean square convergence measures the difference of a function (out of the sequence) and its limit. Mean square convergence is defined that the sequence converges if both the expectation and variance of this difference go to zero. The formal definition reads as follows.

Definition 7.2 (Mean Square Convergence)

A sequence of measurable functions f _n converges in mean square to a function f

$$\displaystyle \begin{aligned} \lim_{n\to\infty}f_n=f\,, \end{aligned} $$

(7.19)

if and only if

$$\displaystyle \begin{aligned} \lim_{n\to\infty}\int_\Omega \left|f_n(\omega)-f(\omega)\right|{}^2\,d\mu(\omega)=0 \end{aligned} $$

(7.20)

applies.

We will show that the mean square convergence ensures that integration and limit can be interchanged. For this we concentrate again on a probability measure, i.e., we consider random variables. We use the definition of mean square convergence and rely on the identity (5.36). Assume lim_n→∞ f _n = f. Thus we get from (7.20)

(7.21)

Since neither of the two summands can be negative, both $\lim _{n\to \infty } \operatorname *{\mathrm {Var}}[f_n-f]=0$ and apply. If the squared expectation is zero, $\lim _{n\to \infty } \operatorname *{\mathrm {E}}[f_n-f]=0$ must hold. The expectation is linear, and therefore $\lim _{n\to \infty } \operatorname *{\mathrm {E}}[f_n]= \operatorname *{\mathrm {E}}[f]$ is true. Thus $\lim _{n\to \infty } \operatorname *{\mathrm {E}}[f_n]= \operatorname *{\mathrm {E}}[\lim _{n\to \infty }f_n]$. That was what we had to show.

7.4 Conditional Expectations Are Random Variables

Finally, we want to draw the reader’s attention to an aspect of conditional expectations that was originated by Kolmogoroff.^{Footnote 18} So far we have realized that a conditional expectation is a real number that refers to an event A (the condition).^{Footnote 19} The expectation depends on this event A. If we choose a different event, a different expectation will usually result. Therefore, Kolmogoroff has proposed that the conditional expectation should be interpreted as a random variable.^{Footnote 20}

To understand this idea we need to remember how we had defined random variables. We wanted to perceive them as functions of elementary events. On page 15 we have shown that a random variable X can be characterized as a function

$$\displaystyle \begin{aligned} X: \Omega\,\rightarrow\, \mathbb{R} \end{aligned} $$

(7.22)

with its conditional expectation

$$\displaystyle \begin{aligned} \operatorname*{\mathrm{E}}[X| {\mathcal{F}}]: \Omega\,\rightarrow\, \mathbb{R} \end{aligned} $$

(7.23)

also being interpreted as a random variable. The following two examples will help to better understand this concept.

Example 7.3 (Binomial Model)

With Table 7.1 we refer to Example 5.6 from page 15. While the first column of this table shows the states, the second column represents the cash flows CF ₃. The conditional expectation (at time t = 2) is given in the third column.

Table 7.1 States, cash flows CF ₃, and conditional expectations of the cash flows in the binomial model of Example 5.6

Full size table

The σ-algebra ${\mathcal {F}}_2$ corresponds to the set of information that the decision-maker assumes today he will have available at the time t = 2. On the basis of this information the decision-maker forms his expectations. In Table 7.1 we have grouped by parentheses those states that cannot be discriminated at time t = 2. Let us call the combination of two such states a “box.” At time t = 2 he only knows which box he will be in but he cannot discriminate the states within the box.

Example 7.3 demonstrates the following: if a specific elementary event ω is given, the event {ω} and other elementary events are combined into a set A (the above-mentioned “box”). The set A contains only those elementary events that the decision-maker cannot discriminate from ω on the base of his information set given. In this example he was able to observe the uu node at t = 2 but did not (yet) know whether the state uuu or uud will occur at t = 3. The conditional expected value $ \operatorname *{\mathrm {E}}[X|{\mathcal {F}}]$ assigns the actual number $ \operatorname *{\mathrm {E}}[X|A]$ to the elementary event ω. To determine the conditional expected values, the payments associated with the elementary events are weighted with their respective probabilities of occurrence.

Example 7.4 (Share Price)

To further deepen our reflections we consider a state space Ω = [0, 1]. Each real number ω ∈ [0, 1] represents an elementary event. If we choose the Lebesgue measure^{Footnote 21} λ with the corresponding σ-algebra, a probability space is generated since λ( Ω) = 1 holds.

Let us consider the random variable

$$\displaystyle \begin{aligned} X(\omega)=\omega^2. \end{aligned} $$

(7.24)

With the elementary event $\omega =\frac {1}{2}$ the random variable assumes the value $X(\omega )=\frac {1}{4}$. We present the path of this random variable in Fig. 7.9 as a dashed curve.

Let us determine the conditional expectation for the following σ-algebra

$$\displaystyle \begin{aligned} {\mathcal{F}}=\left\{ \emptyset,\; \left\{\left[0, \frac{1}{2}\right)\right\}, \;\left\{\left[\frac{1}{2}, 1\right]\right\},\; \{[0, 1]\} \;\right\}. \end{aligned} $$

(7.25)

In this case the decision-maker cannot tell with certainty which specific elementary event ω ∈ [0, 1] is present; instead he receives only the information whether the elementary event is greater or less than $\frac {1}{2}$.^{Footnote 22} This is all he knows. What is the conditional expectation of the random variable X?

Concentrating on the first subinterval we get according to (5.37)^{Footnote 23} a conditional expectation of

$$\displaystyle \begin{aligned} \operatorname*{\mathrm{E}}\left[X|\omega<\frac{1}{2}\right]=\frac{1}{\frac{1}{2}}\int_{0}^{\frac{1}{2}}X^2\,d\lambda(\omega)=2\left[\frac{X^3}{3}\right]_{0}^{\frac{1}{2}} \end{aligned} $$

(7.27)

and for the second subinterval

$$\displaystyle \begin{aligned} \operatorname*{\mathrm{E}}\left[X|\omega>\frac{1}{2}\right]=\frac{1}{\frac{1}{2}}\int_{\frac{1}{2}}^{1}X^2\,d\lambda(\omega)=2\left[\frac{X^3}{3}\right]_{\frac{1}{2}}^{1}\,. \end{aligned} $$

(7.28)

Thus, we can present the conditional expectation simply by

$$\displaystyle \begin{aligned} \operatorname*{\mathrm{E}}[X|{\mathcal{F}}]=\begin{cases} \frac{1}{12}, & \text{if } \omega\in \left[0,\,\frac{1}{2}\right)\,, \\ & \\ \frac{7}{12}, & \text{if } \omega \in \left[\frac{1}{2},\,1\right]. \end{cases} \end{aligned} $$

(7.29)

Figure 7.9 shows the form of the conditional expectation which is a constant function with a jump at $\omega =\frac {1}{2}$.

As before we recognize the idea of conditional expectation. Beginning with an elementary event ω one must first determine the smallest set A which is part of the σ-algebra ${\mathcal {F}}$ and also includes ω. The conditional expectation $ \operatorname *{\mathrm {E}}[X|A]$ is calculated using Eq. (5.37) and represents the value of the random variable $ \operatorname *{\mathrm {E}}[X|{\mathcal {F}}]$ at ω.

Finally, let us present the following rules for calculating for conditional expectations.

Expected value of known quantities :

If $X \in {\mathcal {F}}$ (it is also said that X is ${\mathcal {F}}$-measurable), then $ \operatorname *{\mathrm {E}}[X|{\mathcal {F}}]=X$ applies.

In order to illustrate the theorem imagine having to determine the conditional expectation of an uncertain quantity X(ω). However, the situation is such that the uncertain state ω can be derived directly from the observed value of the quantity X. Thus the observed quantity is not really uncertain, a result confirming the first theorem.

Further, if Z is ${\mathcal {F}}$-measurable and bounded, then $ \operatorname *{\mathrm {E}}[Z\cdot X|{\mathcal {F}}]=Z\cdot \operatorname *{\mathrm {E}}[X|{\mathcal {F}}]$ holds.

Linearity :

For any numbers a, b the following is true: $ \operatorname *{\mathrm {E}}[aX+bY|{\mathcal {F}}]=a \operatorname *{\mathrm {E}}[X|{\mathcal {F}}]+b \operatorname *{\mathrm {E}}[Y|{\mathcal {F}}]\,$.

Since the conditional expectation represents a generalization of the classic (unconditional) expectation, the property of linearity remains valid. That is the substance of this theorem.

Monotonicity :

If X ≥ 0, then $ \operatorname *{\mathrm {E}}[X|{\mathcal {F}}]\ge 0$ applies.

Since probabilities are nonnegative the expected value of nonnegative variables remains nonnegative. This applies to conditional expectations as well.

Limit almost everywhere :

If X _n is a monotonously growing sequence of random variables which converges to X almost everywhere and if X has a finite expectation, $\lim _{n\to \infty } \operatorname *{\mathrm {E}}[X_n|{\mathcal {F}}]= \operatorname *{\mathrm {E}}[X|{\mathcal {F}}]$ holds.

We had emphasized in Sect. 7.3 that the interchangeability of limit and expectation is of considerable importance in probability theory. This is one of the strengths of the concept of conditional expectation. Under certain conditions limit and expectation can be swapped using almost everywhere-convergence.

Iterated expectation :

If ${\mathcal {F}}\subset {\mathcal {G}}$, then $ \operatorname *{\mathrm {E}}[ \operatorname *{\mathrm {E}}[X|{\mathcal {G}}]|{\mathcal {F}}]= \operatorname *{\mathrm {E}}[X|{\mathcal {F}}]$.

If iterated conditional expectations are to be calculated the inner expectation $ \operatorname *{\mathrm {E}}[X|{\mathcal {G}}]$ can be omitted.

Notes

1.
The idea of this proof goes back to the founder of set theory, Georg Ferdinand Ludwig Philipp Cantor (1845–1918, German mathematician).
2.
Without loss of generality we can ignore all digits before the decimal point.
3.
The symbol ℵ is the first letter of the Hebrew alphabet and is pronounced aleph.
4.
Karl Theodor Wilhelm Weierstraß (1815–1897, German mathematician). In 1872 Weierstraß introduced this function in a lecture and claimed that Riemann had knowledge of such an example. However, no such reference has been found in Riemann’s inheritance. Around 1830 Bolzano found the first example of a function that could not be differentiated almost anywhere in a manuscript that was published only in 1922.
5.
The French mathematician Charles Hermite (1822–1901) wrote in 1893 in a letter to Stieltjes: “I avert myself with horror and shock from this lamentable plague of functions that have no derivative at all.”
6.
The picture does not change very much if additional summands are added with the approximation error being reduced.
7.
We will change derivation and infinite summation in our calculation which is mathematically inadmissible under these circumstances. The following argument therefore does not constitute full proof.
8.
The set of those x has Lebesgue measure zero.
9.
See page 3 ff.
10.
See page 12.
11.
In addition to these two types of convergence, there exist in mathematics a few others definitions that will not be discussed here.
12.
One easily realizes that, for example, the sequence s _n = (−1)ⁿ does not converge with increasing n. Such sequences are called divergent.
13.
The noun is “pointwise convergence,” and the verb is “to converge pointwise.”
14.
The definition is easy to interpret: it is required here that for each value ω the sequence f _n(ω) converges against the number f(ω). So you concentrate on each value f(ω) and ignore the values f(ω ± δ) “next to it” when considering convergence.
15.
Sufficient conditions are formulated in the theorem of monotone convergence. The theorem is due to Beppo Levi and can be found in any textbook on measure theory, for example, Rudin (1976), theorem 11.28.
16.
For example, consider f ₃(t) and f ₁(t). At t = 1 we have f ₃(t) = 0 and f ₁(t) = 1 and thus f ₃(t)≱f ₁(t).
17.
In the literature mean square convergence is also labeled as L ²-convergence.
18.
Andrei Nikolayevich Kolmogoroff (1903–1987), Russian mathematician.
19.
See page 12 ff.
20.
See Kolmogoroff (1933), page 41 ff.
21.
See page 53.
22.
For mathematical reasons, the second set in the σ-algebra must be a half-open interval. If we would add the set $[0,\frac {1}{2}]$ to the σ-algebra the intersection
$$\displaystyle \begin{aligned} \left[0,\frac{1}{2}\right]\cap \left[\frac{1}{2},1\right]=\left\{\frac{1}{2}\right\} \end{aligned} $$
(7.26)

would also be measurable and the decision-maker could determine whether the state $\omega =\frac {1}{2}$ has occurred. But that would be more than we wanted to assume.
23.
See page 15.

References

Kolmogoroff AN (1933) Grundbegriffe der Wahrscheinlichkeitsrechnung. Ergebnisse der Mathematik und ihrer Grenzgebiete, Springer, Berlin
Google Scholar
Rudin W (1976) Principles of mathematical analysis, 3rd edn. McGraw-Hill, New York
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Finance, Accounting & Taxation, Free University of Berlin, Berlin, Germany
Andreas Löffler & Lutz Kruschwitz

Authors

Andreas Löffler
View author publications
You can also search for this author in PubMed Google Scholar
Lutz Kruschwitz
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Löffler, A., Kruschwitz, L. (2019). Supplements. In: The Brownian Motion. Springer Texts in Business and Economics. Springer, Cham. https://doi.org/10.1007/978-3-030-20103-6_7

Download citation

DOI: https://doi.org/10.1007/978-3-030-20103-6_7
Published: 03 July 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-20102-9
Online ISBN: 978-3-030-20103-6
eBook Packages: Economics and FinanceEconomics and Finance (R0)

Publish with us

Policies and ethics