# Probability Theory and Random Processes

• Marcelo S. Alencar
• Valdemar C. da Rocha Jr.
Chapter

## Abstract

The  theory of sets, in its more general form, began with Georg Cantor (1845–1918)  in the nineteenth century. Cantor established the basis for this theory and demonstrated some of its most important results, including the concept of set cardinality. Cantor was born in St. Petersburg, Russia, but lived most of his life in Germany (Boyer 1974).  The ideas relative to the notions of universal set, empty set, set partition, discrete systems, continuous systems and infinity are, in reality, as old as philosophy itself.

## 2.1 Set Theory, Functions, and Measure

The  theory of sets, in its more general form, began with Georg Cantor (1845–1918)  in the nineteenth century. Cantor established the basis for this theory and demonstrated some of its most important results, including the concept of set cardinality. Cantor was born in St. Petersburg, Russia, but lived most of his life in Germany (Boyer 1974).  The ideas relative to the notions of universal set, empty set, set partition, discrete systems, continuous systems and infinity are, in reality, as old as philosophy itself.

In the time of Zenon, one of the most famous pre-Socratic philosophers, the notion of infinity was already discussed. Zenon, considered the creator of the dialectic, was born in Elea, Italy, around 504 B.C. and was the defendant of Parmenides, his master, against criticism from the followers of  Pythagoras. Pythagoras was born in Samos around 580 B.C. and created a philosophic current based on the quantification of the universe.

For the Pythagoreans, unity itself is the result of a being and of a not being. It can be noticed that the concept of emptiness is expressed in the above sentence as well as the concept of a universal set. The Pythagoreans established an association between the number one and the point, between the number two and the line, between the number three and the surface, and between the number four and the volume (de Souza 1996). In spite of dominating the notion of emptiness, the Greeks still did not have the concept of zero.

Zenon, by his turn, defended the idea of a unique being continuous and indivisible, of Parmenides, against the multiple being,  discontinuous and divisible of Pythagoras. Aristotle presents various of Zenon’s arguments relative to movement, with the objective of establishing the concept of a continuum.

Aristotle was born in Estagira, Macedonia, in the year 384 B.C. The first of Aristotle’s arguments suggests the idea of an infinite set, to be discussed later (Durant 1996). This argument is as follows: If a mobile object has to cover a given distance, in a continuous space, it must then cover first the first half of the distance before covering the whole distance.

Reasoning in this manner Zenon’s argument implies that infinite segments of distance cannot be successively covered in a finite time. The expositive logic in Aristotle’s counterargument is impressive (de Souza 1996):

In effect, length and time and in general all contents are called infinite in two senses, whether meaning division or whether with respect to extremes. No doubt, infinities in quantity can not be touched in a finite time; but infinities in division, yes, since time itself is also infinite in this manner. As a consequence, it is in an infinite time and not in a finite time that one can cross infinite and, if infinities are touched, they are touched by infinities and not by finite.

Despite the reflections of pre-Socratic philosophers and of others that followed, no one had yet managed to characterize infinite until 1872. In that year,  J. W. R. Dedekind (1831–1916) pointed to the universal property of infinite sets, which has found applications as far as in the study of fractals (Boyer 1974):

A system S is called infinite when it is similar to a part of itself. On the contrary, S is said to be finite.

Cantor also recognized the fundamental property of sets but, differing from Dedekind, he noticed that not all infinite sets are equal. This notion originated the cardinal numbers,  which will be covered later, in order to establish a hierarchy of infinite sets in accordance with their respective powers. The results of Cantor led him to establish set theory as a fully developed subject. As a consequence of his results on transfinite arithmetic, too advanced for his time,  Cantor suffered attacks of mathematicians like Leopold Kronecker (1823–1891), who disregarded him for a position at the University of Berlin.
Cantor spent most of his carrier life in the smaller University of Halle, in a city of medieval aspect with the same name in Germany, famous for its mines of rock salt, and died there in an institution for persons with mental health problems. However, his theory received from David Hilbert (1862–1943), one of the greatest mathematicians of the twentieth century, the following citation (Boyer 1974):

The new transfinite arithmetic is the most extraordinary product of human thought, and one of the most beautiful achievements of human activity in the domain of the purely intelligible.

Set Theory

The notion of a set is of fundamental importance and is axiomatic—in a manner that a set does not admit a problem-free definition, i.e., a definition which will not resort to the original notion of a set. The mathematical concept of a set can be used as fundamental for all known mathematics.

Set theory will be developed based on a set of axioms, called fundamental axioms: Axiom of Extension, Axiom of Specification, Peano’s Axioms, Axiom of Choice, besides Zorn’s Lemma, and Schröder–Bernstein’s Theorem (Halmos 1960).

The objective of this section is to develop the theory of sets in an informal manner, just quoting these fundamental axioms, since this theory  is used as a basis for establishing a probability measure. Some examples of common sets are given next.

• The set of faces of a coin: $$A = \{ C_H, C_T \}$$;

• The binary set: $$B = \{ 0, 1 \}$$;

• The set of natural numbers: $$N = \{ 1, 2, 3, \dots \}$$;

• The set of integer numbers: $$Z = \{ \dots , -1, -2, 0, 1, 2, 3, \dots \}$$;

The most important relations in set theory are the belonging relation, denoted as $$a \in A$$, in which a is an element of the set A, and the inclusion relation, $$A \subset B$$, which is read “A is a subset of the set B,” or B is a superset of the set A.

Sets may be specified by means of propositions as, for example, “The set of students that return their homework on time,” or more formally $$A = \{ a\, | \, a$$  return their homework on time $$\}$$. This is, in a few cases, a way of denoting the empty set! Alias, the empty set can be written formally as $$\emptyset = \{ a\, | \, a \ne a \}$$, i.e., the set the elements of which are not equal to themselves.

The  notion of a universal set is of fundamental interest. A universal set is understood as that set which contains all other sets of interest. An example of a universal set is provided by the sample space in probability theory, usually denoted as S or $$\Omega$$. The empty set is that set which contains no element and which is usually denoted as $$\emptyset$$ or $$\{\ \}$$. It is implicit that the empty set is contained in any set, i.e., that $$\emptyset \subset A$$, for any given set A. However,  the empty set is not in general an element of any other set.

A practical way of representing sets is by means of   the Venn diagram, as illustrated in Fig. 2.1.
Two  sets are said to be disjoint if they have no element in common, as illustrated in Fig. 2.2. Thus, for example, the set of even natural numbers and the set of odd natural numbers are disjoint.
Operations on Sets
• The operation $$\overline{A}$$ represents the complement of A with respect to the sample space $$\Omega$$;

• The subtraction of sets, denoted $$C = A - B$$, gives as a result the set the elements of which belong to A and do not belong to B.

Note: If B is completely contained in A : $$A - B$$ = $$A \cap \overline{B}$$;

• The set of elements belonging to A and to B, but not belonging to $$(A\,\cap \,B)$$ é is specified by $$A\,\Delta \,B = A\,\cup \,B - A\,\cap \,B$$.

The generalization of these concepts to families of sets, as, for example, $$\cup _{i=1}^{N}A_{i}$$ and $$\cap _{i=1}^{N}A_{i}$$, is immediate.  The following properties are usually employed as axioms in developing the theory of sets (Lipschutz 1968).

• Idempotent

$$A\,\cup \,A\,=\,A , \qquad A\,\cap \,A\,=\,A$$.

• Associative

$$(A\,\cup \,B)\,\cup \,C\,=\,A\,\cup \,(B\,\cup \,C) , \qquad (A\,\cap \,B)\,\cap \,C\,=\,A\,\cap \,(B\,\cap \,C)$$.

• Commutative

$$A\,\cup \,B\,=\,B\,\cup \,A , \qquad A\,\cap \,B\,=\,B\,\cap \,A$$.

• Distributive

$$A\,\cup \,(B\,\cap \,C)\,=\,(A\,\cup \,B)\,\cap \,(A\,\cup \,C)$$ ,

$$A\,\cap \,(B\,\cup \,C)\,=\,(A\,\cap \,B)\,\cup \,(A\,\cap \,C)$$.

• Identity

$$A\,\cup \,\emptyset \,=\,A , \qquad A\,\cap \,U\,=\,A$$

$$A\,\cup \,U\,=\,U , \qquad A\,\cap \,\emptyset \,=\,\emptyset$$.

• Complementary

$$A\,\cup \, \overline{A}\,=\,U , \qquad A\,\cap \, \overline{A}\,=\,\emptyset$$        $$\overline{( \overline{A})}\,=\,A$$

$$U\,=\,\emptyset , \qquad \overline{\emptyset }\,=\,U$$.

• de Morgan laws

$$\overline{A\,\cup \,B}\,=\, \overline{A}\,\cap \, \overline{B} , \qquad \overline{A\,\cap \,B}\,=\, \overline{A}\,\cup \, \overline{B}$$.

Families of Sets

Among  the most interesting families of sets, it is worth mentioning an increasing sequence of sets, such that $$\lim _{i\rightarrow \infty }\,\cup \,A_{i} = A$$, as shown in Fig. 2.3. This sequence is used in proofs of limits over sets.
A decreasing sequence of sets is defined in a similar manner with $$\lim _{i\rightarrow \infty }\,\cap \,A_{i} = A$$, as shown in Fig. 2.4.

Indexing

The  Cartesian product is a way of expressing the idea of indexing of sets. The indexing of sets expands the possibilities for the use of sets, allowing to produce eventually entities known as vectors and signals.

Application 1: Consider $$B_i = \{0,1\}$$. Starting from this set, it is possible to construct an indexed sequence of sets by defining its indexing: $$\{B_{i \epsilon I} \}$$, $$I=\{0,\ldots ,7\}$$. This family of indexed sets $$B_{i}$$ constitutes a finite discrete sequence, i.e., a vector. The ASCII set of binary symbols is an example.

Application 2: Consider $$B_i = \{-A,A\}$$, but now let $$I=Z$$ (the set of positive and negative integers plus zero). It follows that $$\{B_{i \epsilon Z} \}$$, which represents an infinite series of –A’s and A’s, i.e., it represents a binary digital signal. For example, Fig. 2.5 represents a discrete signal in both amplitude and time.

Application 3: Still letting $$B_i = \{-A,A\}$$, but considering the indexing over the set of real numbers, $$\{B_{i \epsilon I} \}$$, in which $$I = R$$, a signal is formed which is discrete in amplitude but it is continuous in time, such as the telegraph signal in Fig. 2.6.

Application 4: Finally, consider $$B = R$$ and $$I = R$$. In this manner, the result is an analog signal, i.e., continuous in time and in amplitude, as shown in Fig. 2.7.

Algebra of Sets

In  order to construct an algebra of sets or, equivalently, to construct a field over which operations involving sets make sense, a few properties have to be obeyed.

1. (1)

If $$A\,\in \,\mathcal {F}$$ then $$\overline{A}\,\in \,\mathcal {F}$$. A is the set containing desired results, or over which one wants to operate;

2. (2)

If $$A\,\in \,\mathcal {F}$$ and $$B\,\in \,\mathcal {F}$$ then $$A\,\cup \,B\ \ \in \, \mathcal {F}$$.

The  above properties guarantee the closure of the algebra with respect to finite operations over sets. It is noticed that the universal set $$\Omega$$ always belongs to the algebra, i.e., $$\Omega \in \mathcal {F}$$, because $$\Omega = A \cup \overline{A}$$. The empty set also belongs to the algebra, i.e., $$\emptyset \in \mathcal {F}$$, since $$\emptyset = \overline{\Omega }$$, followed by property 1.

Example: The family $$\{\emptyset , \,\Omega \}$$ complies with the above properties and therefore represents an algebra. In this case $$\emptyset = \{\}$$ and $$\overline{\emptyset } = \Omega$$. The union is also represented, as can be easily checked.

Example: Given the sets $$\{C_{H}\}$$ and $$\{C_{T}\}$$, representing the faces of a coin, respectively, if $$\{C_H\} \in \mathcal {F}$$ then $$\{\overline{C_{H}}\} = \{C_{T}\}\in \mathcal {F}$$. It follows that $$\{C_{H},C_{T}\} \in \mathcal {F}$$ $$\Rightarrow \Omega \in \mathcal {F}$$ $$\Rightarrow \emptyset \in \mathcal {F}$$.

The previous example can be translated by the following expression. If there is a measure for heads, then there must be also a measure for tails, in order for the algebra to be properly defined. Whenever a probability is assigned to an event, then a probability must also be assigned to the complementary event.

The  cardinality of a finite set is defined as the number of elements belonging to this set. Sets with an infinite number of elements are said to have the same cardinality if they are equivalent, i.e., $$A \sim B$$ if $$\sharp A = \sharp B$$. Some examples of sets and their respective cardinals are presented next.

• $$I = \{1,\ldots ,k\} \Rightarrow C_{I}=k$$;

• $$N = \{0,1,\ldots \} \Rightarrow C_{N}$$ or $$\aleph _0$$;

• $$Z = \{\ldots ,-2,-1,0,1,2,\ldots \} \Rightarrow C_{Z}$$;

• $$Q = \{\ldots ,-1/3,0,1/3,1/2,\ldots \} \Rightarrow C_{Q}$$;

• $$R = (-\infty ,\infty ) \Rightarrow C_{R}$$ or $$\aleph$$.

For the above examples, the following relations are verified: $$C_{R}>C_{Q}=C_{Z}=C_{N}>C_{I}$$. The notation $$\aleph _0$$, for the cardinality of the set of natural numbers, was employed by Cantor.

The cardinality of the power set, i.e., of the family of sets consisting of all subsets of a given set I, $$\mathcal {F}$$ = $$2^{I}$$, is $$2^{C_I}$$.

Borel Algebra

The  Borel algebra $$\mathcal {B}$$, or $$\sigma$$-algebra, is an extension of the algebra so far discussed to operate with limits at infinity. The following properties are required from a $$\sigma$$-algebra.

1. 1

$$A\,\in \,\mathcal {B}$$ $$\Rightarrow \overline{A}\,\in \,\mathcal {B}$$,

2. 2

$$A_{i}\,\in \,\mathcal {B}$$ $$\Rightarrow \bigcup _{i=1}^{\infty }A_{i}\,\in \,\mathcal {B}$$.

The above properties guarantee the closure of the $$\sigma$$-algebra with respect to enumerable operations over sets. These properties allow the definition of limits in the Borel field.

Examples: Considering the above properties, it can be verified that $$A_{1}\,\cap \,A_{2}\,\cap \,A_{3}\cdots \in \mathcal {B}$$. In effect, it is sufficient to notice that
$$A\,\in \,\mathcal {B}\ \text{ and }\ \ B\,\in \,\mathcal {B} \ \Rightarrow A\,\cup \,B\,\in \,\mathcal {B},$$
and
$$\overline{A}\,\in \,\mathcal {B}\ \ \text{ and }\ \overline{B}\,\in \,\mathcal {B}\ \Rightarrow \overline{A}\,\cup \,\overline{B}\,\in \,\mathcal {B},$$
and finally
$$\overline{\overline{A}\,\cup \,\overline{B}}\,\in \,\mathcal {B}\ \ \Rightarrow A\,\cap \,B\,\in \,\mathcal {B}.$$
In summary, any combination of unions and intersections of sets belongs to the Borel algebra. In other words, operations of union or intersection of sets, or a combination of these operations, produce a set that belongs to the $$\sigma$$-algebra.

## 2.2 Probability Theory

This section summarizes the more basic definitions related to probability theory, random variables and stochastic processes, the main results, and conclusions of which will be used in subsequent chapters.

Probability theory began in France with studies about games of chance. Antoine Gombaud (1607–1684), known as Chevalier de Méré, was very keen on card games and would discuss with Blaise Pascal (1623–1662) about the probabilities of success in this game. Pascal, also interested in the subject, began a correspondence with Pierre de Fermat (1601–1665) in 1654, which originated the theory of finite probability (Zumpano and de Lima 2004).

However, the first known work about probability is De Ludo Aleae (About Games of Chance), by the Italian medical doctor and mathematician Girolamo Cardano (1501–1576), published in 1663, almost 90 years after his death. This book was a handbook for players, containing some discussion about probability.

The first published treatise about the Theory of Probability, dated 1657, was written by the Dutch scientist Christian Huygens (1629–1695), a folder titled De Ratiociniis in Ludo Aleae (About Reasoning in Games of Chance).

Another Italian, the physicist and astronomer Galileo Galilei (1564–1642), was also concerned with random events. In a fragment probably written between 1613 and 1623, entitled Sopra le Scorpete dei Dadi (About Dice Games), Galileo answers a question asked, it is believed, by the Grand Duke of Tuscany: When three dice are thrown, although both the number 9 and the number 10 may result from six distinct manners, in practice, the chances of getting a 9 are lower than those of obtaining a 10. How can that be explained?

The six distinct manners by which these numbers (9 and 10) can be obtained are (1 3 6), (1 4 5), (2 3 5), (2 4 4), (2 6 2) and (3 3 4) for the number 10 and (1 2 6), (1 3 5), (1 4 4), (2 2 5), (2 3 4) and (3 3 3) for the number 9. Galileo concluded that, for this game, the permutations of the triplets must also be considered since (1 3 6) and (3 1 6) are distinct possibilities. He then calculated that in order to obtain the number 9 there are in fact 25 possibilities, while there are 27 possibilities for the number 10. Therefore, combinations leading to the number 10 are more frequent.

Abraham de Moivre (1667–1754) was another important mathematician for the development of probability theory. He wrote a book of great influence at the time, called Doctrine of Chances. The law of large numbers was discussed by Jacques Bernoulli (1654–1705), Swiss mathematician, in his work Ars Conjectandi (The Art of Conjecturing).

The study of probability was deepened in the eighteenth and nineteenth centuries, being worth of mentioning the works of French mathematicians Pierre-Simon de Laplace (1749–1827) and Siméon Poisson (1781–1840), as well as the German mathematician Karl Friedrich Gauss (1777–1855).

Axiomatic Approach to Probability

Probability theory is usually presented in one of the following manners: The classical approach, the relative frequency approach, and the axiomatic approach. The classical approach is based on the symmetry of an experiment, but employs the concept of probability in a cyclic manner, because it is defined only for equiprobable events. The relative frequency approach to probability is more recent and relies on experiments.

Considering the difficulties found in the two previous approaches to probability, respectively, the cyclic definition in the first case and the problem of convergence in a series of experiments for the second case, henceforth only the axiomatic approach will be followed in this text. Those readers interested in the classical or in the relative frequency approach are referred to the literature (Papoulis 1981).

The axioms of probability established by Andrei N. Kolmogorov (1903–1987), allowing the development of the complete theory, are just three statements as follows (Papoulis 1983):
Axiom 1

$$P(S) = 1$$, in which S denotes the sample space or universal set and $$P(\cdot )$$ denotes the associated probability.

Axiom 2

$$P(A) \ge 0$$, in which A denotes an event belonging to the sample space.

Axiom 3

$$P(A \cup B) = P(A) + P(B)$$, in which A and B are mutually exclusive events and $$A \cup B$$ denotes the union of events A and B.

Using his axiomatic approach to probability theory, Kolmogorov established a firm mathematical basis on which other theories rely as, for example, the Theory of Stochastic Processes, Communications Theory, and Information Theory.

Kolmogorov’s fundamental work was published in 1933, in Russian, and soon afterward was published in German with the title Grundbegriffe der Wahrscheinlichkeits Rechnung (Fundamentals of Probability Theory) (James 1981). In this work, Kolmogorov managed to combine Advanced Set Theory, of Cantor, with Measure Theory, of Lebesgue, in order to produce what to this date is the modern approach to probability theory.

By applying the above axioms, it is possible to deduce all results relative to probability theory. For example, the probability of the empty set, $$\emptyset = \{\}$$, is easily calculated as follows. First, it is noticed that
$$\emptyset \,\cup \,\Omega = \Omega ,$$
since the sets $$\emptyset$$ and $$\Omega$$ are disjoint. Thus, it follows that
$$P(\emptyset \,\cup \,\Omega ) = P(\Omega ) = P(\emptyset )+P(\Omega ) = 1 \Rightarrow P(\emptyset )=0.$$
In the case of sets A and B which are not disjoint, it follows that
$$P(A\,\cup \,B)=P(A)+P(B)-P(A\,\cap \,B).$$
Bayes’ Rule
Bayes’ rule concerns the computation of conditional probabilities and can be expressed by the following rule:
$$P(A | B) = \frac{P(A\,\cap \,B)}{P(B)},$$
assuming $$P(B) \ne 0$$. An equivalent manner of expressing the same result is the following:
$$P(A\,\cap \,B) = P(A | B)\cdot P(B)\ \ ,\ \ P(B) \ne 0.$$
Some important properties of sets are presented next, in which A and B denote events from a given sample space.
• If A is independent of B, then $$P(A | B) = P(A)$$. It then follows that $$P(B|A) = P(B)$$ and that B is independent of A.

• If $$B \subset A$$, then $$P(A | B) = 1$$.

• If $$A \subset B$$, then $$P(A | B) = \frac{P(A)}{P(B)} \ge P(A)$$.

• If A and B are independent events then $$P(A\,\cap \,B) = P(A)\cdot P(B)$$.

• If $$P(A) = 0$$ or $$P(A) = 1$$, then event A is independent of itself.

• If $$P(B) = 0$$, then P(A|B) can assume any arbitrary value. Usually in this case one assumes $$P(A | B) = P(A)$$.

• If events A and B are disjoint, and nonempty, then they are dependent.

A partition is a possible splitting of the sample space into a family of subsets, in a manner that the subsets in this family are disjoint and their union coincides with the sample space. It follows that any set in the sample space can be expressed by using a partition of that sample space, and thus be written as a union of disjoint events.
The following property can be illustrated by means of a Venn diagram, as illustrated in Fig. 2.8:
$$B = B\,\cap \,\Omega = B\,\cap \,\cup _{i=1}^{M}\,A_{i} = \cup _{i=1}^{N}\,B\,\cap \,A_{i}.$$
It now follows that
\begin{aligned} P(B)= & {} P(\cup _{i=1}^{N}\,B\,\cap \,A_{i}) = \sum _{i=1}^{N}\,P(B\,\cap \,A_{i}),\\ P(A_{i} | B)= & {} \frac{P(A_{i}\,\cap \,B)}{P(B)} = \frac{P(B | A_{i})\cdot P(A_{i})}{\sum _{i=1}^{N}\,P(B\,\cap \,A_{i})} = \frac{P(B | A_{i})\cdot P(A_{i})}{\sum _{i=1}^{N}\,P(B | A_{i})\cdot P(A_{i})}. \end{aligned}

## 2.3 Random Variables

A  random variable (r.v.) X represents a mapping of the sample space on the line (the set of real numbers). A random variable is usually characterized by a cumulative probability function (CPF) $$P_X(x)$$, or by a probability density function (pdf) $$p_X(x)$$.

Example: A random variable with a uniform probability density function $$p_{X}(x)$$ is described by the equation $$p_X(x) = u(x) - u(x - 1)$$. It follows by axiom 1 that
\begin{aligned} \int _{- \infty }^{+ \infty } p_{X}(x) dx = 1. \end{aligned}
(2.1)
In general, for a given probability distribution, the probability of X belonging to the interval (ab] is given by
\begin{aligned} P(a < x \le b) = \int _{a}^{b} p_{X}(x) dx . \end{aligned}
(2.2)
The cumulative probability function $$P_{X}(x)$$, of a random variable X, is defined as the integral of $$p_{X}(x)$$, i.e.,
\begin{aligned} P_{X}(x) = \int _{- \infty }^{x} p_{X}(t) dt . \end{aligned}
(2.3)

### 2.3.1 Average Value of a Random Variable

Let f(X) denote a function of a random variable X. The average value (or expected value) of f(X) with respect to X is defined as
\begin{aligned} E[f(X)] = \int _{- \infty }^{+ \infty } f(x) p_{X}(x) dx . \end{aligned}
(2.4)
The following properties of the expected value follow from (2.4):
\begin{aligned} E[\alpha X] = \alpha E[X], \end{aligned}
(2.5)
\begin{aligned} E[X + Y] = E[X] + E[Y] \end{aligned}
(2.6)
and if X and Y are statistically independent random variables then
\begin{aligned} E[X Y] = E[X] E[Y]. \end{aligned}
(2.7)

### 2.3.2 Moments of a Random Variable

The $$i^{th}$$ moment  of a random variable X is defined as
\begin{aligned} m_{i} = E[X^{i}] = \int _{- \infty }^{+ \infty } x^{i} p_{X}(x) dx . \end{aligned}
(2.8)
Various moments of X have special importance and physical interpretation.
• $$m_{1} = E[X]$$, arithmetic mean, average value, average voltage, statistical mean;

• $$m_{2} = E[X^{2}]$$, quadratic mean, total power;

• $$m_{3} = E[X^{3}]$$, measure of asymmetry of the probability density function;

• $$m_{4} = E[X^{4}]$$, measure of flatness of the probability density function.

### 2.3.3 The Variance of a Random Variable

The variance of a random variable X is an important quantity in communication theory (meaning AC power), defined as follows:
\begin{aligned} V[X] = \sigma ^{2}_{X} = E[(X - m_{1})^{2}] = m_{2} - m_{1}^{2}. \end{aligned}
(2.9)
The standard deviation $$\sigma _{X}$$ is defined as the square root of the variance of X.

### 2.3.4 The Characteristic Function of a Random Variable

The characteristic function $$P_{X}(\omega )$$, or moment-generating function, of a random variable X is usually defined from the Fourier transform of the probability density function (pdf) of X, which is equivalent to making $$f(x) = e^{ - j \omega x}$$ in (2.4), i.e.,
\begin{aligned} P_{X}(\omega ) = E[ e^{- j \omega x}] = \int _{- \infty }^{+ \infty } e^{ - j \omega x} p_{X}(x) dx, \text{ in } \text{ which }\ \ j = \sqrt{-1} . \end{aligned}
(2.10)
The moments of a random variable X can also be obtained directly from then characteristic function as follows:
\begin{aligned} m_{i} = \left. \frac{1}{ (-j)^{i}} \frac{ \partial ^{i} P_{X}(\omega ) }{\partial \omega ^{i}} \right| _{ \omega = 0 }. \end{aligned}
(2.11)
Given that X is a random variable, it follows that $$Y = f(X)$$ is also a random variable, obtained by the application of the transformation $$f(\cdot )$$. The probability density function of Y is related to that of X by the formula (Blake 1987)
\begin{aligned} p_{Y}(y) = \left. \frac{ p_{X}(x) }{ |dy/dx| } \right| _{ x = f^{-1}(y) }, \end{aligned}
(2.12)
in which $$f^{-1}(\cdot )$$ denotes the inverse function of $$f(\cdot )$$. This formula assumes the existence of the inverse function of $$f(\cdot )$$ as well as its derivative in all points.

#### 2.3.4.1 Some Important Random Variables

1. (1)

Gaussian random variable

The random variable X with pdf
\begin{aligned} p_{X}(x) = \frac{1}{ \sigma _{X} \sqrt{2 \pi } } e^{ - \frac{ ( x - m_X )^2 }{ 2 \sigma _X^2 } } \end{aligned}
(2.13)
is called a Gaussian (or Normal) random variable. The Gaussian random variable plays an extremely important role in engineering, considering that many well-known processes can be described or approximated by this pdf. The noise present in either analog or digital communications systems usually can be considered Gaussian as a consequence of the influence of many factors (Leon-Garcia 1989). In (2.13), $$m_{X}$$ represents the average value and $$\sigma ^{2}_{X}$$ represents the variance of X. Figure 2.9 illustrates the Gaussian pdf and its corresponding cumulative probability function.

2. (2)

Rayleigh random variable

An often used model to represent the behavior of the amplitudes of signals subjected to fading employs the following pdf (Kennedy 1969), (Proakis 1990):
\begin{aligned} p_{X}( x ) = \frac{ x }{ \sigma ^{2}} e^{ - \frac{ x^{2}}{2 \sigma ^{2}}} u( x ) \end{aligned}
(2.14)
known as the Rayleigh pdf, with average $$E[X] = \sigma \sqrt{ \pi /2}$$ and variance $$V[X] = (2 - \pi ) \frac{ \sigma ^{2} }{2}$$.

The Rayleigh pdf represents the effect of multiple signals, reflected or refracted, which are captured by a receiver, in a situation in which there is no main signal component or main direction of propagation (Lecours et al. 1988). In this situation, the phase distribution of the received signal can be considered uniform in the interval $$(0,2\pi )$$. It is noticed that it is possible to closely approximate a Rayleigh pdf by considering only six waveforms with independently distributed phases (Schwartz et al. 1966).

3. (3)

Sinusoidal random variable

A sinusoidal tone X has the following pdf:
\begin{aligned} p_{X}(x) = \frac{1}{ \pi \sqrt{V^{2} - x^{2}}} , \ |x| < V. \end{aligned}
(2.15)
The pdf and the CPF of X are illustrated in Fig. 2.10.

Joint Random Variables

Considering  that X and Y represent a pair of real random variables, with joint pdf $$p_{XY} (x,y)$$, as illustrated in Fig. 2.11, then the probability of x and y being simultaneously in the region defined by the polygon [abcd] is given by the expression
\begin{aligned} \mathrm{Prob} (a< x< b , c< y < d) = \int _{a}^{b} \int _{c}^{d} p_{XY}(x,y) dx dy . \end{aligned}
(2.16)
The individual pdf’s of X and Y, also called marginal pdf’s, result from the integration of the joint pdf as follows:
\begin{aligned} p_{X}(x) = \int _{- \infty }^{+ \infty } p_{XY}(x,y) dy , \end{aligned}
(2.17)
and
\begin{aligned} p_{Y}(y) = \int _{- \infty }^{+ \infty } p_{XY}(x,y) dx . \end{aligned}
(2.18)
The joint average E[f(XY) ] is calculated as
\begin{aligned} E[ f(X,Y) ] = \int _{- \infty }^{+ \infty } \int _{- \infty }^{+ \infty } f(x,y) p_{XY}(x,y) dx dy, \end{aligned}
(2.19)
for an arbitrary function f(XY) of X and Y.
The joint moments $$m_{ik}$$, of order ik, are calculated as
\begin{aligned} m_{ik} = E[ X^{i},Y^{k} ] = \int _{- \infty }^{+ \infty } \int _{- \infty }^{+ \infty } x^{i} y^{k} p_{XY}(xy) dx dy. \end{aligned}
(2.20)
The two-dimensional characteristic function is defined as the two-dimensional Fourier transform of the joint probability density $$p_{XY}(x,y)$$
\begin{aligned} P_{XY}( \omega , \nu ) = E[ e^{ - j \omega X - j \nu Y} ]. \end{aligned}
(2.21)
When the sum $$Z = X + Y$$ of two statistically independent r.v.’s is considered, it is noticed that the characteristic function of Z turns out to be
\begin{aligned} P_{Z}(\omega ) = E[e^{ - j \omega Z}] = E[e^{ - j \omega (X + Y)}] = P_{X}(\omega ) \cdot P_{Y}(\omega ). \end{aligned}
(2.22)
As far as the pdf of Z is concerned, it can be said that
\begin{aligned} p_{Z}(z) = \int _{-\infty }^{\infty } p_{X}(\rho ) p_{Y}(z - \rho ) d\rho , \end{aligned}
(2.23)
or
\begin{aligned} p_{Z}(z) = \int _{-\infty }^{\infty } p_{X}(z - \rho )p_{Y}(\rho ) d\rho . \end{aligned}
(2.24)
Equivalently, the sum of two statistically independent r.v.’s has a pdf given by the convolution of the respective pdf’s of the r.v.’s involved in the sum.

The r.v.’s X and Y are called uncorrelated if $$E[XY] = E[X] E[Y]$$. The criterion of statistical independence of random variables, which is stronger than that for the r.v.’s being uncorrelated, is satisfied if $$p_{XY}(x,y) = p_{X}(x).p_{Y}(y)$$.

## 2.4 Stochastic Processes

A random process, or a stochastic process, is an extension of the concept of a random variable, involving a sample space, a set of signals and the associated probability density functions. Figure 2.12 illustrates a random signal and its associated probability density function.

A random process (or stochastic process) X(t) defines a random variable for each point on the time axis. A stochastic process is said to be stationary if the probability densities associated with the process are time-independent.

The Autocorrelation Function

An important joint moment of the random process X(t) is the autocorrelation function
\begin{aligned} R_{X}( \xi , \eta ) = E[ X( \xi ) X( \eta ) ] , \end{aligned}
(2.25)
in which
\begin{aligned} E[ X(\xi ) X(\eta ) ] = \int _{- \infty }^{+ \infty } \int _{- \infty }^{+ \infty } x(\xi ) x(\eta ) p_{X(\xi )X(\eta )}(x(\xi )x(\eta )) dx(\xi ) dy(\eta ) \end{aligned}
(2.26)
denotes the joint moment of the r.v. X(t) at $$t = \xi$$ and at $$t = \eta$$.
The random process is called wide-sense stationary if its autocorrelation depends only on the interval of time separating $$X(\xi )$$ and $$X(\eta )$$, i.e., depends only on $$\tau = \xi - \eta$$. Equation 2.25 in this case can be written as
\begin{aligned} R_{X}( \tau ) = E[ X( t ) X( t + \tau ) ]. \end{aligned}
(2.27)
Stationarity
In general, the statistical mean of a time signal is a function of time. Thus, the mean value
$$E[X(t)]\,=\,m_{X}(t) ,$$
the power
$$E[X^{2}(t)]\,=\,P_{X}(t),$$
and the autocorrelation
$$R_{X}(\tau ,t)\,=\,E[X(t)X(t\,+\,\tau )]$$
are, in general, time-dependent. However, there exists a set of time signals the mean value of which are time-independent. These signals are called stationary signals, as illustrated in the following example.
Example: Calculate the power (RMS square value) of the random signal $$X(t)\,=\,V\cos (\omega t + \phi )$$, in which V is a constant and in which the phase $$\phi$$ is a r.v. with a uniform probability distribution over the interval $$[0,2\pi ]$$. Applying the definition of power, it follows that
$$E[X^{2}(t)]\,=\,\int _{-\infty }^{\infty }\,X^{2}(t)p_{X}(x)\,dx\, =\,\frac{1}{2\pi }\,\int _{0}^{2\pi }\,V^{2}\cos ^{2}(\omega t\,+\,\phi )\,d\phi .$$
Recalling that $$\cos ^{2} \theta \,=\,\frac{1}{2}\,+\,\frac{1}{2}\cos 2 \theta$$, it follows that
$$E[X^{2}(t)]\,=\,\frac{V^2}{4\pi }\,\int _{0}^{2\pi }\,(1+\cos (2\omega t\,+\,2\phi ))\,d\phi \, =\,\frac{V^2}{4\pi }\phi \left| _{o}^{2\pi }\right. \,=\,\frac{V^2}{2} .$$
Since the mean value $$m_X$$ of X(t) is zero, i.e.,
$$m_X = E[X(t)]\,=\,E[V\cos (\omega t\,+\,\phi )]\,=\,0 ,$$
the variance, or AC power, becomes
$$V[X]\,=\,E[(X\,-\,m_{X})^2]\,=\,E[X^{2}\,-\,2Xm_{X}\,+\,m_{X}^{2}]\,=\,E[X^2]\,=\,\frac{V^2}{2} .$$
Therefore,
$$\sigma _{X}\,=\,\frac{V}{\sqrt{2}}\,=\,\frac{V\sqrt{2}}{2} .$$
Example: Consider the digital telegraphic signal shown in Fig. 2.13, with equiprobable amplitudes A and $$-A$$.
The probability density function, as shown in Fig. 2.14, is given by
$$p_{X}(x)\,=\,\frac{1}{2}\left[ \delta (x\,+\,A)\,+\,\delta (x\,-\,A)\right] .$$
By applying the definition of the mean to E[X(t)], it follows that
$$E[X(t)]\,=\,0 .$$
The power also follows as
$$E[X^{2}(t)] = \frac{1}{2} A^2 + \frac{1}{2} (-A)^2 = A^2 .$$
Finally, the variance and the standard deviation are calculated as
$$\sigma _{X}^{2}\,=\,E[X^{2}(t)]\,=\,A^2\ \ \Rightarrow \ \ \sigma _{X}\,=\,A .$$
Application: The dynamic range of a signal, from a probabilistic point of view, is illustrated in Fig. 2.15. As can be seen, the dynamic range depends on the standard deviation, or RMS voltage, being usually specified for $$2 \sigma _X$$ or $$4 \sigma _X$$. For a signal with a Gaussian probability distribution of amplitudes, this corresponds to a range encompassing, respectively, 97 and $$99,7\%$$ of all signal amplitudes.

However, since the signal is time-varying, its statistical mean can also change with time, as illustrated in Fig. 2.16. In the example considered, the variance is initially diminishing with time and later it is growing with time. In this case, an adjustment in the signal variance, by means of an automatic gain control mechanism, can remove the pdf dependency on time.

A signal is stationary whenever its pdf is time-independent, i.e., whenever $$p_X(x,t) = p_X(x)$$, as illustrated in Fig. 2.17.
Stationarity may occur in various instances:
1. (1)

Stationary mean $$\Rightarrow \ \ m_{X}(t)\,=\,m_{X}$$;

2. (2)

Stationary power $$\Rightarrow \ \ P_{X}(t)\,=\,P_{X}$$ ;

3. (3)

First-order stationarity implies that the first-order moment is also time-independent;

4. (4)

Second-order stationarity implies that the second-order moments are also time-independent;

5. (5)

Narrow-sense stationarity implies that the signal is stationary for all orders, i.e., $$p_{X_{1}\cdots X_{M}}(x_{1},\ldots ,x_{M};t)\,=\,p_{X_{1}\ldots X_{M}}(x_{1},\ldots ,x_{M})$$

Wide-Sense Stationarity

The following conditions are necessary to guarantee that a stochastic process is wide-sense stationary.

1. (1)

The autocorrelation is time-independent;

2. (2)

The mean and the power are constant;

3. (3)

$$R_X(t_1,t_2) = R_{X}(t_{2}\,-\,t_{1})\,=\,R_{X}(\tau )$$. The autocorrelation depends on the time interval and not on the origin of the time interval.

Stationarity Degrees

A pictorial classification of degrees of stationarity is illustrated in Fig. 2.18. The universal set includes all signals in general. Some important subsets of the universal set are represented by the signals with a stationary mean, wide-sense stationary signals and narrow-sense stationary signals.

Ergodic Signals

Ergodicity is another characteristic of random signals. A given expected value of a function of the pdf is ergodic if the time expected value coincides with the statistical expected value. Thus ergodicity can occur on the mean, on the power, on the autocorrelation, or with respect to other quantities. Ergodicity of the mean implies that the time average is equivalent to the statistical signal average. Therefore,
• Ergodicity of the mean: $$\overline{X(t)}\,\sim \,E[X(t)]$$;

• Ergodicity of the power: $$\overline{X^{2}(t)}\,\sim \,\overline{R_{X}}(\tau )\,\sim \,R_{X}(\tau )$$;

• Ergodicity of the autocorrelation: $$\overline{R_{X}}(\tau )\,\sim \,R_{X}(\tau )$$.

A strictly stationary stochastic process has time-independent joint pdf’s of all orders. A stationary process of second order is that process for which all means are constant and the autocorrelation depends only on the measurement time interval.

Summarizing, a stochastic process is ergodic whenever its statistical means, which are functions of time, can be approximated by their corresponding time averages, which are random processes, with a standard deviation which is close to zero. The ergodicity may appear only on the mean value of the process, in which case the process is said to be ergodic on the mean.

Properties of the Autocorrelation

The autocorrelation function has some important properties as follows:
1. (1)

$$R_{X}(0)\,=\,E[X^{2}(t)]\,=\,P_{X}$$, (Total power);

2. (2)

$$R_{X}(\infty )\,=\,\lim _{\tau \rightarrow \infty }R_{X}(\tau )$$ = $$\lim _{\tau \rightarrow \infty }E[X(t\,+\,\tau )X(t)]\,=\,E^2[X(t)]$$, (Average power or DC level);

3. (3)

Autocovariance: $$C_{X}(\tau )\,=\,R_{X}(\tau )\,-\,E^2[X(t)]$$;

4. (4)

Variance: V[$$X(t)]\,=\,E[(X(t)\,-\,E[X(t)])^2]\,=\,E[X^2(t)]\,-\,E^2[X(t)]$$

or $$P_{AC}(0)\,=\,R_{X}(0)\,-\,R_{X}(\infty )$$;

5. (5)

$$R_{X}(0)\,\ge \, | R_{X}(\tau ) |$$, (Maximum at the origin);

This property is demonstrated by considering the following tautology:
$$E[(X(t)\,-\,X(t\,+\,\tau ))^2]\,\ge \,0 .$$
Thus,
$$E[X^{2}(t)\,-\,2X(t)X(t\,+\,\tau )]\,+\,E[X^{2}(t\,+\,\tau )]\,\ge \,0 ,$$
i.e.,
$$2R_{X}(0)\,-\,2R_{X}(\tau )\,\ge \,0\ \ \Rightarrow \ \ R_{X}(0)\,\ge \,R_{X}(\tau ) .$$

6. (6)

Symmetry: $$R_{X}(\tau )\,=\,R_{X}(-\tau )$$;

In order to prove this property, it is sufficient to use the definition $$R_{X}(-\tau )\,=\,E[X(t)X(t\,-\,\tau )].$$

Letting $$t - \tau \,=\,\sigma \ \ \Rightarrow \ \ t\,=\,\sigma \,+\,\tau$$

$$R_{X}(-\tau )\,=\,E[X(\sigma \,+\,\tau )\cdot X(\sigma )]\,=\,R_{X}(\tau ).$$

7. (7)

$$E[X(t)]\,=\,\sqrt{R_{X}(\infty )}$$ (Signal mean value).

Application: The relationship between the autocorrelation and various other power measures is illustrated in Fig. 2.19.
Example: A digital signal X(t), with equiprobable amplitude levels A and $$-A$$, has the following autocorrelation function:
$$R_X(\tau ) = A^2[ 1 - \frac{|\tau |}{T_b}][ u(\tau + T_b) - u(\tau - T_b)],$$
in which $$T_b$$ is the bit duration.

The Power Spectral Density

By using the autocorrelation function, it is possible to define the following Fourier transform pair, known as the Wiener–Khintchin theorem:
\begin{aligned} S_{X}(\omega ) = \int _{- \infty }^{+ \infty } R_{X}(\tau ) e^{- j w \tau } d \tau \end{aligned}
(2.28)
\begin{aligned} R_{X}(\tau ) = \frac{1}{2 \pi } \int _{- \infty }^{+ \infty } S_{X}(\omega ) e^{ j w \tau } d w. \end{aligned}
(2.29)
The function $$S_{X}(\omega )$$ is called the power spectral density (PSD) of the random process.
The Wiener–Khintchin theorem relates the autocorrelation function with the power spectral density, i.e., it plays the role of a bridge between the time domain and the frequency domain for random signals. This theorem will be proved in the sequel. Figure 2.20 shows a random signal truncated in an interval T.
The Fourier transform for the signal x(t) given in Fig. 2.20 is given by $$\mathcal {F}$$ $$[x_{T}(t)]\,=\,X_{T}(\omega )$$. The time power spectral density of x(t) is calculated as follows:
$$\lim _{T\rightarrow \infty }\,\frac{1}{T}\,|X_{T}(\omega )|^{2}\,=\,\overline{S_{X}}(\omega )$$
The result obtained is obviously a random quantity, and it is possible to compute its statistical mean to obtain the power spectral density
$$S_{X}(\omega )\,=\,E[\overline{S_{X}}(\omega )].$$
Using the fact that
$$|X_{T}(\omega )|^{2}\,=\,X_{T}(\omega )\cdot X_{T}^{*}(\omega )\,=\,X_{T}(\omega )\cdot X_{T}(-\omega )$$
for X(t) real, in which $$X_{T}(\omega )\,=\,\int _{-\infty }^{\infty }\,X_{T}(t)e^{-j\omega t}\,dt$$, it follows that
$$S_{X}(\omega )\,=\,\lim _{T\rightarrow \infty }\frac{1}{T}E[|X_{T}(\omega )|^{2}]\,=\,\lim _{T\rightarrow \infty }\frac{1}{T}E[X_{T}(\omega )\cdot X_{T}(-\omega )]\,=$$
$$=\,\lim _{T\rightarrow \infty }E\left[ \int _{-\infty }^{\infty }\,X_{T}(t)e^{-j\omega t}\,dt\cdot \int _{-\infty }^{\infty }\,X_{T}(\tau )e^{j\omega \tau }\,d\tau \right] \,=$$
$$=\,\lim _{T\rightarrow \infty }\frac{1}{T}\int _{-\infty }^{\infty }\int _{-\infty }^{\infty }E[X_{T}(t)X_{T}(\tau )]e^{-j(t-\tau )\omega }\,dt\,d\tau \,=$$
$$=\,\lim _{T\rightarrow \infty }\frac{1}{T}\int _{-\infty }^{\infty }\int _{-\infty }^{\infty }\,R_{X_{T}}(t-\tau )e^{-j(t-\tau )\omega }\,dt\,d\tau .$$
Letting $$t-\tau \,=\,\sigma$$, it follows that $$t\,=\,\sigma +\tau$$ and $$dt\,=\,d\sigma$$. Thus,
$$S_X(\omega ) =\,\lim _{T\rightarrow \infty }\frac{1}{T}\int _{-\infty }^{\infty }\int _{-\infty }^{\infty }\,R_{X_{T}}(\sigma )e^{-j\sigma \omega }\,dt\,d\sigma \ .$$
This result implies that $$S_{X}(\omega )\, = \,$$ $$\mathcal {F}$$ $$[R_{X}(\tau )]$$, i.e., implies that
$$S_{X}(\omega )\,=\,\int _{-\infty }^{\infty }\,R_{X}(\tau )e^{-j\omega \tau }\,d\tau .$$
The function $$S_X(\omega )$$ represents the power spectral density, which measures power per unit frequency. The corresponding inverse transform is the autocorrelation function $$R_{X}(\tau )\,=$$ $$\mathcal {F}$$ $$^{-1}$$ $$[S_{X}(\omega )]$$, or
$$R_{X}(\tau )\,=\,\frac{1}{2\pi }\,\int _{-\infty }^{\infty }\,S_{X}(\omega )e^{j\omega \tau }\,d\omega .$$
Figure 2.21 illustrates the power spectral density function and the autocorrelation function for a bandlimited signal. Figure 2.22 illustrates the power spectral density and the correspondent autocorrelation function for white noise. It is noticed that $$S_X(\omega ) = S_0$$, which indicates a uniform distribution for the power density along the spectrum, and $$R_X(\tau ) = S_0 \delta ( \tau )$$, which shows white noise as the most uncorrelated, or random, of all signals. Correlation is nonzero for this signal only at $$\tau = 0$$. On the other hand, Fig. 2.23 illustrates the power spectral density for a constant signal, with autocorrelation $$R_S(\tau ) = R_0$$. This is, no doubt, the most predictable among all signals.

Properties of the Power Spectral Density

In the sequel, a few properties of the power spectral density function are listed.

• The area under the curve of the power spectral density is equal to the total power of the random process, i.e.,
\begin{aligned} P_{X} = \frac{1}{2 \pi } \int _{- \infty }^{+ \infty } S_{X}(\omega ) d \omega . \end{aligned}
(2.30)
This fact can be verified directly as
\begin{aligned} P_X= & {} R_{X}(0)\, = \, \frac{1}{2\pi }\, \int _{-\infty }^{\infty }\, S_{X}(\omega )e^{j\omega 0}\,d\omega \\= & {} \frac{1}{2\pi }\,\int _{-\infty }^{\infty }\,S_{X}(\omega )\,d\omega \,=\,\int _{-\infty }^{\infty }\,S_{X}(f)\,df, \end{aligned}
in which $$\omega \,=\,2\pi f$$.
• $$S_{X}(0)\,=\,\int _{-\infty }^{\infty }\,R_{X}(\tau )\,d\tau$$;

Application: Fig. 2.24 illustrates the fact that the area under the curve of the autocorrelation function is the value of the PSD at the origin.

• If $$R_{X}(\tau )$$ is real and even then
\begin{aligned} S_{X}(\omega )= & {} \int _{-\infty }^{\infty }\,R_{X}(\tau )[\cos \omega \tau \,-\,j\mathrm{sin}\,\omega \tau ]\,d\tau , \nonumber \\= & {} \int _{-\infty }^{\infty }\,R_{X}(\tau ) \cos \omega \tau \,d\tau , \end{aligned}
(2.31)
i.e., $$S_{X}(\omega )$$ is real and even.
• $$S_{X}(\omega )\ge 0$$, since the density reflects a power measure.

• The following identities hold:
\begin{aligned} \int _{- \infty }^{+ \infty } R_{X}(\tau ) R_{Y}(\tau ) d \tau = \frac{1}{2 \pi } \int _{- \infty }^{+ \infty } S_{X}(\omega ) S_{Y}(\omega ) d \omega . \end{aligned}
(2.32)
\begin{aligned} \int _{- \infty }^{+ \infty } R_{X}^{2}(\tau ) d \tau = \frac{1}{2 \pi } \int _{- \infty }^{+ \infty } S_{X}^{2}(\omega ) d \omega . \end{aligned}
(2.33)
Finally, the cross-correlation between two random processes X(t) and Y(t) is defined as
\begin{aligned} R_{XY}(\tau ) = E[ X(t) Y(t + \tau )], \end{aligned}
(2.34)
which leads to the definition of the cross-power spectral density $$S_{XY}(\omega )$$.
\begin{aligned} S_{XY}(\omega ) = \int _{- \infty }^{+ \infty } R_{XY}(\tau ) e^{- j \omega \tau } d \tau . \end{aligned}
(2.35)
Example: By knowing that $$\hat{m}(t) = \frac{1}{\pi t} *m(t)$$ is the Hilbert transform of m(t) and using properties of the autocorrelation and of the cross-correlation, it can be shown that
$$E [ m(t) ^2 ] = E [ \hat{m}(t)^2 ]$$
and that
$$E [ m(t) \hat{m}(t) ] = 0.$$
If two stationary processes, X(t) and Y(t), are added to form a new process $$Z(t) = X(t) + Y(t)$$, then the autocorrelation function of the new process is given by
\begin{aligned} R_{Z}(\tau )= & {} E[Z(t) \cdot Z(t + \tau )] \nonumber \\= & {} E [ ( x(t) + y(t) ) ( x(t+ \tau ) + y(t + \tau ) ) ], \nonumber \end{aligned}
which implies
$$R_Z(\tau ) = E [ x(t) x(t + \tau ) + y(t) y(t + \tau ) + x(t) y(t + \tau ) + x(t + \tau ) y(t) ] .$$
By applying properties of the expected value to the above expression, it follows that
\begin{aligned} R_{Z}(\tau ) = R_{X}(\tau ) + R_{Y}(\tau ) + R_{XY}(\tau ) + R_{YX}(\tau ). \end{aligned}
(2.36)
If the processes X(t) and Y(t) are uncorrelated, then $$R_{XY}(\tau ) = R_{YX}(\tau ) = 0$$. Thus, $$R_{Z}(\tau )$$ can be written as
\begin{aligned} R_{Z}(\tau ) = R_{X}(\tau ) + R_{Y}(\tau ), \end{aligned}
(2.37)
and the associated power can be written as
$$P_{Z} = R_{Z}(0) = P_{X} + P_{Y}.$$
The corresponding power spectral density is given by
$$S_{Z}(\omega ) = S_{X}(\omega ) + S_{Y}(\omega ).$$

## 2.5 Linear Systems

Linear systems when examined under the light of the theory of stochastic processes provide more general and more interesting solutions than those resulting from classical analysis. This section deals with the response of linear systems to a random input X(t).
For a linear system, as illustrated in Fig. 2.25, the Fourier transform of its impulse response h(t) is given by
\begin{aligned} H(\omega )\,=\,\int _{-\infty }^{\infty }\,h(t)e^{-j\omega t}\,dt . \end{aligned}
(2.38)
The linear system response Y(t) is obtained by means of the convolution of the input signal with the impulse response as follows:
\begin{aligned} Y(t)= & {} X(t)*h(t)\ \ \Rightarrow \ \ Y(t)\,=\,\int _{-\infty }^{\infty }\,X(t-\alpha )\,h(\alpha )\,d\alpha \\= & {} \int _{-\infty }^{\infty }\,X(\alpha )\,h(t-\alpha )\,d\alpha . \end{aligned}
Expected Value of Output Signal
The mean value of the random signal at the output of a linear system is calculated as follows:
$$E[Y(t)] = E\left[ \int _{-\infty }^{\infty }\,X(t-\alpha )\,h(\alpha )\,d\alpha \right] = \int _{-\infty }^{\infty }\,E[X(t-\alpha )]\,h(\alpha )\,d\alpha$$
Considering the random signal X(t) to be narrow-sense stationary, it follows that $$E[ X(t - \alpha ) ] = E[ X(t) ] = m_X$$, and thus
$$E[Y(t)] = m_X \int _{-\infty }^{\infty } h(\alpha )\,d\alpha = m_X H(0),$$
in which $$H(0) = \int _{-\infty }^{\infty }\,h(\alpha )\,d\alpha$$ follows from (2.38) computed at $$\omega = 0$$. Therefore, the mean value of the output signal depends only on the mean value of the input signal and on the value assumed by the transfer function at $$\omega = 0$$.

The Response of Linear Systems to Random Signals

The computation of the autocorrelation of the output signal, given the autocorrelation of the input signal to a linear system can be performed as follows.

The relationship between the input and the output of a linear system was shown earlier to be given by
$$Y(t)\,=\,\int _{-\infty }^{\infty }\,X(\rho )\,h(t-\rho )\,d\rho \,=\,\int _{-\infty }^{\infty }\,X(t-\rho )\,h(\rho )\,d\rho \,=\,X(t)*h(t).$$
The output autocorrelation function can be calculated directly from its definition as
\begin{aligned} R_{Y}(\tau )= & {} E[Y(t)Y(t+\tau )]\\= & {} E\left[ \int _{-\infty }^{\infty }\,X(t-\rho )\,h(\rho \,d\rho \cdot \int _{-\infty }^{\infty }\,X(t+\tau -\sigma )\,h(\sigma )\,d\sigma \right] \\= & {} \int _{-\infty }^{\infty }\int _{-\infty }^{\infty }\,E[X(t-\rho )X(t+\tau -\sigma )]\cdot h(\rho )\cdot h(\sigma )\,d\rho \,d\sigma \\= & {} \int _{=\infty }^{\infty }\int _{-\infty }^{\infty }\,R_{XX}(\tau +\rho -\sigma )\,h(\rho )\,h(\sigma )\,d\rho \,d\sigma . \end{aligned}
Example: Suppose that white noise with autocorrelation $$R_{X}(\tau )\,=\,\delta (\tau )$$ is the input signal to a linear system. The corresponding autocorrelation function of the output signal is given by
\begin{aligned} R_{Y}(\tau )= & {} \int _{-\infty }^{\infty }\int _{-\infty }^{\infty }\, \delta (\tau +\rho -\sigma )\,h(\rho )\,h(\sigma )\,d\rho \,d\sigma \\= & {} \int _{-\infty }^{\infty }\,h(\sigma -\tau )\cdot h(\sigma )\,d\sigma \\= & {} h(-\tau ) *h(\tau ). \end{aligned}
The Fourier transform of $$R_{Y}(\tau )$$ leads to the following result:
$$R_Y(\tau ) = h(-t)*h(t)\ \ \Longleftrightarrow \ \ S_Y(\omega ) = H(-\omega )\cdot H(\omega ),$$
and for $$h(\tau )$$ a real function of $$\tau$$ it follows that $$H(-\omega )\,=\,H^{*}(\omega )$$, and consequently
$$S_Y(\omega ) = H(-\omega )\cdot H(\omega ) = H^{*}(\omega ) \cdot H(\omega ) = |H(\omega )|^2.$$
Summarizing, the output power spectral density is $$S_{Y}(\omega )\,=\,|H(\omega )|^{2}$$ when white noise is the input to a linear system.
In general, the output power spectral density can be computed by applying the Wiener–Khintchin theorem $$S_{Y}(\omega )\,=\,$$ $$\mathcal {F}$$ $$[R_{Y}(\tau )]$$,
$$S_{Y}(\omega )\, =\,\int _{-\infty }^{\infty }\int _{-\infty }^{\infty }\int _{-\infty }^{\infty }\,R_{X}(\tau +\rho -\sigma )\,h(\rho )\,h(\sigma )\cdot e^{-j\omega \tau }\,d\rho \,d\sigma \,d\tau .$$
Integrating on the variable $$\tau$$, it follows that
$$S_{Y}(\omega )\,=\,\int _{-\infty }^{\infty }\int _{-\infty }^{\infty }\,S_{X}(\omega )e^{j\omega (\rho -\sigma )}\,h(\rho )\,h(\sigma )\,d\rho \,d\sigma .$$
Finally, removing $$S_X(\omega )$$ from the double integral and then separating the two variables in this double integral, it follows that
\begin{aligned} S_Y(\omega )= & {} S_{X}(\omega )\,\int _{-\infty }^{\infty } h(\rho ) e^{j\omega \rho }\,d\rho \,\int _{-\infty }^{\infty } h(\sigma ) e^{-j\omega \sigma }\,d\sigma \\= & {} S_{X}(\omega )\cdot H(-\omega )\cdot H(\omega ). \end{aligned}
Therefore, $$S_{Y}(\omega )\,=\,S_{X}(\omega )\cdot |H(\omega )|^{2}$$ will result whenever the system impulse response is a real function.
Example: Consider again white noise with autocorrelation function $$R_{X}(\tau )\,=\,\delta (\tau )$$ applied to a linear system. The white noise power spectral density is calculated as follows:
$$S_{X}(\omega )\,=\,\int _{-\infty }^{\infty }\,R_{X}(\tau )e^{-j\omega \tau }\,d\tau \, =\,\int _{-\infty }^{\infty }\, \delta (\tau )e^{-j\omega \tau }\,d\tau \,=\,1 ,$$
from which it follows that
$$S_{Y}(\omega )\,=\,|H(\omega )|^{2} ,$$
similar to the previous example.
Example: The linear system shown in Fig. 2.26 is a differentiator, used in control systems or demodulator/detector for frequency-modulated signals.
The output power spectral density for this circuit (or its frequency response) is equal to
$$S_{Y}(\omega )\,=\,|j\omega |^{2}\cdot S_{X}(\omega )\,=\,\omega ^{2}S_{X}(\omega ).$$
It is thus noticed that, for frequency-modulated signals, the noise at the detector output follows a square law, i.e., the output power spectral density grows with the square of the frequency. In this manner, in a frequency-division multiplexing of frequency-modulated channels, the noise will affect more intensely those channels occupying the higher frequency region of the spectrum.
Figure 2.27 shows, as an illustration of what has been discussed so far about square noise, the spectrum of a low-pass flat noise (obtained by passing white noise through an ideal low-pass filter). This filtered white noise is applied to the differentiator circuit of the example, which in turn produces at the output the square law noise shown in Fig. 2.28.

Observation: Pre-emphasis circuits are used in FM modulators to compensate for the effect of square noise.

Other relationships among different correlation functions can be derived, as illustrated in Fig. 2.29. The correlation measures between input and output (input–output cross-correlation), and correlation between output and input can also be calculated from the input signal autocorrelation.
The correlation between the input and the output can be calculated with the formula
$$R_{XY}(\tau )\,=\,E[X(t)Y(t+\tau )] ,$$
and in an analogous manner, the correlation between the output and the input can be calculated as
$$R_{YX}(\tau )\,=\,E[Y(t)X(t+\tau )] .$$
For a linear system, the correlation between output and input is given by
$$R_{YX}(\tau )\,=\,E\left[ \int _{-\infty }^{\infty }\,X(t-\rho )\,h(\rho )\,d\rho \cdot X(t+\tau )\right] .$$
Exchanging the order of the expected value and integral computations, due to their linearity, it follows that
$$R_{YX}(\tau )\,=\,\int _{-\infty }^{\infty }\,E[X(t-\rho )X(t+\tau )]\,h(\rho )\,d\rho \,=\,\int _{-\infty }^{\infty }\,R_{X}(\tau +\rho )\,h(\rho )\,d\rho .$$
In a similar manner, the correlation between the input and the output is calculated as
\begin{aligned} R_{XY}(\tau )= & {} E\left[ X(t)\cdot \int _{-\infty }^{\infty }\,X(t+\tau -\rho )\,h(\rho )\,d\rho \right] \\= & {} \int _{-\infty }^{\infty }\,E[X(t)X(t+\tau -\rho )]\,h(\rho )\,d\rho , \end{aligned}
and finally,
$$R_{XY}(\tau )\,=\,\int _{-\infty }^{\infty }\,R_{X}(\tau - \rho )\,h(\rho )\,d\rho .$$
Therefore
$$R_{XY}(\tau )\,=\,R_{X}(\tau )*h(\tau )$$
and
$$R_{YX}(\tau )\,=\,R_{X}(\tau )*h(-\tau ) .$$
The resulting cross-power spectral densities, respectively, between input–output and between output–input are given by
$$S_{XY}(\tau )\,=\,S_{X}(\omega )\cdot H(\omega ),$$
$$S_{YX}(\tau )\,=\,S_{X}(\omega )\cdot H^{*}(\omega ) .$$
By assuming $$S_{Y}(\omega )\,=\,|H(\omega )|^{2}S_{X}(\omega )$$, the following relationships are immediate.
\begin{aligned} S_{Y}(\omega )= & {} H^{*}(\omega )\cdot S_{XY}(\omega ) \\= & {} H(\omega )\cdot S_{YX}(\omega ). \end{aligned}
Since it is usually easier to determine power spectral densities rather than autocorrelations, the determination of $$S_Y(\omega )$$ from $$S_X(\omega )$$ is of fundamental interest. Other power spectral densities are calculated afterward, and the correlations are then obtained by means of the Wiener–Khintchin theorem. This is pictorially indicated.
$$\begin{array}{ccccccc} R_{X}(\tau ) &{} \longleftrightarrow &{} S_{X}(\omega ) &{} &{} &{} &{} \\ &{} &{} \downarrow &{} &{} &{} &{} \\ R_{Y}(\tau ) &{} \longleftrightarrow &{} S_{Y}(\omega ) &{} \longrightarrow &{} S_{XY}(\omega ) &{} \longleftrightarrow &{} R_{XY}(\tau ) \\ &{} &{} \downarrow &{} &{} &{} &{} \\ &{} &{} S_{YX}(\omega ) &{} &{} &{} &{} \\ &{} &{} \updownarrow &{} &{} &{} &{} \\ &{} &{} R_{YX}(\tau ) &{} &{} &{} &{} \end{array}$$
Phase Information

The autocorrelation is a special measure of average behavior of a signal. Consequently, it is not always possible to recover a signal from its autocorrelation. Since the power spectral density is a function of the autocorrelation, it also follows that signal recovery from its PSD is not always possible because phase information about the signal has been lost in the averaging operation involved. However, the cross-power spectral densities, relating input–output and output–input, preserve signal-phase information and can be used to recover the phase function explicitly.

The transfer function of a linear system can be written as
$$H(\omega )\,=\,|H(\omega )|e^{j\theta (\omega )},$$
in which the modulus $$|H(\omega )|$$ and the phase $$\theta (\omega )$$ are clearly separated. The complex conjugate of the transfer function is
$$H^{*}(\omega )\,=\,|H(\omega )|e^{-j\theta (\omega )}.$$
Since
$$S_Y(\omega ) = H^{*}(\omega )S_{XY}(\omega )\,=\,H(\omega )\cdot S_{YX}(\omega ) ,$$
it follows for a real h(t) that
$$|H(\omega )|e^{-j\theta \omega }\cdot S_{XY}(\omega )\,=\,|H(\omega )|\cdot e^{j\theta \omega }\cdot S_{YX}(\omega )$$
and finally,
$$e^{2j \theta (\omega )}\,=\,\frac{S_{XY}(\omega )}{S_{YX}(\omega )} .$$
The function $$\theta (\omega )$$ can then be extracted, thus giving
$$\theta (\omega ) \,=\, \frac{1}{2 j} \ln { \frac{S_{XY}(\omega )}{S_{YX}(\omega )} } ,$$
which is the desired signal-phase information.
Example: The Hilbert transform provides a simple example of application of the preceding theory. The time domain representation (filter impulse response) of the Hilbert transform is shown in Fig. 2.30. The frequency-domain representation (transfer function) of the Hilbert transform is shown in Figs. 2.31 and 2.32.
Since for the Hilbert transform $$H(\omega )\,=\,j\left[ u(-\omega )\,-\,u(\omega ) \right]$$, it follows that $$|H(\omega )|^{2} = 1$$, and thus from $$S_{Y}(\omega )\,=\,|H(\omega )|^{2}\cdot S_{X}(\omega )$$ it follows that
$$S_{Y}(\omega )\,=\,S_{X}(\omega ).$$
The fact that $$S_{Y}(\omega )\,=\,S_{X}(\omega )$$ comes as no surprise since the Hilbert transform acts only on the signal phase and the PSD does not contain phase information.

## 2.6 Mathematical Formulation for the Digital Signal

This section presents a mathematical formulation for the digital signal, including the computation of the autocorrelation function and the power  spectrum density.

The digital signal, which can be produced by the digitization of the speech signal, or directly generated by a computer hooked to the Internet, or other equipment, can be mathematically expressed as
\begin{aligned} m(t) = \sum _{k=-\infty }^{\infty } m_k p(t - k T_b), \end{aligned}
(2.39)
in which $$m_k$$ represents the kth randomly generated symbol from the discrete alphabet, p(t) is the pulse function that shapes the transmitted signal and $$T_b$$ é is the bit interval.

### 2.6.1 Autocorrelation for the Digital Signal

The autocorrelation function for signal m(t), which can be nonstationary, is given by the formula
\begin{aligned} R_{M}( \tau ,t ) = E[ m( t ) m( t + \tau ) ]. \end{aligned}
(2.40)
Substituting m(t) into Formula Eq. 2.40,
\begin{aligned} R_{M}( \tau ,t ) = E \left[ \sum _{k=-\infty }^{\infty } \sum _{i=-\infty }^{i=\infty } m_k p(t - k T_b) m_j p(t + \tau - i T_b) \right] . \end{aligned}
(2.41)
The expected value operator applies directly to the random signals, because of the linearity property, giving
\begin{aligned} R_{M}( \tau ,t ) = \sum _{k=-\infty }^{\infty } \sum _{i=-\infty }^{i=\infty } E \left[ m_k m_j \right] p(t - k T_b) p(t + \tau - i T_b) . \end{aligned}
(2.42)
In order to eliminate the time dependency, the time average is taken in Eq. 2.42, producing
\begin{aligned} R_{M}( \tau ) = \frac{1}{T_b} \int _{0}^{T_b} R_{M}( \tau ,t ) dt, \end{aligned}
(2.43)
or, equivalently
\begin{aligned} R_{M}( \tau ) = \frac{1}{T_b} \int _{0}^{T_b} \sum _{k=-\infty }^{\infty } \sum _{i=-\infty }^{i=\infty } E \left[ m_k m_j \right] p(t - k T_b) p(t + \tau - i T_b) dt. \end{aligned}
(2.44)
Changing the integral and summation operations, it follows that
\begin{aligned} R_{M}( \tau ) = \frac{1}{T_b} \sum _{k=-\infty }^{\infty } \sum _{i=-\infty }^{i=\infty } E \left[ m_k m_j \right] \int _{0}^{T_b} p(t - k T_b) p(t + \tau - i T_b) dt. \end{aligned}
(2.45)
Defining the discrete autocorrelation as
\begin{aligned} R( k-i ) = E \left[ m_k m_j \right] , \end{aligned}
(2.46)
the  signal autocorrelation can be written as
\begin{aligned} R_{M}( \tau ) = \frac{1}{T_b} \sum _{k=-\infty }^{\infty } \sum _{i=-\infty }^{i=\infty } R( k-i ) \int _{0}^{T_b} p(t - k T_b) p(t + \tau - i T_b) dt. \end{aligned}
(2.47)
For a rectangular pulse, with independent and equiprobable symbols, the autocorrelation function is given by
\begin{aligned} R_M(\tau ) = A^2[ 1 - \frac{|\tau |}{T_b}][ u(\tau + T_b) - u(\tau - T_b)], \end{aligned}
(2.48)
in which $$T_b$$ is the bit interval and A represents the pulse amplitude.
Figure 2.33 shows that this function has a triangular shape. Its maximum occurs at the origin (signal power) and is equal to $$A^2$$. The autocorrelation decreases linearly with the time interval and reaches zero at time $$T_b$$.

### 2.6.2 Power Spectrum Density for the Digital Signal

The power spectrum density for the digital signal can be obtained by taking the Fourier transform of the autocorrelation function, Eq. 2.47. This is the well-known Wiener–Khintchin theorem.
\begin{aligned} S_M(\omega ) = \int _{-\infty }^{\infty } R_M(\tau ) e^{-j\omega \tau } d\tau , \end{aligned}
(2.49)
Therefore,
\begin{aligned} S_M(\omega )&= \frac{1}{T_b} \sum _{k=-\infty }^{\infty } \sum _{i=-\infty }^{i=\infty } R( k-i ) \nonumber \\&\cdot \int _{-\infty }^{\infty } \int _{0}^{T_b} p(t - k T_b) p(t + \tau - i T_b) e^{-j\omega \tau } dt d\tau . \end{aligned}
(2.50)
Changing the order of integration, one can compute the Fourier integral of the shifted pulse that can be written as
\begin{aligned} S_M(\omega ) = \frac{1}{T_b} \sum _{k=-\infty }^{\infty } \sum _{i=-\infty }^{i=\infty } R( k-i ) \int _{0}^{T_b} p(t - k T_b) P(\omega ) e^{-j\omega (k T_b - t)} dt . \end{aligned}
(2.51)
The term $$P(\omega ) e^{-j\omega k T_b }$$ is independent of time and can be taken out of the integral, i.e.,
\begin{aligned} S_M(\omega ) = \frac{1}{T_b} \sum _{k=-\infty }^{\infty } \sum _{i=-\infty }^{i=\infty } R( k-i ) P(\omega ) e^{-j\omega k T_b } \int _{0}^{T_b} p(t - k T_b) e^{j\omega t} dt . \end{aligned}
(2.52)
Computing the integral in Eq. 2.52 gives
\begin{aligned} S_M(\omega ) = \frac{1}{T_b} \sum _{k=-\infty }^{\infty } \sum _{i=-\infty }^{i=\infty } R( k-i ) P(\omega ) P(-\omega ) e^{-j\omega (k -j) T_b }. \end{aligned}
(2.53)
As can be observed from the previous equation, the shape of the spectrum for the random digital signal depends on the pulse shape, defined by $$P(\omega )$$, and also on the manner the symbols relate to each other, specified by the discrete autocorrelation function $$R(k-i)$$.

Therefore, the signal design involves pulse shaping as well as the control of the correlation between the transmitted symbols, which can be  obtained by signal processing.

For a rectangular pulse, one can write $$P(-\omega )= P^{*}(\omega )$$, and the power spectrum density can be written as
\begin{aligned} S_M(\omega ) = \frac{ | P(\omega ) |^2 }{T_b} \sum _{k=-\infty }^{\infty } \sum _{i=-\infty }^{i=\infty } R( k-i ) e^{-j\omega (k -j) T_b }, \end{aligned}
(2.54)
which can be simplified to
\begin{aligned} S_M(\omega ) = \frac{ | P(\omega ) |^2 }{T_b} S(\omega ), \end{aligned}
(2.55)
letting $$l = k-i$$, the summations can be simplified, and the power spectrum density for the discrete sequence of symbols is given by
\begin{aligned} S(\omega ) = \sum _{l=-\infty }^{l=\infty } R( l ) e^{-j\omega l T_b }. \end{aligned}
(2.56)
For the example of Eq. 2.48, the corresponding power spectrum density is
\begin{aligned} S_M(\omega ) = A^2 T_b \frac{ \mathrm{sin}^2 \, (\omega T_b) }{ (\omega T_b)^2}, \end{aligned}
(2.57)
which is the sample function squared, and shows that the random digital signal has a continuous spectrum that occupies a large portion of the spectrum. The function is sketched in Fig. 2.34. The first null is a usual measure of the bandwidth, and  is given by $$\omega _M = \pi /T_b$$.

The signal bandwidth can be defined in several ways. The most common is the half-power bandwidth ($$\omega _{3dB}$$). This bandwidth is computed by taking the maximum of the power spectrum density, dividing by two, and finding the frequency for which this value occurs.

The root mean square (RMS) bandwidth is computed using the frequency deviation around the carrier, if the signal is modulated, or around the origin, for a baseband signal. The frequency deviation, or RMS bandwidth, for a baseband signal is given by
\begin{aligned} \omega _{RMS} = \frac{ \int _{-\infty }^{\infty } \omega ^2 S_M( \omega ) d\omega }{ \int _{-\infty }^{\infty } S_M( \omega ) d\omega }. \end{aligned}
(2.58)
The previous formula is equivalent to
\begin{aligned} \omega _{RMS} = \frac{ - R_M^{\prime \prime } (0) }{R_M(0)}. \end{aligned}
(2.59)
The white noise bandwidth can be computed by equating the maximum of the signal power spectrum density to the noise power spectrum density $$S_N = \max S_M(\omega )$$. After that, the power for both signal and noise are equated and $$\omega _N$$ is obtained. The noise power is $$P_N = 2 \omega _{N} S_N$$ and the signal power, $$P_M$$, is given by the formula
\begin{aligned} P_M = R_M(0) = \frac{1}{2 \pi } \int _{-\infty }^{\infty } S_M( \omega ) d\omega . \end{aligned}
(2.60)
There are signals which exhibit a finite bandwidth, when the spectrum vanishes after a certain frequency.  Finally, the percent bandwidth is computed by finding the frequency which includes 90% of the signal power.

## 2.7 Problems

1. (1)
Given two events A and B, under which conditions are the following relations true?
1. a.

$$A \cap B = \Omega$$

2. b.

$$A \cup B = \Omega$$

3. c.

$$A \cap B = \bar{A}$$

4. d.

$$A \cup B = \emptyset$$

5. e.

$$A \cup B = A \cap B$$

2. (2)

If A, B, and C are arbitrary events in a sample space $$\Omega$$, express $$A \cup B \cup C$$ as the union of three disjoint sets.

3. (3)

Show that $$P\{A \cap B \} \le P\{A \} \le P\{A \cup B \} \le P\{A \} + P\{B \}$$ and specify the conditions for which equality holds.

4. (4)

If $$P\{A \} = a$$, $$P\{B \} = b$$ and $$P\{ A \cap B \} = ab$$, find $$P\{ A \cap \bar{B} \}$$ and $$P\{ \bar{A} \cap \bar{B} \}$$.

5. (5)

Given that A, B, and C are events in a given random experiment, show that the probability that exactly one of the events A, B, or C occurs is $$P\{A \} + P\{B \} + P\{C\} - 2P\{ A \cap B \} - 2P\{B \cap C \} - 2P\{ A \cap C \} + 3P\{A \cap B \cap C \}.$$ Illustrate the solution with a Venn diagram.

6. (6)

Prove that a set with N elements has $$2^N$$ subsets.

7. (7)
Let A, B, and C be arbitrary events in a sample space $$\Omega$$, each one with a nonzero probability. Show that the sample space $$\Omega$$, conditioned on A, provides a valid probability measure by proving the following:
1. a.

$$P\{ \Omega | A \} = 1$$,

2. b.

$$P\{ B|A \} \le P\{ C|A \}, \ \mathrm{if} \ B \subset C$$,

3. c.

$$P\{ B|A \} + P\{ C|A \} = P\{ B \cup C / A \} \ \mathrm{if} \ B \cap C = \emptyset$$.

Show also that $$P\{ B|A \} + P\{ \bar{B}|A \} = 1$$.

8. (8)
Show that if $$A_1, A_2, \dots , A_N$$ are independent events then
$$P\{A_1 \cup A_2 \dots \cup A_N \} = 1 - ( 1- P\{ A_1 \} ) ( 1- P\{ A_2 \} ) \dots ( 1- P\{ A_n \} ).$$

9. (9)

Show that if A and B are independent events then A and $$\bar{B}$$ are also independent events.

10. (10)

Consider the events $$A_1, A_2, \dots , A_N$$ belonging to the sample space $$\Omega$$. If $$\sum _{i=1}^{n} P\{A_i\} = 1$$, for which conditions $$\bigcup _{i=1}^{n} A_i = \Omega$$? If $$A_1, A_2, \dots , A_N$$ are independent and $$P\{ A_i \} = \theta , \ i = 1, \dots , n$$, find an expression for $$P \{ \bigcup _{i=1}^{n} A_i \}$$.

11. (11)

The sample space $$\Omega$$ consists of the interval [0, 1]. If sets represented by equal lengths in the interval are equally likely, find the conditions for which two events are statistically independent.

12. (12)
Using mathematical induction applied to Kolmogorov’s axioms, for $$A_1, A_2, \ldots , A_n$$ mutually exclusive events, prove the following property:
$$P \left[ \bigcup _{k=1}^{n} A_k \right] = \sum _{k=1}^{n} P \left[ A_k \right] , \ \mathrm{for} \ n \ge 2.$$

13. (13)

A family of sets $$A_n,\ n=1,2,\ldots$$ is called limited above by A, by denoting $$A_n \uparrow A$$, if $$A_n \subset A_{n+1}$$ and $$\bigcup _{n \ge 1} A_n = A$$. Using finite additivity, show that if $$A_n \uparrow A$$, then $$P(A_n) \uparrow P(A)$$.

14. (14)

For the Venn diagram in Fig. 2.35, consider that the elements $$\omega$$ are represented by points and are equiprobable, i.e., $$P( \omega ) = \frac{1}{4}$$, $$\forall \omega \in \Omega$$. Prove that the events A, B, and C are not independent.

15. (15)
For the r.v. X it is known that
$$P \{ X> t \} = e^{- \mu t} ( \mu t + 1 ), \ \mu> 0, \ t>0.$$
Find $$P_X(x)$$, $$p_X(x)$$ e $$P \{ X > 1/\mu \}$$.

16. (16)
The ratio between the deviations in the length and width of a substrate has the following pdf:
$$p_X(x) = \frac{ a }{ 1 + x^2 }, \ -\infty< x < \infty .$$
Calculate the value of a and find the CPF of X.

17. (17)

The r.v. X is uniformly distributed in the interval (ab). Derive an expression for the $$n^{th}$$ moments $$E[X^n]$$ and $$E[(X - m_X)^n]$$, in which $$m_X = E[X]$$.

18. (18)

For a Poisson r.v. X with parameter $$\lambda$$, show that $$P\{ X \ \mathrm{even} \} = \frac{1}{2} ( 1 + e^{-2 \lambda } )$$.

19. (19)

By considering a r.v. with mean $$m_X$$ and variance $$\sigma _X^2$$, compute the moment expression $$E[(X - C)^2]$$, for an arbitrary constant C. For which value of C the moment $$E[(X - C)^2]$$ is minimized?

20. (20)

An exponential r.v. X, with parameter $$\alpha$$, has $$P\{ X \ge x \} = e^{-\alpha x}$$. Show that $$P\{ X> t + s | X> t \} = P\{ X > s \}$$.

21. (21)

A communication signal has a normal pdf with zero mean and variance $$\sigma ^2$$. Design a compressor/expansor for the ideal quantizer for this distribution.

22. (22)
A voice signal with a Gamma bilateral amplitude probability distribution, given below, is fed through a compressor obeying the $$\mu$$-Law of ITU-T. Compute the pdf of the signal at the compressor output and sketch an input versus output diagram for $$\gamma = 1$$ and $$\alpha = 1/2, 1$$ and 2.
$$p_X(x) = \frac{ \gamma ^{ \alpha } }{ \Gamma ( \alpha ) } |x|^{ \alpha - 1 } e^{ - | \gamma x | }, \ \alpha , \ \gamma > 0.$$

23. (23)

For a given exponential probability distribution with parameter $$\alpha$$, compute the probability that the r.v. will take values exceeding $$2/\alpha$$. Estimate this probability using Tchebychev’s inequality.

24. (24)

Calculate all the moments of a Gaussian distribution.

25. (25)
Show, for the Binomial distribution
$$p_X(x) = \sum _{k=0}^{\infty } \left( \begin{array}{c} N \\ k \end{array} \right) p^k (1-p)^{N-k} \delta (x - k)$$
that its characteristic function is given by
$$P_X(\omega ) = \left[ 1 - p + p e^{j \omega } \right] ^N.$$

26. (26)
The Erlang distribution has a characteristic function given by
$$P_X(\omega ) = \left[ \frac{ a }{ a + j \omega } \right] ^N, \ a > 0, \ N = 1, \ 2, \ \dots .$$
Show that $$E[ X ] = N/a$$, $$E[ X^2 ] = N(N+1)/a^2$$, and $$\sigma _X^2 = N/a^2$$.

27. (27)
The Weibull distribution is given by
$$p_X(x) = ab x^{b-1} e^{ - a x^b } u(x) .$$
Calculate the mean, the second moment, and the variance for this distribution.

28. (28)
For the Poisson distribution
$$p_X(x) = e^{-b} \sum _{k=0}^{\infty } \frac{ b^k }{ k ! } \delta (x - k) .$$
calculate the cumulative probability function, $$P_X(x)$$, and show that
$$P_X(\omega ) = e^{ -b ( 1 - e^{j\omega } ) }.$$

29. (29)
Calculate the statistical mean and the second moment of Maxwell’s distribution
$$p_Y(y) = \sqrt{ \frac{2}{\pi } } \frac{y^2}{\sigma ^3} e^{ - \frac{y^2}{2 \sigma ^2} } u(y),$$
exploiting the relationship between a Gaussian distribution and its characteristic function:
$$p_X(x) = \frac{ 1 }{ \sqrt{ 2 \pi } \sigma } e^{ - \frac{x^2}{2 \sigma ^2} }$$
$$P_X(\omega ) = e^{ - \frac{\sigma ^2 \omega ^2}{2} }.$$

30. (30)
Show that, for a nonnegative r.v. X, it is possible to calculate its mean value through the formula
$$E[X] = \int _0^{\infty } (1 - P_X(x)) dx.$$
Using this formula, calculate the mean of the exponential distribution of the previous problem.

31. (31)

The dynamic range of a discrete-time signal X(n) is defined as $$W = X_{\max } - X_{\min }$$. Assuming that the samples of the signal X(n), $$n = 1, 2, \ldots , N$$, are identically distributed, calculate the probability distribution of W.

32. (32)
For the following joint distribution
$$p_{XY}(x,y) = u(x) u(y) x e^{ - x ( y + 1) }$$
calculate the marginal distributions, $$p_X(x)$$ and $$p_Y(y)$$, and show that the conditional probability density function is given by
$$p_{Y|X}(y|x) = u(x) u(y) x e^{ - x y }.$$

33. (33)

The r.v.’s V and W are defined in terms of X and Y as $$V = X + a Y$$ and $$W = X - aY$$, in which a is a real number. Determine a as a function of the moments of X and Y, such that V and W are orthogonal.

34. (34)
Show that $$E[ E[ Y|X ] ] = E[ Y ]$$, in which
$$E[ E[ Y|X ] ] = \int _{-\infty }^{\infty } E[ Y|x] p_X(x) dx.$$

35. (35)
Find the distribution of the r.v. $$Z = X/Y$$, assuming that X and Y are statistically independent r.v.’s having an exponential distribution with parameter equal to 1. Use the formula
$$p_Z(z) = \int _{-\infty }^{\infty } |y| p_{XY}(yz,y) dy.$$

36. (36)

A random variable Y is defined by the equation $$Y = X + \beta$$, in which X is a r.v. with an arbitrary probability distribution and $$\beta$$ is a constant. Determine the value of $$\beta$$ which minimizes $$E[X^2]$$. Use this value and the expression for $$E[(X \pm Y)^2]$$ to determine a lower and an upper limit for E[XY].

37. (37)
Two r.v.’s, X and Y, have the following characteristic functions, respectively:
$$P_X(\omega ) = \frac{\alpha }{\alpha + j \omega } \ \ \mathrm{and} \ \ P_Y(\omega ) = \frac{\beta }{\beta + j \omega }.$$
Calculate the statistical mean of the sum $$X+Y$$.

38. (38)

Determine the probability density function of $$Z = X/Y$$. Given that X and Y are r.v.’s with zero mean and Gaussian distribution, show that Z is Cauchy distributed.

39. (39)
Determine the probability density function of $$Z = X Y$$, given that X is a r.v. with zero mean and a Gaussian distribution, and Y is a r.v. with the following distribution:
$$p_Y(y) = \frac{1}{2} [\delta (y+1) + \delta (y-1) ].$$

40. (40)

Let X and Y be statistically independent r.v.’s having exponential distribution with parameter $$\alpha = 1$$. Show that $$Z = X + Y$$ and $$W = X/Y$$ are statistically independent and find their respective probability density function and cumulative probability function.

41. (41)
Show that if X and Y are statistically independent r.v.’s then
$$P(X < Y) = \int _{\infty }^{\infty } (1 - P_Y(x)) p_X(x) dx.$$
Suggestion: sketch the region $$\{ X<Y \}$$.

42. (42)
By using the Cauchy–Schwartz inequality, show that
$$P_{XY} (x,y) \le \sqrt{ P_X(x) P_Y(y) }.$$

43. (43)
Given the joint distribution
$$p_{XY} (x,y) = k x y e^{- x^2 - y^2},$$
in which X and Y are nonnegative r.v.’s, determine k, $$p_X(x)$$, $$p_Y(x)$$, $$p_X(x|y)$$, $$p_Y(y|x)$$, and the first and second moments of this distribution.

44. (44)

Determine the relationship between the cross-correlation and the mean of two uncorrelated processes.

45. (45)

Design an equipment to measure the cross-correlation between two signals, employing delay units, multipliers, and integrators.

46. (46)
Show that the RMS bandwidth of a signal X(t) is given by
$$B_{RMS}^2 = \frac{ -1 }{ R_X(0) } \frac{ d^2 R_X(\tau ) }{ d \tau ^2 }, \ \mathrm{for}\ \tau = 0.$$

47. (47)
For the complex random process
$$X(t) = \sum _{n=1}^N A_n e^{ j \omega _o t + j \theta _n }$$
in which $$A_n$$ and $$\theta _n$$ are statistically independent r.v.’s, $$\theta _n$$ is uniformly distributed in the interval $$[0,2\pi ]$$ and $$n = 1, \ 2, \dots , N$$, show that
$$R_X(\tau ) = E [ X^*(t) X(t + \tau ) ] = e^{ j \omega _o \tau } \sum _{n=1}^N E[ A_n^2 ].$$

48. (48)
The process X(t) is stationary with autocorrelation $$R_X(\tau )$$. Let
$$Y = \int _a^{a + T} X(t) dt, T>0, \ a \ \mathrm{real},$$
and then show that
$$E[ |Y|^2 ] = \int _{-T}^{T} ( T - |\tau |) R_X(\tau ) d \tau .$$

49. (49)
The processes X(t) and Y(t) are jointly wide-sense stationary. Let
$$Z(t) = X(t) \cos { w_c t } + Y(t) \ \mathrm{sin} \ { w_c t }.$$
Under which conditions, in terms of means and correlation functions of X(t) and Y(t), is Z(t) wide-sense stationary? Applying these conditions, compute the power spectral density of the process Z(t). What power spectral density would result if X(t) and Y(t) were uncorrelated?

50. (50)
For a given complex random process $$Z(t) = X(t) + jY(t)$$, show that
$$E[ | Z(t) |^2 ] = R_X(0) + R_Y(0).$$

51. (51)
Considering that the geometric mean of two positive numbers cannot exceed the correspondent arithmetic mean, and that $$E [ ( Y(t + \tau ) + \alpha X(t) )^2 ] \ge 0$$, show that
$$| R_{XY}(\tau ) | \le \frac{1}{2} [ R_X(0) + R_Y(0) ].$$

52. (52)
Let X(t) be a wide-sense stationary process, with mean $$E[X(t)] \ne 0$$. Show that
$$S_{X}(\omega ) = 2 \pi E^2[ X(t) ] \delta (\omega ) + \int _{-\infty }^{\infty } C_X(\tau ) e^{ - j \omega \tau } d \tau ,$$
in which $$C_X(\tau )$$ is the autocovariance function of X(t).

53. (53)
Calculate the RMS bandwidth
$$B_{RMS}^2 = \frac{ \int _{-\infty }^{\infty } \omega ^2 S_{X}(\omega ) d \omega }{ \int _{-\infty }^{\infty } S_{X}(\omega ) d \omega }$$
of a modulated signal having the following power spectral density:
$$S_{X}(\omega ) = \frac{ \pi A^2 }{ 2 \Delta _{FM} } \left[ p_X \left( \frac{ \omega + \omega _c }{ \Delta _{FM} } \right) + p_X \left( \frac{ \omega - \omega _c }{ \Delta _{FM} } \right) \right] ,$$
in which $$p_X( \cdot )$$ denotes the probability density function of the signal X(t).

54. (54)

The autocorrelation $$R_X(\tau )$$ can be seen as a measure of similarity between X(t) e $$X(t+\tau )$$. In order to illustrate this point, consider the process $$Y(t) = X(t) - \rho X(t+\tau )$$ and determine the value for $$\rho$$ which minimizes the mean square value Y(t).

55. (55)

Calculate the cross-correlation between the processes $$U(t) = X(t) + Y(t)$$ and $$V(t) = X(t) - Y(t)$$, given that X(t) and Y(t) have zero mean and are statistically independent.

56. (56)

Determine the probability density function (pdf) for the stochastic process $$X(t) = e^{A t}$$, in which A is a uniformly distributed r.v. over the interval $$[-1,1]$$. Analyze whether the process is stationary and compute its autocorrelation. Sketch a few typical representations of the process X(t), for varying A, as well as the resulting pdf.

57. (57)
A time series X(t) is used to predict $$X(t + \tau )$$. Calculate the correlation between the current value and the predicted value, given that
1. (a)

The predictor uses only the current value of the series, X(t).

2. (b)

The predictor uses the current value X(t) and its derivative, $$X^{\prime }(t)$$.

58. (58)
Find the correlation between the processes V(t) and W(t), in which
$$V(t) = X \cos \omega _o t - Y \mathrm{sin} \, \omega _o t,$$
and
$$W(t) = Y \cos \omega _o t + X \mathrm{sin} \, \omega _o t,$$
in which X and Y are statistically independent r.v.’s with zero mean and variance $$\sigma ^2$$.

59. (59)
Calculate the power spectral density for a signal with autocorrelation given by
$$R(\tau ) = A e^{ - \alpha |\tau | } (1 + \alpha |\tau | + \frac{1}{3} \alpha ^2 \tau ^2 ).$$

60. (60)
Determine the power spectral density of a signal with autocorrelation given by the expression
$$R(\tau ) = A e^{ - \alpha |\tau | } (1 + \alpha |\tau | - 2 \alpha ^2 \tau ^2 + \frac{1}{3} \alpha ^3 |\tau ^3| ).$$

61. (61)

Prove the following properties of narrowband random stationary processes.

1. (a)

$$S_{XY}( \omega ) = S_{YX}(- \omega )$$;

2. (b)

$$\mathrm{Re} [ S_{XY}( \omega ) ] \ \mathrm{and } \ \mathrm{Re} [ S_{YX}( \omega ) ] \ \mathrm{both \ even}$$;

3. (c)

$$\mathrm{Im} [ S_{XY}( \omega ) ] \ \mathrm{and } \ \mathrm{Im} [ S_{YX}( \omega ) ] \ \mathrm{both \ odd}$$;

4. (d)

$$S_{XY}( \omega ) = S_{YX}( \omega ) = 0, \ \mathrm{if} \ X(t) \ \mathrm{and} \ Y(t) \ \mathrm{are \ orthogonal}.$$

62. (62)
Show the following property, for uncorrelated X(t) and Y(t) in a narrowband process,
$$S_{XY}( \omega ) = S_{YX}( \omega ) = 2 \pi E[ X(t) ] E[ Y(t) ] \delta (\omega ).$$

63. (63)
Prove that the RMS bandwidth of a stochastic signal x(t) is given by
$$B_{RMS}^2 = \frac{ -1 }{ R_X(0) } \frac{ d^2 R_X(\tau ) }{ d \tau ^2 }, \ \mathrm{for}\ \tau = 0.$$

64. (64)
For the stochastic processes
$$X(t) = Z(t) \cos (w_c t + \theta )$$
and
$$Y(t) = Z(t) \ \mathrm{sin} (w_c t + \theta ),$$
in which $$A, \ \omega _c > 0$$, and $$\theta$$ is a uniformly distributed r.v. over the interval $$[0,2\pi ]$$, and statistically independent of Z(t), show that
$$S_{XY}( \omega ) = \frac{\pi A}{2} E[ Z(t) ] [ \delta ( w - w_c ) + \delta ( w + w_c ) ].$$

65. (65)
Calculate the bandwidth
$$B_{N} = \frac{ 1 }{ |H(0)|^2 } \int _{0}^{\infty } | H(\omega ) |^2 d \omega$$
of the noise for a system with the following transfer function:
$$| H(\omega ) |^2 = \frac{ 1 }{ 1 + (\omega / W)^2 }.$$

66. (66)

The control system illustrated in Fig. 2.36 has transfer function $$H(\omega )$$.

It is known that the Wiener filter, which minimizes the mean square error, and is called optimum in this sense, has a transfer function given by
$$G(\omega ) = \frac{ S_{X}(\omega ) }{ S_{X}(\omega ) + S_{N}(\omega ) },$$
in which $$S_X(\omega )$$ denotes the power spectral density of the desired signal and $$S_N(\omega )$$ denotes the noise power spectral density. Determine $$H_1(\omega )$$ and $$H_2(\omega )$$ such that this system operates as a Wiener filter.

67. (67)
Calculate the correlation between the input signal and the output signal for the filter with transfer function
$$H(\omega ) = u(\omega + \omega _M) - u(\omega + \omega _M),$$
given that the input has autocorrelation $$R_X(\tau ) = \delta (\tau )$$. Relate the input points for which input and output are orthogonal. Are the signals uncorrelated at these points? Explain.

68. (68)

Consider the signal $$Y_n = X_n - \alpha X_{n-1}$$, generated by means of white noise $$X_n$$, with zero mean and variance $$\sigma _X^2$$. Compute the mean value, the autocorrelation function, and the power spectral density of the signal $$Y_n$$.

69. (69)
Compute the transfer function
$$H(\omega ) = \frac{ S_{YX}(\omega ) }{ S_{X}(\omega ) }$$
of the optimum filter for estimating Y(t) from $$X(t) = Y(t) + N(t)$$, in which Y(t) and N(t) are zero mean statistically independent processes.

70. (70)

Consider the autoregressive process with moving average $$Y_n + \alpha Y_{n-1} = X_{n} + \beta X_{n-1}$$, built from white noise $$X_n$$, with zero mean and variance $$\sigma _X^2$$. Calculate the mean value, the autocorrelation, and the power spectral density of the signal $$Y_n$$.

71. (71)
When estimating a time series, the following result was obtained by minimizing the mean square error:
$$\theta (t + \tau ) \approx \frac{ R(\tau ) }{R(0)} \theta (t) + \frac{ R^{\prime }(\tau ) }{R^{\prime \prime }(0)} \theta ^{\prime }(t) ,$$
in which $$R(\tau )$$ denotes the autocorrelation of the process $$\theta (t)$$. By simplifying the above expression, show that the increment $$\Delta \theta (t) = \theta (t + \tau ) - \theta (t)$$ can be written as
$$\Delta \theta (t) = \tau \theta ^{\prime } (t) - \frac{ (\tau \omega _M )^2 }{2} \theta (t),$$
in which $$\omega _M$$ denotes the RMS bandwidth of the process. Justify the intermediate steps used in your proof.

72. (72)

Show that the system below, having $$h(t) = \frac{1}{T} [ u(t) - u(t-T) ]$$, works as a meter for the autocorrelation of the signal X(t). Is this measure biased?

73. (73)

Calculate the transfer function of the optimum filter, for estimating Y(t) from $$X(t) = Y(t) + N(t)$$, in which Y(t) and N(t) are zero mean-independent random processes.

74. (74)
The stochastic process $$Z(t) = X(t) X'(t)$$ is built from a Gaussian signal X(t) with zero mean and power spectral density $$S_X(\omega )$$. Calculate the power spectral density and the autocorrelation of Z(t), knowing that for the Gaussian process
$$S_{X^2} (\omega ) = 2 \int _{-\infty }^{\infty } S_X(\omega - \phi ) S_X(\phi ) d \phi .$$
Considering the signal X(t) has a uniform power spectral density $$S_X(\omega ) = S_0$$, between $$-\omega _m$$ and $$\omega _M$$, determine the autocorrelation and the power spectral density of Z(t).

75. (75)
It is desired to design a proportional and derivative (PD) control system to act over a signal with autocorrelation
$$R_X(\tau ) = \frac{1 - \alpha |\tau | }{1 + \alpha |\tau | }, \ |\tau | \le 1, \ \alpha > 0.$$
Determine the optimum estimator, in terms of mean square, and estimate the signal value after an interval of $$\tau = 1/\alpha$$ time units, as a function of the values of X(t) and its derivative $$X'(t)$$.

## References

1. Blake, I. F. (1987). An introduction to applied probability. Malabar, Florida: Robert E. Krieger Publishing Co.Google Scholar
2. Boyer, C. (1974). History of mathematics. São Paulo, Brasil (in Portuguese): Edgard Blucher Publishers Ltd.Google Scholar
3. de Souza, J., & Cavalcante. (1996). The pre-socratic - fragments, doxography and comments. São Paulo, Brasil (Portuguese): Nova Cultural Publishers Ltd.Google Scholar
4. Durant, W. (1996). The history of philosophy. São Paulo, Brasil (in Portuguese): Nova Cultural Publishers Ltd.Google Scholar
5. Halmos, P. R. (1960). Naive set theory. Princeton, USA: D. Van Nostrand Company Inc.
6. James, B. R. (1981). Probability: An intermediate level course. Rio de Janeiro, Brasil (in Portuguese): Institute of Pure and Applied Mathematics - CNPq.Google Scholar
7. Kennedy, R. S. (1969). Fading dispersive communication channels. New York: Wiley Interscience.Google Scholar
8. Lecours, M., Chouinard, J.-Y., Delisle, G. Y., & Roy, J. (1988). Statistical modeling of the received signal envelope in a mobile radio channel. IEEE Transactions on Vehicular Technology, 37(4), 204–212.
9. Leon-Garcia, A. (1989). Probability and random processes for electrical engineering. Reading, Massachusetts: Addison-Wesley Publishing Co.
10. Lipschutz, S. (1968). Set theory. Ao Livro Técnico S.A. Publishers: Rio de Janeiro, Brasil (in Portuguese).Google Scholar
11. Papoulis, A. (1981). Probability, random variables, and stochastic processes. Tokyo: McGraw-Hill.
12. Papoulis, A. (1983). Random modulation: A review. IEEE Transactions on Accoustics, Speech and Signal Processing, 31(1), 96–105.
13. Proakis, J. G. (1990). Digital communications. New York: McGraw-Hill Book Company.
14. Schwartz, M., Bennett, W., & Stein, S. (1966). Communication systems and techniques. New York: McGraw-Hill.
15. Zumpano, A., de Lima, B., & Borges, N. (2004). A Medida do Acaso. Ciência Hoje, 34(201), 76–77.Google Scholar

© Springer Nature Switzerland AG 2020

## Authors and Affiliations

• Marcelo S. Alencar
• 1
Email author
• Valdemar C. da Rocha Jr.
• 2
1. 1.Institute of Advanced Studies in CommunicationsFederal University of BahiaSalvadorBrazil
2. 2.Institute of Advanced Studies in CommunicationsFederal University of PernambucoRecifeBrazil