# Principles of Nonlinear MIMO Receivers

• Joachim Speidel
Chapter
Part of the Signals and Communication Technology book series (SCT)

## Abstract

As we have seen in the previous chapter, a linear receiver tries to reduce the impact of inter-channel interference and partially of the noise in the receive signal $$\mathbf {y}(k)$$ of Fig. . Next, the signal is subject to a decision also called detection to recover the QAM symbols in each component $$y_{i}(k)$$. Various decision strategies are known from communications theory and outlined in Part I. In this section, we will consider a Maximum Likelihood (ML) detector as a receiver. In contrast to the linear receiver, the signal $$\hat{\mathbf {s}}(k)$$ will be estimated directly from the receive vector
\begin{aligned} \mathbf {r}(k)=\left( \begin{array}{cccc} r_{1}(k)&r_{2}(k)&\cdots&r_{N}(k)\end{array}\right) ^{T} \end{aligned}

## 19.1 Maximum Likelihood MIMO Receiver

Principle

As we have seen in the previous chapter, a linear receiver tries to reduce the impact of inter-channel interference and partially of the noise in the receive signal $$\mathbf {y}(k)$$ in Fig. . Next, the signal is subject to a decision also called detection to recover the QAM symbols in each component $$y_{i}(k)$$. Various decision strategies are known from communications theory and outlined in Part I. In this section, we will consider a Maximum Likelihood (ML) detector as a receiver. In contrast to the linear receiver, the signal $$\hat{\mathbf {s}}(k)$$ will be estimated directly from the receive vector
\begin{aligned} \mathbf {r}(k)=\left( \begin{array}{cccc} r_{1}(k)&r_{2}(k)&\cdots&r_{N}(k)\end{array}\right) ^{T} \end{aligned}
(19.1)
Hence, a receive matrix $$\mathbf {W}$$ is not present. In the following, we drop the discrete-time variable k to simplify the notation. The observed receive vector
\begin{aligned} \mathbf {r}=\mathbf {H}\mathbf {s}+\mathbf {n} \end{aligned}
(19.2)
is corrupted by additive noise $$\mathbf {n}$$, where $$\mathbf {H}\mathbf {s}$$ is the receive signal in case of a noise-free channel. As for the linear receivers, we assume that the channel matrix $$\mathbf {H}$$ is precisely known to the receiver. In a practical system, the entries of $$\mathbf {H}$$ have to be estimated by a separate channel estimator, which is not considered here. The transmit signal vector is given by
\begin{aligned} \mathbf {s}=\left( \begin{array}{cccc} s_{1}&s_{2}&\cdots&s_{M}\end{array}\right) ^{T} \end{aligned}
(19.3)
in which each component $$s_{j}$$ is taken from a finite QAM symbol alphabet $$\mathcal {B}$$, e.g., $$\mathcal {B}$$= $$\left\{ 1,\mathrm {j,}-1,-\mathrm {j}\right\}$$ for 4-ary phase shift keying (4-PSK) or $$\mathcal {B}=\left\{ 1,-1\right\}$$ for 2-PSK. We assume an additive white Gaussian noise (AWGN) vector $$\mathbf {n}=(n_{1}\,n_{2} \ldots n_{N})^{T}$$ with the following properties:
• statistically independent with covariance matrix
\begin{aligned} \mathbf {R}_{nn}=\sigma _{n}^{2}\mathbf {I}_{N} \end{aligned}
(19.4)
• all noise components $$n_{i}$$ possess the same mean power $$\sigma _{n}^{2}$$ and zero mean $${\mathbf {E}}\left[ n_{i}\right] =0\,\,;\,\,i=1,2, \ldots ,N$$.

• the real part $$n_{R,i}$$ and the imaginary part $$n_{I,i}$$ of the noise $$n_{i}=n_{R,i}+\mathrm {j}n_{I,i}$$ are statistically independent, have the same mean power $$\frac{\sigma _{n}^{2}}{2}$$, and the same Gaussian probability density function
\begin{aligned} p_{x}(x)=\frac{1}{\sqrt{2\pi }\sigma _{x}}\mathrm {e}^{-\frac{x^{2}}{2\sigma _{x}^{2}}}\,\,;\,\,\sigma _{x}^{2}=\frac{\sigma _{n}^{2}}{2} \end{aligned}
(19.5)
where x stands for $$n_{R,i}$$ and $$n_{I,i}$$, $$i=1,2, \ldots ,N$$. Consequently, the density function of the noise $$n_{i}$$ is given by the product
\begin{aligned} p_{n_{i}}(n_{i})=p_{n_{R,i}}(n_{R,i})p_{n_{I,i}}(n_{I,i})=\frac{1}{\pi \sigma _{n}^{2}}\mathrm {e}^{-\frac{\left| n_{i}\right| ^{2}}{\sigma _{n}^{2}}}\,\,;\,\,i=1,2, \ldots ,N \end{aligned}
(19.6)
• the multivariate probability density function of the noise vector $$\mathbf {n}$$ then follows as the product
\begin{aligned} p_{\mathbf {n}}\left( n_{1},n_{2}, \ldots ,n_{N}\right) =\left( \frac{1}{\pi \sigma _{n}^{2}}\right) ^{N}\mathrm {e}^{-\frac{\left| n_{1}\right| ^{2}+\left| n_{2}\right| ^{2}+\cdots +\left| n_{N}\right| ^{2}}{\sigma _{n}^{2}}} \end{aligned}
(19.7)
or with shorthand notation
\begin{aligned} p_{\mathbf {n}}\left( \mathbf {n}\right) =\left( \frac{1}{\pi \sigma _{n}^{2}}\right) ^{N}\mathrm {e}^{-\frac{\left\| \mathbf {n}\right\| ^{2}}{\sigma _{n}^{2}}} \end{aligned}
(19.8)
For the decision process, we first define the following conditional probability density function:
\begin{aligned} p_{L}\left( \mathbf {r\mid }\mathbf {H}\mathbf {s}\right) \end{aligned}
(19.9)
which is also called likelihood probability density function. It can be interpreted as the density function of $$\mathbf {r}$$ under the condition that $$\mathbf {s}$$ was sent, knowing $$\mathbf {H}$$. Please note that $$p_{L}\left( \mathbf {r\mid }\mathbf {H}\mathbf {s}\right)$$ describes a finite set of probability density functions generated by all possible transmit vectors $$\mathbf {s}\,\epsilon \,\mathcal {A}$$, where $$\mathcal {A}$$ is the set of all possible transmit vectors. Be it that each of the M components of $$\mathbf {s}$$ can take on $$L_{Q}$$ different QAM symbol values, then $$\mathcal {A}$$ contains $$L_{Q}^{M}$$ different vectors $$\mathbf {s}_{m}$$ $$, m=1,2, \ldots ,L_{Q}^{M}$$. The maximum likelihood detector selects out of all possible $$\mathbf {H}\mathbf {s}$$ that estimate $$\mathbf {s}={{\hat{\mathbf {s}}}}$$, which is maximal likely to the receive vector $${\mathbf {r}}$$, i.e., which has the largest $$p_{L}\left( \mathbf {r\mid }\mathbf {H}\mathbf {s}\right)$$. Hence, the detection criterion is
\begin{aligned} p_{L}\left( \mathbf {r\mid }\mathbf {H}\mathbf {s}\right) =\max _{\mathbf {s}\,\epsilon \,\mathcal {A}} \end{aligned}
(19.10)
from which the optimal estimate
\begin{aligned} \hat{\mathbf {s}}=\arg \max _{\mathbf {s}\,\epsilon \,\mathcal {A}}\,p_{L}\left( \mathbf {r\mid }\mathbf {H}\mathbf {s}\right) \end{aligned}
(19.11)
results. As is well known from communications theory, if the transmit vectors $$\mathbf {s}\,\epsilon \,\mathcal {A}$$ are equally distributed, then $$\hat{\mathbf {s}}$$ also maximizes the a-posterior probability and thus minimizes the symbol error probability. With (19.2) and (19.8), we obtain from (19.9)
\begin{aligned} p_{L}\left( \mathbf {r\mid }\mathbf {H}\mathbf {s}\right) =p_{\mathbf {n}}\left( \mathbf {r}-\mathbf {H}\mathbf {s}\right) =\left( \frac{1}{\pi \sigma _{n}^{2}}\right) ^{N}\mathrm {e}^{-\frac{\left\| \mathbf {r}-\mathbf {H}\mathbf {s}\right\| ^{2}}{\sigma _{n}^{2}}} \end{aligned}
(19.12)
The argument of the exponential function is always negative. Consequently, the maximal $$p_{L}\left( \mathbf {r\mid }\mathbf {H}\mathbf {s}\right)$$ must fulfill the condition
\begin{aligned} \left\| \mathbf {r}-\mathbf {H}\mathbf {s}\right\| ^{2}=\min _{\mathbf {s}\,\epsilon \,\mathcal {A}} \end{aligned}
(19.13)
and the solution formally is
\begin{aligned} \hat{\mathbf {s}}=\arg \min _{\mathbf {s}\,\epsilon \,\mathcal {A}}\,\left\| \mathbf {r}-\mathbf {H}\mathbf {s}\right\| ^{2} \end{aligned}
(19.14)
Obviously, the statistical detection problem (19.10) translates into the minimization of the Euclidian distance between two vectors, namely, the receive vector $$\mathbf {r}$$ and the vector $$\mathbf {H}\mathbf {s}$$, which is the transmit signal $$\mathbf {s}$$ having passed through the known channel $$\mathbf {H}$$. Hence, a maximum likelihood detector can be implemented as an algorithm, which calculates a squared error according to (19.13) for all possible transmit signal vectors $$\mathbf {s\,}\epsilon \,\mathcal {A}$$ and selects that $$\mathbf {s}=\varvec{\hat{\mathrm{s}}}$$, which yields the minimal quadratic error. Of course, the receiver has to know the transmit vector alphabet $$\mathcal {A}$$, which is quite normal for the design of a digital communications system.

Just a few words about the computational complexity. As already mentioned, if the transmitter is equipped with $${\text {M}}$$ antennas and each antenna output signal can take on $$L_{Q}$$ different values, then there are $$L_{Q}^{M}$$ different vectors $$\mathbf {s}$$, for which the detector has to execute (19.13). We conclude that the number of operations in the maximum likelihood detector grows exponentially with the number M of transmit antennas.

Example 1

As a simple example, we take a MIMO transmitter with $$M=2$$ antennas. The modulation scheme shall be 2-PSK with the symbol alphabet $$\mathbf {\mathcal {B}}=\{1,-1\}$$. Consequently, $$L_{Q}=2$$ and each component of $$\mathbf {s}$$ can take on one value out of $$\mathbf {\mathcal {B}}$$ at time instant k. The channel matrix shall be given as
$$\mathbf {H}=\left( \begin{array}{cc} 1 &{} \,\,0.5\\ 0 &{} \,1\\ 1 &{} \,1 \end{array}\right)$$
and the noisy receive vector is observed as $$\mathbf {r}=\left( \begin{array}{ccc} 1.1&-1.1&0.9\end{array}\right) ^{T}$$. The receiver knows the set $$\mathcal {A}$$ of all $$L_{Q}^{M}=4$$ different transmit vectors.
\begin{aligned} \mathcal {A}=\left\{ \left( \begin{array}{c} 1\\ 1 \end{array}\right) ,\left( \begin{array}{c} \,\,\,1\\ -1 \end{array}\right) ,\left( \begin{array}{c} -1\\ \,\,\,1 \end{array}\right) ,\left( \begin{array}{c} -1\\ -1 \end{array}\right) \right\} \end{aligned}
(19.15)
Then, the maximum likelihood receiver calculates all vectors $$\mathbf {H}\mathbf {s}$$ and $$\mathbf {r}-\mathbf {H}\mathbf {s}$$ as well as the squared error $$\left\| \mathbf {r}-\mathbf {H}\mathbf {s}\right\| ^{2}$$ in Table 19.1. Finally, the minimal $$\left\| \mathbf {r}-\mathbf {H}\mathbf {s}\right\| ^{2}$$, which is 1.18 in our example, is selected and the detector concludes that most likely
$$\varvec{\hat{\mathrm{s}}}=\left( \begin{array}{c} \,\,\,1\\ -1 \end{array}\right)$$
was sent.
Table 19.1

Example, calculation steps for maximum likelihood detection

 $$\begin{array}{c} \\ \begin{array}{c} \mathbf {s}\end{array}\\ \\ \end{array}$$ $$\begin{array}{c} \\ \left( \begin{array}{c} 1\\ 1 \end{array}\right) \\ \\ \end{array}$$ $$\begin{array}{c} \\ \left( \begin{array}{c} \,\,1\\ -1 \end{array}\right) \\ \\ \end{array}$$ $$\begin{array}{c} \\ \left( \begin{array}{c} -1\\ \,\,1 \end{array}\right) \\ \\ \end{array}$$ $$\begin{array}{c} \\ \left( \begin{array}{c} -1\\ -1 \end{array}\right) \\ \\ \end{array}$$ $$\begin{array}{c} \\ \mathbf {H}\mathbf {s}\\ \\ \end{array}$$ $$\begin{array}{c} \\ \left( \begin{array}{c} 1.5\\ 1.0\\ 2.0 \end{array}\right) \\ \\ \end{array}$$ $$\begin{array}{c} \\ \left( \begin{array}{c} \,\,\,0.5\\ -1.0\\ 0 \end{array}\right) \\ \\ \end{array}$$ $$\begin{array}{c} \\ \left( \begin{array}{c} -0.5\\ \,\,\,1.0\\ 0 \end{array}\right) \\ \\ \end{array}$$ $$\begin{array}{c} \\ \left( \begin{array}{c} -1.5\\ -1.0\\ -2.0 \end{array}\right) \\ \\ \end{array}$$ $$\begin{array}{c} \\ \begin{array}{c} \mathbf {r}-\mathbf {H}\mathbf {s}\end{array}\\ \\ \end{array}$$ $$\begin{array}{c} \\ \left( \begin{array}{c} -0.4\\ -2.1\\ -1.1 \end{array}\right) \\ \\ \end{array}$$ $$\begin{array}{c} \\ \left( \begin{array}{c} \,\,\,0.6\\ -0.1\\ \,\,\,0.9 \end{array}\right) \\ \\ \end{array}$$ $$\begin{array}{c} \\ \left( \begin{array}{c} \,\,1.6\\ -2.1\\ \,\,\,0.9 \end{array}\right) \\ \\ \end{array}$$ $$\begin{array}{c} \\ \left( \begin{array}{c} \,\,2.6\\ -0.1\\ \,\,\,2.9 \end{array}\right) \\ \\ \end{array}$$ $$\begin{array}{c} \\ \left\| \mathbf {r}-\mathbf {H}\mathbf {s}\right\| ^{2}\\ \\ \end{array}$$ $$\begin{array}{c} \\ 5.78\\ \\ \end{array}$$ $$\begin{array}{c} \\ 1.18\\ \\ \end{array}$$ $$\begin{array}{c} \\ 7.78\\ \\ \end{array}$$ $$\begin{array}{c} \\ 15.81\\ \\ \end{array}$$

## 19.2 Receiver with Ordered Successive Interference Cancellation

Prerequisites

We are now coming back to the transmission system depicted in Fig.  and are going to combine the linear receiver with the decision device. Our target is to successively detect the transmit signal components $$s_{j}(k),\,\,j=1,2, \ldots ,M$$ of the transmit vector $$\mathbf {s}(k)$$. Again to simplify notation, we drop the discrete-time variable k. The starting point of our considerations is the linear receiver. The receive signal is given by
\begin{aligned} \mathbf {r}=\mathbf {H}\mathbf {s}+\mathbf {n} \end{aligned}
(19.16)
with the channel matrix
\begin{aligned} \mathbf {H}=\left( \begin{array}{cccc} \mathbf {h}_{1}&\mathbf {h}_{2}&\mathbf {\cdots }&\mathbf {h}_{M}\end{array}\right) \end{aligned}
(19.17)
in which $$\mathbf {h}_{j}\,\epsilon \,\mathbb {C}^{N\mathrm {x}1},\,\,j=1, \ldots ,M$$ are the column vectors. The receiver matrix
\begin{aligned} \mathbf {W}=\left( \begin{array}{c} \mathbf {w}_{1}\\ \mathbf {w}_{2}\\ \vdots \\ \mathbf {w}_{M} \end{array}\right) \end{aligned}
(19.18)
is structured by its row vectors $$\mathbf {w}_{i}\,\epsilon \,\mathbb {C}^{1\mathrm {x}N},\,\,i=1, \ldots ,M$$ and can be calculated as the pseudo-inverse or the MMSE receive matrix of the channel matrix. Finally, we get the output of the receiver filter
\begin{aligned} \mathbf {y}=\left( \begin{array}{c} y_{1}\\ y_{2}\\ \vdots \\ y_{M} \end{array}\right) \end{aligned}
(19.19)
by multiplication
\begin{aligned} \mathbf {y=\mathbf {W}\mathbf {r}} \end{aligned}
(19.20)
Hence, the output signal component $$y_{i}$$ is obtained as
\begin{aligned} y_{i}=\mathbf {w}_{i}\mathbf {r}\,\,;\,\,i=1,2, \ldots ,M \end{aligned}
(19.21)
According to Fig. , a decision device follows and we characterize the input–output relation by the decision function $$q(\ldots )$$ yielding
\begin{aligned} \hat{s}_{i}=q(y_{i})\,\,;\,\,i=1,2, \ldots ,M \end{aligned}
(19.22)
The decision device can be a simple threshold detector but also a more sophisticated maximum likelihood detector. If the receiver applies the receive matrix $$\mathbf {W}$$, the system of equations is solved for all $$y_{1,}y_{2,\ldots ,}y_{M}$$ in one step and the decided signal components $$\hat{s}_{1},\hat{s}_{2}, \ldots ,\hat{s}_{M}$$ are obtained in parallel. Now we are going to discuss a method, in which the system of linear equations (19.20) is solved successively in several steps, where in each step the decision operation (19.22) is applied.

Ordered Successive Interference Cancellation

As outlined before, we are looking for an algorithm, with which the $$\hat{s}_{i}=q(y_{i}),\,\,i=1,2, \ldots ,M$$ are calculated one after the other rather than in one step. The algorithm is called ordered successive interference cancellation (OSIC). In principle, the operations (19.20) and (19.22) are merged. With (19.17), we obtain from (19.16)
\begin{aligned} \left( \begin{array}{ccccc} \mathbf {h}_{1}&\cdots&\mathbf {h}_{\nu }&\cdots&\mathbf {h}_{M}\end{array}\right) \left( \begin{array}{c} s_{1}\\ \vdots \\ s_{\nu }\\ \vdots \\ s_{M} \end{array}\right) +\mathbf {n}=\mathbf {r} \end{aligned}
(19.23)
which is equivalent to
\begin{aligned} \mathbf {h}_{1}s_{1}+\cdots \mathbf {+h}_{\nu -1}s_{\nu -1}\mathbf {+h}_{\nu +1}s_{\nu +1}+\cdots +\mathbf {h}_{M}s_{M}+\mathbf {n}=\mathbf {r}-\mathbf {h}_{\nu }s_{{{\nu }}} \end{aligned}
(19.24)
and is the key equation, in which we have moved $$\mathbf {h}_{\nu }s_{{{\nu }}}$$ to the right-hand side. The idea is first to find a solution for $$s_{\nu }$$, e.g., $$s_{1}$$, and then reduce the dimension of system of linear equations by one. It matters to introduce $$\nu$$ as an index for the iteration step $$\nu$$.
The algorithm is best explained with $$M=3$$ as an example. Let $$\mathbf {H}=\left( \begin{array}{ccc} \mathbf {h}_{1}&\mathbf {h}_{2}&\mathbf {h}_{3}\end{array}\right)$$, $$\mathbf {r}$$, and the decision rule $$q(\ldots )$$ be given. In the course of the iterations, the matrix $$\mathbf {H}$$ will change and therefore it will be indicated as $$\mathbf {H}^{(\nu )}$$. In each step, only the first row vector $$\mathbf {w}_{1}^{(\nu )}$$ of the receive matrix
\begin{aligned} \mathbf {W}^{(\nu )}=\left( \begin{array}{c} \mathbf {w}_{1}^{(\nu )}\\ \mathbf {w}_{2}^{(\nu )}\\ \mathbf {w}_{3}^{(\nu )} \end{array}\right) \end{aligned}
(19.25)
has to be calculated either from the pseudo-inverse or the MMSE matrix with respect to $$\mathbf {H}^{(\nu )}$$. The iteration steps are as follows:

The advantage of this algorithm is its low computational complexity and the feature that it reduces inter-channel interference in every decision step. However, decision errors, which may occur at low signal-to-noise ratios, are critical, because they can impact the next decision and thus may cause an error propagation for the following steps. We notice that the algorithm is in principle based on the triangulation of a matrix into a lower or an upper triangular matrix also called L-U decomposition [1], which is continuously applied from one step to the next. This is in principle a linear operation. However, the described OSIC algorithm gets nonlinear owing to the decision made in each step. The algorithm has been practically used in several systems, such as the layered space-time architecture.

## 19.3 Comparison of Different Receivers

The design criteria for linear and nonlinear receivers have quite some similarities which are now going to be discussed for the zero-forcing (ZF), the minimum mean squared error (MMSE), and the maximum likelihood (ML) receiver. Table 19.2 shows a survey of the different design criteria.
Table 19.2

Comparison of design criteria for various receivers

 $$\begin{array}{c} \\ \mathrm {Receiver}\\ \\ \end{array}$$ $$\begin{array}{c} \\ \mathrm {Target \ Function}\\ \\ \end{array}$$ $$\begin{array}{c} \\ \mathrm {Noise}\\ \\ \end{array}$$ $$\begin{array}{c} \\ \mathrm {Result}\\ \\ \end{array}$$ $$\begin{array}{c} \\ \mathrm {Output}\\ \\ \end{array}$$ $$\begin{array}{c} \\ \mathrm {Method}\\ \\ \end{array}$$ $$\begin{array}{c} \mathrm {Zero-Forcing~(ZF)}\\ \end{array}$$ $$\begin{array}{c} \\ \left\| \mathbf {r}-\mathbf {H}\mathbf {s}\right\| ^{2}\\ =\min _{\mathbf {s}\,\epsilon \mathbb {\,C}^{M\mathrm {x}1}}\\ \\ \end{array}$$ $$\begin{array}{c} \\ \mathrm {not}\\ \mathrm {included}\\ \\ \end{array}$$ $$\begin{array}{c} \\ \mathrm {matrix}\\ \mathbf {W}\\ \\ \end{array}$$ $$\begin{array}{c} \\ \mathbf {y=}\\ \mathbf {W}\mathbf {r}\\ \\ \end{array}$$ $$\begin{array}{c}\\ \mathrm {linear}\\ \\ \\ \end{array}$$ $$\begin{array}{c} \\ \mathrm {MMSE}\\ \\ \end{array}$$ $$\begin{array}{c} \\ \begin{array}{c} {\mathbf {E}}\left[ \left\| \mathbf {W}\mathbf {r}-\mathbf {s}\right\| ^{2}\right] \end{array}\\ =\min _{\mathbf {s}\,\epsilon \mathbb {\,C}^{M\mathrm {x}1}}\\ \\ \end{array}$$ $$\begin{array}{c} \\ \mathrm {included}\\ \\ \\ \end{array}$$ $$\begin{array}{c} \\ \mathrm {matrix}\\ \mathbf {W}\\ \\ \end{array}$$ $$\begin{array}{c} \\ \mathbf {y=}\\ \mathbf {W}\mathbf {r}\\ \\ \end{array}$$ $$\begin{array}{c} \\ \mathrm {linear}\\ \\ \\ \end{array}$$ $$\begin{array}{c} \\ \mathrm {Maximum~Likelihood}\\ \\ \end{array}$$ $$\begin{array}{c} \\ \begin{array}{c} \left\| \mathbf {r}-\mathbf {H}\mathbf {s}\right\| ^{2}\\ =\min _{\mathbf {s}\,\epsilon \,\mathcal {A}} \end{array}\\ \\ \end{array}$$ $$\begin{array}{c} \\ \mathrm {included}\\ \\ \\ \end{array}$$ $$\begin{array}{c} \\ \mathrm {symbol}\\ \varvec{\hat{\mathrm{s}}}\\ \\ \end{array}$$ $$\begin{array}{c} \\ \varvec{\hat{\mathrm{s}}}\,\epsilon \,\mathcal {A}\\ \\ \\ \end{array}$$ $$\begin{array}{c} \\ \mathrm {non-}\\ \mathrm {linear}\\ \\ \end{array}$$ $$\begin{array}{c} \\ \mathrm {OSIC}\\ \mathrm {ZF}\\ \\ \end{array}$$ $$\begin{array}{c} \\ \mathrm {as}\\ \mathrm {ZF}\\ \\ \end{array}$$ $$\begin{array}{c} \\ \mathrm {not}\\ \mathrm {included}\\ \\ \end{array}$$ $$\begin{array}{c} \\ \mathscr {\mathrm {\mathrm {symbol}}}\\ \varvec{\hat{\mathrm{s}}}\\ \\ \end{array}$$ $$\begin{array}{c} \\ \varvec{\hat{\mathrm{s}}}\,\epsilon \,\mathcal {A}\\ \\ \\ \end{array}$$ $$\begin{array}{c} \\ \mathscr {\mathrm {non-}}\\ \mathrm {linear}\\ \\ \end{array}$$ $$\begin{array}{c} \\ \mathrm {OSIC}\\ \mathrm {MMSE}\\ \\ \end{array}$$ $$\begin{array}{c} \\ \mathrm {as}\\ \mathrm {MMSE}\\ \\ \end{array}$$ $$\begin{array}{c} \\ \mathrm {included}\\ \\ \\ \end{array}$$ $$\begin{array}{c} \\ \mathrm {\mathrm {symbol}}\\ \varvec{\hat{\mathrm{s}}}\\ \\ \end{array}$$ $$\begin{array}{c} \\ \varvec{\hat{\mathrm{s}}}\,\epsilon \,\mathcal {A}\\ \\ \\ \end{array}$$ $$\begin{array}{c} \\ \mathrm {non-}\\ \mathrm {linear}\\ \\ \end{array}$$

The design of the zero-forcing receiver with and without OSIC does not include the noise at the receiver. Computation of the receiver matrix $$\mathbf {W}$$ for the MMSE receiver requires the knowledge of the signal-to-noise ratio $$\frac{1}{\alpha }$$, which is not needed for the maximum likelihood detection. This method operates without any receiver matrix. Moreover, on the first glance the target functions of the zero-forcing algorithm using the pseudo-inverse matrix and the maximum likelihood receiver look the same. Both receivers minimize the squared error $$\left\| \mathbf {r}-\mathbf {H}\mathbf {s}\right\| ^{2}$$. However, the zero-forcing receiver provides a “soft” output signal $$\mathbf {y\,\epsilon \,\mathbb {C}}^{M\mathrm {x}1}$$ with continuous amplitude and phase compared to the output of the maximum likelihood receiver, which is a discrete vector $$\varvec{\hat{\mathrm{s}}}\,\epsilon \,\mathcal {A}$$. Hence, the maximum likelihood scheme minimizes the same target function as the zero-forcing receiver, however, as the result of a discrete minimization problem with the constraint $$\varvec{\hat{\mathrm{s}}}\,\epsilon \,\mathcal {A}$$. This can be formulated as an integer least squares problem for which several mathematical algorithms from the area of integer programming are known [2, 3]. Such methods have been used for lattice decoding and are summarized as sphere decoding algorithm, because they search in a limited hypersphere of the complex vector space rather than performing an overall brute search [4, 5, 6] and thus do not always provide the global optimum. In principle, the hypersphere is centered around the receive vector $$\mathbf {r}$$ and for an efficient search the sphere should cover the lattice points given by the vectors ($$\mathbf {H}\mathbf {s}\,\,;\,\,\mathbf {s}\,\epsilon \,\mathbf {\mathcal {A}}$$) located in the vicinity of $$\mathbf {r}$$. As a result, the complexity of the maximum likelihood algorithm can be significantly reduced and sphere decoding became an important alternative to the much simpler but suboptimal linear receivers.

On the other hand, the complexity of the zero-forcing and the minimum mean squared error (MMSE) receiver can also be considerably reduced by introducing successive interference cancellation (OSIC), because only parts of an inverse matrix have to be calculated rather than a full-inverse or pseudo-inverse matrix. However, it should be kept in mind that in case of ill-conditioned matrices the calculation of the inverse matrices may turn out to be numerically not stable.

Figure 19.1 shows a rough comparison of the symbol error rate for various receivers as the result of a computer simulation using the platform “webdemo” [7]. According to our expectations, the maximum likelihood detector (ML) demonstrates the best performance followed by the nonlinear receiver with OSIC. Compared to the zero-forcing receiver (ZF), the minimum mean squared error approach (MMSE) takes the noise into account and thus outperforms the ZF receiver in general.

## References

1. 1.
R.A. Horn, C.R. Johnson, Matrix Analysis (Cambridge University Press, Cambridge, 2013)Google Scholar
2. 2.
R. Kannan, Improved algorithms on integer programming and related lattice problems, in Proceedings of ACM Symposium on Theory of Computation (1983)Google Scholar
3. 3.
U. Finke, M. Pohst, Improved methods for calculating vectors of short length in a lattice, including a complexity analysis. Math. Comput. 44 (1985)Google Scholar
4. 4.
E. Viterbo, J. Boutros, A universal lattice code decoder for fading channels. IEEE Trans. Inf. Theory 45 (1999)
5. 5.
O. Damen, A. Chkeif, J. Belfiore, Lattice code decoder for space-time codes. IEEE Commun. Lett. 4 (2000)
6. 6.
T. Kailath, H. Vikalo, B. Hassibi, in MIMO Receive Algorithms, in Space-Time Wireless Systems: From Array Processing to MIMO Communications, ed. by H. Boelcskei, D. Gesbert, C.B. Papadias, A.-J. van der Veen (Cambridge University Press, Cambridge, 2008)Google Scholar
7. 7.
N. Zhao, MIMO detection algorithms, webdemo, Technical report, Institute of Telecommunications, University of Stuttgart, Germany (2018), http://webdemo.inue.uni-stuttgart.de