Properties of a general quaternion-valued gradient operator and its applications to signal processing

Jiang, Meng-di; Li, Yi; Liu, Wei

doi:10.1631/FITEE.1500334

Properties of a general quaternion-valued gradient operator and its applications to signal processing

Published: 06 February 2016

Volume 17, pages 83–95, (2016)
Cite this article

Download PDF

Frontiers of Information Technology & Electronic Engineering Aims and scope Submit manuscript

Properties of a general quaternion-valued gradient operator and its applications to signal processing

Download PDF

Meng-di Jiang¹,
Yi Li² &
Wei Liu¹

1161 Accesses
16 Citations
Explore all metrics

Abstract

The gradients of a quaternion-valued function are often required for quaternionic signal processing algorithms. The HR gradient operator provides a viable framework and has found a number of applications. However, the applications so far have been limited to mainly real-valued quaternion functions and linear quaternionvalued functions. To generalize the operator to nonlinear quaternion functions, we define a restricted version of the HR operator, which comes in two versions, the left and the right ones. We then present a detailed analysis of the properties of the operators, including several different product rules and chain rules. Using the new rules, we derive explicit expressions for the derivatives of a class of regular nonlinear quaternion-valued functions, and prove that the restricted HR gradients are consistent with the gradients in the real domain. As an application, the derivation of the least mean square algorithm and a nonlinear adaptive algorithm is provided. Simulation results based on vector sensor arrays are presented as an example to demonstrate the effectiveness of the quaternion-valued signal model and the derived signal processing algorithm.

Enhanced Gradient Descent Algorithms for Quaternion-Valued Neural Networks

The Quaternion Domain Fourier Transform and its Properties

Article 02 November 2015

Quaternion Data Fusion

1 Introduction

Quaternion calculus has been introduced in signal processing with application areas involving three- or four-dimensional signals, such as color image processing (Pei and Cheng, 1999; Sangwine and Ell, 2000; Parfieniuk and Petrovsky, 2010; Ell et al., 2014; Liu et al., 2014), vector-sensor array systems (Le Bihan and Mars, 2004; Miron et al., 2006; Le Bihan et al., 2007; Tao, 2013; Tao and Chang, 2014; Zhang et al., 2014; Hawes and Liu, 2015), three-phase power systems (Talebi and Mandic, 2015), quaternion-valued wireless communications (Liu, 2014), and wind profile prediction (Jiang et al., 2014b). Several quaternion-valued adaptive filtering algorithms have been proposed in Barthelemy et al. (2014), Jiang et al. (2014c), Talebi et al. (2014), Tao and Chang (2014), and Zhang et al. (2014). Notwithstanding the advantages of the quaternionic algorithms, extra care has to be taken in their developments, in particular when the derivatives of quaternion-valued functions are involved, due to the fact that quaternion algebra is non-commutative. A so-called HR gradient operator was proposed in Mandic et al. (2011) and the interesting formulation appears to provide a general and flexible framework that could potentially have wide applications. However, it has been applied only to real-valued functions and linear quaternion-valued functions. To consider more general quaternion-valued functions, we propose a pair of restricted HR gradient operators, the left and the right restricted HR gradient operators, based on the previous work on the HR gradient operator (Mandic et al., 2011) and our recent work (Jiang et al., 2014b).

To summarize, we make the following main contributions. First, we give a detailed derivation of the relation between the gradients and the increment of a quaternion function, highlighting the difference between the left and the right gradients due to the non-commutativity of quaternion algebra. Second, we document several properties of the operators that have not been reported before, in particular several different versions of product rules and chain rules. Third, we derive a general formula for the restricted HR derivatives of a wide class of regular quaternion-valued nonlinear functions, among which are the exponential, logarithmic, and the hyperbolic tangent functions. Finally, we prove that the restricted HR gradients are consistent with the usual definition for the gradient of a real function of a real variable. Its application to the derivation of a quaternion-valued least mean squares (QLMS) adaptive algorithm and a nonlinear adaptive algorithm based on the hyperbolic tangent function is also briefly discussed. As an example for quaternion-valued signal processing, we consider the reference signal based adaptive beam-forming problem for vector sensor arrays consisting of multiple crossed-dipoles and provide some simulation results.

2 Restricted HR gradient operators

2.1 Introduction of quaternion

Quaternions are a non-commutative extension of complex numbers. A quaternion q is composed of four parts, i.e., $q = {q_a} + {q_b}{\rm{i}} + {q_c}{\rm{j}} + {q_d}{\rm{k}}$, where q_a is the real part, also denoted as R(q). The other three terms constitute the imaginary part I(q), where i, j, and k are the three imaginary units, satisfying the following rules: ${\rm{ij}} = {\rm{k}},\;\;{\rm{jk}} = {\rm{i}},\;\;{\rm{ki}} = {\rm{j}},\;\;{{\rm{i}}^2} = {{\rm{j}}^2} = {{\rm{k}}^2} = - 1$, and ${\rm{ij}} = - {\rm{ji}},{\rm{ki}} = - {\rm{ik}},{\rm{kj}} = - {\rm{jk}}$. As a result, in general the product of two quaternions p and q depends on the order, i.e., qp ≠ pq, and when one of the factors is real, we have qp = pq.

Let $v = \vert {\rm{I}}(q)\vert \;{\rm{and}}\;{\bf{\hat v}} = {\rm{I}}(q)/v$. The quaternion q can also be written as $q = {q_a} + v{\bf{\hat v}}$. Here, ${\bf{\hat v}}$ is a pure unit quaternion, which has the convenient property ${{\bf{\hat v}}^2}: = {\bf{\hat v\hat v}} = - 1$. The quaternionic conjugate of q is ${q^\ast} = {q_a} - {q_b}{\rm{i}} - {q_c}{\rm{j}} - {q_d}{\rm{k}}$, or ${q^\ast} = {q_a} - v{\bf{\hat v}}$. It is easy to show that $q{q^\ast} = {q^\ast}q = \vert q{\vert ^2}$, and hence ${q^{ - 1}} = {q^\ast}/|q{\vert ^2}$.

2.2 Definition of the restricted HR gradient operators

Let f : H → H be a quaternion-valued function of a quaternion q, where H is the non-commutative algebra of quaternions. We use the notation f(q) = f_a + f_bi + f_cj + f_dk, where f_a, f_b, f_c, f_d are the components of f. Here, f can also be viewed as a function of the four components of q, i.e., f = f (q_a, q_b, q_c, q_d). In this view f is a quaternion-valued function on ℝ⁴: f : ℝ⁴ → H. To express the four real components of q, it is convenient to use its involutions q^ν := −νqν where ν ∈ {i, j, k} (Ell and Sangwine, 2007). Explicitly, we have

$${q^{\rm{i}}} = - {\rm{i}}q{\rm{i}} = {q_a} + {q_b}{\rm{i}} - {q_c}{\rm{j}} - {q_d}{\rm{k}},$$

((1))

$${q^{\rm{j}}} = - {\rm{j}}q{\rm{j}} = {q_a} - {q_b}{\rm{i}} + {q_c}{\rm{j}} - {q_d}{\rm{k}},$$

((2))

$${q^{\rm{k}}} = - {\rm{k}}q{\rm{k}} = {q_a} - {q_b}{\rm{i}} - {q_c}{\rm{j}} + {q_d}{\rm{k}},$$

((3))

$${q_a} = {1 \over 4}(q + {q^{\rm{i}}} + {q^{\rm{j}}} + {q^{\rm{k}}}),$$

((4))

$${q_b} = {1 \over {4{\rm{i}}}}(q + {q^{\rm{i}}} - {q^{\rm{j}}} - {q^{\rm{k}}}),$$

((5))

$${q_c} = {1 \over {4{\rm{j}}}}(q - {q^{\rm{i}}} + {q^{\rm{j}}} - {q^{\rm{k}}}),$$

((6))

$${q_d} = {1 \over {4{\rm{k}}}}(q - {q^{\rm{i}}} - {q^{\rm{j}}} + {q^{\rm{k}}}),$$

((7))

Two useful relations are

$$\left\{ {\begin{array}{*{20}c}{q^\ast = 1/2({q^{\rm{i}}} + {q^{\rm{j}}} + {q^{\rm{k}}} - q),} \\ {q + {q^{\rm{i}}} + {q^{\rm{j}}} + {q^{\rm{k}}} = 4{\rm{R}}(q){.}\;\;} \end{array}} \right.$$

((8))

A so-called HR gradient of f(q) was introduced in Mandic et al. (2011), which has been applied to real-valued functions and linear quaternion-valued functions. To find the gradients of more general quaternion-valued functions, we follow a similar approach to propose a ‘restricted’ HR gradient operator (some of the derivation was first presented in Jiang et al. (2014b)). To motivate the definitions, we consider the differential df (q) with respect to differential ${\rm{d}}q: = {\rm{d}}{q_a} + {\rm{d}}{q_b}{\rm{i}} + {\rm{d}}{q_c}{\rm{j}} + {\rm{d}}{q_d}{\rm{k}}$. We observe that ${\rm{d}}f: = {\rm{d}}{f_a} + {\rm{id}}{f_b} + {\rm{jd}}{f_c} + {\rm{kd}}{f_d}$, where

$${\rm{d}}{f_a} = {{\partial {f_a}} \over {\partial {q_a}}}{\rm{d}}{q_a} + {{\partial {f_a}} \over {\partial {q_b}}}{\rm{d}}{q_b} + {{\partial {f_a}} \over {\partial {q_c}}}{\rm{d}}{q_c} + {{\partial {f_a}} \over {\partial {q_d}}}{\rm{d}}{q_d}{.}$$

((9))

We have ${\rm{d}}{q_a} = ({\rm{d}}q + {\rm{d}}{q^{\rm{i}}} + {\rm{d}}{q^{\rm{j}}} + {\rm{d}}{q^{\rm{k}}})/4$ according to Eq. (4). Making use of this and similar expressions for dq_b, dq_c, and dq_d, we find an expression for df_a in terms of the differentials dq, dqⁱ, dq^j, and dq^k. Repeating the calculation for idf_b, jdf_c, and kdf_d, we finally arrive at

$${\rm{d}}f = D{\rm{d}}q + {D_{\rm{i}}}{\rm{d}}{q^{\rm{i}}} + {D_{\rm{j}}}{\rm{d}}{q^{\rm{j}}} + {D_{\rm{k}}}{\rm{d}}{q^{\rm{k}}},$$

((10))

where

$$D: = {1 \over 4}\left( {{{\partial f} \over {\partial {q_a}}} - {{\partial f} \over {\partial {q_b}}}{\rm{i}} - {{\partial f} \over {\partial {q_c}}}{\rm{j}} - {{\partial f} \over {\partial {q_d}}}{\rm{k}}} \right),$$

((11))

$${D_{\rm{i}}}: = {1 \over 4}\left( {{{\partial f} \over {\partial {q_a}}} - {{\partial f} \over {\partial {q_b}}}{\rm{i}} + {{\partial f} \over {\partial {q_c}}}{\rm{j}} + {{\partial f} \over {\partial {q_d}}}{\rm{k}}} \right),$$

((12))

$${D_{\rm{j}}}: = {1 \over 4}\left( {{{\partial f} \over {\partial {q_a}}} + {{\partial f} \over {\partial {q_b}}}{\rm{i}} - {{\partial f} \over {\partial {q_c}}}{\rm{j}} + {{\partial f} \over {\partial {q_d}}}{\rm{k}}} \right),$$

((13))

$${D_{\rm{k}}}: = {1 \over 4}\left( {{{\partial f} \over {\partial {q_a}}} + {{\partial f} \over {\partial {q_b}}}{\rm{i}} + {{\partial f} \over {\partial {q_c}}}{\rm{j}} - {{\partial f} \over {\partial {q_d}}}{\rm{k}}} \right){.}$$

((14))

More details are given in Appendix A. Thus, one may define the partial derivatives of f(q) as follows:

$${{\partial f} \over {\partial q}}: = D,{{\partial f} \over {\partial {q^{\rm{i}}}}}: = {D_{\rm{i}}},{{\partial f} \over {\partial {q^{\rm{j}}}}}: = {D_{\rm{j}}},{{\partial f} \over {\partial {q^{\rm{k}}}}}: = {D_{\rm{k}}}{.}$$

((15))

Introducing operators

$${\nabla _q}: = (\partial /\partial q,\;\partial /\partial {q^{\rm{i}}},\;\partial /\partial {q^{\rm{j}}},\;\partial /\partial {q^{\rm{k}}})$$

and

$${\nabla _r}: = (\partial /\partial {q_a},\;\partial /\partial {q_b},\;\partial /\partial {q_c},\;\partial /\partial {q_d}),$$

Eqs. (11)–(15) can be written as

$${\nabla _q}f = {\nabla _r}f{J^{\rm{H}}},$$

((16))

where the Jacobian matrix is

$$J = {1 \over 4}\left[ {\begin{array}{*{20}c}1 & {\rm{i}} & {\rm{j}} & {\rm{k}} \\ 1 & {\rm{i}} & { - {\rm{j}}} & { - {\rm{k}}} \\ 1 & { - {\rm{i}}} & {\rm{j}} & { - {\rm{k}}} \\ 1 & { - {\rm{i}}} & { - {\rm{j}}} & {\rm{k}} \end{array}} \right],$$

((17))

and J^H is the Hermitian transpose of J (Mandic et al., 2011). Using JJ^H = J^HJ = 1/4I where I is the identity matrix, we can also write

$${\nabla _q}fJ = {1 \over 4}{\nabla _r}f,$$

((18))

which is the inverse formula for the derivatives.

We call the gradient operator defined by Eq. (16) the restricted HR gradient operator. The operator is closely related to the HR operator introduced in Mandic et al. (2011). However, in the original definition of the HR operator, the Jacobian J appears on the left-hand side of ∇_rf, whereas in our definition it appears on the right (as the Hermitian transpose).

The differential df is related to ∇_qf by

$${\rm{d}}f = {{\partial f} \over {\partial q}}{\rm{d}}q + {{\partial f} \over {\partial {q^{\rm{i}}}}}{\rm{d}}{q^{\rm{i}}} + {{\partial f} \over {\partial {q^{\rm{j}}}}}{\rm{d}}{q^{\rm{j}}} + {{\partial f} \over {\partial {q^{\rm{k}}}}}{\rm{d}}{q^{\rm{k}}}{.}$$

((19))

Due to the non-commutativity of quaternion products, the order of the factors in the products of Eq. (19) (as well as Eqs. (11)–(14)) cannot be swapped. In fact, one may call the above operator the left restricted HR gradient operator. As is shown in Appendix A, one can also define a right restricted HR gradient operator by

$${(\nabla _q^{\rm{R}}f)^{\rm{T}}}: = J^\ast {({\nabla _r}f)^{\rm{T}}},$$

((20))

where

$$\nabla _q^{\rm{R}}: = ({\partial ^{\rm{R}}}/\partial q,\;{\partial ^{\rm{R}}}/\partial {q^{\rm{i}}},\;{\partial ^{\rm{R}}}/\partial {q^{\rm{j}}},\;{\partial ^{\rm{R}}}/\partial {q^{\rm{k}}}),$$

and

$${{{\partial ^{\rm{R}}}f} \over {\partial q}}: = {1 \over 4}\left( {{{\partial f} \over {\partial {q_a}}} - {\rm{i}}{{\partial f} \over {\partial {q_b}}} - {\rm{j}}{{\partial f} \over {\partial {q_c}}} - {\rm{k}}{{\partial f} \over {\partial {q_d}}}} \right),$$

((21))

$${{{\partial ^{\rm{R}}}f} \over {\partial {q^{\rm{i}}}}}: = {1 \over 4}\left( {{{\partial f} \over {\partial {q_a}}} - {\rm{i}}{{\partial f} \over {\partial {q_b}}} + {\rm{j}}{{\partial f} \over {\partial {q_c}}} + {\rm{k}}{{\partial f} \over {\partial {q_d}}}} \right),$$

((22))

$${{{\partial ^{\rm{R}}}f} \over {\partial {q^{\rm{j}}}}}: = {1 \over 4}\left( {{{\partial f} \over {\partial {q_a}}} + {\rm{i}}{{\partial f} \over {\partial {q_b}}} - {\rm{j}}{{\partial f} \over {\partial {q_c}}} + {\rm{k}}{{\partial f} \over {\partial {q_d}}}} \right),$$

((23))

$${{{\partial ^{\rm{R}}}f} \over {\partial {q^{\rm{k}}}}}: = {1 \over 4}\left( {{{\partial f} \over {\partial {q_a}}} + {\rm{i}}{{\partial f} \over {\partial {q_b}}} + {\rm{j}}{{\partial f} \over {\partial {q_c}}} - {\rm{k}}{{\partial f} \over {\partial {q_d}}}} \right){.}$$

((24))

The right restricted HR gradient operator is related to the differential df by

$${\rm{d}}f = {\rm{d}}q{{{\partial ^{\rm{R}}}f} \over {\partial q}} + {\rm{d}}{q^{\rm{i}}}{{{\partial ^{\rm{R}}}f} \over {\partial {q^{\rm{i}}}}} + {\rm{d}}{q^{\rm{j}}}{{{\partial ^{\rm{R}}}f} \over {\partial {q^{\rm{j}}}}} + {\rm{d}}{q^{\rm{k}}}{{{\partial ^{\rm{R}}}f} \over {\partial {q^{\rm{k}}}}}{.}$$

((25))

In general, the left and right restricted HR gradients are not the same. For example, even for the simplest linear function f(q) = q₀q with q₀ ∈ H a constant, we have

$${{\partial {q_0}q} \over {\partial q}} = {q_0},\;{{{\partial ^{\rm{R}}}{q_0}q} \over {\partial q}} = {\rm{R}}({q_0}){.}$$

((26))

However, we will show later that the two gradients coincide for a class of functions. In particular, they are the same for real-valued quaternion functions. The relationship between the gradients and the differential is an important ingredient of gradient-based methods, which we will discuss further later.

3 Properties and rules of the operator

We will now focus on the left restricted HR gradient and simply call it the restricted HR gradient unless stated otherwise. It can be easily calculated from the definitions that

$${{\partial q} \over {\partial q}} = 1,\;{{\partial {q^\nu}} \over {\partial q}} = 0,\;{{\partial q^\ast} \over {\partial q}} = - {1 \over 2},$$

((27))

where ν ∈ {i, j, k}. However, to find the derivatives for more complex quaternion functions, it is useful to first establish the rules of the gradient operators. We will see that some of the usual rules do not apply due to the non-commutativity of quaternion products.

1.
Left-linearity: For arbitrary constant quaternions α and β, and functions f(q) and g(q), we have
$${{\partial (\alpha f + \beta g)} \over {\partial {q^\nu}}} = \alpha {{\partial f} \over {\partial {q^\nu}}} + \beta {{\partial g} \over {\partial {q^\nu}}}$$
((28))
for ν ∈ {1, i, j, k} with q¹ := q. However, linearity does not hold for right multiplications, i.e., in general
$${{\partial f\alpha} \over {\partial q}} \neq {{\partial f} \over {\partial q}}\alpha{.}$$
((29))
This is because, according to Eq. (11),
$${{\partial f\alpha} \over {\partial q}} = {1 \over 4}\left( {{{\partial f} \over {\partial {q_a}}}\alpha - {{\partial f} \over {\partial {q_b}}}\alpha {\rm{i}} - {{\partial f} \over {\partial {q_c}}}\alpha {\rm{j}} - {{\partial f} \over {\partial {q_d}}}\alpha {\rm{k}}} \right),$$
((30))
where α is an arbitrary constant quaternion. However, αν ≠ να in general. Therefore, it is different from (∂f/∂q)α, which is
$${1 \over 4}\left( {{{\partial f} \over {\partial {q_a}}} - {{\partial f} \over {\partial {q_b}}}{\rm{i}} - {{\partial f} \over {\partial {q_c}}}{\rm{j}} - {{\partial f} \over {\partial {q_d}}}{\rm{k}}} \right)\alpha{.}$$
((31))
2.
The first product rule: The following product rule holds:
$${\nabla _q}(fg) = f{\nabla _q}g + [({\nabla _r}f)g]{J^{\rm{H}}}{.}$$
((32))
For example,
$${{\partial fq} \over {\partial q}} = f{{\partial g} \over {\partial q}} + {1 \over 4}\left( {{{\partial f} \over {\partial {q_a}}}g - {{\partial f} \over {\partial {q_b}}}g{\rm{i}} - {{\partial f} \over {\partial {q_c}}}g{\rm{j}} - {{\partial f} \over {\partial {q_d}}}g{\rm{k}}} \right){.}$$
((33))
Thus, the product rule in general is different from the usual one.
3.
The second product rule: However, the usual product rule applies to differentiation with respect to real variables, i.e.,
$${{\partial fg} \over {\partial {q_\phi}}} = {{\partial f} \over {\partial {q_\phi}}}g + f{{\partial g} \over {\partial {q_\phi}}}$$
((34))
for φ = a, b, c, or d.
4.
The third product rule: The usual product rule also applies if at least one of the two functions f(q) and g(q) is real-valued, i.e.,
$${{\partial fq} \over {\partial q}} = f{{\partial g} \over {\partial q}} + {{\partial f} \over {\partial q}}g{.}$$
((35))
5.
The first chain rule: For a composite function $f(g(q)),\;\;g(q): = {g_a} + {g_b}{\rm{i}} + {g_c}{\rm{j}} + {g_d}{\rm{k}}$ being a quaternion-valued function, we have the following chain rule:
$${\nabla _q}f = (\nabla _q^gf)M,$$
((36))
where $\nabla _q^g: = (\partial /\partial g,\;\partial /\partial {g^{\rm{i}}},\;\partial /\partial {g^{\rm{j}}},\;\partial /\partial {g^{\rm{k}}})$ and M is a 4 × 4 matrix with element ${M_{\mu \nu}} = \partial {g^\mu}/\partial {q^\nu}$ for ¼, ν ∈ {1, i, j, k} and ${g^\mu} = - \mu g\mu$ (g¹ is understood the same as g). Explicitly, we can write
$${{\partial f} \over {\partial {q^\nu}}} = \sum\limits_\mu {{{\partial f} \over {\partial {g^\mu}}}} {{\partial {g^\mu}} \over {\partial {q^\nu}}}{.}$$
((37))
The proof is outlined in Appendix C.
6.
The second chain rule: The above chain rule uses g and its involutions as the intermediate variables. It is sometimes convenient to use the real components of g for that purpose instead. In this case, the following chain rule may be used:
$${\nabla _q}f = (\nabla _r^gf)O,$$
((38))
where O is a 4 × 4 matrix with entry ${O_{\phi \nu}} = \partial {g_\phi}/\partial {q^\nu}$ with $\phi \in \{ a,\;b,\;c,\;d\}$ and $\nu \in \{ 1,{\rm{i}},{\rm{j}},{\rm{k}}\}$, and $\nabla _r^g: = (\partial /\partial {g_a},\partial /\partial {g_b},\partial /\partial {g_c},\partial /\partial {g_d})$. Explicitly, we have
$${{\partial f} \over {\partial {q^\nu}}} = \sum\limits_\phi {{{\partial f} \over {\partial {g_\phi}}}} {{\partial {g_\phi}} \over {\partial {q^\nu}}}{.}$$
((39))
7.
The third chain rule: If the intermediate function g(q) is real-valued, i.e., g = g_a, then from the second chain rule, we obtain
$${{\partial f} \over {\partial {q^\nu}}} = {{\partial f} \over {\partial g}}{{\partial g} \over {\partial {q^\nu}}}{.}$$
((40))
8.
f(q) is not independent of qⁱ, q^j, or q^k in the sense that, in general,
$${{\partial f(q)} \over {\partial {q^{\rm{i}}}}} \neq 0,\;\;{{\partial f(q)} \over {\partial {q^{\rm{j}}}}} \neq 0,\;\;{{\partial f(q)} \over {\partial {q^{\rm{k}}}}} \neq 0{.}$$
((41))
This can be illustrated by f(q) = q². Using the first product rule (Eq. (32)), we have
$${{\partial {q^2}} \over {\partial {q^{\rm{i}}}}} = q{{\partial q} \over {\partial {q^{\rm{i}}}}} + {1 \over 4}\sum\limits_{(\phi ,\nu )} {{{\partial q} \over {\partial {q_\phi}}}q\nu}$$
for $(\phi ,\nu ) \in \{ (a,1),(b,{\rm{i}}),(c, - {\rm{j}}),(d, - {\rm{k}})\}$. It can then be shown that
$${{\partial {q^2}} \over {\partial {q^{\rm{i}}}}} = {q_b}{\rm{i}},\;{{\partial {q^2}} \over {\partial {q^{\rm{j}}}}} = {q_c}{\rm{j}},\;{{\partial {q^2}} \over {\partial {q^{\rm{k}}}}} = {q_d}{\rm{k}}{.}$$
((42))
This property demonstrates the intriguing difference between the HR derivative and the usual derivatives, although we can indeed show that
$${{\partial q} \over {\partial {q^\nu}}} = 0{.}$$
((43))
One implication of this observation is that, for a nonlinear algorithm involving simultaneously more than one gradient ∂f/∂qν, we have to take care to include all the terms.

4 Restricted HR derivatives for a class of regular functions

Using the above operation rules, we can find explicit expressions for the derivatives for a whole range of functions. We first introduce the following lemma:

Lemma 1 The derivative of the power function $f(q) = {(q - {q_0})^n}$, with integer n and constant quaternion q₀, is

$${{\partial f(q)} \over {\partial q}} = {1 \over 2}\left( {n{{\tilde q}^{n - 1}} + {{{{\tilde q}^n} - {{\tilde q}^{\ast n}}} \over {\tilde q - {{\tilde q}^\ast}}}} \right)$$

((44))

with $\tilde q = q - {q_0}$.

Remark 1 The division in $({\tilde q^n} - {\tilde q^{\ast n}})/(\tilde q - {\tilde q^\ast})$ is understood as $({\tilde q^n} - {\tilde q^{\ast n}}){(\tilde q - {\tilde q^\ast})^{ - 1}}$or ${(\tilde q - {\tilde q^\ast})^{ - 1}}({\tilde q^n} - {\tilde q^{\ast n}})$ which are the same since the two factors commute. The division operations in what follows are understood in the same way.

Proof The lemma is obviously true for n = 0. Letting n ≥ 1, we apply the first product rule, and find

$${{\partial {{(q - {q_0})}^n}} \over {\partial q}} = \tilde q{{\partial {{\tilde q}^{n - 1}}} \over {\partial q}} + {\rm{R}}({\tilde q^{n - 1}}),$$

((45))

where ${\rm{R}}({\tilde q^{n - 1}})$ is the real part of ${\tilde q^{n - 1}}$. We then obtain by induction

$${{\partial {{(q - {q_0})}^n}} \over {\partial q}} = \sum\limits_{m = 0}^{n - 1} {{{\tilde q}^m}} {\rm{R}}({\tilde q^{n - 1 - m}}){.}$$

((46))

Using ${\rm{R}}({\tilde q^{n - 1 - m}}) = ({\tilde q^{n - 1 - m}} + {\tilde q^{\ast (n - 1 - m)}})/2$, the summations can be evaluated explicitly, leading to Eq. (44).

For n < 0, we use the recurrent relation

$${{\partial ({{(q - {q_0})}^{ - n}})} \over {\partial q}} = {\tilde q^{ - 1}}\left[ {{{\partial {{\tilde q}^{ - (n - 1)}}} \over {\partial q}} - {\rm{R}}({{\tilde q}^{ - n}})} \right]$$

((47))

and the result

$${{\partial {{(q - {q_0})}^{ - 1}}} \over {\partial q}} = - {\tilde q^{ - 1}}{\rm{R}}({\tilde q^{ - 1}}){.}$$

((48))

Eq. (44) is proven by using induction as for n > 0. More details are given in Appendix B.

Theorem 1 Assuming f : H → H admits a power series representation $f(q): = g(\tilde q): = \sum\nolimits_{n = - \infty}^\infty {{a_n}} {\tilde q^n}$, with a_n being a quaternion constant and $\tilde q = q - {q_0}$, for ${R_1} \leq \vert \tilde q\vert \leq {R_2}$ with R₁, R₂ > 0 being some constants, then we have

$${{\partial f(q)} \over {\partial q}} = {1 \over 2}\left[ {f\prime (q) + (g(\tilde q) - g({{\tilde q}^\ast})){{(\tilde q - {{\tilde q}^\ast})}^{ - 1}}} \right],$$

((49))

where f′(q) is the derivative in the usual sense, i.e.,

$$f\prime (q): = \sum\limits_{n = - \infty}^\infty n {a_n}{\tilde q^{n - 1}} = \sum\limits_{n = - \infty}^\infty n {a_n}{(q - {q_0})^{n - 1}}{.}$$

((50))

Proof Using Lemma 1 and the restricted left-linearity of HR gradients, we have

$$\begin{array}{*{20}c}{{{\partial f} \over {\partial q}} = {1 \over 2}\sum\limits_{n = - \infty}^\infty {{a_n}} [n{{\tilde q}^{n - 1}} + ({{\tilde q}^n} - {{\tilde q}^{\ast n}}){{(\tilde q - {{\tilde q}^\ast})}^{ - 1}}]} \\ {\quad = f\prime (q) + {1 \over 2}\left[ {\sum\limits_{n = \infty}^\infty {{a_n}} ({{\tilde q}^n} - {{\tilde q}^{\ast n}})} \right]{{(\tilde q - {{\tilde q}^\ast})}^{ - 1}}} \\ {\quad = {1 \over 2}[f\prime (q) + (g(\tilde q) - g({{\tilde q}^\ast})){{(\tilde q - {{\tilde q}^\ast})}^{ - 1}}],} \end{array}$$

which proves the theorem.

The functions f(q) form a class of regular functions on H. A full discussion of such functions is beyond the scope of this paper. However, we note that a similar class of functions have been discussed in Gentili and Struppa (2007). A parallel development for the former is possible, and will be the topic of a future paper. Meanwhile, we observe that many useful elementary functions satisfy the conditions in Theorem 1. To illustrate the application of the theorem, we list below the derivatives of a number of such functions.

Example 1 Exponential function f(q) = e^q has representation

$${{\rm{e}}^q}: = \sum\limits_{n = 0}^\infty {{{{q^n}} \over {n!}}}{.}$$

((51))

Applying Theorem 1 with a_n = 1/n! and q₀ =0, we have

$${{\partial {{\rm{e}}^q}} \over {\partial q}} = {1 \over 2}\left( {{{\rm{e}}^q} + {{{{\rm{e}}^q} - {{\rm{e}}^{q\ast}}} \over {q - {q^\ast}}}} \right){.}$$

((52))

Making use of ${{\rm{e}}^q} = {{\rm{e}}^{{q_a} + {\bf{\hat v}}v}} = {{\rm{e}}^{{q_a}}}{{\rm{e}}^{{\bf{\hat v}}v}} = {{\rm{e}}^{{q_a}}}(\cos v + {\bf{\hat v}}\sin v)$ with the representation of $q = {q_a} + {\bf{\hat v}}v$ and ${{\bf{\hat v}}^2} = - 1$, respectively, we have

$${{\partial {{\rm{e}}^q}} \over {\partial q}} = {1 \over 2}({{\rm{e}}^q} + {{\rm{e}}^{{q_a}}}{v^{ - 1}}\sin v){.}$$

((53))

Example 2 The logarithmic function f(q) = ln q has representation

$$\ln q = \sum\limits_{n = 1}^\infty {{{{{( - 1)}^{n - 1}}} \over {n}}} {(q - 1)^n},$$

((54))

with ${a_n} = {( - 1)^{n - 1}}/n$ and q₀ = 1. Since q₀ is a real number, $g({\tilde q^\ast}) = f({q^\ast})$. Therefore, from Theorem 1 we have

$${{\partial \ln q} \over {\partial q}} = {1 \over 2}\left( {{q^{ - 1}} + {{\ln q - \ln {q^\ast}} \over {q - {q^\ast}}}} \right){.}$$

((55))

Using representation ln $q = \ln \vert q\vert + {\bf{\hat v}}$ arccos(q_a/|q|), the expression can be simplified as

$${{\partial \ln q} \over {\partial q}} = {1 \over 2}\left( {{q^{ - 1}} + {1 \over v}\arccos {{{q_a}} \over {\vert q\vert}}} \right)$$

((56))

where υ = |I(q)|.

Example 3 Hyperbolic tangent function f(q) = tanh q is defined as

$$\tanh q: = {{{{\rm{e}}^q} - {{\rm{e}}^{ - q}}} \over {{{\rm{e}}^q} + {{\rm{e}}^{ - q}}}} = q - {{{q^3}} \over 3} + {{2{q^5}} \over {15}} - \cdots{.}$$

((57))

Therefore, Theorem 1 applies. On the other hand, using the relation ${{\rm{e}}^q} = {{\rm{e}}^{{q_a}}}(\cos v + {\bf{\hat v}}\sin v)$, we can show that

$$\tanh q = {1 \over 2}{{\sinh (2{q_a}) + {\bf{\hat v}}\sin (2v)} \over {{{\sinh}^2}{q_a} + {{\cos}^2}v}}{.}$$

((58))

Then the second term in the expression given by Theorem 1 can be simplified. The final expression can be written as

$${{\partial \tanh q} \over {\partial q}} = {1 \over 2}\left( {{\rm{sec}}{{\rm{h}}^2}q + {{{v^{ - 1}}\sin (2v)} \over {\cosh (2{q_a}) + \cos (2v)}}} \right),$$

((59))

where sechq := 1/cosh q is the quaternionic hyperbolic secant function.

Remark 2 Apparently, the derivatives for these functions can also be found by direct calculations without resorting to Theorem 1.

We now turn to a question of more theoretical interests. Even though it might not be obvious from the definitions, the following theorem shows that the restricted HR derivative is consistent with the derivative in the real domain for a class of functions, including those in the above examples:

Theorem 2 For the function f(q) in Theorem 1, if q₀ is a real number, then

$${{\partial f(q)} \over {\partial q}} \rightarrow f\prime (q),$$

((60))

when q → R(q), i.e., when q approaches a real number.

Proof Using the polar representation, we write $\tilde q = \vert \tilde q\vert \exp ({\bf{\hat v}}\theta )$, where $\theta = \arcsin (v/\vert \tilde q\vert )$ is the argument of $\tilde q$ with $v = \vert {\rm{I}}(\tilde q)\vert$. Then ${\tilde q^n} = \vert \tilde q{\vert ^n}\exp (n{\bf{\hat v}}\theta )$, and

$$({\tilde q^n} - {\tilde q^{\ast n}}){(\tilde q - {\tilde q^\ast})^{ - 1}} = {{{\rm{I}}({{\tilde q}^n})} \over {{\rm{I}}(\tilde q)}} = {{\vert \tilde q{\vert ^{n - 1}}\sin (n\theta )} \over {\sin \theta}}{.}$$

((61))

For real q₀, $\tilde q \rightarrow {q_a} - {q_0}$ and υ → 0 when q → R(q). There are two possibilities. First, if q_a − q₀ ≥ 0, then θ → 0 at the limit. Thus,

$${{\sin (n\theta )} \over {\sin \theta}}\sim {{\sin (n\theta )} \over \theta} \rightarrow n,\quad \vert \tilde q{\vert ^{n - 1}} \rightarrow {({q_a} - {q_0})^{n - 1}}{.}$$

((62))

Therefore,

$$({\tilde q^n} - {\tilde q^{\ast n}}){(\tilde q - {\tilde q^\ast})^{ - 1}} \rightarrow n{\tilde q^{n - 1}},$$

((63))

$$[g(\tilde q) - g({\tilde q^\ast})]{(\tilde q - {\tilde q^\ast})^{ - 1}} \rightarrow \sum\limits_{n = - \infty}^\infty n {a_n}{\tilde q^{n - 1}} = f\prime (q){.}$$

((64))

Thus,

$${{\partial f(q)} \over {\partial q}} \rightarrow {1 \over 2}[f\prime (q) + f\prime (q)] = f\prime (q){.}$$

((65))

Second, if q_a − q₀ < 0, then θ → π. Thus,

$${{\sin (n\theta )} \over {\sin \theta}}\sim {{\sin (n\theta )} \over {\pi - \theta}}{.}$$

((66))

Noting $\sin (n\theta )\quad = \quad \sin [n\pi \;\; - \;\;n(\pi \;\; - \;\;\;\theta )]\quad = \quad {( - 1)^{n - 1}}\sin [n(\pi - \theta )]$, we have

$${{\sin (n\theta )} \over {\sin \theta}}\sim {{{{( - 1)}^{n - 1}}\sin (n(\pi - \theta ))} \over {\pi - \theta}} \rightarrow {( - 1)^{n - 1}}n{.}$$

((67))

On the other hand, in this case $\vert \tilde q\vert \rightarrow - ({q_a} - {q_0})$, and hence $\vert \tilde q{\vert ^{n - 1}} \rightarrow {( - 1)^{n - 1}}{({q_a} - {q_0})^{n - 1}}$. Since $\tilde q \rightarrow {q_a} - {q_0}$, as a consequence, we have

$$({\tilde q^n} - {\tilde q^{\ast n}}){(\tilde q - {\tilde q^\ast})^{ - 1}} \rightarrow n{\tilde q^{n - 1}},$$

((68))

which is the same as Eq. (63). The proofthen follows from the first case.

The functions in above three examples all satisfy the conditions in Theorem 2. Hence, we expect Theorem 2 applies. One can easily verify by direct calculations that the theorem indeed holds.

5 Right restricted HR gradients

In this section, we briefly summarize the results for the right restricted HR gradients, and highlight the difference from the left restricted HR gradients.

1.
Right-linearity: For arbitrary quaternion constants α and β, and functions f(q) and g(q), we have
$${{{\partial ^{\rm{R}}}(f\alpha + g\beta )} \over {\partial {q^\nu}}} = {{{\partial ^{\rm{R}}}f} \over {\partial {q^\nu}}}\alpha + {{{\partial ^{\rm{R}}}g} \over {\partial {q^\nu}}}\beta{.}$$
((69))
However, linearity does not hold for left multiplications, i.e., in general ${\partial ^{\rm{R}}}\alpha f/\partial q \neq \alpha {\partial ^{\rm{R}}}f/\partial q$.
2.
The first product rule: For the right restricted HR operator, the following product rule holds:
$${[\nabla _q^{\rm{R}}(fg)]^{\rm{T}}} = {[(\nabla _q^{\rm{R}}f)g]^{\rm{T}}} + {J^\ast}[f{({\nabla _r}g)^{\rm{T}}}]{.}$$
((70))
The second and third product rules are the same as those of the left restricted operator.
3.
The first chain rule: For the composite function f(g(q)), we have
$${(\nabla _q^{\rm{R}}f)^{\rm{T}}} = {M^{\rm{T}}}{(\nabla _q^{g{\rm{R}}}f)^{\rm{T}}}{.}$$
((71))
4.
The second chain rule becomes ${(\nabla _q^{\rm{R}}f)^{\rm{T}}} = {O^{\rm{T}}}{(\nabla _r^gf)^{\rm{T}}}$.
5.
The third chain rule becomes ${\partial ^{\rm{R}}}f/\partial {q^\nu} = (\partial g/\partial {q^\nu})(\partial f/\partial g)$. Note that $\partial g/\partial {q^\nu} = {\partial ^{\rm{R}}}g/\partial {q^\nu}$ since g is real-valued. We thus have omitted the superscript ‘R’. Also, ∂f/∂g is a real derivative, so there is no distinction between left and right derivatives.

We can also find the right restricted HR gradients for common quaternion functions. First of all, Lemma 1 is also true for right derivatives.

Lemma 2 For f(q) = (q − q₀)ⁿ with n an integer and q₀ a constant quaternion, we have

$${{{\partial ^{\rm{R}}}f(q)} \over {\partial q}} = {1 \over 2}\left( {n{{\tilde q}^{n - 1}} + {{{{\tilde q}^n} - {{\tilde q}^{\ast n}}} \over {\tilde q - {{\tilde q}^\ast}}}} \right),$$

((72))

with $\tilde q = q - {q_0}$.

Remark 3 To prove the lemma, we use the following recurrent relations:

$${{\partial {{(q - {q_0})}^n}} \over {\partial q}} = {{\partial {{\tilde q}^{n - 1}}} \over {\partial q}}\tilde q + {\rm{R}}({\tilde q^{n - 1}}),$$

((73))

$${{\partial ({{(q - {q_0})}^{ - n}})} \over {\partial q}} = \left[ {{{\partial {{\tilde q}^{ - (n - 1)}}} \over {\partial q}} - {\rm{R}}({{\tilde q}^{ - n}})} \right]{\tilde q^{ - 1}}{.}$$

((74))

Using Lemma 2, We can prove the following result:

Theorem 3 Assuming f : H → H admits a power series representation $f(q): = g(\tilde q): = \sum\nolimits_{n = - \infty}^\infty {{{\tilde q}^n}} {a_n}$, with a_n being a quaternion constant and $\tilde q = q - {q_0}$, for ${R_1} \leq \vert \tilde q\vert \leq {R_2}$ with R₁, R₂ > 0 being some constants, then we have

$${{{\partial ^{\rm{R}}}f(q)} \over {\partial q}} = {1 \over 2}\left[ {f\prime (q) + {{(\tilde q - {{\tilde q}^\ast})}^{ - 1}}(g(\tilde q) - g({{\tilde q}^\ast}))} \right],$$

((75))

where f′(q) is the derivative in the usual sense, i.e.,

$$f\prime (q): = \sum\limits_{n = - \infty}^\infty n {\tilde q^{n - 1}}{a_n} = \sum\limits_{n = - \infty}^\infty n {(q - {q_0})^{n - 1}}{a_n}{.}$$

((76))

Note that, the functions f(q) in Theorem 3 in general form a class of functions different from the one in Theorem 1, because in the series representation a_n appears on the right-hand side of the powers. However, if a_n is a real number, then the two classes of functions coincide. Therefore, we have the following result:

Theorem 4 If a_n is real, then the left and right restricted HR gradients of f(q) coincide.

Remark 4 As a consequence, we can see immediately that the right derivatives for the exponential, logarithmic, and hyperbolic tangent functions are the same as the left ones.

Apparently, Theorem 2 is also true for the right derivatives. Hence, we have:

Theorem 5 The right-restricted HR gradient is consistent with the real gradient in the sense of Theorem 2.

6 Increment of a quaternion function

When f(q) is a real-valued quaternion function, both left and right restricted HR gradients are coincident with the HR gradients. Besides, we have

$${{{\partial ^{\rm{R}}}f} \over {\partial {q^\nu}}} = {{\partial f} \over {\partial {q^\nu}}} = {\left( {{{\partial f} \over {\partial q}}} \right)^\nu},$$

((77))

where ν ∈ {1, i, j, k}. Thus, only ∂f/∂q is independent. As a consequence (see also Mandic et al. (2011)),

$$\begin{array}{*{20}c}{{\rm{d}}f = \sum\limits_\nu {{{\partial f} \over {\partial {q^\nu}}}} {\rm{d}}{q^\nu} = \sum\limits_\nu {{{\left( {{{\partial f} \over {\partial q}}} \right)}^\nu}} {\rm{d}}{q^\nu}\;\;} \\ {\quad = \sum\limits_\nu {{{\left( {{{\partial f} \over {\partial q}}{\rm{d}}q} \right)}^\nu}} = 4{\rm{R}}\left( {{{\partial f} \over {\partial q}}{\rm{d}}q} \right),} \end{array}$$

((78))

where Eq. (77) has been used. Hence, −∂f/∂q)* gives the steepest descent direction for f, and the increment is determined by ∂f/∂q.

On the other hand, if f is a quaternion-valued function, the increment will depend on all four derivatives. Taking f(q) = q² as an example, we have (see Eqs. (42) and (44))

$${\rm{d}}{q^2} = (q + {q_a}){\rm{d}}q + {q_b}{\rm{id}}{q^{\rm{i}}} + {q_c}{\rm{jd}}{q^{\rm{j}}} + {q_d}{\rm{kd}}{q^{\rm{k}}},$$

((79))

even though f(q) appears to be independent of qⁱ, q^j, and q^k. It can be verified that the above expression is the same as the differential form given in terms of dq_a, dq_b, dq_c, and dq_d. Thus, it is essential to include the contributions from ∂f/∂qⁱ, etc.

We also note that, if the right gradient is used consistently, the same increment would be produced, since the basis of the definitions is the same, namely, the differential form in terms of dq_a, dq_b, dq_c, and dq_d.

6.1 Quaternion-valued LMS algorithm

As an application, we now apply the quaternion-valued restricted HR gradient operator to develop the QLMS algorithm. Different versions of the QLMS algorithm have been derived in Barthelemy et al. (2014), Jiang et al. (2014b), and Tao and Chang (2014). However, with the rules we have derived that, some of the calculations can be simplified, as we will show below.

In terms of a standard adaptive filter, the output y[n] and error e[n] can be expressed as

$$y[n] = {w^{\rm{T}}}[n]x[n],\;\;e[n] = d[n] - {w^{\rm{T}}}[n]x[n],$$

((80))

where w[n] is the adaptive weight coefficient vector, d[n] the reference signal, and x[n] the input sample vector. The conjugate e*[n] of the error signal e[n] is

$${e^\ast}[n] = {d^\ast}[n] - {x^{\rm{H}}}[n]{w^\ast}[n]{.}$$

((81))

The cost function is defined as $J[n] = e[n]{e^\ast}[n]$, which is real-valued. According to the discussion above and Brandwood (1983) and Mandic et al. (2011), the conjugate gradient (∇_wJ[n])* gives the maximum steepness direction for the optimization surface. Therefore, it is used to update the weight vector. Specifically,

$$w[n + 1] = w[n] - \mu {({\nabla _w}J[n])^\ast},$$

((82))

where μ is the step size. To find ∇_wJ, we use the first product rule:

$$\begin{array}{*{20}c}{{\nabla _w}J = {{\partial e[n]{e^\ast}[n]} \over {\partial w}}\quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad} \\ { = e[n]{{\partial {e^\ast}[n]} \over {\partial w}} + {1 \over 4}\left( {{{\partial e[n]} \over {\partial {w_a}}}{e^\ast}[n] - {{\partial e[n]} \over {\partial {w_b}}}{e^\ast}[n]{\rm{i}}} \right.} \\ {\left. { - {{\partial e[n]} \over {\partial {w_c}}}{e^\ast}[n]{\rm{j}} - {{\partial e[n]} \over {\partial {w_d}}}{e^\ast}[n]{\rm{k}}} \right){.}\quad \quad \quad} \end{array}$$

((83))

After some algebra, we find ${\nabla _w}J[n]\quad = \quad - x[n]{e^\ast}[n]/2$, which leads to the following update equation for the QLMS algorithm:

$$w[n + 1] = w[n] + \mu (e[n]{x^\ast}[n]){.}$$

((84))

6.2 Quaternion-valued nonlinear adaptive algorithm

Another application is the derivation of quaternion-valued adaptive filtering algorithms. We use the quaternion-valued hyperbolic tangent function as an example (Roberts and Jayabalan, 2015), so that the output s[n] of the adaptive filter can be given by $s[n] = \tanh (y[n]) = \tanh ({w^{\rm{T}}}[n]x[n])$. The cost function is given by $J[n] = e[n]{e^\ast}[n]$, with $e[n] = d[n] - \tanh ({w^{\rm{T}}}[n]x[n])$.

Using the product rules in Eq. (83) and chain rules, and letting $y[n] = {w^{\rm{T}}}[n]x[n]$, we have

$$\begin{array}{*{20}c}{{{\partial {e^\ast}[n]} \over {\partial w[n]}} = - \left( {{{\partial \tanh ({y^\ast}[n])} \over {\partial {{({y^\ast}[n])}_a}}}{{\partial {{({y^\ast}[n])}_a}} \over {\partial w[n]}}} \right.} \\ {\quad \;\; + {{\partial \tanh ({y^\ast}[n])} \over {\partial {{({y^\ast}[n])}_b}}}{{\partial {{({y^\ast}[n])}_b}} \over {\partial w[n]}}} \\ {\quad \; + {{\partial \tanh ({y^\ast}[n])} \over {\partial {{({y^\ast}[n])}_c}}}{{\partial {{({y^\ast}[n])}_c}} \over {\partial w[n]}}} \\ {\left. {\quad \;\;\;\, + {{\partial \tanh ({y^\ast}[n])} \over {\partial {{({y^\ast}[n])}_d}}}{{\partial {{({y^\ast}[n])}_d}} \over {\partial w[n]}}} \right){.}} \end{array}$$

((85))

Let u = |I(y)| and û = I(y)/u. Then the quaternion y = y_a +I(y) can also be written as y = y_a+uû. Here, û is a pure unit quaternion. Finally, the gradient can be expressed as follows by using Eq. (58):

$$\begin{array}{*{20}c}{{\nabla _w}J[n] = {1 \over {4{{({{\sinh}^2}{y_a} + {{\cos}^2}u)}^2}}}\quad \quad \quad \quad \quad \quad \quad \quad \quad} \\ {\cdot\left[ {\left( {2\sin (2u)({e_a}{{\sin}^2}{y_a} + \sin (2u){{(e{\bf{\hat u}})}_a})\quad \quad} \right.} \right.} \\ {\left. { + (\cos u - {{\sin u} \over u})({{\sinh}^2}{y_a} + {{\cos}^2}u){{(e{\bf{\hat u}})}_a}} \right)x{\bf{\hat u}}} \\ { + {e_a}\left( {({{\sinh}^2}{y_a} + {{\cos}^2}u)({{\sin u} \over u} - 4\cosh (2{y_a}))} \right.} \\ {\left. { + \sinh (2{y_a})({{\sinh}^2}{y_a} - \sin (2u){{(e{\bf{\hat u}})}_a})} \right)x\quad \quad} \\ {\left. { + 2{{\sin u} \over u}({{\sinh}^2}{y_a} + {{\cos}^2}u){{(e{x_a} + {e^\ast}x)}_a})} \right]{.}\quad \;} \end{array}$$

((86))

Substituting the above result into Eq. (82) we can then obtain the update equation for the nonlinear adaptive algorithm.

On the other hand, if we use the series representation of tanh q, we can obtain another form of the gradient function and the corresponding update equation becomes

$$\begin{array}{*{20}c}{w[n + 1]\quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \;\;} \\ { = w[n] + {1 \over 2}\mu \sum\limits_{m = 0}^\infty {\sum\limits_{r = 0}^{m - 1} {{a_m}}} {{({x^{\rm{H}}}[n]{w^\ast}[n])}^{m - 1 - r}}} \\ { \cdot e[n]{{({x^{\rm{H}}}[n]{w^\ast}[n])}^r}{x^\ast}[n],\quad \quad \quad \quad} \end{array}$$

((87))

where a_m is the coefficient in the series representation of tanh(y[n]), i.e., $\tanh (y[n])\quad = \quad \sum\nolimits_{m = 0}^\infty {{a_m}} {(y[n])^m}$. It can be shown that if the items in the gradient part of the above expression are commutative, it will be reduced to the same form as in the real or complex domain.

7 Application to adaptive beamforming based on vector sensor arrays

As an example for the application of quaternion-valued signal processing, we here consider the reference signal based adaptive beamforming problem for vector sensor arrays consisting of multiple crossed-dipoles, where the earlier derived QLMS algorithm can be employed for beamforming.

7.1 Vector sensor arrays with a quaternion model

A general structure for a uniform linear array (ULA) with M crossed-dipole pairs is shown in Fig. 1, where these pairs are located along the y-axis with an adjacent distance d, and at each location the two crossed components are parallel to the x-axis and y-axis, respectively. For a far-field incident signal with a direction of arrival (DOA) defined by the angles θ and φ, its spatial steering vector is given by

$${S_c}(\theta ,\phi ) = \left[ {\begin{array}{*{20}c}1 \\ {\exp ( - {\rm{j}}2\pi {{d}}\sin \theta \sin \phi /\lambda )} \\ \vdots \\ {\exp ( - {\rm{j}}2\pi (M - 1)d\sin \theta \sin \phi /\lambda )} \end{array}} \right],$$

((88))

where A is the wavelength of the incident signal. For a crossed-dipole the spatial-polarization coherent vector can be given by Compton (1981), Li and Compton (1991), Zhang et al. (2014), and Hawes and Liu (2015):

$${S_{\rm{p}}}(\theta ,\phi ,\gamma ,\eta ) = \left\{ {\begin{array}{*{20}c}{[ - \cos \gamma ,\cos \theta \sin \gamma {{\rm{e}}^{{\rm{j}}\eta}}],} & {\phi = {\pi \over 2},\quad} \\ {[\cos \gamma , - \cos \theta \sin \gamma {{\rm{e}}^{{\rm{j}}\eta}}],} & {\phi = - {\pi \over 2},\;} \end{array}} \right.$$

((89))

where γ is the auxiliary polarization angle with γ ∈ [0, π/2], and η ∈ [−π, π] is the polarization phase difference.

The array structure can be divided into two sub-arrays: one parallel to the x-axis and one to the y-axis. The complex-valued steering vector of the x-axis sub-array is given by

$${S_x}(\theta ,\phi ,\gamma ,\eta ) = \left\{ {\begin{array}{*{20}c}{ - \cos \gamma {S_{\rm{c}}}(\theta ,\phi ),} & {\phi = {\pi \over 2},\quad} \\ {\cos \gamma {S_{\rm{c}}}(\theta ,\phi ),\;\;} & {\phi = {{ - \pi} \over 2},\;} \end{array}} \right.$$

((90))

and for the y-axis it is expressed as

$${S_y}(\theta ,\phi ,\gamma ,\eta ) = \left\{ {\begin{array}{*{20}c}{\cos \theta \sin \gamma {{\rm{e}}^{{\rm{j}}\eta}}{S_{\rm{c}}}(\theta ,\phi ),\;} & {\phi = {\pi \over 2},\quad} \\ { - \cos \theta \sin \gamma {{\rm{e}}^{{\rm{j}}\eta}}{S_{\rm{c}}}(\theta ,\phi ),} & {\phi = {{ - \pi} \over 2}.\;} \end{array}} \right.$$

((91))

Combining the two complex-valued subarray steering vectors together, an overall quaternion-valued steering vector with one real part and three imaginary parts can be constructed as

$$\begin{array}{*{20}c}{{S_{\rm{q}}}(\theta ,\phi ,\gamma ,\eta )\quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad} \\ { = \Re \{ {S_x}(\theta ,\phi ,\gamma ,\eta )\} + {\rm{i}}\Re \{ {S_y}(\theta ,\phi ,\gamma ,\eta )\} \quad \quad \quad} \\ { + {\rm{j}}\Im \{ {S_x}(\theta ,\phi ,\gamma ,\eta )\} + {\rm{k}}\Im \{ {S_y}(\theta ,\phi ,\gamma ,\eta )\} ,} \end{array}$$

((92))

where ${\Re \{\cdot \}}$ and ${\Im \{\cdot \}}$ are the real and imaginary parts of a complex number/vector, respectively. Given a set of coefficients, the response of the array is given by

$$r(\theta ,\phi ,\gamma ,\eta ) = {w^{\rm{H}}}{S_{\rm{q}}}(\theta ,\phi ,\gamma ,\eta ),$$

((93))

where w is the quaternion-valued weight vector.

7.2 Reference signal based adaptive beam-forming

Suppose one of the incident signals to the array is the desired one and the remaining signals are interferences. Then the aim of beamforming is to receive the desired signal while suppressing the interferences at the output of the beamformer (Liu and Weiss, 2010). When a reference signal d[n] is available, adaptive beamforming can be implemented by the standard adaptive filtering structure (Fig. 2), where ${x_m}[n]\;\;(m = 1,2, \cdots ,M)$ are the received quaternion-valued input signals through the M pairs of crossed dipoles, and ${w_m}[n]\;\;(m = 1,2, \cdots ,M)$ the corresponding quaternion-valued weight coefficients. The beamformer output y[n] and the error signal e[n] are

$$\left\{ {\begin{array}{*{20}c}{y[n] = {w^{\rm{T}}}[n]x[n],\quad \quad \;} \\ {e[n] = d[n] - {w^{\rm{T}}}[n]x[n],} \end{array}} \right.$$

((94))

where

$$\left\{ {\begin{array}{*{20}c}{w[n] = {{[{w_1}[n],{w_2}[n], \ldots ,{w_M}[n]]}^{\rm{T}}},} \\ {x[n] = {{[{x_1}[n],{x_2}[n], \ldots ,{x_M}[n]]}^{\rm{T}}}{.}} \end{array}} \right.$$

((95))

Simulations are performed based on such an array with 16 crossed dipoles and half-wavelength spacing using the QLMS algorithm in Eq. (84). The stepsize μ is set to 2 × 10⁻⁴. A desired signal with a 20 dB signal-to-noise ratio (SNR) impinges from the broadside of the array (θ = 15°) and two interfering signals with a signal-to-interference ratio (SIR) of −10 dB arrive from the directions (30°, 90°) and (15°, −90°), respectively. All the signals have the same polarisation of (γ, η) = (30°, 0). The learning

curve obtained by averaging results from 200 simulation runs is shown in Fig. 3 and the resultant beam pattern is shown in Fig. 4, where for convenience positive values of θ indicate the value range θ ∈ [0°, 90°] for φ = 90°, while negative values of θ ∈ [−90°, 0°] indicate an equivalent range of θ ∈ [0°, 90°] with φ = − 90°. We can see that the ensemble mean square error has reached almost − 30 dB and two nulls have been formed successfully in the two interference directions, demonstrating the effectiveness of the quaternion-valued signal model and the derived QLMS algorithm.

8 Conclusions

We have proposed a restricted HR gradient operator and discussed its properties, in particular, several different versions of product rules and chain rules. Using the rules that we have established, we derived a general formula for the derivative of a large class of nonlinear quaternion-valued functions. The class includes the common elementary functions such as the exponential function and the logarithmic function. We also proved that, for a wide class of functions, the restricted HR gradient becomes the usual derivatives for real functions with respect to real variables, when the independent quaternion variable tends to the real axis, thus showing the consistency of the definition. Both linear and nonlinear adaptive filtering algorithms are derived to show the applications of the operator. An adaptive beamforming example based on vector sensor arrays has also been provided to demonstrate the effectiveness of the quaternion-valued signal model and the derived signal processing algorithm.

References

Barthelemy, Q., Larue, A., Mars, J.I., 2014. About QLMS derivations. IEEE Signal Process. Lett., 21(2):240–243. http://dx.doi.org/10.1109/LSP.2014.2299066
Article Google Scholar
Brandwood, D.H., 1983. A complex gradient operator and its application in adaptive array theory. IEE Proc. F, 130(1):11–16. http://dx.doi.org/10.1049/ip-f-1.1983.0003
MathSciNet Google Scholar
Compton, R.T., 1981. On the performance of a polarization sensitive adaptive array. IEEE Trans. Antenn. Propag., 29(5):718–725. http://dx.doi.org/10.1109/TAP.1981.1142651
Article Google Scholar
Ell, T.A., Sangwine, S.J., 2007. Quaternion involutions and anti-involutions. Comput. Math. Appl., 53(1):137–143. http://dx.doi.org/10.1016/j.camwa.2006.10.029
Article MathSciNet Google Scholar
Ell, T.A., Le Bihan, N., Sangwine, S.J., 2014. Quaternion Fourier Transforms for Signal and Image Processing. Wiley, UK.
Book MATH Google Scholar
Gentili, G., Struppa, D.C., 2007. A new theory of regular functions of a quaternionic variable. Adv. Math., 216(1):279–301. http://dx.doi.org/10.1016/j.aim.2007.05.010
Article MathSciNet Google Scholar
Hawes, M., Liu, W., 2015. Design of fixed beamformers based on vector-sensor arrays. Int. J. Antenn. Propag., 2015:181937.1–181937.9. http://dx.doi.org/10.1155/2015/181937
Article Google Scholar
Jiang, M.D., Li, Y., Liu, W., 2014a. Properties and applications of a restricted HR gradient operator. arXiv:1407.5178.
Google Scholar
Jiang, M.D., Liu, W., Li, Y., 2014b. A general quaternionvalued gradient operator and its applications to computational fluid dynamics and adaptive beamforming. Proc. 19th Int. Conf. on Digital Signal Processing, p.821–826. http://dx.doi.org/10.1109/ICDSP.2014.6900781
Google Scholar
Jiang, M.D., Liu, W., Li, Y., 2014c. A zero-attracting quaternion-valued least mean square algorithm for sparse system identification. Proc. 9th Int. Symp. on Communication Systems, Networks and Digital Signal Processing, p.596–599. http://dx.doi.org/10.1109/CSNDSP.2014.6923898
Google Scholar
Le Bihan, N., Mars, J., 2004. Singular value decomposition of quaternion matrices: a new tool for vector-sensor signal processing. Signal Process., 84(7):1177–1199. http://dx.doi.org/10.1016/j.sigpro.2004.04.001
Article Google Scholar
Le Bihan, N., Miron, S., Mars, J.I., 2007. MUSIC algorithm for vector-sensors array using biquaternions. IEEE Trans. Signal Process., 55(9):4523–4533. http://dx.doi.org/10.1109/TSP.2007.896067
Article MathSciNet Google Scholar
Li, J., Compton, R.T., 1991. Angle and polarization estimation using ESPRIT with a polarization sensitive array. IEEE Trans. Antenn. Propag., 39:1376–1376. http://dx.doi.org/10.1109/8.99047
Article Google Scholar
Liu, H., Zhou, Y.L., Gu, Z.P., 2014. Inertial measurement unit-camera calibration based on incomplete inertial sensor information. J. Zhejiang Univ.-Sci. C (Comput. & Electron.), 15(11):999–1008. http://dx.doi.org/10.1631/jzus.C1400038
Article Google Scholar
Liu, W., 2014. Antenna array signal processing for a quaternion-valued wireless communication system. Proc. IEEE Benjamin Franklin Symp. on Microwave and Antenna Sub-systems, arXiv:1504.02921.
Google Scholar
Liu, W., Weiss, S., 2010. Wideband Beamforming: Concepts and Techniques. Wiley, UK.
Book Google Scholar
Mandic, D.P., Jahanchahi, C., Took, C.C., 2011. A quaternion gradient operator and its applications. IEEE Signal Process. Lett., 18(1):47–50. http://dx.doi.org/10.1109/LSP.2010.2091126
Article Google Scholar
Miron, S., Le Bihan, N., Mars, J.I., 2006. Quaternion-MUSIC for vector-sensor array processing. IEEE Trans. Signal Process., 54(4):1218–1229. http://dx.doi.org/10.1109/TSP.2006.870630
Article Google Scholar
Parfieniuk, M., Petrovsky, A., 2010. Inherently lossless structures for eight- and six-channel linear-phase paraunitary filter banks based on quaternion multipliers. Signal Process., 90(6):1755–1767. http://dx.doi.org/10.1016/j.sigpro.2010.01.008
Article Google Scholar
Pei, S.C., Cheng, C.M., 1999. Color image processing by using binary quaternion-moment-preserving thresholding technique. IEEE Trans. Image Process., 8(5):614–628. http://dx.doi.org/10.1109/83.760310
Article Google Scholar
Roberts, M.K., Jayabalan, R., 2015. An improved lowcomplexity sum-product decoding algorithm for lowdensity parity-check codes. Front. Inform. Technol. Electron. Eng., 16(6):511–518. http://dx.doi.org/10.1631/FITEE.1400269
Article Google Scholar
Sangwine, S.J., Ell, T.A., 2000. The discrete Fourier transform of a colour image. Proc. Image Processing II: Mathematical Methods, Algorithms and Applications, p.430–441.
Google Scholar
Talebi, S.P., Mandic, D.P., 2015. A quaternion frequency estimator for three-phase power systems. Proc. IEEE Int. Conf. on Acoustics, Speech and Signal Processing, p.3956–3960. http://dx.doi.org/10.1109/ICASSP.2015.7178713
Google Scholar
Talebi, S.P., Xu, D.P., Kuh, A., et al., 2014. A quaternion least mean phase adaptive estimator. Proc. IEEE Int. Conf. on Acoustics, Speech and Signal Processing, p.6419–6423. http://dx.doi.org/10.1109/ICASSP.2014.6854840
Google Scholar
Tao, J.W., 2013. Performance analysis for interference and noise canceller based on hypercomplex and spatiotemporal-polarisation processes. IET Radar Sonar Navig., 7(3):277–286. http://dx.doi.org/10.1049/iet-rsn.2012.0151
Article Google Scholar
Tao, J.W., Chang, W.X., 2014. Adaptive beamforming based on complex quaternion processes. Math. Prob. Eng., 2014:291249.1–291249.10. http://dx.doi.org/10.1155/2014/291249
Google Scholar
Zhang, X.R., Liu, W., Xu, Y.G., et al., 2014. Quaternionvalued robust adaptive beamformer for electromagnetic vector-sensor arrays with worst-case constraint. Signal Process., 104:274–274. http://dx.doi.org/10.1016/j.sigpro.2014.04.006
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Electronic and Electrical Engineering, University of Sheffield, Sheffield, S1 3JD, UK
Meng-di Jiang & Wei Liu
School of Mathematics and Statistics, University of Sheffield, Sheffield, S3 7RH, UK
Yi Li

Authors

Meng-di Jiang
View author publications
You can also search for this author in PubMed Google Scholar
Yi Li
View author publications
You can also search for this author in PubMed Google Scholar
Wei Liu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Wei Liu.

Additional information

Project supported by the National Grid UK

Part of the work is available at http://arxiv.org/abs/1407.5178 (Jiang et al., 2014a)

ORCID: Wei LIU, http://orcid.org/0000-0003-2968-2888

Appendices

Appendix A: Definition of the operators

We consider ${\rm{d}}f = {\rm{d}}{f_a} + {\rm{id}}{f_b} + {\rm{jd}}{f_c} + {\rm{kd}}{f_d}$. By definition, we have ${\rm{d}}{f_\gamma} = \sum\nolimits_\phi {(\partial {f_\gamma}/\partial {q_\phi})} {\rm{d}}{q_\phi}$, with γ, φ ∈ {a, b, c, d}. Using the relations

$${\rm{d}}{q_a} = {1 \over 4}({\rm{d}}q + {\rm{d}}{q^{\rm{i}}} + {\rm{d}}{q^{\rm{j}}} + {\rm{d}}{q^{\rm{k}}}),$$

((A1))

$${\rm{d}}{q_b} = {1 \over {4{\rm{i}}}}({\rm{d}}q + {\rm{d}}{q^{\rm{i}}} - {\rm{d}}{q^{\rm{j}}} - {\rm{d}}{q^{\rm{k}}}),$$

((A2))

$${\rm{d}}{q_c} = {1 \over {4{\rm{j}}}}({\rm{d}}q - {\rm{d}}{q^{\rm{i}}} + {\rm{d}}{q^{\rm{j}}} - {\rm{d}}{q^{\rm{k}}}),$$

((A3))

$${\rm{d}}{q_d} = {1 \over {4{\rm{k}}}}({\rm{d}}q - {\rm{d}}{q^{\rm{i}}} - {\rm{d}}{q^{\rm{j}}} + {\rm{d}}{q^{\rm{k}}}),$$

((A4))

we can rewrite df_γ as follows:

$$\begin{array}{*{20}c}{{\rm{d}}{f_\gamma}\;\; = \;\;{1 \over 4}\left( {{{\partial {f_\gamma}} \over {\partial {q_a}}} - {\rm{i}}{{\partial {f_\gamma}} \over {\partial {q_b}}} - {\rm{j}}{{\partial {f_\gamma}} \over {\partial {q_c}}} - {\rm{k}}{{\partial {f_\gamma}} \over {\partial {q_d}}}} \right){\rm{d}}q} \\ {\quad \quad + {1 \over 4}\left( {{{\partial {f_\gamma}} \over {\partial {q_a}}} - {\rm{i}}{{\partial {f_\gamma}} \over {\partial {q_b}}} + {\rm{j}}{{\partial {f_\gamma}} \over {\partial {q_c}}} + {\rm{k}}{{\partial {f_\gamma}} \over {\partial {q_d}}}} \right){\rm{d}}{q^{\rm{i}}}} \\ {\quad \quad + {1 \over 4}\left( {{{\partial {f_\gamma}} \over {\partial {q_a}}} + {\rm{i}}{{\partial {f_\gamma}} \over {\partial {q_b}}} - {\rm{j}}{{\partial {f_\gamma}} \over {\partial {q_c}}} + {\rm{k}}{{\partial {f_\gamma}} \over {\partial {q_d}}}} \right){\rm{d}}{q^{\rm{j}}}} \\ {\quad \quad + {1 \over 4}\left( {{{\partial {f_\gamma}} \over {\partial {q_a}}} + {\rm{i}}{{\partial {f_\gamma}} \over {\partial {q_b}}} + {\rm{j}}{{\partial {f_\gamma}} \over {\partial {q_c}}} - {\rm{k}}{{\partial {f_\gamma}} \over {\partial {q_d}}}} \right){\rm{d}}{q^{\rm{k}}},} \end{array}$$

which can be written as

$${\rm{d}}{f_\gamma} = {1 \over 4}\sum\limits_\nu {\left( {\sum\limits_{(\phi ,\mu )} {{{\partial {f_\gamma}} \over {\partial {q_\phi}}}} {\mu ^\nu}} \right)} {\rm{d}}{q^\nu},$$

((A5))

where $(\phi ,\mu ) \in \{ (a,1),(b, - {\rm{i}}),(c, - {\rm{j}}),(d, - {\rm{k}})\} ,\;\;\nu \in \{ 1,{\rm{i}},{\rm{j}},{\rm{k}}\}$, and μ^ν is the v-involution of μ. Therefore,

$$\begin{array}{*{20}c}{{\rm{d}}f = {\rm{d}}{f_a} + {\rm{id}}{f_b} + {\rm{jd}}{f_c} + {\rm{kd}}{f_d}\quad \quad \quad \quad \quad \quad \quad \;} \\ { = {1 \over 4}\sum\limits_\nu {\left( {\sum\limits_{(\phi ,\mu )} {{{\partial ({f_a} + {\rm{i}}{f_b} + {\rm{j}}{f_c} + {\rm{k}}{f_d})} \over {\partial {q_\phi}}}} {\mu ^\nu}} \right)} {\rm{d}}{q^\nu}} \\ { = {1 \over 4}\sum\limits_\nu {\left( {\sum\limits_{(\phi ,\mu )} {{{\partial f} \over {\partial {q_\phi}}}} {\mu ^\nu}} \right)} {\rm{d}}{q^\nu},\quad \quad \quad \quad \quad \quad \;} \end{array}$$

((A6))

which leads to Eqs. (11)–(19) in the main text. Note that, because μν and dq^ν are quaternions, to obtain the last equation, we need to multiply df_b, df_c, and df_d by i, j, and k from the left.

On the other hand, we notice that the prefactors in Eqs. (A2)–(A4) can be moved to the right-hand side of the other factors; i.e., we can write

$${\rm{d}}{q_a} = ({\rm{d}}q + {\rm{d}}{q^{\rm{i}}} + {\rm{d}}{q^{\rm{j}}} + {\rm{d}}{q^{\rm{k}}}){1 \over 4},$$

((A7))

$${\rm{d}}{q_b} = ({\rm{d}}q + {\rm{d}}{q^{\rm{i}}} - {\rm{d}}{q^{\rm{j}}} - {\rm{d}}{q^{\rm{k}}}){1 \over {4{\rm{i}}}},$$

((A8))

$${\rm{d}}{q_c} = ({\rm{d}}q - {\rm{d}}{q^{\rm{i}}} + {\rm{d}}{q^{\rm{j}}} - {\rm{d}}{q^{\rm{k}}}){1 \over {4{\rm{j}}}},$$

((A9))

$${\rm{d}}{q_d} = ({\rm{d}}q - {\rm{d}}{q^{\rm{i}}} - {\rm{d}}{q^{\rm{j}}} + {\rm{d}}{q^{\rm{k}}}){1 \over {4{\rm{k}}}}{.}$$

((A10))

Using these relations, we can find another expression for df₇ following the procedure above:

$${\rm{d}}{f_\gamma} = {1 \over 4}\sum\limits_\nu {\rm{d}} {q^\nu}\left( {\sum\limits_{(\phi ,\mu )} {{\mu ^\nu}} {{\partial {f_\gamma}} \over {\partial {q_\phi}}}} \right){.}$$

((A11))

The expression is different from Eq. (A5), in that the differentials dq^ν are on the left of μ^ν. Therefore, we derive

$$\begin{array}{*{20}c}{{\rm{d}}f = {\rm{d}}{f_a} + {\rm{d}}{f_b}{\rm{i}} + {\rm{d}}{f_c}{\rm{j}} + {\rm{d}}{f_d}{\rm{k}}\quad \quad \quad \quad \quad \quad \quad \quad \quad} \\ { = {1 \over 4}\sum\limits_\nu {\rm{d}} {q^\nu}\left( {\sum\limits_{(\phi ,\mu )} {{\mu ^\nu}} {{\partial ({f_a} + {f_b}{\rm{i}} + {f_c}{\rm{j}} + {f_d}{\rm{k}})} \over {\partial {q_\phi}}}} \right)} \\ { = {1 \over 4}\sum\limits_\nu {\rm{d}} {q^\nu}\left( {\sum\limits_{(\phi ,\mu )} {{\mu ^\nu}} {{\partial f} \over {\partial {q_\phi}}}} \right),\quad \quad \quad \quad \quad \quad \;\;} \end{array}$$

((A12))

which is the basis for the definitions for the right restricted HR derivatives as given in the main text.

Appendix B: Additional details for the proof of Lemma 1

To prove Lemma 1, we have used the following relation:

$${{\partial {q^{ - 1}}} \over {\partial q}} = - {q^{ - 1}}{\rm{R}}({q^{ - 1}}){.}$$

((B1))

To show this result, we note $\partial (q{q^{ - 1}})/\partial q = \partial 1/\partial q = 0$. Thus,

$$\begin{array}{*{20}c}{0 = q{{\partial {q^{ - 1}}} \over {\partial q}} + {1 \over 4}\left( {{q^{ - 1}} - {\rm{i}}{q^{ - 1}}{\rm{i}} - {\rm{j}}{q^{ - 1}}{\rm{j}} - {\rm{k}}{q^{ - 1}}{\rm{k}}} \right)} \\ { = q{{\partial {q^{ - 1}}} \over {\partial q}} + {\rm{R}}({q^{ - 1}}),\quad \quad \quad \quad \quad \quad \quad \quad} \end{array}$$

((B2))

from which the result follows. We have used Eq. (11) and the fact that

$${{\partial q} \over {\partial {q_a}}} = 1,\;{{\partial q} \over {\partial {q_b}}} = {\rm{i}},\;{{\partial q} \over {\partial {q_c}}} = {\rm{j}},\;{{\partial q} \over {\partial {q_d}}} = {\rm{k}}{.}$$

((B3))

The proof also uses the following recurrent relation:

$${{\partial {q^{ - n}}} \over {\partial q}} = {q^{ - 1}}\left[ {{{\partial {q^{ - (n - 1)}}} \over {\partial q}} - {\rm{R}}({q^{ - n}})} \right],$$

((B4))

which can be shown as follows. By using the first product rule, we have

$$\begin{array}{*{20}c}{{{\partial {q^{ - n}}} \over {\partial q}} = {q^{ - 1}}{{\partial {q^{ - (n - 1)}}} \over {\partial q}} + {1 \over 4}\left( {{{\partial {q^{ - 1}}} \over {\partial {q_a}}}{q^{ - (n - 1)}}} \right.\quad \quad} \\ {\left. { - {{\partial {q^{ - 1}}} \over {\partial {q_b}}}{q^{ - (n - 1)}}{\rm{i}} - {{\partial {q^{ - 1}}} \over {\partial {q_c}}}{q^{ - (n - 1)}}{\rm{j}} - {{\partial {q^{ - 1}}} \over {\partial {q_d}}}{q^{ - (n - 1)}}{\rm{k}}} \right){.}} \end{array}$$

((B5))

Using the fact $\partial q{q^{ - 1}}/\partial {q_\phi} = 0$ for φ ∈ {a, b, c, d}, and the second product rule, we find

$${{\partial {q^{ - 1}}} \over {\partial {q_\phi}}} = - {q^{ - 1}}{{\partial q} \over {\partial {q_\phi}}}{q^{ - 1}}{.}$$

((B6))

Thus,

$$\begin{array}{*{20}c}{{{\partial {q^{ - n}}} \over {\partial q}} = {q^{ - 1}}{{\partial {q^{ - (n - 1)}}} \over {\partial q}}\quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad} \\ { - {{{q^{ - 1}}} \over 4}\left( {{q^{ - n}} - {\rm{i}}{q^{ - n}}{\rm{i}} - {\rm{j}}{q^{ - n}}{\rm{j}} - {\rm{k}}{q^{ - n}}{\rm{k}}} \right)} \\ { = {q^{ - 1}}{{\partial {q^{ - (n - 1)}}} \over {\partial q}} - {q^{ - 1}}{\rm{R}}({q^{ - n}}){.}\quad \quad \quad \quad \quad \quad} \end{array}$$

((B7))

Appendix C: Derivations of the first chain rule

The function f(g(q)) can be viewed as a function of intermediate variables g_a, g_b, g_c, and g_d. Using the usual chain rule, we have

$${{\partial f} \over {\partial {q_\beta}}} = \sum\limits_\phi {{{\partial f} \over {\partial {g_\phi}}}} {{\partial {g_\phi}} \over {\partial {q_\beta}}},$$

((C1))

with β ∈ {a, b, c, d}, which gives ${\nabla _r}f = (\nabla _r^gf)P$, where P is a 4 × 4 matrix with ${P_{\phi \beta}} = \partial {g_\phi}/\partial {q_\beta}$. With $({\nabla _r}f){J^{\rm{H}}} = {\nabla _q}f$ and $\nabla _r^gf = 4(\nabla _q^gf)J$, the above equation leads to

$${\nabla _q}f = 4(\nabla _q^gf)JP{J^{\rm{H}}},$$

((C2))

where it is easy to show that 4JPJ^H = M.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Jiang, Md., Li, Y. & Liu, W. Properties of a general quaternion-valued gradient operator and its applications to signal processing. Frontiers Inf Technol Electronic Eng 17, 83–95 (2016). https://doi.org/10.1631/FITEE.1500334

Download citation

Received: 15 October 2015
Accepted: 06 January 2016
Published: 06 February 2016
Issue Date: February 2016
DOI: https://doi.org/10.1631/FITEE.1500334

Keywords

CLC number

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Properties of a general quaternion-valued gradient operator and its applications to signal processing

Abstract

Similar content being viewed by others

Enhanced Gradient Descent Algorithms for Quaternion-Valued Neural Networks

The Quaternion Domain Fourier Transform and its Properties

Quaternion Data Fusion

1 Introduction

2 Restricted HR gradient operators

2.1 Introduction of quaternion

2.2 Definition of the restricted HR gradient operators

3 Properties and rules of the operator

4 Restricted HR derivatives for a class of regular functions

5 Right restricted HR gradients

6 Increment of a quaternion function

6.1 Quaternion-valued LMS algorithm

6.2 Quaternion-valued nonlinear adaptive algorithm

7 Application to adaptive beamforming based on vector sensor arrays

7.1 Vector sensor arrays with a quaternion model

7.2 Reference signal based adaptive beam-forming

8 Conclusions

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Appendices

Appendix A: Definition of the operators

Appendix B: Additional details for the proof of Lemma 1

Appendix C: Derivations of the first chain rule

Rights and permissions

About this article

Cite this article

Keywords

CLC number

Navigation

Properties of a general quaternion-valued gradient operator and its applications to signal processing

Abstract

Similar content being viewed by others

Enhanced Gradient Descent Algorithms for Quaternion-Valued Neural Networks

The Quaternion Domain Fourier Transform and its Properties

Quaternion Data Fusion

1 Introduction

2 Restricted HR gradient operators

2.1 Introduction of quaternion

2.2 Definition of the restricted HR gradient operators

3 Properties and rules of the operator

4 Restricted HR derivatives for a class of regular functions

5 Right restricted HR gradients

6 Increment of a quaternion function

6.1 Quaternion-valued LMS algorithm

6.2 Quaternion-valued nonlinear adaptive algorithm

7 Application to adaptive beamforming based on vector sensor arrays

7.1 Vector sensor arrays with a quaternion model

7.2 Reference signal based adaptive beam-forming

8 Conclusions

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Appendices

Appendix A: Definition of the operators

Appendix B: Additional details for the proof of Lemma 1

Appendix C: Derivations of the first chain rule

Rights and permissions

About this article

Cite this article

Share this article

Keywords

CLC number

Search

Navigation