Keywords

1 Introduction

Automation of tasks linked with video surveillance have received a lot of attention recently by computer vision researchers [1]. This is because cameras located in avenue or public places allow person tracking, anomaly detection or person re-identification (Re-Id). The aim of person re-identification is identifying the same person in different cameras with non-overlapping camera views [2]. The task of Re-Id is a challenging problem caused by illumination changes, occlusion, pedestrian poses, background clutters, viewpoints and so on [3].

All illumination changes generate variance in the person appearance [4], for instance, false colors in person’s clothes. It is clear because it is necessary to get true colors to decrease the error using descriptors as color histograms. Moreover, a stage in Re-Id is the feature extraction [5] that has a lot of influence on the image comparison (in a second stage). As explained above, color is an important feature for person re-identification.

The problem of looking for robust colors to illumination change is called color constancy (also color invariant, photometric normalization or color correction). To address this problem, several alternative have been proposed for Re-Id algorithms. Ma et al. [6] propose using the Gray World algorithm to improve the parts of the person body. Kviatkovsky et al. [7] make groups of colors in a log-chromaticity space of the upper/lower parts of human body assuming that color clouds are invariant under some conditions. Jung et al. [8] use Shades of Gray algorithm to improve the person images. Eisenbach et al. [9] improve the image, taking as reference the person’s clothes and the variations of illumination when the person passes through the scene in a learning process. Allouch [10] applies Retinex algorithm to improve images. Monari [11] developed an algorithm using shadow information of the person to estimate local illumination. Liao et al. [12] developed a version the multiscale Retinex algorithm to get vivid colors in person images. However, previous works for color correction in Re-Id do not take into consideration the connections among each color channel in the person image.

Person images have edges and isolated colors that arise from the self shadow or some anomaly (these isolated colors and edge are considered as false colors, because they have no relation to the pixel colors nearby), and are ignored by the algorithms, for instance, Gray World, which was designed for generic images (as landscapes). We propose a new algorithm for color correction in person images using full-quaternion (to preserve the relationships among colors) and the Quaternion Fast Fourier Transforms (QFFT) for the removal of the false colors previously mentioned.

Contributions of this paper are summarized as follows: First, a full-quaternion is designed using local and neighborhood information of each pixel. Second, in the frequency domain the person image is modified through its spectral module. Third, the image is displayed with an adaptive gamma function.

This paper is organized as follows. In Sect. 2, the algebra of the quaternion is explained to understand the following sections. The proposed approach, the full-quaternion, the QFFT and gamma function are explained in Sect. 3. Experimental results are presented and analyzed in Sect. 4. Finally, the conclusions are set out in Sect. 5.

2 Algebra of the Quaternions

In 1843, Hamilton presents the quaternions [13], denoted with letter \( \mathcal {H} \) . If \( q \in \mathcal {H} \), then it is represented as follows:

$$\begin{aligned} \{ q = t + xi + yj + zk |(t,x,y,z)\in \mathcal {R} \} \end{aligned}$$
(1)

Where the complex operators \( \mathbf {i,j,k} \) have the next rules \( \{ i^2 = j^2 = k^2 = ijk = -1, ij = k = -ji, ki = j = -ik, jk = i = -kj \} \). This shows that the multiplication of quaternions is not commutative. Other form to represent the quaternion is:

$$\begin{aligned} q = Sq + Vq \end{aligned}$$
(2)

Where \( (Sq = t)\) denotes the real or scalar part and Vq denotes the vector or imaginary part.

$$\begin{aligned} Vq = xi + yj + zk \in \mathcal {H}(Vq) \end{aligned}$$
(3)

If \( Sq = 0 \), the quaternion is a pure quaternion and \( q = Vq \), if \( Sq \ne 0 \) the quaternion is a full-quaternion. Other properties and operations are:

$$\begin{aligned} | q | = (t^2 + x^2 + y^2 + z^2)^{1/2} \end{aligned}$$
(4)
$$\begin{aligned} q = Sq - Vq \end{aligned}$$
(5)

Where the expression (4) and (5) are the module and conjugate respectively. If \( | q | = 1 \) and \( q \in \mathcal {H}(Vq) \), then it is called unit pure quaternion.

Its representation in polar form is:

$$\begin{aligned} q = |q|e^{\tau \phi } = |q|(\cos \phi + \tau \sin \phi ) \end{aligned}$$
(6)

Where:

$$\begin{aligned} \phi = \tan ^{-1}( \frac{|Vq|}{Sq}) \end{aligned}$$
(7)
$$\begin{aligned} \tau = \frac{Vq}{|Vq|} \end{aligned}$$
(8)

are respectively eigenangle and eigenaxis. The sine, hyperbolic cosine and inverse hyperbolic cosine [14] are represented as follows:

$$\begin{aligned} sin(q) = ((\sin (Sq)\cosh (|Vq|)) + (\cos (Sq)\sinh (|Vq|)\tau )) \end{aligned}$$
(9)
$$\begin{aligned} cosh(q) = (exp(q) + 1/exp(q))/2 \end{aligned}$$
(10)
$$\begin{aligned} acosh(q) = log(q + \sqrt{q^2 - 1}) \end{aligned}$$
(11)

If \( \hat{q} = a + bi + cj + dk \) and \( \tilde{q} = w + xi + yj + zk \) and \( (\hat{q},\tilde{q}) \in \mathcal {H} \), then add, subtraction and multiplication are as follows:

$$\begin{aligned} \hat{q} \pm \tilde{q} = \ (a \pm w) + (b \pm x )i + (c \pm y )j + (d \pm z )k \end{aligned}$$
(12)
$$\begin{aligned} \hat{q}\tilde{q} = (S\hat{q}S\tilde{q} - V\hat{q} \cdot V\tilde{q}) + (S\hat{q}V\tilde{q} + S\tilde{q}V\hat{q} + V\hat{q} \times V\tilde{q} ) \end{aligned}$$
(13)

Where, \( <\cdot > \) is the scalar product and \( \times \) is the vector cross product.

3 The Proposed Approach

Person image is represented to the domain \( q \in \mathcal {H} \) (full-quaternion) where the real part is a value that represents the clear-dark contrast effect in the image. After, the image is transformed using hyperbolic cosine and with the Quaternion Fast Fourier Transform are modified the coefficients (module). Finally the person image is retrieved through the inverse QIFFT, the inverse hyperbolic cosine and the gamma function.

3.1 Full-Quaternion

An image has three color channels (Red, Green, Blue) and this can be represented by \( f_{(x,y)} \). Nevertheless, the information among color channels is not associated and this is a constraint to the image analysis (the algorithms work only with the intensities of each channel). Also, all the colors depend of their neighbor colors to be analyzed. For example, in a patch with size 3\(\,\times \,\)3 the neighbor pixels are gray and the central pixel is black, then the black color (noise) is generally omitted and the patch is classified as a gray homogeneous region. To solve this difference among colors, we start from the expression (9), as \( f_{(x,y)} \in \mathcal {H}(Vq) \) then \( Sq = 0 \) implying that vector part can only be used in the expression (9) . The expression (8) is to normalize a quaternion, getting a unit pure quaternion. It is applied to all pixels, where take the value \( | q | = 1 \), and each pixel is located in the surface of a unitary sphere. It is deductible that \( \sinh (|Vq|) \) acts as a scale factor on \( \tau \) increasing or decreasing the size of the quaternion. We called clear-dark contrast effect (CDCE) to the increment or decrement of \( \tau \), because the dark colors are close to zero and light colors are close to higher values (taking the RGB cube as reference), for example the pink or white.

The transformation \( f_{(x,y)} \longrightarrow \) \( f_{(x,y)}^H \in \mathcal {H} \), is performed as follows. A sliding window is built and the value of the function (\( \sinh (|Vq|) \)) of the pixels neighboring the central pixel is obtained in each position. The value of the real part of the full-quaternion is the average value of the neighborhood, as explained above.

$$\begin{aligned} \xi = \{\frac{1}{\eta } \sum _{\nu = 1}^{\eta } log( \sinh (|Vq|) + 1)_\nu \} + ri + gj + bk \end{aligned}$$
(14)

Where, \( \xi \in f_{(x,y)}^H \), \( \eta = 8 \), \( \{ r,g,b\} \) are color channels.

3.2 Hyperbolic Cosine and Fourier Transforms

Differently from the other works, we consider that a modification of the coefficients of the spectral module in the person image allows obtaining more reliable colors (considering the false colors as noise in the signal). The quaternion hyperbolic cosine (see expression 10) is used to integrate the value of the real part with the vector part of the quaternion where a new full quaternion is obtained before applying the QFFT [15]. The QFFT allows transforming the frequency coefficients using filters in the spectral module where is possible to eliminate noises present in the spatial domain.

$$\begin{aligned} F(p,s)= S \sum _{m = 0}^{m-1} \sum _{n = 0}^{n-1} e^{-\mu 2\pi (\frac{pm}{M})+(\frac{sn}{N})}f(m,n) \end{aligned}$$
(15)
$$\begin{aligned} f(m,n)= S \sum _{p = 0}^{p-1} \sum _{s = 0}^{s-1} e^{\mu 2\pi (\frac{pm}{M})+(\frac{sn}{N})}F(p,s) \end{aligned}$$
(16)

Where, \( S = \sqrt{\frac{1}{MN}} \), the expression (15) and (16) are the direct and inverse Quaternion Fourier Transforms respectively, \(\mu \) is unit pure quaternion, p and s are frequency coefficients, m and n are spatial coordinates of the image. The filter used is a Gaussian filter applying a convolution on \( \varUpsilon \) (Module):

$$\begin{aligned} \zeta = \varUpsilon *\varGamma \end{aligned}$$
(17)

Where, \(\{*\}\) is convolution operator and \(\varGamma \) is the Gaussian filter. Returning the image to the spatial domain is done as follows:

$$\begin{aligned} f(m,n)= exp( S \sum _{p = 0}^{p-1} \sum _{s = 0}^{s-1} e^{\mu 2\pi (\frac{pm}{M})+(\frac{sn}{N})} ((\zeta + F(p,s)) \beta )) \end{aligned}$$
(18)

Where, \(\beta \) is a scale factor. After the image obtained in the spatial domain f(mn) is transformed by the expression (11).

3.3 Visualization

A problem with QIFFT is that \(f(m,n) \in \mathcal {H} \) , to see the image in \( \mathcal {H}(Vq) \) is developed a gamma function as follows:

$$\begin{aligned} f_{(x,y)} = |C_i|_{(x,y)}^{1/|t|} \end{aligned}$$
(19)

Where, C is color channel, \( i \in \{r,g,b\} \), (xy) are spatial coordinates of the image. The gamma function is adaptive because each channel is affected by the real part t of full-quaternion.

4 Experimental Results

Our experiments are to evaluate the best performance of the color correction algorithms in person images used for person re-identification. However, to the best of our knowledge there are no datasets with ground truth for true colors in person image (for person re-identification). Therefore, to evaluate the algorithm, we use as metric the Area Under Curve (AUC) for Cumulated Matching Characteristics (CMC) curve [16].

The datasets used are VIPeR and GRID. The VIPeR [17] has the image of the same person when passing in front of two different cameras; it contains 1264 images of people divided into 632 images for camera A and camera B. It has strong variations in views and changes in lighting. The characteristics in GRID [18, 19] are: it contains 250 probe images (camera A) captured in one view and 250 images of the same person (in camera B), also there are 775 additional images that are not in the camera A, that is why we reject them and only work with 250 images. The images have strong variations in color and pose.

The parameters used in the experiment for our algorithm called \(\mathbf {FqCC}\) (Full-quaternion Color Correction) are the following: \( \mu = ((i+j+k)/\sqrt{3}) \), \( \beta = 1.5 \). The window and standard deviation of the Gaussian filter are respectively 3 and 0.8. The person image is segmented with a parsing person algorithm [20] to get the trunk and legs (see Fig. 1(i) and (j)). The feature vector is composed of the histograms of the color channels -red, green and blue- of the trunk and legs (because there are pixels that belong to clothing), also the hue and contrast histograms of the whole image using the \( \sinh (|Vq|) \) for contrast and the hue channel of the HSV color space. The Bathacharyya distance [21] is used three times for the person marching for 16, 32 and 64 bins in the feature vector histograms (see Table 1). The algorithms used in the comparison are applied in the pre-processing step for person re-identification such as: multiscale retinex (MSR) adapted in the work of Liao [12], Shades of Grey (SofG) and Grey-World (GW) the framework [22]. Others algorithms as maxRGB (Mrgb), Grey-Edge (GE), Weighted Grey-Edge (WGE) [23] are used and original images (OI) without transformations, too.

Fig. 1.
figure 1

Shows different results of the algorithms used for color correction (row above are VIPeR images ans row below are GRID images), (a) OI, (b) SofG, (c) GW, (d) Mrgb, (e) GE, (f) WGE (g) MSR, (h) \(\mathbf {FqCC}\). The image (j) is the trunk and legs of the image (i). (Color figure online)

Table 1. Experimets with Bathacharyya distance.

In Fig. 1 (h - above) it can be seen that our algorithm removes the noise (black small regions in the dress) by white color in correspondence with true color of the dress and the arm edges are smoothed unlike the other algorithms. Also in Fig. 1 (h - below) image contrast is improved and yellow color decreases (considered as noise). The results obtained are shown in Table 1 where our algorithm gets the best results for a value of 0.7308 (to 16 bins) in the VIPeR dataset and 0.5876 (to 32 bins) in the GRID dataset. The values obtained in the GRID dataset are lower than those of VIPeR dataset due to the high noise level of the person images in the GRID dataset. The results show that not always the algorithms used to improve the colors in the images achieve their goals, as it is the case of SofG, GW, GE and WGE (with VIPeR) that obtained values smaller than the experiment with the image without transforming it and MSR, SofG, GW, Mrgb and WGE in the GRID dataset. However, the algorithm herein proposed obtains the best results by contributing to the elimination of false colors (edges and isolated colors). This results are also, by the transformations of the image in the frequency domain and the incorporation of (CDCE) as an element of the full-quaternion.

5 Conclusions

Obtaining a value (real part), based on the clear-dark contrast effect, allows the construction of a full-quaternion and applying transformations in the frequency domain make it possible to correct the color and to improve the image. It was possible to establish an adaptive gamma function to display the image without losing information of the real part. For future works, we are going to design a local approach to color correction and design a metric of image quality (without ground truth).