Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

1 Introduction

Developed by Edwin H. Land and John J. McCann, Retinex [22, 23] is an interesting theory proposing a computational model to estimate the human color sensation, i.e. the color perception produced by the human vision system when observing a scene. More precisely, in “The Science of Color” [32], the Committee on Colorimetry of the Optical Society of America defines the human color sensation as a “mode of mental functioning that is directly associated with the stimulation of the organism”.

Retinex originated in the late 1950 s from a series of experiments, evidencing that the process of the color formation performed by the human vision system strongly differs from that performed by a camera. In particular, experiments carried out on sets of colored patches, called Mondrians, showed that the color appearance as reported by humans looking a scene does not correlate with the radiances of the observed scene. This means that the human color sensation may differ from the color computed from the quanta catches coming from the observed scene and acquired by the eye photo-receptors. Therefore, the color of an object under a given light as reported by a human may differ from the color of the same object under the same conditions as detected by a camera. This phenomenon is at the basis of the human color constancy, that is the human capability to discount a color cast due to the light illuminating the scene, so that a same object viewed under different light conditions is perceived as the same entity [11].

Land and McCann hypothesized that when looking a scene, the human vision system processes independently the long, medium and short wavelengths coming from the scene and acquired by the retina photo-receptors, and produces a novel scene, whose wavebands, termed the lightnesses, have color constancy [22, 23].

To understand the mechanism of the lightness formation, Land and McCann developed new experiments, that showed the importance, in the color formation, of the edges and of the relative spatial relationships among the reflectances of the observed regions. They definitely proved that color sensation and thus color vision are spatial processes, related to the local color distribution of the image. The empirical evidences they collected lead to a computational model able to estimate the color sensation from a RGB stimulus, i.e. the Retinex algorithm.

The outcomes of their research were also supported by biological studies on human vision, e.g. [3, 6,7,8,9, 19], that revealed the existence of a mechanism of spatial interaction among the responses of the eye photoreceptors, taking place both in the retina and in the primary visual cortex of the brain. Inspired by some of these works, in 1963, Land named its theory Retinex, from the contraction of the words RETina and cortEX.

From its first announcement, the Retinex theory continuously attracts the attention of the research world. A complete understanding of how humans see colors is still an open problem. The human vision system is a complex machine, much efforts are necessary to have a complete knowledge about it. Retinex represents a significant step in this direction, and it still attracts the interest of researches from different areas, such as computer scientists, biologists, psychologists. Many variants of the original Retinex algorithm of Land and McCann and of its spatial color sampling have been proposed, with the main aims of further investigating the mechanism of spatial color interaction, proposing more efficient computational algorithmic solutions, and/or solving practical problems of machine vision, such as color image enhancement, color rendition, dynamic range compression, image retrieval based on human color constancy. e.g. [4, 14, 16, 18, 25, 34,35,36, 38,39,40].

This works introduces in Sect. 2 the conceptual framework of Retinex the main challenges it faced and solved, and the original algorithm. In addition, this paper presents some algorithms of the Milano Retinex family, a special class of Retinex inspired implementations, mainly employed for image color enhancement (Sect. 3). Final conclusions are drawn in Sect. 4.

2 The Original Retinex Algorithm

This Section describes the experiments at the basis of the Retinex theory and the algorithm proposed by Land and McCann and its algorithmic implementation. More details are available in [22, 23, 27, 28].

2.1 The Experiments

The evidences at the basis of the Retinex theory have been collected by a series of experiments, that can be classified in two groups: experiments on color patches (color Mondrians) and experiments on gray level patches (gray Mondrians).

Fig. 1.
figure 1

Two Mondrians are illuminated by two lights (on top), tuned so that the green circle on left and the red circle on right have the same radiance. An observer looks at them. Despite the identical quanta catches, the observer reports green color on left and red color on right. The experiment shows that color sensation does not correlate with radiance. (Color figure online)

Experiments on Color Mondrians: the color constancy - Retinex theory was born from some experiments Land was carried on at the late of 1950 s at the Polaroid Corporation, of which he was a con-founder. A colleague projected on a screen a mixture of two monochromatic pictures, one through a red filter and the other one through a simple white light: in the final picture, he observed more than the white, black and reddish colors that were expected. Land explained this phenomenon by supposing the existence of a sort of color adaptation performed by the human vision system.

To better understand this phenomenon, Land prepared other tests by using panels of colored patterns, that he named a Mondrian, because of its similarity with the artworks of the Dutch painter Pieter Cornelis Mondriaan (1872–1944), known as Mondrian. The Mondrians were built up and used under controlled conditions, e.g. specularities of highlights were avoided.

In an experiment, he considered two identical Mondrians, positioned side/by-side. He attached respectively on the left Mondrian a circular green paper, and on the right Mondrian a circular red paper. Then he illuminated each Mondrian uniformely with a light source, tuned so that the radiance from the green and red circles was the same (see Fig. 1). If the human vision system works as a standard photo-camera, then an observer should say that the green and the red circles have the same color: on the contrary, the green and the red circles appeared to the observers still green and red. Land repeated the experiments by changing the time of fixation and the color of the patches, but again, the result was the same: the observers were able to detect the actual color of the patches. The color as reported by the humans did not correlate with the physical radiances.

This experiments suggested that the human vision system performs a sort of color adaptation, that discount color casts due to the illumination. This mechanism is at the basis of the color constancy, that is the human capability to recover the reflectance of an observed object inspite the color of the light.

Land arrived to the following, first, important conclusion: when the human vision system observes a scene, the long, medium and short wavebands are processed by the human vision system in order to be color constant. Land hypothized that, for any observed scene, the human vision system produces a new image, whose long, medium and short wavebands, named lightnesses, are computed independently from the long, medium and short wavebands of the observed scene, acquired by the retina photo-receptors, i.e. by cones and rods.

According to biological studies, that revealed that the human vision takes place in the retina and in the visual primary cortex of the brain [3, 6,7,8,9, 19], Land named its theory Retinex from the words RETina and cortEX.

Experiments on Gray-Level Mondrians: edge importance and spatial issues - The experiments of Land proceeded with the help of other colleagues of the Polaroid Corporation and in particular of John J. McCann, which joint the Vision Research Laboratory of the company in 1961.

To understand the mechanism of the lightness computation, Land and McCann took into account some visual phenomena suggesting a spatial character of the vision, i.e. simultaneous contrast and edge importance in color vision.

The simultaneous contrast is a phenomenon studied by the French chemist Michel Eugène Chevreul in the 19th century and illustrated in Fig. 2: the same gray square is positioned at the center of two squares with different colors. The square appears darker when it is shown on the left background.

This observation leads to the following, second result: the color sensation at a point depends on the surrounding colors, thus in the lightness, the color at a point is modified relatively to the colors in its surround.

Fig. 2.
figure 2

Simultaneous Contrast: the gray square shown on different background appears differently colored. This phenomenon suggests the existence of a spatial interaction among the colors.

Expressing a quantity relatively to an other quantity implies a comparison process, that can be accomplished through the computation of the ratio between these quantities. Therefore, Land and McCann arrived to the conclusion that the lightness depends on ratios between the reflectances of near-by areas, as suggested by the phenomenon illustrated in Fig. 3: the adjacent squares on left appear differently colored, but this does not happen when the central edge is occluded (right). The importance of the edges in color sensation was deeply investigated by a series of experiments on a gray level Mondrian illuminated by a smooth gradient. Again, observers were asked to report their sensation on the different Mondrian patches, leading to the following outcomes: (1) the ratio between points located across an edge correlates with the appearance; (2) slight edges are irrelevant to color sensation. Finally, the analysis of edges implies a local image processing: the color sensation does not depend on global properties, like for instance the color distribution over the whole image represented by histograms [29].

Finally, another spatial issue had to be considered in color sensation: according to the study in [5], the color sensation at a point is influenced more by the colors of the regions closer to that point that by those of regions located far way.

To sum up, the Retinex theory states that the human color sensation is a complex process, that involves a local, spatial comparison among different areas of any observed scene.

Fig. 3.
figure 3

The picture on left appears composed by two rectangles with two slightly different gray intensities, displayed on a dark background. When the central edge is occluded by a black rectangle, the gray levels of the two rectangles appears equal to each other.

2.2 The Algorithm

The different outcomes of the experiments lead to an algorithm for the prediction of the color sensation. The general workflow of this algorithm applied to a RGB image consists of three main steps: (1) pre-calibration; (2) color filtering; (3) post-calibration.

The pre-calibration step matches the digital values of the device used for image acquisition with the actual luminance of the observed scene. The pre-calibrated image undergoes to the spatial comparison which at the basis of the lightness computation: this phase, that we call color filtering, is the main core of the Retinex algorithm. The output image is the input of the post-calibration step, which remaps the digits into a scale of appearance. Pre- and post-calibrations are fundamental steps for modeling the human color vision [27, 30].

According to the experiments carried out on color and grey Mondrians, the Retinex algorithm proposed by Land and McCann processes any input image channel by channel as follows. For each channel, a set of paths randomly chosen over the image is used to explore and compare the image intensities of different regions. Given a path connecting two image regions, the algorithm computes the lightness by the so-called chain ratio, i.e. the product of the ratios between the intensity values of adjacent pixels. Ratios are a way to measure the image gradient, thus to detect the edges, that, according to the experiments described before, play an important role in color sensation. The multiplication of the ratios allows to spatially relate the color information among distant regions without loosing the information along the bridges system [27]. When the ratio product along a path exceeds the value 1.0, a reset mechanism is implemented: the cumulative product is set to 1.0 and the ratio chain restarts from this value. Reset is a fundamental operation: it implements a normalization process that is performed by our vision system and that allows to express the color we perceive at a point relatively to the other, i.e. as a percentage of a local white detected in its surround (see Chaps. 21 and 33 in [27]). When a ratio is close to one, its contribution is cast to 1.0: this operation reproduces the experimental evidence that slight edges do not contribute to the color sensation. An example of the ratio-product-reset procedure is given in Fig. 4.

The original Retinex algorithm is iterative, i.e. the paths are computed each after the other, and it is destructive, i.e. the digital values of the input image are overwritten with the values output by the ratio-reset procedure computed along the path. Many paths are computed over the image to guarantee an accurate exploration of the color distribution around each region and to reduce the chromatic noise due to the random path sampling. Finally, the lightness at a point is computed by averaging the partial results over the total number of paths.

The whole mechanism implemented by Retinex is called ratio-product-reset-average from its main steps.

Fig. 4.
figure 4

Example of the ratio-reset mechanism of the Retinex algorithm. See text for more explanation.

The work in [33] provides the equation of the ratio-product-reset-average mechanism for the computation of the color filtering. Let I be a color channel of a pre-calibrated RGB image and let x be an image pixel. Hereafter, the intensity levels of I are supposed to be normalized in order to range over (0, 1]. The neighborhood of x is explored by a set of n paths \(\gamma _1, \ldots , \gamma _n\), each of them ending at x and starting from a pixel \(y_k\) (k = 1, ..., n) randomly selected over the image. Each path \(\gamma \in \{\gamma _1, \ldots , \gamma _n \}\) is modeled as a function defined on a set of natural numbers \(\{1, \ldots , l_k \}\) such that \(\gamma (1) := x\), \(\gamma (l_k) = x_{l_k} := y_k\), while \(\gamma (t_{k-1}) = x_{k-1}\) and \(\gamma (t_k) = x_k\) are subsequent pixels over \(\gamma \) (\(k = 2, \ldots , l_k\)). The parameter \(l_k\) denotes the length of the path \(\gamma _k\).

The lightness at x (before post-calibration) is given by

$$\begin{aligned} L(x) = \frac{1}{n}\sum _{k=1}^{n} \prod _{t_k = 1}^{l_k}{\delta _k(R_{t_k})} := \frac{1}{n}\sum _{k=1}^{n} \prod _{t_k = 1}^{l_k}{\delta _k\Big (\frac{I(x_{k+1})}{I(x_k)}\Big )} \end{aligned}$$
(1)

Here \(R_{t_k}\) is the ratio of the intensities of two adjacent pixels on \(\gamma \) i.e. \(R_{t_k}\) = \(\frac{I(\gamma (t_{k+1})}{I(\gamma (t_k))}\), and \(\delta _k: \mathbf {R}^+ \rightarrow \mathbf {R}^+\) is the function such that

$$\begin{aligned} \delta _k(R_{t_k}) = \left\{ \begin{array}{ll} R_{t_k} &{} \text {if } 0< R_{t_k} \le 1 - \varepsilon \\ 1 &{} \text {if } 1 - \varepsilon < R_{t_k} \le 1 + \varepsilon \\ R_{t_k} &{} \text {if } 1 + \varepsilon \le R_{t_k} \le \frac{1 + \varepsilon }{\prod _{m_k}^{t_k - 1}{\delta _k(R_{m_k}})} \\ \frac{1}{{\prod _{m_k}^{t_k - 1}{\delta _k(R_{m_k}})}} &{} \text {if } R_{t_k} > \frac{1 + \varepsilon }{\prod _{m_k}^{t_k - 1}{\delta _k(R_{m_k}})} \end{array} \right. \end{aligned}$$
(2)

The threshold \(\varepsilon \) is a positive parameter ranging over [0, 1] and introduced to model the insensitivity of the color sensation to slight gradients.

The path-based approach proposed in the pioneer works [22, 23] is not the unique possible spatial color sampling for estimating color sensation. In 1986, Land presented an alternative version of the Retinex algorithm, where the path based color sampling is replaced by a sort of high-pass filter [20, 21]. Precisely, in this work, Land computed the value L(x) as the ratio between the intensity value at x and the average value of a set of pixels located in a surround of x, having density proportional to the Euclidean distance from x, i.e.

$$\begin{aligned} L(x) = \frac{I(x)}{(I * G_\sigma )(x)} \end{aligned}$$
(3)

where \(G_\sigma \) is a convolution kernel, usually a Gaussian one, e.g. \(G_\sigma (x) = \frac{1}{\sqrt{2\pi } \sigma } e^{-\frac{\parallel x \parallel }{\sigma ^2}}\), \(\forall x \in \mathbf {R^2}\), and \(\sigma \) is a real, strictly positive number. This Retinex version attracted the interest of many researchers, that investigate the properties and the mathematical form of this implementation [17]. This algorithm, also called Single-Scale Retinex, has been then extended to a multi-scale version, termed Multi-Scale Retinex. This latter has been proved to perform better that the Single-Scale version in many applications, such as dynamic range compression, color rendition, contrast enhancement in medical imaging, e.g. [13, 15, 16, 35, 37, 41]. The Multi-Scale Retinex modifies the Eq. (3) by computing a weighted average of many Single-Scale Retinex outputs, generally in a logarithmic space:

$$\begin{aligned} \log L(x) = \frac{1}{n}\sum _{i = 1}^n w_i (\log I(x) - \log ((I*G_{\sigma _n})(x))) \end{aligned}$$
(4)

where the \(w_i\)s are parameters weighting the different single-scale Retinex outputs, and \(n > 1\).

Many different spatial color sampling inspired by the Retinex principles have been (and still are) proposed in the literature to solve many different computer vision problems, as those listed above. The next Section presents some algorithms of the so-called Milano Retinex family [33], a special class of Retinex-inspired color filtering implementions mainly used for color enhancement. This family is of interest because its members perform a color filtering based on an approximated version of Eq. (2) and exploit different spatial exploration schemes, including path-based, 2D, and probabilistic spatial color sampling.

3 Alternative Spatial Color Sampling: Examples from Milano Retinex Family

The Milano Retinex algorithms differ from the original Retinex in three main points. First, they propose alternative ways for the spatial exploration of the image. Second, the computational color filtering procedure is not destructive. Third, they compute the lightness L by an approximation of Eq. (2), obtained by setting \(\varepsilon = 0\). This choice is justified both by mathematical and empirical issues, showing that the threshold mechanism is in general unessential [33]. When \(\varepsilon = 0\), Eq. (2) becomes simpler, precisely, for any color channel I, the lightness at a pixel x (named target) is given by:

$$\begin{aligned} L(x) = \frac{1}{n}\sum _{i=1}^n \frac{I(x)}{I(m_i)} \end{aligned}$$
(5)

where \(m_i\) is a pixel with maximum intensity over the path \(\gamma _i\), i.e.

$$\begin{aligned} I(m_i) = \max \{I(y): y \in \gamma _i(\{1, \ldots , l_i\}) \}. \end{aligned}$$
(6)

Equation (5) expresses the lightness at each point as the average of the ratios between the intensity at x and a local maximum, that becomes the local white reference (see Chap. 33 in [27]).

The color filtering algorithms of the Milano Retinex class differ to each other in the way the spatial analysis is performed. The different spatial color sampling procedures of this family can be categorized in path-based, 2D, and probabilistic approaches.

Fig. 5.
figure 5

Examples of spatial color sampling of an input image (left) performed by the Milano Retinex approaches ETR (middle) and RSR (right) on the red channel of the input. This samples are around the barycenter of the image support (indicated respectively by a red, filled circle in the middle picture, and by a blue, empty circle in the right picture. (Color figure online)

Examples of Path-Based Milano Retinex Spatial Color Sampling. The works in [26, 31] are the pioneer Milano Retinex approaches. The spatial sampling is performed in the first one by lines, and in the second one by Brownian paths.

Fig. 6.
figure 6

Examples of color enhancement performed by some algorithms of the Milano Retinex family. (A) Input image and outputs of (B) TR, (C) ETR, (D) RSR, (E) QBRIX (distance-weighted version).

Image aware paths have been recently introduced by the methods Termite Retinex (TR) [39], Energy-driven Termite Retinex (ETR) [24] and its light version Light-ETR [40]. The approaches are of interest because they explicitly model the importance of the edges in color sensation. In fact, in these methods, the paths are not randomly selected over the image nor constrained a priori by geometric features (e.g. for instance, they are not lines), but they are built up so that to adhere as much as possible to the edges of each image color channel. Specifically, the paths are thought as the traces of termites (i.e. white ants), each of them exiting one after the other from the nest (i.e. the target) in search for a local white reference (i.e. a pixel with the maximum intensity over the path).

In TR, each path is determined by a sort of contrast follower, that starts from the target and proceeds pixel by pixel by maximizing a function f proportionally to the contrast value and to the squared Euclidean distance between adjacent pixels. The distance term was introduced to spread the termite swarm across the image in order to take into account the color spatial distribution accurately. A penalty term is introduced to avoid the over-exploration of the image, so that f decreases over pixels already traveled by a termite.

ETR inherits from TR the general, swarm-inspired exploration scheme, but it computes each termite route as the local minimum of an energy functional, designed to favor the visit of pixels having high gradient magnitude, with the preference for pixels close to the target and never traveled before. Differently from the Brownian path scanning and from TR, ETR provides a global mathematical condition for describing the paths. The computational issues of ETR have been analyzed in [40], which presents an approximated, computationally more efficient version of ETR. Figure 5 (middle) shows an example of ETR spatial exploration: flat regions are explored less than the others. TR exhibits a similar behaviour, but its paths become random over image areas with null values of f. In this respect, ETR provides a more deterministic procedure to compute a path connecting two pixels.

The number of the paths and the penalty value are user inputs. TR also requires a value for the maximum length of the path.

Examples of 2D Milano Retinex Spatial Color Sampling. Random Spray Retinex (RSR) [1] replaces the path based exploration scheme with a 2D sampling, leading to a faster computation of the lightness. This new scheme has been introduced mainly to solve some problems of the path-based sampling, mainly related to the redundancy of the information collected by random paths and to the chromatic noise due to their randomness. For each chromatic channel, RRS scans the neighborhood of each target x by a 2D set of random pixels randomly selected around x from a radial distribution, according to the fact that the colors of the pixels closest to x influence more its color sensation than those of the pixels located far away. The lightness at x is computed as the ratio between the intensity I(x) and the maximum intensity over the spray. The chromatic noise due to the random samples is reduced by generating many sprays. The final lightness is obtained as the average value of the lightnesses computed over each spray. The numbers of sprays and the number of samples per sprays are input user.

Many different versions of RSR have been published. In particular the works in [1, 2] propose computationally more efficient implementations of RSR. An example of RSR sampling is reported in Fig. 5 (right).

Examples of Probabilistic Milano Retinex Spatial Color Sampling. The works QBRIX [12] and RSR-P [10] present respectively a probabilistic approximation and an exact formulation of RSR. These last two methods avoid the random sampling and thus output an image free of chromatic noise.

QBRIX (from Quantile-Based approach to RetIneX) relies on the fact that the color sensation at any image pixel is poorly influenced by (1) colors rarely occurring in the image and (2) colors of pixel located far from. Issues (1) and (2) lead to two different implementations of QBRIX. Both of them computes the lightness at any pixel of any color channel as the intensity level of a quantile, that the user selects on the probability density fucntion (pdf) of the channel intensities. The first implementation does not use any information about the spatial arrangement of the color, while the second one accounts for this information by weighting the contributions of the channel intensities to the pdf through a function of the distance of the pixels from the target. Thus, this “spatially weighted intensity pdf” must be re-computed for any pixel.

RSR-P (where P stands for Population) is an exact mapping of RSR in to a population based approach, that completely avoids the random sampling and the related chromatic noise. It bases on the estimation of the probability to sample, around each target, n pixels with intensity higher than that of the target and radially distributed around the target. In this framework, RSR results to be an approximated version of RSR-P.

Figure 6 shows some examples of color enhancement provided by some algorithms of this family. In particular, the results obtained by RSR-P and by the first implementation of QBRIX are omitted respectively for the high similarity with RSR and for the distance-free color processing that is not in line with the spatial principles of Retinex.

As visible from this figure, all these algorithms produce a new, enhanced image, where the details are more visible and the mean image brightness is higher. The path-based approaches perform similarly, and provide a better contrast enhancement in the dark areas with respect to the 2D and probabilistic methods.

The difference between the path-based methods and the others are mainly due to the different ways to spatially explore each target neighborhood. These examples point out the importance of the spatial exploration scheme. The use of this or that method depends on the applications, that can be to improve the global or local image visibility, to remove a color cast due to the illuminant or simply making a picture more pleasant.

4 Conclusions

Despite developed many years ago, Retinex is still an attractive research field, as proved by the wide range of recent conferences and publications on it. The Retinex theory nurtured the first mathematical model of the human color sensation and is nowaday inspiring new advanced efforts both in biology and computer vision.