An analysis of single image defogging methods using a color ellipsoid framework
- 4.4k Downloads
The goal of this article is to explain how several single image defogging methods work using a color ellipsoid framework. The foundation of the framework is the atmospheric dichromatic model which is analogous to the reflectance dichromatic model. A key step in single image defogging is the ability to estimate relative depth. Therefore, properties of the color ellipsoids are tied to depth cues within an image. This framework is then extended using a Gaussian mixture model to account for multiple mixtures which gives intuition in more complex observation windows, such as observations at depth discontinuities which is a common problem in single image defogging. A few single image defogging methods are analyzed within this framework and surprisingly tied together with a common approach in using a dark prior. A new single image defogging method based on the color ellipsoid framework is introduced and compared to existing methods.
KeywordsGaussian Mixture Model Median Operator Sample Window Mixture Weight Depth Discontinuity
The phrase single image defogging is used to describe any method that removes atmospheric scattering (e.g., fog) from a single image. In general, the act of removing fog from an image increases the contrast. Thus, single image defogging is a special subset of contrast restoration techniques.
In this article, we refer to fog as the homogeneous scattering medium made up of molecules large enough to equally scatter all wavelengths as described in . Thus, the fog we are referring to is evenly distributed and colorless.
The process of removing fog from an image (defogging) requires the knowledge on physical characteristics of the scene. One of these characteristics is the depth of the scene. This depth is measured from the camera sensor to the objects in the scene. If scene depth is known, then the problem of removing fog becomes much easier. Ideally, given a single image, two images are obtained: a scene depth image and a contrast restored image.
The essential problem that must be solved in most single image defogging methods is scene depth estimation. This is equivalent to converting a two-dimensional image to a three-dimensional image with only one image as the input. The approach to estimating the scene depth for the purpose of defogging is not trivial and requires prior knowledge such as depth cues from fog or atmospheric scattering.
where at pixel location i, the transmission t i is a function of the scattering β i (λ) and distance d i . The term λ is the specific wavelength.
Even though depth from scattering is a well-known phenomenon, single image defogging is relatively new, and a growing number of methods exist. The first methods trying to achieve single image defogging were presented by Tan  and Fattal . Both authors introduced unique methods that remove fog from a single image by inferring the transmission image or map. Soon afterwards, another unique method called the dark channel prior (DCP) by He et al.  supported the ability to infer a raw estimate of t using a single image with fog present. The DCP method has also influenced many more single image defogging methods (see [10, 11, 12, 13, 14, 15, 16]). Within the same time frame, Tarel and Hautière  introduced a fast single image defogging method that also estimates the transmission map.
where w is a scaling term, and θ is a ‘dark prior’. The DCP method by He et al.  was the first to explicitly use (2); however, we demonstrate that this is the prototype used also by other methods regardless of their approach. We find that the dark prior is dependent on properties from the proposed color ellipsoid framework. The following single image defogging methods are analyzed within the framework: Fattal , He et al. , Tarel and Hautière , and Gibson et al. .
The second key message in this article is that a new single image defogging method is proposed. This method is developed using a lemma from the color ellipsoid framework and also estimates the transmission with the same prototype in (2).
There are eight sections in this article including this section. Section 2 presents a detailed description of the atmospheric dichromatic model. Section 3 introduces the color ellipsoid framework. The framework is analyzed when fog is present, and our new defogging method is introduced in Section 4. We then unify four different single image defogging methods using the color ellipsoid model in Section 6. The discussion and conclusion are provided in Sections 7 and 8, respectively.
2 Atmospheric dichromatic model
is commonly used in single image defogging methods for characterizing the intensity of a foggy pixel.
In comparison to the dichromatic reflectance model , the diffuse and specular surface reflections are analogous to the direct transmission, t i (λ)x i (λ), and atmospheric veiling, (1-t i (λ))a(λ), respectively. The atmospheric scattering causes the apparent radiance to have two chromatic artifacts caused by particles in the air that both attenuate direct transmission and add light induced by a diffuse light source.
For obtaining a defogged image, the goal is to estimate the p-channel color image using the dichromatic model (3). For most cases, p = 3 for color images. However, the problem with (3) is that it is under-constrained with one equation and four unknowns for each color channel. Note that there are two unknowns contained within the transmission, t(λ), in (1).
The first unknown is the desired defogged image x. The second unknown variable is the airlight color, . This is the color and intensity observed from a target when the distance is infinite. A good example is the color of the horizon on a foggy or hazy day.
The third and fourth unknowns are from the transmission introduced in (1). The transmission, , is the exponentially decaying function based on scattering, β i (λ), and distance d i .
bringing the unknown count down to a total of two for gray scale or four for red-green-blue (RGB) color excluding estimating x. The transmission t is the first unknown and airlight a is the second unknown for gray scale. For color (p = 3), transmission t is one unknown and airlight a has three unknowns.
The single image defogging problem is composed of two estimations using only the input image : the first is to estimate the airlight a and the second to estimate the transmission t.
There exists several methods for estimating a [7, 9, 18]. In this article, we will assume that the airlight has been estimated accurately in order to focus the analysis on how transmission is estimated (with possible need for refinement). Therefore, the key problem in single image defogging is estimating transmission given a foggy image.
3 Color ellipsoid framework without fog
The general color ellipsoid model and its application to single image defogging was introduced by Gibson and Nguyen in  and . This work will be reproduced here to facilitate the development of additional properties of the model in this article.
The motivation for approximating a color cluster with an ellipsoid is attributed to the color line model in  which is heavily dependent on the work from . The color line model exploits the complex structure of RGB histograms in natural images. This line is actually an approximation of an elongated cluster where Omer and Werman  model the cluster with a skeleton and a 2D Gaussian neighborhood. Likewise, truncated cylinders are used in .
We continue the thought presented by Omer and Werman  that subsets of these clusters are ellipsoidal in shape. We accomplish this by instead generating an RGB histogram using color pixels sampled from a small window within the image.
with the eigenvalues in are sorted in decreasing order.
parameterized by the sample mean μ i and sample covariance Σ i . We will drop the parameters for clarity so that .
It is common to assume that the distribution of the color values sampled within Ω i is normally distributed or can be modeled with an ellipsoid. The distribution for the tristimulus values of color textures was assumed to be normally distributed by Tan . Even though Devaux et al.  do not state that the sample points are normally distributed, they model the color textures with a three-dimensional ellipsoid using the Karhunen-Loeve transform. Kuo and Chang  sample the entire image and characterize the distribution as a mixture of Gaussians with K clusters.
In Figure 1c, we approximated color ellipsoids to each cluster using principal component analysis, where the sample mean and sample covariances were used. In Figure 1b,c, the upper cluster is from the road and the lower cluster is from the tree trunk. Approximating the RGB clusters with an ellipsoidal shape does well in characterizing the three-dimensional density of the cluster of points.
4 Color ellipsoid framework with fog
4.1 General properties
We derive in this section the constraints for color ellipsoids when fog is present. We first simplify the derivation by assuming that the surface of the radiant object within the sample window is flat with respect to the observation angle so that the transmission t i is the same within Ω i (t i = t).
Note that the transmission is the same within the patch because it is assumed that the depth is flat.
The RGB histogram of the surface and a foggy version of the surface should exhibit two main differences. The first is that the RGB cluster will translate along the convex set between μ i and a according to (10). Second, with 0≤t i ≤ 1, the size of the cluster will become smaller when fog is present according to (11). In this article, we present the following new lemmas.
Let β > 0 since the scene is viewed within the fog. Then, t = e - β d = 1 holds if and only if d = 0. However in real world images, the distance to the camera is never zero (d > 0), therefore 0 ≤ t < 1. □
If the parameters μ and are formed according to (10), and , then the centroid of is closer to the origin than the centroid of .
The volume of the color ellipsoid is larger than the foggy color ellipsoid .
4.2 Color ellipsoid model with depth discontinuity
We have assumed in the previous section that the transmission within a sample window is constant. However, this is not always true. For example, the sample window may be centered on a depth discontinuity (e.g., edge of a building).
If depth discontinuities are not accounted for in transmission estimation, then undesired artifacts will be present in the contrast restored image. These artifacts are discussed in more detail in [9, 16, 17]. In summary, these artifacts appear to look like a halo at a depth edge.
To account for the possibility that the sample window is over a depth discontinuity, we characterize the pixels observed within Ω as a Gaussian mixture model . The sample window may cover K different types of objects. This yields K clusters in the RGB histogram.
has a shape influenced by the mixture weights.
respectively. Instead of the transmission influencing the position of the ellipsoid, the mixture weight also has influence on the sample mean. Therefore, the problem of ambiguity exists because of the combination of the mixture weight and transmission π1t1. In order to use the sample mean to estimate the transmission value, the mixture weight must be considered.
5 Proposed ellipsoid prior method
Part of our key message in unifying existing defogging methods is that the transmission can be estimated using parameters from . As an introduction to this unification, we will use Lemma 2 to derive a new unique dark prior.
Similar to the nomenclature in , let the centroid prior, θ C , be the dark prior using Lemma 2.
where c is the color channel.
with t0 set to a low value for numerical conditioning (t0 = 0.001) (see the work by  for the recovery method and  for additional gamma corrections). For generating the defogged image using the centroid prior, , a gamma value of 1/2 was used for the examples in this article, e.g., . The complete algorithm for the ellipsoid prior defogging method is in Algorithm 5.
Algorithm 1 The ellipsoid prior defogging algorithm.
The transmission estimate in (26) is of the same prototype form in (2). Deriving a transmission estimate based on Lemma 2 results in creating a centroid prior that is a function of the ellipsoid parameters. In Section 6, we will show that other single image defogging methods also use the prototype in (2) where a dark prior is used. We will also show that the dark prior is a function of the color ellipsoid properties.
6 Unification of single image defogging methods
The color ellipsoid framework will now be used to analyze how four single image defogging methods (Fattal , He , Gibson , and Tarel ) estimate the transmission using properties of the color ellipsoids.
6.1 Dark channel prior
with w = 0.95 for most scenes. This DCP transmission estimate in (31) is of the same form as (2).
It was observed by He et al.  through an experiment that the DCP of non-foggy outdoor natural scenes had 90% of the pixels below a tenth of the maximum possible value, hence the dark nomenclature in DCP. The is constructed in such a way that it assumes there is a pixel within the sample region centered at i that originally was black. This is a strong assumption, and there must be more to why this initial estimate works.
with equivalence when z c ∈ Ω i since a point from the set Ω i is selected instead of the estimated shell of the ellipsoid.
The right hand side of (36) was chosen by He et al.  to regularize the matting based on the DCP and to enforce smoothing weighted by λ.
and D and I3 × 3 in (38) is influenced by the properties of the color ellipsoid (μ k and Σ k ) within the window Ω k . The ability of preserving depth discontinuity edges is afforded by the affinity matrix, W, which is effective in preserving edges and discontinuities because of its locally adaptive nature .
The DCP method estimates the transmission with the prototype in (2), just like the centroid prior. Additionally, the properties of the color ellipsoids play a key role in the DCP for initial estimation and Laplacian matting for refinement.
6.2 Fattal prior
The single image defogging method by Fattal  is a unique method that at first does not appear to be using the prototype in (2). However, we show that Fattal’s method does indeed indirectly develop a dark prior and estimates the transmission with the same prototype in (2).
Fattal developed a way to create a raw estimate of the transmission and then employed a refinement step to improve the transmission estimate. We will first investigate how the raw transmission estimate is constructed.
when the albedo r is constant.
with ||a|| = ||a⊥|| and 〈a,a⊥ 〉 = 0.
The term is the residual albedo projected onto a⊥.
we see yet another prior, the Fattal prior θ F . The Fattal prior should behave similar to the DCP (θ D ) and centroid prior (θ C ) since it is also used to estimate the transmission. The term θ F should match the intuition that it becomes darker (close to zero) when radiant objects are closer to the camera when fog is present.
The Fattal prior utilizes Lemma 2. Note that in (4) as the transmission increases, t → 1, the foggy pixel moves farther away from the airlight vector, a, while staying on the convex set . This causes more energy to go to the residual, , and less to x a . Therefore, according to (46), the Fattal prior decreases or becomes darker, θ F → 0, as the transmission increases regardless of the value of η.
The Fattal prior also utilizes Lemma 3. To observe this, we analyze the weight factor, η, in (46) which is a measure of ambiguity. It increases as the albedo color becomes parallel with the airlight or becomes more ambiguous. A low η value means that it is not known whether the pixel is covered by fog or if it is truly the same color as the airlight, but not covered by fog.
Since η is measured using a sample region Ω, we employ the color ellipsoid framework to show that the θ F is dependent on the color ellipsoid.
where is the refinement of , and are the pixels in that are good. The transmission variance σ t is discussed in detail in  and is measured based on the noise in the image. The smoothing is controlled by the variance value .
The statistical prior on the right hand side of (57) not only enforces smoothness but also that the variation in the edges in transmission matches the edges in the original image projected onto airlight. Therefore, if there is a depth discontinuity, the variation will be large in enforcing to preserve depth discontinuity edges.
6.3 Tarel prior
In this section, we will explore the single image fog removal method presented by Tarel and Hautière  and relate their intuition with the properties of the color ellipsoids for foggy images. For this analysis, we will make the same assumption that Tarel makes where the foggy image, , has been white balanced such that the airlight component is pure white, a = (1,1,1) T .
(with a s = 1) which is a linear function of the transmission. Similar to the DCP, we call this term, θ T , the Tarel prior. We show that this prior is also dependent on the color ellipsoid properties.
The intuition in using the image whiteness is similar to the first step used in He’s method to obtain the DCP (30). The set of values w i within Ω i are the minimum distances from the points in the RGB cluster to either the R-G, G-B, or R-B planes. The atmospheric veiling is estimated by measuring the local average of w, μ w , and subtracting it from the local standard deviation of w, σ w .
6.3.1 Analysis without median operator
where we assume just as Tarel does that the airlight is pure white with a magnitude 1 for each color channel. If the color in the patch is pure white, μw,i becomes 1, hence the name image of whiteness. Moreover, if the color within Ω i at least has one color component that is zero, then the local mean is only dependent on the atmospheric veiling, μw,i = 1 - t.
Using the approximation with (63), it can be shown that θ T is dependent on the position and shape of the color ellipsoid. There are four different clusters in Figure 8 that exist from different sample patches, where three of the clusters have the true μw,i indicated with them. One can see that these local averages of the image whiteness for each cluster are essentially the minimum component value for the cluster centroid given that the orientation of the cluster is aligned to the gray color line. Assuming that the orientation is along the gray color line is not too strong of an assumption since the image itself has been white-balanced and the dominant orientation is also along the gray color line due to shading or airlight influence. The fourth cluster, indicated with a dashed blue ellipse, is an example where this approximation is not valid due to the position and orientation of the cluster points.
Up to this point, the Tarel prior θ T is not a function of the mixture weights within the sample patch Ω i and thus will cause undesirable halo artifacts when removing fog from the image.
6.3.2 Analysis with median operator
The sample patch Ω i is chosen to be large (41×41) to enforce θ T to be smooth. Likewise, since the median operator works well with edge preservation , the edges are considered limiting halo artifacts from being present.
with |Ω i | odd. In addition to θ T being dependent on the size and position of the color ellipsoid from the sample patch Ω i , we also show in (67) that the mixture weights are employed by Tarel to infer the atmospheric veiling.
This is essentially a hybrid of both the DCP θ D and the Tarel prior θ T because of the use of the median operator. In the same fashion as the previous analysis for the DCP and Tarel priors, the MDCP is also a function of the color ellipsoid properties. It also accounts for depth discontinuities by being dependent on the mixture weights π g .
We have found that we can unify single image defogging methods. The unification is that all of these single image defogging methods use the prototype in (2) to estimate transmission using a dark prior. Additionally, each of these dark priors use properties of the color ellipsoids with respect to Lemmas 2 and 3.
Summary of dark prior methods
Median operator to estimate
Gauss-Markov random field
The development of the color ellipsoid framework is a contribution to the field of work in single image defogging because it brings a richer understanding to the problem of estimating the transmission. This article provides the tools necessary to clearly understand how transmission is estimated from a single foggy day image. We have introduced a new method that is visually more aggressive in removing fog which affords an image that is richer in color.
Future work will include the color ellipsoid framework in the development of a contrast enhancement metric. Additionally, the ambiguity problem when estimating the transmission will be addressed using the orientation of the color ellipsoid to develop a more accurate transmission mapping with respect to the depth of the scene.
We present a new way to model single image defogging methods using a color ellipsoid framework. Our discoveries are as follows:
We have discovered how depth cues from fog can be inferred using the color ellipsoid framework.
We unify single image defogging methods using the color ellipsoid framework.
A Gaussian mixture model is crucial to represent depth discontinuities which is a common issue in removing fog in natural scenes.
We discover that the ambiguity in measuring depth from fog is associated with the color ellipsoid orientation and shape.
A new defogging method is presented which is effective in contrast enhancement and based on the color ellipsoid properties.
This article is a contribution to the image processing community by providing strong intuition in single image defogging, particularly estimating depth from fog. This is useful in contrast enhancement, surveillance, tracking, and robotic applications.
This work is supported in part by the Space and Naval Warfare Systems Center Pacific (SSC Pacific) and by NSF under grant CCF-1065305.
- 4.Middleton WEK: Vision Through the Atmosphere. Ontario: University of Toronto Press; 1952.Google Scholar
- 6.McCartney EJ: Optics of the Atmosphere: Scattering by Molecules and Particles. New York: Wiley; 1976.Google Scholar
- 7.Tan RT: Visibility in bad weather from a single image, 2008. In IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE; 2008:1-8.Google Scholar
- 9.He K, Sun J, Tang X: Single image haze removal using dark channel prior, 2009. In IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE; 2009:1956-1963.Google Scholar
- 10.Kratz L, Nishino K: Factorizing scene albedo and depth from a single foggy image, 2009. In IEEE 12th International Conference on Computer Vision. Piscataway: IEEE; 2009:1701-1708.Google Scholar
- 11.Fang F, Li F, Yang X, Shen C, Zhang G: Single image dehazing and denoising with variational method, 2010. In International Conference on Image Analysis and Signal Processing (IASP). Piscataway: IEEE; 2010:219-222.Google Scholar
- 12.Chao L, Wang M: Removal of water scattering. IEEE Comput. Eng. Technol. (ICCET) 2010, 2: V2—35.Google Scholar
- 13.Yoon I, Jeon J, Lee J, Paik J: Weighted image defogging method using statistical RGB channel feature extraction, 2010. In International SoC Design Conference (ISOCC). Piscataway: IEEE; 2010:34-35.Google Scholar
- 17.Tarel JP, Hautière N: Fast visibility restoration from a single color or gray level image, 2009. In IEEE 12th International Conference on Computer Vision. Piscataway: IEEE; 2009:2201-2208.Google Scholar
- 21.Gibson KB, Nguyen TQ: Hazy image modeling using color ellipsoids, 2011. In 18th IEEE International Conference on Image Processing (ICIP). Piscataway: IEEE; 2011:1861-1864.Google Scholar
- 22.Omer I, Werman M: Color lines: image specific color representation. Proc. 2004 IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit 2004, 2: II-946.Google Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.