Aerosol Layer Height over Water via Oxygen A-Band Observations from Space: A Tutorial

Davis, Anthony B.; Kalashnikova, Olga V.

doi:10.1007/978-3-030-03445-0_4

Anthony B. Davis³ &
Olga V. Kalashnikova³

Part of the book series: Springer Series in Light Scattering ((SSLS))

774 Accesses
4 Citations

Abstract

Aerosols are a highly problematic area in climate science for several reasons. On the one hand, they are at least partially anthropogenic, originating from industrial facilities spewing pollution as well as agricultural activity, seasonal biomass burning, land-use change, and even wood-stove cooking in densely populated regions. On the other hand, aerosols interact in very poorly understood ways with clouds and hence, indirectly, the climate system as a whole (Boucher and Randall 2014).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 109.00; Price excludes VAT (USA)

Hardcover Book: USD 139.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
Davis et al. (2009), and authors cited therein) take the opposite perspective and argue that, with new lidar equations that account fully for multiply scattered laser light, lidar techniques can be extended to optically thick clouds and, for that matter, aerosol plumes.
2.
These “contributed” sensors for the PACE mission are in fact imaging polarimeters, but we will only use the intensity (i.e., 1$\text {st}$ component of the Stokes vector) herein.
3.
Although long-accepted on a phenomenological basis (Chandrasekhar 1950), the RTE in (4)–(5) was derived rigorously from EM wave theory (i.e., Maxwell’s equations) and statistical optics only in 2002 by Mishchenko Mishchenko (2002).
4.
Because of cancellations between the middle and last factors in the 2nd line of (24), the exponentials in the middle term have to be expanded to order 3 to obtain the first surviving cross-term in $\tau _\text {p}\tau _{{\mathrm O}_{2}}(\lambda )$.
5.
This is in lieu of spectrally averaging the numerator of the DOAS ratio in (24), as well as the Jacobians from (27)–(29), for every different choice of the aerosol and geometry parameters. This is tantamount to driving the monochromatic forward model in (24) with a single “effective” wavelength $\lambda _i(M)$ such that $\tau _{{\mathrm O}_{2}}(\lambda _i(M)) = \tau _{{\mathrm O}_{2}}^{(i)}(M)$ for $i = 1,2,3$.
6.
Spherical (Mie) particles following a lognormal size distribution, with $r_\text {g}$ = 0.06 $\mu $m geometric mean radius, $\ln (\sigma _\text {g})$ = 0.6, and complex refractive index $1.518-0.02368\,\mathrm{i}$ at $\lambda $ = 446 nm; although this is not the ${{\mathrm O}_{2}}$ A-band wavelength, the resulting phase function is representative.
7.
In all cases discussed in this article, the eigenvalues of ${\mathbf {J}}_{\mathbf {b}}{\mathbf {S}}_{\mathbf {b}}{\mathbf {J}}_{\mathbf {b}}^\mathrm{T}$ (forward modeling error) are very small compared to the diagonal terms in ${\mathbf {S}}_{\mathbf {y}}$ (measurement error).

References

Aides A, Schechner YY, Holodovsky V, Garay MJ, Davis AB (2013) Multi sky-view 3D aerosol distribution recovery. Opt Express 21(22):25820–25833
Article ADS Google Scholar
Bevington PR, Robinson DK (1992) Data reduction and error analysis for the physical sciences. WCB McGraw-Hill, New York (NY)
Google Scholar
Boucher O, Randall D (2014) Chapter 7: clouds and aerosols. In: Cubasch U, Wuebbles D (eds) Climate change 2013: the physical science basis. Cambridge University Press, Cambridge (UK)
Google Scholar
Chandrasekhar, S (1950) Radiative Transfer. Oxford University Press, Oxford (UK). [reprinted by Dover Publications, New York (NY), 1960]
Google Scholar
Corradini S, Cervino (2006)Aerosol extinction coefficient profile retrieval in the oxygen A-band considering multiple scattering atmosphere. Test case: SCIAMACHY nadir simulated measurements. J Quant Spectrosc Radiat Transfer 97(3):354–380
Article ADS Google Scholar
Davis AB, Kalashnikova OV, Diner DJ (2018) Aerosol layer height over water from the oxygen a-band: mono-angle spectroscopy and/or multi-angle radiometry. Remote Sens (in preparation)
Google Scholar
Davis AB, Knyazikhin Y (2005) Chapter 3: A primer in 3D radiative transfer. In: Marshak A, Davis AB (eds) 3D radiative transfer in cloudy atmospheres. Springer, Heidelberg (Germany), pp 153–242
Chapter Google Scholar
Davis AB, Polonsky IN, Marshak A (2009) Space-time Green functions for diffusive radiation transport, in application to active and passive cloud probing. In: Kokhanovsky AA (ed) Light scattering reviews, vol 4. Springer-Praxis, Heidelberg (Germany), pp 169–292
Chapter Google Scholar
Desmons M, Ferlay N, Parol F, Mcharek L, Vanbauce C (2013) Improved information about the vertical location and extent of monolayer clouds from POLDER3 measurements in the oxygen A-band. Atmos Meas Tech 6(8):2221–2238
Article Google Scholar
Diner DJ, Beckert JC, Reilly TH, Bruegge CJ, Conel JE, Kahn RA, Martonchik JV, Ackerman TP, Davies R, Gerstl SAW, Gordon HR, Muller J-P, Myneni RB, Sellers PJ, Pinty B, Verstraete MM (1998) Multi-angle Imaging SpectroRadiometer (MISR) instrument description and experiment overview. IEEE Trans Geosci Remote Sens 36(4):1072–1087
Article ADS Google Scholar
Diner DJ, Boland SW, Brauer M, Bruegge C, Burke KA, Chipman R, Di Girolamo L, Garay MJ, Hasheminassab S, Hyer E, Jerrett M, Jovanovic V, Kalashnikova OV, Liu Y, Lyapustin AI, Martin RV, Nastan A, Ostro BD, Ritz B, Schwartz J, Wang J, Xu F (2018) Advances in multiangle satellite remote sensing of speciated airborne particulate matter and association with adverse health effects: from MISR to MAIA. J Appl Remote Sens 12:042603
Article Google Scholar
Ding S, Wang J, Xu X (2016) Polarimetric remote sensing in oxygen A and B bands: Sensitivity study and information content analysis for vertical profile of aerosols. Atmos Meas Tech 9(5):2077–2092
Article Google Scholar
Dubuisson P, Frouin R, Dessailly D, Duforêt L, Léon J-F, Voss K, Antoine D (2009) Estimating the altitude of aerosol plumes over the ocean from reflectance ratio measurements in the O$_2$ A-band. Remote Sens Environ 113(9):1899–1911
Article ADS Google Scholar
Duforêt L, Frouin R, Dubuisson P (2007) Importance and estimation of aerosol vertical structure in satellite ocean-color remote sensing. Appl Opt 46(7):1107–1119
Article ADS Google Scholar
Evans KF, Marshak A (2005) Numerical methods. In: Marshak A, Davis A (eds) 3D radiative transfer in cloudy atmospheres, chapter 4. Springer, Heidelberg (Germany), pp 243–281
Chapter Google Scholar
Ferlay N, Thieuleux F, Cornet C, Davis AB, Dubuisson P, Ducos F, Parol F, Riédi J, Vanbauce C (2010) Toward new inferences about cloud structures from multidirectional measurements in the oxygen A-band: middle-of-cloud pressure and cloud geometrical thickness from POLDER-3/PARASOL. J Appl Meteorol Climatol 49(12):2492–2507
Article ADS Google Scholar
Flower VJB, Kahn RA (2018) Karymsky volcano eruptive plume properties based on MISR multi-angle imagery and the volcanological implications. Atmos Chem Phys 18(6):3903–3918
Article ADS Google Scholar
Frankenberg C, Hasekamp O, O’Dell C, Sanghavi S, Butz A, Worden J (2012) Aerosol information content analysis of multi-angle high spectral resolution measurements and its benefit for high accuracy greenhouse gas retrievals. Atmos Meas Tech 5(7):1809–1821
Article Google Scholar
Gabella M, Kisselev V, Perona G (1999) Retrieval of aerosol profile variations from reflected radiation in the oxygen absorption A band. Appl Opt 38(15):3190–3195
Article ADS Google Scholar
Garay MJ, Davis AB, Diner DJ (2016) Tomographic reconstruction of an aerosol plume using passive multiangle observations from the MISR satellite instrument. Geophys Res Lett 43(24):12590–12596
Article Google Scholar
Hollstein A, Fischer J (2014) Retrieving aerosol height from the oxygen A band: a fast forward operator and sensitivity study concerning spectral resolution, instrumental noise, and surface inhomogeneity. Atmos Meas Techn 7(5):1429–1441
Article Google Scholar
Kahn RA, Gaitley BJ, Garay MJ, Diner DJ, Eck TF, Smirnov A, Holben BN (2010) Multiangle imaging spectroradiometer global aerosol product assessment by comparison with the aerosol robotic network. J Geophys Res Atmos 115(D23):D23209
Article ADS Google Scholar
Kalashnikova OV, Garay MJ, Davis AB, Diner DJ, Martonchik JV (2011a) Sensitivity of multi-angle photo-polarimetry to vertical layering and mixing of absorbing aerosols: quantifying measurement uncertainties. J Quant Spectrosc Radiat Transfer 112(13):2149–2163
Article ADS Google Scholar
Kalashnikova OV, Garay MJ, Sokolik IN, Diner DJ, Kahn RA, Martonchik JV, Lee JN, Torres O, Yang W, Marshak A, Kassabian S, Chodas M (2011b) Capabilities and limitations of MISR aerosol products in dust-laden regions. Proc SPIE 8177:1–11
Google Scholar
Kokhanovsky AA, Rozanov VV (2005) Cloud bottom altitude determination from a satellite. IEEE Geosci Remote Sens Lett 2(3):280–283
Article ADS Google Scholar
Kokhanovsky AA, Rozanov VV (2010) The determination of dust cloud altitudes from a satellite using hyperspectral measurements in the gaseous absorption band. Int J Remote Sens 31(10):2729–2744
Article Google Scholar
Kokhanovsky AA, Davis AB, Cairns B, Dubovik O, Hasekamp OP, Sano I, Mukai S, Rozanov VV, Litvinov P, Lapyonok T, Kolomiets IS (2015) Space-based remote sensing of atmospheric aerosols: the multi-angle spectro-polarimetric frontier. Earth Sci Rev 145:85–116
Article Google Scholar
Liu Y, Diner DJ (2017) Multi-angle imager for aerosols: a satellite investigation to benefit public health. Public Health Reports 132(1):14–17
Article Google Scholar
Marquardt DW (1963) An algorithm for least-squares estimation of nonlinear parameters. J Soc Ind Appl Math 11:431–441
Article MathSciNet Google Scholar
Merlin G, Riédi J, Labonnote LC, Cornet C, Davis AB, Dubuisson P, Desmons M, Ferlay N, Parol F (2016) Cloud information content analysis of multi-angular measurements in the oxygen A-band: application to 3MI and MSPI. Atmos Meas Tech 9(10):4977–4995
Article Google Scholar
Min Q, Harrison LC (2004) Retrieval of atmospheric optical depth profiles from downward-looking high-resolution O$_2$ A-band measurements: optically thin conditions. J Atmos Sci 61(20):2469–2477
Article ADS Google Scholar
Mishchenko MI (2002) Vector radiative transfer equation for arbitrarily shaped and arbitrarily oriented particles: a microphysical derivation from statistical electromagnetics. Appl Opt 41:7114–7135
Article ADS Google Scholar
Moroney C, Davies R, Muller JP (2002) Operational retrieval of cloud-top heights using MISR data. IEEE Trans Geosci Remote Sens 40(7):1532–1540
Article ADS Google Scholar
NASA (2017) PACE: Plankton, Aerosol, Cloud, ocean Ecosystem. https://pace.gsfc.nasa.gov
National Academies of Sciences, Engineering, and Medicine (2018) Thriving on Our Changing Planet, A Decadal Strategy for Earth Observation from Space. The National Academies Press, Washington (DC). https://doi.org/10.17226/24938
Nelson DL, Chen Y, Kahn RA, Diner DJ, Mazzoni D (2008) Example applications of the MISR INteractive eXplorer (MINX) software tool to wildfire smoke plume analyses. Proc SPIE 7089:1–11
Google Scholar
Nelson DL, Garay MJ, Kahn RA, Dunst BA (2013) Stereoscopic height and wind retrievals for aerosol plumes with the MISR INteractive eXplorer (MINX). Remote Sens 5(9):4593–4628
Article ADS Google Scholar
Pelletier B, Frouin R, Dubuisson P (2008) Retrieval of the aerosol vertical distribution from atmospheric radiance. Proc. SPIE 7150:7150R1–7150R9
Google Scholar
Press WH, Flannery BP, Teukolsky SA, Vetterling WT (1989) Numerical recipes. Cambridge University Press, Cambridge (UK)
MATH Google Scholar
Rodgers CD (1998) Information content and optimisation of high spectral resolution remote measurements. Adv Space Res 21(3):361–367
Article ADS Google Scholar
Rodgers CD (2000) Inverse methods for atmospheric sounding: theory and practice. World Scientific, Singapore
Book Google Scholar
Rothman LS (2010) The evolution and impact of the HITRAN molecular spectroscopic database. J Quant Spectrosc Radiat Transfer 111:1565–1567
Article ADS Google Scholar
Rozanov VV, Kokhanovsky AA (2004) Semianalytical cloud retrieval algorithm as applied to the cloud top altitude and the cloud geometrical thickness determination from top-of-atmosphere reflectance measurements in the oxygen A band. J Geophys Res Atmos 109(D5):D05202
Article ADS Google Scholar
Sanghavi S, Martonchik JV, Landgraf J, Platt U (2012) Retrieval of the optical depth and vertical distribution of particulate scatterers in the atmosphere using O$_2$ A- and B-band SCIAMACHY observations over kanpur: a case study. Atmos Meas Tech 5(5):1099–1119
Article Google Scholar
Schuessler O, Rodriguez D, Loyola G, Doicu A, Spurr R (2014) Information content in the oxygen A-band for the retrieval of macrophysical cloud parameters. IEEE Trans Geosci Remote Sens 52(6):3246–3255
Article ADS Google Scholar
Shannon CE (1948) A mathematical theory of communication. Bell Syst Tech J 27:379–423
Article MathSciNet Google Scholar
Val Martin M, Logan JA, Kahn RA, Leung FY, Nelson DL, Diner DJ (2010) Smoke injection heights from fires in North America: Analysis of 5 years of satellite observations. Atmos Chem Phys 10(4):1491–1510
Article ADS Google Scholar
Vaughan MA, Powell KA, Winker DM, Hostetler CA, Kuehn RE, Hunt WH, Getzewich BJ, Young SA, Liu Z, McGill MJ (2009) Fully automated detection of cloud and aerosol layers in the CALIPSO lidar measurements. J Atmos Ocean Technol 26(10):2034–2050
Article Google Scholar
Winker DM, Pelon J, Coakley JA Jr, Ackerman SA, Charlson RJ, Colarco PR, Flamant P, Fu Q, Hoff RM, Kittaka C, Kubar TL, Le Treut H, McCormick MP, Mégie G, Poole L, Powell K, Trepte C, Vaughan MA, Wielicki BA (2010) The CALIPSO mission: a global 3D view of aerosols and clouds. Bull Am Meteorol Soc 91(9):1211–1230
Article Google Scholar
Wu L, Hasekamp O, van Diedenhoven B, Cairns B, Yorks JE, Chowdhary J (2016). Passive remote sensing of aerosol layer height using near-UV multiangle polarization measurements. Geophys Res Lett 43(16):8783–8790. 2016GL069848
Article ADS Google Scholar
Xu X, Wang J, Wang Y, Zeng J, Torres O, Yang Y, Marshak A, Reid J, Miller S (2017) Passive remote sensing of altitude and optical depth of dust plumes using the oxygen A and B bands: first results from EPIC/DSCOVR at Lagrange-1 point. Geophys Res Lett 44(14):7544–7554
Article ADS Google Scholar

Download references

Acknowledgements

The research was carried out at the Jet Propulsion Laboratory, California Institute of Technology, under contract with the National Aeronautics and Space Administration (NASA). We acknowledge support from the NASA Plankton, Aerosol, Cloud and ocean Ecosystem (PACE) for Earth Science, managed by Dr. Paula Bontempi. We also thank Laurent C.-Labonnote, Guillaume Merlin, Oleg Dubovik, Alex Kokhanovsky, Kirk Knobelspiesse, Lorraine Remer, Dave Diner, Mike Garay, Vijay Natraj, Suniti Sanghavi, Eugene Ustinov, and Feng Xu for fruitful discussions about OE theory and passive atmospheric profiling of aerosols and clouds using the O$_2$ A-band, in general or in connection with the PACE mission, as well as with the proposed MAIA investigation.

Author information

Authors and Affiliations

Jet Propulsion Laboratory, California Institute of Technology, Pasadena, CA, 91109, USA
Anthony B. Davis & Olga V. Kalashnikova

Authors

Anthony B. Davis
View author publications
You can also search for this author in PubMed Google Scholar
Olga V. Kalashnikova
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Olga V. Kalashnikova .

Editor information

Editors and Affiliations

Vitrociset Belgium, Darmstadt, Hessen, Germany
Alexander Kokhanovsky

Appendix: IC Analysis in Optimal Estimation Theory

1.1 Least-Squares Fit of a Forward Model to Data

The standard approach (Bevington and Robinson 1992; Press et al. 1989) to fitting a generally nonlinear forward model ${\mathbf {F}}({\mathbf {x}};{\mathbf {b}})$ to data ${\mathbf {y}}$ is to equate them, as n-dimensional vectors in measurement space, with allowance for random error $\varvec{\epsilon }$:

$$\begin{aligned} {\mathbf {y}}= {\mathbf {F}}({\mathbf {x}};{\mathbf {b}}) + \varvec{\epsilon }. \end{aligned}$$

(31)

where ${\mathbf {x}}$ is the m-dimensional vector in a state space that contains all the parameters used to find the best fit to the data; ${\mathbf {b}}$ is another vector in the space of non- or otherwise-retrieved parameters that are imperfectly known to within a known uncertainty.

We will consider two sources of error in $\varvec{\epsilon }$: (i) instrumental error that affects ${\mathbf {y}}$, and (ii) forward model error that affects ${\mathbf {F}}({\mathbf {x}};{\mathbf {b}})$. Our main concern in the latter case is uncertainty on ${\mathbf {b}}$, the parts of a larger state vector that have to be treated as given when retrieving ${\mathbf {x}}$. Instrumental error is characterized by its $n \times n$ variance/covariance matrix

$$\begin{aligned} {\mathbf {S}}_{\mathbf {y}}= \mathrm{E}[({\mathbf {y}}-\mathrm{E}[{\mathbf {y}}])({\mathbf {y}}-\mathrm{E}[{\mathbf {y}}])^\mathrm{T}] \end{aligned}$$

(32)

where $\mathrm{E}[\cdot ]$ denotes mathematical expectation. Uncertainty on the non/otherwise-retrieved parameters is defined by their variance/co-variance matrix ${\mathbf {S}}_{\mathbf {b}}$, which can be converted into the equivalent ${\mathbf {S}}_{\mathbf {F}}$ of the measurement error matrix in (32) by using the Jacobian matrix ${\mathbf {J}}_{\mathbf {b}}= \partial {\mathbf {F}}/\partial {\mathbf {b}}$ and its transpose ${\mathbf {J}}_{\mathbf {b}}^\mathrm{T}$:

$$\begin{aligned} {\mathbf {S}}_{\mathbf {F}}= {\mathbf {J}}_{\mathbf {b}}{\mathbf {S}}_{\mathbf {b}}{\mathbf {J}}_{\mathbf {b}}^\mathrm{T}. \end{aligned}$$

(33)

Since we are confident that measurement error in ${\mathbf {y}}$ and modeling error in ${\mathbf {F}}({\mathbf {x}};{\mathbf {b}})$ are independent random variables, their (co)variance matrices just add to form:

$$\begin{aligned} {\mathbf {S}}_{\varvec{\epsilon }} = {\mathbf {S}}_{\mathbf {y}}+ {\mathbf {S}}_{\mathbf {F}}. \end{aligned}$$

(34)

The classic least-squares minimization method determines the estimate

$$\begin{aligned} {\mathbf {x}}_\text {bp} = \mathop {\text {argmin}}\limits _{{\mathbf {x}}}\left[ \Psi _{\mathbf {y}}({\mathbf {x}}) \right] \end{aligned}$$

(35)

for the best possible fit of the data for model ${\mathbf {F}}({\mathbf {x}};{\mathbf {b}})$ by finding the minimum of the cost function

$$\begin{aligned} \Psi _{\mathbf {y}}({\mathbf {x}}) = \frac{1}{2} ({\mathbf {y}}-{\mathbf {F}}({\mathbf {x}};{\mathbf {b}}))^\mathrm{T}{\mathbf {S}}_{\varvec{\epsilon }}^{-1} ({\mathbf {y}}-{\mathbf {F}}({\mathbf {x}};{\mathbf {b}})). \end{aligned}$$

(36)

In the frequent situation where ${\mathbf {S}}_{\varvec{\epsilon }}$ is diagonal, $\Psi _{\mathbf {y}}({\mathbf {x}})$ is the sum of squared “observed–predicted” residuals down-weighted by the uncertainty on each measurement. Assuming, for the moment, that $\Psi _{\mathbf {y}}({\mathbf {x}})$ is convex, one can use the iterative Gauss-Newton algorithm:

$$\begin{aligned} {\mathbf {x}}_{i+1} = {\mathbf {x}}_{i} + {\mathbf {S}}_{\mathbf {x}}{\mathbf {J}}^\mathrm{T}{\mathbf {S}}_{\varvec{\epsilon }}^{-1} ({\mathbf {y}}-{\mathbf {F}}({\mathbf {x}}_i;{\mathbf {b}})), \end{aligned}$$

(37)

where ${\mathbf {J}}= \partial {\mathbf {F}}/\partial {\mathbf {x}}$ is the usual $n \times m$ Jacobian matrix, and

$$\begin{aligned} {\mathbf {S}}_{\mathbf {x}}= \left( {\mathbf {J}}^\mathrm{T}{\mathbf {S}}_{\varvec{\epsilon }}^{-1} {\mathbf {J}}\right) ^{-1}, \end{aligned}$$

(38)

using an arbitrary starting position ${\mathbf {x}}_0$; this method converges in one step if ${\mathbf {F}}({\mathbf {x}};{\mathbf {b}})$ is linear in ${\mathbf {x}}$. Iteration is stopped when the non-dimensional weighted sum of residuals in (36), often denoted as $\chi ^2$/2, is $\approx n$/2.

Note that ${\mathbf {x}}_\text {bp}$ from (35) will not be identical to the true state vector in (31) for at least two basic reasons, starting with the least serious:

First, the data ${\mathbf {y}}$ are collected with a specific realization of the random instrument noise in $\varvec{\epsilon }$, i.e., the component quantified by ${\mathbf {S}}_{\mathbf {y}}$ in (34). Averaging over different measurements of ${\mathbf {y}}$ will reduce the noise level (by the square-root of the number of independent measurements) but not eliminate it. However, the distance between the true and estimated state vectors is likely to be on the order of the square-root of the diagonal elements of ${\mathbf {S}}_{\mathbf {x}}$ in (38), i.e., the predicted retrieval error for each element of ${\mathbf {x}}$:
$$\begin{aligned} \text {StDev}[x_j] = \sqrt{S_{{\mathbf {x}}jj}} (j = 1,\dots ,m). \end{aligned}$$
(39)
Moreover, most interesting problems, in remote sensing in particular, result in non-convex cost functions in (36), simply because ${\mathbf {F}}({\mathbf {x}};{\mathbf {b}})$ is non-linear. Hence there is no guarantee that it has a single global minimum to find by iterative Gauss-Newton steps. So, even assuming that the true state is near the global minimum of $\Psi _{\mathbf {y}}({\mathbf {x}})$, depending on the initial guess ${\mathbf {x}}_0$ at the state vector, the iterative search could end in a local minimum that is generally not within the distance in (39) of the true state. A well-known mitigation strategy for the non-convexity problem is to use the Levenberg-Marquardt minimization algorithm (Marquardt 1963; Bevington and Robinson 1992) where ${\mathbf {S}}_{\mathbf {x}}^{-1} = {\mathbf {J}}^\mathrm{T}{\mathbf {S}}_{\varvec{\epsilon }}^{-1} {\mathbf {J}}$ in (38) is replaced by a weighted sum of itself and its diagonal elements. The weight starts high on the diagonal matrix, which make the iteration behave like a straightforward gradient descent, which can work for non-convex function minimization (away from inflection points). As the weight shifts away from the diagonal, the method reverts to Gauss-Newton steps as the correct minimum is approached in a state-space region where $\Psi _{\mathbf {y}}({\mathbf {x}})$ is locally convex.

Finally, the whole inverse problem at hand can be ill-posed, that is, ${\mathbf {S}}_{\mathbf {x}}^{-1}$ can be nearly singular, thus making its inversion in (38) highly unstable. The trajectory of iterative minimization in (37) will then be very sensitive to small perturbations of ${\mathbf {y}}$, e.g., instrumental noise. We now introduce a solution to this last issue since in remote sensing there is often much redundancy in the observations with respect to the unknowns in the retrieval.

1.2 Optimal Estimation

Rodgers’ (2000) theory of optimal estimation (OE) revisits the above inverse problem of parameter estimation in the forward signal model from a Bayesian perspective, as a means for dealing with chronic ill-posedness by introducing regularization. Bayes theorem relates two conditional probabilities and two unconditional counterparts:

$$\begin{aligned} P({\mathbf {x}}|{\mathbf {y}}) = P({\mathbf {y}}|{\mathbf {x}}) P({\mathbf {x}}) / P({\mathbf {y}}), \end{aligned}$$

(40)

where the last one is just a normalization factor of no particular interest here. $P({\mathbf {x}})$ is the “prior” or“a priori” probability, while $P({\mathbf {x}}|{\mathbf {y}})$ is the “posterior” probability.

Letting $|\cdot |$ denote the determinant of a square matrix, and assuming gaussian distributions, we have

$$\begin{aligned} \log P({\mathbf {x}}|{\mathbf {y}}) = -\frac{1}{2}\left[ ({\mathbf {x}}-\hat{{\mathbf {x}}})^\mathrm{T}{\mathbf {S}}_{\mathbf {x}}^{-1} ({\mathbf {x}}-\hat{{\mathbf {x}}}) + \log |{\mathbf {S}}_{\mathbf {x}}| + m\log (2\pi )\right] \end{aligned}$$

(41)

for the posterior probability of atmospheric state properties ${\mathbf {x}}$, given data ${\mathbf {y}}$. Also, we note that $\hat{{\mathbf {x}}}$ is a new estimate of the prevailing state vector. Both $\hat{{\mathbf {x}}}$ and ${\mathbf {S}}_{\mathbf {x}}$ will depend on ${\mathbf {y}}$, ${\mathbf {F}}({\mathbf {x}};{\mathbf {b}})$, and related quantities, as shown below. In addition, we have

$$\begin{aligned} \log P({\mathbf {x}}) = -\frac{1}{2}\left[ ({\mathbf {x}}-{\mathbf {x}}_\text {a})^\mathrm{T}{\mathbf {S}}_\text {a}^{-1}({\mathbf {x}}-{\mathbf {x}}_\text {a}) + \log |{\mathbf {S}}_\text {a}| + m\log (2\pi ) \right] \end{aligned}$$

(42)

for the a priori probability of atmospheric state, i.e., what we know about it before any observations are made. Finally, we have

$$\begin{aligned} \log P({\mathbf {y}}|{\mathbf {x}}) = -\Psi _{\mathbf {y}}({\mathbf {x}}) - \frac{1}{2} \log \left[ (2\pi )^n|{\mathbf {S}}_{\varvec{\epsilon }}|\right] \end{aligned}$$

(43)

for the probability of obtaining the specific observations ${\mathbf {y}}$, given the atmospheric state ${\mathbf {x}}$, i.e., probability of seeing the random residuals used in (36) to estimate the cost function $\Psi _{\mathbf {y}}({\mathbf {x}})$.

One then defines $\hat{{\mathbf {x}}}$ as the state with maximum likelihood, i.e., that maximizes $P({\mathbf {y}}|{\mathbf {x}})P({\mathbf {x}})$, a product of two gaussian PDFs. In other words,

$$\begin{aligned} \hat{{\mathbf {x}}} = \mathop {\text {argmin}}\limits _{{\mathbf {x}}}\left[ \Psi _{\mathbf {y}}({\mathbf {x}}) + \frac{1}{2}({\mathbf {x}}-{\mathbf {x}}_\text {a})^\mathrm{T}{\mathbf {S}}_\text {a}^{-1}({\mathbf {x}}-{\mathbf {x}}_\text {a}) \right] , \end{aligned}$$

(44)

instead of (35). The Gauss-Newton algorithm in (37) still applies but, rather than (38), we now have

$$\begin{aligned} {\mathbf {S}}_{\mathbf {x}}= \left( {\mathbf {J}}^\mathrm{T}{\mathbf {S}}_{\varvec{\epsilon }}^{-1} {\mathbf {J}}+ {\mathbf {S}}_\text {a}^{-1} \right) ^{-1}. \end{aligned}$$

(45)

Although there may be better choices, one can always start the iterations with ${\mathbf {x}}_0 = {\mathbf {x}}_\text {a}$. As in (39), the diagonal elements $S_{{\mathbf {x}}jj}$ of ${\mathbf {S}}_{\mathbf {x}}$ are the posterior estimates of variance on the retrieved properties in $\hat{{\mathbf {x}}}$.

1.3 Degrees of Freedom

At any rate, and contrary to (38), the matrix inversion problem in (45) is by design well-posed, thanks to the presence of ${\mathbf {S}}_\text {a}^{-1}$. However, convergence goes to $\hat{{\mathbf {x}}}$ rather than ${\mathbf {x}}_\text {bp}$ in (35).

It is therefore of interest to evaluate the $m \times m$ matrix

$$\begin{aligned} {\mathbf {A}}= \left( {\mathbf {J}}^\mathrm{T}{\mathbf {S}}_{\varvec{\epsilon }}^{-1} {\mathbf {J}}+ {\mathbf {S}}_\text {a}^{-1} \right) ^{-1} {\mathbf {J}}^\mathrm{T}{\mathbf {S}}_{\varvec{\epsilon }}^{-1} {\mathbf {J}}, \end{aligned}$$

(46)

i.e., the left-hand product of (45) with the potentially ill-conditioned matrix to be inverted in (38). Recall that these expressions are the predicted uncertainties on the retrieved state properties with and without regularization, respectively; in the latter case, however, the potentially unstable matrix inversion is not performed. A little algebra leads to the simpler expression

$$\begin{aligned} {\mathbf {A}}= {\mathbf {I}}_m - {\mathbf {S}}_{\mathbf {x}}{\mathbf {S}}_\text {a}^{-1}, \end{aligned}$$

(47)

where ${\mathbf {I}}_m$ is the $m \times m$ identity matrix. Now, in OE theory, ${\mathbf {A}}$ is known as the “averaging kernel,” and Rodgers (2000) shows that it can be defined simply as $\partial \hat{{\mathbf {x}}}/\partial {\mathbf {x}}$, where ${\mathbf {x}}$ is the true state parameter.

If ${\mathbf {S}}_\text {a}$ is diagonal (no prior covariances), then the diagonal terms of ${\mathbf {A}}$ are

$$\begin{aligned} A_{jj} = 1 - \frac{S_{{\mathbf {x}}jj}}{S_{\text {a}jj}} \end{aligned}$$

(48)

for $j = 1,\dots ,m$. Since we can anticipate that $0 \le S_{{\mathbf {x}}jj} \le S_{\text {a}jj}$, we have $A_{jj} \in [0,1]$. $A_{jj}$ is known as the partial “degree of freedom” (DOF) for the retrieved property $x_j$. It is directly related to the predicted retrieval error

$$\begin{aligned} \text {StDev}[x_j] = \sqrt{S_{\text {a}jj}\times (1-A_{jj})} = \text {StDev}_\text {a}[x_j] \sqrt{1-A_{jj}}, \end{aligned}$$

(49)

recalling that ${\mathbf {S}}_\text {a}$ is taken as diagonal.

Following the methodology of Merlin et al. (2016), we use $A_{jj}$ and StDev[$x_j$] from (48)–(49) extensively in the main body of this article. The former is an intuitive non-dimensional metric of the value added by the observations projected onto a specific state variable. If $S_{{\mathbf {x}}jj} \ll S_{\text {a}jj}$, then $A_{jj}$ approaches unity, which means that the observations have vastly improved our knowledge of the state variable $x_j$. If, on the contrary, $S_{{\mathbf {x}}jj} \lesssim S_{\text {a}jj}$, then $A_{jj}$ approaches zero, which means that the observations have not helped very much.

Figure 7 illustrates for $m = 2$ a typical concentration of probability measure (IC increase) in a familiar state space when going from the prior to posterior PDFs for ${\mathbf {x}}$. In both cases, we have traced, assuming gaussian PDFs, the lines of iso-probability value at $1/(2\pi )\sqrt{\mathrm{e}|{\mathbf {S}}|}$, equivalently, where $({\mathbf {x}}-\mathrm{E}[{\mathbf {x}}])^\mathrm{T}{\mathbf {S}}^{-1}({\mathbf {x}}-\mathrm{E}[{\mathbf {x}}]) = 1$ for easy visualization of the magnitudes of prior and posterior standard deviations. The area of each ellipsoid is $\propto $ $|{\mathbf {S}}|$. More specifically, the regions inside the ellipses both account for 68% of the prior and posterior events. Note that, while ${\mathbf {S}}_\text {a}$ is diagonal (no covariance), ${\mathbf {S}}_{\mathbf {x}}$ is generally not.

A better-known quantification of overall retrieval performance in OE theory is

$$\begin{aligned} \text {Tr}[{\mathbf {A}}] = \sum \limits _{j=1}^m A_{jj} \in [0,m], \end{aligned}$$

(50)

which is the (implicitly, total) number of Degrees of Freedom. It approximates the maximum number of state properties (among m) that can be retrieved from the n observations in ${\mathbf {y}}$ in view of: (i) the instrumental noise (${\mathbf {S}}_{\mathbf {y}}$); (ii) forward model error (${\mathbf {S}}_{\mathbf {F}}$), in this case, from uncertainty in non-retrieved parameters (${\mathbf {S}}_{\mathbf {b}}$); (iii) sensitivities of the forward model to the state variables to be retrieved (${\mathbf {J}}$), or not (${\mathbf {J}}_{\mathbf {b}}$); and (iv) prior information about the atmospheric state (${\mathbf {S}}_\text {a}$).

The off-diagonal terms in ${\mathbf {A}}$ are non-dimensional quantities of the type $\mathrm{E}\left[ (x_1-\hat{x}_1)(x_2-\hat{x}_2) \right] $ divided by a diagonal element of ${\mathbf {S}}_\text {a}$, again assuming it is diagonal. These normalized cross-correlations describe how different state variables interfere (statistically) with one-another, two-by-two. Keeping these numbers small results in more regular (hence, more easily inverted) matrix ${\mathbf {S}}_{\mathbf {x}}^{-1}$. That, in turn, will help the performance of the retrieval algorithm. In the design phase of the observation system, the forward model ${\mathbf {F}}({\mathbf {x}};{\mathbf {b}})$ can thus be used to optimize the sampling of ${\mathbf {y}}$ in a way that keeps cross-variable interference as small as possible.

1.4 Entropy and Information Content

Lastly, we relate DOFs to information theoretical concepts, and thus justify our claim all along that we are quantifying Information Content per se. Following Shannon (1948), Rodgers (1998) defines the increase in information (equivalently, decrease in entropy) associated with the acquisition of observations ${\mathbf {y}}$ and their processing—by, e.g., OE methods—as

$$\begin{aligned} -\Delta H = -\frac{\log _2(|{\mathbf {S}}_{\mathbf {x}}|)-\log _2(|{\mathbf {S}}_\text {a}|)}{2} = -\frac{1}{2}\log _2(|{\mathbf {S}}_{\mathbf {x}}{\mathbf {S}}_\text {a}^{-1}|) = -\frac{1}{2}\log _2(|{\mathbf {I}}_m - {\mathbf {A}}|), \end{aligned}$$

(51)

when expressed in bits. Information gain $-\Delta H$ ranges from 0$^+$ ($|{\mathbf {S}}_{\mathbf {x}}| \lesssim |{\mathbf {S}}_\text {a}|$) to $\infty $ ($|{\mathbf {S}}_{\mathbf {x}}| \ll |{\mathbf {S}}_\text {a}|$). This follows directly from the expression for the entropy of a generic m-dimensional gaussian probability density function, $P_m(\mathrm{E}[{\mathbf {x}}],{\mathbf {S}})$: from, e.g., (42), we have

$$\begin{aligned} H(m,|{\mathbf {S}}|) = -\mathrm{E}[\log P_m(\mathrm{E}[{\mathbf {x}}],{\mathbf {S}})] = \frac{1}{2}\log ((\mathrm{e}2\pi )^m|{\mathbf {S}}|), \end{aligned}$$

(52)

which is naturally independent of the mean $\mathrm{E}[{\mathbf {x}}]$. To visualize this IC increase (entropy decrease), $-\Delta H$ in (51) is $\nicefrac {1}{2}$ the log of the ratio of the areas of the two ellipsoids in Fig. 7.

Abbreviations & Acronyms

1D:: one-dimensional
AOT:: aerosol optical thickness
BC:: boundary condition
BRDF:: Bi-directional Reflection Distribution Function
BRF:: Bi-directional Reflection Factor
DOAS:: Differential Optical Absorption Spectroscopy
DOF:: Degree(s) Of Freedom
FWHM:: Full-Width Half-Max
GSFC:: Goddard Space Flight Center
IC:: Information Content
IR:: infra-red
JPL:: Jet Propulsion Laboratory
MAIA:: Multi-Angle Imager for Aerosols
MAS:: Multi-Angle Sensor
NASA:: National Aeronautics and Space Agency
OE:: Optimal Estimation
OCI:: Ocean Color Imager
PACE:: Plankton, Aerosol, Cloud, and ocean Ecosystem
RT:: radiative transfer
SNR:: signal-to-noise ratio
SSA:: single scattering albedo
StDev:: Standard Deviation
SZA:: solar zenith angle
TOA:: Top-of-Atmosphere
UV:: ultra-violet
VNIR:: visible/near-infrared
VIS:: visible
VZA:: viewing zenith angle

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Davis, A.B., Kalashnikova, O.V. (2019). Aerosol Layer Height over Water via Oxygen A-Band Observations from Space: A Tutorial. In: Kokhanovsky, A. (eds) Springer Series in Light Scattering. Springer Series in Light Scattering. Springer, Cham. https://doi.org/10.1007/978-3-030-03445-0_4

Download citation

DOI: https://doi.org/10.1007/978-3-030-03445-0_4
Published: 14 January 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-03444-3
Online ISBN: 978-3-030-03445-0
eBook Packages: Earth and Environmental ScienceEarth and Environmental Science (R0)

Publish with us

Policies and ethics

Aerosol Layer Height over Water via Oxygen A-Band Observations from Space: A Tutorial

Abstract

Access this chapter

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Appendix: IC Analysis in Optimal Estimation Theory

Appendix: IC Analysis in Optimal Estimation Theory

1.1 Least-Squares Fit of a Forward Model to Data

1.2 Optimal Estimation

1.3 Degrees of Freedom

1.4 Entropy and Information Content

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Share this chapter

Publish with us

Search

Navigation