Skip to main content

Aerosol Layer Height over Water via Oxygen A-Band Observations from Space: A Tutorial

  • Chapter
  • First Online:
Springer Series in Light Scattering

Part of the book series: Springer Series in Light Scattering ((SSLS))

Abstract

Aerosols are a highly problematic area in climate science for several reasons. On the one hand, they are at least partially anthropogenic, originating from industrial facilities spewing pollution as well as agricultural activity, seasonal biomass burning, land-use change, and even wood-stove cooking in densely populated regions. On the other hand, aerosols interact in very poorly understood ways with clouds and hence, indirectly, the climate system as a whole (Boucher and Randall 2014).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 109.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 139.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Davis et al. (2009), and authors cited therein) take the opposite perspective and argue that, with new lidar equations that account fully for multiply scattered laser light, lidar techniques can be extended to optically thick clouds and, for that matter, aerosol plumes.

  2. 2.

    These “contributed” sensors for the PACE mission are in fact imaging polarimeters, but we will only use the intensity (i.e., 1\(\text {st}\) component of the Stokes vector) herein.

  3. 3.

    Although long-accepted on a phenomenological basis (Chandrasekhar 1950), the RTE in (4)–(5) was derived rigorously from EM wave theory (i.e., Maxwell’s equations) and statistical optics only in 2002 by Mishchenko Mishchenko (2002).

  4. 4.

    Because of cancellations between the middle and last factors in the 2nd line of (24), the exponentials in the middle term have to be expanded to order 3 to obtain the first surviving cross-term in \(\tau _\text {p}\tau _{{\mathrm O}_{2}}(\lambda )\).

  5. 5.

    This is in lieu of spectrally averaging the numerator of the DOAS ratio in (24), as well as the Jacobians from (27)–(29), for every different choice of the aerosol and geometry parameters. This is tantamount to driving the monochromatic forward model in (24) with a single “effective” wavelength \(\lambda _i(M)\) such that \(\tau _{{\mathrm O}_{2}}(\lambda _i(M)) = \tau _{{\mathrm O}_{2}}^{(i)}(M)\) for \(i = 1,2,3\).

  6. 6.

    Spherical (Mie) particles following a lognormal size distribution, with \(r_\text {g}\) = 0.06 \(\mu \)m geometric mean radius, \(\ln (\sigma _\text {g})\) = 0.6, and complex refractive index \(1.518-0.02368\,\mathrm{i}\) at \(\lambda \) = 446  nm; although this is not the \({{\mathrm O}_{2}}\) A-band wavelength, the resulting phase function is representative.

  7. 7.

    In all cases discussed in this article, the eigenvalues of \({\mathbf {J}}_{\mathbf {b}}{\mathbf {S}}_{\mathbf {b}}{\mathbf {J}}_{\mathbf {b}}^\mathrm{T}\) (forward modeling error) are very small compared to the diagonal terms in \({\mathbf {S}}_{\mathbf {y}}\) (measurement error).

References

  • Aides A, Schechner YY, Holodovsky V, Garay MJ, Davis AB (2013) Multi sky-view 3D aerosol distribution recovery. Opt Express 21(22):25820–25833

    Article  ADS  Google Scholar 

  • Bevington PR, Robinson DK (1992) Data reduction and error analysis for the physical sciences. WCB McGraw-Hill, New York (NY)

    Google Scholar 

  • Boucher O, Randall D (2014) Chapter 7: clouds and aerosols. In: Cubasch U, Wuebbles D (eds) Climate change 2013: the physical science basis. Cambridge University Press, Cambridge (UK)

    Google Scholar 

  • Chandrasekhar, S (1950) Radiative Transfer. Oxford University Press, Oxford (UK). [reprinted by Dover Publications, New York (NY), 1960]

    Google Scholar 

  • Corradini S, Cervino (2006)Aerosol extinction coefficient profile retrieval in the oxygen A-band considering multiple scattering atmosphere. Test case: SCIAMACHY nadir simulated measurements. J Quant Spectrosc Radiat Transfer 97(3):354–380

    Article  ADS  Google Scholar 

  • Davis AB, Kalashnikova OV, Diner DJ (2018) Aerosol layer height over water from the oxygen a-band: mono-angle spectroscopy and/or multi-angle radiometry. Remote Sens (in preparation)

    Google Scholar 

  • Davis AB, Knyazikhin Y (2005) Chapter 3: A primer in 3D radiative transfer. In: Marshak A, Davis AB (eds) 3D radiative transfer in cloudy atmospheres. Springer, Heidelberg (Germany), pp 153–242

    Chapter  Google Scholar 

  • Davis AB, Polonsky IN, Marshak A (2009) Space-time Green functions for diffusive radiation transport, in application to active and passive cloud probing. In: Kokhanovsky AA (ed) Light scattering reviews, vol 4. Springer-Praxis, Heidelberg (Germany), pp 169–292

    Chapter  Google Scholar 

  • Desmons M, Ferlay N, Parol F, Mcharek L, Vanbauce C (2013) Improved information about the vertical location and extent of monolayer clouds from POLDER3 measurements in the oxygen A-band. Atmos Meas Tech 6(8):2221–2238

    Article  Google Scholar 

  • Diner DJ, Beckert JC, Reilly TH, Bruegge CJ, Conel JE, Kahn RA, Martonchik JV, Ackerman TP, Davies R, Gerstl SAW, Gordon HR, Muller J-P, Myneni RB, Sellers PJ, Pinty B, Verstraete MM (1998) Multi-angle Imaging SpectroRadiometer (MISR) instrument description and experiment overview. IEEE Trans Geosci Remote Sens 36(4):1072–1087

    Article  ADS  Google Scholar 

  • Diner DJ, Boland SW, Brauer M, Bruegge C, Burke KA, Chipman R, Di Girolamo L, Garay MJ, Hasheminassab S, Hyer E, Jerrett M, Jovanovic V, Kalashnikova OV, Liu Y, Lyapustin AI, Martin RV, Nastan A, Ostro BD, Ritz B, Schwartz J, Wang J, Xu F (2018) Advances in multiangle satellite remote sensing of speciated airborne particulate matter and association with adverse health effects: from MISR to MAIA. J Appl Remote Sens 12:042603

    Article  Google Scholar 

  • Ding S, Wang J, Xu X (2016) Polarimetric remote sensing in oxygen A and B bands: Sensitivity study and information content analysis for vertical profile of aerosols. Atmos Meas Tech 9(5):2077–2092

    Article  Google Scholar 

  • Dubuisson P, Frouin R, Dessailly D, Duforêt L, Léon J-F, Voss K, Antoine D (2009) Estimating the altitude of aerosol plumes over the ocean from reflectance ratio measurements in the O\(_2\) A-band. Remote Sens Environ 113(9):1899–1911

    Article  ADS  Google Scholar 

  • Duforêt L, Frouin R, Dubuisson P (2007) Importance and estimation of aerosol vertical structure in satellite ocean-color remote sensing. Appl Opt 46(7):1107–1119

    Article  ADS  Google Scholar 

  • Evans KF, Marshak A (2005) Numerical methods. In: Marshak A, Davis A (eds) 3D radiative transfer in cloudy atmospheres, chapter 4. Springer, Heidelberg (Germany), pp 243–281

    Chapter  Google Scholar 

  • Ferlay N, Thieuleux F, Cornet C, Davis AB, Dubuisson P, Ducos F, Parol F, Riédi J, Vanbauce C (2010) Toward new inferences about cloud structures from multidirectional measurements in the oxygen A-band: middle-of-cloud pressure and cloud geometrical thickness from POLDER-3/PARASOL. J Appl Meteorol Climatol 49(12):2492–2507

    Article  ADS  Google Scholar 

  • Flower VJB, Kahn RA (2018) Karymsky volcano eruptive plume properties based on MISR multi-angle imagery and the volcanological implications. Atmos Chem Phys 18(6):3903–3918

    Article  ADS  Google Scholar 

  • Frankenberg C, Hasekamp O, O’Dell C, Sanghavi S, Butz A, Worden J (2012) Aerosol information content analysis of multi-angle high spectral resolution measurements and its benefit for high accuracy greenhouse gas retrievals. Atmos Meas Tech 5(7):1809–1821

    Article  Google Scholar 

  • Gabella M, Kisselev V, Perona G (1999) Retrieval of aerosol profile variations from reflected radiation in the oxygen absorption A band. Appl Opt 38(15):3190–3195

    Article  ADS  Google Scholar 

  • Garay MJ, Davis AB, Diner DJ (2016) Tomographic reconstruction of an aerosol plume using passive multiangle observations from the MISR satellite instrument. Geophys Res Lett 43(24):12590–12596

    Article  Google Scholar 

  • Hollstein A, Fischer J (2014) Retrieving aerosol height from the oxygen A band: a fast forward operator and sensitivity study concerning spectral resolution, instrumental noise, and surface inhomogeneity. Atmos Meas Techn 7(5):1429–1441

    Article  Google Scholar 

  • Kahn RA, Gaitley BJ, Garay MJ, Diner DJ, Eck TF, Smirnov A, Holben BN (2010) Multiangle imaging spectroradiometer global aerosol product assessment by comparison with the aerosol robotic network. J Geophys Res Atmos 115(D23):D23209

    Article  ADS  Google Scholar 

  • Kalashnikova OV, Garay MJ, Davis AB, Diner DJ, Martonchik JV (2011a) Sensitivity of multi-angle photo-polarimetry to vertical layering and mixing of absorbing aerosols: quantifying measurement uncertainties. J Quant Spectrosc Radiat Transfer 112(13):2149–2163

    Article  ADS  Google Scholar 

  • Kalashnikova OV, Garay MJ, Sokolik IN, Diner DJ, Kahn RA, Martonchik JV, Lee JN, Torres O, Yang W, Marshak A, Kassabian S, Chodas M (2011b) Capabilities and limitations of MISR aerosol products in dust-laden regions. Proc SPIE 8177:1–11

    Google Scholar 

  • Kokhanovsky AA, Rozanov VV (2005) Cloud bottom altitude determination from a satellite. IEEE Geosci Remote Sens Lett 2(3):280–283

    Article  ADS  Google Scholar 

  • Kokhanovsky AA, Rozanov VV (2010) The determination of dust cloud altitudes from a satellite using hyperspectral measurements in the gaseous absorption band. Int J Remote Sens 31(10):2729–2744

    Article  Google Scholar 

  • Kokhanovsky AA, Davis AB, Cairns B, Dubovik O, Hasekamp OP, Sano I, Mukai S, Rozanov VV, Litvinov P, Lapyonok T, Kolomiets IS (2015) Space-based remote sensing of atmospheric aerosols: the multi-angle spectro-polarimetric frontier. Earth Sci Rev 145:85–116

    Article  Google Scholar 

  • Liu Y, Diner DJ (2017) Multi-angle imager for aerosols: a satellite investigation to benefit public health. Public Health Reports 132(1):14–17

    Article  Google Scholar 

  • Marquardt DW (1963) An algorithm for least-squares estimation of nonlinear parameters. J Soc Ind Appl Math 11:431–441

    Article  MathSciNet  Google Scholar 

  • Merlin G, Riédi J, Labonnote LC, Cornet C, Davis AB, Dubuisson P, Desmons M, Ferlay N, Parol F (2016) Cloud information content analysis of multi-angular measurements in the oxygen A-band: application to 3MI and MSPI. Atmos Meas Tech 9(10):4977–4995

    Article  Google Scholar 

  • Min Q, Harrison LC (2004) Retrieval of atmospheric optical depth profiles from downward-looking high-resolution O\(_2\) A-band measurements: optically thin conditions. J Atmos Sci 61(20):2469–2477

    Article  ADS  Google Scholar 

  • Mishchenko MI (2002) Vector radiative transfer equation for arbitrarily shaped and arbitrarily oriented particles: a microphysical derivation from statistical electromagnetics. Appl Opt 41:7114–7135

    Article  ADS  Google Scholar 

  • Moroney C, Davies R, Muller JP (2002) Operational retrieval of cloud-top heights using MISR data. IEEE Trans Geosci Remote Sens 40(7):1532–1540

    Article  ADS  Google Scholar 

  • NASA (2017) PACE: Plankton, Aerosol, Cloud, ocean Ecosystem. https://pace.gsfc.nasa.gov

  • National Academies of Sciences, Engineering, and Medicine (2018) Thriving on Our Changing Planet, A Decadal Strategy for Earth Observation from Space. The National Academies Press, Washington (DC). https://doi.org/10.17226/24938

  • Nelson DL, Chen Y, Kahn RA, Diner DJ, Mazzoni D (2008) Example applications of the MISR INteractive eXplorer (MINX) software tool to wildfire smoke plume analyses. Proc SPIE 7089:1–11

    Google Scholar 

  • Nelson DL, Garay MJ, Kahn RA, Dunst BA (2013) Stereoscopic height and wind retrievals for aerosol plumes with the MISR INteractive eXplorer (MINX). Remote Sens 5(9):4593–4628

    Article  ADS  Google Scholar 

  • Pelletier B, Frouin R, Dubuisson P (2008) Retrieval of the aerosol vertical distribution from atmospheric radiance. Proc. SPIE 7150:7150R1–7150R9

    Google Scholar 

  • Press WH, Flannery BP, Teukolsky SA, Vetterling WT (1989) Numerical recipes. Cambridge University Press, Cambridge (UK)

    MATH  Google Scholar 

  • Rodgers CD (1998) Information content and optimisation of high spectral resolution remote measurements. Adv Space Res 21(3):361–367

    Article  ADS  Google Scholar 

  • Rodgers CD (2000) Inverse methods for atmospheric sounding: theory and practice. World Scientific, Singapore

    Book  Google Scholar 

  • Rothman LS (2010) The evolution and impact of the HITRAN molecular spectroscopic database. J Quant Spectrosc Radiat Transfer 111:1565–1567

    Article  ADS  Google Scholar 

  • Rozanov VV, Kokhanovsky AA (2004) Semianalytical cloud retrieval algorithm as applied to the cloud top altitude and the cloud geometrical thickness determination from top-of-atmosphere reflectance measurements in the oxygen A band. J Geophys Res Atmos 109(D5):D05202

    Article  ADS  Google Scholar 

  • Sanghavi S, Martonchik JV, Landgraf J, Platt U (2012) Retrieval of the optical depth and vertical distribution of particulate scatterers in the atmosphere using O\(_2\) A- and B-band SCIAMACHY observations over kanpur: a case study. Atmos Meas Tech 5(5):1099–1119

    Article  Google Scholar 

  • Schuessler O, Rodriguez D, Loyola G, Doicu A, Spurr R (2014) Information content in the oxygen A-band for the retrieval of macrophysical cloud parameters. IEEE Trans Geosci Remote Sens 52(6):3246–3255

    Article  ADS  Google Scholar 

  • Shannon CE (1948) A mathematical theory of communication. Bell Syst Tech J 27:379–423

    Article  MathSciNet  Google Scholar 

  • Val Martin M, Logan JA, Kahn RA, Leung FY, Nelson DL, Diner DJ (2010) Smoke injection heights from fires in North America: Analysis of 5 years of satellite observations. Atmos Chem Phys 10(4):1491–1510

    Article  ADS  Google Scholar 

  • Vaughan MA, Powell KA, Winker DM, Hostetler CA, Kuehn RE, Hunt WH, Getzewich BJ, Young SA, Liu Z, McGill MJ (2009) Fully automated detection of cloud and aerosol layers in the CALIPSO lidar measurements. J Atmos Ocean Technol 26(10):2034–2050

    Article  Google Scholar 

  • Winker DM, Pelon J, Coakley JA Jr, Ackerman SA, Charlson RJ, Colarco PR, Flamant P, Fu Q, Hoff RM, Kittaka C, Kubar TL, Le Treut H, McCormick MP, Mégie G, Poole L, Powell K, Trepte C, Vaughan MA, Wielicki BA (2010) The CALIPSO mission: a global 3D view of aerosols and clouds. Bull Am Meteorol Soc 91(9):1211–1230

    Article  Google Scholar 

  • Wu L, Hasekamp O, van Diedenhoven B, Cairns B, Yorks JE, Chowdhary J (2016). Passive remote sensing of aerosol layer height using near-UV multiangle polarization measurements. Geophys Res Lett 43(16):8783–8790. 2016GL069848

    Article  ADS  Google Scholar 

  • Xu X, Wang J, Wang Y, Zeng J, Torres O, Yang Y, Marshak A, Reid J, Miller S (2017) Passive remote sensing of altitude and optical depth of dust plumes using the oxygen A and B bands: first results from EPIC/DSCOVR at Lagrange-1 point. Geophys Res Lett 44(14):7544–7554

    Article  ADS  Google Scholar 

Download references

Acknowledgements

The research was carried out at the Jet Propulsion Laboratory, California Institute of Technology, under contract with the National Aeronautics and Space Administration (NASA). We acknowledge support from the NASA Plankton, Aerosol, Cloud and ocean Ecosystem (PACE) for Earth Science, managed by Dr. Paula Bontempi. We also thank Laurent C.-Labonnote, Guillaume Merlin, Oleg Dubovik, Alex Kokhanovsky, Kirk Knobelspiesse, Lorraine Remer, Dave Diner, Mike Garay, Vijay Natraj, Suniti Sanghavi, Eugene Ustinov, and Feng Xu for fruitful discussions about OE theory and passive atmospheric profiling of aerosols and clouds using the O\(_2\) A-band, in general or in connection with the PACE mission, as well as with the proposed MAIA investigation.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Olga V. Kalashnikova .

Editor information

Editors and Affiliations

Appendix: IC Analysis in Optimal Estimation Theory

Appendix: IC Analysis in Optimal Estimation Theory

1.1 Least-Squares Fit of a Forward Model to Data

The standard approach (Bevington and Robinson 1992; Press et al. 1989) to fitting a generally nonlinear forward model \({\mathbf {F}}({\mathbf {x}};{\mathbf {b}})\) to data \({\mathbf {y}}\) is to equate them, as n-dimensional vectors in measurement space, with allowance for random error \(\varvec{\epsilon }\):

$$\begin{aligned} {\mathbf {y}}= {\mathbf {F}}({\mathbf {x}};{\mathbf {b}}) + \varvec{\epsilon }. \end{aligned}$$
(31)

where \({\mathbf {x}}\) is the m-dimensional vector in a state space that contains all the parameters used to find the best fit to the data; \({\mathbf {b}}\) is another vector in the space of non- or otherwise-retrieved parameters that are imperfectly known to within a known uncertainty.

We will consider two sources of error in \(\varvec{\epsilon }\): (i) instrumental error that affects \({\mathbf {y}}\), and (ii) forward model error that affects \({\mathbf {F}}({\mathbf {x}};{\mathbf {b}})\). Our main concern in the latter case is uncertainty on \({\mathbf {b}}\), the parts of a larger state vector that have to be treated as given when retrieving \({\mathbf {x}}\). Instrumental error is characterized by its \(n \times n\) variance/covariance matrix

$$\begin{aligned} {\mathbf {S}}_{\mathbf {y}}= \mathrm{E}[({\mathbf {y}}-\mathrm{E}[{\mathbf {y}}])({\mathbf {y}}-\mathrm{E}[{\mathbf {y}}])^\mathrm{T}] \end{aligned}$$
(32)

where \(\mathrm{E}[\cdot ]\) denotes mathematical expectation. Uncertainty on the non/otherwise-retrieved parameters is defined by their variance/co-variance matrix \({\mathbf {S}}_{\mathbf {b}}\), which can be converted into the equivalent \({\mathbf {S}}_{\mathbf {F}}\) of the measurement error matrix in (32) by using the Jacobian matrix \({\mathbf {J}}_{\mathbf {b}}= \partial {\mathbf {F}}/\partial {\mathbf {b}}\) and its transpose \({\mathbf {J}}_{\mathbf {b}}^\mathrm{T}\):

$$\begin{aligned} {\mathbf {S}}_{\mathbf {F}}= {\mathbf {J}}_{\mathbf {b}}{\mathbf {S}}_{\mathbf {b}}{\mathbf {J}}_{\mathbf {b}}^\mathrm{T}. \end{aligned}$$
(33)

Since we are confident that measurement error in \({\mathbf {y}}\) and modeling error in \({\mathbf {F}}({\mathbf {x}};{\mathbf {b}})\) are independent random variables, their (co)variance matrices just add to form:

$$\begin{aligned} {\mathbf {S}}_{\varvec{\epsilon }} = {\mathbf {S}}_{\mathbf {y}}+ {\mathbf {S}}_{\mathbf {F}}. \end{aligned}$$
(34)

The classic least-squares minimization method determines the estimate

$$\begin{aligned} {\mathbf {x}}_\text {bp} = \mathop {\text {argmin}}\limits _{{\mathbf {x}}}\left[ \Psi _{\mathbf {y}}({\mathbf {x}}) \right] \end{aligned}$$
(35)

for the best possible fit of the data for model \({\mathbf {F}}({\mathbf {x}};{\mathbf {b}})\) by finding the minimum of the cost function

$$\begin{aligned} \Psi _{\mathbf {y}}({\mathbf {x}}) = \frac{1}{2} ({\mathbf {y}}-{\mathbf {F}}({\mathbf {x}};{\mathbf {b}}))^\mathrm{T}{\mathbf {S}}_{\varvec{\epsilon }}^{-1} ({\mathbf {y}}-{\mathbf {F}}({\mathbf {x}};{\mathbf {b}})). \end{aligned}$$
(36)

In the frequent situation where \({\mathbf {S}}_{\varvec{\epsilon }}\) is diagonal, \(\Psi _{\mathbf {y}}({\mathbf {x}})\) is the sum of squared “observed–predicted” residuals down-weighted by the uncertainty on each measurement. Assuming, for the moment, that \(\Psi _{\mathbf {y}}({\mathbf {x}})\) is convex, one can use the iterative Gauss-Newton algorithm:

$$\begin{aligned} {\mathbf {x}}_{i+1} = {\mathbf {x}}_{i} + {\mathbf {S}}_{\mathbf {x}}{\mathbf {J}}^\mathrm{T}{\mathbf {S}}_{\varvec{\epsilon }}^{-1} ({\mathbf {y}}-{\mathbf {F}}({\mathbf {x}}_i;{\mathbf {b}})), \end{aligned}$$
(37)

where \({\mathbf {J}}= \partial {\mathbf {F}}/\partial {\mathbf {x}}\) is the usual \(n \times m\) Jacobian matrix, and

$$\begin{aligned} {\mathbf {S}}_{\mathbf {x}}= \left( {\mathbf {J}}^\mathrm{T}{\mathbf {S}}_{\varvec{\epsilon }}^{-1} {\mathbf {J}}\right) ^{-1}, \end{aligned}$$
(38)

using an arbitrary starting position \({\mathbf {x}}_0\); this method converges in one step if \({\mathbf {F}}({\mathbf {x}};{\mathbf {b}})\) is linear in \({\mathbf {x}}\). Iteration is stopped when the non-dimensional weighted sum of residuals in (36), often denoted as \(\chi ^2\)/2, is \(\approx n\)/2.

Note that \({\mathbf {x}}_\text {bp}\) from (35) will not be identical to the true state vector in (31) for at least two basic reasons, starting with the least serious:

  • First, the data \({\mathbf {y}}\) are collected with a specific realization of the random instrument noise in \(\varvec{\epsilon }\), i.e., the component quantified by \({\mathbf {S}}_{\mathbf {y}}\) in (34). Averaging over different measurements of \({\mathbf {y}}\) will reduce the noise level (by the square-root of the number of independent measurements) but not eliminate it. However, the distance between the true and estimated state vectors is likely to be on the order of the square-root of the diagonal elements of \({\mathbf {S}}_{\mathbf {x}}\) in (38), i.e., the predicted retrieval error for each element of \({\mathbf {x}}\):

    $$\begin{aligned} \text {StDev}[x_j] = \sqrt{S_{{\mathbf {x}}jj}} (j = 1,\dots ,m). \end{aligned}$$
    (39)
  • Moreover, most interesting problems, in remote sensing in particular, result in non-convex cost functions in (36), simply because \({\mathbf {F}}({\mathbf {x}};{\mathbf {b}})\) is non-linear. Hence there is no guarantee that it has a single global minimum to find by iterative Gauss-Newton steps. So, even assuming that the true state is near the global minimum of \(\Psi _{\mathbf {y}}({\mathbf {x}})\), depending on the initial guess \({\mathbf {x}}_0\) at the state vector, the iterative search could end in a local minimum that is generally not within the distance in (39) of the true state. A well-known mitigation strategy for the non-convexity problem is to use the Levenberg-Marquardt minimization algorithm (Marquardt 1963; Bevington and Robinson 1992) where \({\mathbf {S}}_{\mathbf {x}}^{-1} = {\mathbf {J}}^\mathrm{T}{\mathbf {S}}_{\varvec{\epsilon }}^{-1} {\mathbf {J}}\) in (38) is replaced by a weighted sum of itself and its diagonal elements. The weight starts high on the diagonal matrix, which make the iteration behave like a straightforward gradient descent, which can work for non-convex function minimization (away from inflection points). As the weight shifts away from the diagonal, the method reverts to Gauss-Newton steps as the correct minimum is approached in a state-space region where \(\Psi _{\mathbf {y}}({\mathbf {x}})\) is locally convex.

Finally, the whole inverse problem at hand can be ill-posed, that is, \({\mathbf {S}}_{\mathbf {x}}^{-1}\) can be nearly singular, thus making its inversion in (38) highly unstable. The trajectory of iterative minimization in (37) will then be very sensitive to small perturbations of \({\mathbf {y}}\), e.g., instrumental noise. We now introduce a solution to this last issue since in remote sensing there is often much redundancy in the observations with respect to the unknowns in the retrieval.

1.2 Optimal Estimation

Rodgers’ (2000) theory of optimal estimation (OE) revisits the above inverse problem of parameter estimation in the forward signal model from a Bayesian perspective, as a means for dealing with chronic ill-posedness by introducing regularization. Bayes theorem relates two conditional probabilities and two unconditional counterparts:

$$\begin{aligned} P({\mathbf {x}}|{\mathbf {y}}) = P({\mathbf {y}}|{\mathbf {x}}) P({\mathbf {x}}) / P({\mathbf {y}}), \end{aligned}$$
(40)

where the last one is just a normalization factor of no particular interest here. \(P({\mathbf {x}})\) is the “prior” or“a priori” probability, while \(P({\mathbf {x}}|{\mathbf {y}})\) is the “posterior” probability.

Letting \(|\cdot |\) denote the determinant of a square matrix, and assuming gaussian distributions, we have

$$\begin{aligned} \log P({\mathbf {x}}|{\mathbf {y}}) = -\frac{1}{2}\left[ ({\mathbf {x}}-\hat{{\mathbf {x}}})^\mathrm{T}{\mathbf {S}}_{\mathbf {x}}^{-1} ({\mathbf {x}}-\hat{{\mathbf {x}}}) + \log |{\mathbf {S}}_{\mathbf {x}}| + m\log (2\pi )\right] \end{aligned}$$
(41)

for the posterior probability of atmospheric state properties \({\mathbf {x}}\), given data \({\mathbf {y}}\). Also, we note that \(\hat{{\mathbf {x}}}\) is a new estimate of the prevailing state vector. Both \(\hat{{\mathbf {x}}}\) and \({\mathbf {S}}_{\mathbf {x}}\) will depend on \({\mathbf {y}}\), \({\mathbf {F}}({\mathbf {x}};{\mathbf {b}})\), and related quantities, as shown below. In addition, we have

$$\begin{aligned} \log P({\mathbf {x}}) = -\frac{1}{2}\left[ ({\mathbf {x}}-{\mathbf {x}}_\text {a})^\mathrm{T}{\mathbf {S}}_\text {a}^{-1}({\mathbf {x}}-{\mathbf {x}}_\text {a}) + \log |{\mathbf {S}}_\text {a}| + m\log (2\pi ) \right] \end{aligned}$$
(42)

for the a priori probability of atmospheric state, i.e., what we know about it before any observations are made. Finally, we have

$$\begin{aligned} \log P({\mathbf {y}}|{\mathbf {x}}) = -\Psi _{\mathbf {y}}({\mathbf {x}}) - \frac{1}{2} \log \left[ (2\pi )^n|{\mathbf {S}}_{\varvec{\epsilon }}|\right] \end{aligned}$$
(43)

for the probability of obtaining the specific observations \({\mathbf {y}}\), given the atmospheric state \({\mathbf {x}}\), i.e., probability of seeing the random residuals used in (36) to estimate the cost function \(\Psi _{\mathbf {y}}({\mathbf {x}})\).

One then defines \(\hat{{\mathbf {x}}}\) as the state with maximum likelihood, i.e., that maximizes \(P({\mathbf {y}}|{\mathbf {x}})P({\mathbf {x}})\), a product of two gaussian PDFs. In other words,

$$\begin{aligned} \hat{{\mathbf {x}}} = \mathop {\text {argmin}}\limits _{{\mathbf {x}}}\left[ \Psi _{\mathbf {y}}({\mathbf {x}}) + \frac{1}{2}({\mathbf {x}}-{\mathbf {x}}_\text {a})^\mathrm{T}{\mathbf {S}}_\text {a}^{-1}({\mathbf {x}}-{\mathbf {x}}_\text {a}) \right] , \end{aligned}$$
(44)

instead of (35). The Gauss-Newton algorithm in (37) still applies but, rather than (38), we now have

$$\begin{aligned} {\mathbf {S}}_{\mathbf {x}}= \left( {\mathbf {J}}^\mathrm{T}{\mathbf {S}}_{\varvec{\epsilon }}^{-1} {\mathbf {J}}+ {\mathbf {S}}_\text {a}^{-1} \right) ^{-1}. \end{aligned}$$
(45)

Although there may be better choices, one can always start the iterations with \({\mathbf {x}}_0 = {\mathbf {x}}_\text {a}\). As in (39), the diagonal elements \(S_{{\mathbf {x}}jj}\) of \({\mathbf {S}}_{\mathbf {x}}\) are the posterior estimates of variance on the retrieved properties in \(\hat{{\mathbf {x}}}\).

1.3 Degrees of Freedom

At any rate, and contrary to (38), the matrix inversion problem in (45) is by design well-posed, thanks to the presence of \({\mathbf {S}}_\text {a}^{-1}\). However, convergence goes to \(\hat{{\mathbf {x}}}\) rather than \({\mathbf {x}}_\text {bp}\) in (35).

It is therefore of interest to evaluate the \(m \times m\) matrix

$$\begin{aligned} {\mathbf {A}}= \left( {\mathbf {J}}^\mathrm{T}{\mathbf {S}}_{\varvec{\epsilon }}^{-1} {\mathbf {J}}+ {\mathbf {S}}_\text {a}^{-1} \right) ^{-1} {\mathbf {J}}^\mathrm{T}{\mathbf {S}}_{\varvec{\epsilon }}^{-1} {\mathbf {J}}, \end{aligned}$$
(46)

i.e., the left-hand product of (45) with the potentially ill-conditioned matrix to be inverted in (38). Recall that these expressions are the predicted uncertainties on the retrieved state properties with and without regularization, respectively; in the latter case, however, the potentially unstable matrix inversion is not performed. A little algebra leads to the simpler expression

$$\begin{aligned} {\mathbf {A}}= {\mathbf {I}}_m - {\mathbf {S}}_{\mathbf {x}}{\mathbf {S}}_\text {a}^{-1}, \end{aligned}$$
(47)

where \({\mathbf {I}}_m\) is the \(m \times m\) identity matrix. Now, in OE theory, \({\mathbf {A}}\) is known as the “averaging kernel,” and Rodgers (2000) shows that it can be defined simply as \(\partial \hat{{\mathbf {x}}}/\partial {\mathbf {x}}\), where \({\mathbf {x}}\) is the true state parameter.

If \({\mathbf {S}}_\text {a}\) is diagonal (no prior covariances), then the diagonal terms of \({\mathbf {A}}\) are

$$\begin{aligned} A_{jj} = 1 - \frac{S_{{\mathbf {x}}jj}}{S_{\text {a}jj}} \end{aligned}$$
(48)

for \(j = 1,\dots ,m\). Since we can anticipate that \(0 \le S_{{\mathbf {x}}jj} \le S_{\text {a}jj}\), we have \(A_{jj} \in [0,1]\). \(A_{jj}\) is known as the partial “degree of freedom” (DOF) for the retrieved property \(x_j\). It is directly related to the predicted retrieval error

$$\begin{aligned} \text {StDev}[x_j] = \sqrt{S_{\text {a}jj}\times (1-A_{jj})} = \text {StDev}_\text {a}[x_j] \sqrt{1-A_{jj}}, \end{aligned}$$
(49)

recalling that \({\mathbf {S}}_\text {a}\) is taken as diagonal.

Following the methodology of Merlin et al. (2016), we use \(A_{jj}\) and StDev[\(x_j\)] from (48)–(49) extensively in the main body of this article. The former is an intuitive non-dimensional metric of the value added by the observations projected onto a specific state variable. If \(S_{{\mathbf {x}}jj} \ll S_{\text {a}jj}\), then \(A_{jj}\) approaches unity, which means that the observations have vastly improved our knowledge of the state variable \(x_j\). If, on the contrary, \(S_{{\mathbf {x}}jj} \lesssim S_{\text {a}jj}\), then \(A_{jj}\) approaches zero, which means that the observations have not helped very much.

Figure 7 illustrates for \(m = 2\) a typical concentration of probability measure (IC increase) in a familiar state space when going from the prior to posterior PDFs for \({\mathbf {x}}\). In both cases, we have traced, assuming gaussian PDFs, the lines of iso-probability value at \(1/(2\pi )\sqrt{\mathrm{e}|{\mathbf {S}}|}\), equivalently, where \(({\mathbf {x}}-\mathrm{E}[{\mathbf {x}}])^\mathrm{T}{\mathbf {S}}^{-1}({\mathbf {x}}-\mathrm{E}[{\mathbf {x}}]) = 1\) for easy visualization of the magnitudes of prior and posterior standard deviations. The area of each ellipsoid is \(\propto \) \(|{\mathbf {S}}|\). More specifically, the regions inside the ellipses both account for 68% of the prior and posterior events. Note that, while \({\mathbf {S}}_\text {a}\) is diagonal (no covariance), \({\mathbf {S}}_{\mathbf {x}}\) is generally not.

Fig. 7
figure 7

Schematic representation of an OE for \(m = 2\) atmospheric state parameters relevant to the investigation presented in the main body. Ellipses are iso-likelihood contours where we have \(({\mathbf {x}}-\mathrm{E}[{\mathbf {x}}])^\mathrm{T}{\mathbf {S}}^{-1}({\mathbf {x}}-\mathrm{E}[{\mathbf {x}}]) = 1\), and these curves are traced respectively for the prior (grey) and posterior (black)

A better-known quantification of overall retrieval performance in OE theory is

$$\begin{aligned} \text {Tr}[{\mathbf {A}}] = \sum \limits _{j=1}^m A_{jj} \in [0,m], \end{aligned}$$
(50)

which is the (implicitly, total) number of Degrees of Freedom. It approximates the maximum number of state properties (among m) that can be retrieved from the n observations in \({\mathbf {y}}\) in view of: (i) the instrumental noise (\({\mathbf {S}}_{\mathbf {y}}\)); (ii) forward model error (\({\mathbf {S}}_{\mathbf {F}}\)), in this case, from uncertainty in non-retrieved parameters (\({\mathbf {S}}_{\mathbf {b}}\)); (iii) sensitivities of the forward model to the state variables to be retrieved (\({\mathbf {J}}\)), or not (\({\mathbf {J}}_{\mathbf {b}}\)); and (iv) prior information about the atmospheric state (\({\mathbf {S}}_\text {a}\)).

The off-diagonal terms in \({\mathbf {A}}\) are non-dimensional quantities of the type \(\mathrm{E}\left[ (x_1-\hat{x}_1)(x_2-\hat{x}_2) \right] \) divided by a diagonal element of \({\mathbf {S}}_\text {a}\), again assuming it is diagonal. These normalized cross-correlations describe how different state variables interfere (statistically) with one-another, two-by-two. Keeping these numbers small results in more regular (hence, more easily inverted) matrix \({\mathbf {S}}_{\mathbf {x}}^{-1}\). That, in turn, will help the performance of the retrieval algorithm. In the design phase of the observation system, the forward model \({\mathbf {F}}({\mathbf {x}};{\mathbf {b}})\) can thus be used to optimize the sampling of \({\mathbf {y}}\) in a way that keeps cross-variable interference as small as possible.

1.4 Entropy and Information Content

Lastly, we relate DOFs to information theoretical concepts, and thus justify our claim all along that we are quantifying Information Content per se. Following Shannon (1948), Rodgers (1998) defines the increase in information (equivalently, decrease in entropy) associated with the acquisition of observations \({\mathbf {y}}\) and their processing—by, e.g., OE methods—as

$$\begin{aligned} -\Delta H = -\frac{\log _2(|{\mathbf {S}}_{\mathbf {x}}|)-\log _2(|{\mathbf {S}}_\text {a}|)}{2} = -\frac{1}{2}\log _2(|{\mathbf {S}}_{\mathbf {x}}{\mathbf {S}}_\text {a}^{-1}|) = -\frac{1}{2}\log _2(|{\mathbf {I}}_m - {\mathbf {A}}|), \end{aligned}$$
(51)

when expressed in bits. Information gain \(-\Delta H\) ranges from 0\(^+\) (\(|{\mathbf {S}}_{\mathbf {x}}| \lesssim |{\mathbf {S}}_\text {a}|\)) to \(\infty \) (\(|{\mathbf {S}}_{\mathbf {x}}| \ll |{\mathbf {S}}_\text {a}|\)). This follows directly from the expression for the entropy of a generic m-dimensional gaussian probability density function, \(P_m(\mathrm{E}[{\mathbf {x}}],{\mathbf {S}})\): from, e.g., (42), we have

$$\begin{aligned} H(m,|{\mathbf {S}}|) = -\mathrm{E}[\log P_m(\mathrm{E}[{\mathbf {x}}],{\mathbf {S}})] = \frac{1}{2}\log ((\mathrm{e}2\pi )^m|{\mathbf {S}}|), \end{aligned}$$
(52)

which is naturally independent of the mean \(\mathrm{E}[{\mathbf {x}}]\). To visualize this IC increase (entropy decrease), \(-\Delta H\) in (51) is \(\nicefrac {1}{2}\) the log of the ratio of the areas of the two ellipsoids in Fig. 7.

Abbreviations & Acronyms

 

1D:

one-dimensional

AOT:

aerosol optical thickness

BC:

boundary condition

BRDF:

Bi-directional Reflection Distribution Function

BRF:

Bi-directional Reflection Factor

DOAS:

Differential Optical Absorption Spectroscopy

DOF:

Degree(s) Of Freedom

FWHM:

Full-Width Half-Max

GSFC:

Goddard Space Flight Center

IC:

Information Content

IR:

infra-red

JPL:

Jet Propulsion Laboratory

MAIA:

Multi-Angle Imager for Aerosols

MAS:

Multi-Angle Sensor

NASA:

National Aeronautics and Space Agency

OE:

Optimal Estimation

OCI:

Ocean Color Imager

PACE:

Plankton, Aerosol, Cloud, and ocean Ecosystem

RT:

radiative transfer

SNR:

signal-to-noise ratio

SSA:

single scattering albedo

StDev:

Standard Deviation

SZA:

solar zenith angle

TOA:

Top-of-Atmosphere

UV:

ultra-violet

VNIR:

visible/near-infrared

VIS:

visible

VZA:

viewing zenith angle

 

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Davis, A.B., Kalashnikova, O.V. (2019). Aerosol Layer Height over Water via Oxygen A-Band Observations from Space: A Tutorial. In: Kokhanovsky, A. (eds) Springer Series in Light Scattering. Springer Series in Light Scattering. Springer, Cham. https://doi.org/10.1007/978-3-030-03445-0_4

Download citation

Publish with us

Policies and ethics