Probabilistic reasoning about measurements of equilibrium climate sensitivity: combining disparate lines of evidence
Where policy and science intersect, there are always issues of ambiguous and conflicting lines of evidence. Combining disparate information sources is mathematically complex; common heuristics based on simple statistical models easily lead us astray. Here, we use Bayesian Nets (BNs) to illustrate the complexity in reasoning under uncertainty. Data from joint research at Resources for the Future and NASA Langley are used to populate a BN for predicting equilibrium climate sensitivity (ECS). The information sources consist of measuring the rate of decadal temperature rise (DTR) and measuring the rate of percentage change in cloud radiative forcing (CRF), with both the existing configuration of satellites and with a proposed enhanced measuring system. The goal of all measurements is to reduce uncertainty in equilibrium climate sensitivity. Subtle aspects of probabilistic reasoning with concordant and discordant measurements are illustrated. Relative to the current prior distribution on ECS, we show that after 30 years of observing with the current systems, the 2σ uncertainty band for ECS would be shrunk on average to 73% of its current value. With the enhanced systems over the same time, it would be shrunk to 32% of its current value. The actual shrinkage depends on the values actually observed. These results are based on models recommended by the Social Cost of Carbon methodology and assume a Business as Usual emissions path.
Confronted with unwelcome scientific advice, interested parties may seek out, or in some cases even generate, conflicting scientific views to neutralize the unwelcome impact (Oreskes and Conway 2010). Lacking the ability to evaluate the advice, public media striving for balance can unwittingly promote the idea that conflicting advice can simply be ignored. Behind this perspective is a lack of understanding among the general public about the role of disagreement in science. In addition, there is a defective understanding, rooted in the classical statistical methods which most scientific researchers are taught, of how multiple lines of evidence should be combined, as elaborated in Section 2.
The authors’ recent uncertainty decomposition of current and enhanced measurements for equilibrium climate sensitivity (Cooke et al. 2013, 2015, 2016) provides a basis for exploring the effects of conflicting measurements. The future measurement values invoked for this purpose are of course hypothetical but the effects are obtained by conditionalizing a vetted joint distribution for equilibrium climate sensitivity (ECS), the rate of decadal temperature rise (DTR) and rate of change of cloud radiative forcing (CRF) as measured by current and future enhanced observing systems. This analysis profits from the fact that a prior distribution over equilibrium climate sensitivity and theoretical models connecting ECS with DTR and CRF are provided by the US inter-agency memo on the social cost of carbon (IWGSCC 2009, 2013). This enables a fully Bayesian analysis of these complex interlocking measurement platforms which brings many surprising features to light. Relative to the current prior distribution on ECS, we show that after 30 years of observing with the current systems, the 2σ uncertainty band for ECS would be shrunk on average to 73% of its current value. With the enhanced systems over the same time, it would be shrunk to 32% of its current value. The actual shrinkage depends on the values actually observed. These results are conditional on the current understanding of the uncertainty of ECS (IPCC 2013; IWGSCC 2009), as well as recent scientific advances in the decomposition of cloud feedbacks, which dominate the uncertainty of ECS, into their individual observable components (Soden et al. 2008; Zhou et al. 2015). To be clear, uncertainty in emissions (including aerosols) and the effect of “slow feedbacks” outside the current SCC paradigm are not taken into account. Whereas this paper focuses on probabilistic interpretations of measurements and their overall impact on ECS uncertainty, another paper (Hanea et al. 2018) uses this model to explore counter-intuitive results more generally. The choice to use climate sensitivity for this example is based on the lack of progress in reducing uncertainty in climate sensitivity in the last 30 years of research (IPCC 2013).
The remainder of this paper is organized as follows: Section 2 reviews combining evidence from the popular point of view and from the simple classical error models. Section 3 describes the current and enhanced measurement platform forming the basis for this analysis. Section 4 illustrates how conflicting measurements can be almost as informative as concordant measurements. Section 5 treats overall uncertainty from different combinations of measurement platforms. A final section gathers conclusions. An appendix provides a mathematical background for the results in Section 4.
2 Simple intuitions on combining measurements
Suppose we have one measurement platform for ECS whose sources of error are known and are unbiased. When this platform returns a value for ECS, then the true value for ECS may be either higher or lower according to how the measurement is deflected by its noise. Unable to know the deflection, we intuitively focus on the measured value and ignore the uncertainty. Confronted with the results of two independent measurements, our intuitions are less clear. If the two measurements agree, we tend to see confirmation and feel more confident in the common result. If they strongly disagree, the effect is often to temporize and await more evidence. Such slow deliberative thinking (Kahneman 2011) is often praised as cautious, in contrast to precipitously acting on impulse. However, in cases where decisions cannot be postponed, we need probabilistic thinking. In the simple statistical error model which most practitioners have learned, the measurements would be modeled as perturbed by independent identically distributed additive error terms. The estimate minimizing mean square error is the mean of the observations and the variance of the estimate is the variance of a single error term divided by the number of observations,1 regardless whether the measurements are concordant or discordant.
The intuition that concordant measurements should confer more confidence than discordant measurements is not attested by the simple error model most practitioners know. There are many other examples illustrated in Section 4. This simple error model cannot account for “negative learning” where we become more uncertain after retrieving a measured value than we were before (Oppenheimer and O’Neill 2008; Hanea et al. 2018). Two measurements may return the same values but with different noise, resulting in different predictions. Two measurements may separately produce the same prediction, yet result in a different prediction when combined. Two strongly conflicting measurements may jointly yield a great deal of information about the unknown quantity.
It is common to attribute such divergence between intuitions and simple error models to a difference between classical and Bayesian approaches. Indeed, the features mentioned in the previous paragraph can be ascribed to the interaction between measurement error and a prior distribution on the variable of interest. Bayesian nets are used to illustrate the complexities of combining measurements. However, the appendix shows that the distinction between classical and Bayesian methods is more apparent than real in the contexts of multiple measurement platforms with well-defined error properties: The key idea is that an unknown variable of interest X can be modeled as Z + e where Z is the unknown measured value and e is the error with a known distribution. Upon measuring Z = z, X can be ascribed the distribution of z + e. Subsequent measurements can be seen as updating this “prior.” This ascription cannot be described as probabilistic conditionalization as Z does not have a distribution, but it can be described as “Renyi conditionalization” (Renyi 1970). Alternatively, we can simply compute the conditional error distributions given the observed values and arrive at the same results without ascribing a distribution to X. The two approaches are equivalent. The appendix gives details and provides a simple mathematical model which mimics the results in Section 4 on concordant and discordant measurements.
Neither the simple error model nor our simple intuitions can do justice to the complexities of probabilistic inference with multiple lines of evidence. Real examples combined with graphical software tools for probabilistic inference can help to hone our intuitions.
3 Measuring equilibrium climate sensitivity
An enhanced Earth Observing System (EOS) component CLARREO (Climate Absolute Radiance and Refractivity Observatory, Wielicki et al. 2013) uses better calibration than existing systems to observe trends in the decadal rate of global surface temperature rise, and decadal percentage changes in CRF. This is compared with existing systems: For global temperature rise, these are weather satellite infrared spectrometers IASI (Infrared Atmospheric Sounder Interferometer, Hilton et al. 2012), AIRS (Atmospheric Infrared Sounder, Aumann et al. 2003), and CrIS (Cross Track Infrared Sounder, Strow et al. 2013), abbreviated as IAC. They look at about 1/3 of the Earth’s emitted infrared radiation where CO2 and H2O absorb radiation in varying levels (low absorption levels see to the surface, high absorption see to 20-km altitude, others see to intermediate depths). They use this to gauge temperature and water vapor vertical profiles from the surface to about 20-km altitude. For reflected shortwave CRF, the existing system is the CERES system, a broadband radiation budget instrument. It measures total reflected solar energy as a single value, and total emitted thermal infrared energy as a second value (Wielicki et al. 1996). All references to CRF in this paper are to reflect shortwave CRF. Note that cloud radiative forcing or CRF is also often referred to as cloud radiative effect or CRE.
There is no correlation of the uncertainties of the CERES and IAC instruments. There is almost no common technology, and their calibration issues are very different. The enhanced EOS CRF employs a reflected solar spectrometer with a large 2D detector array (512 by 512 detectors) that uses scans of the sun, moon, and nearby deep space to do calibration and international standards (SI) traceability. It shares no types of components with the IR spectrometer, uses a 2-axis gimbal to point the entire instrument so that the exact same optics path is used for solar, lunar, and earth viewing observations. The IR spectrometer that measures temperature change is an interferometer that uses deep cavity blackbodies (0.9998 emissivity where 1.0 is perfect), three different temperature phase change cells to calibrate temperature of the blackbody to SI standards, a blackbody emissivity monitor, and varies its blackbody temperatures for calibration from 200 to 320 K. The physics of how such instruments would change in orbit has no common element, even the electronics of these instruments are very different.
The cloud feedback uncertainty in climate models is dominated by low clouds (IPCC 2013). The effect of low clouds on the climate system is in turn dominated by cloud-driven changes in Earth’s reflected solar radiation to space which is typically measured by global mean reflected shortwave CRF (Soden et al. 2008; Zhou et al. 2015). Interannual and decadal changes in shortwave CRF have been shown by climate models to be the key measure of cloud feedback (Soden et al. 2008; Dessler 2010; Zhou et al. 2015, 2016; Zelinka et al. 2017). While we use the simpler framework of Soden et al. (2008) for this demonstration example, newer results have shown that spatial patterns of changes in shortwave CRF can be used to reduce the noise of short-term interannual variability in extracting long-term low cloud feedbacks (Zhou et al. 2016). It has also been shown that use of 500 hPa temperature change in the place of surface temperature change may also reduce the effects of natural variability or short-term climate change (Dessler et al. 2018; Dessler and Forster 2018). In the future, both the spatial pattern effects and 500-hPa temperature changes could be incorporated into the framework presented in this paper. Additional climate feedbacks could also be added. We focus on a simpler framework to provide examples of how information from multiple lines of evidence interact.
Global temperature and change in CRF are observed against the background of natural variability, whose effects on trend measurements are attenuated by longer observational times. When perturbed by natural variability (var), orbit sampling uncertainty (orbit) and instrument calibration drift (cal) over observation time t, the variance σ2 of the trend estimate is derived from (Leroy et al. 2008):
The units of the one-period variance components σ2var, σ2cal, and σ2orbit are the squares of the physical units being measured. The characteristic times τvar, τcal, and τorbit are in years and reflect the serial correlation. Autocorrelation time scales for natural variability are dominated by ENSO (~ 1.5 years), satellite orbit sampling by the averaging time (1 year), and instrument calibration by instrument lifetime, here assumed to be 5 years (Leroy et al. 2008; Wielicki et al. 2013). The units of σ2 are thus [physical units squared / time squared]. The effects of noise in observing trends are attenuated by longer observation times. The variance components are considered to be independent normal variables with mean zero (for a detailed discussion see Cooke et al. 2013). The statistical formulation is based on an AR(1) process that accounts for short-term climate variability such as ENSO. Longer term climate variability such as Pacific Decadal Oscillation can also be significant and could be included using an AR(2) or other statistical process in future analysis (Brown et al. 2015).
4 Discordant and concordant measurements
Figure 2 shows the Bayesian net for current (pink) and new (yellow) measurements of the rate of decadal temperature rise (Temp) and percentage rate of change in CRF. Each measurement adds its own instrument uncertainties which, according to the above, are independent. Figure 2 depicts the situation after 30 years before measurements of Temp or CRF. The marginal distributions and correlations are input to the BN which then determines the joint distribution by simulation. Means and standard deviations of the individual variables are shown in the boxes with histograms for each variable. The correlations shown by each arc are determined, as above, by the ratios of standard deviations in the observing and observed systems and the observational time. The joint distribution can then be conditionalized on any set of values of any of the variables. Performing a measurement corresponds to learning a unique value for one or more of the pink and yellow variables. Conditionalization propagates this knowledge through the net thereby reflecting the changes in our uncertainty resulting from the measurement(s).
After 30 years, the trend uncertainty due to natural variability is fairly small and the correlations between ECS and the theoretical trends (green) are high. Greater accuracy with the yellow systems is reflected in higher correlations with the trending variables. The distribution for ECS is the truncated Roe Baker distribution used by IWGSCC. The sign convention for CRF decadal change is that positive change indicates increased downward solar energy into the climate system. Units in the figures for temperature trends are given in K/decade and for CRF are given in % CRF/decade.
4.1 Discordant conclusions from concordant measurement results
4.2 E pluribus unum
The two measurements in Figs. 4 are strongly conflicting when considered in isolation. Indeed, judging by the resultant expected values for ECS, the enhanced measurements are more discordant than those of the current system. Combining the conflicting results is simply a matter of conditionalizing on both pieces of information. Of course, such conflicting results are unlikely, but if they are observed, we should expect that the enhanced CRF measurement is deflected upward by the noise, while the enhanced_Temp measurement is deflected downward. In other words, given conflicting measured values, the expected errors are negatively correlated (see appendix). This effect is stronger for the enhanced than for the current systems.
4.3 Ex uno plures
Similar behavior will be obtained with lower than expected measured values. If enhanced Temp returns 0.229 and enhanced CRF returns zero, the expectations and standard deviations of ECS, when updated on these values individually, are respectively 0.263 ± 0.406 and 0.263 ± 0.501. If ECS is updated on both measured values simultaneously, the result is 2.56 ± 0.325.
5 Overall results
The above results help develop our intuitions for probabilistic reasoning. The ECS estimates themselves depend strongly on the presumed measured values. A more powerful analysis computes the reduction in uncertainty averaged over all possible measurement values. More precisely, we draw a large sample from the joint distribution pictured in Fig. 1. For each value of each measuring platform, and for each vector of values for each combination of measuring platforms, the conditional expectation and conditional standard deviation of ECS are computed. This computation is possible analytically because the BN realizes the correlations using the normal copula. In other words, the marginal distributions pictured in Fig. 1 are assumed to be transformations of standard normal variables. Conditionalization of a joint normal distribution can be performed analytically and the results back-transformed to the original variables. The normal copula imposes features like tail independence and symmetric rank scatter plots. More general copula can be used to avoid these restrictions, at much higher computational expense.
Posterior average standard deviations of ECS with different combinations of measurement systems, in 2050 following launch in 2020
Average posterior standard deviation ECS, 2050
Prior to measurement: μ = 3.39, σ = 1.24C, 2σ range = 0.81–5.77[C]
IAC and enhanced Temp
CERES and enhanced CRF
IAC & CERES
Enhanced Temp and enhanced CRF
Table 1 shows that after 30 years of observing with the current systems, the 2σ range for ECS would be shrunk on average from its current value of 2.48 to 1.8. In the same time period, the enhanced systems would further shrink the 2σ on average range to 0.82. The actual shrinkage will depend on the values of the measurements.
Enhanced measurements can have economic value only if they are used. By positing a decision context in which society adopts reduced emissions scenarios when high values of equilibrium climate sensitivity are established with requisite confidence, the authors have shown that the real option value of enhanced observation systems, over and above the existing systems, runs into trillions of dollars (Cooke et al. 2015, 2016). This underscores the broader message that probabilistic thinking has economic impact, not just in selecting optimal measurement portfolios, but also in quantifying their social value. While there are many challenges to developing and implementing a more rigorous and accurate long-term climate observing system, the economic benefits suggest that this may be one of the society’s best investments.
Valid probabilistic reasoning is subtle, and trusting untrained intuitions can lead to errors. Scientists, science communicators, policy makers, and general public need to understand that disagreement in science is not dysfunctional but is essential to progress. Cultivating valid probabilistic intuitions through exercises like that performed here hope to promote this understanding.
As final caveat, this study follows the US Interagency Memo on the Social Cost of Carbon. A 2017 study of the National Academies of Sciences (2017) identifies many areas in which the existing methodology can and should be improved. In particular, the state-independent carbon cycle models (as they are called, otherwise known as a system of ordinary differential equations) started at equilibrium cannot reproduce features observed in the data and predicted by Earth system Models of Intermediate Complexity (EMICs).
Funding from NASA, NNX17AD55G, is gratefully acknowledged.
- Ale BJM, Bellamy LJ, Boom R, van der Cooper J, Cooke RM, Goossens LHJ, Hale AR, Kurowicka D, Morales O, Roelen ALC, Spouge J (2009) Further development of a Causal model for Air Transport Safety (CATS); Building the mathematical heart. Reliab Eng Syst Saf 94(9):1433–1441. https://doi.org/10.1016/j.ress.2009.02.024 Key: citeulike:5143231CrossRefGoogle Scholar
- Brown PT, Li W, Cordero EC, Mauget SA (2015) Comparing the model-simulated global warming signal to observations using empirical estimates of unforced noise. Nat Sci Rep. https://doi.org/10.1038/srep09957
- Cooke RM, Wielicki BA, Young DF, Mlynczak MG (2013) Value of information for climate observing systems. Environ, Syst Decis. https://doi.org/10.1007/s10669-013-9451-8
- Cooke RM, Golub A, Wielicki BA, Young DF, Mlynczak MG, Baize R (2016) Real option value for new measurements of cloud radiative forcing, Resources for the Future, RFFDP 19–16, March 22, 2016Google Scholar
- Cooke RM, Golub A, Wielicki BA, Young DF, Mlynczak MG, Baize R (2015) Integrated assessment modeling of value of information in earth observing systems. Clim Pol ISSN: 1469–3062 (print) 1752–7457 (online) journal homepage: http://www.tandfonline.com/loi/tcpo20
- Dessler AE, Forster PM (2018) An estimate of equilibrium climate sensitivity from interannual variability. J Geophys Res Atmos 123. https://doi.org/10.1029/2018JD028481
- Nordhaus W, Sztorc P (2013) DICE 2013R: Introduction and user’s manual (2nd ed.). http://www.econ.yale.edu/~nordhaus/homepage/documents/DICE_Manual_103113r2.pdf
- IPCC (2013) Climate Change 2013: The Physical Science Basis. In: Stocker TF, Qin D, Plattner G-K, Tignor M, Allen SK, Boschung J, Nauels A, Xia Y, Bex V, Midgley PM (eds) Contribution of Working Group I to the Fifth Assessment Report of the Intergovernmental Panel on Climate Change. Cambridge University Press, Cambridge, 1535 pp. https://doi.org/10.1017/CBO9781107415324 CrossRefGoogle Scholar
- IWGSCC (Interagency Working Group on Social Cost of Carbon) (2009) Social Cost of Carbon for Regulatory Impact Analysis under Executive Order 12866, Appendix 15a. US Government, Washington, DC, p 53Google Scholar
- IWGSCC (Interagency Working Group on Social Cost of Carbon) (2013) Technical support document: technical update of the social cost of carbon for regulatory impact analysis under executive order 12866. US Government, Washington, DC May 2013, revised Nov 2013Google Scholar
- Leroy SS, Anderson JG, Ohring G (2008) Climate signal detection times and constraints on climate benchmark accuracy requirements. J Clim 21:184–846Google Scholar
- Hanea, A.M., (2008) Algorithms for non-parametric Bayesian Nets Phd Thesis Department Of Mathematics, Delft University Of TechnologyGoogle Scholar
- Hanea AM, Morales Napoles O, Ababei D (2015) Non-parametric Bayesian networks: improving theory and reviewing applications. Reliab Eng Syst Saf. https://doi.org/10.1016/j.ress.2015.07.027 0951–8320/& 2015ElsevierLtd
- Hanea AM, Nane GF, Wielicki BA, Cooke RM (2018) Bayesian networks for identifying incorrect probabilistic intuitions in a climate trend uncertainty quantification context. Risk Research pp 1–16. https://doi.org/10.1080/13669877.2018.1437059
- Kahneman D (2011) Thinking, fast and slow. Farrar, Straus and Giroux, New YorkGoogle Scholar
- Oppenheimer M, Little CM, Cooke RM (2016) Expert judgment and uncertainty quantification for climate change, appearing in Nature Climate Change. 6:445–451. https://doi.org/10.1038/NCLIMATE2959
- Oreskes N, Conway EM (2010) Merchants of doubt: how a handful of scientists obscured the truth on issues from tobacco smoke to global warming, 1st U.S. edn. Bloomsbury Press, New YorkGoogle Scholar
- Renyi A (1970) Theory of probability, North-HollandGoogle Scholar
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.