Quantitative methods versus qualitative methods for uncertainty assessments
In the quest of an appropriate uncertainty assessment for energy scenarios, climate science may provide a suitable starting point as the uncertainty assessment discussion in climate sciences is more advanced than in energy economics.
The Intergovernmental Panel on Climate Change (IPCC) has developed over years guidelines for a consistent treatment of uncertainties associated with climate modelling results. For the current report, the fifth assessment report, the published guidelines incorporate many of former critics on the initial uncertainty assessments and critically analyses uncertainty assessments of earlier reports [28]. The guidelines of the assessment report 5 (AR5) specify two metrics for the communication of the degree of certainty in key findings:
-
Confidence in the validity of a finding, based on the type, amount, quality and consistency of evidence (e.g. mechanistic understanding, theory, data, models, expert judgement) and the degree of agreement. Confidence is expressed qualitatively.
-
Quantified measures of uncertainty in a finding expressed probabilistically (based on statistical analysis of observations or model results, or expert judgement) [23].
By means of a confidence matrix, a likelihood scale (expressed as probabilities) and probability distribution functions, the three working groups of the AR5 are to evaluate uncertainties associated with their findings. The likelihood scale for (subjective) quantitative assessment of uncertainty is recommended to be applied only in cases with high or very high confidence [29]f.
The IPCC uncertainty assessment thus relies on both, qualitative and quantitative ways to describe reliability of findings. Apparently, if quantitative assessments are applicable, they should be used preferable to qualitative assessment. Qualitative uncertainty assessment is applied in cases of deep uncertaintyg, where uncertainty cannot be quantified. Qualitative uncertainty assessment faces several challenges. The problem of linguistic ambiguity seems to be the predominant problem when uncertainty is qualitatively assessed. In the guidance, note the level of confidence is defined using five qualifiers: very low, low, medium, high and very high [23]. It synthesises the author teams’ judgments about the validity of findings as determined through evaluation of evidence and agreement. It is arguable if there is a common understanding of such categories amongst individuals and hence, the question arises, whether the evaluation of agreement actually depicts the uncertainty of the finding in question or rather the ambiguity in understanding of the term used. Also, there is no clear indication how much agreement is necessary for the affiliation to a certain categoryh. And, finally, it is unclear in which way agreement can be associated with high confidence and in turn with uncertainty (judgments about the validity of findings). One interpretation could be that high confidence (inter alia based on high agreement) means low uncertainty; however, this could not hold true in cases where agreement is high that the level of uncertainty is high for a finding (e.g. due to the stochastic nature of a process). Moreover, this reading also faces the criticism that it is thinkable that even with high agreement, the finding is not at all certain, and all assessors could collectively be wrong in their valuation. The other reading, that high confidence means high uncertainty, next to being counter-intuitive, does not reflect that agreement sometimes does give an indication for the truth of a finding.
Qualitative assessment methods, even if normalised to summary terms (IPCC) seem to intrinsically depend on not only a subjective comprehension of summary terms but also subjective opinion of the assessor. This can be advantageous or disadvantageous, depending on the expertise of the assessor and the communication of relevant information that influenced the assessment. In any case, such assessment methods lack the important property of generating reproducible assessments. If a different group of experts assessed the results, the uncertainty assessment of a specific finding might turn out to be significantly different, even if a sound reasoning underpins the assessment. As Krueger et al. point out, expert opinion in modelling will benefit from formal, systematic and transparent procedures [30]. Inter-subjective reproducibility is a necessity if a finding is called robust. A qualitative assessment is likely to be not as efficient in evaluating robustness as a quantitative, standardised assessment, given the problem of linguistic ambiguity and subjectivity of the assessment method. A quantitative approach that uses a method that can be standardised and applied independent of the expertise of the assessor would presumably yield higher agreement.
However, qualitative uncertainty assessments have the important benefit of putting findings into perspective of the state of art of modelling and the present knowledge about processes and/or assumptions. If a finding is based on limited knowledge, it cannot represent a certain statement and has to be supplemented with information regarding the validity of findings.
Quantitative assessment methods often face the critique of being perceived with more precision than justified [31],[32], especially [33] when he discusses Nowotny’s perspective. This could be the case where probability density functions (pdfs) can be produced but are themselves based on uncertain input. In such cases, communication (qualitatively or quantitatively) of the uncertainty related to the pdfs is necessary. An advantage of quantitative methods is an unambiguous representation. The intuitive understanding, even in the simplest form, for example, a scale from one to ten, ten representing high uncertainty, might allow the recipient of such an assessment a clear understanding. This is, indeed, not unproblematic. For one, there is an intrinsic assumption that must be clarified if not true, which is that the scale units are uniform in sizei or a logarithmic scale. Even more intuitively understandable appears to be a probabilistic statement. However, regarding the perception of probabilistic uncertainty assessments, Patt et al. report that changes of equal magnitude in assessed probabilities can have different effects in decision-making experiments. For example, a change of 10 percentage points from 90% to 100% impacts choices of test persons differently than a change from 50% to 60% [34]. Nonetheless, a probabilistic statement is in itself less susceptible to interpretational errors or misunderstandings than a qualitative statement that uses - again - words for interpretation that might be ambiguous.
Another relevant advantage of quantified assessments is the simplicity and comparability of results. The benefits of retrieving a result that can be compared to results, say, some years ago are obvious: applying the same quantitative method can be used to evaluate scientific progress in modelling and scientific understanding (if uncertainty decreases) or illustrate that the nature of a process is more complex than assumed years ago (if uncertainty increases). However, a critique formulated by Kandlikar et al. [35] is that biases can result if simple schemes that attempt to represent uncertainty in a uniform manner across many different contexts are depending on how much detail is presented in the information. This effect is analysed by ignorance aversion. Indeed, quantification, or for that matter qualification to summary terms, can result in a loss of detail and reasoning. It is not totally clear how bias can be created through such a process. However, it is the very task of an uncertainty assessment to transform information of various kinds (quantitative, qualitative, narrative, implicit assumptions, etc.) into a form that can be understood without the profound expertise that is necessary to accomplish the uncertainty assessment itself.
The question whether a quantitative assessment method is preferable to a qualitative assessment method cannot be answered purely by evaluating the respective (dis-) advantages. There are practical limitations that may render a quantitative assessment impossible. However, for the development of an uncertainty assessment in the context of energy economics, relevant differences to climate science prevail that might justify the preferable use of quantitative methods.
Climate science and energy economics
The discussion concerning confirmation of climate models may serve as orientation, and energy model evaluation could profit from these considerations. Lloyd [20] concludes that climate models should not be judged primarily or solely on the basis of what they are weak at. This is an important aspect to remember when evaluating energy models as well. Generally, her approach to confirmation ‘takes it as a matter of degree; models can accrue credit and trustworthiness upon being supported by empirical evidence as well as by theoretical derivation’. Lloyd illustrates the strengths in terms of confirmation as model fit, variety of evidence, independent support for aspects of the models and robustness for climate models. These concepts, bearing on the reliability of models, can be applied to energy models as well.
Model fit
Model fit refers to the ability of model results to represent data that can be observed empirically, possibly ex post. Unfortunately, energy models have a rather poor history of model fit [36],[37]. Analysis of the main reasons for deviations of model results to evidence are summarised as:
-
unanticipated strong political decisions such as closing of mines in the UK, feed-in tariffs in Germany and world climate change concerns;
-
unexpected energy requirements, like the transport behaviour and the rush for gas;
-
definition and availability of statistical data [36].
The main difference when comparing climate models with energy models is that energy models represent and simulate a well-understood system with mainly economic drivers. In contrast, climate modelling has its challenges in representing chaotic systems with at least partially little understood causal relationships and magnitudes of impact of system components. It was, at least in principle, possible to know today with sufficient accuracy how the energy system will look like in a given point of time in the future. The problem is that many interests must be met and decisions not tend to be of durable nature as political, environmental or economic circumstances change. This is one reason why energy roadmaps, energy strategies and energy programs on a political level are important. These commitments to a specific system state in future allow for energy modellers to accordingly define constraints in models and consequently investigate - using models - different paths to meet the desiderata. Results of such model simulations may be cost-effective, environmental-friendly, socially accepted or other (possibly optimised) system development paths. The reason why model fit of energy models yields a poor record in the past hence is not (primarily) due to little understanding of the system but must rather be contributed to influences on the system of radical nature that cannot be anticipated. Moreover, such radical impacts (e.g. political reorientation) do not lead to any improvement of energy models or target system understanding for their nature is vested in societal decisions that can and should not be anticipated, hence allowing evolvement of society.
Variety of evidence
Lloyd refers in her analysis to the fact that climate models can accurately predict other variables than, for example, global mean temperature. This type of confirmation translated to energy models could be interpreted as correctly predicted installed capacity for electricity generation, fuel mix and the like. Variety of evidence in energy models has close relation to the constraints and assumptions built into the model. As Knutti et al. [38] state, it is due to the physical principles known to be true, such as conservation of mass, energy and momentum, that can be applied and transferred across hierarchies of models that confidence in climate models is justified. This physical nature of climate modelling can only partially be applied in energy models. Energy systems have physical limitations, e.g. land use, maximum solar radiation or exhaustible resources; however, the system state is mainly dependent on economic drivers, law requirements and incentive policy. These are not physical, law-obeying mechanisms, although, a kind of cause-and-effect relation can be observed. Due to this different nature of the modelled system, variety of evidence cannot be applied for confirmation in the same sense as climate models are verified by true evidence. The same is true for independent support for aspects.
Robustness
Lloyd applies a robustness analysis developed by Weisberg [39] that puts forward a robust theorem of the general form: ceteris paribus, if [common causal structure] obtains, then [robust property] will obtain. The causal structure captured in the respective models seems to be the key difference between climate models and energy models. Causal structures in energy models, depending on the model in question, are for example, inverse supply functions [2], whereas in climate modelling, for example, thermodynamic laws are appliedj[40]. It seems that climate science partially due to ignorance of (components of) the target system face epistemic uncertainty. Stevens and Bony [41] analyse that for example, tropical precipitation over land and consequently vegetation dynamics are poorly understood. As a result the understanding of the carbon cycle is limited.
It is necessary to clarify the applied interpretation of implication used to analyse Weisberg’s theorem. Let A denote the antecedent ‘common causal structure’ and B denote the consequent ‘common property’ of the theorem. The ‘if…then’ clause can be interpreted in different ways. A strict material implication in its truth functional sense means that A is false or B is true [42]. Another interpretation would be a logical implication to state that B is already logically implicit in A. This interpretation means that it is a logical consequence that a common causal structure implies a common property to obtain (ceteris paribus). Another interpretation of implication (A implies B) is that B is deducible from A by logical reasoning. To prove that it is logically deducible that a common property obtains if a common causal structure obtains would surpass the scope of this text. But, it may well be possible to do so. Weisberg departs in this question from Levins, Orzack and Sober and clarifies that robustness analysis is effective at identifying robust theorems, and, whilst it is not itself a confirmation procedure, robust theorems are likely to be true [39]. It is important that a theorem as put forward by Weisberg does not presuppose the truth of A. In other words, the theorem does not claim to guarantee that if a common causal structure obtains, this implies that a robust property will obtain (ceteris paribus). In this sense, the theorem is much weaker than one would wish for an uncertainty analysis. If robust theorems according to Weisberg are likely to be true, the only case that is unlikely is the one where A is true and B is false, for this renders the theorem to be false. Hence, the unlikely case is that if common causal structure obtains then robust property will not obtain, ceteris paribus. But as indicated by the example of Stevens and Bony, the antecedent ‘common causal structure’ (A) can well be false. The use of Weisberg’s theorem does not indicate if B (robust property will obtain) is true or false if A is false.
Hence, if common causal structure changes in climate models due to new understanding, robustness, defined as such, does not allow inference to the truth of the associated robust property or uncertainty.
In the case of energy models, causal structures face less uncertainty of epistemic nature, but rather uncertainty due to social or political under-determinism of future developments. In this case, robustness could indicate some degree of certainty. However, it is not straightforward to conclude from robust results to uncertainty, even in a qualitative manner.
Another challenging issue in this respect is the ceteris paribus clause. A common approach for robustness analysis is scenario technique. The choice of parameters that are defined stable (ceteris paribus) and parameters or constraints that are varied significantly influences the results of energy models. It is therefore a choice, what results appear robust, for any result could be in principle produced by choice of parameters (e.g. by technology prices in cost optimization models). Hence, robustness as an indicator for uncertainty in energy scenarios has limited potential for uncertainty assessment of energy model results.
A discussion of Bayesian approaches
Probabilistic interpretation of uncertainty assessments is considered valuable, as the IPCC guidance note for treatment of uncertainties specifies [23]. Uncertainty and risk are to be assessed to the extent possible, and if appropriate probabilistic information is available, special attention to high-consequence outcomes should be given.
Probabilistic uncertainty assessments satisfy requirements 1 (clear indication how reliable the findings are), 5 (intuitively understandable and straightforward to communicate) and 6 (reproducible and unambiguous), if the methodology of assessment is a standardised process. As well in the IPCC guideline notes, as in the approach by Walker,as in the NUSAP method statistical knowledge is considered as knowledge with little inherent uncertainty. It seems thus appropriate to consider an assessment method that is based on statistical data and produces probabilistic uncertainty assessment results.
Bayesian statistics could provide such a method. As Bernardo [43] points out, the comprehension of probability in Bayesian statistics corresponds precisely to the sense in which this word is used in everyday language. This quality corresponds to satisfying requirement 5 (intuitively understandable and straightforward to communicate): the understanding of probability as a conditional measure of uncertainty associated with the occurrence of a particular even, given the available information and accepted assumptions. Bernardo stresses that a conditional probability measure is dependent on two arguments, the event E with the uncertainty to be measured and the conditions C of the measurement, ‘absolute’ probabilities do not exist [43].
In typical applications, one is interested in the probability of some event E given the available data D, the set of assumptions A which one is prepared to make about the mechanism which has generated the data, and the relevant contextual knowledge K which might be available. Thus, Pr (E|D, A, K) is to be interpreted as a measure of (presumably rational) belief in the occurrence of the event E, given data D, assumptions A and any other available knowledge K, as a measure of how “likely” is the occurrence of E in these conditions [43].
In Bayesian statistics, a prior probability that represents the presumption of the statistician is combined with empirical data to derive a posterior probability by means of Bayes’ theorem.
(1)
With p(D|ω) being a formal probability model for some (unknown) value of ω, the probabilistic mechanism which has generated the observed data D; p(ω|K) being the prior probability distribution over the sample space Ω, describing the available (expert) knowledge K about the value of ω prior to the data being observed and p(ω|D,A,K) being the posterior probability density.
The following general description of BMA is primarily based on [44] and [45]. Suppose ω represents an input variable to a model. Its posterior distribution given data D is:
(2)
where M
K
represents the considered models. This is an average of the posterior distributions und each of the models considered, weighted by their posterior model probabilities (PMPs). The posterior probability for model M
K
is given by the specific form of Bayes’ theorem,
(3)
with
(4)
representing the integrated likelihood of model M
K
. θ
k
is the vector of parameters of model M
K
, pr (θ
k
|M
k
) is the prior density of θ
k
for model M
k
, pr(D|θ
k
, M
k
) is the likelihood and pr(M
k
) is the prior probability that M
k
is the true model. For a regression model θ = β, σ2, all probabilities are implicitly conditional on the set of all models being considered.
Critique that has been offered for Bayesians includes but is not restricted to scepticism versus prior probabilities [46] and interpretational aspects [47],[48] and in response [49],[50]. Some arguments are also briefly presented by Gelman [51]. It outreaches the possibilities within this text to discuss all of them; therefore, the focus will lie on critique related to Bayesian methods in the context of climate models and energy models. One of the main objections to the use of Bayesian methods is the arbitrariness of the prior distribution. In the context of climate science, Betz [46] argues that the dependence on (1) the specific prior probability distribution over the initially considered hypotheses and (2) the climate model used for probability estimates of climate sensitivity obtained by Bayesian learning is problematic. According to Betz, the choice of prior distribution is an arbitrary assumption and - in the context of climate modelling, with limited sample sizes - entail that the final posterior probability is a function of the initial prior (which is arbitrary). This critique of prior distribution influence on posterior probabilities is a well-known and not a new objection to Bayesian analysis cf. [52],[53].
Thus, the arbitrariness of the prior distribution is considered to be problematic. Put in Bayesian terms, an expert elicitation result is nothing but a collection of prior probabilities and though, this method is used for uncertainty assessment in climate modelling as well as energy modelling. If one accepts that a Bayesian statistician is an expert, the claim can be formulated even stronger, namely, that a Bayesian approach exactly satisfies requirement 4 (incorporate qualitative and quantitative aspects). This is to say that by means of prior distributions not only historical data (the likelihood) is used to assess uncertainty, but also a qualitative, subjective expert judgement can be incorporated. This not only renders the prior distribution choice a relevant tool for a complete representation but also responds to another critique that is often brought up against statistical methods in general, namely, that past evidence cannot provide for future developments. By means of a prior distribution, the likelihood of past events is relativized and both are possible, the recognition of the world as it is (was) and the representation of how this evidence is to be evaluated with respect to the future. It could be considered thus as a distinct virtue that prior probabilities depend on expert judgement rather than being problematic. The argument that subjective criteria can enrich a (statistical) model rather than disempower its findings due to lack of objectivism is also put forward by Isaac. His ‘integrated subjectivism’ also characterises a Bayesian model as the simplest form of integrating subjective knowledge and objective likelihoods with the aim of ‘transforming a scientific model into a decision-theoretic one in which objective parameters (about the world) and subjective parameters (about the agent) peacefully coexist’ [54].
Requirement 4 (incorporate qualitative and quantitative aspects, i.e. complete representation) is satisfied more explicitly with a BMA evaluation that uses informative priors. However, it has been argued that even improper priors (aka non-informative) or weak priors (i.e. flat priors) contain information about the subjective certainty of the modeller, e.g. [55].
Another criticism is sharpened by Kandlikar et al. [35] and focuses on two problematic assumptions:
-
precision: the doctrine that uncertainty may be represented by a single probability or an unambiguously specified distribution;
-
prior knowledge of sample space: the assumption that all possible outcomes (the sample space) and alternatives are known beforehand.
Indeed, the problem of deceptive preciseness of probability distributions needs to be addressed when an uncertainty assessment is based on probabilities. One mean to that end could be a transparent documentation of data used and assumptions made for the uncertainty assessment. Again, comparing with the predominant assessment method, expert elicitation, such critique could hold here as well, however, ambiguity in expert elicitation results seems to be perceived as less problematic. Another mean to that end could be a systematic sensitivity analysis in Bayesian terms. In this effort, a variety of prior probabilities and its effect on posterior probabilities could yield important insight, possibly even in cooperation with expert elicitation to define priors that are suitablek. BMA for input variables of energy models addresses this critique by evaluating a lower bound of uncertainty. Another possible way that is not investigated in the text could be the computation of interval probabilities that specify an interval of uncertainty for an input variable. However, due to considerations of ignorance, a lower bound seems more appropriate that respects the fact that unknown or intentionally ignored influences might increase uncertainty by a not specified amount.
Prior knowledge about the sample space Ω seems to pose more a problem in climate science than in energy economics. Possible outcomes and alternatives in energy economics are likely to be more predictable than in climate science. For example, in climate science, it might be true that a possible outcome is unknown due to interdependencies that are not well understood or orders of magnitude of effects that outrange expectations and the sample space does not account for that possibility. For example, if consequences of unprecedented gaseous concentrations (as in the past low O3 in the stratosphere [56] or more recently high CO2 concentrations in the atmosphere) are modelled, Ω might not be complete. In energy economics, some non-explicit assumptions such as that the target system will exist in a comparable way within the time horizon and geographic scope of the model will simplify the treatment and assessment of the sample space. This is not due to insufficient modelling techniques, but rather to science being an evolving matter that naturally develops with new insight, new measurement techniques and scientific understanding. However, the critique is certainly valid in the context of energy scenarios if key assumptions are considered such as gross domestic product (GDP) growth, future energy prices or population growth. Even if sound forecast data from statistical sources are availablel, these assumptions could be associated with deep uncertainty and possibly, the sample space Ω is not complete. This fact might belong to the realm of recognised ignorance, as Walker et al. term it. Especially for such key assumptions, an uncertainty assessment that evaluates as many potential influences on the key assumption as possible is adequate.
One possibility of limiting such deep uncertainty in the context of energy economic models is a deliberate choice of system boundaries. In addition to typically topological, economic or sectorial system boundaries and sub-system units, social systems can and should be detailed in energy models, see [57]. In energy models, as in climate models, one can intentionally define system boundaries to represent parts of the integrated (energy) system with simplified connections across the system boundaries. However, for climate models that are concerned with questions of global impact and consistent regional interpretation, meaningful results can only be obtained within a global system boundary. IPCC [58] specifies that only general circulation models (GCMs) have the potential of consistent estimates of regional climate change which are required in impact analysis. Energy models can be designed to depict a certain part of the global energy economic system, hereby possibly increasing uncertainty due to ignorance of effects on a larger scale, and possibly reducing uncertainty within the system boundaries as Ω becomes more complete. It thus seems to be a trade-off between chosen ignorance (due to system boundaries) and recognised ignorance (that one is aware of but cannot address). The BMA uncertainty assessment for input variables to energy models respects these uncertainties by formulating a lower bound of uncertainty.
It is worth discussing whether such uncertainties are better assessed with qualitative methods than in quantitative methods in probabilistic terms. The choice of key assumptions and their related uncertainty clearly limits the inferences that can be drawn from model results. However, the assessment of such deep uncertainties could be endeavoured in Bayesian terms.
The Bayesian endeavour
A Bayesian approach could potentially satisfy the requirements previously defined. This section is concerned with how an implementation of Bayesian statistics for uncertainty assessment in energy models could be achieved. In Figure 4, a chart of the design of many quantitative energy models is shown. The information flow starts on the left side with influences that effect different input variables to energy models, exemplified by resources, demand and infrastructure for the input variable energy prices. Input variables are individual for every energy model so that the listed input variables energy prices, GDP, population, efficiency and demand can be regarded as typical examples. Input variables then are processed by the mathematical core of the energy model. Different types of models are possible; in the chart, the examples computable general equilibrium (CGE), linear programming (LP) models, mixed complementary problems (MCP) and stochastic models are mentioned. Finally, on the rightmost side, the output of the model, the energy scenario is the result of that information flow and computational effort.
The key idea is to assess the uncertainty of the input variables on the left side of the graph in Bayesian terms and thusly define a lower bound of uncertainties associated with model results (model output). If one accepts the premise that model output cannot be less uncertain than model input, this lower bound could be defined by the uncertainty of the input variables. It is important to stress at this point that the BMA method for input variables does not replace an energy model, e.g. LP, MCP or a CGE model, to name just a few that are a common practice in energy economics. The aim is rather to assess uncertainties of input variables that are specific for a given model by means of BMA. This process should render transparent that independent of the predictive power of an energy model the sheer use of variables that are inherently uncertain leads to model outcomes that must reflect that uncertainty. It can and should not be the aim of an energy model to present results as more or less certain than they are due to the nature of a non-deterministic world which the target system is based in. The structure, nature, scope, aim and mathematical formulation of energy models are highly diversified. For a given energy economic question, many different potential energy models can be designed to provide an answer. However, any model that could be designed will have input variables that are more or less uncertain. The aim of the proposed method is providing an estimation of these uncertainties independent of the specific (dis-) advantages a given model holds with respect to other energy models that could answer the question.
The predominant assessment method, expert elicitation, of uncertainty is used as reference. An expert elicitation process makes use of expert knowledge to assess how uncertain an assumption or a finding is. But what exactly is expert knowledge? The supposition is that expert knowledge is based on understanding of causal relationships, (long) record of observation or research, inclusion and exclusion of relevant factors and an intuitive ‘feel’ for the field of expertise. At least these virtues should be met by a Bayesian approach as well, together with the requirements previously defined.
The understanding of causal relationships - in the context of energy economics - refers to the ability of understanding market mechanisms, micro- and macro-economic processes, social processes, etc. Consider the example of energy prices in Figure 4. If an assumption regarding the future energy price of, for example, natural gas is to be defined, it would be necessary to think of influences that impact the natural gas price, for example, resources, (global) demand, infrastructure, efficiency of devices and the like. These influences need not be assessed in qualitative terms or subjective opinion of an expert, for there are statistical data available. If such statistical data are not readily available, it might be necessary to look for a suitable statistical representation of the influence, e.g. for consumer acceptance [59], or methods described by [60] with respect to the food industry. A sound record of research and a long record of observation can be translated in statistical terms in sufficient large sample sizes of the statistical data. This might pose a problem if time series are short or the influence record is short.
The causal relationships, or how an influence bears on the input variable in question, in the example, the natural gas price, could be represented in a mathematical relation, e.g. a linear regression model. A regression model representing the dependent variable, natural gas price, and the explanatory variables, the influences, could capture causal relationships and the magnitude of impact of an influence on the input variable. Note that, non-linear models could be applied also, but for the analysis of the impact of an influence on the dependent variable (that is, the input variable in an energy model), it suffices to evaluate whether the influence increases or decreases the dependent variable and with what order of magnitude (that is, the coefficient estimate). This is straightforward standard statistical work. But this would not respect that the representation with a linear model itself increases uncertainty, for one might choose the wrong explanatory variables (influences) or not enough. This problem can be addressed by BMA.
BMA allows the inclusion and exclusion of potential influences by means of a Markov Chain Monte Carlo (MCMC) samplerm investigating the whole model space, i.e. the set of all possible variable combinations that can be employed to represent the dependent variable. In applying the BMA method, the uncertainty assessor firstly gathers any data that might be - even only in an indirect sense - be a relevant influence on the dependent variable. Let these candidate explanatory variables be k. The model space from which to choose the appropriate linear regression model is then 2k. Any variable could be included or excluded, reliant on the explanatory value for the dependent variable. This explanatory value is assessed as posterior inclusion probability (PIP) for individual explanatory variables, and the individual models (containing specific explanatory variables) are ranked according to their PMP. Hence, the explanatory power of each variable and of different competing linear regression models can be assessed. As the name indicates, these results of BMA are probabilities. The prior probabilities concern the assessors’ prior belief about how many explanatory variables are relevant. BMA then provides 1) the best linear regression model in terms of highest PMP and 2) the individual relevance of influences in terms of coefficient probability estimates and posterior inclusion probability PIP. In Figure 5, an exemplary coefficient estimate for an explanatory variable (GDP) of the natural gas price is illustrated.
On the abscissa, the coefficient value for the variable in the linear regression model is quantified. The ordinate represents the probability density for the coefficient value (i.e. the rate of change of the conditional mean of the natural gas price conditional on the change of GDP). The double conditional standard deviation (2× cond. SD) is indicated in the red dotted line. An equivalent chart can be produced for every explanatory variable of the competing models. The PIP of this variable is 96.1% what reflects that if the variable was contained in a model, competing models were less successful in explaining the data. In other words, the PIP is the sum of PMPs for all models wherein a covariate was included. The shape of the probability density and the low range of double standard deviation (approx. 0.4 to 1.4) indicate that variation from the conditional expected value (cond. EV) is rather low.
In practice the approach can be detailed in several steps. In step one, relevant input variables, or all input variables - depending on the size of the energy model under scrutiny - are identified, e.g. GDPo within the energy models’ system boundary. In the next step, statistical data of economic, ecological, social or from other disciplines is gathered that is suspected to influence the input variable (e.g. statistical data concerning, industrial production, import and export, taxes and subventions, birth and death rates, education, etc.), including statistical data of the input variable. This input variable (GDP) in the uncertainty assessment becomes the dependent variable on these influences. Note that, in contrast to other methods, there are hardly practical limitations to the amount of influences that can be considered, for BMA by means of a MCMC sampler investigates the model space and ranks explanatory variables (influences) according to their PIP. The next step is the definition of the form of mathematical representation, e.g. a multivariate linear regressionp. As many potential explanatory variables are defined, the question is what variables should be included in the model. BMA estimates models for all possible combinations of explanatory variables and constructs a weighted average over all of them. Then, the choice of a suitable prior distribution is defined, e.g. Zellner’s g-prior [61],[62]. If the integrated likelihood is constant over all models, the PMP is proportional to the marginal likelihood of a specific candidate model, i.e. the probability of the data given that model times a prior probability. The prior probability reflects how probable the expert thinks the model is before looking at the data [63]. The thence generated models with highest PMPs can be evaluated, and a model that best represents the dependent variable (e.g. GDP) can be chosen. Finally, the uncertainty estimation for the input variable is derived from the PMP of the model chosen.
An additional feature that is not the focus of this text is the possibility of generating predictive distribution functions from the chosen model that consistently with past evidence and expert judgement represent the dependent variable for given assumptions of explanatory variables. This could foster consistency in the choice of key assumptions.
The interpretation of BMA results as uncertainty can be straight forward if uncertainty is suitable defined. To that end, a definition that is based on probability is introduced.
Definition: Uncertainty equals the probability that statement S might not be true.
Given, by means of BMA, a PMP is calculated for an uncertainty model (e.g. a PMP of 13%q for a model that represents the natural gas price), uncertainty - by definition - would be at least 87% for the dependent variable. This would mean that the input variable ‘natural gas price’ to an energy model holds an uncertainty of at least 87%, even if all relevant explanatory variables are considered. Hence, the results of a model including an assumption about the natural gas price cannot be less uncertain than 87%.
In other words, the PMP reflects the probability that the input variable thusly described matches data. For a model with a PMP of 13%, the associated uncertainty would be at least 87%. A clarifying statement of the following form could accompany model results.
“In consideration of expert judgement, statistical data of influence X1, influence X2, influence X3,…, of the last 25 years, the uncertainty that the input variable can be described as such is at least 87%.”
For every influence X1, X2,…, the PIP indicates the explanatory contribution of the influence and 1-PIP indicates the uncertainty that the influence contributes to explaining the dependent variable of a given model (typically the one with the highest PMP). In the example, the uncertainty that GDP explains the natural gas price (together with the other explanatory variables) of the chosen model is 3.99% (1 - 0.9601 = 0.0399). Such an assessment clearly satisfies requirement 1 as uncertainty expressed as probability density is a clear indication how reliable the findings are.
The third virtue of expert knowledge, inclusion and exclusion of relevant factors, could be achieved by this standardised method, hereby satisfying requirement 2 (applicable independent of assessor’s expertise).
The approach would limit many intuitive over- or under-estimations of impact of influences on variables that figure as input variables in energy models. It is thinkable that different experts evaluate individual influences as more/less relevant for the assumption of an input variable (e.g. a natural gas price assumption) thereby generating ambiguousness and dissent. A standardised method, relying on statistical data, i.e. knowledge with little associated uncertainty in and by itself, could yield significant improvement in uncertainty assessment for energy models. However, as expert knowledge is an important part of assessment methods, it is possible to take this by prior probability specification into account.
A key quality of the BMA method for input variables is that model uncertainty of the linear regression model itself, and thus, the assessment method’s uncertainty is quantified in probabilistic terms. This is a distinct advantage of the method as opposed to purely statistical or qualitative methods. Other methods that are applied in uncertainty analysis, for example, standard statistical analysis or purely qualitative methods ignore that source of uncertainty. A standard regression analysis is conditional on the assumed statistical model, and the analyst may be uncertain whether it is the best representation. If an expert Delphi [64],[65] is carried out opinions are rarely scrutinised for their correctness or compliance with statistical evidence. However, if an expert is asked, how probable she thinks her evaluation is, a prior distribution could be constructed.
Another requirement previously defined is the applicability to different energy models (comparability of results), requirement 3. As indicated by Figure 4, the assessment method is concerned with input data to energy models and is hence independent of the mathematical model that consequently processes the input. The uncertainty assessment method would be applicable for different kinds of models common in energy economics, LP’s, MCP’s, CGE’s, stochastic models or even qualitative models that use input variables.
Requirement 4 (inclusion of qualitative and quantitative aspects) to assure a complete representation can be achieved through prior probabilities and statistical data. The resulting posterior probabilities and the probabilistic interpretation of uncertainty are straightforward to communicate, as demanded in requirement 5 (intuitively understandable and straightforward to communicate).
Finally, requirement 6 demands for reproducibility and unambiguousness. Given assessors use the same set of data, the results of BMA are reproducible. However, a source of ambiguousness could be prior probability choice. This lies, as previously discussed, in the very nature of expert judgement. A sensitivity analysis to evaluate such ambiguousness could both, increase understanding of the BMA method within this context, and indicate to what extent expert elicitation has to be put in perspective to statistical data.