1 Introduction

It is well established that global anthropogenic emissions of long-lived greenhouse gases (LLGHG) such as carbon dioxide (CO2) and nitrous oxide (N2O) are causing global climate change. Therefore, these emissions are governed in the United Nations Framework Convention on Climate Change (UNFCCC) since 1992. More recently, the fact that also emissions of certain air pollutants have significant climate impact has been gaining widespread international policy recognition for example through the publication of the UNEP synthesis report on near-term climate protection (Kuylenstierna et al. 2011; Shindell et al. 2012) and creation of the Climate and Clean Air Coalition (CCAC) in 2012. These air pollutants are generally referred to as near-term climate forcers (NTCFs) and include sulphur dioxide (SO2), black carbon (BC), organic carbon (OC), non-methane volatile organic compounds (NMVOC), nitrogen oxides (NOx), and carbon monoxide (CO). Also, the greenhouse gas methane (CH4) is usually included in the NTCF group due to its relatively short atmospheric perturbation time of about a decade (Aamaas et al. 2016), but also in the LLGHG group due to being long-lived enough to be well-mixed in the atmosphere. Increased control of NTCF can reduce the rate of global temperature increase, and the peak temperature, contingent that LLGHGs are stringently and perpetually controlled (Bowerman et al. 2013; Shoemaker et al. 2013).

It is complicated to estimate the climate impact of a given NTCF abatement option or regulation. To use large scale climate models costs time and money and requires an effort that is beyond the capacity of most climate policy studies, including NTCF abatement studies (Aamaas et al. 2013). So to estimate climate impact of NTCF emission policies, researchers in several fields and policy makers typically utilise climate metricsFootnote 1 (such as global warming potential (GWP), global temperature potential (GTP) (Myhre et al. 2013b), regional temperature potential (RTP) (Aamaas et al. 2016)) normalised against the climate impact of CO2. This practice allows for quick estimates of the climate impact—or rather carbon dioxide equivalent (CO2eq) emissions—that LLGHG and NTCF emissions give rise to (Schmale et al. 2014).

Today, dozens of climate metrics exist, including economic (Johansson 2011) as well as physical metrics (Aamaas et al. 2013). Given the numerous metrics available—representing different aspects of climate change and/or time horizons—and that researchers and policy makers are using metrics, it is only natural that there has been a discussion about which metric to use for climate and NTCF policy.

The discussion on which metric to use when placing different climate forcers on a common scale has mainly focused on metrics for comparing CO2 equivalence of emissions that are included in the national reporting to UNFCCC (Manne and Richels 2001; Manning and Reisinger 2011; Peters et al. 2011), as well as combined LLGHG and NTCF abatement (Fesenfeld et al. 2018; Levasseur et al. 2016; Tanaka et al. 2010). Generally, these discussions indicate that the choice of metric design depends on the type of climate change impact considered, on time horizon, and on whether future climate impact is discounted, all normative choices (Tanaka et al. 2010). A recent addition to the discussion includes a consistency check regarding time horizon and discount rate (Sarofim and Giordano 2018). But such a discussion is also relevant for the comparison of cost-effective abatement of only NTCF emissions.

However, while NTCF abatement cost studies utilise different climate metrics when estimating cost-effectiveness, we have not identified any studies that analyse the impact of metric choice on the estimated cost-effectiveness of different NTCF abatement options. For example, the Norwegian Environment Agency (2014) used Global Temperature Potential with a 10-year time horizon (GTP10) when analysing a Norwegian action plan for abatement of NTCF, while UNEP (Kuylenstierna et al. 2011) used GWP100 when analysing NTCF abatement on a global scale, but none of the studies analyse the sensitivity of their findings with respect to metric choice.

The situation is made further complicated for several reasons. Not only is there a large plethora of metrics that could be applied, but the metric value for a given metric choice is uncertain since the climate impacts of NTCFs are short-lived, causing the climate effect to be heterogeneous in space and time. Due to the heterogeneity, the natural scientific oriented literature on metrics contains recommendations to adapt metrics to regional and seasonal origin of emissions when estimating CO2 equivalence of NTCF emissions (Aamaas et al. 2016; Henze et al. 2012; Shindell and Faluvegi 2009), and it has been recognised that climate metrics need to be ‘robust enough’ to allow for a ranking in terms of cost-effectiveness of abatement options when the analysis is to be used for policy purposes (Aamaas et al. 2016).

Studies on the effect of metric choice for NTCF abatement would add to the current knowledge on NTCF policy analysis since some aspects of NTCFs are unique in relation to LLGHG, primarily the current policy context. NTCF emissions are often regulated as a group separate from LLGHG emissions in international regulations such as the UNECE Air Convention and the EU National Emission Ceilings Directive.

All in all, the current situation is that (1) several governments and international bodies are interested in controlling NTCF emissions; (2) cost-effective regulations and abatement options are preferred; (3) climate metrics are often used to estimate the impact of options; (4) the existing literature have not analysed the effect of metric choice on cost-effectiveness of NTCF-only abatement options; and (5) most regulations and policies that regulate NTCF emissions do not regulate LLGHGs. Therefore, in this paper, we analyse whether the relative cost-effectiveness of NTCF abatement options (i.e. ranking) is affected by the choice of climate metric. We apply the assessment approach to NTCF abatement options estimated to be available in Sweden.

2 Data and method

Section 2 presents the data collected and the method used during the analysis, as well as a description of the sensitivity analysis. The analysis is done in Python by using Sublime and the code is available in the supplementary material 1.

2.1 Data selection

2.1.1 Abatement options

In the literature, there is some agreement that small scale wood combustion, mobile machinery, road transport, solvent use, and agriculture are important sources of emissions to and precursors to NTCFs in economically developed countries, including Sweden (ACAP 2014; AMAP et al. 2008; Arctic Council 2011; Kindbom et al. 2015; Kuylenstierna et al. 2011; Stohl et al. 2015; UNEP and WMO 2011; Zaelke et al. 2012). Data on emission abatement options in Sweden is selected based on the criteria that the literature contains estimates on both abatement costs as well as effects on emissions and includes the emission sources mentioned above. Given the limited literature on NTCF abatement options that includes the necessary information, we allow the estimates to vary with respect to target year for the options (between 2020 and 2030). Based on reports on Swedish abatement options for these emitting sectors, we investigate the cost-effectiveness of reducing NTCF emissions through shifting from solid wood to wood pellets in small scale wood combustion (option Pellets) (CLEO 2014) or through investment in newer stoves, boilers, and fire places (Mod. boiler) (Amann 2014; CLEO 2014). We also include estimates on reducing methane emissions from the agricultural sector through increased use of fermentation of manure (Man. ferm) (Hellstedt et al. 2014; Swedish Board of Agriculture 2012), through covering of liquid slurry tanks (Sl. cover) (Swedish Board of Agriculture 2012) or through acidification of liquid slurry (Sl. acid) (Swedish Board of Agriculture 2012). Furthermore, we include options to reduce NTCF emissions from non-road mobile machinery through a rejuvenation of the machine stock (Mod. NRMM) through a shift from two-stroke to four-stroke engines in snow mobiles (4S S-mob) (Fridell and Åström 2009), or through a shift to four-stroke engines and electric engines for smaller machinery (4S&El NRMM) (Fridell and Åström 2009). Finally, we include the option of increasing the shift to water-based solvents in products to reduce NMVOC emissions (W. solv) Amann et al. (2014). The reported average annual abatement cost per option implemented at its full scale in Sweden is converted from the reported costs and currencies to €2010 values (Table 1).

Table 1 Average estimated annual costs of emission abatement for the options considered in this paper

These options are reported to have an effect on up to six pollutants with climate impact: three types of fine particulate matter with an aerodynamic diameter of < 2.5 μm (BC, OC, and remaining sub-fractions (PMres)); NOx; CH4; and NMVOC (Table 2). The options are in the source material not considered to significantly affect fossil fuel demand, which implies insignificant impact on CO2 and SO2 emissions. In Sweden, the allowed sulphur content in diesel fuel is < 10 ppm, (< 0.001%), and given the negligible impact on fuel demand, it is reasonable to assume an equally negligible impact on SO2 emissions. In this paper, we adhere to reported numbers, but follow up on them in the discussion.

Table 2 Reported average national emission reduction of air pollutants per option

2.1.2 Metric choice and climate impact uncertainty

We use climate metrics presented by the Intergovernmental Panel on Climate Change (IPCC) in their 5th Assessment Report on climate change (Myhre et al. 2013a, b) as the primary source of information on metric value, and complemented with IPCC sources when necessary (Collins et al. 2013). The IPCC and Collins et al. (2013) present values for the climate metrics GWP100, GWP20, GTP100, and GTP20 for region-specific as well as for global average (or 4-region aggregates) emissions of BC, OC, NMVOC, CH4, and NOx (given in tables 8.A.1,3,4,5, & 6 in Myhre et al. (2013b); table 8.SM.14 in Myhre et al. (2013a); and tables 1 and 2 in Collins et al. (2013)). Out of the available regions, ‘Europe’ is considered most representative of Sweden.

The estimates of the radiative forcing (RF), climate impacts, and consequently the CO2 equivalent emissions of the pollutants covered in this paper are uncertain. We accommodate for this uncertainty by assigning low, mid, high numerical values of CO2eq emissions for each pollutant and each climate metric: the low and high numerical values correspond to ± one standard deviation, respectively.

The uncertainty ranges presented in the literature are not presented using similar uncertainty intervals, so some adaptations are necessary. For BC, we use low, mid, high values from table 2 in Collins et al. (2013) for European GTP metrics; calculate European GWP metrics based on low, mid, high values on absolute GWP (AGWP) from table 1 in Collins et al. (2013) divided with AGWP20 and AGWP100 for CO2 in Collins et al. (2013); and use low, mid, high values from Myhre et al. (2013b: table 8.A.6 row ‘BC total, global’) for global GTP and GWP metrics. For OC, we use the same approach as for BC, but since the presentation of global metric values is incomplete in the source material, we use average values for emissions from ‘East Asia’, ‘EU + North Africa’, ‘North America’, and ‘South Asia’ as proxies for global average emissions (Myhre et al. 2013b: table 8.A.6 row ‘4 regions’). For NMVOC, we use values from Myhre et al. (2013b: table 8.A.5) for all metrics. For European metrics, we use ‘EU + North Africa’ and for global metrics, we use ‘four regions above’. For NOx, we use values from Myhre et al. (2013b: table 8.A.3): ‘EU + North Africa’ values for European metrics, and ‘four above regions’ values for global metrics. For CH4, we do not separate European and global metrics. We use values from Myhre et al. (2013b: table 8.A.1) for GTP and GWP mid values. GTP low and high values are estimated by calculating the percentage deviation from mid values in Collins et al. (2013) table 2, and applying these percentages on the mid value from Myhre et al. (2013b: table 8.A.1). GWP low and high values are taken from Myhre et al. (2013a: table 8.SM.14) and deflated from a 90% confidence interval to one standard deviation (assuming a normal distribution of CH4 uncertainty).

For PMres, we assume that the CO2 equivalence of emissions is equal to the CO2 equivalence of OC emissions, given that most PMres emissions from biomass burning are inorganic (mineral) salts with mainly light scattering properties (Chen et al. 2017). Due to the crudeness of this assumption, we test the robustness of the results by setting the CO2 equivalent emissions of PMres emissions to zero in the sensitivity analysis. Finally, following the update recommendations in Myhre et al. (2013b), we update the 100-year perspective values with a factor 0.94 for GWP100 and 0.92 for GTP100, an update not necessary for the 20-year time horizon due to negligible effect on that short-time horizon. This update is made to accommodate for updated values of the absolute metric value of CO2. The resulting climate metric values used in this paper are found in Table 3.

Table 3 Values of CO2 equivalent emissions per pollutant and climate metric considered in the analysis (Collins et al. 2013; Myhre et al. 2013a, b), including our assumption of PMres having the same climate impact as OC

There exist up to 729 (36) possible combinations of high, medium, and low CO2 equivalence for the six pollutants. For each of the eight metrics, each of these 729 combinations leads to an ordered ranking of the nine abatement options. All in all, each option thereby has 5832 potential CO2 equivalent emissions impacts.

2.2 Calculation method

We calculate cost-effectiveness of the nine options using these 729 CO2 equivalent abatement levels for each option by dividing the cost of the option (Table 1) with the sum over all pollutants of the product of emission reduction (Table 2) and the metric-specific CO2eq values (Table 3) to give abatement cost per tonne CO2eq (Eq. 1). This calculation is done separately for all eight metrics. We do not consider uncertainty in costs and emission levels since the focus of this paper is on the effect of climate metrics on cost-effectiveness.

$$ {\mathrm{Unit}\ \mathrm{cost}}_{o,m,r}=\frac{c_o}{\sum \limits_p\left({\mathrm{red}}_{o,p}\ast {\mathrm{CI}}_{p,m,r}\right)} $$
(1)

where

o :

abatement option

p :

pollutant

m :

climate metric

r :

climate metric value range

Unit cost:

cost per tonne CO2eq abated [M€2010/tonne CO2eq]

c :

option cost [M€2010]

red:

emission abatement [tonne]

CI:

metric value [tonne CO2eq/tonne emission]

The resulting abatement cost per ton CO2eq for each option for mid values of the eight metrics is shown in Table 4. For certain metrics, some combinations of metric values can result in negative CO2eq emission abatement i.e. implying increasing CO2eq emissions. For example, this may happen if some emissions decrease and others increase due to the abatement options, or in our case where it mainly depends on time horizon for NOx (which is short-term warming whilst long-term cooling). These cases are indicated with N.A. in Table 4.

Table 4 Abatement cost in €2010/tonne CO2eq for the studied options, across the mid CO2eq values of all the studied climate metrics

As can be seen, some abatement options are several orders of magnitude more expensive per unit CO2eq abatement than the cheapest option for all metrics.

For each unique combination of climate metric and metric value, we rank the cost-effectiveness of the nine options from most cost-effective (rank #1) to least cost-effective (rank #9). As mentioned, some options can increase CO2eq emission levels for specific metric values. We therefore first control whether the option would increase or decrease CO2eq emission levels. If one or several control option leads to an increase in the CO2eq emissions, those are considered least cost-effective, while the other options are simply ranked according to their relative cost-effectiveness.

2.3 Sensitivity analysis

We make five different sensitivity analyses. In the first sensitivity analysis, we assume that the radiative forcing strength (in absolute terms) of the particulate sub-fractions (PMres, BC, OC) would be perfectly correlated, as discussed in Aamaas et al. (2016). This implies that we make the calculations in three separate groups. In one group, all PM sub-fractions are assigned low-metric values (from Table 3), the next mid, and the third high (which gives 81 possible combinations of CO2 equivalent emissions for each of the eight metrics). In the second sensitivity analysis, we disregard all climate impact from NOx emissions in the calculations (243 combinations for each metric). This sensitivity analysis is made since NOx is the only pollutant with climate metric values ranging from positive to negative depending metric choice and time horizon, which can have a large impact on ranking. In the third sensitivity analysis, we assume that the CO2eq of PMres is zero (243 combinations), in contrast to the main analysis where we assume that the CO2eq of PMres is equal to OC.

The fourth sensitivity analysis is made in order to verify that any potential robustness of the ranking found in the main analysis is not primarily a result of very uneven costs of the different abatement options (Table 4). In this sensitivity analysis, we construct a hypothetical marginal abatement cost function where the cost of the least cost option starts at 20 €2010/tonne CO2eq when using mid values of GWP100-G, and where the marginal abatement cost increases with an increment of 10 €2010/tonne CO2eq for the subsequent options. Hence, in this sensitivity analysis, the most cost-effective option has a cost of 20 €2010/tonne CO2eq, the second most cost-effective 30 €2010/tonne CO2eq and so on up until 100 €2010/tonne CO2eq for the least cost-effective option. This corresponds to the unit costs in Table 1 being recalibrated into 6.2 M€/year (Pellets); 4.0 M€/year (Mod. boiler); 6.5 M€/year (Man. ferm); 0.84 M€/year (Sl. cover); 1.2 M€/year (Sl. acid); 2.2 M€/year (Mod. NRMM); 0.018 M€/year (4S S-mob); 0.66 M€/year (4S&EL NRMM); 11 M€/year (W. solv). In the fifth sensitivity analysis, we combine sensitivity analysis #2 and 4.

3 Results

We present cost-effectiveness ranking results for all the possible rankings of the nine options. We first show the results of the main analysis (aggregated for all metrics), followed by disaggregated comparisons of option ranking per metric. We also present the effects of region, metric choice, and time horizon, as well as results from the sensitivity analyses on PM correlation and the PMres assumption. Finally, in brief, we present the disaggregated analysis of the effects of region, metric type, and time horizon, as well as the sensitivity analyses on NOx effects and abatement costs, but leave the details to the supplementary material 2.

3.1 Results from the main analysis

Most options have a relatively stable distribution with the 2nd and 3rd quartile ranking and average values within ± 1 from the median (Fig. 1). The option with the largest variation is Pellets, which is the only option affecting all six pollutants.

Fig. 1
figure 1

Distribution of relative ranking of cost-effectiveness (€2010/tonne CO2eq) for each option. The grey box shows the range of ranks for the 2nd and 3rd quartile of the results, the grey line in the boxes shows the median rank, the cross shows the average, and the error bar shows the 90th percentile range

The relative robustness of the ranking can be further confirmed when focusing on the range around the mode ranking (the most common ranking). In Table 5, it can be seen that all options have rankings occurring around the mode ranking ± 1 for more than two-thirds of the occasions. The general picture emerging from Fig. 1 and Table 5 is that the ranking is relatively robust with respect to climate metric.

Table 5 The mode rank for each option and percent of all ranks that surround the mode rank for the option

3.2 Disaggregation of the main analysis

The results are disaggregated in order to analyse which of the climate metrics that has the strongest impact on the variation of the ranking of the abatement options. The disaggregation of the results also allows for an estimate of which feature of the metric design that could be considered to drive the variation in rank.

3.2.1 Ranking per metric

The mode ranking of cost-effectiveness can be seen to be stable across metrics, with largest variation for 4S S-mob, followed by Man. ferm and Pellets. The variation in mode rank for 4S S-mob is likely caused by the variation in metric values for CH4 and BC, which are important for the rank range of Man. ferm, Pellets, and Sl. cover (Fig. 2).

Fig. 2
figure 2

The options relative ranking of cost-effectiveness (€2010/tonne CO2eq) for each climate metric. The coloured boxes show the mode rank of each option, and the error bar shows the 90th percentile range

Furthermore, the disaggregated mode ranks show that the stability for the aggregated results (Table 5) is seen also for the disaggregated results (Table 6).

Table 6 Percent of all option ranks that are at the Table 5 mode rank ± 1, presented per metric

Also, for the 729 CO2eq emission levels per option and metric, the difference between max and min rank as a function of metric remains within two rank steps in more than two-thirds of the outcomes for all options (Table 7). Again, the option with the largest variation is Pellets.

Table 7 Rank variation as a function of metric over all 729 combinations, presented per option in percentage

3.2.2 Effects on ranking of regional scope, metric type, and time horizon

In order to improve the understanding of the variation in ranking, we separate out the results into global metrics vs. European metrics; GTP vs. GWP; and 20-year vs. 100-year time horizon. Through these separations we observe: that the option ranking varies more for global metrics than European metrics (consistent with the larger variation in metric values for global metrics, especially the larger BC and NOx metric value variation in GWP20-G and GWP100-G); that GTP rankings generally are more stable than GWP rankings (consistent with the relatively large BC and NOx value variation in GWP20-G and GWP100-G); and that a 100-year time horizon has a tendency to increase variation in ranking (consistent with most pollutants having slightly larger variation in metric values for the 100-year time horizon than the 20-year time horizon). However, neither the regional scope, nor metric type, nor time horizon has a significant effect on the variation in ranking.

3.3 Results from the sensitivity analysis

We present the results from the first (correlation in RF strength of PM species) and third (neglecting RF of PMres) sensitivity analysis. The results from the second (no NOx), fourth (hypothetical even distribution of abatement costs), and fifth (combination of #2 and #4) sensitivity analyses are shown in the supplementary material 2 and only presented in brief in the main text.

3.3.1 If the radiative forcing of PM-subspecies is correlated

In the first sensitivity analysis, we test the sensitivity of the ranking by letting the low, mid, and high CO2eq values (in absolute terms) for PMres, BC, and OC in Table 3 be correlated when calculating cost-effectiveness and ranking. Overall, the relative ranking of the options is similar in the sensitivity analysis and the main analysis (Fig. 3).

Fig. 3
figure 3

Distribution of relative ranking for each option if assuming that low, mid, and high CO2 equivalence of BC, OC, and PMres are correlated (648 possible ranks per option). The grey box shows the range of ranks for the 2nd and 3rd quartile of the results, the grey line in the boxes shows the median rank, the cross shows the average, and the error bar shows the 90th percentile

3.3.2 Disregarding climate impact of PMres

In the third sensitivity analysis, we test the sensitivity of the ranking by setting the CO2 equivalent emissions of PMres to zero. The exclusion of PMres has a stabilising effect on the ranking, with noticeably reduced variability of Pellets and 4S S-mob (Fig. 4).

Fig. 4
figure 4

Distribution of relative ranking for each option if assuming that PMres emissions have no radiative forcing (243 CO2 equivalence combinations per metric). The grey box shows the range of ranks for the 2nd and 3rd quartile of the results, the grey line in the boxes shows the median rank, the cross shows the average, and the error bar shows the 90th percentile

3.3.3 Summary of additional sensitivity analyses

The sensitivity analyses presented in the supplementary material 2 indicate similar results as the main analysis. Regarding the hypothetical equalised cost distribution case (sensitivity analysis #4), the results show an expected decrease in the robustness of ranking compared with the main analysis. The median and average ranking of options are relatively stable and comparable with what they were in the main analysis except for the option Pellets that change median ranking from 3 to 5 and average from 3.3 to 5.1.

There is however another result from the sensitivity analysis based on the hypothetical even cost that is of higher interest. When studying the details, it can be observed that the increased variation in rank (noticeable for all options in this hypothetical case) seems especially large and irregular for Mod. NRMM. In contrast with the other options, Mod. NRMM has a noticeable larger variation in ranking for the global metrics than for European metrics as well as for GWP rather than for GTP (Figure SM2 13 and SM2 14). This is likely explained by the fact that the abatement option affects NOx emissions and that the metric value for NOx change sign under some circumstances for GWP global (while not for the other metrics). In all cases but one, the metric for NOx is negative, but for the case GWP20-G-Low, it turns positive. This causes the ranking of Mod. NRMM to vary with metric choice and assumption about time horizon and forcing strength. Further, an indication of validity of this explanation is that the ranking distribution of Mod. NRMM has a bimodal characteristic (Table SM2 3), it most often ranks last or close to last when NOx emissions have a negative metric value, but when NOx emissions have a positive metric value, the occurrence of ranks around place three almost doubles. To check the robustness of this explanation, we carry out an additional sensitivity analysis (#5—even cost and no NOx assumption). However, the expected attenuation in rank variation for Mod. NRMM as measured in Figures SM2 16 to 19 is not visible, but the bimodal characteristics of the ranking of Mod. NRMM is weakened (Table SM2 4). Hence, the likely explanation of the observed variation of Mod. NRMM rank and its bimodal characteristics in sensitivity analysis #4 are due to that the CO2 equivalent emission of NOx change sign for the GWP20-G-Low metric.

4 Discussion and implications

Many different climate metrics have been suggested and are found in the literature. In this paper, we analyse to what extent estimated relative cost-effectiveness of options to reduce CO2eq emissions through air pollution abatement is affected by the choice of climate metric used in the calculations; a choice that is at least partly normative. We calculate cost-effectiveness for nine NTCF abatement options expected to be available in Sweden using eight different climate metric designs (two regional scopes, two metric types, and two-time horizons), and then compare how the abatement options’ relative cost-effectiveness would be affected by the metric-related choices. Overall, our results show comparatively robust relative cost-effectiveness (ranking), with the reservation that options having a large effect on NOx might be an exception. This implies that the choice of metric will not strongly affect policy recommendations regarding the relative cost-effectiveness of abatement options affecting only emissions of NTCFs, at least when NOx is not heavily affected. However, it also implies that potential effects on NOx are important to include when analysing cost-effective NTCF abatement options. In fact, the potential impact of NOx provides an indication that care should be taken and that all relevant NTCFs and LLGHGs, including SO2, CO, and CO2, should be included in an analysis of policy measures on NTCFs.

Our results are in apparent contrast to much of the literature on climate metrics for policy analysis. As an example, Grewe and Dahlmann (2015) stress the importance of increasing the precision of the [policy] question asked, as well as finding adequate references, emission scenarios, indicators, and time horizons, since all of these factors will impact which climate metric to use in analyses. Further, Tanaka et al. (2010) state that the choice of metric will affect the balance of which options to prefer when choosing between ‘multicomponent’ options. The important difference between these studies and our analysis is that all these and other studies concerns comparisons of either different LLGHGs or a combination of different LLGHGs and NTCFs. Herein lays a key message from our paper: it is important to consider the different challenges faced when designing policy for LLGHG and NTCF abatement versus when designing policy for only NTCF abatement where the effect on LLGHG are negligible. When only considering a policy for NTCF abatement, where the LLGHG are not affected, there is likely no need to worry about the effect of metric choices on the outcome (with reservation for NOx). In other words, if analysing cost-effectiveness of NTCF emission abatement, the existing climate metrics are showing relatively robust results when it comes to the ranking of the relative cost-effectiveness on various abatement options (given that options have insignificant effect on NOx emissions).

Even though the metric values depend strongly on metric design (as is pointed out by Grewe and Dahlmann (2015) and Tanaka et al. (2010)), the relatively robust ranking can be understood through that the metric values for the pollutants are quite strongly correlated over metrics and time horizons, with the exception for NOx. CO2 is the reference gas for all metrics used in this analysis, and it has a very long climate perturbation time compared with the perturbation time of NTCFs. The variation in metric values would be less affected by metric design choices if a short-lived gas such as methane would be used as reference gas (Cherubini and Tanaka 2016). For this reason, and given the current policy separation of NTCF and LLGHG abatement, it may make sense to use methane as a reference gas when analysing cost-effectiveness of NTCF abatement options. This will not affect the relative ranking in our analysis to any significant extent, but would make related policy assessments more transparent in the sense that the metric value for each NTCF would be more stable over different metric choices and time horizons.

One limitation is that we analyse only nine NTCF abatement options in a single country, and our results should therefore be complemented with other studies made for more countries. The abatement options though are rather generic and likely available in most countries. Further, we do not include SO2 in the analysis, since the options considered in the analysis does not have an effect on SO2 emissions. Also, the results are partly dependent on the wide variation in abatement costs of the considered options, which we controlled for in the fourth sensitivity analysis (supplementary material 2).

The results presented in this paper are relevant since there is a risk that decision makers will mentally transfer the existing general message from metric studies—the choice of metric will have an effect, potentially a large effect, on the balance of options—onto decisions regarding NTCF-only abatement that have no (or little) effect on the emissions of LLGHGs. All in all, our results suggest that there is limited need for policymakers to be concerned by the large variations of metrics used in studies of cost-effective solutions for NTCF control as long as only NTCFs (with NOx emissions exempted) and not LLGHGs are affected by the abatement options considered.