1 Introduction

Summer (June–August, JJA) 2012 was anomalously wet in northern Europe, and anomalously dry in southern Europe. Characterised by a strong negative summer North Atlantic Oscillation (SNAO) pattern, there was a southward and eastward displacement of the jet over Europe, with most storms tracking along the northern flank and across the British Isles (Dong et al. 2013). The wet northern European summer of 2012 presents a stark contrast to the preceding two years of drought in the UK, and is a rare example of groundwater recovery following drought occurring in the summer months (Kendon et al. 2013).

2012 was the wettest summer in the UK since 1912, and the second wettest year overall since 1910. The UK also experienced widespread fluvial flooding (Parry et al. 2013) (note that England is more vulnerable to flooding than other countries, e.g., Crichton 2005).

While northern Europe experienced a wet summer in 2012, the season was anomalously dry in Southern Europe. 2012 saw the lowest summer rainfall in Spain since 1928, which, following on from a winter drought, exacerbated summer drought. Such unusual events naturally raise the question of whether the extremes were influenced by climate change. Event attribution provides a number of approaches to quantify whether climate change has made the occurrence of an extreme event more likely.

Studies on event attribution are necessarily influenced by the way the event is selected and framed, and the suitability of any models used. As it is difficult to address uncertainties in all these aspects, and uncertainties are generally more problematic for variables other than temperature (NAS 2016), we present a number of different approaches to determining whether anthropogenic forcing made the extreme precipitation of summer 2012 more or less likely. By synthesising multiple approaches, we will provide a qualitative statement on the role of anthropogenic forcing in summer 2012, and an assessment of the robustness of this statement to the choice of attribution approach.

We present a description of the event, and some historical context, in Sects. 2 and 3. Approaches to event attribution are generally divided into two categories: those using the observational record to determine the change in probability or magnitude of an event, and those using model experiments to compare the likelihood of an event in worlds with and without anthropogenic climate change. In Sect. 4 we present a number of observational approaches, while model-based approaches are presented in Sect. 5. Confidence in attribution results depends on the skill of the model in simulating the event. We present a model evaluation for our specific case in Sect. 5.1.2.

2 Motivation and description of the event

Precipitation and temperature anomalies for JJA 2012 relative to 1960–2012 from CRUTS3.23 (Harris et al. 2014) are shown in Fig. 1, alongside sea level pressure anomalies from the National Centers for Environmental Prediction (NCEP) reanalysis R1 (Kalnay et al. 1996) and SST anomalies from the Hadley Centre Sea Ice and Sea Surface Temperature dataset (HadISST) (Rayner et al. 2003). A clear dipole pattern can be seen in the precipitation anomalies, with wetter than average conditions over northern Europe, particularly the UK, and drier than average conditions over southern Europe, particularly around the Adriatic Sea. It can be seen in Fig. 1b that most of continental Europe was warmer than average in summer 2012, with the large temperature anomalies over Italy and Greece matching the spatial pattern of the precipitation anomalies. Most of the UK and Scandinavia were cooler than average. This pattern was associated with a low pressure anomaly over the UK, causing more storms to track over northern Europe (Fig. 1c).

Fig. 1
figure 1

a Precipitation; and b surface air temperature anomalies for 2012 relative to 1960–2012 from CRUTS3.23; c sea level pressure anomalies from the NCEP reanalysis; and d SST anomalies from HadISST. The black boxes in panels a and b show the regions used in the definition of the event

Global sea surface temperature (SST) was warmer than the 1960–2012 mean in JJA 2012 (Fig. 1). In particular, most of the North Atlantic was warmer than average: over 2.5 \(^{\circ }\)C warmer in the north west North Atlantic and the Davis Strait. The Mediterranean was also very warm, with anomalies up to 2.5 \(^{\circ }\)C in excess of the 1960–2012 mean, in line with the high land temperatures. Both of these anomalies are large, reaching almost 3 times the inter-annual standard deviation for 1960–2012.

Statements about attribution are sensitive to the way that questions are posed: choices made about the event duration; geographic area; physical variable to consider; the metric used to determine how extreme the event was; and whether the magnitude of an actual event or a percentile from the climatology is used in probabilistic analysis. Our focus will be on the seasonal mean precipitation, as this best characterises the event in both northern and southern Europe. In order to best capture the dipole structure of the 2012 precipitation anomaly, two regions will be considered in this study, which best capture its spatial structure (Fig. 1): northern Europe (10W–25E, 50–60N) and southern Europe (10W–25E, 35–45N). Temperature can exacerbate hydrological drought by increasing surface evaporation, so that increasing temperature presents an increasing risk of hydrological drought, even if precipitation is constant (e.g. Diffenbaugh et al. 2015; Williams et al. 2015). As such, we will also consider near-surface temperature changes for southern Europe. The observed magnitude of the event, rather than a statistical threshold (such as values lying outside the 95 or 99% confidence interval), will be used for the majority of our analysis, and clearly stated when otherwise.

Attribution is a statistical assessment of whether observed changes are unlikely to be due to internal variability, and are consistent with estimated responses to forcing (e.g. Mitchell et al. 2001). Here, we are interested in the role of anthropogenic forcing, so make an estimate of the probability of an event occurring with anthropogenic forcing (\(p_1\)), and with reduced, or no, anthropogenic forcing (\(p_0),\) so that a fraction of attributable risk (\(\text {FAR}=(p_1-p_0)/p_1)\) or probability ratio (\(\text {PR}=p_1/p_0\), often called a risk ratio) can be presented. In this work we present probability ratios, as they directly frame the result in terms of the relative probabilities, and can equally be applied to increases and decreases in the frequency of an event.

3 Historical context

The extreme summer of 2012 was one in a number of consecutive wet summers in northern Europe since 2007 (Fig. 2a). The mean precipitation rate in northern Europe was 3.2 mm day\(^{-1}\) (CRU TS3.23), 1.7 standard deviations above the 1960–2012 mean, and 1.2 standard deviations above the 1999–2013 mean. In southern Europe, the mean precipitation rate of 0.54 mm day\(^{-1}\) was 2 standard deviations below the 1960–2012 mean, and 1.6 standard deviations below the 1999–2013 mean. Southern Europe was also very warm in summer 2012: a mean temperature of 23.8 \(^{\circ }\)C (CRUTS3.23); 2 standard deviations over the 1960–2012 mean; and 1 standard deviation over the 1999–2013 mean.

Fig. 2
figure 2

Time series of observed anomalies (relative to 1960–2012) in JJA a precipitation in northern Europe; b near-surface temperature in northern Europe; c precipitation in southern Europe; d near-surface temperature in southern Europe

Figure 2 shows time series of northern and southern Europe seasonal mean temperature and precipitation from CRUTS3.23, alongside E-OBS (Haylock et al. 2008) and HadCRUT4 (Morice et al. 2012). There is a positive trend in temperature in both regions: best estimates range from 0.28 to 0.31 \(^{\circ }\)C decade\(^{-1}\) [1960–2012] across the three datasets in northern Europe, and from 0.41 to 0.44 \(^{\circ }\text{C}\) decade\(^{-1}\) in southern Europe. Regional temperature is seen to be rising proportional to global temperature plus noise from weather and in some areas low-frequency variability (pattern scaling, Mitchell 2003), in this case much faster than the global mean (van Oldenborgh et al. 2009). The global temperature trend is mainly driven by anthropogenic forcing (Bindoff et al. 2013), which is non-linear in time. The precipitation time series also show natural variability. Precipitation increases in southern Europe from 1960 to the late 1970s, before decreasing. The pattern is reversed in northern Europe, where precipitation decreases from 1960 to the late 1970s before increasing. However, the lag-1 autocorrelation is slightly negative in the north and zero in the south, both zero within uncertainties, pointing to the dominance of high-frequency variability.

Figure 3 shows composites of seasonal precipitation anomalies (at least 1.5 standard deviations above the 1960–2012 mean in northern Europe, and 1.5 standard deviations below the mean for southern Europe, from E-OBS) and their associated 500hPa geopotential height anomalies [from the 20th Century Reanalysis, (Compo et al. 2011)]. The composites of extreme precipitation for the northern European box are associated with a large-scale dipole pattern in 500-hPa geopotential height, with a low-pressure anomaly centred over the British Isles and Scandinavia, and a high pressure centred over the Central North Atlantic: a pattern close to the definition of the Summer NAO (Folland et al. 2009; Bladé et al. 2012a) (Fig. 3). Wet summers in northern Europe are typically associated with precipitation deficits along the Mediterranean coast. Dry summers in southern Europe are typically associated with wet summers in Scandinavia and the Baltics (Fig. 3b), and a positive geopotential height anomaly over Europe. In dry summers (as defined in the southern box), the largest anomalies are typically found in the Adriatic region.

Fig. 3
figure 3

Composites of JJA mean precipitation (shading) and 500 hPa geopotential height anomalies [m] (line contours) in a: wet northern European summers; and b dry southern European summers

Comparison of these composites with the 2012 anomaly shown in Fig. 1 shows that the pattern of anomalous precipitation in 2012 is typical of wet northern and dry southern European summers, and is associated with a negative geopotential height anomaly over the UK.

Summer 2012 saw increased precipitation frequency, as well as amount, in northern Europe (Fig. 4a; Yiou and Cattiaux 2013) compared to the 1960–2012 mean. The frequency distribution of daily precipitation in northern Europe in 2012 is similar to that for the 5 wettest northern European summers since 1960. However, even compared to those years, there are fewer dry days, and more days with heavy rain (Fig. 4d). The maximum daily rain rate, averaged across northern Europe, was 7.15 mm day\(^{-1}\) in 2012, compared to 7.08 mm day\(^{-1}\) for 1960–2012. However, there were 44 days with a regional average rain rate in excess of 3 mm day\(^{-1}\) in 2012, compared to 28.7, on average, for 1960–2012. In addition to the persistent heavy rain that characterised the summer, there were exceptional flood events in the UK related to torrential downpours, with high 1- and 2-day rainfall totals, and UK rain rates in excess of 50 mm h\(^{-1}\) (Parry et al. 2013).

Fig. 4
figure 4

Daily precipitation frequency in northern Europe in JJA 2012 (blue) and JJA 1960-2012 (red) for a E-OBS; and b HadGEM3-A. c JJA 2012 daily precipitation from E-OBS (blue), compared to daily precipitation from the 5 members of HadGEM3-A that best reproduce the observed JJA precipitation anomaly (red). Daily precipitation frequency for the 5 wettest summers (green) in northern Europe compared to 2012 (blue) for d EOBS; and e HadGEM3-A

In southern Europe there were 7 days where the regional average rain rate was 0mm/day, compared to 2 days, on average, for 1960–2012 (Fig. 5a). There was less precipitation at all frequencies compared to 1960–2012, so the frequency distribution of daily precipitation was typical of dry southern European summers (Fig. 5d). Note that additional panels in Figs. 4 and 5 relate to model analysis (using HadGEM3-A), which is introduced in Sect. 5.1.

Fig. 5
figure 5

Daily precipitation frequency in southern Europe in JJA 2012 (blue) and JJA 1960–2012 (red) for a E-OBS; and b HadGEM3-A. c JJA 2012 daily precipitation from E-OBS (blue), compared to daily precipitation from the 5 members of HadGEM3-A that best reproduce the observed JJA precipitation anomaly (red). Daily precipitation frequency for the 5 wettest summers (green) in northern Europe compared to 2012 (blue) for d EOBS; and e HadGEM3-A

4 Observation-based approaches to attribution

Observation-based approaches to attribution include the use of long-term data to determine the changes in likelihood of an observed event with time (van Oldenborgh et al. 2015), and the identification of analogues of the characteristics of an observed event to determine how similar types of events have changed (Vautard and Yiou 2009). We employ both of these approaches in this section.

European observations present a long record that is updated in close to real-time, making observation-based analyses possible. However, uncertainties in such analyses are considerable, particularly for precipitation, where differences between observational datasets can be as large as those between climate models (Prein and Gobiet 2017). Undercatch, undersampling, and the smoothing of fields in the creation of gridded datasets can mean that the severity of both high and low precipitation extremes are underestimated in observations.

4.1 Return times

One approach to observation-based attribution is the return time plot, which uses historical observations to characterise the distribution of a type of event that is similar to the observed event. To address an anthropogenic component, a trend or covariate related to the event, and already attributed to human influences, is identified to describe the trend. The distribution fit to the observations is scaled or shifted to this covariate. Here, we use the 4 year-smoothed global mean temperature. For temperature we made the commonly made assumptions that the distribution shifts with the covariate, and that it scales for precipitation (for details see e.g., van der Wiel et al. 2017).

We first checked whether the seasonally-averaged temperature or precipitation (CRU TS 3.24.01, June–August average) can be adequately approximated by a normal (Gaussian) distribution. This turned out to be the case for both the northern and southern European precipitation distributions. However, the ten warmest summers in southern Europe are warmer than expected on the basis of a normal distribution plus trend, i.e., the tail is fatter. This points to non-linear effects, probably of soil moisture feedbacks (Whan et al. 2015). Thus, for temperature we fitted a Generalised Pareto Distribution (GPD) to the most extreme 20% of seasons since 1901, inspired by extreme value theory (Coles 2001). We impose a normally distributed penalty with \(\sigma =0.2\) constraining the shape parameter \(\xi\) to more physical values \(|\xi | \lesssim 0.4\), which is much larger than the fitted range. This function was found to be a good fit to the data (Fig. 6d). A nonparametric bootstrap with 1000 samples is used to estimate the uncertainties. Finally, we inverted the normal distribution and GPD to obtain return times for the mean temperature and precipitation respectively in 1901, 1960, and 2012. The observed values from 2012 were not included in the fits.

Figure 6 shows the return levels associated with the 1960 and 2012 fits. The observations are drawn twice, once shifted or scaled with the fitted trend to the 1960 values (blue) and once to the 2012 value (red). The likelihood of seasonal mean values of the magnitude of the 2012 event are indicated by the intersection of the pink horizontal line and the fitted curves.

Figure 6a shows that the precipitation in northern Europe in 2012 event was not very rare, with a return time of roughly once every 20 years (95% CI 10–100 years). Up to now there has been no discernible trend in summer precipitation in the northern Europe region, with a trend larger than a factor 2.5 up or down excluded (PR \(\approx\) 1.2, 95% CI 0.4–2.9). The drought in southern Europe is more rare in this measure, with a return time of about 50 years (95 % CI 30–150), the second-driest event in the series after the summer of 1928. There is again no discernible trend, changes in probability of more than a factor 1.6 are excluded at 95% (Fig. 6b).

Not surprisingly, increasing global temperatures are related to a substantial increase in the likelihood of hot summers in southern Europe (Fig. 6c). The summer temperatures are seen to increase at about twice the rate of the global mean temperature (c.f. van Oldenborgh et al. 2009). The JJA mean temperature observed in southern Europe in 2012 was about 30 (95% CI 3–1500) times more likely in 2012 than in 1960, and 90 (95% CI 4–6 \(\times 10^{6}\)) times more likely than in 1901 (Fig. 6c). The 2012 event had a return time of about 20 years (95% CI 2–400) after taking the trend into account.

Fig. 6
figure 6

Return time plots of JJA precipitation from CRUTS3.24.01, assuming the precipitation distribution scales with global mean surface temperature for a northern Europe; b southern Europe. c Return time plot of JJA near-surface temperature from CRUTS3.24.01, assuming the temperature distribution shifts with global mean surface temperature, for southern Europe. d Time series and fit for southern Europe temperature

4.2 Circulation analogues

Circulation analogues are used to understand the influence of atmospheric flow on surface climate (Vautard and Yiou 2009). We use analogues to determine the extent to which the wet summer in northern Europe was associated with the atmospheric circulation, and examine whether that association has evolved with time. The analysis in this section requires a long time series of daily data. It is therefore restricted to the analysis of UK precipitation, taken from HadUKP (1931–present). HadUKP (Alexander and Jones 2000) divides the UK into nine regions (Table 1).

Table 1 The nine HadUKP regions, and their acronyms

Atmospheric circulation data was obtained from the twentieth century Reanalysis (20CR Compo et al. 2011). The 56 members of the realisations were used, considering mean daily SLP over the North Atlantic region (80W–30E; 30–70N) between 1931 and 2012. The skill and caveats of this reanalysis have been pointed out by many authors (e.g. Compo et al. 2011; Ferguson and Villarini 2014).

The skill of 20CR to predict pressure fields plateaus after 1940 (rms \(\approx\) 1.5 hPa), and ramps in 1931–1939 (rms \(\approx\) 2 hPa), although it is better than in 1871–1930 (rms \(\approx\) 2.5 hPa). SLP over this region is well constrained by construction: it is the assimilated variable in the reanalysis, and we only consider it over the North Atlantic region where observational density is high. The use of 56 members of the ensemble allows a sampling of the reanalysis uncertainty, in particular for the 1931–1970 period.

It can be seen in Fig. 1a that the negative precipitation anomaly in 2012 has a minimum over the western UK. HadUKP precipitation is consistent with this: the summer precipitation rate reached a record in 2012 in South Scotland (SSP) and North West England and Wales (NWEP) (Supplementary Figure 1).

To identify any long-term changes in JJA circulation patterns, we determine the probability distribution of observed summer accumulated precipitation for two periods: 1931–1970 and 1971–2011. We computed 20 daily analogues of summer 2012 sea level pressure (SLP) for those two periods, following the methods described by Yiou et al. (2013). The analogues are obtained by optimizing the root mean square between SLP maps. The spatial (rank) correlation is then computed for verification. The distributions of distances and correlations are indistinguishable in the two subperiods indicating that there is no measurable change in the atmospheric circulation between the two subperiods (a Kolmogorov-Smirnov test does not allow us to reject the null hypothesis that the two distributions are the same; Supplementary Figure 2). This is in line with the findings of Bladé et al. (2012a) that there has been no trend in the summer NAO.

For each daily analogue, we determine the corresponding precipitation from HadUKP. In this way, we can simulate two large ensembles (10,000 members) of ‘uchronic’ summer total precipitation for JJA 2012, with analogues in 1931–1970 and 1971–2011. Uchronic simulations represent the distribution of accumulated precipitation that could have been if the circulation were statistically the same as it was in 2012, had 2012 been in 1931–1970 or 1971–2011. In an approach similar to the static analogue weather generator described by Yiou (2014), these simulations are performed by picking one of the daily 20 analogues at random for each day of the summer (following Jezequel et al. 2017).

From the two ensembles (past and present) of conditional precipitation simulations (on the JJA 2012 circulation analogues), we compute the empirical probabilities that precipitation exceeds the upper 95th percentile in the two subperiods. The uncertainty on the probabilities is determined by the spread over the ensemble members of 20CR. The change of probabilities provides an estimate of the thermodynamic contribution of climate change: the dynamic part is fixed to be analogous to the circulation pattern sequence of summer 2012. Hence, we estimate a conditional probability of exceeding a high threshold \(R^\text {ref}\) (the 95th quantile) by:

$$\begin{aligned} p_\text {thermo}=P(R>R^\text {ref}|C\sim C^\text {JJA2012}) \end{aligned}$$

where the change of probabilities is assumed to be due to the global temperature increase between the two subperiods.

Figure 7 shows the probability distributions of precipitation for the nine UK regions. The red boxplots emphasise the role of the atmospheric circulation in driving an extreme precipitation. Apart from North Scotland (NSP), a circulation that is analogous to the summer 2012 leads to higher precipitation rates than normal.

Fig. 7
figure 7

Box and whisker plots of summer precipitation rates for the nine UK regions, and the two sub periods. The white boxplots represent the probability distribution of summer precipitation with all types of atmospheric circulation patterns. The red boxplots represent the probability distribution of the summer precipitation conditional on the summer 2012 circulation. The thick horizontal dashed lines represent the JJA 2012 value of precipitation. The thin horizontal dashed line is the 95th quantile of 1931–2010 mean summer precipitation, estimated from the boxplots in white

The change in precipitation when the circulation is analogous to 2012 (red boxplots in Fig. 7) is interpreted as the thermodynamic contribution. We compute the conditional probability ratio for the thermodynamic contribution, \(\rho ^\text {thermo}=\frac{p_{1}}{p_{0}}\), where \(p_{0}\) and \(p_{1}\) are the thermodynamic probabilities in 1931–1970 and 1971–2011, respectively. We consider the ratios of probabilities of exceeding the 95th percentile (thin dashed grey lines in Fig. 7). \(\rho ^\text {thermo}\) for each region is shown in Fig. 8a. There is no value for Northern Scotland (NSP) because \(p_{1}\) is equal to zero (precipitation never exceeds the 95th quantile during the 1971–2011 period). Four of the UK regions show a decreased probability of wet summer conditional on an atmospheric circulation analogue to 2012 (\(\rho ^\text {thermo}=\frac{p_{1}}{p_{0}}<1\)). Only Central England (CEP), North East England (NEEP) and South Scotland (SSP) show an increase of precipitation when the circulation is close to JJA 2012. There is no change in Central England (CEP).

Fig. 8
figure 8

a Boxplots indicating the probability distribution of the ratios \(\rho _\text {thermo}\), obtained from the 56 member ensemble of the 20CR. Values lower than 1 indicate that the thermodynamic component makes the event less likely between 1931–1970 and 1961–2011. The \(\rho _\text {thermo}\) value for North Scotland (NSP) is \(9\times 10^{-3}\) and does not appear in the figure. b Comparison between risk ratios from the return time approach, applied to the HadUKP regions, and those from the thermodynamic component indicated by the analogues approach. Vertical and horizontal dotted lines show the 95% CI in each case

Analogue analysis shows that the atmospheric circulation observed in 2012 increased the likelihood of higher precipitation rates than normal in the UK, in all regions except Northern Scotland, consistent with a southward displacement of the storm track in 2012 (Dong et al. 2013). There has been no long-term trend in this circulation pattern associated with increasing global temperatures. The analogue method therefore suggests that the occurrence of the pattern in 2012 is the result of natural variability.

Conditioning the precipitation from 1931 to 1970 and 1971 to 2011 on the 2012 circulation allows the thermodynamic contribution to be estimated. For four of the UK regions, thermodynamic changes have decreased the likelihood of a wet UK summer, while they have increased the likelihood in NWEP, NEEP, and SSP. This result is stable when changing the threshold value of high precipitation (e.g. 90th rather than 95th quantile). Yet, this dichotomy, with increasing thermodynamic contribution to high summer precipitation in some UK regions and decreasing in others is surprising, because regions that are geographically close should yield similar climatologies. During the summer of 2012, regions in the north and west of the UK were exposed to an anomalous cyclonic circulation (Fig. 1c), which explains why they are more affected by a thermodynamic effect on the moisture transport by the atmospheric circulation.

The analogue results can be compared with fits to the extremes of precipitation, as in Sect. 4.1, for these regions, using monthly HadUKP data from 1931. This procedure finds results in qualitative agreement with the analogue analysis in all regions except ESP, where it produces an insignificant trend toward wetter summers. If the circulation patterns have not changed, as suggested by the analogue analysis (Supplementary Figure 2), these trends should agree with the thermodynamic trends of Fig. 8a, which they do within the 95% confidence interval (Fig. 8b).

5 Model-based approaches to attribution

Climate model experiments enable the comparison between an historical simulation, and an equivalent counterfactual simulation without changes in anthropogenic forcing. Model-based approaches can either be conditional, so that the attribution question is answered after constraining the state of one of more slowly varying parts of the climate system, or unconditional. Analysis using coupled models is typically considered to be unconditional attribution, which accounts for uncertainty from internal variability.

Whether or not the attribution is conditioned is an essential part of framing, as it affects the quantitative estimates of the extent of anthropogenic influence, and more closely relates the study to the factors driving the particular event. Conditional attribution often improves the signal-to-noise ratio of the anthropogenic influence. It also explicitly includes background knowledge about climate change through the choice of counterfactual conditions (e.g. counterfactual SSTs). However, there are uncertainties associated with the specification of the counterfactual mean state, which we discuss for our case later.

While conditioning can help to better isolate the effect of interest, it can complicate interpretation, because of the likelihood that the conditioning factor might itself be affected by external forcing, even if we don’t have the evidence required to quantify such a change in likelihood.

We present analysis conditional on SSTs in Sect. 5.1 using an atmosphere-only model, and unconditional analysis using an ensemble of coupled models in Sect. 5.2.

5.1 Attribution with atmosphere-only models

In this section, we explore the role of anthropogenic forcing in the events of summer 2012, conditional upon the prevailing pattern of SST anomalies, since SST structure could influence the atmospheric circulation. Another advantage is that the biases in SST-forced models are typically smaller than in coupled models. The more realistic background state, e.g., in the jet position, makes the attribution more reliable (see, e.g. Schaller et al. 2016). With this constraint, the effect of external forcing can be more clearly assessed.

Atmosphere-only models should provide a more realistic simulation of the event than their coupled counterparts. However, a choice needs to be made about the definition of counterfactual SSTs. Attribution conclusions can be sensitive to the choice of counterfactual SSTs (Hauser et al. 2017; Teufel et al. 2017), which can only be modelled, and models simulate diverse responses to forcing. The choice of counterfactual SST pattern matters, as SST gradients can affect the pattern of zonal wind change (Haarsma et al. 2013). Here, we use counterfactual SSTs from a multi-model mean to mitigate some of this uncertainty. Another approach is to consider multiple counterfactual SST fields (e.g. Schaller et al. 2016).

In our experiments, horizontal boundary conditions at the surface are specified in a series of sea surface temperature and sea ice fields. In the ALL experiment observed values are taken from HadISST. In the NAT experiments, an estimate of the change in SST and sea ice extent due to anthropogenic influence is removed from the observed fields following the approach of Christidis et al. (2013). The choice of SST pattern to be removed can matter. We use the mean from the fifth Coupled Model Intercomparison Project (CMIP5) (Taylor et al. 2012) as our estimate of anthropogenic SST change, to account for the fact that different climate models generate different patterns of SST changes in response to anthropogenic forcing.

5.1.1 HadGEM3-A

For model-based attribution of the event we primarily use the HadGEM3-A system (Ciavarella and Coauthors 2017). The atmosphere-only configuration uses the GA6 atmosphere, which represents a significant upgrade to the previous version of HadGEM3-A used in attribution studies (Christidis et al. 2013). GA6 uses the ENDGame dynamical core and the JULES (Joint UK Land Environment Simulator) land surface model. We use HadGEM3-A at N216 L85 resolution, which is a horizontal resolution of approximately 60 km at midlatitudes.

The HadGEM3-A attribution ensemble consists of 15 factual (ALL) and 15 counterfactual (NAT) ensemble members. Each experiment runs from 1960 to near real-time. The 15 ensemble members are generated through the simultaneous operation of two physics schemes, Random Parameters and Stochastic Kinetic Energy Backscatter II, and represent model uncertainty (Ciavarella and Coauthors 2017). Each member represents a reasonable atmospheric state for the given forcing, and is considered to be equally likely.

Greenhouse gas concentrations are prescribed annually with values taken from the RCP Scenario Data Group. Historical values are used up to 2005, and RCP4.5 is followed thereafter. Ozone is prescribed monthly, following the same scenarios. Mineral dust and sea salt aerosol are modelled interactively. Sulphates, soot, organic carbon and biomass aerosols are specified monthly using CMIP5 recommended values, while biogenic aerosols are included via a 12-month climatology.

5.1.2 HadGEM3-A evaluation

Model-based quantifications of risk depend on the model being able reliably to simulate both the event of interest, and any changes in this type of event. Unreliable simulations can lead to overstatements of risk (Bellprat and Doblas-Reyes 2016). Vautard et al. (2017) present a broad evaluation of HadGEM3-A, concluding that it is, overall, a good tool for examining European precipitation and temperature extremes in summer. Here, we present analysis specifically relevant to summer 2012. Figure 9 shows the temperature and precipitation time series from Fig. 2, overlaid on the equivalent time series from HadGEM3-A. Over northern Europe, the model slightly overestimates the positive trend in near-surface temperature [0.36 \(^{\circ }\)C decade\(^{-1}\) (1960–2012) compared to best estimates ranging from 0.28 to 0.31 \(^{\circ }\)C decade\(^{-1}\) in observations], and underestimates the positive trend over southern Europe [0.35 \(^{\circ }\)C decade\(^{-1}\) (1960–2012) compared to best estimates ranging from 0.41 to 0.44 \(^{\circ }\)C decade\(^{-1}\) in observations]. HadGEM3-A correctly shows no significant precipitation trend in the northern and southern Europe regions (Fig. 9a, c).

Fig. 9
figure 9

Time series of observed and modelled anomalies (relative to 1960–2012) in JJA, a precipitation in northern Europe; b near-surface temperature in northern Europe; c precipitation in southern Europe; d near-surface temperature in southern Europe. The solid black line indicates the historical ensemble mean from HadGEM3-A, while the grey lines show the individual ensemble members. The dashed black line indicates the historicalNat ensemble mean, while the pale blue lines show the individual ensemble members

HadGEM3-A represents the spatial pattern of mean precipitation and trends reasonably well (Fig. 10). It shows positive precipitation trends over Scandinavia and negative trends over France and eastern Europe. However, it fails to capture the amplitude of the observed drying over Spain, and does not simulate drying over the full longitudinal extent of the Alps, as is seen in observations. HadGEM3-A captures the spatial pattern of the 2012 temperature anomaly, but underestimates the magnitude of the warm anomaly in southern Europe, and shows only a weaker warm anomaly in northern Europe, in contrast to the observed cold anomaly there (Fig. 11). The model underestimates the magnitude of temperature trends in western Europe, and overestimates them in eastern Europe.

Fig. 10
figure 10

1960–2012 mean, 1960–2012 linear trend, and 2012 vs. 1960–2012 anomaly in precipitation from ac CRUTS3.23 and df HadGEM3-A (ALL ensemble mean) and gi HadGEM3-A (NAT ensemble mean). Hatching indicates significant trends at the 10% level, cross-hatching indicates significance at the 5% level

Fig. 11
figure 11

1960–2012 mean, 1960–2012 linear trend, and 2012 vs. 1960–2012 anomaly in near surface temperature from ac CRUTS3.23 and df HadGEM3-A (ALL ensemble mean) and gi HadGEM3-A (NAT ensemble mean). Hatching indicates significant trends at the 10% level, cross-hatching indicates significance at the 5% level

HadGEM3-A is able to capture the spatial structure and amplitude of the composites of extreme precipitation events (seasonal means either 1.5 standard deviations above or below the 1960–2012 mean JJA precipitation) from EOBS and CRUTS3.23 (Fig. 12). Individual members of the HadGEM3-A ensemble are also able to capture the pattern and magnitude of the observed 2012 anomaly (Supplementary Figure 3). There is considerable spread in the pattern of the anomaly across ensemble members, which causes the mean anomaly to be rather uniform. The ability of individual members to capture the pattern and magnitude of the anomaly suggest that the poor representation of the event in the ensemble mean is a reflection of internal variability, rather than an indication of poor physical representation of precipitation by the model.

Fig. 12
figure 12

Composites of seasonal precipitation when southern European precipitation is at least 1.5 standard deviations below the 1960 to 2012 mean from a CRUTS3.2.3; b EOBS; c HadGEM3-A ALL; d HadGEM3-A Nat, and when northern European precipitation is at least 1.5 standard deviations above the 1960 to 2012 mean from e CRUTS3.2.3; f EOBS; g HadGEM3-A ALL; h HadGEM3-A Nat

HadGEM3-A overestimates the number of dry days in northern European summer, and underestimates the number of days with large precipitation amounts (Fig. 4b compared to Fig. 4a). The ensemble mean precipitation frequencies for 2012 are very similar to the 1960–2012 mean, consistent with the small seasonal anomalies produced by the model (Figs. 4b, 10). If model seasons that best match the spatial pattern of the observed 2012 anomaly are isolated, the model better captures the frequency distribution of daily precipitation (Fig. 4c). However, even with this pre-conditioning, the model still under-represents moderate to high precipitation amounts in northern Europe.

HadGEM3-A represents the frequency distribution of southern European daily precipitation in 2012 well, despite overestimating the number of dry days in the 1960–2012 mean (Fig. 5b). As was seen in E-OBS, the precipitation distribution in 2012 is similar to other dry years in the model (the characteristics of which are also well captured by the model).

Reliability analysis is sometimes used to assess the suitability of a model for use in attribution studies (Christidis et al. 2013). HadGEM3-A is able to reproduce observed probabilities (from E-OBS) in near-surface temperature for both northern and southern Europe (Supplementary Figures 4, 5). However, the assumptions underlying reliability analysis are not well suited to precipitation. For completeness, reliability analysis for near-surface temperature, sea level pressure, and precipitation is shown in the supplementary material, alongside a description of the assumptions, methodology, and a discussion of the issues with their application to precipitation in this case.

5.1.3 Attribution with HadGEM3-A

HadGEM3-A does not capture the amplitude of the 2012 precipitation dipole in the ensemble mean, but it does produce a dipole of the correct sign. However, individual members are able to capture both the amplitude and pattern, which demonstrates that the model is able to simulate precipitation events with the observed magnitude. The diversity across the ensemble members suggests a large role for natural variability in the 2012 event, which will be quantified later. The ability of individual members from the historicalNat ensemble to also reproduce the pattern and amplitude of the event supports this.

Extreme precipitation events have a similar amplitude and spatial structure in both the historical and historicalNat experiments, suggesting that anthropogenic forcing has not significantly changed the amplitude or pattern of such events. This can be seen in the composites of events where northern European precipitation is 1.5 standard deviations over the northern European mean, and where southern European precipitation is 1.5 standard deviations below the southern European mean (Fig. 12). Dry events are slightly drier in the historical ensemble, while wet events are slightly wetter in the historicalNat ensemble. Both the historical and historicalNat experiments simulate a positive precipitation trend over Scandinavia and the Baltic Sea (significant at the 5% level), suggesting a role for natural SST variability in preconditioning the event, although this may also be due to a common drift in the two ensembles.

Before using HadGEM3-A for probabilistic attribution, the mean and variance of the seasonal-mean, regional-mean time series are corrected using CRUTS3.23 (1901–2013). The distribution of seasonal mean precipitation is well approximated by a Gamma distribution in both regions. Southern European temperature is better approximated by a Gaussian distribution. Distributions are fit to the bias-corrected HadGEM3-A historical and historicalNat data for 1999–2013. The probability of the observed 2012 values occurring in these distributions is then calculated. Bootstrapping is used to find 1000 estimates of this probability for each experiment, and the 1,000,000 possible combinations of \(p_1\) and \(p_0\) are then used to find the best estimate of the probability ratio, and the corresponding 95% confidence interval.

Comparing the probability of 2012 precipitation occurring in the historical and historicalNat experiments in 1999–2013, the probability ratio is 0.95 (95% CI 0.57–1.97) for northern Europe, showing no discernable contribution from anthropogenic forcings.

For southern European precipitation there is a suggestion of an anthropogenic contribution to the dry summer in 2012. Comparing the probability of 2012 precipitation occurring in the historical and historicalNat experiments in 1999–2013, the probability ratio is 1.77 (95% CI 0.61–7.63). While not statistically significant at the 5% level, these values suggest that there is ‘likely’ an anthropogenic contribution, using IPCC vocabulary (Le Treut et al. 2007) (the bulk of the confidence interval is greater than 1, with the 70% CI excluding 1). This is consistent with an expectation from the modelled trend that southern Europe is becoming drier (van Oldenborgh et al. 2013).

Southern Europe was both dry and hot in 2012, which is likely to have exacerbated the drought risk. 1999–2013 was significantly warmer than 1960–1974. There is a large anthropogenic contribution to this trend (Fig. 6c). Comparing historical and historicalNat temperature suggests that anthropogenic forcing made summer 2012 2.99 (95% CI 1.35–9.67) times more likely to be hot.

5.2 Attribution with fully-coupled models

The fifth Coupled Model Intercomparison Project (CMIP5) presents an opportunity to analyse the response of a large number of models to forcing. Unlike the HadGEM3-A experiments, the CMIP5 models have fully interactive oceans. As such, these models are not expected to reproduce observed events in a given year (to the extent that SST influences the probability), and, in their ensemble mean, primarily show responses to radiative forcing.

We consider 19 models (Supplementary Table 1), for which a historical (with all forcings prescribed from 1850 to 2005), historicalNat (with natural forcings prescribed from 1850 to 2005, and anthropogenic forcings fixed at their 1850 values), and PIcontrol (free-running simulations with all forcings fixed at their 1850 values) experiment are available. With such a range of experiments, there are a number of ways to define the counterfactual world. As attribution conclusions can be sensitive to this choice (Hauser et al. 2017), we explore a number of possibilities to establish whether the conclusions are robust.

Following Hauser et al. (2017), we define the present (PRES1) as a 20-year window around our event (2002–2021), extending the historical experiment with Representative Concentration Pathway (RCP) 4.5 to be consistent with the HadGEM3-A experiment (Sect. 5.1.1). Comparison with RCP8.5 showed our results to be insensitive to the choice of future pathway. We also derive two counterfactual worlds from the PIcontrol experiments and the historicalNat experiments, each based on 20 years of data. The historicalNat experiments end at 2005, so we take 1986–2005 (PRES2) as our present-day in this case. We compare this to PRES1 (2002–2021), but since this period includes a number of large volcanic eruptions, we also compare it to the equivalent period (1986–2005) from the historical experiments. We tested two different periods sampled from the PIcontrol experiments, 20–39 years and 220–239 years. We found that the best estimates of the probability ratio were insensitive to this choice, and therefore present only the results from the first period in this section.

Creating the CMIP5 ensemble, we include only the first ensemble members from each model, to give each model equal weight, and use the same 19 models in the ensemble for each experiment. Taken as a whole, the ensemble has a small bias in European precipitation, although some of the individual models have large biases (Hauser et al. 2017). Previous work has shown that coastal trends in summer precipitation are underestimated by the ensemble, but larger spatial averages are better represented (van Haren et al. 2012). A multi-model ensemble often has less bias in the mean-state than individual models, but results may still be biased relative to the real-world because of shared deficiencies in the models (NAS 2016).

As for HadGEM3-A, we use CRUTS3.23 to correct the mean and variance of individual model data, before it is pooled into the CMIP5 ensemble, and use bootstrapping to find the best estimates and confidence intervals of the probability ratios.

Comparisons between the historical and historicalNat experiments indicate no significant change in the probability of wet northern European summers due to anthropogenic forcing (Table 2). The comparison between historical PRES2 and PIcontrol also shows no discernible anthropogenic influence, while the comparison between historical PRES1 and PIcontrol suggests an increase in likelihood. Overall, we conclude that we cannot detect an anthropogenic influence on wet summers in northern Europe. The analysis excludes changes in probability larger than a factor 1.5.

All comparisons in the CMIP5 ensemble suggest that reduced precipitation becomes more likely in southern Europe with anthropogenic forcing. Although there is considerable spread in the magnitude of the probability ratio (Table 2), all three comparisons suggest a ‘likely’ role for anthropogenic forcing. However, this effect is so small that it cannot yet be detected in the observations (Fig. 6). Spatially, the CMIP5 models show a drying trend in the western part of the southern Europe box, in Spain, in agreement with the observed trend.

All southern European temperature comparisons in the CMIP5 ensemble show that hot summers are made significantly more likely by anthropogenic forcing (Table 2). The large trend in historical temperature means that the magnitude of the probability ratios are sensitive to the use of PRES1 and PRES2 in this case.

6 Synthesis

This section will briefly summarise the results from the different attribution methods and then provide a synthesis.

6.1 Observationally based return time analysis

Analysis of observed precipitation from CRUTS3.24.01 shows no discernible trend in precipitation in northern or southern European precipitation. Increasing global temperatures are definitely related to a substantial increase in the likelihood of hot summers in southern Europe [28.4 (95% CI 3–1700) times more likely in 2012 than in 1960, and 90 (95% CI 4–6 \(\times 10^{6}\)) times more likely than in 1901], which may increase the risk of hydrological drought there.

6.2 Analogue analysis

Long-term observations of precipitation, with sufficient temporal resolution to perform analogue analysis, were not available for southern Europe. Analogue analysis for northern Europe using HadUKP and the twentieth century reanalysis show that the circulation pattern present in summer 2012 increased the probability of excess precipitation. The analysis also shows that there has been no change in summer circulation patterns since 1960, suggesting that the dynamic component of the extreme precipitation in northern Europe was the result of natural variability.

Although analogue analysis shows that the 2012 atmospheric circulation increased the likelihood of a wet summer, we found no evidence of long-term trends in the circulation, and no consistent change in regional UK precipitation as a result of thermodynamic changes.

6.3 Conditional attribution analysis with HadGEM3-A

Conditional attribution with HadGEM3-A supports the observational conclusions that there is no discernible trend in wet northern European summers. Dry southern European summers may be more likely, but the trend is within the range of natural variability. There is little difference between composites of wet northern European summers, and dry southern European summers, from historical and historicalNat experiments, supporting the conclusion from analysis of HadUKP that there is no discernible anthropogenically forced trend in the summertime circulation.

6.4 Unconditional attribution analysis with CMIP5

Unconditional attribution with an ensemble of 19 CMIP5 models support the conclusions of both the observational and conditional model analysis that there is no trend in wet northern European summers, and that dry southern European summers are becoming slightly more likely to be dry, and significantly more likely to be hot.

Table 2 Probability ratios (p\(_{1}\)/p\(_{0}\)) for summer 2012 from different approaches

6.5 Synthesis

The model-based methods considered here show a slightly increased risk of dry summers in southern Europe due to anthropogenic forcing, although the trend is within the range of natural variability, and is not yet detectable in observations. Analyses consistently show a significantly increased risk of hot summers in southern Europe, which may increase the risk of hydrological drought in the region (Table 2).

We did not identify robust changes in northern European precipitation in response to anthropogenic forcing across approaches. The return times in observations hinted at an increased risk, while the model results hinted at the opposite. The probability ratios are close to unity in most cases, and we have no physical expectation of a change in risk. The analogue analysis shows the sign of thermodynamic contribution to the risk of wet summers to be regionally dependent across the UK. We conclude there is no discernible trend.

Confidence in the dynamical aspects of climate change is low compared to that in the thermodynamic aspects (Shepherd 2014). Hence, if the response of the dynamic drivers to climate change is a significant component of the anthropogenic influence, then the plausibility of that response needs to be established. However, in the case of the extreme summer of 2012, the anthropogenic influence appears to be predominantly thermodynamic. Although the atmospheric circulation contributed to the event, we found no change in the circulation over time, and therefore conclude that the 2012 circulation primarily occurred as a result of natural variability.

7 Conclusions

Anthropogenic forcing may have slightly increased the risk of dry summers, and has significantly increased the risk of hot summers in southern Europe, but has not increased the risk of wet summers in northern Europe.

In northern Europe, there is no clear or robust indication of a change in the risk of wet summers in 2012 due to anthropogenic forcing. We found that the observed atmospheric circulation in summer 2012, with a low pressure centre near the UK, increased the risk of excess precipitation in northern Europe, but there have been no long-term changes in this circulation pattern. Comparison of composites of wet northern European summers, and the 2012 precipitation anomaly, between HadGEM3-A historical and historicalNat experiments further suggest that natural variability played a predominant role in the wet northern European summer of 2012.

Results of previously published analyses attempting to quantify the anthropogenic influence on northern European precipitation vary (Bladé et al. 2012b; Sparrow et al. 2013), in common with the analysis we present here (Table 2). Yiou and Cattiaux (2013) found Scandinavian precipitation rates were influenced by large-scale circulation, which is consistent with our finding that the 2012 circulation pattern, with a low pressure centre near the UK, made a wet northern European summer more likely.

There has been a large positive trend in near-surface temperature in southern Europe since 1960, which is captured in all observational datasets and model timeseries. Our analysis consistently shows anthropogenic forcing has increased the risk of hot southern European summers. Model-based analysis also suggests that anthropogenic forcing has increased the risk of dry summers there, but this is not yet detectable in observations.

NAS (2016) concluded that there should be confidence in attribution for events like the extreme European summer of 2012. It is well simulated in the models we used, has long-term observations available, and, by our choice of variables, is purely meteorological in nature. To further increase the confidence in our conclusions we used multiple approaches, which helps to distinguish results that are robust to the approach taken and the question posed. Using multiple methods to estimate human influence on an event partially addresses the challenge of characterising the many sources of uncertainty in event attribution.