1 Introduction

During the last glacial maximum (LGM) the global mean temperature was 3–5 °C cooler than during the pre-industrial era (Jansen et al. 2007; Braconnot et al. 2012). Approximately equal contributions to this cooling can be attributed to the large ice-sheets and reduced levels of major greenhouse gases that characterised this time period (Petit et al. 1999; Peltier 2004; Braconnot et al. 2012). During the LGM, atmospheric CO2 was around 90 ppmv lower than the pre-industrial, at around 185 ppmv. The LGM has been investigated in detail as it is thought to provide a good test for climate model responses to changes in greenhouse gas-induced radiative forcing (Braconnot et al. 2012). Compilations of terrestrial and surface ocean climate reconstructions (Bartlein et al. 2011; Kucera et al. 2005, respectively) indicate a coherent picture of cooler and drier conditions globally, consistent with the records of terrestrial biome distributions (Prentice et al. 2000; Marchant et al. 2009).

Coordinated ensembles of climate model simulations of this period under standard experimental protocols have been established and analysed in some detail (Braconnot et al. 2007a, b; Izumi et al. 2013; Schmidt et al. 2014). These simulations demonstrate amplified cooling at high latitudes (particularly in the Northern Hemisphere) and over land relative to the ocean. Whilst the GHG effect is thought to be the dominant driver of temperature change at low latitudes, where cooling was of the order of 2–3 °C, the effects of the ice sheets were largest over North American and Eurasian land areas, where annual mean cooling was of a much greater magnitude and with larger seasonality (see Brady et al. 2013, for a comparison of LGM CO2 versus ice-sheet forcing).

A fundamental advantage of using palaeoclimate time periods to evaluate models is that depending on the time period of choice, the scale of change can be very large. This particularly true of the LGM (Braconnot et al. 2012) for which the global mean cooling is of a similar magnitude to the upper range of IPCC projections for the end of the twenty first century. The LGM is often regarded as a suitable time period for testing climate models because (1) a large component of the cooling is forced by greenhouse gas changes which are well constrained, (2) the LGM represents a quasi-equilibrium climate which facilitates comparisons with model simulations and (3) as mentioned, though reconstructions are associated with uncertainties, the overall magnitude of change is large. A potentially important complication for the LGM derives from changes in vegetation and related to this, atmospheric dust aerosols. Both of these have the potential to modify the global radiative forcing and regional climate (e.g. Crucifix and Hewitt 2005a; Mahowald et al. 2006).

Here we evaluate how components of the Earth System model HadGEM2ES (Collins et al. 2011) perform when forced with LGM boundary conditions. This model is of interest because it is capable of simulating a number of interactive components which are missing in many previous simulations of the LGM climate. These include aerosols, particularly mineral dust, dynamic vegetation and optionally interactive tropospheric chemistry. We evaluate the model simulations in comparison with a number of palaeoclimate datasets and additionally perform a detailed comparison with a simpler and lower resolution GCM, HadCM3M2 (Gordon et al. 2000; Cox et al. 2000). We investigate how the LGM might provide insight into model uncertainty, especially as related to Earth System model components and whether palaeo-environmental reconstructions may offer opportunities for the evaluation of complex models such as HadGEM2-ES.

2 Methods

2.1 Model description

In this study we use the HadGEM2-ES model (Collins et al. 2011; HadGEM2 Development Team 2011) mostly in an atmosphere-only configuration (HadGEM2A). This is a semi-lagrangian, non-hydrostatic, fully compressible atmospheric general circulation model (GCM), which utilises a terrain following vertical coordinate scheme (Martin et al. 2006; HadGEM2 Development Team 2011). There are 38 unequally spaced levels in the vertical direction and the horizontal resolution is 1.875° × 1.25° in longitude-latitude. The model has a 30 min time step for the atmospheric dynamics.

The model version used here is very similar to that used for CMIP5 (Jones et al. 2011). It makes use of a dynamic land surface scheme based on the TRIFFID dynamic vegetation model (Cox 2001) and an updated version of the MOSESII land surface scheme (Essery et al. 2003), which is a precursor of the JULES (Joint UK Land Environment Simulator) land surface model (Best et al. 2011; Clark et al. 2011). MOSESII uses fractional tiling of nine land surface types, including five plant functional types. In this version of the model TRIFFID uses an improved radiation-canopy scheme (Mercado et al. 2007) which incorporates multiple leaf levels.

Other components introduced in the development since the pre-cursor model HadGEM1 (Martin et al. 2006) include a sub-grid scale treatment of land surface hydrology (Gedney and Cox 2003) and the incorporation of a river-routing model, TRIP (Oki and Sud 1998) which is run on a 1° × 1° grid. HadGEM2 also includes an optional interactive tropospheric chemistry model, UKCA (e.g. O’Connor et al. 2009, 2014).

HadGEM2 includes a representation of seven aerosol species: mineral dust, sulphate, sea salt, biogenic emissions, biomass burning and fossil fuel black carbon and organic carbon (Bellouin et al. 2007, 2011). The emissions of mineral dust are calculated at each model timestep and are dependent on the wind speed, soil moisture content, the bare soil fraction of a grid-cell (as simulated by the TRIFFID model) and a source multiplier which accounts for available fine-grained material (Woodward 2001, 2011). Mineral dust is transported in six mass bins by the atmospheric model. Sea-salt aerosol numbers are also calculated interactively within HadGEM2, whereas remaining aerosol species are dependent on prescribed monthly emissions. All aerosols except biogenic and sea-salt aerosols are transported by the atmospheric GCM at each time-step within the model. When the UKCA atmospheric chemistry is activated, it solves for the oxidising capacity sulphate, which is otherwise derived from a prescribed monthly climatology. All of the modelled aerosols influence long- and shortwave radiation and have an implicit semi-indirect effect on the climate. First and second indirect effects are also computed for all species except mineral dust and black carbon aerosols (Bellouin et al. 2011).

2.2 Boundary conditions

The pre-industrial simulation follows the protocol used for CMIP5 with HadGEM2-ES (Jones et al. 2011). The LGM simulations were configured following the PMIP2 protocol (Braconnot et al. 2007a) which specifies a reduction in the atmospheric concentrations of CO2, CH4 and NO2, and imposition of the 21 kyr insolation parameters, and changes to the orography, land sea mask and land ice as reconstructed by Peltier (2004). This allowed better comparison with PMIP2-type simulations carried out with other GCMs, in particular the previous Hadley Centre Model HadCM3M2 which includes a similar versions of the MOSESII land surface scheme (Essery et al. 2003) and dynamic vegetation (Cox 2001) as used in HadGEM2.

For a number of remaining Earth System components, the prescribed aerosol emissions of sulphur dioxide, DMS, biogenic and biomass aerosols are kept at their pre-industrial levels following Jones et al. (2011). Both fossil fuel black and organic carbon are set to zero in all simulations considered. For the LGM the source multiplier for mineral dust is expanded to cover new land gridcells and is set to zero over the Laurentide and Fennoscandian ice sheets. In all HadGEM2 simulations analysed we use the Earth System dust emission model parameters, as used in CMIP5 with HadGEM2-ES (Jones et al. 2011). We set anthropogenic vegetation disturbance (which confines dynamic vegetation to grass PFTs) to zero globally in the LGM simulations. The river routing model has been modified for the LGM land-sea mask so that rivers do not terminate at land grid-cells. The topographic index field which determines the sub-grid hydrology was expanded to new land points at the LGM by deriving a logarithmic relationship between the topographic index mean and a measure of sub-grid orographic variability. The latter was calculated at a resolution of 10 arc minutes using the global ETOPO1 data set (Armante and Eakins 2009). This orographic roughness field was then regridded to the resolution of an existing topographic index field (0.83° × 0.55°: N. Gedney, personal communication) in order to calculate the logarithmic relationship parameters used to derive the topographic index over new land points. The resultant globally resolved topographic index was finally regridded to the resolution required in HadGEM2 (1.875° × 1.25°).

The monthly sea surface temperatures (SSTs) and sea-ice distributions are specified from simulations with HadCM3 (Gordon et al. 2000) or HadCM3M2 under equivalent PI and LGM boundary conditions (Singarayer and Valdes 2010) and with dynamic vegetation enabled in the core HadGEM2 experiments. Further simulations also made use of the MARGO SST reconstructions (Kucera et al. 2005) over the North Atlantic, but for the purposes of this work had very little influence on the conclusions. We also perform sensitivity experiments with HadGEM2-A in which the vegetation distribution is prescribed from HadCM3M2. This allows for detailed comparison of the two models without any difference in the surface vegetation coverage.

The HadCM3 coupled simulations have been spun up for 500 years from pre-industrial initial conditions (Singarayer and Valdes 2010). The monthly mean climatologies (averaged over 30 years) of SST and sea-ice area are used to drive HadGEM2. Similar simulations have also been performed using HadCM3M2 (Cox 2001; Essery et al. 2003) which includes similar versions of MOSESII and TRIFFID as used in HadGEM2-A (some differences are outlined by Good et al. 2013). These were identical in setup to the HadCM3 simulations but dynamic vegetation was activated in TRIFFID firstly for 200 years in equilibrium mode and then in dynamic mode for a further 300 years. For both the HadCM3 and HadCM3M2 simulations the final 30 years of the 500 years simulated were used to calculate climatologies.

2.3 Simulation design

In the initial two interactive vegetation cases HadGEM2-A was first run for 20 years in equilibrium-mode which allows the vegetation distribution to reach an equilibrium with the simulated climate. For this an implicit time-step of 100 years is used in the vegetation dynamics (see Cox 2001) for every 5 years of climate simulation, thus giving a total spin up of 400 years.

3 Results

3.1 Climate and dynamic vegetation distributions

The annual mean 2 m surface temperature anomaly for HadGEM2-A is compared with that simulated by HadCM3M2 in Fig. 1. It is clear that the surface temperature response is much stronger in HadGEM2-A. HadGEM2-A shows a global cooling of −5.1 °C. The top of the atmosphere radiation budget shows the model is in disequilibrium with a global mean of −4.4 Wm−2 meaning that further cooling would be likely were the ocean and sea-ice allowed to respond. We note that separate simulations performed to diagnose the mineral dust radiative forcing (with double-call radiation diagnostics) suggest that around 25 % of this is net imbalance is caused by radiative forcing of simulated enhanced dust loading. The major driver behind the enhanced cooling relative to HadCM3M2 is an expansion of bare soil over much of the Asian continent as shown in Fig. 2. The bare soil increases the surface albedo and allows snow-cover to persist during the spring leading to a further positive feedback on the cooling.

Fig. 1
figure 1

Annual mean 2 m temperature anomalies LGM-PI for HadCM3M2 (top), HadGEM2-A forced with the vegetation distributions simulated in HadCM3M2 (middle) and HadGEM2-A with dynamic vegetation enabled (bottom). The SSTs and sea-ice in the middle simulations are prescribed from the HadCM3M2 runs shown in the top panel, whilst in the bottom simulations they are from HadCM3 (a simulation without dynamic vegetation enabled)

Fig. 2
figure 2

Pre-industrial (left) and LGM (right) vegetation distributions. Observed and reconstructed (top row), prescribed in HadGEM2-A from HadCM3M2 (middle row), and simulated dynamically in the standard version of HadGEM2-A (bottom row). A correction was applied over the Amazon in the middle panels, as in HadCM3M2 a small area within the Amazon rainforest (approximately 12 gridcells) is incorrectly simulated as grasses in HadCM3M2. The observed modern fractional coverages (Loveland et al. 2000) have been remapped to TRIFFID types (Essery et al. 2003). A similar process was followed here for the LGM vegetation which is derived from the BIOME6000 project pollen database. In the latter case no distinction is made between C3 and C4 grasses, so that all grass biomes appear as C3 grasses

For the LGM, HadCM3M2 shows a significant contraction of the boreal forest and its replacement with a large area of C3 grasses and to some extent shrubs (similar to simulations with HadSM3 presented by Crucifix and Hewitt 2005b). The equivalent HadGEM2 simulations show realistic coverage of broadleaf trees over the Amazon, and over Eastern North America, but at the LGM the HadGEM2 simulation shows almost no vegetation coverage over a large area of mid- to high-latitude Asia. The tropical broadleaf tree coverage does not change significantly at the LGM in either model. Both models also show expansion of bare soil areas in Southern South America and Australia.

Pollen data for the LGM has been compiled in the BIOME6000 data set (Prentice et al. 2000; Marchant et al. 2009; Ni et al. 2010) and is plotted for comparison with the simulated distributions in Fig. 2. Here the mega-biomes from the data set have been converted to dominant plant functional types in TRIFFID. This conversion is not straightforward but should be seen as a way to identify broad features in the data for validation of the model outputs. The mega biomes were first mapped to the land cover types of Hansen et al. (2000). The TRIFFID types were then derived using Table 1 of Essery et al. (2003). This approach does not distinguish between C3 and C4 grasses.

Although it shows relatively sparse geographical coverage and provides point estimates of vegetation types, it does provide direct evidence for testing the realism of the TRIFFID results. For the LGM, the pollen data suggest some remnant tropical forest at the LGM in agreement with the model simulations and prior modelling (Mayle et al. 2004; Cowling et al. 2004). A number of sites show grassland-shrubland dominated environments in the mid- and high-latitudes of Asia and tundra environments Northwards of 60°N. Individual sites also suggest grassland-shrubland as far North as 70°N in central Asia and Boreal forest at approximately 50°N. Despite the considerable uncertainty in the model and data inferences for vegetation changes at the LGM, the BIOME6000 data and other model results (Braconnot et al. 2007a, b) strongly suggest that the HadGEM2 simulated distribution, which shows dominance of bare soil over a very large area, is too extreme.

3.2 Investigating the enhanced cooling in HadGEM2

In order to ascertain the reason for this enhanced cooling in HadGEM2-A we ran a modified version of HadGEM2-A in which the surface vegetation types (and SSTs and sea-ice) are prescribed from HadCM3M2 for both PI and LGM. Differences in vegetation surface type coverage are therefore removed and other factors causing any differences between the model can be more clearly assessed. These simulations were run for 30 years.

The LGM-PI temperature anomaly remains and is up to 10 °C larger in HadGEM2 than in HadCM3M2 as shown in Fig. 1. In this simulation significant snow cover is found in Asia in June in the LGM HadGEM2 simulation but not in HadCM3. This strongly suggests that HadGEM2 is more sensitive to local radiation balance feedbacks and that the temperature reduction over Asia is perhaps responsible for the vegetation collapse seen in the dynamic simulations described in the previous section. This is a point to which we return in the following section.

To quantify the forcings and feedback more robustly we employed the APRP (approximate partial radiative perturbation) method of Taylor et al. (2007). This allows the influences of clouds, surface albedo and clear-sky changes on the short-wave radiation budget to be separated. This follows the approach taken to evaluate PMIP2 simulations (Crucifix 2006; Braconnot et al. 2012). The APRP method is useful because it is less prone to non-cloud changes in the radiation budget leading to apparent changes in the classically defined cloud radiative forcing (e.g. Hewitt and Mitchell 1997). This is achieved by the use of simple representation of the energy balance through the atmosphere (see Taylor et al. 2007, for a full description of the method).

The annual mean feedback distributions (in Wm−2 K−1) for HadCM3M2 and HadGEM2-A are shown in Fig. 3 where the vegetation distributions (PI and LGM) in HadGEM2-A are prescribed from the HadCM3M2 simulations. The figure shows broad similarities between the two LGM-PI anomalies, which is perhaps unsurprising given that the simulations also share SST and sea-ice distributions. The major difference between the two models is the substantially large surface albedo term over Asia in HadGEM2, which cannot be explained by differences in vegetation cover as this is prescribed identically in the two models.

Fig. 3
figure 3

Radiative forcing/feedback distributions (Wm−2 K−1) where positive values indicate a positive feedback on cooling at the LGM relative to the pre-industiral. The results are shown for HadCM3M2 (left) and HadGEM2 (right), where the latter includes prescribed vegetation coverage from the former. Top row surface albedo term, middle row clear-sky albedo term and bottom row cloud albedo term. The largest difference over Asia is due to the surface albedo term

Other features of interest include a spring (MAM) average positive cloud feedback in Asia in HadGEM2, compared with a negative feedback in this region in HadCM3. This cloud effect is manifest in all HadGEM2-A simulations, regardless of the vegetation cover, and so appears to be be a robust feature in HadGEM2-A, that is physically-driven rather than a result of particular vegetation distribution. This feature is weakened in simulations with zero dust forcing (not shown), suggesting it is the result of the semi-direct effects of the dust aerosols in the model. This feature is also observed in the annual mean and can be seen in Table 1 as the sign of the cloud term is reversed over Asia in a pair of no dust simulations for which the radiative effects of mineral dust were disabled.

Table 1 APRP diagnostics LGM-PI over the Asian continent for different models

Quantifying the three terms (surface, cloud and non-cloud atmosphere) over Asia shows that in the annual mean the surface term dominates, as shown in Table 1. This difference is only significant in months with snow in HadGEM2 (i.e. from September to June). Other model differences due to for example differences in the seasonal evolution of clouds and the influence of the interactive mineral dust scheme can be seen in the clear-sky and cloud radiative feedback distributions, but are shown to be less important than the surface albedo change (see Table 1). Comparison of a PI/LGM pair of HadGEM2-A simulations without dust aerosol radiative effects shows that this cannot reconcile the difference with HadCM3M2 (as shown in Table 1).

It therefore appears the treatment of snow-vegetation albedo is likely to be causing the different albedo response in the two models. A factor which controls the influence of the snow is the leaf area index of underlying vegetation. In both HadCM3 and HadGEM2-A the albedo of the combined vegetation-snow surface is determined by the leaf area index (LAI: m2/m2) and two albedo parameters (Essery et al. 2001):

$$\alpha_{ds} = \left( {1 - f_{r} } \right)\alpha_{s}^{0} + f_{r} \alpha_{s}^{inf}$$
(1)

where

$$f_{r} = 1 - e^{ - \varLambda /2}$$
(2)

where Λ is the LAI and α 0 s and α inf s are the snow-free and snow-covered albedo values for each PFT. In HadGEM2 these values have been reduced by 5 % (Collins et al. 2011) compared to the standard values quoted by Best et al. (2011). Comparison of the LAI itself shows that the LAI for C3 grasses (the dominant PFT in Asia at the LGM in HadCM3) is substantially lower in HadGEM2-A in both the pre-industrial and LGM than in HadCM3. This causes the snow-covered C3 grass tile to have substantially higher albedo by around 10 %. Indeed the snow covered albedo in HadGEM2 is generally higher by around 10 % for each PFT compared with equivalent HadCM3M2 simulation.

3.3 Causes of vegetation die-back

We are now confident of the reason for the extra temperature sensitivity in HadGEM2 at the LGM. We thus performed a further simulation of HadGEM2 at the PI and LGM in which both the snow albedo parameters (α 0 s and α inf s ) are reduced by 10 % for each PFT. The APRP analysis for this simulation shows that it reduces the total SW radiative feedback over Asia from −23.7 to −16.2 Wm−2, i.e. much closer to the value of −10.6 Wm−2 in HadCM3M2 (Table 1). Comparison of the resultant seasonal evolution of surface albedo and temperature shows that the HadGEM2 (−10 % albedo) and HadCM3M2 now have a very similar climatology, as shown in Fig. 4. This modification removes much of the difference in surface albedo between HadCM3M2 and HadGEM2A with prescribed vegetation from HadCM3M2. The difference in the annual mean temperature averaged over Asia reduces from 4.2 °C in the standard HadGEM2-A simulation (with vegetation prescribed from HadCM3M2) to 2.0 °C when the albedo parameters are reduced by 10 %.

Fig. 4
figure 4

Monthly climatologies of surface temperature, surface albedo, snow cover and NPP for needle leaf trees, C3 grasses and shrubs in HadCM3 and HadGEM2 averaged over Asia (50°–70°E by 50°–60°N and 70°–130°E by 50°–70°N, i.e. excluding any ice-sheets). In this figure the HadGEM2 results are from the model version in which the snow albedo parameters have all been reduced by 10 %

Now, turning on the dynamic vegetation in HadGEM2 (and retaining the 10 % reduction in snow albedo model parameters) instead of prescribing the vegetation types from HadCM3M2 might be expected to give similar vegetation distributions in the HadGEM2-A as found in HadCM3M2. Instead, the vegetation still dies off in this parameter perturbed version of HadGEM2-A at the LGM. Plotting the timeseries of NPP for each PFT in Asia (also shown for needle leaf trees, C3 grasses and shrubs in Fig. 4) it becomes apparent why this should be the case. The NPP for each PFT, but particularly grasses is substantially lower in HadGEM2 than in HadCM3, despite the two models having very similar seasonal temperatures and comparable soil moisture levels. This low NPP means that when dynamic vegetation is turned on, the vegetation is not competitive and it dies, being replaced with bare soil.

One fundamental but not widely noted difference between HadCM3 and HadGEM2 versions of TRIFFID is the introduction of leaf mortality dependence on temperature. In the standard version of HadCM3M2 this is deactivated for grass PFTs, whilst in HadGEM2 both C3 and C4 grasses start to lose leaves at temperatures below 5 °C whilst the temperature threshold for needle-leaf trees has been reduced in HadGEM2 (see Table 2). The rate of leaf drop increases by a factor of 10 for each degree below these thresholds (because d T in Eq. 3 is equal to 9) (Cox 2001; Clark et al. 2011). The temperature limited leaf mortality is calculated according to:

$$\lambda_{lm} = \lambda_{0} \left[ { 1 + d_{T} \left( {T_{off}- T_{c} } \right)} \right]$$
(3)

where λ 0 is the unperturbed leaf mortality rate, d T is the rate of change of leaf mortality with temperature and T off is the threshold below which the above equation is applied. T c is the canopy temperature. The default parameter values relevant to this equation are given in Table 2.

Table 2 TRIFFID model parameters relevant to Eq. (3)

This leaf mortality function will therefore reduce the mean LAI explaining the lower LAI values in HadGEM2 compared with HadCM3. This reduction in LAI also explains the roughly 10 % increase in surface albedo in HadGEM2 versus HadCM3 explored in the previous section. A reduction in LAI will also reduce GPP and hence NPP, since GPP is a function of LAI in TRIFFID (Cox 2001). This then potentially explains the very low NPP values in HadGEM2 for the LGM compared with HadCM3. Now running HadGEM2 with the thresholds for leaf mortality as defined in the HadCM3M2 code (see Table 2) with dynamic vegetation substantially changes the resultant equilibrium vegetation distribution, giving a dominance of shrubs and C3 grasses rather than the massive expansion of bare soil seen in the default model LGM simulations. The PI and LGM vegetation distributions for this parameter set are shown in the upper panels of Figs. 5 and 6, respectively. It is evident that this temperature control of leaf mortality has a potentially dominant impact on the surface climate through albedo modifications where snow is present, and on the vegetation distribution through changing the NPP.

Fig. 5
figure 5

Pre-industrial dominant fractional coverage in HadGEM2-A with HadCM3M2 parameter values (top) and the new optimal model parameters (bottom)

Fig. 6
figure 6

LGM dominant fractional coverage in HadGEM2-A with HadCM3M2 parameter values (top) and the new optimal model parameters (bottom)

4 New model parameters

4.1 Choosing new parameters for the PI and LGM

Using the HadCM3M2 model parameter values for T off and d T allows a realistic simulation of vegetation distribution at the LGM in HadGEM2-A as shown in Fig. 6. However, further improvements to the simulation of dynamic vegetation within HadGEM2 may be possible by further tuning of the values of T off and d T . Due of the significant computational cost of the HadGEM2 model, it is not currently feasible to use an ensemble method to optimise these particular model parameters. Instead we have run modest number of PI and LGM simulations with a range of T off and d T values in order to explore the sensitivity to these parameters.

One target for tuning HadGEM2 is the modern distribution of vegetation types. Following the approach of Williams et al. (2013) we use the Advanced Very High Resolution Radiometer observations of surface vegetation types (Loveland et al. 2000) regridded to the resolution of HadGEM2 and converted to the five plant function types in TRIFFID (Essery et al. 2003). The leaf area index is compared against the Lai3 g data set of Zhu et al. (2013) which is derived from the third generation of GIMMS AVHRR NDVI data using an artificial neural network approach. The data set covers the period from July 1981 to December 2010. The DJF, JJA and annual mean climatologies for this time-period are compared with the HadGEM2-A model output.

We additionally include the constraints from the LGM reconstruction data. Particularly the BIOME6000 pollen data set (Prentice et al. 2000; Marchant et al. 2009) discussed earlier and shown in Fig. 2. In the latter case the correlation with the 5 TRIFFID PFTs is not straightforward. Different approaches are possible, including using the climate variables from GCM to calculate biomes from TRIFFID PFTs. Here we make the pragmatic choice that the model must reproduce the key features of the LGM pollen data: grasses or shrublands must dominate in Asia to at least 60°N and in Eastern North America, temperate forest biomes should dominate.

The pre-industrial simulations analysed in the following differ slightly from those described earlier. Here we impose a vegetation disturbance field which restricts dynamic PFTs to grasses in order to mimic the geographical distribution of crops (following the AD 1860 distribution described by Jones et al. 2011). This means that historical deforestation is accounted for to some extent and this should improve the comparison of modelled and observed vegetation distributions. LGM simulations are as in previous runs. Both the pre-industrial and LGM simulations were run for 15 years in equilibrium mode (see Cox 2001, for the definition) with a vegetation distribution update every 3 years, in order to quickly bring the vegetation distribution close to equilibrium with the modelled climate. We trialled four specific modifications to the parameter sets for the temperature dependence of leaf phenology and ran a pair of pre-industrial and LGM simulations for each case.

Firstly as already described we used the parameter values as used in HadCM3M2 and as listed in Table 2. In the PI simulation, the high-latitude areas are dominated by shrubs, whilst in Asia the LGM simulation shows a dominance of grasses and shrubs.

Next we took the default HadGEM2-A values (listed in Table 2) but halved the temperature sensitivity for all PFTs except broadleaf trees by setting d T  = 4. These simulations show a significant expansion of bare soil at the LGM. Similar results were found when d T  = 1. This implies that d T  = 0 is required in order to avoid large-scale bare soil expansion over Asia in the LGM simulations with HadGEM2-A.

Now with the leaf mortality set to zero (d T  = 0) for grasses, the temperature limit (T off ) for needle-leaf trees was reduced below the default value to −50 °C. In the PI simulation needle-leaf tree coverage in eastern Siberia increased, improving agreement with observations (Loveland et al. 2000), whilst changes in the LGM simulation are minor.

Finally, an increase in the T off of shrubs by 30 °C led to an increase in grass coverage at mid- to high-latitudes in the PI simulation, again in agreement with observations (Loveland et al. 2000). These changes to the propensity for grass coverage have impacts at fairly low latitudes, including parts of northern India and across Australia. The latter is particularly important because in HadGEM2-ES dust emissions from Australia are strongly overestimated (Bellouin et al. 2011) and this appears to have an influence on the value of the Earth System sensitivity of the model (Andrews et al. 2012a).

In changing the leaf mortality in this way it is crucial to additionally evaluate the LAI changes which result. Comparison of the model LAI against observations of Zhu et al. (2013) demonstrates that all pre-industrial model versions underestimate LAI seasonality over Asia and over the high-latitudes generally. The observations suggest that LAI changes from 0.3 to 2.2 between the DJF and JJA averages over Asia (Zhu et al. 2013). All other model versions with d T  = 0 for grasses show similar JJA LAI values in Asia of 2.2, whilst the DJF values range from 1.2 to 1.5. The model version with a DJF LAI of 1.2 is therefore the closest to the observed value. The HadGEM2 run with HadCM3M2 model parameters has DJF and JJA LAI values of 0.9–2.0 respectively. This HadGEM2-A run (with HadCM3M2 model parameters) severely underestimates the extent of forest coverage in eastern Siberia suggesting that the model version displays a more realistic seasonal change in LAI for the wrong reasons (i.e. because the model has grasses and shrubs instead of trees as observed).

Of all the trialled model versions (10 in total exploring the parameters outlined above) a final parameter set was selected based on three conditions, (1) bare soil expansion at the LGM must be limited and consistent with the palaeovegetation data, (2) this should not lead to a degraded vegetation distribution in the same region in the pre-industrial, and finally (3) the seasonality of LAI for the pre-industrial should be as close to the observed variation as possible. In practice several sets of parameters satisfied the first 2 criteria, so the model version with the largest seasonality in LAI was chosen. The new parameter set is denoted the tuned set and these parameter values are given in Table 2.

This tuned model version and the version with the HadCM3M2 parameter values were spun up for 50 years after the initial 15 year equilibrium-mode phase described above. The vegetation distributions for the PI simulations are shown in Fig. 5. It can be seen that model now captures the forest distribution over Asia as well as other major areas of forest globally. The overestimation of bare soil over India and Australia remains however. It is also compared with the vegetation distribution simulated with a version of HadGEM2 with the HadCM3M2 model parameter values for leaf mortality also spun up for 50 years. This model shows too little coverage of high latitude forest cover at the expense of an overestimation of shrubs in many regions, showing that the tuned parameter set is an improvement over HadCM3M2.

The LGM simulations for these two model parameter sets are compared in Fig. 6. Some subtle differences emerge over Western North America where the tuned set is less prone to dominance of shrubs, and over Australia where the tuned set shows less grass coverage. Predominantly though, these LGM distributions are remarkably similar, especially over Asia.

In Fig. 7 the simulated annual mean surface temperature anomalies are compared against the LGM pollen-based reconstructions of Bartlein et al. (2011) using a Taylor diagram. This shows the root mean square error (rmse), the correlation and the ratio of the normalised standardised deviations, all calculated for the region northwards of 30°N.

Fig. 7
figure 7

Taylor diagram of the annual mean temperature (red) and precipitation (blue) anomalies in different model runs compared with pollen-based terrestrial climate reconstructions of Bartlein et al. (2011). Asterisk symbols denote models with dynamic vegetation enabled. Models with negative correlation values are listed at the bottom of the figure. Here the normalised standard deviations and correlation values are the upper and lower values respectively

For the LGM-PI temperature anomaly the standard version of HadGEM2A (model 1) has significantly worse agreement with the data in terms of standardized deviations and rmse. The other model versions have similar levels of rmse fit and very similar correlations. Overall the two models with no dynamic vegetation HadCM3 and HadGEM2 (both with fixed pre-industrial vegetation: models 2 and 6) have the closest agreement with the pollen-inferred temperature anomalies. Introducing dynamic vegetation results in too much cooling over Asia in both HadGEM2 and HadCM3 in comparison with the reconstructions. This feature is also evident in some models from an analysis of LGM model simulations in the PMIP3/CMIP5 database (not shown). Overall therefore the Taylor diagram analysis highlights substantial remaining differences between the models and the reconstructions.

This last point is further supported by the Taylor diagram for mean annual precipitation anomalies shown in blue on Fig. 7. This shows significant discrepancies between the models and between the models and reconstructions. HadGEM2 shows a more realistic pattern of precipitation change than the HadCM3 models, though the agreement is much less convincing than for the temperature anomalies. The HadCM3 fields are anti-correlated with the reconstructions whilst all of the HadGEM2 models show a positive correlation. This is mostly due to the changes simulated over North America, where most of the data is centred. In HadGEM2 there is qualitative agreement with the reconstructions, whilst both HadCM3 models show a pattern of increased precipitation in the east of North America which is not supported by the data.

Although not the subject of this study, this different response in HadGEM2 is interesting given that the SST field is common to both models and so cannot be the driver, which must instead be related to differences in the two atmospheric models. The precipitation reconstruction coverage is very sparse over Asia, and so this comparison is less impacted by the changes in the land surface and temperatures over Asia. Hence the three HadGEM2 models are very close in terms of agreement with the pollen-based precipitation reconstruction.

The two versions of HadGEM2 with modified leaf mortality parameters are essentially indistinguishable in their ability to simulate the LGM climate anomalies in comparison with the pollen data. Since the tuned model derived here gives a better vegetation distribution under pre-industrial conditions we take this as the preferred model version in this work.

4.2 Implications for other Earth System model components: dust

As well a having a direct feedback on surface radiation balance, the simulated vegetation distribution also indirectly affects the climate in HadGEM2 through the mineral dust cycle. HadGEM2 includes the main processes related to the dust cycle (Woodward 2011) and so here we briefly explore the main impacts of the changes in model parameters on this component.

Figure 8 shows the LGM-PI anomaly of dust aerosol optical depth at 550 µm for the standard version of HadGEM2-A and the tuned model version of HadGEM2. This figure shows the dust cycle anomalies as LGM divided by PI. The standard version of the model shows a very strong increase in AOD over Asia caused by the bare soil expansion. The area of minimum change around North Africa is the same in both models as is the apparent increase in dust loading over the Southern Ocean, mostly as a result of changes in the dust source located in Patagonia. A strong difference between the default and tuned model versions is seen over Indonesia, but has a relatively limited geographical distribution.

Fig. 8
figure 8

Ratio of aerosol optical depth at 550 μm between LGM and PI for the standard version of HadGEM2 (left) and for the new optimal parameters (right). The largest differences between the schemes occur in the far Northern Hemisphere which is driven by large area of bare soil in LGM simulation with the standard HadGEM2 version. Note that emissions are inhibited at temperatures below 0 °C which somewhat limits the amount of dust emitted in this cold region in the model

4.3 Implications for the climate in response to increased CO2

HadGEM2-ES has the 2nd highest equilibrium climate sensitivity of any of the CMIP5 models at 4.6 °C (Andrews et al. 2012b). This value is also substantially higher than HadCM3 for which the value is closer to 3.7 °C (e.g. Raper et al. 2002). Whether this high climate sensitivity translates to enhanced continental cooling in these cold climate simulations, is uncertain. Crucifix (2006) demonstrated a weak relationship between the radiative feedback parameters for the LGM as compared to a double CO2 climate. This is likely to be even more significant here since the vegetation is influenced by the atmospheric CO2 concentration.

In order to address this question we additionally perform simulations with the full Earth System model HadGEM2-ES which now includes the ocean and tropospheric chemistry components. We ran two sets of pre-industrial control simulations and simulations with the CO2 level abruptly increased to four times pre-industrial levels (4 × CO2). The first setup uses the default version of the model as used for CMIP5 (Jones et al. 2011) and the second using an updated set of leaf mortality parameters derived here.

We first ran the standard model under pre-industrial conditions for 30 years. The modified parameter version was initialised from the same initial conditions as this simulation but was run with TRIFFID in equilibrium mode with an update every 3 years for 15 years. This allows the vegetation distribution to come to near-equilibrium with the simulated climate. This was followed by a control simulation run for 50 years to allow grass PFTs to equilibrate. Finally both the standard and the modified parameter pre-industrial controls were continued for 50 years with the atmospheric CO2 level abruptly increased by a factor of 4 from 805 ppm to 3320 ppmv. Averages were calculated over years 21–50.

Figure 9 shows the impact of the parameter change on both the pre-industrial and the 4 × CO2 simulations, as well as on the 4 × CO2 minus pre-industrial anomalies. Unsurprisingly, the parameter-induced temperature anomaly is very small in comparison with the temperature change induced by the quadrupling of atmospheric CO2 which can reach over 15 °C in places. The parameter change induces cooling at mid- to high-latitudes in the pre-industrial and enhanced cooling in the high-latitudes only in the 4 × CO2 simulation.

Fig. 9
figure 9

Difference in the HadGEM2-ES simulated temperatures between the standard and tuned model runs for the pre-industrial (left) and 4 × CO2 (right) conditions and the resulting difference between the climate change (4 × CO2-PI) anomalies (lower panel)

Changes in surface temperatures between these model versions can arise as a result of modification of the surface albedo through changes in the snow vegetation albedo, from changes in the vegetation distribution or simply because of internal model variability. There are coherent signals in decreased leaf area index, increased surface albedo and increased temperature over the far north of northern America and the farthest Eastern parts of Siberia, suggestive of the first mechanism. There are also substantial differences in the fractional coverage of grasses between the two model versions, but the magnitude of the distribution change for tree PFTs is small because of the short length of the runs. The small warming signal seen over mid-latitude North America is most likely an artefact of internal variability, since there is no coherent signal in the leaf area index or surface albedo fields in these regions and changes in grass coverage are small here.

In general the parameter modification cools the Northern Hemisphere as seen in the top left panel of Fig. 9. In the 4 × CO2 simulation this cooling reduces the sea-ice albedo feedback slightly, leading to the extra cooling over the Arctic seen in the top right plot in Fig. 9. Internal model variability is most likely responsible for regions of an opposite warming temperature signal in the Barents Sea area as interannual variability in this region is significant.

Overall it is clear that the changes in the leaf mortality parameters can have a reasonable regional impact. A Gregory-method analysis (Gregory et al. 2004) of the TOA radiation balance and surface temperature change, suggests that the tuned model version has a 5 % lower value of climate sensitivity compared to the standard version of the model. It is difficult to quantify this more robustly though, without continuing the simulations for longer than 50 years as performed here.

5 Discussion and conclusions

Using an atmosphere-only version of HadGEM2 we investigated the climatic response to glacial climate forcings. HadGEM2 responds particularly strongly to LGM boundary conditions and the version of the TRIFFID dynamic vegetation model coupled interactively within HadGEM2 simulates a massive expansion of bare soil (i.e. desert areas) at the LGM which is not supported by palaeo-vegetation compilations (Prentice et al. 2000) or other modelling studies (Crucifix and Hewitt 2005b; Harrison and Prentice 2003).

Detailed analysis of the model results allows us to diagnose the exact causes of the enhanced cooling and vegetation die off in the model. The major factor behind the cooling in the model is an increase in the strength of the albedo feedback for grass PFTs caused by an overall reduction in the LAI of grass PFTs relative to other versions of TRIFFID/MOSES. This reduction in LAI also leads to a significant reduction in NPP and the combined impact on vegetation coverage leads to expansion of bare soil areas over much of Asia in the LGM simulation. This reduction in LAI is caused by the introduction of a temperature-dependence of leaf mortality for grass PFTs. This feature is present in both HadGEM2 and JULES (Best et al. 2011; Clark et al. 2011), but was disabled in earlier versions of the Met Office model HadCM3M2.

Resetting the parameters in HadGEM2 to those in HadCM3M2 resulted in better coverage of vegetation at the LGM in Asia and over other regions in a comparison of the equivalent pre-industrial simulations with observed vegetation distributions. Further changes to the model parameters also allow other improvements to the vegetation coverage in the model, for example increasing the coverage of the needle-leaf trees PFT over Siberia where the default versions of both HadGEM2 and HadCM3 simulate too much shrub coverage, and increasing the seasonality of LAI.

These modifications to the parameter values might also impact on the climatic response in a warmer climate. To test this we also ran the modified version of the model under an abrupt quadrupling of CO2 scenario as used in CMIP5. Comparison with the equivalent simulations of the standard version of HadGEM2-ES shows regional variations in the response to increased atmospheric CO2 but no significant change in the global average.

One limitation in the current work is the use of atmosphere-only HadGEM2 simulations forced with SSTs and sea-ice from another model, in this case HadCM3. Whilst this impacts on evaluating the overall system response in HadGEM2, this approach does allow for a large number of sensitivity tests with the model and enabled us to diagnose the root cause of the enhanced cooling and vegetation die-back. One missing physical process is the impact on snow albedo from dust deposition. This may be particularly important for the glacial period (e.g. Krinner et al. 2006), but is not included in the current HadGEM2 setup.

Our work suggests that climate simulations of other time periods in Earth’s history can provide useful constraints on our understanding of the Earth System. We have shown that the constraints provided by proxies of past climate and environmental changes at the LGM can, despite relatively large uncertainties, be used to test models of the Earth System. Part of this arises from the large difference in climate between the LGM and the pre-industrial periods investigated here. However, this may also be partly the result of a tendency in complex models for the amplification of errors, so that small (but significant) discrepancies in one model component, here leaf area index, can have a larger effect on other variables such as surface temperature via the interaction with other processes, here snow cover.