1 Background

Climate change affects the productivity (e.g., Boisvenue and Running 2006) and carbon sequestration of forests (cf. Hui et al. 2017). To model potential future site-specific forest productivity, high quality data for both forest stands and climate are required. So far, the available climate data meet the requirements for forestry scientific questions only to a certain degree. Existing datasets, such as the E-OBS gridded dataset (Haylock et al. 2008), the WorldClim data (Hijmans et al. 2005), the grids of monthly aggregated climate data (e.g., DWD Climate Data Center—CDC 2016a), and the recently published CHELSA data (Karger et al. 2017), achieve a spatial resolution of 0.2° and 0.3° (lat./long.) or 1-km2 regular grid. Due to the availability of high-resolution digital terrain models for Germany via the German Federal Agency for Cartography and Geodesy (Bundesamt für Kartographie und Geodäsie BKG 2010), there is a suitable basis for a more precise elaboration of terrain induced climatic effects, such as cold air flows, respectively, accumulation. Not only long-term changes in climatic conditions may cause physiological stress to trees. Especially, weather extremes may generate effects even within short timescales. Therefore, the climate variables influencing forest productivity are required in daily resolution, which is neither realized by WorldClim, CHELSA nor in the DWD climate datasets. In the frame of the project “Forest Productivity–Carbon Sequestration–Climate Change” (WP-KS-KW) (Mette 2017), climate regionalization provides the spatial and temporal requirements by resolving climate information for Germany in a daily raster of 250 × 250 m for the period 1961 to 2100. The “NFI 2012 environmental data base climate” contains aggregated time series with spatial climate information for monthly and annual values, extreme value statistics, and specific characteristics for phenological phases or specific periods in the above-mentioned timeframe for the German National Forest Inventory (NFI) and the National Forest Soil Inventory (NFSI).

2 Methods

2.1 Regionalization of historical daily station observations

To obtain spatial high-resolution climate data for the period of 1961 to 2013, the historical daily observations of the National German Weather Service (Deutscher Wetterdienst) (DWD Climate Data Center—CDC 2016b), supplemented by weather stations of the Global Surface Summary of the Day (GSOD) (Menne et al. 2012) bordering Germany, were regionalized applying geographically weighted regression (GWR) technique using different terrain parameters as predictors. Terrain parameters are derived from a nationwide digital elevation model with a horizontal resolution of 25 m (Bundesamt für Kartographie und Geodäsie BKG 2010), resampled to a 250-m grid, which provides the basis for the parameterization of terrain effects.

The regionalization of station-based data comprised following meteorological variables: (a) global radiation (daily sums for horizontal and inclined surfaces), (b) temperature (daily mean, minimum, maximum), (c) vapor pressure deficit (daily mean), (d) wind speed (daily mean), and (e) precipitation (daily sums). In the following, we briefly introduce the GWR principle and subsequently sketch variable-specific methodical setups.

GWR belongs to the family of multivariate regression techniques (Lloyd 2010) and serves to predict a target variable based on predictor variables considering spatially variable linear relationships. Basic descriptions of GWR functionalities can be found among others in Fotheringham et al. (1998) and Fotheringham et al. (2002). For a systematic evaluation of alternative spatialization methods proofing GWR to outperform global spatialization approaches as well as local deterministic and geostatistical interpolation methods, see Böhner and Bechtel (2018). Further examples of GWR applications are given in Duan and Li (2016) who used GWR to downscale MODIS land surface temperatures in northern China. Li et al. (2010b,) applied GWR for the regionalization of urban surface temperatures. Brunsdon et al. (2001) investigated the hypsometric variation of precipitation in Great Britain. Sharma et al. (2011) conducted a study focusing on the correlation of spatial non-stationarities of precipitation and crop yields. Unlike global regression-based approaches, which only provide a single regression equation to describe statistical relations between predictors and dependent variables, GWR integrates local regression analysis with inverse-distance weighting of input data. Interpolated to the target grid resolution, the resulting grids of regression parameters enable a non-stationary representation of the target variable, accounting for spatially varying predictor-predictand relationships even at topoclimatic scales (Brunsdon et al. 2001; Fotheringham et al. 1998; Lin and Wen 2011). In dependence on the spatial distribution and density of input data, GWR moreover enables to define a search range and a specific bandwidth of the Gaussian weighting scheme for the purpose of optimizing the regionalization results (Böhner and Bechtel 2018). In this research, the bandwidth of the Gaussian kernel and the search range was obtained through cross-validation. An overview of GWR configurations (bandwidth, search range, model resolution), terrain parameters used as predictors, and the approximate number of underlying weather stations is summarized in Table 1. The model resolution refers to the horizontal discretization of the initial GWR grid network, which intends to compromise between spatial detailedness and computational efficiency, elaborated through comprehensive performance tests. Unless otherwise stated, model development, performance tests, and the final implementation of the GWR were performed by the free and open source software SAGA GIS—System for Automated Geoscientific Analysis (Conrad et al. 2015).

Table 1 Attributes of GWR model setup and empirical input data

2.2 Meteorological variables

2.2.1 Solar radiation

Shortwave solar radiation is the primary climatic factor entering the energy budget and thus a major determinant of energy exchange and plant physiological processes. The observational database, however, was quite sparse with only 20 weather stations with long-term records of global radiation. Against this background, daily radiation values (global radiation and topographic radiation) were estimated using the semi-empirical method proposed by Allen et al. (1998). In this approach, the extraterrestrial solar radiation (at the top of atmosphere) is computed as a function of the geographical latitude, determining the astronomical day length and the diurnal course of the sun’s altitude and azimuth angle for each day of the year, while the strength of atmospheric extinction processes and the radiation income at the earth’s surface is estimated as an empirical function of the relative daily sunshine duration (i.e., the ratio between observed and astronomically possible sunshine hours). Relative sunshine duration was computed for 220 to 299 weather stations with daily sunshine observations, depending on station availability, interpolated to the target grid resolution using B-spline interpolation. For the delineation of topographic solar radiation from global radiation in consideration of the local terrain geometry and shadowing effects of the terrain, see Böhner and Bechtel (2018).

2.2.2 Temperature

Regional temperature variations are commonly strongly correlated with altitude, reflecting the close dependence of environmental lapse rates on the circulation mode and thus on the state and stratification of the troposphere. At the topoclimatic scale, however, the differential heating of slopes as well as the nocturnal cold air formation and cold air flow alters the tropospheric stratification and leads to distinct distribution patterns of temperature in the near surface layer, particularly in case of inversions. To account for both effects, temperatures (minimum, maximum, and mean daily temperatures) were regionalized via GWR considering elevation and the terrain exposure index (TEI) as predictors. The TEI is a complex-analytic DEM-based terrain parameter, which indicates the degree, to which a particular location is sheltered from advection flows and is thus particularly suitable to indicate terrain settings frequently exposed to cold air flow and cold air accumulation (e.g., valleys, terrain sinks, mountain-rimmed basins). For a comprehensive formal description and meteorological justification of the TEI, see Böhner and Bechtel (2018) and Böhner and Antonic (2009).

2.2.3 Vapor pressure deficit

Given that the vapor pressure is determined by temperature, the vapor pressure deficit was simply estimated from the vapor pressure (computed for gridded daily mean temperatures using the Magnus equation (cf. Zmarsly et al. 2007) and the relative vapor pressure deficit of the weather station network, interpolated to the target grid resolution via B-spline interpolation. This rather simple approach was chosen to prevent from competing surface parameterizations in the regionalization scheme, given that the regionalization of temperatures already considered the TEI and the altitude as predictors.

2.2.4 Wind speed

To avoid inconsistencies due to the strongly varying data coverage over time, the estimation of daily wind speeds considered both, available daily observations from 49 to 181 stations as well as average monthly wind climatologies from Böhner (2004). The gridded 250 × 250-m resolution datasets for 10 and 2 m above ground were performed according to the statistical SWM (Statistisches Windfeldmodell, cf. Gerth and Christoffer 1994) approach and accounted for terrain and land use effects on the air flow through roughness parameterization (Böhner 2004). The empirical database comprised of wind climatologies from 109 wind stations, standardized for different roughness classes and height levels according to the Wind Atlas approach (cf. Traub and Kruse 1996; Böhner 2004).

In view of differing measuring heights, in a first step, time series of mean daily wind speeds were reduced to 2 m above ground and a standard roughness length of 0.0148 m (FAO reference surface, cf. Allen et al. 1998) according to the logarithmic wind profile law (Oke 2000), using CORINE based roughness grids from Böhner (2004) created for the lowest part of the inertial sublayer. After z-transformation of the time series, the z values were regionalized via GWR again using the TEI as predictor. Assuming daily wind speeds to be Rayleigh distributed with a constant ratio of approximately 1.91 between mean and standard deviation (Shankar 2017), daily mean wind speeds were finally simply delineated from the monthly gridded wind climatologies via inverse z transformation applied to the long-term means.

2.2.5 Precipitation

Since precipitation is highly variable in space and time and thus requires a regionalization strategy, which accounts for the large spatial variability and is moreover robust against changing data densities over time, the estimation of daily precipitation fields was performed in a two-step procedure. At first, long-term 1961–1990 monthly means from the DWD rain gauge network were regionalized using GWR with elevation as predictor. The observational database comprised of consistently processed precipitation climatologies from a total of 4752 stations (DWD 2017). The dataset constitutes the densest available network of point source precipitation observations for Germany sufficiently representing the spatial variation pattern of precipitation. Based on all available daily time series, daily precipitation fields were subsequently inferred from relative daily precipitation rates (i.e., the ratios of daily totals to long-term monthly means), which were interpolated to the target grid resolution via B-spline and finally multiplied with the respective monthly means.

2.3 Statistical and dynamical climate projections

To cover a range of potential future climate developments, climate projections were performed for three Representative Concentration Pathway (RCP) emission scenarios (RCP 2.6, 4.5 and 8.5; cf. IPCC 2014) using MPI-ESM (Max Planck Institute for Meteorology—Earth System Model) simulations as forcings. The state-of-the-art MPI-ESM contributed to the Fifth Assessment Report (AR5) of the IPCC (2014) and consists of the atmospheric General Circulation Model ECHAM6, the land vegetation model JSBACH, the ocean GCM MPIOM, and the ocean biogeochemistry model HAMOCC. An overview of the modeling components is given in Ilyina et al. (2013), Jungclaus et al. (2013), Reick et al. (2013), Schneck et al. (2013), and Stevens et al. (2013).

MPI-ESM runs were performed within the frame of the international Climate Model Intercomparison Project CMIP5 (Taylor et al. 2012; Hasson et al. 2016), projecting an area-averaged twenty-first century warming for Germany from approximately 1.3 (RCP 2.6) to 3.9 K (RCP 8.5) and only minor changes in precipitation of approximately + 2 (RCP 2.6) and − 1% (RCP 8.5), quite in line with the majority of climate change signals of the CMIP5 model ensemble (cf. IPCC 2014; Kovats et al. 2014). The grid mesh resolution of approximately 1.9° of the ECHAM6 atmospheric General Circulation Model (cf. Giorgetta et al. 2012), however, remains far beyond the needs for climate impact analyses and thus requires a suitable strategy for spatially refining limited resolution model outputs, using dynamical or statistical downscaling (cf. Böhner and Bechtel 2018). In this research, both basic approaches, dynamical and statistical downscaling, were considered.

2.3.1 REMO

Climate projections based on dynamically downscaled MPI-ESM simulations were generated for the environmental database of the German NFI, using bias-corrected REMO (Regional Climate Model) simulations. The internationally renowned Limited-Area Model (LAM) REMO of the Max Planck Institute for Meteorology, Hamburg, is a hydrostatic (i.e., non-convection permitting) mesoscale climate model, capable of resolving mesoscale processes at a typical horizontal discretization of about 50 to 10 km (Jacob et al. 2014). REMO simulations were carried out in context of the Coordinated Regional Climate Downscaling Experiment (CORDEX) of the World Climate Research Program (WCRP). The coordinated framework enabled a systematic evaluation of different modeling initiatives, proofing the principle validity of REMO simulations (Jacob et al. 2012; Kotlarski et al. 2014). REMO projections for the twenty-first century (since 2006) were performed in a nested approach, spatially refining the forcing ECHAM6 modeling results for RCP 2.6, 4.5 and 8.5 down to a horizontal grid mesh resolution of approximately 12 km2 (Kotlarski et al. 2014; Pfeifer et al. 2015).

In order to capture the intrinsic variability of REMO simulations, we waived for additional downscaling applications and limited the further processing of REMO outputs to bias corrections. Comprehensive tests of different bias correction methods were conducted (Sachindra et al. 2014; Wetterhall et al. 2012), based on REMO 1971–2000 control run data and respective regionalized climate data of the observational network. Performance tests of alternative approaches were initially carried out on a subset of approximately 2500 paired (i.e., observational vs. modeled) time series for all target variables, proofing the rather simple direct method of Wetterhall et al. (2012) to be suitable for radiation and temperature corrections, particularly in view of the relatively low model biases. Correction terms for modeled radiation data considered the coefficients of long-term 1971–2000 monthly means of observational and modeled data, showing a general tendency of slight overestimation of solar radiation simulated with REMO. For the interval-scaled temperature variables (min, mean, max), model biases were taken as the 1971–2000 mean monthly differences between observational data and REMO control run, revealing mainly slight warm biases of REMO (i.e., a model overestimation) particularly in spring (April, May, June). The correction terms for radiation and temperature were consistently applied to the daily resolution scenario projections.

To prevent from competing bias corrections, the vapor pressure deficits of the projection period were directly estimated by multiplying REMO modeled relative moisture deficits with the vapor pressure determined from bias corrected daily mean temperatures again using the Magnus equation (Zmarsly et al. 2007). Tests based on paired time series from the control period 1971–2000 proofed a sufficient accuracy of the simple procedure, whereas separate bias corrections of the modeled vapor pressure deficits partly produced non-plausibly results (e.g., negative values) when applied to the projections.

Bias corrections of wind speed and precipitation were performed using the correction method proposed by Sachindra et al. (2014). In this approach, at first, modeled daily time series of the control period and the projection period were z-transformed based on its monthly means and standard deviations of the control period. In a second step, inverse z-transformation of obtained z values was performed but this time applied to the monthly means and standard deviations of the observed data. In the result, the method corrects both the average bias and the eventually biased standard deviation of modeled data, resulting in identical statistical parameters (means, standard deviation) of observed and bias corrected time series for the control period. Moreover, as compared to more complex bias correction methods, such as the quantile method (Maraun 2013), frequently applied to precipitation data (Maraun 2013; Li et al. 2010a), the approach is computational efficient, prevents from producing non-plausible outliers in the projections, and was found to sufficiently capture the typically right-skewed statistical distribution of wind and precipitation series. Corrections of precipitation projections adjusted the moist bias of REMO, which was particularly distinct in February, whereas the standard deviations of the control run were principally in good agreement with the observations.

Bias corrections had been consistently applied to the REMO projection period 2006–2100, allowing to directly comparing modeling results with regionalized observations for the 8-year overlap period 2006–2013. Tables 2 and 3 show monthly and annual statistics of daily temperature and precipitation series computed as areal means over the German model domain. In comparison with observational statistics, the first three moments (mean, standard deviation, skewness) as well as the minimum and maximum values of the RCP scenarios are principally plausible. Basic variation characteristics, such as the strongly right skewed distribution of precipitation throughout all months, the rather symmetric distribution of temperatures, and the standard deviations for both temperature and precipitation are largely in line with the observational data. Only the simulated temperature ranges and extremes on average are slightly lower than observed. However, considering that the temperature variability in the period 2006–2013 was slightly higher than in the climate normal period 1971–2000, this finding does not generally indicate an underestimation of temperature variability in the modeling results. Keeping in mind that scenario simulations are projections and no predictions, and, of course cannot expected to synchronously reproduce specific seasonal features of observed variations (e.g., the relatively dry and warm April conditions during 2006–2013 as compared to 1971–2000 Climate Normals), modeling results prove suitable for representing plausible future climate pathways. Examples for long-term changes in temperature and precipitation over the German model domain, depicting REMO-based modeling results for the moderate RCP 2.6 scenario and the extreme-end warming RCP 8.5 scenario are shown in Figs. 1 and 2. Tables 4 and 5 summarize areal averaged mid-century (2041–2050) and end-century (2091–2100) temperature and precipitation projections for the three RCP scenarios.

Table 2 Monthly and annual statistics of area-averaged daily temperature series for regionalized observations (Obs.) and REMO EURO-CORDEX projections (RCPs) of the period 2006–2013. Daily mean (DM), standard deviation (SD), skewness (Skew), minimum (Min), and maximum (Max)
Table 3 Monthly and annual statistics of area-averaged daily precipitation series for regionalized observations (Obs.) and REMO EURO-CORDEX projections (RCPs) of the period 2006–2013. Daily mean (DM), standard deviation (SD), skewness (Skew), minimum (Min), and maximum (Max)
Fig. 1
figure 1

Historical mean temperature (1971–2000) based on regionalized in-situ observations (left). Change of mean temperature (2071–2100 vs. 1971–2000) in the bias corrected REMO EURO-CORDEX RCP 2.6 scenario (middle) and the RCP 8.5 scenario (right)

Fig. 2
figure 2

Average annual rainfall (1971–2000) based on regionalized in-situ observations (left). Relative Change of annual precipitation (2071–2100 vs. 1971–2000) in the bias corrected REMO EURO-CORDEX RCP 2.6 scenario (middle) and RCP 8.5 scenario (right). Due to the low precipitation amounts in continental eastern Germany, equal absolute changes (in mm) result in higher relative changes (in %) than in rainy areas

Table 4 STARS and bias corrected REMO EURO-CORDEX projections of temperature change for Germany under Representative Concentration Pathway (RCP) scenarios. Absolute monthly and annual changes refer to area-averaged Temperatures of the reference period 1971–2000
Table 5 STARS and bias corrected REMO EURO-CORDEX projections of precipitation change for Germany under Representative Concentration Pathway (RCP) scenarios. Relative monthly and annual changes refer to area-averaged precipitation sums of the reference period 1971–2000

2.3.2 STARS

In order to explore the impact of using different modeling paradigms on the outcome of scenarios, dynamical (REMO) simulations had been supplemented by purely statistically generated projections for the near future period 2011–2050 using the Statistical Analogue Resampling Scheme (STARS) from the Potsdam Institute for Climate Impact Research (PIK). The STARS approach applies block-bootstrapping techniques with heuristic rules to recombine observed time series according to a prescribed temperature trend. The iterative procedure firstly rearranges annual values and subsequently 12 day blocks in due consideration of the interannual variability and the rank of the time intervals, until the artificial time series fits a given temperature trend simulated by a numerical climate model. Due to low computational demands, STARS allows to generate large ensembles of statistical realizations, each reproducing the forcing temperature trend at a predefined deviation tolerance. To account for large scale differences in climate projections, modeled temperature signals are assigned to a limited number of weather stations, considered as reference locations, which are representative in terms of its climatic seasonality and long-term observed temperature changes (Orlowsky et al. 2008; Lutz and Gerstengarbe 2014).

Following the procedure suggested by Orlowsky and Lutz (2013), we applied K-means cluster analysis to the observational database of 75 weather stations, taking into account the variables mean temperature, precipitation, standard deviation of temperature, and precipitation as well as the difference between the first and the second half of the observation period (cf. Orlowsky and Lutz 2013). In the result, the weather stations Schwerin and Fichtelberg had been detected as reference locations, subsequently judging the block-bootstrapping procedure of temperature time series of the period 1961–2013. The forcing warming signals were taken from ECHAM6 simulations of the period 2011–2050, ranging from 0.7 (Schwerin) and 0.8 K (Fichtelberg) in the moderate RCP 2.6 scenario to 1.3 (Schwerin) and 1.4 K (Fichtelberg) in the extreme-end warming RCP 8.5 scenario. Using only two reference weather stations is assumed to be sufficient in view of the quite low spatial differences of the ECHAM6 temperature projections over Germany.

To prevent a disproportionately high resampling of warm 12 day blocks, which is required at the second resampling level when enforcing a temperature increase beyond the standard deviation of the underlying observational database causing implausible scenario results (Wechsung and Wechsung 2015), we limited the length of the projection period to 40 years and refused for producing an additional temporal overlap with observations. Referring to the STARS evaluation of Wechsung and Wechsung (2015), a statistical comparison between observations and model results as was done for the REMO outcomes, however, is of rather limited value, given that one evaluates statistics from recombined observational records against statistics from the same observational database.

For each of the three RCP scenarios, we computed an ensemble of 100 realizations, all fitting the prescribed temperature trend of the respective scenario at a deviation tolerance of ± 0.2 K. The STARS results for the mid-century decade 2041–2050 presented in Tables 4 and 5 refer to the ensemble medians (cf. Lutz et al. 2013). Although only these median realizations are maintained in “NFI 2012 environmental data base climate” additional results of ensemble simulations can be provided for particular inventory points on demand.

2.3.3 REMO vs. STARS

In general, dynamical downscaling is commonly considered advantageous in terms of physical consistency as compared to statistical approaches, given that regional climate models, such as REMO, solve the same fundamental differential equations of thermo- and hydrodynamics as the driving global model and accordingly capture the large-scale climate signal from the forcing model though altered with respect to regional environmental features. In the result, dynamical downscaling performed with REMO provides fully distributed, spatial, and temporal coherent fields of all relevant atmospheric variables for every model internal time step over a long simulation period (Böhner and Bechtel 2018). Statistical modeling approaches instead are held to be more empirically robust than dynamical simulations. As these bottom-up approaches are explicitly designed on the bases of observational data, empirical congruence is forms inherent essential (Böhner and Bechtel 2018). This holds particularly true for STARS, where an once defined temporal recombination scheme is identically applied to all variables and time series (i.e., in this research to all gridded time series of the German modeling domain), ensuring that all relations between different climatic factors (between temperature, radiation, precipitation, etc.) and their respective spatial distributions are captured as observed (Lutz et al. 2013; Orlowsky et al. 2008). Shortcomings due to the resampling of climate records from the past (i.e., from a time span with lesser radiative forcing than to be expected in future) concern the projected time slice, which should typically be limited to the near future (Orlowsky 2007), and the projection of extremes is of course delimited by the observational database (Lutz et al. 2013). Rather principal constrains of STARS concern the consistency of the modeled results. Wechsung and Wechsung (2015) demonstrated that the resampling procedure conditioned by a predefined warming signal tends to turn short-term weather co-variations of temperature and its co-variables (e.g., the coincidence of reoccurring hot days during longer-lasting dry spells and rather cool days during rainy phases in summer) into long-term climate trends diverging from the forcing climate model projections. This becomes apparent in different STARS results (cf. Lutz et al. 2013; Wechsung and Wechsung 2015) generally suggesting a future drying trend for Germany with increasing warming rates particularly in summer, whereas the ECHAM6 model results suggest only very minor changes (cf. Svoboda et al. 2015).

Against this background, time series performed with STARS are suggested to be applied when analyzing environmental impacts of limited warming signals as is mostly the case in near-future climate projections. Moreover, STARS ensemble realizations used in environmental sensitivity studies offer the opportunity to estimate uncertainties when, e.g., modeling site-specific forest productivity and allow to explicitly selecting, e.g., particular dry or moist STARS realizations to model the range of possible near-future biophysical responses changing water availability. The twenty-first-century temperature increase in the ECHAM6 extreme-end warming RCP 8.5 scenario instead is of course out of scope for STARS and only sufficiently captured in the REMO simulations. Given that anthropogenic radiative forcing will affect future average conditions, spatiotemporal distributions, and extremes (IPCC 2014), centennial high-resolution REMO simulations are particularly suited, to analyze and assess future forest response on both long-term transient climate changes and altered frequencies and magnitudes of extremes beyond the historically observed range. In view of these different application domains, STARS and REMO projections should not be seen as competing but as complementary model realizations of possible future climates.

2.4 Workflow of data processing

All steps in generating spatial high-resolution climate data were performed using the System for Automated Geoscientific Analysis (SAGA). The modular structured, free, and open source Geographical Information System SAGA was explicitly developed for applications in the field of regional climate and environmental modeling (Conrad et al. 2015). It provides various routines for the parameterization of topographically determined or effected topoclimatic processes based on digital terrain models and land use data (Böhner et al. 2006, 2008). For all climate variables, daily raster data were generated as SAGA grids at a spatial resolution of 250 m. For the environmental database of the German NFI, climate information were extracted for the 26,450 inventory plots using the Geospatial Data Abstraction Library (GDAL 2017). Obtained daily time series at inventory plots were stored to sqlite3 database files. All following temporal aggregations and further treatments of the daily data were finally done using SQL and Python with a focus on the sqlite3 command line utility and the Python pandas library.

2.5 Aggregated climate parameters

The “NFI 2012 environmental data base climate” provides more than 80 aggregated climate parameters for 26,450 forest inventory plots and accordingly the NFSI data bases for 2470 soil inventory plots. For precipitation, air temperature (min, mean, max), vapor pressure deficit, global radiation (on horizontal and inclined surfaces), and wind speed, we aggregated monthly, seasonally, yearly, and multi-yearly climate parameters at the inventory points of the NFI and NFSI. Since the use of WordClim data (Hijmans et al. 2005) is widespread in ecological modeling, the database also provides bioclimatic variables following Hijmans’ definition. Based on the daily resolved data, climatic thresholds, such as the number of frost days, were evaluated.

3 Data access and metadata description

The climate database consists of seven sqlite3 database files for the NFI and seven sqlite3 database files for the NFSI. The content of each sqlite3 database is described in Table 6. All sqlite3 files are archived in the Open Agrar Repository of the Thünen Institute. Access to the databases is provided via the URL: https://doi.org/10.3220/DATA/20180823-102429 (Dietrich et al. 2018). Associated metadata are available at https://agroenvgeo.data.inra.fr/geonetwork/srv/fre/catalog.search#/metadata/d0789030-c94e-4883-8d38-2a7332c98673. All sqlite3 databases contain the same data model and table structure (see Table 7). The tables with prefixed “z_*” are metadata tables with descriptions of the databases (z_db), the tables (z_tab), and columns (z_col). The structure of these tables is based on the official database of the German NFI. Further metadata are provided in a separate spreadsheet. The table x_bl contains the code values for the individual federal states of Germany. The coded values are used in the columns “bl” of the actual data tables (climate_*, bioclim_variables). This enables quick queries and analyses of all inventory points of an individual state. The structure of the data tables is always similar: the first two columns (tnr and enr for NFI, id for NFSI) describe the number of the inventory clusters and the respective plot. The time context of the climatic parameters is then described in different columns (year, month, year_first, year_last, month_first, month_last). Other advanced database features such as normalization and relationships are not applied. In order to keep the data volume for the download as small as possible, no indices were created. To increase the performance of SQL queries a subsequent creation of indices by the user is recommended. For reuse of the sqlite3 databases, we suggest the R or Python programming languages with the appropriate extensions (CRAN: RSQLite, Python: sqlite3, pandas).

Table 6 Database name and content description
Table 7 Database table name and content description

4 Technical validation

For quality assurance, all raster datasets were controlled based on the residuals of observed and gridded daily and monthly data. Moreover, a Leave-One-Out-Cross-Validation (LOOCV) was carried out for autochthonous as well as allochthonous weather conditions showing extremely high and low temperatures and high precipitation amounts to check model robustness for extreme synoptic conditions. Mean temperature showed a mean residue amount of 0.8 K with a 75 percentile of 1.0 K and a 90 percentile of 1.6 K for approximately 2.000 cases at DWD weather station sites. Concerning precipitation, the mean residue amount was 2.0 mm with regard to a Germany-wide, averaged precipitation amount of 8.1 mm and a mean maximum of 70.3 mm, showing a 75 percentile of 2.7 mm and a 90 percentile of 4.6 mm. Subsequently, the quantity of raster per year and climate parameter as well as their file size was checked for completeness and plausibility. GRASS GIS (2017) was used to export raster properties (number of pixels, number of Data-/NoData-pixels) and statistics (e.g., min, max, variance, range, mean, sum, median) to a sqlite3 database. With SQL queries, extreme values could be detected very quickly for the entire time series and checked for plausibility. In addition, the database entries were graphically visualized with the statistic software R, e.g., in histograms. After the point extraction of the climate raster with GDAL, we checked whether all inventory plots had been queried. The resulting sqlite3 database with climate parameters for the inventory plots was queried again with regard to extreme values and used for graphical visualization.

5 Reuse potential and usage limitations

The climate database offers many options for a broad range of analyses, especially in conjunction with the NFI data and the other environmental databases on soil (Benning et al. in review) and hydrological balance (Schmidt-Walter et al. in review). In fact, the database on the hydrological balance (Schmidt-Walter et al. in review) used the daily resolved climate data at each inventory to calculate the water budget and stress indicators due to drought or soil wetness. Within the project “forest productivity–carbon sequestration–climate change” (WP-KS-KW) itself, the climate data were applied to predict forest growth in Germany for the period 2013–2050. Three empirical climate sensitive forest growth simulators, WEHAM, TreeGrOSS, and SILVA used the inventory data of third NFI in 2012 to describe forest growth, yield, and carbon sequestration over 40 years. Nothdurft et al. (2012) also modeling climate sensitive growth based on inventory data. Also, the climate parameters can be used as input variables for the modeling of habitats, e.g., invasive species or bark beetle infestation (Baier et al. 2007). Climate data are also important parameters for the modeling of the mortality (Nothdurft 2013) and susceptibility (Mellert et al. 2016) of tree species, their vulnerability, and its economic effects (Hanewinkel et al. 2010, 2013) under the changing climatic conditions.

One possibility of the methodical development in the regionalization of climate data would be the consideration not only of the weather stations of the National German Weather Service (DWD) but also of forest climate stations to better represent the internal climate of forest stands.

In addition to daily observations, average monthly wind climatologies from Böhner (2004) were used for the regionalization of wind speed. Therefore, the resulting data do not represent the maximum speed of peak gusts and are therefore less suitable for the modeling of storm damage in forests. Within the project WP-KS-KW, the regionalized wind speed data were used for the modeling of evapotranspiration processes of forest stands.