Keywords

1 Introduction

Computer simulation models of the physical world, such as numerical weather prediction (NWP) models, are imperfect and can only approximate the complex evolution of physical reality. Some of the errors are due to the uncertainty in the initial and boundary conditions, forcings, and model parameter values. Other errors, called structural model errors, are due to our incomplete knowledge about the true physical processes; such errors manifest themselves as missing dynamics in the model [11]. Examples of structural errors include the misrepresentation of sea-ice in the spring and fall, errors affecting the stratosphere above polar regions in winter [22], and errors due to the interactions among (approximately represented) physical processes. Data assimilation improves model forecasts by fusing information from both model outputs and observations of the physical world in a coherent statistical estimation framework [1, 15]. While traditional data assimilation reduces the uncertainty in the model state and model parameter values, however, no methodologies to reduce the structural model uncertainty are available to date.

In this study we consider the Weather Research and Forecasting (WRF) model [24], a mesoscale atmospheric modeling system. The WRF model includes multiple physical processes and parametrization schemes, and choosing different model options can lead to significant variability in the model predictions [4, 14]. Among different atmospheric phenomena, the prediction of precipitation is extremely challenging and is obtained by solving the atmospheric dynamic and thermodynamic equations [14]. Model forecasts of precipitation are sensitive to physics options such as the microphysics, cumulus, long-wave, and short-wave radiation [5, 9, 14].

This paper demonstrates the potential of machine learning techniques to help solve two important problems related to the structural or physical uncertainty in numerical weather prediction models: (1) estimation of systematic model errors in output quantities of interest at future times, and the use of this information to improve the model forecasts, (2) identification of those specific physical processes that contribute most to the forecast uncertainty in the quantity of interest under specified meteorological conditions.

The application of machine learning techniques to problems in environmental science has grown considerably in recent years. In [6] a kernel-based regression method is developed as a forecasting approach with performance close to an ensemble Kalman filter (EnKF). Krasnopol et al. [8] employ an artificial neural network (ANN) technique for developing an ensemble stochastic convection parameterization for climate models.

This study focuses on the uncertainty in forecasts of cumulative precipitation caused by imperfect representations of the physics and their interaction in the WRF model. The total accumulated precipitation includes all phases of convective and non-convective precipitation. Specifically, we seek to use the discrepancies between WRF forecasts and measured precipitation levels in the past in order to estimate the WRF prediction uncertainty in advance. The model-observation differences contain valuable information about the error dynamics and the missing physics of the model. We use this information to construct two probabilistic functions. The first maps the discrepancy data and the physical parameters onto the expected forecast errors. The second maps the forecast error levels onto the set of physical parameters that are consistent with them. Both maps are constructed by supervised machine learning techniques, specifically, using ANN and Random Forests (RF) [13].

The remainder of this study is organized as follows. Section 2 covers the definition of the model errors. Section 3 describes the proposed approach of error modeling using machine learning. Section 4 reports numerical experiments with the WRF model that illustrate the capability of the new approach to answer two important questions regarding model errors. Conclusions are drawn in Sect. 5.

2 Model Errors

First-principles computer models capture our knowledge about the physical laws that govern the evolution of a real physical system. The model evolves an initial state at the initial time to states at future times. All models are imperfect, for example, atmospheric model uncertainties are associated with subgrid modeling, boundary conditions, and forcings. All these modeling uncertainties are aggregated into a component that is generically called model error [7, 17, 18]. In the past decade considerable scientific effort has been spent in incorporating model errors and estimating their impact on the best estimate in both variational and statistical approaches [1, 20,21,22].

Consider the following NWP computer model \(\mathcal {M}\) that describes the time-evolution of the state of the atmosphere:

$$\begin{aligned} \mathbf {x}_{t} = \mathcal {M}\left( \mathbf {x}_{t-1}, \Uptheta \right) , \quad t=1, \cdots , T. \end{aligned}$$
(1a)

The state vector \(\mathbf {x}_t \in \mathbb {R}^n\) contains the dynamic variables of the atmosphere such as temperature, pressure, precipitation, and tracer concentrations, at all spatial locations covered by the model and at t. All the physical parameters of the model are lumped into \(\Uptheta \in \mathbb {R}^\ell \).

Formally, the true state of the atmosphere can be described by a physical process \(\mathcal {P}\) with internal states \(\mathbf {\upsilon }_{t}\), which are unknown. The atmosphere, as an abstract physical process, evolves in time as follows:

$$\begin{aligned} \mathbf {\upsilon }_{t} = \mathcal {P}\left( \mathbf {\upsilon }_{t-1}\right) , \quad t=1, \cdots , T. \end{aligned}$$
(1b)

The model state seeks to approximate the physical state:

$$\begin{aligned} \mathbf {x}_{t} \approx \psi (\mathbf {\upsilon }_t) , \quad t=1, \cdots , T, \end{aligned}$$
(1c)

where the operator \(\psi \) maps the physical space onto the model space, for example, by sampling the continuous meteorological fields onto a finite-dimensional computational grid [11].

Assume that the model state at \(t-1\) has the ideal value obtained from the true state via (1c). The model prediction at t will differ from reality:

$$\begin{aligned} \psi (\mathbf {\upsilon }_{t}) = \mathcal {M}\bigl (\psi (\mathbf {\upsilon }_{t-1}), \Uptheta \bigr ) + {\varvec{\delta }} _{t}\bigl (\mathbf {\upsilon }_{t}\bigr ), \quad t=1, \cdots , T, \end{aligned}$$
(2)

where the discrepancy \( {\varvec{\delta }} _t \in \mathbb {R}^n\) between the model prediction and reality is the structural model error. This vector lives in the model space.

Although the global physical state \(\mathbf {\upsilon }_{t}\) is unknown, we obtain information about it by measuring a finite number of observables \(\mathbf {y}_t \in \mathbb {R}^m\) as follows:

$$\begin{aligned} \mathbf {y}_t = h(\mathbf {\upsilon }_t) + \epsilon _t, \quad \epsilon _t \sim \mathcal {N}(0, \mathbf {R}_t), \quad t=1, \cdots , T. \end{aligned}$$
(3)

Here h is the observation operator that maps the true state of atmosphere to the observation space, and the observation error \(\epsilon _t\) is assumed to be normally distributed.

To relate the model state to observations, we also consider the observation operator \(\mathcal {H}\) that maps the model state onto the observation space. The model-predicted values \(\mathbf {o}_t \in \mathbb {R}^m\) of the observations (3) are

$$\begin{aligned} \mathbf {o}_t = \mathcal {H}(\mathbf {x}_t), \quad t=1, \cdots , T. \end{aligned}$$
(4)

We note that the measurements \(\mathbf {y}_t\) and the predictions \(\mathbf {o}_t\) live in the same space and therefore can be directly compared. The difference between the observations (3) of the real system and the model predicted values of these observables (4) represent the model error in observation space:

$$\begin{aligned} \varvec{\varDelta }_{t} = \mathbf {o}_{t} - \mathbf {y}_{t} \in \mathbb {R}^m, \quad t=1, \cdots , T. \end{aligned}$$
(5)

For clarity, in what follows we make the following simplifying assumptions [11]:

  • The physical system is finite dimensional \(\mathbf {\upsilon }_t \in \mathbb {R}^n\).

  • The model state lives in the same space as reality; i.e., \(\mathbf {x}_{t} \approx \mathbf {\upsilon }_t\), and \(\psi (\cdot )\equiv id\) is the identity operator in (1c).

  • \(\mathcal {H}(\cdot ) \equiv h(\cdot )\) in (3) and (4).

These assumptions imply that the discretization errors are very small and that the main sources of error are the parameterized physical processes represented by \(\Uptheta \) and the interaction among these processes. Uncertainties from other sources, such as boundary conditions, are assumed to be negligible.

With these assumptions, the evolution equations for the physical system (1b) and the physical observations Eq. (3) become, respectively,

$$\begin{aligned} \mathbf {\upsilon }_{t}= & {} \mathcal {M}\bigl (\mathbf {\upsilon }_{t-1}, \Uptheta \bigr )+ {\varvec{\delta }} _{t}\bigl (\mathbf {\upsilon }_{t}), \quad t=1, \cdots , T, \end{aligned}$$
(6a)
$$\begin{aligned} \mathbf {y}_t= & {} h(\mathbf {\upsilon }_t) + \epsilon _t. \end{aligned}$$
(6b)

The model errors \( {\varvec{\delta }} _t\) (2) are not fully known at any time t, since having the exact errors is akin to having a perfect model. However, the discrepancies between the modeled and measured observable quantities (5) at past times have been computed and are available at the current time t.

Our goal is to use the errors in observable quantities at past times, \(\varvec{\varDelta }_\tau \) for \(\tau =t-1,t-2,\cdots \), in order to estimate the model error \( {\varvec{\delta }} _\tau \) at future times \(\tau =t,t+1,\cdots \). This is achieved by unraveling the hidden information in the past \(\varvec{\varDelta }_\tau \) values. Good estimates of the discrepancy \( {\varvec{\delta }} _t\), when available, could improve model predictions by applying the correction (6a) to model results:

$$\begin{aligned} \mathbf {v}_{t} \approx \mathbf {x}_{t} + {\varvec{\delta }} _{t}. \end{aligned}$$
(7)

3 Approximating Model Errors Using Machine Learning

We propose a multivariate input-output learning model to predict the model errors \( {\varvec{\delta }} \), defined in (2), stemming from the uncertainty in parameters \(\Uptheta \). To this end, we define a probabilistic function \(\phi \) that maps every set of input features \(F \in \mathbb {R}^r\) to output target variables \(\Uplambda \in \mathbb {R}^o\):

$$\begin{aligned} \phi ( F ) \approx \Uplambda , \end{aligned}$$
(8)

and approximate the function \(\phi \) using machine learning.

In what follows we explain the problems we wish to address, function mapping \(\phi \), the input features, and the target variables for each of these problems.

3.1 Problem One: Estimating in Advance Aspects of Interest of the Model Error

Forecasts produced by NWP models are contaminated by model errors. These model errors are highly correlated in time; hence historical information about the model errors can be used as an input to the learning model to gain insight about model errors. We are interested in the uncertainty caused by the interactions between the various components in the physics-based model; these interactions are lumped into the parameter \(\Uptheta \) that is supplied as an input to the learning model. We define the following mapping:

$$\begin{aligned} \phi ^\mathrm{error} \left( \Uptheta , \mathbf {o}_{\tau }, \varvec{\varDelta }_{\tau }, \mathbf {o}_{t} \right) \approx \varvec{\varDelta }_{t}\, \quad \tau < t. \end{aligned}$$
(9)

We use a machine learning algorithm to approximate the function \(\phi ^\mathrm{error}\). Using a supervised learning process, the learning model identifies the effect of physical packages, historical WRF forecast, historical model discrepancy, and the current WRF forecast on the available model discrepancy at the current time. After the model gets trained on the historical data, it yields an approximation to the mapping \(\phi ^\mathrm{error}\), denoted by \(\widehat{\phi }^\mathrm{error}\). During the test phase the approximate mapping \(\widehat{\phi }^\mathrm{error}\) is used to estimate the model discrepancy \(\widehat{\varvec{\varDelta }}_{t+1}\) in advance. We emphasize that the model prediction (WRF forecast) at the time of interest \(t+1\) (\(\mathbf {o}_{t+1}\)) is available, whereas the model discrepancy \(\widehat{\varvec{\varDelta }}_{t+1}\) is an unknown quantity. In fact the run time of WRF is much smaller than the time interval between t and \(t+1\); in other words, the time interval is large enough to run the WRF model and obtain the forecast for the next time window, estimate the model errors for the next time window and improve the model forecast by combining the model forecast and model errors. At the test time we predict the future model error as follows:

$$ \widehat{\varvec{\varDelta }}_{t+1} \approx \widehat{\phi }^\mathrm{error} \left( \Uptheta , \mathbf {o}_{\tau }, \varvec{\varDelta }_{\tau }, \mathbf {o}_{t+1} \right) \,, \quad \tau <t+1. $$

As explained in [11], the predicted error \(\widehat{\varvec{\varDelta }}_{t+1}\) in the observation space can be used to estimate the error \( {\varvec{\delta }} _{t+1}\) in the model space as follows:

$$\begin{aligned} \varvec{\varDelta }_{t+1} \approx \mathbf {H}_t \cdot {\varvec{\delta }} _{t+1}, \quad \widehat{ {\varvec{\delta }} }_{t+1} \approx \mathbf {H}_t\,\left( \mathbf {H}_t^T\, \mathbf {H}_t\right) ^{-1}\,\mathbf {H}_t^T \cdot \widehat{\varvec{\varDelta }}_{t+1}, \end{aligned}$$
(10a)

where we use the linearized observation operator at the current time, \(\mathbf {H}_t = h'(\varvec{x}_{t})\). A more complex approach is to use a Kalman update formula:

$$\begin{aligned} \widehat{ {\varvec{\delta }} }_{t+1} \approx cov (\mathbf {x}_t,\mathbf {o}_t)\, \left( cov (\mathbf {o}_t,\mathbf {o}_t) + \mathbf {R}_t\right) ^{-1}\, \widehat{\varvec{\varDelta }}_{t+1}, \end{aligned}$$
(10b)

where \(\mathbf {R}_t\) is the covariance of observation errors.

3.2 Problem Two: Identifying the Physical Packages that Contribute Most to the Forecast Uncertainty

Typical NWP models incorporate an array of different physical packages to represent multiple physical phenomena that interact with each other. Each physical package contains several alternative configurations (e.g., parameterizations or numerical solvers) that affect the accuracy of the forecasts produced by the NWP model. A particular scheme in a certain physical package best captures the reality under some specific conditions (for example time of the year, representation of sea-ice). The primary focus of this study is to accurately forecast the cumulative precipitation; therefore we seek to learn the impacts of all the physical packages that affect precipitation. To this end, we define the following mapping:

$$\begin{aligned} \phi ^\mathrm{physics}\left( \varvec{\varDelta }_{t} \right) \approx \Uptheta , \end{aligned}$$
(11)

which estimates the configuration \(\Uptheta \) of the physical packages such that the WRF run generates a forecast with an error consistent with the prescribed level \(\varvec{\varDelta }_{t}\) (where \(\varvec{\varDelta }_{t}\) defined in Eq. (5) is the forecast error in observation space at time t.)

We train the model to learn the effect of the physical schemes on the mismatch between WRF forecasts and reality. The input data required for the training process is obtained by running the model with various physical package configurations \(\Uptheta ^\mathrm{train}_i\) and comparing the model forecast against the observations at all past times \(\tau \) to obtain the corresponding errors \(\varvec{\varDelta }_{\tau ,i}^\mathrm{train}\) for \(\tau \le t\) and \(i \in \{training~data~set\}\). The output data is the corresponding physical combinations \(\Uptheta \) that leads to the input error threshold.

To estimate the physics configuration that contribute most to the uncertainty in predicting precipitation, we take the following approach. The dataset consisting of the observable discrepancies during the current time window \(\varvec{\varDelta }_{t}\) is split into a training part and a testing part. In the test phase we use the approximated function \(\widehat{\phi }^\mathrm{physics}\) to estimate the physical process settings \(\widehat{\Uptheta }_j^1\) that are consistent with the observable errors \(\varvec{\varDelta }_{t,j}^{\{1\}}\). Here we select \(\varvec{\varDelta }_{t,j}^{\{1\}} = \varvec{\varDelta }_{t,j}^\mathrm{test}\) for each \(j \in \{test~data~set\}\). Note that in this case, since we know what physics has been used for the current results, we can take \(\widehat{\Uptheta }_j^{\{1\}}\) to be the real parameter values \(\Uptheta _j^{\{1\}}\) used to generate the test data. In general, \(\varvec{\varDelta }_{t,j}^{\{1\}}\) is chosen appropriately for a given application and the corresponding parameters are estimated.

Next, we reduce the desired forecast error level to \(\varvec{\varDelta }_{t,j}^{\{2\}}=\varvec{\varDelta }_{t,j}^{\{1\}}/2\) and use the approximated function \(\widehat{\phi }^\mathrm{physics}\) to estimate the physical process setting \(\widehat{\Uptheta }_j^{\{2\}}\) that corresponds to this more accurate forecast. To identify the package setting that has the largest impact on the observable error, we monitor the variability in the predicted parameters \(\widehat{\Uptheta }^{\{2\}} - \widehat{\Uptheta }^{\{1\}}\). Specifically, the number of times the setting of a physical process in \(\widehat{\Uptheta }_j^2\) is different from its setting in \(\widehat{\Uptheta }_j^1\) is an indicator of the variability in model prediction when that package is changed. A higher variability in predicted physical packages implies a larger contribution to the model errors as estimated by the learning model.

3.3 Machine Learning Algorithms

To approximate the functions (9) and (11), we use regression machine learning methods. Choosing the right learning algorithm is challenging; it largely depends on the problem and the data available [2, 3, 10, 12]. Here, we use RF and ANN as our learning algorithms [13]. Both RF and ANN algorithms can handle nonlinearity in regression and classification. Given that the physical phenomena governing precipitation are highly nonlinear, and atmospheric dynamics is chaotic, we believe that RF and ANN approaches are well suited to capture the associated features. Although there are several other advanced machine learning algorithms that can be deployed here, we note that our aim here is to demonstrate the potential of machine learning approaches and hence the use of simple learning models such as RFs and ANN. Advanced techniques such as long short term memory (LSTM), convolutional neural networks (CNN), and gated recurrent units (GRU) are typically used for handling time series data and we defer such a study with these techniques to our future research. We describe the details regarding the training procedure, selection of the algorithm parameters, obtaining training data, and validation procedure in Sect. 4.

4 Numerical Experiments

We apply the proposed learning models to the WRF model [24] to (1) predict the bias in precipitation forecast caused by structural model errors, (2) predict the statistics associated with the precipitation errors, and (3) identify the specific physics packages that contribute most to precipitation forecast errors for given meteorological conditions.

4.1 WRF Model

In this study we use the non hydrostatic WRF model version 3.3. The simulation domain, shown in Fig. 1, covers the continental United States and has dimensions of \(60 \times 73\) horizontal grid points in the west-east and south-north directions, respectively, with a horizontal grid spacing of 60 km [23]. The grid has 60 vertical levels to cover the troposphere and lower part of the stratosphere between the surface to approximately 20 km. In all simulations, the six-hourly analysis from the National Centers for Environmental Prediction (NCEP) are used as the initial and boundary conditions of the model [16]. The stage IV estimates are available at an hourly temporal resolution over the continental United States. For experimental purposes, we use the stage IV NCEP analysis as a proxy for the true state of the atmosphere. The simulation window begins at \(6 \textsc {am}\) UTC (Universal Time Coordinated) on May 1, 2017.

The model configuration parameters \(\Uptheta \) represent various combinations of microphysics schemes, cumulus parameterizations, short-wave, and long-wave radiation schemes. Specifically, each process is represented by the schema values of each physical parameter it uses, as detailed in WRF model physics options and references [25]. A total of 252 combinations of the four physical modules are used in the simulations. For each of the combinations, the effect of each physics combination on precipitation is investigated. The NCEP analysis grid points are \(428\times 614\), while the WRF computational model have \(60 \times 73\) grid points. To obtain the discrepancy between the WRF forecast and NCEP analysis, we linearly interpolate the analysis to transfer the physical variables onto the model grid. Figure 1 shows the NCEP analysis at \(12 \textsc {pm}\) on May 1, 2017 which is used as “true” (verification) state. Figure 2 shows the forecast at \(12 \textsc {pm}\) on May 1, 2017. For the initial conditions, we use the NCEP analysis at \(6 \textsc {pm}\) on May 1, 2017. The WRF forecast corresponding to the physics microphysics: Kessler, cumulus physics: Kain-Fritsch, long-wave radiation physics: Cam, shirt-wave radiation physics: Dudhia is illustrated in Fig. 2. Figures 5 and 6 shows contours of discrepancies at \(12 \textsc {pm}\) \(\left( \varvec{\varDelta }_{t=12 \textsc {pm}} \right) \) discussed in Eq. (5) for two physical combinations, which illustrates the effect that changing the physical schemes has on the forecast.

4.2 Experiments for Problem One: Predicting Pointwise Precipitation Forecast Errors over a Small Geographic Region

We demonstrate our learning algorithms to forecast precipitation in the state of Virginia on May 1, 2017, at \(6 \textsc {pm}\). Our goal is to use the learning algorithms to correct the bias created due to model errors and hence improve the forecast for precipitation. As described in Sect. 3.1, we learn the function \(\phi ^\mathrm{error}\) of Eq. (9) using the training data from the previous forecast window (\(6 \textsc {am}\) to \(12 \textsc {pm}\)):

$$ \phi ^\mathrm{error} \left( \Uptheta , \mathbf {o}_{\tau }, \varvec{\varDelta }_{\tau }, \mathbf {o}_{t=12 \textsc {pm}} \right) \approx \varvec{\varDelta }_{t=12 \textsc {pm}}, \quad 7 \textsc {am} \le \tau < 12 \textsc {pm}. $$

We use two learning algorithms to approximate the function \(\phi ^\mathrm{error}\). Specifically, the RF with ten trees and CART learning tree algorithm in the forest and an ANN with four hidden layers and hyperbolic tangent sigmoid activation function in each layer are employed by using Scikit-learn, a machine learning library in Python [19]. For training purposes, we use the NCEP analysis of May 1, 2017 at \(6 \textsc {am}\) as initial conditions for the WRF model. The forecast window is 6 h, and the WRF model forecast final simulation time is \(12 \textsc {pm}\). The input features are as follows

  • The physics combinations (\(\Uptheta \)).

  • The hourly WRF forecasts projected onto observation space \(o_{\tau }\), \(\textsc {am} \le \tau \le 12 \textsc {pm}\). The WRF state (\(\mathbf {x}_{t}\)) includes all model variables such as temperature, pressure, and precipitation. The observation operator extracts the precipitation portion of the WRF state vector, \(\mathbf {o}_{t}\equiv \mathbf {x}_t^\text {precipitation}\). Accordingly, \(\varvec{\varDelta }_{t}\) is the discrepancy between WRF precipitation forecast \(\mathbf {o}_{t}\) and the observed precipitation \(\mathbf {y}_{t}\).

  • The observed discrepancies at past times (\(\varvec{\varDelta }_{\tau }\), \(7 \textsc {am} \le \tau < 12 \textsc {pm}\)).

Fig. 1.
figure 1

NCEP analysis at \(12 \textsc {pm}\) provides a proxy for the true state of the atmosphere.

Fig. 2.
figure 2

WRF forecast at \(12 \textsc {pm}\).

Fig. 3.
figure 3

Original WRF prediction at \(6 \textsc {pm}\) on May 1, 2017. Zoom-in panels show the predictions over Virginia.

Fig. 4.
figure 4

NCEP analysis at \(6 \textsc {pm}\) on May 1, 2017. Zoom-in panels show the predictions over Virginia.

Fig. 5.
figure 5

Microphysics scheme: Kessler; cumulus physics: Kain-Fritsch; short wave radiation: Cam; long wave radiation: Dudhia

Fig. 6.
figure 6

micro-physics scheme: Lin; cumulus physics: Kain-Fritsch; short wave radiation: RRTM Mlawer; long wave radiation: Cam

The output variable is the discrepancy between the NCEP analysis and the WRF forecast at \(12 \textsc {pm}\), that is, the observable discrepancies for the current forecast window (\(\varvec{\varDelta }_{t=12 \textsc {pm}} \)). In fact, for each of the 252 different physical configurations, the WRF model forecast and the difference between the WRF forecast and the analysis are provided as input-output combinations for learning the function \(\phi ^\mathrm{error}\). The number of grid points over the state of Virginia is \(14 \times 12\). Therefore for each physical combination we have 168 grid points, and the total number of samples in the training data set is \(252 \times 168 =42,336\) with 15 features.

Both ANN and RF are trained with these input-output combinations to obtain an approximation of the function \(\phi ^\mathrm{error}\), denoted by \(\widehat{\phi }^\mathrm{error}\), during the training phase. In the testing phase we use the function \(\widehat{\phi }^\mathrm{error}\) to predict the future forecast error \(\widehat{\varvec{\varDelta }}_{t=6 \textsc {pm}}\) given the combination of physical parameters as well as the WRF forecast at time \(6 \textsc {pm}\) as input features.

$$ \widehat{\varvec{\varDelta }}_{t=6 \textsc {pm}} \approx \widehat{\phi }^\mathrm{error} \left( \Uptheta , \mathbf {o}_{\tau }, \varvec{\varDelta }_{\tau }, \mathbf {o}_{t=6 \textsc {pm}} \right) \,, \quad 1 \textsc {pm} \le \tau < 6 \textsc {pm}. $$

To quantify the accuracy of the predicted error we calculate the root mean squared error (RMSE) between the true and predicted discrepancies at \(6\textsc {pm}\):

$$\begin{aligned} RMSE = \sqrt{\frac{1}{n} \sum _{i=1}^{n} \left( \widehat{\varvec{\varDelta }}_{t=6 \textsc {pm}}^{i}-\varvec{\varDelta }_{t=6 \textsc {pm}}^{i}\right) ^2 }, \end{aligned}$$
(12)

where \(n=168\) is the number of grid points over Virginia. \(\widehat{\varvec{\varDelta }}_{t=6\textsc {pm}}^{i}\) is the predicted discrepancy and \(\varvec{\varDelta }_{t=6\textsc {pm}}^{i}\) is the actual discrepancy at the grid point i. The actual discrepancy is obtained as the difference between the NCEP analysis and the WRF forecast at time \(t=6\textsc {pm}\). This error metric is computed for each of the 252 physics configurations. The minimum, maximum, and average RMSE over the 252 runs is reported in Table 1.

Table 1. Minimum, average, and maximum RMSE between the predicted \(\widehat{\varvec{\varDelta }}_{t\,=\,6 \textsc {pm}}\) and the true \(\varvec{\varDelta }_{t=6 }\) over 252 physics combinations.

Figure 3 shows the WRF forecast for \(6\textsc {pm}\) for the state of Virginia using the following physics packages (the physics options are given in parentheses): Microphysics (Kessler), cumulus-physics (Kain), short-wave radiation physics (Dudhia), and long-wave radiation physics (Janjic).

Table 2. Minimum and average of \(\varvec{\varDelta }_{t\,=\,6 \textsc {pm}}\) for the original WRF forecast vs the improved forecast

Figure 4 shows the NCEP analysis at time \(6\textsc {pm}\), which is our proxy for the true state of the atmosphere. The discrepancy between the NCEP analysis and the raw WRF forecast is shown in the Fig. 7. Using the model error prediction we can improve the WRF result by adding the predicted bias to the WRF forecast. The discrepancy between the corrected WRF forecast and the NCEP analysis is shown in the Fig. 8. The results show a considerable reduction of model errors when compared with the uncorrected forecast of Fig. 7. Table 2 shows the minimum and average of original model error vs the improved model errors.

4.3 Experiments for Problem Two: Identifying the Physical Processes that Contribute Most to the Forecast Uncertainty

The interaction of different physical processes greatly affects the precipitation forecast, and we are interested in identifying the major sources of model errors in WRF. To this end we construct the physics mapping (11) using the norm and the statistical characteristics of the model-data discrepancy (over the entire U.S.) as input features:

$$ \phi ^\mathrm{physics} \left( \bar{\varvec{\varDelta }}_{t=12 \textsc {pm}}, \Vert \varvec{\varDelta }_{t=12 \textsc {pm}}\Vert _2 \right) \approx \Uptheta . $$

Statistical characteristics include the mean, minimum, maximum, and variance of the field across all grid points over the continental United States. Note that this is slightly different from (11) where the inputs are the raw values of these discrepancies for each grid point. The output variable is the combination of physical processes \(\Uptheta \) that leads to model errors consistent with the input pattern \(\bar{\varvec{\varDelta }}_{t=12 \textsc {pm}}\) and \(\Vert \varvec{\varDelta }_{t=12 \textsc {pm}}\Vert _2\).

In order to build the dataset, the WRF model is simulated for each of the 252 physical configurations, and the mismatches between the WRF forecasts and the NCEP analysis at the end of the current forecast window are obtained. The discrepancy between the NCEP analysis at \(12 \textsc {pm}\) and the WRF forecast at \(12 \textsc {pm}\) forms the observable discrepancy for the current forecast window \(\varvec{\varDelta }_{t=12 \textsc {pm}}\). For each of the 252 physical configurations, this process is repeated and statistical characteristics of the WRF forecast model error \(\bar{\varvec{\varDelta }}_{t=12 \textsc {pm}}\), and the norm of model error \(\Vert \varvec{\varDelta }_{t=12 \textsc {pm}} \Vert _2\) are used as feature values of the function \(\phi ^\mathrm{physics}\).

Validation of the Learned Physics Mapping. From all the collected data points, \(80\%\) (202 samples) are used for training the learning model, and the remaining \(20\%\) (50 samples) are used for testing purposes. The learning model uses the training dataset to learn the approximate mapping \(\widehat{\phi }^\mathrm{physics}\). This function is applied to each of the 50 test samples \(\varvec{\varDelta }_{t=12 \textsc {pm}}^\mathrm{test}\) to obtain the predicted physical combinations \(\widehat{\Uptheta }_1\). To evaluate these predictions, we run the WRF model again with the \(\widehat{\Uptheta }_1\) physical setting and obtain the new forecast \(\widehat{\mathbf {o}}_{t=12 \textsc {pm}}\) and the corresponding observable discrepancy \(\widehat{\varvec{\varDelta }}_{t=12 \textsc {pm}}^\mathrm{test}\). The RMSE between the norm of actual observable discrepancies and the norm of predicted discrepancies is shown in Table 3. The small values of the difference demonstrates the performance of the learning algorithm.

Table 3. RMSE between estimated discrepancy using predicted physical combinations \( \widehat{\varvec{\varDelta }}_{t=12 \textsc {pm}}^\mathrm{test} \) and the reference discrepancy \(\varvec{\varDelta }_{t=12 \textsc {pm}}^\mathrm{test}\).
Fig. 7.
figure 7

Discrepancy between original WRF forecast and NCEP analysis

Fig. 8.
figure 8

Discrepancy between the corrected WRF forecast and the NCEP analysis

Fig. 9.
figure 9

Frequency of change in the physics with respect to change in the input data from \(\varvec{\varDelta }_{t=12 \textsc {pm}}^\mathrm{test}\) to \(\varvec{\varDelta }_{t=12 \textsc {pm}}^\mathrm{test}/2\). Each data set contains 50 data points, and we report the number of changes of each package.

Analysis of Variability in Physical Settings. We repeat the test phase for each of the 50 test samples with the scaled values of observable discrepancies \(\varvec{\varDelta }_{t=12 \textsc {pm}}^\mathrm{test}/2\) as inputs and obtain the predicted physical combinations \(\widehat{\Uptheta }_2\). The large variability in the predicted physical settings \(\widehat{\Uptheta }\) indicates that the WRF forecast error is sensitive to the corresponding physical packages. We count the number of times the predicted physics \(\widehat{\Uptheta }_2\) is different from \(\widehat{\Uptheta }_1\) when the input data spans the entire test data set.

The results shown in Fig. 9 indicate that microphysics and cumulus physics are not too sensitive to the change of input data, whereas short-wave and long-wave radiation physics are quite sensitive to changes in the input data. Therefore our learning model indicates that having an accurate short-wave and long-wave radiation physics package will aid in greatly reducing the uncertainty in precipitation forecasts due to missing or incorrect physics.

5 Conclusions

This study proposes a novel use of machine learning techniques to understand, predict, and reduce the uncertainty in the WRF model precipitation forecasts. We construct probabilistic approaches to learn the relationships between the configuration of the physical processes used in the simulation and the observed model forecast errors. These relationships are then used to estimate the systematic model error in a quantity of interest at future times, and identify the physical processes that contribute most to the forecast uncertainty in a given quantity of interest under specified conditions.

Numerical experiments are performed with the WRF model using the NCEP analysis as a proxy for the real state of the atmosphere. Ensembles of model runs with different parameter configurations are used to generate the training data. Random forests and Artificial neural network models are used to learn the relationships between physical processes and forecast errors. The experiments validate the new approach, and illustrate the ability to estimate model errors, indicate the best model configuration, and choose the physical packages that influence WRF prediction accuracy the most.

As part of our future work, we will explore other advanced machine learning algorithms that fall under the broad category of recurrent neural nets (such as LSTM, GRU, and CNN) and are known to capture the spatial and temporal correlations well to reduce the uncertainty in medium and long-term forecasts.

figure a