# Applications of Data Assimilation Methods on a Coupled Dual Porosity Stokes Model

Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 12142)

## Abstract

Porous media and conduit coupled systems are heavily used in a variety of areas such as groundwater system, petroleum extraction, and biochemical transport. A coupled dual porosity Stokes model has been proposed to simulate the fluid flow in a dual-porosity media and conduits coupled system. Data assimilation is the discipline that studies the combination of mathematical models and observations. It can improve the accuracy of mathematical models by incorporating data, but also brings challenges by increasing complexity and computational cost. In this paper, we study the application of data assimilation methods to the coupled dual porosity Stokes model. We give a brief introduction to the coupled model and examine the performance of different data assimilation methods on a finite element implementation of the coupled dual porosity Stokes system. We also study how observations on different variables of the system affect the data assimilation process.

## Keywords

Data assimilation Dual porosity Stokes equation Multiphysics

## 1 Introduction

Hou et al.  has proposed the Coupling of dual porosity flow with free flow as a replacement of the widely used Stokes-Darcy family. The proposed model has a better representation than the traditional Stokes Darcy model in modeling fractured porous media with large conduits. Potential applications of this model include petroleum extraction, hydrology, geothermal systems, and carbon sequestration. A finite element implementation of this model using FEniCS has been developed and studied by the authors . Data assimilation is the discipline that studies the combination of mathematical models and observations. In this paper, we will apply data assimilation methods to the implementation of the coupled model to improve the accuracy of the model predictions [4, 9].

In Sect. 2, we give an introduction to the mathematical model of the coupled dual porosity Stokes model proposed by Hou et al. . In Sects. 3 and 4 we illustrate the applications of data assimilation methods on the coupled dual porosity Stokes model. We set up a data assimilation context from our model in Sect. 3. We present the numerical results based on synthetic data in Sect. 4. In Sect. 5 we draw conclusions and discuss future works.

## 2 A Coupled Dual Porosity Stokes Model

The dual porosity Stokes model proposed by Hou et al.  consists of a dual porosity porous subdomain and a conduit subdomain. An example is show in Fig. 1 where $$\varOmega _d$$ represents the porous subdomain and $$\varOmega _c$$ represents the conduit subdomain. Each subdomain has its own set of boundary conditions, represented by $$\varGamma _d$$ and $$\varGamma _c$$ respectively in the figure. The interface $$\varGamma _{cd}$$ is the only place where the two subdomains communicate with each other. Fig. 1.A simplified coupled model in 2D.
Barenblatt et al.  first proposed the dual porosity model in 1960. Later in 1963, Warren and Root  studied the model thoroughly. In a dual porosity medium, two subsystems are assumed. One is the matrix subsystem, which has high porosity and low permeability, and the other is the microfracture subsystem, which has low porosity and high permeability. The dual porosity equations governing the dual porosity subdomain $$\varOmega _d$$ in our coupled dual porosity Stokes model are
\begin{aligned} \phi _m C_{mt} \frac{\partial p_m}{\partial t} - \nabla \cdot \frac{k_m}{\mu }\nabla p_m&= -Q, \end{aligned}
(1)
\begin{aligned} \phi _f C_{ft} \frac{\partial p_f}{\partial t} - \nabla \cdot \frac{k_f}{\mu }\nabla p_f&= Q + q_p . \end{aligned}
(2)
The constant $$\mu$$ represents the dynamic viscosity. The constants $$k_m$$ and $$k_f$$ represent the intrinsic permeability, $$\phi _m$$ and $$\phi _f$$ the porosities, $$C_{mt}$$ and $$C_{ft}$$ the total compressibility, of the matrix and the microfracture subsystems respectively. The variables $$p_m$$ and $$p_f$$ are the flow pressure of the matrix and the microfracture subsystems respectively. The coefficient function $$q_p$$ is the sink/source term. The term $$Q$$ denotes the mass transfer rate per unit volume from the matrix subsystem to the microfracture subsystem and is defined as
\begin{aligned} Q=\frac{\sigma k_m}{\mu }(p_m-p_f), \end{aligned}
(3)
where the parameter $$\sigma$$ represents the characteristic of the fractured rock and is commonly known as the shape factor. Formulas for calculating $$\sigma$$ can be found in Warren and Root  and Mora and Wattenbarger .
We assume the flow in the conduit domain is Stokes flow and thus describe it using the Stokes equation in (4) and (5). Note that the model can be extended to other free flow models such as the incompressible Navier-Stokes model, as proposed in .The two variables, the flow velocity vector $${u}$$ and the flow pressure $$p$$, together describe the state of the flow. The constant $$\nu$$ represents the kinematic viscosity. The vector valued function $$f$$ is a general body force term. The operator Open image in new window is the stress tensor and Open image in new window is the deformation tensor, where $$\pmb I$$ is the identity matrix.
Four interface conditions are imposed:
\begin{aligned} -\frac{k_m}{\mu }\nabla p_m\cdot (-n_{cd}) = 0, \end{aligned}
(6)
\begin{aligned} u\cdot n_{cd}= -\frac{k_f}{\mu }\nabla p_f\cdot n_{cd}, \end{aligned}
(7)
where $$n_{cd}$$ is the unit normal vector of the interface $$\varGamma _{cd}$$, pointing toward $$\varOmega _d$$. The function Open image in new window is the projection operator onto the local tangent plane of $$\varGamma _{cd}$$. The constant $$\alpha$$ is dimensionless and depends on the properties of the fluid and the permeable material. The constant $$\rho$$ is the fluid density. The constant $$N$$ is the space dimension. Open image in new window is the intrinsic permeability of the microfracture subsystem.

Equation (6) represents the no mass exchange condition between the matrix subsystem in $$\varOmega _d$$ and the conduit. This is an assumption based on of the huge difference in permeabilities between the matrix and the microfracture subsystems. Equation (7) imposes conservation of mass exchange between the conduit and the microfracture subsystem on the interface. Equation (8) balances the two forces on the interface: the kinetic pressure in the microfracture subsystem and the normal component of the normal stress in the free flow. Equation (9) is the empirical Beavers-Joseph interface condition , which claims that the tangential component of the normal stress incurred by the free flow along the interface is proportional to the difference of the tangential component of flow velocities at two sides of the interface.

By introducing test function $$[\psi _m, \psi _f, v^T, q]^T$$, the coupled dual porosity Stokes PDE system defined by (1)–(9) has the variational form,A finite element implementation using the automated partial differential equation (PDE) solving platform FEniCS [1, 11] has been developed by the authors . The backward Euler time stepping scheme was used for time discretization.

## 3 A Data Assimilation Problem Based on the Coupled Model

In order to apply data assimilation methods to the coupled dual porosity Stokes model, we first convert the dual porosity Stokes model into a discrete dynamical system, and define the observations on it.

Following the finite element analysis with backward Euler scheme, at timestep $$t$$ we solve the following equation system for the four variables in four finite functional spaces,
\begin{aligned} \mathbf {A} \begin{bmatrix} p_m^{(t)} \\ p_f^{(t)} \\ u^{(t)} \\ p^{(t)} \end{bmatrix} = \mathbf {C} \begin{bmatrix} p_m^{(t-\varDelta t)} \\ p_f^{(t-\varDelta t)} \\ u^{(t-\varDelta t)} \\ p^{(t-\varDelta t)} \end{bmatrix} +\mathbf {b}. \end{aligned}
The matrix $$\mathbf {A}$$ is assembled from the bilinear formThe vector $$\mathbf{b}$$ is assembled from the linear form
\begin{aligned} L\left( \begin{bmatrix} p_m^{(t-\varDelta t)} \\ p_f^{(t-\varDelta t)} \\ {u}^{(t-\varDelta t)} \\ p^{(t-\varDelta t)} \end{bmatrix} \right) = \eta \int _{\varOmega _c}f\cdot vd\varOmega +\int _{\varOmega _d}q_p\psi _f. \end{aligned}
The matrix $$\mathbf {C}$$ is assembled from the bilinear form
\begin{aligned} c\left( \begin{bmatrix} p_m^{(t-\varDelta t)} \\ p_f^{(t-\varDelta t)} \\ {u}^{(t-\varDelta t)} \\ p^{(t-\varDelta t)} \end{bmatrix}, \begin{bmatrix} \psi _m^{(t)} \\ \psi _f^{(t)} \\ v^{(t)} \\ q^{(t)} \end{bmatrix} \right)&= \int _{\varOmega _d}\left( \phi _mC_{mt}\frac{p_m^{(t-\varDelta t)}}{\varDelta t}\psi _m + \phi _fC_{ft}\frac{p_f^{(t-\varDelta t)}}{\varDelta t}\psi _f\right) d\varOmega \\&\quad + \eta \int _{\varOmega _c}\frac{{u}^{(t-\varDelta t)}}{\varDelta t}\cdot vd\varOmega , \end{aligned}
and thus has the form
\begin{aligned} \mathbf{C} = \begin{bmatrix} \frac{\phi _mC_{mt}}{\varDelta t}\pmb I_{d_m}&{}&{}&{}\\ &{}\frac{\phi _fC_{ft}}{\varDelta t}\pmb I_{d_f}&{}&{}\\ &{}&{}\pmb I_{d_u}&{}\\ &{}&{}&{}\mathbf 0_{d_p} \end{bmatrix}, \end{aligned}
where $$d_m, d_f, d_u$$, and $$d_p$$ are the degrees of freedoms of $$p_m, p_f, {u}$$, and $$p$$, respectively,
If we let the state variable
$$v_t = \begin{bmatrix} p_m^{(t)} \\ p_f^{(t)} \\ {u}^{(t)} \\ p^{(t)} \end{bmatrix},$$
the dynamical system can be expressed as
\begin{aligned} v_{t+\varDelta t}&= \varPsi (v_t) + \xi _t,\end{aligned}
(11a)
\begin{aligned} \varPsi (v_t)&= \mathbf {A}^{-1}\mathbf {C}v_t+\mathbf{A}^{-1}\mathbf {b}, \end{aligned}
(11b)
where $$\xi _t\sim \fancyscript{N}(\mathbf {0},\varSigma )$$ represents the model error. This dynamical system is linear. Note that the coefficient matrix $$\mathbf {C}$$ is singular as is $$\mathbf {A}^{-1}\mathbf {C}$$, the Jacobian of $$\varvec{\varPsi }$$ defined in (11a), (11b). Since some smoothing algorithms involve in inverting the Jacobian of the dynamical system, we need to avoid singularities.
In general we can use the singular value decomposition to get around with singularities. In our case, we let our state variable
$$v_t^* = \begin{bmatrix} p_m^{(t)} \\ p_f^{(t)} \\ {u}^{(t)} \end{bmatrix}.$$
The dynamical system becomes
\begin{aligned} v_{t+\varDelta t}^*&= \varPsi ^*(v_t^*) + \xi _t,\end{aligned}
(12a)
\begin{aligned} \varPsi ^*(v_t^*)&= {(\mathbf {A}^{-1})}^*\mathbf {C}^*v_t^*+(\mathbf {A}^{-1})^*{\mathbf {b}}^*, \end{aligned}
(12b)
where $$\mathbf {M}^*$$ represents the matrix generated by removing the last $$d_p$$ rows and columns from a matrix $$\mathbf {M}$$, and $$\mathbf {b}^*$$ is the vector from removing the last $$d_p$$ components of a vector $$\mathbf {b}$$. In fact (12a), (12b) can also be formed from applying singular value decomposition to $$\mathbf {A}^{-1}\mathbf {C}$$ in (11a), (11b). Note that $$p^{(t)}$$ can still be calculated from $$p_m^{(t-\varDelta t)}$$, $$p_f^{(t-\varDelta t)}$$ and $${u}^{(t-\varDelta t)}$$, which in turn can be calculated from $$p_m^{(t)}, p_f^{(t)}$$ and $${u}^{(t)}$$.

Similarly, the Dirichlet boundary conditions will also cause singularities as they do not depend on previous boundary values. We remove all Dirichlet boundary values from the state variable $$v_t$$ using the same technique.

We base the dynamical model on a two dimensional dual porosity Stokes model shown in Fig. 2. Let $$\varOmega =[-0.5,0.5]\times [0,1]$$ be a shifted unit square, Open image in new window , and Open image in new window . The interface is Open image in new window . The domain is partitioned uniformly into $$\frac{1}{16}\times \frac{1}{16}$$ squares.

Dirichlet boundary conditions on $$\varGamma _c$$ and $$\varGamma _d$$, initial conditions for all variables, and coefficients $$q_p$$ and $$f$$ are constructed such that
\begin{aligned} p_m&=\cos {\left( \pi t \right) } \cos {\left( x \left( - y + 1\right) \right) } \\ p_f&= \left( (x^2+y^2-2y+2)\cos {(\pi t)} - 10\pi \sin {(\pi t)} \right) \cos {(xy-x)} \\ {u}&= \left[ \begin{matrix}2 x \cos {\left( \pi t \right) } \\ 2 x \cos {\left( \pi t \right) } - 2 y \cos {\left( \pi t \right) }\end{matrix}\right] \\ p&= - 10 \pi \sin {\left( \pi t \right) } + \left( x^{2} + 2 x + y^{2} - 2 y + 6\right) \cos {\left( \pi t \right) } \end{aligned}
is the solution to our problem. Fig. 2.The 2D example model with a shifted unit square domain $$\varOmega =[-0.5,0.5]\times [0,1]$$, conduit subdomain Open image in new window and dual porosity subdomain Open image in new window

Also, we let $$\varDelta t = 0.01$$, $$\xi _t\sim \fancyscript{N}(\mathbf {0},5\pmb I)$$, $$v_0^*\sim \fancyscript{N}(\mathbf {0},100\pmb I)$$. The large variance of $$v_0^*$$ indicates that we have little knowledge about the initial condition.

For the observations, we assume we have direct observations to every 4 components of $$v_t^*$$ at time $$t$$:
\begin{aligned} y_t&= h(v_t^*) + \eta _t,\end{aligned}
(13a)
\begin{aligned} h(v_t^*)&= \mathbf{Hv}_t^*,\end{aligned}
(13b)
\begin{aligned} \eta _t&\sim \fancyscript{N}(\mathbf {0},5\pmb I), \end{aligned}
(13c)
where
$$\mathbf{H} = \begin{bmatrix} 1&{}0&{}0&{}0&{}0&{}0&{}0&{}0&{}0&{}\dots \\ 0&{}0&{}0&{}0&{}1&{}0&{}0&{}0&{}0&{}\dots \\ 0&{}0&{}0&{}0&{}0&{}0&{}0&{}0&{}1&{}\dots \\ \dots \end{bmatrix}.$$
We observe every 0.01 time unit starting at $$t=0.01$$. Equations (12a), (12b) and (13a), (13b), (13c) together defines the data assimilation problem we are solving. Data are generated synthetically.

## 4 Numerical Results

We run the model against the three dimensional variational method (3DVAR), the strong constraint four dimensional variational method (s4DVAR) with a time window with length $$0.04$$, the extended Rauch-Tung-Striebel smoother (ExtRTS) , the extended Kalman Filter (ExtKF) , the ensemble Kalman Filter (EnKF) [5, 7] with 100 particles, and ensemble Rauch-Tung-Striebel smoother (EnRTS)  with 100 particles. Note that since we have a linear data assimilation problem, the extended methods ExtRTS and ExtKF are just the Rauch-Tung-Striebel smoother (RTS) and the Kalman Filter (KF). We also use a baseline filtering method Forward that only uses the mathematical model $$\varvec{\varPsi }$$ and ignores all data. It starts at $$v_0^*=\mathbf {0}$$ and then applies $$\varvec{\varPsi }^*$$ to get an approximation for $$v_t^*$$. Since the model is linear, we expect an optimal solution by ExtKF for filtering and ExtRTS for smoothing.

All numerical experiments were run with the data assimilation package DAPPER  on the Teton computer cluster at the Advanced Research Computing Cluster (ARCC) at the University of Wyoming.

The results of filtering on the model with such observations are in Table 1 and the results of smoothing are in Table 2. Since we have a linear system with Gaussian errors, the Kalman Filter and the Kalman Smoother are expected to have optimal data assimilation solutions for filtering and smoothing, respectively, which is in accordance with our numerical results. We see that the Kalman Filter, the Kalman Smoother and 3DVAR are efficient in our small linear model while ensemble methods and s4DVAR are relatively slow.
Table 1.

Average root mean square error for filtering (rmse_f) and elapsed time

Forward

3DVAR

ExtKF

EnKF

rmse_f

0.4717

0.2824

0.2604

0.2651

elapsed time

5 s

1 s

5 s

56 s

Table 2.

Average root mean square error for smoothing (rmse_s) and elapsed time

s4DVAR

ExtRTS

EnRTS

rmse_s

0.3

0.1907

0.2033

elapsed time

121 s

32 s

62 s

The error of different data assimilation methods over time are shown in Figs. 3 and 4. Since Forward, 4DVAR, ExtKF, and EnKF all start with an initial guess $$\tilde{v_0}=\mathbf {0}$$, they all have the same predictions at $$t=0.01$$. This is why they all have the same error at $$t=0.01$$ for forecasting as shown in Fig. 3. The predictions are made every 0.01 time units. ExtKF has a smaller forecasting error than all the other methods except for 3DVAR. Our 3DVAR implementation utilizes all true states to approximate the background covariance $$B_t$$. The exposure to the true states enables the 3DVAR implementation to surpass the theoretical optimal solution from the Kalman Filter. EnKF has a result very similar to that of ExtKF. In EnKF, the calculations of mean and variance of the states are approximated using the Monte Carlo method. Since the states follows a Gaussian process, the approximations converge to the truths as the number of particles increases. We can also see in Fig. 4 that by utilizing all observations, the smoothing error at $$t=0.01$$ is reduced by half, comparing to the forecasting error in Fig. 3. Note that the Kalman Smoother ExtRTS achieves the best result at all time, and the ensemble Kalman Smoother EnRTS has a very similar result as ExtRTS, but consumes much more computation time as shown in Table 2.

Note that the baseline method Forward also has a decreasing error with respect to time. This is caused by the characteristics of our dynamical system. Because of the essential boundaries in our coupled model, solutions to the PDE system with different initial conditions all converge to each other as $$t\rightarrow \infty$$. This can also be explained by the linear dynamical system. Consider a linear dynamical system with $$\varvec{\varPsi }$$($$v_t$$) = $$Mv_t$$ where Open image in new window . Then $$\varvec{\varPsi }^{(n)}(v_t)$$ $$\rightarrow \mathbf {0}$$ as $$t\rightarrow 0$$. Fig. 3.Forecasting error of different filtering algorithms Fig. 4.Smoothing error of different smoothing algorithms
The results of smoothing at $$t=0.01$$ from s4DVAR and ExtRTS are shown in Fig. 5. The results of filtering at $$t=0.02$$ from 3DVAR and ExtKF are shown in Fig. 6. We can see from Fig. 5 that by using limited observations, 4DVAR and ExtRTS are able to recover the state close to true state. Also in Fig. 6, we see that by using only data at $$t=0.01$$, 3DVAR and the Kalman Filter are able to predict a state at $$t=0.02$$ that is much closer to true state comparing to the Forecast baseline method. Fig. 5.Results of smoothers at time $$t=0.01$$ in the 2D model. Fig. 6.Results of filters at time $$t=0.02$$ in the 2D model.
We also explore the importance of observations on different variables. With the same settings on the dynamical system, we apply the Kalman Filter (ExtKF) to observations on $$p_m$$, $$p_f$$, and $${u}$$ separately. We still observe from $$t=0.01$$ and observe every 0.01 time unit, but on all grid points. The results are presented in Fig. 7 Fig. 7.Forecasting error of Kalman Filter with different observations.

We can see from Fig. 7 that the data on the flow pressure $$p_m$$ in the matrix subsystem in the dual porosity subdomain provides most of the information while the other two variables provide little improvement over the Forward baseline method, which uses no observation at all. This behavior exists in all our test models with different boundary conditions, source terms and geometries. This phenomenon needs further investigation. Here we conclude that in our limited test cases, observations on $$p_m$$ provide significant information about the true states while observations on $$p_f$$ and $${u}$$ do not.

Lastly we show the result of the Kalman Filter and the Kalman Smoother on a 3D coupled dual porosity Stokes model introduced in , with the mesh size $$h=1/8$$ and observations on $$p_m$$ only. All the other settings are the same as in the 2D model. The results in Fig. 8 validate the Kalman Filter and the Kalman Smoother on our 3D models and real world applications. Fig. 8.Forecasting and smoothing error of ExtKF and ExtRTS on the 3D model.

## 5 Conclusions and Future Work

In this paper, we introduced the coupled dual porosity Stokes model. We set up a data assimilation problem based on the coupled model and applied different data assimilations to solve the problem. Due to the linearity of the coupled dual porosity Stokes model, the Kalman Filter and the Kalman Smoother achieve optimal solutions for filtering and smoothing, respectively, as expected. From our numerical experiments we have seen that observations of pressures in the matrix subsystem contain most of the useful information for data assimilation.

Future work includes exploring different data assimilation methods on the nonlinear coupled dual porosity Navier-Stokes model, applying data assimilation methods with experiment data and investigating the reason behind the uneven distribution of information in different variables.

## References

1. 1.
Alnæs, M.S., et al.: The fenics project version 1.5. Arch. Numer. Softw. 3, 9–23 (2015)Google Scholar
2. 2.
Barenblatt, G.I., Zheltov, I.P., Kochina, I.N.: Basic concepts in the theory of seepage of homogeneous liquids in fissured rocks [strata]. J. Appl. Math. Mech. 24, 852–864 (1960)
3. 3.
Beavers, G.S., Joseph, D.D.: Boundary conditions at a naturally permeable wall. J. Fluid Mech. 30, 197–207 (1967)
4. 4.
Douglas, C.C., Hu, X., Bai, B., He, X., Wei, M., Hou, J.: A data assimilation enabled model for coupling dual porosity flow with free flow. In: 2018 17th International Symposium on Distributed Computing and Applications for Business Engineering and Science (DCABES), pp. 304–307 (2018)Google Scholar
5. 5.
Evensen, G.: Sequential data assimilation with a nonlinear quasi-geostrophic model using Monte Carlo methods to forecast error statistics. J. Geophys. Res. 99(C5), 10143–10162 (1994)
6. 6.
Hou, J., Qiu, M., He, X., Guo, C., Wei, M., Bai, B.: A dual-porosity-stokes model and finite element method for coupling dual-porosity flow and free flow. SIAM J. Sci. Comput. 38, B710–B739 (2016)
7. 7.
Houtekamer, P.L., Mitchell, H.L.: Data assimilation using an ensemble kalman filter technique. Monthly Weather Rev. 126, 796–811 (1998)
8. 8.
Hu, X., Douglas, C.C.: An implementation of a coupled dual-porosity-stokes model with FEniCS. In: Rodrigues, J.M.F., et al. (eds.) ICCS 2019. LNCS, vol. 11539, pp. 60–73. Springer, Cham (2019).
9. 9.
Hu, X., Douglas, C.C.: Performance and scalability analysis of a coupled dual porosity stokes model implemented with fenics. Japan J. Indust. Appl. Math. 36, 1039–1054 (2019).
10. 10.
Kalman, R.E.: A new approach to linear filtering and prediction problems. J. Basic Eng. 82, 35 (1960)
11. 11.
Logg, A., Mardal, K.A., Wells, G.N.: Automated Solution of Differential Equations by the Finite Element Method. LNCS, vol. 84. Springer, Heidelberg (2012).
12. 12.
Mora, C.A., Wattenbarger, R.A.: Analysis and verification of dual porosity and CBM shape factors. J. Can. Pet. Tech. 48, 17–21 (2009)
13. 13.
Raanes, P.N.: On the ensemble Rauch-Tung-Striebel smoother and its equivalence to the ensemble Kalman smoother. Q. J. R. Meteorol. Soc. 142, 1259–1264 (2016)
14. 14.
Raanes, P.N., et al.: Nansencenter/dapper: version 0.8, December 2018.
15. 15.
Rauch, H.E., Striebel, C.T., Tung, F.: Maximum likelihood estimates of linear dynamic systems. AIAA J. 3, 1445–1450 (1965)
16. 16.
Warren, J.E., Root, P.J.: The behavior of naturally fractured reservoirs. SPE J. 3, 245–255 (1963)Google Scholar