1 Introduction

For reliable decision-making in the petroleum industry, reservoir characterization is implemented to estimate the distribution of reservoir parameters of interest. Conventional reservoir characterization uses static and dynamic data in consecutive order. After initial reservoir models are generated from static data, they are simulated to predict reservoir performance, which should be compared with observed data. Dynamic data are used to modify initial models to minimize the difference in reservoir performance. Here, static data such as core data and well logs have a constant value with time, while dynamic data vary with time such as 4D seismic data and well oil production rate (WOPR).

In the case of static data, they represent reservoir parameters at certain location (hard data) or are closely related to them (soft data). These types of data are applied to build prior reservoir models via geostatistical methods. On the other hand, dynamic data are assimilated by inverse algorithms because dynamic data are indirectly related to reservoir parameters. In the case of channelized reservoirs, it is hard to generate reliable reservoir models by geostatistics and inverse modeling due to the following issues (Wang and Li 2011; Hou et al.2015; Kim et al. 2016a; Jo et al. 2017; Jung et al. 2017; Kang et al. 2017; Lee et al. 2017a): First, reservoir performances are complicated by a unique pattern of sand facies. Second, a histogram of reservoir properties has a bimodal distribution, not Gaussian distribution, because there is a stark contrast in the properties between sand and background facies.

To replicate the spatial connectivity of channelized sand facies, multiple-point statistics (MPS) are more appropriate than two-point statistics (TPS) (Strebelle 2002). The two statistics need spatial information, hard data, and soft data as input data. Here, hard data mean direct information about the reservoir parameters such as core and well log data, whereas soft data are indirect information such as facies probability and vertical proportion. The main difference between the two statistics is how spatial relation can be represented. Training image (TI) and variogram are used for MPS and TPS, respectively. TI contains a geological concept from the interpretation of the depositional environment, while variogram is based on mathematical calculations of spatial correlation (Journel 2002).

One of the advantages in MPS over TPS is that conditional probability from TI is easily coupled with soft data through the tau model (Kashib and Srinivasan 2006). TI gives an approximate pattern of facies distribution, while soft data provide constraint for each grid. However, if there are no available seismic data and sufficient geological interpretation, it is difficult to determine channel geometry for TI. This is why the previous studies used several TIs to consider uncertainty in a geological concept (Jafarpour and McLaughlin 2009; Scheidt and Caers 2009a, b; Lorentzen et al. 2012; Lee et al. 2013b, 2016). Therefore, characterization of TI and soft data is crucial, since the reliability of MPS highly depends on their quality.

Recently, a new paradigm has arisen where geostatistical input parameters are obtained by dynamic data (Agbalaka and Oliver 2011; Jafarpour and Khodabakhshi 2011; Astrakova and Oliver 2014; Tavakoli et al. 2014; Sebacher et al. 2015; Chang et al. 2016; Lee et al. 2017b; Kim et al. 2017). Conventional history matching is to characterize model parameters of interest, but updated models cannot preserve static information because inverse algorithms may ignore given hard data, soft data, and geological concept (Jafarpour and Khodabakhshi 2011; Hu et al. 2013; Hou et al. 2015; Satija et al. 2017). To solve this problem, updated models from inverse modeling are used to generate pseudo-static data which are coupled with static data given to rebuild reservoir models. This procedure has the same effect as history matching because regenerated models are based on both static data given and history-matched static data.

This novel approach still depends on the results of inverse algorithms to generate pseudo-static data. In other words, this iterative static modeling can give a meaningful result only if the inversion results are reliable. However, it is difficult with channelized reservoirs to ensure the reliability of the inversion results (Kim et al. 2016b; Lee et al. 2014). Furthermore, it causes a heavy burden on simulation time during inverse algorithms and extensive iterations (Queipo et al. 2002; Kang et al. 2016; Kim et al. 2016c).

In this research, we propose a novel iterative static modeling scheme for channelized reservoirs, which have uncertainty in channel geometry. For each iteration, global facies probability from TI is managed by TI rejection and local facies probability is updated by history-matched soft data. According to TI rejection and the soft data, three strategies are tested in two channelized reservoir cases to optimize the iterative static modeling.

2 Methodology

2.1 Procedure of the proposed method

A conventional workflow of reservoir characterization is shown in Fig. 1. After initial reservoir models are built from static data by geostatistics, they are simulated by the reservoir simulator. Observed dynamic data are compared with simulated dynamic performance, and the difference is minimized by an inverse algorithm until the convergence criteria is satisfied. Finally, all updated models are simulated again to predict future reservoir performance.

Fig. 1
figure 1

The workflow of a conventional reservoir characterization. Green and blue colors stand for static modeling and reservoir simulation, respectively. A conventional reservoir characterization requires inverse modeling with forward simulation for all possible reservoir models

In the case of the proposed method, after initial models are generated, geostatistics is implemented iteratively instead of an inverse modeling (Fig. 2). During iterative static modeling, history-matched facies probability data are generated and some unsuitable TIs are discarded for further iterations. In detail, initial reservoir models are generated using given hard data and all TIs, but regenerated reservoir models are built using hard data given, chosen TI(s), and pseudo-soft data.

Fig. 2
figure 2

The flowchart of the proposed method. Green and blue colors stand for static modeling and reservoir simulation, respectively. A gray-dashed rectangle means iterative clustering and simulation. The proposed method does not use an inverse algorithm and requires few reservoir simulations. Three strategies are tested to find the best implementation for integration of the following two concepts: TI rejection and pseudo-facies probability map

To reject TI and generate pseudo-soft data, we adapt the concept of clustering and simulation procedure in Lee et al. (2017b) (Fig. 2). This concept can select facies models, which have similar production with observed data, among hundreds of facies models with the minimum number of forward simulations. Briefly, initial or regenerated facies models are grouped into similar models by a distance-based method (Fig. 3c). In this research, the Hausdorff distance, multi-dimensional scaling (MDS), and k-means clustering are used and verified the suitability in Sect. 2.4. After the clustering, a representative model for each cluster is implemented for reservoir simulation instead of all the models. The best representative model, which has the lowest root mean square (RMS) error between simulated and observed dynamic data, is determined (Fig. 3d). Finally, the facies models surrounding the best model in the metric space are selected (Fig. 3e), and the mean of them becomes pseudo-soft data or final models (Fig. 2).

Fig. 3
figure 3

The procedure for the selection of models in metric space using the distance-based clustering. a Hundreds of initial models are generated by multi-point geostatistics. b Each model can be assigned coordinates by distance (Hausdorff equation) and dimension reduction (multi-dimensional scaling). c Initial models are grouped by a clustering algorithm (k-means clustering), and models belonging to the same group appear in the same color. d Production from centroid models (gray lines) is compared with observed data (red line), and the best-fit model (black line) can be chosen. e Qualified models (red dots) are selected nearby the best centroid model

To combine facies probability data with a probability from TI, the tau model is used as follows Eq. (1) (Journel 2002):

$$P\left( {F|{\text{TI}},{\text{SD}}} \right) = \frac{1}{1 + x}$$
(1)
$$\frac{x}{a} = \bigg( {\frac{b}{a}} \bigg)^{{\tau_{1} }} \bigg( {\frac{c}{a}} \bigg)^{{\tau_{2} }} ,$$
$$x = \frac{{1 - P\left( {F|{\text{TI}},{\text{SD}}} \right)}}{{P\left( {F|{\text{TI}},{\text{SD}}} \right)}},\quad a = \frac{1 - P\left( F \right)}{P\left( F \right)},\quad b = \frac{{1 - P\left( {F|{\text{TI}}} \right)}}{{P\left( {F|{\text{TI}}} \right)}},\quad c = \frac{{1 - P\left( {F|{\text{SD}}} \right)}}{{P\left( {F|{\text{SD}}} \right)}}$$

where F represents the event of occurrence of a certain facies. TI stands for the probability from the TI for given well data, while SD means pseudo-soft data. τ is a weighting on the information from the TI and facies probability map. If τ2 is greater than τ1, the probability from TI has less influence than pseudo-soft data.

In Lee et al. (2017b), there is no uncertainty in TI (one TI) and iteration concept. In this research, various TIs are utilized to consider uncertainty in channel geometry. Whenever facies models are regenerated, TIs are rejected according to the proportion of TIs, which is used to build the selected facies models. Furthermore, a proper practice to generate pseudo-soft data is proposed in this research.

2.2 Three strategies for iterative static modeling

We test three strategies for iterative static modeling, which are distinguished by the usage of TI rejection and the form of pseudo-soft data (Table 1). The strategy 1 does not adopt a TI rejection scheme, which means that all TIs are used during iteration of geostatistics. In other words, the identical number of facies models is generated from each TI. The same approach was used by Park et al. (2013).

Table 1 Three strategies for iterative static modeling

The strategies 2 and 3 use a TI rejection scheme, which means that facies models are generated in proportion to the number of the qualified models for each TI. For example, 200 initial models are generated, which consist of 100 from TI 1 and 100 from TI 2. After the clustering and simulation procedure, 10 qualified models originate by 6 from the TI 1 and 4 from the TI 2 (Fig. 4a). When making 200 new facies models for next iteration, 120 and 80 models are generated from the TI 1 and TI2, respectively (same proportion to TI ratio in qualified models). If all qualified models are chosen from only the TI 1, 200 new facies models are generated from the TI 1 only, which means that the TI 2 is excluded.

Fig. 4
figure 4

Definition of pseudo-soft data from the qualified models for the three strategies. a Analyzing TI used to create the qualified models. (The six models in the red box and the four models in the blue box are generated by TIs 1 and 2, respectively.) b A unified soft data (mean of all qualified models) in strategies 1 and 3. c Separated soft data for each TI (mean of the six models for TI 1 and mean of the four models for TI2) in strategy 2

Strategies 2 and 3 differ in the way of generating pseudo-soft data. Strategy 2 distinguishes the qualified models based on TI and generates the separated soft data for each TI. For example, the mean of the 6 models from the TI 1 becomes pseudo-soft data for the TI 1 and the mean of the 4 models from the TI 2 is utilized as the soft data for the TI 2 (Fig. 4c). However, strategy 3 makes the unified soft data using all 10 qualified models regardless of the TIs (Fig. 4b). Strategy 1 also follows the same practice with strategy 3 for the soft data.

Iteration of static modeling ends when one of the following convergence criteria is satisfied. First, only one TI is left and all facies models are generated by the TI. Second, the RMS error of the previous and the current pseudo-soft data becomes less than a certain value, α. The RMS is calculated from the following equation [Eq. (2)]:

$${\text{RMS}}_{{P_{\text{f}} }} = \sqrt {\frac{1}{{N_{\text{grid}} }}\mathop \sum \limits_{i = 1}^{{N_{\text{grid}} }} \left\{ {P_{{{\text{f}},i}}^{\text{c}} - P_{{{\text{f}}, i}}^{\text{p}} } \right\}^{2} }$$
(2)

where \(P_{{{\text{f}}, i}}\) means facies probability at the ith grid and \(N_{\text{grid}}\) is the number of grids. The superscripts c and p represent current and previous, respectively. In this research, α of 3.5 is set empirically because RMS values in Tables 6 and 10 are about 3.5 when the strategies 2 and 3 are converged by the first convergence criterion. The convergence of the strategy 1, which does not use the TI rejection concept, can be judged based on the RMS value even though the convergence of the strategies 2 and 3 is determined by both criteria.

2.3 Synthetic channelized reservoir cases

In this research, the proposed method is applied to two synthetic cases. Examples A and B have the same geological parameters except for channel geometry for TIs (Table 2). The channelized reservoirs have a 25 × 25 × 1 grid system with an inverted nine-spot waterflooding. These settings for the synthetic cases have been similarly used in previous studies (Jafarpour and McLaughlin 2009; Scheidt and Caers 2009a; Wang and Li 2011; Lee et al. 2013a).

Table 2 Geological parameters for the static data integration by geostatistics

Both examples assume that the geological concept is not confirmed due to the lack of geological information and there is uncertainty in TI. Example A has uncertainty in the orientation of the channel pattern and Example B contains uncertainty in width and amplitude of channels as shown in Tables 3 and 4, respectively. For Example A, there are 4 TIs, which consist of vertical (TI 1), 45° (TI 2), horizontal (TI 3), and 135° (TI 4) (Fig. 5a). Fifty initial facies models are generated using each TI with well data in Table 2. At this stage, the channel pattern of each ensemble is set in the direction of Table 3. Figure 5b shows one model from each TI, and it maintains the direction of TI used. Note that facies probability map is not available for initial models.

Table 3 Uncertainty for orientation of channel geometry in Example A
Table 4 Uncertainty for width and amplitude of channel geometry in Example B
Fig. 5
figure 5

Training images and corresponding model in Example A. a Four TIs have a different channel orientation. b One model among fifty initial models from each TI (total 200 models)

In the case of Example B, 4 TIs have the same vertical channel pattern but the channel width and amplitude are set differently as shown in Table 4. The TIs 2 and 3 have larger amplitude and wider width than the TI 1, respectively (Fig. 6a). In the worst case, the TI 4 has both larger amplitude and wider width than the TI 1. Fifty initial models are built by each TI and one model from each TI follows the features of its TI (Fig. 6b). For both examples, the total number of initial and regenerated models is 200.

Fig. 6
figure 6

Training images and corresponding model in Example B. a Four TIs have a different channel orientation. b One model among fifty initial models from each TI (total 200 models)

The reference field in Fig. 7a is built by the parameters in Table 2 with the default TI in Figs. 5a and 6a. The true field has three vertical channel streams and has the connection between the production well P6 and the injection well I9. This field is assumed as the reference field for both examples. In the case of observed dynamic data, WOPRs from the eight production wells are obtained with the parameters in Table 5 by a commercial reservoir simulator, ECLIPSE 100. In the case of permeability, isotropy is assumed in the x and y directions. It is assumed that WOPR data are observed every 20 days up to the current time of 900 days (Fig. 7b). There are 360 observation data: multiplying the 8 production wells and the 45 observation time steps. The standard deviation for the observation data is set to 0.01 STB/day. The production wells P2, P6, and P7 show a sharp decrease in WOPR about 200 days because of fast water breakthrough due to the connectivity between the water injection well and the production wells. Channel characterization is very important for the prediction of reservoir performances because fluid movement is dominant in channelized sandstone rather than background shale.

Fig. 7
figure 7

The reference field and observed dynamic data for Examples A and B. a Facies distribution of the reference field with eight production wells and a single injection well at the center. b Oil production rates from the production wells until 900 days (observed data used)

Table 5 Petrophysical parameters for the reservoir simulation

For the proposed method, the three strategies in Sect. 2.2 are tested in Example A and the best strategy is examined in Example B. Also, the standard ensemble-based methods, ensemble Kalman filter (EnKF) and ensemble smoother (ES), are applied to both examples for comparison. We verify the result of the proposed method in view of facies distribution, permeability histogram, production predictions on the existing production wells and a newly drilled production well, and simulation time.

2.4 Verification of the distance-based method

The proposed technique assumes that the facies models, which are classified in the same group in Fig. 3c, are similar to each other. Figure 8 shows the validity of the Hausdorff distance and verification of the clustering method for facies models. When the distance-based clustering is applied to the 200 initial facies models in Example A, similar facies distribution is found among the nearby facies models in the metric space (Fig. 8a). Therefore, the center (representative) model can represent the models in the same group, and the closest models from the best center model can be selected for low RMS models (qualified models) with the reference model (Fig. 3e).

Fig. 8
figure 8

The results of the distance-based clustering in Example A. a Three models in the same cluster have similar facies distribution by the distance-based clustering. b Hausdorff distance has a relationship with WOPR in variogram. c Hausdorff distance has a relationship with WWCT in the variogram

The Hausdorff distance has been verified as a proper concept of a similarity measure for channelized reservoirs in the previous research (Suzuki and Caers 2008; Suzuki et al. 2008; Lee et al. 2013a, 2016, 2017b). After simulating the initial models to obtain production data such as WOPR and well water cut (WWCT), the variogram of the production data can be calculated as a function of the Hausdorff distance. If the variogram has a clear structure model rather than a pure nugget model, it stands for spatial correlation between the production data and the distance concept (Suzuki and Caers 2008). In this research, the Hausdorff distance is a proper distance concept because the variogram of WOPR and WWCT has a certain structure (Fig. 8b, c). The x- and y-axes are calculated as follows Eq. (3):

$${\text{Experimental}}\;{\text{variogram}},y{\text{-axis }} = \frac{1}{{2N_{m} }}\mathop \sum \limits_{m = 1}^{{N_{m} }} \left( {\mathop \sum \limits_{k = 1}^{{N_{d} }} \left( {q^{k} \left( {M_{i} } \right) - q^{k} \left( {M_{j} } \right)} \right)^{2} } \right)$$
(3)
$${\text{Average of the Hausdorff distance}},\;x{\text{-axis}} = \frac{1}{{N_{m} }}\mathop \sum \limits_{m = 1}^{{N_{m} }} H_{m}$$
$$\begin{aligned} {\text{The}}\;{\text{Hausdorff}}\;{\text{distance}},\;H_{m} \left( {M_{i} , M_{j} } \right) & = { \hbox{max} }\left( {d\left( {M_{i} , M_{j} } \right), d\left( {M_{j} , M_{i} } \right)} \right), \\ d\left( {A, B} \right) & = \mathop{ \hbox{max} }_{a \in A }\; \mathop{ \hbox{min} }_{b \in B} \; \| a - b\| \end{aligned}$$

where Nm means the number of pairs of facies model, which satisfy given lag distance. Hm is the Hausdorff distance of the pairs. q(Mi) stands for production data from the ith facies model Mi, and Nd indicates the number of observed production data. For example, in Fig. 8b, the second point from the left has 68.08 ft and 399,820 STB2/day as the Hausdorff distance and experimental variogram of WOPR, respectively. It is calculated from 33 pairs of facies models, which have a Hausdorff distance from 65 to 70 ft.

3 Example A

3.1 Characterization of channel connectivity

The mean of the initial log-permeability models does not have a distinct connectivity in Fig. 9b because the initial facies models are generated from the 4 TIs in Fig. 5a. In the case of the ensemble-based methods in Fig. 9c, d, the red and blue colors are clearly distinguished but the mean fields are quite different from the reference field in Fig. 9a. Above all, they lose connectivity of sand facies and have high permeability values only near the wells.

Fig. 9
figure 9

Log-permeability of the reference field and the mean of log-permeability models in Example A. a Permeability distribution in the reference field. b Mean of 200 initial models (no channel trend). c Mean of updated models by ES. d Mean of updated models by EnKF. e Mean of updated models by strategy 1 in the proposed method. f Mean of updated models by strategy 2 in the proposed method. g Mean of updated models by strategy 3 in the proposed method

When the histogram of the mean fields is examined in Figs. 10c and d, the ensemble-based methods have over- and undershooting values. Although the ES significantly reduces the number of forward simulations over the EnKF due to the global update of dynamic data, it deepens the problem. For example, the largest permeability value of the result from the ES is greater than 7.78 × 1011 mD, while the smallest value is less than 1.74 × 10−7 mD: both values are physically unrealistic. Also, the histograms follow a Gaussian distribution, not the bimodal distribution in the reference field (Fig. 10a) due to the inherent assumption in the ensemble-based methods. These results clarify the problem of conventional inverse algorithms mentioned in Sect. 1 because they do not consider the geological meaning of reservoir parameters. It has been reported that this problem can be solved by techniques such as localization in many studies (Watanabe and Datta-Gupta 2012; Luo et al. 2018; Jung et al. 2018). Since this research is not a study to improve the ensemble-based method, only the standard ensemble-based methods are used as a comparison of the proposed method.

Fig. 10
figure 10

Histogram of the mean of log-permeability models in Example A. a Bimodal distribution of permeability in the reference field. b Mean of initial models. c Mean of updated models by ES. d Mean of updated models by EnKF. e Mean of updated models by strategy 1 in the proposed method. f Mean of updated models by strategy 2 in the proposed method. g Mean of updated models by strategy 3 in the proposed method

The three strategies for the proposed method, iterative static modeling with pseudo-soft data and TI rejection, give the results in Figs. 9e–g and 10e–g. At first, Tables 6 and 7 show the termination of the iteration for the three strategies. \({\text{RMS}}_{{P_{\text{f}} }}\) value is steadily decreased for all strategies in Table 6. The \({\text{RMS}}_{{P_{\text{f}} }}\) value at the first iteration from strategies 2 and 3 is higher than the value from strategy 1 because the new models from the two strategies are quite different from the initial models. However, the value decreases sharply at the second iteration. The iteration of strategies 1 and 3 ends at the third iteration because \({\text{RMS}}_{{P_{\text{f}} }}\) is less than 3.5, the convergence criterion. In the case of the strategy 2, it stops iterative static modeling at the second iteration, although it has \({\text{RMS}}_{{P_{\text{f}} }}\) of 3.61, since TI has converged as shown in Table 7.

Table 6 RMS between the mean of the qualified models in Example A
Table 7 The number of facies models per TI for total 200 models in Example A

Regardless of the strategies in the proposed method, initial models are built from all TIs equally. As mentioned above, the strategy 1 generates 50 new models from each TI during iteration, but the other two strategies construct new facies models according to a certain ratio. The number of models from each TI is proportional to the qualified models in the previous step as explained in Sect. 2.2.

These differences result in final models in Fig. 9. The result of the strategy 1 has three vertical connectivities and shows the connection between the production well P6 and the injection well (Fig. 9e). Its histogram has a bimodal-like distribution in Fig. 10e. However, there is a huge uncertainty in the left area in the mean field, which indicates the variety of facies distribution in the final models. It results from usage of all TIs during iterative static modeling. It demonstrates the importance of a proper TI for MPS algorithms even though soft data can guide a local facies probability.

The final models from the strategies 2 and 3 have similar facies distribution with the reference field (Fig. 9f, g), and their histograms overcome the Gaussian problem in the ensemble-based methods (Fig. 10f, g). Technically, the results from the strategy 3 show better performance because the connection between the production well 6 and the injection well is clear, and the facies ratio in the histogram is similar to the reference field.

Insufficient results in the strategy 2 are caused by biased soft data for iterative static modeling in the early stage. The first-qualified models in Fig. 4a, which are chosen from the initial models, have much uncertainty because the initial models are generated with the 4 TIs and without integration of dynamic data. The reference field in Fig. 9a has mainly vertical channel streams (TI 1 in Fig. 5a) and 45° connection between the production well P6 and the injection well (TI 2 in Fig. 5a). Therefore, the first-qualified models consist of six models from the TI 1 and four models from the TI 2 (Fig. 4a). If pseudo-soft data are generated for the TIs 1 and 2 separately, the vertical TI is coupled with the vertical trend soft data and the 45° TI is used with the 45° trend soft data (Fig. 4c). This strategy 2 intensifies biased trends during iterative static modeling, and regenerated models can have improper facies distribution if the geological information is incorrect or has high uncertainty. The strategy 3 uses unified soft data in Fig. 4b, and it mitigates the robust tendency of the separated soft data in Fig. 4c.

When we make a close investigation to the final models from the strategies 2 and 3 in Fig. 11, the effect of soft data can be detected. In the case of the strategy 3, the final models have vertical connections with the 45° connection and there is a triangular background facies on the lower left (Fig. 11b). However, some of the final models from the strategy 2 give discontinued facies connections at the bottom left (Fig. 11a). The breaks are found between the production well P6 and the injection well and between well P7 and the injection well in the biased soft data (Fig. 4c). Therefore, the unified soft data can guide the local channel distribution properly rather than the separated soft data. That means the strategy 3 is a best-fit practice for the iterative static modeling.

Fig. 11
figure 11

The final models selected from the strategies 2 and 3 in Example A. Yellow and blue color mean sand and background facies, respectively. a The final updated models by strategy 2 in the proposed method. b The final updated models by strategy 3 in the proposed method

3.2 Uncertainty quantification in production forecasts

The final models in the previous section are implemented using ECLIPSE 100 to predict the reservoir productions up to 1800 days. Even though only WOPR up to 900 days is utilized for history matching, we compare both WOPR and WWCT in Fig. 12. The red line for each figure indicates the true production from the reference field. The gray lines are the prediction of each model, such as initial and final models, and the average of the gray lines is marked in the blue line. The band of the gray lines at a certain time stands for the uncertainty range. Here, the production wells P4 and P6 are investigated because there is a complex facies distribution near the wells in the reference field.

Fig. 12
figure 12

Predictions of WOPR and WWCT on the production wells P4 and P6 up to 800 days in Example A. Gray and blue lines indicate results of individual ensembles and mean of them, respectively. The red line means the true production from the reference field, and the vertical black line stands for the end of assimilation time. a Initial ensembles with wide uncertainty. b ES with filter divergence problem. c EnKF with improper uncertainty. d Strategy 1 in the proposed method. e Strategy 2 in the proposed method. f Strategy 3 in the proposed method

The predictions from the initial models have a wide uncertainty range (Fig. 12a) because they are not integrated with observed dynamic data. Also, the average (the blue line) deviates from the true production trend due to uncertainty in the channel direction of the TIs. Especially, there are wide uncertainty ranges for WWCTs. In the case of the ES and EnKF, although the uncertainty ranges are reduced by the integration of dynamic data, the predictions are quite unreliable, such as WCT for both P4 and P6 wells (Fig. 12b, c), even worse than the result of the initial models. It results from the wrong updated models in Figs. 9 and 10. Example A is a very challenging inverse problem for the ensemble-based method because of the bimodal distribution and uncertainty in the TIs, which cause initial ensemble design issues (Jafarpour and McLaughlin 2009; Lee et al. 2013b, 2016).

All strategies for the proposed method demonstrate better predictions than both the initial models and ensemble-based methods (Fig. 12d–f). In the case of WWCT, the strategies 1 and 2 still have wide uncertainty ranges, even though the average can predict water breakthrough time and overall tendency properly. However, the predictions of the final models from the strategy 3 converged to the true productions with narrow bands. Also, they form a band of predictions without ensemble collapse problem, which all models become almost same after history matching. It is a natural result because the final models of the strategy 3 in Fig. 11b look similar to the reference field.

In the case of the proposed method and initial models, WWCTs increase sharply in the early stages because the injected water prefers to flow through high-permeability channel facies. However, water breakthrough time from both the ES and EnKF is delayed due to the Gaussian distribution (Fig. 12c). They fail to describe the fluid behavior of channel reservoirs with a bimodal distribution. This is why many previous studies have used transform techniques for the ensemble-based methods for channelized reservoir characterization.

If the integration of dynamic data is successful, the updated or final models can be utilized for making a decision on future development of the reservoir. Therefore, they are tested for the prediction of a newly drilled production well P9 in Fig. 13a, which starts production after 900 days, at the end of history matching. The well P9 is set at the location of (18, 22), and the operational constraint is the lowest bottom-hole pressure (BHP) of 500 psi.

Fig. 13
figure 13

Water saturation of the reference field and prediction of the newly drilled well P9’s WWCT in Example A. a Distribution of water saturation in the reference field at 900 days and location of the new well P9 which is located between the production wells P7 and P8. b Initial ensembles with wide uncertainty, c ES with filter divergence problem. d EnKF with wide uncertainty. e Strategy 1 in the proposed method. f Strategy 2 in the proposed method. g Strategy 3 in the proposed method

Figure 13a shows the distribution of water saturation of the reference field at 900 days. The distribution is similar to the connectivity of sand facies because the injected water mainly moves through high-permeability sand facies. Therefore, understanding the channel distribution is critical to predict the performance of the new well P9. In the case of the initial models without any calibration, they have very diverse predictions and most of them start to produce water immediately at 900 days (Fig. 13b). It results from an overestimation of the channel connection between the injection well and the lower right of the field. The EnKF and ES cannot give a reliable prediction (Fig. 13c, d), which may lead to wrong decisions, because of the over- and undershooting problems and Gaussian distribution of the updated models.

The proposed method represents the movement of the injection water of the new well properly because it maintains the bimodal distribution with reasonable facies distribution. The strategy 1 makes meaningful WWCT predictions with significantly reduced uncertainty range compared to the initial models and the ensemble-based techniques (Fig. 13e). In the case of the strategies 2 and 3, water breakthrough time is predicted with a very high reliability (Fig. 13f, g). Also, its uncertainty is appropriate to provide a rational basis for decision-makings. For example, we can decide the location of an infill well based on these credible results. The outstanding success of the proposed method is proven through the facies distribution, its histogram, the preexisting production wells, and the infill well.

In addition to the reliability of static and dynamic data integration, the proposed method can reduce the number of dynamic simulations drastically compared to the ensemble-based methods (Table 8). The ES requires simulated WOPR for each initial model to compare with the true WOPR during history matching. After that, dynamic simulation for each updated model is needed to predict future productions. Therefore, the number of forward simulations from the ES is 400 in total. The EnKF demands many more forward simulations for history matching because it assimilates model parameters every dynamic data acquisition time. In this case, the EnKF requires 9000 ECLIPSE runs for history matching, which is the product of total models (200) and the number of dynamic data acquisitions (45).

Table 8 The number of forward simulations for the ensemble-based methods and the proposed method in Example A

However, the proposed method needs less than 30 forward simulations for history matching (Table 8). Only the center models are required to predict WOPRs for each iteration due to the distance-based clustering (see Fig. 2). The simulation number for history matching in the strategy 2 is less than the number in the strategies 1 and 3 because it stops iteration at the early stage due to the biased soft data (Table 7). The proposed method requires only 20 simulations in the prediction step (Table 8) as the below reasons. Instead of simulating all 200 regenerated models at the last iteration, 10 simulations for the center models are needed to find the final models. Then, future reservoir performances are predicted by only the final 10 models. Therefore, the proposed method is a very efficient methodology to integrate static and dynamic data and to estimate uncertainty ranges over the conventional methods.

4 Example B

4.1 Characterization of channel connectivity

Example B deals with the uncertainty in width and amplitude of channels (Fig. 6). The direction is fixed as vertical, which is an uncertainty parameter in Example A (Fig. 5). The mean of the initial log-permeability models in Fig. 14b has a more clear facies distribution of high permeability than the mean of the initials in Fig. 9b. It shows that the channel direction in TI has a more critical effect on the reservoir models than width and amplitude.

Fig. 14
figure 14

Log-permeability of the reference field and the mean of log-permeability models in Example B. a Permeability distribution in the reference field. b Mean of 200 initial models (vertical channel trend). c Mean of updated models by ES. d Mean of updated models by EnKF. e Mean of updated models by strategy 3 in the proposed method

In this example, the two ensemble-based methods are used again for comparison with the proposed method. In the case of the ES, the mean field has scattered facies distribution and is quite different from the true model (Fig. 14c). Especially, histogram of the result has extreme permeability values (max: 6.9211 × 1021 mD, min: 5.381 × 10−18 mD) in Fig. 15c. The mean of the updated models from the EnKF (Fig. 14d) has distinct vertical connections. It is a much better result than the result in Fig. 9d because both the uncertainty in TIs and initial ensemble design issue are relieved. Even though the ES is much faster than the EnKF (Table 8), EnKF shows much better performances than ES from the standpoint of an inverse stability. However, the result of the EnKF still cannot detect a detailed facies distribution such as the connection on the lower left area (Fig. 14d). Also, its histogram fails to preserve a bimodal distribution even if the over- and undershooting problems in the ES is settled down (Fig. 15d).

Fig. 15
figure 15

Histogram of the mean of log-permeability models in Example B. a Bimodal distribution of permeability in the reference field. b Mean of initial models. c Mean of updated models by ES (Gaussian distribution). d Mean of updated models by EnKF (Gaussian distribution). e Mean of updated models by strategy 3 in the proposed method (bimodal distribution)

In the case of the proposed method, the strategy 3 is applied to this example because it shows the best performance in the previous example. Figure 16 shows the mean of the qualified models for each iteration. When the concept of the clustering and simulation in Fig. 2 is applied to the initial models, the qualified models are listed in Fig. 16e. They consist of five models from the TI 1, two models from the TI 2, and three models from the TI 3. Therefore, 200 new models for iteration 1 are constructed with unified soft data with 100 models, 40 models, and 60 models from the TI 1, TI 2, and TI 3, respectively (Table 9). It can be seen that the amplitude has a slightly greater effect on TI of channelized reservoirs than the width. The TI 4 is rejected at the initial stage because it is the most different TI from the true TI. The TI 2 becomes excluded at the first iteration, and the TI 3 also falls off at the second iteration, sequentially (Table 9).

Fig. 16
figure 16

Changes in the qualified models and soft data from the strategy 3 in Example B. a Mean of the qualified models among all initial models. b Mean of the qualified models among the first regenerated models, c Mean of the qualified models among the second-regenerated models. d Mean of the qualified models among the third-regenerated models. e The qualified models among all initial models [used for (a)]. f The qualified models among the third-regenerated models [used for (d)]

Table 9 The number of facies models per TI for total 200 models in Example B

Finally, the TI 1 (the true TI) generates 200 new models with the soft data from the second iteration (Fig. 16c) for the third iteration. The qualified models at the third iteration in Fig. 16f originated from the TI 1, of course. As TI converges, the iteration can be stopped and the qualified models become the final models. When the final models in Fig. 16f are compared with the qualified models from the initial stage in Fig. 16e, facies distribution has converged to have a specific tendency. It is a natural result because the soft data are assimilated by dynamic data and are integrated with static data given. Therefore, the mean of the final log-permeability models is quite similar to the true field (Fig. 14e), and its histogram can preserve a bimodal distribution of the channelized reservoirs (Fig. 15e).

Figure 16a stands for the mean of the qualified models in Fig. 16e from the initial models. This unified soft data is integrated for iterative static modeling to generate 200 new models for the first iteration. The mean of the qualified models is changed from Figs. 16a–d, which becomes more and more converged and similar to the facies distribution of the reference field. \({\text{RMS}}_{{P_{\text{f}} }}\), the difference between the current and previous mean fields, is gradually decreased during the iteration (Table 10) as facies distribution is determined. Finally, \({\text{RMS}}_{{P_{\text{f}} }}\) becomes less than the threshold value, 3.5, at the third iteration.

Table 10 RMS between the mean of the qualified models in Example B

4.2 Uncertainty quantification in production forecasts

The final models in Fig. 16f are implemented through the reservoir simulator to predict WWCT up to 1800 days. WWCTs are compared with the results in Example A, which are the existing production wells P4 and P6 in Fig. 12 and the new production well P9 in Fig. 13. The y-axis in Fig. 17 is the ratio of simulated WWCT to the true WWCT at the end of the prediction time (1800 days). The black horizontal dotted line indicates the true WWCT. The x-axis consists of the initial models, the ensemble-based methods, and the three strategies in both examples. For example, the box plots of the Example A’s initial models (Ex.A_Initial) are further from the black line than the ones of the Example B’s initial models (Ex.B_Initial) due to the higher uncertainty of channel geometry (Fig. 17).

Fig. 17
figure 17

The box plots of WWCT at 1800 days for the existing wells (P4 and P6) and newly drilled well (P9) in Examples A and B. The horizontal black line stands for the true value, and production from eight cases are compared with the true production. a WWCT of production well P4. b WWCT of production well P6. c WWCT of production well P9

In terms of the existing production wells, WWCTs of the production well P6 have larger uncertainties than those of the production well P4 (Fig. 17a, b) because the connectivity between P6 and the injection well is a critical feature in the reference field. If the connection is not considered, the WWCT of P6 must be underestimated compared to the true WWCT at 1800 days. Therefore, most cases in Fig. 17b predict lower than 1, which means smaller simulated WWCTs than the true values. In the case of the ES and EnKF, the results give completely wrong WWCTs for both existing wells even though it takes much more time to integrate dynamic data than the proposed method.

For the proposed method, the importance of the TI rejection scheme can be seen from the box plots of the strategy 1. They show better performance than the ensemble-based methods but still have too wide an uncertainty because the strategy 1 uses all TIs during the iteration. However, the strategies 2 and 3 in Example A and the strategy 3 in Example B reduce the uncertainty range significantly, and they can reflect the true WWCT. Also, uncertainty ranges for the new well are wider than ones of the existing wells because the new well is not included for determining soft data. Nevertheless, the proposed method can provide a trustworthy WWCT for decision-making.

5 Conclusions

In this research, a novel idea, iterative static modeling using history-matched soft data, is proposed and successfully applied to synthetic channelized reservoirs. The three strategies are tested to optimize the iteration procedure for the following two issues: usage of a TI rejection scheme and the unified or separated soft data. The iteration can be terminated according to the convergence of the TI or soft data. The distance-based clustering, which consists of the Hausdorff distance, MDS, and k-means clustering, is utilized to reduce the number of forward simulations.

We use several TIs to reflect the uncertainty in channel geometry. Example A deals with the effect of channel direction, and Example B considers the effect of channel amplitude and width. Among the parameters for channel geometry, the facies distribution and reservoir performances are influenced by the order of direction, amplitude, and width. In Example A, the unified soft data with TI rejection, the strategy 3, shows the best performance compared to the strategies 1 and 2. The concept of TI rejection can manage global trend of channel streams such as main channel direction. The unified soft data can mitigate the effect of biased information in the separated soft data at the early iteration.

The proposed method solves the problems in the conventional inverse algorithms. The strategy 3 for both examples can make a reliable facies distribution and conserve static data given such as bimodal distribution, facies ratio, hard data, and geological concepts in TI. The standard ensemble-based methods, ES and EnKF, fail to characterize channel fields and show over- and undershooting problems. From the standpoint of simulation time, the proposed method has an advantage over the ES and EnKF. In the case of the strategy 3, it requires only 30 times of forward simulation for history matching, while the EnKF and ES need 9000 times and 200 times, respectively.

The performances of the production wells are predicted from the updated models using the ensemble-based methods and the final models using the proposed method. WWCTs of the updated models from the ES and EnKF cannot mimic a sharp increase after water breakthrough due to a Gaussian distribution. Even if the ensemble-based methods take a long time for history matching, the predictions may lead to erroneous decisions because they give even worse predictions than the ones from initial models, which use static data only. However, the final models from the proposed method can provide reliable predictions with reduced uncertainty for both the existing production wells and the newly drilled production well.

From the result of the strategy 1, the significance of TI in MPS is confirmed once again because soft data such as a facies probability map can provide information about the guideline level only. If the quality of TI is very low, setting tau1 to be less than tau2 may increase the effect of pseudo-soft data over TI on facies modeling.