# Surrogate-Based Design Optimisation Tool for Dual-Phase Fluid Driving Jet Pump Apparatus

- 201 Downloads

## Abstract

A comparative study of four well established surrogate models used to predict the non-linear entrainment performance of a dual-phase fluid driving jet pump (JP) apparatus is performed. A JP design flow configuration comprising a dual-phase (air and water) flow driving a secondary gas-air flow, for which no one has ever provided a unique set of design solutions, is described. For the construction of the global approximations (GA), the response surface methodology (RSM), Kriging and the radial basis function artificial neural network (RBFANN), were primarily used. The stacked/ensemble models methodology was integrated in this study, to improve the predictive model results, thus providing accurate GA that facilitate the multi-variable non-linear response design optimisation. An error analysis of all four models along with a multiple model accuracy analysis of each case study were performed. The RSM, Kriging, RBFANN and stacked models formed part of the surrogate-based optimisation, having the entrainment ratio as the main objective function. Optimisation problems were solved by the interior-point algorithm and the genetic algorithm and incurred a hybrid formulation of both algorithms. A total of 60 optimisation problems were formulated and solved with all three approximation models. Results showed that the hybrid formulation having the level-2 ensemble Kriging model performed best, predicting the experimental performance results for all JP models within an error margin of less than 10 % in 90 % of the cases.

## Keywords

Dual-phase jet pump Surrogate modelling Global approximations Global optimisation Ensemble modelling Genetic algorithm Gaussian process Radial basis function## 1 Introduction

Methodologies applied to build adequate learning-models are crucial in performing a model-based optimisation (MBO). With the quick advances in computer science, MBO is becoming more and more applicable for modelling, simulations, experimental and optimisation processes. It has proved to be one of the most efficient techniques for expensive and time demanding real-world optimisation problems [1]. Several studies which considered MBO in the context of global optimisation (GO), were performed to solve close-related design problems such as the one considered in this study.

Here, MBO is used for predicting the non-linear entrainment performance of a dual-phase fluid driving jet pump (JP) apparatus, a technology well-known as an artificial, oil and gas lifting method.

Among all pumping equipment one of the most simplistic and effective way to retrieve low-pressure oil and gas wells or boost production from such wells is via the use of JPs. A JP, also known as ejector, eductor, thermo-compressor or injector, is a revolutionised piece of equipment readily known for the pumping and mixing of fluids within a wide range of applications in various engineering industrial segments such as: water, nuclear and aviation technologies [2]. In contrast to other Improved Oil Recovery (IOR) and Enhanced Oil Recovery (EOR) solutions, the surface jet pump (SJP) technology is classified as the cheaper and most effective solution, thus highlights its importance in the current oil and gas situation [3, 4, 5]. Besides, among all SJP technological advancements, to date, no unique design solution has been established to design a JP which operates effectively under motive multiphase fluid sources. Given the fact that the commercialised variety of single-phase and multiphase JPs/ejectors are particularly relevant, it was found that multiphase JPs (for applications in the oil and gas industry, where the motive fluid contains more than a single-phase, such as liquid and gas mixtures), undergo several drawbacks. Such drawbacks are mainly attributed to the degradation of entrainment and boost performance, and therefore provides a reduction in the overall operating efficiency. This causes limitations on their applicability due to the lack of flexibility, rangeability and versatility. These drawbacks, either individually or collectively, precinct onshore, remote offshore and totally limit the possibility of subsea JP applications. A feasible and practical means to deal with a typical complex (non-linear) multi-variable design problem is presented here, and a surrogate-based design optimisation is considered.

Relevant work in the literature can be categorised into two types. The first type includes studies which consider one surrogate model, while the second type involves multiple learning algorithms (either comparison or stack models to conduct ensemble modelling techniques), thus involves more than one surrogate model to perform surrogate-based optimisation. Several studies comprising the build-up of models based on one surrogate methodology, include the work of Kajero et al. [6], who used the Kriging meta-model approach. These authors coupled the Kriging to the expected improvement (EI) to assist in the calibration of computational fluid dynamics (CFD) models. They successfully calibrated a single-model CFD parameter with experimental data and considered a case of single-phase flow in straight-pipe and convergent-divergent-type annular JP. This study involved fixed design parameters based on the experimental data of Shimizu et al. [7], and considered HP and LP static pressures to compute the pressure coefficient. Yang et al. [8] made use of the RSM and desirability approach to investigate and optimise a JP in a thermoacoustic Stirling heat engine (TASHE). This study considered four designing parameters: position, length, diameter and tapered angle of the nozzle. Also, slightly different than global optimisation (GO), Lyu et al. [9] worked on the design of experiments (DOE) methodology in combination with CFD, to structurally optimise the design of annular liquid-liquid JPs. This work considered the volumetric flow ratio, angle of suction chamber, throat length and the diffuser diverging angle as the input design parameters. For another different engineering design optimisation problem, Di Piazza et al. [10] used the Kriging estimation method to investigate partial shading in photovoltaic fields. It was showed that this learning method provided cheaper and simpler characterisation of the photo voltaic plant output power, and as a result allowed energy forecast.

It is difficult to establish if one of these surrogate modelling methods is superior to others. A comprehensive comparison analysis of multiple surrogate models (built with different methods) has therefore been conducted here to try to answer this question. Simpson et al. [11] compared a maximum of 3 models. He compared the polynomial based response surface with kriging surrogates for the aerodynamic design optimisation of hypersonic spiked blunt bodies. Similarly, Shyy et al. [12] compared the relative performance between polynomials and neural network surrogate models for aerodynamics and rocket propulsion problems. Other methodological related studies include the work of Simpson et al. [13], who emphasised the robustness of Kriging over the RSM surrogate model. The authors used Kriging for global approximation in simulation-based multidisciplinary (multiple input and multiple response) design optimisation problems. In this study Kriging was used for a real aerospace engineering application, a problem related with the design of an aerospike nozzle. The most comprehensive work found in the literature includes the work of Luo and Lu [14]. The authors performed a comparison between the RSM, Kriging and RBFANN surrogate models for building surrogate models of a dual-phase flow simulation model in a simplified nitrobenzene contaminated aquifer remediation problem. The surrogate-based optimisation methodology identified the most cost-effective remediation strategy.

Develop surrogate models using the RSM, Kriging, RBFANN and ensemble methodologies for constructing global approximations (GA) for use in a real dual-phase JP design application, is particular, the design of the injection body and parts of the ejection portions of a JP apparatus.

Estimate the accuracy of the different surrogate models.

Perform a comparison between level-1 and level-2 (ensemble models) and select the surrogate models showing acceptable accuracy in the non-linear optimisation model (consider multiple optimisation algorithms) to identify the best design parameters which optimise entraintment performance under various motive fluid gas volume fractions (GVFs).

## 2 Background: Surrogate Models

### 2.1 Response Surface Models

The response surface methodology is the simplest and most common method applied for analysing the results of physical experiments and to generate empirically based models for response values [15, 16].

*k*is the number of variables, while \(x_i\) and \(x_j\) are the input variables.

### 2.2 Kriging Models

The Kriging method is a probabilistic approach that involves a statistical estimation technique for spatial interpolation of random quantities. The very initial mathematical formulation of the Kriging method was developed on the basis of experiments performed by Daniel Krige in 1963, who established the distribution of minerals in the subsoil by performing punctual surveys. Later, Sacks et al. [22] developed the model into a surrogate model, thus shaped in the form which is mostly known nowadays. This model is also known as the design and analysis of computer experiment (DACE) [23, 24]. The ooDACE toolbox, developed by Couckuyt et al. [25], is a versatile Matlab toolbox which incorporates the popular Gaussian Process based Kriging surrogate models.

Unlike RSM, Kriging models were purposely developed for mining and geostatistical, and spatial applications, which induce spatially and temporally correlated data [26, 27]. More recently, this model gained in popularity and started to be used for other engineering applications, such as real aerospace design engineering applications and for the build-up of a meta-model to assist calibration of computational fluid dynamics models [6, 13].

*k*known regression functions (typically polynomial functions) providing a “global” model of the design space, and \(Z\left( x\right)\) is the realization of a stochastic stationary process with mean zero, process variance \({\sigma }^2\) and covariance given by the covariance matrix in Eq. (5). Controversially to \(f_i\left( x\right)\), \(Z\left( x\right)\) considers “localised” deviations of the interpolation of the data points \(n_s\).

*k*th components of the sample points \({x}_{i}\) and \({x}_j\), and \(n_{dv}\) is the number of design variables. In many cases, such as in McKay et al. [31], Sacks et al. [22], and Osio and Amon [32], a single value of \(\theta\) provided good results, though, in this study a different value of \(\theta\) is used for each design variable.

*x*for a universal Kriging model, are found by Eq. (7).

*n*which contains the sample values of the response, and \(\mathbf{f }\) comprises a column vector of length \({n}_s\), where in the case of ordinary Kriging (not universal Kriging) \(\mathbf{f }\left( x_*\right) =1\), thus, reduces to a scalar function fixed with values of unity. The literature comprises the work of: Deutsch and Journel [33], Cassie [27], Simpson et al. [13], Emery [34] and Bayraktar and Turalioglu [35], which involve Ordinary Kriging models, while for the work of Zimmerman et al. [36], Brus and Heuvelink [37] and Sampson et al. [38], universal kriging models are applied.

If a first order polynomial is involved, then \(f\left( x_*\right) ={\left[ 1,x_{*1},x_2,\, \ldots ,x_{*n}\right] }^T\) and so on. In this study the 0th, 1st and 2nd order polynomial were considered, \(\mathbf{F }\) is a matrix of the form \({\mathbf{F }}={\left[ f\left( x_1\right) ,\, f\left( x_2\right) ,\, \ldots ,f\left( x_m\right) ,\right] }^T\) containing all matrix functions calculated for all *m* training data points.

*n*between an untried

*x*and the sampled data points \(\left\{ x_1,\, \ldots ,x_{ns}\right\}\). Such \({r}^T\) is expressed by:

*y*is estimated by Eq. (10):

### 2.3 Radial Basis Function Neural Network (RBFANN)

*X*, being \(X {\varvec{\in }} {{\mathbb {R}}}^n\), then the output of the neurons in the RBFANN hidden layer is given by:

*i*th neuron in the RBF hidden layer, \(i=1,\, 2,\, \ldots , N\); where

*N*is the number of neurons in the hidden layer, \(\left\| X- c_i\right\|\) is the norm of \(X-c_i\), which is either the Euclidean distance or the Mahalanobis distance, while \({\varPhi }\, (\cdot)\) is the radial basis function, commonly taken to be Gaussian, as given in (14), or either the ‘Cauchy’ function or the ‘Multiquadratic’ function is used instead [41].

*i*th hidden layer neuron to the

*k*th output layer and \({\theta }_k\) is the threshold value of the

*k*th output layer neuron.

This model is widely applicable for approximation, classification, prediction and system control. Park and Sandberg [42] added that the RBF network model performs highly well as universal approximates for compact subsets of \({{\mathbb {R}}}^n\); implying a RBF network having enough hidden neurons, will approximate a continuous function on a closed, bounded set with arbitrary precision.

### 2.4 Model Ensembling

Four of the most common types of ensemble methods are: (a) model averaging, (b) bagging, (c) stacking, and (d) boosting [48, 49].

#### 2.4.1 Model Averaging

*N*predictions via Eq. (17). Each prediction is the output from

*N*trained models, used to perform the scalar ensemble predictions.

*K*, and \({\hat{y}}_n\) is the predictive output from each learning-model.

#### 2.4.2 Bagging Model/Bootstrap Aggregating

Bagging is considered very similar to model averaging, but it comprises a slightly modified training procedure. Bagging uses a subset of samples to train each model to train the base learners. This is also known as bootstrap sampling. Thus, if the RBFANN method is used, multiple models are generated to include all subsets of sample data. However, when done, the corresponding responses from different models are then averaged to obtain a scalar ensemble output.

#### 2.4.3 Stacking Model

Stacking, better known as meta-ensembling, is a model ensemble technique which combines response data from a plurality of predictive models, to generate a new and improved model. This model is therefore an extension (or a second procedure), after responses have been generated by each single predictive model [50].

Stacking requires a modified training procedure relative to other described methods. In this case, *N* trained models are used to predict output of the new sample. Thus, the outputs from the 1st level (separated-models) learning are used as inputs for another model that is ‘stacked’ upon the other models. This will lead to a layer-chain of models. Thereby, the 2nd level model is used to predict the actual output for the new samples. In most cases, it is expected that the model will outperform each of the individual models, due to its smoothing nature, and the capability of selectivity between each case model at regions where it performs best, and avoids other regions where it performs poorly. Eventually, this will make stacking the most effective when base model predictions are significantly different.

#### 2.4.4 Boosting Mode

Boosting involves a family of algorithms which transform weak learners into effective learners. This method deals with weak learner’s models such as decision trees. It functions by combining the predictions via a weighted majority vote (classification) or a weighted sum (regression), to generate the final prediction. It differs from bagging as the base learners are trained in a sequenced manner and on a weighted version of the data.

## 3 Applied Methodology: JP Device Apparatus

As described in the introduction section, a JP is a passive apparatus, thus reluctant to change in operating conditions. Investigative work from Mifsud et al. [51], clearly demonstrates that the performance of a JP apparatus is dictated by its internal geometric features, mainly the injection, entraining and mixing bodies.

These three bodies include unique geometrical features, which vary in both shape and clearance under different flow conditions; namely, type of fluid and more specifically the fluid properties such as fluid density, viscosity, compressibility and diffusivity, mutually dictating the hydrodynamic behaviour. Thus, a gas-driving-gas JP varies in design from a liquid-driving-liquid JP or a liquid-driving-gas JP. However, this work focuses specifically on a JP application having a dual-phase (water and air) HP fluid driving a relative low-pressure air. An analysis of experimental results from unpublished work showed that under dual-phase operating conditions, there exists no linear behaviour which correlates the design parameters against entrainment performance.

A total of five different nozzle bodies were used in this study. A brief description of each injection bodyis provided in Table 1 and accompanied by the schematics of four injection bodies (M-01 to M-04) shown in Fig. 3, while the design of model M-05 and M-10 cannot be shown to protect the proprietary nature of the design. Figure 3 highlights the differences attributed to the number of orifice holes and their positioning.

Details about the considered test-setup models

Model | Injection body description | Flow |
---|---|---|

M-01 | Standard converging-nozzle (single-orifice) | No swirl-induced flow |

M-02 | Converging-diverging nozzle (single-orifice) | No swirl-induced flow |

M-03 | Converging-diverging nozzle (multiple-orifice) horizontally drilled | No swirl-induced flow |

M-04 | Converging-diverging nozzle (multiple-orifice) drilled at an inclination angle of \(5^{\circ }\) | No swirl-induced flow |

M-05* | A bi-nozzle design configuration (multiple-orifices) | No swirl-induced flow |

M-06 | Standard converging-nozzle (single-orifice) | Swirl-induced flow |

M-07 | Converging-diverging nozzle (single-orifice) | Swirl-induced flow |

M-08 | Converging-diverging nozzle (multiple-orifice) horizontally drilled | Swirl-induced flow |

M-09 | Converging-diverging nozzle (multiple-orifice) drilled at an inclination angle of \(5^{\circ }\) | Swirl-induced flow |

M-10* | A bi-nozzle design configuration (multiple-orifices) | Swirl-induced flow |

### 3.1 JP Device Design Analysis

As the non-discrete variables N and S are neither continuous nor discrete, all learning models (both level-1 and level-2) were distinctively developed for each model. This led to the reduction of the number of input response variables, from five to three, as illustrated in Fig. 6.

Also, for further simplification, the bi-response methodology included in Fig. 5, was reduced to a single output. This step was justified on the basis that entrainment ratio will always tend to increase while the magnitude of pressure vibration decreases. The entrainment ratio was preferably selected over magnitude of pressure vibration for the main reason that the focus of this study considers the design parameters which have a significance on the performance of the LP pressure and/or LP/secondary flowrate.

#### 3.1.1 Approximations for the JP Device Design Problem

The sample data for all learning models was obtained from a unique experimental data-set comprising tests performed on a dual-phase (water and air) facility located in the Process Systems Engineering Lab at Cranfield University, UK. Such experimental data-set includes a total of 1440 tests-setups combinations, which comprised both 100 % liquid-water and dual-phase (\(0 \, \le \, \hbox {GVF} \, \le 50\)) motive fluid flows driving a secondary gas-air flow. All experiments were performed at high-pressures below 8 bara.

The whole data-set was divided into 10 data sub-sets. The sets of data were discretised according to the type of injector body, for the JP configurations equipped with or without spin body mechanism. This categorised the data into 10 unique models (M-01...M-10), as listed in Table 2. Note that the range of the nozzle-to-throat clearance X comprised eight divisions, (0.1, 0.25, 0.6, 0.9, 1.4, 1.8, 2.8 and 4), 3 throat-inlet angle At were considred for (\(13^{\circ }\), \(30^{\circ }\) and \(50^{\circ }\)), and the motive fluid GVFs included the values 0, 10, 20, 30, 40 and 50.

Applicable test matrix for this work unique data set

Non-discrete variables | Discrete design variables | ||||
---|---|---|---|---|---|

Injection body | Swirl flow | Nozzle to throat-inlet ratio | Throat-inlet converging angle [\(^\circ\)] | HP fluid gas volume fraction [%] | |

No | Yes | [X] | [At] | [GVF] | |

M-01 | \({\surd}\) | 0.1–4 | 1330 & 50 | 0–50 | |

M-02 | \({\surd}\) | 0.1–4 | 1330 & 50 | 0–50 | |

M-03 | \({\surd}\) | 0.1–4 | 1330 & 50 | 0–50 | |

M-04 | \({\surd}\) | 0.1–4 | 1330 & 50 | 0–50 | |

M-05 | \({\surd}\) | 0.1–4 | 1330 & 50 | 0–50 | |

M-06 | \({\surd}\) | 0.1–4 | 1330 & 50 | 0–50 | |

M-07 | \({\surd}\) | 0.1–4 | 1330 & 50 | 0–50 | |

M-08 | \({\surd}\) | 0.1–4 | 1330 & 50 | 0–50 | |

M-09 | \({\surd}\) | 0.1–4 | 1330 & 50 | 0–50 | |

M-10 | \({\surd}\) | 0.1–4 | 1330 & 50 | 0–50 |

Specific details on the bisection of sample and response data in each learning model are provided and discussed discussed in Sections 3.1.2 to 3.1.5. In each case study, including a single JP model M, a sub-set comprising an array of 144 samples and responses was used for processing each learning model. The 144 samples and responses were selected according to the combinations illustrated in Fig. 7.

The respective ranges of each discrete design variables were set according to specific justifications. The range of the secondary-nozzle/throat inlet angle At, includes an upper-bound limit of \(50^{\circ }\), being an optimal design angle for liquid-driving liquid JP applications. Thus, a lower bound (low as reasonable possible) and intermediate angles of \(13^{\circ }\) and \(30^{\circ }\), were then considered.

The range of the nozzle-to-throat clearance X varied between a ratio of 0.1 and 4. This cophered the well known design ratios (as applicable for gas-gas, liquid-liquid and liquid-gas JP applications) and beyond. Also, the fact that a swirl induced flow was included in half the tests setups, lower values of X were considered than used for gas-gas applications. In the latter cases, optimal values of X can even go down below 0.4. Besides, more frequent values of X were considered to avoid black spots due to high sensitivity, even for small increment of X.

Lastly, the range denoting the values of GVF, cophered a range of HP fluid compositions, including 100 % liquid motive flow and a combination of liquid dominant two-phase (water-air) motive fluid compositions. Particularly, one should consider that some practical intuition was also applied based on practical real field applications (mainly for selection the range of GVF), and limitations were considered due to difficulties to manufacturing the designed components (mainly for selecting the range of At).

#### 3.1.2 RSM for the JP Device Study

Having three variables denoting a performance parameter, makes it extremely difficult to illustrate the responses on 3-D surface plots. It appeared complex enough to illustrate the non-linear behaviour between the gas volume fraction GVF and the nozzle-to-throat clearance X. A clear example which demonstrates the complex correlation between the JP design variables when under dual-phase flow conditions, is given in Fig. 8. The three sub-figures present three sets of surface plots of the same model (M-01), but having a different throat-inlet angle [At]. Thus Fig. 8a–c denote cases having throat angle At of \(13^\circ\), \(30^\circ\) and \(55^{\circ }\) respectively. Each set comprises two plots, one for the case without swirl-induced flow and the other with swirl-induced flow. However, plotting the 3rd variable resulted in an obscure surface plot. It was concluded that such complex behaviour could only be exemplified via polynomial equations.

A pair of polynomial expressions are generated for each JP model design, a first expression including a device body without the swirl-body mechanism, and a second device body, having the same injection body as the first one, but amalgamated with the swirl-body mechanism. Eventually a total of 10 expressions covers all 5 JP models.

A reference to the resulting \({R}^2\), \(R^2_{\mathrm {adjusted}}\), and root mean square error for each response surface model given in Table 6 of “Appendix 2”, results high (ideal close to 1) \({R}^2\), \(R^2_{\mathrm {adjusted}}\) and low (ideal close to 0) RMSE values. Thus, the response surface models appear to capture a large portion of the observed variance, resulting in an acceptable good fit.

#### 3.1.3 Kriging Models for the Jet Pump Device Study

For the Kriging models, a Gaussian correlation of either 0th, 1st or 2nd order regression function was applied, while Eq. (10) was used for the local deviations. It was also noted that a single \(\theta\) parameter was insufficient to model the data accurately, thus a simulated annealing algorithm was used to determine the maximum likelihood estimates (MLEs) for the three \(\theta\) parameters needed (one for each variable) to generate the best Kriging model.

The optimal \(\theta\) parameters values for each case study are given in Table 3. This was accomplished via Eq. (11), and simulations were performed and executed via a dedicated generated script written in MATLAB. Eventually, the Kriging models were identified once all parameters for the Gaussian correlation function and the 118 sample data points were obtained.

*R*values for the Kriging models are given in Table 6 of “Appendix 2”.

Theta parameters for Kriging models for all case studies

Models 01-10 for case studies 01-10 | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|

\(\theta\)* | M-01 | M-02 | M-03 | M-04 | M-05 | M-06 | M-07 | M-08 | M-09 | M-10 |

\({\sigma }_X\) | 8.75 | 3.53 | 20 | 2.77 | 3.42 | 11.49 | 10 | 20 | 20 | 6.59 |

\({\sigma }_{GVF}\) | 4.35 | 2.33 | 3.24 | 1.13 | 1.39 | 1.44 | 1.34 | 3.99 | 2.18 | 1.65 |

\({\sigma }_{At}\) | 1.25 | 1.44 | 2.10 | 0.29 | 0.31 | 0.24 | 1.25 | 1.71 | 0.95 | 1.09 |

#### 3.1.4 Neural Network Models for the Jet Pump Device Study

For the Neural Network models, a script was generated by the Neural Fitting application provided in MATLAB 2018. The input and response samples data sets were randomly selected by the syntax ‘dividerand’ and divided into three main categories: training, validation and testing. Each category was specifically assigned a portion out of the whole dataset, allowing 118 samples for training, 7 samples for validation, and 19 samples for testing. The training samples are presented to the network during training, while the network adjusts in correspondence to its error. The validation samples are used to measure network generalisation and to stop training when generalisation stops improving, while testing offers an autonomous measure of network performance during and after training.

The Bayesian Regularisation backpropagation training function was used via the syntax ‘trainbr’. This type of algorithm typically requires more time than the Levenberg-Marquardt algorithm and the Scaled Conjugate-Gradient algorithm. Though, such algorithms have the potential to result in good generalisation for difficult, small or noisy datasets. Using the Bayesian Regularization algorithm, training stops according to adaptive weight minimisation (regularization).

**j**

*X*, of the performance syntax ‘perf’ with respect to the weight and the bias variables

*X*. Each variable is adjusted according to Levenberg-Marquardt. Further details about the Bayesian regularisation can be found in MacKay [52] and Foresee and Hagan [53]. The mean squared error performance function syntax ‘mse’ was applied to the model performance function. The mean squared error is the average squared difference between outputs and targets. Lower values are better. Additionally, the number of hidden layers as denoted by the letter ‘N’ in Fig. 9, varied according to the results illustrated in the model accuracy analysis section. The first trial consisted of N = 15. Regression

*R*values are presented and discussed in section 4.2.

#### 3.1.5 Model Averaging and Stacking for the Jet Pump Device Study

For the ensemble-stacking model, a more advanced command script (more complex than for the single- model build-up) was formulated and executed in MATLAB 2018.

The applied method for the segregation of sample data, involved a different procedure than applied in the other three former discussed models. To formulate the out-of-sample predictions, data were divided similarly as done for the well-known ‘K-fold’ cross-validation method.

*N*th fold and the highlighted cells include the holdout sample.

Opting to select the cross-validation method was based on the fact that the out-of-sample predictions sustain a higher chance of capturing distinct regions where each model performs the best.

## 4 Results and Discussions for Global Approximations

### 4.1 Learning Model’s Accuracy Analysis

In this study, the absolute error (AE) was selected as the main loss function to estimate the accuracy of the learning models. Figures 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, and 37 in “Appendix 3” illustrate the boxplots of absolute error for the RSM, Kriging and RBFANN models, as applicable for the JP models; for both with and without swirl-body mechanism. Table 4 details the varied parameters in each learning model.

Table 5 includes the optimal parameter values, resulting from the presented box-plots for each JP models (M-01–M-10).

Varied model settings to estimate the accuracy of the surrogate models

Model | Varied setting | Model varied settings | |||
---|---|---|---|---|---|

Response surface | Degree of polynomial | 1st | 2nd | 3rd | – |

Kriging | Degree of regression function | 0th | 1st | 2nd | – |

Neural network | Number of hidden neurons/layers | 15 | 30 | 50 | 80 |

Optimal model settings based on absolute error (AE) as a loss function, to estimate the accuracy of each model

Model type | Optimal model settings for models M-01–M-10 | |||||||||
---|---|---|---|---|---|---|---|---|---|---|

M-01 | M-02 | M-03 | M-04 | M-05 | M-06 | M-07 | M-08 | M-09 | M-10 | |

RSM | 3rd | 3rd | 3rd | 3rd | 3rd | 3rd | 3rd | 3rd | 3rd | 3rd |

Kriging | 2nd | 1st | 2nd | 0th | 2nd | 1st | 2nd | 2nd | 2nd | 2nd |

RBFANN | 45 | 45 | 60 | 45 | 45 | 45 | 30 | 45 | 45 | 30 |

### 4.2 Error Analysis of the RSM, Kriging and RBFANN Models

*y*(26 samples used in each model) and the predicted values \({\hat{y}}\) from either the RSM, Kriging, RBFANN or ensemble models. The accuracy of the 26 validation points for all models was estimated via numerical error analysis equations, given from Eqs. (18) to (22).

- 1.MaxAPE—Maximum Absolute Percent ErrorLower values of MaxAPE indicate lower difference, thus variance tends to decrease as \(\left[ {{MaxAPE\, {\mathop {\longrightarrow }\limits ^{yields}}}}0\right]\).$$\begin{aligned} MaxAPE=100\left| \frac{y_i-{{\hat{y}}}_i}{y_i}\right| \end{aligned}$$(18)
- 2.MAPE—Mean Absolute Percentage ErrorThe absolute value in this calculation is summed up for every predicted value and divided by the number of testing points. Note that in this work, the mean absolute error was calculated, thus Eq. (19) was divided by 100. Also, similarly to MaxAPE, MAPE indicates lower difference, thus variance reduces as \(\left[ {{MAPE\, {\mathop {\longrightarrow }\limits ^{yields}}}}0\right]\).$$\begin{aligned} MAPE=\frac{100}{n}\sum \limits ^n_{i=1}{\left| \frac{y_i-{{\hat{y}}}_i}{y_i}\right| } \end{aligned}$$(19)
- 3.
RMSE—Root Mean Square Error

The RMSE signifies the variances, better known as residuals or prediction errors between predicted and observed values. This error estimator method as given in Eq. (20), is capable to aggregate the individual error magnitude into a single measure of prediction power.Lower values of RMSE indicate lower variance. Variance tends to decrease as \(\left[ {{RMSE\, {\mathop {\longrightarrow }\limits ^{yields}}}}0\right]\).$$\begin{aligned} RMSE=\sqrt{\frac{\sum \nolimits ^n_{i=1}{{\left( {{\hat{y}}}_i-y_i\right) }^2}}{n}} \end{aligned}$$(20) - 4.
R Squared (\({{R}}^{{2}}\))

This is known as the coefficient of determination or multiple determination in this case since models involve multiple regression. This measures the correlation between outputs and targets via:where \(e_i\) denotes the residuals for each \({y}_i\), while \({\overline{y}}\) is the mean of the observed data.$$\begin{aligned} R^2= & {} \left( 1-\frac{Sum\, of\, Squares\, of\, Residuals}{Total\, Sum\, of\, Squares}\right) \nonumber \\= & {} \, \frac{\sum \nolimits _i{e^2_i}}{\sum \nolimits _i {{\left( y_i-{\overline{y}}\right) }^2}} \end{aligned}$$(21) - 5.
Adjusted R Squared (\({{R}}^{{2}}_{{adj}}\))

Adjusted \(R^2\) developed by Henri Theil (1961) includes a modification that accounts for the adjustment of the number of explanatory terms in a model relative to the number of data points.where$$\begin{aligned} R^2_{adj}=\left( 1-\left( 1-R^2\right) \right) \frac{n-1}{n-p-1} \end{aligned}$$(22)*p*is the total number of variables, and*n*denotes the sample size.For both regression

*R*values, close relationships are determined if results are close to 1. In the case of the \({R}^2_{adj}\), the result would always be less or equal to that of \({R}^2\).

Figure 12 shows a collective set of three plots, where Fig. 12a includes \(R^2\) errors, Fig. 12b includes *RMSE* errors, and Fig. 12c includes MaxAPE errors. The comparison of errors exhibit a consistent behaviour in all three error methodologies.

### 4.3 Results Comparison of Actual Against Predicted via Scatter Plots for RSM, Kriging, RBFANN and Ensemble Models

Generally, it can be noted that in most cases, the points are scattered symmetrically around the \(45^{\circ }\) diagonal line and fitted within the (± 10 %) error band margin. However, the scattering tends to increase in cases involving a swirl induced flow. As expected this could be reasoned by the complex nature of the hydrodynamic behaviour of dual-phase flow inside the JP. This emphasises further the non-linearity behaviour which exists between design parameters and the JP performance when operated under dual-phase flow conditions. Once again, all cases showed that the RSM registered the highest scattering while the RBFANN and ensemble models resulted in minimal scattering and shared high similarity between one another. Another point of interest which is well demonstrated involves the plotting behaviour of the ensemble model. The overall results for such model, minimised drastically the scattering effect and smoothened prediction in areas where other models failed to perform within the (± 10 % error) bandwidth.

### 4.4 Optimisation Heuristics

Several types of optimisation algorithms exist. However, such methods can be used for constrained and unconstrained optimisation or are able to perform only one of the latter types.

In this work, constrained optimisation was applied. This method comprised the process of optimising an objective function with respect to some variables in the presence of constraints on those variables. The objective function (hard-type), is to be maximised, so, the negative part of the process function \(f\left( x\right)\) is taken in the constrained minimisation problems. The hard-type involved constraints which set conditions for the variable required to be satisfied [54].

Three optimisation algorithms which are well applicable in engineering-related problems include: (1) the multiple response desirability approach, (2) the interior-point algorithm (IPA), and (3) the augmented Langrangian genetic algorithm (ALGA). Also, a combination of the latter two types of optimisation algorithms can form a hybrid function, which in most cases turns to be more robust and accurate than single-optimisation algorithms. A brief overview of the methodology of the interior-point and the Augmented Langrangan genetic algorithms (both optimisation methods applied in this study) are given hereunder.

#### 4.4.1 Interior-Point Algorithm

The interior-point algorithm comprises a variety of solvers capable of solving both linear and nonlinear convex optimisation problems which have constraint inequalities.

*x*, which is the local minimum of a scalar function \(f\left( x\right)\), subject to the set constraints on the allowable

*x*.

*f*with a simpler function \(q{\cdot }\) This is done to increase the resolution, thus for understanding better the behaviour of the function

*f*in a neighborhood around the point

*x*. Thereby the neighborhood is referred to as the trust region [55, 56]. As provided in Eq. (24), this is computed in the form of a sub-problem in parallel to the main minimisation problem.

#### 4.4.2 ALGA: Augmented Langrangian Genetic Algorithm

The genetic algorithm (GA), is a method able to solve both constrained and unconstrained problems and involves a natural selection process that imitates the biological evolution. It solves problems that differ from other ‘standard’ optimisation algorithms; namely: stochastic, nonlinear and discontinuous.

The solving procedure comprises a continual process which first modifies the population of individual solutions, and then the algorithm picks random individuals (via random number generators) to form the modified population and use such individuals, referred to as ‘parents’, to produce the ‘children’ for the next generation. This procedure is repeated until the population evolves and yields the optimal prediction [57].

By default, the genetic algorithm uses the Augmented Lagrangian Genetic Algorithm (ALGA) to solve nonlinear problems without integer constraints.

*m*and

*mt*describe the number of nonlinear inequality and total number of nonlinear constraints respectively.

*s*are non-negative shifts, while \(\rho\) is a positive penalty parameter.

#### 4.4.3 Hybrid (Fmincon and Genetic Algorithm)

The hybrid optimisation method comprises a combination of 2 or more single-optimisation algorithms. Whenever a hybrid optimisation problem is to be solved, a hybrid function needs to be first formulated. A typical function will contain the order of execution of each algorithm. When the ALGA stops, the hybrid function will then start from the final point as returned by the same generic algorithm.

In this work, the ‘fminunc’ (the hybrid function), was set to be automatically called and initiates the execution with the optimised point found by the former method. Since ‘fminunc’ has its own options structure an additional argument has to be provided when specifying the hybrid function. The hybrid function can improve the accuracy of the solution.

### 4.5 Optimisation Results using the RSM, Kriging and RBFANN Models

As a final comparison of the accuracy of both individual and ensemble learning models, 6 pairs of optimisation problems were first formulated and then solved for each JP model. However, the term ‘pair’ exemplifies that a total of 12 optimisation problems were both formulated and solved for each JP model. In all the optimisation problems, the entrainment ratio was set to be maximised, at specific GVFs. This led to a total of 60 optimisation problems.

As provided in Tables 7, 8, 9, 10, and 11 in “Appendix 4”, the set objective functions denote traditional, single-objective/discipline optimisation problems. Also note that each one of the six boxes, namely ‘Prob. #1M-X & M-X’, includes a common objective function of (a) swirl and (b) no swirl flow respectively. In each optimisation problem, constraints are placed on the maximum and minimum allowable values of responses (not part of the objective function). Each optimisation is formulated and solved using a plurality of optimisation algorithms which were all adopted to solve the developed nonlinear optimisation models within the MATLAB platform. The three optimisation algorithms used included: (1) the ‘interior-point’ algorithm, (2) the Augmented Lagrangian genetic algorithm (ALGA) and (3) a hybrid formulation. The latter algorithm combines the former two optimisation algorithms.

Each optimisation is solved 4 times, firstly for RSM, secondly for Kriging, thirdly for RBFANN and finally for the ensemble models. In each case, three different starting points (the lower, middle and upper bounds) are used for each objective function to assess the number of analysis and gradient calls necessary to obtain the optimum design. To proceed with the three types of optimisation algorithms, four separate predictive functions (one for each learning model) were created. In the case of RSM, a dedicated script was generated to formulate a predictive function, which was later called in the respective optimisation algorithms. In the case of Kriging, the predictor function (the function which is based on the developed model, referred to as ‘dmodel’, was generated by the ooDACE toolbox), while for the RBFANN, a dedicated function called ‘MyNeuralNetworkFunction’, was purposely set to be automatically generated during the execution of the neural network model when using the MATLAB neural network toolbox application. Also, for the ensemble model, the same procedure as applied for Kriging was followed, but this time a new, namely ‘dmodel2’ was generated, thereby containing samples which were derived from the combination of model-averaging and stacking procedures.

During the applied procedures for solving all optimisation problems, a negative predictive response function was taken to maximise the objective function. This approach was adopted as the MATLAB toolbox software always tries to find the minimum of the fitness function.

The results of all 60 optimisation problems using all discussed learning models are summarised in Tables 12, 13, 14, 15, and 16 of “Appendix 4”. Note, that each table includes results for two JP bodies, thus a single nozzle body with and without swirl.

From the numerical results (Tables 8, 9, 10, 11, and 12), it can be noticed that in general, the optimisation requires fewer iterations and/or generations for the RSM than for the Kriging and RBFANN models. Note that iterations for the ensemble level-2 model are not included in the table, but results were identical to level-1 Kriging model. The variance in computational time, and iterations is attributed to the complicity of the respective model. Thus, fewer iterations were needed for the RSM simple 3rd order polynomial equations, while more iterations and generations were required for the Kriging models [comprising non-linear equations as given in Eqs. (4) to (12)] and the RBFANN models. Besides, the computational expense for all sets of approximations still lies in the order of seconds per evaluation. The optimum designs obtained from the RSM, Kriging, RBFANN and ensemble models are in their majority identical for each objective function. However, it is noted that there are some drastic variances in both X and At when maximising the entrainment ratio Er, using the interior-point algorithm based on predicted results from the RSM.

A comparison performed (between the level-1 kriging models and the level-2 ensemble-kriging models) in each error residual plots, demonstrated that the ensemble-Kriging models predicted optima values with error less than 10 % for all cases, and less than 4 % in 90 % of the results.

### 4.6 Conclusions

This study has demonstrated the use of four learning-models for constructing global approximations to facilitate single-discipline nonlinear design optimisation.

The accuracies of each set of approximations were compared via numerical error analysis, graphical analysis and tested ability in the generation of accurate solutions for 60 different optimisation problems. It was found that the response surfaces registered the highest model fit error among all other models. It was clear that neither 1st nor 2nd order polynomial models proved capable to model the dual-phase-fluid driving JP nonlinear performance behaviour. Eventually, it was found that the 3rd order response surface models could be used to approximate the nonlinear design space within a reasonable margin of error. However, instabilities arose when considering higher order polynomials. This may have resulted from a lack of sample points to estimate all coefficients of the polynomial equation.

Kriging models in conjunction with a Gaussian correlation function (comprising of either 0th, 1st or 2nd order regression function), yield global approximations which were slightly more accurate than RSM.

The RBFANN models (including varied optima number of hidden neurons), showed a drastic decrease in prediction error. However, this improvement was not registered throughout all prediction ranges respective to all JP model cases.

Ultimately, the ensemble models (including a combination of model averaging and stacking methodologies) gave the best overall performance results throughout all prediction ranges. Such results illustrated the benefits of the stacked generalisation approach; thereby the combination of model information, including information from level-1 models with poor approximations capabilities. However, the performance increase attributed to the ensemble approach is not computationally cheap. The complexity and additional evaluations slowed down the modelling process.

Furthermore, a comparison between the three optimisation algorithms has been performed for solving a total of 60 optimisation problems. It was noted that the three algorithms, including: (a) the ‘interior-point’ algorithm, (b) the Augmented Lagrangian genetic algorithm (ALGA) and (c) a hybrid formulation, produced similar results for the same optimisation algorithms, but varied for the four types of applied global approximation methods.

As expected, higher accuracy and consistency were obtained for the optimisation algorithms based on the predicted data from both RBFANN and ensemble models.

The ensemble model (having Kriging as the level-2 learning model) estimated the optimised design parameters, which closely matched the actual data, thereby proved that such model-based optimiser was a powerful optimiser. Thus, in situations where the dual-phase driving-fluid JP is considered, where the data or the structure of the fitness function are highly nonlinear, the stacked generalisation approach might be an adequate first approach.

This comprehensive study should serve as a model-based optimiser tool to assist in the design of dual-phase surface JPs and in particular to cases involving dual-phase fluid composition as driving fluids. Also, due to the nature of the input model variables, mainly the nozzle-to-throat clearance X (considered as an adjustable parameter), such model-based optimisation tool has the potential to be implemented for on-line control purposes.

## Notes

## References

- 1.Jones DR (2001) A taxonomy of global optimization methods based on response surfaces. J Glob Optim 21:345–383MathSciNetzbMATHCrossRefGoogle Scholar
- 2.ESDU, Ejectors and jet pumps. Design and performance for incompressible liquid flows (1985)Google Scholar
- 3.Mali PV, Singh R, De S, Bhatta M (1999) Downhole ESP & surface multiphase pump—cost effective lift technology for isolated and marginal offshore field development. In: SPE Asia Pacific oil and gas conference and exhibition, society of petroleum engineersGoogle Scholar
- 4.Lastra R, Johnson I (2005) Feasibility study on application of multiphase pumping towards zero gas flaring in Nigeria. In: Nigeria annual international conference and exhibition, society of petroleum engineersGoogle Scholar
- 5.Peeran SM, Beg DN, Sarshar S (2013) Novel examples of the use of surface jet pumps (SJPs) to enhance production & processing. Case studies and lessons learnedGoogle Scholar
- 6.Kajero OT, Thorpe RB, Chen T, Wang B, Yao Y (2016) Kriging meta-model assisted calibration of computational fluid dynamics models. AIChE J 62:4308–4320CrossRefGoogle Scholar
- 7.Shimizu Y, Nakamura S, Kuzuhara S, Kurata S (1987) Studies of the configuration and performance of annular type jet pumps. J Fluids Eng 109:205–212CrossRefGoogle Scholar
- 8.Yang P, Chen H, Liu Y-W (2017) Application of response surface methodology and desirability approach to investigate and optimize the jet pump in a thermoacoustic stirling heat engine. Appl Therm Eng 127:1005–1014CrossRefGoogle Scholar
- 9.Lyu Q, Xiao Z, Zeng Q, Xiao L, Long X (2016) Implementation of design of experiment for structural optimization of annular jet pumps. J Mech Sci Technol 30:585–592CrossRefGoogle Scholar
- 10.Di Piazza A, Di Piazza MC, Vitale G (2009) A kriging-based partial shading analysis in a large photovoltaic field for energy forecast. In: International conference on renewable energies and power quality (ICREPQ’09) Valencia, SpainGoogle Scholar
- 11.Simpson T, Mistree F, Korte J, Mauery T (1998) Comparison of response surface and kriging models for multidisciplinary design optimization. In: 7th AIAA/USAF/NASA/ISSMO symposium on multidisciplinary analysis and optimization, p 4755Google Scholar
- 12.Shyy W, Papila N, Vaidyanathan R, Tucker K (2001) Global design optimization for aerodynamics and rocket propulsion components. Prog Aerosp Sci 37:59–118CrossRefGoogle Scholar
- 13.Simpson TW, Mauery TM, Korte JJ, Mistree F (2001) Kriging models for global approximation in simulation-based multidisciplinary design optimization. AIAA J 39:2233–2241CrossRefGoogle Scholar
- 14.Luo J, Lu W (2014) Comparison of surrogate models with different methods in groundwater remediation process. J Earth Syst Sci 123:1579–1589CrossRefGoogle Scholar
- 15.Box GE, Draper NR (1987) Empirical model-building and response surfaces. Wiley, New YorkzbMATHGoogle Scholar
- 16.Morris MD, Mitchell TJ (1995) Exploratory designs for computational experiments. J Stat Plan Inference 43:381–402zbMATHCrossRefGoogle Scholar
- 17.Myers RH, Montgomery DC, Anderson-Cook CM (2016) Response surface methodology: process and product optimization using designed experiments. WileyGoogle Scholar
- 18.Myers RH, Montgomery DC et al (1995) Response surface methodology: process and product optimization using designed experiments, vol 3. Wiley, New YorkzbMATHGoogle Scholar
- 19.Lin DK, Tu W (1995) Dual response surface optimization. J Qual Technol 27:34–39CrossRefGoogle Scholar
- 20.Myers RH (1999) Response surface methodology—current status and future directions. J Qual Technol 31:30–44CrossRefGoogle Scholar
- 21.Myers RH, Montgomery DC, Vining GG, Borror CM, Kowalski SM (2004) Response surface methodology: a retrospective and literature survey. J Qual Technol 36:53–77CrossRefGoogle Scholar
- 22.Sacks J, Schiller SB, Welch WJ (1989) Designs for computer experiments. Technometrics 31:41–47MathSciNetCrossRefGoogle Scholar
- 23.Lophaven SN, Nielsen HB, Søndergaard J (2002a) DACE: a Matlab kriging toolbox, vol 2. Citeseer, PrincetonGoogle Scholar
- 24.Lophaven SN, Nielsen HB, Søndergaard J (2002b) A matlab kriging toolbox, version 2.0. Technical University of Denmark, Kgs. LyngbyGoogle Scholar
- 25.Couckuyt I, Dhaene T, Demeester P (2012) ooDACE toolbox. Adv Eng Softw 49:1–13zbMATHCrossRefGoogle Scholar
- 26.Matheron G (1963) Principles of geostatistics. Econ Geol 58:1246–1266CrossRefGoogle Scholar
- 27.Cassie NA (1993) Statistics for spatial data, Revised edn. Wiley, New YorkGoogle Scholar
- 28.Queipo NV, Haftka RT, Shyy W, Goel T, Vaidyanathan R, Tucker PK (2005) Surrogate-based analysis and optimization. Prog Aerosp Sci 41:1–28CrossRefGoogle Scholar
- 29.Martin JD, Simpson TW (2005) Use of kriging models to approximate deterministic computer models. AIAA J 43:853–863CrossRefGoogle Scholar
- 30.Koehler J, Owen A (1996) 9 computer experiments. Handb Stat 13:261–308MathSciNetzbMATHCrossRefGoogle Scholar
- 31.McKay MD, Beckman RJ, Conover WJ (1979) Comparison of three methods for selecting values of input variables in the analysis of output from a computer code. Technometrics 21:239–245MathSciNetzbMATHGoogle Scholar
- 32.Osio IG, Amon CH (1996) An engineering design methodology with multistage Bayesian surrogates and optimal sampling. Res Eng Des 8:189–206CrossRefGoogle Scholar
- 33.Deutsch CV, Journel AG (1992) GSLIB: geostatistical software library and user’s guide. Oxford University Press, OxfordGoogle Scholar
- 34.Emery X (2005) Simple and ordinary multigaussian kriging for estimating recoverable reserves. Math Geol 37:295–319zbMATHCrossRefGoogle Scholar
- 35.Bayraktar H, Turalioglu FS (2005) A kriging-based approach for locating a sampling site–in the assessment of air quality. Stoch Env Res Risk Assess 19:301–305zbMATHCrossRefGoogle Scholar
- 36.Zimmerman D, Pavlik C, Ruggles A, Armstrong MP (1999) An experimental comparison of ordinary and universal kriging and inverse distance weighting. Math Geol 31:375–390CrossRefGoogle Scholar
- 37.Brus DJ, Heuvelink GB (2007) Optimization of sample patterns for universal kriging of environmental variables. Geoderma 138:86–95CrossRefGoogle Scholar
- 38.Sampson PD, Richards M, Szpiro AA, Bergen S, Sheppard L, Larson TV, Kaufman JD (2013) A regionalized national universal kriging model using partial least squares regression for estimating annual pm2. 5 concentrations in epidemiology. Atmos Environ 75:383–392CrossRefGoogle Scholar
- 39.Myung IJ (2003) Tutorial on maximum likelihood estimation. J Math Psychol 47:90–100MathSciNetzbMATHCrossRefGoogle Scholar
- 40.Broomhead DS, Lowe D (1988) Radial basis functions, multi-variable functional interpolation and adaptive networks, Technical Report, Royal Signals and Radar Establishment Malvern (United Kingdom)Google Scholar
- 41.Tinós R, Júnior LOM (2009) Use of the q-gaussian function in radial basis function networks. In: Foundations of computational intelligence, vol 5. Springer, pp 127–145Google Scholar
- 42.Park J, Sandberg IW (1991) Universal approximation using radial-basis-function networks. Neural Comput 3:246–257CrossRefGoogle Scholar
- 43.Gneiting T, Raftery AE (2005) Weather forecasting with ensemble methods. Science 310:248–249CrossRefGoogle Scholar
- 44.Giorgi F, Mearns LO (2002) Calculation of average, uncertainty range, and reliability of regional climate changes from aogcm simulations via the “reliability ensemble averaging”(rea) method. J Clim 15:1141–1158CrossRefGoogle Scholar
- 45.Wintle BA, McCarthy MA, Volinsky CT, Kavanagh RP (2003) The use of Bayesian model averaging to better represent uncertainty in ecological models. Conserv Biol 17:1579–1590CrossRefGoogle Scholar
- 46.Sloughter JM, Gneiting T, Raftery AE (2010) Probabilistic wind speed forecasting using ensembles and Bayesian model averaging. J Am Stat Assoc 105:25–35CrossRefGoogle Scholar
- 47.Zerpa LE, Queipo NV, Pintos S, Salager J-L (2005) An optimization methodology of alkaline-surfactant-polymer flooding processes using field scale numerical simulation and multiple surrogates. J Petrol Sci Eng 47:197–208CrossRefGoogle Scholar
- 48.Opitz D, Maclin R (1999) Popular ensemble methods: an empirical study. J Artif Intell Res 11:169–198zbMATHCrossRefGoogle Scholar
- 49.Caruana R, Niculescu-Mizil A, Crew G, Ksikes A (2004) Ensemble selection from libraries of models. In: Proceedings of the twenty-first international conference on Machine learning. ACM, p 18Google Scholar
- 50.Bartz-Beielstein T (2016) Stacked generalization of surrogate models-a practical approach, Bibliothek der Technischen Hochschule KölnGoogle Scholar
- 51.Mifsud D, Cao Y, Verdin P, Lao L (2018) The hydrodynamics of two-phase flows in the injection part of a conventional ejector. Int J Multiph Flow. https://doi.org/10.1016/j.ijmultiphaseflow.2018.10.007 CrossRefGoogle Scholar
- 52.MacKay DJ (1992) Bayesian interpolation. Neural Comput 4:415–447zbMATHCrossRefGoogle Scholar
- 53.Foresee FD, Hagan MT (1997) Gauss-Newton approximation to Bayesian learning. In: Proceedings conference on neural networks (ICNN'97), vol 3. IEEE, pp 1930–1935Google Scholar
- 54.Nocedal J, Wright SJ (1999) Numerical optimization. Series in operations research, Springer, Berlin Heidelberg New YorkGoogle Scholar
- 55.Boyd S, Vandenberghe L (2004) Convex optimization. Cambridge University Press, CambridgezbMATHCrossRefGoogle Scholar
- 56.Koh K, Kim S-J, Boyd S (2007) An interior-point method for large-scale l1-regularized logistic regression. J Mach Learn Res 8:1519–1555MathSciNetzbMATHGoogle Scholar
- 57.Conn AR, Gould NI, Toint P (1991) A globally convergent augmented lagrangian algorithm for optimization with general constraints and simple bounds. SIAM J Numer Anal 28:545–572MathSciNetzbMATHCrossRefGoogle Scholar

## Copyright information

**Open Access**This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.