Skip to main content
Log in

Comparative study of HDMRs and other popular metamodeling techniques for high dimensional problems

  • RESEARCH PAPER
  • Published:
Structural and Multidisciplinary Optimization Aims and scope Submit manuscript

A Correction to this article was published on 24 August 2018

This article has been updated

Abstract

The efficiency of optimization for the high dimensional problem has been improved by the metamodeling techniques in multidisciplinary in the past decades. In this study, comparative studies are implemented for high dimensional problems on the accuracy of four popular metamodeling methods, Kriging (KRG), radial basis function (RBF), least square support vector regression (LSSVR) and cut-high dimensional model representation (cut-HDMR) methods. Besides, HDMR methods with different basis functions are considered, including KRG-HDMR, RBF-HDMR and SVR-HDMR. Four factors that might influence the quality of metamodeling methods involving parameter interaction of problems, sample sizes, noise level and sampling strategies are considered. The results show that the LSSVR with Gaussian kernel, using Latin hypercube sampling (LHS) strategy, constructs more accurate metamodels than the KRG. The RBF with Gaussian basis function performs poor in the group. Generally, cut-HDMR methods perform much better than the other metamodeling methods when handling the function with weak parameter interaction, but not better when handling the function with strong parameter interaction.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13

Similar content being viewed by others

Change history

  • 24 August 2018

    The original version of this article unfortunately contains 2 mistakes. The authors wish to revise the mistaken figures (Figure 15b, Figure 17a, Figure 17b, Figure 18a, and Figure 18b) and a mistaken description of the article to improve their work, see below corrections.

  • 24 August 2018

    The original version of this article unfortunately contains 2 mistakes. The authors wish to revise the mistaken figures (Figure 15b, Figure 17a, Figure 17b, Figure 18a, and Figure 18b) and a mistaken description of the article to improve their work, see below corrections.

Abbreviations

R :

Correlation function

x :

Sample point

θ :

Correlation coefficient

x :

Vector of sample point

\( \widehat{y} \) :

Prediction of sample point

β :

Weigh coefficient

ϕ :

Predefined polynomial basis function

z :

Realization of stochastic process

Var :

Variance

ψ :

Basis functions

J :

Cost function

β :

A vector of weights

γ :

Penalty parameter

e :

Error variable

y :

Evaluation of sample point

D :

A set of sample point

φ :

Nonlinear mapping

b :

Model offset

α :

Lagrange multiplier

K :

Kernel function

σ :

Tuning parameter

R :

Design domain

p :

Vector consisting linear variable terms

E, F :

Coefficients

N :

Standard Gaussian distribution

η :

Random number sampled

c :

Center points

δ :

Side length

g :

Unit vector

w:

Best function values

j,k :

Point index

n :

Dimension

t :

Point index

m :

Number of sample point

References

  • Backlund PB, Shahan DW, Seepersad CC (2012) A comparative study of the scalability of alternative metamodelling techniques. Eng Optim 7(44):767–786

    Article  Google Scholar 

  • Bai YC, Han X, Jiang C et al (2012) Comparative study of metamodeling techniques for reliability analysis using evidence theory. Adv Eng Softw 53:61–71

    Article  Google Scholar 

  • Bratley P, Fox BL (1988) Algorithm 659: Implementing Sobol's quasirandom sequence generator. ACM Transactions on Mathematical Software (TOMS) 14(1):88–100

    Article  MATH  Google Scholar 

  • Chen L, Li E, Wang H et al (2016) Time-based reflow soldering optimization by using adaptive Kriging-HDMR method. Soldering & Surface Mount Technology 28(2):101–113

    Article  Google Scholar 

  • Clarke SM, Griebsch JH, Simpson TW (2005) Analysis of support vector regression for approximation of complex engineering analyses. J Mech Des 127(6):1077–1087

    Article  Google Scholar 

  • Díaz-Manríquez A, Toscano G, Coello CAC (2017) Comparison of metamodeling techniques in evolutionary algorithms. Soft Comput 21(19):5647–5663

  • Faure H (1992) Good permutations for extreme discrepancy. Journal of Number Theory 42(1):47–56

    Article  MathSciNet  MATH  Google Scholar 

  • Galanti S, Jung A (1997) Low-discrepancy sequences: Monte Carlo simulation of option prices. The Journal of Derivatives 5(5):63–83

    Article  Google Scholar 

  • Halton JH (1960) On the efficiency of certain quasi-random sequences of points in evaluating multi-dimensional integrals. Numer Math 2(1):84–90

    Article  MathSciNet  MATH  Google Scholar 

  • Hardy RL (1971) Multiquadric equations of topography and other irregular surfaces. J Geophys Res 76(8):1905–1915

    Article  Google Scholar 

  • Jin R, Chen W, Simpson TW (2001) Comparative studies of metamodelling techniques under multiple modelling criteria. Struct Multidiscip Optim 23(1):1–13

    Article  Google Scholar 

  • Jones DR (2001) Direct global optimization algorithm direct global optimization algorithm. Encyclopedia of optimization. Springer, Boston, MA, p 431–440

  • Kalagnanam JR, Diwekar UM (1997) An efficient sampling technique for off-line quality control. Technometrics 39(3):308–319

    Article  MATH  Google Scholar 

  • Kim BS, Lee YB, Choi DH (2009) Comparison study on the accuracy of metamodeling technique for non-convex functions. J Mech Sci Technol 23(4):1175–1181

  • Koivunen AC, Kostinski AB (1998) The Feasibility of data whitening to improve performance of weather radar. J Appl Meteorol 38(6):741–749

    Article  Google Scholar 

  • Kostinski AB, Koivunen AC (2000) On the condition number of Gaussian sample-covariance matrices. IEEE Transactions on Geoscience & Remote Sensing 38(1):329–332

    Article  Google Scholar 

  • Lee Y, Oh S, Choi DH (2008) Design optimization using support vector regression. J Mech Sci Technol 22(2):213

    Article  Google Scholar 

  • Li E, Wang H, Li G (2012) High dimensional model representation (HDMR) coupled intelligent sampling strategy for nonlinear problems. Comput Phys Commun 183(9):1947–1955

    Article  MathSciNet  Google Scholar 

  • Liu H, Xu S, Wang X (2016) Sampling strategies and metamodeling techniques for engineering design: comparison and application”//ASME Turbo Expo 2016: Turbomachinery Technical Conference and Exposition. American Society of Mechanical Engineers V02CT45A019-V02CT45A019

  • McKay MD, Beckman RJ, Conover WJ (1979) Comparison of three methods for selecting values of input variables in the analysis of output from a computer code. Technometrics 12(2):239–245

    MathSciNet  MATH  Google Scholar 

  • Paiva RM, Carvalho ARD, Crawford C et al (2010) Comparison of surrogate models in a multidisciplinary optimization framework for wing design. AIAA J 48(5):995–1006

    Article  Google Scholar 

  • Rabitz H, Aliş ÖF (1999) General foundations of high-dimensional model representations. J Math Chem 25(2-3):197–233

    Article  MathSciNet  MATH  Google Scholar 

  • Sacks J, Welch WJ, Mitchell TJ et al (1989) Design and analysis of computer experiments. Stat Sci 4:409–423

    Article  MathSciNet  MATH  Google Scholar 

  • Shan S, Wang GG (2010a) Metamodeling for high dimensional simulation-based design problems. J Mech Des 132(5):051009

    Article  Google Scholar 

  • Shan S, Wang GG (2010b) Survey of modeling and optimization strategies to solve high-dimensional design problems with computationally-expensive black-box functions. Struct Multidiscip Optim 41(2):219–241

    Article  MathSciNet  MATH  Google Scholar 

  • Sobol IM (1976) Uniformly distributed sequences with an additional uniform property. USSR Comput Math Math Phys 16(5):236–242

    Article  MATH  Google Scholar 

  • Suykens JAK, Vandewalle J (1999) Least squares support vector machine classifiers. Neural Process Lett 9(3):293–300

    Article  Google Scholar 

  • Toshimitsu H, Andrea S (1996) Importance measure in global sensitivity analysis of nonlinear models. Reliab Eng Syst Saf 52(1):1–17

    Article  Google Scholar 

  • Viana FAC et al (2004) Special section on multidisciplinary design optimization: Metamodeling in multidisciplinary design optimization: How far have we really come? AIAA J 52(4):670–690

    Article  Google Scholar 

  • Wang H, Tang L, Li GY (2011) Adaptive MLS-HDMR metamodeling techniques for high dimensional problems. Expert Syst Appl 38(11):14117–14126

    Google Scholar 

  • Zhou XJ, Jiang T (2016) Metamodel selection based on stepwise regression. Struct Multidiscip Optim 54(3):641–657

    Article  MathSciNet  Google Scholar 

  • Zimmermann R (2015) On the condition number anomaly of Gaussian correlation matrices. Linear Algebra Appl 466:512–526

    Article  MathSciNet  MATH  Google Scholar 

Download references

Acknowledgments

This work has been supported by National Key Research and Development Program of China 2017YFB0203701 and Project of the Program of National Natural Science Foundation of China under the Grant Numbers 11572120.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hu Wang.

Additional information

Responsible Editor: Erdem Acar

The original version of this article was revised: The original version of this article unfortunately contains mistaken figures.

Appendices

Appendix 1: Test functions

Appendix 2: Validation by cross validation

It is often prohibitive to generate large quantities of testing sample points for practical simulation-based engineering problems. Thus, researchers often employ leave-one-out cross validation (LOOCV) since it only takes training sample points to estimate the metamodels’ accuracy. Accordingly, the generalized mean square error (GMSE) is considered as the accuracy criterion. The time that LOOCV takes is proportional to the number of sample points and the efficiency of metamodeling method. When given a large number of sample points, a low-efficiency metamodeling method definitely requires much time to estimate the metamodel accuracy. However, considering the effort to obtain a sample point using time-demanding FE or CFD analysis, the GMSE criterion is relatively time-saving in contrast with the RMSE measure. There is a necessity to check whether the RMSE measure can be reflected by the GMSE measure. Figure 14 depicts the average values of NRMSE and normalized GMSE (NGMSE) for 12 test functions after 50 replicas. Given large sample sets, three offline metamodeling methods with spline basis function and kernel functions are used. For each function, it is noticeable that the discrepancy of two accuracy criteria derived from the same metamodeling method is negligible. It indicates that the LOOCV can also be utilized to verify the performances of metamodeling methods.

Fig. 14
figure 14

Average values of NRMSE and normalized GMSE for functions after 50 replicas using KRG-Sp, RBF-Sp and LSSVR-Sp with LHS strategy generating large sample sets

Appendix 3: Variability in performance of metamodels with multiple sampling strategies

This section aims to identify the variability in performance of metamodeling methods in view of multiple samples generated by the same sampling strategy (i.e., LHS and HSS). Relative error of accuracy criteria, defined as (3.1 and 3.2), is introduced to represent the robustness of metamodels. The smaller the relative error is, the more robust the metamodel is:

$$ {\mathrm{NRMSE}}_{\mathrm{Relative}\kern0.4em \mathrm{error}}=\left({\mathrm{NRMSE}}_{\mathrm{max}}\hbox{-} {\mathrm{NRMSE}}_{\mathrm{min}}\right)/{\mathrm{NRMSE}}_{\mathrm{mean}} $$
(3.1)
$$ {\mathrm{NMAX}}_{\mathrm{Relative}\kern0.4em \mathrm{error}}=\left({\mathrm{NMAX}}_{\mathrm{max}}\hbox{-} {\mathrm{NMAX}}_{\mathrm{min}}\right)/{\mathrm{NMAX}}_{\mathrm{mean}} $$
(3.2)

Figures 15 and 16 respectively show the relative error of NRMSE and NMAX obtained from 50 replicas with LHS and HSS strategies generating small sample sets. Using the LHS strategy, it can be observed from Fig. 15a for NRMSE that RBF-Ga seems the most robust metamodeling method, followed by LSSVR-Sp and KRG. RBF-Sp and LSSVR-Ga are not robust enough due to their large relative errors. However, RBF-Ga performs much worse for each function and each NRMSE values estimated approach to 1, which makes its relative error extremely small. Thus, LSSVR-Sp is the most robust one, and it can be proved by Figs.15b where LSSVR-Sp possesses the least values of the relative error of NMAX for most of functions. When referring to HSS strategy, LSSVR-Sp remains to be the best alternative, and the overall consequence is similar with that when using LHS strategy, as presented in Fig. 16.

Fig. 15
figure 15

Relative error of accuracy obtained from 50 replicas with LHS strategy generating small sample sets. a NRMSE. b NMAX

Fig. 16
figure 16

Relative error of accuracy obtained from 50 replicas HSS strategy generating small sample sets. a NRMSE. b NMAX

Similarly, Figs. 17 and 18 respectively show the relative error of two measures with LHS and HSS strategies generating large sample sets. It is observable that LSSVR-Sp remains to be the best and has higher robustness compared to Figs. 15 and 16 when sample points increase.

Fig. 17
figure 17

Relative error of accuracy obtained from 50 replicas with LHS strategy generating large sample sets. a NRMSE. b NMAX

Fig. 18
figure 18

Relative error of accuracy obtained from 50 replicas with HSS strategy generating large sample sets. a NRMSE. b NMAX

Appendix 4: Counting of the parameter interaction

The number of parameter interactions should be measured via series representation of the form (16) rather than the previous definition. Therefore, we choose the analysis of variance (ANOVA) to determine the number of parameter interaction of a test function. The implementation of ANOVA is derived from Ref. (Toshimitsu and Andrea 1996). The importance of parameter xi and parameter interactions among \( {x}_{i_1},{x}_{i_2} \) and \( {x}_{i_s} \) to the output f(x) can be quantitively measured by the Sobol sensitivity index Si and \( {S}_{i_1{i}_2\cdots {i}_s} \), respectively. One can argue that parameter interaction among parameters \( {x}_{i_1} \), \( {x}_{i_2} \),…, \( {x}_{i_s} \) is insignificant if \( {S}_{i_1{i}_2\cdots {i}_s} \) equals or approaches zero.

Given an integrable function f(x) defined in In, the ANOVA representation of this function can be expressed as follow:

$$ f\left(\mathbf{x}\right)={f}_0+\sum \limits_{i=1}^n{f}_i\left({x}_i\right)+\sum \limits_{1\le i<j\le n}{f}_{ij}\left({x}_i,{x}_j\right)+\cdots +{f}_{12\cdots n}\left({x}_1,{x}_2,\cdots, {x}_n\right) $$
(4.1)

Assuming the sample points are independent among parameters, the estimation of \( {S}_{i_1{i}_2\cdots {i}_s} \) is defined as follows:

$$ {S}_{i_1{i}_2\cdots {i}_s}=\frac{D_{i_1{i}_2\cdots {i}_s}}{D} $$
(4.2)

where \( {D}_{i_1{i}_2\cdots {i}_s} \) and D respectively indicate the variance of \( {f}_{i_1{i}_2\cdots {i}_s} \) and f(x):

$$ {D}_{i_1{i}_2\cdots {i}_s}={\int}_{I^s}{f}_{i_1{i}_2\cdots {i}_s}^2{dx}_{i_1}{dx}_{i_2}\cdots {dx}_{i_s} $$
(4.3)
$$ D={\int}_{I^n}{f}^2\left(\mathbf{x}\right)d\mathbf{x}-{\widehat{f}}_0^2 $$
(4.4)

where \( {\widehat{f}}_0 \) is the mean of outputs of sample points.

All \( {S}_{i_1{i}_2\cdots {i}_s} \) are nonnegative and the sum is

$$ \sum \limits_{s=1}^n\sum \limits_{i_1<\cdots <{i}_s}^n{S}_{i_1{i}_2\cdots {i}_s}=1 $$
(4.5)

Using the MC method, the estimation of \( {D}_{i_1{i}_2\cdots {i}_s} \) is approximated as follows:

$$ {D}_{i_1{i}_2\cdots {i}_s}\approx {\widehat{D}}_{i_1{i}_2\cdots {i}_s}-\sum {\widehat{D}}_{i_1{i}_2\cdots {i}_{s-1}}+\sum {\widehat{D}}_{i_1{i}_2\cdots {i}_{s-2}}-\cdots +{\left(-1\right)}^s{\widehat{f}}_0^2 $$
(4.6)

In the above equation, \( {\widehat{D}}_{i_1{i}_2\cdots {i}_s} \) is generated by summing products of two function values: u with all the variables sampled and the other one v with all the variables re-sampled except variables \( {x}_{i_1} \), \( {x}_{i_2} \),…, \( {x}_{i_s} \). The following equation is the mathematical expression:

$$ {\widehat{D}}_{i_1{i}_2\cdots {i}_s}=\frac{1}{N}\sum \limits_{m=1}^Nf\left({\mathbf{u}}_m\right)f\left({\mathbf{v}}_m,{\mathbf{x}}_m^{\mathbf{u}}\right) $$
(4.7)

where \( \left({v}_m,{x}_m^u\right) \) indicates the matrix v has the same column vector with the matrix u corresponding to variables \( {x}_{i_1} \), \( {x}_{i_2} \),…, \( {x}_{i_s} \).

In this study, we divide test functions into two groups: low parameter interaction and high parameter interaction. The former kind of test functions is represented by the definition that parameter interactions of test functions are not larger than 2, with \( 0.999\le \sum \limits_i{S}_i+\sum \limits_{i<j}{S}_{ij}\le 1 \). The latter ones indicate test functions have 3 or higher parameter interactions, with \( \sum \limits_i{S}_i+\sum \limits_{i<j}{S}_{ij}<0.999 \). Therefore, a simple procedure is implemented to determine the parameter interaction of a test function.

To begin with, calculate the first order GSI Si based on (4.2). If \( \sum \limits_i{S}_i\ge 0.999 \), parameter interaction of the function is 1; Otherwise, continue calculating the second order GSI Sij. If \( \sum \limits_i{S}_i+\sum \limits_{i<j}{S}_{ij}\ge 0.999 \), parameter interaction is 2; Otherwise, parameter interaction is 3 or higher.

The parameter interactions of twelve test functions are summarized in Table 9.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Chen, L., Wang, H., Ye, F. et al. Comparative study of HDMRs and other popular metamodeling techniques for high dimensional problems. Struct Multidisc Optim 59, 21–42 (2019). https://doi.org/10.1007/s00158-018-2046-8

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00158-018-2046-8

Keywords

Navigation