# Concurrent surrogate model selection (COSMOS): optimizing model type, kernel function, and hyper-parameters

- 255 Downloads

## Abstract

This paper presents an automated surrogate model selection framework called the Concurrent Surrogate Model Selection or COSMOS. Unlike most existing techniques, COSMOS coherently operates at three levels, namely: 1) selecting the model type (e.g., RBF or Kriging), 2) selecting the kernel function type (e.g., cubic or multiquadric kernel in RBF), and 3) determining the optimal values of the typically user-prescribed hyper-parameters (e.g., shape parameter in RBF). The quality of the models is determined and compared using measures of median and maximum error, given by the Predictive Estimation of Model Fidelity (PEMF) method. PEMF is a robust implementation of sequential *k*-fold cross-validation. The selection process undertakes either a cascaded approach over the three levels or a more computationally-efficient one-step approach that solves a mixed-integer nonlinear programming problem. Genetic algorithms are used to perform the optimal selection. Application of COSMOS to benchmark test functions resulted in optimal model choices that agree well with those given by analyzing the model errors on a large set of additional test points. For the four analytical benchmark problems and three practical engineering applications – airfoil design, window heat transfer modeling, and building energy modeling – diverse forms of models/kernels are observed to be selected as optimal choices. These observations further establish the need for automated multi-level model selection that is also guided by dependable measures of model fidelity.

## Keywords

Automated surrogate model selection Hyper-parameter optimization Kriging Mixed-integer non-linear programming (MINLP) Predictive estimation of model fidelity (PEMF) Radial basis functions (RBF) Support vector regression (SVR)## Notes

### Acknowledgements

Support from the National Science Foundation (NSF) Awards CMMI-1642340 and CNS-1524628 is gratefully acknowledged. Any opinions, findings, conclusions, or recommendations expressed in this paper are those of the authors and do not necessarily reflect the views of the NSF.

### Author Contributions

The different core concepts underlying PEMF and COSMOS were conceived, implemented and tested (through MATLAB) by Ali Mehmani and Souma Chowdhury, with important conceptual contributions from Achille Messac with regards to the surrogate modeling paradigm. The airfoil design and building peak cooling model in this paper were developed and implemented by Ali Mehmani, with support from Christoph Meinrenken on the latter.

## References

- Acar E (2010) Optimizing the shape parameters of radial basis functions: An application to automobile crashworthiness. Proceedings of the Institution of Mechanical Engineers, Part D: Journal of Automobile Engineering 224(12):1541–1553Google Scholar
- Acar E, Rais-Rohani M (2009) Ensemble of metamodels with optimized weight factors. Struct Multidiscip Optim 37(3):279–294CrossRefGoogle Scholar
- Ali MM, Khompatraporn C, Zabinsky ZB (2005) A numerical evaluation of several stochastic algorithms on selected continuous global optimization test problems. J Glob Optim 31(4):635–672MathSciNetCrossRefMATHGoogle Scholar
- Ascione F, Bianco N, Stasio CD, Mauro GM, Vanoli GP (2017) Artificial neural networks to predict energy performance and retrofit scenarios for any member of a building category: a novel approach. Energy 26(118):999–1017CrossRefGoogle Scholar
- Basak D, Srimanta P, Patranabis DC (2007) Support vector regression. Neural Information Processing-Letters and Review 11(10):203–224Google Scholar
- Ben-Hur A, Weston J (2010) A user’s guide to support vector machines. Data mining techniques for the life sciences, pp 223–239Google Scholar
- Blatman G, Sudret B (2011) Adaptive sparse polynomial chaos expansion based on least angle regression. J Comput Phys 230(6):2345–2367MathSciNetCrossRefMATHGoogle Scholar
- Bozdogan H (2000) Akaike’s information criterion and recent developments in information complexity. J Math Psychol 44:62–91MathSciNetCrossRefMATHGoogle Scholar
- Chang C-C, Lin CJ (2011) LIBSVM: a library for support vector machines. ACM Trans Intell Syst Technol 2(3):27:1–27:27CrossRefGoogle Scholar
- Chen PW, Wang JY, Lee HM (2004) Model selection of svms using ga approach. In: IEEE international joint conference on neural networks, 2004. Proceedings. 2004, IEEE, vol 3, pp 2035–2040Google Scholar
- Chen X, Yang H, Sun K (2017) Developing a meta-model for sensitivity analyses and prediction of building performance for passively designed high-rise residential buildings. Appl Energy 194:422–439CrossRefGoogle Scholar
- Claeskens G, Hjort NL (2008) Model selection and model averaging. Cambridge BooksGoogle Scholar
- Coelho F, Breitkopf P, Knopf-Lenoir C (2008) Model reduction for multidisciplinary optimization: application to a 2d wing. Struct Multidiscip Optim 37(1):29–48CrossRefGoogle Scholar
- Couckuyt I, Dhaene T, Demeester P (2014) Oodace toolbox: a flexible object-oriented Kriging implementation. J Mach Learn Res 15(1):3183–3186MATHGoogle Scholar
- Crawley DB, Pedersen CO, Lawrie LK, Winkelmann FC (2000) Energyplus: energy simulation program. ASHRAE J 49(4)Google Scholar
- Cressie N (1993) Statistics for spatial data. Wiley, New YorkMATHGoogle Scholar
- Deb K (2002) A fast and elitist multiobjective genetic algorithm: Nsga-ii. IEEE Trans Evol Comput 6 (2):182–197CrossRefGoogle Scholar
- Deru M, Field K, Studer D, Benne K, Griffith B, Torcellini P, Liu B (2011) US department of energy commercial reference building models of the national building stock. Tech. rep., Department of EnergyGoogle Scholar
- Deschrijver D, Dhaene T (2005) An alternative approach to avoid overfitting for surrogate models. In: Signal propagation on interconnects, 2005. Proceedings. 9th IEEE workshop, pp 111– 114Google Scholar
- DOE (2017) (Accessed on Jan 15, 2017) commercial prototype building models. http://www.energycodes.gov/development/commercial
- Fang A, Rais-Rohani M, Liu Z, Horstemeyer MF (2005) A comparative study of metamodeling methods for multiobjective crashworthiness optimization. Comput Struct 83(25):2121–2136CrossRefGoogle Scholar
- Forrester A, Keane A (2009) Recent advances in surrogate-based optimization. Prog Aerosp Sci 45(1-3):50–79CrossRefGoogle Scholar
- Forrester A, Sobester A, Keane A (2008) Engineering design via surrogate modelling: a practical guide. WileyGoogle Scholar
- Giovanis DG, Papaioannou I, Straub D, Papadopoulos V (2017) Bayesian updating with subset simulation using artificial neural networks. Comput Methods Appl Mech Eng 319:124–145MathSciNetCrossRefGoogle Scholar
- Giunta AA, Watson L (1998) A comparison of approximation modeling techniques: polynomial versus interpolating models. AIAA Journal (AIAA-98-4758)Google Scholar
- Goel T, Stander N (2009) Comparing three error criteria for selecting radial basis function network topology. Comput Methods Appl Mech Eng 198:2137–2150MathSciNetCrossRefMATHGoogle Scholar
- Gorissen D, Dhaene T, Turck FD (2009) Evolutionary model type selection for global surrogate modeling. J Mach Learn Res 10:2039–2078MathSciNetMATHGoogle Scholar
- Gorissen D, Couckuyt I, Demeester P, Dhaene T, Crombecq K (2010) A surrogate modeling and adaptive sampling toolbox for computer based design. J Mach Learn Res 11:2051–2055Google Scholar
- Haftka RT, Villanueva D, Chaudhuri A (2016) Parallel surrogate-assisted global optimization with expensive functions – a survey. Struct Multidiscip Optim 54(1):3–13MathSciNetCrossRefGoogle Scholar
- Hamza K, Saitou K (2012) A co-evolutionary approach for design optimization via ensembles of surrogates with application to vehicle crashworthiness. J Mech Des 134(1):011,001–10CrossRefGoogle Scholar
- Hardy RL (1971) Multiquadric equations of topography and other irregular surfaces. J Geophys Res 76:1905–1915CrossRefGoogle Scholar
- Holena M, Demut R (2011) Assessing the suitability of surrogate models in evolutionary optimization. In: Information technologies, pp 31–38Google Scholar
- Jakeman JD, Narayan A, Zhou T (2017) A generalized sampling and preconditioning scheme for sparse approximation of polynomial chaos expansions. SIAM J Sci Comput 39(3):A1114–A1144MathSciNetCrossRefMATHGoogle Scholar
- Jia G, Taflanidis AA (2013) Kriging metamodeling for approximation of high-dimensional wave and surge responses in real-time storm/hurricane risk assessment. Comput Methods Appl Mech Eng 261:24–38MathSciNetCrossRefMATHGoogle Scholar
- Jin R, Chen W, Simpson TW (2001) Comparative studies of metamodelling techniques under multiple modelling criteria. Struct Multidiscip Optim 23(1):1–13CrossRefGoogle Scholar
- Lee H, Jo Y, Lee D, Choi S (2016) Surrogate model based design optimization of multiple wing sails considering flow interaction effect. Ocean Eng 121:422–436CrossRefGoogle Scholar
- Li YF, Ng SH, Xie M, Goh TN (2010) A systematic comparison of metamodeling techniques for simulation optimization in decision support systems. Appl Soft Comput 10(2):255–268CrossRefGoogle Scholar
- Lin S (2011) A nsga-ii program in matlab, version 1.4 edGoogle Scholar
- Lophaven SN, Nielsen HB, Sondergaard J (2002) Dace - a matlab kriging toolbox, version 2.0. Tech. Rep IMM-REP-2002-12. Informatics and Mathematical Modelling Report, Technical University of DenmarkGoogle Scholar
- Martin JD, Simpson TW (2005) Use of kriging models to approximate deterministic computer models. AIAA J 43(4):853–863CrossRefGoogle Scholar
- Mehmani A, Chowdhury S, Messac A (2015a) Predictive quantification of surrogate model fidelity based on modal variations with sample density. Struct Multidiscip Optim 52(2):353–373CrossRefGoogle Scholar
- Mehmani A, Chowdhury S, Tong W, Messac A (2015b) Adaptive switching of variable-fidelity models in population-based optimization. In: Engineering and applied sciences optimization, computational methods in applied sciences, vol 38. Springer International Publishing, pp 175–205Google Scholar
- Molinaro AM, Simon R, Pfeiffer RM (2005) Rprediction error estimation: a comparison of resampling methods. Bioinformatics 21(15):3301–3307CrossRefGoogle Scholar
- Mongillo M (2011) Choosing basis functions and shape parameters for rad-ial basis function methods. In: SIAM undergraduate research onlineGoogle Scholar
- Qudeiri JEA, Khadra FYA, Umer U, Hussein HMA (2015) Response surface metamodel to predict springback in sheet metal air bending process. International Journal of Materials, Mechanics and Manufacturing 3(4):203–224CrossRefGoogle Scholar
- Queipo N, Haftka R, Shyy W, Goel T, Vaidyanathan R, Tucker P (2005) Surrogate-based analysis and optimization. Prog Aerosp Sci 41(1):1–28CrossRefGoogle Scholar
- Reute IM, Mailach VR, Becker KH, Fischersworring-Bunk A, Schlums H, Ivankovic M (2017) Moving least squares metamodels-hyperparameter, variable reduction and model selection. In: 14th international probabilistic workshop. Springer International Publishing, pp 63–80Google Scholar
- Rippa S (1999) An algorithm for selecting a good value for the parameter c in radial basis function interpolation. Adv Comput Math 11(2-3):193–210MathSciNetCrossRefMATHGoogle Scholar
- Roustant O, Ginsbourger D, Deville Y (2012) Dicekriging, diceoptim: two r packages for the analysis of computer experiments by Kriging-based metamodeling and optimization. J Stat Softw 51(1):518–523CrossRefGoogle Scholar
- Soares C, Brazdil PB, Kuba P (2004) A meta-learning method to select the kernel width in support vector regression. Mach Learn 54(3):195–209CrossRefMATHGoogle Scholar
- Solomatine D, Ostfeld A (2008) Data-driven modelling: some past experiences and new approaches. J Hyd’roinf 10(1):3–22CrossRefGoogle Scholar
- Takahashi R, Prasai D, Adams BL, Mattson CA (2012) Hybrid bishop-hill model for elastic-yield limited design with non-orthorhombic polycrystalline metals. J Eng Mater Technol 134(1):0110,031–12CrossRefGoogle Scholar
- Tian W (2013) A review of sensitivity analysis methods in building energy analysis. Renew Sust Energ Rev 20:411–419CrossRefGoogle Scholar
- Viana FAC, Haftka RT, Steffen V (2009) Multiple surrogates: how cross-validation errors can help us to obtain the best predictor. Struct Multidiscip Optim 39:439–457CrossRefGoogle Scholar
- Viana FAC, Venter G, Balabanov V (2010) An algorithm for fast optimal latin hypercube design of experiments. Int J Numer Methods Eng 82(2):135–156MathSciNetMATHGoogle Scholar
- Zhang J, Messac A, Zhang J, Chowdhury S (2014) Adaptive optimal design of active thermoelectric windows using surrogate modeling. Optim Eng 15(2):469–483CrossRefGoogle Scholar
- Zhang M, Gou W, Li L, Yang F, Yue Z (2016) Multidisciplinary design and multi-objective optimization on guide fins of twin-web disk using Kriging surrogate model. Struct Multidiscip Optim 55(1):361–373CrossRefGoogle Scholar
- Zhang Y, Park C, Kim NH, Haftka RT (2017) Function prediction at one inaccessible point using converging lines. J Mech Des 139(5):051,402CrossRefGoogle Scholar