Abstract
The application scenario investigated in the paper is the bank credit scoring based on a Gradient Boosting classifier. It is shown how one may exploit hyperparameter optimization based on the Bayesian Optimization paradigm. All the evaluated methods are based on the Gaussian Process model, but differ in terms of the kernel and the acquisition function. The main purpose of the research presented herein is to confirm experimentally that it is reasonable to tune both the kernel function and the acquisition function in order to optimize Bayesian Gradient Boosting hyperparameters. Moreover, the paper provides results indicating that, at least in the investigated application scenario, the superiority of some of the evaluated Bayesian Optimization methods over others strongly depends on the amount of the optimization budget.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Flach, P.: Machine Learning: The Art and Science of Algorithms That Make Sense of Data. Cambridge University Press, New York (2012)
Snoek, J., Larochelle, H., Adams, R.P.: Practical Bayesian optimization of machine learning algorithms. In: Proceedings of the 25th International Conference on Neural Information Processing Systems, NIPS 2012, USA, vol. 2, pp. 2951–2959. Curran Associates Inc. (2012)
Xia, Y., Liu, C., Li, Y., Liu, N.: A boosted decision tree approach using Bayesian hyper-parameter optimization for credit scoring. Expert Syst. Appl. 78(Suppl. C), 225–241 (2017)
Shahriari, B., Swersky, K., Wang, Z., Adams, R.P., de Freitas, N.: Taking the human out of the loop: a review of Bayesian optimization. Proc. IEEE 104(1), 148–175 (2016)
Chen, T., Guestrin, C.: XGboost: a scalable tree boosting system. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2016, pp. 785–794. ACM, New York (2016)
Brochu, E., Cora, V.M., de Freitas, N.: A tutorial on Bayesian optimization of expensive cost functions, with application to active user modeling and hierarchical reinforcement learning, December 2010. arXiv:1012.2599
Szwabe, A., Misiorek, P., Walkowiak, P.: Reflective relational learning for ontology alignment. In: 9th International Conference on Distributed Computing and Artificial Intelligence, DCAI 2012, Salamanca, Spain, 28–30th March 2012, pp. 519–526 (2012)
Bergstra, J., Bengio, Y.: Random search for hyper-parameter optimization. J. Mach. Learn. Res. 13, 281–305 (2012)
Lizotte, D.J., Greiner, R., Schuurmans, D.: An experimental methodology for response surface optimization methods. J. Glob. Optim. 53(4), 699–736 (2012)
Kushner, H.J.: A new method of locating the maximum point of an arbitrary multipeak curve in the presence of noise. J. Basic Eng. 86(1), 97–106 (1964)
Močkus, J.: On Bayesian methods for seeking the extremum. In: Marchuk, G.I. (ed.) Optimization Techniques 1974. LNCS, vol. 27, pp. 400–404. Springer, Heidelberg (1975). https://doi.org/10.1007/3-540-07165-2_55
Srinivas, N., Krause, A., Kakade, S., Seeger, M.: Gaussian process optimization in the bandit setting: no regret and experimental design. In: Proceedings of the 27th International Conference on International Conference on Machine Learning, ICML 2010, USA, pp. 1015–1022. Omnipress (2010)
University of California, Irvine (UCI), Machine Learning Repository (MRI): German Credit dataset (2017). https://archive.ics.uci.edu/ml/datasets/Statlog+(German+Credit+Data)
Lessmann, S., Baesens, B., Seow, H.V., Thomas, L.C.: Benchmarking state-of-the-art classification algorithms for credit scoring: an update of research. Eur. J. Oper. Res. 247(1), 124–136 (2015)
Brown, I., Mues, C.: An experimental comparison of classification algorithms for imbalanced credit scoring data sets. Expert Syst. Appl. 39(3), 3446–3453 (2012)
Harris, T.: Credit scoring using the clustered support vector machine. Expert Syst. Appl. 42(2), 741–750 (2015)
Huang, C.L., Chen, M.C., Wang, C.J.: Credit scoring with a data mining approach based on support vector machines. Expert Syst. Appl. 33(4), 847–856 (2007)
Finlay, S.: Multiple classifier architectures and their application to credit risk assessment. Eur. J. Oper. Res. 210(2), 368–378 (2011)
Acknowledgments
This work was supported by the Polish National Science Centre, grant DEC-2011/01/D/ST6/06788, and by Poznan University of Technology under grant 04/45/DSPB/0163.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG, part of Springer Nature
About this paper
Cite this paper
Szwabe, A. (2018). Kernel and Acquisition Function Setup for Bayesian Optimization of Gradient Boosting Hyperparameters. In: Nguyen, N., Hoang, D., Hong, TP., Pham, H., Trawiński, B. (eds) Intelligent Information and Database Systems. ACIIDS 2018. Lecture Notes in Computer Science(), vol 10751. Springer, Cham. https://doi.org/10.1007/978-3-319-75417-8_28
Download citation
DOI: https://doi.org/10.1007/978-3-319-75417-8_28
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-75416-1
Online ISBN: 978-3-319-75417-8
eBook Packages: Computer ScienceComputer Science (R0)