Effects of Random Sampling on SVM Hyper-parameter Tuning

Horváth, Tomáš; Mantovani, Rafael G.; de Carvalho, André C. P. L. F.

doi:10.1007/978-3-319-53480-0_27

Tomáš Horváth¹⁸,
Rafael G. Mantovani¹⁹ &
André C. P. L. F. de Carvalho¹⁹

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 557))

Included in the following conference series:

International Conference on Intelligent Systems Design and Applications

1848 Accesses
6 Citations

Abstract

Hyper-parameter tuning is one of the crucial steps in the successful application of machine learning algorithms to real data. In general, the tuning process is modeled as an optimization problem for which several methods have been proposed. For complex algorithms, the evaluation of a hyper-parameter configuration is expensive and their runtime is speed up through data sampling. In this paper, the effect of sample sizes to the results of hyper-parameter tuning process is investigated. Hyper-parameters of Support Vector Machines are tuned on samples of different sizes generated from a dataset. Hausdorff distance is proposed for computing the differences between the results of hyper-parameter tuning on two samples of different size. 100 real-world datasets and two tuning methods (Random Search and Particle Swarm Optimization) are used in the experiments revealing some interesting relations between sample sizes and results of hyper-parameter tuning which open some promising directions for future investigation in this direction.

Tomáš Horváth is also a member of the Institute of Computer Science, Faculty of Science, Pavol Jozef Šafárik University in Košice, Slovakia.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 259.00; Price excludes VAT (USA)

Softcover Book: USD 329.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
http://archive.ics.uci.edu/ml/.
2.
Particle swarm optimization has been successfully used in partially irregular or noisy optimization problems, and, often performs well, finding good solutions because it does not make any assumption about the search landscape.
3.
https://www.r-project.org/.
4.
https://www.csie.ntu.edu.tw/~cjlin/libsvm/.
5.
According to the parameter K of the Algorithm 1 set to 30 in the experiment.
6.
Supported by the Brazilian Funding Agencies CAPES, CNPq and São Paulo Research Foundation FAPESP (CeMEAI-FAPESP process 13/07375-0 and grant #2012/23114-9), and, the Slovakian project VEGA 1/0475/14.

References

Bendtsen, C.: pso: Particle Swarm Optimization, R package version 1.0.3 (2012)
Google Scholar
Bergstra, J., Bengio, Y.: Random search for hyper-parameter optimization. J. Mach. Learn. Res. 13, 281–305 (2012)
MathSciNet MATH Google Scholar
Braga, I., Carmo, L.P., Benatti, C.C., Monard, M.C.: A note on parameter selection for support vector machines. In: Castro, F., Gelbukh, A., González, M. (eds.) MICAI 2013. LNCS (LNAI), vol. 8266, pp. 233–244. Springer, Heidelberg (2013). doi:10.1007/978-3-642-45111-9_21
Chapter Google Scholar
Chang, C.-C., Lin, C.-J.: LIBSVM: a library for support vector machines. ACM Trans. Intell. Syst. Technol. 2(3), 27:1–27:27 (2011)
Article Google Scholar
Cochran, W.G.: Sampling Techniques, 3rd edn. Wiley, New York (1977)
MATH Google Scholar
Friedrichs, F., Igel, C.: Evolutionary tuning of multiple SVM parameters. Neurocomputing 64, 107–117 (2005)
Article Google Scholar
Hofmann, T., Schölkopf, B., Smola, A.J.: Kernel methods in machine learning. Ann. Stat. 36(3), 1171–1220 (2008)
Article MathSciNet MATH Google Scholar
Hutter, F., Hoos, H.H., Leyton-Brown, K.: Sequential model-based optimization for general algorithm configuration. In: Coello, C.A.C. (ed.) LION 2011. LNCS, vol. 6683, pp. 507–523. Springer, Heidelberg (2011). doi:10.1007/978-3-642-25566-3_40
Chapter Google Scholar
Lang, K., Liberty, E., Shmakov, K.: Stratified Sampling Meets Machine Learning (2015)
Google Scholar
Mantovani, R.G., Rossi, A.L.D., Vanschoren, J., Bischl, B., de Carvalho, A.: Effectiveness of random search in SVM hyper-parameter tuning. In: International Joint Conference on Neural Networks, pp. 1–8 (2015)
Google Scholar
Mantovani, R.G., Rossi, A.L.D., Vanschoren, J., Bischl, B., de Carvalho, A.: To tune or not to tune: recommending when to adjust SVM hyper-parameters via meta-learning. In: 2015 International Joint Conference on Neural Networks, pp. 1–8 (2015)
Google Scholar
Mantovani, R.G., Rossi, A.L.D., Vanschoren, J., Carvalho, A.C.P.D.L.: Meta-learning recommendation of default hyper-parameter values for SVMs in classification tasks. In: 2015 International Workshop on Meta-Learning and Algorithm Selection at ECML/PKDD, pp. 80–92 (2015)
Google Scholar
Meng, X.: Scalable simple random sampling and stratified sampling. In: JMLR Workshop and Conference Proceedings, vol. 28, pp. 531–539 (2013)
Google Scholar
Momma, M., Bennett, K.P.: A pattern search method for model selection of support vector regression. In: SIAM International Conference on Data Mining. SIAM (2002)
Google Scholar
Reif, M., Shafait, F., Goldstein, M., Breuel, T., Dengel, A.: Automatic classifier selection for non-experts. Pattern Anal. Appl. 17(1), 83–96 (2014)
Article MathSciNet Google Scholar
Snoek, J., Larochelle, H., Adams, R.P.: Practical bayesian optimization of machine learning algorithms. In: NIPS, pp. 2960–2968 (2012)
Google Scholar
Soares, C., Brazdil, P.B.: Selecting parameters of SVM using meta-learning and kernel matrix-based meta-features. In: ACM Symposium on Applied computing, pp. 564–568. ACM (2006)
Google Scholar
Taha, A.A., Hanbury, A.: An efficient algorithm for calculating the exact hausdorff distance. IEEE Trans. Pattern Anal. Mach. Intell. 37(11), 2153–2163 (2015)
Article Google Scholar
Tillé, Y.: Sampling Algorithms. Springer, Heidelberg (2006)
MATH Google Scholar
Yang, X.-S., Cui, Z., Xiao, R., Gandomi, A.H., Karamanoglu, M.: Swarm Intelligence and Bio-Inspired Computation: Theory and Applications. Elsevier, Amsterdam (2013)
Book Google Scholar

Download references

Author information

Authors and Affiliations

Faculty of Informatics, Eötvös Loránd University, Budapest, Hungary
Tomáš Horváth
Institute of Mathematical and Computer Sciences, University of São Paulo, São Carlos, Brazil
Rafael G. Mantovani & André C. P. L. F. de Carvalho

Authors

Tomáš Horváth
View author publications
You can also search for this author in PubMed Google Scholar
Rafael G. Mantovani
View author publications
You can also search for this author in PubMed Google Scholar
André C. P. L. F. de Carvalho
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Tomáš Horváth .

Editor information

Editors and Affiliations

Departamento de Engenharia Informática, Instituto Superior de Engenharia do Port, Porto, Portugal
Ana Maria Madureira
Scientific Network for Innovation and Research Excellence, Machine Intelligence Research Labs, Auburn, Washington, USA
Ajith Abraham
Polytechnic Institute of Porto, Felgueiras, Portugal
Dorabela Gamboa
Campus of Gualtar, University of Minho, Braga, Portugal
Paulo Novais

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Horváth, T., Mantovani, R.G., de Carvalho, A.C.P.L.F. (2017). Effects of Random Sampling on SVM Hyper-parameter Tuning. In: Madureira, A., Abraham, A., Gamboa, D., Novais, P. (eds) Intelligent Systems Design and Applications. ISDA 2016. Advances in Intelligent Systems and Computing, vol 557. Springer, Cham. https://doi.org/10.1007/978-3-319-53480-0_27

Download citation

DOI: https://doi.org/10.1007/978-3-319-53480-0_27
Published: 23 February 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-53479-4
Online ISBN: 978-3-319-53480-0
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics