Abstract
Deep Learning architectures have been extensively studied in the last years, mainly due to their discriminative power in Computer Vision. However, one problem related to such models concerns their number of parameters and hyperparameters, which can easily reach hundreds of thousands. Additional drawbacks consist of their need for extensive training datasets and their high probability of overfitting. Recently, a naïve idea of disconnecting neurons from a network, known as Dropout, has shown to be a promising solution though it requires an adequate hyperparameter setting. Therefore, this work addresses finding suitable Dropout ratios through meta-heuristic optimization in the task of image reconstruction. Several energy-based Deep Learning architectures, such as Restricted Boltzmann Machines, Deep Belief Networks, and several meta-heuristic techniques, such as Particle Swarm Optimization, Bat Algorithm, Firefly Algorithm, Cuckoo Search, were employed in such a context. The experimental results describe the feasibility of using meta-heuristic optimization to find suitable Dropout parameters in three literature datasets and reinforce bio-inspired optimization as an alternative to empirically choosing regularization-based hyperparameters.
The authors would like to thank São Paulo Research Foundation (FAPESP) grants #2013/07375-0, #2014/12236-1, #2017/25908-6, #2019/02205-5, #2019/07825-1, and #2019/07665-4, and National Council for Scientific and Technological Development (CNPq) grants #307066/2017-7 and #427968/2018-6.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
Regarding the source-code, we have used the implementation available at GitHub: https://github.com/gugarosa/dropout_rbm.
- 2.
A ratio constrained between [0.0001, 1] with a step size of 0.0001 generates 10, 000 possibilities.
- 3.
Note that these values were empirically chosen according to their author’s definition to mitigate additional influences over the Dropout regularization.
- 4.
Note that a x-layer architecture uses x sequential values from \(a-b-c\), being \(x \in [1,3]\). Regarding RBMs, we opted to use 400 and 2, 000 hidden neurons.
References
Bergstra, J., Bengio, Y.: Random search for hyper-parameter optimization. J. Mach. Learn. Res. 13(null), 281–305 (2012)
De Rosa, G.H., Papa, J.P., Yang, X.S.: Handling dropout probability estimation in convolution neural networks using meta-heuristics. Soft Comput. 22(18), 6147–6156 (2018)
Hinton, G.E.: A practical guide to training restricted Boltzmann machines. In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Neural Networks: Tricks of the Trade. LNCS, vol. 7700, pp. 599–619. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_32
Hinton, G.E., Osindero, S., Teh, Y.W.: A fast learning algorithm for deep belief nets. Neural Comput. 18(7), 1527–1554 (2006)
Hoerl, A.E., Kennard, R.W.: Ridge regression: biased estimation for nonorthogonal problems. Technometrics 12(1), 55–67 (1970)
Kennedy, J., Eberhart, R.C., Russel, C., Kennedy, J.F., Shi, Y.: Swarm Intelligence. Morgan Kaufmann (2001)
LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)
Lee, H.W., Kim, N.R., Lee, J.H.: Deep neural network self-training based on unsupervised learning and dropout. Int. J. Fuzzy Logic Intell. Syst. 17(1), 1–9 (2017)
Mosavi, A., Ardabili, S., Várkonyi-Kóczy, A.R.: List of deep learning models. In: Várkonyi-Kóczy, A.R. (ed.) INTER-ACADEMIA 2019. LNNS, vol. 101, pp. 202–214. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-36841-8_20
Nowlan, S.J., Hinton, G.E.: Simplifying neural networks by soft weight-sharing. Neural Comput. 4(4), 473–493 (1992). https://doi.org/10.1162/neco.1992.4.4.473
O’Mahony, N., et al.: Deep learning vs. traditional computer vision. In: Arai, K., Kapoor, S. (eds.) CVC 2019. AISC, vol. 943, pp. 128–144. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-17795-9_10
Papa, J.P., Rosa, G.H., Marana, A.N., Scheirer, W., Cox, D.D.: Model selection for discriminative restricted Boltzmann machines through meta-heuristic techniques. J. Comput. Sci. 9, 14–18 (2015)
Papa, J.P., Scheirer, W., Cox, D.D.: Fine-tuning deep belief networks using harmony search. Appl. Soft Comput. 46(C), 875–885 (2016)
Roelofs, R., et al.: A meta-analysis of overfitting in machine learning. Adv. Neural Inf. Process. Syst. 32, 9179–9189 (2019)
Smolensky, P.: Parallel Distributed Processing: Explorations in the Microstructure of Cognition, vol. 1. MIT Press (1986)
Srivastava, N., Hinton, G.E., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15(1), 1929–1958 (2014)
Su, J., Thomas, D.B., Cheung, P.Y.K.: Increasing network size and training throughput of FPGA restricted Boltzmann machines using dropout. In: 2016 IEEE 24th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM), pp. 48–51 (2016)
Tibshirani, R.: Regression shrinkage and selection via the lasso. J. R. Stat. Soc. Ser. B (Methodol.) 58(1), 267–288 (1996)
Wang, B., Klabjan, D.: Regularization for unsupervised deep neural nets. In: Thirty-First AAAI Conference on Artificial Intelligence (2017)
Wang, S., Manning, C.: Fast dropout training. In: Proceedings of the 30th International Conference on Machine Learning, pp. 118–126 (2013)
Wang, X., Zhao, Y., Pourpanah, F.: Recent advances in deep learning (2020)
Wilcoxon, F.: Individual comparisons by ranking methods. Biometrics Bull. 1(6), 80–83 (1945)
Xiong, H.Y., Barash, Y., Frey, B.J.: Bayesian prediction of tissue-regulated splicing using RNA sequence and cellular context. Bioinformatics 27(18), 2554–2562 (2011)
Yang, X.S.: Firefly algorithm, stochastic test functions and design optimisation. Int. J. Bio-Inspired Comput. 2(2), 78–84 (2010)
Yang, X.S.: A new metaheuristic bat-inspired algorithm. In: Gonzélez, J.R., Pelta, D.A., Cruz, C., Terrazas, G., Krasnogor, N. (eds.) Nature Inspired Cooperative Strategies for Optimization (NICSO 2010), pp. 65–74. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-12538-6_6
Yang, X.S., Deb, S.: Engineering optimisation by cuckoo search. Int. J. Math. Model. Numer. Optim. 1(4), 330–343 (2010)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
de Rosa, G.H., Roder, M., Papa, J.P. (2021). Fine-Tuning Dropout Regularization in Energy-Based Deep Learning. In: Tavares, J.M.R.S., Papa, J.P., González Hidalgo, M. (eds) Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications. CIARP 2021. Lecture Notes in Computer Science(), vol 12702. Springer, Cham. https://doi.org/10.1007/978-3-030-93420-0_10
Download citation
DOI: https://doi.org/10.1007/978-3-030-93420-0_10
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-93419-4
Online ISBN: 978-3-030-93420-0
eBook Packages: Computer ScienceComputer Science (R0)