Skip to main content

Fine-Tuning Dropout Regularization in Energy-Based Deep Learning

  • Conference paper
  • First Online:
Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications (CIARP 2021)

Abstract

Deep Learning architectures have been extensively studied in the last years, mainly due to their discriminative power in Computer Vision. However, one problem related to such models concerns their number of parameters and hyperparameters, which can easily reach hundreds of thousands. Additional drawbacks consist of their need for extensive training datasets and their high probability of overfitting. Recently, a naïve idea of disconnecting neurons from a network, known as Dropout, has shown to be a promising solution though it requires an adequate hyperparameter setting. Therefore, this work addresses finding suitable Dropout ratios through meta-heuristic optimization in the task of image reconstruction. Several energy-based Deep Learning architectures, such as Restricted Boltzmann Machines, Deep Belief Networks, and several meta-heuristic techniques, such as Particle Swarm Optimization, Bat Algorithm, Firefly Algorithm, Cuckoo Search, were employed in such a context. The experimental results describe the feasibility of using meta-heuristic optimization to find suitable Dropout parameters in three literature datasets and reinforce bio-inspired optimization as an alternative to empirically choosing regularization-based hyperparameters.

The authors would like to thank São Paulo Research Foundation (FAPESP) grants #2013/07375-0, #2014/12236-1, #2017/25908-6, #2019/02205-5, #2019/07825-1, and #2019/07665-4, and National Council for Scientific and Technological Development (CNPq) grants #307066/2017-7 and #427968/2018-6.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 69.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 89.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Regarding the source-code, we have used the implementation available at GitHub: https://github.com/gugarosa/dropout_rbm.

  2. 2.

    A ratio constrained between [0.0001, 1] with a step size of 0.0001 generates 10, 000 possibilities.

  3. 3.

    Note that these values were empirically chosen according to their author’s definition to mitigate additional influences over the Dropout regularization.

  4. 4.

    Note that a x-layer architecture uses x sequential values from \(a-b-c\), being \(x \in [1,3]\). Regarding RBMs, we opted to use 400 and 2, 000 hidden neurons.

References

  1. Bergstra, J., Bengio, Y.: Random search for hyper-parameter optimization. J. Mach. Learn. Res. 13(null), 281–305 (2012)

    Google Scholar 

  2. De Rosa, G.H., Papa, J.P., Yang, X.S.: Handling dropout probability estimation in convolution neural networks using meta-heuristics. Soft Comput. 22(18), 6147–6156 (2018)

    Article  Google Scholar 

  3. Hinton, G.E.: A practical guide to training restricted Boltzmann machines. In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Neural Networks: Tricks of the Trade. LNCS, vol. 7700, pp. 599–619. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_32

    Chapter  Google Scholar 

  4. Hinton, G.E., Osindero, S., Teh, Y.W.: A fast learning algorithm for deep belief nets. Neural Comput. 18(7), 1527–1554 (2006)

    Article  MathSciNet  Google Scholar 

  5. Hoerl, A.E., Kennard, R.W.: Ridge regression: biased estimation for nonorthogonal problems. Technometrics 12(1), 55–67 (1970)

    Article  Google Scholar 

  6. Kennedy, J., Eberhart, R.C., Russel, C., Kennedy, J.F., Shi, Y.: Swarm Intelligence. Morgan Kaufmann (2001)

    Google Scholar 

  7. LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)

    Article  Google Scholar 

  8. Lee, H.W., Kim, N.R., Lee, J.H.: Deep neural network self-training based on unsupervised learning and dropout. Int. J. Fuzzy Logic Intell. Syst. 17(1), 1–9 (2017)

    Article  Google Scholar 

  9. Mosavi, A., Ardabili, S., Várkonyi-Kóczy, A.R.: List of deep learning models. In: Várkonyi-Kóczy, A.R. (ed.) INTER-ACADEMIA 2019. LNNS, vol. 101, pp. 202–214. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-36841-8_20

    Chapter  Google Scholar 

  10. Nowlan, S.J., Hinton, G.E.: Simplifying neural networks by soft weight-sharing. Neural Comput. 4(4), 473–493 (1992). https://doi.org/10.1162/neco.1992.4.4.473

    Article  Google Scholar 

  11. O’Mahony, N., et al.: Deep learning vs. traditional computer vision. In: Arai, K., Kapoor, S. (eds.) CVC 2019. AISC, vol. 943, pp. 128–144. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-17795-9_10

    Chapter  Google Scholar 

  12. Papa, J.P., Rosa, G.H., Marana, A.N., Scheirer, W., Cox, D.D.: Model selection for discriminative restricted Boltzmann machines through meta-heuristic techniques. J. Comput. Sci. 9, 14–18 (2015)

    Article  Google Scholar 

  13. Papa, J.P., Scheirer, W., Cox, D.D.: Fine-tuning deep belief networks using harmony search. Appl. Soft Comput. 46(C), 875–885 (2016)

    Google Scholar 

  14. Roelofs, R., et al.: A meta-analysis of overfitting in machine learning. Adv. Neural Inf. Process. Syst. 32, 9179–9189 (2019)

    Google Scholar 

  15. Smolensky, P.: Parallel Distributed Processing: Explorations in the Microstructure of Cognition, vol. 1. MIT Press (1986)

    Google Scholar 

  16. Srivastava, N., Hinton, G.E., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15(1), 1929–1958 (2014)

    MathSciNet  MATH  Google Scholar 

  17. Su, J., Thomas, D.B., Cheung, P.Y.K.: Increasing network size and training throughput of FPGA restricted Boltzmann machines using dropout. In: 2016 IEEE 24th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM), pp. 48–51 (2016)

    Google Scholar 

  18. Tibshirani, R.: Regression shrinkage and selection via the lasso. J. R. Stat. Soc. Ser. B (Methodol.) 58(1), 267–288 (1996)

    MathSciNet  MATH  Google Scholar 

  19. Wang, B., Klabjan, D.: Regularization for unsupervised deep neural nets. In: Thirty-First AAAI Conference on Artificial Intelligence (2017)

    Google Scholar 

  20. Wang, S., Manning, C.: Fast dropout training. In: Proceedings of the 30th International Conference on Machine Learning, pp. 118–126 (2013)

    Google Scholar 

  21. Wang, X., Zhao, Y., Pourpanah, F.: Recent advances in deep learning (2020)

    Google Scholar 

  22. Wilcoxon, F.: Individual comparisons by ranking methods. Biometrics Bull. 1(6), 80–83 (1945)

    Article  Google Scholar 

  23. Xiong, H.Y., Barash, Y., Frey, B.J.: Bayesian prediction of tissue-regulated splicing using RNA sequence and cellular context. Bioinformatics 27(18), 2554–2562 (2011)

    Article  Google Scholar 

  24. Yang, X.S.: Firefly algorithm, stochastic test functions and design optimisation. Int. J. Bio-Inspired Comput. 2(2), 78–84 (2010)

    Article  Google Scholar 

  25. Yang, X.S.: A new metaheuristic bat-inspired algorithm. In: Gonzélez, J.R., Pelta, D.A., Cruz, C., Terrazas, G., Krasnogor, N. (eds.) Nature Inspired Cooperative Strategies for Optimization (NICSO 2010), pp. 65–74. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-12538-6_6

  26. Yang, X.S., Deb, S.: Engineering optimisation by cuckoo search. Int. J. Math. Model. Numer. Optim. 1(4), 330–343 (2010)

    MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Gustavo H. de Rosa .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

de Rosa, G.H., Roder, M., Papa, J.P. (2021). Fine-Tuning Dropout Regularization in Energy-Based Deep Learning. In: Tavares, J.M.R.S., Papa, J.P., González Hidalgo, M. (eds) Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications. CIARP 2021. Lecture Notes in Computer Science(), vol 12702. Springer, Cham. https://doi.org/10.1007/978-3-030-93420-0_10

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-93420-0_10

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-93419-4

  • Online ISBN: 978-3-030-93420-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics