Abstract
In Chap. 1, our empirical analysis was based on neural networks with a single hidden layer. These networks, called shallow, are in theory universal approximators of any continuous function. Deep neural networks use instead a cascade of multiple layers of hidden neurons. Each successive layer uses the output from the previous layer as input. As with shallow networks, many issues can arise with naively trained deep networks. Two common issues are the overfitting of the training dataset and the increase of computation time. We present the techniques for limiting the computation time and the methods of regularization for avoiding the overfitting. We next explain why deep neural networks outperform shallow networks for approximating hierarchical binary functions. This chapter is concluded by a numerical illustration.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
- 2.
- 3.
Notice that the discrepancy between deviances for λ = 0 is explained by the random initialization of neural coefficients. This randomization leads to different deviances after 1500 iterations.
References
Bottou L (1998) Online algorithms and stochastic approximations. Online learning and neural networks. Cambridge University Press, New York. ISBN 978-0-521-65263-6
Chollet F, Allaire JJ (2018) Deep learning with R. Manning Publications, New York
Cybenko G (1989) Approximation by superpositions of a sigmoidal function. Math Control Signals Syst 2:303–314
DeVore R, Howard R, Micchelli CA (1989) Optimal nonlinear approximation. Manuscripta Math 63:469–478
DeVries PMR,Viégas F, Wattenberg M, Meade BJ (2018) Deep learning of aftershock patterns following large earthquakes. Nature 560:632–634
Duchi J, Hazan E, Singer Y (2011) Adaptive subgradient methods for online learning and stochastic optimization. J Mach Learn Res 12:2121–2159
Ferrario A, Nolly A, Wuthrich MV (2018) Insights from inside neural networks. SSRN working paper. https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3226852
Hornik K, Stinchcombe M, White H (1989) Multi-layer feed-forward networks are universal approximators. Neural Netw 2:359–366
Karpathy A, Fei-Fei L (2016) Deep visual-semantic alignments for generating image descriptions. IEEE Trans Pattern Anal Mach Intell 39(4):664–676
Kingma D, Ba J (2015) Adam: a method for stochastic optimization. In: 3rd international conference on learning representations, San Diego, 2015. arXiv:1412.6980
Kiwiel KC (2001) Convergence and efficiency of subgradient methods for quasiconvex minimization. Math Program 90(1):1–25
Lample G, Ballesteros M, Subramanian S, Kawakami K, Dyer C (2016) Neural architectures for named entity recognition. In: Proceedings of NAACL-HLT 2016, pp 260–270
Mhaskar HN (1996) Neural networks for optimal approximation of smooth and analytic functions. Neural Comput 8(1):164–177
Poggio T, Mhaskar H, Rosasco L, Miranda B, Liao Q (2017) Why and when can deep-but not shallow-networks avoid the curse of dimensionality: a review. Int J Autom Comput 14(5):503–519
Silver D, Huang A, Maddison CJ, Guez A, Sifre L, van den Driessche G, Schrittwieser J, Antonoglou I, Panneershelvam V, Lanctot M, Dieleman S, Grewe D, Nham J, Kalchbrenner N, Sutskever I, Lillicrap T, Leach M, Kavukcuoglu K, Graepel T, Hassabis D (2016) Mastering the game of Go with deep neural networks and tree search. Nature 529:484–489
Tibshirani R (1996) Regression shrinkage and selection via the Lasso. J R Stat Soc Ser B 58(1):267–288
Author information
Authors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this chapter
Cite this chapter
Denuit, M., Hainaut, D., Trufin, J. (2019). Deep Neural Networks. In: Effective Statistical Learning Methods for Actuaries III. Springer Actuarial(). Springer, Cham. https://doi.org/10.1007/978-3-030-25827-6_3
Download citation
DOI: https://doi.org/10.1007/978-3-030-25827-6_3
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-25826-9
Online ISBN: 978-3-030-25827-6
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)