Skip to main content

Part of the book series: Springer Actuarial ((SPACLN))

Abstract

In Chap. 1, our empirical analysis was based on neural networks with a single hidden layer. These networks, called shallow, are in theory universal approximators of any continuous function. Deep neural networks use instead a cascade of multiple layers of hidden neurons. Each successive layer uses the output from the previous layer as input. As with shallow networks, many issues can arise with naively trained deep networks. Two common issues are the overfitting of the training dataset and the increase of computation time. We present the techniques for limiting the computation time and the methods of regularization for avoiding the overfitting. We next explain why deep neural networks outperform shallow networks for approximating hierarchical binary functions. This chapter is concluded by a numerical illustration.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://keras.rstudio.com/index.html.

  2. 2.

    https://www.tensorflow.org/.

  3. 3.

    Notice that the discrepancy between deviances for λ = 0 is explained by the random initialization of neural coefficients. This randomization leads to different deviances after 1500 iterations.

References

  • Bottou L (1998) Online algorithms and stochastic approximations. Online learning and neural networks. Cambridge University Press, New York. ISBN 978-0-521-65263-6

    MATH  Google Scholar 

  • Chollet F, Allaire JJ (2018) Deep learning with R. Manning Publications, New York

    Google Scholar 

  • Cybenko G (1989) Approximation by superpositions of a sigmoidal function. Math Control Signals Syst 2:303–314

    Article  MathSciNet  Google Scholar 

  • DeVore R, Howard R, Micchelli CA (1989) Optimal nonlinear approximation. Manuscripta Math 63:469–478

    Article  MathSciNet  Google Scholar 

  • DeVries PMR,Viégas F, Wattenberg M, Meade BJ (2018) Deep learning of aftershock patterns following large earthquakes. Nature 560:632–634

    Article  Google Scholar 

  • Duchi J, Hazan E, Singer Y (2011) Adaptive subgradient methods for online learning and stochastic optimization. J Mach Learn Res 12:2121–2159

    MathSciNet  MATH  Google Scholar 

  • Ferrario A, Nolly A, Wuthrich MV (2018) Insights from inside neural networks. SSRN working paper. https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3226852

  • Hornik K, Stinchcombe M, White H (1989) Multi-layer feed-forward networks are universal approximators. Neural Netw 2:359–366

    Article  Google Scholar 

  • Karpathy A, Fei-Fei L (2016) Deep visual-semantic alignments for generating image descriptions. IEEE Trans Pattern Anal Mach Intell 39(4):664–676

    Article  Google Scholar 

  • Kingma D, Ba J (2015) Adam: a method for stochastic optimization. In: 3rd international conference on learning representations, San Diego, 2015. arXiv:1412.6980

    Google Scholar 

  • Kiwiel KC (2001) Convergence and efficiency of subgradient methods for quasiconvex minimization. Math Program 90(1):1–25

    Article  MathSciNet  Google Scholar 

  • Lample G, Ballesteros M, Subramanian S, Kawakami K, Dyer C (2016) Neural architectures for named entity recognition. In: Proceedings of NAACL-HLT 2016, pp 260–270

    Google Scholar 

  • Mhaskar HN (1996) Neural networks for optimal approximation of smooth and analytic functions. Neural Comput 8(1):164–177

    Article  Google Scholar 

  • Poggio T, Mhaskar H, Rosasco L, Miranda B, Liao Q (2017) Why and when can deep-but not shallow-networks avoid the curse of dimensionality: a review. Int J Autom Comput 14(5):503–519

    Article  Google Scholar 

  • Silver D, Huang A, Maddison CJ, Guez A, Sifre L, van den Driessche G, Schrittwieser J, Antonoglou I, Panneershelvam V, Lanctot M, Dieleman S, Grewe D, Nham J, Kalchbrenner N, Sutskever I, Lillicrap T, Leach M, Kavukcuoglu K, Graepel T, Hassabis D (2016) Mastering the game of Go with deep neural networks and tree search. Nature 529:484–489

    Article  Google Scholar 

  • Tibshirani R (1996) Regression shrinkage and selection via the Lasso. J R Stat Soc Ser B 58(1):267–288

    MathSciNet  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Denuit, M., Hainaut, D., Trufin, J. (2019). Deep Neural Networks. In: Effective Statistical Learning Methods for Actuaries III. Springer Actuarial(). Springer, Cham. https://doi.org/10.1007/978-3-030-25827-6_3

Download citation

Publish with us

Policies and ethics