A random forest algorithm to improve the Lee–Carter mortality forecasting: impact on q-forward

  • Susanna LevantesiEmail author
  • Andrea Nigri


Increased life expectancy in developed countries has led researchers to pay more attention to mortality projection to anticipate changes in mortality rates. Following the scheme proposed in Deprez et al. (Eur Actuar J 7(2):337–352, 2017) and extended by Levantesi and Pizzorusso (Risks 7(1):26, 2019), we propose a novel approach based on the combination of random forest and two-dimensional P-spline, allowing for accurate mortality forecasting. This approach firstly provides a diagnosis of the limits of the Lee–Carter mortality model through the application of the random forest estimator to the ratio between the observed deaths and their estimated values given by a certain model, while the two-dimensional P-spline are used to smooth and project the random forest estimator in the forecasting phase. Further considerations are devoted to assessing the demographic consistency of the results. The model accuracy is evaluated by an out-of-sample test. Finally, we analyze the impact of our model on the pricing of q-forward contracts. All the analyses have been carried out on several countries by using data from the Human Mortality Database and considering the Lee–Carter model.


Mortality Machine learning Two-dimensional P-spline q-Forward 


Compliance with ethical standards

Conflict of interest

The authors declare that they have no conflict of interest.

Ethical approval

This article does not contain any studies with human participants performed by any of the authors.


  1. Barrieu PM, Veraart LAM (2016) Pricing q-forward contracts: an evaluation of estimation window and pricing method under different mortality models. Scand Actuar J 2:146–166MathSciNetCrossRefGoogle Scholar
  2. Breiman L (1996) Bagging predictors. Mach Learn 24(2):123–140zbMATHGoogle Scholar
  3. Breiman L (2001) Random forests. Mach Learn 45(1):5–32CrossRefGoogle Scholar
  4. Breiman L, Friedman J, Stone CJ, Olshen RA (1984) Classification and regression trees. Chapman and Hall/CRCGoogle Scholar
  5. Brouhns N, Denuit M, Vermunt JK (2002) A Poisson log-bilinear regression approach to the construction of projected lifetables. Insur Math Econ 31(3):373–393MathSciNetCrossRefGoogle Scholar
  6. Cairns AJG, Blake D, Dowd K (2008) Modelling and management of mortality risk: a review. Scand Actuar J 2–3:79–113 Pensions Institute Discussion Paper No. PI-0814MathSciNetCrossRefGoogle Scholar
  7. Camarda CG (2012) MortalitySmooth: an R package for smoothing Poisson counts with P-splines. J Stat Softw 50(1):1–24.
  8. Currie ID, Durban M (2002) Flexible smoothing with P-splines: a unified approach. Stat Model 2:333–49MathSciNetCrossRefGoogle Scholar
  9. Currie ID, Durban M, Eilers PHC (2004) Smoothing and forecasting mortality rates. Stat Model 4:279–298MathSciNetCrossRefGoogle Scholar
  10. Currie ID, Durban M, Eilers PHC (2006) Generalized linear array models with applications to multidimensional smoothing. J R Stat Soc B 68:259–280MathSciNetCrossRefGoogle Scholar
  11. D’Amato V, Piscopo G, Russolillo M (2011) The mortality of the Italian population: smoothing techniques on the Lee–Carter model. Ann Appl Stat 5(2A):705–724MathSciNetCrossRefGoogle Scholar
  12. De Boor C (1978) A Practical guide to splines. Springer, New YorkCrossRefGoogle Scholar
  13. Deprez P, Shevchenko PV, Wüthrich M (2017) Machine learning techniques for mortality modeling. Eur Actuar J 7(2):337–352MathSciNetCrossRefGoogle Scholar
  14. Eilers PHC, Marx BD (1996) Flexible smoothing with b-splines and penalties. Stat Sci 11:89–102MathSciNetCrossRefGoogle Scholar
  15. Eilers PHC, Marx BD (2002) Multivariate calibration with temperature interaction using two-dimensional penalized signal regression. Chemom Intell Lab Syst 66:159–174CrossRefGoogle Scholar
  16. Eilers PHC, Currie ID, Durban M (2006) Fast and compact smoothing on large multidimensional grids. Comput Stat Data Anal 50:61–76MathSciNetCrossRefGoogle Scholar
  17. Eilers PHC, Marx BD (2010) Splines, knots, and penalties. Wiley Interdiscip Rev Comput Stat 2:637–653CrossRefGoogle Scholar
  18. Girosi F, King G (2008) Demographic forecasting. Princeton University Press, PrincetonCrossRefGoogle Scholar
  19. Hainaut D (2018) A neural-network analyzer for mortality forecast. Astin Bull 48(2):481–508MathSciNetCrossRefGoogle Scholar
  20. James G, Witten D, Hastie T, Tibshirani R (2017) An introduction to statistical learning: with applications in R. Springer texts in statistics. Springer, Berlin. ISBN-10: 1461471370Google Scholar
  21. Lee RD, Carter RL (1992) Modeling and forecasting US mortality. J Am Stat Assoc 87(419):659–671zbMATHGoogle Scholar
  22. Lee R, Miller T (2001) Evaluating the performance of the Lee–Carter method for forecasting mortality. Demography 38(4):537–549CrossRefGoogle Scholar
  23. Levantesi S, Menzietti M (2017) Maximum market price of longevity risk under solvency regimes: the case of solvency II. Risks 5(2):29CrossRefGoogle Scholar
  24. Levantesi S, Pizzorusso V (2019) Application of machine learning to mortality modeling and forecasting. Risks 7(1):26CrossRefGoogle Scholar
  25. Li N, Lee R, Gerland P (2013) Extending the Lee–Carter method to model the rotation of age patterns of mortality decline for long-term projections. Demography 50(6):2037–2051CrossRefGoogle Scholar
  26. Loeys J, Panigirtzoglou N, Ribeiro R (2007) Longevity: a market in the making. J.P. Morgan’s Global Market Strategy, LondonGoogle Scholar
  27. Loh WY (2011) Classification and regression trees. Wiley Interdiscip Rev Data Min Knowl Discov 1:14–23CrossRefGoogle Scholar
  28. Morgan JN, Sonquist JA (1963) Problems in the analysis of survey data, and a proposal. J Am Stat Assoc 58:415–434CrossRefGoogle Scholar
  29. Nigri A, Levantesi S, Marino M, Scognamiglio S, Perla F (2019) A deep learning integrated Lee–Carter model. Risks 7(1):33CrossRefGoogle Scholar
  30. O’Sullivan F (1986) A statistical perspective on ill-posed inverse problems (with discussion). Stat Sci 1:505–527zbMATHGoogle Scholar
  31. O’Sullivan F (1988) Fast computation of fully automated logdensity and log-hazard estimators. SIAM J Sci Stat Comput 9:363–379MathSciNetCrossRefGoogle Scholar
  32. Piscopo G (2017) Dynamic evolving neuro-fuzzy inference system for mortality prediction. Int J Eng Res Appl 7(3):26–29Google Scholar
  33. Piscopo G (2018a) AR dynamic evolving neuro-fuzzy inference system for mortality data. In: Skiadas CH, Skiadas C (eds) Demography and health issues. Population aging, mortality and data analysis. Springer, BerlinGoogle Scholar
  34. Piscopo G (2018b) A comparative analysis of neuro fuzzy inference systems for mortality prediction. In: Corazza M, Durbán M, Grané A, Perna C, Sibillo M (eds) Mathematical and statistical methods for actuarial sciences and finance. Springer, BerlinGoogle Scholar
  35. Quinlan JR (1986) Induction of decision trees. Mach Learn 1:81–106Google Scholar
  36. Richman R, Wüthrich M (2018) A neural network extension of the Lee–Carter model to multiple populations. SSRN manuscript, ID 3270877Google Scholar
  37. Ruppert D (2002) Selecting the number of knots for penalized splines. J Comput Graph Stat 11:735–57MathSciNetCrossRefGoogle Scholar
  38. Schwarz G (1978) Estimating the dimension of a model. Ann Stat 6:461–464MathSciNetCrossRefGoogle Scholar
  39. The Life and Longevity Markets Association (2010) Technical note: q-forward.
  40. Villegas AM, Kaishev VK, Millossovich P (2015) Stmomo: an r package for stochastic mortality modelling. J Stat Softw 84(3).
  41. Zeddouk F, Devolder P (2019) Pricing of longevity derivatives and cost of capital. Risks 7:41CrossRefGoogle Scholar

Copyright information

© Springer-Verlag GmbH Germany, part of Springer Nature 2019

Authors and Affiliations

  1. 1.Department of StatisticsSapienza University of RomeRomeItaly

Personalised recommendations