A random forest algorithm to improve the Lee–Carter mortality forecasting: impact on q-forward


Increased life expectancy in developed countries has led researchers to pay more attention to mortality projection to anticipate changes in mortality rates. Following the scheme proposed in Deprez et al. (Eur Actuar J 7(2):337–352, 2017) and extended by Levantesi and Pizzorusso (Risks 7(1):26, 2019), we propose a novel approach based on the combination of random forest and two-dimensional P-spline, allowing for accurate mortality forecasting. This approach firstly provides a diagnosis of the limits of the Lee–Carter mortality model through the application of the random forest estimator to the ratio between the observed deaths and their estimated values given by a certain model, while the two-dimensional P-spline are used to smooth and project the random forest estimator in the forecasting phase. Further considerations are devoted to assessing the demographic consistency of the results. The model accuracy is evaluated by an out-of-sample test. Finally, we analyze the impact of our model on the pricing of q-forward contracts. All the analyses have been carried out on several countries by using data from the Human Mortality Database and considering the Lee–Carter model.

This is a preview of subscription content, access via your institution.

Fig. 1

(source: James et al. (2017))

Fig. 2

(source: James et al. (2017))

Fig. 3
Fig. 4
Fig. 5
Fig. 6


  1. Barrieu PM, Veraart LAM (2016) Pricing q-forward contracts: an evaluation of estimation window and pricing method under different mortality models. Scand Actuar J 2:146–166

    MathSciNet  MATH  Article  Google Scholar 

  2. Breiman L (1996) Bagging predictors. Mach Learn 24(2):123–140

    MATH  Google Scholar 

  3. Breiman L (2001) Random forests. Mach Learn 45(1):5–32

    MATH  Article  Google Scholar 

  4. Breiman L, Friedman J, Stone CJ, Olshen RA (1984) Classification and regression trees. Chapman and Hall/CRC

  5. Brouhns N, Denuit M, Vermunt JK (2002) A Poisson log-bilinear regression approach to the construction of projected lifetables. Insur Math Econ 31(3):373–393

    MathSciNet  MATH  Article  Google Scholar 

  6. Cairns AJG, Blake D, Dowd K (2008) Modelling and management of mortality risk: a review. Scand Actuar J 2–3:79–113 Pensions Institute Discussion Paper No. PI-0814

    MathSciNet  MATH  Article  Google Scholar 

  7. Camarda CG (2012) MortalitySmooth: an R package for smoothing Poisson counts with P-splines. J Stat Softw 50(1):1–24. http://cran.r-project.org/package=MortalitySmooth

  8. Currie ID, Durban M (2002) Flexible smoothing with P-splines: a unified approach. Stat Model 2:333–49

    MathSciNet  MATH  Article  Google Scholar 

  9. Currie ID, Durban M, Eilers PHC (2004) Smoothing and forecasting mortality rates. Stat Model 4:279–298

    MathSciNet  MATH  Article  Google Scholar 

  10. Currie ID, Durban M, Eilers PHC (2006) Generalized linear array models with applications to multidimensional smoothing. J R Stat Soc B 68:259–280

    MathSciNet  MATH  Article  Google Scholar 

  11. D’Amato V, Piscopo G, Russolillo M (2011) The mortality of the Italian population: smoothing techniques on the Lee–Carter model. Ann Appl Stat 5(2A):705–724

    MathSciNet  MATH  Article  Google Scholar 

  12. De Boor C (1978) A Practical guide to splines. Springer, New York

    MATH  Book  Google Scholar 

  13. Deprez P, Shevchenko PV, Wüthrich M (2017) Machine learning techniques for mortality modeling. Eur Actuar J 7(2):337–352

    MathSciNet  MATH  Article  Google Scholar 

  14. Eilers PHC, Marx BD (1996) Flexible smoothing with b-splines and penalties. Stat Sci 11:89–102

    MathSciNet  MATH  Article  Google Scholar 

  15. Eilers PHC, Marx BD (2002) Multivariate calibration with temperature interaction using two-dimensional penalized signal regression. Chemom Intell Lab Syst 66:159–174

    Article  Google Scholar 

  16. Eilers PHC, Currie ID, Durban M (2006) Fast and compact smoothing on large multidimensional grids. Comput Stat Data Anal 50:61–76

    MathSciNet  MATH  Article  Google Scholar 

  17. Eilers PHC, Marx BD (2010) Splines, knots, and penalties. Wiley Interdiscip Rev Comput Stat 2:637–653

    Article  Google Scholar 

  18. Girosi F, King G (2008) Demographic forecasting. Princeton University Press, Princeton

    Book  Google Scholar 

  19. Hainaut D (2018) A neural-network analyzer for mortality forecast. Astin Bull 48(2):481–508

    MathSciNet  MATH  Article  Google Scholar 

  20. James G, Witten D, Hastie T, Tibshirani R (2017) An introduction to statistical learning: with applications in R. Springer texts in statistics. Springer, Berlin. ISBN-10: 1461471370

  21. Lee RD, Carter RL (1992) Modeling and forecasting US mortality. J Am Stat Assoc 87(419):659–671

    MATH  Google Scholar 

  22. Lee R, Miller T (2001) Evaluating the performance of the Lee–Carter method for forecasting mortality. Demography 38(4):537–549

    MathSciNet  Article  Google Scholar 

  23. Levantesi S, Menzietti M (2017) Maximum market price of longevity risk under solvency regimes: the case of solvency II. Risks 5(2):29

    Article  Google Scholar 

  24. Levantesi S, Pizzorusso V (2019) Application of machine learning to mortality modeling and forecasting. Risks 7(1):26

    Article  Google Scholar 

  25. Li N, Lee R, Gerland P (2013) Extending the Lee–Carter method to model the rotation of age patterns of mortality decline for long-term projections. Demography 50(6):2037–2051

    Article  Google Scholar 

  26. Liaw A (2018) Package randomforest. https://cran.r-project.org/web/packages/randomForest/randomForest.pdf

  27. Loeys J, Panigirtzoglou N, Ribeiro R (2007) Longevity: a market in the making. J.P. Morgan’s Global Market Strategy, London

    Google Scholar 

  28. Loh WY (2011) Classification and regression trees. Wiley Interdiscip Rev Data Min Knowl Discov 1:14–23

    Article  Google Scholar 

  29. Morgan JN, Sonquist JA (1963) Problems in the analysis of survey data, and a proposal. J Am Stat Assoc 58:415–434

    MATH  Article  Google Scholar 

  30. Nigri A, Levantesi S, Marino M, Scognamiglio S, Perla F (2019) A deep learning integrated Lee–Carter model. Risks 7(1):33

    Article  Google Scholar 

  31. O’Sullivan F (1986) A statistical perspective on ill-posed inverse problems (with discussion). Stat Sci 1:505–527

    MATH  Google Scholar 

  32. O’Sullivan F (1988) Fast computation of fully automated logdensity and log-hazard estimators. SIAM J Sci Stat Comput 9:363–379

    MATH  Article  Google Scholar 

  33. Piscopo G (2017) Dynamic evolving neuro-fuzzy inference system for mortality prediction. Int J Eng Res Appl 7(3):26–29

    Google Scholar 

  34. Piscopo G (2018a) AR dynamic evolving neuro-fuzzy inference system for mortality data. In: Skiadas CH, Skiadas C (eds) Demography and health issues. Population aging, mortality and data analysis. Springer, Berlin

    Google Scholar 

  35. Piscopo G (2018b) A comparative analysis of neuro fuzzy inference systems for mortality prediction. In: Corazza M, Durbán M, Grané A, Perna C, Sibillo M (eds) Mathematical and statistical methods for actuarial sciences and finance. Springer, Berlin

    Google Scholar 

  36. Quinlan JR (1986) Induction of decision trees. Mach Learn 1:81–106

    Google Scholar 

  37. Richman R, Wüthrich M (2018) A neural network extension of the Lee–Carter model to multiple populations. SSRN manuscript, ID 3270877

  38. Ruppert D (2002) Selecting the number of knots for penalized splines. J Comput Graph Stat 11:735–57

    MathSciNet  Article  Google Scholar 

  39. Schwarz G (1978) Estimating the dimension of a model. Ann Stat 6:461–464

    MathSciNet  MATH  Article  Google Scholar 

  40. The Life and Longevity Markets Association (2010) Technical note: q-forward. http://www.llma.org

  41. Villegas AM, Kaishev VK, Millossovich P (2015) Stmomo: an r package for stochastic mortality modelling. J Stat Softw 84(3). https://cran.r-project.org/web/packages/StMoMo/vignettes/StMoMoVignette.pdf

  42. Zeddouk F, Devolder P (2019) Pricing of longevity derivatives and cost of capital. Risks 7:41

    Article  Google Scholar 

Download references

Author information



Corresponding author

Correspondence to Susanna Levantesi.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Ethical approval

This article does not contain any studies with human participants performed by any of the authors.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Communicated by Philippe de Peretti.

A Appendix

A Appendix

See Figs. 7, 8, 9, 10 and 11.

Fig. 7

Variable importance. \(\%IncNodePurity\). Ages 20–90 and years 1947–2014

Fig. 8

\(\hat{\psi }_{\mathbf {s}}\) smoothed values (1947–2000) and extrapolated values (2000–2014). Results from MortalitySmooth package. Male population. Age 40–100

Fig. 9

\(\hat{\psi }_{\mathbf {s}}\) smoothed values (1947–2000) and extrapolated values (2000–2014). Results from MortalitySmooth package. Female population. Age 40–100

Fig. 10

\(\hat{\psi }_{\mathbf {s}}\) smoothed values (1947–2000) and extrapolated values (2000–2014). Results from MortalitySmooth package. Male population. Age 60–100

Fig. 11

\(\hat{\psi }_{\mathbf {s}}\) smoothed values (1947–2000) and extrapolated values (2000–2014). Results from MortalitySmooth package. Female population. Age 60–100

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Levantesi, S., Nigri, A. A random forest algorithm to improve the Lee–Carter mortality forecasting: impact on q-forward. Soft Comput 24, 8553–8567 (2020). https://doi.org/10.1007/s00500-019-04427-z

Download citation


  • Mortality
  • Machine learning
  • Two-dimensional P-spline
  • q-Forward