Penalized Splines

  • Jaroslaw Harezlak
  • David Ruppert
  • Matt P. Wand
Part of the Use R! book series (USE R)


In this chapter, we study nonparametric regression with a single continuous predictor. This problem is often called scatterplot smoothing. Our emphasis is on the use of penalized splines. We also show that a penalized spline model can be represented as a linear mixed model, which allows us to fit penalized splines using linear mixed model software.


  1. Akaike, H. (1973). Maximum likelihood identification of Gaussian autoregressive moving average models. Biometrika, 60, 255–265.MathSciNetCrossRefGoogle Scholar
  2. Albert, J. (2007). Bayesian Computation with navyR. New York: Springer.CrossRefGoogle Scholar
  3. de Boor, C. (2001). A Practical Guide to Splines, Revised Edition. New York: Springer-Verlag.zbMATHGoogle Scholar
  4. Brooks, S.P. and Gelman, A. (1998). General methods for monitoring convergence of iterative simulations. Journal of Computational and Graphical Statistics, 7, 434–455.MathSciNetGoogle Scholar
  5. Carlin, B.P. and Louis, T.A. (2009). Bayesian Methods for Data Analysis, Third Edition. Boca Raton, Florida: Chapman & Hall/CRC.Google Scholar
  6. Carpenter, B., Gelman, A., Hoffman, M.D., Lee, D., Goodrich, B., Betancourt, M., Brubaker, M., Guo, J., Li, P. and Riddell, A. (2017). Stan: A probabilistic programming language. Journal of Statistical Software, 76, Issue 1, 1–32.Google Scholar
  7. Crainiceanu, C. and Ruppert, D. (2004). Likelihood ratio tests in linear mixed models with one variance component. Journal of the Royal Statistical Society, Series B, 66, 165–185.MathSciNetCrossRefGoogle Scholar
  8. Crainiceanu, C.M., Ruppert, D., Claeskens, G. and Wand, M.P. (2005). Exact likelihood ratio tests for penalized splines. Biometrika, 92, 91–103.MathSciNetCrossRefGoogle Scholar
  9. Crainiceanu, C., Ruppert, D. and Wand, M.P. (2005). Bayesian analysis for penalized spline regression using WinBUGS. Journal of Statistical Software, 14, Issue 14, 1–24.Google Scholar
  10. Craven, P. and Wahba, G. (1979). Smoothing noisy data with spline functions: estimating the correct degree of smoothing by the method of generalized cross-validation. Numerische Mathematik, 31, 377–403.MathSciNetCrossRefGoogle Scholar
  11. Croissant, Y. (2016). red4Ecdat: Data sets for econometrics. navyR package version 0.3.
  12. Eilers, P.H.C. and Marx, B.D. (1996). Flexible smoothing with B-splines and penalties (with discussion). Statistical Science, 11, 89–121.MathSciNetCrossRefGoogle Scholar
  13. Gałecki, A. and Burzykowski, T. (2013). Linear and Mixed-Effects Models Using navyR. New York: Springer.CrossRefGoogle Scholar
  14. Gelman, A. (2006). Prior distributions for variance parameters in hierarchical models (Comment on article by Browne and Draper). Bayesian Analysis, 3, 515–534.MathSciNetCrossRefGoogle Scholar
  15. Gelman, A., Carlin, J. B., Stern, H. S., Dunson, D. B., Vehtari, A., and Rubin, D. B. (2014), Bayesian Data Analysis, 3rd Ed., Chapman & Hall Ltd (London; New York)Google Scholar
  16. Gelman, A. and Rubin, D.B. (1992). Inference from iterative simulation using multiple sequences. Statistical Science, 7, 457–511.CrossRefGoogle Scholar
  17. Gelman, A., Sturtz, S., Ligges, U., Gorjanc, G. and Kerman, J. (2015). red4R2WinBUGS: Running navyWinBUGS and navyOpenBUGS from navyR/navyS-PLUS. navyR package version 2.1.
  18. Green, P.J. and Silverman, B.W. (1994). Nonparametric Regression and Generalized Linear Models. London: Chapman and Hall.CrossRefGoogle Scholar
  19. Guo, J., Gabry, J. and Goodrich, B. (2017). red4rstan: navyR interface to navyStan. navyR package version 2.17.2.
  20. Gurrin, L.C., Scurrah, K.J. and Hazelton, M.L. (2005). Tutorial in biostatistics: spline smoothing with linear mixed models. Statistics in Medicine, 24, 3361–3381.MathSciNetCrossRefGoogle Scholar
  21. Härdle, W., Hall, P. and Marron, J.S. (1988). How far are automatically chosen regression smoothing parameters from their optimum? Journal of the American Statistical Association, 83, 86–101.MathSciNetzbMATHGoogle Scholar
  22. Hastie, T. (1996). Pseudosplines. Journal of the Royal Statistical Society, Series B, 58, 379–396.MathSciNetzbMATHGoogle Scholar
  23. Hastie, T.J. and Tibshirani, R.J. (1990). Generalized Additive Models. Boca Raton, Florida: Chapman & Hall/CRC.zbMATHGoogle Scholar
  24. Hastie, T., Tibshirani, R. and Friedman, J. (2009). The Elements of Statistical Learning, Second Edition. New York: Springer.CrossRefGoogle Scholar
  25. Hodges, J.S. (2014). Richly Parameterized Linear Models. Boca Raton, Florida: Chapman & Hall/CRC.zbMATHGoogle Scholar
  26. Hoff, P.D. (2010). A First Course in Bayesian Statistical Methods. New York: Springer.zbMATHGoogle Scholar
  27. Hoffman, M.D. and Gelman, A. (2014). The No-U-turn Sampler: adaptively setting path lengths in Hamiltonian Monte Carlo. Journal of Machine Learning Research, 15, 1593–1623.MathSciNetzbMATHGoogle Scholar
  28. Hurvich, C. M., Simonoff, J. S. and Tsai, C. (1998). Smoothing parameter selection in nonparametric regression using an improved Akaike information criterion. Journal of the Royal Statistical Society, Series B, 60, 271–293.MathSciNetCrossRefGoogle Scholar
  29. Kauermann, G., Krivobokova, T. and Fahrmeir, L. (2009). Some asymptotic results on generalized penalized spline smoothing. Journal of the Royal Statistical Society, Series B, 71, 487–503.MathSciNetCrossRefGoogle Scholar
  30. Kauermann, G. and Opsomer, J.D. (2011). Data-driven selection of the spline dimension in penalized spline regression. Biometrika, 98, 225–230.MathSciNetCrossRefGoogle Scholar
  31. Kneib, T., Heinzl, F., Brezger, A., Sabanes, B. and Klein, N. (2014). red4BayesX: navyR utilities accompanying the software package navyBayesX. navyR package version 0.2.
  32. Kou, S.C. and Efron, B. (2002). Smoothers and the C p, generalized maximum likelihood, and extended exponential criteria. Journal of the American Statistical Association, 97, 766–782.MathSciNetCrossRefGoogle Scholar
  33. Krivobokova, T. (2013). Smoothing parameter selection in two frameworks for penalized splines. Journal of the Royal Statistical Society, Series B, 75, 725–741.MathSciNetCrossRefGoogle Scholar
  34. Lee, P.M. (2012). Bayesian Statistics. Chichester, U.K.: John Wiley & Sons.zbMATHGoogle Scholar
  35. Li, Y. and Ruppert, D. (2008). On the asymptotics of penalized splines. Biometrika, 95, 415–436.MathSciNetCrossRefGoogle Scholar
  36. Ligges, U., Sturtz, S., Gelman, A., Gorjanc, G. and Jackson, C. (2017). red4BRugs: Interface to the navyOpenBUGS Markov chain Monte Carlo software. navyR package version 0.9.
  37. Loader, C. (1999). Local Regression and Likelihood. New York: Springer.zbMATHGoogle Scholar
  38. Loader, C. (2013). red4locfit: Local regression, likelihood and density estimation. navyR package version 1.5.
  39. Lunn, D., Jackson, C., Best, N., Thomas, A. and Spiegelhalter, D. (2013). The navyBUGS Book. Boca Raton, Florida: CRC Press.zbMATHGoogle Scholar
  40. Marley, J.K. and Wand, M.P. (2010). Non-standard semiparametric regression via BRugs. Journal of Statistical Software, 37, Issue 5, 1–30.CrossRefGoogle Scholar
  41. McCulloch, C.E., Searle, S.R. and Neuhaus, J.M. (2008). Generalized, Linear, and Mixed Models, Second Edition. New York: John Wiley & Sons.zbMATHGoogle Scholar
  42. O’Sullivan, F. (1986). A statistical perspective on ill-posed inverse problems (with discussion). Statistical Science, 1, 505–527.zbMATHGoogle Scholar
  43. Parker, R.L. and Rice, J.A. (1985). Comment on article by B.W. Silverman. Journal of the Royal Statistical Society, Series B, 47, 40–42.Google Scholar
  44. Pinheiro, J.C. and Bates, D.M. (2000). Mixed-Effects Models in navyS and navyS-PLUS. New York: Springer-Verlag.CrossRefGoogle Scholar
  45. Pinheiro, J., Bates, D., DebRoy, S., Sarkar, D., navyEISPACK authors and navyR Core Team. (2017). red4nlme: Linear and nonlinear mixed effects models. navyR package version 3.1.
  46. Pya, N. and Wood, S.N. (2016). A note on basis dimension selection in generalized additive modelling. Unpublished manuscript.
  47. Reiss, P.T. and Ogden, R.T. (2009). Smoothing parameter selection for a class of semiparametric linear models. Journal of the Royal Statistical Society, Series B, 71, 505–523.MathSciNetCrossRefGoogle Scholar
  48. Robinson, G.K. (1991). That BLUP is a good thing: the estimation of random effects. Statistical Science, 6, 15–51.MathSciNetCrossRefGoogle Scholar
  49. Ruppert, D. (2002). Selecting the number of knots for penalized splines, Journal of Computational and Graphical Statistics, 11, 735–757.MathSciNetCrossRefGoogle Scholar
  50. Ruppert, D. and Matteson, D.S. (2015). Statistics and Data Analysis for Financial Engineering, Second Edition. New York: Springer.zbMATHGoogle Scholar
  51. Ruppert, D., Wand, M.P. and Carroll, R.J. (2003). Semiparametric Regression. Cambridge, U.K.: Cambridge University Press.CrossRefGoogle Scholar
  52. Ruppert, D., Wand, M.P. and Carroll, R.J. (2009). Semiparametric regression during 2003–2007. Electronic Journal of Statistics, 3, 1193–1256MathSciNetCrossRefGoogle Scholar
  53. Scheipl, F. and Bolker, B. (2016). red4RLRsim: Exact (restricted) likelihood ratio tests for mixed and additive models. navyR package version 3.1.
  54. Stan Development Team. (2017). Stan Modeling Language User’s Guide and Reference Manual, Stan Version 2.17.0.
  55. Umlauf, N., Adler, D., Kneib, T., Lang, S. and Zeileis, A. (2015). Structured additive regression models: an navyR interface to navyBayesX. Journal of Statistical Software, 63, Issue 21, 1–47.Google Scholar
  56. Umlauf, N., Kneib, T., Lang, S. and Zeileis, A. (2016). red4R2BayesX: Estimate structured additive regression models with navyBayesX. navyR package version 1.1.
  57. Wahba, G. (1990). Spline Models for Observational Data. Philadelphia: SIAM.CrossRefGoogle Scholar
  58. Wand, M.P. and Ripley, B.D. (2015). red4KernSmooth: Functions for kernel smoothing supporting Wand and Jones (1995). navyR package version 2.23.
  59. Wand, M.P. and Jones, M.C. (1995). Kernel Smoothing. London: Chapman and Hall.CrossRefGoogle Scholar
  60. Wand, M.P. and Ormerod, J.T. (2008). On semiparametric regression with O’Sullivan penalized splines. Australian and New Zealand Journal of Statistics, 50, 179–198.MathSciNetCrossRefGoogle Scholar
  61. Wood, S.N. (2003). Thin plate regression splines. Journal of the Royal Statistical Society, Series B, 65, 95–114.MathSciNetCrossRefGoogle Scholar
  62. Wood, S.N. (2006a). Generalized Additive Models. Boca Raton, Florida: Chapman & Hall/CRC.CrossRefGoogle Scholar
  63. Wood, S.N. (2017). red4mgcv: Mixed GAM computation vehicle with GCV/AIC/REML smoothness estimation. navyR package version 1.8.

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2018

Authors and Affiliations

  • Jaroslaw Harezlak
    • 1
  • David Ruppert
    • 2
  • Matt P. Wand
    • 3
  1. 1.School of Public HealthIndiana University BloomingtonBloomingtonUSA
  2. 2.Department of Statistical ScienceCornell UniversityIthacaUSA
  3. 3.School of Mathematical and Physical SciencesUniversity of Technology SydneyUltimoAustralia

Personalised recommendations