Skip to main content

Imputation and Inference with Multivariate Adaptive Regression Splines

  • Chapter
Modern Mathematical Tools and Techniques in Capturing Complexity

Summary

The problem of missing data is often addressed with imputation. Traditional single imputation methods, such as the ratio imputation, multiple regression imputation, nearest neighbor imputation, respondent mean imputation or hot deck imputation, have been widely used to compensate for non-response. Nonparametric regression methods have been recently applied to the estimation of the regression function in a wide range of settings and areas of research. The focus of this work is on replacing missing observations on a variable of interest by imputed values obtained from a new algorithm based on Multivariate Adaptive Regression Splines. Some imputation methods can lead to serious underestimation for measures of population distributions. This bias can be reduced by adding to the imputed values a small disturbance drawn from a given distribution. Two different methods of adding the random disturbance are also described. Numerical examples are presented to illustrate the theoretical results and analyze the precision of the proposed method.

The authors would like to thank the editors for this opportunity to contribute to this volume in honour to María Luisa Menéndez.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 219.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Breidt, F.J., Claeskens, G., Opsomer, J.D.: Model-assisted estimation for complex surveys using penalized splines. Biometrika 92, 831–846 (2005)

    Article  MATH  MathSciNet  Google Scholar 

  2. Breiman, L., Friedman, J.H., Olhsen, R.A., Stone, C.J.: Classification and Regression Trees. Wadsworth, Belmont (1984)

    MATH  Google Scholar 

  3. Chambers, R.L., Dunstan, R.: Estimating distribution functions from survey data. Biometrika 73, 597–604 (1986)

    Article  MATH  MathSciNet  Google Scholar 

  4. Chen, J., Shao, J.: Nearest neighbor imputation for survey data. J. Offic. Statist. 16, 113–131 (2000)

    MATH  Google Scholar 

  5. Cheng, P.E.: Nonparametric Estimation of Mean Functionals with Data Missing ar Random. J. Amer. Statist. Assoc. 89, 81–87 (1994)

    Article  MATH  Google Scholar 

  6. Chu, C.K., Cheng, P.E.: Nonparametric regression estimation with missing data. J. Statist. Plan Infer. 48, 85–99 (1995)

    Article  MATH  MathSciNet  Google Scholar 

  7. Conversano, C., Siciliano, R.: Incremental Tree-Based Missing Data Imputation with Lexicographic Ordering. J. Classif. 26(3), 361–379 (2009)

    Article  MathSciNet  Google Scholar 

  8. D’Ambrosio, A., Aria, M., Siciliano, R.: Robust tree-based incremental imputation method for data fusion. In: Berthold, M., Shawe-Taylor, J., Lavrač, N. (eds.) IDA 2007. LNCS, vol. 4723, pp. 174–183. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  9. Ding, Y., Simonoff, J.S.: An investigation of missing data methods for classification trees applied to binary response data. J. Mach. Learn Res. 11, 131–170 (2010)

    MathSciNet  Google Scholar 

  10. Eilers, P.H.C., Marx, B.D.: Flexible smoothing with B-splines and penalties. Statist. Sci. 11(2), 86–121 (1996)

    Article  MathSciNet  Google Scholar 

  11. Fan, J., Gijbels, I.: Local Polynomial Modelling and Its Applications. Chapman & Hall, London (1996)

    MATH  Google Scholar 

  12. Fox, J.: Applied Regression Analysis, Linear Models and Related Methods. Sage Publications, Hamilton (1997)

    Google Scholar 

  13. Friedman, J.H.: Multivariate Adaptive Regression Splines. Ann. Statist. 19(1), 1–67 (1991)

    Article  MATH  MathSciNet  Google Scholar 

  14. Hastie, T.: Pseudosplines. J. Royal Statist. Soc. Ser. B 58, 376–396 (1996)

    Google Scholar 

  15. Hu, M., Salvucci, S., Lee, R.: A Study of Imputation Algorithms. Working Paper No. 2001-17. Washington DC: U.S. Department of Education, National Center for Education Statistics, 27 Stata Statistical Software (2001)

    Google Scholar 

  16. Iacus, S.M., Porro, G.: Missing data imputation, matching and other applications of random recursive partitioning. Comp. Statist. Data Anal. 52(2), 773–789 (2007)

    Article  MATH  MathSciNet  Google Scholar 

  17. Little, R.J.A., Rubin, D.: Statistical Analysis with missing data. Wiley, New York (2002)

    MATH  Google Scholar 

  18. Marx, B.D., Eilers, P.H.C.: Direct generalized additive modelling with penalized likelihood. Comp. Statist. Data Anal. 28, 193–209 (1998)

    Article  MATH  Google Scholar 

  19. Montaquila, J.M., Ponikowski, C.H.: An evaluation of alternative imputation methods. Proc. Section on Surv. Res. Meth. Amer. Statist. Assoc., 281–286 (1995)

    Google Scholar 

  20. Morgan, J.N., Sonquist, J.A.: Problems in the analysis of survey data, and a proposal. J. Amer. Statist. Assoc. 58, 415–434 (1963)

    Article  MATH  Google Scholar 

  21. Nitter, T.: The additive model affected by missing completely at random in the covariate. Comput. Statist. 19(2), 261–282 (2004)

    Article  MathSciNet  Google Scholar 

  22. Pineo, P.C., Porter, J., McRoberts, H.A.: The 1971 census and the socioeconomic misclassification of occupations. Can Rev. Sociol. Anthropol. 14, 147–157 (1977)

    Article  Google Scholar 

  23. Rao, J.N.K., Kovar, J.G., Mantel, H.J.: On estimating distribution functions and quantiles from survey data using auxiliary information. Biometrika 77, 365–375 (1990)

    Article  MATH  MathSciNet  Google Scholar 

  24. Rubin, D.B.: Formalizing subjective notions about the effect of nonrespondents in sample surveys. J. Amer. Statist. Assoc. 72, 53–543 (1977)

    Article  Google Scholar 

  25. Rubin, D.B.: Multiple Imputation for Nonresponse in Surveys. Wiley, New York (1987)

    Book  Google Scholar 

  26. Rubin, D.B.: Multiple imputations in sample surveys. Proc. Section on Surv. Res. Meth. Amer. Statist. Assoc., 20–34 (1978)

    Google Scholar 

  27. Ruppert, D., Wand, M.P.: Multivariate locally weighted least squares regression. Ann. Statist. 22(3), 1346–1370 (1994)

    Article  MATH  MathSciNet  Google Scholar 

  28. Schafer, J.: Analysis of Incomplete Multivariate Data. Chapman & Hall, London (1997)

    Book  MATH  Google Scholar 

  29. Särndal, C.E., Lunström, S.: Estimation in Surveys with Nonresponse. Wiley Series in Survey Methodology. Wiley, New York (2005)

    Book  MATH  Google Scholar 

  30. Särndal, C.E., Swensson, B., Wretman: Model Assisted Survey Sampling. Springer, New York (1992)

    MATH  Google Scholar 

  31. Wand, M.: Smoothing and mixed models. Comput. Statist. 18, 223–249 (2003)

    MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Sánchez-Borrego, I., del Mar Rueda, M., Muñoz, J.F. (2011). Imputation and Inference with Multivariate Adaptive Regression Splines. In: Pardo, L., Balakrishnan, N., Gil, M.Á. (eds) Modern Mathematical Tools and Techniques in Capturing Complexity. Understanding Complex Systems. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-20853-9_24

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-20853-9_24

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-20852-2

  • Online ISBN: 978-3-642-20853-9

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics