Skip to main content

Bootstrap and Nonparametric Predictors to Impute Missing Data

  • Conference paper
  • First Online:
Book cover Classification and Multivariate Analysis for Complex Data Structures
  • 2978 Accesses

Abstract

A new nonparametric technique to impute missing data is proposed in order to obtain a completed data-matrix, capable of producing a degree of reliability for the imputations. Without taking into account strong assumptions, we introduce multiple imputations using bootstrap and nonparametric predictors. It is shown that, in this manner, we can obtain better imputations than with other known methods producing a more reliable completed data-matrix. Using two simulations, we show that the proposed technique can be generalized to consider non-monotone patterns of missing data with interesting results.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Breiman, L.: Bagging predictors. Mach. Learn. 24, 123–140 (1996)

    MathSciNet  MATH  Google Scholar 

  2. Conversano, C., Siciliano, R.: Incremental Tree-based missing data imputation with lexicographic ordering. In: Minotte, M., Swzychak, A. (eds.) Interface 2003 Proceedings, Interface Foundation of North America, Washington, DC (2003)

    Google Scholar 

  3. Di Ciaccio, A., Vallely, T.: Use of non-parametric methods for the imputation of missing data. A comparison based on extensive Montecarlo simulations. In: S.Co.2007, Venice. http://venus.unive.it/sco2007/ocs/papers.php (2007)

  4. Di Zio, M., Guarnera, U., Luzi, O.: Imputation through finite Gaussian mixture models. Comput. Stat. Data Anal. 51, 5305–5316 (2007)

    Article  MATH  Google Scholar 

  5. Efron, B.: Missing data, imputation, and the bootstrap. J. Am. Stat. Assoc. 89(426), 463–475 (1994)

    Article  MathSciNet  MATH  Google Scholar 

  6. Fay, E.R.: Alternative paradigms for the analysis of imputed survey data. J. Am. Stat. Assoc. 91(434), 490–498 (1996)

    Article  MATH  Google Scholar 

  7. Little, R., Rubin, D.: Statistical Analysis with Missing Data. Wiley, New York, NY (1987)

    MATH  Google Scholar 

  8. Mesa, D., Tsai, P., Chambers, R.L.: Using tree-based models for missing data imputation: an evaluation using UK census data. Research Note, Department of Social Statistics, University of Southampton, London (2000)

    Google Scholar 

  9. Nielsen, S.F.: Proper and improper multiple imputation. Intern. Stat. Rev. 71(3), 593–607 (2003)

    Article  MATH  Google Scholar 

  10. Raghunathan, T.E., Lepkowski, J.M., Van Hoewyk, J., Solenberger, P.: A multivariate technique for multiply imputing missing values using sequence of regression models. Surv. Methodol. 27(1), 85–95 (2001)

    Google Scholar 

  11. Rubin, D.B.: Multiple Imputation for Nonresponse in Surveys. Wiley, New York, NY (1987)

    Book  Google Scholar 

  12. Schafer, J.L., Schenker, N.: Inference with imputed conditional means. J. Am. Stat. Assoc. 95(449), 144–154 (2000)

    Article  MathSciNet  MATH  Google Scholar 

  13. Schenker, N., Taylor, J.M.G.: Partially parametric techniques for multiple imputation. Comput. Stat. Data Anal. 22, 425–446 (1996)

    Article  MATH  Google Scholar 

  14. Shao J., Sitter R.R.: Bootstrap for imputed survey data. J. Am. Stat. Assoc. 91(435), 1278–1288 (1996)

    Article  MathSciNet  MATH  Google Scholar 

  15. Van Buuren, S., Brand, J.P.L., Groothuis-Oudshoorn, C.G.M.: Fully conditional specification in multivariate imputation. J. Stat. Comput. Simul. 76(12), 1049–1064 (2006)

    Article  MathSciNet  MATH  Google Scholar 

  16. Van Buuren, S., Oudshoorn, C.G.M.: Multivariate imputation by chained equations: MICE V1.0 User’s manual. Report G/VGZ/00.038. Leiden, TNO Preventie en Gezondheid (2000)

    Google Scholar 

  17. Zhang, J., Everson, R.: Bayesian estimation and classification with incomplete data using mixture models. Proceedings of the 2004 International Conference on Machine Learning and Applications, Louisville, KY, USA, pp. 296–303 (2004)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Agostino Di Ciaccio .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Di Ciaccio, A. (2011). Bootstrap and Nonparametric Predictors to Impute Missing Data. In: Fichet, B., Piccolo, D., Verde, R., Vichi, M. (eds) Classification and Multivariate Analysis for Complex Data Structures. Studies in Classification, Data Analysis, and Knowledge Organization. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-13312-1_20

Download citation

Publish with us

Policies and ethics