Skip to main content

Reconstruction of Cross-Sectional Missing Data Using Neural Networks

  • Conference paper
Engineering Applications of Neural Networks (EANN 2009)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 43))

  • 1386 Accesses

Abstract

The treatment of incomplete data is an important step in the pre-processing of data. We propose a non-parametric multiple imputation algorithm (GMI) for the reconstruction of missing data, based on Generalized Regression Neural Networks (GRNN). We compare GMI with popular missing data imputation algorithms: EM (Expectation Maximization) MI (Multiple Imputation), MCMC (Markov Chain Monte Carlo) MI, and hot deck MI. A separate GRNN classifier is trained and tested on the dataset imputed with each imputation algorithm. The imputation algorithms are evaluated based on the accuracy of the GRNN classifier after the imputation process. We show the effectiveness of our proposed algorithm on twenty-six real datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Rubin, D.B.: Multiple imputation for non response in surveys. Wiley, New York (1987)

    Book  Google Scholar 

  2. Little, R.J.A., Rubin, D.B.: Statistical Analysis with missing data. Wiley, New York (1987)

    MATH  Google Scholar 

  3. Rubin, D.B., Schenker, N.: Multiple imputation for interval estimation from simple random values with ignorable nonresponse. Journal of the American Statistical Association 81(394), 366–374 (1986)

    Article  MathSciNet  MATH  Google Scholar 

  4. Schafer, J.: Analysis of incomplete multivariate data. Chapman and Hall, London (1997)

    Book  MATH  Google Scholar 

  5. Bo, T.H., Dysvik, B., Jonassen, I.: LSimpute: accurate estimation of missing values in microarray data with least squares method. Nucleic Acids Research 32(3) (2004)

    Google Scholar 

  6. Carlo, G., Yao, J.: A multiple-imputation metropolis version of the EM algorithm. Biometrika 90(3), 643–654 (2003)

    Article  MathSciNet  MATH  Google Scholar 

  7. Lokupitiya, R.S., Lokupitiya, E., Paustian, K.: Comparison of missing value imputation methods for crop yield data. Environmetrics 17(4), 339–349 (2006)

    Article  MathSciNet  Google Scholar 

  8. Iannacchione, V.: Weighted sequential hot deck imputation macros. In: Proceedings of the Seventh Annual SAS Users Group International Conference, San Francisco, pp. 759–763 (1982)

    Google Scholar 

  9. Dan, L., Deogun, J.S., Wang, K.: Gene function classification using fuzzy k-nearest neighbour approach. In: IEEE International Conference on Granular Computing, November 2-4, pp. 644–644 (2007)

    Google Scholar 

  10. Schioler, H., Hartmann, U.: Mapping neural network derived from the Parzen window estimator. Neural Networks 2(6), 903–909 (1992)

    Article  Google Scholar 

  11. Specht, D.F: A General Regression Neural Network. IEEE Transactions on Neural Networks. 2(6), pp. 568-576 (1991).

    Google Scholar 

  12. Kuncheva, L.I.: A stability index for feature selection. In: Proceedings of the 25th IASTED International Multi-Conference: artificial intelligence and applications, pp. 390–395. ACTA Press, Anaheim (2007)

    Google Scholar 

  13. UCI Machine Learning Repository: Centre for Machine Learning and Intelligent Systems, http://archive.ics.uci.edu/MI/

  14. Siegel, S., Castellan Jr, N.J.: Nonparametric statistics: for the behavioural sciences, 2nd edn. McGraw-Hill, New York (1988)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Gheyas, I.A., Smith, L.S. (2009). Reconstruction of Cross-Sectional Missing Data Using Neural Networks. In: Palmer-Brown, D., Draganova, C., Pimenidis, E., Mouratidis, H. (eds) Engineering Applications of Neural Networks. EANN 2009. Communications in Computer and Information Science, vol 43. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-03969-0_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-03969-0_3

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-03968-3

  • Online ISBN: 978-3-642-03969-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics