Skip to main content

Towards Missing Data Imputation: A Study of Fuzzy K-means Clustering Method

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3066))

Abstract

In this paper, we present a missing data imputation method based on one of the most popular techniques in Knowledge Discovery in Databases (KDD), i.e. clustering technique. We combine the clustering method with soft computing, which tends to be more tolerant of imprecision and uncertainty, and apply a fuzzy clustering algorithm to deal with incomplete data. Our experiments show that the fuzzy imputation algorithm presents better performance than the basic clustering algorithm.

This work was supported, in part, by a grant from NSF (EIA-0091530), a cooperative agreement with USADA FCIC/RMA (2IE08310228), and an NSF EPSCOR Grant (EPS-0091900).

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum likelihood from incomplete data via the EM algorithm. J. of Royal Statistical Society Series 39, 1–38 (1977)

    MATH  MathSciNet  Google Scholar 

  2. Gary, K., Honaker, J., Joseph, A., Scheve, K.: Listwise deletion is evil: What to do about missing data in political science (2000), http://GKing.Harvard.edu

  3. Little, R.J., Rubin, D.B.: Statistical Analysis with Missing Data. Wiley, New York (1987)

    MATH  Google Scholar 

  4. Myrtveit, I., Stensrud, E., Olsson, U.H.: Analyzing data sets with missing data: an empirical evaluation of imputation methods and likelihood-based methods. IEEE Transactions on Software Engineering 27, 999–1013 (2001)

    Article  Google Scholar 

  5. (Zadeh, L.A.), http://www.cs.berkeley.edu/projects/Bisc/bisc.memo.html

  6. Akleman, E., Chen, J.: Generalized distance functions. In: Proceedings of the 1999 International Conference on Shape Modeling, pp. 72–79 (1999)

    Google Scholar 

  7. Krishnapuram, R., Joshi, A., Nasraoui, O., Yi, L.: Low-complexity fuzzy relational clustering algorithms for web mining. IEEE Trans. on Fuzzy Syst. 9, 595–607 (2001)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2004 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Li, D., Deogun, J., Spaulding, W., Shuart, B. (2004). Towards Missing Data Imputation: A Study of Fuzzy K-means Clustering Method . In: Tsumoto, S., Słowiński, R., Komorowski, J., Grzymała-Busse, J.W. (eds) Rough Sets and Current Trends in Computing. RSCTC 2004. Lecture Notes in Computer Science(), vol 3066. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-25929-9_70

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-25929-9_70

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-22117-3

  • Online ISBN: 978-3-540-25929-9

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics