Towards Missing Data Imputation: A Study of Fuzzy K-means Clustering Method

Li, Dan; Deogun, Jitender; Spaulding, William; Shuart, Bill

doi:10.1007/978-3-540-25929-9_70

Towards Missing Data Imputation: A Study of Fuzzy K-means Clustering Method

Dan Li²⁰,
Jitender Deogun²⁰,
William Spaulding²¹ &
…
Bill Shuart²¹

Conference paper

1358 Accesses
110 Citations
1 Altmetric

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3066))

Abstract

In this paper, we present a missing data imputation method based on one of the most popular techniques in Knowledge Discovery in Databases (KDD), i.e. clustering technique. We combine the clustering method with soft computing, which tends to be more tolerant of imprecision and uncertainty, and apply a fuzzy clustering algorithm to deal with incomplete data. Our experiments show that the fuzzy imputation algorithm presents better performance than the basic clustering algorithm.

This work was supported, in part, by a grant from NSF (EIA-0091530), a cooperative agreement with USADA FCIC/RMA (2IE08310228), and an NSF EPSCOR Grant (EPS-0091900).

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum likelihood from incomplete data via the EM algorithm. J. of Royal Statistical Society Series 39, 1–38 (1977)
MATH MathSciNet Google Scholar
Gary, K., Honaker, J., Joseph, A., Scheve, K.: Listwise deletion is evil: What to do about missing data in political science (2000), http://GKing.Harvard.edu
Little, R.J., Rubin, D.B.: Statistical Analysis with Missing Data. Wiley, New York (1987)
MATH Google Scholar
Myrtveit, I., Stensrud, E., Olsson, U.H.: Analyzing data sets with missing data: an empirical evaluation of imputation methods and likelihood-based methods. IEEE Transactions on Software Engineering 27, 999–1013 (2001)
Article Google Scholar
(Zadeh, L.A.), http://www.cs.berkeley.edu/projects/Bisc/bisc.memo.html
Akleman, E., Chen, J.: Generalized distance functions. In: Proceedings of the 1999 International Conference on Shape Modeling, pp. 72–79 (1999)
Google Scholar
Krishnapuram, R., Joshi, A., Nasraoui, O., Yi, L.: Low-complexity fuzzy relational clustering algorithms for web mining. IEEE Trans. on Fuzzy Syst. 9, 595–607 (2001)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science & Engineering, University of Nebraska-Lincoln, Lincoln, NE, 68588-0115, USA
Dan Li & Jitender Deogun
Department of Psychology, University of Nebraska-Lincoln, Lincoln, NE, 68588-0308, USA
William Spaulding & Bill Shuart

Authors

Dan Li
View author publications
You can also search for this author in PubMed Google Scholar
Jitender Deogun
View author publications
You can also search for this author in PubMed Google Scholar
William Spaulding
View author publications
You can also search for this author in PubMed Google Scholar
Bill Shuart
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Shimane University, 89-1 Enya-cho Izumo, 6938501, Shimane, Japan
Shusaku Tsumoto
Systems Research Institute, Polish Academy of Sciences, 01-447, Warsaw, Poland
Roman Słowiński
The Linnaeus Centre for Bioinformatics, Uppsala University, Uppsala, Sweden
Jan Komorowski
Institute of Computer Science, Polish Academy of Sciences, 01–237, Warsaw, Poland
Jerzy W. Grzymała-Busse

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Li, D., Deogun, J., Spaulding, W., Shuart, B. (2004). Towards Missing Data Imputation: A Study of Fuzzy K-means Clustering Method . In: Tsumoto, S., Słowiński, R., Komorowski, J., Grzymała-Busse, J.W. (eds) Rough Sets and Current Trends in Computing. RSCTC 2004. Lecture Notes in Computer Science(), vol 3066. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-25929-9_70

Download citation

DOI: https://doi.org/10.1007/978-3-540-25929-9_70
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-22117-3
Online ISBN: 978-3-540-25929-9
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics