Abstract
In this paper, we present a missing data imputation method based on one of the most popular techniques in Knowledge Discovery in Databases (KDD), i.e. clustering technique. We combine the clustering method with soft computing, which tends to be more tolerant of imprecision and uncertainty, and apply a fuzzy clustering algorithm to deal with incomplete data. Our experiments show that the fuzzy imputation algorithm presents better performance than the basic clustering algorithm.
This work was supported, in part, by a grant from NSF (EIA-0091530), a cooperative agreement with USADA FCIC/RMA (2IE08310228), and an NSF EPSCOR Grant (EPS-0091900).
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum likelihood from incomplete data via the EM algorithm. J. of Royal Statistical Society Series 39, 1–38 (1977)
Gary, K., Honaker, J., Joseph, A., Scheve, K.: Listwise deletion is evil: What to do about missing data in political science (2000), http://GKing.Harvard.edu
Little, R.J., Rubin, D.B.: Statistical Analysis with Missing Data. Wiley, New York (1987)
Myrtveit, I., Stensrud, E., Olsson, U.H.: Analyzing data sets with missing data: an empirical evaluation of imputation methods and likelihood-based methods. IEEE Transactions on Software Engineering 27, 999–1013 (2001)
(Zadeh, L.A.), http://www.cs.berkeley.edu/projects/Bisc/bisc.memo.html
Akleman, E., Chen, J.: Generalized distance functions. In: Proceedings of the 1999 International Conference on Shape Modeling, pp. 72–79 (1999)
Krishnapuram, R., Joshi, A., Nasraoui, O., Yi, L.: Low-complexity fuzzy relational clustering algorithms for web mining. IEEE Trans. on Fuzzy Syst. 9, 595–607 (2001)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2004 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Li, D., Deogun, J., Spaulding, W., Shuart, B. (2004). Towards Missing Data Imputation: A Study of Fuzzy K-means Clustering Method . In: Tsumoto, S., Słowiński, R., Komorowski, J., Grzymała-Busse, J.W. (eds) Rough Sets and Current Trends in Computing. RSCTC 2004. Lecture Notes in Computer Science(), vol 3066. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-25929-9_70
Download citation
DOI: https://doi.org/10.1007/978-3-540-25929-9_70
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-22117-3
Online ISBN: 978-3-540-25929-9
eBook Packages: Springer Book Archive