Abstract
Clustering is an important technique for identifying groups of similar data objects within a data set. Since problems during the data collection and data preprocessing steps often lead to missing values in the data sets, there is a need for clustering methods that can deal with such imperfect data. Approaches proposed in the literature for adapting the fuzzy c-means algorithm to incomplete data work well on data sets with equally sized and shaped clusters. In this paper we present an approach for adapting the fuzzy c-means algorithm to incomplete data that uses the dimension-wise fuzzy variances of clusters for imputation of missing values. In experiments on incomplete real and synthetic data sets with differently sized and shaped clusters, we demonstrate the benefit over the basic approach in terms of the assignment of data objects to clusters and the cluster prototype computation.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Hathaway, R.J., Bezdek, J.C.: Fuzzy \(c\)-means clustering of incomplete data. IEEE Trans. Syst. Man Cybern. Part B 31(5), 735–744 (2001)
Timm, H., Döring, C., Kruse, R.: Fuzzy cluster analysis of partially missing datasets. In: Proceedings of the European Symposium on Intelligent Technologies, Hybid Systems and Their Implementation on Smart Adaptive Systems (EUNITE 2002), pp. 426–431 (2002)
Sarkar, M., Leong, T.-Y.: Fuzzy K-means clustering with missing values. In: Proceedings of the American Medical Informatics Association Annual Symposium, pp. 588–592 (2001)
van der Laan, M.Y., Pollard, K.S.: A new algorithm for hybrid hierarchical clustering with visualization and the bootstrap. J. Stat. Plann. Infer. 117(2), 275–303 (2003)
Himmelspach, L., Conrad, S.: Clustering approaches for data with missing values: comparison and evaluation. In: Proceedings of the Fifth IEEE International Conference on Digital Information Management (ICDIM 2010), pp. 19–28 (2010)
Himmelspach, L., Conrad, S.: Fuzzy clustering of incomplete data based on cluster dispersion. In: Hüllermeier, E., Kruse, R., Hoffmann, F. (eds.) IPMU 2010. LNCS, vol. 6178, pp. 59–68. Springer, Heidelberg (2010)
MacQueen, J.: Some methods for classification and analysis of multivariate observations. In: Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, Volume 1: Statistics, pp. 281–297 (1967)
Bezdek, J.C.: Pattern Recognition with Fuzzy Objective Function Algorithms. Kluwer Academic Publishers, Norwell (1981)
Kruse, R., Döring, C., Lesot, M.-J.: Fundamentals of fuzzy clustering. In: Advances in Fuzzy Clustering and Its Applications, pp. 1–30 (2007)
Klawonn, F., Kruse, R., Winkler, R.: Fuzzy clustering: more than just fuzzification. Fuzzy Sets Syst. 281, 272–279 (2015)
Timm, H.: Fuzzy-Clusteranalyse: Methoden zur Exploration von Daten mit fehlenden Werten sowie klassifizierten Daten. Ph.D. thesis, Germany (2002)
Asuncion, A., Newman, D.J.: UCI Machine Learning Repository (2007). http://www.ics.uci.edu/~mlearn/MLRepository.html
Little, R.J.A., Rubin, D.B.: Statistical Analysis with Missing Data, 2nd edn. Wiley, New York (2002)
Runkler, T.A.: Comparing partitions by subset similarities. In: Hüllermeier, E., Kruse, R., Hoffmann, F. (eds.) IPMU 2010. LNCS, vol. 6178, pp. 29–38. Springer, Heidelberg (2010)
Timm, H., Döring, C., Kruse, R.: Different approaches to fuzzy clustering of incomplete datasets. Int. J. Approximate Reasoning 35(3), 239–249 (2004)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Himmelspach, L., Conrad, S. (2016). Fuzzy c-Means Clustering of Incomplete Data Using Dimension-Wise Fuzzy Variances of Clusters. In: Carvalho, J., Lesot, MJ., Kaymak, U., Vieira, S., Bouchon-Meunier, B., Yager, R. (eds) Information Processing and Management of Uncertainty in Knowledge-Based Systems. IPMU 2016. Communications in Computer and Information Science, vol 610. Springer, Cham. https://doi.org/10.1007/978-3-319-40596-4_58
Download citation
DOI: https://doi.org/10.1007/978-3-319-40596-4_58
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-40595-7
Online ISBN: 978-3-319-40596-4
eBook Packages: Computer ScienceComputer Science (R0)