Fuzzy c-Means Clustering of Incomplete Data Using Dimension-Wise Fuzzy Variances of Clusters

Himmelspach, Ludmila; Conrad, Stefan

doi:10.1007/978-3-319-40596-4_58

Ludmila Himmelspach¹⁶ &
Stefan Conrad¹⁶

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 610))

Included in the following conference series:

International Conference on Information Processing and Management of Uncertainty in Knowledge-Based Systems

1117 Accesses
1 Citations

Abstract

Clustering is an important technique for identifying groups of similar data objects within a data set. Since problems during the data collection and data preprocessing steps often lead to missing values in the data sets, there is a need for clustering methods that can deal with such imperfect data. Approaches proposed in the literature for adapting the fuzzy c-means algorithm to incomplete data work well on data sets with equally sized and shaped clusters. In this paper we present an approach for adapting the fuzzy c-means algorithm to incomplete data that uses the dimension-wise fuzzy variances of clusters for imputation of missing values. In experiments on incomplete real and synthetic data sets with differently sized and shaped clusters, we demonstrate the benefit over the basic approach in terms of the assignment of data objects to clusters and the cluster prototype computation.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

A Robust Fuzzy c-Means Clustering Algorithm for Incomplete Data

K-Means over Incomplete Datasets Using Mean Euclidean Distance

FIT2COMIn – Robust Clustering Algorithm for Incomplete Data

References

Hathaway, R.J., Bezdek, J.C.: Fuzzy \(c\)-means clustering of incomplete data. IEEE Trans. Syst. Man Cybern. Part B 31(5), 735–744 (2001)
Article Google Scholar
Timm, H., Döring, C., Kruse, R.: Fuzzy cluster analysis of partially missing datasets. In: Proceedings of the European Symposium on Intelligent Technologies, Hybid Systems and Their Implementation on Smart Adaptive Systems (EUNITE 2002), pp. 426–431 (2002)
Google Scholar
Sarkar, M., Leong, T.-Y.: Fuzzy K-means clustering with missing values. In: Proceedings of the American Medical Informatics Association Annual Symposium, pp. 588–592 (2001)
Google Scholar
van der Laan, M.Y., Pollard, K.S.: A new algorithm for hybrid hierarchical clustering with visualization and the bootstrap. J. Stat. Plann. Infer. 117(2), 275–303 (2003)
Article MathSciNet MATH Google Scholar
Himmelspach, L., Conrad, S.: Clustering approaches for data with missing values: comparison and evaluation. In: Proceedings of the Fifth IEEE International Conference on Digital Information Management (ICDIM 2010), pp. 19–28 (2010)
Google Scholar
Himmelspach, L., Conrad, S.: Fuzzy clustering of incomplete data based on cluster dispersion. In: Hüllermeier, E., Kruse, R., Hoffmann, F. (eds.) IPMU 2010. LNCS, vol. 6178, pp. 59–68. Springer, Heidelberg (2010)
Chapter Google Scholar
MacQueen, J.: Some methods for classification and analysis of multivariate observations. In: Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, Volume 1: Statistics, pp. 281–297 (1967)
Google Scholar
Bezdek, J.C.: Pattern Recognition with Fuzzy Objective Function Algorithms. Kluwer Academic Publishers, Norwell (1981)
Book MATH Google Scholar
Kruse, R., Döring, C., Lesot, M.-J.: Fundamentals of fuzzy clustering. In: Advances in Fuzzy Clustering and Its Applications, pp. 1–30 (2007)
Google Scholar
Klawonn, F., Kruse, R., Winkler, R.: Fuzzy clustering: more than just fuzzification. Fuzzy Sets Syst. 281, 272–279 (2015)
Article MathSciNet Google Scholar
Timm, H.: Fuzzy-Clusteranalyse: Methoden zur Exploration von Daten mit fehlenden Werten sowie klassifizierten Daten. Ph.D. thesis, Germany (2002)
Google Scholar
Asuncion, A., Newman, D.J.: UCI Machine Learning Repository (2007). http://www.ics.uci.edu/~mlearn/MLRepository.html
Little, R.J.A., Rubin, D.B.: Statistical Analysis with Missing Data, 2nd edn. Wiley, New York (2002)
MATH Google Scholar
Runkler, T.A.: Comparing partitions by subset similarities. In: Hüllermeier, E., Kruse, R., Hoffmann, F. (eds.) IPMU 2010. LNCS, vol. 6178, pp. 29–38. Springer, Heidelberg (2010)
Chapter Google Scholar
Timm, H., Döring, C., Kruse, R.: Different approaches to fuzzy clustering of incomplete datasets. Int. J. Approximate Reasoning 35(3), 239–249 (2004)
Article MathSciNet MATH Google Scholar

Download references

Author information

Authors and Affiliations

Institute of Computer Science, Heinrich-Heine-Universität Düsseldorf, 40225, Düsseldorf, Germany
Ludmila Himmelspach & Stefan Conrad

Authors

Ludmila Himmelspach
View author publications
You can also search for this author in PubMed Google Scholar
Stefan Conrad
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ludmila Himmelspach .

Editor information

Editors and Affiliations

INESC-ID,Instituto Superior Técnico, Universidade de Lisboa, Lisboa, Portugal
Joao Paulo Carvalho
LIP 6, Université Pierre et Marie Curie, Paris, France
Marie-Jeanne Lesot
School of Industrial Engineering, Eindhoven University of Technology, Eindhoven, The Netherlands
Uzay Kaymak
IDMEC,Instituto Superior Técnico, Universidade de Lisboa, Lisboa, Portugal
Susana Vieira
LIP6, Université Pierre et Marie Curie, CNRS, Paris, France
Bernadette Bouchon-Meunier
Machine Intelligence Institute, Iona College, New Rochelle, New York, USA
Ronald R. Yager

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Himmelspach, L., Conrad, S. (2016). Fuzzy c-Means Clustering of Incomplete Data Using Dimension-Wise Fuzzy Variances of Clusters. In: Carvalho, J., Lesot, MJ., Kaymak, U., Vieira, S., Bouchon-Meunier, B., Yager, R. (eds) Information Processing and Management of Uncertainty in Knowledge-Based Systems. IPMU 2016. Communications in Computer and Information Science, vol 610. Springer, Cham. https://doi.org/10.1007/978-3-319-40596-4_58

Download citation

DOI: https://doi.org/10.1007/978-3-319-40596-4_58
Published: 11 June 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-40595-7
Online ISBN: 978-3-319-40596-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Fuzzy c-Means Clustering of Incomplete Data Using Dimension-Wise Fuzzy Variances of Clusters

Abstract

Access this chapter

Similar content being viewed by others

A Robust Fuzzy c-Means Clustering Algorithm for Incomplete Data

K-Means over Incomplete Datasets Using Mean Euclidean Distance

FIT2COMIn – Robust Clustering Algorithm for Incomplete Data

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Fuzzy c-Means Clustering of Incomplete Data Using Dimension-Wise Fuzzy Variances of Clusters

Abstract

Access this chapter

Similar content being viewed by others

A Robust Fuzzy c-Means Clustering Algorithm for Incomplete Data

K-Means over Incomplete Datasets Using Mean Euclidean Distance

FIT2COMIn – Robust Clustering Algorithm for Incomplete Data

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation