MaxMin Linear Initialization for Fuzzy C-Means

Öztürk, Aybüke; Lallich, Stéphane; Darmont, Jérôme; Waksman, Sylvie Yona

doi:10.1007/978-3-319-96136-1_1

Aybüke Öztürk^13,14,
Stéphane Lallich¹³,
Jérôme Darmont¹³ &
…
Sylvie Yona Waksman¹⁴

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 10934))

Included in the following conference series:

International Conference on Machine Learning and Data Mining in Pattern Recognition

1779 Accesses

Abstract

Clustering is an extensive research area in data science. The aim of clustering is to discover groups and to identify interesting patterns in datasets. Crisp (hard) clustering considers that each data point belongs to one and only one cluster. However, it is inadequate as some data points may belong to several clusters, as is the case in text categorization. Thus, we need more flexible clustering. Fuzzy clustering methods, where each data point can belong to several clusters, are an interesting alternative. Yet, seeding iterative fuzzy algorithms to achieve high quality clustering is an issue. In this paper, we propose a new linear and efficient initialization algorithm MaxMin Linear to deal with this problem. Then, we validate our theoretical results through extensive experiments on a variety of numerical real-world and artificial datasets. We also test several validity indices, including a new validity index that we propose, Transformed Standardized Fuzzy Difference (TSFD).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
http://archive.ics.uci.edu/ml/.

References

Ruspini, E.H.: Numerical methods for fuzzy clustering. Inf. Sci. 2(3), 319–350 (1970)
Article Google Scholar
MacQueen, J., et al.: Some methods for classification and analysis of multivariate observations. In: Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, Oakland, CA, USA, vol. 1, pp. 281–297 (1967)
Google Scholar
Dunn, J.C.: A fuzzy relative of the isodata process and its use in detecting compact well-separated clusters (1973)
Article MathSciNet Google Scholar
Bezdek, J.C., Ehrlich, R., Full, W.: FCM: the fuzzy c-means clustering algorithm. Comput. Geosci. 10(2–3), 191–203 (1984)
Article Google Scholar
Kaufman, L., Rousseeuw, P.J.: Partitioning around medoids (program PAM). In: Finding Groups in Data: An Introduction to Cluster Analysis, pp. 68–125 (1990)
Google Scholar
Steinley, D., Brusco, M.J.: Initializing k-means batch clustering: a critical evaluation of several techniques. J. Classif. 24(1), 99–121 (2007)
Article MathSciNet Google Scholar
Maitra, R., Peterson, A.D., Ghosh, A.P.: A systematic evaluation of different methods for initializing the k-means clustering algorithm. IEEE Trans. Knowl. Data Eng. 41 (2011)
Google Scholar
Celebi, M.E., Kingravi, H.A., Vela, P.A.: A comparative study of efficient initialization methods for the k-means clustering algorithm. Expert Syst. Appl. 40(1), 200–210 (2013)
Article Google Scholar
Norušis, M.J.: IBM SPSS Statistics 19 Statistical Procedures Companion. Prentice Hall, Upper Saddle River (2012)
Google Scholar
Faber, V.: Clustering and the continuous k-means algorithm. Los Alamos Sci. 22(138144.21), 138–144 (1994)
Google Scholar
Hand, D.J., Krzanowski, W.J.: Optimising k-means clustering results with standard software packages. Comput. Stat. Data Anal. 49(4), 969–973 (2005)
Article MathSciNet Google Scholar
Bradley, P.S., Fayyad, U.M.: Refining initial points for k-means clustering. In: ICML, vol. 98, pp. 91–99 (1998)
Google Scholar
Su, T., Dy, J.G.: In search of deterministic methods for initializing k-means and Gaussian mixture clustering. Intell. Data Anal. 11(4), 319–338 (2007)
Article Google Scholar
Wold, S., Esbensen, K., Geladi, P.: Principal component analysis. Chemometr. Intell. Lab. Syst. 2(1–3), 37–52 (1987)
Article Google Scholar
Arthur, D., Vassilvitskii, S.: k-means++: the advantages of careful seeding. In: Proceedings of the Eighteenth Annual ACM-SIAM Symposium on Discrete Algorithms, Society for Industrial and Applied Mathematics, pp. 1027–1035 (2007)
Google Scholar
Lance, G.N., Williams, W.T.: A general theory of classificatory sorting strategies: II. Clustering systems. Comput. J. 10(3), 271–277 (1967)
Article Google Scholar
Astrahan, M.: Speech analysis by clustering, or the hyperphoneme method. Technical report, Department of Computer Science, Stanford University, CA (1970)
Google Scholar
Gonzalez, T.F.: Clustering to minimize the maximum intercluster distance. Theoret. Comput. Sci. 38, 293–306 (1985)
Article MathSciNet Google Scholar
Wang, W., Zhang, Y.: On fuzzy cluster validity indices. Fuzzy Sets Syst. 158(19), 2095–2117 (2007)
Article MathSciNet Google Scholar
Bezdek, J.C.: Cluster validity with fuzzy sets (1973)
Google Scholar
Chen, M.Y., Linkens, D.A.: Rule-base self-generation and simplification for data-driven fuzzy models. In: The 10th IEEE International Conference on Fuzzy Systems, vol. 1, pp. 424–427. IEEE (2001)
Google Scholar
Caliński, T., Harabasz, J.: A dendrite method for cluster analysis. Commun. Stat.-Theory Methods 3(1), 1–27 (1974)
Article MathSciNet Google Scholar
Fukuyama, Y.: A new method of choosing the number of clusters for the fuzzy c-mean method. In: Proceedings of 5th Fuzzy Systems Symposium, pp. 247–250 (1989)
Google Scholar
Xie, X.L., Beni, G.: A validity measure for fuzzy clustering. IEEE Trans. Pattern Anal. Mach. Intell. 13(8), 841–847 (1991)
Article Google Scholar
Pal, N.R., Bezdek, J.C.: On cluster validity for the fuzzy c-means model. IEEE Trans. Fuzzy Syst. 3(3), 370–379 (1995)
Article Google Scholar
Park, H.S., Jun, C.H.: A simple and fast algorithm for K-medoids clustering. Expert Syst. Appl. 36(2), 3336–3341 (2009)
Article Google Scholar
Bensaid, A.M., Hall, L.O., Bezdek, J.C., Clarke, L.P., Silbiger, M.L., Arrington, J.A., Murtagh, R.F.: Validity-guided (re)clustering with applications to image segmentation. IEEE Trans. Fuzzy Syst. 4(2), 112–123 (1996)
Article Google Scholar
Meyer, D., Dimitriadou, E., Hornik, K., Weingessel, A., Leisch, F., Chang, C.C., Lin, C.C., Meyer, M.D.: Package e1071. Version 1.6-8 (2017)
Google Scholar

Download references

Acknowledgements

This project is supported by the Rhône Alpes Region’s ARC 5: “Cultures, Sciences, Sociétés et Médiations” through A. Öztürk’s Ph.D. grant.

Author information

Authors and Affiliations

ERIC EA 3083, Université de Lyon, Lyon 2, 5 avenue Pierre Mendès France, 69676, Bron Cedex, France
Aybüke Öztürk, Stéphane Lallich & Jérôme Darmont
ArAr UMR 5138, Université de Lyon, Lyon 2, 7 rue Raulin, 69365, Lyon Cedex 7, France
Aybüke Öztürk & Sylvie Yona Waksman

Authors

Aybüke Öztürk
View author publications
You can also search for this author in PubMed Google Scholar
Stéphane Lallich
View author publications
You can also search for this author in PubMed Google Scholar
Jérôme Darmont
View author publications
You can also search for this author in PubMed Google Scholar
Sylvie Yona Waksman
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Aybüke Öztürk .

Editor information

Editors and Affiliations

Institute of Computer Vision and Applied Computer Sciences, Leipzig, Germany
Petra Perner

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Öztürk, A., Lallich, S., Darmont, J., Waksman, S.Y. (2018). MaxMin Linear Initialization for Fuzzy C-Means. In: Perner, P. (eds) Machine Learning and Data Mining in Pattern Recognition. MLDM 2018. Lecture Notes in Computer Science(), vol 10934. Springer, Cham. https://doi.org/10.1007/978-3-319-96136-1_1

Download citation

DOI: https://doi.org/10.1007/978-3-319-96136-1_1
Published: 08 July 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-96135-4
Online ISBN: 978-3-319-96136-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics