Heuristic Clustering Algorithms

M. Bagirov, Adil; Karmitsa, Napsu; Taheri, Sona

doi:10.1007/978-3-030-37826-4_5

Adil M. Bagirov⁵,
Napsu Karmitsa⁶ &
Sona Taheri⁵

Part of the book series: Unsupervised and Semi-Supervised Learning ((UNSESUL))

919 Accesses

Abstract

This chapter is devoted to the most popular heuristic partitional clustering algorithms such as k-means, k-medians, and k-medoids. In addition, we give an overview of some clustering algorithms based on mixture models, self-organizing map, and fuzzy clustering. The description of these algorithms as well as their flowcharts is presented. The convergence results for the k-means and the k-medians algorithms using nonsmooth optimization techniques are discussed.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Aggarwal, C.C., Hinneburg, A., Keim, D.: On the surprising behavior of distance metrics in high dimensional space. In: ICDT ’01 Proceedings of the 8th International Conference on Database Theory, pp. 420–434 (2001)
Google Scholar
Al-Daoud, M.B., Roberts, S.A.: New methods for the initialisation of clusters. Pattern Recogn. Lett. 17(5), 451–455 (1996)
Article Google Scholar
Aloise, D., Deshpande, A., Hansen, P., Popat, P.: NP-hardness of Euclidean sum-of-squares clustering. Mach. Learn. 75(2), 245–248 (2009)
Article MATH Google Scholar
Anderberg, M.R.: Cluster Analysis for Applications. Academic, New York, NY (1973)
MATH Google Scholar
Arthur, D., Vassilvitskii, S.: k-means++: the advantages of careful seeding. In: Bansal, N., Pruhs, K., Stein, C. (eds.) SODA ’07 Proceedings of the Eighteenth Annual ACM-SIAM Symposium on Discrete Algorithms, pp. 1027–1035 (2007)
Google Scholar
Ball, G.H., Hall, D.J.: ISODATA, a novel method of data analysis and pattern classification. Technical Report NTIS AD 699616, Stanford Research Institute, Menlo Park, CA (1965)
Google Scholar
Ball, G.H., Hall, D.J.: A clustering technique for summarizing multivariate data. Behav. Sci. 12(2), 153–155 (1967)
Article Google Scholar
Banerjee, A., Dhillon, I.S., Ghosh, J., Sra, S.: Clustering on the unit hypesphere using von Mises-Fisher distributions. J. Mach. Learn. Res. 6, 1345–1382 (2005)
MathSciNet MATH Google Scholar
Baudry, J.P., Raftery, A., Celeux, G., Lo, K., Gottardo, R.G.: Combining mixture components for clustering. J. Comput. Graph. Stat. 19(2), 332–353 (2010)
Article MathSciNet Google Scholar
Bezdek, J.C.: Pattern Recognition with Fuzzy Objective Function Algoritms. Plenum Press, New York (1981)
Book MATH Google Scholar
Biernacki, C., Celeux, G., Gold, E.M.: Assessing a mixture model for clustering with the integrated completed likelihood. IEEE Trans. Pattern Anal. Mach. Intell. 22, 719–725 (2000)
Article Google Scholar
Bobrowski, L., Bezdek, J.C.: c-means clustering with the L ₁ and L _∞ norms. IEEE Trans. Syst. Man Cybern. 21, 545–554 (1991)
Google Scholar
Bock, H.H.: Probabilistic models in cluster analysis. Comput. Stat. Data Anal. 23, 5–28 (1996)
Article MATH Google Scholar
Carmichael, J., Sneath, P.: Taxometric maps. Syst. Zool. 18, 402–415 (1969)
Article Google Scholar
Carpenter, G.A., Grossberg, S.: A massively parallel architecture for a self-organizing neural pattern recognition machine. Comput. Vis. Graph. Image Process. 37, 54–115 (1987)
Article MATH Google Scholar
Carpenter, G.A., Grossberg, S.: Art3: hierarchical search using chemical transmitters in self organising pattern recognition architectures. Neural Netw. 3, 129–152 (1990)
Article Google Scholar
Carpenter, G.A., Grossberg, S., Reynolds, J.H.: ARTMAP: supervised real-time learning and classification of nonstationary data by a self-organizing neural network. Neural Netw. 4, 565–588 (1991)
Article Google Scholar
Celebi, M.E., Kingravi, H.A., Vela, P.A.: A comparative study of efficient initialization methods for the k-means clustering algorithm. Expert Syst. Appl. 40(1), 200–210 (2013)
Article Google Scholar
Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum likelihood for incomplete data via the EM algorithm (with discussion). J. R. Stat. Soc. Ser. B 39(1), 1–38 (1977)
MATH Google Scholar
De Souza, R.M.C.R., de Carvalho, F.A.T.: Clustering of interval data based on city-block distances. Pattern Recogn. Lett. 25, 353–365 (2004)
Article Google Scholar
Doherty, K.A.J., Adams, R.G., Davey, N.: Non-Euclidean norms and data normalisation. In: Proceedings of ESANN, pp. 181–186 (2004)
Google Scholar
Duda, R.O., Hart, P.E.: Pattern Classification and Scene Analysis. Wiley, New York (1973)
MATH Google Scholar
Dunn, J.C.: A fuzzy relative of the ISODATA process and its use in detecting compact well-separated clusters. J. Cybern. 3, 32–57 (1973)
Article MathSciNet MATH Google Scholar
Figueiredo, M.A.T., Jain, A.K.: Unsupervised learning of finite mixture models. IEEE Trans. Pattern Anal. Mach. Intell. 24(3), 381–396 (2002)
Article Google Scholar
Forgy, E.W.: Cluster analysis of multivariate data: efficiency versus interpretability of classifications. Biometrics 21, 768–769 (1965)
Google Scholar
Ghorbani, M.: Maximum entropy-based fuzzy clustering by using L ₁-norm space. Turk. J. Math. 29, 431–438 (2005)
MathSciNet MATH Google Scholar
Gonzalez, T.: Clustering to minimize the maximum intercluster distance. Theor. Comput. Sci. 38(2–3), 293–306 (1985)
Article MathSciNet MATH Google Scholar
Hanilci, C., Ertas, F.: Comparison of the impact of some Minkowski metrics on VQ/GMM based speaker recognition. Comput. Electr. Eng. 37, 41–56 (2011)
Article Google Scholar
Hansen, P., Mladenovic, N.: J-means: a new local search heuristic for minimum sum of squares clustering. Pattern Recogn. 34(2), 405–413 (2001)
Article MATH Google Scholar
Hartigan, J.A., Wong, M.A.: Algorithm AS 136: a k-means clustering algorithm. J. R. Stat. Soc. Ser. C (Appl. Stat.) 28, 100–108 (1979)
Google Scholar
Ismkhan, H.: i − k-means-+: an iterative clustering algorithm based on an enhanced version of the k-means. Pattern Recogn. 79, 402–413 (2018)
Google Scholar
Jain, A.K.: Data clustering: 50 years beyond k-means. Pattern Recogn. Lett. 31(8), 651–666 (2010)
Article Google Scholar
Jajuga, K.: A clustering method based on the L ₁-norm. Comput. Stat. Data Anal. 5, 357–371 (1987)
Article MathSciNet MATH Google Scholar
Katsavounidis, I., Kuo, C.-C.J., Zhang, Z.: A new initialization technique for generalized Lloyd iteration. IEEE Signal Process. Lett. 1(10), 144–146 (1994)
Article Google Scholar
Kaufman, L., Rousseeuw, P.J.: Clustering by means of medoids. In: Dodge, Y. (ed.) Statistical Data Analysis Based on the L ₁-Norm and Related Methods, pp. 405–416. North-Holland, Amsterdam (1987)
Google Scholar
Kaufman, L., Rousseeuw, P.J.: Finding Groups in Data: An Introduction to Cluster Analysis. Wiley Series in Probability and Statistics. Wiley, New York (1990)
Book MATH Google Scholar
Kohonen, T.: Self-organization formation of topologically correct feature maps. Biol. Cybern. 43(1), 59–69 (1982)
Article MATH Google Scholar
Kohonen, T.: Self Organization and Associative Memory. Springer Information Sciences Series, 3rd edn. Springer, Heidelberg (1989)
Google Scholar
Li, J.: Clustering based on multi-layer mixture model. J. Comput. Graph. Stat. 14(3), 547–568 (2005)
Article Google Scholar
Likas, A., Vlassis, M., Verbeek, J.: The global k-means clustering algorithm. Pattern Recogn. 36(2), 451–461 (2003)
Article Google Scholar
Lloyd, S.: Least squares quantization in PCM. IEEE Trans. Inf. Theory 28(2), 129–137 (1982)
Article MathSciNet MATH Google Scholar
MacQueen, J.: Some methods for classification and analysis of multivariate observations. In: Cam, L.M.L., Neyman, J. (eds.) Proceedings of the fifth Berkeley Symposium on Mathematical Statistics and Probability, vol. 1, pp. 281–297. University of California Press, Berkeley, CA (1967)
Google Scholar
Mahajan, M., Nimbhorkar, P., Varadarajan, K.: The planar k-means problem is NP-hard. Theor. Comput. Sci. 442, 13–21 (2012)
Article MathSciNet MATH Google Scholar
McLachlan, G., Krishnan, T.: The EM Algorithm and Extensions. Wiley, New York (1997)
MATH Google Scholar
McLachlan, G.J., Basford, K.E.: Mixture Models: Inference and Applications to Clustering. Marcel Dekker, New York (1988)
MATH Google Scholar
Melnykov, V., Maitra, R.: Finite mixture models and model-based clustering. Stat. Surv. 4, 80–116 (2010). Digital Repository, Statistics Publications, Iowa State University
Google Scholar
Mohebi, E., Bagirov, A.M.: A convolutional recursive modified self organizing map for handwritten digits recognition. Neural Netw. 60, 104–118 (2014)
Article MATH Google Scholar
Mohebi, E., Bagirov, A.M.: Constrained self organizing maps for data clusters visualization. Neural Process. Lett. 43(3), 849–869 (2016)
Article Google Scholar
Newcomb, S.: A generalized theory of the combination of observations so as to obtain the best result. Am. J. Math. 8(4), 343–366 (1886)
Article MathSciNet MATH Google Scholar
Park, H.S., Jun, C.H.: A simple and fast algorithm for k-medoids clustering. Expert Syst. Appl. 36(2), 3336–3341 (2009)
Article Google Scholar
Pearson, K.: Contribution to the mathematical theory of evolution. Philos. Trans. R. Soc. 185, 71–110 (1894)
MATH Google Scholar
Pelleg, D., Moore, A.W.: X-means: extending k-means with efficient estimation of the number of clusters. In: Langley, P. (ed.) Proceedings of the Seventeenth International Conference on Machine Learning, pp. 727–734. Morgan Kaufmann, San Francisco, CA (2000)
Google Scholar
Pizzuti, C., Talia, D., Vonella, G.: A divisive initialisation method for clustering algorithms. In: Proceedings of the 3rd European Conference on Principles and Practice of Knowledge Discovery in Databases, pp. 484–491 (1999)
Google Scholar
Quandt, R.E.: A new approach to estimating switching regressions. J. Am. Stat. Assoc. 67(338), 306–310 (1972)
Article MATH Google Scholar
Redmond, S.J., Heneghan, C.: A method for initialising the k-means clustering algorithm using kd-trees. Pattern Recogn. Lett. 28(8), 965–973 (2007)
Article Google Scholar
Sabo, K., Scitovski, R., Vazler, I.: One-dimensional center-based L ₁-clustering method. Optim. Lett. 7(1), 5–22 (2013)
Article MathSciNet MATH Google Scholar
Selim, S.Z., Ismail, M.A.: k-means-type algorithms: a generalized convergence theorem and characterization of local optimality. IEEE Trans. Pattern Anal. Mach. Intell. 6(1), 81–87 (1984)
Google Scholar
Sethi, I., Jain, A.K. (eds.): Artificial Neural Networks and Pattern Recognition: Old and new Connections. Elsevier, New York (1991)
MATH Google Scholar
Shang, Y., Wah, B.W.: Global optimization for neural network training. IEEE Comput. 29(3), 31–44 (1996)
Article Google Scholar
Späth, H.: Algorithm 30: L ₁ cluster analysis. Computing 16(4), 379–387 (1976)
Article MATH Google Scholar
Späth, H.: Cluster Analysis Algorithms for Data Reduction and Classification of Objects. Computers and Their applications. Ellis Horwood Limited, Chichester (1980)
MATH Google Scholar
Späth, H.: The Cluster Dissection and Analysis Theory FORTRAN Programs Examples. Prentice-Hall, Upper Saddle River, NJ (1985)
MATH Google Scholar
Steinhaus, H.: Sur la division des corp materiels en parties. Bull. Acad. Polon. Sci. 4(12), 801–804 (1956)
MathSciNet MATH Google Scholar
Van der Laan, M., Pollard, K., Bryan, J.: A new partitioning around medoids algorithm. J. Stat. Comput. Simul. 73(8), 575–584 (2003)
Article MathSciNet MATH Google Scholar
Venkateswarlu, N., Raju, P.: Fast ISODATA clustering algorithms. Pattern Recogn. 25(3), 335–342 (1992)
Article Google Scholar
Weiszfeld, E.: Sur le point pour lequel la somme des distances de n points donnes est minimum. Tohoku Math. J. 43, 355–386 (1937)
MATH Google Scholar
Weiszfeld, E., Plastria, F.: On the point for which the sum of the distances to n given points is minimum. Ann. Oper. Res. 167(1), 7–41 (2009)
Article MathSciNet MATH Google Scholar
Wolfe, J.H.: Pattern clustering by multivariate mixture analysis. Multivar. Behav. Res. 5(3), 329–350 (1970)
Article Google Scholar
Wu, X., Kumar, V., Quinlan, J.R., Ghosh, J., Yang, Q., Motoda, H., McLachlan, G.J., Ng, A., Liu, B., Yu, P.S., Zhou, Z.-H., Steinbach, M., Hand, D.J., Steinberg, D.: Top 10 algorithms in data mining. Knowl. Inf. Syst. 14(1), 1–37 (2007)
Article Google Scholar
Yang, M.-Sh., Hung, W.-L., Chung, T.-I.: Alternative fuzzy clustering algorithms with L ₁-norm and covariance matrix. In: Blanc-Talon J., Philips W., Popescu D., Scheunders P. (eds) Advanced Concepts for Intelligent Vision Systems, ACIVS 2006. Lecture Notes in Computer Science, vol. 4179, pp. 654–665. Springer, Berlin/Heidelberg, (2006)
Google Scholar
Zhang, J., Peng, L., Zhao, X., Kuruoglu, E.E.: Robust data clustering by learning multi-metric L _q-norm distances. Expert Syst. Appl. 39(1), 335–349 (2012)
Article Google Scholar
Mohebi, E., Bagirov, A.M.: Modified self organising maps with a new topology and initialisation algorithm. J. Exp. Theor. Artif. Intell. 27(3), 351–372 (2015)
Article Google Scholar

Download references

Author information

Authors and Affiliations

School of Science, Engineering & Information Technology, Federation University Australia, Ballarat, VIC, Australia
Adil M. Bagirov & Sona Taheri
Department of Mathematics and Statistics, University of Turku, Turku, Finland
Napsu Karmitsa

Authors

Adil M. Bagirov
View author publications
You can also search for this author in PubMed Google Scholar
Napsu Karmitsa
View author publications
You can also search for this author in PubMed Google Scholar
Sona Taheri
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

M. Bagirov, A., Karmitsa, N., Taheri, S. (2020). Heuristic Clustering Algorithms. In: Partitional Clustering via Nonsmooth Optimization. Unsupervised and Semi-Supervised Learning. Springer, Cham. https://doi.org/10.1007/978-3-030-37826-4_5

Download citation

DOI: https://doi.org/10.1007/978-3-030-37826-4_5
Published: 25 February 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-37825-7
Online ISBN: 978-3-030-37826-4
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics