On Initializations for the Minkowski Weighted K-Means

de Amorim, Renato Cordeiro; Komisarczuk, Peter

doi:10.1007/978-3-642-34156-4_6

Renato Cordeiro de Amorim^19,20 &
Peter Komisarczuk²⁰

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 7619))

Included in the following conference series:

International Symposium on Intelligent Data Analysis

1764 Accesses
8 Citations

Abstract

Minkowski Weighted K-Means is a variant of K-Means set in the Minkowski space, automatically computing weights for features at each cluster. As a variant of K-Means, its accuracy heavily depends on the initial centroids fed to it. In this paper we discuss our experiments comparing six initializations, random and five other initializations in the Minkowski space, in terms of their accuracy, processing time, and the recovery of the Minkowski exponent p.

We have found that the Ward method in the Minkowski space tends to outperform other initializations, with the exception of low-dimensional Gaussian Models with noise features. In these, a modified version of intelligent K-Means excels.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Ball, G.H., Hall, D.J.: A clustering technique for summarizing multivariate data. Behavioral Science 12(2), 153–155 (1967)
Article Google Scholar
MacQueen, J., et al.: Some methods for classification and analysis of multivariate observations. In: Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, California, USA, pp. 281–297 (1967)
Google Scholar
Chan, E.Y., Ching, W.K., Ng, M.K., Huang, J.Z.: An optimization algorithm for clustering using weighted dissimilarity measures. Pattern Recognition 37(5), 943–952 (2004)
Article MATH Google Scholar
Huang, J.Z., Ng, M.K., Rong, H., Li, Z.: Automated variable weighting in k-means type clustering. IEEE Transactions on Pattern Analysis and Machine Intelligence 27(5), 657–668 (2005)
Article Google Scholar
Huang, J.Z., Xu, J., Ng, M., Ye, Y.: Weighting Method for Feature Selection in K-Means. In: Computational Methods of feature selection, pp. 193–209. Chapman & Hall (2008)
Google Scholar
de Amorim, R.C., Mirkin, B.: Minkowski Metric, Feature Weighting and Anomalous Cluster Initializing in K-Means Clustering. Pattern Recognition 45(3), 1061–1075 (2011)
Article Google Scholar
Mirkin, B.G.: Clustering for data mining: a data recovery approach. CRC Press (2005)
Google Scholar
Chiang, M.M.T., Mirkin, B.: Intelligent choice of the number of clusters in K-Means clustering: an experimental study with different cluster spreads. Journal of Classification 27(1), 3–40 (2010)
Article MathSciNet Google Scholar
Pena, J.M., Lozano, J.A., Larranaga, P.: An empirical comparison of four initialization methods for the k-means algorithm. Pattern Recognition Letters 20(10), 1027–1040 (1999)
Article Google Scholar
Steinley, D., Brusco, M.J.: Initializing K-Means batch clustering: A critical evaluation of several techniques. Journal of Classification 24(1), 99–121 (2007)
Article MathSciNet MATH Google Scholar
Maitra, R., Peterson, A.D., Ghosh, A.P.: A systematic evaluation of different methods for initializing the K-Means clustering algorithm. TKDE (2010)
Google Scholar
Hartigan, J.A., Wong, M.A.: Algorithm AS 136: A k-means clustering algorithm. Journal of the Royal Statistical Society. Series C. 28(1), 100–108 (1979)
MATH Google Scholar
Ward Jr., J.H.: Hierarchical grouping to optimize an objective function. Journal of the American Statistical Association, 236–244 (1963)
Google Scholar
Milligan, G.W., Cooper, M.C.: A study of standardization of variables in cluster analysis. Journal of Classification 5(2), 181–204 (1988)
Article MathSciNet Google Scholar
Kaufman, L., Rousseeuw, P.J.: Finding groups in data: an introduction to cluster analysis. Wiley Online Library (1990)
Google Scholar
Astrahan, M.M.: Speech analysis by clustering, or the hyperphoneme method. Issue 124 of Memo (Stanford Artificial Intelligence Project) (1970)
Google Scholar
de Amorim, R.C.: Constrained Intelligent K-Means: Improving Results with Limited Previous Knowledge. In: ADVCOMP, pp. 176–180 (2008)
Google Scholar
de Amorim, R.C., Komisarczuk, P.: On partitional clustering of malware. In: CyberPatterns, pp. 47–51. Abingdon, Oxfordshire (2012)
Google Scholar
Steinley, D.: Standardizing variables in K-means clustering. Classification, Clustering, and Data Mining Applications, 53–60 (2004)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science and Information Systems, Birkbeck University of London, Malet Street, WC1E 7HX, UK
Renato Cordeiro de Amorim
School of Computing and Technology, University of West London, St Mary’s Road, W5 5RF, UK
Renato Cordeiro de Amorim & Peter Komisarczuk

Authors

Renato Cordeiro de Amorim
View author publications
You can also search for this author in PubMed Google Scholar
Peter Komisarczuk
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Information and Computer Science, Aalto University School of Science, P.O. Box 15400, 00076, Aalto, Finland
Jaakko Hollmén
Department of Computer Science, Ostfalia University of Applied Sciences, Salzdahlumer Straße 46/48, 38302, Wolfenbüttel, Germany
Frank Klawonn
School of Information Systems, Computing and Mathematics, Brunel University, UB8 3PH, Uxbridge, Middlesex, UK
Allan Tucker

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

de Amorim, R.C., Komisarczuk, P. (2012). On Initializations for the Minkowski Weighted K-Means. In: Hollmén, J., Klawonn, F., Tucker, A. (eds) Advances in Intelligent Data Analysis XI. IDA 2012. Lecture Notes in Computer Science, vol 7619. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-34156-4_6

Download citation

DOI: https://doi.org/10.1007/978-3-642-34156-4_6
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-34155-7
Online ISBN: 978-3-642-34156-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics