Abstract
Minkowski Weighted K-Means is a variant of K-Means set in the Minkowski space, automatically computing weights for features at each cluster. As a variant of K-Means, its accuracy heavily depends on the initial centroids fed to it. In this paper we discuss our experiments comparing six initializations, random and five other initializations in the Minkowski space, in terms of their accuracy, processing time, and the recovery of the Minkowski exponent p.
We have found that the Ward method in the Minkowski space tends to outperform other initializations, with the exception of low-dimensional Gaussian Models with noise features. In these, a modified version of intelligent K-Means excels.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Ball, G.H., Hall, D.J.: A clustering technique for summarizing multivariate data. Behavioral Science 12(2), 153–155 (1967)
MacQueen, J., et al.: Some methods for classification and analysis of multivariate observations. In: Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, California, USA, pp. 281–297 (1967)
Chan, E.Y., Ching, W.K., Ng, M.K., Huang, J.Z.: An optimization algorithm for clustering using weighted dissimilarity measures. Pattern Recognition 37(5), 943–952 (2004)
Huang, J.Z., Ng, M.K., Rong, H., Li, Z.: Automated variable weighting in k-means type clustering. IEEE Transactions on Pattern Analysis and Machine Intelligence 27(5), 657–668 (2005)
Huang, J.Z., Xu, J., Ng, M., Ye, Y.: Weighting Method for Feature Selection in K-Means. In: Computational Methods of feature selection, pp. 193–209. Chapman & Hall (2008)
de Amorim, R.C., Mirkin, B.: Minkowski Metric, Feature Weighting and Anomalous Cluster Initializing in K-Means Clustering. Pattern Recognition 45(3), 1061–1075 (2011)
Mirkin, B.G.: Clustering for data mining: a data recovery approach. CRC Press (2005)
Chiang, M.M.T., Mirkin, B.: Intelligent choice of the number of clusters in K-Means clustering: an experimental study with different cluster spreads. Journal of Classification 27(1), 3–40 (2010)
Pena, J.M., Lozano, J.A., Larranaga, P.: An empirical comparison of four initialization methods for the k-means algorithm. Pattern Recognition Letters 20(10), 1027–1040 (1999)
Steinley, D., Brusco, M.J.: Initializing K-Means batch clustering: A critical evaluation of several techniques. Journal of Classification 24(1), 99–121 (2007)
Maitra, R., Peterson, A.D., Ghosh, A.P.: A systematic evaluation of different methods for initializing the K-Means clustering algorithm. TKDE (2010)
Hartigan, J.A., Wong, M.A.: Algorithm AS 136: A k-means clustering algorithm. Journal of the Royal Statistical Society. Series C. 28(1), 100–108 (1979)
Ward Jr., J.H.: Hierarchical grouping to optimize an objective function. Journal of the American Statistical Association, 236–244 (1963)
Milligan, G.W., Cooper, M.C.: A study of standardization of variables in cluster analysis. Journal of Classification 5(2), 181–204 (1988)
Kaufman, L., Rousseeuw, P.J.: Finding groups in data: an introduction to cluster analysis. Wiley Online Library (1990)
Astrahan, M.M.: Speech analysis by clustering, or the hyperphoneme method. Issue 124 of Memo (Stanford Artificial Intelligence Project) (1970)
de Amorim, R.C.: Constrained Intelligent K-Means: Improving Results with Limited Previous Knowledge. In: ADVCOMP, pp. 176–180 (2008)
de Amorim, R.C., Komisarczuk, P.: On partitional clustering of malware. In: CyberPatterns, pp. 47–51. Abingdon, Oxfordshire (2012)
Steinley, D.: Standardizing variables in K-means clustering. Classification, Clustering, and Data Mining Applications, 53–60 (2004)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
de Amorim, R.C., Komisarczuk, P. (2012). On Initializations for the Minkowski Weighted K-Means. In: Hollmén, J., Klawonn, F., Tucker, A. (eds) Advances in Intelligent Data Analysis XI. IDA 2012. Lecture Notes in Computer Science, vol 7619. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-34156-4_6
Download citation
DOI: https://doi.org/10.1007/978-3-642-34156-4_6
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-34155-7
Online ISBN: 978-3-642-34156-4
eBook Packages: Computer ScienceComputer Science (R0)