Abstract
In this paper, an asymmetric k-means clustering algorithm is presented. The asymmetric version of this algorithm is derived using the asymmetric coefficients, which convey the information provided by the asymmetry in analyzed data sets. The formulation of the asymmetric k-means algorithm is motivated by the fact that, when an analyzed data set has the asymmetric nature, a data analysis algorithm should properly adjust to this nature. The traditional k-means approach using the symmetric dissimilarities does not apply correctly to this kind of phenomenon in data. We propose the k-means algorithm using the asymmetric coefficients, which has the ability to reflect the asymmetric relationships between objects in analyzed data sets. The results of our experimental study on real data show that the asymmetric k-means approach outperforms its symmetric counterpart.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Abraham, A., Corchado, E., Corchado, J.M.: Hybrid Learning Machines. Neurocomputing 72(13/15), 2729–2730 (2009)
Biau, G., Devroye, L., Lugosi, G.: On the Performance of Clustering in Hilbert Spaces. IEEE Transactions on Information Theory 54(2), 781–790 (2008)
Chengalvarayan, R., Deng, L.: HMM-Based Speech Recognition Using State-Dependent, Discriminatively Derived Transforms on Mel-Warped DFT Features. IEEE Transactions on Speech and Audio Processing 2(3), 243–256 (1997)
Corchado, E., Abraham, A., Carvalho, A.: Hybrid Intelligent Algorithms and Applications. Information Sciences 180(14), 2633–2634 (2010)
Corchado, E., Graña, M., Woźniak, M.: New Trends and Applications on Hybrid Artificial Intelligence Systems. Neurocomputing 75(1), 61–63 (2012)
Goldberger, A.L., Amaral, L.A.N., Glass, L., Hausdorff, J.M., Ivanov, P.C., Mark, R.G., Mietus, J.E., Moody, G.B., Peng, C.K., Stanley, H.E.: PhysioBank, PhysioToolkit, and PhysioNet: Components of a new research resource for complex physiologic signals. Circulation 101(23), e215–e220, circulation Electronic Pages (2000), http://circ.ahajournals.org/cgi/content/full/101/23/e215
Halkidi, M., Batistakis, Y., Vazirgiannis, M.: On Clustering Validation Techniques. Journal of Intelligent Information Systems 17(2/3), 107–145 (2001)
Handl, J., Knowles, J., Kell, D.B.: Computational Cluster Validation in Post-genomic Data Analysis. Bioinformatics 21(15), 3201–3212 (2005)
Kanungo, T., Mount, D.M., Netanyahu, N.S., Piatko, C.D., Silverman, R., Wu, A.Y.: An Efficient k-Means Clustering Algorithm: Analysis and Implemetation. IEEE Transactions on Pattern Analysis and Machine Intelligence 24(7), 881–892 (2002)
MacQueen, J.: Some Methods for Classification and Analysis of Multivariate Observations. In: Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, vol. 1, pp. 281–297 (1967)
Martín-Merino, M., Muñoz, A.: Visualizing Asymmetric Proximities with SOM and MDS Models. Neurocomputing 63, 171–192 (2005)
Muñoz, A., Martin, I., Moguerza, J.M.: Support Vector Machine Classifiers for Asymmetric Proximities. In: Kaynak, O., Alpaydın, E., Oja, E., Xu, L. (eds.) ICANN 2003 and ICONIP 2003. LNCS, vol. 2714, pp. 217–224. Springer, Heidelberg (2003)
Muñoz, A., Martín-Merino, M.: New Asymmetric Iterative Scaling Models for the Generation of Textual Word Maps. In: Proceedings of the International Conference on Textual Data Statistical Analysis JADT 2002, pp. 593–603 (2002)
Okada, A.: An Asymmetric Cluster Analysis Study of Car Switching Data. In: Data Analysis. Studies in Classification, Data Analysis, and Knowledge Organization. Springer, Heidelberg (2000)
Okada, A., Imaizumi, T.: Asymmetric Multidimensional Scaling of Two-Mode Three-Way Proximities. Journal of Classification 14(2), 195–224 (1997)
Okada, A., Imaizumi, T.: Joint Space Model for Multidimensional Scaling of Two-Mode Three-Way Asymmetric Proximities. In: Innovations in Classification, Data Science, and Information Systems. Studies in Classification, Data Analysis, and Knowledge Organization, pp. 371–378. Springer, Heidelberg (2003)
Okada, A., Imaizumi, T.: Multidimensional Scaling of Asymmetric Proximities with a Dominance Point. In: Advances in Data Analysis. Studies in Classification, Data Analysis, and Knowledge Organization, pp. 307–318. Springer, Heidelberg (2007)
Olszewski, D.: Asymmetric k-Means Algorithm. In: Dobnikar, A., Lotrič, U., Šter, B. (eds.) ICANNGA 2011, Part II. LNCS, vol. 6594, pp. 1–10. Springer, Heidelberg (2011)
Olszewski, D.: An Experimental Study on Asymmetric Self-Organizing Map. In: Yin, H., Wang, W., Rayward-Smith, V. (eds.) IDEAL 2011. LNCS, vol. 6936, pp. 42–49. Springer, Heidelberg (2011)
Steinhaus, H.: Sur la Division des Corp Matériels en Parties. Bulletin de l’Académie Polonaise des Sciences, C1. III 4(12), 801–804 (1956)
Zielman, B., Heiser, W.J.: Models for Asymmetric Proximities. British Journal of Mathematical and Statistical Psychology 49, 127–146 (1996)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Olszewski, D. (2012). k-Means Clustering of Asymmetric Data. In: Corchado, E., Snášel, V., Abraham, A., Woźniak, M., Graña, M., Cho, SB. (eds) Hybrid Artificial Intelligent Systems. HAIS 2012. Lecture Notes in Computer Science(), vol 7208. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-28942-2_22
Download citation
DOI: https://doi.org/10.1007/978-3-642-28942-2_22
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-28941-5
Online ISBN: 978-3-642-28942-2
eBook Packages: Computer ScienceComputer Science (R0)