Abstract
We have developed and evaluated two parallelization schemes for a tree-based κ-means clustering method on shared memory machines. One scheme is to partition the pattern space across processors. We have determined that spatial decomposition of patterns outperforms random decomposition even though random decomposition has almost no load imbalance problem. The other scheme is the parallel traverse of the search tree. This approach solves the load imbalance problem and performs slightly better than the spatial decomposition, but the efficiency is reduced due to thread synchronizations. In both cases, parallel treebased k-means clustering is significantly faster than the direct parallel k-means.
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Jain, A. K., Murty, M. N., Flynn, P. J.: Data Clustering: A Review. ACM Computing Surveys, Vol. 31, No. 3, (1999) 264–323
Judd, D., McKinley, P. K., Jain, A. K.: Large-Scale Parallel Data Clustering. In Proc. of the 13th Int. Conf. on Pattern Recognition, (1996)
McQueen, J.: Some Methods for Classification and Analysis of Multivariate Observations. Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, (1997) 173–188
Alsabti, K., Ranka, S., Singh, V.: An Efficient Κ-Means Clustering Algorithm. IPPS/SPDP 1st Workshop on High Performance Data Mining, (1998)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2001 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Gürsoy, A., Cengiz, İ. (2001). Parallel Pruning for K-Means Clustering on Shared Memory Architectures. In: Sakellariou, R., Gurd, J., Freeman, L., Keane, J. (eds) Euro-Par 2001 Parallel Processing. Euro-Par 2001. Lecture Notes in Computer Science, vol 2150. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-44681-8_45
Download citation
DOI: https://doi.org/10.1007/3-540-44681-8_45
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-42495-6
Online ISBN: 978-3-540-44681-1
eBook Packages: Springer Book Archive