A Comparison of Knee Strategies for Hierarchical Spatial Clustering
A comparative study of the performance of knee detection approaches for the hierarchical clustering of 2D spatial data is undertaken. Knee detection is usually performed on the dendogram generated during cluster generation. For many problems, the knee is a natural indication of the ideal or optimal number of clusters for the given problem. This research compares the performance of various knee strategies on different spatial datasets. Two hierarchical clustering algorithms, single linkage and group average, are considered. Besides determining knees using conventional cluster distances, we also explore alternative metrics such as average global medoid and centroid distances, and F score metrics. Results show that knee determination is difficult and problem dependent.
KeywordsKnee Hierarchical clustering Spatial clustering
This research was supported by NSERC Discovery Grant RGPIN-2016-03653.
- 2.Franti, P.: Clustering Datasets (2015). http://cs.uef.fi/sipu/datasets/. Accessed 31 Oct 2017
- 4.Hayter, A.: Probability and Statistics for Engineers and Scientists. Duxbury, Pacific Grove (2007)Google Scholar
- 6.Online (2017). https://github.com/deric/clustering-benchmark. Accessed 31 Oct 2017
- 7.Ross, B.: A comparison of knee strategies for hierarchical spatial clustering. Technical report TR18-01, Brock U, Department of Computer Science, February 2018Google Scholar
- 8.Salvador, S., Chan, P.: Determining the number of clusters/segments in hierarchical clustering/segmentation algorithms. In: Proceedings of IEEE International Conference on Tools with Artificial Intelligence (ICTAI), pp. 576–584. IEEE (2004)Google Scholar
- 9.Suh, S.C.: Practical Applications of Data Mining. Jones & Bartlett Learning, Sudbury (2012)Google Scholar
- 10.Xie, J., Girshick, R., Farhadi, A.: Unsupervised deep embedding for cluster analysis. In: Proceedings of 33rd International Conference on Machine Learning. JMLR:W&CP (2008)Google Scholar
- 11.Zhang, Y., Zhang, X., Tang, J., Luo, B.: Decision-making strategies for multi-objective community detection in complex networks. In: Pan, L., Păun, G., Pérez-Jiménez, M.J., Song, T. (eds.) BIC-TA 2014. CCIS, vol. 472, pp. 621–628. Springer, Heidelberg (2014). https://doi.org/10.1007/978-3-662-45049-9_102CrossRefGoogle Scholar