Abstract
Clustering is a fundamental and important task in data mining. Affinity propagation clustering (APC) has demonstrated its advantages and effectiveness in various domains. APC iteratively propagates information between affinity samples, updates the responsibility matrix and availability matrix, and employs these matrices to choose cluster centroid (or exemplar) of the respective clusters. However, since it chooses the sample points as the exemplars, these exemplars may not be the realistic centroids of the clusters they belong to. There may be some deviation between exemplars and the realistic cluster centroids. As a result, samples near the decision boundary may have a relatively large similarity with other exemplar they don’t belong to, and they are easy to be clustered incorrectly. To mitigate this problem, we propose an improved APC based on centroid-deviation-distance similarity (APC-CDD). APC-CDD firstly takes advantages of k-means on the whole samples to explore the more realistic centroid of the cluster, and then calculates the approximate centroid deviation distance of each cluster. After that, it adjusts the similarity between pairwise samples by subtracting the centroid deviation distance of the clusters they belong to. Next, it utilizes APC based on centroid-deviation-distance similarity to group samples. Our empirical study on synthetic and UCI datasets shows that the proposed APC-CDD has better performance than original APC and other related approaches.
Supported by NSFC (61872300 and 61873214), Fundamental Research Funds for the Central Universities (XDJK2019B024), NSF of CQ CSTC (cstc2018jcyjAX0228).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Arthur, D., Vassilvitskii, S.: k-means++: the advantages of careful seeding. In: Proceedings of the Eighteenth Annual ACM-SIAM Symposium on Discrete Algorithms, pp. 1027–1035 (2007)
Barbakh, W., Fyfe, C.: Inverse weighted clustering algorithm. Comput. Inf. Syst. 11, 10–18 (2007)
Bradley, P.S., Fayyad, U., Reina, C., et al.: Scaling EM (expectation-maximization) clustering to large databases. Technical report (1998)
Brusco, M.J., Hans-Friedrich, K.: Comment on “clustering by passing messages between data points”. Science 319(5864), 726 (2008)
De Meo, P., Ferrara, E., Fiumara, G., Ricciardello, A.: A novel measure of edge centrality in social networks. Knowl.-Based Syst. 30, 136–150 (2012)
Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum likelihood from incomplete data via the EM algorithm. J. Roy. Stat. Soc. 39(1), 1–38 (1977)
Du, H., Wang, Y., Duan, L.: A new method for grayscale image segmentation based on affinity propagation clustering algorithm. In: 2013 Ninth International Conference on Computational Intelligence and Security, pp. 170–173. IEEE (2013)
Frey, B.J., Delbert, D.: Clustering by passing messages between data points. Science 315(5814), 972–976 (2007)
Gates, A.J., Ahn, Y.Y.: The impact of random models on clustering similarity. J. Mach. Learn. Res. 18(1), 3049–3076 (2017)
Guo, K., Guo, W., Chen, Y., Qiu, Q., Zhang, Q.: Community discovery by propagating local and global information based on the MapReduce model. Inf. Sci. 323, 73–93 (2015)
Halkidi, M., Batistakis, Y., Vazirgiannis, M.: On clustering validation techniques. J. Intell. Inf. Syst. 17(2–3), 107–145 (2001)
Hubert, L., Arabie, P.: Comparing partitions. J. Classif. 2(1), 193–218 (1985)
Kang, J.H., Lerman, K., Plangprasopchok, A.: Analyzing microblogs with affinity propagation. In: Proceedings of the First Workshop on Social Media Analytics, pp. 67–70. ACM (2010)
MacQueen, J., et al.: Some methods for classification and analysis of multivariate observations. In: Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, Oakland, CA, USA, vol. 1, pp. 281–297 (1967)
Michele, L., Martin, W.: Clustering by soft-constraint affinity propagation: applications to gene-expression data. Bioinformatics 23(20), 2708–2715 (2007)
Napolitano, F., Raiconi, G., Tagliaferri, R., Ciaramella, A., Staiano, A., Miele, G.: Clustering and visualization approaches for human cell cycle gene expression data analysis. Int. J. Approximate Reasoning 47(1), 70–84 (2008)
Papalexakis, E.E., Beutel, A., Steenkiste, P.: Network anomaly detection using co-clustering. In: Alhajj, R., Rokne, J. (eds.) Encyclopedia of Social Network Analysis and Mining, pp. 1054–1068. Springer, New York (2014). https://doi.org/10.1007/978-1-4614-6170-8
Sculley, D.: Web-scale k-means clustering. In: Proceedings of the 19th International Conference on World Wide Web, pp. 1177–1178. ACM (2010)
Serdah, A.M., Ashour, W.M.: Clustering large-scale data based on modified affinity propagation algorithm. J. Artif. Intell. Soft Comput. Res. 6(1), 23–33 (2016)
Wagstaff, K., Cardie, C., Rogers, S., Schrödl, S., et al.: Constrained k-means clustering with background knowledge. In: ICML, vol. 1, pp. 577–584 (2001)
Walter, S.: Clustering by affinity propagation. Ph.D. thesis (2007)
Wang, K.J., Jian, L.I., Zhang, J.Y., Chong-Yang, T.U.: Semi-supervised affinity propagation clustering. Comput. Eng. 33(23), 197–198 (2007)
Wang, K., Zhang, J., Li, D., Zhang, X., Guo, T.: Adaptive affinity propagation clustering. ArXiv Preprint ArXiv:0805.1096 (2008)
Wei, F.P., Shu, D., Fu, X.L.: Unsupervised image segmentation via affinity propagation. Appl. Mech. Mater. 610, 464–470 (2014)
Zhang, L., Du, Z.: Affinity propagation clustering with geodesic distances. J. Comput. Inf. Syst. 6(1), 47–53 (2010)
Zhang, R.: Two similarity measure methods based on human vision properties for image segmentation based on affinity propagation clustering. In: 2010 International Conference on Measuring Technology and Mechatronics Automation, vol. 3, pp. 1054–1058. IEEE (2010)
Zhang, X., Wang, W., Norvag, K., Sebag, M.: K-AP: generating specified K clusters by efficient affinity propagation. In: 2010 IEEE International Conference on Data Mining, pp. 1187–1192. IEEE (2010)
Zhao, C., Peng, Q., Sun, S.: Chinese text automatic summarization based on affinity propagation cluster. In: 2009 Sixth International Conference on Fuzzy Systems and Knowledge Discovery, vol. 1, pp. 425–429. IEEE (2009)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Xie, Y., Wang, X., Zhang, L., Yu, G. (2019). Affinity Propagation Clustering Using Centroid-Deviation-Distance Based Similarity. In: Zeng, A., Pan, D., Hao, T., Zhang, D., Shi, Y., Song, X. (eds) Human Brain and Artificial Intelligence. HBAI 2019. Communications in Computer and Information Science, vol 1072. Springer, Singapore. https://doi.org/10.1007/978-981-15-1398-5_21
Download citation
DOI: https://doi.org/10.1007/978-981-15-1398-5_21
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-15-1397-8
Online ISBN: 978-981-15-1398-5
eBook Packages: Computer ScienceComputer Science (R0)