Skip to main content

Affinity Propagation Clustering Using Centroid-Deviation-Distance Based Similarity

  • Conference paper
  • First Online:
Human Brain and Artificial Intelligence (HBAI 2019)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1072))

Included in the following conference series:

  • 624 Accesses

Abstract

Clustering is a fundamental and important task in data mining. Affinity propagation clustering (APC) has demonstrated its advantages and effectiveness in various domains. APC iteratively propagates information between affinity samples, updates the responsibility matrix and availability matrix, and employs these matrices to choose cluster centroid (or exemplar) of the respective clusters. However, since it chooses the sample points as the exemplars, these exemplars may not be the realistic centroids of the clusters they belong to. There may be some deviation between exemplars and the realistic cluster centroids. As a result, samples near the decision boundary may have a relatively large similarity with other exemplar they don’t belong to, and they are easy to be clustered incorrectly. To mitigate this problem, we propose an improved APC based on centroid-deviation-distance similarity (APC-CDD). APC-CDD firstly takes advantages of k-means on the whole samples to explore the more realistic centroid of the cluster, and then calculates the approximate centroid deviation distance of each cluster. After that, it adjusts the similarity between pairwise samples by subtracting the centroid deviation distance of the clusters they belong to. Next, it utilizes APC based on centroid-deviation-distance similarity to group samples. Our empirical study on synthetic and UCI datasets shows that the proposed APC-CDD has better performance than original APC and other related approaches.

Supported by NSFC (61872300 and 61873214), Fundamental Research Funds for the Central Universities (XDJK2019B024), NSF of CQ CSTC (cstc2018jcyjAX0228).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Arthur, D., Vassilvitskii, S.: k-means++: the advantages of careful seeding. In: Proceedings of the Eighteenth Annual ACM-SIAM Symposium on Discrete Algorithms, pp. 1027–1035 (2007)

    Google Scholar 

  2. Barbakh, W., Fyfe, C.: Inverse weighted clustering algorithm. Comput. Inf. Syst. 11, 10–18 (2007)

    Google Scholar 

  3. Bradley, P.S., Fayyad, U., Reina, C., et al.: Scaling EM (expectation-maximization) clustering to large databases. Technical report (1998)

    Google Scholar 

  4. Brusco, M.J., Hans-Friedrich, K.: Comment on “clustering by passing messages between data points”. Science 319(5864), 726 (2008)

    Article  Google Scholar 

  5. De Meo, P., Ferrara, E., Fiumara, G., Ricciardello, A.: A novel measure of edge centrality in social networks. Knowl.-Based Syst. 30, 136–150 (2012)

    Article  Google Scholar 

  6. Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum likelihood from incomplete data via the EM algorithm. J. Roy. Stat. Soc. 39(1), 1–38 (1977)

    MathSciNet  MATH  Google Scholar 

  7. Du, H., Wang, Y., Duan, L.: A new method for grayscale image segmentation based on affinity propagation clustering algorithm. In: 2013 Ninth International Conference on Computational Intelligence and Security, pp. 170–173. IEEE (2013)

    Google Scholar 

  8. Frey, B.J., Delbert, D.: Clustering by passing messages between data points. Science 315(5814), 972–976 (2007)

    Article  MathSciNet  Google Scholar 

  9. Gates, A.J., Ahn, Y.Y.: The impact of random models on clustering similarity. J. Mach. Learn. Res. 18(1), 3049–3076 (2017)

    MathSciNet  MATH  Google Scholar 

  10. Guo, K., Guo, W., Chen, Y., Qiu, Q., Zhang, Q.: Community discovery by propagating local and global information based on the MapReduce model. Inf. Sci. 323, 73–93 (2015)

    Article  MathSciNet  Google Scholar 

  11. Halkidi, M., Batistakis, Y., Vazirgiannis, M.: On clustering validation techniques. J. Intell. Inf. Syst. 17(2–3), 107–145 (2001)

    Article  Google Scholar 

  12. Hubert, L., Arabie, P.: Comparing partitions. J. Classif. 2(1), 193–218 (1985)

    Article  Google Scholar 

  13. Kang, J.H., Lerman, K., Plangprasopchok, A.: Analyzing microblogs with affinity propagation. In: Proceedings of the First Workshop on Social Media Analytics, pp. 67–70. ACM (2010)

    Google Scholar 

  14. MacQueen, J., et al.: Some methods for classification and analysis of multivariate observations. In: Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, Oakland, CA, USA, vol. 1, pp. 281–297 (1967)

    Google Scholar 

  15. Michele, L., Martin, W.: Clustering by soft-constraint affinity propagation: applications to gene-expression data. Bioinformatics 23(20), 2708–2715 (2007)

    Article  Google Scholar 

  16. Napolitano, F., Raiconi, G., Tagliaferri, R., Ciaramella, A., Staiano, A., Miele, G.: Clustering and visualization approaches for human cell cycle gene expression data analysis. Int. J. Approximate Reasoning 47(1), 70–84 (2008)

    Article  Google Scholar 

  17. Papalexakis, E.E., Beutel, A., Steenkiste, P.: Network anomaly detection using co-clustering. In: Alhajj, R., Rokne, J. (eds.) Encyclopedia of Social Network Analysis and Mining, pp. 1054–1068. Springer, New York (2014). https://doi.org/10.1007/978-1-4614-6170-8

    Chapter  Google Scholar 

  18. Sculley, D.: Web-scale k-means clustering. In: Proceedings of the 19th International Conference on World Wide Web, pp. 1177–1178. ACM (2010)

    Google Scholar 

  19. Serdah, A.M., Ashour, W.M.: Clustering large-scale data based on modified affinity propagation algorithm. J. Artif. Intell. Soft Comput. Res. 6(1), 23–33 (2016)

    Article  Google Scholar 

  20. Wagstaff, K., Cardie, C., Rogers, S., Schrödl, S., et al.: Constrained k-means clustering with background knowledge. In: ICML, vol. 1, pp. 577–584 (2001)

    Google Scholar 

  21. Walter, S.: Clustering by affinity propagation. Ph.D. thesis (2007)

    Google Scholar 

  22. Wang, K.J., Jian, L.I., Zhang, J.Y., Chong-Yang, T.U.: Semi-supervised affinity propagation clustering. Comput. Eng. 33(23), 197–198 (2007)

    Google Scholar 

  23. Wang, K., Zhang, J., Li, D., Zhang, X., Guo, T.: Adaptive affinity propagation clustering. ArXiv Preprint ArXiv:0805.1096 (2008)

  24. Wei, F.P., Shu, D., Fu, X.L.: Unsupervised image segmentation via affinity propagation. Appl. Mech. Mater. 610, 464–470 (2014)

    Article  Google Scholar 

  25. Zhang, L., Du, Z.: Affinity propagation clustering with geodesic distances. J. Comput. Inf. Syst. 6(1), 47–53 (2010)

    Google Scholar 

  26. Zhang, R.: Two similarity measure methods based on human vision properties for image segmentation based on affinity propagation clustering. In: 2010 International Conference on Measuring Technology and Mechatronics Automation, vol. 3, pp. 1054–1058. IEEE (2010)

    Google Scholar 

  27. Zhang, X., Wang, W., Norvag, K., Sebag, M.: K-AP: generating specified K clusters by efficient affinity propagation. In: 2010 IEEE International Conference on Data Mining, pp. 1187–1192. IEEE (2010)

    Google Scholar 

  28. Zhao, C., Peng, Q., Sun, S.: Chinese text automatic summarization based on affinity propagation cluster. In: 2009 Sixth International Conference on Fuzzy Systems and Knowledge Discovery, vol. 1, pp. 425–429. IEEE (2009)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Guoxian Yu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Xie, Y., Wang, X., Zhang, L., Yu, G. (2019). Affinity Propagation Clustering Using Centroid-Deviation-Distance Based Similarity. In: Zeng, A., Pan, D., Hao, T., Zhang, D., Shi, Y., Song, X. (eds) Human Brain and Artificial Intelligence. HBAI 2019. Communications in Computer and Information Science, vol 1072. Springer, Singapore. https://doi.org/10.1007/978-981-15-1398-5_21

Download citation

  • DOI: https://doi.org/10.1007/978-981-15-1398-5_21

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-15-1397-8

  • Online ISBN: 978-981-15-1398-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics