Adaptive Affinity Propagation Clustering in MapReduce Environment

Hung, Wei-Chih; Liu, Yuan-Cheng; Wu, Yi-Leh; Tang, Cheng-Yuan; Hor, Maw-Kae

doi:10.1007/978-3-319-13987-6_20

Wei-Chih Hung²¹,
Yuan-Cheng Liu²¹,
Yi-Leh Wu²¹,
Cheng-Yuan Tang²² &
…
Maw-Kae Hor²³

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 8916))

Included in the following conference series:

International Conference on Technologies and Applications of Artificial Intelligence

1619 Accesses

Abstract

The Affinity Propagation (AP) is a clustering algorithm based on the concept of “message passing” between data points. Unlike most clustering algorithms such as k-means, the AP does not require the number of clusters to be determined or estimated before running the algorithm. There are implementation of AP on Hadoop, a distribute cloud environment, called the Map/Reduce Affinity Propagation (MRAP). But the MRAP has a limitation: it is hard to know what value of parameter “preference” can yield an optimal clustering solution. The Adaptive Affinity Propagation Clustering (AAP) algorithm was proposed to overcome this limitation to decide the preference value in AP. In this study, we propose to combine these two methods as the Adaptive Map/Reduce Affinity Propagation (AMRAP), which divides the clustering task to multiple mappers and one reducer in Hadoop, and decides suitable preference values individually for each mapper. In the experiments, we compare the clustering results of the proposed AMRAP with the original MRAP method. The experiment results support that the proposed AMRAP method outperforms the original MRAP method in terms of accuracy.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Improved Affinity Propagation Clustering Based on K-Nearest Neighbors and Canopy Algorithm

Hierarchical PSO Clustering on MapReduce for Scalable Privacy Preservation in Big Data

A MapReduce-based K-means clustering algorithm

Article 20 September 2021

References

Frey, B.J., Dueck, D.: Clustering by Passing Messages Between Data Points. Science 315, 972–976 (2007)
Article MATH MathSciNet Google Scholar
He, Y.C., Chen, Q.C., Wang, X.L., Xu, R.F., Bai, X.H., Meng, X.J.: An adaptive affinity propagation document clustering. Informatics and Systems (INFOS), pp. 1-7 (March 2010)
Google Scholar
Dean, J., Ghemawat, S.: MapReduce: simplified data processing on large clusters. Communications of the ACM 51(1), 107–113 (2008)
Article Google Scholar
Hadoop, http://hadoop.apache.org (referenced on March 1, 2013)
Bhandarkar, M.: MapReduce programming with apache Hadoop. In: Parallel & Distributed Processing (IPDPS), pp. 19–23 (April 2010)
Google Scholar
Maurya, M., Mahajan, S.: Performance analysis of MapReduce programs on Hadoop cluster. In: Information and Communication Technologies (WICT), pp. 505–510 (2012)
Google Scholar
Lynch, N.A.: Distributed Algorithms. Morgan Kaufmann (1996)
Google Scholar
Wang, K., Zhang, J., Li, D., Zhang, X., Guo, T.: Adaptive Affinity Propagation Clustering. Acta Automatica Sinica 33(12), 1242–1246 (2007)
MATH Google Scholar
Hung, W.C., Chu, C.Y., Wu, Y.L., Tang, C.Y.: Map/Reduce Affinity Propagation Clustering Algorithm. In: International Conference on Control, Robotics and Cybernetics (ICCRC 2014)(August 2014)
Google Scholar
UCI Machine Learning Repository, http://archive.ics.uci.edu/ml/ (referenced on March 1, 2013)
The Yale Face Database, http://cvc.yale.edu/projects/yalefaces/yalefaces.html (referenced on March 1, 2013)
Rousseeuw, P.J.: Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. Computational and Applied Mathematics 20, 53–65 (1987)
Article MATH Google Scholar
Dudoit, S., Fridlyand, J.: A prediction-based resampling method for estimating the number of clusters in a dataset. Genome Biology 3(7) (2002)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science and Information Engineering, National Taiwan University of Science and Technology, Taipei, Taiwan
Wei-Chih Hung, Yuan-Cheng Liu & Yi-Leh Wu
Department of Information Management, Huafan University, New Taipei City, Taiwan
Cheng-Yuan Tang
Kainan University, Taoyuan, Taiwan
Maw-Kae Hor

Authors

Wei-Chih Hung
View author publications
You can also search for this author in PubMed Google Scholar
Yuan-Cheng Liu
View author publications
You can also search for this author in PubMed Google Scholar
Yi-Leh Wu
View author publications
You can also search for this author in PubMed Google Scholar
Cheng-Yuan Tang
View author publications
You can also search for this author in PubMed Google Scholar
Maw-Kae Hor
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer Science and Information Engineering, National Taiwan University of Science and Technology, No. 43, Sec. 4, Keelung Rd., Da’an Dist., 106, Taipei City, Taiwan
Shin-Ming Cheng
Department of Information Management, Tamkang University, No. 151, Yingzhuan Rd., Danshui Dist., 25137, New Taipei City, Taiwan
Min-Yuh Day

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Hung, WC., Liu, YC., Wu, YL., Tang, CY., Hor, MK. (2014). Adaptive Affinity Propagation Clustering in MapReduce Environment. In: Cheng, SM., Day, MY. (eds) Technologies and Applications of Artificial Intelligence. TAAI 2014. Lecture Notes in Computer Science(), vol 8916. Springer, Cham. https://doi.org/10.1007/978-3-319-13987-6_20

Download citation

DOI: https://doi.org/10.1007/978-3-319-13987-6_20
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-13986-9
Online ISBN: 978-3-319-13987-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Adaptive Affinity Propagation Clustering in MapReduce Environment

Abstract

Access this chapter

Preview

Similar content being viewed by others

Improved Affinity Propagation Clustering Based on K-Nearest Neighbors and Canopy Algorithm

Hierarchical PSO Clustering on MapReduce for Scalable Privacy Preservation in Big Data

A MapReduce-based K-means clustering algorithm

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Adaptive Affinity Propagation Clustering in MapReduce Environment

Abstract

Access this chapter

Preview

Similar content being viewed by others

Improved Affinity Propagation Clustering Based on K-Nearest Neighbors and Canopy Algorithm

Hierarchical PSO Clustering on MapReduce for Scalable Privacy Preservation in Big Data

A MapReduce-based K-means clustering algorithm

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation