Advertisement

Parallel Simrank Computing on Large Scale Dataset on Mapreduce

  • Lina LiEmail author
  • Cuiping Li
  • Hong Chen
Conference paper
Part of the Communications in Computer and Information Science book series (CCIS, volume 387)

Abstract

Many fields need computing the similarity between objects, such as recommendation system, search engine etc. Simrank is one of the simple and intuitive algorithms. It is rigidly based on the random walk theorem. There are three existing iterative ways to compute simrank, however, all of them have one problem, that is time consuming; moreover, with the rapidly growing data on the Internet, we need a novel parallel method to compute simrank on large scale dataset. Hadoop is one of the popular distributed platforms. This paper combines the features of the Hadoop and computes the simrank parallel with different methods, and compars them in the performance.

Keywords

Simrank Parallel Mapreduce Hadoop 

References

  1. 1.
    Dean, J., Ghemawat, S.: MapReduce: simplified data processing on large clusters. In: OSDI 04: Proceedings of the 6th Conference on Symposium on Opearting Systems Design and Implementation (2004)Google Scholar
  2. 2.
    Zheng, Y., Gao, Q., Gao, L., Wang, C.: iMapReduce: a distributed computing framework for iterative computationGoogle Scholar
  3. 3.
    Bu, Y., Howe, B., Balazinska, M., Ernst, M.D.: HaLoop: efficient iterative data processing on large clusters. Proc. VLDB Endownment 3(1), 285–296 (2010)Google Scholar
  4. 4.
    Kambatla, K., Rapolu, N., Jagannathan, S., Grama, A.: Asynchronous algorithm in MapReduce. In: 2010 IEEE International Conference on Cluster ComputingGoogle Scholar
  5. 5.
    Cohen, J.: Graph twiddling in a MapReduce world. Comput. Sci. Eng. 11(4), 29–41 (2009)CrossRefGoogle Scholar
  6. 6.
    Bahmani, B., Chakrabarti, K., Xin, D.: Fast personalized PageRank on MapReduce. In: SIGMOD 11, 12–16 June 2011, Athens, GreeceGoogle Scholar
  7. 7.
    Fogaras, D., Racz, B.: Scaling link-based similarity search. In: WWW 2005, Chiba, JapanGoogle Scholar
  8. 8.
    Lizorkin, D., Velikhov, P., Grinev, M.: Accuracy estimate and optimization techmiques for SimRank computation. VLDB J. 19(1), 45–66 (2010)CrossRefGoogle Scholar
  9. 9.
    Jeh, G., Widom, J.: Simrank: a measure of structural-context similarity. In: KDD 02: Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 538–543. ACM Press, New York (2002)CrossRefGoogle Scholar
  10. 10.
    Li, C., Han, J., He, G.: Fast computation of SimRank for static and dynamic information networks. In: EDBT 2010, 22–26 March 2010, Lausanne, SwitzerlandGoogle Scholar
  11. 11.
    He, G., Feng, H., Li, C.: Parallel simrank computation on large graphs with iterative aggregation. In: Proceedings of the 16th ACM SIGKDD 2010Google Scholar
  12. 12.
    Feng, H.: Research on Parallel Simrank. BeiJing Renmin University of China (2010)Google Scholar
  13. 13.
    Fogaras, D., Rácz, B.: Towards scaling fully personalized pageRank. In: Leonardi, S. (ed.) WAW 2004. LNCS, vol. 3243, pp. 105–117. Springer, Heidelberg (2004)Google Scholar
  14. 14.
    Yu, W., Lin, X., Le, J.: Taming computational complexity: efficient and parallel SimRank optimizations on undirected graphs. In: Chen, L., Tang, C., Yang, J., Gao, Y. (eds.) WAIM 2010. LNCS, vol. 6184, pp. 280–296. Springer, Heidelberg (2010)Google Scholar
  15. 15.
    Li, P., Liu, H., et al.: Fast single-pair SimRank computation. In: 2010 SIAM International Conference on Data Mining, pp. 571–582 (2010)Google Scholar
  16. 16.
    Langville, A.N., Meyer, C.D.: Updating pagerank with iterative aggregation. In: WWW Alt. 04: Proceedings of the 13th International World Wide Web Conference on Alternate Track Papers & Posters, pp. 392–393. ACM, New York (2004)CrossRefGoogle Scholar
  17. 17.
    Page, L., Brin, S., Motwani, R., Winograd, T.: The pagerank citation ranking: bringing order to the web.Technical report, Stanford University Database Group. http://citeseer.nj.nec.com/368196.html (1998)
  18. 18.

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  1. 1.Renmin University of ChinaBeijingChina

Personalised recommendations