Abstract
The well-partitioned graph is capable of accelerating the parallel graph algorithms significantly, but few of them adopt the well partitioning algorithms in large scale graph computing. The high time complexity, which even exceed that of the final algorithms occasionally, is the main factor to prevent their applicabilities. Existing graph partitioning algorithms are mostly based on multilevel k-way scheme or iterative label propagation. Most of these algorithms can yield a high-quality result, but the high time/space complexities limit their applications in big data. In this paper, we propose the locality-sensitive hashing (LSH) based graph partitioning algorithm whose time/space complexity is O(n), n is the number of vertices in graph. For all kinds of hyperscale graphs, it works at the speed of random partitioning method approximately. Compared with the latest mainstream graph partitioning algorithms, the new algorithm owns a simple processing pipeline and avoids irregular memory access generated by graph traversals. The experimental result show that the new algorithm achieves 10x faster than Metis and 2x faster than label propagation algorithm at the cost of reasonable precision loss.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Berger, M.J., Bokhari, S.H.: A partitioning strategy for nonuniform problems on multiprocessors. IEEE Computer Society (1987)
Karypis, G., Kumar, V.: Metis: a software package for partitioning unstructured graphs. In: International Cryogenics Monograph, pp. 121–124 (1998)
Sanchis, L.A.: Multiple-way network partitioning. IEEE Trans. Comput. 38(1), 62–81 (1989)
Complexity of pmetis and kmetis algorithms, 27 March 2018. http://glaros.dtc.umn.edu/gkhome/node/419
Zhang, X.K., Ren, J., Song, C., Jia, J., Zhang, Q.: Label propagation algorithm for community detection based on node importance and label influence. Phys. Lett. A 381 (2017)
Backstrom, L., Backstrom, L.: Balanced label propagation for partitioning massive graphs. In: ACM International Conference on Web Search and Data Mining, pp. 507–516 (2013)
Karypis, G., Schloegel, K., Kumar, V.: Parmetis: parallel graph partitioning and sparse matrix ordering library, Version, Department of Computer Science, University of Minnesota (2003)
pmbw - parallel memory bandwidth results, 27 March 2018. https://panthema.net/2013/pmbw/results.html
Lanczos, C.: An iteration method for the solution of the eigenvalue problem of linear differential and integral operators. J. Res. Natl. Bureau Stand. 45(45), 255–282 (1950)
Bulu\(\dot{\rm c}\), A., Meyerhenke, H., Safro, I., Sanders, P., Schulz, C.: Recent advances in graph partitioning, vol. 77, no. 1, pp. 207–220 (2013)
Kabelíková, P.: Graph partitioning using spectral methods (2006)
Jin, K., Hwang, I., Kim, Y.H., Moon, B.R.: Genetic approaches for graph partitioning: a survey. In: Proceedings of Genetic and Evolutionary Computation Conference, GECCO 2011, Dublin, Ireland, July 2011, pp. 473–480 (2011)
Network flow based partitioning, 27 March 2018. http://users.ece.utexas.edu/~dpan/EE382V_PDA/notes/lecture5_partition-networkflow.pdf
Kernighan, B.W., Lin, S.: An efficient heuristic procedure for partitioning graphs. Bell Syst. Tech. J. 49(2), 291–307 (1970)
Metis-wikipedia, 27 March 2018. https://en.wikipedia.org/wiki/METIS
Wang, L., Xiao, Y., Shao, B., Wang, H.: How to partition a billion-node graph (2014)
Zhang, W., He, B., Chen, Y., Zhang, Q.: GMR: graph-compatible mapreduce programming model. Multimedia Tools Appl. 1, 1–19 (2017)
Bawa, M., Condie, T., Ganesan, P.: LSH forest: self-tuning indexes for similarity search. In: International Conference on World Wide Web, pp. 651–660 (2005)
Lv, Q., Josephson, W., Wang, Z., Charikar, M., Li, K.: Multi-probe LSH: efficient indexing for high-dimensional similarity search. In: International Conference on Very Large Data Bases, pp. 950–961 (2007)
Bentley, J.L.: Multidimensional binary search trees used for associative searching. Commun. ACM 18(9), 509–517 (1975)
Guttman, A.: R-trees: a dynamic index structure for spatial searching. In: ACM SIGMOD International Conference on Management of Data, pp. 47–57 (1984)
Berchtold, S., Keim, D.A., Kriegel, H.P.: The X-tree: an index structure for high-dimensional data. In: Proceedings of VLDB, September 1996, Mumbai, India, pp. 28–39 (1996)
Leskovec, J., Krevl, A.: SNAP Datasets: Stanford large network dataset collection, June 2014. http://snap.stanford.edu/data
Raghavan, U.N., Albert, R., Kumara, S.: Near linear time algorithm to detect community structures in large-scale networks. Phys. Rev. E Stat. Nonlinear Soft Matter Phys. 76(2), 036106 (2007)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Zhang, W., Zhang, M. (2018). LSH-Based Graph Partitioning Algorithm. In: Zhou, ZH., Yang, Q., Gao, Y., Zheng, Y. (eds) Artificial Intelligence. ICAI 2018. Communications in Computer and Information Science, vol 888. Springer, Singapore. https://doi.org/10.1007/978-981-13-2122-1_5
Download citation
DOI: https://doi.org/10.1007/978-981-13-2122-1_5
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-13-2121-4
Online ISBN: 978-981-13-2122-1
eBook Packages: Computer ScienceComputer Science (R0)