Abstract
Community detection is a fundamental graph analytic task. However, due to the high computation complexity, many community detection algorithms cannot handle large graphs. In this chapter, we investigate a special community detection problem, that is, cohesive subgraph detection. Here the target cohesive subgraph is k-truss, which is motivated by a natural observation of social cohesion. We propose a novel parallel and efficient truss detection algorithm, called PeTa. PeTa produces a triangle complete subgraph (TC-subgraph) for every computing node. Based on the TC-subgraphs, it can detect the local k-truss in parallel within a few iterations. We theoretically prove, within this new paradigm, the communication cost of PeTa is bounded by three times of the number of triangles, the total computation complexity of PeTa is the same order as the best known serial algorithm, and the number of iterations for a given partition scheme is minimized as well. Furthermore, we present a subgraph-oriented model to efficiently express PeTa in parallel graph computing systems. The results of comprehensive experiments demonstrate, compared with the existing solutions, PeTa saves 2× to 19× in communication cost, reduces 80% to 95% number of iterations, and improves the overall performance by 80% across various real-world graphs.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsNotes
References
Giraph. https://github.com/apache/giraph.
Foto N. Afrati, Dimitris Fotakis, and Jeffrey D. Ullman. Enumerating subgraph instances using map-reduce. ICDE, 2013.
Richard D. Alba. A graph-theoretic definition of a sociometric clique. J. Math. Sociol., 1973.
Zhengdao Chen, Lisha Li, and Joan Bruna. Supervised community detection with line graph neural networks. In ICLR, 2019.
Jonathan Cohen. Trusses: Cohesive subgraphs for social network analysis. NSA., 2008.
Jonathan Cohen. Graph twiddling in a MapReduce world. Comput. Sci. Eng., 2009.
Jeffrey Dean and Sanjay Ghemawat. MapReduce: simplified data processing on large clusters. OSDI, 2004.
Santo Fortunato. Community detection in graphs. Physics reports, 486(3–5):75–174, 2010.
M. Girvan and M. E. J. Newman. Community structure in social and biological networks. PNAS, 2002.
Michelle Girvan and Mark EJ Newman. Community structure in social and biological networks. Proceedings of the national academy of sciences, 99(12):7821–7826, 2002.
M. S. Granovetter. The Strength of Weak Ties. Am. J. Sociol., 1973.
Yuting Jia, Qinqin Zhang, Weinan Zhang, and Xinbing Wang. CommunityGAN: Community detection with generative adversarial nets. In WWW, pages 784–794, 2019.
George Karypis and Vipin Kumar. Parallel multilevel graph partitioning. IPPS, 1996.
Kevin Lang. Finding good nearly balanced cuts in power law graphs. Technical report, 2004.
Matthieu Latapy. Main-memory triangle computations for very large (sparse (power-law)) graphs. Theor. Comput. Sci., 2008.
Pei-Zhen Li, Ling Huang, Chang-Dong Wang, and Jian-Huang Lai. EdMot: An edge enhancement approach for motif-aware community detection. In KDD, pages 479–487, 2019.
R.Duncan Luce and Albert D. Perry. A method of matrix analysis of group structure. Psychometrika, 1949.
Grzegorz Malewicz, Matthew H. Austern, Aart J.C Bik, James C. Dehnert, Ilan Horn, Naty Leiser, and Grzegorz Czajkowski. Pregel: a system for large-scale graph processing. SIGMOD, 2010.
Pedro Mercado, Francesco Tudisco, and Matthias Hein. Spectral clustering of signed graphs via matrix power means. In ICML, pages 4526–4536, 2019.
Robert J. Mokken. Cliques, clubs and clans. Qual. Quant., 1979.
J.W. Moon and L. Moser. On cliques in graphs. Israel J. Math., 1965.
Mark EJ Newman. Detecting community structure in networks. The European Physical Journal B, 38(2):321–330, 2004.
Mark EJ Newman. Modularity and community structure in networks. Proceedings of the national academy of sciences, 103(23):8577–8582, 2006.
M.E.J. Newman. Detecting community structure in networks. Eur. Phys. J B, 2004.
Louise Quick, Paul Wilkinson, and David Hardcastle. Using Pregel-like large scale graph processing frameworks for social network analysis. ASONAM, 2012.
Semih Salihoglu and Jennifer Widom. GPS: a graph processing system. SSDBM, 2013.
Stephen B. Seidman and Brian L. Foster. A graph-theoretic generalization of the clique concept. J. Math. Sociol., 1978.
Etsuji Tomita, Akira Tanaka, and Haruhisa Takahashi. The worst-case time complexity for generating all maximal cliques and computational experiments. Theor. Comput. Sci., 2006.
Jia Wang and James Cheng. Truss decomposition in massive networks. PVLDB, 2012.
Stanley Wasserman and Katherine Faust. Social network analysis: Methods and applications. Cambridge university press, 1994.
Douglas R. White and Frank Harary. The cohesiveness of blocks in social networks: Node connectivity and conditional density. Sociol. Methodol., 2001.
Feng Zhao and Anthony K. H. Tung. Large scale cohesive subgraphs discovery for social network visual analysis. PVLDB, 2013.
Author information
Authors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Singapore Pte Ltd.
About this chapter
Cite this chapter
Shao, Y., Cui, B., Chen, L. (2020). Efficient Parallel Cohesive Subgraph Detection. In: Large-scale Graph Analysis: System, Algorithm and Optimization. Big Data Management. Springer, Singapore. https://doi.org/10.1007/978-981-15-3928-2_6
Download citation
DOI: https://doi.org/10.1007/978-981-15-3928-2_6
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-15-3927-5
Online ISBN: 978-981-15-3928-2
eBook Packages: Computer ScienceComputer Science (R0)