Springer Nature is making SARS-CoV-2 and COVID-19 research free. View research | View latest news | Sign up for updates

Efficient maximum clique computation and enumeration over large sparse graphs

  • 35 Accesses

Abstract

This paper studies the problem of maximum clique computation (MCC) over sparse graphs, as large real-world graphs are usually sparse. In the literature, the problem of MCC over sparse graphs has been studied separately and less extensively than its dense counterpart—MCC over dense graphs—and advanced algorithmic techniques that are developed for MCC over dense graphs have not been utilized in the existing MCC solvers for sparse graphs. In this paper, we design an algorithm \(\mathsf {MC\text {-}BRB}\) for sparse graphs which transforms an instance of MCC over a large sparse graph G to instances of k-clique finding (KCF) over dense subgraphs of G, each of which can be computed by the existing MCC solvers for dense graphs. To further improve the efficiency, we then develop a new branch-reduce-&-bound framework for KCF over dense graphs by proposing light-weight reducing techniques and leveraging the advanced branching and bounding techniques that are used in the existing MCC solvers for dense graphs. In addition, we also design an ego-centric algorithm \(\mathsf {MC\text {-}EGO}\) for heuristically computing a near-maximum clique in near-linear time, and we extend our \(\mathsf {MC\text {-}BRB}\) algorithm to enumerate all maximum cliques. Finally, we parallelize our algorithms to exploit multiple CPU cores. We conduct extensive empirical studies on large real graphs and demonstrate the efficiency and effectiveness of our techniques.

This is a preview of subscription content, log in to check access.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Notes

  1. 1.

    When processing large sparse graphs, although the adjacency matrix can be replaced by a hash set for memory efficiency, the quadratic (i.e., \(|V|^2\)) memory consumption is still inevitable; note that the state-of-the-art MCC-Dense solver \(\mathsf {MoMC}\) [34] also materializes the complement graph of the input graph, i.e., explicitly stores the non-neighbors of each vertex, for efficient processing.

  2. 2.

    The source code of \(\mathsf {MC\text {-}BRB}\) is open sourced at https://github.com/LijunChang/MC-BRB.

  3. 3.

    http://snap.stanford.edu/.

  4. 4.

    http://law.di.unimi.it/datasets.php.

  5. 5.

    http://networkrepository.com/.

  6. 6.

    http://man7.org/linux/man-pages/man1/time.1.html.

  7. 7.

    The source code of \(\mathsf {MC\text {-}BRB}\) is open sourced at https://github.com/LijunChang/MC-BRB.

  8. 8.

    The source code of \(\mathsf {PMC}\) is downloaded from https://github.com/ryanrossi/pmc.

  9. 9.

    The binary code of \(\mathsf {BBMCSP}\) is downloaded from http://venus.elai.upm.es/logs/results_sparse/bin/bbmcsp_linux_release.

  10. 10.

    The source code of \(\mathsf {RMC}\) is obtained from the authors of [35].

  11. 11.

    The adjacency lists are represented by compressed bit strings in \(\mathsf {BBMCSP}\), and represented by arrays (specifically, C++ vectors) in \(\mathsf {PMC}\) and \(\mathsf {RMC}\).

  12. 12.

    The source code of \(\mathsf {MoMC}\) is downloaded from https://home.mis.u-picardie.fr/~cli/MoMC2016.c.

  13. 13.

    https://turing.cs.hbg.psu.edu/txn131/clique.html.

  14. 14.

    https://github.com/sparsehash/sparsehash.

References

  1. 1.

    Akiba, T., Iwata, Y.: Branch-and-reduce exponential/fpt algorithms in practice: a case study of vertex cover. Theor. Comput. Sci. 609, 211–225 (2016)

  2. 2.

    Andrade, D.V., Resende, M.G.C., Werneck, R.F.: Fast local search for the maximum independent set problem. J. Heuristics 18(4), 525–547 (2012)

  3. 3.

    Batagelj, V., Zaversnik, M.: An o(m) algorithm for cores decomposition of networks. CoRR, cs.DS/0310049 (2003)

  4. 4.

    Berman, P., Fujito, T.: On approximation properties of the independent set problem for low degree graphs. Theor. Comput. Sys. 32(2), 115–132 (1999)

  5. 5.

    Berry, N., Ko, T., Moy, T., Smrcka, J., Turnley, J., Ben, W.: Emergent clique formation in terrorist recruitmen. theory and practice. In: Workshop on Agent Organizations (2004)

  6. 6.

    Boccaletti, S., Latora, V., Moreno, Y., Chavez, M., Hwang, D.-U.: Complex networks: structure and dynamics. Phys. Rep. 424(4–5), 175–308 (2006)

  7. 7.

    Boginski, V., Butenko, S., Pardalos, P.M.: Statistical analysis of financial networks. Comput. Stat. Data Anal. 48(2), 431–443 (2005)

  8. 8.

    Bron, C., Kerbosch, J.: Finding all cliques of an undirected graph (algorithm 457). Commun. ACM 16(9), 575–576 (1973)

  9. 9.

    Carraghan, R., Pardalos, P.M.: An exact algorithm for the maximum clique problem. Oper. Res. Lett. 9(6), 375–382 (1990)

  10. 10.

    Chang, L.: Efficient maximum clique computation over large sparse graphs. In: Proceedings of SIGKDD’19 (2019)

  11. 11.

    Chang, L., Li, W., Zhang, W.: Computing a near-maximum independent set in linear time by reducing-peeling. In: Proceedings of SIGMOD’17 (2017)

  12. 12.

    Chang, L., Qin, L.: Cohesive Subgraph Computation Over Large Sparse Graphs. Springer Series in the Data Sciences. Springer, Berlin (2018)

  13. 13.

    Chang, L., Yu, J.X., Qin, L.: Fast maximal cliques enumeration in sparse graphs. Algorithmica 66, 173 (2012)

  14. 14.

    Chang, L., Yu, J.X., Qin, L., Lin, X., Liu, C., Liang, W.: Efficiently computing k-edge connected components via graph decomposition. In: Proceedings of SIGMOD’13 (2013)

  15. 15.

    Cheng, J., Ke, Y., Fu, A.W.-C., Yu, J.X., Zhu, L.: Finding maximal cliques in massive networks. ACM Trans. Database Syst. 36(4), 21:1–21:34 (2011)

  16. 16.

    Chiba, N., Nishizeki, T.: Arboricity and subgraph listing algorithms. SIAM J. Comput. 14(1), 210–223 (1985)

  17. 17.

    Cohen, J.: Trusses: cohesive subgraphs for social network analysis. National Security Agency Technical Report (2008)

  18. 18.

    Danisch, M., Balalau, O.D., Sozio, M.: Listing k-cliques in sparse real-world graphs. In: Proceedings of WWW’18, pp. 589–598 (2018)

  19. 19.

    Deveci, M., Boman, E.G., Devine, K.D., Rajamanickam, S.: Parallel graph coloring for manycore architectures. In: Proceedings of IPDPS’16, pp. 892–901 (2016)

  20. 20.

    Dhulipala, L., Blelloch, G.E., Shun, J.: Julienne: a framework for parallel graph algorithms using work-efficient bucketing. In: Proceedings of SPAA’17, pp. 293–304 (2017)

  21. 21.

    Eppstein, D., Löffler, M., Strash, D.: Listing all maximal cliques in large sparse real-world graphs. ACM J. Exp. Algorithm. 12, 18 (2013)

  22. 22.

    Fomin, F.V., Grandoni, F., Kratsch, D.: A measure & conquer approach for the analysis of exact algorithms. J. ACM 56(5), 12 (2009)

  23. 23.

    Funabiki, N., Takefuji, Y., Lee, K.C.: A neural network model for finding a near-maximum clique. J. Parallel Distrib. Comput. 14(3), 340–344 (1992)

  24. 24.

    Goldberg, A.V.: Finding a maximum density subgraph. Technical report, Berkeley, CA, USA (1984)

  25. 25.

    Halldórsson, M.M., Radhakrishnan, J.: Greed is good: approximating independent sets in sparse and bounded-degree graphs. Algorithmica 18(1), 145–163 (1997)

  26. 26.

    Håstad, J.: Clique is hard to approximate within n\({}^{\text{1-epsilon}}\). In: Proceedings of FOCS’96, pp. 627–636 (1996)

  27. 27.

    Hespe, D., Lamm, S., Schulz, C., Strash, D.: WeGotYouCovered: the winning solver from the PACE 2019 implementation challenge, vertex cover track. CoRR abs/1908.06795 (2019)

  28. 28.

    Jerrum, M.: Large cliques elude the metropolis process. Random Struct. Algorithms 3(4), 347–360 (1992)

  29. 29.

    Karp, R.M.: Reducibility among combinatorial problems. In: Proceedings of CCC’72, pp. 85–103 (1972)

  30. 30.

    Kim, H., Lee, J., Bhowmick, S.S., Han, W.-S., Lee, J.-H., Ko, S., Jarrah, M.H.A.: DUALSIM: parallel subgraph enumeration in a massive graph on a single machine. In: Proceedings of SIGMOD’16 (2016)

  31. 31.

    Longbin Lai, L., Qin, X.L., Zhang, Y., Chang, L.: Scalable distributed subgraph enumeration. PVLDB 10(3), 217–228 (2016)

  32. 32.

    Lamm, S., Sanders, P., Schulz, C., Strash, D., Werneck, R.F.: Finding near-optimal independent sets at scale. In: Proceedings of ALENEX’16, pp. 138–150 (2016)

  33. 33.

    Li, C.-M., Fang, Z., Xu, K.: Combining maxsat reasoning and incremental upper bound for the maximum clique problem. In: Proceedings of ICTAI’13 (2013)

  34. 34.

    Li, C.-M., Jiang, H., Manyà, F.: On minimization of the number of branches in branch-and-bound algorithms for the maximum clique problem. Comput. OR 84, 1–15 (2017)

  35. 35.

    Lu, C., Yu, J.X., Wei, H., Zhang, Y.: Finding the maximum clique in massive graphs. PVLDB 10(11), 1538–1549 (2017)

  36. 36.

    Matsunaga, T., Yonemori, C., Tomita, E., Muramatsu, M.: Clique-based data mining for related genes in a biomedical database. BMC Bioinform. 10, 44 (2009)

  37. 37.

    Matula, D.W., Beck, L.L.: Smallest-last ordering and clustering and graph coloring algorithms. J. ACM 30(3), 417–427 (1983)

  38. 38.

    Pardalos, P.M., Xue, J.: The maximum clique problem. J. Glob. Optim. 4(3), 301–328 (1994)

  39. 39.

    Pattabiraman, B., Patwary, M.M.A., Gebremedhin, A.H., Liao, W., Choudhary, A.N.: Fast algorithms for the maximum clique problem on massive graphs with applications to overlapping community detection. Internet Math. 11(4–5), 421–448 (2015)

  40. 40.

    Pullan, W., Mascia, F., Brunato, M.: Cooperating local search for the maximum clique problem. J. Heuristics 17(2), 181–199 (2011)

  41. 41.

    Rokos, G., Gorman, G., Kelly, P.H.J.: A fast and scalable graph coloring algorithm for multi-core and many-core architectures. In: Proceedings of Euro-Par’15, pp. 414–425 (2015)

  42. 42.

    Rossi, R.A., Gleich, D.F., Gebremedhin, A.H.: Parallel maximum clique algorithms with applications to network analysis. SIAM J. Sci. Comput. 37(5), 13 (2015)

  43. 43.

    Rossi, R.A., Zhou, R.: Graphzip: a clique-based sparse graph compression method. J. Big Data 5, 10 (2018)

  44. 44.

    Sariyüce, A.E., Seshadhri, C., Pinar, A.: Local algorithms for hierarchical dense subgraph discovery. PVLDB 12(1), 43–56 (2018)

  45. 45.

    Segundo, P.S., Lopez, A., Pardalos, P.M.: A new exact maximum clique algorithm for large and massive sparse graphs. Comput. Oper. Res. 66, 81–94 (2016)

  46. 46.

    Seidman, S.B.: Network structure and minimum degree. Soc. Netw. 5(3), 269–287 (1983)

  47. 47.

    Serafini, M., De Francisci Morales, G., Siganos, G.: Qfrag: distributed graph search via subgraph isomorphism. In: Proceedings of SoCC’17 (2017)

  48. 48.

    Tomita, E.: Efficient algorithms for finding maximum and maximal cliques and their applications. In: Proceedings of WALCOM’17, pp. 3–15 (2017)

  49. 49.

    Tomita, E., Sutani, Y., Higashi, T., Shinya T., Mitsuo W.: A simple and faster branch-and-bound algorithm for finding a maximum clique. In: Proceedings of WALCOM’10, pp. 191–203 (2010)

  50. 50.

    Tomita, E., Tanaka, A., Takahashi, H.: The worst-case time complexity for generating all maximal cliques and computational experiments. Theor. Comput. Sci. 363(1), 28–42 (2006)

  51. 51.

    Tomita, E., Yoshida, K., Hatta, T., Nagao, A., Ito, H., Wakatsuki, M.: A much faster branch-and-bound algorithm for finding a maximum clique. In: Proceedings of FAW’16, pp. 215–226 (2016)

  52. 52.

    Wen, D., Qin, L., Zhang, Y., Lin, X., Yu, J.X.: I/O efficient core graph decomposition: application to degeneracy ordering. IEEE Trans. Knowl. Data Eng. 31(1), 75–90 (2019)

  53. 53.

    Xiang, J., Guo, C., Aboulnaga, A.: Scalable maximum clique computation using mapreduce. In: Proceedings of ICDE’13, pp. 74–85 (2013)

  54. 54.

    Zheng, X., Liu, T., Yang, Z., Wang, J.: Large cliques in arabidopsis gene coexpression network and motif discovery. J. Plant Physiol. 168(6), 611–618 (2011)

Download references

Acknowledgements

The author is supported by ARC DP160101513 and FT180100256.

Author information

Correspondence to Lijun Chang.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Chang, L. Efficient maximum clique computation and enumeration over large sparse graphs. The VLDB Journal (2020). https://doi.org/10.1007/s00778-020-00602-z

Download citation

Keywords

  • Maximum clique computation
  • Maximum clique enumeration
  • Large sparse graph
  • Branch-bound-and-reduce
  • Reducing techniques