Skip to main content

Efficient maximum clique computation and enumeration over large sparse graphs

Abstract

This paper studies the problem of maximum clique computation (MCC) over sparse graphs, as large real-world graphs are usually sparse. In the literature, the problem of MCC over sparse graphs has been studied separately and less extensively than its dense counterpart—MCC over dense graphs—and advanced algorithmic techniques that are developed for MCC over dense graphs have not been utilized in the existing MCC solvers for sparse graphs. In this paper, we design an algorithm \(\mathsf {MC\text {-}BRB}\) for sparse graphs which transforms an instance of MCC over a large sparse graph G to instances of k-clique finding (KCF) over dense subgraphs of G, each of which can be computed by the existing MCC solvers for dense graphs. To further improve the efficiency, we then develop a new branch-reduce-&-bound framework for KCF over dense graphs by proposing light-weight reducing techniques and leveraging the advanced branching and bounding techniques that are used in the existing MCC solvers for dense graphs. In addition, we also design an ego-centric algorithm \(\mathsf {MC\text {-}EGO}\) for heuristically computing a near-maximum clique in near-linear time, and we extend our \(\mathsf {MC\text {-}BRB}\) algorithm to enumerate all maximum cliques. Finally, we parallelize our algorithms to exploit multiple CPU cores. We conduct extensive empirical studies on large real graphs and demonstrate the efficiency and effectiveness of our techniques.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Notes

  1. 1.

    When processing large sparse graphs, although the adjacency matrix can be replaced by a hash set for memory efficiency, the quadratic (i.e., \(|V|^2\)) memory consumption is still inevitable; note that the state-of-the-art MCC-Dense solver \(\mathsf {MoMC}\) [34] also materializes the complement graph of the input graph, i.e., explicitly stores the non-neighbors of each vertex, for efficient processing.

  2. 2.

    The source code of \(\mathsf {MC\text {-}BRB}\) is open sourced at https://github.com/LijunChang/MC-BRB.

  3. 3.

    http://snap.stanford.edu/.

  4. 4.

    http://law.di.unimi.it/datasets.php.

  5. 5.

    http://networkrepository.com/.

  6. 6.

    http://man7.org/linux/man-pages/man1/time.1.html.

  7. 7.

    The source code of \(\mathsf {MC\text {-}BRB}\) is open sourced at https://github.com/LijunChang/MC-BRB.

  8. 8.

    The source code of \(\mathsf {PMC}\) is downloaded from https://github.com/ryanrossi/pmc.

  9. 9.

    The binary code of \(\mathsf {BBMCSP}\) is downloaded from http://venus.elai.upm.es/logs/results_sparse/bin/bbmcsp_linux_release.

  10. 10.

    The source code of \(\mathsf {RMC}\) is obtained from the authors of [35].

  11. 11.

    The adjacency lists are represented by compressed bit strings in \(\mathsf {BBMCSP}\), and represented by arrays (specifically, C++ vectors) in \(\mathsf {PMC}\) and \(\mathsf {RMC}\).

  12. 12.

    The source code of \(\mathsf {MoMC}\) is downloaded from https://home.mis.u-picardie.fr/~cli/MoMC2016.c.

  13. 13.

    https://turing.cs.hbg.psu.edu/txn131/clique.html.

  14. 14.

    https://github.com/sparsehash/sparsehash.

References

  1. 1.

    Akiba, T., Iwata, Y.: Branch-and-reduce exponential/fpt algorithms in practice: a case study of vertex cover. Theor. Comput. Sci. 609, 211–225 (2016)

    MathSciNet  MATH  Article  Google Scholar 

  2. 2.

    Andrade, D.V., Resende, M.G.C., Werneck, R.F.: Fast local search for the maximum independent set problem. J. Heuristics 18(4), 525–547 (2012)

    MATH  Article  Google Scholar 

  3. 3.

    Batagelj, V., Zaversnik, M.: An o(m) algorithm for cores decomposition of networks. CoRR, cs.DS/0310049 (2003)

  4. 4.

    Berman, P., Fujito, T.: On approximation properties of the independent set problem for low degree graphs. Theor. Comput. Sys. 32(2), 115–132 (1999)

    MathSciNet  MATH  Article  Google Scholar 

  5. 5.

    Berry, N., Ko, T., Moy, T., Smrcka, J., Turnley, J., Ben, W.: Emergent clique formation in terrorist recruitmen. theory and practice. In: Workshop on Agent Organizations (2004)

  6. 6.

    Boccaletti, S., Latora, V., Moreno, Y., Chavez, M., Hwang, D.-U.: Complex networks: structure and dynamics. Phys. Rep. 424(4–5), 175–308 (2006)

    MathSciNet  MATH  Article  Google Scholar 

  7. 7.

    Boginski, V., Butenko, S., Pardalos, P.M.: Statistical analysis of financial networks. Comput. Stat. Data Anal. 48(2), 431–443 (2005)

    MathSciNet  MATH  Article  Google Scholar 

  8. 8.

    Bron, C., Kerbosch, J.: Finding all cliques of an undirected graph (algorithm 457). Commun. ACM 16(9), 575–576 (1973)

    MATH  Article  Google Scholar 

  9. 9.

    Carraghan, R., Pardalos, P.M.: An exact algorithm for the maximum clique problem. Oper. Res. Lett. 9(6), 375–382 (1990)

    MATH  Article  Google Scholar 

  10. 10.

    Chang, L.: Efficient maximum clique computation over large sparse graphs. In: Proceedings of SIGKDD’19 (2019)

  11. 11.

    Chang, L., Li, W., Zhang, W.: Computing a near-maximum independent set in linear time by reducing-peeling. In: Proceedings of SIGMOD’17 (2017)

  12. 12.

    Chang, L., Qin, L.: Cohesive Subgraph Computation Over Large Sparse Graphs. Springer Series in the Data Sciences. Springer, Berlin (2018)

    Google Scholar 

  13. 13.

    Chang, L., Yu, J.X., Qin, L.: Fast maximal cliques enumeration in sparse graphs. Algorithmica 66, 173 (2012)

    MathSciNet  MATH  Article  Google Scholar 

  14. 14.

    Chang, L., Yu, J.X., Qin, L., Lin, X., Liu, C., Liang, W.: Efficiently computing k-edge connected components via graph decomposition. In: Proceedings of SIGMOD’13 (2013)

  15. 15.

    Cheng, J., Ke, Y., Fu, A.W.-C., Yu, J.X., Zhu, L.: Finding maximal cliques in massive networks. ACM Trans. Database Syst. 36(4), 21:1–21:34 (2011)

    Article  Google Scholar 

  16. 16.

    Chiba, N., Nishizeki, T.: Arboricity and subgraph listing algorithms. SIAM J. Comput. 14(1), 210–223 (1985)

    MathSciNet  MATH  Article  Google Scholar 

  17. 17.

    Cohen, J.: Trusses: cohesive subgraphs for social network analysis. National Security Agency Technical Report (2008)

  18. 18.

    Danisch, M., Balalau, O.D., Sozio, M.: Listing k-cliques in sparse real-world graphs. In: Proceedings of WWW’18, pp. 589–598 (2018)

  19. 19.

    Deveci, M., Boman, E.G., Devine, K.D., Rajamanickam, S.: Parallel graph coloring for manycore architectures. In: Proceedings of IPDPS’16, pp. 892–901 (2016)

  20. 20.

    Dhulipala, L., Blelloch, G.E., Shun, J.: Julienne: a framework for parallel graph algorithms using work-efficient bucketing. In: Proceedings of SPAA’17, pp. 293–304 (2017)

  21. 21.

    Eppstein, D., Löffler, M., Strash, D.: Listing all maximal cliques in large sparse real-world graphs. ACM J. Exp. Algorithm. 12, 18 (2013)

    MATH  Google Scholar 

  22. 22.

    Fomin, F.V., Grandoni, F., Kratsch, D.: A measure & conquer approach for the analysis of exact algorithms. J. ACM 56(5), 12 (2009)

    MathSciNet  MATH  Article  Google Scholar 

  23. 23.

    Funabiki, N., Takefuji, Y., Lee, K.C.: A neural network model for finding a near-maximum clique. J. Parallel Distrib. Comput. 14(3), 340–344 (1992)

    Article  Google Scholar 

  24. 24.

    Goldberg, A.V.: Finding a maximum density subgraph. Technical report, Berkeley, CA, USA (1984)

  25. 25.

    Halldórsson, M.M., Radhakrishnan, J.: Greed is good: approximating independent sets in sparse and bounded-degree graphs. Algorithmica 18(1), 145–163 (1997)

    MathSciNet  MATH  Article  Google Scholar 

  26. 26.

    Håstad, J.: Clique is hard to approximate within n\({}^{\text{1-epsilon}}\). In: Proceedings of FOCS’96, pp. 627–636 (1996)

  27. 27.

    Hespe, D., Lamm, S., Schulz, C., Strash, D.: WeGotYouCovered: the winning solver from the PACE 2019 implementation challenge, vertex cover track. CoRR abs/1908.06795 (2019)

  28. 28.

    Jerrum, M.: Large cliques elude the metropolis process. Random Struct. Algorithms 3(4), 347–360 (1992)

    MathSciNet  MATH  Article  Google Scholar 

  29. 29.

    Karp, R.M.: Reducibility among combinatorial problems. In: Proceedings of CCC’72, pp. 85–103 (1972)

  30. 30.

    Kim, H., Lee, J., Bhowmick, S.S., Han, W.-S., Lee, J.-H., Ko, S., Jarrah, M.H.A.: DUALSIM: parallel subgraph enumeration in a massive graph on a single machine. In: Proceedings of SIGMOD’16 (2016)

  31. 31.

    Longbin Lai, L., Qin, X.L., Zhang, Y., Chang, L.: Scalable distributed subgraph enumeration. PVLDB 10(3), 217–228 (2016)

    Google Scholar 

  32. 32.

    Lamm, S., Sanders, P., Schulz, C., Strash, D., Werneck, R.F.: Finding near-optimal independent sets at scale. In: Proceedings of ALENEX’16, pp. 138–150 (2016)

  33. 33.

    Li, C.-M., Fang, Z., Xu, K.: Combining maxsat reasoning and incremental upper bound for the maximum clique problem. In: Proceedings of ICTAI’13 (2013)

  34. 34.

    Li, C.-M., Jiang, H., Manyà, F.: On minimization of the number of branches in branch-and-bound algorithms for the maximum clique problem. Comput. OR 84, 1–15 (2017)

    MathSciNet  MATH  Article  Google Scholar 

  35. 35.

    Lu, C., Yu, J.X., Wei, H., Zhang, Y.: Finding the maximum clique in massive graphs. PVLDB 10(11), 1538–1549 (2017)

    Google Scholar 

  36. 36.

    Matsunaga, T., Yonemori, C., Tomita, E., Muramatsu, M.: Clique-based data mining for related genes in a biomedical database. BMC Bioinform. 10, 44 (2009)

    Article  Google Scholar 

  37. 37.

    Matula, D.W., Beck, L.L.: Smallest-last ordering and clustering and graph coloring algorithms. J. ACM 30(3), 417–427 (1983)

    MathSciNet  MATH  Article  Google Scholar 

  38. 38.

    Pardalos, P.M., Xue, J.: The maximum clique problem. J. Glob. Optim. 4(3), 301–328 (1994)

    MathSciNet  MATH  Article  Google Scholar 

  39. 39.

    Pattabiraman, B., Patwary, M.M.A., Gebremedhin, A.H., Liao, W., Choudhary, A.N.: Fast algorithms for the maximum clique problem on massive graphs with applications to overlapping community detection. Internet Math. 11(4–5), 421–448 (2015)

    MathSciNet  Article  Google Scholar 

  40. 40.

    Pullan, W., Mascia, F., Brunato, M.: Cooperating local search for the maximum clique problem. J. Heuristics 17(2), 181–199 (2011)

    Article  Google Scholar 

  41. 41.

    Rokos, G., Gorman, G., Kelly, P.H.J.: A fast and scalable graph coloring algorithm for multi-core and many-core architectures. In: Proceedings of Euro-Par’15, pp. 414–425 (2015)

  42. 42.

    Rossi, R.A., Gleich, D.F., Gebremedhin, A.H.: Parallel maximum clique algorithms with applications to network analysis. SIAM J. Sci. Comput. 37(5), 13 (2015)

    MathSciNet  MATH  Article  Google Scholar 

  43. 43.

    Rossi, R.A., Zhou, R.: Graphzip: a clique-based sparse graph compression method. J. Big Data 5, 10 (2018)

    Article  Google Scholar 

  44. 44.

    Sariyüce, A.E., Seshadhri, C., Pinar, A.: Local algorithms for hierarchical dense subgraph discovery. PVLDB 12(1), 43–56 (2018)

    Google Scholar 

  45. 45.

    Segundo, P.S., Lopez, A., Pardalos, P.M.: A new exact maximum clique algorithm for large and massive sparse graphs. Comput. Oper. Res. 66, 81–94 (2016)

    MathSciNet  MATH  Article  Google Scholar 

  46. 46.

    Seidman, S.B.: Network structure and minimum degree. Soc. Netw. 5(3), 269–287 (1983)

    MathSciNet  Article  Google Scholar 

  47. 47.

    Serafini, M., De Francisci Morales, G., Siganos, G.: Qfrag: distributed graph search via subgraph isomorphism. In: Proceedings of SoCC’17 (2017)

  48. 48.

    Tomita, E.: Efficient algorithms for finding maximum and maximal cliques and their applications. In: Proceedings of WALCOM’17, pp. 3–15 (2017)

  49. 49.

    Tomita, E., Sutani, Y., Higashi, T., Shinya T., Mitsuo W.: A simple and faster branch-and-bound algorithm for finding a maximum clique. In: Proceedings of WALCOM’10, pp. 191–203 (2010)

  50. 50.

    Tomita, E., Tanaka, A., Takahashi, H.: The worst-case time complexity for generating all maximal cliques and computational experiments. Theor. Comput. Sci. 363(1), 28–42 (2006)

    MathSciNet  MATH  Article  Google Scholar 

  51. 51.

    Tomita, E., Yoshida, K., Hatta, T., Nagao, A., Ito, H., Wakatsuki, M.: A much faster branch-and-bound algorithm for finding a maximum clique. In: Proceedings of FAW’16, pp. 215–226 (2016)

  52. 52.

    Wen, D., Qin, L., Zhang, Y., Lin, X., Yu, J.X.: I/O efficient core graph decomposition: application to degeneracy ordering. IEEE Trans. Knowl. Data Eng. 31(1), 75–90 (2019)

    Article  Google Scholar 

  53. 53.

    Xiang, J., Guo, C., Aboulnaga, A.: Scalable maximum clique computation using mapreduce. In: Proceedings of ICDE’13, pp. 74–85 (2013)

  54. 54.

    Zheng, X., Liu, T., Yang, Z., Wang, J.: Large cliques in arabidopsis gene coexpression network and motif discovery. J. Plant Physiol. 168(6), 611–618 (2011)

    Article  Google Scholar 

Download references

Acknowledgements

The author is supported by ARC DP160101513 and FT180100256.

Author information

Affiliations

Authors

Corresponding author

Correspondence to Lijun Chang.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Chang, L. Efficient maximum clique computation and enumeration over large sparse graphs. The VLDB Journal 29, 999–1022 (2020). https://doi.org/10.1007/s00778-020-00602-z

Download citation

Keywords

  • Maximum clique computation
  • Maximum clique enumeration
  • Large sparse graph
  • Branch-bound-and-reduce
  • Reducing techniques