Advertisement

Think Before You Discard: Accurate Triangle Counting in Graph Streams with Deletions

  • Kijung ShinEmail author
  • Jisu Kim
  • Bryan Hooi
  • Christos Faloutsos
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11052)

Abstract

Given a stream of edge additions and deletions, how can we estimate the count of triangles in it? If we can store only a subset of the edges, how can we obtain unbiased estimates with small variances?

Counting triangles (i.e., cliques of size three) in a graph is a classical problem with applications in a wide range of research areas, including social network analysis, data mining, and databases. Recently, streaming algorithms for triangle counting have been extensively studied since they can naturally be used for large dynamic graphs. However, existing algorithms cannot handle edge deletions or suffer from low accuracy.

Can we handle edge deletions while achieving high accuracy? We propose ThinkD, which accurately estimates the counts of global triangles (i.e., all triangles) and local triangles associated with each node in a fully dynamic graph stream with edge additions and deletions. Compared to its best competitors, ThinkD is (a) Accurate: up to 4.3\({\times }\) more accurate within the same memory budget, (b) Fast: up to 2.2\({\times }\) faster for the same accuracy requirements, and (c) Theoretically sound: always maintaining unbiased estimates with small variances. Code related to this paper is available at: https://github.com/kijungs/thinkd.

Keywords

Triangle counting Local triangles Streaming algorithms Fully dynamic graph streams Edge deletions 

Notes

Acknowledgements

This material is based upon work supported by the National Science Foundation under Grants No. CNS-1314632 and IIS-1408924. Research was sponsored by the Army Research Laboratory and was accomplished under Cooperative Agreement Number W911NF-09-2-0053. Shin was supported by the KFAS Scholarship, and Kim was supported by the Samsung Scholarship. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation, or other funding parties. The U.S. Government is authorized to reproduce and distribute reprints for Government purposes notwithstanding any copyright notation here on.

Supplementary material

478890_1_En_9_MOESM1_ESM.pdf (1.2 mb)
Supplementary material 1 (pdf 1239 KB)

References

  1. 1.
  2. 2.
    Ahmed, N.K., Duffield, N., Willke, T.L., Rossi, R.A.: On sampling from massive graph streams. PVLDB 10(11), 1430–1441 (2017)Google Scholar
  3. 3.
    Bar-Yossef, Z., Kumar, R., Sivakumar, D.: Reductions in streaming algorithms, with an application to counting triangles in graphs. In: SODA (2002)Google Scholar
  4. 4.
    Batagelj, V., Zaveršnik, M.: Short cycle connectivity. Discret. Math. 307(3), 310–318 (2007)MathSciNetCrossRefGoogle Scholar
  5. 5.
    Becchetti, L., Boldi, P., Castillo, C., Gionis, A.: Efficient algorithms for large-scale local triangle counting. TKDD 4(3), 13 (2010)CrossRefGoogle Scholar
  6. 6.
    Berry, J.W., Hendrickson, B., LaViolette, R.A., Phillips, C.A.: Tolerating the community detection resolution limit with edge weighting. Phys. Rev. E 83(5), 056119 (2011)CrossRefGoogle Scholar
  7. 7.
    De Stefani, L., Epasto, A., Riondato, M., Upfal, E.: Trièst: counting local and global triangles in fully-dynamic streams with fixed memory size. In: KDD (2016)Google Scholar
  8. 8.
    Epasto, A., Lattanzi, S., Mirrokni, V., Sebe, I.O., Taei, A., Verma, S.: Ego-net community mining applied to friend suggestion. PVLDB 9(4), 324–335 (2015)Google Scholar
  9. 9.
    Gemulla, R., Lehner, W., Haas, P.J.: Maintaining bounded-size sample synopses of evolving datasets. VLDB J. 17(2), 173–201 (2008)CrossRefGoogle Scholar
  10. 10.
    Han, G., Sethu, H.: Edge sample and discard: a new algorithm for counting triangles in large dynamic graphs. In: ASONAM (2017)Google Scholar
  11. 11.
    Jha, M., Seshadhri, C., Pinar, A.: A space efficient streaming algorithm for triangle counting using the birthday paradox. In: KDD (2013)Google Scholar
  12. 12.
    Kolountzakis, M.N., Miller, G.L., Peng, R., Tsourakakis, C.E.: Efficient triangle counting in large graphs via degree-based vertex partitioning. In: Kumar, R., Sivakumar, D. (eds.) WAW 2010. LNCS, vol. 6516, pp. 15–24. Springer, Heidelberg (2010).  https://doi.org/10.1007/978-3-642-18009-5_3CrossRefGoogle Scholar
  13. 13.
    Kutzkov, K., Pagh, R.: Triangle counting in dynamic graph streams. In: Ravi, R., Gørtz, I.L. (eds.) SWAT 2014. LNCS, vol. 8503, pp. 306–318. Springer, Cham (2014).  https://doi.org/10.1007/978-3-319-08404-6_27CrossRefGoogle Scholar
  14. 14.
    Lim, Y., Kang, U.: MASCOT: memory-efficient and accurate sampling for counting local triangles in graph streams. In: KDD (2015)Google Scholar
  15. 15.
    Newman, M.E.: The structure and function of complex networks. SIAM Rev. 45(2), 167–256 (2003)MathSciNetCrossRefGoogle Scholar
  16. 16.
    Pavan, A., Tangwongsan, K., Tirthapura, S., Wu, K.L.: Counting and sampling triangles from a graph stream. PVLDB 6(14), 1870–1881 (2013)Google Scholar
  17. 17.
    Shin, K.: WRS: waiting room sampling for accurate triangle counting in real graph streams. In: ICDM (2017)Google Scholar
  18. 18.
    Shin, K., Eliassi-Rad, T., Faloutsos, C.: Patterns and anomalies in k-cores of real-world graphs with applications. Knowl. Inf. Syst. 54(3), 677–710 (2018)CrossRefGoogle Scholar
  19. 19.
    Shin, K., Hammoud, M., Lee, E., Oh, J., Faloutsos, C.: Tri-Fly: distributed estimation of global and local triangle counts in graph streams. In: Phung, D., Tseng, V.S., Webb, G.I., Ho, B., Ganji, M., Rashidi, L. (eds.) PAKDD 2018. LNCS (LNAI), vol. 10939, pp. 651–663. Springer, Cham (2018).  https://doi.org/10.1007/978-3-319-93040-4_51CrossRefGoogle Scholar
  20. 20.
    Tangwongsan, K., Pavan, A., Tirthapura, S.: Parallel triangle counting in massive streaming graphs. In: CIKM (2013)Google Scholar
  21. 21.
    Tsourakakis, C.E.: Fast counting of triangles in large real networks without counting: algorithms and laws. In: ICDM (2008)Google Scholar
  22. 22.
    Tsourakakis, C.E., Drineas, P., Michelakis, E., Koutis, I., Faloutsos, C.: Spectral counting of triangles via element-wise sparsification and triangle-based link recommendation. Soc. Netw. Anal. Min. 1(2), 75–81 (2011)CrossRefGoogle Scholar
  23. 23.
    Vitter, J.S.: Random sampling with a reservoir. TOMS 11(1), 37–57 (1985)MathSciNetCrossRefGoogle Scholar
  24. 24.
    Watts, D.J., Strogatz, S.H.: Collective dynamics of ‘small-world’ networks. Nature 393(6684), 440–442 (1998)CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  • Kijung Shin
    • 1
    Email author
  • Jisu Kim
    • 1
  • Bryan Hooi
    • 1
  • Christos Faloutsos
    • 1
  1. 1.School of Computer ScienceCarnegie Mellon UniversityPittsburghUSA

Personalised recommendations