Skip to main content

On Counting Triangles Through Edge Sampling in Large Dynamic Graphs

  • Chapter
  • First Online:
  • 323 Accesses

Part of the book series: Lecture Notes in Social Networks ((LNSN))

Abstract

Traditional frameworks for dynamic graphs have relied on processing only the stream of edges added into or deleted from an evolving graph, but not any additional related information such as the degrees or neighbor lists of nodes incident to the edges. In this chapter, we propose a new edge sampling framework for big-graph analytics in dynamic graphs which enhances the traditional model by enabling the use of additional related information. To demonstrate the advantages of this framework, we present a new sampling algorithm, called Edge Sample and Discard (esd). It generates an unbiased estimate of the total number of triangles, which can be continuously updated in response to both edge additions and deletions. We provide a comparative analysis of the accuracy and computational complexity of esd under the new framework against two current state-of-the-art algorithms operating under the traditional framework. The results of the experiments performed on real graphs show that, with the help of the neighborhood information of the sampled edges, the accuracy achieved by our algorithm is substantially better. We also characterize the impact of properties of the graph on the performance of our algorithm by testing on several Barabási–Albert graphs.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   119.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD   159.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Ahmed, N.K., Duffield, N., Neville, J., Kompella, R.: Graph sample and hold: a framework for big-graph analytics. In: ACM KDD, pp. 1446–1455. ACM, NY (2014)

    Google Scholar 

  2. Alon, N., Yuster, R., Zwick, U.: Finding and counting given length cycles. Algorithmica 17(3), 209–223 (1997)

    Article  MathSciNet  MATH  Google Scholar 

  3. Avron, H.: Counting triangles in large graphs using randomized matrix trace estimation. In: Workshop on Large-Scale Data Mining: Theory and Applications, vol. 10, pp. 10–9 (2010)

    Google Scholar 

  4. Becchetti, L., Boldi, P., Castillo, C., Gionis, A.: Efficient algorithms for large-scale local triangle counting. ACM Trans. Knowl. Discov. Data 4(3), 13 (2010)

    Article  Google Scholar 

  5. Berry, J.W., Hendrickson, B., LaViolette, R.A., Phillips, C.A.: Tolerating the community detection resolution limit with edge weighting. Phys. Rev. E 83(5), 056119 (2011)

    Article  Google Scholar 

  6. Buriol, L.S., Frahling, G., Leonardi, S., Marchetti-Spaccamela, A., Sohler, C.: Counting triangles in data streams. In: ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, pp. 253–262. ACM, New York (2006)

    Google Scholar 

  7. Chakrabarti, D., Faloutsos, C.: Graph mining: laws, generators, and algorithms. ACM Comput. Surv. 38(1), 2 (2006)

    Article  Google Scholar 

  8. Chiba, N., Nishizeki, T.: Arboricity and subgraph listing algorithms. SIAM J. Comput. 14(1), 210–223 (1985)

    Article  MathSciNet  MATH  Google Scholar 

  9. Coppersmith, D., Winograd, S.: Matrix multiplication via arithmetic progressions. In: Proceedings of the 19th Annual ACM Symposium on Theory of Computing, pp. 1–6. ACM, New York (1987)

    Google Scholar 

  10. Foucault Welles, B., Van Devender, A., Contractor, N.: Is a friend a friend?: investigating the structure of friendship networks in virtual worlds. In: CHI’10 Extended Abstracts on Human Factors in Computing Systems, pp. 4027–4032. ACM, New York (2010)

    Google Scholar 

  11. Gemulla, R., Lehner, W., Haas, P.J.: Maintaining bounded-size sample synopses of evolving datasets. VLDB J. 17(2), 173–201 (2008)

    Article  Google Scholar 

  12. Han, G., Sethu, H.: Edge sample and discard: a new algorithm for counting triangles in large dynamic graphs. In: IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, pp. 44–49. ACM, New York (2017)

    Google Scholar 

  13. Hardiman, S.J., Katzir, L.: Estimating clustering coefficients and size of social networks via random walk. In: WWW, pp. 539–550. ACM, New York (2013)

    Google Scholar 

  14. Horvitz, D.G., Thompson, D.J.: A generalization of sampling without replacement from a finite universe. J. Am. Stat. Assoc. 47(260), 663–685 (1952)

    Article  MathSciNet  MATH  Google Scholar 

  15. Jha, M., Seshadhri, C., Pinar, A.: A space-efficient streaming algorithm for estimating transitivity and triangle counts using the birthday paradox. ACM Trans. Knowl. Discov. Data 9(3), 15 (2015)

    Article  Google Scholar 

  16. Kolountzakis, M.N., Miller, G.L., Peng, R., Tsourakakis, C.E.: Efficient triangle counting in large graphs via degree-based vertex partitioning. Internet Math. 8(1–2), 161–185 (2012)

    Article  MathSciNet  MATH  Google Scholar 

  17. Kutzkov, K., Pagh, R.: Triangle counting in dynamic graph streams. In: Algorithm Theory–SWAT 2014, pp. 306–318. Springer, Cham (2014)

    Google Scholar 

  18. Latapy, M.: Main-memory triangle computations for very large (sparse (power-law)) graphs. Theor. Comput. Sci. 407(1), 458–473 (2008)

    Article  MathSciNet  MATH  Google Scholar 

  19. Leskovec, J., Krevl, A.: SNAP Datasets: Stanford large network dataset collection. http://snap.stanford.edu/data (Jun 2014)

  20. Lim, Y., Kang, U.: Mascot: memory-efficient and accurate sampling for counting local triangles in graph streams. In: ACM KDD, pp. 685–694. ACM, New York (2015)

    Google Scholar 

  21. Newman, M.E.: The structure and function of complex networks. SIAM Rev. 45(2), 167–256 (2003)

    Article  MathSciNet  MATH  Google Scholar 

  22. Pagh, R., Tsourakakis, C.E.: Colorful triangle counting and a MapReduce implementation. Inf. Process. Lett. 112(7), 277–281 (2012)

    Article  MathSciNet  MATH  Google Scholar 

  23. Rossi, R.A., Ahmed, N.K.: The network data repository with interactive graph analytics and visualization (2013), http://networkrepository.com

  24. Schank, T., Wagner, D.: Finding, counting and listing all triangles in large graphs, an experimental study. In: Experimental and Efficient Algorithms, pp. 606–609. Springer, Berlin (2005)

    Google Scholar 

  25. Shin, K.: Wrs: Waiting room sampling for accurate triangle counting in real graph streams. arXiv preprint arXiv:1709.03147 (2017)

    Google Scholar 

  26. Stefani, L.D., Epasto, A., Riondato, M., Upfal, E.: TRIÈST: Counting local and global triangles in fully-dynamic streams with fixed memory size. CoRR abs/1602.07424 (2016), http://arxiv.org/abs/1602.07424

  27. Thorup, M., Zhang, Y.: Tabulation-based 5-independent hashing with applications to linear probing and second moment estimation. SIAM J. Comput. 41(2), 293–331 (2012)

    Article  MathSciNet  MATH  Google Scholar 

  28. Tiropanis, T., Hall, W., Crowcroft, J., Contractor, N., Tassiulas, L.: Network science, web science, and Internet science. Commun. ACM 58(8), 76–82 (2015)

    Article  Google Scholar 

  29. Tsourakakis, C.E.: Fast counting of triangles in large real networks without counting: algorithms and laws. In: 2008 8th IEEE International Conference on Data Mining, pp. 608–617. IEEE, Pisa (2008)

    Google Scholar 

  30. Tsourakakis, C.E., Kang, U., Miller, G.L., Faloutsos, C.: Doulion: Counting triangles in massive graphs with a coin. In: ACM KDD, pp. 837–846, ACM, New York (2009)

    Google Scholar 

  31. Türkoğlu, D., Turk, A.: Edge-based wedge sampling to estimate triangle counts in very large graphs. arXiv preprint arXiv:1710.09961 (2017)

    Google Scholar 

  32. Wasserman, S., Faust, K.: Social Network Analysis: Methods and Applications, vol. 8. Cambridge University Press, Cambridge (1994)

    Book  MATH  Google Scholar 

  33. Welser, H.T., Gleave, E., Fisher, D., Smith, M.: Visualizing the signatures of social roles in online discussion groups. J. Soc. Struct. 8(2), 1–32 (2007)

    Google Scholar 

  34. Yahoo! webscope dataset. http://research.yahoo.com/Academic_Relations

Download references

Acknowledgements

This work was partially supported by the National Science Foundation Award #1250786.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Harish Sethu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Han, G., Sethu, H. (2019). On Counting Triangles Through Edge Sampling in Large Dynamic Graphs. In: Karampelas, P., Kawash, J., Özyer, T. (eds) From Security to Community Detection in Social Networking Platforms. ASONAM 2017. Lecture Notes in Social Networks. Springer, Cham. https://doi.org/10.1007/978-3-030-11286-8_6

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-11286-8_6

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-11285-1

  • Online ISBN: 978-3-030-11286-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics