Advertisement

Efficient Triangle Counting in Large Graphs via Degree-Based Vertex Partitioning

  • Mihail N. Kolountzakis
  • Gary L. Miller
  • Richard Peng
  • Charalampos E. Tsourakakis
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6516)

Abstract

In this paper we present an efficient triangle counting algorithm which can be adapted to the semistreaming model [12]. The key idea of our algorithm is to combine the sampling algorithm of [31,32] and the partitioning of the set of vertices into a high degree and a low degree subset respectively as in [1], treating each set appropriately. We obtain a running time \(O \left( m + \frac{m^{3/2} \Delta \log{n} }{t \epsilon^2} \right)\) and an ε approximation (multiplicative error), where n is the number of vertices, m the number of edges and Δ the maximum number of triangles an edge is contained. Furthermore, we show how this algorithm can be adapted to the semistreaming model with space usage \(O\left(m^{1/2}\log{n} + \frac{m^{3/2} \Delta \log{n}}{t \epsilon^2} \right)\) and a constant number of passes (three) over the graph stream. We apply our methods in various networks with several millions of edges and we obtain excellent results. Finally, we propose a random projection based method for triangle counting and provide a sufficient condition to obtain an estimate with low variance.

Keywords

Online Social Network Sampling Algorithm Large Graph Random Projection Exponential Random Graph Model 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Alon, N., Yuster, R., Zwick, U.: Finding and Counting Given Length Cycles. Algorithmica 17(3), 209–223 (1997)MathSciNetCrossRefzbMATHGoogle Scholar
  2. 2.
    Avron, H.: Counting triangles in large graphs using randomized matrix trace estimation. In: Proceedings of KDD-LDMTA 2010 (2010)Google Scholar
  3. 3.
    Bar-Yosseff, Z., Kumar, R., Sivakumar, D.: Reductions in streaming algorithms, with an application to counting triangles in graphs. In: SODA (2002)Google Scholar
  4. 4.
    Becchetti, L., Boldi, P., Castillo, C., Gionis, A.: Efficient Semi-Streaming Algorithms for Local Triangle Counting in Massive Graphs. In: KDD (2008)Google Scholar
  5. 5.
    Buriol, L., Frahling, G., Leonardi, S., Marchetti-Spaccamela, A., Sohler, C.: Counting Triangles in Data Streams. In: PODS (2006)Google Scholar
  6. 6.
    Chernoff, H.: A Note on an Inequality Involving the Normal Distribution. Annals of Probability 9(3), 533–535 (1981)MathSciNetCrossRefzbMATHGoogle Scholar
  7. 7.
    Chung, F., Lu, L.: Complex Graphs and Networks, vol. (107). American Mathematical Society, Providence (2006)CrossRefzbMATHGoogle Scholar
  8. 8.
    Coppersmith, D., Winograd, S.: Matrix multiplication via arithmetic progressions. In: STOC (1987)Google Scholar
  9. 9.
    Jeffrey, D., Ghemawat, S.: MapReduce: Simplified Data Processing on Large Clusters. In: OSDI (2004)Google Scholar
  10. 10.
    Eckmann, J.-P., Moses, E.: Curvature of co-links uncovers hidden thematic layers in the World Wide Web. In: PNAS (2002)Google Scholar
  11. 11.
    Frank, O., Strauss, D.: Markov Graphs. Journal of the American Statistical Association 81(395), 832–842 (1986)MathSciNetCrossRefzbMATHGoogle Scholar
  12. 12.
    Feigenbaum, J., Kannan, S., McGregor, A., Suri, S., Zhang, J.: On graph problems in a semi-streaming model. Journal of Theoretical Computer Science 348(2), 207–216 (2005)MathSciNetCrossRefzbMATHGoogle Scholar
  13. 13.
    Hajnal, A., Szemerédi, E.: Proof of a Conjecture of Erds. In: Combinatorial Theory and Its Applications, vol. 2, pp. 601–623. North-Holland, Amsterdam (1970)Google Scholar
  14. 14.
    Itai, A., Rodeh, M.: Finding a minimum circuit in a graph. In: STOC (1977)Google Scholar
  15. 15.
    Johnson, W., Lindenstrauss, J.: Extensions of Lipschitz mappings into a Hilbert space. Contemporary Mathematics 26, 189–206 (1984)MathSciNetCrossRefzbMATHGoogle Scholar
  16. 16.
    Jowhari, H., Ghodsi, M.: New Streaming Algorithms for Counting Triangles in Graphs. In: Wang, L. (ed.) COCOON 2005. LNCS, vol. 3595, pp. 710–716. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  17. 17.
    Kang, U., Tsourakakis, C., Appel, A.P., Faloutsos, C., Leskovec, J.: Radius Plots for Mining Tera-byte Scale Graphs: Algorithms, Patterns, and Observations. In: SIAM Data Mining, SDM 2010 (2010)Google Scholar
  18. 18.
    Kang, U., Tsourakakis, C., Faloutsos, C.: PEGASUS: A Peta-Scale Graph Mining System. In: IEEE Data Mining, ICDM 2009 (2009)Google Scholar
  19. 19.
    Kim, J.H., Vu, V.H.: Concentration of multivariate polynomials and its applications. Combinatorica 20(3), 417–434 (2000)MathSciNetCrossRefzbMATHGoogle Scholar
  20. 20.
    Knuth, D.: Seminumerical Algorithms, 3rd edn. Addison-Wesley Professional, Reading (1997)Google Scholar
  21. 21.
    Kolountzakis, M., Miller, G.L., Peng, R., Tsourakakis, C.E.: Efficient Triangle Counting in Large Graphs via Degree-based Vertex Partitioning, http://arxiv.org/abs/1011.0468
  22. 22.
    Latapy, M.: Main-memory triangle computations for very large (sparse (power-law)) graphs. Theor. Comput. Sci. 407, 458–473 (2008)MathSciNetCrossRefzbMATHGoogle Scholar
  23. 23.
    Magen, A., Zouzias, A.: Near Optimal Dimensionality Reductions That Preserve Volumes. In: Goel, A., Jansen, K., Rolim, J.D.P., Rubinfeld, R. (eds.) APPROX and RANDOM 2008. LNCS, vol. 5171, pp. 523–534. Springer, Heidelberg (2008)Google Scholar
  24. 24.
    Mislove, A., Massimiliano, M., Gummadi, K., Druschel, P., Bhattacharjee, B.: Measurement and Analysis of Online Social Networks. In: IMC (2007)Google Scholar
  25. 25.
    Newman, M.: The structure and function of complex networks (2003)Google Scholar
  26. 26.
    Papadimitriou, C., Yannakakis, M.: The clique problem for planar graphs. Information Processing Letters 13, 131–133 (1981)MathSciNetCrossRefGoogle Scholar
  27. 27.
    Schank, T., Wagner, D.: Finding, Counting and Listing all Triangles in Large Graphs, An Experimental Study. In: Nikoletseas, S.E. (ed.) WEA 2005. LNCS, vol. 3503, pp. 606–609. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  28. 28.
    Schank, T., Wagner, D.: Approximating Clustering Coefficient and Transitivity. Journal of Graph Algorithms and Applications 9, 265–275 (2005)MathSciNetCrossRefzbMATHGoogle Scholar
  29. 29.
    Tsourakakis, C.E.: Fast Counting of Triangles in Large Real Networks, without counting: Algorithms and Laws. In: ICDM (2008)Google Scholar
  30. 30.
    Tsourakakis, C.E.: Counting Triangles Using Projections. KAIS Journal (2010)Google Scholar
  31. 31.
    Tsourakakis, C.E., Kang, U., Miller, G.L., Faloutsos, C.: Doulion: Counting Triangles in Massive Graphs with a Coin. In: KDD (2009)Google Scholar
  32. 32.
    Tsourakakis, C.E., Kolountzakis, M., Miller, G.L.: Approximate Triangle Counting (Preprint), http://arxiv.org/abs/0904.3761
  33. 33.
    Tsourakakis, C.E., Drineas, P., Michelakis, E., Koutis, I., Faloutsos, C.: Spectral Counting of Triangles via Element-Wise Sparsification and Triangle-Based Link Recommendation. In: ASONAM (2010)Google Scholar
  34. 34.
    Vu, V.H.: On the concentration of multivariate polynomials with small expectation. Random Structures and Algorithms 16(4), 344–363 (2000)MathSciNetCrossRefzbMATHGoogle Scholar
  35. 35.
    Wasserman, S., Faust, K.: Social Network Analysis: Methods and Applications (Structural Analysis in the Social Sciences). Cambridge University Press, Cambridge (1994)CrossRefzbMATHGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2010

Authors and Affiliations

  • Mihail N. Kolountzakis
    • 1
  • Gary L. Miller
    • 2
  • Richard Peng
    • 2
  • Charalampos E. Tsourakakis
    • 3
  1. 1.Department of MathematicsUniversity of CreteGreece
  2. 2.School of Computer ScienceCarnegie Mellon UniversityUSA
  3. 3.Department of Mathematical SciencesCarnegie Mellon UniversityUSA

Personalised recommendations