Advertisement

Applied Intelligence

, Volume 48, Issue 10, pp 3557–3576 | Cite as

An efficient multilevel scheme for coarsening large scale social networks

  • Delel Rhouma
  • Lotfi Ben Romdhane
Article
  • 129 Downloads

Abstract

The explosive growth of data raised from social networks, hinders researchers from analysing them in a good way. So, is it possible to rapidly “zoom-out” from this huge network while preserving its whole structure? In fact, this technique is named “graph’s reduction” and represents a significant task in social networks’ analysis. Thus, several methods have been developed to pull a smaller succinct version of the graph. Some of them belong to the category of “graph sampling” and risk losing key characteristics of communities. Others are part of “coarsening strategy” and designed to cope with the problem of community discovering, which is our desired purpose. In this paper, we propose a multi-level coarsening algorithm called MCCA (Multi-level Coarsening Compact Areas). The main strategy of this algorithm is to merge well connected zones in every level by updating edge and vertex weight until a stopping criterion is met. Using real-world social networks, we evaluate the quality and scalability of MCCA. Furthermore, we compared it with eight known proposals. We also show how our method can be used as a preliminary step for community detection without much loss of information.

Keywords

Graph mining Social networks Coarsening Multilevel paradigm 

Notes

Acknowledgements

We would like to thank Dominique LaSalle, Naoto Ohsaka, Roland Glantz and Alireza Chakeri for their fruitful discussions about their proposed models as well as providing their source codes used in our simulations (respectively: Nerstrand, MaxInf, TREE and SPER). We also thank the anonymous reviewers for their valuable remarks which led to a substantial improvement in the quality of the paper.

References

  1. 1.
    Adamic AL, Lukose RM, Puniyani AR, Hubermna BA (2001) Search in power-law networks. Phys Rev E 64(46135):1–8Google Scholar
  2. 2.
    Alan M, Massimiliano M, Gummsdi KP, Peter D, Bobby B (2007) Measurement and analysis of online social networksGoogle Scholar
  3. 3.
    Anand R, David UJ (2011) Mining of Massive Datasets. Cambridge University Press, CambridgeGoogle Scholar
  4. 4.
    Avrachenkov K, Ribeiro BF, Towsley D (2010) Improving random walk estimation accuracy with uniform restarts. In: Algorithms and Models for the Web-Graph - 7th International Workshop, WAW 2010. Proceedings, Stanford, pp 98–109Google Scholar
  5. 5.
    Bastian M, Heymann S, Jacomy M (2009) Gephi: an open source software for exploring and manipulating networksGoogle Scholar
  6. 6.
    Bruce H, Robert L (1995) A multilevel algorithm for partitioning graphs. In: Proceedings of the 1995 ACM/IEEE Conference on Supercomputing. ACMGoogle Scholar
  7. 7.
    Buluċ A, Meyerhenke H, Safro I, Sanders P, Schulz C (2016) Recent advances in graph partitioning. In: Algorithm Engineering - Selected Results and Surveys, pp 117–158Google Scholar
  8. 8.
    Cédric C, Ilya S (2009) Comparison of coarsening schemes for multilevel graph partitioning. Springer, Berlin, pp 191–205Google Scholar
  9. 9.
    Chakeri A, Farhidzadeh H, Hall LO (2016) Spectral sparsification in spectral clustering. In: 2016 23rd International Conference on Pattern Recognition (ICPR), pp 2301–2306Google Scholar
  10. 10.
    Chen H, Perozzi B, Hu Y, Skiena S (2017) HARP: hierarchical representation learning for networks. arXiv:1706.07845
  11. 11.
    Chen H, Zhao J, Chen X, Xiao D, Shi C (2017) Visual analysis of large heterogeneous network through interactive centrality based sampling. In: 14th IEEE International Conference on Networking, Sensing and Control, ICNSC 2017, Calabria, pp 378–383Google Scholar
  12. 12.
    Eunjoon C, Myers SA, Jure L (2011) Friendship and mobility: user movement in location-based social networks. In: Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, pp 1082–1090Google Scholar
  13. 13.
    Fortunato S (2010) Community detection in graphs. Phys Rep 486:75–174MathSciNetCrossRefGoogle Scholar
  14. 14.
    Fortunato S, Hric D (2016) Community detection in networks: a user guide. Phys Rep 659:1–44MathSciNetCrossRefGoogle Scholar
  15. 15.
    George K, Vipin K (1995) Analysis of multilevel graph partitioning. ACM, NY, p 29Google Scholar
  16. 16.
    George K, Vipin K (1998) A fast and high quality multilevel scheme for partitioning irregular graphs. SIAM J Sci Comput 20(1):359–392MathSciNetCrossRefzbMATHGoogle Scholar
  17. 17.
    Glantz R, Meyerhenke H, Schulz C (2016) Tree-based coarsening and partitioning of complex networks. ACM Journal of Experimental Algorithmics 21(1):364–375MathSciNetzbMATHGoogle Scholar
  18. 18.
    Heuer T, Schlag S (2017) Improving Coarsening Schemes for Hypergraph Partitioning by Exploiting Community Structure. In: 16th International Symposium on Experimental Algorithms (SEA 2017), Leibniz International Proceedings in Informatics (LIPIcs), vol 75, pp 21:1–21:19Google Scholar
  19. 19.
    Hu P, Lau WC (2013) A survey and taxonomy of graph sampling. CoRRGoogle Scholar
  20. 20.
    Mcauley J, Jure L (2012) Discovering social circles in ego networks. arXiv:1210.8182
  21. 21.
    Newman ME, Girvan M (2004) Finding and evaluating community structure in networks. Phys Rev E 69:026113CrossRefGoogle Scholar
  22. 22.
    Watts DJ, Strogatz SH (1998) Collective dynamics of ’small-world’ networks. Nature 393(6684):440–442CrossRefzbMATHGoogle Scholar
  23. 23.
    Jaewon Y, Jure L (2015) Defining and evaluating network communities based on ground-truth. Knowl Inf Syst 42(1):181–213CrossRefGoogle Scholar
  24. 24.
    Jiyoung WJ, Xin S, Dhillon IS (2012) Scalable and memory-efficient clustering of large-scale social networks. IEEE Computer Society, Washington, pp 705–714Google Scholar
  25. 25.
    Jure L, Christos F (2006) Sampling from large graphs. In: Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, pp 631– 636Google Scholar
  26. 26.
    Jure L, Lang KJ, Anirban D, Mahoney MW (2008) Community structure in large networks: Natural cluster sizes and the absence of large well-defined clusters. CoRRGoogle Scholar
  27. 27.
    Kaur R, Singh S (2016) A survey of data mining and social network analysis based anomaly detection techniques. Egypt Inf J 17(2):199–216CrossRefGoogle Scholar
  28. 28.
    LaSalle D, Karypis G (2015) Multi-threaded modularity based graph clustering using the multilevel paradigm. J Parallel Distrib Comput 76:66–80CrossRefGoogle Scholar
  29. 29.
    Leskovec J, Krevl A (2014) SNAP Datasets: Stanford large network dataset collection. http://snap.stanford.edu/data
  30. 30.
    Ludo W, van Eck NJ (2013) A smart local moving algorithm for large-scale modularity-based community detection. arXiv:1308.6604
  31. 31.
    Manos P, Gautam D, Nick K (2013) Sampling online social networks. IEEE Trans Knowl Data Eng 25 (3):662–676CrossRefGoogle Scholar
  32. 32.
    Ohsaka N, Sonobe T, Fujita S, Kawarabayashi Ki (2017) Coarsening massive influence networks for scalable diffusion analysis. In: Proceedings of the 2017 ACM International Conference on Management of Data, SIGMOD ’17. ACM, pp 635– 650Google Scholar
  33. 33.
    Purohit M (2014) Fast influence-based coarsening for large networks. In: Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, pp 1296–1305Google Scholar
  34. 34.
    Rhouma D, Romdhane LB (2014) An efficient algorithm for community mining with overlap in social networks. Expert Syst Appl 41(9):4309–4321CrossRefGoogle Scholar
  35. 35.
    Maiya AS, Berger-Wolf TY (2010) Sampling community structure. ACM, NY, pp 701–710Google Scholar
  36. 36.
    Safro I, Sanders P, Schulz C (2012) Advanced coarsening schemes for graph partitioning. CoRRGoogle Scholar
  37. 37.
    Sercan S, Gunduz OS, Sima EUA (2010) An efficient community detection method using parallel clique-finding ants. IEEE, Piscataway, pp 1–7Google Scholar
  38. 38.
    Lee SH, Kim PJ, Jeong H (2006) Statistical properties of sampled networks. Phys Rev E 73(1):016,102CrossRefGoogle Scholar
  39. 39.
    Tianyi W, Yang C, Zengbin Z, Tianyin X, Long J, Pan H, Beixing D, Xing L (2009) Understanding graph sampling algorithms for social network analysisGoogle Scholar
  40. 40.
    Kang U, Christos F (2011) Beyond ‘caveman communities’: Hubs and spokes for graph compression and mining. ICDM ’11. IEEE Computer Society, Washington, pp 300–309Google Scholar
  41. 41.
    Vaishnavi K, Michalis F, Marek C, Li L, Jun-Hong C, Percus AG (2005) Reducing large internet topologies for faster simulations. In: Networking, vol 3462. Springer, pp 328–341Google Scholar
  42. 42.
    Venu S, Srinivasan P, Yiye R (2011) Local graph sparsification for scalable clustering. ACM, NY, pp 721–732Google Scholar
  43. 43.
    Yiye R, David F, Jiongqian L, Yu W, Srinivasan P (2015) Community Discovery: Simple and Scalable Approaches. Springer International Publishing, Berlin, pp 23–54Google Scholar
  44. 44.
    Zhao J, Wang P, Lui JCS, Towsley D, Guan X (2017) Sampling online social networks by random walk with indirect jumps. arXiv:1708.09081

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2018

Authors and Affiliations

  1. 1.MARS Research Lab LR17ES05 Higher Institute of Computer Science and Telecom (ISITCom)University of SousseSousseTunisia

Personalised recommendations