Advertisement

Is Community Detection Fully Unsupervised? The Case of Weighted Graphs

  • Victor Connes
  • Nicolas Dugué
  • Adrien Guille
Conference paper
Part of the Studies in Computational Intelligence book series (SCI, volume 812)

Abstract

In the field of NLP, word embeddings have recently attracted a lot of attention. A textual corpus is represented as a sparse words co-occurrences matrix. Then, the matrix can be factorized, for example using SVD, which allows to obtain a shorter matrix with dense and continuous vectors. To help SVD, PMI measure is applied on the initial co-occurrence matrix, assigning a relevant weight to the co-occurrences by normalizing them using both the considered words frequencies. In this paper, we follow this idea to study if weighted networks can benefit from pre-processing that can help community detection. We first design a benchmark using LFR networks. Then, we consider PMI and another NLP inspired measure as a preprocessing of the links weights, and show that PMI worsens the results while the other one improves them. By distinguishing links inside communities and links between communities into two classes, we show that this is due to the weights distributions of these links. Links between communities are in average bigger, leading to bigger values of PMI. From this analysis, we design another set of experiments that show that it is possible to classify efficiently links into these two classes, using a small set of features. Finally, we introduce the Supervised Label Propagation (SLP) algorithm that takes into account the classification results during the propagation. This algorithm clearly improves the results, leading us to a major questioning: is community detection on weighted networks a fully unsupervised task? We conclude with our thoughts on this topic.

References

  1. 1.
    Barthélemy, M., Barrat, A., Pastor-Satorras, R., Vespignani, A.: Characterization and modeling of weighted networks. Phys. A Stat. Mech. Appl. 346(1), 34–43 (2005)Google Scholar
  2. 2.
    Blondel, V.D., Guillaume, J.L., Lambiotte, R., Lefebvre, E.: Fast unfolding of communities in large networks. J. Stat. Mech. Theory Exp. 2008(10), P10,008 (2008)Google Scholar
  3. 3.
    Bruna, J., Li, X.: Community detection with graph neural networks. arXiv:1705.08415 (2017)
  4. 4.
    Chen, T., Guestrin, C.: Xgboost: A scalable tree boosting system. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 785–794. ACM (2016)Google Scholar
  5. 5.
    De Meo, P., Ferrara, E., Fiumara, G., Ricciardello, A.: A novel measure of edge centrality in social networks. Knowl.-Based Syst. 30, 136–150 (2012)Google Scholar
  6. 6.
    Dugué, N., Labatut, V., Perez, A.: A community role approach to assess social capitalists visibility in the twitter network. Soc. Netw. Anal. Min. 5(1), 26 (2015)Google Scholar
  7. 7.
    Hubert, L., Arabie, P.: Comparing partitions. J. Classification 2(1), 193–218 (1985)Google Scholar
  8. 8.
    Lancichinetti, A., Fortunato, S.: Benchmarks for testing community detection algorithms on directed and weighted graphs with overlapping communities. Phys. Rev. E 80(1), 016,118 (2009)Google Scholar
  9. 9.
    Levy, O., Goldberg, Y., Dagan, I.: Improving distributional similarity with lessons learned from word embeddings. Trans. Assoc. Computat. Linguistics 3, 211–225 (2015)Google Scholar
  10. 10.
    Lu, X., Kuzmin, K., Chen, M., Szymanski, B.K.: Adaptive modularity maximization via edge weighting scheme. Informat. Sci. 424, 55–68 (2018)Google Scholar
  11. 11.
    Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advanc. Neural Informat. Process. Syst. 3111–3119 (2013)Google Scholar
  12. 12.
    Newman, M.E.: Analysis of weighted networks. Phys. Rev. E 70(5), 056,131 (2004)Google Scholar
  13. 13.
    Pennington, J., Socher, R., Manning, C.: Glove: Global vectors for word representation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1532–1543 (2014)Google Scholar
  14. 14.
    Raghavan, U.N., Albert, R., Kumara, S.: Near linear time algorithm to detect community structures in large-scale networks. Phys. Rev. E 76(3), 036,106 (2007)Google Scholar
  15. 15.
    Sarkar, S., Dong, A.: Community detection in graphs using singular value decomposition. Phys. Rev. E 83, 046,114 (2011)Google Scholar
  16. 16.
    Strehl, A., Ghosh, J.: Cluster ensembles–a knowledge reuse framework for combining multiple partitions. J. Mach. Learn. Res. 3(Dec), 583–617 (2002)Google Scholar
  17. 17.
    Van Laarhoven, T., Marchiori, E.: Network community detection with edge classifiers trained on LFR graphs. In: ESANN (2013)Google Scholar
  18. 18.
    Wang, J., Leng, M.: A new active learning semi-supervised community detection algorithm in complex networks. In: Proceedings of Recent Developments in Mechatronics and Intelligent Robotics (2019)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.Le Mans Université, LIUM, EA 4023Laboratoire d’Informatique de l’Université du MansLe MansFrance
  2. 2.University of LyonBron CedexFrance

Personalised recommendations