Minimum Entropy Stochastic Block Models Neglect Edge Distribution Heterogeneity
The statistical inference of stochastic block models as emerged as a mathematicaly principled method for identifying communities inside networks. Its objective is to find the node partition and the block-to-block adjacency matrix of maximum likelihood i.e. the one which has most probably generated the observed network. In practice, in the so-called microcanonical ensemble, it is frequently assumed that when comparing two models which have the same number and sizes of communities, the best one is the one of minimum entropy i.e. the one which can generate the less different networks. In this paper, we show that there are situations in which the minimum entropy model does not identify the most significant communities in terms of edge distribution, even though it generates the observed graph with a higher probability.
KeywordsNetwork Community detection Stochastic block model Statistical inference Entropy
This work was supported by the ACADEMICS grant of the IDEXLYON, project of the Université de Lyon, PIA operated by ANR-16-IDEX-0005, and of the project ANR-18-CE23-0004 (BITUNAM) of the French National Research Agency (ANR).
- 9.Peixoto, T.P.: Bayesian stochastic blockmodeling. arXiv preprint. http://arxiv.org/abs/1705.10225 (2017)
- 15.Abbe, E., Sandon, C.: Community detection in general stochastic block models: fundamental limits and efficient algorithms for recovery. In: 2015 IEEE 56th Annual Symposium on Foundations of Computer Science, pp. 670–688. IEEE (2015)Google Scholar