An Exact No Free Lunch Theorem for Community Detection

McCarthy, Arya D.; Chen, Tongfei; Ebner, Seth

doi:10.1007/978-3-030-36687-2_15

An Exact No Free Lunch Theorem for Community Detection

Arya D. McCarthy⁷,
Tongfei Chen⁷ &
Seth Ebner⁷

Conference paper
First Online: 26 November 2019

3172 Accesses
3 Citations

Part of the book series: Studies in Computational Intelligence ((SCI,volume 881))

Abstract

A precondition for a No Free Lunch theorem is evaluation with a loss function which does not assume a priori superiority of some outputs over others. A previous result for community detection by [12] relies on a mismatch between the loss function and the problem domain. The loss function computes an expectation over only a subset of the universe of possible outputs; thus, it is only asymptotically appropriate with respect to the problem size. By using the correct random model for the problem domain, we provide a stronger, exact No Free Lunch theorem for community detection. The claim generalizes to other set-partitioning tasks including core–periphery separation, \(k\)-clustering, and graph partitioning. Finally, we review the literature of proposed evaluation functions and identify functions which (perhaps with slight modifications) are compatible with an exact No Free Lunch theorem.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Softcover Book: USD 219.99; Price excludes VAT (USA)

Hardcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

1.
Throughout this work, we assume that we evaluate against a known ground truth, as opposed to some intrinsic measure of partition properties like modularity [10].
2.
That is, the objective is to find \(f = g^{-1}\). In general, this does not exist.
3.
To take the example of [12], \(L^2\) loss (squared Euclidian distance) imposes a geometric structure: In the task of guessing points in the unit circle, guessing the center will garner a higher reward, on average, than any other point.
4.
\(\mathcal {B}_N\) is the \(N\)-th Bell number, i.e., the number of partitions of a set of \(N\) nodes.
5.
A multiset of cluster sizes, also called a decomposition pattern [3] or a group-size distribution [6]. It is equivalent to an integer partition of \(N\).
6.
Why do we assume uniformity over \(\varOmega \)? Because this is the highest-entropy (i.e., least informed) distribution—it places the fewest assumptions on the distribution.

References

Chen, Z., Li, L., Bruna, J.: Supervised community detection with line graph neural networks. In: International Conference on Learning Representations (2019)
Google Scholar
Gates, A.J., Ahn, Y.Y.: The impact of random models on clustering similarity. J. Mach. Learn. Res. 18(87), 1–28 (2017)
MathSciNet MATH Google Scholar
Hauer, B., Kondrak, G.: Decoding anagrammed texts written in an unknown language and script. Trans. Assoc. Comput. Linguist. 4, 75–86 (2016)
Article Google Scholar
Hubert, L., Arabie, P.: Comparing partitions. J. Classif. 2(1), 193–218 (1985)
Article Google Scholar
Kvalseth, T.O.: Entropy and correlation: some comments. IEEE Trans. Syst. Man Cybern. 17(3), 517–519 (1987)
Article Google Scholar
Lai, D., Nardini, C.: A corrected normalized mutual information for performance evaluation of community detection. J. Stat. Mech: Theory Exp. 2016(9), 093403 (2016)
Article Google Scholar
Liu, X., Cheng, H.M., Zhang, Z.Y.: Evaluation of community structures using kappa index and F-score instead of normalized mutual information. ArXiv e-prints, July 2018
Google Scholar
McCarthy, A.D., Matula, D.W.: Normalized mutual information exaggerates community detection performance. In: SIAM Workshop on Network Science, SIAM NS 2018, pp. 78–79. SIAM, Portland, July 2018
Google Scholar
McCarthy, A.D., Rudinger, R., Chen, T., Matula, D.W.: Metrics matter in community detection. In: Proceedings of the 8th International Conference on Complex Networks and Their Applications: Complex Networks, Lisbon, Portugal (2019)
Google Scholar
Newman, M.E.J., Girvan, M.: Finding and evaluating community structure in networks. Phys. Rev. E 69, 026113 (2004)
Article Google Scholar
Peel, L.: Estimating network parameters for selecting community detection algorithms. J. Adv. Inform. Fusion 6, 119–130 (2011)
Google Scholar
Peel, L., Larremore, D.B., Clauset, A.: The ground truth about metadata and community detection in networks. Sci. Adv. 3(5), e1602548 (2017)
Article Google Scholar
Radicchi, F., Castellano, C., Cecconi, F., Loreto, V., Parisi, D.: Defining and identifying communities in networks. Proc. Natl. Acad. Sci. 101(9), 2658–2663 (2004)
Article Google Scholar
Romano, S., Bailey, J., Nguyen, V., Verspoor, K.: Standardized mutual information for clustering comparisons: one step further in adjustment for chance. In: International Conference on Machine Learning, pp. 1143–1151 (2014)
Google Scholar
Romano, S., Vinh, N.X., Bailey, J., Verspoor, K.: Adjusting for chance clustering comparison measures. J. Mach. Learn. Res. 17(1), 4635–4666 (2016)
MathSciNet MATH Google Scholar
Schumacher, C., Vose, M.D., Whitley, L.D.: The no free lunch and problem description length. In: Proceedings of the 3rd Annual Conference on Genetic and Evolutionary Computation, GECCO 2001, pp. 565–570. Morgan Kaufmann Publishers Inc., San Francisco (2001)
Google Scholar
Vinh, N.X., Epps, J., Bailey, J.: Information theoretic measures for clusterings comparison: is a correction for chance necessary? In: Proceedings of the 26th Annual International Conference on Machine Learning, ICML 2009, pp. 1073–1080. ACM, New York (2009)
Google Scholar
Wolpert, D.H.: The lack of a priori distinctions between learning algorithms. Neural Comput. 8(7), 1341–1390 (1996)
Article Google Scholar
Yang, Z., Algesheimer, R., Tessone, C.J.: A comparative analysis of community detection algorithms on artificial networks. Sci. Rep. 6, 30750 (2016)
Article Google Scholar
Zhang, J., Chen, T., Hu, J.: On the relationship between gaussian stochastic blockmodels and label propagation algorithms. J. Stat. Mech: Theory Exp. 2015(3), P03009 (2015)
Article MathSciNet Google Scholar
Zhang, P.: Evaluating accuracy of community detection using the relative normalized mutual information. J. Stat. Mech: Theory Exp. 2015(11), P11006 (2015)
Article MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

Johns Hopkins University, Baltimore, USA
Arya D. McCarthy, Tongfei Chen & Seth Ebner

Authors

Arya D. McCarthy
View author publications
You can also search for this author in PubMed Google Scholar
Tongfei Chen
View author publications
You can also search for this author in PubMed Google Scholar
Seth Ebner
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Arya D. McCarthy .

Editor information

Editors and Affiliations

University of Burgundy, Dijon Cedex, France
Hocine Cherifi
Università degli Studi di Milano, Milan, Italy
Sabrina Gaito
University of Aveiro, Aveiro, Portugal
José Fernendo Mendes
Universidad Carlos III de Madrid, Leganés, Madrid, Spain
Esteban Moro
Indiana University, Bloomington, IN, USA
Luis Mateus Rocha

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

McCarthy, A.D., Chen, T., Ebner, S. (2020). An Exact No Free Lunch Theorem for Community Detection. In: Cherifi, H., Gaito, S., Mendes, J., Moro, E., Rocha, L. (eds) Complex Networks and Their Applications VIII. COMPLEX NETWORKS 2019. Studies in Computational Intelligence, vol 881. Springer, Cham. https://doi.org/10.1007/978-3-030-36687-2_15

Download citation

DOI: https://doi.org/10.1007/978-3-030-36687-2_15
Published: 26 November 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-36686-5
Online ISBN: 978-3-030-36687-2
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics