Searching for Network Modules
Abstract
When analyzing complex networks, a key target is to uncover their modular structure, which means searching for a family of node subsets spanning each an exceptionally dense subnetwork. Objective function-based graph clustering procedures such as modularity maximization output a partition of nodes, i.e. a family of pair-wise disjoint subsets, although single nodes are likely to be included in multiple or overlapping modules. Thus in fuzzy clustering approaches each node may be included in different modules with different [0, 1]-ranged memberships. This work proposes a novel type of objective function for graph clustering, in the form of a multilinear polynomial extension whose coefficients are determined by network topology. It may be seen as a potential, taking values on fuzzy clusterings or families of fuzzy subsets of nodes over which every node distributes a unit membership. If suitably parameterized, this potential attains its maximum when every node concentrates its all unit membership on some module. Maximizers thus remain partitions, while the original discrete optimization problem is turned into a continuous version allowing to conceive alternative search strategies. The instance of the problem being a pseudo-Boolean function assigning real-valued cluster scores to node subsets, modularity maximization is employed to exemplify a so-called quadratic form, in that the scores of singletons and pairs also fully determine the scores of larger clusters, while the resulting multilinear polynomial potential function has degree 2. After considering further quadratic instances, different from modularity and obtained by interpreting network topology in alternative manners, a greedy local-search strategy for the continuous framework is analytically compared with an existing greedy agglomerative procedure for the discrete case. Overlapping is finally discussed in terms of multiple runs, i.e. several local searches with different initializations.
Keywords
Modularity Fuzzy clustering Pseudo-Boolean functionReferences
- 1.Adamcsek, B., Palla, G., Farkas, I.J., Derényi, I., Vicsek, T.: CFinder: locating cliques and overlapping modules in biological networks. Bioinformatics 22(8), 1021–1023 (2006)CrossRefGoogle Scholar
- 2.Ahn, Y.Y., Bagrow, J.P., Lehmann, S.: Link communities reveal multiscale complexity in networks. Nature 466, 761–764 (2010)CrossRefGoogle Scholar
- 3.Aigner, M.: Combinatorial Theory. Springer, Berlin (1997)CrossRefGoogle Scholar
- 4.Altaf-Ul-Amin, M., Shinbo, Y., Mihara, K., Kurokawa, K., Kanaya, S.: Development and implementation of an algorithm for detection of protein complexes in large interaction networks. BMC Bioinform. 7(207) (2006)Google Scholar
- 5.Asur, S., Ucar, D., Parthasarathy, S.: An ensemble framework for clustering protein-protein interaction networks. Bioinformatics 23, i29–i40 (2007)CrossRefGoogle Scholar
- 6.Bollobás, B., Riordan, O.M.: Mathematical results on scale-free random graphs. In: Bornholdt, S., Schuster, H.G. (eds.) Handbook of Graphs and Networks: from the Genome to the Internet, pp. 1–34. Wiley, Berlin (2003)zbMATHGoogle Scholar
- 7.Boros, E., Hammer, P.: Pseudo-Boolean optimization. Discrete Appl. Math. 123, 155–225 (2002)MathSciNetCrossRefGoogle Scholar
- 8.Brandes, U., Delling, D., Gaertler, M., Görke, R., Hoefer, M., Nikoloski, Z., Wagner, D.: On modularity clustering. IEEE Trans. Knowl. Data Eng. 20(2), 172–188 (2007)CrossRefGoogle Scholar
- 9.Brower, A.E., Haemers, W.H.: Spectra of Graphs. Springer, New York (2011)Google Scholar
- 10.Chakrabarti, M., Heath, L., Ramakrishnan, N.: New methods to generate massive synthetic networks. cs. SI, arXiv:1705.08473 v1 (2017)
- 11.Diestel, R.: Graph Theory. Springer, New York (2010)CrossRefGoogle Scholar
- 12.Fortunato, S.: Community detection in graphs. Phys. Rep. 486(3–5), 75–174 (2010)MathSciNetCrossRefGoogle Scholar
- 13.Freeman, T.C., Goldovsky, L., Brosch, M., van Dongen, S., Mazire, P., Grocock, R.J., Freilich, S., Thornton, J., Enright, A.J.: Construction, visualisation, and clustering of transcription networks from microarray expression data. PLOS Comp. Biol. 3(10–e206), 2032–2042 (2007)MathSciNetCrossRefGoogle Scholar
- 14.Gilboa, I., Lehrer, E.: Global games. Int. J. Game Theory 20, 120–147 (1990)MathSciNetzbMATHGoogle Scholar
- 15.Gilboa, I., Lehrer, E.: The value of information—an axiomatic approach. J. Math. Econ. 20(5), 443–459 (1991)MathSciNetCrossRefGoogle Scholar
- 16.Graham, R.L., Knuth, D.E., Patashnik, O.: Concrete Mathematics—A Foundation for Computer Science, 2nd edn. Addison-Wesley, Reading (1994)Google Scholar
- 17.Lancichinetti, A., Fortunato, S., Kertész, J.: Detecting the overlapping and hierarchical community structure in complex networks. New J. Phys. 11(3), 033015 (2009)CrossRefGoogle Scholar
- 18.Lancichinetti, A., Fortunato, S., Radicchi, F.: Benchmark graphs for testing community detection algorithms. Phys. Rev. E 78(4), 046110 (2008)CrossRefGoogle Scholar
- 19.Lei, X., Wu, S., Ge, L., Zhang, A.: Clustering and overlapping modules detection in PPI network based on IBFO. Proteomics 13(2), 278–290 (2013)CrossRefGoogle Scholar
- 20.Li, Y., Shang, Y., Yang, Y.: Clustering coefficients of large networks. Inf. Sci. 382–383, 350–358 (2017)MathSciNetCrossRefGoogle Scholar
- 21.Miyamoto, S., Ichihashi, H., Honda, K.: Algorithms for Fuzzy Clustering. Springer, Berlin (2008)zbMATHGoogle Scholar
- 22.Nepusz, T., Petróczi, A., Négyessy, L., Baszó, F.: Fuzzy communities and the concept of bridgeness in complex networks. Phys. Rev. E 77(1), 016107 (2008)MathSciNetCrossRefGoogle Scholar
- 23.Newman, M.E.J.: The structure and function of complex networks. SIAM Rev. 45(2), 167–256 (2003)MathSciNetCrossRefGoogle Scholar
- 24.Newman, M.E.J.: Fast algorithm for detecting communities in networks. Phys. Rev. E 69(6), 066133 (2004)CrossRefGoogle Scholar
- 25.Newman, M.E.J.: Modularity and community structure in networks. PNAS 103, 8577–8582 (2006)CrossRefGoogle Scholar
- 26.Newman, M.E.J.: Random graphs with clustering. Phys. Rev. Lett. 103(5), 058701(4) (2009)Google Scholar
- 27.Newman, M.E.J., Barabási, A.L., Watts, D.J.: The Structure and Dynamics of Networks. Princeton University Press, Princeton (2006)zbMATHGoogle Scholar
- 28.Newman, M.E.J., Park, J.: Why social networks are different from other types of networks. Phys. Rev. E 68(3), 036122 (2003)CrossRefGoogle Scholar
- 29.Pereira-Leal, J.B., Enright, A.J., Ouzounis, C.A.: Detection of functional modules from protein interaction networks. PROTEINS: Struct. Funct. Bioinform. 54, 49–57 (2004)CrossRefGoogle Scholar
- 30.Reichardt, J., Bornholdt, S.: Detecting fuzzy community structures in complex networks with a Potts model. Phys. Rev. Lett. 93(21), 218701 (2004)CrossRefGoogle Scholar
- 31.Rossi, G.: Multilinear objective function-based clustering. In: Proceedings of 7th IJCCI, vol. 2. Fuzzy Computation Theory and Applications, pp. 141–149 (2015)Google Scholar
- 32.Rossi, G.: Near-Boolean optimization—a continuous approach to set packing and partitioning. In: LNCS 10163 Pattern Recognition Applications and Methods, pp. 60–87. Springer (2017)Google Scholar
- 33.Rota, G.C.: The number of partitions of a set. Am. Math. Monthly 71, 499–504 (1964)MathSciNetCrossRefGoogle Scholar
- 34.Rota, G.C.: On the foundations of combinatorial theory I: theory of Möbius functions. Z. Wahrscheinlichkeitsrechnung u. verw. Geb. 2, 340–368 (1964)Google Scholar
- 35.Rotta, R., Noack, A.: Multilevel local search clustering algorithms for modularity clustering. ACM J. Exp. Algorithmics 16(2), 2.3:1–27 (2011)MathSciNetCrossRefGoogle Scholar
- 36.Schaeffer, S.E.: Graph clustering. Comput. Sci. Rev. 1, 27–64 (2007)CrossRefGoogle Scholar
- 37.Schmidt, M.C., Samatova, N.F., Thomas, K., Park, B.H.: A scalable, parallel algorithm for maximal clique enumeration. J. Parallel Distrib. Comput. 69(4), 417–428 (2009)CrossRefGoogle Scholar
- 38.Sharan, R., Ulitsky, I., Shamir, R.: Network-based prediction of protein function. Mol. Syst. Biol. 3, 88 (2007)CrossRefGoogle Scholar
- 39.Stanley, R.: Modular elements of geometric lattices. Algebra Universalis 1, 214–217 (1971)MathSciNetCrossRefGoogle Scholar
- 40.Szalay-Bekő, M., Palotai, R., Szappanos, B., Kovás, I.A., Papp, B., Csermely, P.: Hierarchical layers of overlapping network modules and community centrality. Bioinformatics 28(16), 2202–2204 (2012)CrossRefGoogle Scholar
- 41.Vlasblom, J., Wodak, S.J.: Markov clustering versus affinity propagation for the partitioning of protein interaction graphs. BMC Bioinform. 10, 99 (2009)CrossRefGoogle Scholar
- 42.Wang, J., Run, J., Li, M., Wu, F.X.: Identification of hierarchical and overlapping functional modules in PPI networks. IEEE Trans. Nanobiosci. 11(4), 386–393 (2012)CrossRefGoogle Scholar
- 43.Wu, H., Gao, L., Dong, J., Jang, X.: Detecting overlapping protein complexes by rough-fuzzy clustering in protein-protein networks. Plos ONE 9(3–e91856) (2014)CrossRefGoogle Scholar
- 44.Xie, J., Kelley, S., Szymanski, B.K.: Overlapping community detection in networks: the state of the art and a comparative study. ACM Comput. Surv. 45(43), 43:1–43:35 (2012)CrossRefGoogle Scholar
- 45.Yu, T., Liu, M.: A linear time algorithm for maximal clique enumeration in large sparse graphs. Inf. Process. Lett. 125, 35–40 (2017)MathSciNetCrossRefGoogle Scholar
- 46.Zhang, S., Wang, R.S., Zhang, X.S.: Identification of overlapping community structure in complex networks using fuzzy c-means clustering. Phisica A 374, 483–490 (2007)CrossRefGoogle Scholar