Abstract
Network community detection is an important area of research. In this work, we propose a novel nonparametric probabilistic model for this task. We conduct random walks on the network and apply the Hierarchical Dirichlet Process topic model on the random walk data to explore the community structure of the network. Our work is among the very few endeavors in nonparametric probabilistic modeling in complex networks. Our proposed model is highly flexible. The nonparametric nature allows it to automatically detect the number of communities without prior knowledge. Our model is also quite powerful. It demonstrates significant improvements compared to other models in several experiments.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsNotes
- 1.
Other constructions of the HDP topic model and the details of the Stochastic Variational Inference for the model can be found in [23].
References
Airoldi, E.M., Blei, D.M., Fienberg, S.E., Xing, E.P.: Mixed membership stochastic blockmodels. J. Mach. Learn. Res. 9(Sep), 1981–2014 (2008)
Ball, B., Karrer, B., Newman, M.E.: Efficient and principled method for detecting communities in networks. Phys. Rev. E 84(3), 036,103 (2011)
Barnes, E.R.: An algorithm for partitioning the nodes of a graph. SIAM J. Algebraic Discr. Methods 3(4), 541–550 (1982)
Bezdek, J.C.: Objective function clustering. In: Pattern recognition with fuzzy objective function algorithms, pp. 43–93. Springer (1981)
Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. J. Mach. Learn. Res. 3(Jan), 993–1022 (2003)
Blundell, C., Teh, Y.W.: Bayesian hierarchical community discovery. In: Advances in Neural Information Processing Systems, pp. 1601–1609 (2013)
Bojchevski, A., Shchur, O., Zügner, D., Günnemann, S.: Netgan: Generating graphs via random walks. arXiv preprint arXiv:1803.00816 (2018)
Chen, D.T., Nasir, A., Culhane, A., Venkataramu, C., Fulp, W., Rubio, R., Wang, T., Agrawal, D., McCarthy, S.M., Gruidl, M., et al.: Proliferative genes dominate malignancy-risk gene signature in histologically-normal breast tissue. Breast Cancer Res. Treatment 119(2), 335 (2010)
Chen, Y., Xu, D.: Understanding protein dispensability through machine-learning analysis of high-throughput data. Bioinformatics 21(5), 575–581 (2004)
Clauset, A., Newman, M.E., Moore, C.: Finding community structure in very large networks. Phys. Rev. E 70(6), 066,111 (2004)
Donath, W.E., Hoffman, A.J.: Lower bounds for the partitioning of graphs. In: Selected Papers of Alan J Hoffman: With Commentary, pp. 437–442. World Scientific (2003)
Dunn, J.C.: A fuzzy relative of the isodata process and its use in detecting compact well-separated clusters (1973)
Everett, M.G., Borgatti, S.P.: Analyzing clique overlap. Connections 21(1), 49–61 (1998)
Fiedler, M.: Algebraic connectivity of graphs. Czechoslovak Math. J. 23(2), 298–305 (1973)
Fortunato, S.: Community detection in graphs. Phys. Reports 486(3–5), 75–174 (2010)
Friedman, J., Hastie, T., Tibshirani, R.: The Elements of Statistical Learning, vol. 1. Springer series in statistics New York, NY, USA (2001)
Gerlach, M., Peixoto, T.P., Altmann, E.G.: A network approach to topic models. Sci. Advanc. 4(7), eaaq1360 (2018)
Girvan, M., Newman, M.E.: Community structure in social and biological networks. Proc. Nat. Acad. Sci. 99(12), 7821–7826 (2002)
Grover, A., Leskovec, J.: Node2vec: Scalable feature learning for networks. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 855–864. ACM (2016)
Guimera, R., Amaral, L.A.N.: Functional cartography of complex metabolic networks. Nature 433(7028), 895 (2005)
Guo, J., Wilson, A.G., Nordman, D.J.: Bayesian nonparametric models for community detection. Technometrics 55(4), 390–402 (2013)
Hjort, N.L., Holmes, C., Müller, P., Walker, S.G.: Bayesian Nonparametrics, vol. 28. Cambridge University Press (2010)
Hoffman, M.D., Blei, D.M., Wang, C., Paisley, J.: Stochastic variational inference. J. Mach. Learn. Res. 14(1), 1303–1347 (2013)
Holland, P.W., Laskey, K.B., Leinhardt, S.: Stochastic blockmodels: first steps. Soc. Netw. 5(2), 109–137 (1983)
Karrer, B., Newman, M.E.: Stochastic block models and community structure in networks. Phys Rev. E 83(1), 016,107 (2011)
Kernighan, B.W., Lin, S.: An efficient heuristic procedure for partitioning graphs. Bell Syst. Tech. J. 49(2), 291–307 (1970)
Khan, B.S., Niazi, M.A.: Network Community Detection: A Review and Visual Survey. arXiv preprint arXiv:1708.00977 (2017)
Kim, D.I., Gopalan, P.K., Blei, D., Sudderth, E.: Efficient online inference for Bayesian nonparametric relational models. In: Advances in Neural Information Processing Systems, pp. 962–970 (2013)
Leskovec, J., Kleinberg, J., Faloutsos, C.: Graph evolution: densification and shrinking diameters. ACM Trans. Knowl. Disc. Data (TKDD) 1(1), 2 (2007)
Leskovec, J., Mcauley, J.J.: Learning to discover social circles in ego networks. In: Advances in Neural Information Processing Systems, pp. 539–547 (2012)
MacQueen, J., et al.: Some methods for classification and analysis of multivariate observations. In: Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, vol. 1, pp. 281–297. Oakland, CA, USA (1967)
Mørup, M., Schmidt, M.N.: Bayesian community detection. Neural Computat. 24(9), 2434–2456 (2012)
Newman, M.E.: Fast algorithm for detecting community structure in networks. Phys. Rev. E 69(6), 066,133 (2004)
Newman, M.E.: Modularity and community structure in networks. Proc. Nat. Acad. Sci. 103(23), 8577–8582 (2006)
Palla, G., Derényi, I., Farkas, I., Vicsek, T.: Uncovering the overlapping community structure of complex networks in nature and society. Nature 435(7043), 814 (2005)
Peel, L., Larremore, D.B., Clauset, A.: The ground truth about metadata and community detection in networks. Sci. Advanc. 3(5), e1602,548 (2017)
Peixoto, T.P.: Hierarchical block structures and high-resolution model selection in large networks. Phys. Rev. X 4(1), 011,047 (2014)
Peixoto, T.P., Rosvall, M.: Modelling sequences and temporal networks with dynamic community structures. Nature Commun. 8(1), 582 (2017)
Perozzi, B., Al-Rfou, R., Skiena, S.: Deepwalk: Online learning of social representations. In: Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 701–710. ACM (2014)
Pons, P., Latapy, M.: Computing communities in large networks using random walks. In: International Symposium on Computer and Information Sciences, pp. 284–293. Springer (2005)
Radicchi, F., Castellano, C., Cecconi, F., Loreto, V., Parisi, D.: Defining and identifying communities in networks. Proc. Nat. Acad. Sci. USA 101(9), 2658–2663 (2004)
Rosvall, M., Bergstrom, C.T.: Maps of random walks on complex networks reveal community structure. Proc. Nat. Acad. Sci. USA 105(4), 1118–1123 (2008)
Schmidt, M.N., Morup, M.: Nonparametric bayesian modeling of complex networks: an introduction. IEEE Signal Process. Mag. 30(3), 110–128 (2013)
Shi, J., Malik, J.: Normalized cuts and image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 22(8), 888–905 (2000)
Sudderth, E.B., Torralba, A., Freeman, W.T., Willsky, A.S.: Learning hierarchical models of scenes, objects, and parts. In: Tenth IEEE International Conference on Computer Vision, 2005. ICCV 2005, vol. 2, pp. 1331–1338. IEEE (2005)
Teh, Y.W., Jordan, M.I., Beal, M.J., Blei, D.M.: Sharing clusters among related groups: Hierarchical dirichlet processes. In: Advances in Neural Information Processing Systems, pp. 1385–1392 (2005)
Watts, D.J., Strogatz, S.H.: Collective dynamics of ‘small-world’ networks. Nature 393(6684), 440 (1998)
Wei, X., Croft, W.B.: LDA-based document models for ad-hoc retrieval. In: Proceedings of the 29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 178–185. ACM (2006)
Yu, H., Braun, P., Yıldırım, M.A., Lemmens, I., Venkatesan, K., Sahalie, J., Hirozane-Kishikawa, T., Gebreab, F., Li, N., Simonis, N., et al.: High-quality binary protein interaction map of the yeast interactome network. Science 322(5898), 104–110 (2008)
Zhang, H., Qiu, B., Giles, C.L., Foley, H.C., Yen, J.: An LDA-based community structure discovery approach for large-scale social networks. In: Intelligence and Security Informatics, 2007 IEEE, pp. 200–207. IEEE (2007)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Zhu, R., Jiang, W. (2019). Bayesian Complex Network Community Detection Using Nonparametric Topic Model. In: Aiello, L., Cherifi, C., Cherifi, H., Lambiotte, R., Lió, P., Rocha, L. (eds) Complex Networks and Their Applications VII. COMPLEX NETWORKS 2018. Studies in Computational Intelligence, vol 812. Springer, Cham. https://doi.org/10.1007/978-3-030-05411-3_23
Download citation
DOI: https://doi.org/10.1007/978-3-030-05411-3_23
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-05410-6
Online ISBN: 978-3-030-05411-3
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)