Advertisement

Bayesian Complex Network Community Detection Using Nonparametric Topic Model

  • Ruimin Zhu
  • Wenxin Jiang
Conference paper
Part of the Studies in Computational Intelligence book series (SCI, volume 812)

Abstract

Network community detection is an important area of research. In this work, we propose a novel nonparametric probabilistic model for this task. We conduct random walks on the network and apply the Hierarchical Dirichlet Process topic model on the random walk data to explore the community structure of the network. Our work is among the very few endeavors in nonparametric probabilistic modeling in complex networks. Our proposed model is highly flexible. The nonparametric nature allows it to automatically detect the number of communities without prior knowledge. Our model is also quite powerful. It demonstrates significant improvements compared to other models in several experiments.

Keywords

Community detection Bayesian modeling Nonparametric topic model 

References

  1. 1.
    Airoldi, E.M., Blei, D.M., Fienberg, S.E., Xing, E.P.: Mixed membership stochastic blockmodels. J. Mach. Learn. Res. 9(Sep), 1981–2014 (2008)Google Scholar
  2. 2.
    Ball, B., Karrer, B., Newman, M.E.: Efficient and principled method for detecting communities in networks. Phys. Rev. E 84(3), 036,103 (2011)Google Scholar
  3. 3.
    Barnes, E.R.: An algorithm for partitioning the nodes of a graph. SIAM J. Algebraic Discr. Methods 3(4), 541–550 (1982)Google Scholar
  4. 4.
    Bezdek, J.C.: Objective function clustering. In: Pattern recognition with fuzzy objective function algorithms, pp. 43–93. Springer (1981)Google Scholar
  5. 5.
    Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. J. Mach. Learn. Res. 3(Jan), 993–1022 (2003)Google Scholar
  6. 6.
    Blundell, C., Teh, Y.W.: Bayesian hierarchical community discovery. In: Advances in Neural Information Processing Systems, pp. 1601–1609 (2013)Google Scholar
  7. 7.
    Bojchevski, A., Shchur, O., Zügner, D., Günnemann, S.: Netgan: Generating graphs via random walks. arXiv preprint arXiv:1803.00816 (2018)
  8. 8.
    Chen, D.T., Nasir, A., Culhane, A., Venkataramu, C., Fulp, W., Rubio, R., Wang, T., Agrawal, D., McCarthy, S.M., Gruidl, M., et al.: Proliferative genes dominate malignancy-risk gene signature in histologically-normal breast tissue. Breast Cancer Res. Treatment 119(2), 335 (2010)Google Scholar
  9. 9.
    Chen, Y., Xu, D.: Understanding protein dispensability through machine-learning analysis of high-throughput data. Bioinformatics 21(5), 575–581 (2004)Google Scholar
  10. 10.
    Clauset, A., Newman, M.E., Moore, C.: Finding community structure in very large networks. Phys. Rev. E 70(6), 066,111 (2004)Google Scholar
  11. 11.
    Donath, W.E., Hoffman, A.J.: Lower bounds for the partitioning of graphs. In: Selected Papers of Alan J Hoffman: With Commentary, pp. 437–442. World Scientific (2003)Google Scholar
  12. 12.
    Dunn, J.C.: A fuzzy relative of the isodata process and its use in detecting compact well-separated clusters (1973)Google Scholar
  13. 13.
    Everett, M.G., Borgatti, S.P.: Analyzing clique overlap. Connections 21(1), 49–61 (1998)Google Scholar
  14. 14.
    Fiedler, M.: Algebraic connectivity of graphs. Czechoslovak Math. J. 23(2), 298–305 (1973)Google Scholar
  15. 15.
    Fortunato, S.: Community detection in graphs. Phys. Reports 486(3–5), 75–174 (2010)Google Scholar
  16. 16.
    Friedman, J., Hastie, T., Tibshirani, R.: The Elements of Statistical Learning, vol. 1. Springer series in statistics New York, NY, USA (2001)Google Scholar
  17. 17.
    Gerlach, M., Peixoto, T.P., Altmann, E.G.: A network approach to topic models. Sci. Advanc. 4(7), eaaq1360 (2018)Google Scholar
  18. 18.
    Girvan, M., Newman, M.E.: Community structure in social and biological networks. Proc. Nat. Acad. Sci. 99(12), 7821–7826 (2002)Google Scholar
  19. 19.
    Grover, A., Leskovec, J.: Node2vec: Scalable feature learning for networks. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 855–864. ACM (2016)Google Scholar
  20. 20.
    Guimera, R., Amaral, L.A.N.: Functional cartography of complex metabolic networks. Nature 433(7028), 895 (2005)Google Scholar
  21. 21.
    Guo, J., Wilson, A.G., Nordman, D.J.: Bayesian nonparametric models for community detection. Technometrics 55(4), 390–402 (2013)Google Scholar
  22. 22.
    Hjort, N.L., Holmes, C., Müller, P., Walker, S.G.: Bayesian Nonparametrics, vol. 28. Cambridge University Press (2010)Google Scholar
  23. 23.
    Hoffman, M.D., Blei, D.M., Wang, C., Paisley, J.: Stochastic variational inference. J. Mach. Learn. Res. 14(1), 1303–1347 (2013)Google Scholar
  24. 24.
    Holland, P.W., Laskey, K.B., Leinhardt, S.: Stochastic blockmodels: first steps. Soc. Netw. 5(2), 109–137 (1983)Google Scholar
  25. 25.
    Karrer, B., Newman, M.E.: Stochastic block models and community structure in networks. Phys Rev. E 83(1), 016,107 (2011)Google Scholar
  26. 26.
    Kernighan, B.W., Lin, S.: An efficient heuristic procedure for partitioning graphs. Bell Syst. Tech. J. 49(2), 291–307 (1970)Google Scholar
  27. 27.
    Khan, B.S., Niazi, M.A.: Network Community Detection: A Review and Visual Survey. arXiv preprint arXiv:1708.00977 (2017)
  28. 28.
    Kim, D.I., Gopalan, P.K., Blei, D., Sudderth, E.: Efficient online inference for Bayesian nonparametric relational models. In: Advances in Neural Information Processing Systems, pp. 962–970 (2013)Google Scholar
  29. 29.
    Leskovec, J., Kleinberg, J., Faloutsos, C.: Graph evolution: densification and shrinking diameters. ACM Trans. Knowl. Disc. Data (TKDD) 1(1), 2 (2007)Google Scholar
  30. 30.
    Leskovec, J., Mcauley, J.J.: Learning to discover social circles in ego networks. In: Advances in Neural Information Processing Systems, pp. 539–547 (2012)Google Scholar
  31. 31.
    MacQueen, J., et al.: Some methods for classification and analysis of multivariate observations. In: Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, vol. 1, pp. 281–297. Oakland, CA, USA (1967)Google Scholar
  32. 32.
    Mørup, M., Schmidt, M.N.: Bayesian community detection. Neural Computat. 24(9), 2434–2456 (2012)Google Scholar
  33. 33.
    Newman, M.E.: Fast algorithm for detecting community structure in networks. Phys. Rev. E 69(6), 066,133 (2004)Google Scholar
  34. 34.
    Newman, M.E.: Modularity and community structure in networks. Proc. Nat. Acad. Sci. 103(23), 8577–8582 (2006)Google Scholar
  35. 35.
    Palla, G., Derényi, I., Farkas, I., Vicsek, T.: Uncovering the overlapping community structure of complex networks in nature and society. Nature 435(7043), 814 (2005)Google Scholar
  36. 36.
    Peel, L., Larremore, D.B., Clauset, A.: The ground truth about metadata and community detection in networks. Sci. Advanc. 3(5), e1602,548 (2017)Google Scholar
  37. 37.
    Peixoto, T.P.: Hierarchical block structures and high-resolution model selection in large networks. Phys. Rev. X 4(1), 011,047 (2014)Google Scholar
  38. 38.
    Peixoto, T.P., Rosvall, M.: Modelling sequences and temporal networks with dynamic community structures. Nature Commun. 8(1), 582 (2017)Google Scholar
  39. 39.
    Perozzi, B., Al-Rfou, R., Skiena, S.: Deepwalk: Online learning of social representations. In: Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 701–710. ACM (2014)Google Scholar
  40. 40.
    Pons, P., Latapy, M.: Computing communities in large networks using random walks. In: International Symposium on Computer and Information Sciences, pp. 284–293. Springer (2005)Google Scholar
  41. 41.
    Radicchi, F., Castellano, C., Cecconi, F., Loreto, V., Parisi, D.: Defining and identifying communities in networks. Proc. Nat. Acad. Sci. USA 101(9), 2658–2663 (2004)Google Scholar
  42. 42.
    Rosvall, M., Bergstrom, C.T.: Maps of random walks on complex networks reveal community structure. Proc. Nat. Acad. Sci. USA 105(4), 1118–1123 (2008)Google Scholar
  43. 43.
    Schmidt, M.N., Morup, M.: Nonparametric bayesian modeling of complex networks: an introduction. IEEE Signal Process. Mag. 30(3), 110–128 (2013)Google Scholar
  44. 44.
    Shi, J., Malik, J.: Normalized cuts and image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 22(8), 888–905 (2000)Google Scholar
  45. 45.
    Sudderth, E.B., Torralba, A., Freeman, W.T., Willsky, A.S.: Learning hierarchical models of scenes, objects, and parts. In: Tenth IEEE International Conference on Computer Vision, 2005. ICCV 2005, vol. 2, pp. 1331–1338. IEEE (2005)Google Scholar
  46. 46.
    Teh, Y.W., Jordan, M.I., Beal, M.J., Blei, D.M.: Sharing clusters among related groups: Hierarchical dirichlet processes. In: Advances in Neural Information Processing Systems, pp. 1385–1392 (2005)Google Scholar
  47. 47.
    Watts, D.J., Strogatz, S.H.: Collective dynamics of ‘small-world’ networks. Nature 393(6684), 440 (1998)Google Scholar
  48. 48.
    Wei, X., Croft, W.B.: LDA-based document models for ad-hoc retrieval. In: Proceedings of the 29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 178–185. ACM (2006)Google Scholar
  49. 49.
    Yu, H., Braun, P., Yıldırım, M.A., Lemmens, I., Venkatesan, K., Sahalie, J., Hirozane-Kishikawa, T., Gebreab, F., Li, N., Simonis, N., et al.: High-quality binary protein interaction map of the yeast interactome network. Science 322(5898), 104–110 (2008)Google Scholar
  50. 50.
    Zhang, H., Qiu, B., Giles, C.L., Foley, H.C., Yen, J.: An LDA-based community structure discovery approach for large-scale social networks. In: Intelligence and Security Informatics, 2007 IEEE, pp. 200–207. IEEE (2007)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.Northwestern UniversityEvanstonUSA

Personalised recommendations