MEI: Mutual Enhanced Infinite Generative Model for Simultaneous Community and Topic Detection
Community and topic are two widely studied patterns in social network analysis. However, most existing studies either utilize textual content to improve the community detection or use link structure to guide topic modeling. Recently, some studies take both the link emphasized community and text emphasized topic into account, but community and topic are modeled by using the same latent variable. However, community and topic are different from each other in practical aspects. Therefore, it is more reasonable to model the community and topic by using different variables. To discover community, topic and their relations simultaneously, a m utual e nhanced i nfinite generative model (MEI) is proposed. This model discriminates the community and topic from one another and relates them together via community-topic distributions. Community and topic can be detected simultaneously and can be enhanced mutually during learning process. To detect the appropriate number of communities and topics automatically, Hierarchical/Dirichlet Process Mixture model (H/DPM) is employed. Gibbs sampling based approach is adopted to learn the model parameters. Experiments are conducted on the co-author network extracted from DBLP where each author is associated with his/her published papers. Experimental results show that our proposed model outperforms several baseline models in terms of perplexity and link prediction performance.
Keywordssocial network analysis community detection topic modeling mutual enhanced infinite generative model dirichlet process gibbs sampling
Unable to display preview. Download preview PDF.
- 4.Gao, J., Liang, F., Fan, W., Wang, C., Sun, Y., Han, J.: On community outliers and their efficient detection in information networks. In: KDD, pp. 813–822 (2010)Google Scholar
- 5.Guo, Z., Zhang, Z.M., Zhu, S., Chi, Y., Gong, Y.: Knowledge discovery from citation networks. In: ICDM, pp. 800–805 (2009)Google Scholar
- 6.Heinrich, G.: Parameter estimation for text analysis. Technical report, University of Leipzig (2008)Google Scholar
- 7.Hofmann, T.: Probabilistic latent semantic indexing. In: SIGIR, pp. 50–57 (1999)Google Scholar
- 8.Li, H., Nie, Z., Lee, W.-C., Giles, C.L., Wen, J.-R.: Scalable community discovery on textual data with relations. In: WWW, pp. 101–110 (2008)Google Scholar
- 9.McCallum, A., Wang, X., Corrada-Emmanuel, A.: Topic and role discovery in social networks with experiments on enron and academic email. JAIR 30, 249–272 (2007)Google Scholar
- 11.Mei, Q., Cai, D., Zhang, D., Zhai, C.: Topic modeling with network regularization. In: CIKM, pp. 1203–1212 (2008)Google Scholar
- 12.Nallapati, R., Ahmed, A., Xing, E.P., Cohen, W.W.: Joint latent topic models for text and citations. In: KDD, pp. 542–550 (2008)Google Scholar
- 15.Sun, Y., Han, J., Gao, J., Yu, Y.: Itopicmodel: Information network-integrated topic modeling. In: ICDM, pp. 493–502 (2009)Google Scholar
- 17.Wang, X., Mohanty, N., Mccallum, A.: Group and topic discovery from relations and text. In: LinkKDD, pp. 28–35 (2005)Google Scholar
- 18.Yang, T., Jin, R., Chi, Y., Zhu, S.: Combining link and content for community detection: A discriminative approach. In: KDD, pp. 927–935 (2009)Google Scholar