A Community-Based Pseudolikelihood Approach for Relationship Labeling in Social Networks

  • Huaiyu Wan
  • Youfang Lin
  • Zhihao Wu
  • Houkuan Huang
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6913)


A social network consists of people (or other social entities) connected by a set of social relationships. Awareness of the relationship types is very helpful for us to understand the structure and the characteristics of the social network. Traditional classifiers are not accurate enough for relationship labeling since they assume that all the labels are independent and identically distributed. A relational probabilistic model, relational Markov networks (RMNs), is introduced to labeling relationships, but the inefficient parameter estimation makes it difficult to deploy in large-scale social networks. In this paper, we propose a community-based pseudolikelihood (CBPL) approach for relationship labeling. The community structure of a social network is used to assist in constructing the conditional random field, and this makes our approach reasonable and accurate. In addition, the computational simplicity of pseudolikelihood effectively resolves the time complexity problem which RMNs are suffering. We apply our approach on two real-world social networks, one is a terrorist relation network and the other is a phone call network we collected from encrypted call detail records. In our experiments, for avoiding losing links while splitting a closely connected social network into separate training and test subsets, we split the datasets according to the links rather than the individuals. The experimental results show that our approach performs well in terms of accuracy and efficiency.


Social networks Relationship labeling Community structure Pseudolikelihood Conditional random fields 


  1. 1.
    Taskar, B., Wong, M.F., Abbeel, P., Koller, D.: Link prediction in relational data. In: Neural Information Processing Systems 2003, pp. 659–666. The MIT Press, Cambridge (2004)Google Scholar
  2. 2.
    Zhao, B., Sen, P., Getoor, L.: Entity and relationship labeling in affiliation networks. In: ICML 2006 Workshop on Statistical Network Analysis: Models, Issues, and New Directions (2006)Google Scholar
  3. 3.
    Taskar, B., Abbeel, B., Koller, D.: Discriminative probabilistic models for relational data. In: 18th Conference on Uncertainty in Artificial Intelligence, pp. 485–492. Morgan Kaufmann, San Francisco (2002)Google Scholar
  4. 4.
    Girvan, M., Newman, M.E.J.: Community structure in social and biological networks. PNAS 99(12), 7821–7826 (2002)CrossRefzbMATHMathSciNetGoogle Scholar
  5. 5.
    Besag, J.: Statistical analysis of non-lattice data. The Statistician 24(3), 179–195 (1975)CrossRefGoogle Scholar
  6. 6.
    Getoor, L., Taskar, B.: Introduction to Statistical Relational Learning. MIT Press, Cambridge (2007)zbMATHGoogle Scholar
  7. 7.
    Richardson, M., Domingos, P.: Markov logic networks. Technical report, Department of Computer Science and Engineering, University of Washington (2004)Google Scholar
  8. 8.
    Domingos, P., Richardson, M.: Markov logic: a unifying framework for statistical relational learning. In: ICML 2004 Workshop on Statistical Relational Learning and its Connections to Other Fields, pp. 49–54. IMLS, Washington, DC (2004)Google Scholar
  9. 9.
    Neville, J., Jensen, D.: Collective classification with relational dependency networks. In: KDD 2003 Workshop on Multi-Relational Data Mining, pp. 77–91 (2003)Google Scholar
  10. 10.
    Xiang, R., Neville, J.: Pseudolikelihood EM for within-network relational learning. In: 8th IEEE International Conference on Data Mining, pp. 1103–1108. IEEE Computer Society, Washington, DC (2008)Google Scholar
  11. 11.
    Neville, J., Jensen, D.: Leveraging relational autocorrelation with latent group models. In: 5th IEEE International Conference on Data Mining, pp. 322–329. IEEE Computer Society, Washington, DC (2005)Google Scholar
  12. 12.
    Tang, L., Liu, H.: Relational learning via latent social dimensions. In: 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 817–826. ACM Press, New York (2009)CrossRefGoogle Scholar
  13. 13.
    Newman, M.E.J., Girvan, M.: Finding and evaluating community structure in networks. Physical Review E 69(2), 026113 (2004)CrossRefGoogle Scholar
  14. 14.
    Wang, C., Han, J., Jia, Y., Tang, J., Zhang, D., Yu, Y., Guo, J.: Mining advisor-advisee relationships from research publication networks. In: 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 203–212. ACM Press, New York (2010)CrossRefGoogle Scholar
  15. 15.
    Lafferty, J., McCallum, A., Pereira, F.: Conditional random fields: probabilistic models for segmenting and labeling sequence data. In: 18th International Conference on Machine Learning, pp. 282–289. Morgan Kaufmann, San Francisco (2001)Google Scholar
  16. 16.
    Ng, A.Y., Jordan, M.I.: On discriminative vs. generative classifiers: A comparison of logistic regression and naive Bayes. In: Neural Information Processing Systems 2001, pp. 841–848. The MIT Press, Cambridge (2002)Google Scholar
  17. 17.
    Fortunato, S.: Community detection in graphs. Physics Reports 486(3-5), 75–174 (2010)CrossRefMathSciNetGoogle Scholar
  18. 18.
    Kindermann, R., Snell, J.L.: Markov Random Fields and Their Applications. American Mathematical Society, Providence (1980)CrossRefzbMATHGoogle Scholar
  19. 19.
    Pearl, J.: Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. Morgan Kaufmann, San Francisco (1988)zbMATHGoogle Scholar
  20. 20.
    Murphy, K.P., Weiss, Y., Jordan, M.I.: Loopy belief propagation for approximate inference: an empirical study. In: 15th Conference on Uncertainty in Artificial Intelligence, pp. 485–492. Morgan Kaufmann, San Francisco (1999)Google Scholar
  21. 21.
    Rosvall, M., Bergstrom, C.T.: Maps of random walks on complex networks reveal community structure. PNAS 105(4), 1118–1123 (2008)CrossRefGoogle Scholar
  22. 22.
    Lee, C., Reid, F., McDaid, A., Hurley, N.: Detecting highly overlapping community structure by greedy clique expansion. In: KDD 2010 Workshop on Social Network Mining and Analysis (2010)Google Scholar
  23. 23.
    Lancichinetti, A., Fortunato, S.: Community detection algorithms: a comparative analysis. Physical Review E 80(5), 056117 (2009)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Huaiyu Wan
    • 1
  • Youfang Lin
    • 1
  • Zhihao Wu
    • 1
  • Houkuan Huang
    • 1
  1. 1.School of Computer and Information TechnologyBeijing Jiaotong UniversityBeijingChina

Personalised recommendations