Skip to main content

Microgroup Mining on TSina via Network Structure and User Attribute

  • Conference paper
Advanced Data Mining and Applications (ADMA 2011)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 7121))

Included in the following conference series:

Abstract

In this paper, we focus on the problem of community detection on TSina: the most popular microblogging network in China. By characterizing the structure and content of microgroup (community) on TSina in detail, we reveal that different from ordinary social networks, the degree assortativity coefficients are negative on most microgroups. In addition, we find that users from the same microgroup likely exhibit some similar attributes (e.g., sharing many followers, tags and topics). Inspired by these new findings, we propose a united method for microgroup detection without losing the information of link structure and user attribute. First, the link direction is converted to the weight by giving higher value to the more surprising link, while attribute similarity between two users is measured by the Jaccard coefficient of common features like followers, tags, and topics. Then, above two factors are uniformly converted to the edge weight of a newly generated network. Finally, many frequently used community detection algorithms that support weighted network would be employed. Extensive experiments on real social networks show that the factors of link structure and user attribute play almost equally important roles in microgroup detection on TSina. Our newly proposed method significantly outperforms the traditional methods with average accuracy being improved by 25%, and the number of unrecognized users decreasing by about 75%.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Fortunato, S.: Community detection in graphs. Physics Reports 486(3-5), 75–174 (2010)

    Article  Google Scholar 

  2. Danon, L., Duch, J., Arenas, A., Daz-guilera, A.: Comparing community structure identification. Journal of Statistical Mechanics: Theory and Experiment 9008, 09008 (2005)

    Google Scholar 

  3. Girvan, M., Newman, M.E.J.: Community structure in social and biological networks. PNAS 99(12), 7821–7826 (2002)

    Article  MATH  Google Scholar 

  4. Newman, M.E.J., Girvan, M.: Finding and evaluating community structure in networks. Phys. Rev. E 69(2), 26113 (2004)

    Article  Google Scholar 

  5. Radicchi, F., Castellano, C., Cecconi, F., Loreto, V., Parisi, D.: Defining and identifying communities in networks. Proceedings of the National Academy of Sciences 101(9), 2658 (2004)

    Article  Google Scholar 

  6. Palla, G., Derenyi, I., Farkas, I., Vicsek, T.: Uncovering the overlapping community structure of complex networks in nature and society. Nature 435, 814 (2005)

    Article  Google Scholar 

  7. Arenas, A., Díaz-Guilera, A., Pérez-Vicente, C.J.: Synchronization reveals topological scales in complex networks. Phys. Rev. Lett. 96(11), 114102 (2006)

    Article  Google Scholar 

  8. Clauset, A., Newman, M.E.J., Moore, C.: Finding community structure in very large networks. Phys. Rev. E 70(6), 66111 (2004)

    Article  Google Scholar 

  9. Flake, G., Lawrence, S., Giles, C., Coetzee, F.: Self-organization and identification of Web communities. Computer 35(3), 66–70 (2002)

    Article  Google Scholar 

  10. Pothen, A., Simon, H.D., Liou, K.P.: Partitioning sparse matrices with eigenvectors of graphs. SIAM J. Matrix Anal. Appl. 11(3), 430–452 (1990)

    Article  MATH  Google Scholar 

  11. Kernighan, B.W., Lin, S.: An Efficient Heuristic Procedure for Partitioning Graphs. The Bell system technical journal 49(1), 291–307 (1970)

    Article  MATH  Google Scholar 

  12. Gregory, S.: Finding overlapping communities in networks by label propagation. New Journal of Physics 12(10), 103018+ (2010)

    Article  Google Scholar 

  13. Stanoev, A., Smilkov, D., Kocarev, L.: Identifying communities by influence dynamics in social networks (April 2011)

    Google Scholar 

  14. Yang, T., Chi, Y., Zhu, S., Gong, Y., Jin, R.: Directed network community detection: A popularity and productivity link model. In: SIAM International Conference on Data Mining, pp. 742–753 (2010)

    Google Scholar 

  15. Cohn, D., Hofmann, T.: The missing link - a probabilistic model of document content and hypertext connectivity. In: Neural Information Processing Systems, vol. 13 (2001)

    Google Scholar 

  16. Getoor, L., Friedman, N., Koller, D., Taskar, B.: Learning probabilistic models of link structure. Journal of Machine Learning Research 3, 679–707 (2002)

    MATH  Google Scholar 

  17. Stephen, E.E., Fienberg, S., Lafferty, J.: Mixed membership models of scientific publications. Proceedings of the National Academy of Sciences (2004)

    Google Scholar 

  18. Yang, T., Jin, R., Chi, Y., Zhu, S.: Combining link and content for community detection: a discriminative approach. In: Knowledge Discovery and Data Mining, pp. 927–936 (2009)

    Google Scholar 

  19. Dietz, L., Bickel, S., Scheffer, T.: Unsupervised prediction of citation influences. In: Proceedings of the 24th International Conference on Machine Learning, pp. 233–240 (2007)

    Google Scholar 

  20. Amit Gruber, M.R.Z., Weiss, Y.: Latent topic models for hypertext. In: Uncertainty in Artificial Intelligence, pp. 230–239 (2008)

    Google Scholar 

  21. Cha, M., Mislove, A., Gummadi, P.K.: A measurement-driven analysis of information propagation in the flickr social network. World Wide Web Conference Series, pp. 721–730 (2009)

    Google Scholar 

  22. Kumar, R., Novak, J., Tomkins, A.: Structure and evolution of online social networks. In: Knowledge Discovery and Data Mining, pp. 611–617 (2006)

    Google Scholar 

  23. Kwak, H., Lee, C., Park, H., Moon, S.B.: What is twitter, a social network or a news media? World Wide Web Conference Series, pp. 591–600 (2010)

    Google Scholar 

  24. Newman, M.E.J.: Mixing patterns in networks. Phys. Rev. E 67(2), 26126 (2003)

    Article  Google Scholar 

  25. Liben-Nowell, D., Kleinberg, J.: The link-prediction problem for social networks. J. Am. Soc. Inf. Sci. Technol. 58, 1019–1031 (2007)

    Article  Google Scholar 

  26. Kim, Y., Son, S.-W., Jeong, H.: Community Identification in Directed Networks. In: Zhou, J. (ed.) Complex 2009. LNICST, vol. 5, pp. 2050–2053. Springer, Heidelberg (2009)

    Chapter  Google Scholar 

  27. Rosvall, M., Bergstrom, C.T.: Maps of random walks on complex networks reveal community structure. PNAS 105, 1118 (2008)

    Article  Google Scholar 

  28. Lancichinetti, A., Radicchi, F., Ramasco, J.J.: Statistical significance of communities in networks. Phys. Rev. E 81(4), 46110 (2010)

    Article  Google Scholar 

  29. Zachary, W.: An information flow model for conflict and fission in small groups. Journal of Anthropological Research 33, 452–473 (1977)

    Article  Google Scholar 

  30. Lusseau, D., Schneider, K., Boisseau, O.J., Haase, P., Slooten, E.: The bottlenose dolphin community of doubtful sound features a large proportion of long-lasting associations. Behavioral Ecology and Sociobiology 54(4), 396–405 (2003)

    Article  Google Scholar 

  31. Traud, A.L., Kelsic, E.D., Mucha, P.J., Porter, M.A.: Comparing Community Structure to Characteristics in Online Collegiate Social Networks. In: Proceedings of the 2009 APS March Meeting (March 2009)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Xiong, X., Niu, X., Zhou, G., Xu, K., Huang, Y. (2011). Microgroup Mining on TSina via Network Structure and User Attribute. In: Tang, J., King, I., Chen, L., Wang, J. (eds) Advanced Data Mining and Applications. ADMA 2011. Lecture Notes in Computer Science(), vol 7121. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-25856-5_11

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-25856-5_11

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-25855-8

  • Online ISBN: 978-3-642-25856-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics