Skip to main content

Integrating Social Media Data for Community Detection

  • Conference paper
Book cover Modeling and Mining Ubiquitous Social Media (MUSE 2011, MSM 2011)

Abstract

Community detection is an unsupervised learning task that discovers groups such that group members share more similarities or interact more frequently among themselves than with people outside groups. In social media, link information can reveal heterogeneous relationships of various strengths, but often can be noisy. Since different sources of data in social media can provide complementary information, e.g., bookmarking and tagging data indicates user interests, frequency of commenting suggests the strength of ties, etc., we propose to integrate social media data of multiple types for improving the performance of community detection. We present a joint optimization framework to integrate multiple data sources for community detection. Empirical evaluation on both synthetic data and real-world social media data shows significant performance improvement of the proposed approach. This work elaborates the need for and challenges of multi-source integration of heterogeneous data types, and provides a principled way of multi-source community detection.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 49.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Backstrom, L., Huttenlocher, D., Kleinberg, J., Lan, X.: Group formation in large social networks: membership, growth, and evolution. In: KDD, pp. 44–54. ACM (2006)

    Google Scholar 

  2. Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. The Journal of Machine Learning Research 3, 993–1022 (2003)

    MATH  Google Scholar 

  3. Chaudhuri, K., Kakade, S.M., Livescu, K., Sridharan, K.: Multi-view clustering via canonical correlation analysis. In: ICML (2009)

    Google Scholar 

  4. Erosheva, E., Fienberg, S., Lafferty, J.: Mixed-membership models of scientific publications. Proceedings of the National Academy of Sciences of the United States of America 101(suppl. 1), 5220 (2004)

    Article  Google Scholar 

  5. Evans, T., Lambiotte, R.: Line graphs, link partitions, and overlapping communities. Physical Review E 80(1), 16105 (2009)

    Article  Google Scholar 

  6. Lin, Y.-R., Sun, J., Castro, P., Konuru, R., Sundaram, H., Kelliher, A.: Metafac: community discovery via relational hypergraph factorization. In: KDD, pp. 527–536. ACM (2009)

    Google Scholar 

  7. Liu, Y., Niculescu-Mizil, A., Gryc, W.: Topic-link lda: Joint models of topic and author community. In: ICML 2009 (2009)

    Google Scholar 

  8. Luxburg, U.: A tutorial on spectral clustering. Statistics and Computing 17(4), 395–416 (2007)

    Article  MathSciNet  Google Scholar 

  9. McPherson, M., Lovin, L.S., Cook, J.M.: Birds of a feather: Homophily in social networks. Annual Review of Sociology 27(1), 415–444 (2001)

    Article  Google Scholar 

  10. Newman, M.E.: Finding community structure in networks using the eigenvectors of matrices. Physical Review E 74(3), 36104 (2006)

    Article  Google Scholar 

  11. Newman, M.E., Leicht, E.: Mixture models and exploratory analysis in networks. Proceedings of the National Academy of Sciences 104(23), 9564 (2007)

    Article  MATH  Google Scholar 

  12. Newman, M.E.J., Girvan, M.: Finding and evaluating community structure in networks. Physical Review E 69(2), 26113 (2004)

    Article  Google Scholar 

  13. Palla, G., Dernyi, I., Farkas, I., Vicsek, T.: Uncovering the overlapping community structure of complex networks in nature and society. Nature 435(7043), 814–818 (2005)

    Article  Google Scholar 

  14. Scellato, S., Mascolo, C., Musolesi, M., Latora, V.: Distance matters: Geo-social metrics for online social networks. In: WOSN 2010 (2010)

    Google Scholar 

  15. Strehl, A., Ghosh, J.: Cluster ensembles – a knowledge reuse framework for combining multiple partitions. Journal of Machine Learning Research 3, 583–617 (2002)

    MathSciNet  Google Scholar 

  16. Tang, J., Gao, H., Liu, H.: mtrust: discerning multi-faceted trust in a connected world. In: WSDM, pp. 93–102. ACM (2012)

    Google Scholar 

  17. Tang, J., Liu, H.: Feature selection with linked data in social media. In: SDM (2012)

    Google Scholar 

  18. Tang, J., Liu, H.: Unsupervised feature selection for linked social media data. In: KDD (2012)

    Google Scholar 

  19. Tang, L., Liu, H.: Scalable learning of collective behavior based on sparse social dimensions. In: CIKM, pp. 1107–1116. ACM (2009)

    Google Scholar 

  20. Tang, L., Wang, X., Liu, H.: Uncovering groups via heterogeneous interaction analysis. In: ICDM, Miami, FL, USA, December 6-9 (2009)

    Google Scholar 

  21. Tibshirani, R.: Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society. Series B 58(1), 267–288 (1996)

    MathSciNet  MATH  Google Scholar 

  22. Wang, X., Tang, L., Gao, H., Liu, H.: Discovering overlapping groups in social media. In: ICDM, Sydney, Australia, December 14 - 17 (2010)

    Google Scholar 

  23. White, S., Smyth, P.: A spectral clustering approach to finding communities in graphs. In: SDM, p. 274. Society for Industrial Mathematics (2005)

    Google Scholar 

  24. Xiang, R., Neville, J., Rogati, M.: Modeling relationship strength in online social networks. In: WWW, pp. 981–990. ACM (2010)

    Google Scholar 

  25. Yang, T., Jin, R., Chi, Y., Zhu, S.: Combining link and content for community detection: a discriminative approach. In: KDD, pp. 927–936. ACM (2009)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Tang, J., Wang, X., Liu, H. (2012). Integrating Social Media Data for Community Detection. In: Atzmueller, M., Chin, A., Helic, D., Hotho, A. (eds) Modeling and Mining Ubiquitous Social Media. MUSE MSM 2011 2011. Lecture Notes in Computer Science(), vol 7472. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-33684-3_1

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-33684-3_1

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-33683-6

  • Online ISBN: 978-3-642-33684-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics