Skip to main content

Predicting Friendship Links in Social Networks Using a Topic Modeling Approach

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 6635))

Abstract

In the recent years, the number of social network users has increased dramatically. The resulting amount of data associated with users of social networks has created great opportunities for data mining problems. One data mining problem of interest for social networks is the friendship link prediction problem. Intuitively, a friendship link between two users can be predicted based on their common friends and interests. However, using user interests directly can be challenging, given the large number of possible interests. In the past, approaches that make use of an explicit user interest ontology have been proposed to tackle this problem, but the construction of the ontology proved to be computationally expensive and the resulting ontology was not very useful. As an alternative, we propose a topic modeling approach to the problem of predicting new friendships based on interests and existing friendships. Specifically, we use Latent Dirichlet Allocation (LDA) to model user interests and, thus, we create an implicit interest ontology. We construct features for the link prediction problem based on the resulting topic distributions. Experimental results on several LiveJournal data sets of varying sizes show the usefulness of the LDA features for predicting friendships.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Boyd, M.D., Ellison, B.N.: Social Network Sites: Definition, History, and Scholarship. Journal of Computer-Mediated Communication 13 (2007)

    Google Scholar 

  2. comScore Press Release, http://www.comscore.com/Press_Events/Press_Releases/2007/07/Social_Networking_Goes_Globa

  3. TechCrunch Report, http://eu.techcrunch.com/2010/06/08/report-social-networks-overtake-search-engines-in-uk-should-google-be-worried

  4. Fitzpatrick, B.: LiveJournal: Online Service, http://www.livejournal.com

  5. Geetor, L., Lu, Q.: Link-based Classification. In: Twelth International Conference on Machine Learning (ICML 2003), Washington DC (2003)

    Google Scholar 

  6. Na, J.C., Thet, T.T.: Effectiveness of web search results for genre and sentiment classification. Journal of Information Science 35(6), 709–726 (2009)

    Article  Google Scholar 

  7. Castillo, C., Donato, D., Gionis, A., Murdock, V., Silvestri, F.: Know your Neighbors: Web Spam Detection using the web Topology. In: Proceedings of SIGIR 2007, Amsterdam, Netherlands (2007)

    Google Scholar 

  8. Taskar, B., Wong, M., Abbeel, P., Koller, D.: Link Prediction in Relational Data. In: Proc. of 17th Neural Information Processing Systems, NIPS (2003)

    Google Scholar 

  9. Hsu, H.W., Weninger, T., Paradesi, R.S.M., Lancaster, J.: Structural link analysis from user profiles and friends networks: a feature construction approach. In: Proceedings of International Conference on Weblogs and Social Media (ICWSM), Boulder, CO, USA (2007)

    Google Scholar 

  10. Caragea, D., Bahirwani, V., Aljandal, W., Hsu, H.W.: Link Mining: Ontology-Based Link Prediction in the LiveJournal Social Network. In: Proceedings of Association of the Advancement of Artificial Intelligence, pp. 192–196 (2009)

    Google Scholar 

  11. Haridas, M., Caragea, D.: Link Mining: Exploring Wikipedia and DMoz as Knowledge Bases for Engineering a User Interests Hierarchy for Social Network Applications. In: Proceedings of the Confederated International Conferences on On the Move to Meaningful Internet Systems: Part II, Portugal, pp. 1238–1245 (2009)

    Google Scholar 

  12. Steyvers, M., Griffiths, T.: Probabilistic Topic Models. In: Landauer, T., Mcnamara, D., Dennis, S., Kintsch, W. (eds.) Handbook of Latent Semantic Analysis. Lawrence Erlbaum Associates, Mahwah (2007)

    Google Scholar 

  13. Steyvers, M., Griffiths, T., Tenenbaum, J.B.: Topics in Semantic Representation. American Psychological Association 114(2), 211–244 (2007)

    Google Scholar 

  14. Steyvers, M., Griffiths, T.: Finding Scientific Topics. Proceedings of National Academy of Sciences, U.S.A, 5228–5235 (2004)

    Google Scholar 

  15. Blei, D., Ng, Y.A., Jordan, I.M.: Latent Dirichlet Allocation. Journal of Machine Learning Research 3, 993–1022 (2003)

    MATH  Google Scholar 

  16. Blei, D., Boyd-Graber, J., Zhu, X.: A Topic Model for Word Sense Disambiguation. In: Proc. of the 2007 Joint Conf. on Empirical Methods in Natural Language Processing and Comp. Natural Language Learning, pp. 1024–1033 (2007)

    Google Scholar 

  17. Guo, J., Xu, G., Cheng, X., Li, H.: Named Entity Recognition in Query. In: Proceedings of SIGIR 2009, Boston, USA (2009)

    Google Scholar 

  18. Krestel, R., Fankhauser, P., Nejdl, W.: Latent Dirichlet Allocation for Tag Recommendation. In: Proceedings of RecSys 2009, New York, USA (2009)

    Google Scholar 

  19. Chen, W., Chu, J., Luan, J., Bai, H., Wang, Y., Chang, Y.E.: Collaborative Filtering for Orkut Communities: Discovery of User Latent Behavior. In: Proceedings of International World Wide Web Conference (2009)

    Google Scholar 

  20. McCallam, K.A.: Mallet: A Machine Learning for Language Toolkit (2002), http://mallet.cs.umass.edu

  21. Phanse, S.: Study on the Performance of Ontology Based Approaches to Link Prediction in Social Networks as the Number of Users Increases. M.S. Thesis (2010)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Parimi, R., Caragea, D. (2011). Predicting Friendship Links in Social Networks Using a Topic Modeling Approach. In: Huang, J.Z., Cao, L., Srivastava, J. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2011. Lecture Notes in Computer Science(), vol 6635. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-20847-8_7

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-20847-8_7

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-20846-1

  • Online ISBN: 978-3-642-20847-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics