Topic Extraction of Events on Social Media Using Reinforced Knowledge

  • Xuefei Zhang
  • Ruifang HeEmail author
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11062)


The conventional topic models for topic extraction of events on social media are insufficient due to the data sparsity and the noise of microblog posts. The existing researches use word embeddings as prior knowledge to guide modeling or integrate conversation structures to enrich context. However, the shared context across a large number of events is ignored, which can be used as prior knowledge to reinforce coherent topic generation of each event. Thus, we propose a Reinforced Knowledge LDA for discovering topics of each event. It consists of three steps: (1) Running a topic model based on word embeddings and conversation structures to extract prior topics of each event; (2) Mining a set of reinforced knowledge sets from prior topics of all events automatically; (3) Using the reinforced knowledge sets to generate the final topics of every event. Experimental results on three real-word datasets which individually contain 50 events demonstrate the effectiveness of the proposed model and the reinforced knowledge.


Topic extraction Social media Reinforced knowledge Word embedding Conversation structure 



This work is supported by the National Science Foundation of China (No. 61472277). We would like to thank anonymous reviewers for the detailed and helpful comments and suggestions.


  1. 1.
    Agrawal, R., Srikant, R.: Fast algorithms for mining association rules in large databases. In: VLDB, pp. 487–499 (1994)Google Scholar
  2. 2.
    Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. J. Mach. Learn. Res. 3(Jan), 993–1022 (2003)zbMATHGoogle Scholar
  3. 3.
    Chang, J., Gerrish, S., Wang, C., Boyd-Graber, J.L., Blei, D.M.: Reading tea leaves: how humans interpret topic models. In: NIPS, pp. 288–296 (2009)Google Scholar
  4. 4.
    Chen, Z., Liu, B.: Topic modeling using topics from many domains, lifelong learning and big data. In: ICML, pp. 703–711 (2014)Google Scholar
  5. 5.
    Griffiths, T.L., Steyvers, M.: Finding scientific topics. Proc. Natl. Acad. Sci. 101(1), 5228–5235 (2004)CrossRefGoogle Scholar
  6. 6.
    Hong, L., Davison, B.D.: Empirical study of topic modeling in Twitter. In: Proceedings of the First Workshop on Social Media Analytics, pp. 80–88 (2010)Google Scholar
  7. 7.
    Hu, W., Tsujii, J.: A latent concept topic model for robust topic inference using word embeddings. In: ACL, pp. 380–386 (2016)Google Scholar
  8. 8.
    Li, J., Liao, M., Gao, W., He, Y., Wong, K.F.: Topic extraction from microblog posts using conversation structures. In: ACL (2016)Google Scholar
  9. 9.
    Li, Y., Liu, T., Jiang, J., Zhang, L.: Hashtag recommendation with topical attention-based LSTM. In: COLING (2016)Google Scholar
  10. 10.
    Mehrotra, R., Sanner, S., Buntine, W., Xie, L.: Improving LDA topic models for microblogs via tweet pooling and automatic labeling. In: SIGIR, pp. 889–892 (2013)Google Scholar
  11. 11.
    Mikolov, T., Yih, W.T., Zweig, G.: Linguistic regularities in continuous space word representations. In: NAACL, pp. 746–751 (2013)Google Scholar
  12. 12.
    Mimno, D., Wallach, H.M., Talley, E., Leenders, M., McCallum, A.: Optimizing semantic coherence in topic models. In: EMNLP, pp. 262–272 (2011)Google Scholar
  13. 13.
    Quan, X., Kit, C., Ge, Y., Pan, S.J.: Short and sparse text topic modeling via self-aggregation. In: IJCAI, pp. 2270–2276 (2015)Google Scholar
  14. 14.
    Sridhar, V.K.R.: Unsupervised topic modeling for short texts using distributed representations of words. In: NAACL, pp. 192–200 (2015)Google Scholar
  15. 15.
    Tang, J., Zhang, M., Mei, Q.: One theme in all views: modeling consensus topics in multiple contexts. In: KDD, pp. 5–13 (2013)Google Scholar
  16. 16.
    Xing, C., et al.: Topic aware neural response generation. In: AAAI, pp. 3351–3357 (2017)Google Scholar
  17. 17.
    Yan, X., Guo, J., Lan, Y., Cheng, X.: A biterm topic model for short texts. In: WWW, pp. 1445–1456 (2013)Google Scholar
  18. 18.
    Zhao, W.X., et al.: Comparing Twitter and traditional media using topic models. In: Clough, P., et al. (eds.) ECIR 2011. LNCS, vol. 6611, pp. 338–349. Springer, Heidelberg (2011). Scholar
  19. 19.
    Zhuang, H., Rahman, R., Hu, X., Guo, T., Hui, P., Aberer, K.: Data summarization with social contexts. In: CIKM, pp. 397–406 (2016)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  1. 1.Tianjin Key Laboratory of Cognitive Computing and ApplicationTianjinChina
  2. 2.School of Computer Science and TechnologyTianjin UniversityTianjinChina

Personalised recommendations