Advertisement

Research on Hot Topic Discovery Technology of Micro-blog Based on Biterm Topic Model

  • Jun FengEmail author
  • Yu Fang
Conference paper
Part of the Communications in Computer and Information Science book series (CCIS, volume 699)

Abstract

In order to overcome data sparsity and expression diversity problems of short text and to improve the quality of clustering, this paper proposes a text feature enhancement method based on biterm topic model (BTM). First, we obtain the high frequency word matrix of underlying topic based on the extraction on the corpus using BTM and then strengthen the traditional vector space model (VSM) selectively with this matrix to reduce vector dimension and highlight the main features. Also, we propose a heat calculation equation combining with propagation characteristic and time effect of micro-blogs so that we can better demonstrate the evolution of a topic and analyze it. Experiments show that our method has achieved good results in improving the clustering quality and the heat calculation equation is also beneficial to the discovery and evolution of hot topics.

Keywords

Biterm topic model Feature enhancement Topic discovery Hot topic evolution 

References

  1. 1.
    Allan, J.: Introduction to topic detection and tracking. In: Allan, J. (ed.) Topic Detection and Tracking, pp. 1–16. Springer US, New York (2002)CrossRefGoogle Scholar
  2. 2.
    Yan, X., Guo, J., Lan, Y.: A biterm topic model for short texts. In: Proceedings of the 22nd International Conference on World Wide Web, pp. 1445–1456. ACM (2013)Google Scholar
  3. 3.
    Beil, F., Ester, M., Xu, X.: Frequent term-based text clustering. In: Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 436–442. ACM (2002)Google Scholar
  4. 4.
    Hu, J., Xu, H., Liu, Y.: Algorithm of repeats-based term extraction and its application in text clustering. Comput. Eng. 33, 65–67 (2007)Google Scholar
  5. 5.
    Gabrilovich, E.: Feature generation for textual information retrieval using world knowledge. ACM SIGIR Forum 41, 123 (2007)CrossRefGoogle Scholar
  6. 6.
    Hotho, A., Staab, S., Stumme, G.: Ontologies improve text document clustering. In: Third IEEE International Conference on Data Mining, pp. 541–544 (2003)Google Scholar
  7. 7.
    Frey, B.J., Dueck, D.: Clustering by passing messages between data points. Science 315, 972–976 (2007)MathSciNetCrossRefzbMATHGoogle Scholar
  8. 8.
    Song, L., Zhang, P.: System design of micro-blog public opinion based on LDA topic modeling method. Netw. Secur. Technol. Appl. 4, 5–6 (2014). (in Chinese)Google Scholar
  9. 9.
    Tang, Q.: Short text clustering method based on BTM. Anhui University, Hefei (2014). (in Chinese)Google Scholar
  10. 10.
    Zhang, Y.: A short text similarity calculation method based on feature extension using BTM topic mode. Anhui University, Hefei (2014). (in Chinese)Google Scholar
  11. 11.
    Wang, Y.: Topic model based on mixture LDA model in microblogging services. Nanjing University of Posts and Telecommunications, Nanjing (2015). (in Chinese)Google Scholar
  12. 12.
    Wu, W., Wu, Q., Gu, J.: Hot topic extraction from E-commerce microblog based on EM-LDA integrated model. Mod. Libr. Inf. Technol. 11, 33–40 (2015). (in Chinese)Google Scholar
  13. 13.
    Wang, H., Peng, Y.: Public opinion hotspots discovery based on topic model and ARIMA algorithm. Technology Square (2016). (in Chinese)Google Scholar
  14. 14.
    Jiang, H.: Characteristics of micro blog and its influence on public opinion. News Lovers First Half 5, 85–86 (2011). (in Chinese)Google Scholar
  15. 15.
    O’Connor, B., Balasubramanyan, R., Routledge, B.R.: From tweets to polls: linking text sentiment to public opinion time series. In: ICWSM, vol. 11, pp. 122–129 (2010)Google Scholar
  16. 16.
    Cheng, J., Sun, A.R., Hu, D.: An information diffusion based recommendation framework for micro-blogging. J. Assoc. Inf. 12, 463 (2010)Google Scholar

Copyright information

© Springer Nature Singapore Pte Ltd. 2017

Authors and Affiliations

  1. 1.Department of Electronic and Information EngineeringTongji UniversityShanghaiChina

Personalised recommendations