Advertisement

Comparison of Two-Pass Algorithms for Dynamic Topic Modeling Based on Matrix Decompositions

  • Gabriella Skitalinskaya
  • Mikhail Alexandrov
  • John Cardiff
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10633)

Abstract

In this paper we present a two-pass algorithm based on different matrix decompositions, such as LSI, PCA, ICA and NMF, which allows tracking of the evolution of topics over time. The proposed dynamic topic models as output give an easily interpreted overview of topics found in a sequentially organized set of documents that does not require further processing. Each topic is presented by a user-specified number of top-terms. Such an approach to topic modeling if applied to, for example, a news article data set, can be convenient and useful for economists, sociologists, political scientists. The proposed approach allows to achieve results comparable to those obtained using complex probabilistic models, such as LDA.

Keywords

Dynamic topic modeling Matrix decomposition Latent Dirichlet Allocation 

References

  1. 1.
    RosBusinessConsulting. (http://www.rbc.ru/). Accessed 01 May 2017
  2. 2.
    Deerwester, S., Dumais, S., Furnas, G., Landauer, T., Harshman, R.: Indexing by latent semantic analysis. J. Am. Soc. Inform. Sci. Technol. 41, 391–407 (1990)CrossRefGoogle Scholar
  3. 3.
    Steyvers, M., Griffiths, T.L.: Probabilistic topic models. In: Tang, Z., MacLennan, (eds.) Latent Semantic Analysis: A Road to Meaning, pp. 1–6. Laurence Erlbaum, Mahwah, NJ (2006)Google Scholar
  4. 4.
    Blei, D.M., Edu, B.B., Ng, A.Y., Edu, A.S., Jordan, M.I., Edu, J.B.: Latent Dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)Google Scholar
  5. 5.
    Wei, X., Croft, W.B.: Modeling term associations for ad-hoc retrieval performance within language modeling framework. In: Amati, G., Carpineto, C., Romano, G. (eds.) ECIR 2007. LNCS, vol. 4425, pp. 52–63. Springer, Heidelberg (2007).  https://doi.org/10.1007/978-3-540-71496-5_8CrossRefGoogle Scholar
  6. 6.
    Blei, D.M., Lafferty, J.D.: Correlated topic models. Adv. Neural Inf. Process. Syst. 18, 147–154 (2006)Google Scholar
  7. 7.
    Daud, A., Li, J., Zhou, L., Muhammad, F.: Knowledge discovery through directed probabilistic topic models: a survey (2010)CrossRefGoogle Scholar
  8. 8.
    Vorontsov, K., Potapenko, A.: Tutorial on probabilistic topic modeling: additive regularization for stochastic matrix factorization. Commun. Comput. Inf. Sci. 436, 29–46 (2014)Google Scholar
  9. 9.
    Vorontsov, K., Potapenko, A.: Additive regularization of topic models. Mach. Learn. 101, 303–323 (2014)MathSciNetCrossRefGoogle Scholar
  10. 10.
    Wang, Q., Cao, Z., Xu, J., Li, H.: Group matrix factorization for scalable topic modeling. In: Proceedings of 35th SIGIR Conference on Research and Development in Information Retrieval, pp. 375–384 (2012)Google Scholar
  11. 11.
    Grant, S., Skillicorn, D., Cordy, J.R.: Topic detection using independent component analysis. In: Proceedings of the Workshop on Link Analysis, Counterterrorism and Security (LACTS 2008), pp. 23–28 (2008)Google Scholar
  12. 12.
    Blei, D.M., Lafferty, J.D.: Dynamic topic models. In: Proceedings of the 23rd International Conference on Machine Learning - ICML 2006, pp. 113–120 (2006)Google Scholar
  13. 13.
    Caron, F., Davy, M., Doucet, A.: Generalized Polya Urn for time-varying Dirichlet process mixtures. In: 23rd Conference on Uncertainty in Artificial Intelligence UAI’2007 Vancouver Canada, pp. 33–40 (2007)Google Scholar
  14. 14.
    Wang, C., Blei, D., Heckerman, D.: Continuous time dynamic topic models. In: Proceedings of the Twenty-Fourth Conference Annual Conference on Uncertainty in Artificial Intelligence (UAI-08), pp. 579–586 (2008)Google Scholar
  15. 15.
    Wang, X., McCallum, A.: Topics over time: a non-markov continuous-time model of topical trends. In: Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 424–433 (2006)Google Scholar
  16. 16.
    Greene, D., Cross, J.P.: Exploring the political agenda of the europeanparliament using a dynamic topic modeling approach. Polit. Anal. 25, 77–94 (2017)CrossRefGoogle Scholar
  17. 17.
    Skitalinskaya, G.: Analysis of news dynamics using two-pass algorithms of dynamic topic modeling. Math. Methods Inf. Soc. Process., Publ. House KIAM-RAS 19, 13 (2017)Google Scholar
  18. 18.
    Newman, D., Lau, J., Grieser, K., Baldwin, T.: Automatic evaluation of topic coherence. In: Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics, pp. 100–108 (2010)Google Scholar
  19. 19.
    Bouma, G.: Normalized (pointwise) mutual information in collocation extraction. In: Proceedings of German Society for Computational Linguistics (GSCL 2009), pp. 31–40 (2009)Google Scholar
  20. 20.
    Aletras, N., Stevenson, M.: Evaluating topic coherence using distributional semantics. In: Proceedings of the 10th International Conference on Computational Semantics (IWCS 2013)-Long Papers, pp. 13–22 (2013)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  • Gabriella Skitalinskaya
    • 1
    • 2
    • 4
  • Mikhail Alexandrov
    • 3
    • 4
  • John Cardiff
    • 1
  1. 1.Institute of Technology, TallaghtDublinIreland
  2. 2.Moscow Institute of Physics and Technology (State University)DolgoprudnyRussia
  3. 3.Autonomous University of BarcelonaBarcelonaSpain
  4. 4.Russian Presidential Academy of National Economy and Public AdministrationMoscowRussia

Personalised recommendations