Probabilistic Topic Models

  • Murugan Anandarajan
  • Chelsey Hill
  • Thomas Nolan
Part of the Advances in Analytics and Data Science book series (AADS, volume 2)


In this chapter, the reader is introduced to an unsupervised, probabilistic analysis model known as topic models. In topic models, the full TDM (or DTM) is broken down into two major components: the topic distribution over terms and the document distribution over topics. The topic models introduced in this chapter include latent Dirichlet allocation, dynamic topic models, correlated topic models, supervised latent Dirichlet allocation, and structural topic models. Finally, decision-making and topic model validation are presented.


Topic models Probabilistic topic models Latent Dirichlet allocation Dynamic topic models Correlated topic models Structural topic models Supervised latent Dirichlet allocation 


  1. Arun, R., Suresh, V., Madhavan, C. V., & Murthy, M. N. (2010, June). On finding the natural number of topics with latent dirichlet allocation: Some observations. In Pacific-Asia Conference on Knowledge Discovery and Data Mining (pp. 391–402). Berlin/Heidelberg: Springer.CrossRefGoogle Scholar
  2. Blei, D. M. (2012). Probabilistic topic models. Communications of the ACM, 55(4), 77–84.CrossRefGoogle Scholar
  3. Blei, D. M., & Lafferty, J. D. (2006). Dynamic topic models. In Proceedings of the 23rd International Conference on Machine Learning (pp. 113–120). ACM.Google Scholar
  4. Blei, D. M., & Lafferty, J. D. (2007). A correlated topic model of science. The Annals of Applied Statistics, 1(1), 17–35.CrossRefGoogle Scholar
  5. Blei, D. M., & Lafferty J. D. (2009). Topic models. In A. Srivastava & M. Sahami (Eds.), Text mining: Classification, clustering, and applications. London: Chapman & Hall/CRC Data Mining and Knowledge Discovery Series.Google Scholar
  6. Blei, D. M., Ng, A. Y., & Jordan, M. I. (2002). Latent Dirichlet allocation. In Advances in neural information processing systems (pp. 601–608). Cambridge, MA: MIT Press.Google Scholar
  7. Blei, D. M., Ng, A. Y., & Jordan, M. I. (2003). Latent Dirichlet allocation. Journal of Machine Learning Research, 3, 993–1022.Google Scholar
  8. Blei, D., Carin, L., & Dunson, D. (2010). Probabilistic topic models. IEEE Signal Processing Magazine, 27(6), 55–65.Google Scholar
  9. Blei, David M., & Lafferty, J.D. (2007). A Correlated Topic Model of Science. The Annals of Applied Statistics. 1(1): 17–35.CrossRefGoogle Scholar
  10. Cao, J., Xia, T., Li, J., & Zhang Y., & Tang, S. (2009). A density-based method for adaptive lDA model selection. Neurocomputing — 16th European Symposium on Artificial Neural Networks 2008, 72(7–9), 1775–1781.CrossRefGoogle Scholar
  11. Chang, J., Gerrish, S., Wang, C., Boyd-Graber, J. L., & Blei, D. M. (2009). Reading tea leaves: How humans interpret topic models. In Advances in neural information processing systems (pp. 288–296). Cambridge, MA: MIT Press.Google Scholar
  12. Deveaud, R., SanJuan, E., & Bellot, P. (2014). Accurate and effective latent concept modeling for ad hoc information retrieval. Document numérique, 17(1), 61–84.CrossRefGoogle Scholar
  13. Griffiths, T. L., & Steyvers, M. (2004). Finding scientific topics. Proceedings of the National Academy of Sciences, 101(suppl 1), 5228–5235.CrossRefGoogle Scholar
  14. Griffiths, T. L., Steyvers, M., & Tenenbaum, J. B. (2007). Topics in semantic representation. Psychological Review, 114(2), 211–244.CrossRefGoogle Scholar
  15. Hofmann, T. (1999, July). Probabilistic latent semantic analysis. In Proceedings of the Fifteenth Conference on Uncertainty in Artificial Intelligence (pp. 289–296).Google Scholar
  16. Mcauliffe, J. D., & Blei, D. M. (2008). Supervised topic models. In Advances in neural information processing systems (pp. 121–128). Cambridge, MA: MIT Press.Google Scholar
  17. Mimno, D., Wallach, H. M., Talley, E., Leenders, M., & McCallum, A. (2011, July). Optimizing semantic coherence in topic models. In Proceedings of the conference on empirical methods in natural language processing (pp. 262–272). Association for Computational Linguistics.Google Scholar
  18. Roberts, M. E., Stewart, B. M., Tingley, D., & Airoldi, E. M. (2013, January). The structural topic model and applied social science. In Advances in neural information processing systems workshop on topic models: computation, application, and evaluation (pp. 1–20).Google Scholar
  19. Roberts, M., et al. (2014). Structural topic models for open-ended survey responses. American Journal of Political Science, 58, 1064–1082.CrossRefGoogle Scholar
  20. Yi, X., & Allan, J. (2009, April). A comparative study of utilizing topic models for information retrieval. In European conference on information retrieval (pp. 29–41). Berlin/Heidelberg: Springer.Google Scholar

Further Reading

  1. To learn more about topic models, see Blei (2012), Blei and Lafferty (2009), and Griffiths et al. (2007).Google Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  • Murugan Anandarajan
    • 1
  • Chelsey Hill
    • 2
  • Thomas Nolan
    • 3
  1. 1.LeBow College of BusinessDrexel UniversityPhiladelphiaUSA
  2. 2.Feliciano School of BusinessMontclair State UniversityMontclairUSA
  3. 3.Mercury Data ScienceHoustonUSA

Personalised recommendations