Skip to main content

Probabilistic Topic Models

  • Chapter
  • First Online:
Book cover Practical Text Analytics

Part of the book series: Advances in Analytics and Data Science ((AADS,volume 2))

Abstract

In this chapter, the reader is introduced to an unsupervised, probabilistic analysis model known as topic models. In topic models, the full TDM (or DTM) is broken down into two major components: the topic distribution over terms and the document distribution over topics. The topic models introduced in this chapter include latent Dirichlet allocation, dynamic topic models, correlated topic models, supervised latent Dirichlet allocation, and structural topic models. Finally, decision-making and topic model validation are presented.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 49.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 64.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 99.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  • Arun, R., Suresh, V., Madhavan, C. V., & Murthy, M. N. (2010, June). On finding the natural number of topics with latent dirichlet allocation: Some observations. In Pacific-Asia Conference on Knowledge Discovery and Data Mining (pp. 391–402). Berlin/Heidelberg: Springer.

    Chapter  Google Scholar 

  • Blei, D. M. (2012). Probabilistic topic models. Communications of the ACM, 55(4), 77–84.

    Article  Google Scholar 

  • Blei, D. M., & Lafferty, J. D. (2006). Dynamic topic models. In Proceedings of the 23rd International Conference on Machine Learning (pp. 113–120). ACM.

    Google Scholar 

  • Blei, D. M., & Lafferty, J. D. (2007). A correlated topic model of science. The Annals of Applied Statistics, 1(1), 17–35.

    Article  Google Scholar 

  • Blei, D. M., & Lafferty J. D. (2009). Topic models. In A. Srivastava & M. Sahami (Eds.), Text mining: Classification, clustering, and applications. London: Chapman & Hall/CRC Data Mining and Knowledge Discovery Series.

    Google Scholar 

  • Blei, D. M., Ng, A. Y., & Jordan, M. I. (2002). Latent Dirichlet allocation. In Advances in neural information processing systems (pp. 601–608). Cambridge, MA: MIT Press.

    Google Scholar 

  • Blei, D. M., Ng, A. Y., & Jordan, M. I. (2003). Latent Dirichlet allocation. Journal of Machine Learning Research, 3, 993–1022.

    Google Scholar 

  • Blei, D., Carin, L., & Dunson, D. (2010). Probabilistic topic models. IEEE Signal Processing Magazine, 27(6), 55–65.

    Google Scholar 

  • Blei, David M., & Lafferty, J.D. (2007). A Correlated Topic Model of Science. The Annals of Applied Statistics. 1(1): 17–35.

    Article  Google Scholar 

  • Cao, J., Xia, T., Li, J., & Zhang Y., & Tang, S. (2009). A density-based method for adaptive lDA model selection. Neurocomputing — 16th European Symposium on Artificial Neural Networks 2008, 72(7–9), 1775–1781.

    Article  Google Scholar 

  • Chang, J., Gerrish, S., Wang, C., Boyd-Graber, J. L., & Blei, D. M. (2009). Reading tea leaves: How humans interpret topic models. In Advances in neural information processing systems (pp. 288–296). Cambridge, MA: MIT Press.

    Google Scholar 

  • Deveaud, R., SanJuan, E., & Bellot, P. (2014). Accurate and effective latent concept modeling for ad hoc information retrieval. Document numérique, 17(1), 61–84.

    Article  Google Scholar 

  • Griffiths, T. L., & Steyvers, M. (2004). Finding scientific topics. Proceedings of the National Academy of Sciences, 101(suppl 1), 5228–5235.

    Article  Google Scholar 

  • Griffiths, T. L., Steyvers, M., & Tenenbaum, J. B. (2007). Topics in semantic representation. Psychological Review, 114(2), 211–244.

    Article  Google Scholar 

  • Hofmann, T. (1999, July). Probabilistic latent semantic analysis. In Proceedings of the Fifteenth Conference on Uncertainty in Artificial Intelligence (pp. 289–296).

    Google Scholar 

  • Mcauliffe, J. D., & Blei, D. M. (2008). Supervised topic models. In Advances in neural information processing systems (pp. 121–128). Cambridge, MA: MIT Press.

    Google Scholar 

  • Mimno, D., Wallach, H. M., Talley, E., Leenders, M., & McCallum, A. (2011, July). Optimizing semantic coherence in topic models. In Proceedings of the conference on empirical methods in natural language processing (pp. 262–272). Association for Computational Linguistics.

    Google Scholar 

  • Roberts, M. E., Stewart, B. M., Tingley, D., & Airoldi, E. M. (2013, January). The structural topic model and applied social science. In Advances in neural information processing systems workshop on topic models: computation, application, and evaluation (pp. 1–20).

    Google Scholar 

  • Roberts, M., et al. (2014). Structural topic models for open-ended survey responses. American Journal of Political Science, 58, 1064–1082.

    Article  Google Scholar 

  • Yi, X., & Allan, J. (2009, April). A comparative study of utilizing topic models for information retrieval. In European conference on information retrieval (pp. 29–41). Berlin/Heidelberg: Springer.

    Google Scholar 

Further Reading

  • To learn more about topic models, see Blei (2012), Blei and Lafferty (2009), and Griffiths et al. (2007).

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Anandarajan, M., Hill, C., Nolan, T. (2019). Probabilistic Topic Models. In: Practical Text Analytics. Advances in Analytics and Data Science, vol 2. Springer, Cham. https://doi.org/10.1007/978-3-319-95663-3_8

Download citation

Publish with us

Policies and ethics