Skip to main content

Bayesian Joint Optimization for Topic Model and Clustering

  • Conference paper
Artificial Neural Networks – ICANN 2010 (ICANN 2010)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 6352))

Included in the following conference series:

  • 1833 Accesses

Abstract

Statistical clustering is the method for dividing the given samples by assumed distributions. In high dimensional problems, such as document or image clustering, the direct method is suffered from over-fitting and the curse of the dimensionality. In many cases, we firstly reduce the dimensionality, then apply the clustering algorithm. However these methods neglect the interaction among two processes. In this report, we propose the hierarchical joint distribution of Latent Dirichlet Allocation and Polya Mixture and give the parameter estimation algorithm by Gibbs sampling method. Some benchmarks show the effectiveness of the proposed method.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Bishop, C.M.: Bayesian PCA. Advances in Neural Information Processing Systems 11, 382–388 (1999)

    Google Scholar 

  2. Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. Journal of Machine Learning Research 3, 993–1022 (2003)

    Article  MATH  Google Scholar 

  3. Teh, Y.W., Jordan, M.I., Beak, M.J., Blei, D.M.: Hierarchical Dirichlet processes. Journal of the American Statistical Association 101(476), 1566–1581 (2006)

    Article  MATH  MathSciNet  Google Scholar 

  4. Watanabe, K., Akaho, S., Okada, M.: Clustering on a subspace of exponential family using variational Bayes method. In: Proceedings of International Conference on Information Theory and Statistical Learning (2008)

    Google Scholar 

  5. Katahira, K., Matsumoto, N., Sugase-Miyamoto, Y., Okanoya, K., Okada, M.: Doubly Sparse Factor Models for Unifying Feature Transformation and Feature Selection. Journal of Physics: Conference Series (in press)

    Google Scholar 

  6. Griffiths, T., Steyvers, M.: Finding scientific topics. Proceedings of the National Academy of Sciences 101 (2004)

    Google Scholar 

  7. Sadamitsu, K., Mishina, T., Yamamoto, M.: Topic-based language models using Dirichlet mixtures. IEICE-D-II J88-D-II(9), 1771–1779 (2005)

    Google Scholar 

  8. Zhao, B., Wang, F., Zhang, C.: Efficient multiclass maximum margin clustering. In: ICML 2008: Proceedings of the 25th International Conference on Machine Learning (2008)

    Google Scholar 

  9. Li, Y.-F., Tsang, I.W., Kwok, J., Zhou, Z.-H.: Tighter and Convex Maximum Margin Clustering. JMLR W&CP 5, 344–351 (2009)

    Google Scholar 

  10. Watanabe, S.: Equations of states in singular statistical estimation. Neural Networks 23(1) (2010)

    Google Scholar 

  11. Lewis, D.D., Yang, Y., Rose, T., Li, F.: Rcv1: A new benchmark collection for text categorization research. JMLR 5, 361–397

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Hosino, T. (2010). Bayesian Joint Optimization for Topic Model and Clustering. In: Diamantaras, K., Duch, W., Iliadis, L.S. (eds) Artificial Neural Networks – ICANN 2010. ICANN 2010. Lecture Notes in Computer Science, vol 6352. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-15819-3_11

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-15819-3_11

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-15818-6

  • Online ISBN: 978-3-642-15819-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics