Skip to main content

Two Alternative Criteria for a Split-Merge MCMC on Dirichlet Process Mixture Models

  • Conference paper
  • First Online:
Artificial Neural Networks and Machine Learning – ICANN 2017 (ICANN 2017)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 10614))

Included in the following conference series:

  • 4223 Accesses

Abstract

The free energy and the generalization error are two major model selection criteria. However, in general, they are not equivalent. In previous studies, for the split-merge algorithm on conjugate Dirichlet process mixture models, the complete free energy was mainly used. In this work, we propose, the new criterion, the complete leave one out cross validation which is based on the approximation of the generalization error. In numerical experiments, our proposal outperforms the previous methods with the test set perplexity. Finally, we discuss the appropriate usage of these two criteria taking into account the experimental results.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/.

References

  1. Akaike, H.: Information theory and an extension of the maximum likelihood principle. In: 2nd International Symposium on Information Theory. Academiai Kiado (1973)

    Google Scholar 

  2. Blackwell, D., MacQueen, J.B.: Ferguson distributions via pólya urn schemes. Ann. Stat. 1(2), 353–355 (1973)

    Article  MATH  Google Scholar 

  3. Dahl, D.B.: An improved merge-split sampler for conjugate dirichlet process mixture models. Technical report 1, 086 (2003)

    Google Scholar 

  4. Hsu, C.W., Chang, C.C., Lin, C.J., et al.: A practical guide to support vector classification. Technical report, Department of Computer Science, National Taiwan University (2003)

    Google Scholar 

  5. Jain, S., Neal, R.M.: A split-merge markov chain monte carlo procedure for the dirichlet process mixture model. J. Comput. Graph. Stat. 13(1), 158–182 (2004)

    Article  MathSciNet  Google Scholar 

  6. Kenji, N., Jun, K., Shin-ichi, N., Satoshi, E., Ryoi, T., Masato, O.: An exhaustive search and stability of sparse estimation for feature selection problem. IPSJ Trans. Math. Model. Appl. 8(2), 23–30 (2015)

    Google Scholar 

  7. MacKay, D.J.: Information Theory, Inference and Learning Algorithms. Cambridge University Press, Cambridge (2003)

    MATH  Google Scholar 

  8. Neal, R.M.: Markov chain sampling methods for dirichlet process mixture models. J. Comput. Graph. Stat. 9(2), 249–265 (2000)

    MathSciNet  Google Scholar 

  9. Rissanen, J.: Modeling by shortest data description. Automatica 14(5), 465–471 (1978)

    Article  MATH  Google Scholar 

  10. Sato, I., Nakagawa, H.: Stochastic divergence minimization for online collapsed variational Bayes zero inference of latent Dirichlet allocation. In: Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1035–1044. ACM (2015)

    Google Scholar 

  11. Schwarz, G., et al.: Estimating the dimension of a model. Ann. Stat. 6(2), 461–464 (1978)

    Article  MATH  MathSciNet  Google Scholar 

  12. Wang, C., Blei, D.M.: A split-merge MCMC algorithm for the hierarchical dirichlet process. arXiv preprint arXiv:1201.1657 (2012)

  13. Watanabe, S.: Asymptotic equivalence of Bayes cross validation and widely applicable information criterion in singular learning theory. J. Mach. Learn. Res. 11(Dec), 3571–3594 (2010)

    MATH  MathSciNet  Google Scholar 

  14. Watanabe, S.: A widely applicable Bayesian information criterion. J. Mach. Learn. Res. 14(Mar), 867–897 (2013)

    MATH  MathSciNet  Google Scholar 

  15. Welling, M., Kurihara, K.: Bayesian k-means as a “maximization-expectation” algorithm. In: Proceedings of the 2006 SIAM International Conference on Data Mining, pp. 474–478. SIAM (2006)

    Google Scholar 

  16. Yamazaki, K.: Asymptotic accuracy of Bayes estimation for latent variables with redundancy. Mach. Learn. 102(1), 1–28 (2016)

    Article  MATH  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Tikara Hosino .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Hosino, T. (2017). Two Alternative Criteria for a Split-Merge MCMC on Dirichlet Process Mixture Models. In: Lintas, A., Rovetta, S., Verschure, P., Villa, A. (eds) Artificial Neural Networks and Machine Learning – ICANN 2017. ICANN 2017. Lecture Notes in Computer Science(), vol 10614. Springer, Cham. https://doi.org/10.1007/978-3-319-68612-7_76

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-68612-7_76

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-68611-0

  • Online ISBN: 978-3-319-68612-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics