Skip to main content
Log in

Abstract

This paper develops a novel online algorithm, namely moving average stochastic variational inference (MASVI), which applies the results obtained by previous iterations to smooth out noisy natural gradients. We analyze the convergence property of the proposed algorithm and conduct a set of experiments on two large-scale collections that contain millions of documents. Experimental results indicate that in contrast to algorithms named ‘stochastic variational inference’ and ‘SGRLD’, our algorithm achieves a faster convergence rate and better performance.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Amari, S., 1998. Natural gradient works efficiently in learning. Neur. Comput., 10(2):251–276. [doi:10.1162/089976698300017746]

    Article  MathSciNet  Google Scholar 

  • Andrieu, C., de Freitas, N., Doucet, A., et al., 2003. An introduction to MCMC for machine learning. Mach. Learn., 50(1–2):5–43. [doi:10.1023/A:1020281327116]

    Article  MATH  Google Scholar 

  • Blatt, D., Hero, A.O., Gauchman, H., 2007. A convergent incremental gradient method with a constant step size. SIAM J. Optim., 18(1):29–51. [doi:10.1137/040615961]

    Article  MATH  MathSciNet  Google Scholar 

  • Blei, D.M., 2012. Probabilistic topic models. Commun. ACM, 55(4):77–84. [doi:10.1145/2133806.2133826]

    Article  MathSciNet  Google Scholar 

  • Blei, D.M., Ng, A.Y., Jordan, M.I., 2003. Latent Dirichlet allocation. J. Mach. Learn. Res., 3:993–1022.

    MATH  Google Scholar 

  • Canini, K.R., Shi, L., Griffiths, T.L., 2009. Online inference of topics with latent Dirichlet allocation. J. Mach. Learn. Res., 5(2):65–72.

    Google Scholar 

  • Griffiths, T.L., Steyvers, M., 2004. Finding scientific topics. PNAS, 101(suppl 1):5228–5235. [doi:10.1073/pnas.0307752101]

    Article  Google Scholar 

  • Hoffman, M., Bach, F.R., Blei, D.M., 2010. Online learning for latent Dirichlet allocation. Advances in Neural Information Processing Systems, p.856–864.

    Google Scholar 

  • Hoffman, M., Blei, D.M., Wang, C., et al., 2013. Stochastic variational inference. J. Mach. Learn. Res., 14(1): 1303–1347.

    MATH  MathSciNet  Google Scholar 

  • Liu, Z., Zhang, Y., Chang, E.Y., et al., 2011. PLDA+: parallel latent Dirichlet allocation with data placement and pipeline processing. ACM Trans. Intell. Syst. Technol., 2(3), Article 26.

    Google Scholar 

  • Newman, D., Asuncion, A., Smyth, P., et al., 2009. Distributed algorithms for topic models. J. Mach. Learn. Res., 10:1801–1828.

    MATH  MathSciNet  Google Scholar 

  • Ouyang, J., Lu, Y., Li, X., 2014. Momentum online LDA for large-scale datasets. Proc. 21st European Conf. on Artificial Intelligence, p.1075–1076.

    Google Scholar 

  • Patterson, S., Teh, Y.W., 2013. Stochastic gradient Riemannian Langevin dynamics on the probability simplex. Advances in Neural Information Processing Systems, p.3102–3110.

    Google Scholar 

  • Ranganath, R., Wang, C., Blei, D.M., et al., 2013. An adaptive learning rate for stochastic variational inferencen. J. Mach. Learn. Res., 28(2):298–306.

    Google Scholar 

  • Schaul, T., Zhang, S., LeCun, Y., 2013. No more pesky learning rates. arXiv preprint, arXiv:1206:1106v2.

    Google Scholar 

  • Song, X., Lin, C.Y., Tseng, B.L., et al., 2005. Modeling and predicting personal information dissemination behavior. Proc. 11th ACM SIGKDD Int. Conf. on Knowledge Discovery in Data Mining, p.479–488. [doi:10.1145/1081870.1081925]

    Google Scholar 

  • Tadić, V.B., 2009. Convergence rate of stochastic gradient search in the case of multiple and non-isolated minima. arXiv preprint, arXiv:0904.4229v2.

    Google Scholar 

  • Teh, Y.W., Newman, D., Welling, M., 2007. A collapsed variational Bayesian inference algorithm for latent Dirichlet allocation. Advances in Neural Information Processing Systems, p.1353–1360.

    Google Scholar 

  • Wang, C., Chen, X., Smola, A.J., et al., 2013. Variance reduction for stochastic gradient optimization. Advances in Neural Information Processing Systems, p.181–189.

    Google Scholar 

  • Wang, Y., Bai, H., Stanton, M., et al., 2009. PLDA: parallel latent Dirichlet allocation for large-scale applications. Proc. 5th Int. Conf. on Algorithmic Aspects in Information and Management, p.301–314. [doi:10.1007/978-3-642-02158-9_26]

    Chapter  Google Scholar 

  • Yan, F., Xu, N., Qi, Y., 2009. Parallel inference for latent Dirichlet allocation on graphics processing units. Advances in Neural Information Processing Systems, p.2134–2142.

    Google Scholar 

  • Ye, Y., Gong, S., Liu, C., et al., 2013. Online belief propagation algorithm for probabilistic latent semantic analysis. Front. Comput. Sci., 7(5):526–535. [doi:10.1007/s11704-013-2360-7]

    Article  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ji-hong Ouyang.

Additional information

Project supported by the National Natural Science Foundation of China (Nos. 61170092, 61133011, and 61103091)

ORCID: Xi-ming LI, http://orcid.org/0000-0001-8190-5087

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Li, Xm., Ouyang, Jh. & Lu, Y. Topic modeling for large-scale text data. Frontiers Inf Technol Electronic Eng 16, 457–465 (2015). https://doi.org/10.1631/FITEE.1400352

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1631/FITEE.1400352

Key words

CLC number

Navigation