Multiplicative Updates for Learning with Stochastic Matrices

  • Zhanxing Zhu
  • Zhirong Yang
  • Erkki Oja
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7944)


Stochastic matrices are arrays whose elements are discrete probabilities. They are widely used in techniques such as Markov Chains, probabilistic latent semantic analysis, etc. In such learning problems, the learned matrices, being stochastic matrices, are non-negative and all or part of the elements sum up to one. Conventional multiplicative updates which have been widely used for nonnegative learning cannot accommodate the stochasticity constraint. Simply normalizing the nonnegative matrix in learning at each step may have an adverse effect on the convergence of the optimization algorithm. Here we discuss and compare two alternative ways in developing multiplicative update rules for stochastic matrices. One reparameterizes the matrices before applying the multiplicative update principle, and the other employs relaxation with Lagrangian multipliers such that the updates jointly optimize the objective and steer the estimate towards the constraint manifold. We compare the new methods against the conventional normalization approach on two applications, parameter estimation of Hidden Markov Chain Model and Information-Theoretic Clustering. Empirical studies on both synthetic and real-world datasets demonstrate that the algorithms using the new methods perform more stably and efficiently than the conventional ones.


nonnegative learning stochastic matrix multiplicative update 


  1. 1.
    Dhillon, I.S., Sra, S.: Generalized nonnegative matrix approximations with bregman divergences. In: Advances in Neural Information Processing Systems, vol. 18, pp. 283–290 (2006)Google Scholar
  2. 2.
    Choi, S.: Algorithms for orthogonal nonnegative matrix factorization. In: Proceedings of IEEE International Joint Conference on Neural Networks, pp. 1828–1832 (2008)Google Scholar
  3. 3.
    Cichocki, A., Zdunek, R., Phan, A.H., Amari, S.: Nonnegative Matrix and Tensor Factorizations: Applications to Exploratory Multi-way Data Analysis. John Wiley (2009)Google Scholar
  4. 4.
    Faivishevsky, B., Goldberger, J.: A nonparametric information theoretic clustering algorithm. In: The 27th International Conference on Machine Learning (2010)Google Scholar
  5. 5.
    Jin, R., Ding, C., Kang, F.: A probabilistic approach for optimizing spectral clustering. In: Advances in Neural Information Processing Systems, pp. 571–578 (2005)Google Scholar
  6. 6.
    Lee, D.D., Seung, H.S.: Learning the parts of objects by non-negative matrix factorization. Nature 401, 788–791 (1999)CrossRefGoogle Scholar
  7. 7.
    Ding, C., Li, T., Peng, W.: On the equivalence between non-negative matrix factorization and probabilistic latent semantic indexing. Computational Statistics and Data Analysis 52(8), 3913–3927 (2008)MathSciNetzbMATHCrossRefGoogle Scholar
  8. 8.
    Mørup, M., Hansen, L.: Archetypal analysis for machine learning. In: IEEE International Workshop on Machine Learning for Signal Processing (MLSP), pp. 172–177. IEEE (2010)Google Scholar
  9. 9.
    Yang, Z., Oja, E.: Linear and nonlinear projective nonnegative matrix factorization. IEEE Transaction on Neural Networks 21(5), 734–749 (2010)CrossRefGoogle Scholar
  10. 10.
    Yang, Z., Oja, E.: Unified development of multiplicative algorithms for linear and quadratic nonnegative matrix factorization. IEEE Transactions on Neural Networks 22(12), 1878–1891 (2011)CrossRefGoogle Scholar
  11. 11.
    Yang, Z., Oja, E.: Quadratic nonnegative matrix factorization. Pattern Recognition 45(4), 1500–1510 (2012)zbMATHCrossRefGoogle Scholar
  12. 12.
    Yang, Z., Oja, E.: Clustering by low-rank doubly stochastic matrix decomposition. In: International Conference on Machine Learning (ICML) (2012)Google Scholar
  13. 13.
    Lakshminarayanan, B., Raich, R.: Non-negative matrix factorization for parameter estimation in hidden markov models. In: IEEE International Workshop on Machine Learning for Signal Processing (MLSP), pp. 89–94 (2010)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  • Zhanxing Zhu
    • 1
  • Zhirong Yang
    • 1
  • Erkki Oja
    • 1
  1. 1.Department of Information and Computer ScienceAalto UniversityAaltoFinland

Personalised recommendations