Advertisement

Probabilistic and Bayesian Networks

  • Ke-Lin DuEmail author
  • M. N. S. Swamy
Chapter

Abstract

This chapter introduces several important probabilistic models. Bayesian network is a well-known probabilistic model in machine learning. Hidden Markov model is a special case of Bayesian network model for dynamic systems. Important probabilistic methods, including sampling methods, expectation–maximization method, variational Bayesian method, and mixture method, are introduced. Some Bayesian and probabilistic approaches to machine learning are also mentioned in this chapter.

References

  1. 1.
    Abbeel, P., Koller, D., & Ng, A. Y. (2006). Learning factor graphs in polynomial time and sample complexity. Journal of Machine Learning Research, 7, 1743–1788.MathSciNetzbMATHGoogle Scholar
  2. 2.
    Ahn, J.-H., Oh, J.-H., & Choi, S. (2007). Learning principal directions: Integrated-squared-error minimization. Neurocomputing, 70, 1372–1381.CrossRefGoogle Scholar
  3. 3.
    Andrieu, C., de Freitas, N., & Doucet, A. (2001). Robust full Bayesian learning for radial basis networks. Neural Computation, 13, 2359–2407.zbMATHCrossRefGoogle Scholar
  4. 4.
    Andrieu, C., Doucet, A., & Holenstein, R. (2010). Particle Markov chain Monte Carlo methods. Journal of the Royal Statistical Society: Series B, 72(3), 269–342.MathSciNetzbMATHCrossRefGoogle Scholar
  5. 5.
    Archambeau, C., & Verleysen, M. (2007). Robust Bayesian clustering. Neural Networks, 20, 129–138.zbMATHCrossRefGoogle Scholar
  6. 6.
    Attias, H. (1999). Inferring parameters and structure of latent variable models by variational Bayes. In Proceedings of the 15th Annual Conference on Uncertainty in AI (pp. 21–30).Google Scholar
  7. 7.
    Attias, H. (1999). Independent factor analysis. Neural Computation, 11, 803–851.CrossRefGoogle Scholar
  8. 8.
    Audhkhasi, K., Osoba, O., & Kosko, B. (2013). Noise benefits in backpropagation and deep bidirectional pre-training. In Proccedings of International Joint Conference on Neural Networks (IJCNN) (pp. 1–8). Dallas, TX.Google Scholar
  9. 9.
    Banerjee, A., Dhillon, I. S., Ghosh, J., & Sra, S. (2005). Clustering on the unit hypersphere using von Mises-Fisher distributions. Journal of Machine Learning Research, 6, 1345–1382.MathSciNetzbMATHGoogle Scholar
  10. 10.
    Bauer, E., Koller, D., & Singer, Y. (1997). Update rules for parameter estimation in Bayesian networks. In Proceedings of Annual Conference on Uncertainty in AI (pp. 3–13).Google Scholar
  11. 11.
    Baum, L. E., Petrie, T., Soules, G., & Weiss, N. (1970). A maximization technique occurring in the statistical analysis of probabilistic functions of Markov chains. Annals of Mathematical Statistics, 41(1), 164–171.MathSciNetzbMATHCrossRefGoogle Scholar
  12. 12.
    Beinlich, I. A., Suermondt, H. J., Chavez, R. M., & Cooper, G. F. (1989). The ALARM monitoring system: A case study with two probabilistic inference techniques for Bayesian networks. In Proceedings of the 2nd European Conference on Artificial Intelligence in Medicine (pp. 247–256).Google Scholar
  13. 13.
    Benavent, A. P., Ruiz, F. E., & Saez, J. M. (2009). Learning Gaussian mixture models with entropy-based criteria. IEEE Transactions on Neural Networks, 20(11), 1756–1771.CrossRefGoogle Scholar
  14. 14.
    Binder, J., Koller, D., Russell, S., & Kanazawa, K. (1997). Adaptive probabilistic networks with hidden variables. Machine Learning, 29, 213–244.zbMATHCrossRefGoogle Scholar
  15. 15.
    Bouchaert, R. R. (1994). Probabilistic network construction using the minimum description length principle. Technical Report UU-CS-1994-27, Utrecht University, Department of Computer Science, The Netherlands.Google Scholar
  16. 16.
    Bouguila, N., & Ziou, D. (2007). High-dimensional unsupervised selection and estimation of a finite generalized Dirichlet mixture model based on minimum message length. IEEE Transactions on Pattern Analysis and Machine Intelligence, 29(10), 1716–1731.CrossRefGoogle Scholar
  17. 17.
    Boutemedjet, S., Bouguila, N., & Ziou, D. (2009). A hybrid feature extraction selection approach for high-dimensional non-Gaussian data clustering. IEEE Transactions on Pattern Analysis and Machine Intelligence, 31(8), 1429–1443.CrossRefGoogle Scholar
  18. 18.
    Bradley, P. S., Fayyad, U. M., & Reina, C. A. (1998). Scaling EM (expectation-maximization) clustering to large databases. MSR-TR-98-35, Microsoft Research.Google Scholar
  19. 19.
    Breese, J. S., & Heckerman, D. (1996). Decision-theoretic troubleshooting: a framework for repair and experiment. In Proceedings of the 12th Conference on Uncertainty in AI (pp. 124–132). Portland, OR.Google Scholar
  20. 20.
    Bromberg, F., Margaritis, D., & Honavar, V. (2006). Effcient Markov network structure discovery using independence tests. In Proceedings of the 6th SIAM International Conference on Data Mining (pp. 141–152).Google Scholar
  21. 21.
    Buntine, W. (1991). Theory refinement of Bayesian networks. In B. D. D’Ambrosio, P. Smets, & P. P. Bonisone (Eds.), Proceedings of the 7th Conference on Uncertainty in AI (pp. 52–60). Burlington: Morgan Kaufmann.Google Scholar
  22. 22.
    Celeux, G., & Govaert, G. (1992). A classification EM algorithm for clustering and two stochastic versions. Computational Statistics & Data Analysis, 14(3), 315–332.MathSciNetzbMATHCrossRefGoogle Scholar
  23. 23.
    Cemgil, A. T. (2009). Bayesian Inference for Nonnegative Matrix Factorisation Models. Computational Intelligence and Neuroscience, 2009, 785152, 17 p.Google Scholar
  24. 24.
    Centeno, T. P., & Lawrence, N. D. (2006). Optimising kernel parameters and regularisation coefficients for non-linear discriminant analysis. Journal of Machine Learning Research, 7, 455–491.MathSciNetzbMATHGoogle Scholar
  25. 25.
    Chang, R., & Hancock, J. (1966). On receiver structures for channels having memory. IEEE Transactions on Information Theory, 12(4), 463–468.CrossRefGoogle Scholar
  26. 26.
    Chatzis, S. P., & Demiris, Y. (2011). Echo state Gaussian process. IEEE Transactions on Neural Networks, 22(9), 1435–1445.CrossRefGoogle Scholar
  27. 27.
    Chatzis, S. P., & Kosmopoulos, D. I. (2011). A variational Bayesian methodology for hidden Markov models utilizing Student’s-t mixtures. Pattern Recognition, 44(2), 295–306.zbMATHCrossRefGoogle Scholar
  28. 28.
    Chen, H., Tino, P., & Yao, X. (2009). Probabilistic classification vector machines. IEEE Transactions on Neural Networks, 20(6), 901–914.CrossRefGoogle Scholar
  29. 29.
    Chen, X.-W., Anantha, G., & Lin, X. (2008). Improving Bayesian network structure learning with mutual information-based node ordering in the K2 algorithm. IEEE Transactions on Knowledge and Data Engineering, 20(5), 1–13.CrossRefGoogle Scholar
  30. 30.
    Chen, Y., Bornn, L., de Freitas, N., Eskelin, M., Fang, J., & Welling, M. (2016). Herded Gibbs sampling. Journal of Machine Learning Research, 17, 1–29.MathSciNetzbMATHGoogle Scholar
  31. 31.
    Chen, Y.-C., Wheeler, T. A., & Kochenderfer, J. (2017). Learning discrete Bayesian networks from continuous data. Journal of Artificial Intelligence Research, 59, 103–132.MathSciNetzbMATHCrossRefGoogle Scholar
  32. 32.
    Cheng, J., Greiner, R., Kelly, J., Bell, D., & Liu, W. (2002). Learning Bayesian networks from data: An information-theory based approach. Artificial Intelligence, 137(1), 43–90.MathSciNetzbMATHCrossRefGoogle Scholar
  33. 33.
    Cheng, S.-S., Fu, H.-C., & Wang, H.-M. (2009). Model-based clustering by probabilistic self-organizing maps. IEEE Transactions on Neural Networks, 20(5), 805–826.CrossRefGoogle Scholar
  34. 34.
    Cheung, Y. M. (2005). Maximum weighted likelihood via rival penalized EM for density mixture clustering with automatic model selection. IEEE Transactions on Knowledge and Data Engineering, 17(6), 750–761.CrossRefGoogle Scholar
  35. 35.
    Chickering, D. M. (1996). Learning Bayesian networks is NP-complete. In D. Fisher & H. Lenz (Eds.), Learning from Data: Artificial Intelligence and Statistics (Vol. 5, pp. 121–130). Berlin: Springer.CrossRefGoogle Scholar
  36. 36.
    Chickering, D. M. (2002). Optimal structure identification with greedy search. Journal of Machine Learning Research, 3, 507–554.MathSciNetzbMATHGoogle Scholar
  37. 37.
    Chickering, D. M., Heckerman, D., & Meek, C. (2004). Large-sample learning of Bayesian networks is NP-hard. Journal of Machine Learning Research, 5, 1287–1330.MathSciNetzbMATHGoogle Scholar
  38. 38.
    Chien, J.-T., & Hsieh, H.-L. (2013). Nonstationary source separation using sequential and variational Bayesian learning. IEEE Transactions on Neural Networks and Learning Systems, 24(5), 681–694.MathSciNetCrossRefGoogle Scholar
  39. 39.
    Choudrey, R. A., & Roberts, S. J. (2003). Variational mixture of Bayesian independent component analyzers. Neural Computation, 15, 213–252.zbMATHCrossRefGoogle Scholar
  40. 40.
    Chu, W., & Ghahramani, Z. (2009). Probabilistic models for incomplete multi-dimensional arrays. In Proceedings of the 12nd International Conference on Artificial Intelligence and Statistics (AISTATS) (pp. 89–96). Clearwater Beach, FL.Google Scholar
  41. 41.
    Chu, W., Keerthi, S. S., & Ong, C. J. (2003). Bayesian trigonometric support vector classifier. Neural Computation, 15, 2227–2254.zbMATHCrossRefGoogle Scholar
  42. 42.
    Cohen, I., Bronstein, A., & Cozman, F. G. (2001). Adaptive online learning of Bayesian network parameters. HP Laboratories Palo Alto, HPL-2001-156.Google Scholar
  43. 43.
    Cohn, I., El-Hay, T., Friedman, N., & Kupferman, R. (2010). Mean field variational approximation for continuous-time Bayesian networks. Journal of Machine Learning Research, 11, 2745–2783.MathSciNetzbMATHGoogle Scholar
  44. 44.
    Constantinopoulos, C., Titsias, M. K., & Likas, A. (2006). Bayesian feature and model selection for Gaussian mixture models. IEEE Transactions on Pattern Analysis and Machine Intelligence, 28(6), 1013–1018.CrossRefGoogle Scholar
  45. 45.
    Cooper, G. F. (1990). The computational complexity of probabilistic inference using Bayesian Inference. Artificial Intelligence, 42, 393–405.MathSciNetzbMATHCrossRefGoogle Scholar
  46. 46.
    Cooper, G. F., & Herskovits, E. (1992). A Bayesian method for the induction of probabilistic networks from data. Machine Learning, 9, 309–347.zbMATHGoogle Scholar
  47. 47.
    Cowell, R. (2001). Conditions under which conditional independence and scoring methods lead to identical selection of Bayesian network models. In J. Breese, & D. Koller (Eds.), Proceedings of the 17th Conference on Uncertainty in AI (pp. 91–97). Burlington: Morgan Kaufmann.Google Scholar
  48. 48.
    Dagum, P., & Luby, M. (1993). Approximating probabilistic inference in Bayesian belief networks is NP-hard. Artificial Intelligence, 60(1), 141–154.MathSciNetzbMATHCrossRefGoogle Scholar
  49. 49.
    Dagum, P., & Luby, M. (1997). An optimal approximation algorithm for Bayesian inference. Artificial Intelligence, 93, 1–27.MathSciNetzbMATHCrossRefGoogle Scholar
  50. 50.
    Darwiche, A. (2001). Constant-space reasoning in dynamic Bayesian networks. International Journal of Approximate Reasoning, 26(3), 161–178.MathSciNetzbMATHCrossRefGoogle Scholar
  51. 51.
    Dauwels, J., Korl, S., & Loeliger, H.-A. (2005). Expectation maximization as message passing. In Proceedings of IEEE International Symposium on Information Theory (1–4). Adelaide, Australia.Google Scholar
  52. 52.
    Dawid, A. P. (1992). Applications of a general propagation algorithm for probalilistic expert systems. Statistics and Computing, 2, 25–36.CrossRefGoogle Scholar
  53. 53.
    de Campos, L. M., & Castellano, J. G. (2007). Bayesian network learning algorithms using structural restrictions. International Journal of Approximate Reasoning, 45, 233–254.MathSciNetzbMATHCrossRefGoogle Scholar
  54. 54.
    de Campos, C. P., & Ji, Q. (2011). Efficient structure learning of Bayesian networks using constraints. Journal of Machine Learning Research, 12, 663–689.MathSciNetzbMATHGoogle Scholar
  55. 55.
    Del Moral, P., Doucet, A., & Jasra, A. (2006). Sequential Monte Carlo samplers. Journal of the Royal Statistical Society: Series B, 68(3), 411–436.MathSciNetzbMATHCrossRefGoogle Scholar
  56. 56.
    Dempster, A. P., Laird, N. M., & Rubin, D. B. (1977). Maximum Likelihood from Incomplete Data via the EM Algorithm. Journal of the Royal Statistical Society, Series B, 39(1), 1–38.MathSciNetzbMATHGoogle Scholar
  57. 57.
    Deneve, S. (2008). Bayesian spiking neurons. I: Inference. Neural Computation, 20(1), 91–117.MathSciNetzbMATHCrossRefGoogle Scholar
  58. 58.
    Du, K.-L., & Swamy, M. N. S. (2010). Wireless communication systems. Cambridge: Cambridge University Press.CrossRefGoogle Scholar
  59. 59.
    El-Hay, T., Friedman, N., & Kupferman, R. (2008). Gibbs sampling in factorized continuous-time Markov processes. In Proceedings of the 24th Conference on Uncertainty in AI.Google Scholar
  60. 60.
    Elidan, G., & Friedman, N. (2005). Learning hidden variable networks: The information bottleneck approach. Journal of Machine Learning Research, 6, 81–127.MathSciNetzbMATHGoogle Scholar
  61. 61.
    Engel, A., & Van den Broeck, C. (2001). Statistical mechanics of learning. Cambridge: Cambridge University Press.zbMATHCrossRefGoogle Scholar
  62. 62.
    Ephraim, Y., & Merhav, N. (2002). Hidden markov processes. IEEE Transactions on Information Theory, 48(6), 1518–1569.MathSciNetzbMATHCrossRefGoogle Scholar
  63. 63.
    Fan, Y., Xu, J., & Shelton, C. R. (2010). Importance sampling for continuous time Bayesian networks. Journal of Machine Learning Research, 11, 2115–2140.MathSciNetzbMATHGoogle Scholar
  64. 64.
    Frey, B. J., & Dueck, D. (2007). Clustering by passing messages between data points. Science, 305(5814), 972–976.MathSciNetzbMATHCrossRefGoogle Scholar
  65. 65.
    Friedman, N. (1997). Learning Bayesian networks in the presence of missing values and hidden variables. In D. Fisher (Ed.), Proceedings of the 14th Conference on Uncertainty in AI (pp. 125–133). San Francisco: Morgan Kaufmann.Google Scholar
  66. 66.
    Gandhi, P., Bromberg, F., & Margaritis, D. (2008). Learning Markov network structure using few independence tests. In Proceedings of SIAM International Conference on Data Mining (pp. 680–691).Google Scholar
  67. 67.
    Gao, B., Woo, W. L., & Dlay, S. S. (2012). Variational regularized 2-D nonnegative matrix factorization. IEEE Transactions on Neural Networks and Learning Systems, 23(5), 703–716.CrossRefGoogle Scholar
  68. 68.
    Gaussier, E., & Goutte, C. (2005). Relation between PLSA and NMF and implications. In Proceedings of Annual ACMSIGIR Conference on Research and Development in Information Retrieval (pp. 601–602).Google Scholar
  69. 69.
    Gelfand, A. E., & Smith, A. F. M. (1990). Sampling-based approaches to calculating marginal densities. Journal of the American Statistical Association, 85, 398–409.MathSciNetzbMATHCrossRefGoogle Scholar
  70. 70.
    Gelly, S., & Teytaud, O. (2005). Bayesian networks: A better than frequentist approach for parametrization, and a more accurate structural complexity measure than the number of parameters. In Proceedings of CAP. Nice, France.Google Scholar
  71. 71.
    Geman, S., & Geman, D. (1984). Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images. IEEE Transactions on Pattern Analysis and Machine Intelligence, 6(6), 721–741.zbMATHCrossRefGoogle Scholar
  72. 72.
    Getoor, L., Friedman, N., Koller, D., & Taskar, B. (2002). Learning probabilistic models of link structure. Journal of Machine Learning Research, 3, 679–707.MathSciNetzbMATHGoogle Scholar
  73. 73.
    Ghahramani, Z., & Beal, M. (1999). Variational inference for Bayesian mixture of factor analysers. Advances in neural information processing systems (Vol. 12). Cambridge: MIT Press.Google Scholar
  74. 74.
    Globerson, A., & Jaakkola, T. (2007). Fixing max-product: Convergent message passing algorithms for MAP LP-relaxations. In Advances in neural information processing systems (Vol. 20, pp. 553–560). Vancouver, Canada.Google Scholar
  75. 75.
    Gonen, M., Tanugur, A. G., & Alpaydin, E. (2008). Multiclass posterior probability support vector machines. IEEE Transactions on Neural Networks, 19(1), 130–139.CrossRefGoogle Scholar
  76. 76.
    Green, P. J. (1995). Reversible jump Markov chain Monte Carlo computation and Bayesian model determination. Biometrika, 82, 711–732.MathSciNetzbMATHCrossRefGoogle Scholar
  77. 77.
    Handschin, J. E., & Mayne, D. Q. (1969). Monte Carlo techniques to estimate the conditional expectation in multi-stage non-linear filtering. International Journal of Control, 9(5), 547–559.MathSciNetzbMATHCrossRefGoogle Scholar
  78. 78.
    Hammersely, J. M., & Morton, K. W. (1954). Poor man’s Monte Carlo. Journal of the Royal Statistical Society: Series B, 16, 23–38.MathSciNetzbMATHGoogle Scholar
  79. 79.
    Hastings, W. K. (1970). Monte Carlo sampling methods using Markov chains and their applications. Biometrika, 57, 97–109.MathSciNetzbMATHCrossRefGoogle Scholar
  80. 80.
    Hazan, T., & Shashua, A. (2010). Norm-product belief propagation: Primal-dual message-passing for approximate inference. IEEE Transactions on Information Theory, 56(12), 6294–6316.MathSciNetzbMATHCrossRefGoogle Scholar
  81. 81.
    Heckerman, D. (1995). A Tutorial on learning with Bayesian networks. Microsoft Technical Report MSR-TR-95-06 (Revised Nov 1996).Google Scholar
  82. 82.
    Heckerman, D., Geiger, D., & Chickering, D. M. (1995). Learning Bayesian networks: The combination of knowledge and statistical data. Machine Learning, 20(3), 197–243.zbMATHGoogle Scholar
  83. 83.
    Hennig, P., & Kiefel, M. (2013). Quasi-Newton methods: A new direction. Journal of Machine Learning Research, 14, 843–865.MathSciNetzbMATHGoogle Scholar
  84. 84.
    Heskes, T. (2004). On the uniqueness of loopy belief propagation fixed points. Neural Computation, 16, 2379–2413.zbMATHCrossRefGoogle Scholar
  85. 85.
    Hojen-Sorensen, P. A., & d. F. R., Winther, O., & Hansen, L. K., (2002). Mean-field approaches to independent component analysis. Neural Computation, 14, 889–918.Google Scholar
  86. 86.
    Holmes, C. C., & Mallick, B. K. (1998). Bayesian radial basis functions of variable dimension. Neural Computation, 10(5), 1217–1233.CrossRefGoogle Scholar
  87. 87.
    Huang, Q., Yang, J., & Zhou, Y. (2008). Bayesian nonstationary source separation. Neurocomputing, 71, 1714–1729.CrossRefGoogle Scholar
  88. 88.
    Huang, J. C., & Frey, B. J. (2011). Cumulative distribution networks and the derivative-sum-product algorithm: Models and inference for cumulative distribution functions on graphs. Journal of Machine Learning Research, 12, 301–348.MathSciNetzbMATHGoogle Scholar
  89. 89.
    Huang, S., Li, J., Ye, J., Fleisher, A., Chen, K., Wu, T., et al. (2013). A sparse structure learning algorithm for Gaussian Bayesian network identification from high-dimensional data. IEEE Transactions on Pattern Analysis and Machine Intelligence, 35(6), 1328–1342.CrossRefGoogle Scholar
  90. 90.
    Huda, S., Yearwood, J., & Togneri, R. (2009). A stochastic version of expectation maximization algorithm for better estimation of hidden Markov model. Pattern Recognition Letters, 30, 1301–1309.CrossRefGoogle Scholar
  91. 91.
    Ickstadt, K., Bornkamp, B., Grzegorczyk, M., Wieczorek, J., Sheriff, M. R., Grecco, H. E., et al. (2010). Nonparametric Bayesian network. Bayesian. Statistics, 9, 283–316.Google Scholar
  92. 92.
    Ihler, A. T., Fisher, J. W, I. I. I., & Willsky, A. S. (2005). Loopy belief propagation: Convergence and effects of message errors. Journal of Machine Learning Research, 6, 905–936.MathSciNetzbMATHGoogle Scholar
  93. 93.
    Jensen, F. V., Lauritzen, S. L., & Olesen, K. G. (1990). Bayesian updating in causal probabilistic networks by local computations. Computational Statistics Quaterly, 4, 269–282.MathSciNetzbMATHGoogle Scholar
  94. 94.
    Jordan, M. I., Ghahramani, Z., Jaakkola, T. S., & Saul, L. K. (1999). An introduction to variational methods for graphical models. Machine Learning, 37, 183–233.zbMATHCrossRefGoogle Scholar
  95. 95.
    Kalisch, M., & Buhlmann, P. (2007). Estimating high-dimensional directed acyclic graphs with the PC-algorithm. Journal of Machine Learning Research, 8, 613–636.zbMATHGoogle Scholar
  96. 96.
    Khan, S. A., Leppaaho, E., & Kaski, S. (2016). Bayesian multi-tensor factorization. Machine Learning, 105(2), 233–253.MathSciNetzbMATHCrossRefGoogle Scholar
  97. 97.
    Khreich, W., Granger, E., Miri, A., & Sabourin, R. (2010). On the memory complexity of the forward-backward algorithm. Pattern Recognition Letters, 31, 91–99.zbMATHCrossRefGoogle Scholar
  98. 98.
    Kingma, D. P., & Welling, M. (2014). Auto-encoding variational Bayes. In Proceedings of the 2nd International Conference on Learning Representations (pp. 1–14). Banff, Canada.Google Scholar
  99. 99.
    Kjaerulff, U. (1995). dHugin: A computational system for dynamic time-sliced Bayesian networks. International Journal of Forecasting, 11(1), 89–113.CrossRefGoogle Scholar
  100. 100.
    Koivisto, M., & Sood, K. (2004). Exact Bayesian structure discovery in Bayesian networks. Journal of Machine Learning Research, 5, 549–573.MathSciNetzbMATHGoogle Scholar
  101. 101.
    Kolmogorov, V. (2015). A new look at reweighted message passing. IEEE Transactions on Pattern Analysis and Machine Intelligence, 37(5), 919–930.CrossRefGoogle Scholar
  102. 102.
    Kschischang, F. R., Frey, B. J., & Loeliger, H.-A. (2001). Factor graphs and the sum-product algorithm. Transactions on Information Theory, 47(2), 498–519.MathSciNetzbMATHCrossRefGoogle Scholar
  103. 103.
    Lam, W., & Bacchus, F. (1994). Learning Bayesian belief networks: An approach based on the MDL principle. Computational Intelligence, 10(3), 269–293.CrossRefGoogle Scholar
  104. 104.
    Lam, W., & Segre, A. M. (2002). A distributed learning algorithm for Bayesian inference networks. IEEE Transactions on Knowledge and Data Engineering, 14(1), 93–105.CrossRefGoogle Scholar
  105. 105.
    Langari, R., Wang, L., & Yen, J. (1997). Radial basis function networks, regression weights, and the expectation-maximization algorithm. IEEE Transactions on Systems, Man, and Cybernetics, Part A, 27(5), 613–623.CrossRefGoogle Scholar
  106. 106.
    Lappalainen, H., & Honkela, A. (2000). Bayesian nonlinear independent component analysis by multilayer perceptron. In M. Girolami (Ed.), Advances in independent component analysis (pp. 93–121). Berlin: Springer.CrossRefGoogle Scholar
  107. 107.
    Lauritzen, S. L. (1992). Propagation of probabilities, means and variances in mixed graphical association models. Journal of the American Statistical Association, 87(420), 1098–1108.MathSciNetzbMATHCrossRefGoogle Scholar
  108. 108.
    Lauritzen, S. L., & Spiegelhalter, D. J. (1988). Local computations with probabilities on graphical structures and their application on expert systems. Journal of the Royal Statistical Society, Series B, 50(2), 157–224.MathSciNetzbMATHGoogle Scholar
  109. 109.
    Lazaro, M., Santamaria, I., & Pantaleon, C. (2003). A new EM-based training algorithm for RBF networks. Neural Networks, 16, 69–77.CrossRefGoogle Scholar
  110. 110.
    Lawrence, N. (2005). Probabilistic non-linear principal component analysis with Gaussian process latent variable models. Journal of Machine Learning Research, 6, 1783–1816.MathSciNetzbMATHGoogle Scholar
  111. 111.
    Lawrence, N., Seeger, M., & Herbrich, R. (2003). Fast sparse Gaussian process methods: The informative vector machine. In Advances in Neural Information Processing Systems (Vol. 15, pp. 609–616).Google Scholar
  112. 112.
    Li, J., & Tao, D. (2013). Exponential family factors for Bayesian factor analysis. IEEE Transactions on Neural Networks and Learning Systems, 24(6), 964–976.CrossRefGoogle Scholar
  113. 113.
    Liang, F. (2007). Annealing stochastic approximation Monte Carlo algorithm for neural network training. Machine Learning, 68, 201–233.CrossRefGoogle Scholar
  114. 114.
    Liang, F., Liu, C., & Carroll, R. J. (2007). Stochastic approximation in Monte Carlo computation. Journal of the American Statistical Association, 102, 305–320.MathSciNetzbMATHCrossRefGoogle Scholar
  115. 115.
    Lindsten, F., Jordan, M. I., & Schon, T. B. (2014). Particle Gibbs with ancestor sampling. Journal of Machine Learning Research, 15, 2145–2184.MathSciNetzbMATHGoogle Scholar
  116. 116.
    Lopez-Rubio, E. (2009). Multivariate Student-t self-organizing maps. Neural Networks, 22, 1432–1447.zbMATHCrossRefGoogle Scholar
  117. 117.
    Lu, X., Wang, Y., & Yuan, Y. (2013). Sparse coding from a Bayesian perspective. IEEE Transactions on Neural Networks and Learning Systems, 24(6), 929–939.CrossRefGoogle Scholar
  118. 118.
    Luis, R., Sucar, L. E., & Morales, E. F. (2010). Inductive transfer for learning Bayesian networks. Machine Learning, 79, 227–255.MathSciNetCrossRefGoogle Scholar
  119. 119.
    Ma, S., Ji, C., & Farmer, J. (1997). An efficient EM-based training algorithm for feedforward neural networks. Neural Networks, 10, 243–256.CrossRefGoogle Scholar
  120. 120.
    Ma, J., Xu, L., & Jordan, M. I. (2000). Asymptotic convergence rate of the EM algorithm for Gaussian mixtures. Neural Computation, 12, 2881–2907.CrossRefGoogle Scholar
  121. 121.
    Mackay, D. J. C. (1992). A practical Bayesian framework for backpropagation networks. Neural Computation, 4(3), 448–472.CrossRefGoogle Scholar
  122. 122.
    Margaritis, D., & Thrun, S. (2000). Bayesian network induction via local neighborhoods. In S. A. Solla, T. K. Leen, & K.-R. Muller (Eds.), Advances in neural information processing systems (Vol. 12, pp. 505–511). Cambridge: MIT Press.Google Scholar
  123. 123.
    Martins, A. F. T., Figueiredo, M. A. T., Aguiar, P. M. Q., Smith, N. A., & Xing, E. P. (2015). AD\(^3\): Alternating directions dual decomposition for MAP inference in graphical models. Journal of Machine Learning Research, 16, 495–545.MathSciNetzbMATHGoogle Scholar
  124. 124.
    Mateescu, R., & Dechter, R. (2009). Mixed deterministic and probabilistic networks: A survey of recent results. Annals of Mathematics and Artificial Intelligence, 54, 3–51.zbMATHCrossRefGoogle Scholar
  125. 125.
    Meilijson, I. (1989). A fast improvement to the EM algorithm on its own terms. Journal of the Royal Statistical Society: Series B, 51(1), 127–138.MathSciNetzbMATHGoogle Scholar
  126. 126.
    Minka, T. (2001). Expectation propagation for approximate Bayesian inference. Doctoral Dissertation, MIT Media Lab.Google Scholar
  127. 127.
    Miskin, J. W., & MacKay, D. J. C. (2001). Ensemble learning for blind source separation. In S. Roberts & R. Everson (Eds.), Independent component analysis: Principles and practice (pp. 209–233). Cambridge: Cambridge University Press.CrossRefGoogle Scholar
  128. 128.
    Mnih, A., & Salakhutdinov, R. R. (2007). Probabilistic matrix factorization. In Advances in neural information processing systems (Vol. 20, pp. 1257–1264). Red Hook: Curran & Associates Inc.Google Scholar
  129. 129.
    Mongillo, G., & Deneve, S. (2008). Online learning with hidden Markov models. Neural Computation, 20, 1706–1716.MathSciNetzbMATHCrossRefGoogle Scholar
  130. 130.
    Moral, S., Rumi, R., & Salmeron, A. (2001). Mixtures of truncated exponentials in hybrid Bayesian networks. In S. Benferhat, & P. Besnard (Eds.), Proceedings of the 6th European Conference on Symbolic and Quantitative Approaches to Reasoning with Uncertainty, LNCS 2143 (pp. 156–167). Berlin: Springer.Google Scholar
  131. 131.
    Nasios, N., & Bors, A. (2006). Variational learning for Gaussian mixtures. IEEE Transactions on Systems Man and Cybernetics, Part B, 36(4), 849–862.CrossRefGoogle Scholar
  132. 132.
    Neal, R. M. (1993). Probabilistic inference using Markov chain Monte Carlo methods. Technical Report CRG-TR-93-1, Department of Computer Science, University of Toronto, Toronto.Google Scholar
  133. 133.
    Ngo, L., & Haddawy, P. (1995). Probabilistic logic programming and bayesian networks. In Algorithms, Concurrency and Knowledge (Proceedings of Asian Computing Science Conference), LNCS (Vol. 1023, pp. 286–300). Berlin: Springer.Google Scholar
  134. 134.
    Nielsen, S. H., & Nielsen, T. D. (2008). Adapting Bayes network structures to non-stationary domains. International Journal of Approximate Reasoning, 49, 379–397.MathSciNetzbMATHCrossRefGoogle Scholar
  135. 135.
    Nodelman, U., Shelton, C. R., & Koller, D. (2002). Continuous time Bayesian networks. In Proceedings of the 18th Conference on Uncertainty in Artificial Intelligence (UAI) (pp. 378–387).Google Scholar
  136. 136.
    Noorshams, N., & Wainwright, M. J. (2013). Stochastic belief propagation: A low-complexity alternative to the sum-product algorithm. IEEE Transactions on Information Theory, 59(4), 1981–2000.MathSciNetzbMATHCrossRefGoogle Scholar
  137. 137.
    Norton, M., Mafusalov, A., & Uryasev, S. (2017). Soft margin support vector classification as buffered probability minimization. Journal of Machine Learning Research, 18, 1–43.MathSciNetzbMATHGoogle Scholar
  138. 138.
    Opper, M. (1998). A Bayesian approach to online learning. In D. Saad (Ed.), On-line learning in neural networks (pp. 363–378). Cambridge: Cambridge University Press.zbMATHGoogle Scholar
  139. 139.
    Opper, M., & Winther, O. (2000). Gaussian processes for classification: Mean field algorithms. Neural Computation, 12, 2655–2684.CrossRefGoogle Scholar
  140. 140.
    Opper, M., & Winther, O. (2001). Tractable approximations for probabilistic models: The adaptive Thouless-Anderson-Palmer mean field approach. Physical Review Letters, 86, 3695–3699.CrossRefGoogle Scholar
  141. 141.
    Opper, M., & Winther, O. (2005). Expectation consistent approximate inference. Journal of Machine Learning Research, 6, 2177–2204.MathSciNetzbMATHGoogle Scholar
  142. 142.
    Osoba, O., & Kosko, B. (2016). The noisy expectation-maximization algorithm for multiplicative noise injection. Fluctuation and Noise Letters, 15(1), paper ID 1650007.Google Scholar
  143. 143.
    Osoba, O., Mitaim, S., & Kosko, B. (2011). Noise benefits in the expectation-maximization algorithm: NEM theorems and models. In Proccedings of the International Joint Conference on Neural Networks (IJCNN) (pp. 3178–3183). San Jose, CA.Google Scholar
  144. 144.
    Ott, G. (1967). Compact encoding of stationary markov sources. IEEE Transactions on Information Theory, 13(1), 82–86.zbMATHCrossRefGoogle Scholar
  145. 145.
    Park, H., & Ozeki, T. (2009). Singularity and slow convergence of the EM algorithm for Gaussian mixtures. Neural Processing Letters, 29, 45–59.CrossRefGoogle Scholar
  146. 146.
    Pearl, J. (1988). Probabilistic reasoning in intelligent systems: Networks of plausible inference. San Mateo: Morgan Kaufmann.zbMATHGoogle Scholar
  147. 147.
    Perez, A., Larranaga, P., & Inza, I. (2009). Bayesian classifiers based on kernel density estimation: Flexible classifiers. International Journal of Approximate Reasoning, 50, 341–362.zbMATHCrossRefGoogle Scholar
  148. 148.
    Pietra, S. D., Pietra, V. D., & Lafferty, J. (1997). Inducing features of random fields. IEEE Transactions on Pattern Analysis and Machine Intelligence, 19(4), 380–393.CrossRefGoogle Scholar
  149. 149.
    Rabiner, L. R. (1989). A tutorial on hidden Markov models and selected applications in speech recognition. Proceedings of the IEEE, 77(2), 257–286.CrossRefGoogle Scholar
  150. 150.
    Raviv, J. (1967). Decision making in markov chains applied to the problem of pattern recognition. IEEE Transactions on Information Theory, 13(4), 536–551.MathSciNetCrossRefGoogle Scholar
  151. 151.
    Richardson, S., & Green, P. J. (1997). On Bayesian analysis of mixtures with an unknown number of components (with discussion). Journal of the Royal Statistical Society: Series B, 59(4), 731–792.MathSciNetzbMATHCrossRefGoogle Scholar
  152. 152.
    Robert, C. P., & Casella, G. (2004). Monte Carlo Statistical Methods. New York: Springer.zbMATHCrossRefGoogle Scholar
  153. 153.
    Romero, V., Rumi, R., & Salmeron, A. (2006). Learning hybrid Bayesian networks using mixtures of truncated exponentials. International Journal of Approximate Reasoning, 42, 54–68.MathSciNetzbMATHCrossRefGoogle Scholar
  154. 154.
    Roos, T., Grunwald, P., & Myllymaki, P. (2005). On discriminative Bayesian network classifiers and logistic regression. Machine Learning, 59, 267–296.zbMATHGoogle Scholar
  155. 155.
    Rosipal, R., & Girolami, M. (2001). An expectation-maximization approach to nonlinear component analysis. Neural Computation, 13, 505–510.zbMATHCrossRefGoogle Scholar
  156. 156.
    Roweis, S. (1998). EM algorithms for PCA and SPCA. Advances in neural information processing systems (Vol. 10, pp. 626–632). Cambridge: MIT Press.Google Scholar
  157. 157.
    Rusakov, D., & Geiger, D. (2005). Asymptotic model selection for naive Bayesian networks. Journal of Machine Learning Research, 6, 1–35.MathSciNetzbMATHGoogle Scholar
  158. 158.
    Sarela, J., & Valpola, H. (2005). Denoising source separation. Journal of Machine Learning Research, 6, 233–272.MathSciNetzbMATHGoogle Scholar
  159. 159.
    Sato, M. (2001). Online model selection based on the variational Bayes. Neural Computation, 13, 1649–1681.zbMATHCrossRefGoogle Scholar
  160. 160.
    Scutari, M. (2010). Learning Bayesian networks with the bnlearn R package. Journal of Statistical Software, 35(3), 1–22.MathSciNetCrossRefGoogle Scholar
  161. 161.
    Seeger, M. W. (2008). Bayesian inference and optimal design for the sparse linear model. Journal of Machine Learning Research, 9, 759–813.MathSciNetzbMATHGoogle Scholar
  162. 162.
    Shashanka, M., Raj, B., & Smaragdis, P. (2008). Probabilistic latent variable models as nonnegative factorizations. Computational Intelligence and Neuroscience, 2008, 947438, 9 p.Google Scholar
  163. 163.
    Shelton, C. R., Fan, Y., Lam, W., Lee, J., & Xu, J. (2010). Continuous time Bayesian network reasoning and learning engine. Journal of Machine Learning Research, 11, 1137–1140.zbMATHGoogle Scholar
  164. 164.
    Shutin, D., Zechner, C., Kulkarni, S. R., & Poor, H. V. (2012). Regularized variational Bayesian learning of echo state networks with delay & sum readout. Neural Computation, 24, 967–995.MathSciNetzbMATHCrossRefGoogle Scholar
  165. 165.
    Silander, T., & Myllymaki, P. (2006). A simple approach for finding the globally optimal Bayesian network structure. In Proceedings of the 22th Conference on Uncertainty in AI (pp. 445–452).Google Scholar
  166. 166.
    Silander, T., Kontkanen, P., & Myllymaki, P. (2007). On sensitivity of the MAP Bayesian network structure to the equivalent sample size parameter. In R. Parr, & L. van der Gaag (Eds.), Proceedings of the 23rd Conference on Uncertainty in AI (pp. 360–367). AUAI Press.Google Scholar
  167. 167.
    Silander, T., Roos, T., & Myllymaki, P. (2009). Locally minimax optimal predictive modeling with Bayesian networks. In Proceedings of the 12th International Conference on Artificial Intelligence and Statistics (AISTATS), JMLR Proceedings Track (Vol. 5, 504–511). Clearwater Beach, FL.Google Scholar
  168. 168.
    Smyth, P., Hecherman, D., & Jordan, M. I. (1997). Probabilistic independent networks for hidden Markov probabilities models. Neural Computation, 9(2), 227–269.CrossRefGoogle Scholar
  169. 169.
    Spiegelhalter, D. J., & Lauritzen, S. L. (1990). Sequential updating of conditional probabilities on directed graphical structures. Networks, 20(5), 579–605.MathSciNetzbMATHCrossRefGoogle Scholar
  170. 170.
    Spirtes, P., Glymour, C., & Scheines, R. (2000). Causation, prediction, and search (2nd ed.). Cambridge: MIT Press.zbMATHGoogle Scholar
  171. 171.
    Takeda, A., & M. Sugiyama, (2008). \(\nu \)-support vector machine as conditional value-at-risk minimization. In Proceedings of the ACM 25th international conference on machine learning (pp. 1056–1063).Google Scholar
  172. 172.
    Takekawa, T., & Fukai, T. (2009). A novel view of the variational Bayesian clustering. Neurocomputing, 72, 3366–3369.CrossRefGoogle Scholar
  173. 173.
    Tamada, Y., Imoto, S., & Miyano, S. (2011). Parallel algorithm for learning optimal Bayesian network structure. Journal of Machine Learning Research, 12, 2437–2459.MathSciNetzbMATHGoogle Scholar
  174. 174.
    Tan, X., & Li, J. (2010). Computationally efficient sparse Bayesian learning via belief propagation. IEEE Transactions on Signal Processing, 58(4), 2010–2021.MathSciNetzbMATHCrossRefGoogle Scholar
  175. 175.
    Tanner, M., & Wong, W. (1987). The calculation of posterior distributions by data augmentation. Journal of the American Statistical Association, 82(398), 528–540.MathSciNetzbMATHCrossRefGoogle Scholar
  176. 176.
    Tao, Q., Wu, G., Wang, F., & Wang, J. (2005). Posterior probability support vector machines for unbalanced data. IEEE Transactions on Neural Networks, 16(6), 1561–1573.CrossRefGoogle Scholar
  177. 177.
    Tatikonda, S., & Jordan, M. (2002). Loopy belief propagation and Gibbs measures. In Proceedings of the 18th Conference on Uncertainty in Artificial Intelligence (pp. 493–500). San Francisco, CA: Morgan Kaufmann.Google Scholar
  178. 178.
    Ting, J.-A., D’Souza, A., Vijayakumar, S., & Schaal, S. (2010). Efficient learning and feature selection in high-dimensional regression. Neural Computation, 22, 831–886.MathSciNetzbMATHCrossRefGoogle Scholar
  179. 179.
    Tipping, M. E. (2000). The relevance vector machine. In Advances in neural information processing systems (Vol. 12, pp. 652–658).Google Scholar
  180. 180.
    Tipping, M. E. (2001). Sparse Bayesian learning and the relevance vector machine. Journal of Machine Learning Research, 1, 211–244.MathSciNetzbMATHGoogle Scholar
  181. 181.
    Tipping, M. E., & Bishop, C. M. (1999). Probabilistic principal component analysis. Journal of the Royal Statistical Society: Series B, 61(3), 611–622.MathSciNetzbMATHCrossRefGoogle Scholar
  182. 182.
    Tipping, M. E., & Bishop, C. M. (1999). Mixtures of probabilistic principal component analyzers. Neural Computation, 11, 443–482.CrossRefGoogle Scholar
  183. 183.
    Tipping, M. E., & Faul, A. C. (2003). Fast marginal likelihood maximisation for sparse Bayesian models. In Proceedings of the 9th International Workshop on Artificial Intelligence and Statistics (pp. 1–13). Key West, FL.Google Scholar
  184. 184.
    Tsamardinos, I., Brown, L. E., & Aliferis, C. F. (2006). The max-min hill-climbing bayesian network structure learning algorithm. Machine Learning, 65(1), 31–78.CrossRefGoogle Scholar
  185. 185.
    Tzikas, D. G., Likas, A. C., & Galatsanos, N. P. (2009). Sparse Bayesian modeling with adaptive kernel learning. IEEE Transactions on Neural Networks, 20(6), 926–937.zbMATHCrossRefGoogle Scholar
  186. 186.
    Ueda, N., Nakano, R., Ghahramani, Z., & Hinton, G. E. (2000). SMEM algorithm for mixture models. Neural Computation, 12, 2109–2128.CrossRefGoogle Scholar
  187. 187.
    Valpola, H., & Pajunen, P. (2000). Fast algorithms for Bayesian independent component analysis. In Proceedings of the 2nd International Workshop on Independent Component Analysis and Signal Separation (pp. 233–237). Helsinki, Finland.Google Scholar
  188. 188.
    Valpola, H. (2000). Nonlinear independent component analysis using ensemble learning: Theory. In Proceedings of the 2nd International Workshop on Independent Component Analysis and Signal Separation (pp. 251–256). Helsinki, Finland.Google Scholar
  189. 189.
    Valpola, H., & Karhunen, J. (2002). An unsupervised ensemble learning for nonlinear dynamic state-space models. Neural Computation, 141(11), 2647–2692.zbMATHCrossRefGoogle Scholar
  190. 190.
    Verma, T., & Pearl, J. (1990). Equivalence and synthesis of causal models. In Proceedings of the 6th Conference on Uncertainty in AI (255–268). Cambridge, MA.Google Scholar
  191. 191.
    Wang, N., Yao, T., Wang, J., & Yeung, D.-Y. (2012). A probabilistic approach to robust matrix factorization. In Proceedings of the 12th European Conference on Computer Vision (pp. 126–139). Florence, Italy.Google Scholar
  192. 192.
    Watanabe, K., & Watanabe, S. (2006). Stochastic complexities of Gaussian mixtures in variational Bayesian approximation. Journal of Machine Learning Research, 7(4), 625–644.MathSciNetzbMATHGoogle Scholar
  193. 193.
    Watanabe, K., & Watanabe, S. (2007). Stochastic complexities of general mixture models in variational Bayesian learning. Neural Networks, 20, 210–219.zbMATHCrossRefGoogle Scholar
  194. 194.
    Watanabe, K., Akaho, S., Omachi, S., & Okada, M. (2009). VB mixture model on a subspace of exponential family distributions. IEEE Transactions on Neural Networks, 20(11), 1783–1796.CrossRefGoogle Scholar
  195. 195.
    Weiss, Y., & Freeman, W. T. (2001). Correctness of belief propagation in Gaussian graphical models of arbitrary topology. Neural Computation, 13(10), 2173–2200.zbMATHCrossRefGoogle Scholar
  196. 196.
    Welling, M., & Weber, M. (2001). A constrained EM algorithm for independent component analysis. Neural Computation, 13, 677–689.zbMATHCrossRefGoogle Scholar
  197. 197.
    Winn, J., & Bishop, C. M. (2005). Variational message passing. Journal of Machine Learning Research, 6, 661–694.MathSciNetzbMATHGoogle Scholar
  198. 198.
    Winther, O., & Petersen, K. B. (2007). Flexible and efficient implementations of Bayesian independent component analysis. Neurocomputing, 71, 221–233.CrossRefGoogle Scholar
  199. 199.
    Xiang, Y. (2000). Belief updating in multiply sectioned Bayesian networks without repeated local propagations. International Journal of Approximate Reasoning, 23, 1–21.MathSciNetzbMATHCrossRefGoogle Scholar
  200. 200.
    Xie, X., & Geng, Z. (2008). A recursive method for structural learning of directed acyclic graphs. Journal of Machine Learning Research, 9, 459–483.MathSciNetzbMATHGoogle Scholar
  201. 201.
    Xie, X., Yan, S., Kwok, J., & Huang, T. (2008). Matrix-variate factor analysis and its applications. IEEE Transactions on Neural Networks, 19(10), 1821–1826.CrossRefGoogle Scholar
  202. 202.
    Xu, L., Jordan, M. I., & Hinton, G. E. (1995). An alternative model for mixtures of experts. In G. Tesauro, D. S. Touretzky, & T. K. Leen (Eds.), Advances in neural information processing systems (Vol. 7, pp. 633–640). Cambridge: MIT Press.Google Scholar
  203. 203.
    Yamazaki, K., & Watanabe, S. (2003). Singularities in mixture models and upper bounds of stochastic complexity. Neural Networks, 16, 1023–1038.zbMATHGoogle Scholar
  204. 204.
    Yang, Z. R. (2006). A novel radial basis function neural network for discriminant analysis. IEEE Transactions on Neural Networks, 17(3), 604–612.CrossRefGoogle Scholar
  205. 205.
    Yap, G.-E., Tan, A.-H., & Pang, H.-H. (2008). Explaining inferences in Bayesian networks. Applied Intelligence, 29, 263–278.CrossRefGoogle Scholar
  206. 206.
    Yedidia, J. S., Freeman, W. T., & Weiss, Y. (2001). Generalized belief propagation. In T. K. Leen, T. G. Dietterich, & V. Tresp (Eds.), Advances in neural information processing systems (Vol. 13, pp. 689–695). Cambridge: MIT Press.Google Scholar
  207. 207.
    Yuille, A. (2002). CCCP algorithms to minimize the Bethe and Kikuchi free energies: Convergent alternatives to belief propagation. Neural Computation, 14, 1691–1722.zbMATHCrossRefGoogle Scholar
  208. 208.
    Zhang, B., Zhang, C., & Yi, X. (2004). Competitive EM algorithm for finite mixture models. Pattern Recognition, 37, 131–144.zbMATHCrossRefGoogle Scholar
  209. 209.
    Zhao, J., Yu, P. L. H., & Kwok, J. T. (2012). Bilinear probabilistic principal component analysis. IEEE Transactions on Neural Networks and Learning Systems, 23(3), 492–503.CrossRefGoogle Scholar
  210. 210.
    Zhao, Q., Meng, D., Xu, Z., Zuo, W., & Yan, Y. (2015). \(L_1\)-norm low-rank matrix factorization by variational Bayesian method. IEEE Transactions on Neural Networks and Learning Systems, 26(4), 825–839.MathSciNetCrossRefGoogle Scholar
  211. 211.
    Zhao, Q., Zhou, G., Zhang, L., Cichocki, A., & Amari, S.-I. (2016). Bayesian robust tensor factorization for incomplete multiway data. IEEE Transactions on Neural Networks and Learning Systems, 27(4), 736–748.MathSciNetCrossRefGoogle Scholar

Copyright information

© Springer-Verlag London Ltd., part of Springer Nature 2019

Authors and Affiliations

  1. 1.Department of Electrical and Computer EngineeringConcordia UniversityMontrealCanada
  2. 2.Xonlink Inc.HangzhouChina

Personalised recommendations