Abstract
We present a probabilistic model for robust factor analysis and principal component analysis in which the observation noise is modeled by Student-t distributions in order to reduce the negative effect of outliers. The Student-t distributions are modeled independently for each data dimensions, which is different from previous works using multivariate Student-t distributions. We compare methods using the proposed noise distribution, the multivariate Student-t and the Laplace distribution. Intractability of evaluating the posterior probability density is solved by using variational Bayesian approximation methods. We demonstrate that the assumed noise model can yield accurate reconstructions because corrupted elements of a bad quality sample can be reconstructed using the other elements of the same data vector. Experiments on an artificial dataset and a weather dataset show that the dimensional independency and the flexibility of the proposed Student-t noise model can make it superior in some applications.
Similar content being viewed by others
References
Archambeau C, Delannay N, Verleysen M (2006) Robust probabilistic projections. In: Proceedings of the 23rd international conference on machine learning (ICML2006), pp 33–40
Beal MJ (2003) Variational algorithms for approximate Bayesian inference. PhD thesis, Gatsby Computational Neuroscience Unit, University College, London
Bishop C (1999) Variational principal components. In: Proceedings of the 9th international conference on artificial neural networks (ICANN’99), vol 1, pp 509–514
Bishop C: Pattern recognition and machine learning. Springer, New York (2006)
Candès EJ, Li X, Ma Y, Wright J: Robust principal component analysis?. J ACM 58, 37 (2011)
Chandrasekaran V, Sanghavi S, Parrilo PA, Willsky AS (2009) Sparse and low-rank matrix decomposition. In: IFAC symposium on system identification
Cichocki A, Amari SI: Adaptive blind signal and image processing: learning algorithms and applications. Wiley, New York (2002)
Ding X, He L, Carin L: Bayesian robust principal component analysis. IEEE Transactions on Image Processing 29(12), 3419–3430 (2011)
Gao J: Robust L1 principal component analysis and its Bayesian variational inference. Neural Comput 20(2), 555–572 (2008)
Hyvärinen A, Karhunen J, Oja E: Independent component analysis. J. Wiley, New York (2001)
Ilin A, Raiko T: Practical approaches to principal component analysis in the presence of missing values. J Mach Learn Res 11, 1957–2000 (2010)
Jolliffe I: Principal component analysis, 2nd edn. Springer, New York (2002)
Khan Z, Dellaert F: Robust generative subspace modeling: the subspace t distribution. Tech. rep., GVU Center, College of Computing, Georgia (2004)
Liu C, Rubin D: ML estimation of the t distribution using EM and its extensions, ECM and ECME. Stat Sinica 5, 19–39 (1995)
Luttinen J, Ilin A (2009) Variational Gaussian-process factor analysis for modeling spatio-temporal data. In: Advances in neural information processing systems 22. MIT Press, Cambridge, MA, USA, pp 1177–1185
Luttinen J, Ilin A: Transformations in variational Bayesian factor analysis to speed up learning. Neurocomputing 73(7–9), 1093–1102 (2010)
Roweis S: EM algorithms for PCA and SPCA. In: Jordan, M, Kearns, M, Solla, S (eds) Advances in neural information processing systems, vol 10, pp. 626–632. MIT Press, Cambridge (1998)
Tipping M, Bishop C: Probabilistic principal component analysis. J R Stat Soc Ser B 61(3), 611–622 (1999)
Wright J, Peng Y, Ma Y, Ganesh A, Rao S (2009) Robust principal component analysis: Exact recovery of corrupted low-rank matrices by convex optimization. In: Advances in neural information processing systems 22. MIT Press, Cambridge, MA
Zhao J, Jiang Q: Probabilistic PCA for t distributions. Neurocomputing 69, 2217–2226 (2006)
Zhao Jh, Yu PLH: A note on variational Bayesian factor analysis. Neural Netw 22, 988–997 (2009)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Luttinen, J., Ilin, A. & Karhunen, J. Bayesian Robust PCA of Incomplete Data. Neural Process Lett 36, 189–202 (2012). https://doi.org/10.1007/s11063-012-9230-4
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11063-012-9230-4