Data Mining and Knowledge Discovery

, Volume 25, Issue 2, pp 298–324 | Cite as

Tensor factorization using auxiliary information

  • Atsuhiro Narita
  • Kohei Hayashi
  • Ryota Tomioka
  • Hisashi Kashima
Open Access


Most of the existing analysis methods for tensors (or multi-way arrays) only assume that tensors to be completed are of low rank. However, for example, when they are applied to tensor completion problems, their prediction accuracy tends to be significantly worse when only a limited number of entries are observed. In this paper, we propose to use relationships among data as auxiliary information in addition to the low-rank assumption to improve the quality of tensor decomposition. We introduce two regularization approaches using graph Laplacians induced from the relationships, one for moderately sparse cases and the other for extremely sparse cases. We also give present two kinds of iterative algorithms for approximate solutions: one based on an EM-like algorithms which is stable but not so scalable, and the other based on gradient-based optimization which is applicable to large scale datasets. Numerical experiments on tensor completion using synthetic and benchmark datasets show that the use of auxiliary information improves completion accuracy over the existing methods based only on the low-rank assumption, especially when observations are sparse.


Tensors Multi-way arrays CP-decomposition Tucker decomposition Side information 



The authors would like to thank the anonymous reviewers of ECML PKDD 2011 for their valuable comments and suggestions to improve the quality of the paper.

Open Access

This article is distributed under the terms of the Creative Commons Attribution License which permits any use, distribution, and reproduction in any medium, provided the original author(s) and the source are credited.


  1. Acar E, Dunlavy DM, Kolda TG, Mørup M (2010) Scalable tensor factorizations with missing data. In: Proceedings of the 2010 SIAM international conference on data mining, pp 701–712Google Scholar
  2. Adams RP, Dahl GE, Murray I (2010) Incorporating side information into probabilistic matrix factorization using Gaussian processes. In: Grünwald P, Spirtes P (eds) Proceedings of the 26th conference on uncertainty in artificial intelligence, Catalina Island, California, pp 1–9Google Scholar
  3. Banerjee A, Basu S, Merugu S (2007) Multi-way clustering on relation graphs. In: Proceedings of the 2007 SIAM international conference on data miningGoogle Scholar
  4. Cai D, He X, Han J, Huang TS (2010) Graph regularized non-negative matrix factorization for data representation. IEEE Trans Pattern Anal Mach Intell 33(10): 2026–2038Google Scholar
  5. Cai JF, Candes EJ, She Z (2010) A singular value thresholding algorithm for matrix completion. SIAM J Optim 20(4): 1956–1982MathSciNetMATHCrossRefGoogle Scholar
  6. Candes EJ, Tao T (2010) The power of convex relaxation: near-optimal matrix completion. IEEE Trans Inf Theory 56(5): 2053–2080MathSciNetCrossRefGoogle Scholar
  7. Chu W, Ghahramani Z (2009) Probabilistic models for incomplete multi-dimensional arrays. In: Proceedings of the 12th international conference on artificial intelligence and statisticsGoogle Scholar
  8. Collins M, Dasgupta S, Schapire RE (2002) A generalization of principal components analysis to the exponential family. In: Dietterich TG, Becker S, Ghahramani Z (eds) Advances in neural information processing systems, vol 14. MIT Press, CambridgeGoogle Scholar
  9. Dunlavy DM, Kolda TG, Acar E (2011) Temporal link prediction using matrix and tensor factorizations. ACM Trans Knowl Discov Data 5: 10–11027CrossRefGoogle Scholar
  10. Gandy S, Recht B, Yamada I (2010) Tensor completion and low-n-rank tensor recovery via convex optimization. Inverse Problems 27(2): 025010MathSciNetCrossRefGoogle Scholar
  11. Harshman RA (1970) Foundations of the PARAFAC procedure: models and conditions for an “explanatory” multi-modal factor analysis. UCLA Work Pap Phonetics 16(1): 84Google Scholar
  12. Hayashi K, Takenouchi T, Shibata T, Kamiya Y, Kato D, Kunieda K, Yamada K, Ikeda K (2010) Exponential family tensor factorization for missing-values prediction and anomaly detection. In: Proceedings of the 10th IEEE international conference on data mining, pp 216 –225Google Scholar
  13. Kashima H, Kato T, Yamanishi Y, Sugiyama M, Tsuda K (2009) Link propagation: a fast semi-supervised algorithm for link prediction. In: Proceedings of the 2009 SIAM international conference on data miningGoogle Scholar
  14. Kolda TG, Bader BW (2009) Tensor decompositions and applications. SIAM Rev 51(3): 455–500MathSciNetMATHCrossRefGoogle Scholar
  15. Li WJ, Yeung DY (2009) Relation regularized matrix factorization. In: Proceedings of the 21st international joint conference on artificial intelligence, pp 1126–1131Google Scholar
  16. Lu Z, Agarwal D, Dhillon IS (2009) A spatio-temporal approach to collaborative filtering. In: Proceedings of the 3rd ACM conference on recommender systems, pp 13–20Google Scholar
  17. Nishimori Y, Akaho S (2010) Learning algorithms utilizing quasi-geodesic flows on the Stiefel manifold. Neurocomputing 67: 106–135CrossRefGoogle Scholar
  18. Plumbley MD (2005) Geometrical methods for non-negative ICA: manifolds, Lie groups and toral subalgebras. Neurocomputing 67: 161–197CrossRefGoogle Scholar
  19. Porteous I, Asuncion A, Welling M (2010) Bayesian matrix factorization with side information and Dirichlet process mixtures. In: Proceedings of the 24th AAAI conference on artificial intelligence, pp 563–568Google Scholar
  20. Rendle S, Thieme LS (2010) Pairwise interaction tensor factorization for personalized tag recommendation. In: Proceedings of the 3rd ACM international conference on web search and data mining, pp 81–90Google Scholar
  21. Salakhutdinov R, Mnih A (2008) Probabilistic matrix factorization. In: Platt JC, Koller D, Singer Y, Roweis S (eds.) Advances in neural information processing systems, vol 20. MIT Press, CambridgeGoogle Scholar
  22. Shashua A, Hazan T (2005) Non-negative tensor factorization with applications to statistics and computer vision. In: Proceedings of the 22nd international conference on machine learning, pp 792–799Google Scholar
  23. Srebro N (2004) Learning with matrix factorizations. Ph.D. thesis, Massachusetts Institute of Technology, CambridgeGoogle Scholar
  24. Srebro N, Rennie J, Jaakkola T (2005) Maximum-margin matrix factorization. In: Advances in neural information processing systems, vol 17. MIT Press, CambridgeGoogle Scholar
  25. Tomioka R, Suzuki T, Hayashi K, Kashima H (2012) Statistical performance of convex tensor decomposition. In: Advances in Neural Information Processing Systems, vol 24. MIT Press, CambridgeGoogle Scholar
  26. Tucker L (1966) Some mathematical notes on three-mode factor analysis. Psychometrika 31(3): 279–311MathSciNetCrossRefGoogle Scholar
  27. Walczak B (2001) Dealing with missing data. Part I. Chemom Intell Lab Syst 58(1): 15–27CrossRefGoogle Scholar
  28. Yu K, Lafferty J, Zhu S, Gong Y (2009) Large-scale collaborative prediction using a nonparametric random effects model. In: Proceedings of the 26th international conference on machine learning, pp 1185–1192Google Scholar

Copyright information

© The Author(s) 2012

Authors and Affiliations

  • Atsuhiro Narita
    • 1
  • Kohei Hayashi
    • 1
  • Ryota Tomioka
    • 1
  • Hisashi Kashima
    • 1
    • 2
  1. 1.Department of Mathematical InformaticsThe University of TokyoBunkyo-ku, TokyoJapan
  2. 2.Basic Research Programs PRESTOSynthesis of Knowledge for Information Oriented SocietyTokyoJapan

Personalised recommendations