Advertisement

Learning and Utilizing a Pool of Features in Non-negative Matrix Factorization

  • Tetsuya Yoshida
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8210)

Abstract

Learning and utilizing a pool of features for a given data is important to achieve better performance in data analysis. Since many real world data can be represented as a non-negative data matrix, Non-negative Matrix Factorization (NMF) has recently become popular to deal with data under the non-negativity constraint. However, when the number of features is increased, the constraint imposed on the features can hinder the effective utilization of the learned representation. We conduct extensive experiments to investigate the effectiveness of several state-of-the-art NMF algorithms for learning and utilizing a pool of features over document datasets. Experimental results revealed that coping with the non-orthogonality of features is crucial to achieve a stable performance for exploiting a large number of features in NMF.

Keywords

Normalize Mutual Information Document Cluster Cluster Assignment Imbalanced Data Imbalanced Dataset 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Ding, C., Li, T., Jordan, M.I.: Convex and semi-nonnegative matrix factorizations. IEEE Transactions on Pattern Analysis and Machine Intelligence 32(1), 45–55 (2010)CrossRefGoogle Scholar
  2. 2.
    Ding, C., Li, T., Peng, W., Park, H.: Orthogonal nonnegative matrix tri-factorizations for clustering. In: Proc. KDD 2006, pp. 126–135 (2006)Google Scholar
  3. 3.
    Harville, D.A.: Matrix Algebra From a Statistican’s Perspective. Springer (2008)Google Scholar
  4. 4.
    Hoyer, P.O.: Non-negative matrix factorization with sparseness constraints. Journal of Machine Learning Research 5, 1457–1469 (2004)MathSciNetzbMATHGoogle Scholar
  5. 5.
    Japkowicz, N., Stephen, S.: The class imbalance problem: A systematic study. Intelligent Data Analysis 6(5), 429–450 (2002)zbMATHGoogle Scholar
  6. 6.
    Kamvar, S.D., Klein, D., Manning, C.D.: Spectral learning. In: Proc. of IJCAI 2003, pp. 561–566 (2003)Google Scholar
  7. 7.
    Lee, D.D., Seung, H.S.: Learning the parts of objects by non-negative matrix factorization. Nature 401, 788–791 (1999)CrossRefGoogle Scholar
  8. 8.
    Lee, D.D., Seung, H.S.: Algorithms for non-negative matrix factorization. In: Proc. NIPS 2001, pp. 556–562 (2001)Google Scholar
  9. 9.
    Raina, R., Battle, A., Lee, H., Packer, B., Ng, A.: Self-taught learning:transfer learning from unlabeled data. In: Proc. ICML 2007, pp. 759–766 (2007)Google Scholar
  10. 10.
    Strehl, A., Ghosh, J.: Cluster ensembles — a knowledge reuse framework for combining multiple partitions. J. Machine Learning Research 3(3), 583–617 (2002)MathSciNetGoogle Scholar
  11. 11.
    von Luxburg, U.: A tutorial on spectral clustering. Statistics and Computing 17(4), 395–416 (2007)MathSciNetCrossRefGoogle Scholar
  12. 12.
    Xu, W., Liu, X., Gong, Y.: Document clustering based on non-negative matrix factorization. In: Proc. SIGIR 2003, pp. 267–273 (2003)Google Scholar
  13. 13.
    Yoshida, T.: Cholesky decomposition rectification for non-negative matrix factorization. In: Kryszkiewicz, M., Rybinski, H., Skowron, A., Raś, Z.W. (eds.) ISMIS 2011. LNCS, vol. 6804, pp. 214–219. Springer, Heidelberg (2011)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2013

Authors and Affiliations

  • Tetsuya Yoshida
    • 1
  1. 1.Graduate School of Information Science and TechnologyHokkaido UniversitySapporoJapan

Personalised recommendations