Bi-clustering via MDL-Based Matrix Factorization

  • Ignacio Ramírez
  • Mariano Tepper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8258)

Abstract

Bi-clustering, or co-clustering, refers to the task of finding sub-matrices (indexed by a group of columns and a group of rows) within a matrix such that the elements of each sub-matrix are related in some way, for example, that they are similar under some metric. As in traditional clustering, a crucial parameter in bi-clustering methods is the number of groups that one expects to find in the data, something which is not always available or easy to guess. The present paper proposes a novel method for performing bi-clustering based on the concept of low-rank sparse non-negative matrix factorization (S-NMF), with the additional benefit that the optimum rank k is chosen automatically using a minimum description length (MDL) selection procedure, which favors models which can represent the data with fewer bits. This MDL procedure is tested in combination with three different S-NMF algorithms, two of which are novel, on a simulated example in order to assess the validity of the procedure.

References

  1. 1.
    Madeira, S., Oliveira, A.: Biclustering Algorithms for Biological Data Analysis: A Survey. IEEE Trans. CBB 1(1), 24–45 (2004)Google Scholar
  2. 2.
    Lee, D.D., Seung, H.S.: Learning the parts of objects by non-negative matrix factorization. Nature 401(6755), 7880–791 (1999)Google Scholar
  3. 3.
    Hoyer, P.: Non-negative matrix factorization with sparseness constraints. JMLR 5, 1457–1469 (2004)MathSciNetMATHGoogle Scholar
  4. 4.
    Barron, A., Rissanen, J., Yu, B.: The minimum description length principle in coding and modeling. IEEE Trans. IT 44(6), 2743–2760 (1998)MathSciNetCrossRefMATHGoogle Scholar
  5. 5.
    Jornsten, R., Yu, B.: Simultaneous gene clustering and subset selection for sample classification via MDL. Bioinformatics 19(9), 1100–1109 (2003)CrossRefGoogle Scholar
  6. 6.
    Olshausen, B.A., Field, D.J.: Sparse coding with an overcomplete basis set: A strategy employed by v1? Vision Research 37, 3311–3325 (1997)CrossRefGoogle Scholar
  7. 7.
    Aharon, M., Elad, M., Bruckstein, A.: The K-SVD: An algorithm for designing of overcomplete dictionaries for sparse representations. IEEE Trans. SP 54(11), 4311–4322 (2006)CrossRefGoogle Scholar
  8. 8.
    A bi-clustering formulation of multiple model estimation (submitted, 2013)Google Scholar
  9. 9.
    Zou, H., Hastie, T., Tibshirani, R.: Sparse Principal Component Analysis. Computational and Graphical Statistics 15(2), 265–286 (2006)MathSciNetCrossRefGoogle Scholar
  10. 10.
    Hochreiter, S., Bodenhofer, U., Heusel, M., Mayr, A., Mitterecker, A., Kasim, A., Adetayo, K., Tatsiana, S., Suzy, V., Lin, D., Talloen, W., Bijnens, L., Shkedy, Z.: FABIA: factor analysis for biclustering acquisition. Bioinformatics 26(12), 1520–1527 (2010)CrossRefGoogle Scholar
  11. 11.
    Lee, M., Shen, H., Huang, J.Z., Marron, J.S.: Biclustering via sparse singular value decomposition. Biometrics 66(4), 1087–1095 (2010)MathSciNetCrossRefMATHGoogle Scholar
  12. 12.
    Bruckstein, A.M., Donoho, D.L., Elad, M.: From Sparse Solutions of Systems of Equations to Sparse Modeling of Signals and Images. SIAM Review 51(1), 34–81 (2009)MathSciNetCrossRefMATHGoogle Scholar
  13. 13.
    Beck, A., Teboulle, M.: A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems. SIAM Journal on Imaging Sciences 2(1), 183–202 (2009)MathSciNetCrossRefMATHGoogle Scholar
  14. 14.
    Pati, Y.C., Rezaiifar, R., Krishnaprasad, P.S.: Orthogonal Matching Pursuit: Recursive function approximation with applications to wavelet decomposition. In: Proc. 27th Ann. Asilomar Conf. Signals, Systems, and Computers (1993)Google Scholar
  15. 15.
    Cover, T.M.: Enumerative source coding. IEEE Trans. Inform. Theory 19, 73–77 (1973)MathSciNetCrossRefMATHGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  • Ignacio Ramírez
    • 1
  • Mariano Tepper
    • 2
  1. 1.Universidad de la RepúblicaUruguay
  2. 2.Duke UniversityUSA

Personalised recommendations