Skip to main content

Data Clustering with Semi-binary Nonnegative Matrix Factorization

  • Conference paper
Artificial Intelligence and Soft Computing – ICAISC 2008 (ICAISC 2008)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 5097))

Included in the following conference series:

Abstract

Recently, a considerable growth of interest in using Nonnegative Matrix Factorization (NMF) for pattern classification and data clustering has been observed. For nonnegative data (observations, data items, feature vectors) many problems of partitional clustering can be modeled in terms of a matrix factorization into two groups of vectors: the nonnegative centroid vectors and the binary vectors of cluster indicators. Hence our data partitional clustering problem boils down to a semi-binary NMF problem. Usually, NMF problems are solved with an alternating minimization of a given cost function with multiplicative algorithms. Since our NMF problem has a particular characteristics, we apply a different algorithm for updating the estimated factors than commonly-used, i.e. a binary update with simulated annealing steering. As a result, our algorithm outperforms some well-known algorithms for partitional clustering.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 189.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Jain, A.K., Murty, M.N., Flynn, P.J.: Data clustering: a review. ACM Compututing Surveys 31(3), 264–323 (1999)

    Article  Google Scholar 

  2. Mcqueen, J.: Some methods for classification and analysis of multivariate observations. In: Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, pp. 281–297 (1967)

    Google Scholar 

  3. Anderberg, M.R.: Cluster Analysis for Applications. Monographs and Textbooks on Probability and Mathematical Statistics. Academic Press, Inc., New York (1973)

    MATH  Google Scholar 

  4. Ball, G.H., Hall, D.J.: ISODATA, a novel method of data analysis and classification. Technical report, Stanford University, Stanford, CA (1965)

    Google Scholar 

  5. Diday, E.: The dynamic cluster method in non-hierarchical clustering. J. Comput. Inf. Sci. 2, 61–88 (1973)

    Article  MathSciNet  MATH  Google Scholar 

  6. Symon, M.J.: Clustering criterion and multi-variate normal mixture. Biometrics 77, 35–43 (1977)

    Google Scholar 

  7. Mao, J., Jain, A.K.: A self-organizing network for hyperellipsoidal clustering (HEC). IEEE Trans. Neural Netw. 7(1), 16–29 (1996)

    Article  Google Scholar 

  8. Dhillon, I.S., Modha, D.M.: Concept decompositions for large sparse text data using clustering. Machine Learning J. 42, 143–175 (2001)

    Article  MATH  Google Scholar 

  9. Zahn, C.T.: Graph-theoretical methods for detecting and describing gestalt clusters. IEEE Trans. Comput. C-20, 68–86 (1971)

    Article  Google Scholar 

  10. Ozawa, K.: A stratificational overlapping cluster scheme. Pattern Recogn. 18, 279–286 (1985)

    Article  MATH  MathSciNet  Google Scholar 

  11. Jain, A.K., Dubes, R.C.: Algorithms for clustering data. Prentice-Hall, Inc., Upper Saddle River (1988)

    MATH  Google Scholar 

  12. Mitchell, T. (ed.): Machine Learning. McGraw Hill, Inc., New York (1997)

    MATH  Google Scholar 

  13. Lu, S.Y., Fu, K.S.: A sentence-to-sentence clustering procedure for pattern analysis. IEEE Trans. Syst. Man Cybern. 8, 381–389 (1978)

    Article  MATH  MathSciNet  Google Scholar 

  14. Zadeh, L.A.: Fuzzy sets. Inf. Control 8, 338–353 (1965)

    Article  MATH  MathSciNet  Google Scholar 

  15. Ruspini, E.H.: A new approach to clustering. Inf. Control 15, 22–32 (1969)

    Article  MATH  Google Scholar 

  16. Bezdek, J.C.: Pattern Recognition With Fuzzy Objective Function Algorithms. Plenum Press, New York (1981)

    MATH  Google Scholar 

  17. Sethi, I., Jain, A.K. (eds.): Artificial Neural Networks and Pattern Recognition: Old and New Connections. Elsevier Science Inc., New York (1991)

    MATH  Google Scholar 

  18. Jain, A.K., Mao, J.: Neural networks and pattern recognition. In: Zurada, J.M., Marks II, R.J., Robinson, E.G. (eds.) Computational Intell. Imitating Life, pp. 194–212. IEEE Press, Los Alamitos (1994)

    Google Scholar 

  19. Kohonen, T.: Self-organization and associative memory, 3rd edn. Springer, New York (1989)

    Google Scholar 

  20. Raghavan, V.V., Birchard, K.: A clustering strategy based on a formalism of the reproductive process in natural systems. SIGIR Forum 14(2), 10–22 (1979)

    Article  Google Scholar 

  21. Special issue on evolutionary computation. In: Fogel, D.B., Fogel, L.J. (eds.) IEEE Transactions Neural Networks (1994)

    Google Scholar 

  22. Jones, D., Beltramo, M.A.: Solving partitioning problems with genetic algorithms. In: Proc. of the Fourth International Conference on Genetic Algorithms, pp. 442–449. Morgan Kaufmann Publishers, San Francisco (1991)

    Google Scholar 

  23. Koontz, W.L.G., Fukunaga, K., Narendra, P.M.: A branch and bound clustering algorithm. IEEE Trans. Comput. 23, 908–914 (1975)

    Article  MathSciNet  Google Scholar 

  24. Cheng, C.H.: A branch-and-bound clustering algorithm. IEEE Trans. Syst. Man Cybern. 25(5), 895–898 (1995)

    Article  Google Scholar 

  25. Rojas, M., Santos, S.A., Sorensen, D.C.: Deterministic annealing approach to constrained clustering. IEEE Trans. Pattern Anal. Mach. Intell. 15, 785–794 (1993)

    Article  Google Scholar 

  26. Baeza-Yates, R.A.: Introduction to data structures and algorithms related to information retrieval. In: Information retrieval: data structures and algorithms, pp. 13–27. Prentice-Hall, Inc., Upper Saddle River (1992)

    Google Scholar 

  27. Lee, D.D., Seung, H.S.: Learning the parts of objects by nonnegative matrix factorization. Nature 401, 788–791 (1999)

    Article  Google Scholar 

  28. Lee, D.D., Seung, H.S.: Algorithms for nonnegative matrix factorization. In: NIPS, pp. 556–562 (2000)

    Google Scholar 

  29. Shahnaz, F., Berry, M., Pauca, P., Plemmons, R.: Document clustering using nonnegative matrix factorization. Inf. Process. Manage. 42(2), 373–386 (2006)

    Article  MATH  Google Scholar 

  30. Li, T., Ding, C.: The relationships among various nonnegative matrix factorization methods for clustering. In: ICDM 2006, pp. 362–371. IEEE Computer Society, Washington, DC, USA (2006)

    Chapter  Google Scholar 

  31. Ding, C., Li, T., Peng, W., Park, H.: Orthogonal nonnegative matrix tri-factorizations for clustering. In: KDD 2006: Proceedings of the 12th ACM SIGKDD international conference on knowledge discovery and data mining, pp. 126–135. ACM Press, New York (2006)

    Chapter  Google Scholar 

  32. Okun, O.G.: Non-negative matrix factorization and classifiers: experimental study. In: Proc. of the Fourth IASTED International Conference on Visualization, Imaging, and Image Processing (VIIP 2004), Marbella, Spain, pp. 550–555 (2004)

    Google Scholar 

  33. Zass, R., Shashua, A.: A unifying approach to hard and probabilistic clustering. In: International Conference on Computer Vision (ICCV), Beijing, China (2005)

    Google Scholar 

  34. Banerjee, A., Merugu, S., Dhillon, I.S., Ghosh, J.: Clustering with Bregman divergences. In: SIAM International Conf. on Data Mining, Lake Buena Vista, Florida. SIAM, Philadelphia (2004)

    Google Scholar 

  35. Carmona-Saez, P., Pascual-Marqui, R.D., Tirado, F., Carazo, J.M., Pascual-Montano, A.: Biclustering of gene expression data by non-smooth non-negative matrix factorization. BMC Bioinformatics 7(78) (2006)

    Google Scholar 

  36. Cho, H., Dhillon, I.S., Guan, Y., Sra, S.: Minimum sum squared residue based co-clustering of gene expression data. In: Proc. 4th SIAM International Conference on Data Mining (SDM), Florida, pp. 114–125 (2004)

    Google Scholar 

  37. Wild, S.: Seeding non-negative matrix factorization with the spherical k-means clustering. M.Sc. Thesis, University of Colorado (2000)

    Google Scholar 

  38. Ding, C., He, X., Simon, H.D.: On the equivalence of nonnegative matrix factorization and spectral clustering. In: Jonker, W., Petković, M. (eds.) SDM 2005. LNCS, vol. 3674, pp. 606–610. Springer, Heidelberg (2005)

    Google Scholar 

  39. Cichocki, A., Zdunek, R., Amari, S.: New algorithms for non-negative matrix factorization in applications to blind source separation. In: Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2006, Toulouse, France, pp. 621–624 (2006)

    Google Scholar 

  40. Cichocki, A., Amari, S., Zdunek, R., Kompass, R., Hori, G., He, Z.: Extended SMART algorithms for non-negative matrix factorization. In: Rutkowski, L., Tadeusiewicz, R., Zadeh, L.A., Żurada, J.M. (eds.) ICAISC 2006. LNCS (LNAI), vol. 4029, pp. 548–562. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  41. Zdunek, R., Cichocki, A.: Nonnegative matrix factorization with constrained second-order optimization. Signal Processing 87, 1904–1916 (2007)

    Article  Google Scholar 

  42. Li, H., Adali, T., Wang, W., Emge, D., Cichocki, A.: Non-negative matrix factorization with orthogonality constraints and its application to Raman spectroscopy. Journal of VLSI Signal Processing 48(1-2), 83–97 (2007)

    Article  Google Scholar 

  43. Sajda, P., Du, S., Brown, T.R., Shungu, R.S.D.C., Mao, X., Parra, L.C.: Nonnegative matrix factorization for rapid recovery of constituent spectra in magnetic resonance chemical shift imaging of the brain. IEEE Trans. Medical Imaging 23(12), 1453–1465 (2004)

    Article  Google Scholar 

  44. Cho, Y.C., Choi, S.: Nonnegative features of spectro-temporal sounds for classification. Pattern Recognition Letters 26, 1327–1336 (2005)

    Article  Google Scholar 

  45. Liu, W., Zheng, N.: Non-negative matrix factorization based methods for object recognition. Pattern Recognition Letters 25(8), 893–897 (2004)

    Article  Google Scholar 

  46. Guillamet, D., Schiele, B., Vitrià, J.: Analyzing non-negative matrix factorization for image classification. In: 16th International Conference on Pattern Recognition (ICPR 2002), Quebec City, Canada, vol. 2, pp. 116–119 (2002)

    Google Scholar 

  47. Lin, C.J.: Projected gradient methods for non-negative matrix factorization. Neural Computation 19(10), 2756–2779 (2007)

    Article  MATH  MathSciNet  Google Scholar 

  48. Kim, D., Sra, S., Dhillon, I.S.: Fast Newton-type methods for the least squares nonnegative matrix approximation problem. In: Proc. 6-th SIAM International Conference on Data Mining, Minneapolis, Minnesota, USA (2007)

    Google Scholar 

  49. Heiler, M., Schnörr, C.: Learning sparse representations by non-negative matrix factorization and sequential cone programming. J. Mach. Learn. Res. 7, 1385–1407 (2006)

    MathSciNet  Google Scholar 

  50. Zdunek, R., Cichocki, A.: Nonnegative matrix factorization with quadratic programming. Neurocomputing (accepted, 2008)

    Google Scholar 

  51. Cichocki, A., Zdunek, R.: Regularized alternating least squares algorithms for non-negative matrix/tensor factorizations. In: Liu, D., Fei, S., Hou, Z., Zhang, H., Sun, C. (eds.) ISNN 2007. LNCS, vol. 4493, pp. 793–802. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  52. Herman, G.T., Kuba, A. (eds.): Discrete Tomography: Foundations, Algorithms, and Applications. Birkhauser, Boston (1999)

    MATH  Google Scholar 

  53. Cao, B., Shen, D., Sun, J.T., Wang, X., Yang, Q., Chen, Z.: Detect and track latent factors with online nonnegative matrix factorization. In: Proc. the 20th International Joint Conference on Artificial Intelligence (IJCAI), Hyderabad, India, pp. 2689–2694 (2007)

    Google Scholar 

  54. Zhang, Z., Li, T., Ding, C., Zhang, X.S.: Binary matrix factorization with applications. In: Proc. IEEE Intternational Conference on Data Mining (ICDM) (to appear, 2007)

    Google Scholar 

  55. Green, P.J.: Bayesian reconstruction from emission tomography data using a modified EM algorithm. IEEE Trans. Medical Imaging 9, 84–93 (1990)

    Article  Google Scholar 

  56. Zdunek, R., Pralat, A.: Detection of subsurface bubbles with discrete electromagnetic geotomography. Electronic Notes in Discrete Mathematics 20, 535–553 (2005)

    Article  MathSciNet  Google Scholar 

  57. Phillips, J.W., Leahy, R.M., Mosher, J.C.: MEG-based imaging of focal neuronal current sources. IEEE Trans. Medical Imaging 16, 248–338 (1997)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Leszek Rutkowski Ryszard Tadeusiewicz Lotfi A. Zadeh Jacek M. Zurada

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Zdunek, R. (2008). Data Clustering with Semi-binary Nonnegative Matrix Factorization. In: Rutkowski, L., Tadeusiewicz, R., Zadeh, L.A., Zurada, J.M. (eds) Artificial Intelligence and Soft Computing – ICAISC 2008. ICAISC 2008. Lecture Notes in Computer Science(), vol 5097. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-69731-2_68

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-69731-2_68

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-69572-1

  • Online ISBN: 978-3-540-69731-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics