Advertisement

Document Clustering Games in Static and Dynamic Scenarios

  • Rocco TripodiEmail author
  • Marcello Pelillo
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10163)

Abstract

In this work we propose a game theoretic model for document clustering. Each document to be clustered is represented as a player and each cluster as a strategy. The players receive a reward interacting with other players that they try to maximize choosing their best strategies. The geometry of the data is modeled with a weighted graph that encodes the pairwise similarity among documents, so that similar players are constrained to choose similar strategies, updating their strategy preferences at each iteration of the games. We used different approaches to find the prototypical elements of the clusters and with this information we divided the players into two disjoint sets, one collecting players with a definite strategy and the other one collecting players that try to learn from others the correct strategy to play. The latter set of players can be considered as new data points that have to be clustered according to previous information. This representation is useful in scenarios in which the data are streamed continuously. The evaluation of the system was conducted on 13 document datasets using different settings. It shows that the proposed method performs well compared to different document clustering algorithms.

Notes

Acknowledgements

This work was partly supported by the Samsung Global Research Outreach Program.

References

  1. 1.
    Aggarwal, C.C.: A survey of stream clustering algorithms. In: Data Clustering: Algorithms and Applications, pp. 231–258 (2013)Google Scholar
  2. 2.
    Aggarwal, C.C.: Data Streams: Models and Algorithms. Springer, Heidelberg (2007)CrossRefzbMATHGoogle Scholar
  3. 3.
    Ardanuy, M.C., Sporleder, C.: Structure-based clustering of novels. In: EACL 2014, pp. 31–39 (2014)Google Scholar
  4. 4.
    Bharat, K., Curtiss, M., Schmitt, M.: Methods and apparatus for clustering news content. US Patent 7,568,148, 28 July 2009Google Scholar
  5. 5.
    Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent Dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003). http://dl.acm.org/citation.cfm?id=944919.944937 zbMATHGoogle Scholar
  6. 6.
    Dhillon, I.S.: Co-clustering documents and words using bipartite spectral graph partitioning. In: Proceedings of the Seventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 269–274. ACM (2001)Google Scholar
  7. 7.
    Ding, C., Li, T., Peng, W.: Nonnegative matrix factorization and probabilistic latent semantic indexing: equivalence chi-square statistic, and a hybrid method. In: Proceedings of the National Conference on Artificial Intelligence, vol. 21, p. 342. AAAI Press/MIT Press, Menlo Park/Cambridge, London (1999, 2006)Google Scholar
  8. 8.
    Erdem, A., Pelillo, M.: Graph transduction as a noncooperative game. Neural Comput. 24(3), 700–723 (2012)MathSciNetCrossRefzbMATHGoogle Scholar
  9. 9.
    Haykin, S., Network, N.: A comprehensive foundation. Neural Netw. 2, 1–3 (2004)Google Scholar
  10. 10.
    Jain, A.K., Dubes, R.C.: Algorithms for clustering data. Prentice-Hall, Inc., Upper Saddle River (1988)zbMATHGoogle Scholar
  11. 11.
    Landauer, T.K., Foltz, P.W., Laham, D.: An introduction to latent semantic analysis. Discourse Process. 25(2–3), 259–284 (1998)CrossRefGoogle Scholar
  12. 12.
    Lee, D.D., Seung, H.S.: Learning the parts of objects by non-negative matrix factorization. Nature 401(6755), 788–791 (1999)CrossRefGoogle Scholar
  13. 13.
    Leyton-Brown, K., Shoham, Y.: Essentials of game theory: a concise multidisciplinary introduction. Synth. Lect. Artif. Intell. Mach. Learn. 2(1), 1–88 (2008)CrossRefzbMATHGoogle Scholar
  14. 14.
    Lovasz, L.: Matching Theory (North-Holland Mathematics Studies) (1986)Google Scholar
  15. 15.
    Manning, C.D., Raghavan, P., Schütze, H., et al.: Introduction to Information Retrieval, vol. 1. Cambridge University Press, Cambridge (2008)CrossRefzbMATHGoogle Scholar
  16. 16.
    Nowak, M.A., Sigmund, K.: Evolutionary dynamics of biological games. Science 303(5659), 793–799 (2004)CrossRefGoogle Scholar
  17. 17.
    Okasha, S., Binmore, K.: Evolution and Rationality: Decisions, Co-operation and Strategic Behaviour. Cambridge University Press, Cambridge (2012)Google Scholar
  18. 18.
    Pavan, M., Pelillo, M.: Dominant sets and pairwise clustering. IEEE Trans. Pattern Anal. Mach. Intell. 29(1), 167–172 (2007)CrossRefGoogle Scholar
  19. 19.
    Peterson, A.D.: A separability index for clustering and classification problems with applications to cluster merging and systematic evaluation of clustering algorithms (2011)Google Scholar
  20. 20.
    Pompili, F., Gillis, N., Absil, P.A., Glineur, F.: Two algorithms for orthogonal nonnegative matrix factorization with application to clustering. Neurocomputing 141, 15–25 (2014)CrossRefGoogle Scholar
  21. 21.
    Rota Bulò, S., Pelillo, M.: A game-theoretic approach to hypergraph clustering. IEEE Trans. Pattern Anal. Mach. Intell. 35(6), 1312–1327 (2013)CrossRefGoogle Scholar
  22. 22.
    Rota Buló, S., Pelillo, M., Bomze, I.M.: Graph-based quadratic optimization: a fast evolutionary approach. Comput. Vis. Image Underst. 115(7), 984–995 (2011)CrossRefGoogle Scholar
  23. 23.
    Sandholm, W.H.: Population Games and Evolutionary Dynamics. MIT Press, Cambridge (2010)zbMATHGoogle Scholar
  24. 24.
    Sankaranarayanan, J., Samet, H., Teitler, B.E., Lieberman, M.D., Sperling, J.: TwitterStand: news in tweets. In: Proceedings of the 17th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems, pp. 42–51. ACM (2009)Google Scholar
  25. 25.
    Shawe-Taylor, J., Cristianini, N.: Kernel Methods for Pattern Analysis. Cambridge University Press, Cambridge (2004)CrossRefzbMATHGoogle Scholar
  26. 26.
    Smith, J.M., Price, G.: The logic of animal conflict. Nature 246, 15 (1973)CrossRefGoogle Scholar
  27. 27.
    Strehl, A., Ghosh, J.: Cluster ensembles—a knowledge reuse framework for combining multiple partitions. J. Mach. Learn. Res. 3, 583–617 (2003)MathSciNetzbMATHGoogle Scholar
  28. 28.
    Szabó, G., Fath, G.: Evolutionary games on graphs. Phys. Rep. 446(4), 97–216 (2007)MathSciNetCrossRefGoogle Scholar
  29. 29.
    Tagarelli, A., Karypis, G.: Document clustering: the next frontier. In: Data Clustering: Algorithms and Applications, p. 305 (2013)Google Scholar
  30. 30.
    Taylor, P.D., Jonker, L.B.: Evolutionary stable strategies and game dynamics. Math. Biosci. 40(1), 145–156 (1978)MathSciNetCrossRefzbMATHGoogle Scholar
  31. 31.
    Tripodi, R., Pelillo, M.: A game-theoretic approach to word sense disambiguation. Comput. Linguist. (in press)Google Scholar
  32. 32.
    Tripodi, R., Pelillo, M.: Document clustering games. In: Proceedings of the 5th International Conference on Pattern Recognition Applications and Methods, pp. 109–118 (2016)Google Scholar
  33. 33.
    Von Neumann, J., Morgenstern, O.: Theory of Games and Economic Behavior (60th Anniversary Commemorative Edition). Princeton University Press, Princeton (1944)Google Scholar
  34. 34.
    Weibull, J.W.: Evolutionary Game Theory. MIT Press, Cambridge (1997)zbMATHGoogle Scholar
  35. 35.
    Xu, W., Liu, X., Gong, Y.: Document clustering based on non-negative matrix factorization. In: Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development in Informaion Retrieval, pp. 267–273. ACM (2003)Google Scholar
  36. 36.
    Zhao, Y., Karypis, G.: Empirical and theoretical comparisons of selected criterion functions for document clustering. Mach. Learn. 55(3), 311–331 (2004)CrossRefzbMATHGoogle Scholar
  37. 37.
    Zhao, Y., Karypis, G., Fayyad, U.: Hierarchical clustering algorithms for document datasets. Data Min. Knowl. Discov. 10(2), 141–168 (2005)MathSciNetCrossRefGoogle Scholar
  38. 38.
    Zhong, S., Ghosh, J.: Generative model-based document clustering: a comparative study. Knowl. Inf. Syst. 8(3), 374–384 (2005)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  1. 1.ECLTCa’ Foscari University, Ca’ MinichVeniceItaly
  2. 2.DAISCa’ Foscari University, Via TorinoVeniceItaly

Personalised recommendations