Language Resources and Evaluation

, Volume 52, Issue 3, pp 733–770 | Cite as

A comparison of graph-based word sense induction clustering algorithms in a pseudoword evaluation framework

  • Flavio Massimiliano Cecchini
  • Martin Riedl
  • Elisabetta Fersini
  • Chris Biemann
Original Paper


This article presents a comparison of different Word Sense Induction (wsi) clustering algorithms on two novel pseudoword data sets of semantic-similarity and co-occurrence-based word graphs, with a special focus on the detection of homonymic polysemy. We follow the original definition of a pseudoword as the combination of two monosemous terms and their contexts to simulate a polysemous word. The evaluation is performed comparing the algorithm’s output on a pseudoword’s ego word graph (i.e., a graph that represents the pseudoword’s context in the corpus) with the known subdivision given by the components corresponding to the monosemous source words forming the pseudoword. The main contribution of this article is to present a self-sufficient pseudoword-based evaluation framework for wsi graph-based clustering algorithms, thereby defining a new evaluation measure (top2) and a secondary clustering process (hyperclustering). To our knowledge, we are the first to conduct and discuss a large-scale systematic pseudoword evaluation targeting the induction of coarse-grained homonymous word senses across a large number of graph clustering algorithms.


Word sense induction Graph clustering Pseudowords Evaluation 


  1. Amigó, E., Gonzalo, J., Artiles, J., & Verdejo, F. (2009). A comparison of extrinsic clustering evaluation metrics based on formal constraints. Information Retrieval, 12(4), 461–486.CrossRefGoogle Scholar
  2. Bagga, A., Baldwin, B. (1998). Algorithms for scoring coreference chains. In Proceedings of the first international Conference on Language Resources and Evaluation (LREC’98), workshop on linguistic coreference (pp. 563–566). European Language Resources Association, Granada, Spain.Google Scholar
  3. Barabási, A. L., & Albert, R. (1999). Emergence of scaling in random networks. Science, 286(5439), 509–512.CrossRefGoogle Scholar
  4. Başkaya, O., & Jurgens, D. (2016). Semi-supervised learning with induced word senses for state of the art word sense disambiguation. Journal of Artificial Intelligence Research, 55, 1025–1058.CrossRefGoogle Scholar
  5. Berge, C., & Minieka, E. (1973). Graphs and hypergraphs (Vol. 7). Amsterdam: North-Holland.Google Scholar
  6. Biemann, C. (2006). Chinese whispers: An efficient graph clustering algorithm and its application to natural language processing problems. In Proceedings of the first workshop on graph based methods for natural language processing (pp. 73–80), New York, NY, USA.Google Scholar
  7. Biemann, C., & Quasthoff, U. (2009). Networks generated from natural language text. In N. Ganguly, A. Deutsch & A. Mukherjee (Eds.), Dynamics on and of complex networks: Applications to biology, computer science, and the social sciences (pp. 167–185). Springer.Google Scholar
  8. Biemann, C., & Riedl, M. (2013). Text: Now in 2D! a framework for lexical expansion with contextual similarity. Journal of Language Modelling, 1(1), 55–95.CrossRefGoogle Scholar
  9. Bordag, S. (2006). Word sense induction: Triplet-based clustering and automatic evaluation. In Proceedings of the 11th conference of the European chapter of the association for computational linguistics (pp. 137–144). EACL, Trento, Italy.Google Scholar
  10. i Cancho, R. F., & Solé, R. (2001). The small world of human language. Proceedings of the Royal Society of London Series B: Biological Sciences, 268(1482), 2261–2265.CrossRefGoogle Scholar
  11. Cecchini, F. M. (2017). Graph-based clustering algorithms for word sense induction. Ph.D. thesis, Università degli Studi di Milano-Bicocca.Google Scholar
  12. Cecchini, F. M., Fersini, E. (2015) . Word sense discrimination: A gangplank algorithm. In Proceedings of the second Italian conference on computational linguistics CLiC-it 2015 (pp. 77–81). Trento, Italy.Google Scholar
  13. Cecchini, F. M., Fersini, E., & Messina, E. (2015). Word sense discrimination on tweets: A graph-based approach. In KDIR 2015—Proceedings of the international conference on knowledge discovery and information retrieval (Vol. 1, pp. 138–146). IC3K, Lisbon.Google Scholar
  14. Cover, T., & Thomas, J. (2012 [1991]). Elements of information theory. Wiley, Hoboken, NJ.Google Scholar
  15. De Marneffe, M. C., MacCartney, B., & Manning, C. (2006) . Generating typed dependency parses from phrase structure parses. In Proceedings of the fifth international conference on language resources and evaluation (LREC’06), 2006 (pp. 449–454). European Language Resources Association, Genoa.Google Scholar
  16. De Saussure, F. (1916) . Cours de linguistique générale. Payot&Rivage, Paris, France (1995 [1916]). Critical edition of 1st editionGoogle Scholar
  17. Di Marco, A., & Navigli, R. (2013). Clustering and diversifying web search results with graph-based word sense induction. Computational Linguistics, 39(3), 709–754.CrossRefGoogle Scholar
  18. Dixon, W., & Massey, F, Jr. (1957). Introduction to statistical analysis. New York, NY: McGraw-Hill.Google Scholar
  19. van Dongen, S. (2000). Graph clustering by flow simulation. Ph.D. thesis, Universiteit UtrechtGoogle Scholar
  20. Evert, S. (2004) . The statistics of word cooccurrences: Word pairs and collocations. Ph.D. thesis, Universität StuttgartGoogle Scholar
  21. Feld, S. L. (1981). The focused organization of social ties. American Journal of Sociology, 86(5), 1015–1035.CrossRefGoogle Scholar
  22. Gale, W., Church, K., & Yarowsky, D. (1992) . Work on statistical methods for word sense disambiguation. In Technical Report of 1992 fall symposium—Probabilistic approaches to natural language, pp. 54–60. AAAI, Cambridge, Massachusetts, USAGoogle Scholar
  23. Grätzer, G. (2011). Lattice theory: Foundation. New York: Springer.CrossRefGoogle Scholar
  24. Harris, Z. (1954). Distributional structure. Word, 10(2–3), 146–162.CrossRefGoogle Scholar
  25. Haynes, T. W., Hedetniemi, S., & Slater, P. (1998). Fundamentals of domination in graphs. Boca Raton, FL: CRC Press.Google Scholar
  26. Hope, D., & Keller, B. (2013). MaxMax: A graph-based soft clustering algorithm applied to word sense induction. In Proceedings of the 14th international conference on computational linguistics and intelligent text processing (pp. 368–381). Samos, GreeceGoogle Scholar
  27. Karlberg, M. (1997). Testing transitivity in graphs. Social Networks, 19(4), 325–343.CrossRefGoogle Scholar
  28. Kilgarriff, A., Rychlý, P., Smrž, P., & Tugwell, D. (2004). The sketch engine. In Proceedings of the eleventh Euralex Conference (pp. 105–116). Lorient, France.Google Scholar
  29. Lloyd, S. (1982). Least squares quantization in PCM. IEEE Transactions on Information Theory, 28(2), 129–137.CrossRefGoogle Scholar
  30. Lyons, J. (1968). Introduction to theoretical linguistics. Cambridge: Cambridge University Press.CrossRefGoogle Scholar
  31. Manandhar, S., Klapaftis, I., Dligach, D., & Pradhan, S. (2010) . Semeval-2010 task 14: Word sense induction & disambiguation. In Proceedings of the 5th international workshop on semantic evaluation (pp. 63–68). Association for Computational Linguistics, Los Angeles, CA.Google Scholar
  32. Martin, J., & Jurafsky, D. (2000). Speech and language processing. Upper Saddle River, NJ: Pearson.Google Scholar
  33. Miller, G. (1995). Wordnet: A lexical database for english. Communications of the ACM, 38(11), 39–41.CrossRefGoogle Scholar
  34. Nakov, P., & Hearst, M. (2003). Category-based pseudowords. In Companion volume of the proceedings of the human language technology conference of the North American chapter of the association for computational linguistics (HTL-NAACL) 2003—Short Papers (pp. 70–72). Association for Computational Linguistics, Edmonton, Alberta, Canada.Google Scholar
  35. Navigli, R. (2009). Word sense disambiguation: A survey. ACM Computing Surveys (CSUR), 41(2), 10.CrossRefGoogle Scholar
  36. Navigli, R., Litkowski, K., & Hargraves, O. (2007) . Semeval-2007 task 07: Coarse-grained english all-words task. In Proceedings of the 4th international workshop on semantic evaluations (pp. 30–35). Association for Computational Linguistics, Prague.Google Scholar
  37. Otrusina, L., Smrž, P. (2010) . A new approach to pseudoword generation. In Proceedings of the seventh international Conference on language resources and evaluation (LREC’10) (pp. 1195–1199). European Language Resources Association, Valletta.Google Scholar
  38. Parker, R., Graff, D., Kong, J., Chen, K., & Maeda, K. (2011) . English Gigaword, 5th edn. Linguistic Data Consortium, Philadelphia, PA.
  39. Pilehvar, M. T., & Navigli, R. (2013). Paving the way to a large-scale pseudosense-annotated dataset. In Proceedings of the 2013 conference of the North American chapter of the association for computational linguistics: Human language technologies (HTL-NAACL) (pp. 1100–1109). Association for Computational Linguistics, Atlanta, GA.Google Scholar
  40. Pilehvar, M. T., & Navigli, R. (2014). A large-scale pseudoword-based evaluation framework for state-of-the-art word sense disambiguation. Computational Linguistics, 40(4), 837–881.CrossRefGoogle Scholar
  41. Richter, M., Quasthoff, U., Hallsteinsdóttir, E., & Biemann, C. (2006) . Exploiting the Leipzig corpora collection. In Proceedings of the fifth Slovenian and first international language technologies conference, IS-LTC ’06 (pp. 68–73). Slovenian Language Technologies Society, Ljubljana.Google Scholar
  42. Riedl, M. (2016) . Unsupervised methods for learning and using semantics of natural language. Ph.D. thesis, Technische Universität DarmstadtGoogle Scholar
  43. Ruohonen, K. (2013) . Graph theory. Tampereen teknillinen yliopisto (trans: Tamminen, J., Lee, K.-C., & Piché, R.). Originally titled Graafiteoria, lecture notes.
  44. Schütze, H. (1992) . Dimensions of meaning. In Proceedings of Supercomputing’92 (pp. 787–796). ACM/IEEE, Minneapolis, MN.Google Scholar
  45. Strehl, A., & Ghosh, J. (2002). Cluster ensembles–A knowledge reuse framework for combining multiple partitions. Journal of Machine Learning Research, 3, 583–617.Google Scholar
  46. Turney, P., & Pantel, P. (2010). From frequency to meaning: Vector space models of semantics. Journal of Artificial Intelligence Research, 37(1), 141–188.CrossRefGoogle Scholar
  47. Véronis, J. (2004). Hyperlex: Lexical cartography for information retrieval. Computer Speech & Language, 18(3), 223–252.CrossRefGoogle Scholar
  48. Watts, D., & Strogatz, S. (1998). Collective dynamics of small-world networks. Nature, 393(6684), 440–442.CrossRefGoogle Scholar
  49. Widdows, D., & Dorow, B. (2002) . A graph model for unsupervised lexical acquisition. In Proceedings of the 19th international conference on computational linguistics (vol. 1, pp. 1–7). Association for Computational Linguistics, Taipei.Google Scholar

Copyright information

© Springer Science+Business Media B.V., part of Springer Nature 2018

Authors and Affiliations

  1. 1.DISCoUniversità degli Studi di Milano - BicoccaMilanItaly
  2. 2.InformatikumUniversität HamburgHamburgGermany

Personalised recommendations