Semantic Clustering of Russian Web Search Results: Possibilities and Problems

  • Andrey KutuzovEmail author
Part of the Communications in Computer and Information Science book series (CCIS, volume 505)


The present paper deals with word sense induction from lexical co-occurrence graphs. We construct such graphs on large Russian corpora and then apply the data to cluster the results of search according to meanings in the query. We compare different methods of performing such clustering and different source corpora. Models of applying distributional semantics to big linguistic data are described.


  1. 1.
    Harris, Z.S.: Distributional Structure. Springer, Heidelberg (1970)CrossRefGoogle Scholar
  2. 2.
    Bybee, J.: Frequency of use and the Organization of Language. Oxford University Press, USA (2006)Google Scholar
  3. 3.
    Kilgarriff, A.: Dictionary word sense distinctions: An enquiry into their nature. Comput. Humanit. 26(5–6), 365–387 (1992)CrossRefGoogle Scholar
  4. 4.
    Schütze, H., Pedersen, J.O.: Information retrieval based on word senses. In: Proceedings 4th Annual Symposium on Document Analysis and Information Retrieval (SDAIR 1995), pp. 161–175 (1995)Google Scholar
  5. 5.
    Navigli, R.: A quick tour of word sense disambiguation, induction and related approaches. In: Bieliková, M., Friedrich, G., Gottlob, G., Katzenbeisser, S., Turán, G. (eds.) SOFSEM 2012. LNCS, vol. 7147, pp. 115–129. Springer, Heidelberg (2012) CrossRefGoogle Scholar
  6. 6.
    Marco, A.D., Navigli, R.: Clustering and diversifying web search results with graph-based word sense induction. Comput. Linguist. 39(3), 709–754 (2013)CrossRefGoogle Scholar
  7. 7.
    Véronis, J.: Hyperlex: lexical cartography for information retrieval. Comput. Speech Lang. 18(3), 223–252 (2004)CrossRefGoogle Scholar
  8. 8.
    Padró, L., Stanilovsky, E.: Freeling 3.0: Towards wider multilinguality. In: Calzolari, N., Choukri, K., Declerck, T., Doğan, M.U., Maegaard, B., Mariani, J., Odijk, J., Piperidis, S. (eds.) Proceedings of the Eight International Conference on LanguageResources and Evaluation (LREC’12). Istanbul, Turkey, European LanguageResources Association (ELRA), May 2012Google Scholar
  9. 9.
    Smadja, F., McKeown, K.R., Hatzivassiloglou, V.: Translating collocations for bilingual lexicons: A statistical approach. Comput. Linguist. 22(1), 1–38 (1996)Google Scholar
  10. 10.
    Watts, D.J., Strogatz, S.H.: Collective dynamics of ‘small-world’ networks. Nature 393(6684), 440–442 (1998)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Open Access This chapter is licensed under the terms of the Creative Commons Attribution-NonCommercial 2.5 International License (, which permits any noncommercial use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

Authors and Affiliations

  1. GroupNational Research University Higher School of EconomicsMoscowRussia

Personalised recommendations