Abstract
The paper presents experimental results on automatic word sense disambiguation (WSD). Contexts for polysemous and/or homonymic Russian nouns denoting physical objects serve as an empirical basis of the study. Sets of contexts were extracted from the Russian National Corpus (RNC). Machine learning software for WSD was developed within the framework of the project. WSD tool used in experiments is aimed at statistical processing and classification of noun contexts. WSD procedure was performed taking into account lexical markers of word meanings in contexts and semantic annotation of contexts. Sets of experiments allowed to define optimal conditions for WSD in Russian texts.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Agirre, E., Edmonds, Ph. (eds.): Word Sense Disambiguation: Algorithms and Applications. Text, Speech and Language Technology, vol. 33. Springer, Berlin (2007)
Lukaševič, N.V., Čujko, D.S.: Avtomatičeskoje razrešenije leksičeskoj mnogoznačnosti na baze tezaurusnyh znanij. In: Internet-matematika 2007, pp. 108–117. Ekaterinburg (2007)
Rahilina, E.V., Kobricov, B.P., Kustova, G.I., L’aševskaja, O.N., Šemanajeva Ju, O.: Mnogoznačnost’ kak prikladnaja problema: leksiko-semantičeskaja razmetka v Nacional’nom korpuse russkogo jazyka. In: Kompjuternaja lingvistika i intellektual’nyje tehnologii: Trudy meždunarodnoj konferencii Dialog 2006, Moscow, pp. 445–450 (2006)
Azarova, I.V., Marina, A.S.: Avtomatizirovannaja klassifikacija kontekstov pri podgotovke dannyh dl’a kompjuternogo tezaurusa RussNet. In: Kompjuternaja lingvistika i intellektual’nyje tehnologii: Trudy meždunarodnoj konferencii Dialog 2006, Moscow, pp. 13–17 (2006)
Kobricov, B.P., L’aševskaja, O.N., Šemanajeva, O., Ju, O.: Sn’atije leksiko-semantičeskoj omonimii v novostnyh i gazteno-žurnal’nyh tekstah: poverhnostnyje fil’try i statističeskaja ocenka. In: Internet-matematika 2005: Avtomatičeskaja obrabotka web-dannyh, Moscow, pp. 38–57 (2005)
Toldova, S.J., Kustova, G.I., L’aševskaja, O.N.: Semantičeskije fil’try dl’a razrešenija mnogoznačnosti v nacional’nom korpuse russkogo jazyka: glagoly. In: Kompjuternaja lingvistika i intellektual’nyje tehnologii: Trudy meždunarodnoj konferencii Dialog 2008, Moscow, pp. 522–529 (2008)
Mitrofanova, O., Mukhin, A., Panicheva, P., Savitsky, V.: Automatic Word Clustering in Russian Texts. In: Matoušek, V., Mautner, P. (eds.) TSD 2007. LNCS (LNAI), vol. 4629, pp. 85–91. Springer, Heidelberg (2007)
L’aševskaja, O.N., Sharoff, S.A.: Častotnyj slovar’ nacional’nogo korpusa russkogo jazyka: koncepcija i tehnologija sozdanija. In: Kompjuternaja lingvistika i intellektual’nyje tehnologii: Trudy meždunarodnoj konferencii Dialog 2008, Moscow, pp. 345–351 (2008)
Čermák, F., Křen, M.: Large Corpora, Lexical Frequencies and Coverage of Texts. In: Proceedings of the Corpus Linguistics Conference, Birmingham, July 14–17 (2005), http://www.corpus.bham.ac.uk/PCLC/CermakKren05.doc
Pala, K.: Word Sketches and Semantic Roles // Trudy meždunarodnoj konferencii Korpusnaja Lingvistika – 2006, pp. 307–317. St. Petersburg (2006)
Mitrofanova, O., Belik, V., Kadina, V.: Corpus Analysis of Selectional Preferences in Russian. In: Levická, J., Garabík, R. (eds.) Computer Treatment of Slavic and East European Languages: Proceedings of the Fourth International Seminar SLOVKO 2007, Bratislava, Slovakia, October 25–27, 2007, pp. 176–182 (2007)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Mitrofanova, O., Lashevskaya, O., Panicheva, P. (2008). Statistical Word Sense Disambiguation in Contexts for Russian Nouns Denoting Physical Objects. In: Sojka, P., Horák, A., Kopeček, I., Pala, K. (eds) Text, Speech and Dialogue. TSD 2008. Lecture Notes in Computer Science(), vol 5246. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-87391-4_21
Download citation
DOI: https://doi.org/10.1007/978-3-540-87391-4_21
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-87390-7
Online ISBN: 978-3-540-87391-4
eBook Packages: Computer ScienceComputer Science (R0)