Abstract
Two methods of evaluation of semantic similarity/dissimilarity of English nouns are proposed based on their modifier sets taken from Oxford Collocation Dictionary for Student of English. The first method measures similarity by the portion of modifiers commonly applicable to both nouns under evaluation. The second method measures dissimilarity by the change of the mean value of cohesion between a noun and modifiers, its own or those of the contrasted noun. Cohesion between words is measured by Stable Connection Index (SCI) based of raw Web statistics for occurrences and co-occurrences of words. It is shown that the two proposed measures are approximately in inverse monotonic dependency, while the Web evaluations confer a higher resolution.
Work done under partial support of Mexican Government (CONACyT, SNI, SIP-IPN).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Bolshakov, I.A., Bolshakova, E.I.: Measurements of Lexico-Syntactic Cohesion by means of Internet. In: Gelbukh, A., de Albornoz, Á., Terashima-Marín, H. (eds.) MICAI 2005: Advances in Artificial Intelligence. LNCS (LNAI), vol. 3789, pp. 790–799. Springer, Heidelberg (2005)
Fellbaum, C.: WordNet: An Electronic Lexical Database. MIT Press, Cambridge (1998)
Hirst, G., Budanitsky, A.: Correcting Real-Word Spelling Errors by Restoring Lexical Cohesion. Natural Language Engineering 11(1), 87–111 (2005)
Keller, F., Lapata, M.: Using the Web to Obtain Frequencies for Unseen Bigram. Computational linguistics 29(3), 459–484 (2003)
Ledo-Mezquita, Y., Sidorov, G.: Combinación de los métodos de Lesk original y simplificado para desambiguación de sentidos de palabras. In: International Workshop on Natural Language Understanding and Intelligent Access to Textual Information, in conjunction with MICAI-2005, Mexico, pp. 41–47 (2005)
Lin, D.: Automatic retrieval and clustering of similar words. COLING-ACL 98 (1998)
Manning, C.D., Schütze, H.: Foundations of Statistical Natural Language Processing. MIT Press, Cambridge (1999)
McCarthy, D., Rob, K., Julie, W., John, C.: Finding PredominantWord Senses in Untagged Text. ACL-2004 (2004)
Oxford Collocations Dictionary for Students of English. Oxford University Press, Oxford (2003)
Patwardhan, S., Banerjee, S., Pedersen, T.: Using Measures of Semantic Relatedness for Word Sense Disambiguation. In: Gelbukh, A. (ed.) Computational Linguistics and Intelligent Text Processing. LNCS, vol. 2588, Springer, Heidelberg (2003)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Bolshakov, I.A., Gelbukh, A. (2007). Two Methods of Evaluation of Semantic Similarity of Nouns Based on Their Modifier Sets. In: Kedad, Z., Lammari, N., Métais, E., Meziane, F., Rezgui, Y. (eds) Natural Language Processing and Information Systems. NLDB 2007. Lecture Notes in Computer Science, vol 4592. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-73351-5_39
Download citation
DOI: https://doi.org/10.1007/978-3-540-73351-5_39
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-73350-8
Online ISBN: 978-3-540-73351-5
eBook Packages: Computer ScienceComputer Science (R0)