Comparing Distributional and Mirror Translation Similarities for Extracting Synonyms

Muller, Philippe; Langlais, Philippe

doi:10.1007/978-3-642-21043-3_40

Philippe Muller²¹ &
Philippe Langlais²²

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 6657))

Included in the following conference series:

Canadian Conference on Artificial Intelligence

Abstract

Automated thesaurus construction by collecting relations between lexical items (synonyms, antonyms, etc) has a long tradition in natural language processing. This has been done by exploiting dictionary structures or distributional context regularities (coocurrence, syntactic associations, or translation equivalents), in order to define measures of lexical similarity or relatedness. Dyvik had proposed to use aligned multilingual corpora and defines similar terms as terms that often share their translations. We evaluate the usefulness of this similarity for the extraction of synonyms, compared to the more widespread distributional approach.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Edmonds, P., Hirst, G.: Near-Synonymy and lexical choice. Computational Linguistics 28(2), 105–144 (2002)
Article Google Scholar
Michiels, A., Noel, J.: Approaches to thesaurus production. In: Proceedings of Coling 1982 (1982)
Google Scholar
Kozima, H., Furugori, T.: Similarity between words computed by spreading activation on an english dictionary. In: Proceedings of the Conference of the European Chapter of the ACL, pp. 232–239 (1993)
Google Scholar
Niwa, Y., Nitta, Y.: Co-occurrence vectors from corpora vs. distance vectors from dictionaries. In: Proceedings of Coling 1994 (1994)
Google Scholar
Lin, D.: Automatic retrieval and clustering of similar words. In: Proceedings of Coling 1998, Montreal, vol. 2, pp. 768–774 (1998)
Google Scholar
Freitag, D., Blume, M., Byrnes, J., Chow, E., Kapadia, S., Rohwer, R., Wang, Z.: New experiments in distributional representations of synonymy. In: Proceedings of CoNLL, pp. 25–32 (2005)
Google Scholar
van der Plas, L., Tiedemann, J.: Finding synonyms using automatic word alignment and measures of distributional similarity. In: Proceedings of the COLING/ACL Poster Sessions, pp. 866–873 (2006)
Google Scholar
Wu, H., Zhou, M.: Optimizing synonyms extraction with mono and bilingual resources. In: Proceedings of the Second International Workshop on Paraphrasing. Association for Computational Linguistics, Sapporo (2003)
Google Scholar
Dyvik, H.: Translations as semantic mirrors: From parallel corpus to wordnet. In: The Theory and Use of English Language Corpora, ICAME 2002 (2002)
Google Scholar
Bannard, C., Callison-Burch, C.: Paraphrasing with bilingual parallel corpora. In: Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics, pp. 597–604 (2005)
Google Scholar
Zhitomirsky-Geffet, M., Dagan, I.: Bootstrapping distributional feature vector quality. Computational Linguistics 35(3), 435–461 (2009)
Article Google Scholar
Weeds, J.E.: Measures and Applications of Lexical Distributional Similarity. PhD thesis, University of Sussex (2003)
Google Scholar
Barzilay, R., McKeown, K.R.: Extracting paraphrases from a parallel corpus. In: Proceedings of the 39th Annual Meeting of the Association for Computational Linguistics (2001)
Google Scholar
Lin, D., Zhao, S., Qin, L., Zhou, M.: Identifying synonyms among distributionally similar words. In: Proceedings of IJCAI 2003, pp. 1492–1493 (2003)
Google Scholar
Curran, J.R., Moens, M.: Improvements in automatic thesaurus extraction. In: Proceedings of the ACL 2002 Workshop on Unsupervised Lexical Acquisition, pp. 59–66 (2002)
Google Scholar
Lonneke, P., Tiedemann, J., Manguin, J.: Automatic acquisition of synonyms for French using parallel corpora. In: Proceedings of the 4th International Workshop on Distributed Agent-Based Retrieval Tools (2010)
Google Scholar
Hagiwara, M., Ogawa, Y., Toyama, K.: Supervised synonym acquisition using distributional features and syntactic patterns. Journal of Natural Language Processing 16(2), 59–83 (2009)
Article Google Scholar
Ferret, O.: Testing semantic similarity measures for extracting synonyms from a corpus. In: Proceeding of LREC (2010)
Google Scholar
Turney, P.: A uniform approach to analogies, synonyms, antonyms, and associations. In: Proceedings of Coling 2008, pp. 905–912 (2008)
Google Scholar
Miller, G., Charles, W.: Contextual correlates of semantic similarity. Language and Cognitive Processes 6(1), 1–28 (1991)
Article Google Scholar

Download references

Author information

Authors and Affiliations

IRIT, Univ. Toulouse & Alpage, INRIA, France
Philippe Muller
DIRO, Univ. Montréal, Canada
Philippe Langlais

Authors

Philippe Muller
View author publications
You can also search for this author in PubMed Google Scholar
Philippe Langlais
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer Science, University of Regina, 3737 Wascana Parkway, Regina, S4S 0A2, Saskatchewan, Canada
Cory Butz
Department of Mathematics and Computing Science, Saint Mary’s University, B3H 3C3, Halifax, Nova Scotia, Canada
Pawan Lingras

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Muller, P., Langlais, P. (2011). Comparing Distributional and Mirror Translation Similarities for Extracting Synonyms. In: Butz, C., Lingras, P. (eds) Advances in Artificial Intelligence. Canadian AI 2011. Lecture Notes in Computer Science(), vol 6657. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-21043-3_40

Download citation

DOI: https://doi.org/10.1007/978-3-642-21043-3_40
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-21042-6
Online ISBN: 978-3-642-21043-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics