Advertisement

Definitions and Characteristics

  • Carlos Ramisch
Chapter
  • 847 Downloads
Part of the Theory and Applications of Natural Language Processing book series (NLP)

Abstract

In this chapter, we discuss definitions and properties of MWEs and we present a brief introduction to the research field of automatic MWE treatment. Although we include pointers toward linguistic and psycholinguistic studies, most of the related work cited in this chapter has a strong computational background.

Keywords

Noun Phrase Lexical Item Nominal Compound Lexical Unit Idiomatic Expression 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

  1. Anastasiou D, Hashimoto C, Nakov P, Kim SN (eds) (2009) Proceedings of the ACL workshop on multiword expressions: identification, interpretation, disambiguation, applications (MWE 2009), Singapore. Association for Computational Linguistics/Suntec, 70p. http://aclweb.org/anthology-new/W/W09/W09-29
  2. Attia M, Toral A, Tounsi L, Pecina P, van Genabith J (2010) Automatic extraction of Arabic multiword expressions. In: Laporte É, Nakov P, Ramisch C, Villavicencio A (eds) Proceedings of the COLING workshop on multiword expressions: from theory to applications (MWE 2010), Beijing. Association for Computational Linguistics, pp 18–26Google Scholar
  3. Baldwin T, Kim SN (2010) Multiword expressions. In: Indurkhya N, Damerau FJ (eds) Handbook of natural language processing, 2nd edn. CRC/Taylor and Francis Group, Boca Raton, pp 267–292Google Scholar
  4. Bond F, Korhonen A, McCarthy D, Villavicencio A (eds) (2003) Proceedings of the ACL workshop on multiword expressions: analysis, acquisition and treatment (MWE 2003), Sapporo. Association for Computational Linguistics, 104p. http://aclweb.org/anthology-new/W/W03/W03-1800
  5. Bu F, Zhu X, Li M (2010) Measuring the non-compositionality of multiword expressions. In: Huang CR, Jurafsky D (eds) Proceedings of the 23rd international conference on computational linguistics (COLING 2010), Beijing. The Coling 2010 Organizing Committee, pp 116–124. http://www.aclweb.org/anthology/C10-1014
  6. Butnariu C, Kim SN, Nakov P, Séaghdha DO, Szpakowicz S, Veale T (2010) Semeval-2 task 9: the interpretation of noun compounds using paraphrasing verbs and prepositions. In: Erk K, Strapparava C (eds) Proceedings of the 5th international workshop on semantic evaluation (SemEval 2010), Uppsala. Association for Computational Linguistics, pp 39–44. http://www.aclweb.org/anthology/S10-1007
  7. Cabré MT (1992) La terminologia. La teoria, els mètodes, les aplicacions. Empúries, Barcelona, 527pGoogle Scholar
  8. Calzolari N, Fillmore C, Grishman R, Ide N, Lenci A, Macleod C, Zampolli A (2002) Towards best practice for multiword expressions in computational lexicons. In: Proceedings of the third international conference on language resources and evaluation (LREC 2002), Las Palmas. European Language Resources Association, pp 1934–1940Google Scholar
  9. Carpuat M, Diab M (2010) Task-based evaluation of multiword expressions: a pilot study in statistical machine translation. In: Proceedings of human language technology: the 2010 annual conference of the North American chapter of the Association for Computational Linguistics (NAACL 2003), Los Angeles. Association for Computational Linguistics, pp 242–245. http://www.aclweb.org/anthology/N10-1029
  10. Choueka Y (1988) Looking for needles in a haystack or locating interesting collocational expressions in large textual databases. In: Fluhr C, Walker DE (eds) Proceedings of the 2nd international conference on computer-assisted information retrieval (Recherche d’Information et ses Applications – RIA 1988), Cambridge. CID, pp 609–624Google Scholar
  11. Church K (2013) How many multiword expressions do people know? ACM Trans Speech language processing Spec Issue Multiword Expr Theory Pract Use Part 1 (TSLP) 10(2):1–13MathSciNetCrossRefGoogle Scholar
  12. Church K, Hanks P (1990) Word association norms mutual information, and lexicography. Comput Linguist 16(1):22–29Google Scholar
  13. Cruse DA (1986) Lexical semantics. Cambridge University Press, Cambridge, 310pGoogle Scholar
  14. Dagan I, Church K (1994) Termight: identifying and translating technical terminology. In: Proceedings of the 4th applied natural language processing conference (ANLP 1994), Stuttgart. Association for Computational Linguistics, pp 34–40. doi:10.3115/974358.974367, http://www.aclweb.org/anthology/A94-1006
  15. de Medeiros Caseli H, Ramisch C, das Graças Volpe Nunes M, Villavicencio A (2010) Alignment-based extraction of multiword expressions. Lang Resour Eval Spec Issue Multiword Expr Hard Going or Plain Sailing 44(1–2):59–77. doi:10.1007/s10579-009-9097-9, http://www.springerlink.com/content/H7313427H78865MG
  16. Devereux B, Costello F (2007) Learning to interpret novel noun-noun compounds: evidence from a category learning experiment. In: Buttery P, Villavicencio A, Korhonen A (eds) Proceedings of the ACL 2007 workshop on cognitive aspects of computational language acquisition, Prague. Association for Computational Linguistics, pp 89–96. http://www.aclweb.org/anthology/W/W07/W07-0612
  17. Dunning T (1993) Accurate methods for the statistics of surprise and coincidence. Comput Linguist 19(1):61–74Google Scholar
  18. Evert S (2004) The statistics of word cooccurrences: word pairs and collocations. PhD thesis, Institut für maschinelle Sprachverarbeitung, University of Stuttgart, Stuttgart, 353pGoogle Scholar
  19. Fazly A, Stevenson S (2007) Distinguishing subtypes of multiword expressions using linguistically-motivated statistical measures. In: Grégoire N, Evert S, Kim SN (eds) Proceedings of the ACL workshop on a broader perspective on multiword expressions (MWE 2007), Prague. Association for Computational Linguistics, pp 9–16. http://www.aclweb.org/anthology/W/W07/W07-1102
  20. Fillmore CJ, Kay P, O’Connor MC (1988) Regularity and idiomaticity in grammatical constructions: the case of let alone. Language 64:501–538. http://www.jstor.org/stable/414531
  21. Firth JR (1957) Papers in linguistics 1934-1951. Oxford University Press, Oxford, 233pGoogle Scholar
  22. Frantzi K, Ananiadou S, Mima H (2000) Automatic recognition of multiword terms: the C-value/NC-value method. Int J Digit Libr 3(2):115–130CrossRefGoogle Scholar
  23. Grégoire N, Evert S, Kim SN (eds) (2007) Proceedings of the ACL workshop on a broader perspective on multiword expressions (MWE 2007), Prague. Association for Computational Linguistics, 80p. http://aclweb.org/anthology-new/W/W07/W07-11
  24. Grégoire N, Evert S, Krenn B (eds) (2008) Proceedings of the LREC workshop towards a shared task for multiword expressions (MWE 2008), Marrakech, 57p. http://www.lrec-conf.org/proceedings/lrec2008/workshops/W20_Proceedings.pdf
  25. Hendrickx I, Kim SN, Kozareva Z, Nakov P, Séaghdha DO, Padó S, Pennacchiotti M, Romano L, Szpakowicz S (2010) Semeval-2010 task 8: multi-way classification of semantic relations between pairs of nominals. In: Erk K, Strapparava C (eds) Proceedings of the 5th international workshop on semantic evaluation (SemEval 2010), Uppsala. Association for Computational Linguistics, pp 33–38. http://www.aclweb.org/anthology/S10-1006
  26. Jackendoff R (1997) Twistin’ the night away. Language 73:534–559CrossRefGoogle Scholar
  27. Joshi A (2010) Multi-word expressions as discourse relation markers (DRMs). In: Laporte É, Nakov P, Ramisch C, Villavicencio A (eds) Proceedings of the COLING workshop on multiword expressions: from theory to applications (MWE 2010), Beijing. Association for Computational Linguistics, p 89Google Scholar
  28. Joyce T, Srdanović I (2008) Comparing lexical relationships observed within Japanese collocation data and Japanese word association norms. In: Zock M, Huang CR (eds) Proceedings of the COLING 2008 workshop on cognitive aspects of the lexicon (COGALEX 2008), Manchester. The Coling 2008 Organizing Committee, pp 1–8. http://www.aclweb.org/anthology/W08-1901
  29. Justeson JS, Katz SM (1995) Technical terminology: some linguistic properties and an algorithm for identification in text. Nat Lang Eng 1(1):9–27CrossRefGoogle Scholar
  30. Kim SN, Medelyan O, Kan MY, Baldwin T (2010) Semeval-2010 task 5: automatic keyphrase extraction from scientific articles. In: Erk K, Strapparava C (eds) Proceedings of the 5th international workshop on semantic evaluation (SemEval 2010), Uppsala. Association for Computational Linguistics, pp 21–26. http://www.aclweb.org/anthology/S10-1004
  31. Kordoni V, Ramisch C, Villavicencio A (eds) (2011) Proceedings of the ACL workshop on multiword expressions: from parsing and generation to the real world (MWE 2011), Portland. Association for Computational Linguistics, 144p. http://www.aclweb.org/anthology/W/W11/W11-08
  32. Kordoni V, Ramisch C, Villavicencio A (eds) (2013) Proceedings of the 9th workshop on multiword expressions (MWE 2013), Atlanta. Association for Computational Linguistics, 144p. http://www.aclweb.org/anthology/W13-10
  33. Kordoni V, Savary A, Egg M, Wehrli E, Evert S (eds) (2014) Proceedings of the 10th workshop on multiword expressions (MWE 2014), Gothenburg. Association for Computational Linguistics, 133p. http://www.aclweb.org/anthology/W14-08
  34. Krieger M, Finatto MJB (2004) Introdução à Terminologia: teoria & prática. Editora Contexto, São Paulo, 223pGoogle Scholar
  35. Laporte É, Nakov P, Ramisch C, Villavicencio A (eds) (2010) Proceedings of the COLING workshop on multiword expressions: from theory to applications (MWE 2010), Beijing. Association for Computational Linguistics, 89p. http://aclweb.org/anthology-new/W/W10/W10-37
  36. Lavagnino E, Park J (2010) Conceptual structure of automatically extracted multi-word terms from domain specific corpora: a case study for Italian. In: Zock M, Rapp R (eds) Proceedings of the 2nd workshop on cognitive aspects of the lexicon (COGALEX 2010), Beijing. The Coling 2010 Organizing Committee, pp 48–55. http://www.aclweb.org/anthology/W10-3408
  37. Lin D (1998a) Automatic retrieval and clustering of similar words. In: Proceedings of the 36th annual meeting of the Association for Computational Linguistics and 17th international conference on computational linguistics, Montreal, vol 2. Association for Computational Linguistics, pp 768–774. doi:10.3115/980691.980696, http://www.aclweb.org/anthology/P98-2127
  38. Lin D (1998b) Extracting collocations from text corpora. In: First workshop on computational terminology, Montreal, pp 57–63Google Scholar
  39. Lin D (1999) Automatic identification of non-compositional phrases. In: Proceedings of the 37th annual meeting of the Association for Computational Linguistics (ACL 1999), College Park. Association for Computational Linguistics, pp 317–324Google Scholar
  40. Manning CD, Schütze H (1999) Foundations of statistical natural language processing. MIT, Cambridge, 620pzbMATHGoogle Scholar
  41. Mel’čuk I, Polguère A (1987) A formal lexicon in the meaning-text theory or (how to do lexica with words). Comput Linguist 13(3–4):261–275Google Scholar
  42. Mel’čuk I, Arbatchewsky-Jumarie N, Elnitsky L, Iordanskaja L, Lessard A (1984) Dictionnaire explicatif et combinatoire du français contemporain. Recherches lexico-sémantiques I. Les presses de l’Université de Montréal, Montréal, 172pGoogle Scholar
  43. Mel’čuk I, Arbatchewsky-Jumarie N, Dagenais L, Elnitsky L, Iordanskaja L, Lefebvre MN, Mantha S (1988) Dictionnaire explicatif et combinatoire du français contemporain. Recherches lexico-sémantiques II. Les presses de l’Université de Montréal, Montréal, 332pGoogle Scholar
  44. Mel’čuk I, Arbatchewsky-Jumarie N, Iordanskaja L, Mantha S (1992) Dictionnaire explicatif et combinatoire du français contemporain. Recherches lexico-sémantiques III. Les presses de l’Université de Montréal, Montréal, 323pGoogle Scholar
  45. Mel’čuk I, Clas A, Polguère A (1995) Introduction à la lexicologie explicative et combinatoire. Editions Duculot, Louvain la Neuve, 256pGoogle Scholar
  46. Mel’čuk I, Arbatchewsky-Jumarie N, Clas A, Mantha S, Polguère A (1999) Dictionnaire explicatif et combinatoire du français contemporain. Recherches lexico-sémantiques IV. Les presses de l’Université de Montréal, Montréal, 347pGoogle Scholar
  47. Mitkov R, Monti J, Pastor GC, Seretan V (eds) (2013) Proceedings of the MT summit 2013 workshop on multi-word units in machine translation and translation technology (MUMTTT 2013), Nice.Google Scholar
  48. Moirón BV, Villavicencio A, McCarthy D, Evert S, Stevenson S (eds) (2006) Proceedings of the COLING/ACL workshop on multiword expressions: identifying and exploiting underlying properties (MWE 2006), Sidney. Association for Computational Linguistics, 61p. http://aclweb.org/anthology-new/W/W06/W06-12
  49. Nakov P (2013) On the interpretation of noun compounds: syntax, semantics, and entailment. Nat Lang Eng Spec Issue Noun Compd 19(3):291–330. doi10.1017/S1351324913000065, http://journals.cambridge.org/article_S1351324913000065
  50. Nematzadeh A, Fazly A, Stevenson S (2012, to appear) Child acquisition of multiword verbs: a computational investigation. In: Poibeau T, Villavicencio A, Korhonen A, Alishahi A (eds) Cognitive aspects of computational language acquisition, Springer, HeidelbergGoogle Scholar
  51. Pal S, Naskar SK, Pecina P, Bandyopadhyay S, Way A (2010) Handling named entities and compound verbs in phrase-based statistical machine translation. In: Laporte É, Nakov P, Ramisch C, Villavicencio A (eds) Proceedings of the COLING workshop on multiword expressions: from theory to applications (MWE 2010), Beijing. Association for Computational Linguistics, pp 45–53Google Scholar
  52. Pearce D (2001) Synonymy in collocation extraction. In: WordNet and other lexical resources: applications, extensions and customizations (NAACL 2001 workshop), Pittsburgh, pp 41–46Google Scholar
  53. Ramisch C (2009) Multiword terminology extraction for domain-specific documents. Master’s thesis, École Nationale Supérieure d’Informatique et de Mathématiques Appliquées, Grenoble, 79pGoogle Scholar
  54. Ramisch C, Villavicencio A, Moura L, Idiart M (2008) Picking them up and figuring them out: verb-particle constructions, noise and idiomaticity. In: Clark A, Toutanova K (eds) Proceedings of the twelfth conference on natural language learning (CoNLL 2008), Manchester. The Coling 2008 Organizing Committee, pp 49–56. http://www.aclweb.org/anthology/W08-2107
  55. Ramisch C, Villavicencio A, Kordoni V (2013) Introduction to the special issue on multiword expressions: from theory to practice and use. ACM Trans Speech Lang Process Spec Issue Multiword Expr Theory Pract Use Part 1 (TSLP) 10(2):1–10Google Scholar
  56. Rapp R (2008) The computation of associative responses to multiword stimuli. In: Zock M, Huang CR (eds) Proceedings of the COLING 2008 workshop on cognitive aspects of the lexicon (COGALEX 2008), Manchester. The Coling 2008 Organizing Committee, pp 102–109. http://www.aclweb.org/anthology/W08-1914
  57. Rayson P, Sharoff S, Adolphs S (eds) (2006) Proceedings of the EACL workshop on multiword expressions in multilingual context (EACL-MWE 2006), Trento. Association for Computational Linguistics, 79p. http://aclweb.org/anthology-new/W/W06/W06-2400
  58. Rayson P, Piao S, Sharoff S, Evert S, Villada Moirón B (2010) Multiword expressions hard going or plain sailing? Lang Resour Eval Spec Issue Multiword Expr Hard Going Plain Sailing 44(1–2):1–5 SpringerGoogle Scholar
  59. Sag I, Baldwin T, Bond F, Copestake A, Flickinger D (2002) Multiword expressions: a pain in the neck for NLP. In: Proceedings of the 3rd international conference on intelligent text processing and computational linguistics (CICLing-2002), Mexico-City. Lecture notes in computer science, vol 2276/2010. Springer, pp 1–15Google Scholar
  60. SanJuan E, Dowdall J, Ibekwe-SanJuan F, Rinaldi F (2005) A symbolic approach to automatic multiword term structuring. Comput Speech Lang Spec Issue MWEs 19(4):524–542CrossRefGoogle Scholar
  61. Seretan V (2008) Collocation extraction based on syntactic parsing. PhD thesis, University of Geneva, Geneva, 249pGoogle Scholar
  62. Seretan V (2011) Syntax-based collocation extraction. Text, speech and language technology, vol 44, 1st edn. Springer, Dordrecht, 212pGoogle Scholar
  63. Sinclair J (1991) Corpus, concordance, collocation. Describing English language, Oxford University Press, Oxford, 179pGoogle Scholar
  64. Smadja FA (1993) Retrieving collocations from text: Xtract. Comput Linguist 19(1):143–177Google Scholar
  65. Szpakowicz S, Bond F, Nakov P, Kim SN (2013) On the semantics of noun compounds. In: Nat Lang Eng Spec Issue Noun Compd 19(3):289–290. Cambridge University Press, CambridgeGoogle Scholar
  66. Tanaka T, Villavicencio A, Bond F, Korhonen A (eds) (2004) Proceedings of the ACL workshop on multiword expressions: integrating processing (MWE 2004), Barcelona. Association for Computational Linguistics, 103p. http://aclweb.org/anthology-new/W/W04/W04-0400
  67. Villavicencio A, Bond F, Korhonen A, McCarthy D (2005) Introduction to the special issue on multiword expressions having a crack at a hard nut. Computer speech Lang Spec Issue MWEs 19(4):365–377 ElsevierGoogle Scholar
  68. Villavicencio A, Idiart M, Ramisch C, Araujo VD, Yankama B, Berwick R (2012) Get out but don’t fall down: verb-particle constructions in child language. In: Berwick R, Korhonen A, Poibeau T, Villavicencio A (eds) Proceedings of the EACL 2012 workshop on computational models of language acquisition and loss, Avignon. Association for Computational Linguistics, pp 43–50Google Scholar
  69. Yarowsky D (2001) One sense per collocation. In: Proceedings of the first international conference on human language technology research (HLT 2001), San Diego. Morgan Kaufmann Publishers, pp 266–271Google Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  • Carlos Ramisch
    • 1
  1. 1.Aix Marseille UniversityMarseilleFrance

Personalised recommendations