Application 1: Lexicography

  • Carlos Ramisch
Part of the Theory and Applications of Natural Language Processing book series (NLP)


This chapter shows the results of the evaluation of the mwetoolkit methodology for the creation of MWE dictionaries. First, we explore the creation of a dictionary containing Greek nominal expressions (Sect. 6.1). Second, we present the creation of two lexical resources for Brazilian Portuguese. They contain complex predicates (verbal expressions) and are aimed at two real applications: semantic role labelling and sentiment analysis (Sect. 6.2). These two languages were chosen because: (a) they are poorly resourced in terms of MWE lexicons, and (b) there was a real need to build MWE lexicons for a given application.


Sentiment Analysis Candidate List Lexical Resource Complex Predicate Argument Taker 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. Alsina A, Bresnan J, Sells P (eds) (1997) Complex predicates. CSLI Publications, Stanford, 514pGoogle Scholar
  2. Anastasiadi-Symeonidi A (1986) Neology in modern Greek (in Greek). PhD thesis, Aristotle University of ThessalonikiGoogle Scholar
  3. Atkins S (2010) The DANTE database: its contribution to English lexical research, and in particular to complementing the FrameNet data. In: de Schryver GM (ed) A way with words: recent advances in lexical theory and analysis. A Festschrift for Patrick Hanks. Menha Publishers, KampalaGoogle Scholar
  4. Atkins S, Fillmore C, Johnson CR (2003) Lexicographic relevance: selecting information from corpus evidence. Int J Lexicogr 16(3):251–280CrossRefGoogle Scholar
  5. Barreiro A, Cabral LM (2009) ReEscreve: a translator-friendly multi-purpose paraphrasing software tool. In: Proceedings of the workshop beyond translation memories: new tools for translators, the twelfth machine translation summit, Ottawa, pp 1–8Google Scholar
  6. Bick E (2000) The parsing system Palavras. Aarhus University Press, Aarhus, 411pGoogle Scholar
  7. Carvalho P, Sarmento L, Teixeira J, Silva MJ (2011) Liars and saviors in a sentiment annotated corpus of comments to political debates. In: Proceedings of the 49th annual meeting of the Association for Computational Linguistics: human language technology (ACL HLT 2011), Portland. Association for Computational Linguistics, pp 564–568.
  8. Dras M (1995) Automatic identification of support verbs: a step towards a definition of semantic weight. In: Proceedings of the eighth Australian joint conference on artificial intelligence. World Scientific Press, Canberra, pp 451–458Google Scholar
  9. Duran MS, Ramisch C (2011) How do you feel? Investigating lexical-syntactic patterns in sentiment expression. In: Proceedings of corpus linguistics 2011: discourse and corpus linguistics conference, BirminghamGoogle Scholar
  10. Duran MS, Ramisch C, Aluísio SM, Villavicencio A (2011) Identifying and analyzing Brazilian Portuguese complex predicates. In: Kordoni V, Ramisch C, Villavicencio A (eds) Proceedings of the ALC workshop on multiword expressions: from parsing and generation to the real world (MWE 2011), Portland. Association for Computational Linguistics, pp 74–82.
  11. Esuli A, Sebastiani F (2006) SENTIWORDNET: a publicly available lexical resource for opinion mining. In: Proceedings of the sixth international conference on language resources and evaluation (LREC 2006), Genoa. European Language Resources Association, pp 417–422Google Scholar
  12. Evert S, Krenn B (2005) Using small random samples for the manual evaluation of statistical association measures. Comput. Speech Lang Spec Issue MWEs 19(4):450–466CrossRefGoogle Scholar
  13. Fotopoulou A (1993) Une classification des phrases à compléments figés en grec moderne: étude morphosyntaxique des phrases figées. PhD thesis, Université Paris VIII, 248pGoogle Scholar
  14. Fotopoulou A (1997) L’ordre des mots dans les phrases figées à un complément libre en grec moderne. In: Fiala P, Lafon P, Piguet MF (eds) La locution: entre lexique, syntaxe et pragmatique. INALF, Saint-Cloud, pp 37–48Google Scholar
  15. Fotopoulou A, Giannopoulos G, Zourari M, Mini M (2008) Automatic recognition and extraction of multiword nominal expressions from corpora (in Greek). In: Proceedings of the 29th annual meeting, Department of Linguistics, Aristotle University of Thessaloniki, GreeceGoogle Scholar
  16. Gill AJ, French RM, Gergle D, Oberlander J (2008) The language of emotion in short blog texts. In: Proceedings of the 2008 ACM conference on computer supported cooperative work (CSCW ’08)San Diego. Association for Computing MachineryGoogle Scholar
  17. Hendrickx I, Mendes A, Pereira S, Gonçalves A, Duarte I (2010) Complex predicates annotation in a corpus of Portuguese. In: Proceedings of the ACL 2010 fourth linguistic annotation workshop, Uppsala, pp 100–108Google Scholar
  18. Hwang JD, Bhatia A, Bonial C, Mansouri A, Vaidya A, Zhou Y, Xue N, Palmer M (2010) Propbank annotation of multilingual light verb constructions. In: Proceedings of the ACL 2010 fourth linguistic annotation workshop, Uppsala, pp 82–90Google Scholar
  19. Kim SM, Hovy E (2004) Determining the sentiment of opinions. In: Proceedings of the 20th international conference on computational linguistics (COLING 2004), Geneva. International Committee on Computational Linguistics, pp 1367–1373.
  20. Koehn P (2005) Europarl: a parallel corpus for statistical machine translation. In: Proceedings of the tenth machine translation summit(MT Summit 2005), Phuket. Asian-Pacific Association for Machine Translation, pp 79–86Google Scholar
  21. Linardaki E, Ramisch C, Villavicencio A, Fotopoulou A (2010) Towards the construction of language resources for Greek multiword expressions: extraction and evaluation. In: Piperidis S, Slavcheva M, Vertan C (eds) Proceedings of the LREC workshop on exploitation of multilingual resources and tools for central and (South) Eastern European languages, Valetta, May 2010, pp 31–40Google Scholar
  22. Michou A, Seretan V (2009) A tool for multi-word expression extraction in modern Greek using syntactic parsing. In: Proceedings of the demonstrations session at EACL 2009, Athens. Association for Computational Linguistics, pp 45–48Google Scholar
  23. Mini M, Fotopoulou A (2009) Typology of multiword verbal expressions in modern Greek dictionaries: limits and differences (in Greek). In: Proceedings of the 18th international symposium of theoretical & applied linguistics, School of English, Aristotle University of Thessaloniki, Thessaloniki, pp 491–503Google Scholar
  24. Moustaki A (1995) Les expressions figées ει μ α ι/être prép C W en grec moderne. PhD thesis, Université Paris VIII, 476pGoogle Scholar
  25. Pang B, Lee L (2008) Opinion mining and sentiment analysis. Found Trends Inf Retr 2(1–2):135p. Now Publishers Inc. doi:
  26. Papageorgiou H, Prokopidis P, Giouli V, Piperidis S (2000) A unified POS tagging architecture and its application to Greek. In: Proceedings of the second international conference on language resources and evaluation (LREC 2000), Athens. European Language Resources Association, pp 1455–1462Google Scholar
  27. Salkoff M (1990) Automatic translation of support verb constructions. In: Proceedings of the 13th international conference on computational linguistics (COLING 1990), Helsinki, pp 243–246Google Scholar
  28. Silva MJ, Carvalho P, Sarmento L, Oliveira E, Magalhães P (2009) The design of OPTIMISM, an opinion mining system for Portuguese politics. In: Proceedingsof the fourteenth Portuguese conference on artificial intelligence (EPIA 2006), Aveiro, pp 565–576Google Scholar
  29. Stevenson S, Fazly A, North R (2004) Statistical measures of the semi-productivity of light verb constructions. In: Tanaka T, Villavicencio A, Bond F, Korhonen A (eds) Proceedings of the ACL workshop on multiword expressions: integrating processing (MWE 2004), Barcelona. Association for Computational Linguistics, pp 1–8Google Scholar
  30. Teufel S, Grefenstette G (1995) Corpus-based method for automatic identification of support verbs for nominalizations. In: Proceedings of the 7th conference of the European chapter of the association for computational linguistics (EACL 1995), Dublin Association for Computational Linguistics, pp 98–103Google Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  • Carlos Ramisch
    • 1
  1. 1.Aix Marseille UniversityMarseilleFrance

Personalised recommendations