Abstract
A method for the recognition of the compositionality of Multi Word Expressions (MWEs) is proposed. First, we study associations between MWEs and the structure of wordnet lexico-semantic relations. A simple method of splitting plWordNet’s MWEs into compositional and non-compositional on the basis of the hypernymy structure is discussed. However, our main goal is to build a classifier for the recognition of compositional MWEs. We assume prior MWE detection. Several experiments with different classification algorithms were performed for the purposes of this task, namely Naive Bayes classifier, Multinomial logistic regression model with a ridge estimator and Decision Table classifier. A heterogeneous set of features is based on: t-score measure for word co-occurrences, Measure of Semantic Relatedness and lexico-syntactic structure of MWEs. MWE compositionality classification is analysed as a knowledge source for automated wordnet expansion.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Baldwin, T., Bannard, C., Tanaka, T., Widdows, D.: An empirical model of multiword expression decomposability. In: Proc. of the ACL 2003 Workshop on Multiword Expressions: Analysis, Acquisition and Treatment, MWE 2003, vol. 18, pp. 89–96. ACL (2003)
Broda, B., Derwojedowa, M., Piasecki, M.: Recognition of Structured Collocations in An Inflective Language. Systems Science 34(4), 27–36 (2008); the previous version was published in the Proceedings of AAIA 2008, Wisla Poland
Broda, B., Maziarz, M., Piasecki, M.: Tools for plWordNet Development. Presentation and Perspectives. In: Calzolari, N., et al. (eds.) Proceedings of the Eight International Conference on Language Resources and Evaluation (LREC 2012), pp. 3647–3652. ELRA, Istanbul (2012)
Broda, B., Piasecki, M.: SuperMatrix: a general tool for lexical semantic knowledge acquisition. In: Proc. of IMCSIT – 3rd International Symposium Advances in Artificial Intelligence and Applications (AAIA 2008), pp. 345–352 (2008)
Chawla, N.V., Bowyer, K.W., Hall, L.O., Kegelmeyer, W.P.: Smote: Synthetic minority over-sampling technique. Journal of Artificial Intelligence Research 16, 321–357 (2002)
Cicchetti, D.V., Volkmar, F., Sparrow, S.S., Cohen, D., Fermanian, J., Rourke, B.P.: Assessing the reliability of clinical scales when the data have both nominal and ordinal features: proposed guidelines for neuropsychological assessments. J. Clin. Exp. Neuropsychol. 14(5), 673–686 (1992)
Fleiss, J.L.: Statistical Methods for Rates and Proportions. Wiley series in probability and mathematical statistics. John Wiley & Sons, New York (1981)
Gurrutxaga, A., Alegria, I.: Measuring the compositionality of nv expressions in basque by means of distributional similarity techniques. In: Calzolari, N., et al. (eds.) Proceedings of the Eight International Conference on Language Resources and Evaluation (LREC 2012). ELRA, Istanbul (2012)
Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The weka data mining software: an update. SIGKDD Explor. Newsl. 11(1), 10–18 (2009)
Hirst, G., St-Onge, D.: Lexical chains as representations of context for the detection and correction of malapropisms (1997)
Korkontzelos, I., Klapaftis, I., Man, S.: Graph connectivity measures for unsupervised parameter tuning of graph-based sense induction systems (2009)
Korkontzelos, I., Manandhar, S.: Detecting compositionality in multi-word expressions. In: Proceedings of the ACL-IJCNLP 2009 Conference Short Papers, ACLShort 2009, pp. 65–68. ACL (2009)
Krčmář, L., Ježek, K., Pecina, P.: Determining compositionality of word expressions using word space models. In: Proceedings of the 9th Workshop on Multiword Expressions, pp. 42–50. Association for Computational Linguistics, Atlanta (2013)
Kurc, R., Piasecki, M., Broda, B.: Constraint based description of polish multiword expressions. In: Proceedings of the Eight International Conference on Language Resources and Evaluation (LREC 2012). ELRA, Istanbul (2012)
Landis, J.R., Koch, G.G.: The Measurement of Observer Agreement for Categorical Data. Biometrics 33(1), 159–174 (1977)
Lin, D.: An information-theoretic definition of similarity. In: Proceedings of the Fifteenth International Conference on Machine Learning, ICML, pp. 296–304. Morgan Kaufmann Publishers Inc., San Francisco (1998)
Lyons, J.: Linguistic Semantics. Cambridge University Press (1995)
Maldonado-Guerra, A., Emms, M.: Measuring the compositionality of collocations via word co-occurrence vectors: Shared task system description. In: Proceedings of the Workshop on Distributional Semantics and Compositionality, pp. 48–53. ACL, Portland (2011)
Maziarz, M., Piasecki, M., Szpakowicz, S.: Approaching plWordNet 2.0. In: Fellbaum, C., Vossen, P. (eds.) Proceedings of 6th International Global Wordnet Conference, pp. 189–196. The Global WordNet Association, Matsue (2012)
McCarthy, D., Keller, B., Carroll, J.: Detecting a continuum of compositionality in phrasal verbs. In: Proc. of the ACL 2003 Workshop on Multiword Expressions: Analysis, Acquisition and Treatment, vol. 18, pp. 73–80. ACL (2003)
Pagin, P.: Is compositionality compatible with holism? Mind & Language 12, 11–33 (1997)
Pedersen, T.: Identifying collocations to measure compositionality: shared task system description. In: Proceedings of the Workshop on Distributional Semantics and Compositionality, DiSCo 2011, pp. 33–37. ACL, Stroudsburg (2011)
Piao, S.S., Rayson, P., Mudraya, O., Wilson, A., Garside, R.: Measuring mwe compositionality using semantic annotation. In: Proceedings of the Workshop on Multiword Expressions: Identifying and Exploiting Underlying Properties, pp. 2–11. ACL, Sydney (2006)
Piasecki, M., Szpakowicz, S., Broda, B.: A Wordnet from the Ground Up. Oficyna Wydawnicza Politechniki Wrocławskiej (2009)
Resnik, P.: Using information content to evaluate semantic similarity in a taxonomy. In: Proceedings of the 14th International Joint Conference on Artificial Intelligence, pp. 448–453 (1995)
Korpus rzeczpospolitej (1993-2002), http://www.cs.put.poznan.pl/dweiss/rzeczpospolita
Salehi, B., Cook, P.: Predicting the compositionality of multiword expressions using translations in multiple languages. In: *SEM, Proc. of the Main Conference and the Shared Task: Semantic Textual Similarity, vol. 1, pp. 266–275. ACL (2013)
Svensson, M.H.: A very complex criterion of fixedness: Non-compositionality, ch. 5, pp. 81–93. John Benjamins Publishing Company (2008)
Venkatapathy, S., Joshi, A.K.: Measuring the relative compositionality of verb-noun (v-n) collocations by integrating features. In: Proceedings of the Conference on Human Language Technology and Empirical Methods in Natural Language Processing, HLT 2005, pp. 899–906. ACL (2005)
Vincze, V., Nagy, T.I., Berend, G.: Detecting noun compounds and light verb constructions: a contrastive study. In: Workshop on Multiword Expressions: from Parsing and Generation to the Real World, MWE 2011, pp. 116–121. ACL (2011)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Kędzia, P., Piasecki, M., Maziarz, M., Marcińczuk, M. (2013). Recognising Compositionality of Multi-Word Expressions in the Wordnet Oriented Perspective. In: Castro, F., Gelbukh, A., González, M. (eds) Advances in Artificial Intelligence and Its Applications. MICAI 2013. Lecture Notes in Computer Science(), vol 8265. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-45114-0_19
Download citation
DOI: https://doi.org/10.1007/978-3-642-45114-0_19
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-45113-3
Online ISBN: 978-3-642-45114-0
eBook Packages: Computer ScienceComputer Science (R0)