Abstract
In this paper, we focus on the most frequent errors that occurred during the implementation of a rule-based module for semantic relations extraction, which has been integrated in STRING, a hybrid statistical and rule-based Natural Language Processing chain for Portuguese. We focus on whole-part relations (meronymy), that is, a semantic relation between an entity that is perceived as a constituent part of another entity, or a member of a set. In this case, we target the type of meronymy involving human entities and body-part nouns. We describe with some detail the decisions that were made in order to overcome the errors produced by the system and the solutions adopted to improve its performance.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Ait-Mokhtar, S., Chanod, J., Roux, C.: Robustness beyond shallowness: incremental dependency parsing. Natural Language Engineering 8(2/3), 121–144 (2002)
Bick, E.: The Parsing System ”Palavras”: Automatic Grammatical Analysis of Portuguese in a Constraint Grammar Framework. Ph.D. thesis, Aarhus Univ. Aarhus, Denmark: Aarhus Univ. Press (2000)
Costa, F., Branco, A.: LXGram: A Deep Linguistic Processing Grammar for Portuguese. In: Pardo, T.A.S., Branco, A., Klautau, A., Vieira, R., de Lima, V.L.S. (eds.) PROPOR 2010. LNCS, vol. 6001, pp. 86–89. Springer, Heidelberg (2010)
Fellbaum, C.: WordNet: An Electronic Lexical Database. MIT Press, Cambridge (1998)
Fleiss, J.L.: Statistical methods for rates and proportions, 2nd edn., pp. 38–46. John Wiley, New York (1981)
Freelon, D.: ReCal: Intercoder Reliability Calculation as a Web Service. Intl. J. of Internet Science 5(1), 20–33 (2010)
Freitas, C.: ESQUELETO - Anotaçã das palavras do corpo humano. Tech. Rep. Versão 5 (May 20, 2014), http://www.linguateca.pt/acesso/Esqueleto.pdf
Gelbukh, A.: Syntactic disambiguation with weighted extended subcategorization frames. In: Proceedings of PACLING-99, Pacific Association for Computational Linguistics, pp. 244–249. University of Waterloo, Canada (1999)
Gelbukh, A.: Unsupervised Learning for Syntactic Disambiguation. Computación y Sistemas 18(2), 329–344 (2014)
Girju, R., Badulescu, A., Moldovan, D.: Learning Semantic Constraints for the Automatic Discovery of Part-Whole Relations. In: Proceedings of HLT-NAACL, vol. 3, pp. 80–87 (2003)
Girju, R., Badulescu, A., Moldovan, D.: Automatic discovery of part-whole relations. Computational Linguistics 21(1), 83–135 (2006)
van Hage, W.R., Kolb, H., Schreiber, G.: A method for learning part-whole relations. In: Cruz, I., Decker, S., Allemang, D., Preist, C., Schwabe, D., Mika, P., Uschold, M., Aroyo, L.M. (eds.) ISWC 2006. LNCS, vol. 4273, pp. 723–735. Springer, Heidelberg (2006)
Hearst, M.: Automatic acquisition of hyponyms from large text corpora. In: Proceedings of the 14th Conf. on Computational Linguistics, COLING 1992, vol. 2, pp. 539–545. ACL Morristown, NJ (1992)
Hirst, G.: Ontology and the lexicon. In: Staab, S., Studer, R. (eds.) Handbook on Ontologies, pp. 209–230. Springer (2004)
Iris, M., Litowitz, B., Evens, M.: Problems of the Part-Whole Relation. In: Evens, M. (ed.) Relational Models of the Lexicon: Representing Knowledge in Semantic Networks, pp. 261–288. Cambridge Univ. Press (1988)
Landis, J., Koch, G.: The measurement of observer agreement for categorical data. Biometrics 33(1), 159–174 (1977)
Mamede, N., Baptista, J., Diniz, C., Cabarrão, V.: STRING: An Hybrid Statistical and Rule-Based Natural Language Processing Chain for Portuguese. In: Computational Processing of Portuguese, PROPOR 2012, vol. Demo Session (2012), http://www.propor2012.org/demos/DemoSTRING.pdf
Marques, J.: Anaphora Resolution. Master’s thesis, Univ. of Lisbon/IST and INESC-ID Lisboa/L2F (2013)
Marrafa, P.: WordNet do Português: uma base de dados de conhecimento linguístico. Instituto Camões (2001)
Marrafa, P.: Portuguese WordNet: general architecture and internal semantic relations. DELTA 18, 131–146 (2002)
Marrafa, P., Amaro, R., Mendes, S.: WordNet.PT Global – extending WordNet.PT to Portuguese varieties. In: Proceedings of the 1st Workshop on Algorithms and Resources for Modelling of Dialects and Language Varieties, pp. 70–74. ACL Press, Edinburgh (2011)
Oliveira, H.: Onto.PT: Towards the Automatic Construction of a Lexical Ontology for Portuguese. Ph.D. thesis, Univ. of Coimbra/Faculty of Science and Technology (2012)
Oliveira, H.G., Santos, D., Gomes, P., Seco, N.: PAPEL: A Dictionary-Based Lexical Ontology for Portuguese. In: Teixeira, A., de Lima, V.L.S., de Oliveira, L.C., Quaresma, P. (eds.) PROPOR 2008. LNCS (LNAI), vol. 5190, pp. 31–40. Springer, Heidelberg (2008)
Pantel, P., Pennacchiotti, M.: Espresso: Leveraging generic patterns for automatically harvesting semantic relations. In: Proceedings of Conf. on Computational Linguistics/ACL, COLING/ACL 2006, pp. 113–120. Sydney, Australia (2006)
Pianta, E., Bentivogli, L., Girardi, C.: MultiWordNet: developing an aligned multilingual database. In: 1st Intl. Conf. on Global WordNet, Mysore, India, pp. 293–302 (2002)
Prévot, L., Huang, C., Calzolari, N., Gangemi, A., Lenci, A., Oltramari, A.: Ontology and the lexicon: a multi-disciplinary perspective (introduction). In: Huang, C., Calzolari, N., Gangemi, A., Lenci, A., Oltramari, A., Prévot, L. (eds.) Ontology and the Lexicon: A Natural Language Processing Perspective. Studies in Natural Language Processing, ch. 1, pp. 3–24. Cambridge Univ. Press (2010)
Rocha, P., Santos, D.: CETEMPúblico: Um corpus de grandes dimensões de linguagem jornalística portuguesa. In: Nunes, M. (ed.) V Encontro para o processamento computacional da língua portuguesa escrita e falada (PROPOR 2000), pp. 131–140. São Paulo, ICMC/USP (2000)
Sidorov, G.: Non-continuous Syntactic N-grams. Polibits 48, 67–75 (2013)
Sidorov, G., Velasquez, F., Stamatatos, E., Gelbukh, A., Chanona-Hernandez, L.: Syntactic N-grams as Machine Learning Features for Natural Language Processing. Expert Systems with Applications 41(3), 853–860 (2013)
Winston, M., Chaffin, R., Herrmann, D.: A Taxonomy of Part-Whole Relations. Cognitive Science 11, 417–444 (1987)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Markov, I., Mamede, N., Baptista, J. (2014). Whole-Part Relations Rule-Based Automatic Identification: Issues from Fine-Grained Error Analysis. In: Gelbukh, A., Espinoza, F.C., Galicia-Haro, S.N. (eds) Human-Inspired Computing and Its Applications. MICAI 2014. Lecture Notes in Computer Science(), vol 8856. Springer, Cham. https://doi.org/10.1007/978-3-319-13647-9_5
Download citation
DOI: https://doi.org/10.1007/978-3-319-13647-9_5
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-13646-2
Online ISBN: 978-3-319-13647-9
eBook Packages: Computer ScienceComputer Science (R0)