Whole-Part Relations Rule-Based Automatic Identification: Issues from Fine-Grained Error Analysis

Markov, Ilia; Mamede, Nuno; Baptista, Jorge

doi:10.1007/978-3-319-13647-9_5

Ilia Markov²²,
Nuno Mamede^23,24 &
Jorge Baptista^24,25

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 8856))

Included in the following conference series:

Mexican International Conference on Artificial Intelligence

1715 Accesses

Abstract

In this paper, we focus on the most frequent errors that occurred during the implementation of a rule-based module for semantic relations extraction, which has been integrated in STRING, a hybrid statistical and rule-based Natural Language Processing chain for Portuguese. We focus on whole-part relations (meronymy), that is, a semantic relation between an entity that is perceived as a constituent part of another entity, or a member of a set. In this case, we target the type of meronymy involving human entities and body-part nouns. We describe with some detail the decisions that were made in order to overcome the errors produced by the system and the solutions adopted to improve its performance.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Ait-Mokhtar, S., Chanod, J., Roux, C.: Robustness beyond shallowness: incremental dependency parsing. Natural Language Engineering 8(2/3), 121–144 (2002)
Google Scholar
Bick, E.: The Parsing System ”Palavras”: Automatic Grammatical Analysis of Portuguese in a Constraint Grammar Framework. Ph.D. thesis, Aarhus Univ. Aarhus, Denmark: Aarhus Univ. Press (2000)
Google Scholar
Costa, F., Branco, A.: LXGram: A Deep Linguistic Processing Grammar for Portuguese. In: Pardo, T.A.S., Branco, A., Klautau, A., Vieira, R., de Lima, V.L.S. (eds.) PROPOR 2010. LNCS, vol. 6001, pp. 86–89. Springer, Heidelberg (2010)
Chapter Google Scholar
Fellbaum, C.: WordNet: An Electronic Lexical Database. MIT Press, Cambridge (1998)
MATH Google Scholar
Fleiss, J.L.: Statistical methods for rates and proportions, 2nd edn., pp. 38–46. John Wiley, New York (1981)
MATH Google Scholar
Freelon, D.: ReCal: Intercoder Reliability Calculation as a Web Service. Intl. J. of Internet Science 5(1), 20–33 (2010)
Google Scholar
Freitas, C.: ESQUELETO - Anotaçã das palavras do corpo humano. Tech. Rep. Versão 5 (May 20, 2014), http://www.linguateca.pt/acesso/Esqueleto.pdf
Gelbukh, A.: Syntactic disambiguation with weighted extended subcategorization frames. In: Proceedings of PACLING-99, Pacific Association for Computational Linguistics, pp. 244–249. University of Waterloo, Canada (1999)
Google Scholar
Gelbukh, A.: Unsupervised Learning for Syntactic Disambiguation. Computación y Sistemas 18(2), 329–344 (2014)
Article Google Scholar
Girju, R., Badulescu, A., Moldovan, D.: Learning Semantic Constraints for the Automatic Discovery of Part-Whole Relations. In: Proceedings of HLT-NAACL, vol. 3, pp. 80–87 (2003)
Google Scholar
Girju, R., Badulescu, A., Moldovan, D.: Automatic discovery of part-whole relations. Computational Linguistics 21(1), 83–135 (2006)
Google Scholar
van Hage, W.R., Kolb, H., Schreiber, G.: A method for learning part-whole relations. In: Cruz, I., Decker, S., Allemang, D., Preist, C., Schwabe, D., Mika, P., Uschold, M., Aroyo, L.M. (eds.) ISWC 2006. LNCS, vol. 4273, pp. 723–735. Springer, Heidelberg (2006)
Chapter Google Scholar
Hearst, M.: Automatic acquisition of hyponyms from large text corpora. In: Proceedings of the 14th Conf. on Computational Linguistics, COLING 1992, vol. 2, pp. 539–545. ACL Morristown, NJ (1992)
Google Scholar
Hirst, G.: Ontology and the lexicon. In: Staab, S., Studer, R. (eds.) Handbook on Ontologies, pp. 209–230. Springer (2004)
Google Scholar
Iris, M., Litowitz, B., Evens, M.: Problems of the Part-Whole Relation. In: Evens, M. (ed.) Relational Models of the Lexicon: Representing Knowledge in Semantic Networks, pp. 261–288. Cambridge Univ. Press (1988)
Google Scholar
Landis, J., Koch, G.: The measurement of observer agreement for categorical data. Biometrics 33(1), 159–174 (1977)
Article MATH MathSciNet Google Scholar
Mamede, N., Baptista, J., Diniz, C., Cabarrão, V.: STRING: An Hybrid Statistical and Rule-Based Natural Language Processing Chain for Portuguese. In: Computational Processing of Portuguese, PROPOR 2012, vol. Demo Session (2012), http://www.propor2012.org/demos/DemoSTRING.pdf
Marques, J.: Anaphora Resolution. Master’s thesis, Univ. of Lisbon/IST and INESC-ID Lisboa/L2F (2013)
Google Scholar
Marrafa, P.: WordNet do Português: uma base de dados de conhecimento linguístico. Instituto Camões (2001)
Google Scholar
Marrafa, P.: Portuguese WordNet: general architecture and internal semantic relations. DELTA 18, 131–146 (2002)
Article Google Scholar
Marrafa, P., Amaro, R., Mendes, S.: WordNet.PT Global – extending WordNet.PT to Portuguese varieties. In: Proceedings of the 1st Workshop on Algorithms and Resources for Modelling of Dialects and Language Varieties, pp. 70–74. ACL Press, Edinburgh (2011)
Google Scholar
Oliveira, H.: Onto.PT: Towards the Automatic Construction of a Lexical Ontology for Portuguese. Ph.D. thesis, Univ. of Coimbra/Faculty of Science and Technology (2012)
Google Scholar
Oliveira, H.G., Santos, D., Gomes, P., Seco, N.: PAPEL: A Dictionary-Based Lexical Ontology for Portuguese. In: Teixeira, A., de Lima, V.L.S., de Oliveira, L.C., Quaresma, P. (eds.) PROPOR 2008. LNCS (LNAI), vol. 5190, pp. 31–40. Springer, Heidelberg (2008)
Chapter Google Scholar
Pantel, P., Pennacchiotti, M.: Espresso: Leveraging generic patterns for automatically harvesting semantic relations. In: Proceedings of Conf. on Computational Linguistics/ACL, COLING/ACL 2006, pp. 113–120. Sydney, Australia (2006)
Google Scholar
Pianta, E., Bentivogli, L., Girardi, C.: MultiWordNet: developing an aligned multilingual database. In: 1st Intl. Conf. on Global WordNet, Mysore, India, pp. 293–302 (2002)
Google Scholar
Prévot, L., Huang, C., Calzolari, N., Gangemi, A., Lenci, A., Oltramari, A.: Ontology and the lexicon: a multi-disciplinary perspective (introduction). In: Huang, C., Calzolari, N., Gangemi, A., Lenci, A., Oltramari, A., Prévot, L. (eds.) Ontology and the Lexicon: A Natural Language Processing Perspective. Studies in Natural Language Processing, ch. 1, pp. 3–24. Cambridge Univ. Press (2010)
Google Scholar
Rocha, P., Santos, D.: CETEMPúblico: Um corpus de grandes dimensões de linguagem jornalística portuguesa. In: Nunes, M. (ed.) V Encontro para o processamento computacional da língua portuguesa escrita e falada (PROPOR 2000), pp. 131–140. São Paulo, ICMC/USP (2000)
Google Scholar
Sidorov, G.: Non-continuous Syntactic N-grams. Polibits 48, 67–75 (2013)
Google Scholar
Sidorov, G., Velasquez, F., Stamatatos, E., Gelbukh, A., Chanona-Hernandez, L.: Syntactic N-grams as Machine Learning Features for Natural Language Processing. Expert Systems with Applications 41(3), 853–860 (2013)
Article Google Scholar
Winston, M., Chaffin, R., Herrmann, D.: A Taxonomy of Part-Whole Relations. Cognitive Science 11, 417–444 (1987)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Centro de Investigación en Computación (CIC), Instituto Politécnico Nacional (IPN), México D.F., Mexico
Ilia Markov
Universidade do Algarve/FCHS and CECL, Faro, Portugal
Nuno Mamede
Spoken Language Lab, INESC-ID Lisboa/L2F, Lisboa, Portugal
Nuno Mamede & Jorge Baptista
Universidade de Lisboa/IST, Lisboa, Portugal
Jorge Baptista

Authors

Ilia Markov
View author publications
You can also search for this author in PubMed Google Scholar
Nuno Mamede
View author publications
You can also search for this author in PubMed Google Scholar
Jorge Baptista
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Centro de Investigación en Computación, Instituto Politécnico Nacional, Av. Juan Dios Bátiz s/n, Col. Nueva Industrial Vallejo, 07738, Mexico City, Mexico
Alexander Gelbukh
Área Académica de Computación y Electrónica, Carretera Pachuca-Tulancingo, Universidad Autónoma del Estado de Hidalgo, Km. 4.5, Col. Carboneras, Mineral de la Reforma, 42180, Hidalgo, Mexico
Félix Castro Espinoza
Facultad de ciencias, Universidad Autónoma Nacional de México, Ciudad Universitaria, México DF, Mexico
Sofía N. Galicia-Haro

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Markov, I., Mamede, N., Baptista, J. (2014). Whole-Part Relations Rule-Based Automatic Identification: Issues from Fine-Grained Error Analysis. In: Gelbukh, A., Espinoza, F.C., Galicia-Haro, S.N. (eds) Human-Inspired Computing and Its Applications. MICAI 2014. Lecture Notes in Computer Science(), vol 8856. Springer, Cham. https://doi.org/10.1007/978-3-319-13647-9_5

Download citation

DOI: https://doi.org/10.1007/978-3-319-13647-9_5
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-13646-2
Online ISBN: 978-3-319-13647-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics