Abstract
There are several lexical resources available for the computational processing of Portuguese, organised differently and created by different people with different approaches and limitations. This paper presents the first experiments towards the exploitation of seven of those resources in the automatic creation of a large wordnet, where numerical scores are assigned to the inclusion of words in synsets and to the connection of synsets by semantic relations. Experiments confirm that a large wordnet can indeed be created and, to some extent, computed scores can be used as a confidence measure, which will enable the users to select only a portion of the resource, depending on the needs of their application on quantity and quality of lexical-semantic knowledge.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsNotes
- 1.
- 2.
- 3.
- 4.
- 5.
The same contributor was not allowed to label more than two sets of pairs.
References
Fellbaum, C. (ed.): WordNet: An Electronic Lexical Database (Language, Speech, and Communication). The MIT Press, Cambridge (1998)
Gonçalo Oliveira, H., de Paiva, V., Freitas, C., Rademaker, A., Real, L., Simões, A.: As wordnets do Português. In: Simões, A., Barreiro, A., Santos, D., Sousa-Silva, R., Tagnin, S.E.O. (eds.) Linguística, Informática e Tradução: Mundos que se Cruzam, pp. 397–424. OSLa: Oslo Studies in Language, University of Oslo (2015)
Araúz, P.L., Gómez-Romero, J., Bobillo, F.: A fuzzy ontology extension of WordNet and EuroWordnet for specialized knowledge. In: Proceedings of Terminology and Knowledge Engineering Conference, TKE 2012, Madrid, Spain, June 2012
Kilgarriff, A.: Word senses are not bona fide objects: implications for cognitive science, formal semantics, NLP. In: Proceedings of 5th International Conference on the Cognitive Science of Natural Language Processing, pp. 193–200 (1996)
Gonçalo Oliveira, H., Gomes, P.: ECO and Onto.PT: a flexible approach for creating a Portuguese wordnet automatically. Lang. Resour. Eval. 48(2), 373–393 (2014)
Marrafa, P., Amaro, R., Mendes, S.: WordNet.PT global - extending WordNet.PT to Portuguese varieties. In: Proceedings of 1st Workshop on Algorithms and Resources for Modelling of Dialects and Language Varieties, Edinburgh, Scotland, pp. 70–74. ACL Press (2011)
Dias-da-Silva, B.C., de Oliveira, M.F., de Moraes, H.R.: Groundwork for the development of the Brazilian Portuguese wordnet. In: Ranchhod, E., Mamede, N.J. (eds.) PorTAL 2002. LNCS (LNAI), vol. 2389, pp. 189–196. Springer, Heidelberg (2002)
Dias-da-Silva, B.C.: Wordnet.Br: an exercise of human language technology research. In: Proceedings of 3rd International WordNet Conference (GWC), GWC 2006, South Jeju Island, Korea, pp. 301–303, January 2006
de Paiva, V., Rademaker, A., de Melo, G.: OpenWordNet-PT: an open Brazilian wordnet for reasoning. In: Proceedings of 24th International Conference on Computational Linguistics, COLING (Demo Paper) (2012)
de Melo, G., Weikum, G.: Towards a universal wordnet by learning from combined evidence. In: Proceedings of the 18th ACM Conference on Information and Knowledge Management (CIKM 2009), pp. 513–522. ACM, New York (2009)
Simões, A., Guinovart, X.G.: Bootstrapping a Portuguese wordnet from Galician, Spanish and English wordnets. In: Navarro Mesa, J.L., Ortega, A., Teixeira, A., Hernández Pérez, E., Quintana Morales, P., Ravelo García, A., Guerra Moreno, I., Toledano, D.T. (eds.) IberSPEECH 2014. LNCS, vol. 8854, pp. 239–248. Springer, Heidelberg (2014)
Gonzalez-Agirre, A., Laparra, E., Rigau, G.: Multilingual central repository version 3.0. In: Proceedings of the 8th International Conference on Language Resources and Evaluation (LREC 2012), pp. 2525–2529. ELRA (2012)
Gomes, M.M., Beltrame, W., Cury, D.: Automatic construction of Brazilian Portuguese WordNet. In: Proceedings of X National Meeting on Artificial and Computational Intelligence, ENIAC 2013 (2013)
Gonçalo Oliveira, H., Santos, D., Gomes, P., Seco, N.: PAPEL: a dictionary-based lexical ontology for Portuguese. In: Teixeira, A., de Lima, V.L.S., de Oliveira, L.C., Quaresma, P. (eds.) PROPOR 2008. LNCS (LNAI), vol. 5190, pp. 31–40. Springer, Heidelberg (2008)
Simões, A., Sanromán, Á.I., Almeida, J.J.: Dicionário-Aberto: a source of resources for the Portuguese language processing. In: Caseli, H., Villavicencio, A., Teixeira, A., Perdigão, F. (eds.) PROPOR 2012. LNCS, vol. 7243, pp. 121–127. Springer, Heidelberg (2012)
Gonçalo Oliveira, H., Antón Pérez, L., Costa, H., Gomes, P.: Uma rede léxico-semântica de grandes dimensões para o português, extraída a partir de dicionários electrónicos. Linguamática 3(2) 23–38, 2011
Maziero, E.G., Pardo, T.A.S., Felippo, A.D., Dias-da-Silva, B.C.: A Base de Dados Lexical e a Interface Web do TeP 2.0 - Thesaurus Eletrônico para o Português do Brasil. In: VI Workshop em Tecnologia da Informação e da Linguagem Humana (TIL), pp. 390–392 (2008)
Borin, L., Forsberg, M.: From the people’s synonym dictionary to fuzzy synsets - first steps. In: Proceedings of LREC 2010 Workshop on Semantic Relations. Theory and Applications, La Valleta, Malta, pp. 18–25 (2010)
Gonçalo Oliveira, H., Gomes, P.: Automatic discovery of fuzzy synsets from dictionary definitions. In: Proceedings of 22nd International Joint Conference on Artificial Intelligence, IJCAI 2011, Barcelona, Spain, pp. 1801–1806. IJCAI/AAAI, July 2011
Velldal, E.: A fuzzy clustering approach to word sense discrimination. In: Proceedings of 7th International Conference on Terminology and Knowledge Engineering, Copenhagen, Denmark (2005)
Navigli, R.: Word sense disambiguation: a survey. ACM Comput. Surv. 41(2), 1–69 (2009)
Nasiruddin, M.: A state of the art of word sense induction: a way towards word sense disambiguation for under resourced languages. In: Proceedings of Traitement Automatique des Langues Naturelles and Rencontres des Étudiants Chercheurs en Informatique pour le Traitement Automatique des Langues, TALN/RECITAL 2013 (2013)
Gonçalo Oliveira, H., Santos, F.: Discovering fuzzy synsets from the redundancy in different lexical-semantic resources. In: Proceedings of 10th Language Resources and Evaluation Conference, LREC 2016, Portorož, Slovenia. ELRA, May 2016
Biemann, C.: Chinese whispers: an efficient graph clustering algorithm and its application to natural language processing problems. In: Proceedings of 1st Workshop on Graph Based Methods for Natural Language Processing, TextGraphs-1, New York City, pp. 73–80. ACL Press (2006)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Gonçalo Oliveira, H. (2016). CONTO.PT: Groundwork for the Automatic Creation of a Fuzzy Portuguese Wordnet. In: Silva, J., Ribeiro, R., Quaresma, P., Adami, A., Branco, A. (eds) Computational Processing of the Portuguese Language. PROPOR 2016. Lecture Notes in Computer Science(), vol 9727. Springer, Cham. https://doi.org/10.1007/978-3-319-41552-9_29
Download citation
DOI: https://doi.org/10.1007/978-3-319-41552-9_29
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-41551-2
Online ISBN: 978-3-319-41552-9
eBook Packages: Computer ScienceComputer Science (R0)