Abstract
A wordnet is an important tool for developing natural language processing applications for a language, but the manual creation of such a resource limits its development. This dissertation studied the automatic construction of Onto.PT, a large Portuguese wordnet, aiming to minimise the main limitations of existing Portuguese wordnets. On this context, we propose ECO, an approach for creating wordnets automatically from text – relation instances are extracted, synonymy clusters (synsets) are discovered, and the remaining relations are then attached to suitable synsets. This document also reports on the contents of Onto.PT, its comparison to other wordnets, and its evaluation.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Gonçalo Oliveira, H.: Onto.PT: Towards the Automatic Construction of a Lexical Ontology for Portuguese. PhD thesis. University of Coimbra (2013), http://eden.dei.uc.pt/~hroliv/pubs/GoncaloOliveira_PhdThesis2012.pdf
Marrafa, P.: Portuguese WordNet: General architecture and internal semantic relations. DELTA 18, 131–146 (2002)
Dias-da-Silva, B.C.: Wordnet.Br: An exercise of human language technology research. In: Procs of 3rd International WordNet Conference, GWC 2006, South Jeju Island, Korea, pp. 301–303 (January 2006)
de Paiva, V., Rademaker, A., de Melo, G.: OpenWordNet-PT: An open brazilian wordnet for reasoning. In: Procs of 24th International Conferene on Computational Linguistics, COLING (Demo Paper) (2012)
Richardson, S.D., Dolan, W.B., Vanderwende, L.: MindNet: Acquiring and structuring semantic information from text. In: Procs of 17th International Conference on Computational Linguistics, COLING 1998, pp. 1098–1102 (1998)
Nichols, E., Bond, F., Flickinger, D.: Robust ontology acquisition from machine-readable dictionaries. In: Procs of 19th International Joint Conference on Artificial Intelligence, IJCAI 2005, pp. 1111–1116. Professional Book Center (2005)
Zesch, T., Müller, C., Gurevych, I.: Extracting lexical semantic knowledge from Wikipedia and Wiktionary. In: Procs of 6th International Conference on Language Resources and Evaluation, LREC 2008, Marrakech, Morocco (2008)
Lin, D.: Automatic retrieval and clustering of similar words. In: Procs of 17th International Conference on Computational linguistics, COLING 1998, pp. 768–774. ACL Press, Montreal (1998)
Hearst, M.A.: Automatic acquisition of hyponyms from large text corpora. In: Procs of 14th Conference on Computational Linguistics, COLING 1992, pp. 539–545. ACL Press (1992)
Caraballo, S.A.: Automatic construction of a hypernym-labeled noun hierarchy from text. In: Procs of 37th Annual Meeting of the Association for Computational Linguistics, pp. 120–126. ACL Press (1999)
Snow, R., Jurafsky, D., Ng, A.Y.: Learning syntactic patterns for automatic hypernym discovery. In: Advances in Neural Information Processing Systems, pp. 1297–1304. MIT Press, Cambridge (2005)
Pantel, P., Pennacchiotti, M.: Espresso: Leveraging generic patterns for automatically harvesting semantic relations. In: Procs of 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics, pp. 113–120. ACL Press, Sydney (2006)
Etzioni, O., Fader, A., Christensen, J., Soderland, S.: Mausam: Open information extraction: The second generation. In: Procs of 22nd International Joint Conference on Artificial Intelligence, IJCAI 2011, pp. 3–10. IJCAI/AAAI, Barcelona (2011)
Shi, L., Mihalcea, R.: Putting pieces together: Combining FrameNet, VerbNet and WordNet for robust semantic parsing. In: Gelbukh, A. (ed.) CICLing 2005. LNCS, vol. 3406, pp. 100–111. Springer, Heidelberg (2005)
Navigli, R., Ponzetto, S.P.: BabelNet: The automatic construction, evaluation and application of a wide-coverage multilingual semantic network. Artificial Intelligence 193, 217–250 (2012)
Gurevych, I., Eckle-Kohler, J., Hartmann, S., Matuschek, M., Meyer, C.M., Wirth, C.: UBY - a large-scale unified lexical-semantic resource. In: Procs of 13th Conference of the European Chapter of the Association for Computational Linguistics, EACL 2012, pp. 580–590. ACL Press, Avignon (2012)
Pennacchiotti, M., Pantel, P.: Ontologizing semantic relations. In: Procs of 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics COLING/ACL, pp. 793–800. ACL Press (2006)
Gonçalo Oliveira, H., Gomes, P.: ECO and Onto.PT: A flexible approach for creating a Portuguese wordnet automatically. Language Resources and Evaluation 48(2), 373–393 (2014)
Gonçalo Oliveira, H., Santos, D., Gomes, P., Seco, N.: PAPEL: A dictionary-based lexical ontology for Portuguese. In: Teixeira, A., de Lima, V.L.S., de Oliveira, L.C., Quaresma, P. (eds.) PROPOR 2008. LNCS (LNAI), vol. 5190, pp. 31–40. Springer, Heidelberg (2008)
Simões, A., Sanromán, Á.I., Almeida, J.J.: Dicionário-Aberto: A source of resources for the Portuguese language processing. In: Caseli, H., Villavicencio, A., Teixeira, A., Perdigão, F. (eds.) PROPOR 2012. LNCS (LNAI), vol. 7243, pp. 121–127. Springer, Heidelberg (2012)
Gonçalo Oliveira, H., Antón Pérez, L., Costa, H., Gomes, P.: Uma rede léxico-semântica de grandes dimensões para o português, extraída a partir de dicionários electrónicos. Linguamática 3(2), 23–38 (2011)
Gonçalo Oliveira, H., Gomes, P.: Automatic Discovery of Fuzzy Synsets from Dictionary Definitions. In: Procs of 22nd International Joint Conference on Artificial Intelligence, IJCAI 2011, pp. 1801–1806. IJCAI/AAAI, Barcelona (2011)
Maziero, E.G., Pardo, T.A.S., Felippo, A.D., Dias-da-Silva, B.C.: A Base de Dados Lexical e a Interface Web do TeP 2.0 - Thesaurus Eletrônico para o Português do Brasil. In: VI Workshop em Tecnologia da Informação e da Linguagem Humana, TIL, pp. 390–392 (2008)
Gonçalo Oliveira, H., Gomes, P.: Towards the automatic enrichment of a thesaurus with information in dictionaries. Expert Systems: The Journal of Knowledge Engineering 30(4), 320–332 (2013)
Gonçalo Oliveira, H., Gomes, P.: Ontologising semantic relations into a relationless thesaurus. In: Procs of 20th European Conference on Artificial Intelligence (ECAI 2012), pp. 915–916. IOS Press, Montpellier (2012)
Santos, D., Bick, E.: Providing Internet access to Portuguese corpora: the AC/DC project. In: Proc 2nd Language Resources and Evaluation, LREC 2000, pp. 205–210. ELRA, Athens (2000)
Gonçalo Oliveira, H., Gomes, P.: Onto.PT: Recent developments of a large public domain portuguese wordnet. In: Procs of the 7th Global WordNet Conference, GWC 2014, Tartu, Estonia, pp. 16–22 (2014)
Rodrigues, R., Gonçalo Oliveira, H., Gomes, P.: Uma abordagem ao Págico baseada no processamento e análise de sintagmas dos tópicos. Linguamática 4(1), 31–39 (2012)
Gonçalo Oliveira, H., Coelho, I., Gomes, P.: Exploiting Portuguese lexical knowledge bases for answering open domain cloze questions automatically. In: Proc 9th Language Resources and Evaluation Conference, ELRA, Reykjavik (2014)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Gonçalo Oliveira, H. (2014). The Creation of Onto.PT: A Wordnet-Like Lexical Ontology for Portuguese. In: Baptista, J., Mamede, N., Candeias, S., Paraboni, I., Pardo, T.A.S., Volpe Nunes, M.d.G. (eds) Computational Processing of the Portuguese Language. PROPOR 2014. Lecture Notes in Computer Science(), vol 8775. Springer, Cham. https://doi.org/10.1007/978-3-319-09761-9_16
Download citation
DOI: https://doi.org/10.1007/978-3-319-09761-9_16
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-09760-2
Online ISBN: 978-3-319-09761-9
eBook Packages: Computer ScienceComputer Science (R0)