Skip to main content

Bootstrapping a Portuguese WordNet from Galician, Spanish and English Wordnets

  • Conference paper
Book cover Advances in Speech and Language Technologies for Iberian Languages

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 8854))

Abstract

In this article we exploit the possibility on bootstrapping an European Portuguese WordNet from the English, Spanish and Galician wordnets using Probabilistic Translation Dictionaries automatically created from parallel corpora.

The process generated a total of 56 770 synsets and 97 058 variants. An evaluation of the results using the Brazilian OpenWordNet-PT as a gold standard resulted on a precision varying from 53% to 75% percent, depending on the cut-line. The results were satisfying and comparable to similar experiments using the WN-Toolkit.

This research has been carried out thanks to the Project SKATeR (TIN2012-38584-C06-01 and TIN2012-38584-C06-04) supported by the Ministry of Economy and Competitiveness of the Spanish Government.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Atserias, J., Villarejo, L., Rigau, G., Agirre, E., Carroll, J., Magnini, B., Vossen, P.: The MEANING Multilingual Central Repository. In: Second International WordNet Conference, pp. 80–210 (2004)

    Google Scholar 

  2. Fernández Montraveta, A., Vázquez, G.: La construcción del wordnet 3.0 en español. In: Castillo, M.A., Platero, J.M.G. (eds.) La Lexicografía en su Dimensión Teórica, pp. 201–220. Universidad de Málaga, Málaga (2010)

    Google Scholar 

  3. Gómez Guinovart, X.: A hybrid corpus-based approach to bilingual terminology extraction. In: Fandiño, I.M.S., Crespo, B. (eds.) Encoding the Past, Decoding the Future: Corpora in the 21st Century, pp. 147–175. Cambridge Scholar Publishing, Newcastle upon Tyne (2012)

    Google Scholar 

  4. Gómez Guinovart, X., Clemente, X.M.G., Pereira, A.G., Lorenzo, V.T.: Galnet: WordNet 3.0 do galego. Linguamática 3(1), 61–67 (2011)

    Google Scholar 

  5. Gómez Guinovart, X., Oliver, T.: Methodology and evaluation of the Galician WordNet expansion with the WN-Toolkit. Procesamiento del Lenguaje Natural 53, 43–50 (2014)

    Google Scholar 

  6. Gonçalo Oliveira, H., Costa, H., Gomes, P.: Extracção de conhecimento léxico-semântico a partir de resumos da Wikipédia. In: Proceedings of INFORUM 2010, Simpósio de Informática. Braga, Portugal (September 2010)

    Google Scholar 

  7. Gonçalo Oliveira, H., Gomes, P.: Towards the automatic creation of a wordnet from a term-based lexical network. In: Proceedings of the ACL Workshop TextGraphs-5: Graph-based Methods for Natural Language Processing, pp. 10–18. ACL Press (July 2010)

    Google Scholar 

  8. Gonçalo Oliveira, H., Gomes, P.: Automatic discovery of fuzzy synsets from dictionary definitions. In: Proceedings of 22nd International Joint Conference on Artificial Intelligence, IJCAI 2011, pp. 1801–1806. AAAI Press, Barcelona (2011)

    Google Scholar 

  9. González, A., Laparra, E., Rigau, G.: Multilingual central repository version 3.0: upgrading a very large lexical knowledge base. In: 6th Global WordNet Conference, Matsue, Japan (2012)

    Google Scholar 

  10. Levenshtein, V.I.: On the minimal redundancy of binary error-correcting codes. Information and Control 28(4), 268–291 (1975)

    Article  MathSciNet  Google Scholar 

  11. Maziero, E.G., Pardo, T.A.S., Di Felippo, A., Dias-da Silva, B.C.: A base de dados lexical e a interface Web do TeP 2.0: Thesaurus eletrônico para o português do brasil. In: Companion Proceedings of the XIV Brazilian Symposium on Multimedia and the Web, WebMedia 2008, pp. 390–392. ACM, New York (2008)

    Chapter  Google Scholar 

  12. de Melo, G., Weikum, G.: Towards a universal wordnet by learning from combined evidence. In: Proceedings of the 18th ACM Conference on Information and Knowledge Management, CIKM 2009, pp. 513–522. ACM, New York (2009)

    Google Scholar 

  13. Miller, G.A.: WordNet: A lexical database for English. Commun. ACM 38(11), 39–41 (1995)

    Article  Google Scholar 

  14. Och, F.J., Ney, H.: A systematic comparison of various statistical alignment models. Computational Linguistics 29(1), 19–51 (2003)

    Article  MATH  Google Scholar 

  15. Oliver, A.: Wn-toolkit: Automatic generation of wordnets following the expand model. In: Proceedings of the 7th Global WordNetConference, Tartu, Estonia (2014)

    Google Scholar 

  16. Padró, L.: Analizadores multilingües en FreeLing. Linguamática 3(2), 13–20 (2011)

    Google Scholar 

  17. de Paiva, V., Rademaker, A., de Melo, G.: OpenWordNet-PT: An open Brazilian WordNet for reasoning. In: Proceedings of the 24th International Conference on Computational Linguistics (2012)

    Google Scholar 

  18. Simões, A., Almeida, J.J., Carvalho, N.R.: Defining a probabilistic translation dictionaries algebra. In: Correia, L., Reis, L.P., Cascalho, J., Gomes, L., Guerra, H., Cardoso, P. (eds.) XVI Portuguese Conference on Artificial Inteligence - EPIA, pp. 444–455. Angra do Heroismo, Azores (2013)

    Google Scholar 

  19. Simões, A., Guinovart, X.G.: Dictionary Alignment by Rewrite-based Entry Translation. In: Leal, J.P., Rocha, R., Simões, A. (eds.) 2nd Symposium on Languages, Applications and Technologies. OpenAccess Series in Informatics (OASIcs), vol. 29, pp. 237–247. Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik, Dagstuhl (2013)

    Google Scholar 

  20. Simões, A.M., Almeida, J.J.: NATools – a statistical word aligner workbench. Procesamiento del Lenguaje Natural 31, 217–224 (2003)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this paper

Cite this paper

Simões, A., Guinovart, X.G. (2014). Bootstrapping a Portuguese WordNet from Galician, Spanish and English Wordnets. In: Navarro Mesa, J.L., et al. Advances in Speech and Language Technologies for Iberian Languages. Lecture Notes in Computer Science(), vol 8854. Springer, Cham. https://doi.org/10.1007/978-3-319-13623-3_25

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-13623-3_25

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-13622-6

  • Online ISBN: 978-3-319-13623-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics