Advertisement

Diagnostic Tools in plWordNet Development Process

  • Maciej PiaseckiEmail author
  • Łukasz Burdka
  • Marek Maziarz
  • Michał Kaliński
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9561)

Abstract

With the growing size of a wordnet, it is becoming more and more difficult to avoid, identify and eliminate errors in it, especially when a group of editors work in parallel. That is the case of plWordNet. Thus we need elaborated tools for both error prevention during editing, and diagnostic tools for error detection after the work was completed. In this paper, first, we present error prevention mechanisms built-in the plWordNet editor application and the system for group-working of a linguistic team. Next, we discuss diagnostic tests and diagnostic tools dedicated to plWordNet – the Polish wordnet. plWordNet has been in steady development for almost ten years and has reached the size of 193 k synsets and 255 k lexical meanings. We propose a typology of the diagnostic levels: describe formal, structural and semantic rules for seeking errors within plWordNet, as well as, a new method of automated induction of the diagnostic rules. Finally, we discuss results and benefits of the approach.

Keywords

Knowledge Source Semantic Domain Relation Definition Semantic Error Relation Link 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Notes

Acknowledgments

Work financed by the Polish Ministry of Science and Higher Education, a program in support of scientific units involved in the development of a European research infrastructure for the humanities and social sciences in the scope of the consortia CLARIN ERIC and ESS-ERIC, 2015–2016.

References

  1. 1.
    Słownik Języka Polskiego. Wydawnictwo Naukowe PWN (2007)Google Scholar
  2. 2.
    The site of Wroclaw University of Technology Language Technology Group G4.19 (2013). http://www.nlp.pwr.wroc.pl
  3. 3.
    Broda, B., Maziarz, M., Piasecki, M.: Tools for plWordNet development. Presentation and perspectives. In: Calzolari, N., Choukri, K., Declerck, T., Dovgan, M., Maegaard, B., Mariani, J., JanOdijk, Piperidis, S. (eds.) Proceedings of the Eight International Conference on Language Resourcesand Evaluation (LREC 2012), pp. 3647–3652. European Language Resources Association (ELRA), Istanbul, Turkey, May 2012Google Scholar
  4. 4.
    Broda, B., Piasecki, M.: Evaluating LexCSD in a large scale experiment. Cont. Cybern. 40(2), 419–436 (2011)zbMATHGoogle Scholar
  5. 5.
    Cruse, A.: Meaning in Language. An Introduction to Semantics and Pragmatics. Oxford University Press, Oxford (2004)Google Scholar
  6. 6.
    Huang, C.R., Calzolari, N., Gangemi, A., Oltramari, A., Prévot, L. (eds.): Ontology and the Lexicon. A Natural Languge Processing Perspective. Studies in Natural Languge Processing. Cambridge University Press, Cambridge (2010)Google Scholar
  7. 7.
    Kubis, M.: A query language for WordNet-like lexical databases. In: Pan, J.-S., Chen, S.-M., Nguyen, N.T. (eds.) ACIIDS 2012, Part III. LNCS, vol. 7198, pp. 436–445. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  8. 8.
    Lohk, A., Vare, K., Võhandu, L.: Visual study of Estonian wordnet using bipartite graphs and minimal crossing algorithm. In: Proceedings of 6th Global Wordnet Conference, Matsue, Japan, January 2012Google Scholar
  9. 9.
    Maziarz, M., Piasecki, M., Rabiega-Wisniewska, J., Szpakowicz, S.: Semantic relations among nouns in Polish wordnet grounded in lexicographic and semantic tradition. Cogn. Stud. 11, 161–181 (2011). http://www.eecs.uottawa.ca/~szpak/pub/Maziarz_et_al_CS2011a.pdf
  10. 10.
    Maziarz, M., Piasecki, M., Szpakowicz, S.: The chicken-and-egg problem in WordNet design: synonymy, synsets and constitutive relations. Lang. Resour. Eval. 47(3), 769–796 (2013)CrossRefGoogle Scholar
  11. 11.
    Maziarz, M., Piasecki, M., Szpakowicz, S., Rabiega-Wiśniewska, J., Hojka, B.: Semantic relations between verbs in Polish WordNet 2.0. Cogn. Stud. 11, 183–200 (2011)Google Scholar
  12. 12.
    Maziarz, M., Szpakowicz, S., Piasecki, M.: Semantic relations among adjectives in Polish WordNet 2.0: a new relation set, discussion and evaluation. Cogn. Stud. 12, 149–179 (2012)Google Scholar
  13. 13.
    Miłkowski, M.: Open thesaurus - polski thesaurus (2007). http://www.synomix.pl/
  14. 14.
    Piasecki, M., Marcińczuk, M., Ramocki, R., Maziarz, M.: WordNetLoom: a WordNet development system integrating form-based and graph-based perspectives. Int. J. Data Min. Model. Manage. 5(3), 210–232 (2013)Google Scholar
  15. 15.
    Piasecki, M., Szpakowicz, S., Broda, B.: A WordNet from the Ground Up. University of Technology Press, Wrocław (2009)Google Scholar
  16. 16.
    Rizov, B.: Hydra: a modal logic tool for wordnet development, validation and exploration. In: Calzolari, N., et al. (eds.) Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC 2008). European Language Resources Association (ELRA), Marrakech, Morocco, May 2008Google Scholar
  17. 17.
    SJP.PL, Z.: Słownik języka polskiego [A dictionary of the Polish language] (2015). http://sjp.pl/
  18. 18.
    Smrž, P.: Quality control and checking for wordnet development: a case study of balkanet. Rom. J. Inf. Sci. Technol. 2004(1), 173–182 (2004)Google Scholar
  19. 19.
    Witten, I., Frank, E., Hall, M.: Data Mining: Practical Machine Learning Tools and Techniques. Morgan Kaufmann Publishers, San Francisco (2011)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2016

Authors and Affiliations

  • Maciej Piasecki
    • 1
    Email author
  • Łukasz Burdka
    • 1
  • Marek Maziarz
    • 1
  • Michał Kaliński
    • 1
  1. 1.G4.19 Research Group, Computational Intelligence DepartmentWrocław University of TechnologyWrocławPoland

Personalised recommendations