Skip to main content

Preliminary Study on Automatic Recognition of Spatial Expressions in Polish Texts

  • Conference paper
  • First Online:
Book cover Text, Speech, and Dialogue (TSD 2016)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 9924))

Included in the following conference series:

Abstract

In the paper we cover the problem of spatial expression recognition in text for Polish language. A spatial expression is a text fragment which describes a relative location of two or more physical objects to each other. The first part of the paper treats about a Polish corpus annotated with spatial expressions and annotators agreement. In the second part we analyse the feasibility of spatial expression recognition by overviewing relevant tools and resources for text processing for Polish. Then we present a knowledge-based approach which utilizes the existing tools and resources for Polish, including: a morpho-syntactic tagger, shallow parsers, a dependency parser, a named entity recognizer, a general ontology, a wordnet and a wordnet to ontology mapping. We also present a dedicated set of manually created syntactic and semantic patterns for generating and filtering candidates of spatial expressions. In the last part we discuss the results obtained on the reference corpus with the proposed method and present detailed error analysis.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Kolomiyets, O., Kordjamshidi, P., Bethard, S., Moens, M.: SemEval-2013 task 3: spatial role labeling. In: Second Joint Conference on Lexical and Computational Semantics (SEM). Proceedings of the Seventh International Workshop on Semantic Evaluation (SemEval 2013), Atlanta, USA. ACL, East Stroudsburg (2013)

    Google Scholar 

  2. LDC: ACE (Automatic Content Extraction) English Annotation Guidelines for Relations. Argument (2008)

    Google Scholar 

  3. Broda, B., Marcińczuk, M., Maziarz, M., Radziszewski, A., Wardyński, A.: KPWr: towards a free corpus of Polish. In: Calzolari, N., Choukri, K., Declerck, T., Doğan, M.U., Maegaard, B., Mariani, J., Odijk, J., Piperidis, S. (eds.) Proceedings of the Eight International Conference on Language Resources and Evaluation (LREC 2012), Istanbul, Turkey. European Language Resources Association (ELRA), May 2012

    Google Scholar 

  4. Radziszewski, A.: A tiered CRF tagger for Polish. In: Bembenik, R., Skonieczny, Ł., Rybiński, H., Kryszkiewicz, M., Niezgódka, M. (eds.) Intelligent Tools for Building a Scientific Information. SCI, vol. 467, pp. 215–230. Springer, Heidelberg (2013)

    Chapter  Google Scholar 

  5. Waszczuk, J.: Harnessing the CRF complexity with domain-specific constraints. The case of morphosyntactic tagging of a highly inflected language. In: Proceedings of COLING 2012, no. December 2012, pp. 2789–2804 (2012)

    Google Scholar 

  6. Acedański, S.: A morphosyntactic Brill tagger for inflectional languages. In: Loftsson, H., Rögnvaldsson, E., Helgadóttir, S. (eds.) IceTAL 2010. LNCS, vol. 6233, pp. 3–14. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

  7. Kaczmarek, A., Marcińczuk, M.: Heuristic algorithm for zero subject detection in Polish. In: Král, P., Matoušek, V. (eds.) TSD 2015. LNCS, vol. 9302, pp. 378–386. Springer, Heidelberg (2015). doi:10.1007/978-3-319-24033-6_43

    Chapter  Google Scholar 

  8. Przepiórkowski, A.: Powierzchniowe przetwarzanie języka polskiego. Problemy współczesnej nauki, teoria i zastosowania: Inżynieria lingwistyczna. Akademicka Oficyna Wydawnicza “Exit” (2008)

    Google Scholar 

  9. Głowińska, K.: Anotacja składniowa NKJP. In: Przepiórkowski, A., Bańko, M., Górski, R.L., Lewandowska-Tomaszczyk, B. (eds.) Narodowy Korpus Języka Polskiego, pp. 107–127. Wydawnictwo Naukowe PWN, Warsaw (2012)

    Google Scholar 

  10. Radziszewski, A., Pawlaczek, A.: Large-scale experiments with NP chunking of Polish. In: Sojka, P., Horák, A., Kopeček, I., Pala, K. (eds.) TSD 2012. LNCS, vol. 7499, pp. 143–149. Springer, Heidelberg (2012)

    Chapter  Google Scholar 

  11. Radziszewski, A.: Metody znakowania morfosyntaktycznego i automatycznej płytkiej analizy składniowej języka polski. Ph.D. thesis, Politechnika Wrocławska, Wrocław (2012)

    Google Scholar 

  12. Maziarz, M., Piasecki, M., Szpakowicz, S.: Approaching plWordNet 2.0. In: Proceedings of the 6th Global Wordnet Conference, Matsue, Japan, January 2012

    Google Scholar 

  13. Pease, A., Niles, I., Li, J.: The suggested upper merged ontology: a large ontology for the semantic web and its applications. In: Working Notes of the AAAI-2002 Workshop on Ontologies and the Semantic Web (2002)

    Google Scholar 

  14. Marcińczuk, M., Kocoń, J., Janicki, M.: Liner2 — a customizable framework for proper names recognition for Polish. In: Bembenik, R., Skonieczny, Ł., Rybiński, H., Kryszkiewicz, M., Niezgódka, M. (eds.) Intelligent Tools for Building a Scientific Information. SCI, vol. 467, pp. 231–254. Springer, Heidelberg (2013)

    Chapter  Google Scholar 

  15. Wróblewska, A., Woliński, M.: Preliminary experiments in Polish dependency parsing. In: Bouvry, P., Kłopotek, M.A., Leprévost, F., Marciniak, M., Mykowiecka, A., Rybiński, H. (eds.) SIIS 2011. LNCS, vol. 7053, pp. 279–292. Springer, Heidelberg (2012)

    Chapter  Google Scholar 

  16. Kordjamshidi, P., Van Otterlo, M., Moens, M.F.: Spatial role labeling: towards extraction of spatial relations from natural language. ACM Trans. Speech Lang. Process. 8(3), 1–36 (2011)

    Article  Google Scholar 

  17. Przybylska, R.: Polisemia przyimków polskich w świetle semantyki kognitywnej. Universitas, Kraków (2002)

    Google Scholar 

Download references

Acknowledgements

Work financed as part of the investment in the CLARIN-PL research infrastructure funded by the Polish Ministry of Science and Higher Education.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Michał Marcińczuk .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this paper

Cite this paper

Marcińczuk, M., Oleksy, M., Wieczorek, J. (2016). Preliminary Study on Automatic Recognition of Spatial Expressions in Polish Texts. In: Sojka, P., Horák, A., Kopeček, I., Pala, K. (eds) Text, Speech, and Dialogue. TSD 2016. Lecture Notes in Computer Science(), vol 9924. Springer, Cham. https://doi.org/10.1007/978-3-319-45510-5_18

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-45510-5_18

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-45509-9

  • Online ISBN: 978-3-319-45510-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics