Incorporating Head Recognition into a CRF Chunker

Radziszewski, Adam; Pawlaczek, Adam

doi:10.1007/978-3-642-38634-3_3

Adam Radziszewski¹⁸ &
Adam Pawlaczek¹⁸

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 7912))

Included in the following conference series:

Intelligent Information Systems Symposium

1021 Accesses
1 Citations

Abstract

While rule-based shallow parsers usually recognise phrases’ syntactic heads, the same does not hold for statistical syntactic chunkers. The task of finding heads within already recognised chunks is not trivial for freer word order languages like German or Polish, while this information may be very useful.

We propose a simple solution that allows to incorporate head recognition into existing chunkers by extending the standard IOB2 representation with information on head location. To evaluate this approach we introduced the new representation into a CRF chunker for Polish. Although this idea is very simple, the results are surprisingly good.

This work was financed by the National Centre for Research and Development (NCBiR) project SP/I/1/77065/10 (“SyNaT”).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 49.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Abney, S.: Parsing by chunks. In: Principle-Based Parsing. pp. 257–278. Kluwer Academic Publishers (1991)
Google Scholar
Broda, B., Marcińczuk, M., Maziarz, M., Radziszewski, A., Wardyński, A.: KPWr: Towards a free corpus of Polish. In: Calzolari, N., Choukri, K., Declerck, T., Doǧan, M.U., Maegaard, B., Mariani, J., Odijk, J., Piperidis, S. (eds.) Proceedings of LREC 2012. ELRA, Istanbul (2012)
Google Scholar
Hobbs, J.R., Riloff, E.: Information extraction. In: Indurkhya, N., Damerau, F.J. (eds.) Handbook of Natural Language Processing, 2nd edn. Chapman & Hall/CRC Press, Taylor & Francis Group (2010)
Google Scholar
Kermes, H., Evert, S.: YAC — a recursive chunker for unrestricted German text. In: Rodriguez, M.G., Araujo, C.P. (eds.) Proceedings of the Third International Conference on , vol. V, pp. 1805–1812 (2002)Language Resources and Evaluation
Google Scholar
Maziarz, M., Radziszewski, A., Wieczorek, J.: Chunking of Polish: guidelines, discussion and experiments with Machine Learning. In: Proceedings of the 5th Language & Technology Conference, LTC 2011, Poznań, Poland (2011)
Google Scholar
Osenova, P.: Bulgarian nominal chunks and mapping strategies for deeper syntactic analyses. In: Proceedings of the Workshop on Treebanks and Linguistic Theories (TLT 2002), Sozopol, Bulgaria, September 20-21 (2002)
Google Scholar
Przepiórkowski, A., Bańko, M., Górski, R.L., Lewandowska-Tomaszczyk, B. (eds.): Narodowy Korpus Języka Polskiego. Wydawnictwo Naukowe PWN, Warsaw (2012)
Google Scholar
Radziszewski, A., Maziarz, M., Wieczorek, J.: Shallow syntactic annotation in the Corpus of Wrocław University of Technology. Cognitive Studies 12 (2012)
Google Scholar
Radziszewski, A., Pawlaczek, A.: Large-scale experiments with NP chunking of polish. In: Sojka, P., Horák, A., Kopeček, I., Pala, K. (eds.) TSD 2012. LNCS, vol. 7499, pp. 143–149. Springer, Heidelberg (2012)
Chapter Google Scholar
Ramshaw, L.A., Marcus, M.P.: Text chunking using transformation-based learning. In: Proceedings of the Third ACL Workshop on Very Large Corpora, Cambridge, MA, USA, pp. 82–94 (1995)
Google Scholar
Sang, E.F.T.K., Veenstra, J.: Representing text chunks. In: Proceedings of the Ninth Conference on European Chapter of the Association for Computational Linguistics, pp. 173–179. Association for Computational Linguistics, Morristown (1999)
Chapter Google Scholar
Tjong Kim Sang, E.F., Buchholz, S.: Introduction to the CoNLL-2000 shared task: Chunking. In: Proceedings of CoNLL-2000 and LLL-2000, Lisbon, Portugal pp. 127–132 (2000)
Google Scholar
Vučković, K.: Model parsera za hrvatski jezik. Ph.D. thesis, Department of Information Sciences, Faculty of Humanities and Social Sciences, University of Zagreb, Croatia (2009)
Google Scholar

Download references

Author information

Authors and Affiliations

Institute of Informatics, Wrocław University of Technology, Wybrzeże Wyspiańskiego 27, Wrocław, Poland
Adam Radziszewski & Adam Pawlaczek

Authors

Adam Radziszewski
View author publications
You can also search for this author in PubMed Google Scholar
Adam Pawlaczek
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Institute of Computer Science, Polish Academy of Sciences, ul. Jana Kazimierza 5, 01-248, Warsaw, Poland
Mieczysław A. Kłopotek , Jacek Koronacki , Małgorzata Marciniak & Agnieszka Mykowiecka , , &
Institute of Computer Science, Polish Academy of Sciences, ul. Brzegi 55, 80-045, Gdańsk, Poland
Sławomir T. Wierzchoń

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Radziszewski, A., Pawlaczek, A. (2013). Incorporating Head Recognition into a CRF Chunker. In: Kłopotek, M.A., Koronacki, J., Marciniak, M., Mykowiecka, A., Wierzchoń, S.T. (eds) Language Processing and Intelligent Information Systems. IIS 2013. Lecture Notes in Computer Science, vol 7912. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-38634-3_3

Download citation

DOI: https://doi.org/10.1007/978-3-642-38634-3_3
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-38633-6
Online ISBN: 978-3-642-38634-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics