Experiments in PCFG-like Disambiguation of Constituency Parse Forests for Polish
The work presented here is the first attempt at creating a probabilistic constituency parser for Polish. The described algorithm disambiguates parse forests obtained from the Świgra parser in a manner close to Probabilistic Context Free Grammars. The experiment was carried out and evaluated on the Składnica treebank. The idea behind the experiment was to check what can be achieved with this well known method. Results are promising, the approach presented achieves up to \(94.1\,\%\) PARSEVAL F-measure and \(92.1\,\%\) ULAS. The PCFG-like algorithm can be evaluated against existing Polish dependency parser which achieves \(92.2\,\%\) ULAS.
- Abney, S., Flickenger, S., Gdaniec, C., Grishman, C., Harrison, P., Hindle, D., Ingria, R., Jelinek, F., Klavans, J., Liberman, M., Marcus, M., Roukos, S., Santorini, B., Strzalkowski, T.: Procedure for quantitatively comparing the syntactic coverage of english grammars. In: Black, E. (ed.) Proceedings of the Workshop on Speech and Natural Language, HLT 1991. Association for Computational Linguistics, Stroudsburg (1991)Google Scholar
- Billot, S., Lang, B.: The structure of shared forests in ambiguous parsing. In: Meeting of the Association for Computational Linguistics (1989)Google Scholar
- Collins, M.: Three generative, lexicalised models for statistical parsing. In: Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics, ACL 1998. Association for Computational Linguistics, Stroudsburg (1997)Google Scholar
- Przepiórkowski, A.: On complements and adjuncts in Polish. In: Borsley, R.D., Przepiórkowski, A. (eds.) Slavic in HPSG, pp. 183–210. CSLI Publications, Stanford (1999)Google Scholar
- Woliński, M., Głowińska, K., Świdziński, M.: A preliminary version of Składnica–a treebank of Polish. In: Vetulani, Z. (ed.) Proceedings of the 5th Language & Technology Conference, Poznań (2011)Google Scholar