Abstract
In this paper a new pruning method for a rule-based parser is described that relies on separating the underlying grammar rules into several mutually competing levels. This method has been developed and exploited for Czech in the syntactic parser Synt to reduce the number of possible output derivation trees. The algorithm behind operates on a so called packed forest of trees, a compressing data structure used for internal representation of parallel analyses, and thus performs very effectively. An evaluation of its contribution has been performed on the Brno Phrasal Treebank showing that the algorithm significantly prunes the resulting tree space while preserving perspective parses.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Zeman, D., Žabokrtský, Z.: Improving Parsing Accuracy by Combining Diverse Dependency Parsers. In: Proceedings of the 9th International Workshop on Parsing Technologies (2005)
Kadlec, V., Horák, A.: New Meta-grammar Constructs in Czech Language Parser Synt. In: Matoušek, V., Mautner, P., Pavelka, T. (eds.) TSD 2005. LNCS (LNAI), vol. 3658, pp. 85–92. Springer, Heidelberg (2005)
Horák, A., Holan, T., Kadlec, V., Kovář, V.: Dependency and Phrasal Parsers of the Czech Language: A Comparison. In: Matoušek, V., Mautner, P. (eds.) TSD 2007. LNCS (LNAI), vol. 4629, pp. 76–84. Springer, Heidelberg (2007)
Kovář, V., Horák, A., Kadlec, V.: New Methods for Pruning and Ordering of Syntax Parsing Trees. In: Sojka, P., Horák, A., Kopeček, I., Pala, K. (eds.) TSD 2008. LNCS (LNAI), vol. 5246, pp. 125–131. Springer, Heidelberg (2008)
Kadlec, V.: Syntactic analysis of natural languages based on context-free grammar backbone. PhD thesis, Faculty of Informatics, Masaryk University, Brno (2007)
Sikkel, K.: Parsing Schemata – A Framework for Specification and Analysis of Parsing Algorithms. Springer, Heidelberg (1997)
Jakubíček, M., Horák, A., Kovář, V.: Mining Phrases from Syntactic Analysis. In: Matoušek, V., Mautner, P. (eds.) TSD 2009. LNCS, vol. 5729, pp. 124–130. Springer, Heidelberg (2009)
Kovář, V., Jakubíček, M.: Test Suite for the Czech Parser Synt. In: Proceedings of Recent Advances in Slavonic Natural Language Processing 2008, Brno, pp. 63–70 (2008)
Sampson, G., Babarczy, A.: A Test of the Leaf-Ancestor Metric for Parse Accuracy. Natural Language Engineering 9(04), 365–380 (2003)
Šmerk, P.: Unsupervised learning of rules for morphological disambiguation. In: Sojka, P., Kopeček, I., Pala, K. (eds.) TSD 2004. LNCS (LNAI), vol. 3206, pp. 211–216. Springer, Heidelberg (2004)
Pala, K., Rychlý, P., Šmerk, P.: Morphological Analysis of Law Texts. In: Proceedings of Recent Advances in Slavonic Natural Language Processing 2007, pp. 21–26 (2007)
Grác, M., Jakubíček, M., Kovář, V.: Through low-cost annotation to reliable parsing evaluation. In: 24th Pacific Asia Conference on Language, Information and Computation (2010)
Rychlý, P.: Manatee/Bonito - A Modular Corpus Manager. In: Proceedings of Recent Advances in Slavonic Natural Language Processing 2007, Brno, Masaryk University (2007)
Horák, A., Pala, K., Rambousek, A.: The Global WordNet Grid Software Design. In: Proceedings of the Fourth Global WordNet Conference, University of Szegéd, pp. 194–199 (2008)
Nevěřilová, Z.: Semantic Role Patterns and Verb Classes in Verb Valency Lexicon. In: Sojka, P., Horák, A., Kopeček, I., Pala, K. (eds.) TSD 2010. LNCS, vol. 6231, pp. 150–156. Springer, Heidelberg (2010)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Jakubíček, M. (2011). Effective Parsing Using Competing CFG Rules. In: Habernal, I., Matoušek, V. (eds) Text, Speech and Dialogue. TSD 2011. Lecture Notes in Computer Science(), vol 6836. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-23538-2_15
Download citation
DOI: https://doi.org/10.1007/978-3-642-23538-2_15
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-23537-5
Online ISBN: 978-3-642-23538-2
eBook Packages: Computer ScienceComputer Science (R0)