Abstract
This chapter presents a discriminative modeling technique which corrects the errors made by an automatic parser. The model is similar to reranking; however, it does not require the generation of k-best lists as in MCDonald et al. (2005), McDonald and Pereira (2006), Charniak and Johnson (2005), and Hall (2007). The corrective strategy employed by our technique is to explore a set of candidate parses which are constructed by making structurally—local perturbations to an automatically generated parse tree. We train a model which makes local, corrective decisions in order to optimize for parsing performance. The technique is independent of the parser generating the first set of parses. We show in this chapter that the only requirement for this technique is the ability to define a local neighborhood in which a large number of the errors occur.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
In order to correctly capture the dependency structure, co-indexed movement traces are used in a form similar to Government and Binding theory, GPSG, etc.
- 2.
Exhaustive parsing assumes that the optimal parse under the model has been chosen; this is in contrast to greedy techniques, where the parse may not be optimal under the model.
- 3.
The imaginary root node simplifies notation.
- 4.
The dependency structures here are very similar to those described by Mel’čuk (1988); however the nodes of the dependency trees discussed in this chapter are limited to the words of the sentence and are always ordered according to the surface word-order.
- 5.
Node w a is said to transitively govern node w b if w b is a descendant of w a in the dependency tree.
- 6.
Bilexical dependencies are components of both the Collins and Charniak parsers and model the types of syntactic subordination that we encode in a dependency tree. (Bilexical models were also proposed by Eisner (1996)). In the absence of lexicalization, both parsers have dependency features that are encoded as head-constituent to sibling features.
- 7.
This information was provided by Eugene Charniak in a personal communication.
- 8.
A cousin is a descendant of an ancestor and not an ancestor itself, which subsumes the definition of sibling.
- 9.
These statistics are for the complete PDT 1.0 dataset.
- 10.
- 11.
The CoNLL07 shared-task data is a subset of the PDT 2.0 data.
- 12.
Jack-knife cross-validation is the process of splitting the data into m sets, training on \(m-1\) of these, and applying the trained model the remaining set. We do this m times, resulting in predictions for the entire training set while never using a model trained on the data for which we are making predictions.
- 13.
Using held-out development data, we determined a Gaussian prior parameter setting of 4 worked best. The optimal number of training iterations was chosen on held-out data for each experiment. This was generally in the order of a couple hundred iterations of L-BFGS. The MaxEnt modeling implementation can be found at http://homepages.inf.ed.ac.uk/s0450736/maxent_toolkit.html
- 14.
The MaltEval (http://w3.msi.vxu.se/jni/malteval/) tool was used for evaluation of the dependency-based parsers.
References
Attardi, G. and M. Ciaramita (2007). Tree revision learning for dependency parsing. In Proceedings of Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics, Rochester, NY.
Berger, A.L., S.A.D. Pietra, and V.J.D. Pietra (1996). A maximum entropy approach to natural language processing. Computational Linguistics 22(1), 39–71.
Böhmová, A., J. Hajič, E. Hajičová, and B.V. Hladká (2002). The Prague Dependency Treebank: three-level annotation scenario. In A. Abeille (Ed.), In Treebanks: Building and Using Syntactically Annotated Corpora. Dordrecht: Kluwer Academic Publishers.
Brill, E. (1995, December). Transformation-based error-driven learning and natural language processing: a case study in part of speech tagging. Computational Linguistics 21(4), 543–565.
Caraballo, S. and E. Charniak (1998, June). New figures of merit for best-first probabilistic chart parsing. Computational Linguistics 24(2), 275–298.
Charniak, E. (2000). A maximum-entropy-inspired parser. In Proceedings of the 2000 Conference of the North American Chapter of the Association for Computational Linguistics, ACL, New Brunswick, NJ.
Charniak, E. (2001). Immediate-head parsing for language models. In Proceedings of the 39th Annual Meeting of the Association for Computational Linguistics, Toulouse, France.
Charniak, E. and M. Johnson (2005). Coarse-to-fine n-best parsing and MaxEnt discriminative reranking. In Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics, Ann Arbor, Michigan.
Collins, M. (2000). Discriminative reranking for natural language parsing. In Proceedings of the 17th International Conference on Machine Learning 2000, Stanford, CA.
Collins, M. (2003). Head-driven statistical models for natural language processing. Computational Linguistics 29(4), 589–637.
Collins, M., L. Ramshaw, J. Hajič, and C. Tillmann (1999). A statistical parser for Czech. In Proceedings of the 37th Annual Meeting of the Association for Computational Linguistics, College Park, MD. pp. 505–512.
Dubey, A. and F. Keller (2003). Probabilistic parsing for German using sister-head dependencies. In Proceedings of the 41st Annual Meeting of the Association for Computational Linguistics, Sapporo, Sapporo, Japan pp. 96–103.
Eisner, J. (1996). Three new probabilistic models for dependency parsing: an exploration. In Proceedings of the 16th International Conference on Computational Linguistics (COLING), Copenhagen, Denmark pp. 340–345.
Hajič, J. (1998). Building a syntactically annotated corpus: The Prague Dependency Treebank. In Issues of Valency and Meaning. Praha: Karolinum, pp. 106–132.
Hajičová, E., J. Havelka, P. Sgall, K. Veselá, and D. Zeman (2004). Issues of projectivity in the Prague Dependency Treebank. Prague Bulletin of Mathematical Linguistics 81, 5–22.
Hall, K. (2007). k-best spanning tree parsing. In Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics, Prague, Czech Republic.
Hall, K. and V. Novák (2005). Corrective modeling for non-projective dependency parsing. In Proceedings of the 9th International Workshop on Parsing Technologies, Vancouver, BC Canada.
Harrison, P., S. Abney, D. Fleckenger, C. Gdaniec, R. Grishman, D. Hindle, B. Ingria, M. Marcus, B. Santorini, and T. Strzalkowski (1991). Evaluating syntax performance of parser/grammars of english. In Proceedings of the Workshop on Evaluating Natural Language Processing Systems, ACL, Berkeley, CA.
Klein, D. and C.D. Manning (2003). Factored A% search for models over sequences and trees. In Proceedings of IJCAI 2003, Acapulco, Mexico.
Levy, R. and C. Manning (2004). Deep dependencies from context-free statistical parsers: correcting the surface dependency approximation. In Proceedings of the 42nd Annual Meeting of the Association for Computational Linguistics, Barcelona, Spain, pp. 327–334.
Manning, C.D. and H. Schütze (1999). Foundations of Statistical Natural Language Processing. Cambridge, MA: MIT Press.
McDonald, R., K. Crammer, and F. Pereira (2005). Online large-margin training of dependency parsers. In Proceedings of the 43nd Annual Meeting of the Association for Computational Linguistics, Ann Arbor, MI.
McDonald, R., K. Lerman, and F. Pereira (2006). Multilingual dependency parsing with a two-stage discriminative parser. In Conference on Natural Language Learning, New York, NY.
McDonald, R. and F. Pereira (2006). Online learning of approximate dependency parsing algorithms. In Proceedings of the Annual Meeting of the European Association for Computational Linguistics, Trento, Italy.
McDonald, R., F. Pereira, K. Ribarov, and J. Hajič (2005, October). Non-projective dependency parsing using spanning tree algorithms. In Proceedings of Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing, Vancouver, BC, Canada, pp. 523–530.
Mel’čuk, I. (1988). Dependency Syntax: Theory and Practice. Albany, NY: SUNY Press.
Nivre, J. (2006). Inductive Dependency Parsing, Text, Speech and Language Technology vol. 34. New York, NY: Springer.
Nivre, J., J. Hall, S. Kübler, R. McDonald, J. Nilsson, S. Riedel, and D. Yuret (2007). The CoNLL 2007 shared task on dependency parsing. In Proceedings of the Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL), Prague, Czech Republic.
Nivre, J. and J. Nilsson (2005). Pseudo-projective dependency parsing. In Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics, Ann Arbor, MI, pp. 99–106.
Roark, B. and M. Collins (2004). Incremental parsing with the perceptron algorithm. In Proceedings of the 42nd Annual Meeting of the Association for Computational Linguistics, Barcelona.
Sgall, P., E. Hajičová, and J. Panevová (1986). The Meaning of the Sentence in Its Semantic and Pragmatic Aspects. Boston, MA: Kluwer Academic.
Smith, N.A. and J. Eisner (2005). Contrastive estimation: Training log-linear models on unlabeled data. In Proceedings of the Association for Computational Linguistics (ACL 2005), Ann Arbor, MI.
Tarjan, R. (1977). Finding optimal branchings. Networks 7, 25–35.
Acknowledgements
This work was partially supported by U.S. NSF grants IIS–9982329 and OISE–0530118, by the Czech Ministry of Education grant LC536 and Czech Academy of Sciences grant 1ET201120505.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer Science+Business Media B.V.
About this chapter
Cite this chapter
Hall, K., Novák, V. (2010). Corrective Dependency Parsing. In: Bunt, H., Merlo, P., Nivre, J. (eds) Trends in Parsing Technology. Text, Speech and Language Technology, vol 43. Springer, Dordrecht. https://doi.org/10.1007/978-90-481-9352-3_9
Download citation
DOI: https://doi.org/10.1007/978-90-481-9352-3_9
Published:
Publisher Name: Springer, Dordrecht
Print ISBN: 978-90-481-9351-6
Online ISBN: 978-90-481-9352-3
eBook Packages: Humanities, Social Sciences and LawSocial Sciences (R0)