Abstract
We propose a hybrid dependency parsing pipeline which combines transition-based parser and graph-based parser, and use segmented treebanks to train transition-based parsers as subparsers in front end, and then propose a constrained Eisner’s algorithm to reparse their outputs. We build the pipeline to investigate the influence on parsing accuracy when training with different segmentations of training data and find a convenient method to obtain parsing reliability score while achieving state-of-the-art parsing accuracy. Our results show that the pipeline with segmented training dataset could improve accuracy through reparsing while providing parsing reliability score.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Collins M (2002) Ranking algorithms for named-entity extraction: boosting and the voted perceptron. In: Proceedings of the 40th annual meeting on association for computational linguistics (ACL’ 02), pp 489–496
Eisner J (1996) Three new probabilistic models for dependency parsing: an exploration. In: Proceedings of the 16th international conference on computational linguistics (COLING-96), pp 340–345
Li ZH, Liu T, Che WX (2012) Exploiting multiple treebanks for parsing with quasi-synchronous grammars. In: Proceedings of the 50th annual meeting of the association for computational linguistics (ACL’ 12), pp 675–684
McDonald R, Pereira F (2006) Online learning of approximate dependency parsing algorithms. In: Proceedings of the 11th international conference of the European chapter of the association for computational linguistics (EACL 2006), pp 81–88
Nivre J, McDonald R (2008) Integrating graph-based and transition-based dependency parsers. In: Proceedings of the 46th annual meeting of the association for computational linguistics, pp 950–958
Niu ZY, Wang HF, Wu H (2009) Exploiting heterogeneous treebanks for parsing. In: Proceedings of the joint conference of the 47th annual meeting of the ACL and the 4th international joint conference on natural language processing of the AFNLP, pp 46–54
Plank B, Noord GV (2010) Grammar-driven versus data-driven: which parsing system is more affected by domain shifts? In: Proceedings of the 2010 workshop on NLP and linguistics: finding the common ground (NLPLING’ 10), pp 25–33
Xue NW, Xia F, Chiou FD, Palmer M (2005) The Penn Chinese treebank: phrase structure annotation of a large corpus. Nat Lang Eng 11(2):207–238
Zhang Y, Clark S (2011) Syntactic processing using the generalized perceptron and beam search. Comput Linguist 37(1):105–151
Zhou GY, Cai L, Zhao J, Liu K (2011) Phrase-based translation model for question retrieval in community question answer archives. In: Proceedings of the 49th annual meeting of the association for computational linguistics: human language technologies (HLT’ 11), pp 653–662
Zhou GY, Zhao J (2013) Joint inference for heterogeneous dependency parsing. In: The 51st annual meeting of the association for computational linguistics, pp 104–109
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Wu, F., Zhou, F. (2015). Hybrid Dependency Parser with Segmented Treebanks and Reparsing. In: Deng, Z., Li, H. (eds) Proceedings of the 2015 Chinese Intelligent Automation Conference. Lecture Notes in Electrical Engineering, vol 336. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-46469-4_6
Download citation
DOI: https://doi.org/10.1007/978-3-662-46469-4_6
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-662-46468-7
Online ISBN: 978-3-662-46469-4
eBook Packages: EngineeringEngineering (R0)