Abstract
This paper describes a mixing model of joint POS tagging and chunking for Kazakh where partial optimal solution provide feature information for joint model. A improved beam-search algorithm use dynamic beam instead of unified beam to obtain search space of small-but-excellent during both training and decoding phases of the model. Moreover we can statistical induction the information of chunk to disambiguation of multi-category words and experiment shows the precision is improved from 81.6 % to 87.7 % by information of chunk.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Altenbek, G., Wang, X., Haisha, G.: Identification of basic phrases for Kazakh language using maximum entropy model. In: COLING, Dublin, pp. 1007–1014 (2014)
Collins, M.: Discriminative training methods for hidden Markov models: theory and experiments with perceptron algorithms. In: Proceedings of the ACL-2002 Conference on Empirical methods in Natural Language Processing, vol. 10, pp. 1–8. Association for Computational Linguistics, Philadelphia, July 2002
Collins, M.: Parameter estimation for statistical parsing models: theory and practice of distribution free methods. In: Bunt, H., Carroll, J., Satta, G. (eds.) New Developments in Parsing Technology, pp. 19–55. Springer, Netherlands (2004)
Zhang, Y., Clark, S.: Syntactic processing using the generalized perceptron and beam search. Comput. Linguist. 37(1), 105–151 (2011)
Hatori, J., Matsuzaki, T., Miyao, Y., Tsujii, J.: Incremental joint POS tagging and dependency parsing in Chinese. In: IJCNLP, pp. 1216–1224 (2011)
Hatori, J., Matsuzaki, T., Miyao, Y., Tsujii, J.: Incremental joint approach to word segmentation, POS tagging, and dependency parsing in Chinese. In: Meeting of the Association for Computational Linguistics: Long Papers, Jeju, vol. 1, pp. 1045–1053 (2012)
Saraclar, M., Roark, B.: Joint discriminative language modeling and utterance classification. In: CASSP, vol. 1, pp. 561–564 (2005)
Wang, Z., Xue, N.: Joint POS tagging and transition-based constituent parsing in Chinese with non-local features. In: ACL, Maryland, vol. 1, pp. 733–742 (2014)
Zhang, Y., Clark, S.: Transition-based parsing of the Chinese treebank using a global discriminative model. In: International Conference on Parsing Technologies, pp. 162–171. Association for Computational Linguistics, Paris (2009)
Zhang, Y., Clark, S.: Chinese Segmentation with a word-based perceptron algorithm. In: ACL 2007, Proceedings of the, Meeting of the Association for Computational Linguistics, Prague, Czech Republic, 23–30 June 2007
Collins, M., Roark, B.: Incremental parsing with the perceptron algorithm. In: Meeting of the Association for Computational Linguistics, Barcelona, 21–26 July 2004, pp. 111—118 (2004)
Freund, Y., Schapire, R.E.: Large margin classification using the perceptron algorithm. Mach. Learn. 37(3), 277–296 (1999)
Collins, M., Duffy, N.: New ranking algorithms for parsing and tagging: kernels over discrete structures, and the voted perceptron. In: Meeting on Association for Computational Linguistics, pp. 263–270. Association for Computational Linguistics, Philadelphia (2002)
DauméIII, H., Marcu, D.: Learning as search optimization: approximate large margin methods for structured prediction. In: ICML, Bonn, pp. 169–176 (2009)
Shi, B.Y.: A dual-layer CRF based joint decoding method for cascade segmentation and labelling tasks. In: Proceedings of IJCAI, pp. 1707–1712 (2012)
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing AG
About this paper
Cite this paper
Wu, H., Altenbek, G. (2016). Improved Joint Kazakh POS Tagging and Chunking. In: Sun, M., Huang, X., Lin, H., Liu, Z., Liu, Y. (eds) Chinese Computational Linguistics and Natural Language Processing Based on Naturally Annotated Big Data. NLP-NABD CCL 2016 2016. Lecture Notes in Computer Science(), vol 10035. Springer, Cham. https://doi.org/10.1007/978-3-319-47674-2_10
Download citation
DOI: https://doi.org/10.1007/978-3-319-47674-2_10
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-47673-5
Online ISBN: 978-3-319-47674-2
eBook Packages: Computer ScienceComputer Science (R0)