Improved Joint Kazakh POS Tagging and Chunking

Wu, Hao; Altenbek, Gulila

doi:10.1007/978-3-319-47674-2_10

Improved Joint Kazakh POS Tagging and Chunking

Hao Wu^18,19 &
Gulila Altenbek^18,19

Conference paper
First Online: 10 October 2016

1723 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 10035))

Abstract

This paper describes a mixing model of joint POS tagging and chunking for Kazakh where partial optimal solution provide feature information for joint model. A improved beam-search algorithm use dynamic beam instead of unified beam to obtain search space of small-but-excellent during both training and decoding phases of the model. Moreover we can statistical induction the information of chunk to disambiguation of multi-category words and experiment shows the precision is improved from 81.6 % to 87.7 % by information of chunk.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

Altenbek, G., Wang, X., Haisha, G.: Identification of basic phrases for Kazakh language using maximum entropy model. In: COLING, Dublin, pp. 1007–1014 (2014)
Google Scholar
Collins, M.: Discriminative training methods for hidden Markov models: theory and experiments with perceptron algorithms. In: Proceedings of the ACL-2002 Conference on Empirical methods in Natural Language Processing, vol. 10, pp. 1–8. Association for Computational Linguistics, Philadelphia, July 2002
Google Scholar
Collins, M.: Parameter estimation for statistical parsing models: theory and practice of distribution free methods. In: Bunt, H., Carroll, J., Satta, G. (eds.) New Developments in Parsing Technology, pp. 19–55. Springer, Netherlands (2004)
Chapter Google Scholar
Zhang, Y., Clark, S.: Syntactic processing using the generalized perceptron and beam search. Comput. Linguist. 37(1), 105–151 (2011)
Article Google Scholar
Hatori, J., Matsuzaki, T., Miyao, Y., Tsujii, J.: Incremental joint POS tagging and dependency parsing in Chinese. In: IJCNLP, pp. 1216–1224 (2011)
Google Scholar
Hatori, J., Matsuzaki, T., Miyao, Y., Tsujii, J.: Incremental joint approach to word segmentation, POS tagging, and dependency parsing in Chinese. In: Meeting of the Association for Computational Linguistics: Long Papers, Jeju, vol. 1, pp. 1045–1053 (2012)
Google Scholar
Saraclar, M., Roark, B.: Joint discriminative language modeling and utterance classification. In: CASSP, vol. 1, pp. 561–564 (2005)
Google Scholar
Wang, Z., Xue, N.: Joint POS tagging and transition-based constituent parsing in Chinese with non-local features. In: ACL, Maryland, vol. 1, pp. 733–742 (2014)
Google Scholar
Zhang, Y., Clark, S.: Transition-based parsing of the Chinese treebank using a global discriminative model. In: International Conference on Parsing Technologies, pp. 162–171. Association for Computational Linguistics, Paris (2009)
Google Scholar
Zhang, Y., Clark, S.: Chinese Segmentation with a word-based perceptron algorithm. In: ACL 2007, Proceedings of the, Meeting of the Association for Computational Linguistics, Prague, Czech Republic, 23–30 June 2007
Google Scholar
Collins, M., Roark, B.: Incremental parsing with the perceptron algorithm. In: Meeting of the Association for Computational Linguistics, Barcelona, 21–26 July 2004, pp. 111—118 (2004)
Google Scholar
Freund, Y., Schapire, R.E.: Large margin classification using the perceptron algorithm. Mach. Learn. 37(3), 277–296 (1999)
Article MATH Google Scholar
Collins, M., Duffy, N.: New ranking algorithms for parsing and tagging: kernels over discrete structures, and the voted perceptron. In: Meeting on Association for Computational Linguistics, pp. 263–270. Association for Computational Linguistics, Philadelphia (2002)
Google Scholar
DauméIII, H., Marcu, D.: Learning as search optimization: approximate large margin methods for structured prediction. In: ICML, Bonn, pp. 169–176 (2009)
Google Scholar
Shi, B.Y.: A dual-layer CRF based joint decoding method for cascade segmentation and labelling tasks. In: Proceedings of IJCAI, pp. 1707–1712 (2012)
Google Scholar

Download references

Author information

Authors and Affiliations

College of Information Science and Engineering, Xinjiang University, Urumqi, China
Hao Wu & Gulila Altenbek
The Base of Kazakh and Kirghiz Language of National Language Resource Monitoring and Research Centre Minority Languages, Urumqi, China
Hao Wu & Gulila Altenbek

Authors

Hao Wu
View author publications
You can also search for this author in PubMed Google Scholar
Gulila Altenbek
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Hao Wu or Gulila Altenbek .

Editor information

Editors and Affiliations

Tsinghua University , Beijing, China
Maosong Sun
Fudan University , Shanghai, China
Xuanjing Huang
Dalian University of Technology , Dalian, China
Hongfei Lin
Tsinghua University , Beijing, China
Zhiyuan Liu
Tsinghua University , Beijing, China
Yang Liu

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Wu, H., Altenbek, G. (2016). Improved Joint Kazakh POS Tagging and Chunking. In: Sun, M., Huang, X., Lin, H., Liu, Z., Liu, Y. (eds) Chinese Computational Linguistics and Natural Language Processing Based on Naturally Annotated Big Data. NLP-NABD CCL 2016 2016. Lecture Notes in Computer Science(), vol 10035. Springer, Cham. https://doi.org/10.1007/978-3-319-47674-2_10

Download citation

DOI: https://doi.org/10.1007/978-3-319-47674-2_10
Published: 10 October 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-47673-5
Online ISBN: 978-3-319-47674-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics