Skip to main content

Improved Joint Kazakh POS Tagging and Chunking

  • Conference paper
  • First Online:
  • 1723 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 10035))

Abstract

This paper describes a mixing model of joint POS tagging and chunking for Kazakh where partial optimal solution provide feature information for joint model. A improved beam-search algorithm use dynamic beam instead of unified beam to obtain search space of small-but-excellent during both training and decoding phases of the model. Moreover we can statistical induction the information of chunk to disambiguation of multi-category words and experiment shows the precision is improved from 81.6 % to 87.7 % by information of chunk.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Altenbek, G., Wang, X., Haisha, G.: Identification of basic phrases for Kazakh language using maximum entropy model. In: COLING, Dublin, pp. 1007–1014 (2014)

    Google Scholar 

  2. Collins, M.: Discriminative training methods for hidden Markov models: theory and experiments with perceptron algorithms. In: Proceedings of the ACL-2002 Conference on Empirical methods in Natural Language Processing, vol. 10, pp. 1–8. Association for Computational Linguistics, Philadelphia, July 2002

    Google Scholar 

  3. Collins, M.: Parameter estimation for statistical parsing models: theory and practice of distribution free methods. In: Bunt, H., Carroll, J., Satta, G. (eds.) New Developments in Parsing Technology, pp. 19–55. Springer, Netherlands (2004)

    Chapter  Google Scholar 

  4. Zhang, Y., Clark, S.: Syntactic processing using the generalized perceptron and beam search. Comput. Linguist. 37(1), 105–151 (2011)

    Article  Google Scholar 

  5. Hatori, J., Matsuzaki, T., Miyao, Y., Tsujii, J.: Incremental joint POS tagging and dependency parsing in Chinese. In: IJCNLP, pp. 1216–1224 (2011)

    Google Scholar 

  6. Hatori, J., Matsuzaki, T., Miyao, Y., Tsujii, J.: Incremental joint approach to word segmentation, POS tagging, and dependency parsing in Chinese. In: Meeting of the Association for Computational Linguistics: Long Papers, Jeju, vol. 1, pp. 1045–1053 (2012)

    Google Scholar 

  7. Saraclar, M., Roark, B.: Joint discriminative language modeling and utterance classification. In: CASSP, vol. 1, pp. 561–564 (2005)

    Google Scholar 

  8. Wang, Z., Xue, N.: Joint POS tagging and transition-based constituent parsing in Chinese with non-local features. In: ACL, Maryland, vol. 1, pp. 733–742 (2014)

    Google Scholar 

  9. Zhang, Y., Clark, S.: Transition-based parsing of the Chinese treebank using a global discriminative model. In: International Conference on Parsing Technologies, pp. 162–171. Association for Computational Linguistics, Paris (2009)

    Google Scholar 

  10. Zhang, Y., Clark, S.: Chinese Segmentation with a word-based perceptron algorithm. In: ACL 2007, Proceedings of the, Meeting of the Association for Computational Linguistics, Prague, Czech Republic, 23–30 June 2007

    Google Scholar 

  11. Collins, M., Roark, B.: Incremental parsing with the perceptron algorithm. In: Meeting of the Association for Computational Linguistics, Barcelona, 21–26 July 2004, pp. 111—118 (2004)

    Google Scholar 

  12. Freund, Y., Schapire, R.E.: Large margin classification using the perceptron algorithm. Mach. Learn. 37(3), 277–296 (1999)

    Article  MATH  Google Scholar 

  13. Collins, M., Duffy, N.: New ranking algorithms for parsing and tagging: kernels over discrete structures, and the voted perceptron. In: Meeting on Association for Computational Linguistics, pp. 263–270. Association for Computational Linguistics, Philadelphia (2002)

    Google Scholar 

  14. DauméIII, H., Marcu, D.: Learning as search optimization: approximate large margin methods for structured prediction. In: ICML, Bonn, pp. 169–176 (2009)

    Google Scholar 

  15. Shi, B.Y.: A dual-layer CRF based joint decoding method for cascade segmentation and labelling tasks. In: Proceedings of IJCAI, pp. 1707–1712 (2012)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Hao Wu or Gulila Altenbek .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing AG

About this paper

Cite this paper

Wu, H., Altenbek, G. (2016). Improved Joint Kazakh POS Tagging and Chunking. In: Sun, M., Huang, X., Lin, H., Liu, Z., Liu, Y. (eds) Chinese Computational Linguistics and Natural Language Processing Based on Naturally Annotated Big Data. NLP-NABD CCL 2016 2016. Lecture Notes in Computer Science(), vol 10035. Springer, Cham. https://doi.org/10.1007/978-3-319-47674-2_10

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-47674-2_10

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-47673-5

  • Online ISBN: 978-3-319-47674-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics