Skip to main content

Deep Learning for Multilingual POS Tagging

  • Conference paper
  • First Online:
Advances in Computational Collective Intelligence (ICCCI 2020)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1287))

Included in the following conference series:

Abstract

Various neural networks for sequence labeling tasks have been studied extensively in recent years. The main research focus on neural networks for the task are range from the feed-forward neural network to the long short term memory (LSTM) network with CRF layer. This paper summarizes the existing neural architectures and develop the most representative four neural networks for part-of-speech tagging and apply them on several typologically different languages. Experimental results show that the LSTM type of networks outperforms the feed-forward network in most cases and the character-level networks can learn the lexical features from characters within words, which makes the model achieve better results than no-character ones.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    For all those LSTM type of models, we did not use the CRF layer, since LSTM can captures sentence-level information.

References

  1. Baba Ali, B., Wójcik, W., Orken, M., Turdalyuly, M., Mekebayev, N.: Speech recognizer-based non-uniform spectral compression for robust MFCC feature extraction. Przegl. Elektrotechniczny 94, 90–93 (2018)

    Google Scholar 

  2. Bengio, Y., Ducharme, R., Vincent, P., Janvin, C.: A neural probabilistic language model. J. Mach. Learn. Res. 3, 1137–1155 (2003)

    MATH  Google Scholar 

  3. Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional LSTM-CNNs. Trans. Assoc. Comput. Linguist. 4, 357–370 (2016)

    Article  Google Scholar 

  4. Cohen, T., Geiger, M., Köhler, J., Welling, M.: Spherical CNNS. ArXiv abs/1801.10130 (2018)

    Google Scholar 

  5. Collobert, R., Weston, J.: A unified architecture for natural language processing: deep neural networks with multitask learning. In: Proceedings of the 25th International Conference on Machine Learning (ICML 2008), pp. 160–167. ACM, New York, NY, USA (2008)

    Google Scholar 

  6. Collobert, R., Weston, J., Bottou, L., Karlen, M., Kavukcuoglu, K., Kuksa, P.: Natural language processing (almost) from scratch. J. Mach. Learn. Res. 12, 2493–2537 (2011)

    MATH  Google Scholar 

  7. Duong, L., Cohn, T., Verspoor, K., Bird, S., Cook, P.: What can we get from 1000 tokens? a case study of multilingual POS tagging for resource-poor languages. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 886–897. Association for Computational Linguistics, Doha, Qatar (Oct 2014). https://doi.org/10.3115/v1/D14-1096, https://www.aclweb.org/anthology/D14-1096

  8. Elman, J.L.: Finding structure in time. Cogn. Sci. 14(2), 179–211 (1990)

    Article  Google Scholar 

  9. Elsayed, G.F., et al.: Adversarial examples that fool both computer vision and time-limited humans. In: Proceedings of the 32Nd International Conference on Neural Information Processing Systems (NIPS 2018), pp. 3914–3924. Curran Associates Inc., USA (2018)

    Google Scholar 

  10. Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)

    Article  Google Scholar 

  11. Horsmann, T., Zesch, T.: Do LSTMs really work so well for PoS tagging? – a replication study. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Copenhagen, Denmark (Sep 2017)

    Google Scholar 

  12. Huang, Z., Xu, W., Yu, K.: Bidirectional LSTM-CRF models for sequence tagging (2015), cite arxiv:1508.01991

  13. Kalimoldayev, M., Mamyrbayev, O., Kydyrbekova, A., Mekebayev, N.: Voice verification and identification using I-vector representation. Int. J. Math.Phys. 10(1), 66–74 (2019)

    Article  Google Scholar 

  14. Kalimoldayev, M.N., Alimhan, K., Mamyrbayev, O.J.: Methods for applying VAD in Kazakh speech recognition systems. Int. J. Speech Technol. 17(2), 199–204 (2014). https://doi.org/10.1007/s10772-013-9220-6

    Article  Google Scholar 

  15. Ling, W., et al.: Finding function in form: compositional character models for open vocabulary word representation. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 1520–1530. Association for Computational Linguistics, Lisbon, Portugal (Sep 2015)

    Google Scholar 

  16. Ma, X., Hovy, E.: End-to-end sequence labeling via bi-directional LSTM-CNNs-CRF. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Vol. 1: Long Papers), pp. 1064–1074. Association for Computational Linguistics, Berlin, Germany (Aug 2016)

    Google Scholar 

  17. Mamyrbayev, O., Turdalyuly, M., Mekebayev, N., Alimhan, K., Kydyrbekova, A., Turdalykyzy, T.: Automatic recognition of Kazakh speech using deep neural networks. In: Nguyen, N.T., Gaol, F.L., Hong, T.-P., Trawiński, B. (eds.) ACIIDS 2019. LNCS (LNAI), vol. 11432, pp. 465–474. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-14802-7_40

    Chapter  Google Scholar 

  18. Mamyrbayev, O., et al.: Continuous speech recognition of Kazakh language. ITM Web of Conferences 24, 01012 (2019)

    Google Scholar 

  19. Mikolov, T.: Statistical Language Models Based on Neural Networks. Ph.D. Thesis, Brno University of Technology (2012)

    Google Scholar 

  20. Mikolov, T., Sutskever, I., Chen, K., Corrado, G., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Proceedings of the 26th International Conference on Neural Information Processing Systems (NIPS 2013), Vol. 2, pp. 3111–3119, Curran Associates Inc., USA (2013)

    Google Scholar 

  21. Nivre, J., et al.: Universal dependencies v1: a multilingual treebank collection. In: Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC 2016). European Language Resources Association (ELRA), PortoroĹľ, Slovenia (May 2016)

    Google Scholar 

  22. Paul, D.B., Baker, J.M.: The design for the wall street journal-based CSR corpus. In: Speech and Natural Language: Proceedings of a Workshop Held at Harriman, New York, February 23–26, 1992 (1992)

    Google Scholar 

  23. Pei, W., Ge, T., Chang, B.: Max-margin tensor neural network for Chinese word segmentation. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, (Vol. 1: Long Papers), pp. 293–303. Association for Computational Linguistics, Baltimore, Maryland (Jun 2014)

    Google Scholar 

  24. Tolegen, G., Toleu, A., Mamyrbayev, O., Mussabayev, R.: Neural named entity recognition for Kazakh. In: Proceedings of the 20th International Conference on Computational Linguistics and Intelligent Text Processing. CICLing, Springer Lecture Notes in Computer Science (2019)

    Google Scholar 

  25. Tolegen, G., Toleu, A., Zheng, X.: Named entity recognition for Kazakh using conditional random fields. In: Proceedings of the 4-th International Conference on Computer Processing of Turkic Languages TurkLang 2016, pp. 118–127. Izvestija KGTU im.I.Razzakova (2016)

    Google Scholar 

  26. Toleu, A., Tolegen, G., Makazhanov, A.: Character-aware neural morphological disambiguation. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, (Vol. 2: Short Papers), pp. 666–671. Association for Computational Linguistics, Vancouver, Canada (Jul 2017). 10.18653/v1/P17-2105

    Google Scholar 

  27. Toleu, A., Tolegen, G., Makazhanov, A.: Character-based deep learning models for token and sentence segmentation. In: Proceedings of the 5th International Conference on Turkic Languages Processing (TurkLang 2017). Kazan, Tatarstan, Russian Federation (October 2017)

    Google Scholar 

  28. Wang, M., Manning, C.D.: Effect of non-linear deep architecture in sequence labeling. In: IJCNLP (2013)

    Google Scholar 

  29. Zhou, J., Xu, W.: End-to-end learning of semantic role labeling using recurrent neural networks. In: Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Vol. 1: Long Papers), pp. 1127–1137. Association for Computational Linguistics, Beijing, China (Jul 2015)

    Google Scholar 

Download references

Acknowledgments

This research has been conducted within the framework of the grant num. BR05236839 “Development of information technologies and systems for stimulation of personality’s sustainable development as one of the bases of development of digital Kazakhstan”.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Alymzhan Toleu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Toleu, A., Tolegen, G., Mussabayev, R. (2020). Deep Learning for Multilingual POS Tagging. In: Hernes, M., Wojtkiewicz, K., Szczerbicki, E. (eds) Advances in Computational Collective Intelligence. ICCCI 2020. Communications in Computer and Information Science, vol 1287. Springer, Cham. https://doi.org/10.1007/978-3-030-63119-2_2

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-63119-2_2

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-63118-5

  • Online ISBN: 978-3-030-63119-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics