Deep Learning for Multilingual POS Tagging

Toleu, Alymzhan; Tolegen, Gulmira; Mussabayev, Rustam

doi:10.1007/978-3-030-63119-2_2

Alymzhan Toleu⁸,
Gulmira Tolegen⁸ &
Rustam Mussabayev⁸

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1287))

Included in the following conference series:

International Conference on Computational Collective Intelligence

1233 Accesses
1 Citations

Abstract

Various neural networks for sequence labeling tasks have been studied extensively in recent years. The main research focus on neural networks for the task are range from the feed-forward neural network to the long short term memory (LSTM) network with CRF layer. This paper summarizes the existing neural architectures and develop the most representative four neural networks for part-of-speech tagging and apply them on several typologically different languages. Experimental results show that the LSTM type of networks outperforms the feed-forward network in most cases and the character-level networks can learn the lexical features from characters within words, which makes the model achieve better results than no-character ones.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
For all those LSTM type of models, we did not use the CRF layer, since LSTM can captures sentence-level information.

References

Baba Ali, B., Wójcik, W., Orken, M., Turdalyuly, M., Mekebayev, N.: Speech recognizer-based non-uniform spectral compression for robust MFCC feature extraction. Przegl. Elektrotechniczny 94, 90–93 (2018)
Google Scholar
Bengio, Y., Ducharme, R., Vincent, P., Janvin, C.: A neural probabilistic language model. J. Mach. Learn. Res. 3, 1137–1155 (2003)
MATH Google Scholar
Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional LSTM-CNNs. Trans. Assoc. Comput. Linguist. 4, 357–370 (2016)
Article Google Scholar
Cohen, T., Geiger, M., Köhler, J., Welling, M.: Spherical CNNS. ArXiv abs/1801.10130 (2018)
Google Scholar
Collobert, R., Weston, J.: A unified architecture for natural language processing: deep neural networks with multitask learning. In: Proceedings of the 25th International Conference on Machine Learning (ICML 2008), pp. 160–167. ACM, New York, NY, USA (2008)
Google Scholar
Collobert, R., Weston, J., Bottou, L., Karlen, M., Kavukcuoglu, K., Kuksa, P.: Natural language processing (almost) from scratch. J. Mach. Learn. Res. 12, 2493–2537 (2011)
MATH Google Scholar
Duong, L., Cohn, T., Verspoor, K., Bird, S., Cook, P.: What can we get from 1000 tokens? a case study of multilingual POS tagging for resource-poor languages. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 886–897. Association for Computational Linguistics, Doha, Qatar (Oct 2014). https://doi.org/10.3115/v1/D14-1096, https://www.aclweb.org/anthology/D14-1096
Elman, J.L.: Finding structure in time. Cogn. Sci. 14(2), 179–211 (1990)
Article Google Scholar
Elsayed, G.F., et al.: Adversarial examples that fool both computer vision and time-limited humans. In: Proceedings of the 32Nd International Conference on Neural Information Processing Systems (NIPS 2018), pp. 3914–3924. Curran Associates Inc., USA (2018)
Google Scholar
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)
Article Google Scholar
Horsmann, T., Zesch, T.: Do LSTMs really work so well for PoS tagging? – a replication study. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Copenhagen, Denmark (Sep 2017)
Google Scholar
Huang, Z., Xu, W., Yu, K.: Bidirectional LSTM-CRF models for sequence tagging (2015), cite arxiv:1508.01991
Kalimoldayev, M., Mamyrbayev, O., Kydyrbekova, A., Mekebayev, N.: Voice verification and identification using I-vector representation. Int. J. Math.Phys. 10(1), 66–74 (2019)
Article Google Scholar
Kalimoldayev, M.N., Alimhan, K., Mamyrbayev, O.J.: Methods for applying VAD in Kazakh speech recognition systems. Int. J. Speech Technol. 17(2), 199–204 (2014). https://doi.org/10.1007/s10772-013-9220-6
Article Google Scholar
Ling, W., et al.: Finding function in form: compositional character models for open vocabulary word representation. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 1520–1530. Association for Computational Linguistics, Lisbon, Portugal (Sep 2015)
Google Scholar
Ma, X., Hovy, E.: End-to-end sequence labeling via bi-directional LSTM-CNNs-CRF. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Vol. 1: Long Papers), pp. 1064–1074. Association for Computational Linguistics, Berlin, Germany (Aug 2016)
Google Scholar
Mamyrbayev, O., Turdalyuly, M., Mekebayev, N., Alimhan, K., Kydyrbekova, A., Turdalykyzy, T.: Automatic recognition of Kazakh speech using deep neural networks. In: Nguyen, N.T., Gaol, F.L., Hong, T.-P., Trawiński, B. (eds.) ACIIDS 2019. LNCS (LNAI), vol. 11432, pp. 465–474. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-14802-7_40
Chapter Google Scholar
Mamyrbayev, O., et al.: Continuous speech recognition of Kazakh language. ITM Web of Conferences 24, 01012 (2019)
Google Scholar
Mikolov, T.: Statistical Language Models Based on Neural Networks. Ph.D. Thesis, Brno University of Technology (2012)
Google Scholar
Mikolov, T., Sutskever, I., Chen, K., Corrado, G., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Proceedings of the 26th International Conference on Neural Information Processing Systems (NIPS 2013), Vol. 2, pp. 3111–3119, Curran Associates Inc., USA (2013)
Google Scholar
Nivre, J., et al.: Universal dependencies v1: a multilingual treebank collection. In: Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC 2016). European Language Resources Association (ELRA), Portorož, Slovenia (May 2016)
Google Scholar
Paul, D.B., Baker, J.M.: The design for the wall street journal-based CSR corpus. In: Speech and Natural Language: Proceedings of a Workshop Held at Harriman, New York, February 23–26, 1992 (1992)
Google Scholar
Pei, W., Ge, T., Chang, B.: Max-margin tensor neural network for Chinese word segmentation. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, (Vol. 1: Long Papers), pp. 293–303. Association for Computational Linguistics, Baltimore, Maryland (Jun 2014)
Google Scholar
Tolegen, G., Toleu, A., Mamyrbayev, O., Mussabayev, R.: Neural named entity recognition for Kazakh. In: Proceedings of the 20th International Conference on Computational Linguistics and Intelligent Text Processing. CICLing, Springer Lecture Notes in Computer Science (2019)
Google Scholar
Tolegen, G., Toleu, A., Zheng, X.: Named entity recognition for Kazakh using conditional random fields. In: Proceedings of the 4-th International Conference on Computer Processing of Turkic Languages TurkLang 2016, pp. 118–127. Izvestija KGTU im.I.Razzakova (2016)
Google Scholar
Toleu, A., Tolegen, G., Makazhanov, A.: Character-aware neural morphological disambiguation. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, (Vol. 2: Short Papers), pp. 666–671. Association for Computational Linguistics, Vancouver, Canada (Jul 2017). 10.18653/v1/P17-2105
Google Scholar
Toleu, A., Tolegen, G., Makazhanov, A.: Character-based deep learning models for token and sentence segmentation. In: Proceedings of the 5th International Conference on Turkic Languages Processing (TurkLang 2017). Kazan, Tatarstan, Russian Federation (October 2017)
Google Scholar
Wang, M., Manning, C.D.: Effect of non-linear deep architecture in sequence labeling. In: IJCNLP (2013)
Google Scholar
Zhou, J., Xu, W.: End-to-end learning of semantic role labeling using recurrent neural networks. In: Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Vol. 1: Long Papers), pp. 1127–1137. Association for Computational Linguistics, Beijing, China (Jul 2015)
Google Scholar

Download references

Acknowledgments

This research has been conducted within the framework of the grant num. BR05236839 “Development of information technologies and systems for stimulation of personality’s sustainable development as one of the bases of development of digital Kazakhstan”.

Author information

Authors and Affiliations

Institute of Information and Computational Technologies, Almaty, Kazakhstan
Alymzhan Toleu, Gulmira Tolegen & Rustam Mussabayev

Authors

Alymzhan Toleu
View author publications
You can also search for this author in PubMed Google Scholar
Gulmira Tolegen
View author publications
You can also search for this author in PubMed Google Scholar
Rustam Mussabayev
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Alymzhan Toleu .

Editor information

Editors and Affiliations

Wroclaw University of Economics and Business, Wrocław, Poland
Marcin Hernes
Wrocław University of Science and Technology, Wrocław, Poland
Krystian Wojtkiewicz
University of Newcastle, Newcastle, Australia
Edward Szczerbicki

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Toleu, A., Tolegen, G., Mussabayev, R. (2020). Deep Learning for Multilingual POS Tagging. In: Hernes, M., Wojtkiewicz, K., Szczerbicki, E. (eds) Advances in Computational Collective Intelligence. ICCCI 2020. Communications in Computer and Information Science, vol 1287. Springer, Cham. https://doi.org/10.1007/978-3-030-63119-2_2

Download citation

DOI: https://doi.org/10.1007/978-3-030-63119-2_2
Published: 19 November 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-63118-5
Online ISBN: 978-3-030-63119-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Deep Learning for Multilingual POS Tagging