Abstract
Various neural networks for sequence labeling tasks have been studied extensively in recent years. The main research focus on neural networks for the task are range from the feed-forward neural network to the long short term memory (LSTM) network with CRF layer. This paper summarizes the existing neural architectures and develop the most representative four neural networks for part-of-speech tagging and apply them on several typologically different languages. Experimental results show that the LSTM type of networks outperforms the feed-forward network in most cases and the character-level networks can learn the lexical features from characters within words, which makes the model achieve better results than no-character ones.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
For all those LSTM type of models, we did not use the CRF layer, since LSTM can captures sentence-level information.
References
Baba Ali, B., Wójcik, W., Orken, M., Turdalyuly, M., Mekebayev, N.: Speech recognizer-based non-uniform spectral compression for robust MFCC feature extraction. Przegl. Elektrotechniczny 94, 90–93 (2018)
Bengio, Y., Ducharme, R., Vincent, P., Janvin, C.: A neural probabilistic language model. J. Mach. Learn. Res. 3, 1137–1155 (2003)
Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional LSTM-CNNs. Trans. Assoc. Comput. Linguist. 4, 357–370 (2016)
Cohen, T., Geiger, M., Köhler, J., Welling, M.: Spherical CNNS. ArXiv abs/1801.10130 (2018)
Collobert, R., Weston, J.: A unified architecture for natural language processing: deep neural networks with multitask learning. In: Proceedings of the 25th International Conference on Machine Learning (ICML 2008), pp. 160–167. ACM, New York, NY, USA (2008)
Collobert, R., Weston, J., Bottou, L., Karlen, M., Kavukcuoglu, K., Kuksa, P.: Natural language processing (almost) from scratch. J. Mach. Learn. Res. 12, 2493–2537 (2011)
Duong, L., Cohn, T., Verspoor, K., Bird, S., Cook, P.: What can we get from 1000 tokens? a case study of multilingual POS tagging for resource-poor languages. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 886–897. Association for Computational Linguistics, Doha, Qatar (Oct 2014). https://doi.org/10.3115/v1/D14-1096, https://www.aclweb.org/anthology/D14-1096
Elman, J.L.: Finding structure in time. Cogn. Sci. 14(2), 179–211 (1990)
Elsayed, G.F., et al.: Adversarial examples that fool both computer vision and time-limited humans. In: Proceedings of the 32Nd International Conference on Neural Information Processing Systems (NIPS 2018), pp. 3914–3924. Curran Associates Inc., USA (2018)
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)
Horsmann, T., Zesch, T.: Do LSTMs really work so well for PoS tagging? – a replication study. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Copenhagen, Denmark (Sep 2017)
Huang, Z., Xu, W., Yu, K.: Bidirectional LSTM-CRF models for sequence tagging (2015), cite arxiv:1508.01991
Kalimoldayev, M., Mamyrbayev, O., Kydyrbekova, A., Mekebayev, N.: Voice verification and identification using I-vector representation. Int. J. Math.Phys. 10(1), 66–74 (2019)
Kalimoldayev, M.N., Alimhan, K., Mamyrbayev, O.J.: Methods for applying VAD in Kazakh speech recognition systems. Int. J. Speech Technol. 17(2), 199–204 (2014). https://doi.org/10.1007/s10772-013-9220-6
Ling, W., et al.: Finding function in form: compositional character models for open vocabulary word representation. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 1520–1530. Association for Computational Linguistics, Lisbon, Portugal (Sep 2015)
Ma, X., Hovy, E.: End-to-end sequence labeling via bi-directional LSTM-CNNs-CRF. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Vol. 1: Long Papers), pp. 1064–1074. Association for Computational Linguistics, Berlin, Germany (Aug 2016)
Mamyrbayev, O., Turdalyuly, M., Mekebayev, N., Alimhan, K., Kydyrbekova, A., Turdalykyzy, T.: Automatic recognition of Kazakh speech using deep neural networks. In: Nguyen, N.T., Gaol, F.L., Hong, T.-P., Trawiński, B. (eds.) ACIIDS 2019. LNCS (LNAI), vol. 11432, pp. 465–474. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-14802-7_40
Mamyrbayev, O., et al.: Continuous speech recognition of Kazakh language. ITM Web of Conferences 24, 01012 (2019)
Mikolov, T.: Statistical Language Models Based on Neural Networks. Ph.D. Thesis, Brno University of Technology (2012)
Mikolov, T., Sutskever, I., Chen, K., Corrado, G., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Proceedings of the 26th International Conference on Neural Information Processing Systems (NIPS 2013), Vol. 2, pp. 3111–3119, Curran Associates Inc., USA (2013)
Nivre, J., et al.: Universal dependencies v1: a multilingual treebank collection. In: Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC 2016). European Language Resources Association (ELRA), PortoroĹľ, Slovenia (May 2016)
Paul, D.B., Baker, J.M.: The design for the wall street journal-based CSR corpus. In: Speech and Natural Language: Proceedings of a Workshop Held at Harriman, New York, February 23–26, 1992 (1992)
Pei, W., Ge, T., Chang, B.: Max-margin tensor neural network for Chinese word segmentation. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, (Vol. 1: Long Papers), pp. 293–303. Association for Computational Linguistics, Baltimore, Maryland (Jun 2014)
Tolegen, G., Toleu, A., Mamyrbayev, O., Mussabayev, R.: Neural named entity recognition for Kazakh. In: Proceedings of the 20th International Conference on Computational Linguistics and Intelligent Text Processing. CICLing, Springer Lecture Notes in Computer Science (2019)
Tolegen, G., Toleu, A., Zheng, X.: Named entity recognition for Kazakh using conditional random fields. In: Proceedings of the 4-th International Conference on Computer Processing of Turkic Languages TurkLang 2016, pp. 118–127. Izvestija KGTU im.I.Razzakova (2016)
Toleu, A., Tolegen, G., Makazhanov, A.: Character-aware neural morphological disambiguation. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, (Vol. 2: Short Papers), pp. 666–671. Association for Computational Linguistics, Vancouver, Canada (Jul 2017). 10.18653/v1/P17-2105
Toleu, A., Tolegen, G., Makazhanov, A.: Character-based deep learning models for token and sentence segmentation. In: Proceedings of the 5th International Conference on Turkic Languages Processing (TurkLang 2017). Kazan, Tatarstan, Russian Federation (October 2017)
Wang, M., Manning, C.D.: Effect of non-linear deep architecture in sequence labeling. In: IJCNLP (2013)
Zhou, J., Xu, W.: End-to-end learning of semantic role labeling using recurrent neural networks. In: Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Vol. 1: Long Papers), pp. 1127–1137. Association for Computational Linguistics, Beijing, China (Jul 2015)
Acknowledgments
This research has been conducted within the framework of the grant num. BR05236839 “Development of information technologies and systems for stimulation of personality’s sustainable development as one of the bases of development of digital Kazakhstan”.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Toleu, A., Tolegen, G., Mussabayev, R. (2020). Deep Learning for Multilingual POSÂ Tagging. In: Hernes, M., Wojtkiewicz, K., Szczerbicki, E. (eds) Advances in Computational Collective Intelligence. ICCCI 2020. Communications in Computer and Information Science, vol 1287. Springer, Cham. https://doi.org/10.1007/978-3-030-63119-2_2
Download citation
DOI: https://doi.org/10.1007/978-3-030-63119-2_2
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-63118-5
Online ISBN: 978-3-030-63119-2
eBook Packages: Computer ScienceComputer Science (R0)