Named Entity Recognition in Russian with Word Representation Learned by a Bidirectional Language Model

Konoplich, Georgy; Putin, Evgeniy; Filchenkov, Andrey; Rybka, Roman

doi:10.1007/978-3-030-01204-5_5

Georgy Konoplich¹²,
Evgeniy Putin¹²,
Andrey Filchenkov¹² &
…
Roman Rybka¹³

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 930))

Included in the following conference series:

Conference on Artificial Intelligence and Natural Language

932 Accesses
2 Citations

Abstract

Named Entity Recognition is one of the most popular tasks of the natural language processing. Pre-trained word embeddings learned from unlabeled text have become a standard component of neural network architectures for natural language processing tasks. However, in most cases, a recurrent network that operates on word-level representations to produce context sensitive representations is trained on relatively few labeled data. Also, there are many difficulties in processing Russian language. In this paper, we present a semi-supervised approach for adding deep contextualized word representation that models both complex characteristics of word usage (e.g., syntax and semantics), and how these usages vary across linguistic contexts (i.e., to model polysemy). Here word vectors are learned functions of the internal states of a deep bidirectional language model, which is pretrained on a large text corpus. We show that these representations can be easily added to existing models and be combined with other word representation features. We evaluate our model on FactRuEval-2016 dataset for named entity recognition in Russian and achieve state of the art results.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

References

Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems, pp. 3111–3119 (2013)
Google Scholar
Pennington, J., Socher, R., Manning, C.: GloVe: global vectors for word representation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1532–1543 (2014)
Google Scholar
Collobert, R., Weston, J., Bottou, L., Karlen, M., Kavukcuoglu, K., Kuksa, P.: Natural language processing (almost) from scratch. J. Mach. Learn. Res. 12, 2493–2537 (2011)
MATH Google Scholar
Wieting, J., Bansal, M., Gimpel, K., Livescu, K.: Charagram: embedding words and sentences via character n-Grams. arXiv:1607.02789 (2016)
Bojanowski, P., Grave, E., Joulin, A., Mikolov, T.: Enriching word vectors with subword information. arXiv:1607.04606 (2016)
Neelakantan, A., Shankar, J., Passos, A., McCallum, A.: Efficient non-parametric estimation of multiple embeddings per word in vector space. arXiv:1504.06654 (2015)
Melamud, O., Goldberger, J., Dagan, I.: context2vec: learning generic context embedding with bidirectional LSTM. In: Proceedings of the 20th SIGNLL Conference on Computational Natural Language Learning, pp. 51–61 (2016)
Google Scholar
McCann, B., Bradbury, J., Xiong, C., Socher, R.: Learned in translation: contextualized word vectors. In: Advances in Neural Information Processing Systems, pp. 6297–6308 (2017)
Google Scholar
Peters, M.E., Ammar, W., Bhagavatula, C., Power, R.: Semi-supervised sequence tagging with bidirectional language models. arXiv:1705.00108 (2017)
Hashimoto, K., Xiong, C., Tsuruoka, Y., Socher, R.: A joint many-task model: growing a neural network for multiple NLP tasks. arXiv:1611.01587 (2016)
Belinkov, Y., Durrani, N., Dalvi, F., Sajjad, H., Glass, J.: What do neural machine translation models learn about morphology? arXiv:1704.03471 (2017)
Yang, Z., Salakhutdinov, R., Cohen, W.W.: Transfer learning for sequence tagging with hierarchical recurrent networks. arXiv:1703.06345 (2017)
Lample, G., Ballesteros, M., Subramanian, S., Kawakami, K., Dyer, C.: Neural architectures for named entity recognition. arXiv:1603.01360 (2016)
Peters, M.E., et al.: Deep contextualized word representations. arXiv:1802.05365 (2018)
Howard, J., Sebastian, R.: Fine-tuned language models for text classification. arXiv:1801.06146 (2018)
Jozefowicz, R., Vinyals, O., Schuster, M., Shazeer, N., Wu, Y.: Exploring the limits of language modeling. arXiv:1602.02410 (2016)
Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? In: Advances in Neural Information Processing Systems, pp. 3320–3328 (2014)
Google Scholar
Ruder, S.: An Overview of gradient descent optimization algorithms. arXiv:1609.04747 (2016)
Felbo, B., Mislove, A., Søgaard, A., Rahwan, I., Lehmann, S.: Using millions of Emoji occurrences to learn any-domain representations for detecting sentiment, emotion and sarcasm. arXiv:1708.00524 (2017)
Ba, J.L., Kiros, J.R., Hinton, G.E.: Layer normalization. arXiv:1607.06450 (2016)
Srivastava, R.K., Greff, K., Schmidhuber, J.: Training very deep networks. In: Advances in Neural Information Processing Systems, pp. 2377–2385 (2015)
Google Scholar
Lafferty, J., McCallum, A., Pereira, F.C.: Conditional random fields: probabilistic models for segmenting and labeling sequence data (2001)
Google Scholar
Trofimov, I.V.: Person name recognition in news articles based on the per-sons1000/1111-F collections. In: 16th All-Russian Scientific Conference Digital Libraries: Advanced Methods and Technologies, Digital Collections, RCDL 2014, pp. 217–221 (2014)
Google Scholar
Gareev, R., Tkachenko, M., Solovyev, V., Simanovsky, A., Ivanov, V.: Introducing baselines for russian named entity recognition. In: Gelbukh, A. (ed.) CICLing 2013 Part I. LNCS, vol. 7816, pp. 329–342. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-37247-6_27
Chapter Google Scholar
Mozharova, V., Loukachevitch, N.: Two-stage approach in Russian named entity recognition. In: Proceeding of IEEE International FRUCT Conference on Intelligence, Social Media and Web (ISMW FRUCT 2016), pp. 1–6 (2016)
Google Scholar
Ivanitskiy, R., Shipilo, A., Kovriguina, L.: Russian named entities recognition and classification using distributed word and phrase representations. In: SIMBig, pp. 150–156 (2016)
Google Scholar
Sysoev, A.A., Andrianov, I.A.: Named entity recognition in Russian: the power of wiki-based approach. In: Dialog Conference (2016, in Russian)
Google Scholar
Malykh, V., Ozerin, A.: Reproducing Russian NER baseline quality without additional data. In: CDUD@ CLA, pp. 54–59 (2016)
Google Scholar
Rubaylo, A.V., Kosenko, M.Y.: Software utilities for natural language information retrievial. Alm. Mod. Sci. Educ. 12(114), 87–92 (2016)
Google Scholar
Huang, Z., Xu, W., Yu, K.: Bidirectional LSTM-CRF models for sequence tagging. arXiv:1508.01991 (2015)
Tutubalina, E., Nikolenko, S.: Combination of deep recurrent neural networks and conditional random fields for extracting adverse drug reactions from user reviews. J Healthc. Eng. 2017 (2017)
Article Google Scholar
Anh, L.T., Arkhipov, M.Y., Burtsev. M.S.: Application of a hybrid Bi-LSTM-CRF model to the task of russian named entity recognition. arXiv:1709.09686 (2017)

Download references

Acknowledgements

The authors would like to thank Ivan Smetannikov for helpful conversation and unknown AINL reviewers for useful comments.

The research was supported by the Government of the Russian Federation (Grant 08-08).

Author information

Authors and Affiliations

ITMO University, Saint Petersburg, Russia
Georgy Konoplich, Evgeniy Putin & Andrey Filchenkov
Kurchatov Institute, Moscow, Russia
Roman Rybka

Authors

Georgy Konoplich
View author publications
You can also search for this author in PubMed Google Scholar
Evgeniy Putin
View author publications
You can also search for this author in PubMed Google Scholar
Andrey Filchenkov
View author publications
You can also search for this author in PubMed Google Scholar
Roman Rybka
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Andrey Filchenkov .

Editor information

Editors and Affiliations

Data and Web Science Group, University of Mannheim, Mannheim, Baden-Württemberg, Germany
Dmitry Ustalov
ITMO University, St. Petersburg, Russia
Andrey Filchenkov
University of Helsinki, Helsinki, Finland
Lidia Pivovarova
Mendel University, Brno, Czech Republic
Jan Žižka

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Konoplich, G., Putin, E., Filchenkov, A., Rybka, R. (2018). Named Entity Recognition in Russian with Word Representation Learned by a Bidirectional Language Model. In: Ustalov, D., Filchenkov, A., Pivovarova, L., Žižka, J. (eds) Artificial Intelligence and Natural Language. AINL 2018. Communications in Computer and Information Science, vol 930. Springer, Cham. https://doi.org/10.1007/978-3-030-01204-5_5

Download citation

DOI: https://doi.org/10.1007/978-3-030-01204-5_5
Published: 27 September 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-01203-8
Online ISBN: 978-3-030-01204-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics