The rapid growth of the Internet promotes the growth of textual data, and people get the information they need from the amount of textual data to solve problems. The textual data may include some potential information like the opinions of the crowd, the opinions of the product, or some market-relevant information. However, some problems that point to “How to get features from the text” must be solved. The model of extracting the text features by using the neural network method is called neural network language model. The features are based on n-gram Model concept, which are the co-occurrence relationship between the vocabularies. The word vectors are important because the sentence vectors or the document vectors still have to understand the relationship between the words, and based on this, this study discusses the word vectors. This study assumes that the words contain “the meaning in sentences” and “the position of grammar.” This study uses recurrent neural network with attention mechanism to establish a language model. This study uses Penn Treebank, WikiText-2, and NLPCC2017 text datasets. According to these datasets, the proposed models provide the better performance by the perplexity.
This is a preview of subscription content, log in to check access.
Buy single article
Instant access to the full article PDF.
Price includes VAT for USA
Subscribe to journal
Immediate online access to all issues from 2019. Subscription will auto renew annually.
This is the net price. Taxes to be calculated in checkout.
Long short-term memory
Neural network language model
Pointer sentinel mixture model
Recurrent highway network
Recurrent neural network
Recurrent neural network language model
Variational recurrent highway network
Variational recurrent neural network
Khodabakhsh M, Kahani M, Bagheri E (2018) Predicting future personal life events on twitter via recurrent neural networks. J Intell Inf Syst. https://doi.org/10.1007/s10844-018-0519-2
Zilly JG, Srivastava RK, Koutník J, Schmidhuber J (2017) Recurrent highway network. arXiv preprint arXiv:1607.03474
Mikolov T, Chen K, Corrado G, Dean J (2013) Efficient estimation of word representations in vector space. In: Proceedings of the international conference on learning representations (ICLR 2013), Scottsdale, Arizona, USA
Li Y, Li W, Sun F, Li S (2015) Component-enhanced Chinese character embeddings. In: The conference on empirical methods in natural language processing (EMNLP 2015), Lisbon, Portugal, pp 829–834
Niu Y, Xie R, Liu Z, Sun M (2017) Improved Word Representation Learning with Sememes. In: Proceedings of the 55th annual meeting of the association for computational linguistics (ACL 2017), vol 1, pp 2049–2058
Han H, Bai X, Li P (2018) Augmented sentiment representation by learning context information. Neural Comput Appl. https://doi.org/10.1007/s00521-018-3698-4
Wu X, Du Z, Guo Y, Fujita H (2019) Hierarchical attention based long short-term memory for Chinese lyric generation. Appl Intell 49(1):44–52
Bengio Y, Ducharme R, Vincent P, Jauvin C (2003) A neural probabilistic language model. J Mach Learn Res 3:1137–1155
Mikolov T, Karafiát M, Burget L, Černocký J, Khudanpur S (2010) Recurrent neural network based language model. InL INTERSPEECH 2010, Makuhari, Chiba, Japan, pp 1045–1048
Gal Y, Ghahramani Z (2016) A theoretically grounded application of dropout in recurrent neural Network. In: Advances in neural information processing systems (NIPS 2016), Barcelona, Spain, pp 1019–1027
Merity S, Xiong C, Bradbury J, Socher R (2017) Pointer sentinel mixture models. In: Proceedings of the international conference on learning representations (ICLR 2017), Toulon, France
Press O, Wolf L (2016) Using the output embedding to improve language models. In: Proceedings of the 15th conference of the European chapter of the association for computational linguistics, vol 2, pp 157–163, Valencia, Spain
Inan H, Khosravi K, Socher R (2016) Tying word vectors and word classifiers: a loss framework for language modeling. In: Proceedings of the international conference on learning representations (ICLR 2017), Toulon, France
Ali ES, Elazim SMA (2018) Mine blast algorithm for environmental economic load dispatch with valve loading effect. Neural Comput Appl 30:261–270
Abd-Elazim SM, Ali ES (2018) Load frequency controller design of a two-area system composing of PV grid and thermal generator via firefly algorithm. Neural Comput Appl 30(2):607–616
Oshaba AS, Ali ES, Elazim SMA (2017) PI controller design using ABC algorithm for MPPT of PV system supplying DC motor-pump load. Neural Comput Appl 28(2):353–364
Wang F, Jiang M, Qian C, Yang S, Li C, Zhang H, Wang X, Tang X (2017) Residual attention network for image classification. In: The IEEE conference on computer vision and pattern recognition (CVPR), pp 3156–3164
Zaremba W (2015). https://github.com/wojzaremba/lstm. Accessed 1 June 2018
Pytorch (2016). https://github.com/pytorch/examples/tree/master/word_language_model. Accessed 1 June 2018
Zhou H, Huang M, Zhang T, Zhu X, Liu B (2017) Emotional chatting machine: emotional conversation generation with internal and external memory. In: The 32nd AAAI conference on artificial intelligence (AAAI-18), New Orleans, Louisiana, USA, pp 730–738
Marcus MP, Marcinkiewicz MA, Santorini B (1993) Building a large annotated corpus of English: the Penn Treebank. Computational linguistics 19:313–330
Koehn P, Hoang H, Birch A, Callison-Burch C, Federico M, Bertoldi N, Cowan B, Shen W, Moran C, Zens R, Dyer C, Bojar O, Constantin A, Herbst E (2007) Moses: open source toolkit for statistical machine translation. In: Proceedings of the 45th annual meeting of the ACL on interactive poster and demonstration sessions, Prague, Czech Republic, 177-180
Mikolov T, Zweig G (2012) Context dependent recurrent neural network language model. In: 2012 IEEE spoken language technology workshop (SLT), Miami, USA, pp 234–239
Zaremba W, Sutskever I, Vinyals O (2014) Recurrent neural network regularization. arXiv preprint arXiv:1409.2329
Grave E, Joulin A, Usunier N (2017) Improving neural language models with a continuous cache. In: Proceedings of the international conference on learning representations (ICLR 2017), Toulon, France
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
About this article
Cite this article
Chen, M., Chiang, H., Sangaiah, A.K. et al. Recurrent neural network with attention mechanism for language model. Neural Comput & Applic 32, 7915–7923 (2020). https://doi.org/10.1007/s00521-019-04301-x
- Language model
- Recurrent neural network
- Artificial intelligence
- Attention mechanism