Abstract
Deep Learning has played a major role in the field of Artificial Intelligence in solving some of the toughest problems. In this paper, a training approach for deep learning model based on LSTMs, GRUs, and other similar variants is proposed. The effectiveness of this approach is checked on two different models, the first model was based on LSTMs and the second model was based on GRUs. To maintain fairness during comparison few parameters were made constant and then different tests were carried out by varying the dropout. It was found that both LSTMs and GRUs models which were trained using the proposed approach had quickly reduced their training loss without underfitting or overfitting, on the data and converges much faster than the traditional approach. A comparative study is done on a text generation task to show the differences between the quality of data generated by our proposed model with respect to the traditional model.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)
Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient Estimation of Word Representations in Vector Space, Cornell University Library (2013). arXiv:1301.3781
Chung, J., Gulcehre, C., Cho, K., Bengio, Y.: Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling, Cornell University Library (2014). arXiv:1412.3555
Graves, A., Schmidhuber, J.: Framewise phoneme classification with bidirectional LSTM and other neural network architectures. Neural Netw. 18(5–6), 602–610 (2005)
Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. 15, 1929–1958 (2015)
Cheng, G., Peddinti, V., Povery, D., Manohar, V., Khudanpur, S., Yan, Y.: An exploration of dropout with LSTMs. In: INTERSPEECH, Annual Conference of the International Speech Communication Association, Stockholm, Sweden, 20 August 2017
Graves, A., Mohamed, A.-R., Hinton, G.: Speech recognition with deep recurrent neural networks. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing, pp. 6645–6649 (2013)
Palangu, H., et al.: Deep Sentence Embedding Using Long Short-Term Memory Networks: Analysis and Application to Information Retrieval, Cornell University Library (2015). arXiv:1501.06922v3
Nayebi, A., Vitelli, M.: GRUV: Algorithmic Music Generation using Recurrent Neural Networks (2015). https://cs224d.stanford.edu/reports/NayebiAran.pdf
Wen, T., Gasic, M., Mrksic, N., Su, P.-H., Vandyke, D., Young, S.: Semantically conditioned LSTM-based natural language generation for spoken dialogue systems. In: Proceedings of EMNLP, Lisbon, Portugal, pp. 1711–1721, September 2015
Xiao, Y., Xiong, S., Duan, P.: Music generation system based on LSTM. In: 4th International Conference on Electrical & Electronics Engineering and Computer Science, Advances in Computer Science and Research, vol. 50. ICEEECS (2016)
Huang, A., Wu, R.: Deep Learning for Music, Cornell University Library (2016). arXiv:1606.04930
Sutskever, I. Vinyals, O., Le, Q.V.: Sequence to sequence learning with neural networks. In: Proceedings of Advances in Neural Information Processing Systems, vol. 27, pp. 3104–3112 (2014)
Github repository. https://github.com/mr-ravin/ShadowNetwork. Accessed 3 Nov 2018
Heilige Quotes bot on Twitter. https://twitter.com/HeiligeQuotes. Accessed 3 Nov 2018
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Kumar, R. (2019). A New Approach to Train LSTMs, GRUs, and Other Similar Networks for Data Generation. In: Prateek, M., Sharma, D., Tiwari, R., Sharma, R., Kumar, K., Kumar, N. (eds) Next Generation Computing Technologies on Computational Intelligence. NGCT 2018. Communications in Computer and Information Science, vol 922. Springer, Singapore. https://doi.org/10.1007/978-981-15-1718-1_14
Download citation
DOI: https://doi.org/10.1007/978-981-15-1718-1_14
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-15-1717-4
Online ISBN: 978-981-15-1718-1
eBook Packages: Computer ScienceComputer Science (R0)