Skip to main content

A New Approach to Train LSTMs, GRUs, and Other Similar Networks for Data Generation

  • Conference paper
  • First Online:
Next Generation Computing Technologies on Computational Intelligence (NGCT 2018)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 922))

Included in the following conference series:

  • 337 Accesses

Abstract

Deep Learning has played a major role in the field of Artificial Intelligence in solving some of the toughest problems. In this paper, a training approach for deep learning model based on LSTMs, GRUs, and other similar variants is proposed. The effectiveness of this approach is checked on two different models, the first model was based on LSTMs and the second model was based on GRUs. To maintain fairness during comparison few parameters were made constant and then different tests were carried out by varying the dropout. It was found that both LSTMs and GRUs models which were trained using the proposed approach had quickly reduced their training loss without underfitting or overfitting, on the data and converges much faster than the traditional approach. A comparative study is done on a text generation task to show the differences between the quality of data generated by our proposed model with respect to the traditional model.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)

    Article  Google Scholar 

  2. Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient Estimation of Word Representations in Vector Space, Cornell University Library (2013). arXiv:1301.3781

  3. Chung, J., Gulcehre, C., Cho, K., Bengio, Y.: Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling, Cornell University Library (2014). arXiv:1412.3555

  4. Graves, A., Schmidhuber, J.: Framewise phoneme classification with bidirectional LSTM and other neural network architectures. Neural Netw. 18(5–6), 602–610 (2005)

    Article  Google Scholar 

  5. Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. 15, 1929–1958 (2015)

    MathSciNet  MATH  Google Scholar 

  6. Cheng, G., Peddinti, V., Povery, D., Manohar, V., Khudanpur, S., Yan, Y.: An exploration of dropout with LSTMs. In: INTERSPEECH, Annual Conference of the International Speech Communication Association, Stockholm, Sweden, 20 August 2017

    Google Scholar 

  7. Graves, A., Mohamed, A.-R., Hinton, G.: Speech recognition with deep recurrent neural networks. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing, pp. 6645–6649 (2013)

    Google Scholar 

  8. Palangu, H., et al.: Deep Sentence Embedding Using Long Short-Term Memory Networks: Analysis and Application to Information Retrieval, Cornell University Library (2015). arXiv:1501.06922v3

  9. Nayebi, A., Vitelli, M.: GRUV: Algorithmic Music Generation using Recurrent Neural Networks (2015). https://cs224d.stanford.edu/reports/NayebiAran.pdf

  10. Wen, T., Gasic, M., Mrksic, N., Su, P.-H., Vandyke, D., Young, S.: Semantically conditioned LSTM-based natural language generation for spoken dialogue systems. In: Proceedings of EMNLP, Lisbon, Portugal, pp. 1711–1721, September 2015

    Google Scholar 

  11. Xiao, Y., Xiong, S., Duan, P.: Music generation system based on LSTM. In: 4th International Conference on Electrical & Electronics Engineering and Computer Science, Advances in Computer Science and Research, vol. 50. ICEEECS (2016)

    Google Scholar 

  12. Huang, A., Wu, R.: Deep Learning for Music, Cornell University Library (2016). arXiv:1606.04930

  13. Sutskever, I. Vinyals, O., Le, Q.V.: Sequence to sequence learning with neural networks. In: Proceedings of Advances in Neural Information Processing Systems, vol. 27, pp. 3104–3112 (2014)

    Google Scholar 

  14. Github repository. https://github.com/mr-ravin/ShadowNetwork. Accessed 3 Nov 2018

  15. Heilige Quotes bot on Twitter. https://twitter.com/HeiligeQuotes. Accessed 3 Nov 2018

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ravin Kumar .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Kumar, R. (2019). A New Approach to Train LSTMs, GRUs, and Other Similar Networks for Data Generation. In: Prateek, M., Sharma, D., Tiwari, R., Sharma, R., Kumar, K., Kumar, N. (eds) Next Generation Computing Technologies on Computational Intelligence. NGCT 2018. Communications in Computer and Information Science, vol 922. Springer, Singapore. https://doi.org/10.1007/978-981-15-1718-1_14

Download citation

  • DOI: https://doi.org/10.1007/978-981-15-1718-1_14

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-15-1717-4

  • Online ISBN: 978-981-15-1718-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics