Skip to main content

Continuous Distributed Representations of Words as Input of LSTM Network Language Model

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 8655))

Abstract

The continuous skip-gram model is an efficient algorithm for learning quality distributed vector representations that are able to capture a large number of syntactic and semantic word relationships. Artificial neural networks have become the state-of-the-art in the task of language modelling whereas Long-Short Term Memory (LSTM) networks seem to be efficient training algorithm.

In this paper, we carry out experiments with a combination of these powerful models: the continuous distributed representations of words are trained with skip-gram method on a big corpora and are used as the input of LSTM language model instead of traditional 1-of-N coding. The possibilities of this approach are shown in experiments on perplexity with Wikipedia and Penn Treebank corpus.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Mikolov, T., Kombrink, S., Deoras, A., Burget, L., Černocký, J.: RNNLM - Recurrent Neural Network Language Modeling Toolkit (2011)

    Google Scholar 

  2. Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient Estimation of Word Representations in Vector Space. In: Proceedings of Workshop at ICLR (2013)

    Google Scholar 

  3. Sundermeyer, M., Schlüter, R., Ney, H.: LSTM Neural Networks for Language Modeling. In: INTERSPEECH 2012 (2012)

    Google Scholar 

  4. Hochreiter, S., Schmidhuber, J.: Long Short-term Memory. Neural Computation 9(8), 1735–1780 (1997)

    Article  Google Scholar 

  5. Graves, A.: Generating sequences with recurrent neural networks. arXiv:1308.0850 (cs.NE) (2013)

    Google Scholar 

  6. Bengio, Y., Ducharme, R., Vincent, P., Janvin, C.: A neural probabilistic language model. J. Mach. Learn. Res. 3, 1137–1155 (2003)

    MATH  Google Scholar 

  7. Mnih, A., Hinton, G.: Three new graphical models for statistical language modelling. In: Proceedings of the 24th International Conference on Machine Learning, ICML 2007, pp. 641–648. ACM, New York (2007)

    Google Scholar 

  8. Mikolov, T., Yih, W., Zweig, G.: Linguistic Regularities in Continuous Space Word Representations. In: Proceedings of NAACL HLT (2013)

    Google Scholar 

  9. Mikolov, T., Sutskever, I., Chen, K., Corrado, G., Dean, J.: Distributed Representations of Words and Phrases and their Compositionality. In: Proceedings of NIPS (2013)

    Google Scholar 

  10. Soutner, D., Müller, L.: Application of LSTM Neural Networks in Language Modelling. In: Habernal, I. (ed.) TSD 2013. LNCS, vol. 8082, pp. 105–112. Springer, Heidelberg (2013)

    Google Scholar 

  11. Charniak, E., et al.: BLLIP 1987-89 WSJ Corpus Release 1, Linguistic Data Consortium, Philadelphia (2000)

    Google Scholar 

  12. Wikimedia Foundation: Wikipedia, The Free Encyclopedia (2009), http://en.wikipedia.org

  13. Garofalo, J., et al.: CSR-I (WSJ0) Complete, Linguistic Data Consortium, Philadelphia (2007)

    Google Scholar 

  14. Kneser, R., Ney, H.: Improved backing-off for M-gram language modeling. Acoustics, Speech, and Signal Processing 1, 181 (1995)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this paper

Cite this paper

Soutner, D., Müller, L. (2014). Continuous Distributed Representations of Words as Input of LSTM Network Language Model. In: Sojka, P., Horák, A., Kopeček, I., Pala, K. (eds) Text, Speech and Dialogue. TSD 2014. Lecture Notes in Computer Science(), vol 8655. Springer, Cham. https://doi.org/10.1007/978-3-319-10816-2_19

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-10816-2_19

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-10815-5

  • Online ISBN: 978-3-319-10816-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics