The Custom Decay Language Model for Long Range Dependencies

Singh, Mittul; Greenberg, Clayton; Klakow, Dietrich

doi:10.1007/978-3-319-45510-5_39

The Custom Decay Language Model for Long Range Dependencies

Mittul Singh^17,18,
Clayton Greenberg^17,18,19 &
Dietrich Klakow^17,18,19

Conference paper
First Online: 03 September 2016

1698 Accesses
1 Citations

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 9924))

Abstract

Significant correlations between words can be observed over long distances, but contemporary language models like N-grams, Skip grams, and recurrent neural network language models (RNNLMs) require a large number of parameters to capture these dependencies, if the models can do so at all. In this paper, we propose the Custom Decay Language Model (CDLM), which captures long range correlations while maintaining sub-linear increase in parameters with vocabulary size. This model has a robust and stable training procedure (unlike RNNLMs), a more powerful modeling scheme than the Skip models, and a customizable representation. In perplexity experiments, CDLMs outperform the Skip models using fewer number of parameters. A CDLM also nominally outperformed a similar-sized RNNLM, meaning that it learned as much as the RNNLM but without recurrence.

D. Klakow—The work was supported by the Cluster of Excellence for Multimodal Computing and Interaction, the German Research Foundation (DFG) as part of SFB 1102 and the EU FP7 Metalogue project (grant agreement number: 611073).

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

Charniak, E.: Immediate-head parsing for language models. In: Proceedings of 39th Annual Meeting of the Association for Computational Linguistics, pp. 124–131, Toulouse, France, July 2001
Google Scholar
Cheng, W.C., Kok, S., Pham, H.V., Chieu, H.L., Chai, K.M.A.: Language modeling with sum-product networks. In: 15th Annual Conference of the International Speech Communication Association (INTERSPEECH), pp. 2098–2102, Singapore, September 2014
Google Scholar
Duchi, J., Hazan, E., Singer, Y.: Adaptive subgradient methods for online learning and stochastic optimization. J. Mach. Learn. Res. 12, 2121–2159 (2011)
MathSciNet MATH Google Scholar
Goodman, J.T.: Classes for fast maximum entropy training. In: 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), vol. 1, pp. 561–564. IEEE, Salt Lake City, May 2001
Google Scholar
Graff, D., Cieri, C.: English gigaword LDC2003t05. Web Download. Linguistic Data Consortium, Philadelphia (2003)
Google Scholar
Guthrie, D., Allison, B., Liu, W., Guthrie, L., Wilks, Y.: A closer look at skip-gram modelling. In: Proceedings of the 5th International Conference on Language Resources and Evaluation (LREC), pp. 1222–1225 (2006)
Google Scholar
Klakow, D.: Log-linear interpolation of language models. In: Fifth International Conference on Spoken Language Processing (ICSLP), pp. 1695–1698 (1998)
Google Scholar
Kuhn, R., De Mori, R.: A cache-based natural language model for speech recognition. IEEE Trans. Pattern Anal. Mach. Intell. 12(6), 570–583 (1990)
Article Google Scholar
Mikolov, T., Karafiát, M., Burget, L., Černocký, J., Khudanpur, S.: Recurrent neural network based language model. In: 11th Annual Conference of the International Speech Communication Association (INTERSPEECH), pp. 1045–1048, Makuhari, Japan, September 2010
Google Scholar
Mikolov, T., Kombrink, S., Burget, L., Černocký, J., Khudanpur, S.: Extensions of recurrent neural network language model. In: 2011 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 5528–5531, Prague, Czech Republic (2011). http://dx.doi.org/10.1109/ICASSp.2011.5947611
Momtazi, S., Faubel, F., Klakow, D.: Within and across sentence boundary language model. In: 11th Annual Conference of the International Speech Communication Association (INTERSPEECH), pp. 1800–1803, Makuhari, Japan, September 2010
Google Scholar
Singh, M., Klakow, D.: Comparing RNNs and log-linear interpolation of improved skip-model on four babel languages: Cantonese, Pashto, Tagalog, Turkish. In: 2013 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 8416–8420, May 2013
Google Scholar

Download references

Author information

Authors and Affiliations

Spoken Language Systems (LSV), Saarland University, Saarbrücken, Germany
Mittul Singh, Clayton Greenberg & Dietrich Klakow
Saarbrücken Graduate School of Computer Science, Saarland University, Saarland Informatics Campus, Saarbrücken, Germany
Mittul Singh, Clayton Greenberg & Dietrich Klakow
Collaborative Research Center on Information Density and Linguistic Encoding, Saarland University, Saarbrücken, Germany
Clayton Greenberg & Dietrich Klakow

Authors

Mittul Singh
View author publications
You can also search for this author in PubMed Google Scholar
Clayton Greenberg
View author publications
You can also search for this author in PubMed Google Scholar
Dietrich Klakow
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Mittul Singh .

Editor information

Editors and Affiliations

Masaryk University , Brno, Czech Republic
Petr Sojka
Masaryk University , Brno, Czech Republic
Aleš Horák
Masaryk University , Brno, Czech Republic
Ivan Kopeček
Masaryk University , Brno, Czech Republic
Karel Pala

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Singh, M., Greenberg, C., Klakow, D. (2016). The Custom Decay Language Model for Long Range Dependencies. In: Sojka, P., Horák, A., Kopeček, I., Pala, K. (eds) Text, Speech, and Dialogue. TSD 2016. Lecture Notes in Computer Science(), vol 9924. Springer, Cham. https://doi.org/10.1007/978-3-319-45510-5_39

Download citation

DOI: https://doi.org/10.1007/978-3-319-45510-5_39
Published: 03 September 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-45509-9
Online ISBN: 978-3-319-45510-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics