Position Models and Language Modeling

Zdziobeck, Arnaud; Thollard, Franck

doi:10.1007/978-3-540-89689-0_12

Arnaud Zdziobeck⁷ &
Franck Thollard⁷

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 5342))

Included in the following conference series:

Joint IAPR International Workshops on Statistical Techniques in Pattern Recognition (SPR) and Structural and Syntactic Pattern Recognition (SSPR)

2359 Accesses

Abstract

In statistical language modelling the classic model used is n-gram. This model is not able however to capture long term dependencies, i.e. dependencies larger than n. An alternative to this model is the probabilistic automaton. Unfortunately, it appears that preliminary experiments on the use of this model in language modelling is not yet competitive, partly because it tries to model too long term dependencies. We propose here to improve the use of this model by restricting the dependency to a more reasonable value. Experiments shows an improvement of 45% reduction in the perplexity obtained on the Wall Street Journal language modeling task.

This work was supported by the BINGO2 project (ANR-07-MDCO 014-02).

Download to read the full chapter text

Chapter PDF

The Custom Decay Language Model for Long Range Dependencies

Language Modeling

Sequence Graphs Realizations and Ambiguity in Language Models

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

Callut, J., Dupont, P.: Learning partially observable markov models from first passage times. In: Kok, J.N., Koronacki, J., Lopez de Mantaras, R., Matwin, S., Mladenič, D., Skowron, A. (eds.) ECML 2007. LNCS, vol. 4701, pp. 91–103. Springer, Heidelberg (2007)
Chapter Google Scholar
Carrasco, R.C., Oncina, J.: Learning stochastic regular grammars by means of a state merging method. In: Second ICGI, pp. 139–152 (1994)
Google Scholar
Charniak, E.: Immediate-head parsing for language models. In: 10th Conf. of the Association for Computational linguistic, ACL 2001 (2001)
Google Scholar
Chen, S.F.: Building Probabilistic Models for natural Language. PhD thesis, Harvard University Cambridge Massachusetts (May 1996)
Google Scholar
Daciuk, J., van Noord, G.: Finite automata for compact representation of language models in NLP. In: Watson, B.W., Wood, D. (eds.) CIAA 2001. LNCS, vol. 2494, pp. 65–73. Springer, Heidelberg (2003)
Chapter Google Scholar
Dupont, P., Amengual, J.C.: Smoothing probabilistic automata: an error-correcting approach. In: Oliveira, A.L. (ed.) ICGI 2000. LNCS, vol. 1891, pp. 51–64. Springer, Heidelberg (2000)
Chapter Google Scholar
Goodman, J.: A bit of progress in language modeling. Technical report, Microsoft Research (2001)
Google Scholar
Hirschman, L.: Multi-site data collection for a spoken language corpus. In: DARPA Speech and Natural Language Workshop, pp. 7–14 (1992)
Google Scholar
Kenneth, C., Ted, H., Jianfeng, G.: Compressing trigram language models with Golomb coding. In: Joint EMNLP-CoNLL, pp. 199–207 (2007)
Google Scholar
Kermorvant, C., Dupont, P.: Stochastic grammatical inference with multinomial tests. In: Adriaans, P.W., Fernau, H., van Zaanen, M. (eds.) ICGI 2002. LNCS, vol. 2484, pp. 149–160. Springer, Heidelberg (2002)
Chapter Google Scholar
Kneser, R., Ney, H.: Improved backing-off for m-gram language modeling. In: Intl. Conf. on Acoustic, Speech and Signal Processing, pp. 181–184 (1995)
Google Scholar
Marcus, M., Santorini, S., Marcinkiewicz, M.: Building a large annotated corpus of English: the Penn treebank. Computational Linguistics 19(2), 313–330 (1993)
Google Scholar
McAllester, D., Shapire, R.: On the convergence rate of the good-turing estimators. In: Conf. on Computational Learning Theory, pp. 1–66 (2000)
Google Scholar
Ron, D., Singer, Y., Tishby, N.: On the learnability and usage of acyclic probabilistic finite automata. In: Proceedings of COLT 1995, pp. 31–40 (1995)
Google Scholar
Siivola, V., Hirsimäki, T., Virpioja, S.: On growing and pruning kneser-ney smoothed n-gram models. IEEE Transactions on Audio, Speech and Language Processing 15(5), 1617–1624 (2007)
Article Google Scholar
Stolcke, A.: Entropy-based pruning of backoff language models. In: DARPA Broadcast News Transcription and Understanding Workshop, pp. 270–274 (1998)
Google Scholar
Thollard, F.: Improving probabilistic grammatical inference core algorithms with post-processing techniques. In: ICML 2001, pp. 561–568. Morgan Kaufmann, San Francisco (2001)
Google Scholar
Thollard, F., Clark, A.: Shallow parsing using probabilistic grammatical inference. In: Adriaans, P.W., Fernau, H., van Zaanen, M. (eds.) ICGI 2002. LNCS, vol. 2484, pp. 269–282. Springer, Heidelberg (2002)
Chapter Google Scholar
Thollard, F., Dupont, P., de la Higuera, C.: Probabilistic DFA inference using Kullback-Leibler divergence and minimality. In: ICML (2000)
Google Scholar
Vidal, E., Thollard, F., de la Higuera, C., Casacuberta, F., Carrasco, R.C.: Probabilistic finite-state machines – Part I and II. PAMI 27(7) (2005)
Google Scholar

Download references

Author information

Authors and Affiliations

Laboratoire Hubert Curien, UMR CNRS 5516, Université de Lyon, Université Jean Monnet, Saint-Étienne, France
Arnaud Zdziobeck & Franck Thollard

Authors

Arnaud Zdziobeck
View author publications
You can also search for this author in PubMed Google Scholar
Franck Thollard
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

University of Central Florida School of Electrical Engineering and Computer Science, Orlando, 32816, Florida, USA
Niels da Vitoria Lobo
University of Central Florida Orlando, 32816, Florida, USA
Takis Kasparis & Michael Georgiopoulos &
Department of Electrical and Electronic Engineering , University of Cagliari, Piazza d’Armi, 09123, Cagliari, Italy
Fabio Roli
Department of Computer Science Hong Kong, University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong, China
James T. Kwok
Department of Electrical and Computer Engineering, Melbourne Florida Institute of Technology, 32901, Florida, USA
Georgios C. Anagnostopoulos
Delft University of Technology, Delft, The Netherlands and University of Copenhagen, Copenhagen, Denmark
Marco Loog

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zdziobeck, A., Thollard, F. (2008). Position Models and Language Modeling. In: da Vitoria Lobo, N., et al. Structural, Syntactic, and Statistical Pattern Recognition. SSPR /SPR 2008. Lecture Notes in Computer Science, vol 5342. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-89689-0_12

Download citation

DOI: https://doi.org/10.1007/978-3-540-89689-0_12
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-89688-3
Online ISBN: 978-3-540-89689-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

The International Association for Pattern Recognition (opens in a new tab)

Position Models and Language Modeling

Abstract

Chapter PDF

Similar content being viewed by others

The Custom Decay Language Model for Long Range Dependencies

Language Modeling

Sequence Graphs Realizations and Ambiguity in Language Models

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Societies and partnerships

Navigation

Abstract

Chapter PDF

Similar content being viewed by others

The Custom Decay Language Model for Long Range Dependencies

Language Modeling

Sequence Graphs Realizations and Ambiguity in Language Models

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Societies and partnerships

Search

Navigation