Cluster Expansions and Iterative Scaling for Maximum Entropy Language Models

Lafferty, John D.; Suhm, Bernhard

doi:10.1007/978-94-011-5430-7_23

John D. Lafferty⁴ &
Bernhard Suhm⁴

Part of the book series: Fundamental Theories of Physics ((FTPH,volume 79))

Abstract

The maximum entropy method has recently been successfully introduced to a variety of natural language applications. In each of these applications, however, the power of the maximum entropy method is achieved at the cost of a considerable increase in computational requirements. In this paper we present a technique, closely related to the classical cluster expansion from statistical mechanics, for reducing the computational demands necessary to calculate conditional maximum entropy language models.

Research supported in part by NSF and ARPA under grant IRI-9314969 and the ATR Interpreting Telecommunications Research Laboratories.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

S. D. Pietra, V. D. Pietra, and J. Lafferty, “Inducing features of random fields,” Tech. rep., CMU-CS-95-144, Department of Computer Science, Carnegie Mellon University, 1995.
Google Scholar
J. Darroch and D. RatclifF, “Generalized iterative scaling for log-linear models,” Ann. Math. Statistics, 43, pp. 1470–1480, 1972.
Article MathSciNet MATH Google Scholar
I. Csiszár, “A geometric interpretation of Darroch and Ratcliff’s generalized iterative scaling,” The Annals of Statistics, 17,(3), pp. 1409–1413, 1989.
Article MathSciNet Google Scholar
L. R. Bahl, F. Jelinek, and R. L. Mercer, “A maximum likelihood approach to continuous speech recognition,” IEEE Trans, on Pattern Analysis and Machine Intelligence, PAMI-5,(2), pp. 179–190, 1983.
Article Google Scholar
P. Brown, J. Cocke, S. D. Pietra, V. D. Pietra, F. Jelinek, J. Lafferty, R. Mercer, and P. Roosin, “A statistical approach to machine translation,” Computational Linguistics, 16, pp. 79–85, 1990.
Google Scholar
E. T. Jaynes, Papers on Probability, Statistics, and Statistical Physics, D. Reidel Publishing, Dordrecht-Holland, 1983.
MATH Google Scholar
A. Berger, S. D. Pietra, and V. D. Pietra, “A maximum entropy approach to natural language processing,” Computational Linguistics, to appear, 1995.
Google Scholar
R. Lau, R. Rosenfeld, and S. Roukos, “Adaptive language modeling using the maximum entropy principle,” in Proceedings of the ARPA Human Language Technology Workshop, pp. 108–113, Morgan Kaufman Publishers, 1993.
Google Scholar
R. P. Feynman, Statistical Mechanics: A Set of Lectures, W. A. Benjamin, Reading, MA, 1972.
Google Scholar
P. C. Cheeseman, “A method for computing generalized Bayesian probability values for expert systems,” in Proc. Eighth International Conference on Artificial Intelligence, pp. 198–202, 1983.
Google Scholar
S. A. Goldman, “Efficient methods for calculating maximum entropy distributions,” Tech. rep., MIT Department of Electrical Engineering and Computer Science (Masters thesis), 1987.
Google Scholar
J. Godfrey, E. Holliman, and M. McDaniel, “Switchboard: Telephone speech corpus for research development,” in Proc. ICASSP-92, pp. I–517–520, 1992.
Google Scholar

Download references

Author information

Authors and Affiliations

School of Computer Science, Carnegie Mellon University, 5000 Forbes Avenue, Pittsburgh, PA, 15217, USA
John D. Lafferty & Bernhard Suhm

Authors

John D. Lafferty
View author publications
You can also search for this author in PubMed Google Scholar
Bernhard Suhm
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Dynamic Experimentation Division, Los Alamos National Laboratory, Los Alamos, New Mexico, USA
Kenneth M. Hanson
Theoretical Division, Los Alamos National Laboratory, Los Alamos, New Mexico, USA
Richard N. Silver

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Lafferty, J.D., Suhm, B. (1996). Cluster Expansions and Iterative Scaling for Maximum Entropy Language Models. In: Hanson, K.M., Silver, R.N. (eds) Maximum Entropy and Bayesian Methods. Fundamental Theories of Physics, vol 79. Springer, Dordrecht. https://doi.org/10.1007/978-94-011-5430-7_23

Download citation

DOI: https://doi.org/10.1007/978-94-011-5430-7_23
Publisher Name: Springer, Dordrecht
Print ISBN: 978-94-010-6284-8
Online ISBN: 978-94-011-5430-7
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics