Skip to main content

Cluster Expansions and Iterative Scaling for Maximum Entropy Language Models

  • Conference paper
Maximum Entropy and Bayesian Methods

Part of the book series: Fundamental Theories of Physics ((FTPH,volume 79))

Abstract

The maximum entropy method has recently been successfully introduced to a variety of natural language applications. In each of these applications, however, the power of the maximum entropy method is achieved at the cost of a considerable increase in computational requirements. In this paper we present a technique, closely related to the classical cluster expansion from statistical mechanics, for reducing the computational demands necessary to calculate conditional maximum entropy language models.

Research supported in part by NSF and ARPA under grant IRI-9314969 and the ATR Interpreting Telecommunications Research Laboratories.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. S. D. Pietra, V. D. Pietra, and J. Lafferty, “Inducing features of random fields,” Tech. rep., CMU-CS-95-144, Department of Computer Science, Carnegie Mellon University, 1995.

    Google Scholar 

  2. J. Darroch and D. RatclifF, “Generalized iterative scaling for log-linear models,” Ann. Math. Statistics, 43, pp. 1470–1480, 1972.

    Article  MathSciNet  MATH  Google Scholar 

  3. I. Csiszár, “A geometric interpretation of Darroch and Ratcliff’s generalized iterative scaling,” The Annals of Statistics, 17,(3), pp. 1409–1413, 1989.

    Article  MathSciNet  Google Scholar 

  4. L. R. Bahl, F. Jelinek, and R. L. Mercer, “A maximum likelihood approach to continuous speech recognition,” IEEE Trans, on Pattern Analysis and Machine Intelligence, PAMI-5,(2), pp. 179–190, 1983.

    Article  Google Scholar 

  5. P. Brown, J. Cocke, S. D. Pietra, V. D. Pietra, F. Jelinek, J. Lafferty, R. Mercer, and P. Roosin, “A statistical approach to machine translation,” Computational Linguistics, 16, pp. 79–85, 1990.

    Google Scholar 

  6. E. T. Jaynes, Papers on Probability, Statistics, and Statistical Physics, D. Reidel Publishing, Dordrecht-Holland, 1983.

    MATH  Google Scholar 

  7. A. Berger, S. D. Pietra, and V. D. Pietra, “A maximum entropy approach to natural language processing,” Computational Linguistics, to appear, 1995.

    Google Scholar 

  8. R. Lau, R. Rosenfeld, and S. Roukos, “Adaptive language modeling using the maximum entropy principle,” in Proceedings of the ARPA Human Language Technology Workshop, pp. 108–113, Morgan Kaufman Publishers, 1993.

    Google Scholar 

  9. R. P. Feynman, Statistical Mechanics: A Set of Lectures, W. A. Benjamin, Reading, MA, 1972.

    Google Scholar 

  10. P. C. Cheeseman, “A method for computing generalized Bayesian probability values for expert systems,” in Proc. Eighth International Conference on Artificial Intelligence, pp. 198–202, 1983.

    Google Scholar 

  11. S. A. Goldman, “Efficient methods for calculating maximum entropy distributions,” Tech. rep., MIT Department of Electrical Engineering and Computer Science (Masters thesis), 1987.

    Google Scholar 

  12. J. Godfrey, E. Holliman, and M. McDaniel, “Switchboard: Telephone speech corpus for research development,” in Proc. ICASSP-92, pp. I–517–520, 1992.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 1996 Springer Science+Business Media Dordrecht

About this paper

Cite this paper

Lafferty, J.D., Suhm, B. (1996). Cluster Expansions and Iterative Scaling for Maximum Entropy Language Models. In: Hanson, K.M., Silver, R.N. (eds) Maximum Entropy and Bayesian Methods. Fundamental Theories of Physics, vol 79. Springer, Dordrecht. https://doi.org/10.1007/978-94-011-5430-7_23

Download citation

  • DOI: https://doi.org/10.1007/978-94-011-5430-7_23

  • Publisher Name: Springer, Dordrecht

  • Print ISBN: 978-94-010-6284-8

  • Online ISBN: 978-94-011-5430-7

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics