Skip to main content

Maxent, Mathematics, and Information Theory

  • Conference paper
Maximum Entropy and Bayesian Methods

Part of the book series: Fundamental Theories of Physics ((FTPH,volume 79))

Abstract

This is a mathematically oriented survey about the method of maximum entropy or minimum I-divergence, with a critical treatment of its various justifications and relation to Bayesian statistics. Information theoretic ideas are given substantial attention, including “information geometry”. The axiomatic approach is considered as the best justification of maxent, as well as of alternate methods of minimizing some Bregman distance or f-divergence other than I-divergence. The possible interpretation of such alternate methods within the original maxent paradigm is also considered.

This work was supported by the Hungarian National Foundation for Scientific Research, Grant T016386.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. S. Kuliback, Information Theory and Statistics, John Wiley and Sons, New York, 1959.

    Google Scholar 

  2. E.T. Jaynes (R.D. Rosenkrantz ed.), Papers on Probability, Statistics and Statistical Physics, Reidel, Dordrecht, 1983).

    Google Scholar 

  3. I. Csiszár and J. Korner, Information Theory: Coding Theorems for Discrete Memoryless Systems, Academic Press, New York, 1981.

    MATH  Google Scholar 

  4. J.M. Van Campenhout and T. Cover, “Maximum entropy and conditional probability,” IEEE Trans. Inform. Theory, 27, 483–489, 1981.

    Article  MathSciNet  MATH  Google Scholar 

  5. I. Csiszár, “Sanov property, generalized I-projection and a conditional limit theorem,” Ann. Probability, 12, 768–793, 1984.

    Article  MATH  Google Scholar 

  6. I. Csiszár, “An extended maximum entropy principle and a Bayesian justification (with discussion),” Bayesian Statistics 2, J.M. Bernardo et al., pp. 83–89, North-Holland, Amsterdam, 1985.

    Google Scholar 

  7. J.M. Bernardo, “Reference posterior for Bayesian inference (with discussion),” J. Roy. Statist. Soc. B, 41, 113–147, 1979.

    MathSciNet  MATH  Google Scholar 

  8. B. Clarke and A.R. Barron, “Jeffreys’ prior is asymptotically least favorable under entropy risk,” J. Statist. Planning and Inference, 41, pp. 37–60, 1994.

    Article  MathSciNet  MATH  Google Scholar 

  9. J.M. Bernardo and A.F.M. Smith, “Bayesian Theory,” John Wiley and Sons, New York, 1994.

    Book  MATH  Google Scholar 

  10. I. Csiszár, “I-divergence geometry of probability distributions and minimization problems,” Ann. Probability, 3, pp. 146–158, 1975.

    Article  MathSciNet  MATH  Google Scholar 

  11. F. Topsoe, “Information theoretical optimization techniques”, Kybernetika, 15, pp. 7–17, 1979.

    MathSciNet  Google Scholar 

  12. I. Csiszár and G. Tusnády, “Information geometry and alternating minimization procedures, ” Statist. Decisions, Suppl. 1, pp. 205–237, 1984.

    Google Scholar 

  13. I. Csiszár, “A geometric interpretation of Darroch and Ratcliff’s generalized iterative scaling,” Ann. Statist., 17, pp. 1409–1413, 1989.

    Article  MathSciNet  Google Scholar 

  14. L.D. Davisson and A. Leon-Garcia, “A source matching approach to finding minimax codes,” IEEE Trans. Inform. Theory, 26, pp. 166–174, 1980.

    Article  MathSciNet  MATH  Google Scholar 

  15. J. Burg, “Personal Communication,” 1995.

    Google Scholar 

  16. C. R. Rao and T. K. Nayak, “Cross entropy, dissimilarity measures, and characterization of quadratic entropy”, IEEE Trans. Inform. Theory, 31, pp. 589–593, 1985.

    Article  MathSciNet  MATH  Google Scholar 

  17. L. Jones and V. Trutzer, “Computationally feasible high-resolution minimum-distance procedures which extend the maximum-entropy method,” Inverse Problems, 5, pp. 749–766, 1989.

    Article  MathSciNet  MATH  Google Scholar 

  18. J. M. Borwein and A. S. Lewis, “Partially-finite programming in L 1 and the existence of maximum entropy estimates,” SIAM J. Optimization, 3, pp. 248–267, 1993.

    Article  MathSciNet  MATH  Google Scholar 

  19. L.M. Bregman, “The relaxation method of finding the common point of convex sets and its application to the solution of problems in convex programming,” USSR Comput. Math. and Math. Phys., 7, pp. 200–217, 1967.

    Article  Google Scholar 

  20. I. Csiszár, “Generalized projections for non-negative functions”, Acta Math. Hungar., 68, pp. 161–185, 1995.

    Article  MathSciNet  MATH  Google Scholar 

  21. I. Csiszár, “Eine informationstheoretische Ungleichung und ihre Anwendung auf den Beweis der Ergodizit von Markoffschen Ketten,” Publ. Math. Inst. Hungar. Acad. Sci, 8, pp. 85–108, 1963.

    MATH  Google Scholar 

  22. S.M. Ali and S.D. Silvey, “A general class of coefficients of divergence of one distribution from another,” J. Roy. Statist. Soc. Ser. B, 28, pp. 131–142, 1966.

    MathSciNet  MATH  Google Scholar 

  23. J. E. Shore and R. W. Johnson, “Axiomatic derivation of the principle of maximum entropy and the principle of minimum cross-entropy,” IEEE Trans. Inform. Theory, 26, pp. 26–37, 1980.

    Article  MathSciNet  MATH  Google Scholar 

  24. I. Csiszár, “Why least squares and maximum entropy? An axiomatic approach to inference for linear inverse problems,” Ann. Statist., 19, pp. 2032–2066, 1991.

    Article  MathSciNet  MATH  Google Scholar 

  25. A. Perez, “Barycenter” of a set of probability measures and its application in statistical decision, Compstat Lectures, pp. 154–159, Physica, Heidelberg, 1984.

    Google Scholar 

  26. J. Navaza, “The use of non-local constraints in maximum-entropy electron density reconstruction, ” Acta. Crystallographica, A42, pp. 212–223, 1986.

    Google Scholar 

  27. D. Dacunha-Castelle and F. Gamboa, “Maximum d’entropie et problème des moments,” Ann. Inst. H. Poincarè, 4, pp. 567-596, 1990.

    Google Scholar 

  28. F. Gamboaand G. Gassiat, “Bayesian methods and maximum entropy for ill posed inverse problems,” Ann. Statist., Submitted, 1994.

    Google Scholar 

  29. J.F. Bercher G. LeBesnerais and G. Demoment, “The Maximum Entropy on the Mean, Method, Noise and Sensitivity”, in Proc. 14th Int. Workshop Maximum Entropy and Bayesian Methods, S. Sibisi and J. Skilling, Kluwer Academic, 1995.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 1996 Springer Science+Business Media Dordrecht

About this paper

Cite this paper

Csiszár, I. (1996). Maxent, Mathematics, and Information Theory. In: Hanson, K.M., Silver, R.N. (eds) Maximum Entropy and Bayesian Methods. Fundamental Theories of Physics, vol 79. Springer, Dordrecht. https://doi.org/10.1007/978-94-011-5430-7_5

Download citation

  • DOI: https://doi.org/10.1007/978-94-011-5430-7_5

  • Publisher Name: Springer, Dordrecht

  • Print ISBN: 978-94-010-6284-8

  • Online ISBN: 978-94-011-5430-7

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics