Probability and Information Theory

Part of the Applied and Numerical Harmonic Analysis book series (ANHA)


This chapter serves as an introduction to concepts from elementary probability theory and information theory in the concrete context of the real line and multi-dimensional Euclidean space. The probabilistic concepts of mean, variance, expected value, marginalization, conditioning, and conditional expectation are reviewed. In this part of the presentation there is some overlap with the previous chapter, which has some pedagogical benefit. There will be no mention of Borel measurability, σ-algebras, filtrations, or martingales, as these are treated in numerous other books on probability theory and stochastic processes such as [1, 14, 15, 32, 27, 48]. The presentation here, while drawing from these excellent works, will be restricted only to those topics that are required either in the mathematical and computational modeling of stochastic physical systems, or the determination of properties of solutions to the equations in these models. Basic concepts of information theory are addressed such as measures of distance, or “divergence,” between probability density functions, and the properties of “information” and entropy. All pdfs treated here will be differentiable functions on Rn. Therefore the entropy and information measures addressed in this chapter are those that are referred to in the literature as the “differential” or “continuous” version.


Probability Density Function Central Limit Theorem Conditional Expectation Fisher Information Fisher Information Matrix 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Applebaum, D., Probability and Information: An Integrated Approach, 2nd ed., Cambridge University Press, London, 2008.MATHGoogle Scholar
  2. 2.
    Barron, A.R., “Entropy and the central-limit-theorem,” Ann. Prob., 14, pp. 336–342, 1986.MATHCrossRefMathSciNetGoogle Scholar
  3. 3.
    Bertsekas, D., Convex Analysis and Optimization, Athena Scientific, 2003.Google Scholar
  4. 4.
    Bhattacharya, R., Patrangenaru, V., “Nonparametric estimation of location and dispersion on Riemannian manifolds,” J. Stat. Plann. Inference, 108, pp. 23–36, 2002.MATHCrossRefMathSciNetGoogle Scholar
  5. 5.
    Blachman, N.M., “The convolution inequality for entropy powers,” IEEE Trans. Inf. Theory, 11, pp. 267–271, 1965.MATHCrossRefMathSciNetGoogle Scholar
  6. 6.
    Braunstein, S. L., Caves, C. M., “Statistical distance and the geometry of quantum states,” Phys. Rev. Lett., 72, pp. 3439–3443, 1994.MATHCrossRefMathSciNetGoogle Scholar
  7. 7.
    Brown, L.D., “A proof of the Central Limit Theorem motivated by the Cramér-Rao in-equality,” in G. Kallianpur, P.R. Krishnaiah, and J.K. Ghosh, eds., Statistics and Probability: Essays in Honour of C.R. Rao, pp. 141–148, North-Holland, New York, 1982.Google Scholar
  8. 8.
    Chirikjian, G.S., Kyatkin, A.B., Engineering Applications of Noncommutative Harmonic Analysis, CRC Press, Boca Raton, FL, 2001.MATHGoogle Scholar
  9. 9.
    Costa, M.H., “A new power inequality,” IEEE Trans. Inf. Theory, 31, pp. 751–760, 1985.MATHCrossRefGoogle Scholar
  10. 10.
    Cover, T.M., Thomas, J.A., Elements of Information Theory, Wiley-Interscience, 2nd ed., Hoboken, NJ, 2006.MATHGoogle Scholar
  11. 11.
    Cramér, H., Mathematical Methods of Statistics, Princeton University Press, Princeton, NJ, 1946.MATHGoogle Scholar
  12. 12.
    Crassidis, J.L., Junkins, J.L., Optimal Estimation of Dynamic Systems, Chapman & Hall/CRC, London, 2004.MATHGoogle Scholar
  13. 13.
    Dembo, A., Cover, T.M., Thomas, J.A., “Information theoretic inequalities,” IEEE Trans. Inf. Theory, 37, pp. 1501–1518, 1991.MATHCrossRefMathSciNetGoogle Scholar
  14. 14.
    Doob, J. L., Stochastic Processes, Wiley, New York, 1953.MATHGoogle Scholar
  15. 15.
    Feller, W., Introduction to Probability Theory and its Applications, John Wiley & Sons, New York, 1971.MATHGoogle Scholar
  16. 16.
    Fisher, R.A., “On the mathematical foundations of theoretical statistics,” Philos. Trans. R. Soc. London Ser. A, 222, pp. 309–368, 1922.CrossRefGoogle Scholar
  17. 17.
    Fisher, R.A., “Theory of statistical estimation,” Proc. Cambridge Philos. Soc., 22, pp. 700–725, 1925.MATHCrossRefGoogle Scholar
  18. 18.
    Frieden, B.R., Science from Fisher Information, Cambridge University Press, New York, 2004.MATHGoogle Scholar
  19. 19.
    Gardner, R.J., “The Brunn-Minkowski inequality,” Bull. Amer. Math. Soc., 39, pp. 355–405, 2002.MATHCrossRefMathSciNetGoogle Scholar
  20. 20.
    Gnedenko, B.V., Kolmogorov, A.N., Limit Distributions for Sums of Independent Random Variables, Addison-Wesley, Reading, MA, 1954 (and 1968).MATHGoogle Scholar
  21. 21.
    Grenander, U., Probabilities on Algebraic Structures, John Wiley & Sons, New York, 1963 (reprinted by Dover, 2008).MATHGoogle Scholar
  22. 22.
    Hardy, G.I., Littlewood, J.E., Pólya, G., Inequalities, 2nd ed., Cambridge University Press, London, 1952.MATHGoogle Scholar
  23. 23.
    Hendricks, H., “A Cramér-Rao type lower bound for estimators with values in a manifold,” J. Multivariate Anal., 38, pp. 245–261, 1991.MATHCrossRefMathSciNetGoogle Scholar
  24. 24.
    Itoh, Y., “An application of the convolution inequality for the Fisher information,” Ann. Inst. Stat. Math., 41, pp. 9–12, 1989.MATHGoogle Scholar
  25. 25.
    Jaynes, E.T., Probability Theory: The Logic of Science, Cambridge University Press, London, 2003.MATHGoogle Scholar
  26. 26.
    Jensen, J.L.W.V., “Sur les fonctions convexes et les inégalités entre les valeurs moyennes,” Acta Math., 30, pp. 175–193, 1906.MATHCrossRefMathSciNetGoogle Scholar
  27. 27.
    Johnson, O., Information Theory and the Central Limit Theorem, Imperial College Press, London, 2004.MATHGoogle Scholar
  28. 28.
    Johnson, O. T., “Entropy inequalities and the Central Limit Theorem,” Stochastic Process. Appl., 88, pp. 291–304, 2000.MATHCrossRefMathSciNetGoogle Scholar
  29. 29.
    Johnson, O., “A conditional entropy power inequality for dependent variables,” IEEE Trans. Inf. Theory, 50, pp. 1581–1583, 2004.CrossRefGoogle Scholar
  30. 30.
    Johnson, O., Barron, A., “Fisher information inequalities and the central limit theorem,” Probability Theory Related Fields, 129, pp. 391–409, 2004.MATHCrossRefMathSciNetGoogle Scholar
  31. 31.
    Kullback, S., Information Theory and Statistics, Dover, New York, 1997 (originally published in 1958).MATHGoogle Scholar
  32. 32.
    Lawler, G.F., Introduction to Stochastic Processes, 2nd ed., CRC Press, Boca Raton, FL, 2006.MATHGoogle Scholar
  33. 33.
    Linnik, Y.V., “An information-theoretic proof of the Central Limit Theorem with the Lindeberg condition,” Theory Probab. Its Appl., 4, pp. 288–299, 1959.CrossRefMathSciNetGoogle Scholar
  34. 34.
    Madiman, M., Barron, A., “Generalized entropy power inequalities and monotonicity properties of information,” IEEE Trans. Inf. Theory, 53, pp. 2317–2329, 2007.CrossRefMathSciNetGoogle Scholar
  35. 35.
    Nikolov, B., Frieden, B.R., “Limitation on entropy increase imposed by Fisher information,” Phys. Rev. E, 49, pp. 4815–4820 Part A, 1994.CrossRefGoogle Scholar
  36. 36.
    Pennec, X., “Intrinsic statistics on Riemannian manifolds: Basic tools for geometric measurements,” J. Math. Imaging Vision, 25, pp. 127–154, 2006.CrossRefMathSciNetGoogle Scholar
  37. 37.
    Pennec, X., “Probabilities and statistics on Riemannian manifolds: Basic tools for geometric measurements,” IEEE Workshop on Nonlinear Signal and Image Processing, 1999.Google Scholar
  38. 38.
    Rao, C.R., “Information and the accuracy attainable in the estimation of statistical parameters,” Bull. Calcutta Math. Soc., 37, pp. 81–89, 1945.MATHMathSciNetGoogle Scholar
  39. 39.
    Rockafellar, R.T., Convex Analysis, Princeton University Press, Princeton, NJ, 1970.MATHGoogle Scholar
  40. 40.
    Samorodnitsky, G., Taqqu, M.S., Stable Non-Gaussian Random Processes: Stochastic Models with Infinite Variance, Chapman and Hall, New York, 1994.MATHGoogle Scholar
  41. 41.
    Scharf, L.L., McWhorter, L., “Geometry of the Cramér-Rao bound,” Signal Process., 31, pp. 301–311, 1993.MATHCrossRefGoogle Scholar
  42. 42.
    Schervish, M.J., Theory of Statistics, Springer, New York, 1995.MATHGoogle Scholar
  43. 43.
    Shannon, C.E., Weaver, W., The Mathematical Theory of Communication, University of Illinois Press, Urbana, 1949.MATHGoogle Scholar
  44. 44.
    Smith, S.T., “Covariance, subspace, and intrinsic Cramér-Rao bounds in signal processing,” IEEE Trans. Acoustics Speech Signal Process., 53, pp. 1610–1630, 2005.Google Scholar
  45. 45.
    Stam, A.J., “Some inequalities satisfied by the quantities of information of Fisher and Shannon,” Inf. Control, 2, pp. 101–112, 1959.MATHCrossRefMathSciNetGoogle Scholar
  46. 46.
    Verdú, S., Guo, D., “A simple proof of the entropy-power inequality,” IEEE Trans. Inf. Theory, 52, pp. 2165–2166, 2006.CrossRefGoogle Scholar
  47. 47.
    Villani, C., “Entropy production and convergence to equilibrium,” in Entropy Methods For The Boltzmann Equation, Lecture Notes in Mathematics, Volume: 1916, pp. 1–70, Springer, Berlin, 2008.CrossRefGoogle Scholar
  48. 48.
    Williams, D., Probability with Martingales, Cambridge University Press, London, 1991.MATHGoogle Scholar
  49. 49.
    Xavier, J., Barroso, V., “Intrinsic variance lower bound (IVLB): an extension of the Cramér-Rao bound to Riemannian manifolds,” IEEE International Conference on Acoustics, Speech, and Signal Processing 2005 Proceedings (ICASSP ’05), Vol. 5, pp. 1033–1036, March 18–23, 2005.CrossRefGoogle Scholar
  50. 50.
    Zamir, R., Feder, M., “A generalization of the entropy power inequality with applications,” IEEE Trans. Inf. Theory, 39, pp. 1723–1728, 1993.MATHCrossRefGoogle Scholar
  51. 51.
    Zolotarev, V.M., One-Dimensional Stable Distributions, Translations of Mathematical Monographs, Vol. 65, American Mathematical Society, Providence, RI, 1986.MATHGoogle Scholar

Copyright information

© Birkhäuser Boston 2009

Authors and Affiliations

  1. 1.Department of Mechanical EngineeringThe Johns Hopkins UniversityBaltimoreUSA

Personalised recommendations