Abstract
The ultimate goal of doing experiments and make observations is to learn about the way nature behaves and, eventually, unveil the mathematical laws governing the Universe and predict yet-unobserved phenomena. In less pedantic words, to get information about the natural world. Information plays a relevant role in a large number is disciplines (physics, mathematics, biology, image processing,...) and, in particular, it is an important concept in Bayesian Inference. It is useful for instance to quantify the similarities or differences between distributions and to evaluate the different ways we have to analyse the observed data because, in principle, not all of them provide the same amount of information on the same questions. The first steep will be to quantify the amount of information that we get from a particular observation.
Sir, the reason is very plain; knowledge is of two kinds. We know the subject ourselves, or we know where we can find information upon it
S. Johnson
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsNotes
- 1.
The non-negativity of this and the following expressions of Information can be easily derived from the Jensen’s inequality for convex functions: Given the probability space \((\mathcal{R},\mathcal{B},\mu )\), a \(\mu \)-integrable function X and a convex function \(\phi \) over the range of X, then \(\phi (\int _\mathcal{R}X\,d\mu ) \le \int _\mathcal{R}\phi (X)\,d\mu \) provided the last integral exist; that is, \(\phi (E[X])] \le E[\phi (X)]\). Observe that if \(\phi \) is a concave function, then \(-\phi \) is convex so the inequality sign is reversed and that if \(\phi \) is twice continuously differentiable on [a, b], it is convex on that interval iff \(\phi ''(x) \ge 0\) for all \(x \in [a,b]\). Frequent and useful convex functions are \(\phi (x)=\exp (x)\) and \(\phi (x)=-\log x\).
- 2.
Despite of that, the “Differential Entropy” \( h(p)= - \int _{{\Omega }_X}\,p(x)\,\mathrm{log}\,p(x)\,dx\) is a useful quantity in a different context. It is left as an exercise to show that among all continuous distributions with support on [a, b], then the Uniform distribution Un(x|a, b) is the one that maximizes the Differential Entropy, among those with support on \([0,\infty )\) and specified first order moment is the Exponential \(Ex(x|\mu )\) and, if the second order moment is also constrained, we get the Normal density \(N(x|\mu ,\sigma )\).
References
C. Shannon, A mathematical theory of communication. Bell Syst. Tech. J. 27 (1948)
S. Kullback, Information Theory and Statistics (Dover, New York, 1968)
J.M. Bernardo, A.F.M. Smith, Bayesian Theory (Wiley, New York, 1994)
J.M. Bernardo, J. R. Stat. Soc. Ser. B 41, 113–147 (1979)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this chapter
Cite this chapter
Maña, C. (2017). Information Theory. In: Probability and Statistics for Particle Physics. UNITEXT for Physics. Springer, Cham. https://doi.org/10.1007/978-3-319-55738-0_4
Download citation
DOI: https://doi.org/10.1007/978-3-319-55738-0_4
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-55737-3
Online ISBN: 978-3-319-55738-0
eBook Packages: Physics and AstronomyPhysics and Astronomy (R0)