Abstract
The basis of all of our development up to this point has been the cluster ensemble, a discrete ensemble that generates every possible distribution of integers i with fixed zeroth and first order moments. Thermodynamics arises naturally in this ensemble when M and N become very large. In this chapter we will reformulate the theory on a mathematical basis that is more abstract and also more general. The key idea is as follows. If we obtain a sample from a given distribution h 0, the distribution of the sample may be, in principle, any distribution h that is defined in the same domain. This sampling process defines a phase space of distributions h generated by sampling distribution h 0. We will introduce a sampling bias via a selection functional W to define a probability measure on this space and obtain its most probable distribution. When the generating distribution h 0 is chosen to be exponential, the most probable distribution obeys thermodynamics. Along the way we will make contact with Information Theory, Bayesian Inference, and of course Statistical Mechanics.
Keywords
- Generalized thermodynamics
- Sampling
- Random sampling
- Biased sampling
- Canonical sampling
- Microcanonical sampling
- Most probable distribution
- Kullback-Leibler divergence
- Entropy
- Relative entropy
- Probability functional
- Canonical probability functional
- Microcanonical probability functional
- Selection functional
- Bias
- Improper prior
- Linearized probability functional
- Inverse problem
- Calculus of variations
- Euler’s theorem
- Euler equation
- Maximum entropy
- Bayesian inference
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsNotes
- 1.
From now on the integration limits will be assumed to be over the domain of h and will not be written explicitly.
- 2.
- 3.
- 4.
In writing this probability we have anticipated the fact that the log of the normalization constant is homogeneous in N with degree 1.
- 5.
With w = h 0∕f Eq. (7.24) gives f = h 0∕r, and since both f and h 0 are normalized, we must have r = 1.
- 6.
Both n i and N increase inversely proportional to Δ.
- 7.
Functional \(\log W_f[h]\) is linear with respect to all h with fixed \({\bar x}\); it is not a linear functional because \(\log w(x;{\bar x})\) is not the same for all h (it depends on \({\bar x}\)). For this reason we call \(\log W_f\) linearized but not linear.
- 8.
While \(\log W[h]\) is linear with respect to all h with fixed \({\bar x}\), functional \(\log \varrho _f=S[h] + \log W_f [h]-\log \omega \) is not because S[h] is not a linear functional.
- 9.
The complete functional to be maximized is
$$\displaystyle \begin{aligned} S[f] + \log W[f] + a_0 \left(1-\int f(x)d x\right) + a_1 \left({\bar x}-\int x f(x)d x\right) \end{aligned}$$but this is equivalent to
$$\displaystyle \begin{aligned} S[f] + \log W[f] - a_0 - a_1 \int x f(x)d x, \end{aligned}$$which is the same as the functional in Eq. (7.101).
- 10.
It prevents the experimentalist with access to f from inferring W on the basis of f alone.
- 11.
A nice historical account of the development of thermodynamics is given by Müller (2007).
- 12.
While the entropy functional may be applied to any distribution, what we call thermodynamic entropy (i.e., the quantity measured as reversible heat over temperature) refers specifically to the application of the entropy functional to the probability distribution of microstates. The distinction is not always made clear in the literature, as Jaynes had to point out (Jaynes 1965).
- 13.
Kapur (1989) gives several examples.
- 14.
A fourth functional that belongs with other three,
$$\displaystyle \begin{aligned} -\int h(x)\log \frac{h(x)}{w(x;h)} d x, \end{aligned}$$does not appear in Jaynes’s treatment.
- 15.
The MEM literature is not very clear on exactly how to handle the invariant measure.
- 16.
We may call it a physical assumption, if we are not concerned about the philosophical distinction between physical reality and models about this reality.
- 17.
Functional is a very general term for any mapping between a function and scalar. Here are some examples that do not conform to Eq. (7.129):
$$\displaystyle \begin{aligned} J[h] = h(x_0);\quad J[h] = \max_x{h(x)};\quad J[h] = \exp\left(\int h(x)d x\right) . \end{aligned}$$ - 18.
For example, the functional
$$\displaystyle \begin{aligned} J[h] = \int\big( a_0(x) h(x) + a_1(x) h'(x) + a_2(x) h''(x) + \cdots\big) d x, \end{aligned}$$where h′ is the first derivative of h, h″ is the second derivative, and so on, is also linear in h. This form is not of any relevance to our work.
- 19.
Linearity requires J[λh] = λJ[h] for all h and if F is of the form in Eq. (7.129), then
$$\displaystyle \begin{aligned} \int F(x,\lambda h) d x = \lambda \int F(x,h) d x. \end{aligned}$$which implies that F is homogeneous in h with degree 1. The Gibbs-Duhem equation is
$$\displaystyle \begin{aligned} \frac{\partial F}{\partial h} =0, \end{aligned}$$which requires F to be independent of h, or F(x, y) = a(x).
- 20.
The notion of vicinity implies that we have some measure to determine the distance between two functions. There are various ways to define such measures, but we will not go into these details here. Interested readers are referred to Gelfand and Fromin (2000).
- 21.
The theorem extends to any degree of homogeneity.
- 22.
If f(x 1, x 2⋯ ) is homogeneous in x i with degree 1, then
$$\displaystyle \begin{aligned} f = x_1 f_1 + x_2 f_2 \cdots \end{aligned}$$where f i is the derivative of f with respect to x i.
References
T.M. Cover, J.A. Thomas, Elements of Information Theory, 2nd edn. (Wiley, Hoboken, 2006)
I.M. Gelfand, S.V. Fromin, Calculus of Variations (Dover, Mineola, NY, 2000). (Reprint of the 1963 edition)
E. Jaynes, Prior probabilities. IEEE Trans. Syst. Sci. Cybern. 4(3), 227–241 (1968). ISSN 0536-1567. https://doi.org/10.1109/TSSC.1968.300117
E.T. Jaynes, Gibbs vs Boltzmann entropies. Am. J. Phys. 33(5), 391–398 (1965). https://doi.org/10.1119/1.1971557. URL http://link.aip.org/link/?AJP/33/391/1
E.T. Jaynes, Papers on Probability, Statistics and Statistical Physics (Kluwer Academic Publishers, 1983)
J.N. Kapur, Maximum Entropy Methods in Science and Engineering (Wiley Eastern Limited, Brisbane, 1989)
I. Müller, A History of Thermodynamics: The Doctrine of Energy and Entropy (Springer, Berlin, Heidelberg, 2007). https://doi.org/10.1007/978-3-540-46227-9
H. Touchette, The large deviation approach to statistical mechanics. Phys. Rep. 478(1), 1–69 (2009). ISSN 0370-1573. https://doi.org/10.1016/j.physrep.2009.05.002. URL http://www.sciencedirect.com/science/article/pii/S0370157309001410
Author information
Authors and Affiliations
Appendix: Calculus of Variations
Appendix: Calculus of Variations
We give here a brief review of some tools from the calculus of variations that are useful in handling the functionals that appear in the continuous domain. The review is based on Gelfand and Fromin (2000), a recommended reference that provides a more detailed presentation at a level accessible to most readers with basic background in calculus. Variational calculus is the study of continuous functionals and the conditions that define their extrema. One of the most basic types of functionals is one that can be expressed in the formFootnote 17
where h = h(x) is a function that we treat as a variable in J and F(x, y) is some function (not functional) of its arguments. The functionals we encounter in the cluster ensemble are either of this form, or can be expressed in terms of such functionals. Here are some examples:
7.1.1 Variation of Functional
A functional J[h] is linear in h if it satisfies the conditions,
for any scalar λ and any h, h 1, h 2 in the domain of admissible functions. An example of a linear functional is
where a(x) is some function of x. Other forms of linear functionals are possible.Footnote 18 However, if J[h] is of the form in Eq. (7.129) and it is linear, then it must be of the form in Eq. (7.130).Footnote 19
The variation δJ of functional J is the change in its value when function h changes by δh and is analogous to the differential of regular functions (Fig. 7.2). If we change h by dh to h + δh, the corresponding change in J is
If the functional is linear, then
We may interpret a(x) as the derivative of linear functional with respect to h. We extend this to general functional. A functional J is differentiable if δJ in the limit δh → 0 becomes a linear functional in δh. If we indicate this functional by J′[h], in the vicinity of h we have
We interpret J′[h] as the derivative of the functional with respect to h. We may express this linear relationship in the form of Eq. (7.130),
where y = h + δh is a function in the vicinity of h.Footnote 20 If we extend this functional to all functions y, we obtain a new functional,
that is linear in y and has the same value and the same derivative at y = h as the original functional J[h]. The functionals Φ and J are generally different from each other unless J is linear. Equation (7.135) represents a linear extrapolation of J from y = h.
Relevance to Generalized Thermodynamics
The fundamental functional in ensemble theory is \(\log W\). In general, \(\log W[h]\) is a nonlinear functional of distribution h, but since the cluster ensemble converges to the most probable distribution f, only distributions in the vicinity of f are relevant and for distributions in this narrow region we treat \(\log W\) as linear.
7.1.2 Functional Derivative
For functionals of the form in (7.129), the functional derivative is
This derivative is calculated as follows: treat the integrand of Eq. (7.129) as a regular function of h, and h as a regular variable. The derivative of the integrand with respect to h is the variational derivative. For example, the functional
is of the form in Eq. (7.129) with F(x, z) = x k z. The functional derivative is
In this case the derivative is independent of h because the functional is linear. As a second example we consider the intensive entropy functional
In this case \(F(x,z)= - h \log h\) and the functional derivative is
Relevance to Generalized Thermodynamics
The functional derivative is a function of x that depends on h. As we see in Eq. (7.138), the right-hand side is a function of x whose functional form depends on h. Our notation w(x; h) expresses this connection to both x and h. If h is linear, its derivative is a pure function of x, the same function for all h, as we see in Eq. (7.137). We use the notation w(x) to indicate linear functionals.
7.1.3 Homogeneity
Euler’s theorem for homogeneous functions extends to homogeneous functionals. Let J[h] be homogeneous in h with degree 1, i.e.,Footnote 21
We discretize the x axis into a set of points (x 1, x 2⋯ ) at which h receives the corresponding values h 1, h 2⋯. In this discretized space J[h] becomes J(h 1, h 2⋯ ) which may now be treated as a regular function of the h i. Euler’s theorem gives,Footnote 22
where δJ[h]∕δh is the derivative of the functional, namely, the change in J when h changes by δh. Passing from the discrete to the continuous limit,
where a(x;h) is the functional derivative of J. This expresses Euler’s theorem of homogeneous functional in h with degree 1.
7.1.4 Gibbs-Duhem Equation
Let us calculate the variation of J in Eq. (7.140) upon a small change δh:
For small δh the variation δJ is given by the linear functional
Then we must have
This expresses the Gibbs-Duhem equation that is associated with the Euler Equation (7.140). Here is how to understand this result. The functional derivative is a function of x that depends on h. If h is changed by δh, a will also be changed. The total change integrated over h is not free to have any value, it must be zero. This relationship is imposed by the homogeneity condition.
The Gibbs-Duhem equation is satisfied for all variations in h. We may consider variations along some specific path by varying h(x) via some parameter t. For example, we could take h to be the distribution \(e^{-x/{\bar x}}/{\bar x}\) and use \({\bar x}\), or any function \(t=t({\bar x})\), as a parameter to vary h. Along this path a changes in response to changes in t. If we divide Eq. (7.143) by dt we obtain
where we have interpreted δa∕dt as the derivative of a with respect to t, since the observed change in a is entirely due to dt. This can be expressed more simply as
where the bar indicates the mean operator over distribution h. This condition is a property of the homogeneous functional of which a is a derivative, not a property of h; it applies to any h along any path.
Relevance to Generalized Thermodynamics
The logarithm of the selection bias is homogeneous in h with degree 1. According to Eq. (7.140) we have,
$$\displaystyle \begin{aligned} \log W[f] = \int h(x)\frac{\delta\log W}{\delta h} d x = \int h(x) \log w(x;h)d x. \end{aligned} $$(7.146)The derivative of \(\log W\) with respect to h is the cluster function w(x;h). If \(\log W\) is linear, then \(\log w\) is pure function of x, i.e., w = w(x). If \(\log W\) is not linear, w(x;h) is a function of x and a functional of h.
Along the quasistatic path, h = f and f is a parametric function of \({\bar x}\), i.e., \(f = f(x,{\bar x})\). Applying Eq. (7.144) with h = f, \(J=\log W\), and \(t={\bar x}\) we have
$$\displaystyle \begin{aligned} \int f(x,{\bar x}) \frac{\partial \log w(x,{\bar x})}{\partial{\bar x}}d {\bar x} = 0. \end{aligned}$$([7.143])This result was used to obtain the relationship between \(\log q\), β, and \({\bar x}\) in Eq. (7.31).
7.1.5 Functional Derivative of Extensive Entropy Functional
An important homogeneous functional in ensemble theory is the entropy functional, which we define as
with
This defines the entropy functional of extensive distribution h and involves a second functional, J 0, that represents the area under the distribution. We will calculate the functional derivative of this functional by allowing any variations δh without requiring the area under h to be constant. We refer to this as the unconstrained derivative of entropy to distinguish it from that when the normalization constraint is imposed. First we write the functional in the form
We will calculate the derivative of each term separately. The first term is of the form in Eq. (7.129) with \(F(x,z) = -z \log z\) and its derivative is
For the second term we have
J 0 is of the form in Eq. (7.129) with F(x, z) = z and its derivative is
Combining these results we obtain the functional derivative of entropy:
Using this result the entropy functional can be expressed as
which is a statement of Euler’s theorem and demonstrates the applicability of the theorem to functionals.
7.1.5.1 The Gibbs-Duhem Equation for Entropy
We demonstrate the Gibbs-Duhem equation applied to entropy with an example. We take h to be the exponential distribution,
and use \({\bar x}\) as a parameter, such that by varying \({\bar x}\) we allow h to trace a path in the phase space of distributions. The functional derivative of entropy for this choice of h is obtained by applying Eq. (7.153) to the exponential function (recall that in this case h is normalized to unit area)
and
We now calculate the integral
The result is zero in agreement with the Gibbs-Duhem equation given in Eq. (7.143). If we choose \(t=t({\bar x})\), where t any function of \({\bar x}\) we have
which again is zero. We may try this with any other distribution: the Gibbs-Duhem equation is an identity by virtue of homogeneity, independently of the details of the distribution or the path.
7.1.6 Maximization
If J[h] has an extremum (maximum or minimum) for some h = h ∗, then its variation at that function is zero,
by analogy to the condition dy = 0 for regular functions. Whether this extremum is a maximum or a minimum is determined by the sign of the second variation; we will not get into the details of the second variation here and we will assume instead that we know that the extremum is a maximum. For the functional of the form in Eq. (7.129) this condition is equivalent to the Euler equation,
This is easily extended to constrained maximization. Suppose we want the maximum of J[h] with respect to h under the constraints,
Using Lagrange multipliers, the equivalent unconstrained problem is the maximization of the functional
where λ 1 and λ 2 are Lagrange multipliers. This functional has the same maximum with respect to h as the one below,
This is of the form in Eq. (7.129) and its Euler equation is
The constrained maximization of a continuous functional then is not different from that in the discrete space, if the functional is of the form in Eq. (7.129).
Relevance to Generalized Thermodynamics
The MPD maximizes the functional
$$\displaystyle \begin{aligned} -\int f\log \frac{f}{w} dx, \end{aligned}$$which is of the form in Eq. (7.129) with
$$\displaystyle \begin{aligned} F(x,f) = -f\log f + f \log w , \end{aligned}$$and its derivative is
$$\displaystyle \begin{aligned} \frac{\partial F}{\partial h} = -\log f - 1 + \log w. \end{aligned}$$The Euler equation is obtained by combining this with Eq. (7.158),
$$\displaystyle \begin{aligned} -\log f - 1 + \log w -\lambda_1 - x\lambda_2 = 0, \end{aligned}$$and its solution is
$$\displaystyle \begin{aligned} f = w e^{\lambda_2 x - \lambda_1 - 1} . \end{aligned}$$With λ 2 = β, \(e^{\lambda _1 + 1} = q\), we obtain the canonical form of the MPD.
Rights and permissions
Copyright information
© 2018 Springer Nature Switzerland AG
About this chapter
Cite this chapter
Matsoukas, T. (2018). Generalized Thermodynamics. In: Generalized Statistical Thermodynamics. Understanding Complex Systems. Springer, Cham. https://doi.org/10.1007/978-3-030-04149-6_7
Download citation
DOI: https://doi.org/10.1007/978-3-030-04149-6_7
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-04148-9
Online ISBN: 978-3-030-04149-6
eBook Packages: Physics and AstronomyPhysics and Astronomy (R0)