Skip to main content

Generalized Thermodynamics

  • Chapter
  • First Online:
Generalized Statistical Thermodynamics

Abstract

The basis of all of our development up to this point has been the cluster ensemble, a discrete ensemble that generates every possible distribution of integers i with fixed zeroth and first order moments. Thermodynamics arises naturally in this ensemble when M and N become very large. In this chapter we will reformulate the theory on a mathematical basis that is more abstract and also more general. The key idea is as follows. If we obtain a sample from a given distribution h 0, the distribution of the sample may be, in principle, any distribution h that is defined in the same domain. This sampling process defines a phase space of distributions h generated by sampling distribution h 0. We will introduce a sampling bias via a selection functional W to define a probability measure on this space and obtain its most probable distribution. When the generating distribution h 0 is chosen to be exponential, the most probable distribution obeys thermodynamics. Along the way we will make contact with Information Theory, Bayesian Inference, and of course Statistical Mechanics.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 149.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 199.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    From now on the integration limits will be assumed to be over the domain of h and will not be written explicitly.

  2. 2.

    Alternatively, we may formally maximize the functional in Eq. (7.9) with respect to h under fixed N and under the normalization constraint in Eq. (7.1).

  3. 3.

    See for example Cover and Thomas (2006) and Touchette (2009).

  4. 4.

    In writing this probability we have anticipated the fact that the log of the normalization constant is homogeneous in N with degree 1.

  5. 5.

    With w = h 0f Eq. (7.24) gives f = h 0r, and since both f and h 0 are normalized, we must have r = 1.

  6. 6.

    Both n i and N increase inversely proportional to Δ.

  7. 7.

    Functional \(\log W_f[h]\) is linear with respect to all h with fixed \({\bar x}\); it is not a linear functional because \(\log w(x;{\bar x})\) is not the same for all h (it depends on \({\bar x}\)). For this reason we call \(\log W_f\) linearized but not linear.

  8. 8.

    While \(\log W[h]\) is linear with respect to all h with fixed \({\bar x}\), functional \(\log \varrho _f=S[h] + \log W_f [h]-\log \omega \) is not because S[h] is not a linear functional.

  9. 9.

    The complete functional to be maximized is

    $$\displaystyle \begin{aligned} S[f] + \log W[f] + a_0 \left(1-\int f(x)d x\right) + a_1 \left({\bar x}-\int x f(x)d x\right) \end{aligned}$$

    but this is equivalent to

    $$\displaystyle \begin{aligned} S[f] + \log W[f] - a_0 - a_1 \int x f(x)d x, \end{aligned}$$

    which is the same as the functional in Eq. (7.101).

  10. 10.

    It prevents the experimentalist with access to f from inferring W on the basis of f alone.

  11. 11.

    A nice historical account of the development of thermodynamics is given by Müller (2007).

  12. 12.

    While the entropy functional may be applied to any distribution, what we call thermodynamic entropy (i.e., the quantity measured as reversible heat over temperature) refers specifically to the application of the entropy functional to the probability distribution of microstates. The distinction is not always made clear in the literature, as Jaynes had to point out (Jaynes 1965).

  13. 13.

    Kapur (1989) gives several examples.

  14. 14.

    A fourth functional that belongs with other three,

    $$\displaystyle \begin{aligned} -\int h(x)\log \frac{h(x)}{w(x;h)} d x, \end{aligned}$$

    does not appear in Jaynes’s treatment.

  15. 15.

    The MEM literature is not very clear on exactly how to handle the invariant measure.

  16. 16.

    We may call it a physical assumption, if we are not concerned about the philosophical distinction between physical reality and models about this reality.

  17. 17.

    Functional is a very general term for any mapping between a function and scalar. Here are some examples that do not conform to Eq. (7.129):

    $$\displaystyle \begin{aligned} J[h] = h(x_0);\quad J[h] = \max_x{h(x)};\quad J[h] = \exp\left(\int h(x)d x\right) . \end{aligned}$$
  18. 18.

    For example, the functional

    $$\displaystyle \begin{aligned} J[h] = \int\big( a_0(x) h(x) + a_1(x) h'(x) + a_2(x) h''(x) + \cdots\big) d x, \end{aligned}$$

    where h′ is the first derivative of h, h″ is the second derivative, and so on, is also linear in h. This form is not of any relevance to our work.

  19. 19.

    Linearity requires J[λh] = λJ[h] for all h and if F is of the form in Eq. (7.129), then

    $$\displaystyle \begin{aligned} \int F(x,\lambda h) d x = \lambda \int F(x,h) d x. \end{aligned}$$

    which implies that F is homogeneous in h with degree 1. The Gibbs-Duhem equation is

    $$\displaystyle \begin{aligned} \frac{\partial F}{\partial h} =0, \end{aligned}$$

    which requires F to be independent of h, or F(x, y) = a(x).

  20. 20.

    The notion of vicinity implies that we have some measure to determine the distance between two functions. There are various ways to define such measures, but we will not go into these details here. Interested readers are referred to Gelfand and Fromin (2000).

  21. 21.

    The theorem extends to any degree of homogeneity.

  22. 22.

    If f(x 1, x 2⋯ ) is homogeneous in x i with degree 1, then

    $$\displaystyle \begin{aligned} f = x_1 f_1 + x_2 f_2 \cdots \end{aligned}$$

    where f i is the derivative of f with respect to x i.

References

Download references

Author information

Authors and Affiliations

Authors

Appendix: Calculus of Variations

Appendix: Calculus of Variations

We give here a brief review of some tools from the calculus of variations that are useful in handling the functionals that appear in the continuous domain. The review is based on Gelfand and Fromin (2000), a recommended reference that provides a more detailed presentation at a level accessible to most readers with basic background in calculus. Variational calculus is the study of continuous functionals and the conditions that define their extrema. One of the most basic types of functionals is one that can be expressed in the formFootnote 17

$$\displaystyle \begin{aligned} J[h] = \int F(x,h) d x, \end{aligned} $$
(7.129)

where h = h(x) is a function that we treat as a variable in J and F(x, y) is some function (not functional) of its arguments. The functionals we encounter in the cluster ensemble are either of this form, or can be expressed in terms of such functionals. Here are some examples:

7.1.1 Variation of Functional

A functional J[h] is linear in h if it satisfies the conditions,

$$\displaystyle \begin{aligned} J[\lambda h] = \lambda J[h], \quad J[h_1+h_2] = J[h_1] + J[h_2] , \end{aligned} $$

for any scalar λ and any h, h 1, h 2 in the domain of admissible functions. An example of a linear functional is

$$\displaystyle \begin{aligned} J[y] = \int a(x) h(x)d x, \end{aligned} $$
(7.130)

where a(x) is some function of x. Other forms of linear functionals are possible.Footnote 18 However, if J[h] is of the form in Eq. (7.129) and it is linear, then it must be of the form in Eq. (7.130).Footnote 19

The variation δJ of functional J is the change in its value when function h changes by δh and is analogous to the differential of regular functions (Fig. 7.2). If we change h by dh to h + δh, the corresponding change in J is

$$\displaystyle \begin{aligned} \delta J[h] = J[h+\delta h] - J[\delta h] . \end{aligned} $$
(7.131)

If the functional is linear, then

$$\displaystyle \begin{aligned} \delta J[h] = \int a(x)\delta h(x) d x, \end{aligned} $$
(7.132)
Fig. 7.2
figure 2

Schematic representation of δh

We may interpret a(x) as the derivative of linear functional with respect to h. We extend this to general functional. A functional J is differentiable if δJ in the limit δh → 0 becomes a linear functional in δh. If we indicate this functional by J′[h], in the vicinity of h we have

$$\displaystyle \begin{aligned} \delta J[h] = J'[x;h] \delta h . \end{aligned} $$
(7.133)

We interpret J′[h] as the derivative of the functional with respect to h. We may express this linear relationship in the form of Eq. (7.130),

$$\displaystyle \begin{aligned} J[y] = \int J'[x;h] y(x) d x , \end{aligned} $$
(7.134)

where y = h + δh is a function in the vicinity of h.Footnote 20 If we extend this functional to all functions y, we obtain a new functional,

$$\displaystyle \begin{aligned} \Phi[y;h] = \int J'[x;h] y(x) d x, \end{aligned} $$
(7.135)

that is linear in y and has the same value and the same derivative at y = h as the original functional J[h]. The functionals Φ and J are generally different from each other unless J is linear. Equation (7.135) represents a linear extrapolation of J from y = h.

Relevance to Generalized Thermodynamics

The fundamental functional in ensemble theory is \(\log W\). In general, \(\log W[h]\) is a nonlinear functional of distribution h, but since the cluster ensemble converges to the most probable distribution f, only distributions in the vicinity of f are relevant and for distributions in this narrow region we treat \(\log W\) as linear.

7.1.2 Functional Derivative

For functionals of the form in (7.129), the functional derivative is

$$\displaystyle \begin{aligned} \frac{\delta J[h]}{\delta h} = \left(\displaystyle\frac{\partial F(x,z)}{\partial z} \right)_{z=h(x)}. \end{aligned} $$
(7.136)

This derivative is calculated as follows: treat the integrand of Eq. (7.129) as a regular function of h, and h as a regular variable. The derivative of the integrand with respect to h is the variational derivative. For example, the functional

$$\displaystyle \begin{aligned} J[h] = \int x^k h(x)d x, \end{aligned}$$

is of the form in Eq. (7.129) with F(x, z) = x k z. The functional derivative is

$$\displaystyle \begin{aligned} \frac{\delta J[h]}{\delta h} = \left(\displaystyle\frac{\partial (x^k z)}{\partial z} \right)_{z=h(x)} = x^k\Big|{}_{z=h(x)} = x^k. \end{aligned}$$

In this case the derivative is independent of h because the functional is linear. As a second example we consider the intensive entropy functional

$$\displaystyle \begin{aligned} J[h] = -\int h(x) \log h(x) d x. \end{aligned} $$
(7.137)

In this case \(F(x,z)= - h \log h\) and the functional derivative is

$$\displaystyle \begin{aligned} \frac{\delta J}{\delta h} = -\log h - 1. \end{aligned} $$
(7.138)

Relevance to Generalized Thermodynamics

The functional derivative is a function of x that depends on h. As we see in Eq. (7.138), the right-hand side is a function of x whose functional form depends on h. Our notation w(x; h) expresses this connection to both x and h. If h is linear, its derivative is a pure function of x, the same function for all h, as we see in Eq. (7.137). We use the notation w(x) to indicate linear functionals.

7.1.3 Homogeneity

Euler’s theorem for homogeneous functions extends to homogeneous functionals. Let J[h] be homogeneous in h with degree 1, i.e.,Footnote 21

$$\displaystyle \begin{aligned} J[\lambda h] = \lambda J[h] . \end{aligned}$$

We discretize the x axis into a set of points (x 1, x 2⋯ ) at which h receives the corresponding values h 1, h 2⋯. In this discretized space J[h] becomes J(h 1, h 2⋯ ) which may now be treated as a regular function of the h i. Euler’s theorem gives,Footnote 22

$$\displaystyle \begin{aligned} J[h] = \sum h_i\frac{\partial J(h_1,h_2\cdots)}{\partial h_i} , \end{aligned} $$
(7.139)

where δJ[h]∕δh is the derivative of the functional, namely, the change in J when h changes by δh. Passing from the discrete to the continuous limit,

$$\displaystyle \begin{aligned} J[h] = \int h(x) \frac{\delta J[h]}{\delta h} d x = \int h(x) a(x;h) d x, \end{aligned} $$
(7.140)

where a(x;h) is the functional derivative of J. This expresses Euler’s theorem of homogeneous functional in h with degree 1.

7.1.4 Gibbs-Duhem Equation

Let us calculate the variation of J in Eq. (7.140) upon a small change δh:

$$\displaystyle \begin{aligned} \delta J[h] = \int a(x;h) \delta h(x) d x + \int h(x) \delta a(x;h)d x . \end{aligned} $$
(7.141)

For small δh the variation δJ is given by the linear functional

$$\displaystyle \begin{aligned} \delta J[h] = \int a(x;h) \delta h(x) d x . \end{aligned} $$
(7.142)

Then we must have

$$\displaystyle \begin{aligned} \int h(x) \delta a(x;h)d x = 0. \end{aligned} $$
(7.143)

This expresses the Gibbs-Duhem equation that is associated with the Euler Equation (7.140). Here is how to understand this result. The functional derivative is a function of x that depends on h. If h is changed by δh, a will also be changed. The total change integrated over h is not free to have any value, it must be zero. This relationship is imposed by the homogeneity condition.

The Gibbs-Duhem equation is satisfied for all variations in h. We may consider variations along some specific path by varying h(x) via some parameter t. For example, we could take h to be the distribution \(e^{-x/{\bar x}}/{\bar x}\) and use \({\bar x}\), or any function \(t=t({\bar x})\), as a parameter to vary h. Along this path a changes in response to changes in t. If we divide Eq. (7.143) by dt we obtain

$$\displaystyle \begin{aligned} \int h(x)\frac{\partial a(x;h)}{\partial t}d x = 0 , \end{aligned} $$
(7.144)

where we have interpreted δadt as the derivative of a with respect to t, since the observed change in a is entirely due to dt. This can be expressed more simply as

$$\displaystyle \begin{aligned} \overline{ \frac{\partial a(x;h)}{\partial t} } = 0, \end{aligned} $$
(7.145)

where the bar indicates the mean operator over distribution h. This condition is a property of the homogeneous functional of which a is a derivative, not a property of h; it applies to any h along any path.

Relevance to Generalized Thermodynamics

The logarithm of the selection bias is homogeneous in h with degree 1. According to Eq. (7.140) we have,

$$\displaystyle \begin{aligned} \log W[f] = \int h(x)\frac{\delta\log W}{\delta h} d x = \int h(x) \log w(x;h)d x. \end{aligned} $$
(7.146)

The derivative of \(\log W\) with respect to h is the cluster function w(x;h). If \(\log W\) is linear, then \(\log w\) is pure function of x, i.e., w = w(x). If \(\log W\) is not linear, w(x;h) is a function of x and a functional of h.

Along the quasistatic path, h = f and f is a parametric function of \({\bar x}\), i.e., \(f = f(x,{\bar x})\). Applying Eq. (7.144) with h = f, \(J=\log W\), and \(t={\bar x}\) we have

$$\displaystyle \begin{aligned} \int f(x,{\bar x}) \frac{\partial \log w(x,{\bar x})}{\partial{\bar x}}d {\bar x} = 0. \end{aligned}$$
([7.143])

This result was used to obtain the relationship between \(\log q\), β, and \({\bar x}\) in Eq. (7.31).

7.1.5 Functional Derivative of Extensive Entropy Functional

An important homogeneous functional in ensemble theory is the entropy functional, which we define as

$$\displaystyle \begin{aligned} S[h] = -\int h(x)\log \frac{h(x)}{J_0[h]} d x , \end{aligned} $$
(7.147)

with

$$\displaystyle \begin{aligned} J_0[h] = \int h(x) d x . \end{aligned} $$
(7.148)

This defines the entropy functional of extensive distribution h and involves a second functional, J 0, that represents the area under the distribution. We will calculate the functional derivative of this functional by allowing any variations δh without requiring the area under h to be constant. We refer to this as the unconstrained derivative of entropy to distinguish it from that when the normalization constraint is imposed. First we write the functional in the form

$$\displaystyle \begin{aligned} S[h] = -\int h(x) \log h(x) d x + J_0[h]\log J_0[h] . \end{aligned} $$
(7.149)

We will calculate the derivative of each term separately. The first term is of the form in Eq. (7.129) with \(F(x,z) = -z \log z\) and its derivative is

$$\displaystyle \begin{aligned} \begin{array}{rcl} &\displaystyle &\displaystyle \frac{\delta}{\delta h}\left(-\int h(x) \log h(x) d x\right) = \left(\displaystyle\frac{\partial F}{\partial z} \right)_{z=h} =(-\log z - 1)_{z=h} = \\ &\displaystyle &\displaystyle \qquad \qquad \qquad -\log h(x) - 1 . \end{array} \end{aligned} $$
(7.150)

For the second term we have

$$\displaystyle \begin{aligned} \begin{array}{rcl} &\displaystyle &\displaystyle \frac{\delta (J_0 \log J_0)}{\delta h} = \left(\frac{\delta J_0}{\delta h}\right)\log J_0 + J_0 \left(\frac{\delta \log J_0}{\delta h}\right) = \\ &\displaystyle &\displaystyle \qquad \qquad \left(\frac{\delta J_0}{\delta h}\right)\log J_0 + \left(\frac{\delta J_0}{\delta h}\right) = \\ &\displaystyle &\displaystyle \qquad \qquad \qquad \qquad \left(\frac{\delta J_0}{\delta h}\right) \left(\log J_0+1\right). \end{array} \end{aligned} $$
(7.151)

J 0 is of the form in Eq. (7.129) with F(x, z) = z and its derivative is

$$\displaystyle \begin{aligned} \frac{\delta J_0}{\delta h} = \left(\displaystyle\frac{\partial F}{\partial z} \right)_{z=h} = 1\Big|{}_{f=h} = 1. \end{aligned} $$
(7.152)

Combining these results we obtain the functional derivative of entropy:

$$\displaystyle \begin{aligned} \frac{\delta S[h]}{\delta h} = - \log \frac{h(x)}{J_0[h]} . \end{aligned} $$
(7.153)

Using this result the entropy functional can be expressed as

$$\displaystyle \begin{aligned} S[h] = \int h(x) \frac{\delta S[h]}{\delta h} d x, \end{aligned} $$
(7.154)

which is a statement of Euler’s theorem and demonstrates the applicability of the theorem to functionals.

7.1.5.1 The Gibbs-Duhem Equation for Entropy

We demonstrate the Gibbs-Duhem equation applied to entropy with an example. We take h to be the exponential distribution,

$$\displaystyle \begin{aligned} h(x) = \frac{e^{-{\bar x}}}{{\bar x}}, \end{aligned} $$
(7.155)

and use \({\bar x}\) as a parameter, such that by varying \({\bar x}\) we allow h to trace a path in the phase space of distributions. The functional derivative of entropy for this choice of h is obtained by applying Eq. (7.153) to the exponential function (recall that in this case h is normalized to unit area)

$$\displaystyle \begin{aligned} a(x;h) = -\log h(x) = -\frac{x}{{\bar x}}-\log{\bar x}, \end{aligned}$$

and

$$\displaystyle \begin{aligned} \frac{\partial a}{\partial{\bar x}} = \frac{x}{{\bar x}^2}-\frac{1}{{\bar x}} . \end{aligned}$$

We now calculate the integral

$$\displaystyle \begin{aligned} \int h(x) \frac{\partial a}{\partial{\bar x}} d x = \int \left( \frac{x}{{\bar x}^2}-\frac{1}{{\bar x}} \right) \frac{e^{-\beta x}}{{\bar x}} d x = \frac{{\bar x}}{{\bar x}^2}-\frac{1}{{\bar x}} = 0 . \end{aligned}$$

The result is zero in agreement with the Gibbs-Duhem equation given in Eq. (7.143). If we choose \(t=t({\bar x})\), where t any function of \({\bar x}\) we have

$$\displaystyle \begin{aligned} \int h(x) \frac{\partial a}{\partial t} d x = \frac{dt}{d{\bar x}}\int h(x) \frac{\partial a}{\partial {\bar x}} d x = 0, \end{aligned}$$

which again is zero. We may try this with any other distribution: the Gibbs-Duhem equation is an identity by virtue of homogeneity, independently of the details of the distribution or the path.

7.1.6 Maximization

If J[h] has an extremum (maximum or minimum) for some h = h , then its variation at that function is zero,

$$\displaystyle \begin{aligned} \delta J[h^*] = 0, \end{aligned}$$

by analogy to the condition dy = 0 for regular functions. Whether this extremum is a maximum or a minimum is determined by the sign of the second variation; we will not get into the details of the second variation here and we will assume instead that we know that the extremum is a maximum. For the functional of the form in Eq. (7.129) this condition is equivalent to the Euler equation,

$$\displaystyle \begin{aligned} \frac{\partial F(x,y)}{\partial y}\Big|{}_{y=h^*} = 0. \end{aligned}$$

This is easily extended to constrained maximization. Suppose we want the maximum of J[h] with respect to h under the constraints,

$$\displaystyle \begin{aligned} \int h(x)d x = A;\quad \int x h(x) d x = B . \end{aligned} $$
(7.156)

Using Lagrange multipliers, the equivalent unconstrained problem is the maximization of the functional

$$\displaystyle \begin{aligned} \max_h\left\{ J[h] + \lambda_1 \left(A-\int h(x)d x\right) + \lambda_2 \left(B-\int x h(x)d x\right) \right\} , \end{aligned} $$
(7.157)

where λ 1 and λ 2 are Lagrange multipliers. This functional has the same maximum with respect to h as the one below,

$$\displaystyle \begin{aligned} \mathcal J[h] = F(x,h) -\lambda_1 h -\lambda_2 x h . \end{aligned}$$

This is of the form in Eq. (7.129) and its Euler equation is

$$\displaystyle \begin{aligned} \frac{\partial F(x,h)}{\partial h} -\lambda_1 - x\lambda_2 = 0. \end{aligned} $$
(7.158)

The constrained maximization of a continuous functional then is not different from that in the discrete space, if the functional is of the form in Eq. (7.129).

Relevance to Generalized Thermodynamics

The MPD maximizes the functional

$$\displaystyle \begin{aligned} -\int f\log \frac{f}{w} dx, \end{aligned}$$

which is of the form in Eq. (7.129) with

$$\displaystyle \begin{aligned} F(x,f) = -f\log f + f \log w , \end{aligned}$$

and its derivative is

$$\displaystyle \begin{aligned} \frac{\partial F}{\partial h} = -\log f - 1 + \log w. \end{aligned}$$

The Euler equation is obtained by combining this with Eq. (7.158),

$$\displaystyle \begin{aligned} -\log f - 1 + \log w -\lambda_1 - x\lambda_2 = 0, \end{aligned}$$

and its solution is

$$\displaystyle \begin{aligned} f = w e^{\lambda_2 x - \lambda_1 - 1} . \end{aligned}$$

With λ 2 = β, \(e^{\lambda _1 + 1} = q\), we obtain the canonical form of the MPD.

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Matsoukas, T. (2018). Generalized Thermodynamics. In: Generalized Statistical Thermodynamics. Understanding Complex Systems. Springer, Cham. https://doi.org/10.1007/978-3-030-04149-6_7

Download citation

Publish with us

Policies and ethics