1 The Second Law Puzzle

Let me begin my talkFootnote 1 by recalling one version of the second law of thermodynamics:

The entropy of the universe begins low and increases monotonically.

There are long-established and well-known arguments—see the discussion of ‘branch systems’ in [1] as also reviewed e.g. in [2])—that other statements of the second law, in terms of what can and cannot happen with heat engines, refrigerators etc. follow from the above statement. As also explained in these references, the above statement leads to an explanation of time asymmetry; i.e. why, for example, it is commonplace to observe wine-glasses fall off tables and smash into pieces, but we never see lots of smashed pieces assemble themselves into wine-glasses and jump onto tables (Fig. 1).

But how do we define the entropy of a closed system? And why does it increase?

A standard way of answering this (essentially due to Boltzmann around 1870) might be to consider for example what will happen if one starts with a system of N gas molecules in the left half of a box (see Fig. 2) and removes a partition, allowing the particles to diffuse into the right half of the box.

In a classical discussion, one describes the states of this system with some given energy in terms of a \(6N-1\) dimensional phase space, the points of which are called ‘microstates’ and (see Fig. 3) one imagines this phase space to be divided up into cells—called ‘macrostates’—with the property that we cannot in practice distinguish between any pair of microstates in any single macrostate. One then defines the (‘coarse-grained’) entropy, S, of a microstate by

$$\begin{aligned} S = k\log W \end{aligned}$$
(1)

where k is Boltzmann’s constant and W is the volume of the macrostate containing that given microstate.

Fig. 1
figure 1

Schematic diagram of the universe showing how its radius increases with time

Fig. 2
figure 2

A box of gas molecules, initially confined to the left half

The standard argument then is that (see Fig. 3) the macrostate corresponding to “all the particles are in the left half of the box” will have a vastly smaller volume in phase space than the large macrostate which corresponds to “the molecules fill the box with roughly uniform density”. Hence, as time goes on and the state of the system wanders around the phase space accordingly, it is highly likely that the entropy—as defined by (1) will get bigger and stay bigger.

Fig. 3
figure 3

The phase space for the gas in the box, indicating some possible macrostates

However, this definition of entropy and this argument for its increase depends, unsatisfactorily, on the need to make judgments about what we can distinguish. For example, if (see again Fig. 3) after previously ignoring such fine distinctions, we were to take the view that we can distinguish a state where, say, 48% of the particles are in the left half of the box and 52% in the right half from a state with roughly equal proportionsFootnote 2 then, at times for which the system’s microstate lies in the accordingly-defined new macrostate (obviously a subregion of the previously discussed large macrostate) then Eq. (1) would ascribe a different value to the entropy.

Moreover, this unsatisfactory arbitrariness and vagueness in the definition of entropy is even more of a problem if we want to account for the version of the second law with which we began. For we are not even present to make any distinctions in the early universe!

Turning to the quantum setting, von Neumann gave us long ago a quantum translation of Boltzmann’s equation (1). Given a description of our system in terms of a density operator, \(\rho \) acting on the system’s Hilbert space \(\mathcal {H}\), one defines its von Neumann entropy, \(S^{\mathrm {vN}}(\rho )\), by

$$\begin{aligned} S^{\mathrm {vN}}(\rho )=-k\mathrm{tr}(\rho \log \rho ). \end{aligned}$$
(2)

But if we were to equate the physical entropy, \(S^{\mathrm {physical}}\), with \(S^{\mathrm {vN}}(\rho )\) and if \(\rho \) satisfies the usual unitary time evolution rule

$$\begin{aligned} \rho (t)=U(t)\rho (0)U(t)^{-1} \end{aligned}$$

then we would conclude that

$$\begin{aligned} S^{\mathrm {physical}}(\rho (t))= {\mathrm {constant}}. \end{aligned}$$

in contradiction with the second law. We shall call this the second law puzzle. One can overcome this difficulty by defining quantum counterparts to the above classical coarse-graining, but of course one then would have the same unsatisfactory vagueness and subjectivity as we discussed above in the classical case.

More interestingly, one can seek to exploit a feature of quantum mechanics which has no classical counterpart: If we have a pure state, described by a density operator, \(\rho =|\varPsi \rangle \langle \varPsi |\), which is a projector onto a vector, \(\varPsi \), in a Hilbert space, \(\mathcal {H}_{\mathrm {total}}\), which arises as the tensor product,

$$\begin{aligned} \mathcal {H}_{\mathrm {total}}=\mathcal {H}_\mathrm {A}\otimes \mathcal {H}_\mathrm {B} \end{aligned}$$

of two Hilbert spaces, \(\mathcal {H}_\mathrm {A}\) and \(\mathcal {H}_\mathrm {B}\), then the reduced density operator, \(\rho _A\) on \(\mathcal {H}_\mathrm {A}\), defined as the partial trace, \(\mathrm{tr}_{\mathcal {H}_\mathrm {B}}(\rho )\), of \(\rho \) over \(\mathcal {H}_\mathrm {B}\), will typically have \(S^{\mathrm {vN}}(\rho _\mathrm {A}) \!>\! 0\).

We remark that

  • This partial trace is characterized by the property that, if O is a (self-adjoint) operator on \(\mathcal {H}_\mathrm {A}\), then

    $$\begin{aligned} \mathrm{tr}(\rho _\mathrm {A} O)_{\mathcal {H}_\mathrm {A}}= \langle \varPsi (O\otimes I)|\varPsi \rangle _{\mathcal {H}_{\mathrm {total}}}. \end{aligned}$$
  • Both reduced density operators have equal von Neumann entropies:

    $$\begin{aligned} S^{\mathrm {vN}}(\rho _A)=S^{\mathrm {vN}}(\rho _\mathrm {B}) \end{aligned}$$
    (3)

    and this common value is often known as the A–B entanglement entropy of the total state-vector \(\varPsi \).

In a variant of the ‘environment paradigm for decoherence’ or, from another point of view, a variant of a possible approach to quantum statistical mechanics, this formalism is often applied in the case that A is interpreted as standing for some ‘system’ and B for the system’s ‘environment’ or ‘energy bath’ and \(S^{\mathrm {vN}}(\rho _\mathrm {A})\) is then interpreted as the entropy of the system due to entanglement with the environment.

So the environment paradigm gives us an objective notion of entropy. However, there remain problems:

  • It only offers a notion of entropy for open systems.

  • There are lots of ways of decomposing a given \(\mathcal H\) as \(\mathcal {H}_\mathrm {A}\otimes \mathcal {H}_\mathrm {B}\). How we choose to decompose it depends on subjective choices and, again, we are not around in the early universe to make those choices.

What I’d like to point out is that one can envisage an alternative physical use of this mathematical fact: Suppose there’s some decomposition that’s physically natural, then maybe we could define the entropy of a total closed system by

$$\begin{aligned} S^\mathrm {total}= S^{\mathrm {vN}}(\rho _\mathrm {A}) \quad (= S^{\mathrm {vN}}(\rho _\mathrm {B})) \quad \hbox {(= A--B entanglement entropy)} \end{aligned}$$
(4)

rather than interpreting this mathematical quantity as the entropy of the A-subsystem!

We propose that the identification:

$$\begin{aligned} \hbox {A}=\textit{matter}; \quad \hbox {B}=\textit{gravity}, \end{aligned}$$

is the right choice. This is our matter-gravity entanglement hypothesis. (See [3,4,5] for early papers, and [6] and the remainder of the present article for recent partial overviews and further references.)

In support of this, we note that the decomposition has to be meaningful throughout the entire history of the universe: E.g. we could not identify A with photons and B with nuclei + electrons because these notions are not even meaningful until the photon epoch. We content ourselves, though, with going back to just after the Planck epoch; we assume that a low-energy quantum gravity theory holds there and throughout the entire subsequent history of the universe and that this is a conventional (unitary) quantum theory with \(\mathcal {H}=\mathcal {H}_{matter}\otimes \mathcal {H}_{gravity}\). We will also assume that the initial degree of matter-gravity entanglement is low. (We leave it for a future theory of the pre-Planck era to explain that.)

These assumptions then appear to be capable of offering an explanation of the second law in the form stated at the outset since one can argue that an initial state with a low degree of matter-gravity entanglement will, because of matter-gravity interaction, get more entangled, plausibly monotonically, as time increases. At least the question of whether the second law holds becomes a question which, in principle, can be answered mathematically once we specify the (low-energy) quantum gravity Hamiltonian (i.e. the generator of the unitary time-evolution) and the initial state. What we have called the second law puzzle would then be resolved because once we define entropy as matter-gravity entanglement entropy (rather than as the von Neumann entropy of the total state) there is no conflict between its increase and a unitary time-evolution.

2 The Information Loss Puzzle (Hawking 1976)

The celebrated result of Hawking [7] is that a black hole formed by the dynamical collapse of a star will emit thermal radiation at the Hawking temperature, given, in the case of a spherically symmetric electrically neutral black hole (Fig. 4) by

Fig. 4
figure 4

A schematic picture of the spacetime of a star which collapses to a black hole and then Hawking-evaporates. The thick brown lines represent the boundary of the surface of a collapsing star, the green lines the horizon, the blue wiggly line the future spacetime singularity. The thin yellow wiggles indicate the Hawking radiation predicted in [7] (Color figure online)

$$\begin{aligned} kT_\mathrm {Hawking}={1\over {8\pi GM}} \end{aligned}$$
(5)

where M is the black hole mass (and we take \(c=\hbar =1\)).

As Hawking explained in that work, one expects that such a radiating black hole will lose mass, increasing further its temperature, and eventually evaporate.

During this whole process of collapse to a black hole and subsequent evaporation, one expects the entropy of the total system to increase monotonically.Footnote 3

The version of the information loss puzzle [8] that I shall adopt here is the puzzle as to how this entropy increase can be reconciled with an assumption of unitary time evolution.

Stated in this way, I think it is clear that the information loss puzzle is nothing but a special case of our Second Law Puzzle; we recall here that this is the puzzle that, if one equates \(S^\mathrm {physical}\) with \(S^{\mathrm {vN}}(\rho _\mathrm {total}\)), then \(S^\mathrm {physical}\) must be constant.

I suggested in [3, 4] that the resolution to the information loss puzzle is simply the special case of the above proposed resolution to the second law puzzle. Namely, \(S^\mathrm {physical}\) is not \(S^{\mathrm {vN}}(\rho _\mathrm {total})\). Rather \(S^\mathrm {physical}\) is the total state’s matter-gravity entanglement entropy. As I already said in the more general context in Sect. 1, this is not a unitary invariant and—it is reasonable to assume—would increase, thus offering to resolve the puzzle. That it also offers this resolution to the information loss puzzle lends, is, in my view, further evidence that our matter-gravity entanglement hypothesis is on the right track.

3 The Thermal Atmosphere Puzzle

A black hole in a box in equilibrium with its thermal atmosphere (see Fig. 5) is traditionally taken to be in a total Gibbs state (in particular a total mixed state) at the Hawking temperature.

Fig. 5
figure 5

A schematic picture of a black hole in equilibrium with its thermal atmosphere in a box

Everyone agrees that the entropy of this system has (at least up to small corrections) the value

$$\begin{aligned} S^\mathrm {Hawking}=4\pi kGM^2 = kA/4G. \end{aligned}$$
(6)

where A is the surface area of the event horizon (\(=16\pi G^2M^2\)). The thermal atmosphere puzzle [9, 10] is that one can give seemingly convincing arguments for each of the following three, at first sight seemingly mutually contradictory, statements about the nature and origin of this entropy:

  • It is the entropy of the gravitational field (so mostly ‘residing’ in the black hole).

  • It is the entropy of the thermal atmosphere (so apart from the graviton component, consisting mainly of matter).

  • It is the sum of the above two entropies.

Our proposed resolution of the puzzle begins by postulating that it is not actually the case that the total state is a Gibbs state; rather, we propose, the total state is pure, but entangled between gravity (\(\simeq \) the black hole) and matter (\(\simeq \) its atmosphere) in such a way that each are approximately Gibbs states (at the Hawking temperature).

We further suggest, in line with our matter-gravity entanglement hypothesis, that \(S^\mathrm {Hawking}\) is really this state’s matter-gravity entanglement entropy. This offers to resolve the puzzle in the following way: The first entropy can be regarded, according to the environment paradigm, as the entropy of the open system consisting of the gravitational field due to its matter environment; the second the entropy of the open system consisting of the matter due to its gravity environment. But, by (3), these are actually equal and so, in this environment-paradigm sense, both statements are therefore true, without contradiction. On the other hand, there is no reason why the third statement should be true in any sense and in fact, on our hypothesis it is clearly not true—the total entropy being, by (4) not the sum of the first two, but rather, equal to each of them.

The fact that it seems capable of providing this resolution to the thermal atmosphere puzzle provides further support for the validity of our matter-gravity entanglement hypothesis.

4 The Weak String-Coupling Limit of Black-Hole Equilibrium States and Black Hole Entropy

Some of the most interesting work towards computing (in certain cases) or, at least, gaining a better understanding of, black hole entropy has been within string theory. Here I shall briefly recall the basic idea due to Susskind [11] and one particular line of development by Horowitz and Polchinski [12, 13] which leads to an explanation of how the entropy of spherically symmetric black holes scales with \(M^2\) (the square of the black-hole mass), albeit the argument is semi-qualitative and does not tell us the constant term (so does not explain the factor of 1 / 4 in (6)).

First I will outline the Susskind–Horowitz–Polchinski (SHP) argument. Then I will criticize it. Then I will propose a modification of the SHP argument which is free from the criticisms I raise and is consistent with the understanding of black-hole equilibrium states on the matter-gravity entanglement hypothesis that I outlined in Sect. 3.

Fig. 6
figure 6

The weak string-coupling limit of a black hole is a long string

The SHP argument [12, 13] is in two stepsFootnote 4: First (see Fig. 6) one argues that, as one scales the string coupling-constant, g, down and the string length, \(\ell _s\) up, keeping Newton’s constant \(G=g^2\ell _s^2\) fixed, a black hole goes over to a long string. This will have density of states (i.e. number of states per unit energy, where we use \(\epsilon \) to denote energy) \(\sigma _\mathrm {long string}(\epsilon )\) approximately of the form of a constant times \(e^{\ell _s\epsilon }\).

Secondly, one equates the entropy, \(S_\mathrm {black hole}\), with “\(k\log (\sigma _\mathrm {long string}(\epsilon ))\)\(=k\ell _s\epsilon \) at \(\epsilon =\) constant times M when \(\ell _s=\) constant times GM whereupon \(S_\mathrm {black hole} =\) constant times \(kGM^2\).

Our criticism of this is that it is not correct to equate an entropy with the logarithm of a density of states. (Nor indeed, in other string theory work, with the logarithm of a degeneracy—see [6, 15].) Indeed it only ever makes sense in physics to take the logarithm of a dimensionless quantity but a density of states has of course the dimensions of inverse energy!

Our proposed modification of the SHP scenario [14, 15] is to consider, in place of the limit

$$\begin{aligned} {\textit{black hole}} \rightarrow {\textit{long string}}, \end{aligned}$$

the limit

$$\begin{aligned}&{\textit{black hole in equilibrium with thermal atmosphere in a box}} \quad \rightarrow \\&{\textit{long string in equilibrium with atmosphere of small strings in a suitably rescaled box}}. \end{aligned}$$
Fig. 7
figure 7

The weak string-coupling limit of a black hole in equilibrium with its atmosphere in a suitable box is a long string in equilibrium with its stringy atmosphere in another box

The key fact [12, 13] about a string equilibrium state of this latter type is that (in a certain approximation where we ignore certain power-law prefactors—see Footnote 4) the long string and its stringy atmosphere will have densities of states of the exponential form:

$$\begin{aligned} \sigma _\mathrm {long string}(\epsilon ) \sim c e^{\ell _s\epsilon }, \quad \sigma _\mathrm {stringy atmosphere}(\epsilon ) \sim c' e^{\ell _s\epsilon } \end{aligned}$$
(7)

where the constants c and \(c'\) may be different, but, importantly the exponents are the same.

I have demonstrated (see Sect. 5 for a discussion of the proof) that:

Theorem 1

For any pair of weakly coupled systems (to be called here ‘system’ and ‘bath’) with densities of states as in (7) a randomly chosen pure equilibrium state with total energy E will, with very high probability, have a system-bath entanglement entropy approximately equal to \(k\ell _s E/4\). It will also be such that the reduced states of system and bath separately each have energy E / 2 and are each approximately thermal at temperature \(T=1/k\ell _s\)

Applying this theorem and reading ‘long string’ for ‘system’ and ‘stringy atmosphere‘ for ‘bath’ (or vice versa) and equating the black hole mass, M, with a constant times E and the entanglement entropy of this theorem with the matter-gravity entanglement entropy of the black hole equilibrium state at \(\ell _s=\) constant times GM (as in the unmodified argument) the latter entropy will thus be a constant times \(kGM^2\). Thus we achieve a corrected string explanation of this formula for the black hole entropy which is not subject to the criticism we made of the original SHP approach. Moreover making the same substitution, \(\ell _s=\) constant times GM, the temperature formula for the reduced states of the long string and of its stringy atmosphere goes over to the temperature formula T = a constant times 1 / kGM, which agrees with the Hawking temperature formula (5) (up to a constant).Footnote 5

That ends my discussion of my matter-gravity entanglement hypothesis and of how it offers a resolution to the three puzzles: the second law puzzle, the black hole information loss puzzle, and the thermal atmosphere puzzle and, finally, in this section, of how it enables a modification of the SHP string approach to black hole entropy which is free from the criticismFootnote 6 which I made of the original SHP approach.

In the remainder of the talk I would like to supply some of the details about how I proved the above theorem.

5 Explanations of Thermality: Traditional and Modern

Theorem 1 in fact relies on a general theorem—which is stated below as Theorem 2—which I obtained [16] in a general setting where one has a total system (in [16] I abbreviate this with the the term ‘totem’ and I shall follow that terminology here) consisting of a (quantum) system weakly coupled to an energy bath.

Such a totem will have a Hamiltonian of form

$$\begin{aligned} H=H_\mathrm {system} + H_\mathrm {bath} + H_\mathrm {interaction} \end{aligned}$$

on

$$\begin{aligned} \mathcal {H}_\mathrm {system}\otimes \mathcal {H}_\mathrm {bath} \end{aligned}$$

where \(H_\mathrm {interaction}\) is assumed to be sufficiently weak that it can be ignored for the purposes of counting energy levels; \(\mathcal {H}_\mathrm {system}\) and \(\mathcal {H}_\mathrm {bath}\) each have positively supported, locally finite, discrete spectrum with monotonically increasing densities of states,

$$\begin{aligned} \sigma _\mathrm {system}(\epsilon ) \quad \hbox {and}\quad \sigma _\mathrm {bath}(\epsilon ). \end{aligned}$$

Theorem 2 may be considered to generalize a result of Goldstein, Lebowitz, Tumulka and Zanghi (GLTZ) [17] (see also [18]) which explains why it is that a small system in contact with a large energy bath will typically be in an (approximate) thermal equilibrium state. So I will first briefly recall that result.

5.1 Thermality in the Case the System is Small

The GLTZ explanation is itself a modern replacement for the earlier traditional explanation of the thermality of a small system in contact with a heat bath, so let me recall that first (Fig. 8).

The traditional explanation is based on a mathematical theorem which tells us that if the totem is in a microcanonical ensemble with energy in a narrow band around some total energy E, then the small system will be approximately in a thermal equilibrium state with temperature, \(T_\mathrm {system}\) given by the formula (note that the dimensionful argument of the logarithm is innocuous here because the logarithm is differentiated):

$$\begin{aligned} {1 \over kT_\mathrm {system}}={d\over d\epsilon } \log \sigma _\mathrm {bath}(\epsilon )|_{\epsilon =E}. \end{aligned}$$
(8)

The modern explanation (Fig. 9) [17] is based on a mathematical theorem (proven in [17]) that if the totem state is a pure state, randomly chosen from the set of all pure states with totem energy in a narrow band around E (where the random choice is with respect to a natural measure on the set of all these pure states) then the small system will very probably be very close to the same thermal equilibrium state with a temperature given by the same formula (8).

The advantage of the “modern” over the “traditional” point of view is that it bases a theory of how systems get themselves into (approximate) Gibbs states on the same foundational assumption that we usually make for the foundations of quantum mechanics—namely that the total state of a full closed system is a pure (vector) state.

5.2 What Happens When System and Energy Bath are of Comparable Size?

One might think that one could apply the GLTZ result directly to the case our totem is the string equilibrium state illustrated in Fig. 7, identifying, say, the long string with our ‘system’ and the stringy atmosphere with our ‘energy bath’. However, neither of these can be regarded as small with respect to the other. Here we should clarify that ‘small’ in this context would mean having much more widely spaced energy levels, i.e. having a much lower density of states. Instead both densities of states are (ignoring the power-law prefactors I mentioned earlier) of the exponentially increasing form (7).

It turns out in general, that when the system and the energy bath are of comparable size, then—on both the traditional assumption of a totem microcanonical ensemble and the modern assumption of a random total pure state with energy in a small band—it is no longer necessarily the case that either system or energy bath will probably be in a thermal equilibrium state. However, I have shown [16] with regard to the modern approach:

Theorem 2

There is a special density operator (see the Appendix for details)

$$\begin{aligned} \rho ^\mathrm {modapprox}_\mathrm {system} \ \ \hbox {on} \ \ \mathcal {H}_\mathrm {system} \end{aligned}$$
(9)

such that, given a random vector, \(\varPsi \in \mathcal {H}_\mathrm {system}\otimes \mathcal {H}_\mathrm {bath}\), with energy in a narrow band around E, then the partial trace of \(|\varPsi \rangle \langle \varPsi |\) over \(\mathcal {H}_\mathrm {bath}\) is very probably very close to \(\rho ^\mathrm {modapprox}_\mathrm {system}\).

Fig. 8
figure 8

The traditional explanation of the thermality of a small system

Fig. 9
figure 9

The modern explanation of the thermality of a small system

(And similarly with system \(\leftrightarrow \) energy bath).

But it is important to realize that when system and energy bath are of comparable size, \(\rho ^\mathrm {modapprox}_\mathrm {system}\) is not always thermal. (And neither, by the way, is the reduced state of the system thermal when the total state is in a traditional microcanonical ensemble.)

E.g. if \(\sigma _\mathrm {system}(\epsilon )\) and \(\sigma _\mathrm {bath}(\epsilon )\) take, respectively, the power law forms \(\sigma _\mathrm {system}(\epsilon )=A_S\epsilon ^{N_S}\), \(\sigma _\mathrm {bath}(\epsilon )=A_S\epsilon ^{N_S}\) (the typical behaviour of ordinary matter when \(N_A\) and \(N_B\) are comparable in size to Avogadro’s number) then the system ‘energy probability density’, \(P_\mathrm {system}(\epsilon )\) [16] will be a Gaussian (in fact the same Gaussian on both traditional and modern assumptions) rather than the Gibbsian distribution characteristic of a thermal state. See Figs. 10 and 11.

Fig. 10
figure 10

Plot of the energy probability density, \(P_\mathrm {system}(\epsilon )\), when system and energy bath have the same power-law density of states \(\sigma (\epsilon )=A\epsilon ^{N}\) for the (‘unusually’ small) value \(N=10\)

Fig. 11
figure 11

Plot of the energy probability density, \(P^{\mathrm {Gibbs}}_\mathrm {system,\beta }(\epsilon )\) for the thermal state at inverse temperature, \(\beta \), on our system with density of states \(\sigma (\epsilon )=A\epsilon ^N\), for the same (‘unusually’ small) value \(N=10\) and for \(\beta =22/E\) (i.e. the value of \(\beta \) for which the mean energy is E / 2)

5.3 The Special Nature of Exponential Densities of States

However, it is shown in [16], regarding the modern approachFootnote 7

Theorem 3

When system and energy-bath densities of states both take the exponential form of Eq. (7):

  • \(\rho ^\mathrm {modapprox}_\mathrm {system}\) and \(\rho ^\mathrm {modapprox}_\mathrm {bath}\) are (close toFootnote 8) thermal at temperature \(T=1/k\ell _s\). (And each have mean energy E / 2.)

  • Also, the system-energy bath entanglement entropy, S, \((=S^\mathrm {vN}(\rho ^\mathrm {modapprox}_\mathrm {system}) =S^\mathrm {vN}(\rho ^\mathrm {modapprox}_\mathrm {bath}))\) is approximately \(k\ell _s E/4\).Footnote 9

Theorem 1 of Sect. 4 clearly follows immediately from Theorems 2 and 3.