Abstract
This paper presents a minimal formulation of nonrelativistic quantum mechanics, by which is meant a formulation which describes the theory in a succinct, self-contained, clear, unambiguous and of course correct manner. The bulk of the presentation is the so-called “microscopic theory”, applicable to any closed system S of arbitrary size N, using concepts referring to S alone, without resort to external apparatus or external agents. An example of a similar minimal microscopic theory is the standard formulation of classical mechanics, which serves as the template for a minimal quantum theory. The only substantive assumption required is the replacement of the classical Euclidean phase space by Hilbert space in the quantum case, with the attendant all-important phenomenon of quantum incompatibility. Two fundamental theorems of Hilbert space, the Kochen–Specker–Bell theorem and Gleason’s theorem, then lead inevitably to the well-known Born probability rule. For both classical and quantum mechanics, questions of physical implementation and experimental verification of the predictions of the theories are the domain of the macroscopic theory, which is argued to be a special case or application of the more general microscopic theory.
Similar content being viewed by others
References
Friedberg, R., Hohenberg, P.: Compatible quantum theory. Rep. Prog. Phys. 77, 092001 (2014)
Kochen, S.: A reconstruction of quantum mechanics. Found. Phys. 45, 557 (2015)
Birkhoff, G., von Neumann, J.: The logic of quantum mechanics. Ann. Math. 37, 823 (1936)
Griffiths, R.B.: Consistent Quantum Theory. Cambridge University Press, Cambridge (2002)
Bell, J.S.: On the problem of hidden variables in quantum mechanics. Rev. Mod. Phys. 38, 447 (1966)
Kochen, S., Specker, E.: The problem of hidden variables in quantum mechanics. J. Math. Mech. 17, 59–87 (1967)
Mermin, N.D.: Simple unified form for the major no-hidden-variables theorems. Phys. Rev. Lett. 65, 3373 (1990)
Gleason, A.M.: Measures on the closed subspaces of a Hilbert space. J. Math. Mech. 6, 885 (1957)
von Neumann, J.: Mathematical Foundations of Quantum Mechanics, vol. 2. Princeton University Press, Princeton (1996)
Preskill, J.: Lecture Notes for Physics 219: Quantum Information and Computation (2015). http://www.theory.caltech.edu/people/preskill/index.html
Bub, J., Pitowsky, I.: In: S. Saunders (ed.) Many Worlds?: Everett, Quantum Theory, and Reality, pp. 433–459. Oxford University Press, Oxford (2010)
Schrödinger, E.: In: Mathematical Proceedings of the Cambridge Philosophical Society, vol. 31, pp. 555–563. Cambridge University Press (1935)
Schrödinger, E.: In: Mathematical Proceedings of the Cambridge Philosophical Society, vol. 32, pp. 446–452. Cambridge University Press, Cambridge (1936)
Hughston, L.P., Jozsa, R., Wootters, W.K.: A complete classification of quantum ensembles having a given density matrix. Phys. Lett. A 183(1), 14 (1993)
Wootters, W.K., Zurek, W.H.: The no-cloning theorem. Phys. Today 62(2), 76 (2009)
Bub, J.: Quantum probabilities as degrees of belief. Stud. Hist. Philos. Sci. Part B 38(2), 232 (2007)
Feynman, R.P., Leighton, R.B., Sands, M.: The Feynman Lectures on Physics, vol. 3. Addison-Wesley, Reading, MA (2006)
Zeh, H.D.: On the interpretation of measurements in quantum theory. Found. Phys. 1, 69 (1970)
Zurek, W.H.: Decoherence, einselection, and the quantum origins of the classical. Rev. Mod. Phys. 75, 715 (2003)
Gell-Mann, M., Hartle, J.B.: Classical equations for quantum systems. Phys. Rev. D 47, 3345 (1993)
Landau, L., Lifshitz, E.: Course of Theoretical Physics: Vol.: 3: Quantum Mechanics: Non-relativistic Theory. Pergamon Press, Oxford (1965)
Peres, A.: Quantum Theory: Concepts and Methods, vol. 72. Springer, Berlin (1995)
Bell, J.: Against ‘measurement’. Phys. World 3(8), 33 (1990)
Brukner, Č., Zeilinger, A.: Operationally invariant information in quantum measurements. Phys. Rev. Lett. 83(17), 3354 (1999)
Fuchs, C.A.: arXiv:1003.5209 (2010)
Fuchs, C.A., Mermin, N.D., Schack, R.: An introduction to QBism with an application to the locality of quantum mechanics. Am. J. Phys. 82(8), 749 (2014)
Dirac, P.A.M.: The Principles of Quantum Mechanics. Oxford University Press, Oxford (1981)
Everett, H.I.: Relative state formulation of quantum mechanics. Rev. Mod. Phys. 29, 454 (1957)
Griffiths, R.B.: Consistent histories and the interpretation of quantum mechanics. J. Stat. Phys. 36, 219 (1984)
Omnès, R.: Consistent interpretations of quantum mechanics. Rev. Mod. Phys. 64(2), 339 (1992)
de Broglie, L.: The principles of the new undulatory mechanics. J. Phys. Rad. 7, 321 (1926)
Bohm, D.: A suggested interpretation of the quantum theory in terms of “hidden” variables. I. Phys. Rev. 85, 180 (1952)
Ghirardi, G.C., Rimini, A., Weber, T.: Unified dynamics for microscopic and macroscopic systems. Phys. Rev. D 34(2), 470 (1986)
Hartle, J.B.: arXiv preprint arXiv:gr-qc/0508001 (2005)
Author information
Authors and Affiliations
Corresponding author
Appendices
Set Theory, Classical Logic and Probability Theory
In this appendix we provide a brief summary of set theory, classical (Aristotelian) logic and classical probability theory, and we show how the three are formally related.
1.1 Set Theory
For simplicity we consider a discrete set \(\varOmega \) of N elements \(x\in \varOmega \). The subsets \(A,B,\ldots \) of \(\varOmega \) form a set \(\mathcal {B}(\varOmega )\) of sets (in set theory, a field of sets) for which the operations of union \(\cup \), intersection \(\cap \) and complement \(\sim \) obey the axioms of set theory:
where \(\emptyset \) is the empty subset of \(\varOmega \).
1.2 Classical Logic
The subsets \(A,B,\ldots \) can also be considered as logical propositions, in which case the operations of set theory become logical operations
Under the replacements Eqs. A.2, Eqs. A.1 become the usual axioms of propositional calculus
In particular Eq. A.1e becomes the distributive law Eq. A.3e. The set \({\mathcal {B}(\varOmega )}\) of \(2^N\) propositions forms a Boolean algebra under the logical operations. On this algebra we can define truth functions \(\mathcal {T}(A)\) with values 1 (True) and 0 (False). Such truth functions must agree with the standard truth tables for the logical functions, which imply the algebraic relations
and must also satisfy \(\mathcal {T}(\varOmega )=1\), \(\mathcal {T}(\emptyset )=0\). We shall refer to these equations as ‘truth table relations’.
Let us consider in particular those subsets of \(\varOmega \) containing only one member, that is sets of the form \(\{x\}\) where \(x\in \varOmega \). We may call them atomic sets, and the corresponding logical propositions atomic propositions. Then by applying Eq. A.4 we find that any truth function must assign the value 1 to some atomic proposition and 0 to all the others. We shall denote the truth function that assigns 1 to a particular \(\{x\}\) by \(\mathcal{T}_x\). Then for \(x, y \in \varOmega \)
Again applying Eq. A.4, we see that for any \(A\in \mathcal {B}(\varOmega )\),
We may say that x is the “source of truth” for the atomic truth function \(\mathcal{T}_x\).
1.3 Probability Theory
Truth functions can be generalized by introducing a probability function \(\mathcal {P}(A)\) from \(\mathcal {B}(\varOmega )\) to the unit interval [0, 1]. One first defines a measure as a function from \(\mathcal {B}(\varOmega )\) to the interval \([0,\infty ]\), which satisfies the linearity condition for countable sets of disjoint subsets,
A probability measure or probability function (classically, these two ideas can be identified) is a measure which satisfies the additional condition
(The relation \(\mathcal {P}(\emptyset )=0\) is already implied by Eq. A.7). In the context of probability theory, the set \(\varOmega \) is the ‘sample space’ of the probability measure and the elements \(A^{(i)}\in \mathcal {B}(\varOmega )\) are called ‘events’. It can be shown that for any two events A, B, Eq. A.7 implies the relation
The converse is true for a finite \(\varOmega \). We shall sometimes refer to Eqs. A.7 and A.8 as the ‘Kolmogorov conditions’, and to Eq. A.9 as the ‘Kolmogorov overlap equation’.
One notices a similarity between Eqs. A.9 and A.4c. Indeed, in [1, Appendix A] the authors explained in what sense the probability function \(\mathcal {P}\) can be thought of as a ‘distributed truth function’.
Let us mention one feature of probability functions which is often obscure, namely that (in a finite-dimensional sample space) defining a probability function that assigns a relative likelihood to the members of a sample space does not alter the fact that one and only one of its members is true, while all the others are false.
It should be noted, moreover, that our definitions of probability and truth are formal ones, and they are thus consistent with either a frequentist or a Bayesian approach to probabilities. At this stage we are not inquiring into the relationship of probabilities to the “real world”, which is where such distinctions arise. The connection between truth and probability explored above exists already on the formal level and is therefore independent of any real-world interpretation of probability.
Noncontextual Network Theorem
Suppose we have a noncontextual network \(\mathcal {N}\) of probability functions, i.e. a set of functions \(P_\mathcal{F}\) with sample spaces \(\mathcal {F}\), such that if A belongs both to \(\mathcal{F}_1\) and to \(\mathcal{F}_2\), then
We show here that the probability functions \(P_\mathcal{F}\) must satisfy the Born Rule Eq. 5, for some density matrix \(\rho \).
To prove the theorem we first define a quantum probability measure \({\mathcal M}_\mathcal{N}\). For each \(A \in Q(\mathcal {H})\), let \(\mathcal{F}_A\) be the framework consisting of 1, \(\emptyset \), A, and \(\lnot A\). Then we define
Is \({\mathcal M}_\mathcal{N}\) a quantum probability measure? That depends on whether the additivity condition
is satisfied. So let \(\{A\} = \{A_1, A_2, \ldots \}\) be a finite or countably infinite set of properties that are mutually orthogonal, that is \(A_i \perp A_j\) for \(i\ne j\) as in the condition of Eq. B.3 ; and let \(\text {Sp}(\{A\})\) be the span of all the members of \(\{A\}\). If we define \(A_0 = \lnot \text {Sp}(\{A\})\), then \(S_{\{A\}} = \{A_0, A_1, A_2,\ldots \}\) is a sample space. This sample space is the basis of a framework which we shall call \(\mathcal{F}_{\{A\}}\). Since \(P_{\mathcal{F}_{\{A\}}}\) is a probability function, Eq. A.7 above tells us that
But since any \(A\in \mathcal{F}_{\{A\}}\) belongs to both \(\mathcal{F}_{\{A\}}\) and \(\mathcal{F}_{A_i}\), we can write Eq. B.4 as
in view of Eq. B.1. Then using Eq. B.2, we have
which is Eq. B.3. Therefore \({\mathcal M}_\mathcal{N}\) is a quantum probability measure. Its normalization \({\mathcal M}_\mathcal{N}(\mathcal{H}) = 1 \) can be deduced from the completeness of the sample space in any framework. Then from Gleason’s theorem we get the Born rule Eq. 5.
Kochen’s Proof of an Important Theorem
In [2, Sect. 8.1], a theorem is stated and proved: that given a state p and a property y such that \(p(y)\ne 0\), a unique new state \(p_y\) exists such that \(p_y(x) = p(x\wedge y)/p(y)\) for any x belonging to some \(\sigma \)-algebra B that contains y; and that the density operator of \(p_y\) is given uniquely by \(w_y = ywy/Tr(ywy)\) where w is the density operator associated with p. The proof is valid but extremely terse. For better comprehension we give here an expanded version of the theorem and the proof.
1.1 Preliminary Definitions
In [2, Sect. 3.1] we read,
“A state of a system with a \(\sigma \)-complex of properties Q is a map \(p: Q -> [0,1]\) such that the restriction of p to any \(\sigma \)-algebra B in Q is a probability measure on B.”
Looking higher in the same section, we see that a probability measure on B is a function \(p: B -> [0,1]\) such that
where I is the identity operator, and
for any mutually disjoint set of \(a_i\) all belonging to B.
But if the \(a_i\) are properties, disjoint simply means orthogonal; and for orthogonal properties \(x_i \vee x_j = x_i + x_j\). So the definition of a state can be rewritten
A state of a system with a \(\sigma \)-complex of properties Q is a map \(p: Q -> [0,1]\) such that Eq. C.1 holds and
for any mutually orthogonal \(x_1, x_2, \ldots \). (It is trivial that such a set generates a \(\sigma \)-algebra.)
Now looking lower, we see that if \(Q = Q(\mathcal H)\) where \(\mathcal H\) is a Hilbert space of dimension \(>2\), Gleason’s theorem tells us that for any state p there exists a unique density operator w (nonnegative Hermitian operator of trace 1) on \(\mathcal H\) such that
for all \(x\in Q\).
1.2 Statement of Theorem
In [2, Sect. 8.1], Kochen states and proves the following theorem. (We replace some of his notation with our own.)
If p is a state on \(Q(\mathcal H)\) and \(y\in Q(\mathcal H)\) such that \(p(y) \ne 0\), then there exists a unique state \(p_y\) conditioned on y. If w is the density operator corresponding under Eq. C.4 to p, then ywy / Tr(ywy) corresponds to the state \(p_y\).
To understand this statement one must look back a couple of paragraphs to the general \(\sigma \)-complex Q:
“Let p be a state on a \(\sigma \)-complex Q and \(y\in Q\) such that \(p(y)\ne 0\). By a state conditioned on y we mean a state \(p_y\) such that for every \(\sigma \)-algebra B in Q containing y and every \(x\in B\),
This permits us to restate the theorem as follows:
Let it be given that p is a state on \(Q(\mathcal H)\) as defined in the previous section, and that (p, w) satisfy Eq. C.4. Let it also be given that \(y\in Q(\mathcal H)\) and that
Then there exists a state \(p_y\) on \(Q(\mathcal H)\) such that (5) holds for every \(x\in Q(\mathcal H)\) that belongs to some \(\sigma \)-algebra B that also contains y. Moreover, there exists only one such state, and it is given by the formula
where
[It is important to note that the assertion that \(p_y\) is a state implies that \(p_y(x)\) is defined for every property x, whether or not it commutes with y. But Eq. C.5 is asserted only for certain properties x. In fact, one might as well say that Eq. C.5 is asserted for all x compatible with y, since if x and y commute one can easily construct a \(\sigma \)-algebra containing them both.
It is also essential that Eq. C.6 be assumed, since otherwise the right side of Eq. C.5 could be 0/0.
Finally, Eq. C.7 is meant to hold for all x, not only those satisfying Eq. C.5.]
1.3 Proof of Theorem (Existence)
The proof of the theorem is in two halves. First, we must show that there exists a state \(p_y\) such that Eq. C.5 holds for all x that commutes with y. And second, we must show that any such \(p_y\) must satisfy Eqs. C.7 and C.8. In both halves it is given to begin with that p is a state and that y is a property satisfying Eq. C.6.
To show that a state \(p_y\) exists with the desired feature, we simply exhibit one as defined by Eqs. C.7 and C.8. We must then show that \(p_y\) so defined is a state, and also that it satisfies Eq. C.5 for all x compatible with y. The latter statement is easily proved. Combining Eqs. C.7 and C.8, we have
for all x. The denominator is equal to \(Tr(wyy) = Tr(wy) = p(y)\), which is nonzero by Eq. C.6; therefore Eq. C.9 is defined. Then if we specialize to x commuting with y, the numerator in Eq. C.9 can be written as \(Tr(ywxy) = Tr(wxyy) = Tr (wxy) = p(xy)\); but for commuting x and y, xy is the same as \(x\wedge y\) and so Eq. C.5 is satisfied.
It remains to show that \(p_y\) as defined by Eq. C.9 is a state. This means verifying Eqs. C.1 and C.3 with \(p_y\) in place of p. As for Eq. C.1, we have from Eq. C.9
again appealing to Eq. C.6. As for Eq. C.3, the equation
follows directly from Eq. C.7 since the right side of Eq. C.7 is linear in x. Therefore \(p_y\) is a state.
1.4 Proof of Theorem (Uniqueness)
In this section we assume as before that p is a state and that y is a property obeying Eq. C.6. We assume also that \(p_y\) is some state for which Eq. C.5 holds whenever x commutes with y. Then from Gleason’s theorem we know that a density operator \(w_y\) exists satisfying Eq. C.7 for all x.
But we do not assume Eq. C.8; we regard the form of \(w_y\) as unknown. We shall then derive an expression for \(p_y(x)\), holding for any 1-dimensional x, from which \(w_y\) has been eliminated. (This is the heart of Kochen’s proof. It shows that the state \(p_y\) is unique, since its values on arbitrary x can be obtained from those on 1-dimensional x by linearity.)
If x is 1-dimensional, it can be written as \(|\phi \rangle \langle \phi |\) for some ket \(|\phi \!>\) for which \(\langle \phi | \phi \rangle = 1\). Then Eq. C.7 becomes
Since y and its complement \(y^{\perp }\) span \(\mathcal H\), we have
By substituting Eq. C.13 for \(| \phi \rangle \) in Eq. C.12, we obtain \(p_y\) as the sum of four terms, three of which contain the factor \(w_y y^{\perp }| \phi \rangle \) or else \(\langle \phi | y^{\perp } w_y\). These three vanish, for the following reason. Since y and \(y^{\perp }\) commute, we may substitute \(y^{\perp }\) for x in Eq. C.5, obtaining
where \(\emptyset \) is the empty set. (\(p(\emptyset )\) must vanish, otherwise Eq. C.3 would lead to a contradiction.) On the other hand, since Eq. C.7 applies to all x we can apply it to \(y^{\perp }\), obtaining
and comparing Eqs. C.15–C.14 we have
Now from Eq. C.16 it follows that \(w_y y^{\perp } |\phi \rangle = 0\). This implication is not trivial, and we shall give the reasoning as a lemma.
Lemma 1
If v is a density operator and z is a projector, then either \(vz = 0\) or \(Tr(vz) > 0\).
Proof
z can be decomposed into a sum of one-dimensional projectors \(z_k\), and the lemma will be true of z if it is true of each \(z_k\).
Therefore without loss of generality we can take z to be one-dimensional: \(z = |\xi \rangle \langle \xi |\) where \(\langle \xi |\xi \rangle = 1.\)
Express v in matrix form so that it is diagonal: \(v_{ab} = n_a \delta _{a,b}\) where all \(n_a \ge 0\) and \(\varSigma n_a = 1\). Then \(|\xi \rangle \) is a column matrix whose a’th element is \(\xi _a\), and \(\varSigma |\xi _a|^2 = 1.\)
We then have \(z_{ab} = \xi _a \xi _b^*\), and \((vz)_{ab} = n_a \xi _a \xi _b^*.\) Hence \(Tr((vz) = \varSigma n_a \xi _a \xi _a^*.\) Each term of the trace is nonnegative, and therefore \(Tr(vz) > 0\) unless, for each a, \(n_a \xi _a \xi _a^* = 0\). But in that case, \(|n_a \xi _a|^2 = 0\), hence \(n_a \xi _a = 0\) and therefore \(n_a \xi _a \xi _b^* = 0\) for all b; therefore \(vz = 0\). The lemma is proved.
If \(v = w_y\), \(z = y^{\perp }\), Eq. C.16 implies \(w_y y^{\perp } = 0\). Hence in the expansion of Eq. C.12 the three terms containing \(w_y y^{\perp } |\phi \rangle \) or its dual vanish, and we are left with
To eliminate \(w_y\), we shall first prove a second lemma:
Lemma 2
where u is the projector onto \(y|\phi \rangle \). (Thus
for some nonzero real number \(\nu \).)
Proof
If \(y|\phi \rangle =0\) then \(u=0\) and both sides of Eq. C.18 vanish.
Otherwise \(u\ne 0\) and let \(\mu = \langle \phi |y|\phi \rangle = \langle \phi |yy|\phi \rangle \) . We have
so that \(\nu \mu = 1\) and Eq. C.19 becomes
Let \(|\chi _0\rangle = y|\phi \!>/\sqrt{\mu }\) so that \(\langle \chi _0 |\chi _0\rangle = 1.\) Then
Let \(|\chi _0\rangle , |\chi _1\rangle , |\chi _2\rangle , \ldots \) be a complete orthonormal basis for \(\mathcal H\). Then
But by Eq. C.22 we have
and therefore
which is just Eq. C.18. Lemma 2 is proved. \(\square \)
Therefore Eq. C.17 becomes
Now, comparing the projectors u and y, we have
whereas \(u | \chi _i\rangle = 0\) for any \(i\ne 0\). Therefore \(\langle \chi _i |y - u |\chi _i \rangle \; \ge \! 0\) for all i, or simply \(y\ge u\). Hence u and y commute and
It follows that (taking B as the \(\sigma \)-algebra generated by u and y) u may be substituted for x in Eq. C.5, yielding
which exists by Eq. C.6. Likewise, Eq. C.7 - postulated to hold for all x—holds for u in place of x:
.
Comparing Eq. C.30 with Eq. C.29, we obtain
so that Eq. C.26 becomes
This is the promised expression for \(p_y(x)\) that does not involve \(w_y\). (We have nowhere invoked Eq. C.8.) However, it holds only for 1-dimensional x (of the form \(| \phi \rangle \langle \phi |\)). To show that Eq. C.32 suffices to define the whole state \(p_y\), we observe that any x is the sum of 1-dimensional projectors and that Eq. C.7 is linear in x no matter what form \(w_y\) takes. Hence Eq. C.32 establishes the uniqueness of \(p_y\) satisfying the conditions at the start of this section. \(\square \)
1.5 Unique Form of Density Matrix
In Sect. 3 we showed that a legitimate state \(p_y\) exists, satisfying the conditions imposed for conditioning p on y, whose density operator is given by Eq. C.8. In Sect. 4 we showed that no other state can satisfy these conditions. But Gleason’s theorem tells us that a given state can have only one density operator. Hence in fact Eq. C.8 does describe the density operator associated with the state \(p_y\) whose uniqueness is shown in Sect. 4.
This completes the entire proof of Kochen’s theorem in [2, Sect. 8.1].
Rights and permissions
About this article
Cite this article
Friedberg, R., Hohenberg, P.C. What is Quantum Mechanics? A Minimal Formulation. Found Phys 48, 295–332 (2018). https://doi.org/10.1007/s10701-018-0145-4
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10701-018-0145-4