Skip to main content
Log in

What is Quantum Mechanics? A Minimal Formulation

  • Published:
Foundations of Physics Aims and scope Submit manuscript

Abstract

This paper presents a minimal formulation of nonrelativistic quantum mechanics, by which is meant a formulation which describes the theory in a succinct, self-contained, clear, unambiguous and of course correct manner. The bulk of the presentation is the so-called “microscopic theory”, applicable to any closed system S of arbitrary size N, using concepts referring to S alone, without resort to external apparatus or external agents. An example of a similar minimal microscopic theory is the standard formulation of classical mechanics, which serves as the template for a minimal quantum theory. The only substantive assumption required is the replacement of the classical Euclidean phase space by Hilbert space in the quantum case, with the attendant all-important phenomenon of quantum incompatibility. Two fundamental theorems of Hilbert space, the Kochen–Specker–Bell theorem and Gleason’s theorem, then lead inevitably to the well-known Born probability rule. For both classical and quantum mechanics, questions of physical implementation and experimental verification of the predictions of the theories are the domain of the macroscopic theory, which is argued to be a special case or application of the more general microscopic theory.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  1. Friedberg, R., Hohenberg, P.: Compatible quantum theory. Rep. Prog. Phys. 77, 092001 (2014)

    Article  ADS  MathSciNet  Google Scholar 

  2. Kochen, S.: A reconstruction of quantum mechanics. Found. Phys. 45, 557 (2015)

    Article  ADS  MathSciNet  MATH  Google Scholar 

  3. Birkhoff, G., von Neumann, J.: The logic of quantum mechanics. Ann. Math. 37, 823 (1936)

    Article  MathSciNet  MATH  Google Scholar 

  4. Griffiths, R.B.: Consistent Quantum Theory. Cambridge University Press, Cambridge (2002)

    MATH  Google Scholar 

  5. Bell, J.S.: On the problem of hidden variables in quantum mechanics. Rev. Mod. Phys. 38, 447 (1966)

    Article  ADS  MathSciNet  MATH  Google Scholar 

  6. Kochen, S., Specker, E.: The problem of hidden variables in quantum mechanics. J. Math. Mech. 17, 59–87 (1967)

    MathSciNet  MATH  Google Scholar 

  7. Mermin, N.D.: Simple unified form for the major no-hidden-variables theorems. Phys. Rev. Lett. 65, 3373 (1990)

    Article  ADS  MathSciNet  MATH  Google Scholar 

  8. Gleason, A.M.: Measures on the closed subspaces of a Hilbert space. J. Math. Mech. 6, 885 (1957)

    MathSciNet  MATH  Google Scholar 

  9. von Neumann, J.: Mathematical Foundations of Quantum Mechanics, vol. 2. Princeton University Press, Princeton (1996)

    MATH  Google Scholar 

  10. Preskill, J.: Lecture Notes for Physics 219: Quantum Information and Computation (2015). http://www.theory.caltech.edu/people/preskill/index.html

  11. Bub, J., Pitowsky, I.: In: S. Saunders (ed.) Many Worlds?: Everett, Quantum Theory, and Reality, pp. 433–459. Oxford University Press, Oxford (2010)

  12. Schrödinger, E.: In: Mathematical Proceedings of the Cambridge Philosophical Society, vol. 31, pp. 555–563. Cambridge University Press (1935)

  13. Schrödinger, E.: In: Mathematical Proceedings of the Cambridge Philosophical Society, vol. 32, pp. 446–452. Cambridge University Press, Cambridge (1936)

  14. Hughston, L.P., Jozsa, R., Wootters, W.K.: A complete classification of quantum ensembles having a given density matrix. Phys. Lett. A 183(1), 14 (1993)

    Article  ADS  MathSciNet  Google Scholar 

  15. Wootters, W.K., Zurek, W.H.: The no-cloning theorem. Phys. Today 62(2), 76 (2009)

    Article  Google Scholar 

  16. Bub, J.: Quantum probabilities as degrees of belief. Stud. Hist. Philos. Sci. Part B 38(2), 232 (2007)

    Article  MathSciNet  MATH  Google Scholar 

  17. Feynman, R.P., Leighton, R.B., Sands, M.: The Feynman Lectures on Physics, vol. 3. Addison-Wesley, Reading, MA (2006)

    MATH  Google Scholar 

  18. Zeh, H.D.: On the interpretation of measurements in quantum theory. Found. Phys. 1, 69 (1970)

    Article  ADS  Google Scholar 

  19. Zurek, W.H.: Decoherence, einselection, and the quantum origins of the classical. Rev. Mod. Phys. 75, 715 (2003)

    Article  ADS  MathSciNet  MATH  Google Scholar 

  20. Gell-Mann, M., Hartle, J.B.: Classical equations for quantum systems. Phys. Rev. D 47, 3345 (1993)

    Article  ADS  MathSciNet  Google Scholar 

  21. Landau, L., Lifshitz, E.: Course of Theoretical Physics: Vol.: 3: Quantum Mechanics: Non-relativistic Theory. Pergamon Press, Oxford (1965)

    MATH  Google Scholar 

  22. Peres, A.: Quantum Theory: Concepts and Methods, vol. 72. Springer, Berlin (1995)

    MATH  Google Scholar 

  23. Bell, J.: Against ‘measurement’. Phys. World 3(8), 33 (1990)

    Article  Google Scholar 

  24. Brukner, Č., Zeilinger, A.: Operationally invariant information in quantum measurements. Phys. Rev. Lett. 83(17), 3354 (1999)

    Article  ADS  MathSciNet  MATH  Google Scholar 

  25. Fuchs, C.A.: arXiv:1003.5209 (2010)

  26. Fuchs, C.A., Mermin, N.D., Schack, R.: An introduction to QBism with an application to the locality of quantum mechanics. Am. J. Phys. 82(8), 749 (2014)

    Article  ADS  Google Scholar 

  27. Dirac, P.A.M.: The Principles of Quantum Mechanics. Oxford University Press, Oxford (1981)

    MATH  Google Scholar 

  28. Everett, H.I.: Relative state formulation of quantum mechanics. Rev. Mod. Phys. 29, 454 (1957)

    Article  ADS  MathSciNet  Google Scholar 

  29. Griffiths, R.B.: Consistent histories and the interpretation of quantum mechanics. J. Stat. Phys. 36, 219 (1984)

    Article  ADS  MathSciNet  MATH  Google Scholar 

  30. Omnès, R.: Consistent interpretations of quantum mechanics. Rev. Mod. Phys. 64(2), 339 (1992)

    Article  ADS  MathSciNet  Google Scholar 

  31. de Broglie, L.: The principles of the new undulatory mechanics. J. Phys. Rad. 7, 321 (1926)

    Article  Google Scholar 

  32. Bohm, D.: A suggested interpretation of the quantum theory in terms of “hidden” variables. I. Phys. Rev. 85, 180 (1952)

    Article  ADS  MathSciNet  MATH  Google Scholar 

  33. Ghirardi, G.C., Rimini, A., Weber, T.: Unified dynamics for microscopic and macroscopic systems. Phys. Rev. D 34(2), 470 (1986)

    Article  ADS  MathSciNet  MATH  Google Scholar 

  34. Hartle, J.B.: arXiv preprint arXiv:gr-qc/0508001 (2005)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to R. Friedberg.

Appendices

Set Theory, Classical Logic and Probability Theory

In this appendix we provide a brief summary of set theory, classical (Aristotelian) logic and classical probability theory, and we show how the three are formally related.

1.1 Set Theory

For simplicity we consider a discrete set \(\varOmega \) of N elements \(x\in \varOmega \). The subsets \(A,B,\ldots \) of \(\varOmega \) form a set \(\mathcal {B}(\varOmega )\) of sets (in set theory, a field of sets) for which the operations of union \(\cup \), intersection \(\cap \) and complement \(\sim \) obey the axioms of set theory:

$$\begin{aligned} A\cup \emptyset = A&A\cap \varOmega = A, \end{aligned}$$
(A.1a)
$$\begin{aligned} A\cup \sim A = \varOmega&A\cap \sim A = \emptyset , \end{aligned}$$
(A.1b)
$$\begin{aligned} A\cup B = B\cup A&A\cap B = B\cap A, \end{aligned}$$
(A.1c)
$$\begin{aligned} A\cup (B\cup C) = (A\cup B)\cup C,&A\cap (B\cap C) = (A\cap B)\cap C, \end{aligned}$$
(A.1d)
$$\begin{aligned} A\cup (B\cap C) = (A\cup B)\cap (A\cup C),&\;\; A\cap (B\cup C) = (A\cap B)\cup (A\cap C),\nonumber \\ \end{aligned}$$
(A.1e)

where \(\emptyset \) is the empty subset of \(\varOmega \).

1.2 Classical Logic

The subsets \(A,B,\ldots \) can also be considered as logical propositions, in which case the operations of set theory become logical operations

$$\begin{aligned} \cup \;\; \longrightarrow \;\; \vee \;\; \text {disjunction}\;\; \text {(or)}, \end{aligned}$$
(A.2a)
$$\begin{aligned} \cap \;\; \longrightarrow \;\; \wedge \;\; \text {conjunction}\;\; \text {(and)}, \end{aligned}$$
(A.2b)
$$\begin{aligned} \sim \;\; \longrightarrow \;\; \lnot \;\; \text {negation}\;\; \text {(not)}, \end{aligned}$$
(A.2c)
$$\begin{aligned} \varOmega \;\; \longrightarrow \;\; \text {T}\;\; (\text {true}), \end{aligned}$$
(A.2d)
$$\begin{aligned} \emptyset \;\; \longrightarrow \;\; \text {F}\;\; (\text {false}). \end{aligned}$$
(A.2e)

Under the replacements Eqs. A.2, Eqs. A.1 become the usual axioms of propositional calculus

$$\begin{aligned} A \vee \text {F} = A&A \wedge \text {T} = A, \end{aligned}$$
(A.3a)
$$\begin{aligned} A \vee \sim A = \text {T}&A \wedge \sim A = \text {F}, \end{aligned}$$
(A.3b)
$$\begin{aligned} A \vee B = B \vee A&A \wedge B = B \wedge A, \end{aligned}$$
(A.3c)
$$\begin{aligned} A \vee (B \vee C) = (A \vee B) \vee C,&\;\; A \wedge (B \wedge C) = (A \wedge B) \wedge C,\end{aligned}$$
(A.3d)
$$\begin{aligned} A \vee (B \wedge C) = (A \vee B) \wedge (A \vee C),&\;\; A \wedge (B \vee C) = (A \wedge B)\vee (A \wedge C).\nonumber \\ \end{aligned}$$
(A.3e)

In particular Eq. A.1e becomes the distributive law Eq. A.3e. The set \({\mathcal {B}(\varOmega )}\) of \(2^N\) propositions forms a Boolean algebra under the logical operations. On this algebra we can define truth functions \(\mathcal {T}(A)\) with values 1 (True) and 0 (False). Such truth functions must agree with the standard truth tables for the logical functions, which imply the algebraic relations

$$\begin{aligned} \mathcal {T}(\lnot A) = 1 - \mathcal {T}(A), \end{aligned}$$
(A.4a)
$$\begin{aligned} \mathcal {T}(A\wedge B) = \mathcal {T}(A) \mathcal {T} (B), \end{aligned}$$
(A.4b)
$$\begin{aligned} \mathcal {T}(A\vee B) = \mathcal {T}(A) + \mathcal {T}(B) - \mathcal {T}(A) \mathcal {T} (B), \end{aligned}$$
(A.4c)

and must also satisfy \(\mathcal {T}(\varOmega )=1\), \(\mathcal {T}(\emptyset )=0\). We shall refer to these equations as ‘truth table relations’.

Let us consider in particular those subsets of \(\varOmega \) containing only one member, that is sets of the form \(\{x\}\) where \(x\in \varOmega \). We may call them atomic sets, and the corresponding logical propositions atomic propositions. Then by applying Eq. A.4 we find that any truth function must assign the value 1 to some atomic proposition and 0 to all the others. We shall denote the truth function that assigns 1 to a particular \(\{x\}\) by \(\mathcal{T}_x\). Then for \(x, y \in \varOmega \)

$$\begin{aligned} \mathcal{T}_x(\{y\}) = 1 \;\;\; \text {if} \;\;\; y=x, \;\;\; \text {otherwise}\;\;\; 0. \end{aligned}$$
(A.5)

Again applying Eq. A.4, we see that for any \(A\in \mathcal {B}(\varOmega )\),

$$\begin{aligned} \mathcal{T}_x(A) = 1 \;\;\; \text {if} \;\;\; x\in A, \;\;\; \text {otherwise}\;\;\; 0. \end{aligned}$$
(A.6)

We may say that x is the “source of truth” for the atomic truth function \(\mathcal{T}_x\).

1.3 Probability Theory

Truth functions can be generalized by introducing a probability function \(\mathcal {P}(A)\) from \(\mathcal {B}(\varOmega )\) to the unit interval [0, 1]. One first defines a measure as a function from \(\mathcal {B}(\varOmega )\) to the interval \([0,\infty ]\), which satisfies the linearity condition for countable sets of disjoint subsets,

$$\begin{aligned}&\mathcal {P}(A^{(1)}\vee A^{(2)}\vee \cdots ) = \mathcal {P}(A^{(1)})+ \mathcal {P}(A^{(2)})+ \cdots ,\nonumber \\&\quad \text {whenever}\;\; A^{(i)}\wedge A^{(j)} = \emptyset \;\;\text {for}\;\;i\ne j. \end{aligned}$$
(A.7)

A probability measure or probability function (classically, these two ideas can be identified) is a measure which satisfies the additional condition

$$\begin{aligned} \mathcal {P}(\varOmega )=1. \end{aligned}$$
(A.8)

(The relation \(\mathcal {P}(\emptyset )=0\) is already implied by Eq. A.7). In the context of probability theory, the set \(\varOmega \) is the ‘sample space’ of the probability measure and the elements \(A^{(i)}\in \mathcal {B}(\varOmega )\) are called ‘events’. It can be shown that for any two events AB, Eq. A.7 implies the relation

$$\begin{aligned} \mathcal {P}(A \vee B) = \mathcal {P}(A) + \mathcal {P}(B) - \mathcal {P}(A \wedge B). \end{aligned}$$
(A.9)

The converse is true for a finite \(\varOmega \). We shall sometimes refer to Eqs. A.7 and A.8 as the ‘Kolmogorov conditions’, and to Eq. A.9 as the ‘Kolmogorov overlap equation’.

One notices a similarity between Eqs. A.9 and A.4c. Indeed, in [1, Appendix A] the authors explained in what sense the probability function \(\mathcal {P}\) can be thought of as a ‘distributed truth function’.

Let us mention one feature of probability functions which is often obscure, namely that (in a finite-dimensional sample space) defining a probability function that assigns a relative likelihood to the members of a sample space does not alter the fact that one and only one of its members is true, while all the others are false.

It should be noted, moreover, that our definitions of probability and truth are formal ones, and they are thus consistent with either a frequentist or a Bayesian approach to probabilities. At this stage we are not inquiring into the relationship of probabilities to the “real world”, which is where such distinctions arise. The connection between truth and probability explored above exists already on the formal level and is therefore independent of any real-world interpretation of probability.

Noncontextual Network Theorem

Suppose we have a noncontextual network \(\mathcal {N}\) of probability functions, i.e. a set of functions \(P_\mathcal{F}\) with sample spaces \(\mathcal {F}\), such that if A belongs both to \(\mathcal{F}_1\) and to \(\mathcal{F}_2\), then

$$\begin{aligned} P_{\mathcal{F}_1}(A) = P_{\mathcal{F}_2}(A). \end{aligned}$$
(B.1)

We show here that the probability functions \(P_\mathcal{F}\) must satisfy the Born Rule Eq. 5, for some density matrix \(\rho \).

To prove the theorem we first define a quantum probability measure \({\mathcal M}_\mathcal{N}\). For each \(A \in Q(\mathcal {H})\), let \(\mathcal{F}_A\) be the framework consisting of 1, \(\emptyset \), A, and \(\lnot A\). Then we define

$$\begin{aligned} \mathcal{M}_\mathcal{N}(A) = P_{\mathcal{F}_A}(A). \end{aligned}$$
(B.2)

Is \({\mathcal M}_\mathcal{N}\) a quantum probability measure? That depends on whether the additivity condition

$$\begin{aligned}&{\mathcal M}_\mathcal{N}(A^{(1)}\vee A^{(2)}\vee \cdots ) = {\mathcal M}_\mathcal{N}(A^{(1)}) + {\mathcal M}_\mathcal{N}(A^{(2)})\nonumber \\&+ \cdots , \text {when}\;\; A^{(i)} \perp A^{(j)}\;\; \text {for} \; i \ne j, \end{aligned}$$
(B.3)

is satisfied. So let \(\{A\} = \{A_1, A_2, \ldots \}\) be a finite or countably infinite set of properties that are mutually orthogonal, that is \(A_i \perp A_j\) for \(i\ne j\) as in the condition of Eq. B.3 ; and let \(\text {Sp}(\{A\})\) be the span of all the members of \(\{A\}\). If we define \(A_0 = \lnot \text {Sp}(\{A\})\), then \(S_{\{A\}} = \{A_0, A_1, A_2,\ldots \}\) is a sample space. This sample space is the basis of a framework which we shall call \(\mathcal{F}_{\{A\}}\). Since \(P_{\mathcal{F}_{\{A\}}}\) is a probability function, Eq. A.7 above tells us that

$$\begin{aligned} P_{\mathcal{F}_{\{A\}}}(A^{(1)}\vee A^{(2)}\vee \cdots ) = P_{\mathcal{F}_{\{A\}}}(A_1) + P_{\mathcal{F}_{\{A\}}}(A_2) + \cdots \,. \end{aligned}$$
(B.4)

But since any \(A\in \mathcal{F}_{\{A\}}\) belongs to both \(\mathcal{F}_{\{A\}}\) and \(\mathcal{F}_{A_i}\), we can write Eq. B.4 as

$$\begin{aligned} P_{\mathcal{F}_{\{A\}}}(A^{(1)}\vee A^{(2)}\vee \cdots ) = P_{\mathcal{F}_{A_1}}(A_1) + P_{\mathcal{F}_{A_2}}(A_2) + \cdots , \end{aligned}$$
(B.5)

in view of Eq. B.1. Then using Eq. B.2, we have

$$\begin{aligned} \mathcal{M}_\mathcal{N}(A^{(1)}\vee A^{(2)}\vee \cdots ) = \mathcal{M}_\mathcal{N}(A_1) + \mathcal{M}_\mathcal{N}(A_2) + \cdots , \end{aligned}$$
(B.6)

which is Eq. B.3. Therefore \({\mathcal M}_\mathcal{N}\) is a quantum probability measure. Its normalization \({\mathcal M}_\mathcal{N}(\mathcal{H}) = 1 \) can be deduced from the completeness of the sample space in any framework. Then from Gleason’s theorem we get the Born rule Eq. 5.

Kochen’s Proof of an Important Theorem

In [2, Sect. 8.1], a theorem is stated and proved: that given a state p and a property y such that \(p(y)\ne 0\), a unique new state \(p_y\) exists such that \(p_y(x) = p(x\wedge y)/p(y)\) for any x belonging to some \(\sigma \)-algebra B that contains y; and that the density operator of \(p_y\) is given uniquely by \(w_y = ywy/Tr(ywy)\) where w is the density operator associated with p. The proof is valid but extremely terse. For better comprehension we give here an expanded version of the theorem and the proof.

1.1 Preliminary Definitions

In [2, Sect. 3.1] we read,

“A state of a system with a \(\sigma \)-complex of properties Q is a map \(p: Q -> [0,1]\) such that the restriction of p to any \(\sigma \)-algebra B in Q is a probability measure on B.”

Looking higher in the same section, we see that a probability measure on B is a function \(p: B -> [0,1]\) such that

$$\begin{aligned} p(I) = 1 \end{aligned}$$
(C.1)

where I is the identity operator, and

$$\begin{aligned} p(a_1 \vee a_2 \vee \cdots ) = p(a_1) + p(a_2) + \cdots \end{aligned}$$
(C.2)

for any mutually disjoint set of \(a_i\) all belonging to B.

But if the \(a_i\) are properties, disjoint simply means orthogonal; and for orthogonal properties \(x_i \vee x_j = x_i + x_j\). So the definition of a state can be rewritten

A state of a system with a \(\sigma \)-complex of properties Q is a map \(p: Q -> [0,1]\) such that Eq. C.1 holds and

$$\begin{aligned} p(x_1 + x_2 + \cdots ) = p(x_1) + p(x_2) +\cdots \end{aligned}$$
(C.3)

for any mutually orthogonal \(x_1, x_2, \ldots \). (It is trivial that such a set generates a \(\sigma \)-algebra.)

Now looking lower, we see that if \(Q = Q(\mathcal H)\) where \(\mathcal H\) is a Hilbert space of dimension \(>2\), Gleason’s theorem tells us that for any state p there exists a unique density operator w (nonnegative Hermitian operator of trace 1) on \(\mathcal H\) such that

$$\begin{aligned} p(x) = Tr(w x) \end{aligned}$$
(C.4)

for all \(x\in Q\).

1.2 Statement of Theorem

In [2, Sect. 8.1], Kochen states and proves the following theorem. (We replace some of his notation with our own.)

If p is a state on \(Q(\mathcal H)\) and \(y\in Q(\mathcal H)\) such that \(p(y) \ne 0\), then there exists a unique state \(p_y\) conditioned on y. If w is the density operator corresponding under Eq. C.4 to p, then ywy / Tr(ywy) corresponds to the state \(p_y\).

To understand this statement one must look back a couple of paragraphs to the general \(\sigma \)-complex Q:

“Let p be a state on a \(\sigma \)-complex Q and \(y\in Q\) such that \(p(y)\ne 0\). By a state conditioned on y we mean a state \(p_y\) such that for every \(\sigma \)-algebra B in Q containing y and every \(x\in B\),

$$\begin{aligned} p_y(x) = p(x\wedge y)/p(y).'' \end{aligned}$$
(C.5)

This permits us to restate the theorem as follows:

Let it be given that p is a state on \(Q(\mathcal H)\) as defined in the previous section, and that (pw) satisfy Eq. C.4. Let it also be given that \(y\in Q(\mathcal H)\) and that

$$\begin{aligned} p(y) \ne 0. \end{aligned}$$
(C.6)

Then there exists a state \(p_y\) on \(Q(\mathcal H)\) such that (5) holds for every \(x\in Q(\mathcal H)\) that belongs to some \(\sigma \)-algebra B that also contains y. Moreover, there exists only one such state, and it is given by the formula

$$\begin{aligned} p_y(x) = Tr(w_y x) \end{aligned}$$
(C.7)

where

$$\begin{aligned} w_y = ywy/Tr(ywy). \end{aligned}$$
(C.8)

[It is important to note that the assertion that \(p_y\) is a state implies that \(p_y(x)\) is defined for every property x, whether or not it commutes with y. But Eq. C.5 is asserted only for certain properties x. In fact, one might as well say that Eq. C.5 is asserted for all x compatible with y, since if x and y commute one can easily construct a \(\sigma \)-algebra containing them both.

It is also essential that Eq. C.6 be assumed, since otherwise the right side of Eq. C.5 could be 0/0.

Finally, Eq. C.7 is meant to hold for all x, not only those satisfying Eq. C.5.]

1.3 Proof of Theorem (Existence)

The proof of the theorem is in two halves. First, we must show that there exists a state \(p_y\) such that Eq. C.5 holds for all x that commutes with y. And second, we must show that any such \(p_y\) must satisfy Eqs. C.7 and C.8. In both halves it is given to begin with that p is a state and that y is a property satisfying Eq. C.6.

To show that a state \(p_y\) exists with the desired feature, we simply exhibit one as defined by Eqs. C.7 and C.8. We must then show that \(p_y\) so defined is a state, and also that it satisfies Eq. C.5 for all x compatible with y. The latter statement is easily proved. Combining Eqs. C.7 and C.8, we have

$$\begin{aligned} p_y(x) = Tr(ywyx)/Tr(ywy) \end{aligned}$$
(C.9)

for all x. The denominator is equal to \(Tr(wyy) = Tr(wy) = p(y)\), which is nonzero by Eq. C.6; therefore Eq. C.9 is defined. Then if we specialize to x commuting with y, the numerator in Eq. C.9 can be written as \(Tr(ywxy) = Tr(wxyy) = Tr (wxy) = p(xy)\); but for commuting x and y, xy is the same as \(x\wedge y\) and so Eq. C.5 is satisfied.

It remains to show that \(p_y\) as defined by Eq. C.9 is a state. This means verifying Eqs. C.1 and C.3 with \(p_y\) in place of p. As for Eq. C.1, we have from Eq. C.9

$$\begin{aligned} p_y(I) = Tr(ywyI)/Tr(ywy) = Tr(ywy)/Tr(ywy) = p(y)/p(y) = 1, \end{aligned}$$
(C.10)

again appealing to Eq. C.6. As for Eq. C.3, the equation

$$\begin{aligned} p_y(x_1 + x_2 +\cdots ) = p_y(x_1) + p_y(x_2) +\cdots \end{aligned}$$
(C.11)

follows directly from Eq. C.7 since the right side of Eq. C.7 is linear in x. Therefore \(p_y\) is a state.

1.4 Proof of Theorem (Uniqueness)

In this section we assume as before that p is a state and that y is a property obeying Eq. C.6. We assume also that \(p_y\) is some state for which Eq. C.5 holds whenever x commutes with y. Then from Gleason’s theorem we know that a density operator \(w_y\) exists satisfying Eq. C.7 for all x.

But we do not assume Eq. C.8; we regard the form of \(w_y\) as unknown. We shall then derive an expression for \(p_y(x)\), holding for any 1-dimensional x, from which \(w_y\) has been eliminated. (This is the heart of Kochen’s proof. It shows that the state \(p_y\) is unique, since its values on arbitrary x can be obtained from those on 1-dimensional x by linearity.)

If x is 1-dimensional, it can be written as \(|\phi \rangle \langle \phi |\) for some ket \(|\phi \!>\) for which \(\langle \phi | \phi \rangle = 1\). Then Eq. C.7 becomes

$$\begin{aligned} p_y(x) = \langle \phi | w_y |\phi \rangle . \end{aligned}$$
(C.12)

Since y and its complement \(y^{\perp }\) span \(\mathcal H\), we have

$$\begin{aligned} |\phi \rangle = y |\phi \rangle + \; y^{\perp }|\phi \rangle \end{aligned}$$
(C.13)

By substituting Eq. C.13 for \(| \phi \rangle \) in Eq. C.12, we obtain \(p_y\) as the sum of four terms, three of which contain the factor \(w_y y^{\perp }| \phi \rangle \) or else \(\langle \phi | y^{\perp } w_y\). These three vanish, for the following reason. Since y and \(y^{\perp }\) commute, we may substitute \(y^{\perp }\) for x in Eq. C.5, obtaining

$$\begin{aligned} p_y(y^{\perp }) = p( y^{\perp }\wedge y)/p(y) = p(\emptyset )/p(y) = 0, \end{aligned}$$
(C.14)

where \(\emptyset \) is the empty set. (\(p(\emptyset )\) must vanish, otherwise Eq. C.3 would lead to a contradiction.) On the other hand, since Eq. C.7 applies to all x we can apply it to \(y^{\perp }\), obtaining

$$\begin{aligned} p_y(y^{\perp }) = Tr(w_y y^{\perp }), \end{aligned}$$
(C.15)

and comparing Eqs. C.15C.14 we have

$$\begin{aligned} Tr(w_y y^{\perp }) = 0. \end{aligned}$$
(C.16)

Now from Eq. C.16 it follows that \(w_y y^{\perp } |\phi \rangle = 0\). This implication is not trivial, and we shall give the reasoning as a lemma.

Lemma 1

If v is a density operator and z is a projector, then either \(vz = 0\) or \(Tr(vz) > 0\).

Proof

z can be decomposed into a sum of one-dimensional projectors \(z_k\), and the lemma will be true of z if it is true of each \(z_k\).

Therefore without loss of generality we can take z to be one-dimensional: \(z = |\xi \rangle \langle \xi |\) where \(\langle \xi |\xi \rangle = 1.\)

Express v in matrix form so that it is diagonal: \(v_{ab} = n_a \delta _{a,b}\) where all \(n_a \ge 0\) and \(\varSigma n_a = 1\). Then \(|\xi \rangle \) is a column matrix whose a’th element is \(\xi _a\), and \(\varSigma |\xi _a|^2 = 1.\)

We then have \(z_{ab} = \xi _a \xi _b^*\), and \((vz)_{ab} = n_a \xi _a \xi _b^*.\) Hence \(Tr((vz) = \varSigma n_a \xi _a \xi _a^*.\) Each term of the trace is nonnegative, and therefore \(Tr(vz) > 0\) unless, for each a, \(n_a \xi _a \xi _a^* = 0\). But in that case, \(|n_a \xi _a|^2 = 0\), hence \(n_a \xi _a = 0\) and therefore \(n_a \xi _a \xi _b^* = 0\) for all b; therefore \(vz = 0\). The lemma is proved.

If \(v = w_y\), \(z = y^{\perp }\), Eq. C.16 implies \(w_y y^{\perp } = 0\). Hence in the expansion of Eq. C.12 the three terms containing \(w_y y^{\perp } |\phi \rangle \) or its dual vanish, and we are left with

$$\begin{aligned} p_y(x) = \langle \phi |y w_y y|\phi \rangle . \end{aligned}$$
(C.17)

To eliminate \(w_y\), we shall first prove a second lemma:

Lemma 2

$$\begin{aligned} \langle \phi |y w_y y|\phi \rangle = \langle \phi |y|\phi \rangle Tr(w_y u), \end{aligned}$$
(C.18)

where u is the projector onto \(y|\phi \rangle \). (Thus

$$\begin{aligned} u = \nu y|\phi \rangle \langle \phi |y \end{aligned}$$
(C.19)

for some nonzero real number \(\nu \).)

Proof

If \(y|\phi \rangle =0\) then \(u=0\) and both sides of Eq. C.18 vanish.

Otherwise \(u\ne 0\) and let \(\mu = \langle \phi |y|\phi \rangle = \langle \phi |yy|\phi \rangle \) . We have

$$\begin{aligned} u = uu = \nu ^2 y|\phi \rangle \langle \phi |y\, y|\phi \rangle \langle \phi |y = \nu ^2 \mu \, y|\phi \rangle \langle \phi |y = \nu \mu u \end{aligned}$$
(C.20)

so that \(\nu \mu = 1\) and Eq. C.19 becomes

$$\begin{aligned} u = (1/ \mu ) y|\phi \rangle \langle \phi |y. \end{aligned}$$
(C.21)

Let \(|\chi _0\rangle = y|\phi \!>/\sqrt{\mu }\) so that \(\langle \chi _0 |\chi _0\rangle = 1.\) Then

$$\begin{aligned} u = |\chi _0\rangle \langle \chi _0 |. \end{aligned}$$
(C.22)

Let \(|\chi _0\rangle , |\chi _1\rangle , |\chi _2\rangle , \ldots \) be a complete orthonormal basis for \(\mathcal H\). Then

$$\begin{aligned} Tr(w_y u) = Tr(w_y uu) = Tr(uw_yu) = \varSigma _i \langle \chi _i | u w_y u |\chi _i \rangle . \end{aligned}$$
(C.23)

But by Eq. C.22 we have

$$\begin{aligned} \langle \chi _i | u w_y u |\chi _i \rangle = \langle \chi _i | \chi _0\rangle \langle \chi _0 |w_y|\chi _0\rangle \langle \chi _0 |\chi _i\rangle = \delta _{i,0}\langle \chi _0 |w_y|\chi _0\rangle \end{aligned}$$
(C.24)

and therefore

$$\begin{aligned} Tr(w_y u) = \langle \chi _0| w_y | \chi _0\rangle = \langle \phi |y w_y y| \phi \rangle /\mu \end{aligned}$$
(C.25)

which is just Eq. C.18. Lemma 2 is proved. \(\square \)

Therefore Eq. C.17 becomes

$$\begin{aligned} p_y(x) = \langle \phi |y|\phi \rangle Tr(w_y u). \end{aligned}$$
(C.26)

Now, comparing the projectors u and y, we have

$$\begin{aligned} y| \chi _0\rangle = y(y|\phi \rangle \!/\sqrt{\mu }) = y|\phi \rangle \!/\sqrt{\mu } = | \chi _0\rangle = u | \chi _0\rangle \end{aligned}$$
(C.27)

whereas \(u | \chi _i\rangle = 0\) for any \(i\ne 0\). Therefore \(\langle \chi _i |y - u |\chi _i \rangle \; \ge \! 0\) for all i, or simply \(y\ge u\). Hence u and y commute and

$$\begin{aligned} u \wedge y = u. \end{aligned}$$
(C.28)

It follows that (taking B as the \(\sigma \)-algebra generated by u and y) u may be substituted for x in Eq. C.5, yielding

$$\begin{aligned} p_y(u) = p(u\wedge y) / p(y) = p(u) / p(y) \end{aligned}$$
(C.29)

which exists by Eq. C.6. Likewise, Eq. C.7 - postulated to hold for all x—holds for u in place of x:

$$\begin{aligned} p_y(u) = Tr(w_y u) \end{aligned}$$
(C.30)

.

Comparing Eq. C.30 with Eq. C.29, we obtain

$$\begin{aligned} Tr(w_y u) = p(u) / p(y) \end{aligned}$$
(C.31)

so that Eq. C.26 becomes

$$\begin{aligned} p_y(x) = \langle \phi |y|\phi \rangle p(u)/p(y). \end{aligned}$$
(C.32)

This is the promised expression for \(p_y(x)\) that does not involve \(w_y\). (We have nowhere invoked Eq. C.8.) However, it holds only for 1-dimensional x (of the form \(| \phi \rangle \langle \phi |\)). To show that Eq. C.32 suffices to define the whole state \(p_y\), we observe that any x is the sum of 1-dimensional projectors and that Eq. C.7 is linear in x no matter what form \(w_y\) takes. Hence Eq. C.32 establishes the uniqueness of \(p_y\) satisfying the conditions at the start of this section. \(\square \)

1.5 Unique Form of Density Matrix

In Sect. 3 we showed that a legitimate state \(p_y\) exists, satisfying the conditions imposed for conditioning p on y, whose density operator is given by Eq. C.8. In Sect. 4 we showed that no other state can satisfy these conditions. But Gleason’s theorem tells us that a given state can have only one density operator. Hence in fact Eq. C.8 does describe the density operator associated with the state \(p_y\) whose uniqueness is shown in Sect. 4.

This completes the entire proof of Kochen’s theorem in [2, Sect. 8.1].

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Friedberg, R., Hohenberg, P.C. What is Quantum Mechanics? A Minimal Formulation. Found Phys 48, 295–332 (2018). https://doi.org/10.1007/s10701-018-0145-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10701-018-0145-4

Keywords

Navigation