1 Introduction

In classical physics, symmetry, reference frames and the relativity of physical quantities are intimately connected. The position of a material object is defined as relative to a given frame, and the relative position of object to frame is a shift-invariant quantity. Galilean directions/angles, velocities and time of events are all relative, and invariant only once the frame-dependence has been accounted for. The relativity of these quantities is encoded in the Galilei group, and the observable quantities are those which are invariant under its action. Einstein’s theory engendered a deeper relativity—the length of material bodies and time between spatially separated events are also frame-dependent quantities—and observables must be sought in accordance with their invariance under the action of the Poincaré group.

In quantum mechanics the analogues of those quantities mentioned above (e.g., position, angle) must also be understood as being relative to a reference frame. As in the normal presentation of the classical theory, the reference frame-dependence is implicit. However, in the quantum case, there arises an ambiguity regarding the definition of a reference frame: if it is classical, this raises the spectre of the lack of universality of quantum mechanics along with technical difficulties surrounding hybrid classical-quantum systems; if quantum, such a frame is subject to difficulties of definition and interpretation arising from indeterminacy, incompatibility, entanglement, and other quantum properties (see, e.g., [1,2,3] for early discussions of some of the important issues).

In previous work [4, 5], following classical intuition we have posited that observable quantum quantities are invariant under relevant symmetry transformations, and examined the properties of quantum reference frames (viewed as physical systems) which allow for the usual description, in which the reference frame is implicit, to be recovered. We constructed a map \(\yen \) which brings out the relative nature of quantities normally presented in “absolute” form in conventional treatments, which allows for a detailed study of the relativity of states and observables in quantum mechanics and the crucial role played by reference localisation.

The attitude taken throughout that reference frames are to be understood, in the first instance, as physical systems demands further attention and is motivated by the following observations. Firstly, this is in accordance with experimental practice and with the analysis of laboratory experiments, i.e., the position of a material particle is measured relative to available and suitable laboratory objects serving to define spatial coordinates. Given the existence of such objects, the abstraction to non-physical (mathematical) coordinate systems as an idealised description is then justified, and the frame coordinates may be suppressed for calculational convenience. Viewing frames as being defined by physical bodies has historical precedent; Einstein, for instance, writes (in a classical/relativistic context) “Every description of events in space involves the use of a rigid body to which such events have to be referred. The resulting relationship takes for granted that the laws of Euclidean geometry hold for “distances”, the “distance” being represented physically by means of the convention of two marks on a rigid body” [6].

The dual role afforded to reference frames—physical or mathematical—is unproblematic in the classical regime because the mathematical idealisation (abstract coordinate systems) can be made a very good approximation to judiciously chosen reference bodies appearing in the physical world. The suitability of abstracting suitable frames of reference from physical bodies in quantum mechanics must be scrupulously considered, and is the subject of this paper. To borrow a phrase from [1], such an abstraction “must stop short of actual self-contradiction”; such would be the case, for instance, for reference frames designated for the description of quantum phenomena which possess position and momentum localisations violating the uncertainty relation. The natural questions to consider, then, are how are we to perform the abstraction from physical to mathematical reference frames in quantum theory, and what justification is there for supposing that we may eliminate the reference frame from the description in direct analogy to the classical case? We seek to address these questions shortly.

The specific objectives for this paper are: (1) to provide a mathematically rigorous and conceptually clear framework with which to discuss quantum reference frames, making precise existing work on the subject (e.g., [7]) and providing proofs of the main claims in [5]; (2) to construct examples, showing how symmetry dictates that the usual text book formulation of quantum theory describes the relation between a quantum system and an appropriately localised reference system; (3) to provide further conceptual context for the quantitative trade-off relations proven in [4]; (4) to provide explicit and clear explanation of what it means for states/observables to be defined relative to an external reference frame, and show how such an external description is compatible with quantum mechanics as a universal theory; (5) to introduce the concepts of absolute coherence and mutual coherence, showing the latter to be required for good approximation of relative quantities by absolute ones, and demonstrating it to be the crucial property for interference phenomena to manifest in the presence of symmetry; (6) to address the questions of dynamics and measurement under symmetry, offering an interpretation of the Wigner–Araki–Yanase theorem based on relational quantities; (7) to analyse simplified models similar to those appearing in the literature purporting to produce superpositions typically thought “forbidden” due to superselection rules, and provide a critical analysis of large amplitude limits in this context guided by two interpretational principles due to Earman and Butterfield, leading directly to (8) to provide a historical account of two differing views on the nature of superselection rules ([8, 9] “versus” [7, 10, 11]), their fundamental status in quantum theory and precisely what restrictions arise in the presence of such a rule, showing how our framework brings a unity to the opposing standpoints; (9) to remove ambiguities and inconsistencies appearing in all previous works on the subject of the connection between superselection rules and reference frames; (10) to offer a fresh perspective, based on the concept of mutual coherence, on the nature and reality of quantum optical coherence, settling a long-standing debate on the subject of whether laser light is “truly” coherent. See also [12] for an important contribution on this topic. We provide general arguments and many worked examples to show precisely how the framework presented works in practice, and which simplify a number of models appearing in the literature.

Our paper constitutes further effort in a long line of enquiries (e.g., [7, 13,14,15,16,17,18]) aimed at capturing the relationalism at the heart of the quantum mechanical world view. The fundamental role of symmetry has not impressed itself strongly upon previous consideration of the relative nature of the quantum description, and we view this work (along with [4, 5]) as opening new lines of enquiry in this direction. Our work is inspired by [7] and visits similar themes, and is complementary to recent work on resource theories (e.g., [7, 19,20,21,22]), which focus primarily on practical questions surrounding, for example, high-precision quantum metrology.

We now provide standard mathematical background material, and will work in units where \(\hbar = 1\).

2 Notation and Some Definitions

2.1 Observables and States

Associated to each physical system is a separable complex Hilbert space \(\mathcal {H}\). We let \(\mathcal {L(H)}\) denote the (\(C^*\)/von Neumann) algebra of all bounded linear operators on \(\mathcal {H}\).

Definition 1

Let \((\varOmega , \mathcal {F})\) denote the measurable space consisting of a \(\sigma \)-algebra \(\mathcal {F}\) of subsets of some set \(\varOmega \). A normalised positive operator valued measure (pom) \(\mathsf {E}\) on \((\varOmega , \mathcal {F})\) is a mapping \(\mathsf {E}: \mathcal {F} \rightarrow \mathcal {L(H)}\) for which

  1. 1.

    \(\mathsf {E}(\varOmega ) = \mathbb {1}\),

  2. 2.

    \(\mathsf {E}(X) \ge 0 \) for all \(X \in \mathcal {F}\),

  3. 3.

    \(\mathsf {E}\left( \bigcup {X_i} \right) = \sum \mathsf {E}(X_i)\) for disjoint sequences \(X_i \subset \mathcal {F}\) (sum converging weakly).

(Here \(\le ,\ge \) denote the standard operator ordering.)

Normalised poms represent observables (subject to extra constraints in the presence of symmetry, discussed below). Throughout this paper, the pair \((\varOmega , \mathcal {F})\) will normally correspond to \(\left( \mathbb {R}^n, \mathcal {B}(\mathbb {R}^n) \right) \) (or possibly subsets, subalgebras) with \(\mathcal {B}(\cdot )\) denoting the (algebra of) Borel sets. The operators \(\mathsf {E}(X)\) are called effects (occasionally also pom elements or effect operators); they satisfy \(\mathbb {O} \le \mathsf {E}(X) \le \mathbb {1}\). The unit operator interval \(\left[ \mathbb {O}, \mathbb {1} \right] \) comprises the set of all effects \(\mathcal {E}(\mathcal {H})\). \(\mathcal {E}(\mathcal {H})\) is convex as a subset of the real linear space of self-adjoint operators in \(\mathcal {L(H)}\), and the collection of extremal elements is the set of projections, characterised as the idempotent effects. If all elements of a pom \(\mathsf {E}\) are idempotent, then \(\mathsf {E}\) is called a projection valued measure (pvm), and if \(\mathsf {E}\) is defined on \(\mathbb {R}\), it defines a unique self-adjoint operator \(A:=\int \mathsf {E}(d \lambda )\) with spectral measure \(\mathsf {E}^{A}\equiv \mathsf {E}\). An observable defined by a self-adjoint operator, or equivalently, a pvm, will be called sharp, and all others unsharp.

Definition 2

A positive linear map \(\omega : \mathcal {L(H)}\rightarrow \mathcal {A}\) (where \(\mathcal {A}\) is a von Neumann algebra) is called normal if for any increasing net \((A_{\alpha }) \subset \mathcal {L(H)}\) with \(\sup {\{A_{\alpha }\}}=A\), \(\omega (A) = \sup {\{\omega (A_{\alpha })\}}\).

Normality is equivalent to \(\sigma \)-weak continuity. We will denote the trace class of \(\mathcal {H}\) by \(\mathcal {L}_1(\mathcal {H})\) and the trace functional by \(\text {tr}\left[ \cdot \right] \). Normal states are then obtained by setting \(\mathcal {A}=\mathbb {C}\) in Definition 2; any normal state is of the form \(A \mapsto \text {tr}\left[ \rho A\right] \equiv \langle A\rangle _\rho \), where \(\rho \in \mathcal {L}_1(\mathcal {H})\) is a positive operator and \(\text {tr}\left[ \rho \right] =1\). The set of normal states, denoted \(\mathcal {S}(\mathcal {H})\), is (identified with) a \(\sigma \)-convex subset of the real vector space \(\mathcal {L}_1(\mathcal {H})_{\text {sa}}\) of self-adjoint elements of \(\mathcal {L}_1(\mathcal {H})\). Henceforth all states are assumed to be normal, and we freely move between algebraic (linear functional) and spatial (density operator) notions of states. The extreme points of \(\mathcal {S}(\mathcal {H})\), corresponding to the pure states, are given by the rank one projections, which will be denoted \(P_{\varphi } \equiv |\varphi \rangle \langle \varphi |\), where \(\varphi \in \mathcal {H}\), \(\Vert \varphi \Vert =1\). We will usually identify pure normal states with unit vectors in \(\mathcal {H}\). States generate expectation-valued functionals \(\mathcal {L}(\mathcal {H})_{\text {sa}} \rightarrow \mathbb {R}\) on \(\mathcal {L}(\mathcal {H})_{\text {sa}}\)—the self-adjoint part of \(\mathcal {L(H)}\)—and when restricted to \(\mathcal {E}(\mathcal {H})\) can be viewed as generalised probability measures \(\mathcal {E}(\mathcal {H}) \rightarrow [0,1]\). For a given pom \(\mathsf {E}:\mathcal {F} \rightarrow \mathcal {L(H)}\) and \(\rho \in \mathcal {L}_1(\mathcal {H})\) we will write \(X \mapsto p^E_{\rho }(X)\) for the probability measure \(X \mapsto \text {tr}\left[ \mathsf {E}(X)\rho \right] \) and if \(\mathsf {E}= \mathsf {E}^A\) we use the shorthand \(X \mapsto p^A_{\rho }(X)\) to represent the measure \(X \mapsto \text {tr}\left[ \mathsf {E}^A(X)\rho \right] \).

2.2 Covariant poms and Localisability

Covariant poms will feature as reference quantities in the sequel, and their localisation properties will play an important role. We review these basic notions here.

2.2.1 Systems of Covariance, Norm-1 Property

Definition 3

Let U denote a unitary representation of a locally compact group G, and let \(\mathsf {F}: \mathcal {F} \rightarrow \mathcal {L(H)}\) be a POM whose outcome space \(\varOmega \) is a G-space. Then \((U,\mathsf {F},\mathcal {H})\) is a system of covariance for G if

$$\begin{aligned} \mathsf {F}(g.X) = U(g)\mathsf {F}(X)U(g)^* \text { for all } g \in G, X \in \mathcal {F}. \end{aligned}$$
(1)

\(\mathsf {F}\) is called a covariant pom for U. The triple \((U,\mathsf {F},\mathcal {H})\) is called a system of imprimitivity if \(\mathsf {F}\) is projection-valued. We often consider the case \(\varOmega = G\) and G abelian.

Remark 1

Systems of covariance/imprimitivity may also be defined for projective representations.

We give a definition relating to the localisability of poms—the so-called norm-1 property (see, e.g., [23]):

Definition 4

A pom \(\mathsf {E}: \mathcal {B}(G) \rightarrow \mathcal {L(H)}\) is said to satisfy the norm-1 property if \(\left\| \mathsf {E}(X)\right\| =1\) for all X for which \(\mathsf {E}(X) \ne 0\).

The following is an immediate consequence.

Lemma 1

If \(\mathsf {E}\) satisfies the norm-1 property, then for any X for which \(\mathsf {E}(X) \ne 0\), there exists a sequence of unit vectors \((\varphi _k) \subset \mathcal {H}\) for which

$$\begin{aligned} \lim _{k \rightarrow \infty } \left\langle \,\varphi _k\,{|}\,\mathsf {E}(X)\varphi _k\,\right\rangle = 1. \end{aligned}$$

This entails that such a pom gives rise to probability distributions which are (approximately) localisable in every set X for which \(\mathsf {E}(X)\ne 0\). In comparison, for a projection valued measure \(\mathsf {P}\), for any X with \(\mathsf {P}(X) \ne 0\), there is a unit vector \(\varphi \in \mathcal {H}\) for which \(\left\langle \,\varphi \,{|}\, \mathsf {P} (X)\varphi \,\right\rangle = 1\) (any unit vector in the range of \(\mathsf {P} (X)\) will have this property). Hence poms with the norm-1 property have, in a limiting sense, the localisability properties possessed by all pvms.

Remark 2

In the case of a covariant pom, we do not need to check all the subsets X to confirm the norm-1 property, as shown in the following lemma.

Lemma 2

Let \(\mathsf {E}\) be a covariant pom with \(\varOmega = G\). The following are equivalent.

  1. (i)

    \(\mathsf {E}\) has the norm-1 property.

  2. (ii)

    \(\Vert E(X) \Vert =1\) for all \(X \in \mathcal {F}\) with \(e\in X\) and \(\mathsf {E}(X)\ne 0\).

  3. (iii)

    For all \(X\in \mathcal {F}\) with \(e \in X\) and \(\mathsf {E}(X) \ne 0\), there exists a sequence of unit vectors \((\varphi _k)\) such that \(\lim _k \langle \varphi _k| \mathsf {E}(X) \varphi _k\rangle =1\).

Proof

Assume (ii). Then for an arbitrary \(Y \in \mathcal {F}\) with \(\mathsf {E}(Y) \ne 0\), there exists \(g\in G\) such that \(e\in g.Y \in \mathcal {F}\) holds. Thus (i) follows. The other relations are trivial.Footnote 1 \(\square \)

2.2.2 Positions, Momenta, and Covariant Phase Space poms

For the case \(G=\mathbb {R}\), the position operator Q with spectral measure \(\mathsf {E}^Q : \mathcal {B}(\mathbb {R}) \rightarrow \mathcal {L}(L^2(\mathbb {R}))\) acting by multiplication, and the strongly continuous representation \(U(x)=e^{ixP}\) (P momentum) of \(\mathbb {R}\) gives rise to a system of imprimitivity \((U, \mathsf {E}^Q, L^2(\mathbb {R}))\) under the covariance

$$\begin{aligned} \mathsf {E}^Q(X-x) = U(x)\mathsf {E}^Q(X)U(x)^*. \end{aligned}$$
(2)

The momentum operator P with spectral measure \(\mathsf {E}^P: \mathcal {B}(\mathbb {R}) \rightarrow \mathcal {L}(L^2(\mathbb {R}))\) satisfies, with \(V(p)=e^{ipQ}\), the following covariance relation with respect to boosts:

$$\begin{aligned} \mathsf {E}^P(Y-p) = V(p)\mathsf {E}^P(Y)V(p)^*, \end{aligned}$$
(3)

yielding the system of imprimitivity \((V,\mathsf {E}^P, L^2(\mathbb {R}))\).

Unsharp versions of position (for instance, smeared positions (e.g., [24])) are also covariant; indeed it is such a covariance requirement that defines the class of unsharp positions (analogously for unsharp momenta). Let \(\mu \) be a probability (“confidence”) measure on \(\mathbb {R}\). A smeared position observable \(\mathsf {E}^{\mu }\) is defined as

$$\begin{aligned} \mathsf {E}^{\mu } (X) := (\mu * \mathsf {E})(X) = \int _R \mathsf {E}(X + q) d\mu (q) \end{aligned}$$
(4)

where “\(*\)” denotes convolution of measures. Under the assumption of absolute continuity we can write \(\mu (X) = \int _X e(x) dx\) and \(\mathsf {E}^{\mu }(X) \equiv \mathsf {E}^{e} (X) = (\chi _X * e)(Q)\). Such a quantity is called a smeared position observable with confidence function e. It is covariant, and in the limit that e becomes a delta function, or equivalently, the associated \(\mu \) becomes a point measure, the sharp position is returned.

An example of a necessarily unsharp covariant quantity is provided by a covariant phase-space pom \(M:\mathcal {B}(\mathbb {R}^2) \rightarrow \mathcal {L}(L^2(\mathbb {R}))\) which is both shift and boost covariant, i.e.,

$$\begin{aligned} W(q,p)M(Z)W(q,p)^*=M(Z+(q,p)), \end{aligned}$$
(5)

where \(W(q,p):=e^{(-i/2 )qp} e^{-iqP} e^{ipQ}\) are the Weyl operators. M contains unsharp positions and momenta as marginals.

We now turn to the case of \(G = S_1\) which plays a major role in the rest of the paper.

2.2.3 Covariant Phases

We identify \(S^1\) with \([0,2 \pi ]\) or occasionally \([- \pi , \pi ]\) (identifying also the endpoints of these intervals); \(\theta \mapsto U(\theta )=e^{i N\theta }\) is a strongly continuous unitary representation, for self-adjoint N, of \(S_1\) in \(L^2(S^1)\) and \(\mathsf {F}:\mathcal {B}(S_1) \rightarrow \mathcal {L}(L^2(S^1))\) is called a covariant phase pom if

$$\begin{aligned} e^{i\theta N }\mathsf {F}(X)e^{-i \theta N } =\mathsf {F}(X\dotplus \theta ), \quad \theta \in [0,2\pi ),\ X\in \mathcal {B}([0,2 \pi )) \end{aligned}$$
(6)

(where \(\dotplus \) denotes addition modulo \(2 \pi \)). There is a constraint on the spectrum of the unique self-adjoint generator N. Using the spectral representation of N, \(N=\int x \mathsf {E}^N(dx)\), we have \(\int e^{i2\pi x}\mathsf {E}^N(dx)=\mathbb {1}\) so that the spectrum of N must consist of integers. Recall that the generator N associated with a phase shift group is called a number operator. We consider three typical cases of generators N and covariant phase poms associated with them.

Example 1

Consider the canonical pair of an angular momentum component and the associated angle variable of a particle in three dimensions. In this case, \(N=L_z\) (say), where \(L_z\) generates rotations about the z axis, and as a covariant phase pom one can take the spectral measure of the self-adjoint azimuthal angle operator, \(\mathsf {E}=\mathsf {E}^\varPhi \), where \(\varPhi \psi (r,\theta ,\phi )=\phi \psi (r,\theta ,\phi )\). Note that the spectrum of N is \(\mathbb {Z}\).

Example 2

The second example is motivated by the number operator counting the eigenvalues of the harmonic oscillator Hamiltonian. The associated phase poms are covariant under rotations in phase space. These cannot be pvms due to the fact that N is bounded from below [25, 26]; the preceding example then figures naturally as the minimal Naimark extension of the canonical phase (defined presently). Thus, let \(\{e_n \}\) be an orthonormal basis in \(\mathcal {H}\simeq \ell ^2\) and \(N := \sum _{n=0}^\infty n P[e_n] \equiv \sum _{n=0}^\infty n P_n\) be a number operator. Any covariant phase pom conjugate to N is known to be of the form

$$\begin{aligned} \mathsf {F}(X) = \sum _{n,m=0}^{\infty } c_{nm} \frac{1}{2 \pi } \int _{X} e^{i(n-m) \theta } d \theta |n\rangle \langle m| \end{aligned}$$
(7)

where \((c_{nm})\) is a so-called phase matrix—a positive matrix for which \(c_{nn}=1\) for all \(n \in \mathbb {N}\). The canonical phase \(\mathsf {F}^{\text {can}}\) is singled out by the condition \(c_{n,m} = 1\) for all \(n,m \in \mathbb {N}\cup \{0\}\). \(\mathsf {F}^\mathrm{{can}}\) is characterised by various optimality properties (see [27]), in particular it satisfies the norm-1 property.

Example 3

As the third example we consider covariant phase poms in finite dimensional Hilbert spaces; one such instance is the spin phase (e.g., [24]). The norm-1 property and the associated localisability are lost when we move to the finite dimensional setting, as shown in Lemma 3.

Let \(\mathcal {H}\simeq \mathbb {C}^d\) and consider the operator \(N \in \mathcal {L(H)}\) defined by \(N = \sum _{n = 0}^{d-1}nP_n\). An example of a covariant phase pom is given by

$$\begin{aligned} \mathsf {F}(X) = \sum _{n,m = 0} ^{d-1}\frac{1}{2 \pi }\int _X e^{i(n-m) \theta } |m\rangle \langle n| d \theta . \end{aligned}$$
(8)

For a set \(X\in \mathcal {B}([0,2\pi ))\), we denote its Lebesgue measure by |X|.

Lemma 3

(Localisation Lemma) Consider a covariant phase pom \(\mathsf {F}\) in a d-dimensional Hilbert space \(\mathcal {H}\). For any \(X \in \mathcal {B}([0,2\pi ))\) \((X \ne [0,2 \pi ) )\) and for any state \(\rho \), it holds that

$$\begin{aligned} \mathrm{tr}[{\rho \mathsf {F}(X)}]\le d|X|/2\pi . \end{aligned}$$

Proof

The inequality follows immediately from \(\text {tr}\left[ \rho \mathsf {F}(X)\right] \le \text {tr}\left[ \mathsf {F}(X)\right] \) and the fact that due to the covariance condition (6) the phase distribution is uniform in the number states, i.e., for each \(\theta \),

$$\begin{aligned} \left\langle \,n\,{|}\, \mathsf {F}(X)|n\,\right\rangle = \left\langle \,n\,{|}\, \mathsf {F}(X \dotplus \theta )|n\,\right\rangle ; \end{aligned}$$

therefore \(\left\langle \,n\,{|}\, \mathsf {F}(X)|n\,\right\rangle ={|X|}/{(2\pi )}\), and so \(\text {tr}\left[ \mathsf {F}(X)\right] =d|X|/(2\pi )\). \(\square \)

We note that all covariant phase poms are absolutely continuous with respect to the Lebesgue measure. The localisation lemma puts stringent bounds on the magnitude of the localisation probability if |X| is small. Conversely, in order to get high localisation probability (close to 1) for small intervals X in a finite dimensional system, one needs to choose the dimension d to be large. We will use covariant quantities to construct reference frames in the relativisation model, where large (typically infinite-dimensional) Hilbert spaces are required for the reference system to be good, in a sense to be discussed.

3 Symmetry

3.1 Observables as Invariant Quantities

A quantum system \(\mathcal {S}\) is constrained to behave in accordance with the symmetries of the spacetime it inhabits and to the concomitant conservation laws that arise. An upshot of such a constraint is that certain quantities require two systems for their definition or, more colloquially, require a reference frame. Henceforth, absolute quantities will be understood as those whose formal representation does not explicitly rely on such a reference, which is therefore viewed as external [7]. Such absolute quantities should not be taken to be observable:Footnote 2 the absolute position of a system is not meaningful, but both the relative positions of parts of \(\mathcal {S}\) as a compound system and the position of \(\mathcal {S}\) (or, e.g., its centre of mass) relative to some other system \(\mathcal {R}\) is. Relative position is a shift-invariant quantity. We proceed with the hypothesis that what can be measured is invariant with respect to the relevant transformation group, with particular emphasis on the group of phase shifts.

Thus, a (locally compact) symmetry group G acts in the Hilbert space \(\mathcal {H}_{\mathcal {S}}\) of the system \(\mathcal {S}\) via a (strongly continuous, projective) unitary representation U. In non-relativistic quantum mechanics G (the spacetime symmetry group) is the Galilei group, and the stipulation of symmetry is that any pom \(\mathsf {E}\) of \(\mathcal {S}\) to be deemed observable must satisfy \(U(g)\mathsf {E}(X)U(g)^* = \mathsf {E}(X)\) for all \(g \in G\) and \(X\in \mathcal {B}(\varOmega )\) (\(\varOmega \) is any appropriate G-space). In this paper we simplify the problem, treating only unitary representations, and focus on shifts in one dimension (\(G=\mathbb {R}\)), and rotations (\(G=S^1\)). The latter case has a spacetime realisation as rotations about an axis, and an internal realisation as shifts in phase of, say, a laser beam.

3.2 Number and Phase

Consider a (possibly unbounded) number operator \(N=\sum _n n P_n\) acting in \(\mathcal {H}_{\mathcal {S}}\), generating a strongly continuous unitary representation \(U_{\mathcal {S}}\) of \(S^1\) in \(\mathcal {H}_{\mathcal {S}}\) via the unitary operators \(U_{\mathcal {S}}(\theta ):=e^{iN_{\mathcal {S}} \theta }\), giving rise by conjugation to an action on \(\mathcal {L}(\mathcal {H}_{\mathcal {S}})\), i.e., \(A \mapsto e^{iN_{\mathcal {S}} \theta } A e^{-iN_{\mathcal {S}} \theta }\), and on states \(\rho \mapsto e^{-iN_{\mathcal {S}} \theta } \rho e^{iN_{\mathcal {S}} \theta }\).

Consider the mapping \(\tau _{\mathcal {S}}:\mathcal {L}(\mathcal {H}_{\mathcal {S}})\rightarrow \mathcal {L}(\mathcal {H}_{\mathcal {S}})\) defined by

$$\begin{aligned} \tau _{\mathcal {S}}(A)=\sum P_n A P_n, \end{aligned}$$
(9)

with its predual \(\tau _{\mathcal {S}}{_*} : \mathcal {T}_1(\mathcal {H}_{\mathcal {S}}) \rightarrow \mathcal {T}_1(\mathcal {H}_{\mathcal {S}})\) taking the same form:

$$\begin{aligned} \tau _{\mathcal {S}}{_*}(\rho )=\sum P_n \rho P_n. \end{aligned}$$
(10)

\(\tau _{\mathcal {S}}{_*}\) is familiar from various contexts. In the quantum theory of measurement, it is the Lüders map arising from a non-selective measurement of \(N_{\mathcal {S}}\); in an optical setting it is called a dephasing channel. It also appears in decoherence theory. \(\tau _{\mathcal {S}}{_*}\) is trace-preserving, and hence bounded and trace-norm continuous. It can be shown that for any pure state \(P[\phi ]\), \(\tau _{\mathcal {S}}{_*}(P[\phi ])\) is the mixture of \(N_{\mathcal {S}}\)-eigenstates which minimises the Hilbert–Schmidt distance from \(P[\phi ]\); we omit the proof.

Proposition 1

Let \(\mu \) denote the (normalised) Haar measure on \(S^1\). For self-adjoint \(A \in \mathcal {L}(\mathcal {H}_{\mathcal {S}})\) the following are equivalent:

  1. 1.

    \([A,P_n]=0\) for all n (and thus \([A,N_{\mathcal {S}}]=0\) for bounded \(N_{\mathcal {S}}\)).

  2. 2.

    \(U(\theta )AU(\theta ^*)=A\) for all \(\theta \).

  3. 3.

    \(\int U(\theta ) A U(\theta )^* d\mu (\theta ) = A\).

  4. 4.

    \(\tau _{\mathcal {S}}(A) = A\).

For \(\rho \in \mathcal {L}_1(\mathcal {H}_{\mathcal {S}})\), the following are also equivalent:

  1. (5)

    \([\rho , P_n]=0\) for all n (and thus \([\rho ,N_{\mathcal {S}}]=0\) for bounded \(N_{\mathcal {S}}\)).

  2. (6)

    \(U(\theta )\rho U(\theta ^*)=\rho \) for all \(\theta \).

  3. (7)

    \(\int U(\theta )^* \rho U(\theta ) d\mu (\theta ) = \rho \).

  4. (8)

    \(\tau _{\mathcal {S}}{_*}(\rho ) = \rho \).

Proposition 2

The following hold (prime denoting commutant).

  1. 1.

    For any \(A \in \{P_n\}^{\prime }\), \(p_{\rho }^{A}(X) = p^A_{\tau _{\mathcal {S}}{_*}(\rho )}(X)\).

  2. 2.

    For any \(\rho \in \{P_n\}^{\prime }\), \(p_{\rho }^{A}(X) = p^{\tau _{\mathcal {S}}(A)}_{\rho }(X)\).

We omit the proof of (1) which is straightforward, and note that (2) follows from the duality

$$\begin{aligned} \text {tr}\left[ \rho \tau _{\mathcal {S}}(\mathsf {E}^A(X))\right] = \text {tr}\left[ \tau _{\mathcal {S}}{_*}(\rho ) \mathsf {E}^A(X)\right] = \text {tr}\left[ \tau _{\mathcal {S}}{_*}(\rho ) \tau _{\mathcal {S}}(\mathsf {E}^A(X))\right] ; \end{aligned}$$
(11)

the final equality is not part of the proof, but shows that demanding states or observables to be invariant is equivalent to demanding the invariance of both. The proposition holds also if appropriately rephrased for unsharp \(\mathsf {E}\) in place of A. This shows that no invariant quantity (i.e., no bona fide observable) of \(\mathcal {S}\) can distinguish between \(\rho \) and its invariant “counterpart”, \(\tau _{\mathcal {S}}{_*}(\rho )\). Operationally, then, the stipulation of invariance of observables partitions the state space into equivalence classes of indistinguishable states under the obvious equivalence relation. In the dual picture, keeping to invariant states means absolute and invariant quantities cannot be distinguished. Thus, stipulating that either states or observables must be invariant constitutes a restriction to ordinary quantum theory, and only if both states and observables are unrestricted do we have the usual textbook description.

3.3 Position and Momentum

The shift group on \(\mathbb {R}\) is unitarily implemented in \(L^2(\mathbb {R})\) by the operators \(U(x)=e^{ixP}\), with P the momentum operator in one space dimension. The spectral measure \(\mathsf {E}^Q\) of position Q is singled out (among spectral measures) by the condition \(\mathsf {E}^Q(X-x) = U(x)\mathsf {E}^Q(X)U(x)^*\). Unsharp positions also satisfy such a covariance criterion, and thus, as non-invariant quantities, absolute positions (sharp or unsharp) do not represent observable quantities, reflecting the lack of absolute space.

However, it may be possible to distinguish separate parts of a given quantum system, \(\mathcal {S}\) and \(\mathcal {R}\), and it may be possible to speak of the position of \(\mathcal {S}\) relative to \(\mathcal {R}\), and therefore to measure the shift-invariant quantity \(Q_{\mathcal {S}}\otimes \mathbb {1} - \mathbb {1} \otimes Q_{\mathcal {R}}\) or, indeed, any other shift-invariant quantity of \(\mathcal {S}+ \mathcal {R}\). Similar considerations apply to boosts.

Remark 3

The non-compactness of (the shift group on) \(\mathbb {R}\) rules out the existence of a normalisable Haar measure playing the role of \(\mu \) in Proposition 1 (as also pointed out recently by Smith et al. [28]).

Therefore, the absolute position \(Q_{\mathcal {S}}\) should be understood as representing the relative position \(Q_{\mathcal {S}} - Q_{\mathcal {R}}\), in the situation that the \(Q_{\mathcal {R}}\) system may be suppressed, or “externalised” [7] from the description. We now turn to a general analysis of the possibility of such an externalisation for arbitrary groups and relative quantities, before turning once more to typical examples.

4 Relativisation

In this section we introduce a relativisation mapping \(\yen \), and prove various mathematical properties satisfied by it. We discuss the physical interpretation of \(\yen \) as the making explicit of a reference system, and in the following section show that under high localisation of the reference system with respect to an appropriate covariant quantity used to define \(\yen \), the description of the system alone in terms of absolute, non-invariant quantities can provide a statistically good account of the relative quantities. Conversely, it is shown that in the case of reference system delocalisation, the description afforded by system quantities is necessarily invariant, giving generally poor representation of relative observables. The \(\yen \) map generalises (by considering a pom for the reference and more general groups) and makes mathematically precise (by avoiding improper states, and giving rigorous meaning to the integral) the “\({\$}\)” map of [7]. We also introduce the predual, \(\yen _*\), which de-relativises states, replacing the erroneous use of \({\$}\) also on states in [7].

4.1 Definition and Properties of the Map \(\yen \)

Definition 5

Let \(\mathcal {H}_{\mathcal {T}}=\mathcal {H}_{\mathcal {S}}\otimes \mathcal {H}_{\mathcal {R}}\) with \(\mathcal {H}_{\mathcal {S}},\mathcal {H}_{\mathcal {R}}\) finite or infinite dimensional Hilbert spaces hosting strongly continuous unitary representations \(U_{\mathcal {S}}\) and \(U_{\mathcal {R}}\) respectively of a locally compact metrisable group G and let \(\mathsf {F}\) be a covariant pom acting in \(\mathcal {H}_{\mathcal {R}}\). Then \(\yen :\mathcal {L}(\mathcal {H}_{\mathcal {S}}) \rightarrow \mathcal {L}(\mathcal {H}_{\mathcal {T}})\) is defined by

$$\begin{aligned} \yen (A)= \int _G U_{\mathcal {S}}(g)AU_{\mathcal {S}}(g)^* \otimes \mathsf {F}(dg). \end{aligned}$$
(12)

\(\yen \) will be called a relativisation map, various properties of which will be given in proposition 8. \(\yen \) also acts on poms by \((\yen \circ \mathsf {E})(X):=\yen (\mathsf {E}(X))\).

We must first give the definition of this integral. If \(\mathcal {H}_{\mathcal {R}}\) is finite-dimensional and G is compact and metrisable, there exists a unique positive T such that \(\mathsf {F}(X) = \kappa \int _X U_{\mathcal {R}}(g)TU_{\mathcal {R}}(g)^* dg\) where dg is the Haar measure and \(\kappa \) is chosen so that \(\int _G U_{\mathcal {R}}(g)TU_{\mathcal {R}}(g)^* dg = \mathbb {1}\) [29]. Then [4], the integral (12) may be defined by

$$\begin{aligned} \yen (A)= \kappa \int _G U_{\mathcal {S}}(g)AU_{\mathcal {S}}(g)^* \otimes U_{\mathcal {R}}(g)TU_{\mathcal {R}}(g)^* dg. \end{aligned}$$
(13)

For the case that \(\mathcal {H}_{\mathcal {S}}\) and \(\mathcal {H}_{\mathcal {R}}\) are of infinite dimension, more work is required (see also [30, Sect. 5.4]). Let G be a locally compact second countable group. We first construct \(\yen \) for a subset \(\mathcal {A} \subset \mathcal {L}(\mathcal {H}_{\mathcal {S}})\) on which the action \(\alpha _g\) is norm continuous, noting that this subset is weakly dense; see the discussion below.

For \(e \in G\), for any \(\epsilon >0\) there exists a neighbourhood U such that \(\Vert \alpha _g(A) - A\Vert \le \epsilon \Vert A\Vert \). By translating this U we obtain a covering \(G= \cup _{g \in G} g U\). The Lindelöf property of G allows us to obtain a countable cover \(G= \cup _i U_i\). By taking intersections we obtain a disjoint countable cover (a mesh) \(G= \cup _n V_n\) such that for any \(g, g' \in V_n\) it holds that \(\Vert \alpha _g(A) - \alpha _{g'}(A) \Vert \le \epsilon \Vert A\Vert \). Let \(\{\epsilon _{N}\}\) be a decreasing sequence converging to 0. By employing the above construction we can construct a mesh \(G= \cup _n V^{N}_n\) for each N so that \(V^N_n = \cup _{m \in H^N_M} V^{M}_m\) holds for each \(N\le M\) with some proper \(H^N_M \subset \mathbf {N}\). (That is, a mesh of M is strictly finer than that of N). We choose \(g^N_n \in V^N_n\) for each n (and N). Now we assume a covariant pom \(\mathsf {E}(\cdot )\) acting in \(\mathcal {H}_{\mathcal {R}}\) to be projection-valued. This suffices since any pom \(\mathsf {F}\) can be dilated to a pvm by Naimark extension.

We introduce for each N, the mapping

$$\begin{aligned} \yen _N(A) := \sum _{n} \alpha _{g^N_n}(A) \otimes \mathsf {E}(V^N_n). \end{aligned}$$

It is easy to see that this is bounded. In fact, for an arbitrary normalised vector \(|\psi \rangle \in \mathcal {H}_{\mathcal {S}}\otimes \mathcal {H}_{\mathcal {R}}\), it holds that

$$\begin{aligned} \langle \psi | \yen _N(A)^* \yen _N(A)|\psi \rangle = \sum _n \text{ tr }[\rho ^N_n \alpha _{g^N_n}(A^*A) ] p^N_n \le \sum _n p^N_n \Vert A\Vert ^2 = \Vert A\Vert ^2, \end{aligned}$$

where \(p^N_n := \langle \psi | \mathbb {1} \otimes \mathsf {E}(V^N_n) |\psi \rangle \) and \(\rho ^N_n\) is a density operator uniquely determined by \(\text{ tr }[\rho ^N_n X] = \langle \psi |X\otimes \mathsf {E}(V^N_n) |\psi \rangle \).

Now we show that the sequence \(\{\yen _N(A)\}\) is a Cauchy sequence. Since an arbitrary bounded operator A can be decomposed into two self-adjoint operators, it suffices to show the property for self-adjoint A. For \(\epsilon >0\), we show that \(\Vert \yen _N(A) - \yen _M(A) \Vert \le \epsilon \Vert A\Vert \) for \(\epsilon _N, \epsilon _M \le \epsilon \). Let \(M>N\). For an arbitrary normalised \(|\psi \rangle \in \mathcal {H}_{\mathcal {S}}\otimes \mathcal {H}_{\mathcal {R}}\), we have

$$\begin{aligned} |\langle \psi | \yen _M (A) - \yen _N(A) |\psi \rangle |= & {} \left| \sum _n p^N_n \text{ tr }[\rho ^N_n \alpha _{g^N_n}(A)] - \sum _m p^M_m \text{ tr }[\rho ^M_m \alpha _{g^M_m}(A)] \right| \\\le & {} \sum _n \left| p^N_n \text{ tr }[\rho ^N_n \alpha _{g^N_n}(A)] - \sum _{m \in H^N_M} \text{ tr }[\rho ^M_m \alpha _{g^M_m}(A)]\right| \\\le & {} \sum _n \sum _{m\in H^N_M} p^M_m \left| \text{ tr }[\rho ^M_m (\alpha _{g^N_n}(A) - \alpha _{g^M_m}(A)] \right| \le \epsilon \Vert A\Vert , \end{aligned}$$

where we used \(\sum _{m\in H^N_M} p^M_m \rho ^M_m = p^N_n \rho ^N_n\). Thus for self-adjoint A, we find \(\Vert \yen _M(A) - \yen _N(A)\Vert \le \epsilon \Vert A\Vert \). Thus we can define \(\yen (A)\) by \(\yen (A):=\lim _N \yen _N(A)\).

Note that this definition does not depend on the choice of covers. In fact one can see that for two covers \(\{V^N_n\}\) and \(\{\hat{V}^N_n\}\) for \(\epsilon _N\), their intersections \(\{V^N_n \cap V^N_m\}\) is also a cover. It is easy to see that the difference between the \(\yen _N(A)\) constructed with \(\{V^N_n\}\) (\(\{\hat{V}^N_n\}\)) and \(\{V^N_n \cap \hat{V}^N_m\}\) is smaller than \(\epsilon \Vert A\Vert \).

Now we discuss the density of \(\mathcal {A}\) in \(\mathcal {L}(\mathcal {H}_{\mathcal {S}})\). We assume that the action is weakly continuous—a natural assumption from a physical point of view. In addition the action is assumed to be implemented by a unitary operator U(g). Then one can see that the representation U(g) is strongly continuous. We can define, for a smooth function f whose support is compact in G, \(A(f) := \int \mu (dg) U(g)AU(g)^* f(g)\). For such A(f) the action \(\alpha _g\) is norm continuous. We may introduce \(\mathcal {A}\) as a subalgebra generated by such elements. If we take f to be localised around the unit of G, A(f) gets close to A with respect to the weak topology. Thus \(\mathcal {A}\) is dense in \(\mathcal {L}(\mathcal {H}_{\mathcal {S}})\).

As a final remark, we show that this \(\yen (A)\) can be defined on the whole \(\mathcal {L}(\mathcal {H}_{\mathcal {S}})\) for Abelian G implemented by a true unitary representation. As each U(g) commutes with each \(U(g^{\prime })\), their generators can be diagonalised simultaneously. For simplicity we treat here only \(G=\mathbb {R}\) and write its generator as \(K = \int _{\mathbb {R}} k \mathsf {P}(dk)\). Now we introduce \(P_E= \int _{|k|\le E} \mathsf {P}(dk)\). Then one can see that for any \(A \in \mathcal {L}(\mathcal {H}_{\mathcal {S}})\), \(P_E A P_E\) is a “smooth” element (i.e., \(\alpha _g(P_E AP_E)\) is norm continuous) and \(\yen (P_E A P_E)\) is defined. In addition its norm is bounded as \(\Vert \yen (P_E A P_E) \Vert \le \Vert A \Vert \). Now for arbitrary vectors \(|\psi \rangle , |\phi \rangle \in \mathcal {H}_{\mathcal {S}}\otimes \mathcal {H}_{\mathcal {R}}\), we define a sesquilinear form \(Q(|\phi \rangle , |\psi \rangle )\) by \(\lim _{E\rightarrow \infty } \langle \phi | \yen (P_E A P_E) |\psi \rangle \). We first confirm that this is well-defined. For any \(\epsilon >0\), there exists \(E_0\) such that \(\Vert (\mathbb {1} - P_{E_0})|\psi \rangle \Vert , \Vert (\mathbb {1} - P_{E_0})|\phi \rangle \Vert \le \epsilon \). Then for any \(E\ge E' \ge E_0\), we have

$$\begin{aligned}&\left| \langle \psi | \yen (P_EA P_E) |\phi \rangle - \langle \psi | \yen (P_{E'} A P_{E'}) | \phi \rangle \right| \\&= \left| \langle \psi | \yen (P_E A P_E) |\phi \rangle - \langle \psi | P_{E'} \yen (P_E A P_E ) P_{E'} | \phi \rangle \right| \le (2\epsilon + \epsilon ^2 )\Vert A \Vert . \end{aligned}$$

Thus this sequence is Cauchy. On the other hand, each quantity is bounded by \(\Vert A\Vert \Vert |\psi \rangle \Vert \Vert |\phi \rangle \Vert \). Thus it converges. It is also easy to see that this sesquilinear form is bounded as \(Q(|\psi \rangle , |\phi \rangle ) \le \Vert A\Vert \Vert |\psi \rangle \Vert \Vert |\phi \rangle \Vert \). Thus there exists an operator \(\yen (A)\) satisfying \(\langle \psi |\yen (A) |\phi \rangle = Q(|\psi \rangle , |\phi \rangle )\). Moreover, it is easy to see that such defined \(\yen (A)\) is bounded as \(\Vert \yen (A) \Vert \le \Vert A\Vert \).

Proposition 3

\(\yen : \mathcal {L}(\mathcal {H}_{\mathcal {S}})\rightarrow \mathcal {L}(\mathcal {H}_{\mathcal {T}})\) has the following properties.

  1. 1.

    \(\yen \) is linear, unital (\(\yen (\mathbb {1} _{\mathcal {H}_{\mathcal {S}}}) = \mathbb {1} _{\mathcal {H}_T}\)), and preserves adjoints (\( \yen (A^*) = \yen (A)^*\) for any \(A \in \mathcal {L}(\mathcal {H}_{\mathcal {S}})\)).

  2. 2.

    \(\yen \) is positive and hence bounded.

  3. 3.

    \(\yen \) is completely positive.

  4. 4.

    \(\yen \) is the dual of a bounded linear map \(\yen _*: \mathcal {L}_1 (\mathcal {H}_{\mathcal {T}}) \rightarrow \mathcal {L}_1 (\mathcal {H}_{\mathcal {S}})\) defined by \(\mathrm{tr} \left[ R \yen (A) \right] =\mathrm{tr} \left[ \yen _* (R) A \right] \) for all \(A \in \mathcal {L}(\mathcal {H}_{\mathcal {S}}),~R\in \mathcal {L}_1 (\mathcal {H}_{\mathcal {T}})\). In particular, \(\yen \) is normal.

  5. 5.

    If E is an effect, so is \(\yen (E)\). With \(\mathsf {E}: \mathcal {B}(G) \rightarrow \mathcal {L}(\mathcal {H}_{\mathcal {S}})\), \(\mathsf {E}\mapsto \yen \circ \mathsf {E}\equiv \mathsf {E}^{(\yen )}\) defines a map from \(\mathcal {L(H_S)}\)-valued POMs to \(\mathcal {L}(\mathcal {H}_{\mathcal {T}})\)-valued POMs.

  6. 6.

    If \(\mathsf {F}\) is projection-valued, \(\yen \) is multiplicative, i.e., \(\yen (AB) = \yen (A) \yen (B)\) for all \(A,B \in \mathcal {L}(\mathcal {H}_{\mathcal {S}})\)—and is thus an algebraic \(^*\)-homomorphism.

  7. 7.

    With \(\mathcal {U}(g) = U_{\mathcal {S}}(g)\otimes U_{\mathcal {R}}(g)\),

    $$\begin{aligned} \mathcal {U}(g)\yen (A) \mathcal {U}(g)^* = \yen (A) \quad \text {for all }A \in \mathcal {L}(\mathcal {H}_S),\quad g \in G. \end{aligned}$$
    (14)

    If \(U_{\mathcal {S}}(g)AU_{\mathcal {S}}(g)^*=A\) for all \(g \in G\), then \(\yen (A) = A \otimes \mathbb {1}\).

Proof

  1. 1.

    These properties follow immediately from the definition.

  2. 2.

    To prove positivity, consider \(\left\langle \,\psi \,{|}\,\yen (A) \psi \,\right\rangle \) for \(\psi \in \mathcal {H}_{\mathcal {T}}\) and assume that A is positive, therefore \(A=B^2\) for a (unique) \(B \ge 0\). Let \(\{\varphi _i \otimes \phi _j \}\) be an orthonormal basis in \(\mathcal {H}_{\mathcal {S}} \otimes \mathcal {H}_{\mathcal {R}}\), and \(\psi = \sum _{i,j}c_{ij}\varphi _i \otimes \phi _j\). Let \(\gamma _{g}(A) \equiv U_{\mathcal {S}}(g)AU_{\mathcal {S}}(g)^*\); we then have

    $$\begin{aligned} \left\langle \,\psi \,{|}\,\yen (A) \psi \,\right\rangle = \sum _{i,j,k,l}\bar{c}_{ij} c_{kl} \int _G \left\langle \,\varphi _i\,{|}\,\gamma _{g}(A) \varphi _k\,\right\rangle \left\langle \,\phi _j\,{|}\, \mathsf {F}(dg) \phi _l\,\right\rangle . \end{aligned}$$
    (15)

    Writing \(\gamma _{g} (A) = U_{\mathcal {S}}(g)B^2 U_{\mathcal {S}}(g)^*\) and \(B^2 = \sum _m B |\varphi _m\rangle \langle \varphi _m|B\), the expression for \(\left\langle \,\psi \,{|}\,\yen (A) \psi \,\right\rangle \) becomes

    $$\begin{aligned} \left\langle \,\psi \,{|}\,\yen (A) \psi \,\right\rangle = \sum _m \int _G \left\langle \,\xi _{m}(g)\,{|}\, \mathsf {F}(dg) \xi _m (g)\,\right\rangle , \end{aligned}$$
    (16)

    where we have defined \(\xi _{m}(g) := \sum _{k,l}c_{kl}\left\langle \,\varphi _m\,{|}\,BU_{\mathcal {S}}(g)^* \varphi _k\,\right\rangle \phi _l\). The right hand side of the expression (16) is manifestly positive. Any positive (linear) map between \(C^*\)-algebras with unit is automatically bounded—see [31, Proposition 33.4]. Now we note that the right hand side of (16) can be written

    $$\begin{aligned} \text {tr}\left[ \int _G \sum _m |\xi _m(g)\rangle \langle \xi _{m}(g)| \mathsf {F}(dg)\right] . \end{aligned}$$
    (17)
  3. 3.

    In order to show that \(\yen \otimes \mathbb {1}_n : \mathcal {L}(\mathcal {H}_{\mathcal {S}})\otimes M_n(\mathbb {C}) \rightarrow \mathcal {L}(\mathcal {H}_{\mathcal {S}})\otimes \mathcal {L}(\mathcal {H}_{\mathcal {R}})\otimes M_n(\mathbb {C})\) is positive we introduce an orthonormal basis \(\{ \eta _k\} \subset \mathbb {C}^n\). Then the proof runs along essentially the same lines as (2) above, and one finds that for \(\varPsi = \sum _{i,j,k}c_{ijk}\varphi _i \otimes \phi _j \otimes \eta _k\), and repeating the argument, letting \(A=B^2 = \sum _pB |\varphi _p\rangle \langle \varphi _p|B\), \(\left\langle \,\varPsi \,{|}\,\yen (A) \otimes \mathbb {1} \varPsi \,\right\rangle \) is given as

    $$\begin{aligned}&\left\langle \,\varPsi \,{|}\,\yen (A) \otimes \mathbb {1} \varPsi \,\right\rangle \nonumber \\&\quad = \sum _{i,j,k,l,m,n}\int _G \left\langle \,\varphi _i\,{|}\,U(g)B^2U(g)^* \varphi _l\,\right\rangle \left\langle \,\phi _j\,{|}\,\mathsf {F}(dg)\phi _m\,\right\rangle \left\langle \,\eta _k\,{|}\,\eta _n\,\right\rangle \end{aligned}$$
    (18)

    which may be written as

    $$\begin{aligned} \left\langle \,\varPsi \,{|}\,\yen (A) \otimes \mathbb {1} \varPsi \,\right\rangle = \sum _p \int _G \left\langle \,\zeta _p(g)\,{|}\,\mathsf {F}(dg) \zeta _p(g)\,\right\rangle , \end{aligned}$$
    (19)

    where \(\zeta _g(p) := \sum _{l,m,n} \left\langle \,\varphi _p\,{|}\,BU(g)^*\varphi _l\,\right\rangle \phi _m \otimes \eta _n\). Thus, by the same argument as in (2), \(\yen \) is completely positive.

  4. 4.

    The normality of \(\yen \) follows from \(\yen \) being the dual of the positive linear map \(\yen _* :\mathcal {S}(\mathcal {H}_{\mathcal {T}}) \rightarrow \mathcal {S}(\mathcal {H}_{\mathcal {S}})\) [29, Lemma 2.2.].

  5. 5.

    The effect property \(0 \le \yen (E) \le \mathbb {1}\) follows immediately from 1 and 2. That \(\yen \circ \mathsf {E}\) has the \(\sigma \)-additivity property of a pom follows from the normality of \(\yen \).

  6. 6.

    Let \(\mathsf {F}\) be projection valued. For \(A,B \in \mathcal {L}(\mathcal {H}_{\mathcal {S}})\), let \(f_{\varphi , \varphi ^{\prime }}(g, g^{\prime })\) denote the complex bounded function \(\left\langle \,\varphi \,{|}\,\gamma _g(A)\gamma _{g^{\prime }}(B)\varphi ^{\prime }\,\right\rangle \):

    $$\begin{aligned} \left\langle \,\varphi \otimes \phi \,{|}\,\yen (A)\yen (B)\varphi ^{\prime } \otimes \phi ^{\prime }\,\right\rangle =\int \int f_{\varphi , \varphi ^{\prime }}(g,g^{\prime }) \left\langle \,\phi \,{|}\,\mathsf {F}(d g) \mathsf {F}(d g ^{\prime }) \phi ^{\prime }\,\right\rangle . \end{aligned}$$
    (20)

    The product measure defined by \(X\times Y \mapsto \mathsf {F}(X)\mathsf {F}(Y)\) is zero whenever \(X \cap Y = \emptyset \), and hence the right hand side of (20) reduces to

    $$\begin{aligned} \int \left\langle \,\varphi \,{|}\,U_{\mathcal {S}}(g) (AB) U_{\mathcal {S}}(g)^* \varphi ^{\prime }\,\right\rangle \left\langle \,\phi \,{|}\,\mathsf {F}(d g) \phi ^{\prime }\,\right\rangle , \end{aligned}$$
    (21)

    which is the expression for \(\left\langle \,\varphi \otimes \phi \,{|}\,\yen (AB) \varphi ^{\prime }\otimes \phi ^{\prime }\,\right\rangle \).

  7. 7.

    We compute:

    $$\begin{aligned} \mathcal {U}(g)\yen (A)\mathcal {U}(g)^*&= \int _G {U}_{\mathcal {S}}(gg^{\prime })A {U}_{\mathcal {S}}(gg^{\prime })^* \mathsf {F}(d(gg^{\prime })) = \yen (A) \end{aligned}$$

    Therefore to each bounded self-adjoint operator of the system the map \(\yen \) assigns a bounded self-adjoint operator (see below) \(\yen (A)\) acting in \(\mathcal {H}_{\mathcal {T}}\) which is invariant under the action of \(\mathcal {U}\).

\(\square \)

Remark 4

From (5) we observe that \(\yen \) not only relativises self-adjoint operators, but also their spectral measures, and more generally any pom. An example of the latter, which we will encounter in Sect. 4.2.3, is the relativisation of a covariant phase pom, resulting in a relative phase observable.

4.2 Examples

In this subsection we give examples of familiar relative quantities obtained under \(\yen \) to demonstrate that \(\yen \) functions as expected. Its main utility, however, lies in the fact that it relativises arbitrary quantities.

4.2.1 Position and Momentum

Consider the spectral measure \(\mathsf {E}^{Q_{\mathcal {S}}}\) of the position \(Q_{\mathcal {S}}\), \(\mathsf {E}^{Q_{\mathcal {R}}}\) of \(Q_{\mathcal {R}}\), and unitary shifts \(U_{\mathcal {S}}(x)=e^{ixP_{\mathcal {S}}}\) and \(U_{\mathcal {R}}(x)=e^{ixP_{\mathcal {R}}}\). Then,

$$\begin{aligned} (\yen \circ \mathsf {E}^{Q_{\mathcal {S}}})(X) = \int _{\mathbb {R}}e^{ixP_{\mathcal {S}}}\mathsf {E}^{Q_{\mathcal {S}}}(X)e^{-ixP_{\mathcal {S}}}\otimes \mathsf {E}^{Q_{\mathcal {R}}}(dx), \end{aligned}$$
(22)

which may be written as

$$\begin{aligned} (\yen \circ \mathsf {E}^{Q_{\mathcal {S}}})(X) = \int _{\mathbb {R}} \int _{\mathbb {R}} \chi _X(x^{\prime }-x)\mathsf {E}^{Q_{\mathcal {S}}}(dx^{\prime }) \otimes \mathsf {E}^{Q_{\mathcal {R}}}(dx). \end{aligned}$$
(23)

This is easily recognised as the spectral measure of the relative position

$$\begin{aligned} Q_{\mathcal {S}} - Q_{\mathcal {R}} = \int _{\mathbb {R}} \int _{\mathbb {R}} (x-x^{\prime })\mathsf {E}^{Q_{\mathcal {S}}}(dx^{\prime }) \otimes \mathsf {E}^{Q_{\mathcal {R}}}(dx). \end{aligned}$$
(24)

Therefore, one may formally write \(\yen (Q_{\mathcal {S}}) = Q_{\mathcal {S}}-Q_{\mathcal {R}}\).

Under the given relativisation, the spectral measure \(\mathsf {E}^{P_{\mathcal {S}}}\) of the momentum \(P_{\mathcal {S}}\) takes the simple form \((\yen \circ \mathsf {E}^{P_{\mathcal {S}}})(X) = \mathsf {E}^{P_{\mathcal {S}}}(X)\otimes \mathbb {1}\), and again we write \(\yen (P_{\mathcal {S}}) = P_{\mathcal {S}} \otimes \mathbb {1}\).

Relativisation of momentum under boosts follows an identical argument; we find that (using the same symbol \(\yen \) for relativising with respect to boosts)

$$\begin{aligned} (\yen \circ \mathsf {E}^{P_{\mathcal {S}}})(Y) = \int _{\mathbb {R}} e^{iyQ_{\mathcal {S}}}\mathsf {E}^{P_{\mathcal {S}}}(X)^{-iyQ_{\mathcal {S}}}\otimes E^{P_{\mathcal {R}}}(dy), \end{aligned}$$
(25)

yielding

$$\begin{aligned} (\yen \circ \mathsf {E}^{P_{\mathcal {S}}})(Y)=\int _{\mathbb {R}} \int _{\mathbb {R}} \chi _Y(y^{\prime }-y)\mathsf {E}^{P_{\mathcal {S}}}(dy^{\prime }) \otimes \mathsf {E}^{P_{\mathcal {R}}}(dy), \end{aligned}$$
(26)

which is the spectral measure of \(P_{\mathcal {S}}-P_{\mathcal {R}}\). Under this relativisation, \(Q \mapsto Q \otimes \mathbb {1}\).

Remark 5

With respect to the boost part of the Galilei group, the momentum relativisation assumed \(\mathcal {S}\) and \(\mathcal {R}\) are of equal mass. For systems with different mass, i.e., \(m_{\mathcal {S}}\) and \(m_{\mathcal {R}}\), \(\yen \) must be appropriately redefined.

We note also the possibility of unsharp relativisations, that is, allowing for one or both of the spectral measures \(\mathsf {E}^{Q_{\mathcal {S}}}\) and \(\mathsf {E}^{Q_{\mathcal {R}}}\) to be replaced by unsharp or smeared (covariant) positions—Sect. 6. The same applies for unsharp momenta.

4.2.2 Angle

Here we consider two systems with \(\varTheta _{\mathcal {S}}, \varTheta _{\mathcal {R}}\) being their angle operators conjugate to the z-components of their angular momenta. For the covariant pvm of \(\mathcal {R}\) we thus choose \(\mathsf {F}=\mathsf {E}^{\varTheta _{\mathcal {R}}}\). Then we obtain:

$$\begin{aligned} \yen (\varTheta _{\mathcal {S}})&= \int _0 ^{2 \pi } U(\theta ') \varTheta _{\mathcal {S}} U(\theta ^{\prime })^* \otimes \mathsf {E}^{\varTheta _{\mathcal {R}}}d (\theta ^{\prime })\end{aligned}$$
(27)
$$\begin{aligned}&= \int _0 ^{2 \pi }U(\theta ^{\prime }) \left[ \int _{0}^{2 \pi } \theta \mathsf {E}^{\varTheta _{\mathcal {S}}}(d \theta )\right] U(\theta ^{\prime })^{*} \otimes \mathsf {E}^{\varTheta _{\mathcal {R}}}(d \theta ^{\prime }). \end{aligned}$$
(28)

Exploiting the covariance of the spectral measure \(\mathsf {E}^{\varTheta _{\mathcal {S}}}\) and performing the substitution \(\theta +\theta ^{\prime } \equiv \theta ^{{\prime } {\prime }}\) we find the above equal to

$$\begin{aligned}&\int _{0}^{2 \pi } \left[ \int _{0} ^{2 \pi } (\theta ^{\prime \prime } - \theta ^{\prime }) \mathsf {E}^{\varTheta _{\mathcal {S}}} (d{\theta ^{\prime \prime }}) \right] \otimes \mathsf {E}^{\varTheta _{\mathcal {R}}}(d{\theta ^{\prime }})\end{aligned}$$
(29)
$$\begin{aligned}&= \int _{0}^{2 \pi } \left[ \varTheta _{\mathcal {S}} - \theta ^{\prime } \mathbb {1} \right] \otimes \mathsf {E}^{\varTheta _{\mathcal {R}}}(d{\theta ^{\prime }})\end{aligned}$$
(30)
$$\begin{aligned}&= \varTheta _{\mathcal {S}} \otimes \mathbb {1} - \mathbb {1} \otimes \varTheta _{\mathcal {R}}. \end{aligned}$$
(31)

Therefore, \(\yen (\varTheta _{\mathcal {S}}) = \varTheta _{\mathcal {S}} - \varTheta _{\mathcal {R}}\).

4.2.3 Phase

We may use \(\yen \) to construct a relative phase observable as given in [32, 33]. Let \(\mathsf {F}\) be a covariant phase pom as defined in (7), and denote by \(\mathsf {F}^{\mathcal {R}}\) a covariant phase for \(\mathcal {R}\). Then \( \yen \circ \mathsf {F}\) is given by:

$$\begin{aligned} \yen \bigl [\mathsf {F}(X)\bigr ] =\int _0 ^{2 \pi } \mathsf {F}(X \dotplus \theta ) \otimes \mathsf {F}^{\mathcal {R}} (d\theta ), \end{aligned}$$
(32)

and

$$\begin{aligned} \yen [\mathsf {F}(X)] = \frac{1}{(2 \pi )^2} \sum _{n,m,k,l} \widetilde{c}_{n,m,k,l} \int _{0}^{2\pi } d \theta \int _{X \dotplus \theta } e^{i(n-m) \theta ^{\prime }} |n\rangle \langle m| \otimes |k\rangle \langle l| e^{i(k-l) \theta } d \theta ^{\prime }, \end{aligned}$$
(33)

where \( \widetilde{c}_{n,m,k,l} \equiv c_{n,m} c^{\prime }_{k,l}\). Writing \( |n,k\rangle \equiv |n\rangle \otimes |k\rangle \), we have

$$\begin{aligned} \yen \bigl [\mathsf {F}(X)\bigr ] = \frac{1}{2 \pi } \sum _{n,m,k,l} \widetilde{c}_{n,m,k,l}\delta _{n-m,k - l} \int _X |n,k\rangle \langle m,l| e^{i(n-m) \theta } d \theta , \end{aligned}$$
(34)

which is a relative phase observable.

5 Restriction

5.1 Basic Properties

Consider now a fixed state \(\omega \) of \(\mathcal {R}\) and the isometric embedding \(\mathcal {V}_{\omega } :\mathcal {L}_1(\mathcal {H}_{\mathcal {S}}) \rightarrow \mathcal {L}_1(\mathcal {H}_{\mathcal {T}})\) defined by \(\rho \mapsto \rho \otimes \omega \). This has a dual (restriction) map \(\varGamma _{\omega }:\mathcal {L}(\mathcal {H}_{\mathcal {T}})\rightarrow \mathcal {L}(\mathcal {H}_{\mathcal {S}})\), which on tensor product operators \(A \otimes B\) takes the form

$$\begin{aligned} \varGamma _{\omega }(A \otimes B) = A \text {tr}\left[ \omega B\right] . \end{aligned}$$

Proposition 4

\(\varGamma _{\omega }\) possesses the following properties.

  1. 1.

    \(\varGamma _{\omega }\) is linear, unital, adjoint-preserving.

  2. 2.

    \(\varGamma _{\omega }\) is completely positive (and therefore positive).

  3. 3.

    \(\varGamma _{\omega }\) is normal.

  4. 4.

    \(\varGamma _{\omega }\) is a (normal) conditional expectation in the sense of von Neumann algebras.

We recall (e.g., [29]) that given \(\mathcal {L(H)}\) and a von Neumann subalgebra \(\mathcal {W} \subset \mathcal {L(H)}\) a normal conditional expectation \(\mathcal {E}:\mathcal {L(H)}\rightarrow \mathcal {W}\) is a positive, adjoint-preserving normal map satisfying the additional properties (i) \(\mathcal {E}(X)=X\) if and only if \(X \in \mathcal {W}\) and (ii) \(\mathcal {E}(X_1YX_2)=X_1\mathcal {E}(Y)X_2\) for any \(X_1,~ X_2 \in \mathcal {W}\) and \(Y \in \mathcal {L(H)}\).

Proof

(1)–(3), [29, Ch. 9]; for the final part we view \(\mathcal {L}(\mathcal {H}_{\mathcal {S}})\) as a subalgebra of \(\mathcal {L}(\mathcal {H}_{\mathcal {T}})\) by identifying \(A \in \mathcal {L}(\mathcal {H}_{\mathcal {S}})\) with \(A \otimes \mathbb {1} \in \mathcal {L}(\mathcal {H}_{\mathcal {T}})\). Then,

$$\begin{aligned} \varGamma _{\omega }\left( (X_1\otimes \mathbb {1})(A \otimes B)(X_2 \otimes \mathbb {1}) \right) = X_1AX_2 \omega (B) = X_1 \varGamma _{\omega } (A \otimes B)X_2 \otimes \mathbb {1}. \end{aligned}$$
(35)

We extend by linearity to finite sums \(\sum _{i,j}A_i \otimes B_j \in \mathcal {L}(\mathcal {H}_{\mathcal {T}})\), and to infinite sums by continuity (see also see [34]).\(\square \)

Thus we will sometimes refer to \(\varGamma _{\omega }\) as a restriction channel. \(\varGamma _{\omega }\) restricts poms of \(\mathcal {S}+ \mathcal {R}\) to those of \(\mathcal {S}\), and is used to translate back from the relative picture to the absolute one, contingent upon the state \(\omega \) of \(\mathcal {R}\). For a pure product state, for example, for a given self-adjoint \(R \in \mathcal {L}(\mathcal {H}_{\mathcal {T}})\) and fixed unit \(\phi \in \mathcal {H}_{\mathcal {R}}\) the expression \(\left\langle \,\cdot \otimes \phi \,{|}\,R \cdot \otimes \phi \,\right\rangle \) determines a bounded, real valued quadratic form \(\mathcal {H}_{\mathcal {S}}\rightarrow \mathbb {R}\) and therefore a unique bounded self-adjoint operator \(R_{\phi } =\varGamma _{\phi }(R) \in \mathcal {L}(\mathcal {H}_{\mathcal {S}})\).

The restriction map is related to the trace as follows. By the duality \(\mathcal {L}(\mathcal {H}_{\mathcal {S}})\cong \mathcal {L}_1(\mathcal {H}_{\mathcal {S}})^*\) the map \(\text {tr}\left[ R \cdot \otimes \omega \right] :\mathcal {L}_1(\mathcal {H}_{\mathcal {S}}) \rightarrow \mathbb {C}\) determines a unique bounded \(A_{\omega } \in \mathcal {L}(\mathcal {H}_{\mathcal {S}})\), which is self-adjoint when \(R \in \mathcal {L}(\mathcal {H}_{\mathcal {T}})\) is and . Explicitly we therefore have \(\text {tr}\left[ \varGamma _{\omega }(R)\rho \right] = \text {tr}\left[ R \rho \otimes \omega \right] \), holding for all \(\rho \).

5.2 Further Properties

The restriction maps \(\varGamma _{\omega }\) have further properties of interest, which we collect here. For the purpose of characterising the relationship between the choice of state \(\omega \) and the quantities thus obtained under \(\varGamma _{\omega }\) it is convenient to introduce covariant channels.

Definition 6

Let U and V be unitary representations of G in Hilbert spaces \(\mathcal {H}\) and \(\mathcal {K}\) respectively, and let the channel \(\varLambda : \mathcal {L(H)}\rightarrow \mathcal {L(K)}\). \(\varLambda \) is called covariant if \(\varLambda (U(g)AU(g)^*) = V(g)\varLambda (A)V(g)^*\) for all \(A \in \mathcal {L(H)}\) and \(g \in G\).

Covariance for maps acting on the trace class takes an obvious analogous form. The next lemma demonstrates that the restriction map \(\varGamma _{\omega }\) applied to an invariant quantity is invariant if \(\omega \) is invariant.

Lemma 4

Let \(\varGamma _{\omega }:\mathcal {L}(\mathcal {H}_{\mathcal {T}})\rightarrow \mathcal {L}(\mathcal {H}_{\mathcal {S}})\) be a restriction channel for some state \(\omega \). \(\varGamma _{\omega }\) is covariant if and only if \(\omega \) is invariant. If \(\varGamma _{\omega }\) is covariant, then \(\varGamma _{\omega }(R)\) is invariant if R is.

Proof

Writing \(\mathcal {U}(g) = U_{\mathcal {S}}(g)\otimes U_{\mathcal {R}}(g)\), the first part follows immediately from the covariance condition

$$\begin{aligned} \sum _{i,j}U_{\mathcal {S}}(g)A_iU_{\mathcal {S}}(g)^*\omega (U_{\mathcal {R}}(g)B_jU_{\mathcal {R}}(g)^*))=\sum _{i,j}U_{\mathcal {S}}(g)A_iU_{\mathcal {S}}(g)^*\omega (B_j), \end{aligned}$$
(36)

to hold for an arbitrary bounded operator \(\sum _{i,j}A_i \otimes B_j \in \mathcal {L(H)}\) (or a limit of such terms) and all \(g \in G\). For the second part, if \(R = \mathcal {U}(g)R\mathcal {U}(g)^*\) then clearly \(U_{\mathcal {S}}(g)\varGamma _{\omega }(R)U_{\mathcal {S}}(g)^* = \varGamma _{\omega }(R)\). \(\square \)

Thus, for invariant (observable) \(R \in \mathcal {L}(\mathcal {H}_{\mathcal {T}})\), the only possible way to achieve a non-invariant restriction \(\varGamma _{\omega }(R)\) is by choosing a non-invariant \(\omega \). An invariant \(\omega \) therefore yields, in the case of number/phase, restricted quantities satisfying (1)–(4) of Proposition 1.

A simple calculation shows that the partial trace map \(\mathrm{tr}_{\mathcal {H}_{\mathcal {R}}}:\mathcal {L}_1(\mathcal {H}_{\mathcal {S}}\otimes \mathcal {H}_{\mathcal {R}}) \rightarrow \mathcal {L}_1(\mathcal {H}_{\mathcal {S}})\) is covariant with respect to \(U_{\mathcal {S}}\) and \(\mathcal {U}\) (and analogously for the partial trace over \(\mathcal {H}_{\mathcal {S}}\)). The following is a trivial consequence:

Proposition 5

For an arbitrary state \(\varOmega \in \mathcal {L}_1(\mathcal {H}_{\mathcal {T}})\), \(\mathrm{tr}_{\mathcal {H}_{\mathcal {R}}}(\varOmega )\) is invariant under \(U_{\mathcal {S}}\) if \(\varOmega \) is invariant under \(\mathcal {U}\).

Hence we have the following:

Corollary 1

\(\mathrm{tr}_{\mathcal {H}_{\mathcal {R}}}(\tau _{\mathcal {T}*} (\rho _{\mathcal {S}} \otimes \rho _{\mathcal {R}}))\) is invariant under \(U_{\mathcal {S}}\).

5.3 Restrictions After \(\yen \)

The restriction map \(\varGamma _{\omega }\) may be composed with the relativisation map \(\yen \) to give, for arbitrary G,

$$\begin{aligned} (\varGamma _{\omega }\circ \yen ) (A) = \int _{G} U_{\mathcal {S}}(g)AU_{\mathcal {S}}(g)^* d\mu ^{\mathsf {F}}_{\omega }(g), \end{aligned}$$
(37)

where the measure \(\mu ^{\mathsf {F}}_{\omega }:= \omega \circ \mathsf {F}\) (or, for a density operator \(\rho \) corresponding to \(\omega \), \(\mu ^{\mathsf {F}}_{\rho }(X) = \text {tr}\left[ \mathsf {F}(X)\rho \right] \)). As we shall see in the next section, the measure \(\mu ^{\mathsf {F}}_{\omega }\) dictates the proximity of A and \((\varGamma _{\omega }\circ \yen )(A)\).

6 Localisation and Delocalisation

6.1 High Localisation

Recall (Lemma 1) that if a pom \(\mathsf {F}\) satisfies the norm-1 property then for each X for which \(\mathsf {F}(X) \ne 0\) there exists a sequence of unit vectors \((\phi _n) \subset \mathcal {H}_{\mathcal {R}}\) such that \(\lim _{n \rightarrow \infty } \left\langle \,\phi _n\,{|}\,\mathsf {F}(X) \phi _n\,\right\rangle = 1\). This “localising sequence” \((\phi _n)\) allows for the expression of expectation values of relative observables to be given in terms of those of absolute quantities to arbitrary precision:

Theorem 1

Let \(\mathsf {F}\) have the norm-1 property and let G be either \(S_1\) (which we identify with the interval \([-\pi , \pi ]\)) or \(\mathbb {R}\) written additively with identity 0. If \((\phi _n) \subset \mathcal {H}_{\mathcal {R}}\) is a sequence of unit vectors which becomes well localised at \(g=0\), then for each \(A \in \mathcal {L}(\mathcal {H}_{\mathcal {S}})\) and all \(\varphi \in \mathcal {H}_{\mathcal {S}}\)

$$\begin{aligned} \lim _{n \rightarrow \infty } \left\langle \,\varphi \otimes \phi _n\,{|}\, \yen (A) \varphi \otimes \phi _n\,\right\rangle =\left\langle \, \varphi \,{|}\, A \varphi \,\right\rangle \end{aligned}$$
(38)

Proof

Assuming without loss of generality that \(\Vert \varphi \Vert =1\), we write

$$\begin{aligned} \left\langle \,\varphi \otimes \phi _n\,{|}\,\yen (A) \varphi \otimes \phi _n\,\right\rangle =\int _G \left\langle \,\varphi \,{|}\,U(g) A U(g)^* \varphi \,\right\rangle \left\langle \,\phi _n\,{|}\,\mathsf {F}(d g) \phi _n\,\right\rangle = \left\langle \,\varphi \,{|}\,A \varphi \,\right\rangle +c_n \end{aligned}$$
(39)

where \(c_n\) is the “error” for each n which we show goes to zero as n becomes large.

$$\begin{aligned} \left| c_n \right| = \left| \int _G \left\langle \,\varphi \,{|}\,\left( U(g) A U(g)^* -A \right) \varphi \,\right\rangle \left\langle \,\phi _n\,{|}\,\mathsf {F}(dg) \phi _n\,\right\rangle \right| . \end{aligned}$$
(40)

Let \(\varDelta _n = (-1/2n,1/2n)\); then

$$\begin{aligned}&\left| c_n \right| \le \left| \int _{\varDelta _n} \left\langle \,\varphi \,{|}\,\left( U(g) A U(g)^*- A \right) \varphi \,\right\rangle \left\langle \,\phi _n\,{|}\,\mathsf {F}(d g) \phi _n\,\right\rangle \right| \end{aligned}$$
(41)
$$\begin{aligned}&+ \left| \int _{G \backslash \varDelta _n} \left\langle \,\varphi \,{|}\,\left( U(g) A U(g)^* -A \right) \varphi \,\right\rangle \left\langle \,\phi _n\,{|}\,\mathsf {F}(d g) \phi _n\,\right\rangle \right| . \end{aligned}$$
(42)

Now the \(\phi _n\) are chosen as follows: for each n, there is a \(\phi _n\) for which \(|\left\langle \,\phi _n\,{|}\,\mathsf {F}(\varDelta _n) \phi _n\,\right\rangle - 1| < 1/n\) and \(|\left\langle \,\phi _n\,{|}\,\mathsf {F}(G \backslash \varDelta _n) \phi _n\,\right\rangle | < 1/n\). Therefore the second term is bounded above by \(2 \left\| A\right\| \int _{G \backslash \varDelta _n} \left\langle \,\phi _n\,{|}\,\mathsf {F}(d g) \phi _n\,\right\rangle \) which vanishes in the limit. For the first term, writing \(f_{\varphi }^A := g \mapsto \left\langle \,\varphi \,{|}\,(U(g)AU(g)^* - A)\varphi \,\right\rangle \) , we estimate

$$\begin{aligned}&\int _{\varDelta _n}|f_{\varphi }^A (g)|\left\langle \,\phi _n\,{|}\,\mathsf {F}(dg)\phi _n\,\right\rangle \le \int _{\varDelta _n} \sup _{g \in \varDelta _n} |f_{\varphi }^A (g)|\left\langle \,\phi _n\,{|}\,\mathsf {F}(dg)\phi _n\,\right\rangle \nonumber \\&\le \sup _{g \in \varDelta _n}|f_{\varphi }^A(g)|\left\langle \,\phi _n\,{|}\,\mathsf {F}(\varDelta _n) \phi _n\,\right\rangle . \end{aligned}$$
(43)

From the continuity for self-adjoint A of the real function \(f_{\varphi }^A\) it follows that \(f_{\varphi }^A(g) \rightarrow 0\) as \(g \rightarrow 0\), and therefore the right hand side of (43) goes to zero in the \(g \rightarrow 0\) limit. This extends to arbitrary bounded A, as can be seen by decomposing A into real and imaginary parts. \(\square \)

Therefore by choosing a localising sequence \((\phi _n) \subset \mathcal {H}_{\mathcal {R}}\) we can make \(\left\langle \,\varphi \otimes \phi _n\,{|}\,\yen (A) \varphi \otimes \phi _n\,\right\rangle \) as close to \(\left\langle \,\varphi \,{|}\,A\varphi \,\right\rangle \) as we like. This result rests crucially on the assumption that the chosen \(\mathsf {F}\) satisfies the norm-1 property. The main result may thus be rephrased (for the localising sequence \((\phi _n)\)) in terms of \(\yen \) and \(\varGamma \) as follows:

$$\begin{aligned} \lim _{n \rightarrow \infty } (\varGamma _{\phi _n} \circ \yen )(A) = A \end{aligned}$$
(44)

in the weak topology on \(\mathcal {L}(\mathcal {H}_{\mathcal {S}})\), that is, in the topology of pointwise convergence of expectation values.

6.1.1 Examples

Example 4

Qubit algebra. Consider the space \(\mathcal {L}(\mathbb {C}^2)\) and a basis of Pauli operators with identity: \(\{ \mathbb {1}, \sigma _1, \sigma _2, \sigma _{3}\}\). Let \(N_S:=\frac{1}{2}(\mathbb {1} +\sigma _3)\) (which has spectrum \(\{ 0,1 \}\) and corresponding eigenvectors denoted \(|0\rangle , |1\rangle \)). We addend an infinite dimensional reference system, \(\mathcal {H}_{\mathcal {R}}\), with “number basis” \(\{|n\rangle :\,n\in \mathbb {N}\}\), thus defining \(N_{\mathcal {R}}:=\sum _{n=0}^\infty n|n\rangle \langle n|\). Then we may use \(\mathsf {F}\equiv \mathsf {F}^{\text {can}}\) on \(\mathcal {H}_{\mathcal {R}}\) (see Eq. (7)) to construct \(\yen \), and we find that

$$\begin{aligned} \yen (\mathbb {1})&= \mathbb {1}\otimes \mathbb {1}, \end{aligned}$$
(45)
$$\begin{aligned} \yen (\sigma _{3})&= \sigma _3 \otimes \mathbb {1}, \end{aligned}$$
(46)
$$\begin{aligned} \yen (\sigma _1)&= \sum _{m\ge 0} \left( |0\rangle \langle 1| \otimes |m+1\rangle \langle m| + |1\rangle \langle 0| \otimes |{m}\rangle \langle {m+1}|\right) , \end{aligned}$$
(47)
$$\begin{aligned} \yen (\sigma _2)&= i\sum _{m\ge 0}\left( -|0\rangle \langle 1| \otimes |m+1\rangle \langle m| + |1\rangle \langle 0| \otimes |{m}\rangle \langle {m+1}|\right) . \end{aligned}$$
(48)

The possibility of good phase localisation of states with respect to \(\mathsf {F}^\mathrm{can}\) allows the entire qubit algebra \(\mathcal {L}(\mathbb {C}^2)\) to be recovered in the following way. Let \(A \in \mathcal {L}(\mathbb {C}^2)\) be an arbitrary self-adjoint element, and \(\varphi \in \mathbb {C}^2\) an arbitrary unit vector. Define \(|\phi _n\rangle =\frac{1}{\sqrt{n+1}} \sum _{j= 0}^n |j\rangle \), which represents an approximately localised phase centred at zero.

Remark 6

The property that \(\{|\phi _n\rangle \}\) represents an approximate phase eigenstate at phase value \(\theta =0\) for \(\mathsf {F}^\mathrm{can}\) means the following: for every \(\delta >0\), the probability of localisation in the interval \([-\delta /2,+\delta /2]\) approaches 1 as \(n\rightarrow \infty \). Thus,

$$\begin{aligned} \lim _{n\rightarrow \infty }\left\langle \phi _n\big |\mathsf {F}^\mathrm{can}\bigl (\bigl [-\tfrac{\delta }{2},\tfrac{\delta }{2}\bigr ]\bigr )\phi _n\right\rangle =1\quad \text {for any }\delta \in (0,2\pi ). \end{aligned}$$

We sketch a proof of this property. In fact, we will find that the speed of convergence can be specified more precisely: we can allow \(\delta \) to tend to zero as \(\delta =\delta _n:=(n+1)^{(-1+\epsilon )/2}\) for any \(\epsilon \in (0,1)\). We put \(\varDelta _n:=[-\tfrac{\delta _n}{2},\tfrac{\delta _n}{2}]\) and \(X_n=[-\pi ,\pi ]\setminus \varDelta _n\).

Thus, we show that the probability \(p_n:=\left\langle \phi _n\big |\mathsf {F}^\mathrm{can}\bigl ([-\pi ,\pi ]\setminus \varDelta _n\bigr )\phi _n\right\rangle \rightarrow 0\) as \(n\rightarrow \infty \). We have

$$\begin{aligned} p_n= & {} \frac{1}{2\pi (n+1)}\int _{X_n}\left( \sum _{k=0}^ne^{ik\theta }\sum _{\ell =0}^ne^{-i\ell \theta }\right) d\theta =\frac{1}{2\pi (n+1)}\int _{X_n}\frac{1-\cos \bigl ((n+1)\theta \bigr )}{1-\cos \theta }d\theta \\&\quad \le \frac{1}{2\pi (n+1)}\frac{1}{1-\cos (\delta _n/2)}\int _{X_n}\bigl [1-\cos \bigl ((n+1)\theta \bigr )\bigr ]d\theta \\= & {} \frac{1}{n+1}\frac{1}{1-\cos (\delta _n/2)}\left[ 1-\frac{\delta _n}{2\pi }-2\left. \frac{\sin \bigl ((n+1)\theta \bigr )}{2\pi (n+1)}\right| _{\delta _j/2}^\pi \right] \\= & {} \frac{1}{n+1}\frac{1}{1-\cos (\delta _n/2)}\left[ 1-\frac{\delta _n}{2\pi }+\frac{\sin \bigl ((n+1)\delta _n/2\bigr )}{\pi (n+1)}\right] \\= & {} \frac{1}{n+1}\frac{1}{1-\cos (\delta _n/2)}\left[ 1-\frac{\delta _n}{2\pi } +\frac{\delta _n}{2\pi }\frac{\sin \,u}{u}\right] \\&\quad \le \frac{1}{n+1}\frac{1}{1-\cos (\delta _n/2)} \\&\quad \le \frac{1}{n+1}\frac{1}{\frac{\delta _n^2}{8}-\frac{\delta _n^4}{24\cdot 16}} =\frac{8(n+1)^{-\epsilon }}{1-{\delta _n^2}/48}\ \rightarrow \ 0\ \text {as }n\rightarrow \infty . \end{aligned}$$

\(\square \)

With \(\varphi = c_0 |0\rangle + c_1 |1\rangle \) (normalised) and \(A= a_0 {\mathbb {1}} + \mathbf {a}\cdot \varvec{\sigma } = a_0\mathbb {1}+a_1\sigma _1+a_2\sigma _2+a_3\sigma _3\), we have

$$\begin{aligned} \left\langle \,\varphi \,{|}\,A \varphi \,\right\rangle = a_0 + 2a_1 \mathrm{Re} (\bar{c}_0c_1) + 2i a_2 \mathrm{Im} (c_0 \bar{c}_1) + a_3 (\left| c_0 \right| ^2 - \left| c_1 \right| ^2). \end{aligned}$$

Evaluating

$$\begin{aligned} \left\langle \,\varphi \otimes \phi _n\,{|}\,\yen (\mathbb {1})\varphi \otimes \phi _n\,\right\rangle&= 1,\\ \left\langle \,\varphi \otimes \phi _n\,{|}\,\yen (\sigma _1)\varphi \otimes \phi _n\,\right\rangle&= \frac{n}{n+1}2\,\mathrm{Re} (\bar{c}_0c_1)=\frac{n}{n+1}\left\langle \,\varphi \,{|}\,\sigma _1\varphi \,\right\rangle ,\\ \left\langle \,\varphi \otimes \phi _n\,{|}\,\yen (\sigma _2)\varphi \otimes \phi _n\,\right\rangle&= \frac{n}{n+1}2i\,\mathrm{Im}(c_0 \bar{c}_1)=\frac{n}{n+1}\left\langle \,\varphi \,{|}\,\sigma _2\varphi \,\right\rangle ,\\ \left\langle \,\varphi \otimes \phi _n\,{|}\,\yen (\sigma _3)\varphi \otimes \phi _n\,\right\rangle&= \left\langle \,\varphi \,{|}\,\sigma _3 \varphi \,\right\rangle , \end{aligned}$$

we see that as n becomes large, we indeed reproduce for any unit vectors \(\varphi \in \mathcal {H}_S\) the expectation values of the basis operators \(\mathbb {1},\sigma _1,\sigma _2,\sigma _3\), and therefore for all \(A \in \mathcal {L}(\mathbb {C}^2)\):

$$\begin{aligned} \lim _{n \rightarrow \infty }\left\langle \,\varphi \otimes \phi _n\,{|}\,\yen (A)\varphi \otimes \phi _n\,\right\rangle = \left\langle \,\varphi \,{|}\,A \varphi \,\right\rangle . \end{aligned}$$
(49)

In conclusion, we see that by making the reference system explicit and taking the limit of a highly phase-localised state, the statistics of any absolute qubit effect offers an accurate representation of the relative qubit effect in \(\mathcal {H}_{\mathcal {S}} \otimes \mathcal {H}_{\mathcal {R}}\).

Example 5

Finite cyclic group. We may construct \(\yen \) so that \(\yen (A)\) is invariant with respect to a unitary representation \(U_{\mathcal {S}} \otimes U_{\mathcal {R}}\) of some finite cyclic group G. Hence, let G be a group of cyclic permutations of a finite index set I, which can therefore be identified with G. We consider a Hilbert space \(\mathcal {H}_{\mathcal {R}}\) that allows a direct sum decomposition into subspaces of equal dimension, \(\mathcal {H}_{\mathcal {R}} = \bigoplus _{i\in I} \mathcal {H}_{\mathcal {R},i}\), and define a unitary representation \(U_{\mathcal {R}}:G \rightarrow \mathcal {L}(\mathcal {H}_{\mathcal {R}})\) such that for any \(g\in G\), \(U_\mathcal {R}(g)\) maps a given orthonormal basis of each \(\mathcal {H}_{\mathcal {R},i}\) to a given orthonormal basis of \(\mathcal {H}_{\mathcal {R},g.i}\) (where \(i\mapsto g.i\) denotes an action of G on I). Let \(\{P_i\}\) (or technically the map \(i \mapsto P_i\)) denote the pvm composed of the projections onto \(\mathcal {H}_{\mathcal {R}_i}\). Then \(\left( U_{\mathcal {R}}, \{P_i\}, \{ \mathcal {H}_{\mathcal {R}}\} \right) \) is a system of imprimitivity for G, with the covariance \(U_{\mathcal {R}}(g) P_i U_{\mathcal {R}}(g) ^* = P_{g.i}\).

With \(U_{\mathcal {S}}: G \rightarrow \mathcal {L(H_S)}\) any representation of G in \(\mathcal {H_S}\) and with \(\mathcal {H}_{\mathcal {T}}= \mathcal {H_S} \otimes \mathcal {H_R}\), define \(\yen : \mathcal {L(H_S)} \rightarrow \mathcal {L}(\mathcal {H}_{\mathcal {T}})\) by:

$$\begin{aligned} \yen (A) = \sum _{g \in G} U_{\mathcal {S}}(g)AU_{\mathcal {S}}(g)^* \otimes P_g , \end{aligned}$$
(50)

which, with \(\mathcal {U} := U_{\mathcal {S}} \otimes U_{\mathcal {R}}\), satisfies \(\mathcal {U}(g) \yen (A) \mathcal {U}(g)^*=\yen (A)\). Furthermore \(\yen \) is a \(^*\)-homomorphism (since the covariant pom \(\{P_g\}\) generating \(\yen \) is projection valued). Then there exists a state \(\phi \in \mathcal {H}_{\mathcal {R}}\) for which

$$\begin{aligned} \left\langle \,\varphi \,{|}\,A \varphi \,\right\rangle =\left\langle \,\varphi \otimes \phi \,{|}\,\yen (A) \varphi \otimes \phi \,\right\rangle \end{aligned}$$
(51)

for all \(\varphi \in \mathcal {H_S}\). Indeed, we have

$$\begin{aligned} \left\langle \,\varphi \otimes \phi \,{|}\,\yen (A) \varphi \otimes \phi \,\right\rangle = \sum _{g \in G}\left\langle \,\varphi \,{|}\,U_{\mathcal {S}}(g)AU_{\mathcal {S}}(g)^* \varphi \,\right\rangle \left\langle \,\phi \,{|}\,P_g \phi \,\right\rangle , \end{aligned}$$
(52)

so that \(\phi \) may be chosen to be any unit vector \(\phi \in \mathcal {H}_{\mathcal {R},e}\), where e is the identity element of G: in this case \(P_g \phi = \delta _{g,e} \phi \), and (52) collapses to

$$\begin{aligned} \left\langle \,\varphi \,{|}\,U(e)A U(e)^* \varphi \,\right\rangle = \left\langle \,\varphi \,{|}\,A \varphi \,\right\rangle \text {for all } \varphi , \end{aligned}$$
(53)

i.e., \(\varGamma _{\phi }(\yen (A))=A\). Therefore, by choosing a state localised at the identity of G, \(\phi \in \mathcal {H}_{\mathcal {R},e}\), all expectation values of any self-adjoint \(A \in \mathcal {L}(\mathcal {H_S})\) are precisely those of the relativised \(\yen (A) \in \mathcal {L}(\mathcal {H}_{\mathcal {T}})\).

Example 6

Unsharp position. The quality of the reference system, understood as the localisability of the covariant quantity, dictates the quality of approximation of relative quantities by absolute ones (see [4] for a detailed investigation of this phenomenon.) An intuitive example of this behaviour is given by fixing a smeared position \(\mathsf {E}^{Q_\mathcal {R}}_{e}\) for the reference, yielding, for sharp \(\mathsf {E}^{Q_{\mathcal {S}}}\) of \(\mathcal {S}\)

$$\begin{aligned} (\yen \circ \mathsf {E}^{Q_{\mathcal {S}}})(X) = \int _{\mathbb {R}} e^{iPx}\mathsf {E}^{Q_{\mathcal {S}}}(X)e^{-iPx} \otimes \mathsf {E}^{Q_\mathcal {R}}_{e}(dx). \end{aligned}$$
(54)

After restriction, we therefore wish to find the pom \(\tilde{\mathsf {E}}^{Q_{\mathcal {S}}}\) defined by

$$\begin{aligned} \left\langle \,\varphi \,{|}\,\tilde{\mathsf {E}}^{Q_{\mathcal {S}}}(X)\varphi \,\right\rangle = \left\langle \,\varphi \,{|}\,(\varGamma _{\phi } \circ \yen )(\mathsf {E}^{Q_{\mathcal {S}}}) (X) \varphi \,\right\rangle \text { for all}~ \varphi , ~X \end{aligned}$$
(55)

and to compare this with \(\mathsf {E}^{Q_{\mathcal {S}}}\). Moving to the position representation the right hand side of this expression may be written:

$$\begin{aligned} \int \int \int \chi _{X + y + z}(x)\left| \varphi (x) \right| ^2e(y)\left| \phi (z) \right| ^2dxdydz, \end{aligned}$$
(56)

which we write as

$$\begin{aligned} \int dx \left| \varphi (x) \right| ^2 F_X(x) = \left\langle \,\varphi \,{|}\,\tilde{\mathsf {E}}_{Q_{\mathcal {S}}}(X)\,\right\rangle . \end{aligned}$$
(57)

After some manipulations we find that

$$\begin{aligned} F_X(x) = \chi _X * (e * \left| \phi \right| ^2)(x), \end{aligned}$$
(58)

and that therefore

$$\begin{aligned} \tilde{\mathsf {E}}_{Q_{\mathcal {S}}}(X) = \chi _X * (e * \left| \phi \right| ^2)(Q_{\mathcal {S}}). \end{aligned}$$
(59)

The spread of the function \(\tilde{e} = e * \left| \phi \right| ^2\) dictates the (in)accuracy of \(\tilde{\mathsf {E}}^{Q_{\mathcal {S}}}\) as it approximates \({\mathsf {E}}^{Q_{\mathcal {S}}}\). This spread can be quantified in different ways. Using the variance measure, we find that \(\text {Var}(\tilde{e}) = \text {Var}(e) + \text {Var}\bigl (\left| \phi \right| ^2\bigr )\).

Definition 7

For \(0 \le \epsilon < 1\), the overall width \(W_{\epsilon }(p)\) at confidence level \(1-\epsilon \) of a probability measure p is defined by

$$\begin{aligned} W(p; 1-\epsilon ):=\inf _{I}\{|I|: p(I) \ge 1-\epsilon \}; \end{aligned}$$
(60)

here the infimum of the lengths, |I|, is taken over all intervals I in \(\mathbb {R}\).

We may also use the overall width \(W_{\epsilon }(\tilde{e})\) applied to the density \(\tilde{e}\) and the fact that the overall width of a convolution is bounded below by the width of the function with the greatest width, i.e., \(W_{\epsilon }(\tilde{e}) \ge \max \{W_{\epsilon }(e), W_{\epsilon }(\left| \phi \right| ^2) \}\).

Therefore, the quality (localisability of the smeared position) of the reference system dictates the quality of the representing absolute quantity. Here the inaccuracy inherent in the reference system features as a lower bound on the inaccuracy of the absolute quantity. Even with perfect localisation of the reference with respect to the sharp position, there is a residual inaccuracy in the position of the system arising from the unsharpness of the covariant reference position. To get perfect accuracy, we need the preparation to be highly localised at 0, and the smearing distribution to be highly localised around 0.

6.2 Phase Delocalisation

At the other extreme (to that of high localisation) we may also consider very poorly localised reference states, including the worst case scenario of complete delocalisation, possible only for compact groups. For concreteness, we focus on the phase case.

Consider covariant phase pom \(\mathsf {F}\) and invariant state \(\omega \). Such a state is completely delocalised with respect to \(\mathsf {F}\), i.e., \(\mu ^{\mathsf {F}}_{\omega } \equiv (\omega \circ \mathsf {F})(X) = \frac{|X|}{2\pi }\). This is the Haar measure on \(S^1\) under identification of \(S^1\) with \([0,2\pi ]\). Thus we may formally write \(\mu ^{\mathsf {F}}_{\omega }(d \theta ) = d \theta /2\pi \).

The composition of relativisation and restriction for a delocalised state \(\omega \) then has the following effect on \(\mathcal {L}(\mathcal {H}_{\mathcal {S}})\) and \(\mathcal {L}_1(\mathcal {H}_{\mathcal {S}})\) respectively:

$$\begin{aligned} (\varGamma _{\omega } \circ \yen )(A) = \frac{1}{2 \pi }\int _{S^1} U(\theta )AU(\theta )^* d \theta \equiv \tau _{\mathcal {S}}(A); \end{aligned}$$
(61)

with predual/Schrödinger picture

$$\begin{aligned} (\varGamma _{\omega } \circ \yen )_*(\rho )=\frac{1}{2 \pi }\int _{S^1} U(\theta )^*\rho U(\theta ) d \theta \equiv \tau {{_\mathcal {S}}_*}(\rho ). \end{aligned}$$
(62)

The above equations hold if (for example) we choose for \(\omega \) the density operator \(\tau _{\mathcal {R}*}(\rho _{\mathcal {R}})\) for any \(\rho _{\mathcal {R}}\) of \(\mathcal {R}\).

Thus, from the perspective of \(\yen \), completely delocalised reference states give rise to restricted quantities/states which are phase-shift invariant/commute with \(N_{\mathcal {S}}\). We note that the above relations (Eqs. (61) and (62)) also generalise, i.e.,

Proposition 6

Let \(\varLambda \) denote a general relative (invariant) self-adjoint operator acting in \(\mathcal {H}_{\mathcal {T}}\). If \(\omega \) is invariant, then \(\varGamma _{\omega }\) is covariant. Then there exists a self-adjoint \(A \in \mathcal {L}(\mathcal {H}_{\mathcal {S}})\) for which

$$\begin{aligned} \varGamma _{\omega }(\varLambda ) = \frac{1}{2 \pi }\int _{S^1} U(\theta )AU(\theta )^* d \theta \equiv \tau _{\mathcal {S}}(A). \end{aligned}$$

The proof is a simple corollary of Lemma 4 and Proposition 1. We include it separately to emphasise the general property of invariance of restricted quantities obtained from general invariant quantities and invariant reference states.

6.3 Discussion

We have now seen that in the case of perfect reference phase localisation, absolute quantities of \(\mathcal {S}\), along with non-invariant states, provide an adequate theoretical and empirical account of the statistics produced by invariant quantities of \(\mathcal {S}+ \mathcal {R}\). In this case, the reference can be externalised, and excluded from the description. Though such a localised state is certainly a quantum state, there is a sense in which it may be viewed as classical—if the reference were provided by an abelian algebra (say, \(C_0(G)\), embedded in \(\mathcal {L}(L^2(G))\)), classical pure states correspond to points in G which, are of course, localised (moreover, as shown in [4], “good” reference frames must be large, pointing to some form of classicality). These observations go some way towards justifying the informal use of external/classical reference frames in working with non-invariant states of \(\mathcal {S}\), which has become common-place in the literature.

The \(\tau _{\mathcal {S}}\) mapping (and \(\tau _{\mathcal {S}*}\)) manifests in (at least) two distinct ways. Initially, we observed that the assumption that observables of \(\mathcal {S}\) are invariant implies that \(\rho _{\mathcal {S}}\) and \(\tau _{\mathcal {S}*}(\rho _{\mathcal {S}})\) cannot be distinguished, yielding an equivalence class of indistinguishable states (indeed, the notion of state could be redefined as this class). Now we see that \(\tau _{\mathcal {S}*}\) produces a state description for \(\mathcal {S}\) applicable when \(\mathcal {R}\) is prepared in a completely phase-indefinite state (for example, an eigenstate of \(N_{\mathcal {R}}\).) Indeed, \(\tau _{\mathcal {S}*}\) (or (62)) is the “twirling” operation used in, e.g., [7], to yield a state description in which some observer of \(\mathcal {S}\) “lacks a phase reference”. There, it is argued that this is the description one would use if some experimenter wished to describe the state of \(\mathcal {S}\), but had no knowledge of the value of the (classical) phase reference that the state of \(\mathcal {S}\) implicitly refers to.

In our formulation, this averaging arises as part of the physical description of an experiment in which the reference phase is completely phase-indefinite. Number states are of this type, and the phase-indefiniteness is a quantum restriction arising from number-phase preparation uncertainty relations. Therefore, we are able to give an alternative interpretation of to what the “lack of a phase reference” may be understood to refer: it is the situation in which a quantum phase reference is completely phase-indeterminate. There is no requirement of epistemic arguments regarding information possessed by experimenters, nor any need to refer to classical phase references at all.

6.4 The Findings of [4]: General Considerations

We briefly review the findings of [4], which presents a study of “intermediate” situations for the reference frame, in between the very high and very low localisation covered in the present paper. Miyadera et al. [4] provides quantitative and operational size-versus-inaccuracy trade-off relations highlighting the necessity of large apparatus for good agreement between some arbitrary effect A and \(\varGamma _{\omega }(E)\) for invariant effect E.

We now state the main results more precisely. In the following, the Hilbert spaces involved are assumed finite dimensional. The operator norm \(\left\| \cdot \right\| \) on \(\mathcal {L}(\mathcal {H}_{\mathcal {S}})\) induces the metric \(D(A,B):= \left\| A-B\right\| = \sup _{\sigma }|\sigma (A) - \sigma (B)|\), which when restricted to the set of effects \(\mathcal {E}(\mathcal {H}_{\mathcal {S}})\) gives an operational measure of the discrepancy between the effects A and B.

In the following, the quantity \(W^0_{\epsilon }(\mu _{\omega _\mathcal {R}}^\mathsf {F})\) refers to the overall width (cf. Definition 7) of the probability measure \(\mu _{\omega _\mathcal {R}}^\mathsf {F}(X)\equiv \omega _{\mathcal {R}}\circ \mathsf {F}(X)\) around 0, i.e.,

$$\begin{aligned} W^0_{\epsilon }(\mu _{\omega _\mathcal {R}}^\mathsf {F}) := \inf \left\{ w \big |\ \mu _{\omega _{\mathcal {R}}}^{\mathsf {F}}\bigl ( \mathcal {I}(0,w)\bigr )\ge 1-\epsilon \right\} , \end{aligned}$$

where \(\mathcal {I}(0,w)\) denotes the closed interval of width \(w \le 2 \pi \) centred at \(\theta \in [-\pi . \pi )\). Of course, \(W^0_{\epsilon }(\mu _{\omega _\mathcal {R}}^\mathsf {F}) \ge W_{\epsilon }(\mu _{\omega _\mathcal {R}}^\mathsf {F})\). This \(\inf \) can be replaced by \(\min \) (see, e.g., [35, Chapter 12]).

In the case in which relative quantities of \(\mathcal {S}+ \mathcal {R}\) are obtained through \(\yen \), strong localisation of \(\omega \) around \(\theta = 0\) gives good approximation between absolute and relativised effects:

Proposition 7

Let \(\yen \) be a relativisation map and \(\varGamma _{\omega }\) a restriction map. For an arbitrary effect A and \(0\le \epsilon <1\), it holds that

$$\begin{aligned} D\bigl (A, \varGamma _{\omega }(\yen (A))\bigr ) \le \bigl \Vert [N_{\mathcal {S}}, A]\bigr \Vert \left( \tfrac{1}{2}{W^0_{\epsilon }(\mu _{\omega }^{\mathsf {F}})}(1-\epsilon ) + \pi \epsilon \right) . \end{aligned}$$

Bad localisation gives bad approximation:

Proposition 8

For \(A= \frac{1}{2}\bigl (|0\rangle \langle 0| + |1\rangle \langle 1| +|0\rangle \langle 1| + |1\rangle \langle 0|\bigr )\), it holds that

$$\begin{aligned} D\bigl (A, \varGamma _{\omega _{\mathcal {R}}}( \yen (A))\bigr ) \ge \frac{\epsilon }{2}\Bigl ( 1- \cos \left( \tfrac{1}{2}{W^0_{\epsilon }(\mu _{\omega _\mathcal {R}}^{\mathsf {F}})}\right) \Bigr ). \end{aligned}$$

And finally,

Theorem 2

Let A be an effect defined by \(A=\frac{1}{2}(|0\rangle \langle 0 | + |1\rangle \langle 1| + |0\rangle \langle 1 | + |1\rangle \langle 0|)\). For \(\omega _{\mathcal {R}}\) satisfying \(\varDelta _{\omega _{\mathcal {R}}} N_{\mathcal {R}} < \frac{1}{6}\),

$$\begin{aligned} D(A, \varGamma _{\omega _{\mathcal {R}}}(\yen (A))) > \frac{1}{32}. \end{aligned}$$

For \(\omega _{\mathcal {R}}\) satisfying \(\varDelta _{\omega _\mathcal {R}} N_{\mathcal {R}} \ge \frac{1}{6}\), it holds

$$\begin{aligned} D(A, \varGamma _{\omega _{\mathcal {R}}}(\yen (A))) \ge \frac{1}{32}\left( 1- \cos \left( \frac{\pi }{12\varDelta _{\omega _{\mathcal {R}}}N_{\mathcal {R}}}\right) \right) . \end{aligned}$$

One may go beyond the case of invariant quantities being obtained using \(\yen \), and consider general invariant quantities. The following holds for a general (a finite-dimensional, connected) Lie group G, acting via projective unitary representations \(U_{\mathcal {S}}\) and \(U_{\mathcal {R}}\) in \(\mathcal {H}_{\mathcal {S}}\) and \(\mathcal {H}_{\mathcal {R}}\) respectively, with self-adjoint generators \(N_{\mathcal {S}}\) and \(N_{\mathcal {R}}\).

Theorem 3

Recall that \(V(A)=\Vert A-A^2\Vert \), \(D(A,B) = \left\| A-B\right\| \), \(N_{\mathcal {S}}\) and \(N_{\mathcal {R}}\) are number operators on \(\mathcal {H}_{\mathcal {S}}\) and \(\mathcal {H}_{\mathcal {R}}\) respectively, and \(\omega _{\mathcal {R}} \in \mathcal {S}(\mathcal {H}_{\mathcal {R}})\). Then the following inequality holds:

$$\begin{aligned} \bigl \Vert [A, N_\mathcal {S}]\bigr \Vert&\le 2 D\bigl (\varGamma (E), A\bigr ) \Vert N_\mathcal {S}\Vert +2 \left( \omega _{\mathcal {R}}(N_{\mathcal {R}}^2)-\omega _{\mathcal {R}}(N_{\mathcal {R}})^2\right) ^{1/2}\\&\quad \times \,\bigl (2D(\varGamma (E),A)+V(A)\bigr )^{1/2}. \end{aligned}$$

Therefore, good approximation between arbitrary absolute effects (of \(\mathcal {S}\)) and relative effects (of \(\mathcal {S}+ \mathcal {R}\)), a large spread in the reference’s number operator is required, and sufficient for this is good phase localisation.

6.5 Absolute Coherence

The stipulation that observable quantities are invariant under the given symmetry action bears strongly upon the possibility of operationally discerning between coherent superpositions (of eigenstates of the generator, in our case, number) and incoherent mixtures. It will therefore be useful to have a simple working definition and quantification of the absolute coherence of states with respect to a number operator (see [36] for other measures and observables).

Definition 8

Let N be a number operator acting in (generic Hilbert space) \(\mathcal {H}\) and \(\rho \) a density matrix. Then \(\rho \) is absolutely coherent with respect to N if \(\tau _*(\rho ) \ne \rho \), and (absolutely) incoherent otherwise.

This suggests a measure of absolute coherence:

Definition 9

The absolute coherence \(\mathcal {C}(\rho ) :=\frac{1}{2}\left\| \rho - \tau _*(\rho )\right\| _1\).

Clearly, then, a state \(\rho \) is absolutely coherent if and only if it is not invariant under \(\rho \mapsto e^{iN\theta }\rho e^{-iN\theta }\). Let \(\{|n\rangle \}\) be a (possibly infinite) orthonormal basis of \(\mathcal {H}\) consisting of eigenvectors of N. For an arbitrary state \(\varphi = \sum _n c_n |n\rangle \), no self-adjoint operator A commuting with N, i.e., no observable quantity, can distinguish between \(P[\varphi ]\) and \(\tau _{*}(P[\varphi ]) = \sum {\left| c_n \right| ^2}|n\rangle \langle n|\), i.e., between a state with absolute coherence and a state without it (Proposition 2).

Since localised states with respect to phase conjugate to N are not invariant, they are necessarily absolutely coherent. We now show that highly localised states have large absolute coherence, for the case of \(S^1\) and finite dimensional Hilbert spaces.

Proposition 9

Consider a covariant pom \(\mathsf {E}\) of \(S^1\) on a finite dimensional Hilbert space \(\mathcal {H}_{\mathcal {S}}\). For \(\rho \) with overall width \(W_{\epsilon }(\mu ^{\mathsf {E}}_{\rho })\) and for an arbitrary \(\epsilon \), it holds that

$$\begin{aligned} \mathcal {C}(\rho ) \ge 1 -2\epsilon - \frac{3W_{\epsilon }(\mu ^{\mathsf {E}}_{\rho })}{2\pi } (1-2 \epsilon ). \end{aligned}$$

Proof

For simplicity we assume \(\varDelta = [-W/2, W/2]\) satisfies \(\text{ tr }(\rho \mathsf {E}(\varDelta ))= 1-\epsilon \), where \(W= W_{\epsilon }(\mu ^{\mathsf {E}}_{\rho })\). Since the claim is trivial for \(3W/2 \ge \pi \), we assume \(3W/2 <\pi \). Since \(\mathcal {C}(\rho )= \frac{1}{2}\left\| \rho - \tau _*(\rho )\right\| _1 = \sup _{E: O\le \Vert E\Vert \le \mathbb {1}} | \text{ tr }(\rho E) - \text{ tr }(\rho \tau (E))|\) and \( O\le E(\varDelta ) \le \mathbb {1}\) hold, we have

$$\begin{aligned} \mathcal {C}(\rho )\ge & {} \text{ tr }( \rho \mathsf {E}(\varDelta ))- \frac{1}{2\pi } \int d\theta \text{ tr }( \rho E(\varDelta + \theta )). \end{aligned}$$

The first term in the right-hand side is \(1-\epsilon \). The second term in the right-hand side is now estimated. For any \(\theta \), by the definition of the overall width, \(\text{ tr }(\rho E(\varDelta +\theta )) \le 1- \epsilon \) holds. For \(\theta \in S^1 \setminus [-3W/2,3W/2]\), since \(E(\varDelta +\theta ) \cap E(\varDelta ) = \emptyset \) holds, we have \(\text{ tr }(\rho E(\varDelta +\theta ) )\le \epsilon \). Thus we obtain

$$\begin{aligned} \frac{1}{2\pi } \int d\theta \text{ tr }( \rho E(\varDelta + \theta )) =&\, \frac{1}{2\pi } \int ^{3W/2}_{-3W/2} d\theta \text{ tr }( \rho E(\varDelta + \theta ))\\&+ \frac{1}{2\pi } \int _{S^1 \setminus [-3W/2, 3W/2]} d\theta \text{ tr }( \rho E(\varDelta + \theta )) \\&\le \frac{3W}{2\pi }(1-\epsilon ) + \frac{2\pi -3W}{2\pi } \epsilon = \frac{3W}{2\pi }(1-2\epsilon ) +\epsilon . \end{aligned}$$

\(\square \)

6.6 Summary and Analysis of a Potential Objection

Returning to the situation in which we identify a system \(\mathcal {S}\) and a reference \(\mathcal {R}\) with number observables \(N_{\mathcal {S}}\) and \(N_{\mathcal {R}}\) respectively, an apparent circularity arises. Briefly summarising the story so far, we have argued that observable quantities are (defined as) those which are invariant under given symmetries. Given a quantum object, this entails that, in the phase-shift-invariance case, the states \(\rho \) and \(\tau _{\mathcal {T}_*}(\rho )\) are observationally equivalent and occupy the same equivalence class. From this point of view, absolute coherence is not a necessary feature of any description of the quantum object.

However, we have argued that in certain circumstances the given object may be separated into two parts: system \(\mathcal {S}\) and reference \(\mathcal {R}\), and that invariant quantities of \(\mathcal {S}+ \mathcal {R}\) can, in the case of \(\mathcal {R}\) having a phase quantity possessing the norm-1 property, be arbitrarily well approximated by absolute quantities of \(\mathcal {S}\), given a highly localised state of \(\mathcal {R}\). Absolute quantities are, in particular, sensitive to the difference between an absolutely coherent state \(\rho \) and its invariant, absolutely incoherent counterpart \(\tau _{{\mathcal {S}}_*}(\rho )\).

Therefore, a description of \(\mathcal {S}+ \mathcal {R}\) in terms of only restricted/absolute quantities of \(\mathcal {S}\) (not commuting with \(N_{\mathcal {S}}\)), along with states with absolute coherence is possible, given states localised with respect to the absolute phase of \(\mathcal {R}\), which requires that such states have absolute coherence with respect to \(N_{\mathcal {R}}\). This poses a difficulty, since it appears that we claim the description in terms of \(\mathcal {S}\) alone is a relational one, depending implicitly on \(\mathcal {R}\), but have offered no such account for \(\mathcal {R}\). Moreover, the appearance of absolute coherence (of states of) \(\mathcal {S}\) appears to depend on the actuality of absolute coherence of (states of) \(\mathcal {R}\).

What we therefore seek to develop in the next section is a “fully relational” picture in which \(\mathcal {S}\) and \(\mathcal {R}\) are treated on an equal footing. What emerges is that coherence is a truly relational notion in quantum mechanics, requiring two systems for its definition. From this, through development of the new concept of mutual coherence, we are able to give an analysis of interference experiments in terms of mutual coherence, and provide novel perspectives on the “reality” of optical coherence and the subtle issue of superselection rules and their relationship with quantum reference frames.

7 Fully Relational Picture

In order to obviate the objection raised in the previous section, we now rephrase our findings in what we describe as a fully relational picture, that is, in presenting our main results without taking recourse to absolute coherence, absolute localisation, absolute quantities, etc. In short, we may present the main theorem of the paper so far—Theorem 1—in a fully invariant manner for \(\mathcal {S}+ \mathcal {R}\). This does not change the mathematical content of the theorem, but highlights that only invariant states of \(\mathcal {S}+ \mathcal {R}\) are required for good approximation of relational quantities by absolute ones, from which we may conclude that non-invariant states of \(\mathcal {S}\) are representative of invariant ones of \(\mathcal {S}+ \mathcal {R}\), in direct analogy to the case of observables. It also motivates the concept of mutual coherence, to be presented in the next section.

7.1 States

We return once again to the situation wherein \(\mathcal {R}\) has phase quantity satisfying the norm-1 property. First, with \((\phi _i) \subset \mathcal {H}_{\mathcal {R}}\) a localising sequence (around 0), we may write Eq. (44) as

$$\begin{aligned} \lim _{i \rightarrow \infty }\text {tr}\left[ \rho \otimes P[\phi _i] \yen (A)\right] = \text {tr}\left[ \rho A\right] , \end{aligned}$$
(63)

holding for all \(\rho \in \mathcal {L}_1(\mathcal {H}_{\mathcal {S}})\) and \(A \in \mathcal {L}(\mathcal {H}_{\mathcal {S}})\). Thus, since \(\yen (A)\) is invariant we find that

$$\begin{aligned} \lim _{i \rightarrow \infty }\text {tr}\left[ \tau _{\mathcal {T}_*}(\rho \otimes P[\phi _i]) \yen (A)\right] = \text {tr}\left[ \rho A\right] \end{aligned}$$
(64)

for all \(\rho \in \mathcal {L}_1(\mathcal {H}_{\mathcal {S}})\) and \(A \in \mathcal {L}(\mathcal {H}_{\mathcal {S}})\). Hence, the limit and the resulting approximation may be carried out using only invariant/absolutely incoherent states of \(\mathcal {S}+ \mathcal {R}\).

Just as absolute quantities of \(\mathcal {S}\) may be used to represent invariant ones of \(\mathcal {S}+ \mathcal {R}\), with good approximation coming with good localisation, a state \(\rho \) of \(\mathcal {S}\) with absolute coherence may be used to represent invariant/absolutely incoherent states of the form \(\tau _{\mathcal {T}_*}({\rho \otimes P[\phi ]})\) of \(\mathcal {S}+ \mathcal {R}\), again with good approximation coming with high phase localisation of \(\phi \).

However, we recall that there are difficulties with ascribing physical significance to the absolutely localised state \(\phi \) with respect to the absolute phase pom appearing in the definition of \(\yen \), namely, absolute properties such as localisation and coherence, and absolute quantities such as phase should be understood as relative to a reference, with the reference being only implicit. Therefore, we should seek a consistent formulation in which absolute properties of \(\mathcal {R}\) are not required—the description should be entirely relational. The above discussion is a step in this direction: the states \(\{\tau _{\mathcal {T}_*}({\rho \otimes P[\phi ]})\}\) for some localised \(\phi \) no longer “contain” the localised \(\phi \) in the sense that the partial trace over system or reference yields invariant/delocalised/absolutely incoherent states, and therefore a localised state cannot be attributed to \(\mathcal {R}\).

We now introduce the concept of mutual coherence, which we view as the fully relational version of ordinary coherence.

8 Coherence Revisited: Mutual Coherence

The need for a relational understanding of coherence has been clearly enunciated in the literature (e.g., [7, 10, 12, 37]). However, little formalism is provided to deal precisely with such a relational notion, and there is no framework capable of making sense of an external classical frame (which appears in [7, 12, 37]). We will analyse this in more detail in Sect. 11; here we introduce the concept of mutual coherence, which we view as the relational counterpart (and a generalisation of) of absolute coherence (usually referred to as coherence in standard treatments).

We treat the number/phase case, recalling that we view poms which are invariant under relevant symmetry transformations as the truly observable quantities, and use the term “invariant quantity” for such objects.

Lemma 5

It holds that

$$\begin{aligned} (\tau _{\mathcal {S}}\otimes id)\circ \tau _{\mathcal {T}} = (id \otimes \tau _{\mathcal {R}}) \circ \tau _{\mathcal {T}} =(\tau _{\mathcal {S}}\otimes \tau _{\mathcal {R}}) \circ \tau _{\mathcal {T}}. \end{aligned}$$
(65)

Proof

We denote eigenvalue decompositions by \(N_{\mathcal {S}}= \sum _n n P^{\mathcal {S}}_n\), \(N_{\mathcal {R}}= \sum _m m P^{\mathcal {R}}_m\) and \(N_{\mathcal {T}}= N_{\mathcal {S}}+N_{\mathcal {R}}= \sum _N N P_N\). Then, \(P_N = \sum _{n+m= N} P^{\mathcal {S}}_n \otimes P^{\mathcal {R}}_m\) and

$$\begin{aligned} \tau _{\mathcal {T}}(A)= & {} \sum _N P_N A P_N \\= & {} \sum _N \sum _{n_1+m_1=N} \sum _{n_2+ m_2= N} (P^{\mathcal {S}}_{n_1}\otimes P^{\mathcal {R}}_{m_1})A (P^{\mathcal {S}}_{n_2} \otimes P^{\mathcal {R}}_{m_2}). \end{aligned}$$

Since \(\tau _{\mathcal {S}}(B) = \sum _n P^{\mathcal {S}}_n B P^{\mathcal {S}}_n\) and \(\tau _{R}(C) = \sum _m P^{\mathcal {R}}_m C P^{\mathcal {R}}_m\), a simple calculation shows that

$$\begin{aligned} (\tau _{\mathcal {S}}\otimes id)\circ \tau _{\mathcal {T}}(A)= & {} \sum _N \sum _{n+m= N} (P^{\mathcal {S}}_n \otimes P^{\mathcal {R}}_m) A (P^{\mathcal {S}}_n \otimes P^{\mathcal {R}}_m) \\= & {} (id \otimes \tau _{\mathcal {R}})\circ \tau _{\mathcal {T}}(A) =(\tau _{\mathcal {S}} \otimes \tau _{\mathcal {R}}) \circ \tau _{\mathcal {T}}(A). \end{aligned}$$

\(\square \)

Corollary 2

The following two conditions are equivalent.

  1. (i)

    There exists an invariant quantity \(\mathsf {E}\) (thus \(\tau _{\mathcal {T}}(\mathsf {E}(X)) = \mathsf {E}(X)\) for all X) and an X such that

    $$\begin{aligned} \text{ tr }[(\tau _{\mathcal {S}*}(\rho _{\mathcal {S}}) \otimes \rho _{\mathcal {R}})\mathsf {E}(X)] \ne \text{ tr }[(\rho _{\mathcal {S}}\otimes \rho _{\mathcal {R}}) \mathsf {E}(X)]. \end{aligned}$$
  2. (ii)

    There exists an invariant quantity \(\mathsf {E}\) (\(\tau _{\mathcal {T}}(\mathsf {E}(X)) = \mathsf {E}(X)\) for all X) and an X such that

    $$\begin{aligned} \text{ tr }[(\rho _{\mathcal {S}} \otimes \tau _{\mathcal {R}*}(\rho _{\mathcal {R}}))\mathsf {E}(X)] \ne \text{ tr }[(\rho _{\mathcal {S}}\otimes \rho _{\mathcal {R}}) \mathsf {E}(X)]. \end{aligned}$$

Proof

Assume (i) holds. Then for \(\mathsf {E}(X)\) satisfying the condition (i),

$$\begin{aligned} \text{ tr }[(\rho _{\mathcal {S}} \otimes \rho _{\mathcal {R}}) \mathsf {E}(X)]\ne & {} \text{ tr }[((\tau _{\mathcal {S}* }(\rho _S) \otimes \rho _{\mathcal {R}})\mathsf {E}(X)] \qquad \\= & {} \text{ tr }[(\rho _{\mathcal {S}} \otimes \rho _{\mathcal {R}})(\tau _{\mathcal {S}} \otimes id) \circ \tau _T(\mathsf {E}(X))]\\= & {} \text{ tr }[(\rho _{\mathcal {S}} \otimes \rho _{\mathcal {R}}) (id \otimes \tau _{\mathcal {R}}) \circ \tau _{\mathcal {T}}(\mathsf {E}(X))]\\= & {} \text{ tr }[(\rho _{\mathcal {S}} \otimes \tau _{\mathcal {R}*}(\rho _{\mathcal {R}})) \mathsf {E}(X)]. \end{aligned}$$

Thus (ii) follows and vice versa. \(\square \)

Moreover, one can observe that for condition (i) to hold both \(\tau _{\mathcal {S}*}(\rho _{\mathcal {S}}) \ne \rho _{\mathcal {S}}\) and \(\tau _{\mathcal {R}*} (\rho _{\mathcal {R}}) \ne \rho _{\mathcal {R}}\) must be satisfied.

Therefore, since \((\tau _{\mathcal {S}} \otimes id) \circ \tau _{\mathcal {T}} = (id \otimes \tau _{\mathcal {R}}) \circ \tau _{\mathcal {T}}\), one can conclude that a system state \(\rho _{\mathcal {S}}\) is coherent relative to the reference state \(\rho _{\mathcal {R}}\) if and only if \(\rho _{\mathcal {R}}\) is coherent relative to \(\rho _{\mathcal {S}}\). Thus, coherence has a truly relational character. This motivates the following definition:

Definition 10

A pair of states \((\rho _{\mathcal {S}}, \rho _{\mathcal {R}})\) is called mutually coherent if either of the conditions (i) or (ii) of Corollary 2 holds.

This may be generalised to an arbitrary (possibly non-separable) state \(\varTheta \in \mathcal {S}(\mathcal {H}_{\mathcal {T}})\).

Definition 11

A state \(\varTheta \) of \(\mathcal {S}+ \mathcal {R}\) is said to be mutually coherent (with respect to \(\mathcal {S}\) and \(\mathcal {R}\)) if

  1. (i)’

    there exists an invariant observable \(\mathsf {E}\) and an X such that

    $$\begin{aligned} \text{ tr }[(\tau _{\mathcal {S}*}\otimes id)(\varTheta )\mathsf {E}(X)] \ne \text{ tr }[\varTheta \mathsf {E}(X)] \end{aligned}$$

    or (equivalently)

  2. (ii)’

    there exists an invariant observable \(\mathsf {E}\) and an X such that

    $$\begin{aligned} \text{ tr }[(id\otimes \tau _{\mathcal {R}*})(\varTheta )\mathsf {E}(X)] \ne \text{ tr }[\varTheta \mathsf {E}(X)]. \end{aligned}$$

A quantitative measure \(\mathcal {M}(\varTheta )\) of mutual coherence of \(\varTheta \) may be provided by the quantity (where the supremum is taken over invariant effects)

$$\begin{aligned} \mathcal {M}(\varTheta ):= & {} \sup _{E} \bigl | \text{ tr }\bigl [((\tau {_{\mathcal {S}*}}\otimes id)(\varTheta ) - \varTheta )E\bigr ]\bigr |\\= & {} \sup _{E} \bigl | \text{ tr }\bigl [ ((id\otimes \tau {_{\mathcal {R}*}})(\varTheta ) - \varTheta )E\bigr ]\bigr |\\= & {} \sup _{E} \bigl |\text{ tr }\bigl [((\tau {_{\mathcal {S}*}}\otimes \tau {_{\mathcal {R}*}})(\varTheta ) - \varTheta )E\bigr ]\bigr |. \end{aligned}$$

The above equalities follow easily from Lemma 5. For \(\varTheta = \rho _{\mathcal {S}} \otimes \rho _{\mathcal {R}}\) we may write this quantitative measure as \(\mathcal {M}(\rho _\mathcal {S}, \rho _\mathcal {R})\).

We note that this measure of mutual coherence is invariant with respect to the unitary representations \(U_{\mathcal {S}} \otimes id\) and \(id \otimes U_{\mathcal {R}}\) (and therefore also under \(U_{\mathcal {S}} \otimes U_{\mathcal {R}}\)). The following propositions show that if either (system or reference) state is invariant (absolutely incoherent), the mutual coherence vanishes, and at the other extreme (high reference localisation), the mutual coherence is well approximated by the absolute coherence (of the system state).

Proposition 10

$$\begin{aligned} \mathcal {M}(\rho _\mathcal {S}, \rho _\mathcal {R}) \le \min \{ \mathcal {C}(\rho _\mathcal {S}), \mathcal {C}(\rho _\mathcal {R})\} \end{aligned}$$

Proof

$$\begin{aligned} \mathcal {M}(\rho _\mathcal {S}, \rho _\mathcal {R}) \le \frac{1}{2} \Vert \tau _{\mathcal {S}*}(\rho _\mathcal {S}) \otimes \rho _{\mathcal {R}}-\rho _{\mathcal {S}} \otimes \rho _{\mathcal {R}}\Vert _{1} =\frac{1}{2}\Vert \tau _{\mathcal {S}*}(\rho _{\mathcal {S}}) - \rho _{\mathcal {S}}\Vert _1. \end{aligned}$$

\(\square \)

In particular, for an invariant state \(\rho _\mathcal {R}\), the mutual coherence \(\mathcal {M}(\rho _{\mathcal {S}}, \rho _{\mathcal {R}})\) vanishes.

Proposition 11

For a highly phase-localised state \(\rho _{\mathcal {R}}\) of \(\mathcal {R}\), \(\mathcal {M}(\rho _{\mathcal {S}}, \rho _{\mathcal {R}})\) is approximately \(\mathcal {C}(\rho _{\mathcal {S}})\) (the absolute coherence of \(\rho _{\mathcal {S}}\)).

Proof

(We give a proof for finite dimensional Hilbert spaces.) For highly localised \(\rho _\mathcal {R}\), we have shown (Proposition 7) that for an arbitrary effect E of \(\mathcal {S}\), \(\varGamma _{\rho _{\mathcal {R}}}(\yen (E))\) well approximates E, as

$$\begin{aligned} D(E, \varGamma _{\rho _{\mathcal {R}}}(\yen (E))) \le \Vert [N_{\mathcal {S}}, E]\Vert \left( \frac{1}{2}W^0_{\epsilon }(\mu ^{\mathsf {F}}_{\rho _\mathcal {R}})(1-\epsilon ) + \pi \epsilon \right) . \end{aligned}$$
(66)

From the definition of \(\mathcal {M}\) we observe that

$$\begin{aligned} \mathcal {M}(\rho _{\mathcal {S}}, \rho _{\mathcal {R}})&\ge \sup _{E} | \text{ tr }((\tau _{\mathcal {S}_*}(\rho _{\mathcal {S}})\otimes \rho _{\mathcal {R}} - \rho _{\mathcal {S}}\otimes \rho _{\mathcal {R}})\yen (E)) |\\&= \sup _{E} | \text{ tr }((\tau _{\mathcal {S}_*}(\rho _{\mathcal {S}}) - \rho _{\mathcal {S}})\varGamma _{\rho _{\mathcal {R}}}(\yen (E)))|. \end{aligned}$$

For a fixed effect E, we have, using (66):

$$\begin{aligned}&|\text{ tr }((\tau _{\mathcal {S}_*}(\rho _{\mathcal {S}}) -\rho _{\mathcal {S}})\varGamma _{\rho _{\mathcal {R}}}(\yen (E)))|\\&\ge |\text{ tr }((\tau _{\mathcal {S}_*}(\rho _{\mathcal {S}}) -\rho _{\mathcal {S}})E)| - |\text{ tr }((\tau _{\mathcal {S}_*}(\rho _{\mathcal {S}})-\rho _{\mathcal {S}}) (\varGamma _{\rho _{\mathcal {R}}}(\yen (E))- E))| \\&\ge |\text{ tr }((\tau _{\mathcal {S}_*}(\rho _{\mathcal {S}}) -\rho _{\mathcal {S}})E)|- 2 D(E, \varGamma _{\rho _{\mathcal {R}}} (\yen (E))) \\&\ge |\text{ tr }((\tau _{\mathcal {S}_*}(\rho _{\mathcal {S}}) -\rho _{\mathcal {S}})E)| -2\Vert N_{\mathcal {S}}\Vert \left( \frac{1}{2}W^0_{\epsilon }(\mu ^{\mathsf {F}}_{\rho _\mathcal {R}})(1-\epsilon ) + \pi \epsilon \right) . \end{aligned}$$

Since E is arbitrary, we have

$$\begin{aligned} \mathcal {M}(\rho _{\mathcal {S}}, \rho _{\mathcal {R}}) \ge \mathcal {C}(\rho _{\mathcal {S}}) - 2\Vert N_{\mathcal {S}}\Vert \left( \frac{1}{2}W^0_{\epsilon }(\mu ^{\mathsf {F}}_{\rho _\mathcal {R}})(1-\epsilon ) + \pi \epsilon \right) . \end{aligned}$$
(67)

We recall Proposition 10, which states that \(\mathcal {M}(\rho _\mathcal {S}, \rho _\mathcal {R}) \le \min \{ \mathcal {C}(\rho _\mathcal {S}), \mathcal {C}(\rho _\mathcal {R})\}\). In the high localisation regime for \(\rho _{\mathcal {R}}\), the second expression on the right side of (67) becomes small and we may assume that \(\mathcal {C}(\rho _\mathcal {S}) \le \mathcal {C}(\rho _\mathcal {R})\). Therefore, we have the approximate equality \(\mathcal {M}(\rho _\mathcal {S}, \rho _\mathcal {R}) \approx \mathcal {C}(\rho _\mathcal {S})\), with the quality of approximation becoming arbitrarily good as \(\rho _{\mathcal {R}}\) becomes highly localised. \(\square \)

In other words, the mutual coherence takes on the appearance of absolute coherence in the high reference localisation limit.

We shall soon discuss the role of mutual coherence in interference phenomena and superselection rules. First, we note the following observations relating to approximation of relational observables by absolute quantities for some \(\yen (A)\) constructed using a phase pom possessing the norm-1 property.

Suppose that we have some non-invariant state \(\rho _{\mathcal {S}} \ne \tau _{\mathcal {S}*}(\rho _{\mathcal {S}})\). Then for arbitrary A, \(\text {tr}\left[ \rho _{\mathcal {S}}A\right] \) and \(\text {tr}\left[ \rho _{\mathcal {S}} \otimes \rho _{\mathcal {R}}\yen (A)\right] \) can be made equal only if \((\rho _{\mathcal {S}}, \rho _{\mathcal {R}})\) is mutually coherent. The reason is clear: Suppose that \((\rho _{\mathcal {S}}, \rho _{\mathcal {R}})\) is not mutually coherent. Then, by Definition 10, for any invariant \(R \in \mathcal {L}(\mathcal {H}_{\mathcal {T}})\) it must be that \(\text {tr}\left[ \rho _{\mathcal {S}} \otimes \tau _{\mathcal {R}*}(\rho _{\mathcal {R}})R\right] = \text {tr}\left[ \rho _{\mathcal {S}} \otimes \rho _{\mathcal {R}} R\right] \). Then, \(\text {tr}\left[ \rho _{\mathcal {S}} \otimes \tau _{\mathcal {R}*}(\rho _{\mathcal {R}})R\right] = \text {tr}\left[ \rho _{\mathcal {S}} \varGamma _{\tau _{\mathcal {R}*}(\rho _{\mathcal {R}})}(R)\right] \) (see Sect. 5.2) and (by Lemma 4) \(\varGamma _{\tau _{\mathcal {R}*}(\rho _{\mathcal {R}})}(R)\) is invariant. But, due to the invariance of \(\varGamma _{\tau _{\mathcal {R}*}(\rho _{\mathcal {R}})}(R)\), \(\text {tr}\left[ \rho _{\mathcal {S}}\varGamma _{\tau _{\mathcal {R}*}(\rho _{\mathcal {R}})}(R)\right] = \text {tr}\left[ \tau _{\mathcal {S}*}(\rho _{\mathcal {S}})\varGamma _{\tau _{\mathcal {R}*}(\rho _{\mathcal {R}})}(R)\right] \). This latter quantity can never equal \(\text {tr}\left[ \rho _{\mathcal {S}}A\right] \) for non-invariant A and non-invariant \(\rho _{\mathcal {S}}\).

In fact, this also establishes the more general result that for any invariant \(R\in \mathcal {L}(\mathcal {H}_{\mathcal {T}})\) and non-invariant \(\rho _{\mathcal {S}}\), \(\text {tr}\left[ \rho _{\mathcal {S}} \otimes \rho _{\mathcal {R}}R\right] = \text {tr}\left[ \rho _{\mathcal {S}}A\right] \) for arbitrary A only if \((\rho _{\mathcal {S}}, \rho _{\mathcal {R}})\) is mutually coherent. Theorem 2 demonstrates for a specific non-invariant effect \(A \in \mathcal {E}(\mathcal {H}_{\mathcal {S}})\) that for a \(\rho _{\mathcal {R}}\) with poor localisation (\(\varDelta _{\rho _{\mathcal {R}}}N_{\mathcal {R}} < 1/6\)), the discrepancy \(D(A,\varGamma _{\rho _{\mathcal {R}}}(\yen (A))) > 1/32\), and therefore that \(\text {tr}\left[ \rho _{\mathcal {S}}A\right] \) and \(\text {tr}\left[ \rho _{\mathcal {S}} \otimes \rho _{\mathcal {R}}\yen (A)\right] \) cannot even be close in this case.

9 Measurement

The enquiry thus far has been of a kinematical nature. We now consider the important role played by dynamical evolution of states and ensuing measurements, considered in light of the relational perspective presented. The main theorem regarding the role of symmetry in quantum measurements is the Wigner–Araki–Yanase (WAY) theorem [38,39,40], which addresses measurements in the presence of additive conserved quantities of system-plus-reference. After presenting the essentials of the quantum theory of measurement required for our analysis, we present a “strong” form of the WAY theorem, assuming the system on its own has a conserved quantity, followed by two readings of the WAY theorem: the orthodox reading, as presented in [41], and the relational viewpoint.

9.1 Measurement Theory: Brief Overview

We briefly describe the quantum theory of measurement of relevance to this work. For simplicity we present these concepts without the impositions of symmetry.

Let \(\mathcal {H_S}\) be the Hilbert space representing a quantum system \(\mathcal {S}\) under investigation, \(\mathcal {H}_{\mathcal {A}}\) that representing a measuring apparatus, with the combined system then given by \(\mathcal {H}=\mathcal {H}_{\mathcal {S}}\otimes \mathcal {H}_{\mathcal {A}}\). A unitary mapping \(U: \mathcal {H} \rightarrow \mathcal {H}\) models a measurement interaction, serving to correlate the states of the system to those of the apparatus during an interaction period T. The specification of a self-adjoint “pointer observable” Z on \(\mathcal {H_A}\), a fixed state \(\phi \in \mathcal {H_A}\) (which for convenience is assumed to be pure) and the scaling function f (which maps the values of the pointer to those of the measured observable) then fix the measurement scheme \(\mathcal {M} \equiv \langle \mathcal {H_A},U,\phi ,Z,f\rangle \) for observable \(\mathsf {E}\) of \(\mathcal {S}\). With \(\varPsi _{T }=U(\varphi \otimes \phi )\in \mathcal {H}\), \(\mathcal {M}\) must satisfy the probability reproducibility condition:

$$\begin{aligned} \left\langle \varPsi _{T }|\mathbb {1} \otimes \mathsf {E}^{Z}\bigl (f^{-1}(X)\bigr )\varPsi _{T} \right\rangle \equiv \left\langle \varphi |\mathsf {E}(X)\varphi \right\rangle , \end{aligned}$$
(68)

where \( \mathsf {E}^{Z}\bigl (f^{-1}(X)\bigr )\) are spectral projections of Z, and (68) holds for all \(\varphi \) and X. In words, (68) stipulates that the outcome distribution for \(\mathsf {E}\) in any state \(\varphi \) may be recovered from the pointer statistics in the final state \(\varPsi _{T}\). Conversely, given a measurement scheme as described above, this relation determines the measured observable \(\mathsf {E}\).

A measurement (scheme) is said to be repeatable if, upon immediate repetition of the measurement, the same outcome is achieved with certainty. This may be written:

$$\begin{aligned} \left\langle \varPsi _{T} |\mathsf {E}(X)\otimes \mathsf {E}^{Z}\bigl (f^{-1}(X)\bigr )\varPsi _{T} \right\rangle =\left\langle \varphi |\mathsf {E}(X)\varphi \right\rangle . \end{aligned}$$
(69)

We note that (68) does not entail (69), and therefore the question of repeatability must be treated independently of that of probability reproducibility.

9.2 Conservation Laws: Strong and Weak WAY Theorems

We present here the standard version and interpretation of the theorem of Wigner, Araki and Yanase (WAY), as presented in [41], giving both the no-go part, prohibiting sharp, repeatable measurements of an observable (in the ordinary sense) which does not commute with (the system part of) an additive conserved quantity, and the positive part demonstrating conditions under which good approximation can be achieved. We begin with a version of the WAY theorem subject to a stronger assumption than is typical—that subsystem quantities are conserved—which we therefore refer to as the strong WAY theorem.

9.2.1 Strong Conservation

The stipulation that observability entails invariance follows as a theorem in the quantum theory of measurement from a constraint on \(\mathcal {M}\), namely the conservation of some quantity of \(\mathcal {H}_{\mathcal {S}}\) (see also [35], Ch. 21).

Consider the strongly continuous unitary group described by the operators \(U_{\mathcal {S}}(t) \equiv e^{itL_{\mathcal {S}}}\), with \(t \in \mathbb {R}\) or \(t \in [0, 2 \pi ]\) and \(L_{\mathcal {S}}\) a self-adjoint operator acting in \(\mathcal {H}_{\mathcal {S}}\). Then the following holds.Footnote 3

Proposition 12

Suppose that for any measurement scheme \(\mathcal {M}\) for \(\mathsf {E}\), \(L_{\mathcal {S}}\) is conserved, i.e., \([U,U_{\mathcal {S}}(t) \otimes \mathbb {1}]=0\). Then

$$\begin{aligned} U_{\mathcal {S}}(t) \mathsf {E}(X) U_{\mathcal {S}}(t)^* = \mathsf {E}(X) \end{aligned}$$
(70)

for all value sets \(X \in \mathcal {F}\) and all t.

Proof

It is sufficient to consider measurement schemes for which the pointer function f in (68) is the identity map. Equation (68) gives

$$\begin{aligned} \left\langle \,U\varphi \otimes \phi \,{|}\,\mathbb {1}\otimes {\mathsf E}^Z(X)U\varphi \otimes \phi \,\right\rangle =\left\langle \,\varphi \,{|}\,\mathsf {E}(X)\varphi \,\right\rangle \end{aligned}$$

for all object initial states \(\varphi \) and all X. Replacing \(\varphi \) with \(U_{\mathcal {S}}(t)\varphi \) does not change the left hand side, and therefore the right hand side is also unchanged, immediately giving (70). \(\square \)

Remark 7

If \(\mathcal {S}\) is considered to be an elementary system, the operators \(U_{\mathcal {S}}(t)\) comprise an irreducible representation, in which case the only effects satisfying (70) are those of the form \(\mathsf {E}(X) = c_X \mathbb {1}\) (\(0 \le c_X \le 1\)), i.e., the trivial effects. If, however, \(\mathcal {S}\) is more complex, comprising several elementary systems for instance, \(\mathcal {S}\) can be separated into an “object” system \(\mathcal {S}_{\mathcal {O}}\) and a “reference” system \(\mathcal {S}_{\mathcal {R}}\). Absolute quantities of \(\mathcal {S}_{\mathcal {O}}\) then function only as representatives of observables of \(\mathcal {S}\) as a whole, and depending on the composition of \(\mathcal {S}_{\mathcal {R}}\) may or may not accurately represent observables; as we have seen, \(\mathcal {S}_{\mathcal {R}}\) under certain localisation requirements allows for the absolute quantities of \(\mathcal {S}_{\mathcal {O}}\) to be good representations of observables. In particular, \(\mathcal {S}_{\mathcal {R}}\) may be viewed as a measuring apparatus, as is the case in the WAY theorem, which we initially present in its conventional form, and subsequently reinterpret in a relative vein.

9.2.2 Weak Conservation: Wigner–Araki–Yanase Theorem

In this instance a conservation law is applied to the system-apparatus combination, but is not assumed to hold ‘locally’, i.e., for the system under investigation and apparatus separately. We present the traditional reading of the WAY theorem.

Theorem 4

(Wigner–Araki–Yanase) Let \(\mathcal {M} := \left\langle \mathcal {H}_A, U, \phi , Z, f \right\rangle \) be a measurement of a discrete-spectrum self-adjoint operator A on \(\mathcal {H_S}\), and let \(L_{\mathcal {S}}\) and \(L_{\mathcal {A}}\) be bounded self-adjoint operators on \(\mathcal {H_S}\) and \(\mathcal {H_A}\), respectively, such that \([U, L_{\mathcal {S}} +L_{\mathcal {A}}] = 0\). Assume that \(\mathcal {M}\) is repeatable or \([Z,L_{\mathcal {A}}]=0\). Then \([A,L_{\mathcal {S}}]=0\).

We refer to [41] for a proof. Following Ozawa [43], we refer to the condition that the pointer observable Z commutes with the apparatus part of the conserved quantity \(L_{\mathcal {A}}\) as the Yanase condition [44]. In the case that \([A,L_{\mathcal {S}}]\ne 0\), there is a positive counterpart to the impossibility result: approximate measurements of A, with approximate repeatability properties, are feasible, with increasingly good approximation properties the larger the variance \(\left( \varDelta _{\phi }L_2\right) ^2\) becomes (see [41], where more general measures of spread are also considered) and indeed that such large “spread” is necessary for good measurements of A.

Thus, in its usual reading, the WAY theorem does not prohibit accurate measurements of unsharp observables which do not commute with \(L_{\mathcal {S}}\), thus leaving room for a positive rephrasing of the theorem where a smeared, approximate version of A can be measured accurately. We now address this point, arguing that, just as in the discussion following the strong version of the theorem, one should actually conclude that the measured observable in the WAY theorem must be understood as a representative of a relative observable of system and apparatus combined.

The standard interpretation of the WAY theorem states that any sharp A not commuting with (the object part \(L_1\) of) an additive conserved quantity L for which the Yanase condition is satisfied (\([Z,L_2]=0\)) cannot be measured precisely. Moreover, good approximation can occur if there is large uncertainty with respect to the apparatus part \(L_2\) of the conserved quantity in the initial state \(\phi \) of the apparatus, i.e., if \(\left( \varDelta _{\phi }L_2 \right) ^2\) is large.

In light of the theme of this paper, namely understanding the consequences of the principle that observables are invariant, we may reconsider the message of the WAY theorem. We recall that, for fixed \(\phi \in \mathcal {H}_{\mathcal {R}}\), the equation

$$\begin{aligned} \left\langle \,\varphi \,{|}\,\mathsf {E}(X)\varphi \,\right\rangle =\left\langle \,\varphi \otimes \phi \,{|}\,U^* \mathbb {1} \otimes \mathsf {E}^Z (X) U \varphi \otimes \phi \,\right\rangle , \end{aligned}$$
(71)

when stipulated to hold for all \(X, \varphi \) determines the pom \(\mathsf {E}\). In other words, \(\mathsf {E}(X) = \varGamma _{\phi }(U^*\mathbb {1} \otimes \mathsf {E}^Z(X)U)\). Given the Yanase condition (\([Z,L_2]=0\)) and the conservation law (\([U,L]=0\)), it follows that \([U^*ZU,L]=0\). Writing \(U^*ZU \equiv Z(\tau )\), it is therefore evident that \(Z(\tau )\) is invariant under the symmetry generated by L and that, furthermore, in the limit that \(\left( \varDelta _{\phi }L_2 \right) ^2\) becomes large, A (which is not necessarily observable) can become a good approximation of the observable \(Z(\tau )\).

If \(L_2\) is the shift-generator in a conjugate quantity (e.g., a number operator generating phase shifts), then large \(L_2\) spread in \(\phi \) corresponds to high localisation with respect to \(\phi \) in the conjugate quantity, completely in line with the view that for A to be a good representative of an invariant observable, the reference system must be highly localised with respect to a phase-like quantity, à la \(\yen \). This also sheds light on the reason that \(L_2\) must have large spread in the initial state of the apparatus \(\phi \). This view of the ordinary WAY theorem then arises when the strong WAY theorem is applied to system-plus-apparatus together, viewed as an isolated system.

Example 7

Ozawa model of an unsharp position measurement: relative versus absolute position.

The relational view just discussed may be exemplified in a position measurement model of Ozawa, introduced in [45] and analysed further in [46, 47]. We consider the momentum–conserving position measurement scheme in which \(\mathcal {S}+\mathcal {R}\) interacts with two apparatus systems \(\mathcal {A}+\mathcal {B}\). This scheme measures the absolute position Q with the pointer observable \(P_\mathcal {B}-P_\mathcal {A}\), a relativised momentum. Contrary to the claim in [45], a WAY-type limitation is exhibited for this model. However, we show that the same scheme may be used to measure the relative position observable, \(Q \otimes \mathbb {1} - \mathbb {1} \otimes Q_{\mathcal {R}}\equiv Q-Q_\mathcal {R}\); in this case there is no localisation requirement at all for good measurements, as would be expected since \(Q-Q_{\mathcal {R}}\) is already shift-invariant. Moreover, we demonstrate that the absolute position Q well represents \(Q-Q_{\mathcal {R}}\) precisely when \(Q_{\mathcal {R}}\) is highly position-localised, corresponding to a large momentum spread in the reference system \(\mathcal {R}\) (cf. Example 6).

The unitary measurement coupling is given by \(U=e^{i \frac{\lambda }{2}(Q-Q_{\mathcal {R}})(Q_\mathcal {A}- Q_\mathcal {B})}\), which commutes with the total momentum \(P +P_{\mathcal {R}}+P_{\mathcal {A}}+P_{\mathcal {B}}\) (notice also that \(P +P_{\mathcal {R}}\) is separately conserved and therefore falls under the remit of Proposition 12). Subsequently the pointer Z, given by the difference of momentum operators \(P_\mathcal {B}-P_\mathcal {A}\), is measured. We consider the initial state \(\varPsi _0(x,y,u,v)=\varphi (x)\phi (y)\xi _a(u)\xi _b(v)\), where xyuv are spectral values of \(Q, Q_{\mathcal {R}},P_{\mathcal {B}} - P_{\mathcal {A}}, P_{\mathcal {A}}+P_{\mathcal {B}}\) respectively. The unique measured pom \(\widetilde{\mathsf {E}}: \mathcal {B} (\mathbb {R}) \rightarrow \mathcal {L}\bigl (\mathcal {H}\otimes \mathcal {H}_{\mathcal {R}}\bigr ) \equiv \mathcal {L}(L^{2} (\mathbb {R}^2)) \) is extracted from the condition

$$\begin{aligned} \left\langle \varPsi _{\tau }|\mathbb {1}\otimes \mathbb {1}\otimes \mathsf {E}^{Z}(f^{-1}(X))\otimes \mathbb {1}\varPsi _{\tau }\right\rangle =\left\langle \varphi \otimes \phi | \widetilde{\mathsf {E}}(X)\varphi \otimes \phi \right\rangle , \end{aligned}$$
(72)

required to hold for all \(\varphi ,~\phi \). It then follows that

$$\begin{aligned} \widetilde{\mathsf {E}}(X)=\chi _{X}*\widetilde{e}^{(\lambda )}(Q-Q_{\mathcal {R}}), \end{aligned}$$
(73)

where the right-hand side is the convolution of the set indicator function \(\chi _X\) with the probability distribution \(\widetilde{e}^{(\lambda )}(x)=\bigl \vert \xi _a^{(\lambda )}(x)\bigr \vert ^{2}\) with \(\xi _a ^{(\lambda )}(s) = \sqrt{\lambda }\xi _a(\lambda s)\).

We see that \(\widetilde{\mathsf {E}}\) is a smeared version of \(\mathsf {E}^{Q-Q_{\mathcal {R}}}\). As such the former can be considered an approximation of the latter, and we may quantify the inaccuracy or error of that approximation by the variance of the distribution function \(\widetilde{e}^{(\lambda )}\) (other measures such as overall width can also be used: see [47]). The variance of \(\widetilde{e}^{(\lambda )}\) is \({\text {Var}}(\widetilde{e}^{(\lambda )})=\frac{4}{\lambda ^{2}}{\text {Var}}\left| \xi _a\right| ^{2}\). Therefore by tuning \(\lambda \) to be large, arbitrarily accurate measurements of \(Q-Q_{\mathcal {R}}\) can be achieved with no localisation requirement on the reference system \(\mathcal {R}\).

The absolute position Q then acts as an approximation of the observable \(Q-Q_{\mathcal {R}}\), the approximation becoming good with good \(Q_{\mathcal {R}}\) localisation. By fixing \(\phi \) in the initial state \(\varPsi _0\), the measurement scheme can be viewed as “measuring” a pom \(\mathsf {E}\) for \(\mathcal {S}\):

$$\begin{aligned} \left\langle \,\varphi \otimes \phi \,{|}\,\widetilde{\mathsf {E}}(X)\varphi \otimes \phi \,\right\rangle =: \left\langle \,\varphi \,{|}\,\mathsf {E}(X) \varphi \,\right\rangle . \end{aligned}$$

This is of the form \(\mathsf {E}(X) = \chi _X * e^{(\lambda )}(Q)\) with \(e^{(\lambda )}\) given by \(e^{(\lambda )}(x) = \bigl |{\phi }\bigr |^2*\bigl |{\xi _a ^{(\lambda )}}\bigr |^2(x)\). The probability distribution for the relative position has thereby been re-expressed in terms of a smeared distribution for the absolute position by considering a fixed reference state \(\phi \). The approximation error of \(\mathsf {E}\) relative to \(\mathsf {E}^Q\) is given by \(\text {Var}(e^{(\lambda )}) = \text {Var}\left| \phi \right| ^2 + \frac{4}{\lambda ^2} \text {Var}\left| \xi _a \right| ^2\). The probability distributions corresponding to the relative coordinate in the states \(\varphi \otimes \phi \) become indistinguishable from those of the absolute coordinate Q in the limit that the localisation of the state \(\phi \) with respect to \(Q_{\mathcal {R}}\) is arbitrarily good (provided also that \(\lambda \) is tuned to be large.)

This model, therefore, highlights how relative observables such as \(Q-Q_{\mathcal {R}}\) may be measured whilst preserving overall symmetry imposed by the conservation of total momentum, with a pointer observable that also respects symmetry. In this case, the apparatus is not required to function as a reference system, which is internal to the measurement device and whose localisation controls the quality of the approximation by the absolute quantity.

We have therefore seen that the picture of observables as relative quantities may be well maintained in the presence of dynamics. It was shown that the WAY theorem has a relative interpretation, and the model of Ozawa provided a measurement scheme for the relative position observable, which could be re-expressed as an accurate measurement of an absolute position precisely when the reference system was well position-localised. Absolute quantities were seen to be good representatives of observables again in the high reference localisation limit, the interpretation therefore not differing from the “static” case. We now consider the impact of this enquiry on the status of superpositions, interference and superselection rules.

10 Interference Phenomena

We begin this section with a typical analysis of interference phenomena. Absolute coherence is the usual requirement for interference effects to manifest. We show that, from our relational perspective, mutual coherence replaces absolute coherence in regard to interference. We provide several models which serve to illustrate the problem of observing coherence, and then turn to the role played by high phase localisation of the reference.

The stipulation that observables are phase-shift invariant implies that relative phase factorsFootnote 4 between states in a superposition of number eigenstates (with differing eigenvalue) ought to be unobservable. We have seen, for example, that the state \(P[\varphi ]\) of \(\mathcal {S}\) with \(\varphi = \bigl ( |0\rangle + e^{i \theta } |1\rangle \bigr )/{\sqrt{2}}\) cannot be distinguished from \(\tau _{\mathcal {S}_*}(P[\varphi ]) = 1/2(|0\rangle \langle 0| + |1\rangle \langle 1|)\) (which has no dependence on \(\theta \)) by any quantity commuting with \(N_{\mathcal {S}}\).

The usual reading of \(\theta \)-dependent expectation values (appearing in states akin to that discussed above) arising in measurements is that a coherent (in our language, absolutely coherent) superposition has been prepared and measured (to be coherent). Since this is equivalent to measuring an absolute quantity, this conclusion warrants further scrutiny. There is a history of debate and controversy surrounding the meaning of \(\theta \)-dependent expectation values in superpositions of number states (also understood as charge eigenstates) in the subject of superselection rules [8,9,10], and of photon number states in the so-called optical coherence controversy [12, 48], regarding the reality of coherent states in describing the output field of a laser.

In the forthcoming subsections we motivate the question of relative phase-factor sensitivity more formally and in a dynamical context, by first discussing a generic interference experiment, followed by three model considerations. We finish with a discussion of the role of high phase localisation and the accompanying interpretation of the measurement statistics. We then use our findings to analyse points of agreement and points of friction between our interpretation and those appearing in the literature.

10.1 Interferometry

Ramsey interferometry exemplifies the typical form of interference experiments (see, e.g., [37]). Here, an atom enters a cavity in its ground state \(|g\rangle \), interacts with the cavity, and exits in a superposition of ground and excited states. At the level of the atom the following sequence (or similar) of (unitary) state evolutions is often given:

$$\begin{aligned} \psi _i \equiv |g\rangle&\rightarrow \frac{1}{\sqrt{2}}(|g\rangle - i |e\rangle ) \rightarrow \frac{1}{\sqrt{2}}(|g\rangle - i e^{-i \theta } |e\rangle )\end{aligned}$$
(74)
$$\begin{aligned}&\rightarrow \sin \left( \frac{\theta }{2} \right) |g\rangle - \cos \left( \frac{\theta }{2} \right) |e\rangle \equiv \psi _f, \end{aligned}$$
(75)

where \(|e\rangle \) represents an excited state of the atom. If the observable \(P_g \equiv |g\rangle \langle g|\) is measured in the final state, we see \(\left\langle \,\psi _f\,{|}\,P_g \psi _f\,\right\rangle = \sin ^2 \left( {\theta }/{2} \right) \). The orthodox reading of such a measurement is that this \(\theta \)–dependent probability distribution for the observable \(P_g\) in the state \(\psi _f\) validates the coherence of the superposition state \(\frac{1}{\sqrt{2}}(|g\rangle - i e^{-i \theta } |e\rangle )\).

However, the Hamiltonian generating such an evolution certainly does not commute with \(P_g,~P_e \) (i.e., \(N_{\mathcal {S}}\)) and is, itself, therefore not phase-shift invariant (and thus not (an) observable). Equations (74) and (75) must, if applicable at all, therefore be viewed as approximate, reduced descriptions of the true, energy-conserving dynamics of system-plus-cavity.

We will obtain a consistent description of measurements which at first sight appear sensitive to relative phase factors between number superpositions. Keeping in mind that observables are invariant and that states are class representatives, we may obtain statistics which look as if absolute quantities have been measured or, alternatively (and equivalently), that relative phase factors across number eigenspaces have been observed. Again, this is a reduced, approximate description and not a true representation of the state of affairs. The models to be presented have strong formal similarities to the case of observability of phase factors between states of different charge and different baryon number, allowing for comparison to the issue of whether superselection rules may be obviated in practice (cf. [7, 10, 37]). We show that all such attempts may be phrased purely in terms of measurements of relative quantities (i.e., observables), highlighting the fact the absolute quantities are never measured.

10.2 Model 1: Two-Level System

We first consider a model in Hilbert space dimension 4 to show how to dynamically introduce a relative phase factor between number states (of the same total number eigenvalue), whilst respecting symmetry. The restriction to low dimensions highlights the relational nature of the relative phase factor. The generic structure of this model can then be applied to the scenario where the reference system’s Hilbert space has infinite dimension, which resembles the situation for which there have been claims purporting to “lift” [7] or evade superselection rules. However, we argue that there is no reason for the interpretation of measurement statistics in the infinite dimensional setting to be different from the model discussed below, except for the observation that with infinite dimensional reference systems, expectation values of absolute quantities (can be made to) agree arbitrarily well with those of the relative ones (contingent on a choice of reference state).

Let \(N_{\mathcal {S}} \in {\mathcal {L}(\mathcal {H}_{\mathcal {S}})} \equiv \mathcal {L}(\mathbb {C}^2)\) be a number operator so that \(N_{\mathcal {S}}|0\rangle =0,~N_{\mathcal {S}} |1\rangle =|1\rangle \), and let \(N_{\mathcal {R}} \in \mathcal {L}(\mathcal {H}_{\mathcal {R}})\) have the same definition. Any self-adjoint operator \(A \in \mathcal {L}(\mathcal {H}_{\mathcal {T}})\) must commute with \(N :=N_{\mathcal {S}} \otimes \mathbb {1}+ \mathbb {1}\otimes N_{\mathcal {R}}\) if it is to be deemed observable.

We introduce two unitary operators \(U_1\) and \(U_2\) which represent two stages of time evolution, defined as

$$\begin{aligned} |0\rangle |0\rangle&\overset{U_1}{\longrightarrow } |0\rangle |0\rangle \overset{U_2}{\longrightarrow } |0\rangle |0\rangle ;\\ |0\rangle |1\rangle&\overset{U_1}{\longrightarrow }\frac{e^{-i \frac{\theta }{2}}}{\sqrt{2}} \left( |0\rangle |1\rangle + e^{i \theta } |1\rangle |0\rangle \right) \\&\overset{U_2}{\longrightarrow }\left( \cos \left( \frac{\theta }{2} \right) |0\rangle |1\rangle -i \sin \left( \frac{\theta }{2} \right) |1\rangle |0\rangle \right) ;\\ |1\rangle |0\rangle&\overset{U_1}{\longrightarrow }\frac{e^{-i \frac{\theta }{2}}}{\sqrt{2}} \left( |0\rangle |1\rangle - e^{i \theta }|1\rangle |0\rangle \right) \\&\overset{U_2}{\longrightarrow }\left( - i \sin \left( \frac{\theta }{2} \right) |0\rangle |1\rangle + \cos \left( \frac{\theta }{2} \right) |1\rangle |0\rangle \right) ;\\ |1\rangle |1\rangle&\overset{U_1}{\longrightarrow }|1\rangle |1\rangle \overset{U_2}{\longrightarrow }|1\rangle |1\rangle ; \end{aligned}$$

and it can be seen that \([U_1,N]=[U_2,N]=0\). Furthermore, it is important to note that \(U_2\) does not depend on \(\theta \), which can be seen by the action of \(U_2\) on the initial product states given by \(U_2 |0\rangle |1\rangle = \frac{1}{\sqrt{2}} \bigl (|0\rangle |1\rangle + |1\rangle |0\rangle \bigr )\) and \(U_2|1\rangle |0\rangle = \frac{1}{\sqrt{2}} (|0\rangle |1\rangle - |1\rangle |0\rangle )\). The purpose of applying \(U_2\) is to allow a measurement of an invariant quantity of \(\mathcal {S}\), which gives rise to \(\theta \)-sensitive measurement statistics.

In other words, \(U_1\) introduces the factor \(\theta \), \(U_2\) redistributes the \(\theta \)-dependence, so that the measurement of an invariant \(\mathcal {S}\)-quantity depends on \(\theta \), which then validates the superposition present after \(U_1\).

Writing \(P_0 :=|0\rangle \langle 0|\), \(\psi = |0\rangle |1\rangle \), and noting that \(\tau {_{\mathcal {S}}} (P_0) =P_0\), we compute post-\(U_2\) statistics:

$$\begin{aligned} \mathrm{tr} \bigl [ P_0 \otimes \mathbb {1} \tau _{\mathcal {T}*}( P_{U_2U_1 \psi }) \bigr ] =\mathrm{tr} \bigl [ P_0 \mathrm{tr} _{\mathcal {K}} P_{U_2U_1\psi } \bigr ] . \end{aligned}$$
(76)

This yields the probability \(p_{U_2U_1\psi }^{P_0}(0) = \cos ^2 \left( \frac{\theta }{2} \right) \), which depends explicitly on the phase \(\theta \). Applying \(\tau {_{\mathcal {T}_*}}\) at every stage does not alter the probabilities; we have, for example

$$\begin{aligned} \tau {_{\mathcal {T}_*}} (P_{\psi })&\rightarrow U_1 \tau {_{\mathcal {T}_*}} (P_{\psi })U_1 ^* = \tau {_{\mathcal {T}_*}} (U_1 P_{\psi } U_1 ^*)\\ \nonumber&\rightarrow U_2\bigl (\tau {_{\mathcal {T}_*}} (U_1 P_{\psi } U_1 ^*)\bigr ) U_2 ^* = \tau {_{\mathcal {T}_*}} (U_2 U_1 P_{\psi } U_1 ^* U_2 ^*) = \tau {_{\mathcal {T}_*}} (P_{U_2U_1 \psi }). \end{aligned}$$
(77)

Then \(\mathrm{tr} \bigl [P_0 \otimes \mathbb {1}\tau _{\mathcal {T}_*}(P_{U_2U_1 \psi })\bigr ]\) coincides with the expression in (76). The unitary maps \(U_1\) followed by \(U_2\) mimic what might occur in a realistic interference experiment in which the reference system is confined to a low dimensional Hilbert space. The interference fringes dictated by \(\theta \) may be observed through the measurement of an invariant system-apparatus quantity. This does not require absolute coherence, i.e., does not imply the coherence of superpositions across N-eigenspaces. It does, however, require mutual coherence.

Considering the states arising after application of \(U_1\), it is immediately clear that the reduced states \(\mathrm{tr}_{\mathcal {H}_R} [P_{U_1 |i\rangle |j\rangle }] \) and \(\mathrm{tr}_{\mathcal {H}_{\mathcal {S}}}{P_{U_1|i\rangle |j\rangle }}\) have no dependence on \(\theta \), indicating that \(\theta \) relates to both \(\mathcal {S}\) and \(\mathcal {R}\). Since the post-\(U_1\) states are entangled, the only means by which we may identify a system and a reference is via the partial trace.

We may define a restriction map \(\varGamma _{\rho _{\mathcal {R}}}\) with \(\rho _{\mathcal {R}}=\mathrm{tr}_{\mathcal {H}_{\mathcal {S}}}{P_{U_1|i\rangle |j\rangle }}\), which, since \(\rho _{\mathcal {R}}\) is invariant, yields only invariant restricted quantities for \(\mathcal {S}\) (if assumed invariant for \(\mathcal {S}+ \mathcal {R}\)). The conclusion, then, is that \(\theta \)-dependent expectation values do not correspond to the observation of states with absolute coherence.

We now analyse a variety of infinite dimensional examples, and argue in Sect. 10.6 that we must draw the same conclusion: relative phase factor sensitive measurement statistics can be achieved by measuring observables (and only observables), i.e., the relevant relative phase factors occur within an N-eigenspace. Only in the high reference phase localisation do these appear as though they pertain to the system alone.

10.3 Model 2: Angular Momentum and Angle

We now adapt the previous model, replacing the space \(\mathbb {C}^2\) of the reference system with an infinite dimensional space, and construct a new unitary mapping (still calling it \(U_1\) and restricting to the subspace spanned by \(\{|0\rangle ,|1\rangle \}\) for the first system):

$$\begin{aligned} |0\rangle |n\rangle&\overset{U_1}{\longrightarrow }e^{-i \frac{\theta }{2}}\frac{1}{\sqrt{2}} \left( |0\rangle |n\rangle + e^{i \theta }|1\rangle |n-1\rangle \right) ,\end{aligned}$$
(78)
$$\begin{aligned} |1\rangle |n-1\rangle&\overset{U_1}{\longrightarrow }e^{-i \frac{\theta }{2}}\frac{1}{\sqrt{2}} \left( |0\rangle |n\rangle - e^{i \theta }|1\rangle |n-1\rangle \right) . \end{aligned}$$
(79)

Here the basis vectors are the eigenvectors of \(N_i = \sum _{n=- \infty } ^{\infty }n ^{(i)} P_n ^{(i)}\). We observe that the partial trace over system or reference yields reduced states which do not depend on \(\theta \). Linearity and continuity entail

$$\begin{aligned}&U_1: \varPsi _0 \equiv |0\rangle |\xi \rangle \equiv |0\rangle \sum _{n=-\infty }^\infty c_n |n\rangle \longrightarrow e^{-i \frac{\theta }{2}}\frac{1}{\sqrt{2}}\nonumber \\&\quad \sum _{n=-\infty }^\infty c_n \left( |0\rangle |n\rangle + e^{i \theta } |1\rangle |n-1\rangle \right) \equiv \varPsi _f. \end{aligned}$$
(80)

The initial state \(\varPsi _0\) under \(\tau {_{\mathcal {T}_*}}\) takes the form (sums taken for n running from \(- \infty \) to \(\infty \))

$$\begin{aligned} \tau {_{\mathcal {T}_*}} (P_{\varPsi _0}) = \sum _n P_n P_{\varPsi _0} P_n = |0\rangle \langle 0| \sum _n \left| c_n \right| ^2|n\rangle \langle n|, \end{aligned}$$

where the \(P_n\) are the infinite-rank projectors onto the eigenspaces of \(N = N_1 + N_2\) given as

$$\begin{aligned} P_n = \sum _{l+m=n}P^{(1)}_l \otimes P^{(2)}_m = \sum _{l} P^{(1)}_l \otimes P^{(2)}_{n-l}. \end{aligned}$$
(81)

We consider what observation may reveal about \(\theta \) in the state \(\tau _{\mathcal {T}*} (P_{\varPsi _f})\) for \(\varPsi _f\) as given in (80). We have

$$\begin{aligned} \tau {_{\mathcal {T}_*}} (P_{\psi _f})&=\sum _n \left| c_n \right| ^2 \frac{1}{2}\Bigl \{ |0,n\rangle \langle 0,n| + |1,n-1\rangle \langle 1,n-1| \Bigr . \nonumber \\&\qquad \qquad +\Bigl . |0,n\rangle \langle 1,n-1| e^{-i \theta } + |1,n-1\rangle \langle 0,n| e^{i \theta } \Bigr \} \nonumber \\&=\sum _n\left| c_n \right| ^2\,P_{\frac{1}{\sqrt{2}}\left( |0,n\rangle +e^{i\theta }|1,n-1\rangle \right) }. \end{aligned}$$
(82)

Here it is manifest that \(\tau {_{\mathcal {T}_*}} (P_{\psi _f})\) is a mixture of states of different N-eigenvalues, and within each eigenspace labelled by n there is a relative phase factor between the states of the same N-eigenvalue. There exists an invariant quantity of \(\mathcal {S}+ \mathcal {R}\) which is sensitive to \(\theta \) in the state \(\tau _{\mathcal {T}*} (P_{\varPsi _f})\). For example, we may choose \(A = |0,n\rangle \langle 1,n-1| + \mathrm{h.c.}\) and invoke relation (11) (replacing the spectral projections with the self-adjoint operators they define).

We may extend the analysis and, in the spirit of the finite dimensional example, introduce a second unitary \(U_2\) (which is independent of \(\theta \)), which with \(U\equiv U_2 U_1\) yields on the number basis states

$$\begin{aligned} |0\rangle |n\rangle&\overset{U}{\longrightarrow }\cos \left( \frac{\theta }{2}\right) |0\rangle |n\rangle - i \sin \left( \frac{\theta }{2} \right) |1\rangle |n-1\rangle , \end{aligned}$$
(83)
$$\begin{aligned} |1\rangle |n-1\rangle&\overset{U}{\longrightarrow }-i \sin \left( \frac{\theta }{2} \right) |0\rangle |n\rangle + \cos \left( \frac{\theta }{2} \right) |1\rangle |n-1\rangle . \end{aligned}$$
(84)

In analogy to the \(2\times 2\) case, we see that for example \(\mathrm{tr} \bigl [|0\rangle \langle 0| \mathrm{tr} _{\mathcal {K}}[P_{U|0\rangle |n\rangle }] \bigr ] = \cos ^2 \left( \frac{\theta }{2} \right) \), and again, since we have measured an observable (i.e., \(|0\rangle \langle 0| \otimes \mathbb {1}\)), applying \(\tau {_{\mathcal {T}_*}}\) at all stages does not alter the result. The purpose of \(U_2\) is then to bring the final states into a form wherein the measured observable is non-trivial only for \(\mathcal {S}\) and measurement of an invariant quantity for \(\mathcal {S}\) gives \(\theta \)-dependent expectation values. This validates the presence of mutual coherence, and does not indicate the existence of absolute coherence at any stage. As shall be shown, this becomes crucial in the question of whether superselection rules can be effectively overcome through a judicious choice of unitary mappings and measurements.

10.4 Model 3: Number and Phase

We now consider the number-phase case. Here, we have \(N_{\mathcal {S}} \otimes \mathbb {1}\) and \(\mathbb {1} \otimes N_{\mathcal {R}}\) (each with spectrum given by \(\mathbb {N}\cup \{0\}\)) acting on \(\mathcal {H}_S \otimes \mathcal {H}_R\) and \(N=N_{\mathcal {S}} \otimes \mathbb {1}+\mathbb {1}\otimes N_{\mathcal {R}} = \sum _{n=0}^\infty n P_n\) with \(P_n=\sum _{i+j=n}P_i ^{(1)} \otimes P_j ^{(2)}\). A simple N-preserving unitary mapping is given by:

$$\begin{aligned}&U_1: |0\rangle |n\rangle \rightarrow \left\{ \begin{array}{cl} e^{-i \frac{\theta }{2}}\frac{1}{\sqrt{2}} \left( |0\rangle |n\rangle + e^{i \theta }|1\rangle |n-1\rangle \right) &{} n> 0 \\ |0\rangle |0\rangle &{} n=0 \end{array} \right. \nonumber \\&U_1:|1\rangle |n-1\rangle \rightarrow e^{-i \frac{\theta }{2}}\tfrac{1}{\sqrt{2}} ( -e^{-i \theta }|0\rangle |n\rangle + |1\rangle |n-1\rangle )~n>0. \end{aligned}$$
(85)

Following the now familiar approach, we introduce a second unitary map \(U_2\), under which \(U \equiv U_2 U_1\) implements

$$\begin{aligned}&U: |0\rangle |n\rangle \rightarrow \left\{ \begin{array}{cl} \cos \left( \frac{\theta }{2}\right) |0\rangle |n\rangle - i \sin \left( \frac{\theta }{2}\right) |1\rangle |n-1\rangle ) &{} n > 0 \\ |0\rangle |0\rangle &{} n=0 \end{array} \right. \end{aligned}$$
(86)
$$\begin{aligned}&U:|1\rangle |n-1\rangle \rightarrow -i \sin \left( \tfrac{\theta }{2}\right) |0\rangle |n\rangle + \cos \left( \tfrac{\theta }{2}\right) |1\rangle |n-1\rangle ~n>0. \end{aligned}$$

Then \(\mathrm{tr} \left[ |0\rangle \langle 0| \mathrm{tr} _{\mathcal {K}} P_{U |0\rangle |n\rangle } \right] = \cos ^2 \left( \frac{\theta }{2} \right) \) and once again we have a \(\theta \)-dependent probability distribution for an observable in the state \(U|0\rangle |n\rangle \). Moreover, this does not differ from the distribution in the state \(\tau (P_{U|0\rangle |n\rangle })\). Of course, the \(\theta \)-dependence only corroborates the “reality” of the relative phase factor, within the eigenspace of N with eigenvalue n in the state on the top line of Eq. (85).

10.5 Coherence and Mutual Coherence: Brief Discussion

In each of the models we have discussed, the crucial component for witnessing interference effects, in the form of \(\theta \)-dependent expectation values, is the presence of non-zero mutual coherence for states of \(\mathcal {S}+ \mathcal {R}\) (which is possible even in the absence of absolute coherence for \(\mathcal {S}+ \mathcal {R}\)). Mutual coherence allows for (and is necessary for) the appearance of absolute coherence, even without a limit being taken.

For instance, \(\varphi = \alpha |0\rangle + \beta |1\rangle \) gives, for \(A = |0\rangle \langle 1| + |1\rangle \langle 0|\) the expectation value \(2 Re (\alpha \bar{\beta })\). One can define the invariant (entangled) state \(\tilde{\varphi } = \alpha |01\rangle + \beta |10\rangle \) and \(\tilde{A} = |01\rangle \langle 10| + |10\rangle \langle 01|\) (which does not commute with \(N_{\mathcal {T}}\)) so that \(\left\langle \,\tilde{\varphi }\tilde{A}\,{|}\,\tilde{\varphi }\,\right\rangle = \left\langle \,\varphi \,{|}\, A \varphi \,\right\rangle = 2 Re(\alpha \bar{\beta })\). This can also be done with an invariant \(\tilde{A^{\prime }} = \tau _{\mathcal {T}}(\tilde{A})\). Thus, in this case, asymmetric statistics of \(\mathcal {S}\) can be given by symmetric ones of \(\mathcal {S}+ \mathcal {R}\) without the need for localisation. However, the physical interpretation is unclear due to the non-separability of \(\varPsi \). The important observation is that mutual coherence of \(\tilde{\varphi }\) is required for the possibility of the appearance of absolute coherence of states of \(\mathcal {S}\).

Next we examine the role of high reference phase localisation in the interpretation of the measurement statistics.

10.6 High Phase Localisation

We turn now to the behaviour of model 2 in the regime that the initial state of the reference system is highly phase-localised. Let \(c_n =\frac{ e^{in \theta '}}{\sqrt{2j+1}}\) for \(-j \le n \le j\) and 0 otherwise, and let \(|\theta ' _j\rangle = \sum _{-j}^j c_n |n\rangle \). This state is approximately localised around the value \(\theta '\), i.e., is an approximate eigenstate of the self-adjoint angle \(\varTheta _\mathcal {R}\) with eigenvalue \(\theta '\), with the quality of approximation becoming increasingly good as j becomes large. Indeed, the sequence \((|\theta '_j\rangle )\) is an approximate eigenstate of \(\varTheta _{\mathcal {R}}\), in the sense that \(\bigl \langle {\theta '_j}\big |{\varTheta _{\mathcal {R}} \theta '_j}\bigr \rangle = \theta '\) and \(\text {Var}(\varTheta _\mathcal {R})_{\theta ' _j} \rightarrow 0\) as \(j \rightarrow \infty \). (The sequence also describes a an approximately localised state in terms of concentration of probabilities, as described for the similar sequence \((\phi _n)\) of Example 4.) Using the form of \(U_1\) from Sect. 10.3, we find

$$\begin{aligned} \varPsi _f \equiv U_1 |0\rangle |\theta ' _j\rangle = e^{-i\frac{\theta }{2}}\frac{1}{\sqrt{2}}\left( |0\rangle + e^{i(\theta + \theta ')} |1\rangle \right) |\theta ^{\prime } _j\rangle + |\text {error}\rangle _j \end{aligned}$$
(87)

where the state

$$\begin{aligned} |\text {error}\rangle _j = e^{i\frac{\theta }{2}}e^{i \theta '}\frac{1}{\sqrt{2 (2j+1)}}\left( - e^{-ij\theta '}|1\rangle |-j\rangle +e^{i(j+1)\theta '}|1\rangle |j+1\rangle \right) . \end{aligned}$$
(88)

Clearly \(\bigl \Vert {|\text {error}\rangle _j}\bigr \Vert {^2}=(2j+1)^{-1}\) and therefore \(\bigl \Vert {|\text {error}\rangle _j}\bigr \Vert \rightarrow 0\) as \(j \rightarrow \infty \). As this error term becomes arbitrarily small, \(\varPsi _f\) is arbitrarily norm–close (modulo an overall phase) to the product state \(\bigl (|0\rangle + e^{i(\theta + \theta ')} |1\rangle |\theta ' _j\rangle \bigr )/\sqrt{2}\).

Let \(R \in \mathcal {L}(\mathcal {H}_{\mathcal {T}})\) be invariant and self-adjoint. By continuity,\(\lim _{j \rightarrow \infty }\left\| R |\text {error}\rangle _j\right\| = 0\). Suppose we fix \(\theta ^{\prime }=0\) and \(R = \yen (A)\) for some self-adjoint A which does not commute with \(N_{\mathcal {S}}\). Then,

$$\begin{aligned} \lim _{j \rightarrow \infty } \left\langle \,\varPsi _f \,{|}\,\yen (A) \varPsi _f\,\right\rangle = \left\langle \,\varphi \,{|}\, A \varphi \,\right\rangle , \end{aligned}$$
(89)

with \(\varphi :=\bigl (|0\rangle + e^{i\theta } |1\rangle |\theta ' _j\rangle \bigr )/\sqrt{2}\). Therefore, in this model, the expectation of the absolute quantity A in the absolutely coherent state \(\varphi \) approximates arbitrarily well the relational \(\yen (A)\) in the invariant (absolutely incoherent) state \(\tau _{\mathcal {T}*}(P[\varPsi _f])\).

We may also consider the action of \(U_2\), leading to an overall evolution U:

$$\begin{aligned} U |0\rangle |\theta ^{\prime } _j\rangle = \left( \cos \left( \frac{\theta }{2} \right) |0\rangle -e^{i \theta ^{\prime }}i \sin \left( \frac{\theta }{2} \right) |1\rangle \right) |\theta _j ^{\prime }\rangle + |\mathrm{error}\rangle _j, \end{aligned}$$
(90)

with

$$\begin{aligned} \lim _{j \rightarrow \infty } \bigl \Vert |\mathrm{error}\rangle _j \bigr \Vert = \lim _{j \rightarrow \infty }\bigl \Vert \frac{1}{\sqrt{{2}(2j+1)}} \left( e^{i(j+1)\theta ^{\prime }} |1\rangle |j\rangle -e^{ij \theta ^{\prime }} |1\rangle |-j-1\rangle \right) \bigr \Vert =0, \end{aligned}$$
(91)

leading to the evolution up to a term of arbitrarily small norm of

$$\begin{aligned} U |0\rangle |\theta ^{\prime } _j\rangle \approx \left( \cos \left( \frac{\theta }{2} \right) |0\rangle -ie^{i \theta ^\prime } \sin \left( \frac{\theta }{2} \right) |1\rangle \right) |\theta _j ^{\prime }\rangle . \end{aligned}$$
(92)

Therefore, if the error term \(|\mathrm{error}\rangle _j\), which can be made to have arbitrarily small norm by choosing j large enough is ignored, the state of the system alone is given by the first factor in the tensor product, achieved by partial tracing over \(\mathcal {R}\). For simplicity we set \(\theta ^{\prime } = 0\). Then measurement sensitivity of the observable \(|0\rangle \langle 0|\) to \(\theta \) (which is still present after operating with \(\tau _{\mathcal {T}*}\)) seems to validate the existence and measurability of (the relative phase factor \(\theta \) in) the superposition \(\frac{1}{\sqrt{2}}\left( |0\rangle + e^{i\theta } |1\rangle \right) \), since the latter state is given as the state of the system again by ignoring the error term in Eq. (87). It looks as though coherence across \(N_1\) eigenspaces has been prepared and confirmed. We now critically analyse this conclusion.

10.7 Interpretation

Analysis of the post-\(U_1\) and post-U states in the above high reference localisation regime highlights several key points, variants of which will reappear throughout the rest of this paper under various guises. We first recapitulate:

  1. Any reasonable measure of entanglement capable of capturing this situation would show that the state \(\varPsi _f\) becomes arbitrarily close to an unentangled state for suitably large j.

  2. Continuity (of R) dictates that the statistics of absolute A in absolutely coherent \(\varphi \) can be approximated arbitrarily well, for suitably large j, by \(\yen (A)\) in the state \(\tau _{\mathcal {T}*}(P[\varPsi _f])\). In particular, \(\theta \)-dependent expectation values are present before the limit is taken.

The limit \(j \rightarrow \infty \) itself must be treated with extreme caution—the rigorous existence of such limits must be questioned, and the meaning of physical conclusions drawn from the limit may not be clear. The main dangers of taking the large amplitude limits in the example we have discussed are summarised below.

  1. The limit \(j \rightarrow \infty \) in the state \(|\theta ^{\prime }_j\rangle \) does not yield a normalisable Hilbert space vector.

  2. \(N_2\) (and thus N) is not a bounded/continuous operator and therefore \(\left\| N |\mathrm {error}\rangle _j\right\| \) need not vanish even as \(\left\| |\mathrm {error}\rangle _j\right\| \) does in the large j limit.

  3. If the error term is ignored, the dynamics no longer conserve number (this is due to the unboundedness). This is most acutely observed by noting that in (85), \(\theta \) may take any real value. Choosing \(\theta = \pi \) and ignoring the error term, the evolution takes the form \(U|0\rangle |\theta ^{\prime }\rangle _j = |1\rangle |\theta ^{\prime }\rangle _j\). It appears as though the state \(|1\rangle \) has been manufactured from \(|0\rangle \) with no energy cost.

  4. Ignoring the error term leads to a “reduced” unitary \(U_\mathrm{eff} = U_{\mathcal {S}}\otimes \mathbb {1}\) and it is clear that \([U_\mathrm{eff},N] \ne 0\) and \([U_{\mathcal {S}},N_1] \ne 0\). Therefore, no matter how small \(\left\| |\mathrm{error}\rangle _j\right\| \) may become, in order to properly account for energy/N conservation, it must not be taken to be zero.

  5. The partial trace \(\mathrm{tr}_{\mathcal {R}}[\tau _{\mathcal {T}*}(P[\varPsi _f])]\), for any finite j, yields an invariant/absolutely incoherent state of \(\mathcal {S}\) (by Proposition 5.) Only in the limit does absolute coherence for \(\mathcal {S}\) appear.

We now discuss the large amplitude limit in more detail.

10.7.1 Meaning of the Limit

In analysing the physical interpretation of the high amplitude limit, we will be guided by two principles, referred to by Landsman [49] as Earman’s principle [50] and Butterfield’s principle [51]. Earman’s principle states that

While idealisations are useful and, perhaps, even essential to progress in physics, a sound principle of interpretation would seem to be that no effect can be counted as a genuine physical effect if it disappears when the idealisations are removed. [50, p. 191].

Butterfield’s principle then addresses the question of idealisations given by infinite limits, and describes in more detail the type of behaviour that must exist prior to a limit being taken:

There is a weaker, yet still vivid, novel and robust behaviour that occurs before we get to the limit, i.e. for finite N. And it is this weaker behaviour which is physically real. [51, p. 1065].

Here, N refers to particle number, but this principle is readily adapted to our situation. Taking this all into account, the following appears to be a consistent interpretation.

The overall (\(\mathcal {S}+ \mathcal {R}\)) dynamics are number-conserving, and the observables which may be measured are invariant under phase shifts generated by N (hence, commute with N). There is a reduced description, applicable to \(\mathcal {S}\) on its own in which, in direct analogy to the discussion of high localisation in the kinematical case, absolute quantities, absolute coherence, and non-N-conserving dynamical maps approximately capture the observed statistics.

This reduced description is suitable in its convenience and usefulness in certain situations, and provides an adequate tool for computing, to arbitrary approximation, empirically verifiable measurement statistics. For instance, the descriptions afforded by A and \(\yen (A)\) may be observed to be arbitrarily close given arbitrarily high reference phase localisation, with the limit then featuring as an idealisation in which A and \(\yen (A)\) (or more correctly, \((\varGamma _{\phi } \circ \yen )(A)\)) may be taken to be equal.

However, in addressing fundamental issues, the use of the idealisation (high localisation limit) betrays the essence of the phenomena under investigation. Guided by Earman’s and Butterfield’s principles, we may therefore discard as being artefacts of the idealisation those phenomena present in the limit but which disappear prior to the high localisation limit actually being taken. The attribution to \(\mathcal {S}\) of a state which is a superposition of eigenstates (and which is physically different from its corresponding mixture) of different \(N_{\mathcal {S}}\) eigenvalue is such an example: taking the partial trace (over \(\mathcal {R}\), with \(\theta ^{\prime } = 0\)) (in (87)) with the error term included (finite j) yields the state \(\rho _{\mathcal {S}} = \frac{1}{2}(|0\rangle \langle 0|+ |1\rangle \langle 1|)\), whereas ignoring the error term (infinite j) we find the state \(\mathrm{tr}_{\mathcal {R}}\left[ {P_{\varPsi _f}}\right] = P_{\frac{1}{\sqrt{2}} (|0\rangle +e^{i\theta } |1\rangle )}\), i.e., the projection onto the vector unit vector \(\varphi _{\mathcal {S}}=\frac{1}{\sqrt{2}} (|0\rangle +e^{i\theta } |1\rangle )\).Footnote 5

Another issue is the violation of energy conservation. At any finite j, energy is manifestly conserved, whereas in the limit, with the error term ignored, energy conservation is violated. These two instances must therefore be viewed as pertaining to not physically real effects in the sense of Earman. The physically real effects are the statistics arising from the measurement of invariant quantities. Approximating these statistics in a convenient manner by non-invariant quantities is legitimate, but attributing the measurement statistics to such quantities as observables is not. The measurement statistics containing \(\theta \)-dependent terms, close in approximation to the absolutely coherent superposition, are physically real, but the state description of \(\mathcal {S}\) as absolutely coherent is not.

Working with the idealised limit is legitimate when it comes to computing certain expectation values which may arise in experiments. For example, using A rather than \(\yen (A)\) is unproblematic, provided the reference frame is prepared in a highly localised state. However, given the nature of our enterprise, that is, to understand the fundamental role played by symmetry upon the definability and measurability of quantum mechanical quantities, it is illegitimate to move to an idealisation in which the symmetry is no longer manifest, a fortiori when the symmetry is present at every finite value of j prior to the limit being taken.

In other words, since we are interested in symmetry, we should not have recourse to a theoretical description in which, even though valid insofar as certain calculations are concerned, the symmetry in question is no longer present. Thinking of the description of a ball bouncing against a wall (cf. [7]), there is no problem, as far as the modelling of the ball is concerned, in taking the wall to be of infinite mass. But if one is performing an investigation of the limitations on dynamics imposed by momentum conservation, then taking the large mass limit of the wall—the limit in which momentum conservation is violated—cannot be viewed as fundamentally valid and completely obscures the issue at hand, namely the role played by symmetry and conservation.

We now address controversies surrounding superselection rules and the reality of optical coherence, by critically analysing a number of models in the literature aimed at, in essence, obviating superselection rules. We will observe the use of dynamics and limits very similar to those discussed above, with identical interpretation.

11 Controversies

The final part of this paper addresses a number of controversies which have appeared in the literature over the last 65 years. The first relates to the fundamental status of superselection rules and the role played by reference frames there, and the second, appearing much later but strongly connected to the superselection rule debate, the question of the reality of optical coherence of laser beams.

We critique two opposing standpoints on the meaning and validity of superselection rules. Wick, Wightman and Wigner’s (WWW’s) seminal 1952 paper [8] was met with objection from Aharonov and Susskind (AS) 15 years later [10], which was then obliquely criticised again by WWW. Subsequent efforts have been devoted on the one hand to rigorous work on superselection rules in quantum field theory (see, e.g., [52]), whilst on the other towards more practical questions on the role of superselection in information and communication theoretic tasks (e.g., [7, 53]).

After briefly introducing Wick, Wightman and Wigner’s original argument, we focus on Aharonov and Susskind’s contribution, highlighting points of agreement and disagreement between our perspective and theirs. For instance, the meaning of coherence/superpositions as requiring a relational understanding [10] we view as ground-breaking, and this point of view has inspired much of the work in this paper. However, we do not support their conclusion (e.g., in the abstract of [10]) that “contrary to a widespread belief, interference may be possible between states with different charges”; nor do we agree that this conclusion follows from their argument. The paper suffers from mathematical flaws and a lack of conceptual clarity; what is at stake is nothing more than the appearance of measurability of absolute quantities/coherence in the presence of symmetry, and therefore the explicitly relational framework presented is well-suited to bringing a consistent and clear explanation of the issue of whether superselection rule “compatible” states can be superposed to give a physically different state from its corresponding mixture.

We also critique more recent contributions [7, 37] along similar lines, focussing on the latter. The former [7] suffers from serious mathematical defects, some of which we have already pointed out and some of which are irreparable, which severely limit the conclusions that can be drawn from the work. The scope of [7] is also catered heavily towards the role of reference frames in information-theoretic tasks and agent-based scenarios, e.g., entanglement theory, quantum key distribution, communication tasks, all when the given agents have no knowledge of each other’s reference frame. The ensuing practical limitations gradually morph through the paper into fundamental ones, with far reaching conclusions that we contend are not warranted. We again give points of agreement (e.g., that “all observable quantities ought to be relational”) and disagreement (“superselection rules cannot provide any fundamental restrictions on quantum theory”), and again clear up dubious arguments by consistently applying the principle that observable quantities are invariant. This also applies to [37], which shares many mathematical problems with [7]. We find the language vague and occasionally conceptually unclear, and we will critique this work in detail, drawing upon ideas thus far presented.

11.1 Brief Overview

The notion of a superselection rule was introduced by Wick, Wightman and Wigner [8], who proposed that superpositions of states of bosons and fermions should be considered as equivalent to the associated mixture (i.e., that relative phase factors in superpositions of bosons and fermions are unobservable in principle), and a similar position was advocated for states of differing electric charge. Aharonov and Susskind [10] disagreed with the latter claim and offered a concrete experimental arrangement, very similar those we have considered in this paper (for the express purpose of critique), to demonstrate the possibility of preparing and observing coherent superpositions of states of different electric charge, via a formal analogy to the case of angular momentum. WWW then replied [9] with a theorem demonstrating that coherence is required in the initial state of one system in order to observe it in another, pointing to a circularity in Aharonov and Susskind’s argument and similar to the objection raised here in Sect. 6.6.

Subsequently the issue of the “reality” of quantum optical coherence was raised by Mølmer [48], who suggested that if the gain medium of the laser is properly accounted for, the actual laser field is described by a mixture of number states, and that therefore the coherence is merely “convenient fiction”.

Bartlett et al. [7] (BRS), also in collaboration with Dowling [37] (DBRS), have shed light on aspects of the superselection rule debate, particularly in clarifying the position of Aharonov and Susskind [10], and on the “optical coherence controversy” [12], highlighting the relative nature of states (and also, therefore, of coherence) and the accompanying role of reference frames.

We now present the form that these controversies take from the perspective of the relational formalism presented here. We believe that the framework we have developed for dealing with relative quantities clarifies the seemingly opposing viewpoints of AS and WWW, and in a certain sense unifies them. We will see that the attempts to overcome or “lift” superselection rules (as they arise through the lack of a reference frame—see [7]) correspond to model considerations that take the same form as the dynamical models already considered (many of which are modelled on [10] and [37]). The framework afforded by observables-as-invariants allows for a circumvention of the “relative” and “global” decompositions of the system-apparatus Hilbert space described in [7, 37] (see also [47]) which have mathematical flaws, and allows for a direct assessment of the status of claims to, in essence, obviate superselection rules.

11.2 The Exchange Between Aharonov-Susskind and Wick-Wightman-Wigner

11.2.1 Wick, Wightman, Wigner: The First Superselection Rule

In 1952, Wick et al. [8] made a simple argument to demonstrate the existence of a dichotomy between the assumption that all self-adjoint operators represent observables on one hand (a working assumption since von Neumann’s book [54, p. 313]), and relativistic invariance on the other. Since double time reversal, \(T^2:\mathcal {H}_{\mathcal {S}}\rightarrow {\mathcal {H}_{\mathcal {S}}}\) (with \(\mathcal {H}_{\mathcal {S}}\equiv \mathcal {H}_{\mathcal {S}}{_b} \oplus \mathcal {H}_{\mathcal {S}}{_f}\), the decomposition into bosonic and fermionic subspaces defining the projections \(P_b\) and \(P_f\) respectively), they argue, cannot be observed, and since \(T^2\) has the effect of leaving bosonic states invariant and introducing a minus sign on fermionic states:

$$\begin{aligned} \varPsi _+ \equiv \frac{1}{\sqrt{2}}\left( \varphi _b + \varphi _f \right) \overset{T^2}{\longrightarrow }\left( \varphi _{b} - \varphi _f \right) \equiv \varPsi _- , \end{aligned}$$
(94)

it must be that any observable leaves the bosonic and fermionic sectors invariant, with the sign difference then unobservable. This follows since for any self-adjoint A, for the consequences of a double time reversal to be unobservable, it must be that \(\left\langle \,\varPsi _+\,{|}\,A \varPsi _+\,\right\rangle =\left\langle \,\varPsi _-\,{|}\,A \varPsi _-\,\right\rangle \), from which it follows that any observable A must commute with \(P_b\) and \(P_f\) and thus any observable W with \(P_b\) and \(P_f\) as spectral projections. W is then a superselection observable.

We observe that the stipulation that observables of \(\mathcal {S}\) commute with W leads to the equivalence of states \(\rho \) and \(\tau _{\mathcal {S}*} (\rho )\), with \(\tau _{\mathcal {S}*}(\rho ) := P_b\rho P_b + P_f \rho P_f\) in this case. WWW also conjectured (subsequently proven in quantum field theory by Strocchi and Wightman [55]) that the relative phase factors in superpositions of states of different electric charge have the same status, namely cannot be determined by experiment, even in principle, and therefore that the states \(\rho \) and \(\sum P_n \rho P_n \equiv \tau _{\mathcal {S}*}(\rho )\) are equivalent, with the sum running over all possible values of electric charge. Any observable must commute with charge, and must thus be invariant under shifts in phase/angle conjugate to charge.

The stipulation of such a superselection rule is formally identical to the limitation imposed by the a priori assumption that observables are invariant under symmetry (phase shifts in the charge case). Therefore, it must be understood whether the statement of a (say, electric charge) superselection rule amounts to anything more than the restriction thus far discussed. First, we discuss the reply of AS to the WWW paper, along with another model (due to Dowling et al. [37]) purporting to prepare and measure (absolutely) coherent superpositions of atoms and molecules (against baryon number superselection).

As has been shown, the requirement that observables be phase shift invariant allows for the relative phase of system and reference to be observed, with the absolute phase representing the relative phase given high reference phase localisation. The reply by Aharonov and Susskind to the WWW paper advocating the possibility of measuring relative phase factors in charge superpositions paper makes explicit use of such phase references. We now review their reply and hope to clarify their position by employing the methods and language introduced in this paper.

11.2.2 Reply of Aharonov and Susskind: Proton–Neutron Superpositions

In favour of observability of relative phase factors between superselection rule “forbidden” superpositions, we sketch two thought experiments; the first is due to Aharonov and Susskind [10], conceived so as to demonstrate a realistic scenario in which coherent superpositions of states of different electric charge can be prepared and measured. The second, due to Dowling et al. [37], is similar in spirit, and purports to prove that atoms and (diatomic) molecules can be (absolutely) coherently superposed.Footnote 6

It will be shown that in both of these examples there is an implicit relativisation of the operators to be measured, thereby constructing an invariant operator (observable) not unlike the ones we have discussed. Furthermore, in both cases a crucial role is played by the limit of high localisation of a reference state (in both cases provided by a coherent state) with respect to a covariant phase-like operator conjugate to the symmetry generator, in direct analogy to the models and general results that have been presented. To our knowledge, it has not been explicitly stated anywhere that such localisation is the key property.

The Hilbert space \(\mathcal {H}_{\mathcal {S}}\otimes \mathcal {H}_{\mathcal {R}_1} \otimes \mathcal {H}_{\mathcal {R}_2}\) of Aharonov and Susskind’s thought experiment is to correspond to a proton–neutron system \(\mathcal {S}\) and two cavities (\(\mathcal {R}_1\), \(\mathcal {R}_2\)) capable of containing any integer number of negatively charged mesons. Aharonov and Susskind imagine preparing \(\mathcal {R}_1\) and \(\mathcal {R}_2\) in charge-coherent states (we include normalisation factors that were omitted in the original treatment)

$$\begin{aligned} |q_1, \theta \rangle = e^{-q_1/2}\sum _n \frac{q_1 ^{n/2}}{\sqrt{n!}}\exp {(i n \theta )} |n\rangle \equiv \sum c_n (\theta ) |n\rangle \end{aligned}$$
(95)

and

$$\begin{aligned} |q_2, \theta ^{\prime } \rangle = e^{-q_2/2}\sum _n \frac{q_2 ^{n/2}}{\sqrt{n!}}\exp {(i n \theta ^{\prime })}|n \rangle \equiv \sum c'_n(\theta ^{\prime }) |n\rangle \end{aligned}$$
(96)

respectively, where \(|n\rangle \) denotes a charge eigenstate corresponding to n negatively charged mesons. The parameters \(q_1\) and \(q_2\) represent the respective mean charge values in the coherent states, corresponding to the observables \(Q_1, \ Q_2\),

which are structurally identical to the number operators we have encountered thus far, except that n takes (only) non-positive values.

The initial state of the nucleon is a proton \(|P\rangle \), and we will use \(|N\rangle \) to represent a neutron. \(|P\rangle \) and \(|N\rangle \) are thus eigenstates of the charge observable \(Q_{\mathcal {S}}\) of \(\mathcal {S}\) with eigenvalues 1, and 0, respectively. The dynamics, which take place in two stages, are governed by a Jaynes-Cummings-type Hamiltonian (which commutes with charge) \(H=g(t)(\sigma ^+ a^- + \sigma ^- a^+)\) where \(\sigma ^{+} = |N \rangle \langle P| \), \(\sigma ^{-} = |P \rangle \langle N|\) (sometimes referred to as the isospin operators), and \(a^{\pm }\) are meson creation and annihilation operators which act on the states of the cavities. The function g(t) describes the interaction strength and fixes the duration of the interaction (given physically by the passage time of the nucleon travelling through the cavity). Explicitly, the dynamics are governed by \(H_1 = g_1(t)(\sigma ^+ \otimes a^- \otimes \mathbb {1} + \sigma ^- \otimes a^+ \otimes \mathbb {1} )\) with \(g_1(t)=g\chi _{[0,T]}(t)\), followed by \(H_2 = g_2(t)(\sigma ^+ \otimes \mathbb {1} \otimes a^- + \sigma ^- \otimes \mathbb {1} \otimes a^+)\) with \(g_2(t)=g\chi _{[T,2T]}(t)\). The unitary \(U_1\) effects the following transitions on charge eigenstates (omitting the second cavity):

$$\begin{aligned} |N\rangle |n\rangle&\longrightarrow i \sin {\left( Tg\sqrt{n}\right) }|P\rangle |n - 1 \rangle + \cos {\left( Tg\sqrt{n}\right) }|N\rangle |n\rangle ,\end{aligned}$$
(97)
$$\begin{aligned} |P\rangle |n\rangle&\longrightarrow \cos {\left( g\sqrt{n+1}\right) } |P \rangle |n\rangle + i \sin {\left( Tg\sqrt{n+1}\right) }|N \rangle |n+1 \rangle . \end{aligned}$$
(98)

Referring back to equation (86), these are of an almost identical form. Analogous to what we saw there, we find here that we may measure the observable \(|P\rangle \langle P|\otimes \mathbb {1}\) in the state \(U_1 |P\rangle |n\rangle \) to find the proton probability \(\cos ^2{\left( Tg\sqrt{n+1}\right) }\).

Starting with the initial state \(\varPsi _0=|P\rangle |q_1,\theta \rangle |q_2,\theta '\rangle \), the state after the first cavity is

$$\begin{aligned} U_1 \varPsi _0 =\sum _n c_n\left[ \cos {\left( Tg\sqrt{n+1}\right) }|P\rangle |n\rangle + i \sin {\left( Tg\sqrt{n+1}\right) }|N\rangle |n+1 \rangle \right] \,|q_2,\theta '\rangle . \end{aligned}$$
(99)

One must then consider the limit of large \(q_1\), which yields

$$\begin{aligned} U_1 \varPsi _0 \approx \left( i e^{i \theta }\sin {\left( gT\sqrt{q_1}\right) }|N\rangle + \cos {\left( gT\sqrt{q_1}\right) }|P\rangle \right) \, |q_1, \theta \rangle \, |q_2, \theta ^{\prime }\rangle . \end{aligned}$$
(100)

The nucleon is then approximately “separated” from the cavities; it enters the second cavity and exits, this time in the large \(q_2\) limit, as

$$\begin{aligned}&\Big [ \left( \cos {(gT\sqrt{q_1})}\cos {(gT\sqrt{q_2}}- e^{i(\theta - \theta ^{\prime })} \sin {(gT\sqrt{q_1})}\sin {(gT\sqrt{q_2})} \right) |P\rangle \nonumber \\&+ i\left( e^{i \theta ^{\prime }} \cos {(gT\sqrt{q_1})}\sin {(gT\sqrt{q_2})} + e^{i \theta }\sin {(gT\sqrt{q_1})} \cos {(gT\sqrt{q_2})} \right) |N\rangle \Big ] \, |q_1, \theta \rangle \, |q_2, \theta ^{\prime }\rangle . \end{aligned}$$
(101)

As observed, the proton probability (i.e., \(\text {tr}\left[ |0\rangle \langle 0|U \varPsi _0\right] \)) now depends on \(\theta - \theta ^{\prime }\), the relative phase between \(\mathcal {R}_1\) and \(\mathcal {R}_2\).

Therefore, as argued by Aharonov and Susskind, the nucleon is in a coherent superposition of proton and neutron with relative phase \((\theta - \theta ^{\prime })\)when referred to the frame provided by \(\mathcal {R}_2\)”. The idea is that the absolutely coherent superposition is created by the first cavity (cf. (99)) and then confirmed by measuring an invariant quantity of \(\mathcal {S}\) after passage through the second cavity. However, the model presented by AS suffers from the same kind of difficulties as discussed in Sect. 10.7.

From the perspective developed in the present paper, we would instead say that in the limit of high reference system localisation, we are faced with the appearance of measuring an absolute quantity (namely a phase-like quantity sensitive to relative phase in nucleon superpositions) in an absolutely coherent state, but this is appropriately understood as pertaining to a relative phase-like observable between the nucleon and the cavities (and a mutually coherent state). The analogy to the angular momentum/angle case, as employed by Aharonov and Susskind to compel one to believe in the observability of proton–neutron superpositions, is indeed a good one. However, we argue for the opposite conclusion: it is not that since absolute coherence of states of different angular momentum is observable, therefore so is the relative phase factor in superpositions of charge states, but rather, absolute coherence for angular momentum is not possible, and nor is it in the charge case.

Indeed, it is stated quite explicitly in [10], that “the coherence of states of different angular momentum is measured relative to a frame of reference”. Thus coherence itself is viewed as a relative feature; from this point of view, there is no absolute coherence of states of angular momentum. Once again, we see the importance of the mutual coherence concept.

The Aharonov-Susskind paper was understood by many as proving the possibility of coherent superpositions of states of different electric charge—a situation conjectured impossible by Wick, Wightman and Wigner (WWW) [8] 15 years previously. Three years after Aharonov and Susskind’s contribution, WWW demonstrated the necessity of using superpositions of states of different charge (i.e., the absolutely coherent cavity states) in order to demonstrate their existence (i.e., for the nucleon); see Sect. 11.2.3. Aharonov and Susskind were alert to such a circularity and tried to avoid any possible objection in the final part of their paper, where they attempted to construct a charge eigenstate out of the two charge coherent states, providing a manifestly (phase-shift) invariant state. This takes the form of an integral ([10], final page)

$$\begin{aligned} |i\rangle = \int |q \theta _1\rangle |q ^{\prime } \theta _2\rangle \delta \left( \theta _1 - \theta _2 -(\theta ^{\prime } - \theta )\right) e^{-i(q + q^{\prime }) \theta _1}d \theta _1 d \theta _2, \end{aligned}$$
(102)

where the initial state \(|i\rangle \) is then an eigenstate of charge \(q+q'\) and fairly well-defined phase \(\theta ^{\prime } - \theta \).Footnote 7 They claim that the proton probability distribution is unchanged even when the cavities are prepared in a charge eigenstate. The following calculation demonstrates that their proposal is flawed: if the two-cavity system is prepared in a charge eigenstate, \(|i\rangle \), under charge-conserving evolution the approximation they give can never be valid. Suppose under evolution U which conserves total charge we have

$$\begin{aligned} |P\rangle \otimes |i\rangle {\mathop {\longrightarrow }\limits ^{U}}\phi _{i+1} \end{aligned}$$

where \(Q^c\phi _{i+1}=(i+1) \phi _{i+1}\). Then for arbitrary \(\psi = \alpha |P\rangle + \gamma |N\rangle \), \(\left| \alpha \right| ^2+\left| \gamma \right| ^2 = 1\), we note the following trivial observation:

$$\begin{aligned} \left\| \phi _{i+1} - \psi \otimes |j\rangle \right\| = 0\quad \text {if and only if }\gamma = 0\text { and }i=j. \end{aligned}$$

By contrast, in the example where \(\alpha = \gamma = 1/ \sqrt{2}\), we have \(\left\| \phi _{i+1} - \psi \otimes |i\rangle \right\| ^2 \ge 2-\sqrt{2}\). Thus the resulting state is a finite (norm) distance from an eigenstate, independent of the “size” of the reference system. This “fix” by Aharonov and Susskind is therefore untenable, and their conclusion that interference effects may be observed between states of different electric charge, given the restriction of not assuming its possibility from the outset, does not follow from their argument.

The approximation based on high amplitude coherent states was mathematically valid and results in states close to a product state containing proton–neutron superpositions in the system Hilbert space. High amplitude coherent states, however, already exhibit absolute coherence if the observables are not restricted to invariants (such a constraint on observables is barely mentioned in AS’s paper.) The above result demonstrates that if the coherent states are replaced with a charge eigenstate, no such approximation can occur. WWW responded to the AS paper, also implicitly criticising the error, which we now discuss.

11.2.3 Response of Wick, Wightman, Wigner

Wick et al. [9] responded to Aharonov and Susskind’s challenge to the superselection rule for charge, making three key points which we now summarise. We note that WWW’s argument was not to offer a proof of charge superselection, but rather to argue that superpositions of states of different charge cannot arise from (composition of) invariant states, charge-conserving dynamics, and subsystem separation. This therefore takes place in the Schrödinger picture.

We assume that charge (\(Q_{\mathcal {S}}\) for \(\mathcal {S}\) and \(Q_{\mathcal {R}}\) for \(\mathcal {R}\)) may take positive and negative values, and recall that \(\tau _{\mathcal {T}*}(\rho ) = \sum _{-\infty }^{\infty }P_n \rho P_n\) (with appropriate indices for subsystems, \(\mathcal {S}\) and \(\mathcal {R}\)).

  1. 1.

    The composition \(\rho _{\mathcal {S}} \otimes \rho _{\mathcal {R}}\) for \(\rho _{\mathcal {S}}\) and \(\rho _{\mathcal {R}}\) invariant yields a state which commutes with total charge (i.e., is invariant) and no interference of states of different charge of \(\mathcal {S}+ \mathcal {R}\) is possible.

  2. 2.

    The time evolution \(U:\mathcal {H}_{\mathcal {S}}\otimes \mathcal {H}_{\mathcal {R}}\rightarrow \mathcal {H}_{\mathcal {S}}\otimes \mathcal {H}_{\mathcal {R}}\), which commutes with \(Q=Q_{\mathcal {S}} + Q_{\mathcal {R}}\), gives \([Q,U\rho U^*]=0\) for any \(\rho \) for which \([\rho ,Q]=0\). Equivalently, with \(\mathcal {U}(\cdot ) = U (\cdot ) U^*\), \(\mathcal {U}\circ \tau = \tau \circ \mathcal {U}\).

  3. 3.

    Given \(\tau _{\mathcal {T}*} (\rho )\), the reduced states of \(\mathcal {S}\) and \(\mathcal {R}\) commute with \(Q_{\mathcal {S}}\) and \(Q_{\mathcal {R}}\), respectively. Equivalently, \(\tau _{\mathcal {S}*}(\mathrm{tr}_{\mathcal {R}}\left[ \tau _{\mathcal {T}*} (\rho )\right] ) = \mathrm{tr}_{\mathcal {R}}\left[ \tau _{\mathcal {T}*} (\rho )\right] \).

The three steps outlined above correspond to composition, evolution and separation, respectively. Regarding observing absolutely coherent superpositions of states of different charge of \(\mathcal {S}\), WWW showed that it “takes one to know one”; specifically, a coherent superposition of states of different charge for \(\mathcal {R}\) are required in order to observe them at the level of \(\mathcal {S}\), showing that AS’s argument, as presented, is circular (in using coherent states for the cavities) or flawed (in using a charge eigenstate for the combined cavities).

Dowling et al. presented, in 2006 [37], an argument in favour of superpositions of states of different baryon number, correcting some flaws in Aharonov-Susskind’s argument. We now present this model, before comparing the viewpoints of the two “camps” (those who believe superselection can be obviated in practice, and those who don’t), and discussing the wider context of superselection rules and their obviation.

11.3 Atom-Molecule Superpositions According to Dowling et al.

In the spirit of the 1967 contribution by Aharonov and Susskind, Dowling et al. [37] attempt to model the observation of a coherent superposition of an atom and a (diatomic) molecule, as a possible demonstration of coherent superpositions of states of differing baryon number. In order to avoid the error of Aharonov and Susskind in preparing the cavities in an eigenstate of the conserved quantity, they instead utilise the coherent state, but acknowledge that appropriate “sectorising” (i.e., application of the \(\tau _{\mathcal {T}*}\)/twirling map) is necessary in order to respect the symmetry for the composite system.

The reference system is provided by a Bose–Einstein condensate (BEC), coherent states of which are written \(|\beta \rangle = \sum _{n=0} ^{\infty } c_n |n\rangle \) (\(|n\rangle \) representing a state of n atoms) with \(c_n = \exp {(-\left| \beta \right| ^2/2}) \beta ^n / \sqrt{n!}\). We write \(\beta = \sqrt{m} e^{i \theta }\), and have that \(\langle N \rangle _{\beta } = \left| \beta \right| ^2 = m\) and \((\varDelta N )_{\beta } = \sqrt{m}\), and as m becomes large, coherent states become arbitrarily highly localised in phase. Therefore the coherent state looks increasingly like a phase “eigenstate”. It is also useful to note that \(\tau _{\mathcal {R}*} (P_{\beta }) = \sum _{n=0}^{\infty } P_n |\beta \rangle \langle \beta |P_n = \sum _{n=0} ^{\infty } \left| c_n \right| ^2 |n\rangle \langle n|\).

Dowling et al. describe an experiment, again with a multistage unitary along the lines of the models we have outlined, which goes as follows: The initial state is \(P_{|A\rangle \otimes |\beta \rangle }\) (\(\sim |A\rangle \langle A| \otimes \tau _{\mathcal {R}*} (P_{\beta })\)), where the state \(|A\rangle \) is to represent an atom; accordingly molecule states are written \(|M\rangle \) (both of these are to be understood as shorthand: \(|A\rangle \equiv |0\rangle _M|1\rangle _A\) and \(|M\rangle \equiv |1\rangle _M|0\rangle _A\)). Defining the cavity states

$$\begin{aligned} | \beta _{A}^1 \rangle = \sum _{n=0}^\infty c_n\cos {\left( \frac{\pi }{4}\sqrt{\frac{n}{m}}\right) } |n\rangle = \sum _{n=0}^{\infty } \frac{e^{-m/2}m^{n/2}}{\sqrt{n!}}e^{in \theta }\cos {\left( \frac{\pi }{4}\sqrt{\frac{n}{m}}\right) } |n \rangle \end{aligned}$$
(103)

and

$$\begin{aligned} |\beta _{M} ^1\rangle = -i \sum _{n=0}^\infty c_n \sin \left( \frac{\pi }{4} \sqrt{\frac{n}{m}} \right) |n-1\rangle , \end{aligned}$$
(104)

they give the following sequence of unitary maps (for details on the specific form of the Hamiltonians, see [37]):

$$\begin{aligned} \varPsi ^{\prime } \equiv U_1 |A \rangle \otimes |\beta \rangle = |A\rangle \otimes |\beta _{A}^1\rangle + |M\rangle \otimes |\beta _{M}^1\rangle \end{aligned}$$
(105)

followed by free evolution under a Hamiltonian of the form \(K |M\rangle \langle M|\) (with K a constant)

$$\begin{aligned} \varPsi ^{\prime } \rightarrow \varPsi ^{\prime \prime } \equiv U_2 \varPsi ^{\prime } = |A\rangle \otimes |\beta _{A}^1\rangle + e^{i \phi }|M\rangle \otimes |\beta _{M}^1\rangle , \end{aligned}$$
(106)

where \(\phi = T K\) and T is the duration of free evolution. Thus \(U_2\) explicitly depends on \(\phi \). Finally,

$$\begin{aligned} U_3 \varPsi ^{\prime \prime } = |A\rangle \otimes |\beta _{A}^3\rangle + |M\rangle \otimes |\beta _{M}^3\rangle , \end{aligned}$$
(107)

with

$$\begin{aligned} |\beta _{A}^3\rangle = \sin \left( \frac{\phi }{2}\right) |\beta \rangle - i \cos \left( \frac{\phi }{2}\right) \sum c_n \cos \sqrt{\frac{n}{m}\frac{\pi }{2}} |n\rangle \end{aligned}$$

and

$$\begin{aligned} |\beta _{M}^3\rangle =- \cos \left( \frac{\phi }{2}\right) \sum c_n \sin \sqrt{\frac{n}{m}\frac{\pi }{2}} |n-1\rangle \end{aligned}$$

again representing cavity states. The purpose of \(U_2U_1\) is to introduce the relative phase factor \(\phi \); \(U_3\) then allows a measurement of a convenient quantity (i.e. \(|M\rangle \langle M|, |A\rangle \langle A|\)) for realistic experiments, but also to measure an invariant quantity of \(\mathcal {S}\). For the purposes of discussing relative phase factor observability it is sufficient to consider the state following the application of \(U_1\) or \(U_2\), along with the asymptotic behaviour outlined in [37].

Since discussions pertaining to the type of convergence thus far encountered here have been somewhat informal in the existing work, we provide a proof in the appendix that, for example,

$$\begin{aligned} \bigl \Vert |\beta _{A}^1\rangle - \frac{1}{\sqrt{2}} \left| \beta \right\rangle \bigr \Vert \rightarrow 0~ \text {as}~ m \rightarrow \infty . \end{aligned}$$
(108)

We may write

$$\begin{aligned} U_1 |A \rangle \otimes |\beta \rangle = \left( \tfrac{1}{\sqrt{2}}|A\rangle - ie^{i \theta }\tfrac{1}{\sqrt{2}} |M\rangle \right) \otimes |\beta \rangle + |\text {error}\rangle _m \end{aligned}$$
(109)

where

$$\begin{aligned} |\text {error}\rangle _m = |A\rangle \otimes \left( \tfrac{1}{\sqrt{2}} | \beta \rangle - |\beta _{A}^1\rangle \right) + |M\rangle \otimes \left( i e^{i \theta }\tfrac{1}{\sqrt{2}} |\beta \rangle + |\beta _{M}^1\rangle \right) \end{aligned}$$
(110)

with \(\theta \equiv \arg {\beta }\). It is clear that \(\left\| |\text {error}\rangle _m \right\| \rightarrow 0\) as \(m\rightarrow \infty \) if and only if

$$\begin{aligned} \left\| \tfrac{1}{\sqrt{2}} | \beta \rangle - |\beta _{A}^1\rangle \right\| \rightarrow 0 \quad \text {and}\quad \left\| i e^{i \theta }\tfrac{1}{\sqrt{2}} |\beta \rangle + |\beta _{M}^1\rangle \right\| \rightarrow 0 \end{aligned}$$

individually, using the fact that \(\left\langle \,A\,{|}\,M\,\right\rangle =0\) and \(\bigl \Vert |A\rangle \bigr \Vert = \bigl \Vert |M\rangle \bigr \Vert = 1\).

However, one can also consider the post \(U_3\) state; again, asymptotically and ignoring the error term we have (as given in [37])

$$\begin{aligned} U_3U_2U_1 |A\rangle \otimes |\beta \rangle \cong \left[ \sin \left( \frac{\phi }{2} \right) |A\rangle - e^{i \theta } \cos \left( \frac{\phi }{2}\right) |M\rangle \right] \otimes |\beta \rangle . \end{aligned}$$
(111)

The interpretation given in [37] is that since one can apply \(\tau _{\mathcal {T}*}\) at every stage (under the approximation) and still achieve atom/molecule probabilities of \(\sin ^2 (\phi /2)\) and \(\cos ^2 (\phi /2)\) respectively, a coherent superposition of an atom and a molecule has been observed.Footnote 8

In view of the work we have presented, along with the argument of WWW [9], we do not agree with this view. Given the problems with taking the limit (violation of the conservation law, non-existence of limit for states, unphysical nature of such a limit), we believe that the limit should not be taken in considering the fundamental status of these experiments. As such, the analysis of WWW holds, and absolute coherence cannot be observed for atom-molecule “superpositions”. What is instead observed is mutual coherence, and the observability of the interference effects as given by (for example) \(\sin ^2 (\phi /2)\) only demonstrates the feasibility of measuring relative phase factors within a sector, and the phase \(\phi /2\) should be viewed as precisely this. The large reference system, which provides high reference phase localisation, again provides the appearance of a relative phase factor at the level of the system only.

Therefore, we return once more to the main point: absolute quantities are not measurable, but represent measurable, relative quantities, with good approximation coming with good localisation (suitably, relationally, interpreted). We conclude this section with a final analysis of the two views concerning the observability of “forbidden” superpositions.

11.3.1 Analysis of the Opposing Standpoints

Following, for example, the prescription given in Sect. 10.3, it is possible to follow WWW’s three-step sequence to the letter:

  1. 1.

    Compose: \(|\varPsi _0\rangle \langle \varPsi _0| = |0\rangle \langle 0|\otimes \sum |c_n|^2 |n\rangle \langle n|\);

  2. 2.

    Evolve: \(|\varPsi _0\rangle \langle \varPsi _0|\) evolves according to the charge-conserving unitary defined in (78) and (79) yielding \(\tau _{\mathcal {T}*}(|\varPsi _f\rangle \langle \varPsi _f|) =\sum _n\left| c_n \right| ^2\,P_{\frac{1}{\sqrt{2}}\left( |0,n\rangle +e^{i\theta }|1,n-1\rangle \right) }\) (Eq. (82));

  3. 3.

    Separate: (on the two-dimensional subspace spanned by \(\{|0\rangle , |1\rangle \)).

On this basis, it is clear that there can never be interference observed between \(|0\rangle \) and \(|1\rangle \) under the processes outlined by WWW.

On the other hand, as described in Sect. 10.6, we may prepare the state \(|0\rangle \langle 0|\otimes \tau _{\mathcal {R}*}P[\varPsi _0]\), with \(\varPsi _0 = \sum _{n}c_n |n\rangle \), choosing \(c_n = \frac{e^{in\theta ^{\prime }}}{\sqrt{2j+1}}\) for \(|n|\le j\) and 0 otherwise. Then, for finite j, there exists invariant \(A \in \mathcal {L}(\mathcal {H}_{\mathcal {T}})\) so that \(\text {tr}\left[ A \tau _{\mathcal {T}*}P[\varPsi _f]\right] \) depends on \(\theta \). This \(\varPsi _f\), as j becomes arbitrarily large, becomes arbitrarily close to the product state

$$\begin{aligned} \frac{1}{\sqrt{2}}\bigl (|0\rangle + e^{i(\theta + \theta ')} |1\rangle \bigr ) |\theta ' _j\rangle . \end{aligned}$$
(112)

Then employing relation (11), the statistics of an invariant quantity in \(\tau _{\mathcal {T}*}(P_{\varPsi _f})\) are identical to the statistics in \(\varPsi _f\). One finds that, for example, \(\left\langle \,\varPsi _f\,{|}\,(\varTheta - \varTheta _{\mathcal {R}}) \varPsi _f\,\right\rangle \) gives rise to statistics which are sensitive to the relative phase \(e^{i(\theta + \theta ^{\prime })}\). With \(\theta ^{\prime } = 0\), one finds that \(\left\langle \,\varPsi _f\,{|}\,(\varTheta - \varTheta _{\mathcal {R}}) \varPsi _f\,\right\rangle =\left\langle \,\varphi _{\ell }\,{|}\,\varTheta \varphi _{\ell }\,\right\rangle \) with \(\varphi _{\ell }:=\bigl (|0\rangle + e^{i \theta }|1\rangle \bigr )/\sqrt{2}\). Thus it appears as though one has measured an absolute observable in a superposition state.

In order to attempt to avoid the appearance of measuring an absolute quantity, the second unitary (e.g., that introduced in (86)) allows, on the system level and “once the limit has been taken”, for something like this to occur:

$$\begin{aligned} \frac{1}{\sqrt{2}}\bigl (|0\rangle + e^{i\theta }|1\rangle \bigl ) \mapsto \cos {\left( \frac{\theta }{2}\right) }|0\rangle -i\sin {\left( \frac{\theta }{2}\right) }|1\rangle . \end{aligned}$$
(113)

Then the observable (e.g.) \(|0\rangle \langle 0|\) can be measured and a \(\theta \)-dependent probability distribution achieved.

The upshot is that both WWW and AS/DBRS make arguments which bear out (once the errors have been remedied). The former show, quite correctly, that strictly speaking, only (absolute) coherence begets (absolute) coherence, and if you don’t have it, you’ll never get it, as one would expect. The latter “camp”, in their attempt to show the positive possibility of creating absolute coherence from states without it, actually show the possibility of well-approximating absolute quantities and states with absolute coherence by relative quantities and states without absolute coherence. The crucial ingredients for such an approximation are mutual coherence and high localisation.

11.4 Further Analysis: Superselection Reconsidered

Bartlett et al. [7] argue that a superselection rule may be “lifted”, that is (we think), the following holds: a superselection rule applies to some system \(\mathcal {S}\). A reference frame \(\mathcal {R}\) may be included, the superselection rule applied to \(\mathcal {S}+ \mathcal {R}\), whose statistics then exactly give those of \(\mathcal {S}\) as if there weren’t a superselection rule for \(\mathcal {S}\). This is taken as proof that “superselection rules cannot provide any fundamental restrictions on quantum theory” since, they argue, a SSR is simply a lack of an appropriate frame, which can always be introduced.Footnote 9

We do not endorse this view. First, the analysis preceding the above quote in [7] is mathematically flawed. Second, the reason given for the (e.g., photon number) superselection rule is a practical one: agents may not share a classical phase reference. Finally, as we have noted, if the analysis is done rigorously, one sees that the “superselection-violating statistics” of \(\mathcal {S}\) can be achieved only when there is a localised/absolutely coherent state for \(\mathcal {R}\), which just shifts the problem of absolute coherence from \(\mathcal {S}\) to \(\mathcal {R}\). Only through the mutual coherence concept can this circularity be avoided. The question, then, is whether mutually coherent states exist in all given situations, i.e., for all phase-like quantities.

In more concrete terms, we have seen that, through the \(\yen \) construction, absolute quantities and absolutely coherent states can arbitrarily well approximate the statistics of a relational quantity in an invariant state, contingent on a highly localised reference state. We view this statistical equivalence not as “lifting” in order to show that it can be violated for \(\mathcal {S}\), but rather as an expression of the fact that the ordinary usage of quantum mechanics, with its absolute quantities and absolutely coherent states, captures to a very good degree the true, physical situation represented by invariant quantities of system plus reference, in line with fundamental symmetry requirements.

The situation of “lacking a phase reference”, in our conception, pertains not to the lack of shared knowledge of physicists, but to the physical scenario in which the physical system being used as a reference is completely delocalised with respect to phase, for instance, if it is a number state. This gives rise to a “reduced” description in which the structure of a superselection rule must be enforced. Whether such a reduced description afforded by absolute quantities and absolutely coherent states does yield what is observed in any given situation is an empirical question. It seems, to us, that there may be situations in which they do not, in which case a “superselection rule” stronger than that mooted for photon number could be in force. For example, there may be physical situations in which it is impossible for mutually coherent states to arise from unitary evolution of absolutely incoherent product states, making the approximation of relative quantities by absolute ones impossible. A “strong” conservation law, as presented in Sect. 9.2 for instance, would have this effect.

Finally, superselection rules, as they arise in quantum field theory, correspond to inequivalent representations of the algebra of observables (possible only for systems with infinitely many degrees of freedom—also suspicious according to Earman and Butterfield) and entirely different in nature, it would seem, from the kind of constraint arising from the non-observability of absolute quantities. The connection of these with the superselection rules we have discussed in this manuscript remains a task for the future.

We conclude this section with a note of caution about the possibility of “lifting” a superselection rule arising from the indistinguishability of quantum particles.

11.4.1 A Cautionary Note

In order to urge a degree of circumspection regarding the idea that reference frames can be used to overcome superselection rules in general, we discuss now an example based on the indistinguishable particle superselection rule in which the physical meaning of a reference frame is unclear.

Consider a tensor product space \(L^2(\mathbb {R})\otimes L^2(\mathbb {R})\) with the action of \(\mathbb {Z}_2\) which exchanges particle numbering, i.e., \(U(a)\varPsi (x_1,x_2)=\varPsi (x_2,x_1)\) (a is the non-identity element). Indistinguishability requires that any observable A satisfies \([A,U(a)]=0\) (cf. [56]). Addend another Hilbert space \(\mathbb {C}^2\) with projectors \(P{ \left( {\begin{matrix}1\\ 0\end{matrix}}\right) }\) and \(P{ \left( {\begin{matrix}0\\ 1\end{matrix}}\right) }\) with \(\mathbb {Z}_2\) action \(U^{\prime }(a)P{ \left( {\begin{matrix}1\\ 0\end{matrix}}\right) } = P{ \left( {\begin{matrix}0\\ 1\end{matrix}}\right) }\).

Then by demanding invariance of observables only at the level of \(\mathcal {H}_1 \otimes \mathcal {H}_2 \otimes \mathbb {C}^2\) one can take an arbitrary \(A \otimes B \in \mathcal {L}(\mathcal {H}_1 \otimes \mathcal {H}_2)\) and see that

$$\begin{aligned} A \otimes B \otimes P{ \left( {\begin{matrix}1\\ 0\end{matrix}}\right) }+ B \otimes A \otimes P{ \left( {\begin{matrix}0\\ 1\end{matrix}}\right) } \end{aligned}$$
(114)

defines an invariant quantity (observable). Indeed, this is \(\yen (A \otimes B)\) for this (finite) group. Then,

$$\begin{aligned} \left\langle \,\varphi \otimes \phi \,{|}\,\left( A \otimes B \otimes P{ \left( {\begin{matrix}1\\ 0\end{matrix}}\right) }+B \otimes A \otimes P{ \left( {\begin{matrix}0\\ 1\end{matrix}}\right) } \right) \varphi \otimes \phi \,\right\rangle = \left\langle \,\varphi \,{|}\,A \otimes B \varphi \,\right\rangle \end{aligned}$$
(115)

for all \(\varphi \) and \(\phi \) the ‘phase-localised’ state \(\phi = \left( {\begin{matrix}1\\ 0\end{matrix}}\right) \). Therefore one can introduce a reference system in order to “measure” particle labelling. In the BRS language, the corresponding SSR has been “lifted”. However, such a “reference frame” provided by the \(\mathbb {C}^2\) system appears highly artificial and there is a question of whether it makes any physical sense.

11.5 Reality of Optical Coherence

In [48], Mølmer claimed that the representation of laser light using coherent states, i.e., states of the form

$$\begin{aligned} |\beta \rangle := e^{\frac{-\left| \beta \right| ^2}{2}}\sum _{n=0}^{\infty } \frac{\beta ^n}{n!}|n\rangle , \end{aligned}$$
(116)

while being legitimate for the purposes of calculation, does not reflect the true state of affairs. Actually, he claimed, that, after analysing the internal workings of laser light production in a physical system, the “actual” state is (in our notation) \(\tau {_{\mathcal {S}}}_*(P[|\beta \rangle ])\), and (the coherence of) \(|\beta \rangle \) is nothing more than a ‘convenient fiction’.

The ensuing controversy is well described in [12] (see also references therein), where a fictional dialogue is presented between hypothetical physicists representing two groups with contrasting views: those who believe in the “fact” of optical coherence, and those who view it as fictional. Given the nature of the problem (of the reality of laser coherence), we may re-visit the controversy and provide a perspective based on the formal framework developed here (see also [5]).

The issue is whether \(|\beta \rangle \) and \(\tau {_{\mathcal {S}}}_*(P[|\beta \rangle ])\) of some laser system \(\mathcal {S}\) can be empirically distinguished, given that no invariant quantity of \(\mathcal {S}\) can tell \(|\beta \rangle \) from \(\tau {_{\mathcal {S}}}_*(P[|\beta \rangle ])\). As we have seen, however, non-invariant quantities of \(\mathcal {S}\) can be used to represent invariant quantities of \(\mathcal {S}+ \mathcal {R}\), contingent on a suitable state of \(\mathcal {R}\). The question then is whether there is a feasible physical experiment in which \(|\beta \rangle \) and \(\tau {_{\mathcal {S}}}_*(P[|\beta \rangle ])\), in their role as representing invariant states of \(\mathcal {S}+ \mathcal {R}\), give rise to differing physical predictions.

An absolute phase observable \(\mathsf {F}^{\mathcal {S}}\) of \(\mathcal {S}\) (in particular, the canonical phase) is mathematically suitable for separating \(|\beta \rangle \) from \(\tau {_{\mathcal {S}}}_*(P[|\beta \rangle ])\). We may choose also a canonical phase for \(\mathcal {R}\), and use \(\yen \) to construct the relative phase observable \(\mathsf {F}^{\mathcal {T}}=\yen \circ \mathsf {F}^{\mathcal {S}}\). Fixing a sequence \((\beta ^{\mathcal {R}}_i) \subset \mathcal {H}_{\mathcal {R}}\) of coherent states with the property of becoming increasingly well localised at 0 as i becomes large, we then find that

$$\begin{aligned} \left\langle \,\beta \,{|}\,\mathsf {F}^{\mathcal {S}}(X)\beta \,\right\rangle&= \lim _{i \rightarrow \infty }\left\langle \,\beta \otimes \beta ^{\mathcal {R}}_i\,{|}\,(\yen \circ \mathsf {F}^{\mathcal {S}})(X)\beta \otimes \beta ^{\mathcal {R}}_i\,\right\rangle \\&\nonumber = \lim _{i \rightarrow \infty } \left\langle \,\beta \,{|}\, \varGamma _{\beta ^{\mathcal {R}}_i}\circ \yen \circ \mathsf {F}^{\mathcal {S}}(X) \beta \,\right\rangle \\&\nonumber = \lim _{i \rightarrow \infty }\text {tr}\left[ \mathsf {F}^T(X)\tau _{\mathcal {T}*}(P[\beta \otimes \beta ^{\mathcal {R}}_i)\right] \end{aligned}$$
(117)

for each \(X \in \mathcal {B}(S^1)\).

From an absolute point of view, absolute coherence (of \(\beta ^{\mathcal {R}}_i\) for large i) is required to witness absolute coherence of \(|\beta \rangle \). From a relational point of view, all that is required (for good approximation of the right hand side by the left) is mutual coherence of the pair \((|\beta \rangle , |\beta _i^{\mathcal {R}}\rangle )\). The final line of Eq. (117) shows that the limit can be taken using only invariant states of \(\mathcal {S}+ \mathcal {R}\), and that an absolute phase with an absolutely coherent (coherent) state captures the statistics to arbitrarily good approximation.

Given that absolute phase observables \(\mathsf {F}^{\mathcal {S}}\) can be reconstructed in homodyne detection experiments (e.g. [57]), with the reference state/local oscillator given as a high-amplitude coherent state, we conclude that laser light is mutually coherent. In the high amplitude limit, the mutual coherence takes on the appearance of absolute coherence for \(|\beta \rangle \). We therefore have a resolution of the puzzle of optical coherence through the application of the ‘observables are invariants’ principle and the concept of mutual coherence.

12 Summary and Conclusion

The thesis of this paper is that observable quantities are invariant under symmetry and that, in quantum mechanical laboratory experiments, the measured statistics pertain not to some absolute quantity, but rather to an observable, relative quantity, corresponding to the system and apparatus combined, along with the appropriate high localisation limit on the side of the apparatus. This is quite general, and not specific to any particular absolute quantity, though in this paper special attention has been given to phase, angle and position.

Through our relativisation procedure, we have shown that absolute quantities with absolutely coherent states provide a good account of the observable, relative quantities (with absolutely incoherent states) under high reference localisation. In this sense, the incorporation of a reference frame into the physical description makes it look “as though” symmetry-violating statistics exist for a subsystem. However, since we argue that the description afforded by subsystem quantities is theoretical shorthand for the relative description, we do not believe it is consistent to argue that symmetry may be violated by the introduction of a reference frame. Indeed, it is the introduction of such a frame that makes symmetry explicit; some quantities simply require two systems for definition, and one of these may me called a reference frame.

Therefore, we agree with prominent physicists (Aharonov/Susskind, Bartlett/Spekkens/Rudolph) that quantum states refer not only to systems to which they symbolically refer (i.e., the system under investigation), but also to external physical objects which are not explicitly part of the theoretical description. We have shown that complete reference phase delocalisation gives rise to a reduced description formally identical to one in which a superselection rule is present, giving a new interpretation of the phrase “lack of a phase reference implies a photon number superselection rule”. The idea that such a rule may be “lifted” [7], as we understand it, corresponds to the observation that a superselection rule may be applied to system-plus-reference, in which case, under reference localisation, it appears as though a superselection rule is not applicable to the system. We believe that, since the “reduced” description is not a full account of the state of affairs, it is not correct to conclude that superselection-rule-“violating” superpositions can be produced or measured. This would indicate that absolute quantities can be measured.

An important question, however, is whether, in all mooted instances of superselection rules, a reference frame may exist which makes it look like the superselection rule can be lifted or overcome. It is empirically the case that for photon number, such a frame does exist. Mutually coherent pairs of systems exist in this case, making absolute phases and coherent states a suitable shorthand description for the true, relative description, with the associated relative phase observable. On the other hand, a reference frame for lifting a superselection rule corresponding to indistinguishability appears highly suspect. As far as we know, it has yet to be settled in a laboratory whether absolute phases conjugate to charge provide an empirically adequate account.