Abstract
A novel framework for the analysis of observation statistics on time discrete linear evolutions in Banach space is presented. The model differs from traditional models for stochastic processes and, in particular, clearly distinguishes between the deterministic evolution of a system and the stochastic nature of observations on the evolving system. General Markov chains are defined in this context and it is shown how typical traditional models of classical or quantum random walks and Markov processes fit into the framework and how a theory of quantum statistics (sensu Barndorff-Nielsen, Gill and Jupp) may be developed from it. The framework permits a general theory of joint observability of two or more observation variables which may be viewed as an extension of the Heisenberg uncertainty principle and, in particular, offers a novel mathematical perspective on the violation of Bell’s inequalities in quantum models. Main results include a general sampling theorem relative to Riesz evolution operators in the spirit of von Neumann’s mean ergodic theorem for normal operators in Hilbert space.
Similar content being viewed by others
Notes
References
Aharonov D, Ambainis A, Kempe J, Vazirani U: Quantum walks on graphs. In: Proc. 33th STOC, ACM, New York, pp 60–69
Aspect A, Dalibard J, Roger G (1982) Experimental tests of Bell’s inequalities using time-varying analyzers. Phys Rev Lett 49:1804
Barndorff-Nielsen OE, Gill RD, Jupp PE (2003) On quantum statistical inference. J R Stat Soc B 65:775–816
Bell JS (1964) On the Einstein Podolsky Rosen paradox. Physics 1:195–200
Bell JS (1966) On the problem of hidden variables in quantum mechanics. Rev Mod Phys 38:447–452
Choi SPM, Yeung DY, Zhang NL (2000) Hidden-Markov decision processes for nonstationary sequential decision making. In: Sun R, Giles CL (eds) Sequence learning. Lecture notes in artificial intelligence, vol 1828. Springer, Berlin, pp 264–287
Conway JB (1990) A course in functional analysis. Graduate texts in mathematics, vol 96, 2nd edn. Springer, New York
Dharmadhikari SW (1965) A characterization of a class of functions of finite Markov chains. Ann Math Stat 36:524–528
Dowson HR (1978) Spectral theory of linear operators. Academic Press, London
Elliot RJ, Aggoun L, Moore JB (1995) Hidden Markov models. Springer, Berlin
Faigle U, Grabisch M (2012) Values for Markovian coalition processes. Econ Theory 51:505–538
Faigle U, Kern W (1991) Note on the convergence of simulated annealing algorithms. SIAM J Control Optim 29:153159
Faigle U, Schönhuth A (2007) Asymptotic mean stationarity of sources with finite evolution dimension. IEEE Trans Inf Theory 53:2342–2348
Faigle U, Schönhuth A (2011) Efficient tests for equivalence of hidden Markov processes and quantum random walks. IEEE Trans Inf Theory 57:1746–1753
Feller W (1971) An introduction to probability theory and its applications II. Wiley, New York
Grabisch M (2016) Set functions, games and capacities in decision making. Springer, Berlin (ISBN 978-3-319-30690-2)
Gilbert EJ (1959) On the identifiability problem for functions of finite Markov chains. Ann Math Stat 30:688–697
Gudder S (2008) Quantum Markov chains. J Math Phys 49:072105
Heller A (1965) On stochastic processes derived from Markov chains. Ann Math Stat 36:1286–1291
Hernandez-Lerma O, Lassere JB (2003) Markov chains and invariant probabilities theory. Birkaeuser, Basel
Ito H, Amari S-I, Kobayashi K (1992) Identifiability of hidden Markov information sources and their minimum degrees of freedom. IEEE Trans Inf Theory 38:324–333
Jaeger H (2000) Observable operator models for discrete stochastic time series. Neural Comput 12:1371–1398
Kempe J (2003) Quantum random walks: an introductory overview. Contemp Phys 44:307–327
Kirkpatrick S, Gelatt CD Jr, Vecchi MP (1983) Optimization by simulated annealing. Science 220(4598):671680
Markoff AA (1912) Wahrscheinlichkeitsrechnung. In: Teubner BG (ed) Leipzig (Übersetzung der 2. russischen Auflage)
Metropolis N, Rosenbluth AW, Rosenbluth MN, Teller AH, Teller E (1953) Equation of state calculations by fast computing machines. J Chem Phys 21(6):1087
Nielsen M, Chuang I (2000) Quantum computation and quantum information. Cambridge University Press, Cambridge
Portugal R, Santos RAM, Fernandes TD, Goncalves DN (2015) The staggered quantum walk model. Quant Inf Process. arXiv:1505.04761
Szegedy M (2004) Quantum speed-up of Markov chain based algorithms. In: Proceedings 45th Symposium on Foundations of Computer Science, pp 32–41
Temme K, Osborne TJ, Vollbrecht KG, Verstraete F (2011) Quantum metropolis sampling. Nature 471:87–90
Vidyasagar M (2011) The complete realization problem for hidden Markov models: a survey and some new results. Math Control Signals Syst 23(2011):1–65
Acknowledgements
The authors are grateful to the reviewers for their careful reading of the manuscript and their comments, which have improved the presentation.
Author information
Authors and Affiliations
Corresponding author
Appendix: Proofs
Appendix: Proofs
For fundamental notions and facts on linear operators we refer to standard textsFootnote 6 for fundamentals on linear operators.
1.1 Proof of Theorem 2
Recall the spectral representation for a continuous normal operator in its multiplication form:
Theorem 5
Let \(\psi\) be a continuous operator on a complex Hilbert space \({{\mathcal {H}}}\) such that \(\psi ^{*}\psi =\psi \psi ^{*}\). Then there exists a measure space \((\Omega ,\Sigma ,\mu )\), an essentially bounded measurable function \(g:\Omega \rightarrow {{\mathbb {C}}}\) and a unitary operator \(U:{{\mathcal {H}}}\rightarrow {{\mathcal {L}}}_{\mu } ^{2}(\Omega )\) such that
where M is the multiplication operator \(M_{g}f=f\cdot g\). Moreover,
By Theorem 5, we can assume w.l.o.g.:
-
\({{\mathcal {H}}}={{\mathcal {L}}}_{\mu }^{2}(\Omega )\) and \(\psi\) is given as multiplication by a bounded measurable function g.
The stability of the evolution \(\Psi =(\psi ,s)\) implies that there is a constant M so that \(|g^n(\omega )f(\omega )| \le M\) holds almost everywhere for all positive integers n. It follows that a.e., \(|g(\omega )| \le 1\) or \(f(\omega ) = 0\). Hence there is a measurable function \(g_1\) so that a.e., \(|g_1(\omega )|\le 1\) and \(g^n(\omega )f(\omega ) = g_1^n(\omega )f(\omega ).\) Setting
we therefore conclude
The sequence \((\overline{\psi }_{t}(f)(\omega ))_{t \ge 0}\) converges to
If \(g_1(\omega )=1\), then \(\overline{\psi }_{t}(f)(\omega )=f(\omega )=\pi (f)\left( \omega \right)\), and so
On the set \(\{\omega \in \Omega \mid g(\omega )\ne 1\}\), the functions \(|\overline{\psi }_{t}(f)(\omega )|^{2}\) converge pointwise to 0 and are bounded by the integrable function \(|f(\omega )|^{2}\). The theorem of dominated convergence thus yields
Clearly, \(\pi (f)\) is the orthogonal projection of f onto the eigenspace of \(\lambda =1\). So stability is sufficient for mean ergodicity.
To see that stability is necessary for mean ergodicity, assume that
We would like to show that \(\left| g\left( x\right) \right| \le 1\) holds a.e. on the set \(\left\{ x :f\left( x\right) \ne 0\right\}\).
Assume that the set
has positive measure. Then for some integer \(r>1\) the set
has positive measure and for any \(x \in M_{r}\), we have
Observing
we thus conclude
and hence
Since \(\int _{M_{r}}\left| f\left( x\right) \right| d\mu>0\) and \(\lim _{n \rightarrow \infty }\frac{\left( 1 +\frac{1}{r}\right) ^{n}-1}{n\left( 1 +r\right) } =\infty ,\) it follows that \(\left\| \frac{1}{n}\sum _{k =0}^{n -1}T^{k}\left( f\right) \right\| _{2}\) is unbounded. So the series \(\left( \frac{1}{n}\sum _{k =0}^{n -1}T^{k}\left( f\right) \right) _{n>0}\) cannot converge. Consequently, \(\left| g\left( x\right) ^{n}f\left( x\right) \right| \le \left| f\left( x\right) \right|\) and hence \(\left\| T^{n}\left( f\right) \right\| _{2} \le \left\| f\right\| _{2}\) holds for all n. \(\square\)
1.2 Proof of Theorems 1 and 3
Throughout this section, let U be a fixed Banach space with a fixed element \(s \in U\). We further fix an operator \(T : U \rightarrow U\) that is bounded on \(U_s = \text{ lin }\{T^n s : n\ge 0 \}\) and denote by \(\hat{T}\) its (bounded) extension to \(\hat{U}_s\). Without loss of generality, we can therefore assume \(\hat{U}_s = U\) and \(\hat{T} = T\). \(\sigma (T)\) denotes the spectrum of T. The spectral radius of T is
Lemma 3
If T is Riesz, then \(\sigma _1(T) =\{\lambda \in \sigma (T)\mid |\lambda | \ge 1\}\) is a finite set.
Proof
\(\sigma _{\varepsilon ,\delta }(T) = \{\lambda \in \sigma (T) \mid \varepsilon \le |\lambda |\le \delta \}\) is a bounded subset of \({\mathbb C}\) for any \(\varepsilon ,\delta \ge 0\). If T is Riesz, 0 is the only possible accumulation point of \(\sigma (T)\). So \(\sigma _{\varepsilon ,\delta }(T)\) must be finite for every \(\varepsilon>0\). If T is bounded, \(\sigma _1(T) = \sigma _{\varepsilon ,\delta }(T)\) holds for \(\varepsilon = 1\) and \(\delta = \Vert T\Vert\) and the claim of the Lemma follows. \(\square\)
For any eigenvalue \(\lambda\) of T with finite algebraic multiplicity \(n_\lambda\), the Riesz decomposition of T with respect to \(\lambda\) guaranteesFootnote 7:
-
(R) U admits the direct sum decomposition \(U = N_\lambda \oplus R_\lambda\), where
-
1.
\(N_{\lambda }=\{x\in U\mid (T-\lambda )^{n_{\lambda }}x=0\}\) is T-invariant with \(\dim N_\lambda <\infty\);
-
2.
\(R_{\lambda }=(T-\lambda )^{n_{\lambda }}U\) is T-invariant.
-
1.
If \(\sigma _1(T)\) is finite set of eigenvalues, repeated application of the Riesz decomposition (R) to some \(\lambda \in \sigma _1(T)\) and then to \(T:R_{\lambda }\rightarrow R_{\lambda }\) etc. and the other eigenvalues in \(\sigma _{1}(T)\) yields
Lemma 4
(Riesz decomposition) If \(\sigma _1(T)\) is a finite set of eigenvalues of T with finite algebraic multiplicities, then U admits a direct sum decomposition \(U=N\oplus W\) into T-invariant subspaces N and W, where
Moreover, \(|\lambda | < 1\) holds for all eigenvalues \(\lambda\) of the restriction of T to W. \(\square\)
The decomposition (R) implies that Riesz evolutions are finitary.
Proposition 2
Assume that T is a Riesz operator with decomposition \(U = N\oplus W\) into T-invariant subspaces N and W such that \(\dim N<\infty\) and the restriction of T to W has no eigenvalue in \(\sigma _1(T)\). Then the Riesz evolution (T, s) is equivalent to the finite-dimensional evolution \((T, s_N\)), where \(s_N\in N\) is such that \(s=s_N +s_W\) holds for some \(s_W\in W\).
Proof
In view of \(T^m s = T^m s_N + T^m s_W\) for all \(m\ge 0\), it suffices to establish the claim
Since \(\sigma _{\varepsilon , 1}(T)\) is a finite set for any \(\varepsilon>0\), the spectral radius \(r_W(T)\) of T on W must satisfy \(r_W(T)< 1\). For clarity of notation, let \(T_W\) be the restriction of T to W and choose \(n_{0}\) so large that \(\Vert T_W^{n}\Vert ^{1/n}\le r<1\) holds for all \(n\ge n_{0}\). Then one has \(\Vert T_W^{n}\Vert \le r^{n}\) and thus concludes
\(\square\)
The proof of Theorem 1 is now immediate: The Riesz evolution \((\psi ,s)\) is equivalent to the finite-dimensional evolution \((\psi ,s_N)\). The ergodic properties stated in Theorem 1 are directly obtained by applying Proposition 1 to \((\psi ,s_N)\). \(\square\)
For the proof of the sampling theorem (Theorem 3), let (Q, x) be a finite-dimensional evolution that is equivalent to (T, s). So we have \(\Vert T^ns-Q^nx\Vert \rightarrow 0\) and hence
since the sampling function f is continuous. This implies
i.e., the f-sample averages converge on (T, s) exactly when they converge on (Q, x). It is furthermore clear that (T, s) is stable exactly when (Q, x) is stable. For the proof, we can therefore assume without loss of generality that already (T, s) is finite-dimensional and hence the sample space \({\mathcal F}= f(U_s)\) is finite-dimensional.
Passing to coordinates, we may thus assume: \(U_s={\mathbb C}^n\) and \({\mathcal F}={\mathbb C}^k\). Since \(f:{\mathbb C}^n\rightarrow {\mathbb C}^k\) is bounded on (T, s) if and only if each component functional \(f_j\) of f is bounded on (T, s), it suffices to consider the 1-dimensional case \(k=1\).
With respect to the chosen coordinatization, T is an \(n\times n\) matrix, s a column vector and f a row vector of dimension n. Assume first that the sequence \((fT^ts)\) is bounded. Then
It follows that the sequence \((fT^t)\) of n-dimensional row vectors constitutes a bounded evolution. (The choice of u as the unit vector \(e_i\) in \({\mathbb C}\) shows that the ith coordinate of the evolution is bounded.) In view of Proposition 1 (Section 2.1), this evolution is mean-ergodic. Consequently, the boundedness of \((fT^ts)\)implies the existence of
To prove the converse implication, assume that \(\overline{f}_\infty\) exists. Then
It follows that the evolution \((fT^t)\) of row vectors is mean ergodic and hence, again by Proposition 1, stable, i.e., there is some constant \(c\in {\mathbb R}\) such that
which establishes the claim of the sampling theorem. \(\square\)
Rights and permissions
About this article
Cite this article
Faigle, U., Gierz, G. Markovian statistics on evolving systems. Evolving Systems 9, 213–225 (2018). https://doi.org/10.1007/s12530-017-9186-8
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12530-017-9186-8