Abstract
Quantum tomography is a general term for estimation methods used to completely identify quantum states or processes using independent experimental designs, and has become a standard measurement technique in quantum physics. It is especially important in the field of quantum information as it is used for the confirmation of successful experimental implementation of quantum protocols. For example, it can be used to confirm that the quantum states produced in a quantum information protocol are sufficiently close to their theoretical targets. In spite of this importance, however, finite sample analysis in quantum tomography has not been well studied. In this chapter, we explain our results regarding finite sample analysis of quantum tomography. In Sect. 5.1, we explain the estimation setting. In Sect. 5.2, we analyze expected losses with finite data, particularly for three estimators: extended linear, extended norm-minimization, and maximum-likelihood. In Sect. 5.3, we derive upper bounds on error probability with finite data for those same estimators.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
These calculations involve numerical errors, and strictly speaking the calculated estimate is different from the exact \(\ell _{2}\)-eNM estimate. We analyze the effect of numerical (and systematic) errors on error thresholds and upper bounds on error probability and give a solution to this problem in Sects. B.2 and B.3.
References
D.T. Smithey, M. Beck, M.G. Raymer, A. Faridani, Phys. Rev. Lett. 70, 1244 (1993). doi:10.1103/PhysRevLett.70.1244
Z. Hradil, Phys. Rev. A 55, R1561 (1997). doi:10.1103/PhysRevA.55.R1561
K. Banaszek, G.M. D’Ariano, M.G.A. Paris, M.F. Sacchi, Phys. Rev. A 61, 010304(R) (1999). doi:10.1103/PhysRevA.61.010304
J.F. Poyatos, J.I. Cirac, P. Zoller, Phys. Rev. Lett. 78, 390 (1997). doi:10.1103/PhysRevLett.78.390
I.L. Chuang, M.A. Nielsen, J. Mod. Phys. 44, 2455 (1997). doi:10.1080/09500349708231894
V. Buzek, Phys. Rev. A 58, 1723 (1998). doi:10.1103/PhysRevA.58.1723
J. Fiurasek, Z. Hradil, Phys. Rev. A 63, 020101(R) (2001). doi:10.1103/PhysRevA.63.020101
M.F. Sacchi, Phys. Rev. A 63, 054104 (2001). doi:10.1103/PhysRevA.63.054104
A. Luis, L.L. Sanchez-Sato, Phys. Rev. Lett. 83, 3573 (1999). doi:10.1103/PhysRevLett.83.3573
J. Fiurasek, Phys. Rev. A 64, 024102 (2001). doi:10.1103/PhysRevA.64.024102
G.M. D’Ariano, P.L. Presti, Phys. Rev. Lett. 86, 4195 (2001). doi:10.1103/PhysRevLett.86.4195
F. Bloch, Phys. Rev. 70, 460 (1946). doi:10.1103/PhysRev.70.460
E. Bagan, M. Baig, R. Muñoz-Tapia, A. Rodriguez, Phys. Rev. A 69, 010304(R) (2004). doi:10.1103/PhysRevA.69.010304
G. Kimura, Phys. Lett. A 314, 339 (2003). doi:10.1016/S0375-9601(03)00941–1
M.S. Byrd, N. Khaneja, Phys. Rev. A 68, 062322 (2003). doi:10.1103/PhysRevA.68.062322
U. Fano, Rev. Mod. Phys. 29, 74 (1957). doi:10.1103/RevModPhys.29.74
R. Schack, T.A. Brum, C.M. Caves, Phys. Rev. A 64, 014305 (2001). doi:10.1103/PhysRevA.64.014305
C.A. Fuchs, R. Schack, P.F. Scudo, Phys. Rev. A 69, 062305 (2004). doi:10.1103/PhysRevA.69.062305
V. Buzěk, G. Drobny, J. Mod. Opt. 47, 2823 (2000). doi:10.1080/09500340008232199
S.T. Flammia, D. Gross, Y.K. Liu, J. Eisert, New J. Phys. 14, 095022 (2012). doi:10.1088/1367-2630/14/9/095022
R. Blume-Kohout, arXiv:1202.5270 [quant-ph] (2012).
M. Christandl, R. Renner, Phys. Rev. Lett. 109, 120403 (2012). doi:10.1103/PhysRevLett.109.120403
A.J. Scott, J. Phys. A: Math. Gen. 39, 13507 (2006). doi:10.1088/0305-4470/39/43/009
H. Zhu, B.G. Englert, Phys. Rev. A 84, 022327 (2011). doi:10.1103/PhysRevA.84.022327
M.D. de Burgh, N.K. Langford, A.C. Doherty, A. Gilchrist, Phys. Rev. A 78, 052122 (2008). doi:10.1103/PhysRevA.78.052122
T. Sugiyama, P.S. Turner, M. Murao, Phys. Rev. A 85, 052107 (2012). doi:10.1103/PhysRevA.85.052107
H. Chernoff, Ann. Math. Stat. 25, 573 (1954). doi:10.1214/aoms/1177728725
S.G. Self, K.Y. Liang, J. Am. Stat. Assoc. 82, 605 (1987). doi:10.1080/01621459.1987.10478472
M. Abramowitz, I.A. Stegun (eds.), Handbook of Mathematical Functions with Formulas, Graphs, and Mathematical Tables (Wiley, New York, 1972)
J.A. Smolin, J.M. Gambetta, G. Smith, Phys. Rev. Lett. 108, 070502 (2012). doi:10.1103/PhysRevLett.108.070502
S.M. Tan, J. Mod. Opt. 44, 2233 (1997). doi:10.1080/09500349708231881
C.W. Helstrom, Quantum Detection and Estimation Theory (Academic, New Tork, 1976)
A.S. Holevo, Probabilistic and Statistical Aspects of Quantum Theory (North-Holland, New York, 1982)
M. Hayashi (ed.), Asymptotic Theory of Quantum Statistical Inference: Selected Papers (World Scientific, Singapore, 2005)
K. Vogel, H. Risken, Phys. Rev. A 40, 2847 (1989). doi:10.1103/PhysRevA.40.2847
T.M. Buzug, Computed Tomography: From Photon Statistics to Modern Cone-Beam CT (Springer, Berlin, 2008)
M. Paris, J. Řeháček (eds.), Quantum State Estimation. Lecture Notes in Physics (Springer, Berlin, 2004)
A. Ling, K.P. Soh, A. L.-Linares, C. Kurtsiefer, Phys. Rev. A 74, 022309 (2006). doi:10.1103/PhysRevA.74.022309
H. Kosaka, T. Inagaki, Y. Rikitake, H. Imamura, Y. Mitsumori, K. Edamatsu, Nature 457, 702 (2009). doi:10.1038/nature07729
M. Steffen, M. Ansmann, R. McDermott, N. Katz, R.C. Bialczak, E. Lucero, Phys. Rev. Lett. 97, 050502 (2006). doi:10.1103/PhysRevLett.97.050502
M. Neeley, M. Ansmann, R.C. Bialczak, M. Hofheinz, N. Katz, E. Lucero, A. O’Connell, H. Wang, A.N. Cleland, J.M. Martinis, Nature Phys. 4, 523 (2008). doi:10.1038/nphys972
M. Hofheinz, H. Wang, M. Ansmann, R.C. Bialczak, E. Lucero, M. Neeley, A.D. O’Connell, D. Sank, J. Wenner, J.M. Martinis, A.N. Cleland, Nature 459, 546 (2009). doi:10.1038/nature08005
D. Leibfried, D.M. Meekhof, B.E. King, C. Monroe, W.M. Itano, D.J. Wineland, Phys. Rev. Lett. 77, 4281 (1996). doi:10.1103/PhysRevLett.77.4281
S. Olmschens, D.N. Matsukevich, P. maunz, D. hayes, L.M. Duan, C. Monroe, Science 323, 486 (2009). doi:10.1126/science.1167209
H. Tanji, S. Ghosh, J. Simon, B. Bloom, V. Vuletic, Phys. Rev. Lett. 103, 043601 (2009). doi:10.1103/PhysRevLett.103.043601
P. Neumann, N. Mizouchi, F. Rempp, P. Hemmer, H. Watanabe, S. Yamasaki, V. Jacques, T. Gaebel, F. Jelezko, J. Wrachtrup, Science 320, 1326 (2008). doi:10.1126/science.1157233
T.J. Dunn, I.A. Walmsley, S. Mukamel, Phys. Rev. Lett. 74, 884 (1995). doi:10.1103/PhysRevLett.74.884
M.A. Nielsen, E. Knill, R. Laflamme, Nature 396, 52 (1998). doi:10.1038/23891
E. Skovsen, H. Stapelfeldt, S. Juhl, K. Molmer, Phys. Rev. Lett. 91, 090406 (2003). doi:10.1103/PhysRevLett.91.090406
R.A. Horn, C.R. Johnson, Matrix Analysis (Cambridge University Press, New York, 1985)
T.M. Cover, J.A. Thomas, Elements of Information Theory, 2nd edn. Wiley Series in Telecommunications and Signal Processing (Wiley-Interscience, New York, 2006)
I. Bengtsson, K. Życzkowski, Geometry of Quantum States (Cambridge University Press, Cambridge, 2006)
J.M. Borwein, A.S. Lewis, Convex Analysis and Nonlinear Optimization. CMS Books in Mathematics (Springer, New York, 2006)
N.J. Higham, Accuracy and Stability of Numerical Algorithms (SIAM., Philadelphia, 2002).
T. Sugiyama, P.S. Turner, M. Murao, Phys. Rev. Lett. 111, 160406 (2013). doi:10.1103/PhysRevLett.111.160406
T. Sugiyama, P.S. Turner, M. Murao, New J. Phys. 14, 085005 (2012). doi:10.1088/1367-2630/14/8/085005
Author information
Authors and Affiliations
Corresponding author
Appendices
Appendix
Linear Algebra and Convex Analysis
In this section, we give mathematical supplements. In Sect. A.1, we explain terminologies in vector and matrix theory and known inequalities between different norms. In Sect. A.2, we explain a known property of projections in convex analysis.
5.1.1 Vector and Matrix
5.1.1.1 Class of Matrices
Let \(M_{m,n}\) denote the set of all complex matrices with \(m\) columns and \(n\) rows. When \(m=n\), the elements of \(M_{m,m}\) is called square matrices.
-
1.
Square matrix
Suppose that a matrix \(A\) is square. Let \(A^{\dagger }\) denote the Hermitian conjugate of \(A\). When the matrix \(A\) satisfies \(A^{\dagger }A = A A^{\dagger }\), it is called normal. The normality is the necessary and sufficient condition of the diagonalizability. When \(A\) is diagonalizable, it can be represented as
$$\begin{aligned} A = \sum _{i=1}^{m} a_{i} \varvec{v}_{i} \varvec{v}_{i}^{\dagger }, \end{aligned}$$(5.150)where \(\varvec{v}_{i}\) are normalized vectors in \(\mathbb {C}^{m}\) and orthogonal to each other. These vectors are called the eigenvectors of \(A\), and the complex values \(a_{i}\) are called the eigenvalues of \(A\). When a square matrix \(A\) satisfies \(A = A^{\dagger }\), it is called Hermitian. Hermitian matrices are normal, and the eigenvalues are real. When all eigenvalues of \(A\) is positive, it is called a positive matrix. When the eigenvalues are nonnegative, it is called a positive semidefinite matrix.
Let \(f\) denote a function from \(\mathbb {C}\) to \(\mathbb {C}\). Let \(A\) denote a normal matrix and \(A=\sum _{i=1}^{m}a_{i}\varvec{v}_{i}\varvec{v}_{i}^{\dagger }\) be the diagonalized form. We define the action of \(f\) on \(A\) by
$$\begin{aligned} f(A):= \sum _{i=1}^{m} f(a_{i}) \varvec{v}_{i} \varvec{v}_{i}^{\dagger }. \end{aligned}$$(5.151)We use the following notation.
$$\begin{aligned} |A|:= \sum _{i=1}^{m} |a_{i}| \varvec{v}_{i} \varvec{v}_{i}^{\dagger }, \end{aligned}$$(5.152)$$\begin{aligned} \sqrt{A}:= _{i=1}^{m} \sqrt{a_{i}} \varvec{v}_{i} \varvec{v}_{i}^{\dagger }. \end{aligned}$$(5.153)When all eigenvalues of \(A\) are not zero, we define the inverse matrix of \(A\) by
$$\begin{aligned} A^{-1}:= \sum _{i=1}^{m} \frac{1}{a_{i}} \varvec{v}_{i} \varvec{v}_{i}^{\dagger }. \end{aligned}$$(5.154)When the inverse matrix of \(A\) exists, \(A\) is called invertible or regular. The inverse matrix satisfies \(A A^{-1} = A^{-1}A=I_{m}\), where \(I_{m}\) is the identity matrix on \(\mathbb {C}^{m}\).
When \(A\) has zero eigenvalues, we cannot define the inverse matrix and \(A\) is called irregular. In this case, we define a matrix \(A^{-}\) by
$$\begin{aligned} A^{-} := \sum _{i; a_{i}\ne 0} \frac{1}{a_{i}} \varvec{v}_{i} \varvec{v}_{i}^{\dagger }. \end{aligned}$$(5.155)This \(A^{-}\) is a Moore-Penrose generalized inverse for normal matrices.
-
2.
Non-square matrix
Next, we consider the case of \(m\ne n\). For a given non-square matrix \(A \in M_{m,n}\), we define the left-inverse matrix \(A^{-1}_\mathrm{{left}}\in M_{n,m}\) and right-inverse matrix \(A^{-1}_\mathrm{{right}}\in M_{n,m}\) by
$$\begin{aligned} A^{-1}_\mathrm{{left}} A = I_{n}, \end{aligned}$$(5.156)$$\begin{aligned} A A^{-1}_\mathrm{{right}} = I_{m}. \end{aligned}$$(5.157)We define the rank of \(A\) by the number of independent row vectors of \(A\) and use the notation \(\mathrm{{rank}}(A)\). When \(\mathrm{{rank}}(A) = \mathrm{{min}} \{m,n\}\) holds, \(A\) is called full-rank, and otherwise it is called rank-deficient. Suppose that \(A\) is full-rank. When \(m>n\), the left-inverse matrix \(A^{-1}_\mathrm{{left}}\) exists, and when \(m<n\), the right-inverse matrix \(A^{-1}_\mathrm{{right}}\) exists. When \(A\) is rank-deficient, there are no left- and right-inverse matrices of \(A\).
5.1.1.2 Inhomogeneous Equation
Let \(A\) be a real \((m,n)\)-matrix and \(\varvec{v}\in \mathbb {R}^{n}\), \(\varvec{w}\in \mathbb {R}^{m}\). When \(\varvec{w}\) is not \(\varvec{0}\), an equation
is called an inhomogeneous equation. e consider an inverse problem such that for given \(A\) and \(\varvec{w}\), we find \(\varvec{v}\) satisfying Eq. (5.158). We define the image of \(A\) by
When \(\varvec{w}\in {\mathrm{{Im}}} (A)\), the solutions of Eq. (5.158) exist, and otherwise they do not exist. We define an augmented matrix of \(A\) with \(\varvec{w}\) by
Then following theorem holds.
Theorem 5.12
The following statements are equivalent.
-
(i)
For a given \(A\) and \(\varvec{w}\), Eq. (5.158) has solutions.
-
(ii)
\(\varvec{w} \in {\mathrm{{Im}}} (A)\).
-
(iii)
The ranks of \(A\) and the augmented matrix are same, i.e.,
$$\begin{aligned} \mathrm{{rank}}(A) = \mathrm{{rank}}([A|\varvec{w}]). \end{aligned}$$(5.161)
When the solutions exist, the solutions have \(\bigl (n - \mathrm{{rank}}(A)\bigr )\) degrees of freedom.
We summarize the contents of this subsection. Suppose that \(A\in M_{m,n}\) and \(\varvec{w}\in \mathbb {R}^{m}\) are given.
-
If and only if the statements (ii) or (iii) are not satisfied, Eq. (5.158) has no solutions.
-
If and only if he statements (ii) or (iii) are satisfied, Eq. (5.158) has solutions.
-
If Eq. (5.158) has solutions, they have \(\bigl (n - \mathrm{{rank}}(A)\bigr )\) degrees of freedom.
-
Suppose that Eq. (5.158) has a solution and \(n > \mathrm{{rank}}(A)\) holds. Then the solution is not unique.
-
Suppose that Eq. (5.158) has a solution and \(n = \mathrm{{rank}}(A)\) holds. Then the solution is unique.
-
When \(n=m\), \(A\) is full-rank and invertible, and the solution is given as \(A^{-1}\varvec{w}\).
-
When \(m>n\), \(A\) is full-rank and the left-inverse matrix exists. The solution is given as \(A^{-1}_\mathrm{{left}}\varvec{w}\).
-
5.1.1.3 Norms and Loss Functions
We obey the definitions of norms in [50].
-
Vector norms
A function \(\Vert \cdot \Vert : \mathbb {C}^{k} \rightarrow \mathbb {R}\) is a vector norm if for all \(\varvec{v},\ \varvec{w}\in \mathbb {C}^{k}\),
-
(i)
(Non-negativity) \(\Vert \varvec{v} \Vert \ge 0\).
-
(ii)
(Positivity) \(\Vert \varvec{v} \Vert =0\) if and only if \(\varvec{x}=\varvec{0}\).
-
(iii)
(Homogeneity) \(\Vert c \varvec{v} \Vert = |c|\cdot \Vert \varvec{v} \Vert \) for all scalars \(c\in \mathbb {C}\).
-
(iv)
(Triangle inequality) \(\Vert \varvec{v} + \varvec{w}\Vert \le \Vert \varvec{v} \Vert + \Vert \varvec{w} \Vert \).
We introduce three representative vector norms:
-
1.
\(\ell _{1}\)-norm (the sum norm) \(\Vert \varvec{v} \Vert _{1} := \sum _{i=1}^{k} |v_{i}|\).
-
2.
\(\ell _{2}\)-norm (the Euclidean norm) \(\Vert \varvec{v} \Vert _{2} := \sqrt{\sum _{i=1}^{k} |v_{i}|^{2}}\).
-
3.
\(\ell _{\infty }\)-norm (the max norm) \(\Vert \varvec{v} \Vert _{\infty } := \max _{i=1, \ldots , k} |v_{i}|\).
These vector norms satisfy the following inequalities:
$$\begin{aligned} \Vert \varvec{v} \Vert _{2} \le \Vert \varvec{v} \Vert _{1} \le \sqrt{k} \Vert \varvec{v} \Vert _{2}, \end{aligned}$$(5.162)$$\begin{aligned} \Vert \varvec{v} \Vert _{\infty } \le \Vert \varvec{v} \Vert _{1} \le k \Vert \varvec{v} \Vert _{\infty }, \end{aligned}$$(5.163)$$\begin{aligned} \Vert \varvec{v} \Vert _{\infty } \le \Vert \varvec{v} \Vert _{2} \le \sqrt{k} \Vert \varvec{v} \Vert _{\infty }. \end{aligned}$$(5.164)For any probability distributions \(\varvec{p}\) and \(\varvec{q}\), \(\ell _{1}\)-norm and the Kullback-Leibler divergence satisfy the following inequality [51]
$$\begin{aligned} ( \Vert \varvec{p} - \varvec{q} \Vert _{1} )^{2} \le K ( \varvec{p} \Vert \varvec{q}). \end{aligned}$$(5.165)Equation (5.165) is called Pinsker’s inequality.
-
(i)
-
Matrix norms
A function \(|||\cdot |||: M_{k,k} \rightarrow \mathbb {R}\) is a matrix norm if for all \(A,\ B \in M_{k,k}\),
-
(i)
(Non-negativity) \(|||A |||\ge 0\).
-
(ii)
(Positivity) \(|||A |||=0\) if and only if \(A=O\).
-
(iii)
(Homogeneity) \(|||c A |||= |c| \cdot |||A |||\) for all scalars \(c\in \mathbb {C}\).
-
(iv)
(Triangle inequality) \(|||A + B |||\le |||A |||+ |||B |||\).
-
(v)
(Submultiplicativity) \(|||AB |||\le |||A |||\cdot |||B |||\).
We introduce two representative matrix norms:
-
1.
Trace norm \(|||A |||_\mathrm{{tr}} := \mathrm{tr}[ |A| ]\).
-
2.
Hilbert-Schmidt norm (the Frobenius norm) \(|||A |||_\mathrm{{HS}} := \mathrm{tr}[ A^{\dagger } A ]^{1/2}\).
These matrix norms satisfy the following inequalities:
$$\begin{aligned} |||A |||_\mathrm{{HS}} \le |||A |||_\mathrm{{tr}} \le \sqrt{\mathrm{{rank}}(A)} |||A |||_\mathrm{{HS}}. \end{aligned}$$(5.166)The Hilbert-Schmidt distance between two density matrices are defined by
$$\begin{aligned} \varDelta ^\mathrm{{HS}} (\hat{\rho } , \hat{\rho }^{\prime }) := \frac{1}{\sqrt{2}} |||\hat{\rho } - \hat{\rho }^{\prime } |||_\mathrm{{HS}}. \end{aligned}$$(5.167)The normalization factor makes the maximal value one. In the generalized Bloch parametrization, we have
$$\begin{aligned} \varDelta ^\mathrm{{HS}} (\hat{\rho }(\varvec{s}), \hat{\rho }(\varvec{s}^{\prime })) = \frac{1}{2} \Vert \varvec{s} - \varvec{s}^{\prime } \Vert _{2}. \end{aligned}$$(5.168) -
(i)
-
Fidelity, infidelity, Bures distance
In quantum information science, one of the most popular evaluation functions is fidelity. For two density matrices \(\hat{\rho }\) and \(\hat{\rho }^{\prime }\), the fidelity is defined by
$$\begin{aligned} f (\hat{\rho } , \hat{\rho }^{\prime }) := \mathrm{Tr}\Bigl [ \sqrt{ \sqrt{\hat{\rho }} \hat{\rho }^{\prime } \sqrt{\hat{\rho }}} \Bigr ]^{2}. \end{aligned}$$(5.169)Fidelity satisfies \(f(\hat{\rho }, \hat{\rho })=1\) and it is not a loss function. The square-root \(\sqrt{f(\hat{\rho } , \hat{\rho }^{\prime })}\) is called the root fidelity. The infidelity is defined by
$$\begin{aligned} \varDelta ^\mathrm{{IF}} (\hat{\rho } , \hat{\rho }^{\prime })&:= 1 - \mathrm{Tr}\Bigl [ \sqrt{ \sqrt{\hat{\rho }} \hat{\rho }^{\prime } \sqrt{\hat{\rho }}} \Bigr ]^{2} \end{aligned}$$(5.170)$$\begin{aligned}&= 1 - f(\hat{\rho } , \hat{\rho }^{\prime }). \end{aligned}$$(5.171)Infidelity is a loss function but it is not a distance. The Bures distance is defined by
$$\begin{aligned} \varDelta ^\mathrm{{B}} (\hat{\rho } , \hat{\rho }^{\prime })&:= \sqrt{1 - \mathrm{Tr}\Bigl [ \sqrt{ \sqrt{\hat{\rho }} \hat{\rho }^{\prime } \sqrt{\hat{\rho }}} \Bigr ] }\end{aligned}$$(5.172)$$\begin{aligned}&= \sqrt{1 - \sqrt{f (\hat{\rho } , \hat{\rho }^{\prime })} }. \end{aligned}$$(5.173)For the trace distance, Bures distance, and infidelity, the following inequalities hold [52]:
$$\begin{aligned} \varDelta ^\mathrm{{B}} (\hat{\rho } , \hat{\rho }^{\prime }) \le \sqrt{\varDelta ^\mathrm{{IF}} (\hat{\rho } , \hat{\rho }^{\prime })} \le \sqrt{2} \varDelta ^\mathrm{{B}} (\hat{\rho } , \hat{\rho }^{\prime }),\end{aligned}$$(5.174)$$\begin{aligned} \varDelta ^\mathrm{{B}} (\hat{\rho } , \hat{\rho }^{\prime })^{2} \le \varDelta ^\mathrm{{T}} (\hat{\rho } , \hat{\rho }^{\prime }) \le \sqrt{\varDelta ^\mathrm{{IF}} (\hat{\rho } , \hat{\rho }^{\prime })}. \end{aligned}$$(5.175)
5.1.2 Projection in Convex Analysis
A subset \(R \subseteq \mathbb {R}^{k}\) is called convex if for all \(\varvec{r},\ \varvec{r}^{\prime } \in R\), \(p\varvec{r} + (1-p)\varvec{r}^{\prime }\) is included in \(S\), where \(p\in [0,1]\). Let \(R\) be a non-empty, closed, and convex set in \(\mathbb {R}^{k}\) and \(\varvec{t}\) be a vector in \(\varvec{R}^{k}\). The vector \(\varvec{t}\) is not necessary included in \(R\). We define the projection of \(\varvec{t}\) onto \(R\) by
Obviously, \(\varvec{P}_{R} (\varvec{t})=\varvec{t}\) holds for any \(\varvec{t} \in R\).
Theorem 5.13
(Non-expandability of projections [53])
Suppose that \(R\) is a non-empty, closed, and convex set in \(\mathbb {R}^{k}\). Then for any \(\varvec{t}, \varvec{t}^{\prime } \in \mathbb {R}^{k}\),
holds.
Theorem 5.13 indicates that any projections do not expand the distance between any two vectors in the Euclidean space. For \(\varvec{t}^{\prime }\in R\), we have
Supplement for Sect. 5.3.2
In this section, we give supplements for Sect. 5.3.2. In Sect. B.1, we derive the upper bound on error probability for \(k\)-qubit state tomography using Pauli measurements with detection losses. In Sects. B.2 and B.3, we explain a way of evaluating the effect of systematic and numerical errors on the error threshold and upper bound. In Sect. B.4, we derive the upper bound on error probability for the constrained least squares estimator, which was introduced in Sect. 5.3.2, and compare the performance to that of the \(\ell _{2}\)-eNM estimator.
5.1.1 Proof of Eq. (5.130)
Suppose that we prepare \(N\) identical copies of \(\hat{\rho } \in \fancyscript{S} ( (\mathbb {C}^{2})^{\otimes k})\) and make the three Pauli measurements with detection efficiency \(\eta \) on each qubit. The POVMs describing the ideal Pauli measurements on each qubit, \(\varvec{\varPi }^{(i)} = \{ \hat{\varPi }^{(i)}_{+1}, \hat{\varPi }^{(i)}_{-1} \}\), are given as
where \(i=1,2,3\),
and
When the measurements have detention loss, the corresponding POVMs, \(\varvec{\varPi }^{\eta , (i)} = \{ \hat{\varPi }^{\eta , (i)}_{+1}, \hat{\varPi }^{\eta , (i)}_{-1}, \hat{\varPi }^{\eta , (i)}_{0} \}\), are given as
where \(\eta \) is the detection efficiency and takes the value from \(0\) to \(1\). The outcome “\(0\)” means no detection at the measurement trial. When we perform the imperfect Pauli measurements on each qubit, the POVM on \(k\)-qubit is given as
where \(\varvec{i} = \{i_{q} \}_{q=1}^{k}\) and \(i_{q} = 1, 2, 3\). The label of the different POVMs, \(\varvec{i}\), corresponds to \(j\) in this chapter. Suppose that we perform each measurement described by \(\varvec{\varPi }^{\eta , (\varvec{i})}\) equally \(n:= N / 3^{k}\) times.
Let us choose \(\varvec{\lambda }\) to be the set of tensor products of Pauli and identity matrices with the normalization factor \(1 / \sqrt{2^{k-1}}\), i.e.,
where \(\varvec{\beta } := \{ \beta _{q} \}_{q=1}^{k}\) and \(\beta _{q} = 0, 1, 2, 3\). We eliminate from \(\varvec{\beta }\) the case that all \(\beta _{q}\) are \(0\). The label of the matrices, \(\varvec{\beta }\), corresponds to \(\alpha \) in this chapter. Using this \(\varvec{\lambda }\), any density matrices are represented as
where
Equation (5.188) indicates that the parameter \(s_{\varvec{\beta }}\) is the expectation of a tensor product of ideal Pauli and identity matrices.
In \(k\)-qubit state tomography with \(k\ge 2\), we need to be careful about the treatment of multiple uses of same data. For example, in order to estimate the expectation of \(\hat{\sigma }_{1} \otimes \hat{\mathbb {1}}\) in \(2\)-qubit case, we use the data of three types of measurements; \(\hat{\sigma }_{1} \otimes \hat{\sigma }_{1}\), \(\hat{\sigma }_{1} \otimes \hat{\sigma }_{2}\), and \(\hat{\sigma }_{1} \otimes \hat{\sigma }_{3}\). Therefore the estimation of each parameter can de dependent even for \(\hat{\rho }^\mathrm{{LLS}}\).
We try to estimate these parameters from a data set of the imperfect Pauli measurements \(\breve{\varvec{\varPi }} := \{ \varvec{\varPi }^{\eta , (\varvec{i})} \}_{\varvec{i}}\). In order to calculate \(c_{\alpha }\), we need to derive a matrix \(B\) satisfying
This matrix \(B\) corresponds to \(\varLambda ^{-1}_\mathrm{{left}}\) in this chapter. Let \(l\) denote the number of \(\hat{\mathbb {1}}\) appearing in \(\hat{\lambda }_{\varvec{\beta }}\). The number of \(\hat{\lambda }_{\varvec{\beta }}\) including \(l\) identities is \(3^{k-l} \times \frac{k !}{l! (k-l)!}\). \(\hat{\lambda }_{\varvec{\beta }} = \hat{\mathbb {1}}^{\otimes l} \otimes \left( \otimes _{q=l+1}^{k} \hat{\sigma }_{i_{q}} \right) / \sqrt{2^{k-1}}\) is an example of such \(\hat{\lambda }_{\varvec{\beta }}\). In this case, Eq. (5.188) is rewritten by the probability distributions of the imperfect Pauli measurement as
Therefore we have
if \(i_{q} = 1, 2, 3\) and \(m_{q} = \pm 1, 0\) for \(q= 1, \ldots , l\) and \(i_{q} = \beta _{q}\) and \(m_{q} = \pm 1\) for \(q = l+1, \ldots , k\). Otherwise \(B_{\varvec{\beta }, (\varvec{i}, \varvec{m})} = 0\). Then for each \(\beta \) and \(\varvec{i}\),
holds, and we obtain
From the above discussion, we can see that \(c_{\varvec{\beta }}\) takes same value for different \(\hat{\lambda }_{\varvec{\beta }}\) with the same \(l\). The upper bound of error probability is calculated as
where
When we choose the trace distance as the loss function, we have
and
In one-qubit (\(k=1\)) and two-qubit (\(k=2\)) cases, we have
As in the above discussion, when the directions of each Pauli measurement are perfectly orthogonal, it is easy to derive \(c_{\varvec{\beta }}\). When the directions are not orthogonal, we need to calculate \(\varLambda ^{-1}_\mathrm{{left}} = ( \varLambda ^{T} \varLambda )^{-1} \varLambda ^{T}\). Then, it becomes more difficult to analyze \(c_{\varvec{\beta }}\), and we would need to calculate them numerically.
5.1.2 Effect of Systematic Errors
Theorems 5.9 is valid for any informationally complete POVMs and is applicable for cases in which a systematic error exists. However, we must know exactly the mathematical representation of the systematic error in order to strictly verify a value of the confidence level. This assumption can be unrealistic in some experiments. In this section, we will weaken the assumption to a more realistic condition and give a formula of \(\text {P}_{u}^{\varDelta }\) in such a case.
Let \(\breve{\varvec{\varPi }}\) denote a set of POVMs exactly describing the measurement used, and let \(\breve{\varvec{\varPi }}^{\prime } (\ne \breve{\varvec{\varPi }})\) denote a set of POVMs that we mistake as the correct set of POVMs. We assume that \(\breve{\varvec{\varPi }}\) and \(\breve{\varvec{\varPi }}^{\prime }\) are both informationally complete. Suppose that we do not know \(\breve{\varvec{\varPi }}\), but we know that \(\breve{\varvec{\varPi }}\) is in a known set \(\fancyscript{M}\). For example, consider the case where an experimentalist wants to perform a projective easurement of \(\hat{\sigma }_{1}\). If they can guarantee that their actual measurement is prepared within \(0.5\) degrees from the \(x\)-axis, and if their detection efficiency is \(0.9\), then \(\fancyscript{M}\) is the set of all POVMs whose measurement direction and detection efficiency are within \(0.5\) degrees of the \(x\)-axis and \(0.9\), respectively.
For given relative frequencies \(\varvec{\nu }_{N}\), the correct and mistaken eL estimates are
Then the actual and mistaken \(\ell _{2}\)-eNM estimates are
Let \(\hat{\rho }^{\ell _{2}\mathrm - \mathrm {eNM}}_{N}\) and \(\hat{\rho }^{\ell _{2}\mathrm - \mathrm {eNM}\prime }_{N}\) denote the corresponding density matrix estimates. Let us define the size of the systematic error as
This is a function of \(\varDelta \), \(\varvec{\nu }_{N}\), \(\breve{\varvec{\varPi }}^{\prime }\), and \(\fancyscript{M}\). Then for any \(\hat{\rho } \in \fancyscript{S}(\fancyscript{H})\) and \(\breve{\varvec{\varPi }} \in \fancyscript{M}\),
holds with probability at least
Using Eqs. (5.209) and (5.210), we can evaluate the precision of state preparation, \(\varDelta (\hat{\rho }_{*}, \hat{\rho } )\), without knowing the true state \(\hat{\rho }\) and true sets of POVMs \(\breve{\varvec{\varvec{\varPi }}}\).
5.1.3 Effect of Numerical Errors
In this section, we analyze the effect of numerical errors and explain a method for evaluating the precision of the state preparation in the cases that numerical errors exits.
The \(\ell _{2}\)-eNM estimator \(\hat{\rho }^{\ell _{2}\mathrm - \mathrm {eNM}}\) requires a nonlinear minimization, which requires the use of a numerical algorithm. Suppose that we choose an algorithm for the minimization and obtain a result \(\hat{\sigma }^{\ell _{2}\mathrm - \mathrm {eNM}}_{N}\) for a given data set. In practice, there exists a numerical error on the result, and \(\hat{\sigma }^{\ell _{2}\mathrm - \mathrm {eNM}}_{N}\) differs from the exact solution \(\hat{\rho }^{\ell _{2}\mathrm - \mathrm {eNM}}_{N}\). We cannot obtain the exact solution, but we can guarantee the accuracy of the numerical result with accuracy-guaranteed algorithms [54]. Suppose that we use an algorithm for which \(\varDelta (\hat{\sigma }^{\ell _{2}\mathrm - \mathrm {eNM}}_{N}, \hat{\rho }^{\ell _{2}\mathrm - \mathrm {eNM}}_{N}) \le \zeta \) is guaranteed. Then
holds with probability at least \(1 - \text {P}_{u}^{\varDelta } \). The error threshold is changed from \(\delta \) to \(\zeta + \delta \).
Usually systematic and numerical errors both exists. In such a case, by combining Eqs. (5.209) and (5.211), we can prove that the inequality
holds with probability in Eq. (5.210), where \(\zeta \) is a numerical error threshold for \(\breve{\varvec{\varPi }}^{\prime }\). Therefore Theorem 5.9 with a modification can apply for the cases that systematic and numerical errors exist.
5.1.4 Error Probability for Constrained Least Squares Estimator
From Eq. (5.56), the probability distribution of \(\hat{\rho }^\mathrm{{eL}}_{N}\) is the projection of \(\varvec{\nu }_{N}\) on the probability space of trace-one Hermitian matrices (\(\{ \varvec{p}(\hat{\sigma } ) | \hat{\sigma } = \hat{\sigma }^{\dagger }, \mathrm{Tr}[\hat{\sigma } ] = 1 \}\)), and we have
Therefore, Eq. (5.131) is rewritten as
and \(\hat{\rho }^\mathrm{{CLS}}_{N}\) is the projection of \(\hat{\rho }^\mathrm{{eL}}_{N}\) on \(\fancyscript{S}(\fancyscript{H})\) with respect to the \(2\)-norm on the probability space. We can see from Eqs. (5.68) and (5.214) that \(\hat{\rho }^{\ell _{2}\mathrm - \mathrm {eNM}}\) and \(\hat{\rho }^\mathrm{{CLS}}\) are the projections of \(\hat{\rho }^\mathrm{{eL}}_{N}\) with respect to difference spaces (or different norms).
Using Theorem 5.13, we obtain
where \(\varvec{s}^\mathrm{{CLS}}_{N}\) is the Bloch vector corresponding to \(\hat{\rho }^\mathrm{{CLS}}_{N}\). Let us define \(\Vert \varLambda \Vert _{2, \max }\) and \(\Vert \varLambda \Vert _{2, \min }\) as
When \(\breve{\varvec{\varPi }}\) is informationally complete, \(\varLambda \) is full-rank and \(\Vert \varLambda \Vert _{\min } > 0\). We have
We obtain
From the same logic in the proof of Theorem 5.14, we obtain the following theorem:
Theorem 5.14
(Error probability, \(\hat{\rho }^\mathrm{{CLS}}\mathbf{,}\, \varDelta ^\mathrm{{HS}}\mathbf{,}\, \varDelta ^\mathrm{{T}}\mathbf{,}\, \varDelta ^\mathrm{{IF}}\) )
When we choose the Hilbert-Schmidt distance, Trace distance, or infidelity as the loss function for the density matrix, we have the following upper bounds on the error probabilities for the constrained least squares estimator.
for any true density matrix \(\hat{\rho }\).
Compared to Theorem 5.9, there is an additional factor \(\left( \frac{\Vert \varLambda \Vert _{2, \min }}{\Vert \varLambda \Vert _{2, \max }} \right) ^{2} (\le 1)\) in the rate of exponential decrease in Theorem 5.14. When \(\Vert \varLambda \Vert _{2, \max } = \Vert \varLambda \Vert _{2, \min }\) holds, the upper bounds for \(\hat{\rho }^\mathrm{{CLS}}\) coincides with those for \(\hat{\rho }^{\ell _{2}\mathrm - \mathrm {eNM}}\). Roughly speaking, the condition, \(\Vert \varLambda \Vert _{2, \max } = \Vert \varLambda \Vert _{2, \min }\), implies that we perform measurements extracting information of each Bloch vector element with an equivalent weight. When \(\Vert \varLambda \Vert _{2, \max } > \Vert \varLambda \Vert _{2, \min }\), the upper bounds for \(\hat{\rho }^\mathrm{{CLS}}\) is larger than those for \(\hat{\rho }^{\ell _{2}\mathrm - \mathrm {eNM}}\). This does not mean we can immediately conclude that \(\hat{\rho }^\mathrm{{CLS}}\) is less precise than \(\hat{\rho }^{\ell _{2}\mathrm - \mathrm {eNM}}\) because their upper bounds are probably not optimal. However, we can say that \(\hat{\rho }^\mathrm{{CLS}}\) is less precise than \(\hat{\rho }^{\ell _{2}\mathrm - \mathrm {eNM}}\) insofar as Theorems 5.9 and 5.14 give the only upper bounds known for point estimators in quantum tomography to date. Additionally, the computational cost of \(\hat{\rho }^{\ell _{2}\mathrm - \mathrm {eNM}}\) can be smaller than that of \(\hat{\rho }^\mathrm{{CLS}}\) as explained in Sect. 5.3.2. Therefore, we believe that the \(\ell _{2}\)-eNM estimator performs better than the CLS estimator and is at present our best choice.
Rights and permissions
Copyright information
© 2014 Springer Japan
About this chapter
Cite this chapter
Sugiyama, T. (2014). Evaluation of Estimation Precision in Quantum Tomography. In: Finite Sample Analysis in Quantum Estimation. Springer Theses. Springer, Tokyo. https://doi.org/10.1007/978-4-431-54777-8_5
Download citation
DOI: https://doi.org/10.1007/978-4-431-54777-8_5
Published:
Publisher Name: Springer, Tokyo
Print ISBN: 978-4-431-54776-1
Online ISBN: 978-4-431-54777-8
eBook Packages: Physics and AstronomyPhysics and Astronomy (R0)