Skip to main content

Evaluation of Estimation Precision in Quantum Tomography

  • Chapter
  • First Online:
Finite Sample Analysis in Quantum Estimation

Part of the book series: Springer Theses ((Springer Theses))

  • 680 Accesses

Abstract

Quantum tomography is a general term for estimation methods used to completely identify quantum states or processes using independent experimental designs, and has become a standard measurement technique in quantum physics. It is especially important in the field of quantum information as it is used for the confirmation of successful experimental implementation of quantum protocols. For example, it can be used to confirm that the quantum states produced in a quantum information protocol are sufficiently close to their theoretical targets. In spite of this importance, however, finite sample analysis in quantum tomography has not been well studied. In this chapter, we explain our results regarding finite sample analysis of quantum tomography. In Sect. 5.1, we explain the estimation setting. In Sect. 5.2, we analyze expected losses with finite data, particularly for three estimators: extended linear, extended norm-minimization, and maximum-likelihood. In Sect. 5.3, we derive upper bounds on error probability with finite data for those same estimators.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

eBook
USD 16.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    These calculations involve numerical errors, and strictly speaking the calculated estimate is different from the exact \(\ell _{2}\)-eNM estimate. We analyze the effect of numerical (and systematic) errors on error thresholds and upper bounds on error probability and give a solution to this problem in Sects. B.2 and B.3.

References

  1. D.T. Smithey, M. Beck, M.G. Raymer, A. Faridani, Phys. Rev. Lett. 70, 1244 (1993). doi:10.1103/PhysRevLett.70.1244

    Google Scholar 

  2. Z. Hradil, Phys. Rev. A 55, R1561 (1997). doi:10.1103/PhysRevA.55.R1561

  3. K. Banaszek, G.M. D’Ariano, M.G.A. Paris, M.F. Sacchi, Phys. Rev. A 61, 010304(R) (1999). doi:10.1103/PhysRevA.61.010304

  4. J.F. Poyatos, J.I. Cirac, P. Zoller, Phys. Rev. Lett. 78, 390 (1997). doi:10.1103/PhysRevLett.78.390

    Google Scholar 

  5. I.L. Chuang, M.A. Nielsen, J. Mod. Phys. 44, 2455 (1997). doi:10.1080/09500349708231894

  6. V. Buzek, Phys. Rev. A 58, 1723 (1998). doi:10.1103/PhysRevA.58.1723

  7. J. Fiurasek, Z. Hradil, Phys. Rev. A 63, 020101(R) (2001). doi:10.1103/PhysRevA.63.020101

  8. M.F. Sacchi, Phys. Rev. A 63, 054104 (2001). doi:10.1103/PhysRevA.63.054104

  9. A. Luis, L.L. Sanchez-Sato, Phys. Rev. Lett. 83, 3573 (1999). doi:10.1103/PhysRevLett.83.3573

    Google Scholar 

  10. J. Fiurasek, Phys. Rev. A 64, 024102 (2001). doi:10.1103/PhysRevA.64.024102

  11. G.M. D’Ariano, P.L. Presti, Phys. Rev. Lett. 86, 4195 (2001). doi:10.1103/PhysRevLett.86.4195

    Google Scholar 

  12. F. Bloch, Phys. Rev. 70, 460 (1946). doi:10.1103/PhysRev.70.460

  13. E. Bagan, M. Baig, R. Muñoz-Tapia, A. Rodriguez, Phys. Rev. A 69, 010304(R) (2004). doi:10.1103/PhysRevA.69.010304

  14. G. Kimura, Phys. Lett. A 314, 339 (2003). doi:10.1016/S0375-9601(03)00941–1

  15. M.S. Byrd, N. Khaneja, Phys. Rev. A 68, 062322 (2003). doi:10.1103/PhysRevA.68.062322

  16. U. Fano, Rev. Mod. Phys. 29, 74 (1957). doi:10.1103/RevModPhys.29.74

  17. R. Schack, T.A. Brum, C.M. Caves, Phys. Rev. A 64, 014305 (2001). doi:10.1103/PhysRevA.64.014305

  18. C.A. Fuchs, R. Schack, P.F. Scudo, Phys. Rev. A 69, 062305 (2004). doi:10.1103/PhysRevA.69.062305

  19. V. Buzěk, G. Drobny, J. Mod. Opt. 47, 2823 (2000). doi:10.1080/09500340008232199

    Google Scholar 

  20. S.T. Flammia, D. Gross, Y.K. Liu, J. Eisert, New J. Phys. 14, 095022 (2012). doi:10.1088/1367-2630/14/9/095022

    Google Scholar 

  21. R. Blume-Kohout, arXiv:1202.5270 [quant-ph] (2012).

    Google Scholar 

  22. M. Christandl, R. Renner, Phys. Rev. Lett. 109, 120403 (2012). doi:10.1103/PhysRevLett.109.120403

    Google Scholar 

  23. A.J. Scott, J. Phys. A: Math. Gen. 39, 13507 (2006). doi:10.1088/0305-4470/39/43/009

  24. H. Zhu, B.G. Englert, Phys. Rev. A 84, 022327 (2011). doi:10.1103/PhysRevA.84.022327

  25. M.D. de Burgh, N.K. Langford, A.C. Doherty, A. Gilchrist, Phys. Rev. A 78, 052122 (2008). doi:10.1103/PhysRevA.78.052122

  26. T. Sugiyama, P.S. Turner, M. Murao, Phys. Rev. A 85, 052107 (2012). doi:10.1103/PhysRevA.85.052107

  27. H. Chernoff, Ann. Math. Stat. 25, 573 (1954). doi:10.1214/aoms/1177728725

  28. S.G. Self, K.Y. Liang, J. Am. Stat. Assoc. 82, 605 (1987). doi:10.1080/01621459.1987.10478472

    Google Scholar 

  29. M. Abramowitz, I.A. Stegun (eds.), Handbook of Mathematical Functions with Formulas, Graphs, and Mathematical Tables (Wiley, New York, 1972)

    Google Scholar 

  30. J.A. Smolin, J.M. Gambetta, G. Smith, Phys. Rev. Lett. 108, 070502 (2012). doi:10.1103/PhysRevLett.108.070502

    Google Scholar 

  31. S.M. Tan, J. Mod. Opt. 44, 2233 (1997). doi:10.1080/09500349708231881

  32. C.W. Helstrom, Quantum Detection and Estimation Theory (Academic, New Tork, 1976)

    Google Scholar 

  33. A.S. Holevo, Probabilistic and Statistical Aspects of Quantum Theory (North-Holland, New York, 1982)

    Google Scholar 

  34. M. Hayashi (ed.), Asymptotic Theory of Quantum Statistical Inference: Selected Papers (World Scientific, Singapore, 2005)

    Google Scholar 

  35. K. Vogel, H. Risken, Phys. Rev. A 40, 2847 (1989). doi:10.1103/PhysRevA.40.2847

  36. T.M. Buzug, Computed Tomography: From Photon Statistics to Modern Cone-Beam CT (Springer, Berlin, 2008)

    Google Scholar 

  37. M. Paris, J. Řeháček (eds.), Quantum State Estimation. Lecture Notes in Physics (Springer, Berlin, 2004)

    Google Scholar 

  38. A. Ling, K.P. Soh, A. L.-Linares, C. Kurtsiefer, Phys. Rev. A 74, 022309 (2006). doi:10.1103/PhysRevA.74.022309

  39. H. Kosaka, T. Inagaki, Y. Rikitake, H. Imamura, Y. Mitsumori, K. Edamatsu, Nature 457, 702 (2009). doi:10.1038/nature07729

  40. M. Steffen, M. Ansmann, R. McDermott, N. Katz, R.C. Bialczak, E. Lucero, Phys. Rev. Lett. 97, 050502 (2006). doi:10.1103/PhysRevLett.97.050502

    Google Scholar 

  41. M. Neeley, M. Ansmann, R.C. Bialczak, M. Hofheinz, N. Katz, E. Lucero, A. O’Connell, H. Wang, A.N. Cleland, J.M. Martinis, Nature Phys. 4, 523 (2008). doi:10.1038/nphys972

  42. M. Hofheinz, H. Wang, M. Ansmann, R.C. Bialczak, E. Lucero, M. Neeley, A.D. O’Connell, D. Sank, J. Wenner, J.M. Martinis, A.N. Cleland, Nature 459, 546 (2009). doi:10.1038/nature08005

  43. D. Leibfried, D.M. Meekhof, B.E. King, C. Monroe, W.M. Itano, D.J. Wineland, Phys. Rev. Lett. 77, 4281 (1996). doi:10.1103/PhysRevLett.77.4281

    Google Scholar 

  44. S. Olmschens, D.N. Matsukevich, P. maunz, D. hayes, L.M. Duan, C. Monroe, Science 323, 486 (2009). doi:10.1126/science.1167209

  45. H. Tanji, S. Ghosh, J. Simon, B. Bloom, V. Vuletic, Phys. Rev. Lett. 103, 043601 (2009). doi:10.1103/PhysRevLett.103.043601

    Google Scholar 

  46. P. Neumann, N. Mizouchi, F. Rempp, P. Hemmer, H. Watanabe, S. Yamasaki, V. Jacques, T. Gaebel, F. Jelezko, J. Wrachtrup, Science 320, 1326 (2008). doi:10.1126/science.1157233

  47. T.J. Dunn, I.A. Walmsley, S. Mukamel, Phys. Rev. Lett. 74, 884 (1995). doi:10.1103/PhysRevLett.74.884

    Google Scholar 

  48. M.A. Nielsen, E. Knill, R. Laflamme, Nature 396, 52 (1998). doi:10.1038/23891

  49. E. Skovsen, H. Stapelfeldt, S. Juhl, K. Molmer, Phys. Rev. Lett. 91, 090406 (2003). doi:10.1103/PhysRevLett.91.090406

    Google Scholar 

  50. R.A. Horn, C.R. Johnson, Matrix Analysis (Cambridge University Press, New York, 1985)

    Google Scholar 

  51. T.M. Cover, J.A. Thomas, Elements of Information Theory, 2nd edn. Wiley Series in Telecommunications and Signal Processing (Wiley-Interscience, New York, 2006)

    Google Scholar 

  52. I. Bengtsson, K. Życzkowski, Geometry of Quantum States (Cambridge University Press, Cambridge, 2006)

    Google Scholar 

  53. J.M. Borwein, A.S. Lewis, Convex Analysis and Nonlinear Optimization. CMS Books in Mathematics (Springer, New York, 2006)

    Google Scholar 

  54. N.J. Higham, Accuracy and Stability of Numerical Algorithms (SIAM., Philadelphia, 2002).

    Google Scholar 

  55. T. Sugiyama, P.S. Turner, M. Murao, Phys. Rev. Lett. 111, 160406 (2013). doi:10.1103/PhysRevLett.111.160406

    Google Scholar 

  56. T. Sugiyama, P.S. Turner, M. Murao, New J. Phys. 14, 085005 (2012). doi:10.1088/1367-2630/14/8/085005

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Takanori Sugiyama .

Appendices

Appendix

Linear Algebra and Convex Analysis

In this section, we give mathematical supplements. In Sect. A.1, we explain terminologies in vector and matrix theory and known inequalities between different norms. In Sect. A.2, we explain a known property of projections in convex analysis.

5.1.1 Vector and Matrix

5.1.1.1 Class of Matrices

Let \(M_{m,n}\) denote the set of all complex matrices with \(m\) columns and \(n\) rows. When \(m=n\), the elements of \(M_{m,m}\) is called square matrices.

  1. 1.

    Square matrix

    Suppose that a matrix \(A\) is square. Let \(A^{\dagger }\) denote the Hermitian conjugate of \(A\). When the matrix \(A\) satisfies \(A^{\dagger }A = A A^{\dagger }\), it is called normal. The normality is the necessary and sufficient condition of the diagonalizability. When \(A\) is diagonalizable, it can be represented as

    $$\begin{aligned} A = \sum _{i=1}^{m} a_{i} \varvec{v}_{i} \varvec{v}_{i}^{\dagger }, \end{aligned}$$
    (5.150)

    where \(\varvec{v}_{i}\) are normalized vectors in \(\mathbb {C}^{m}\) and orthogonal to each other. These vectors are called the eigenvectors of \(A\), and the complex values \(a_{i}\) are called the eigenvalues of \(A\). When a square matrix \(A\) satisfies \(A = A^{\dagger }\), it is called Hermitian. Hermitian matrices are normal, and the eigenvalues are real. When all eigenvalues of \(A\) is positive, it is called a positive matrix. When the eigenvalues are nonnegative, it is called a positive semidefinite matrix.

    Let \(f\) denote a function from \(\mathbb {C}\) to \(\mathbb {C}\). Let \(A\) denote a normal matrix and \(A=\sum _{i=1}^{m}a_{i}\varvec{v}_{i}\varvec{v}_{i}^{\dagger }\) be the diagonalized form. We define the action of \(f\) on \(A\) by

    $$\begin{aligned} f(A):= \sum _{i=1}^{m} f(a_{i}) \varvec{v}_{i} \varvec{v}_{i}^{\dagger }. \end{aligned}$$
    (5.151)

    We use the following notation.

    $$\begin{aligned} |A|:= \sum _{i=1}^{m} |a_{i}| \varvec{v}_{i} \varvec{v}_{i}^{\dagger }, \end{aligned}$$
    (5.152)
    $$\begin{aligned} \sqrt{A}:= _{i=1}^{m} \sqrt{a_{i}} \varvec{v}_{i} \varvec{v}_{i}^{\dagger }. \end{aligned}$$
    (5.153)

    When all eigenvalues of \(A\) are not zero, we define the inverse matrix of \(A\) by

    $$\begin{aligned} A^{-1}:= \sum _{i=1}^{m} \frac{1}{a_{i}} \varvec{v}_{i} \varvec{v}_{i}^{\dagger }. \end{aligned}$$
    (5.154)

    When the inverse matrix of \(A\) exists, \(A\) is called invertible or regular. The inverse matrix satisfies \(A A^{-1} = A^{-1}A=I_{m}\), where \(I_{m}\) is the identity matrix on \(\mathbb {C}^{m}\).

    When \(A\) has zero eigenvalues, we cannot define the inverse matrix and \(A\) is called irregular. In this case, we define a matrix \(A^{-}\) by

    $$\begin{aligned} A^{-} := \sum _{i; a_{i}\ne 0} \frac{1}{a_{i}} \varvec{v}_{i} \varvec{v}_{i}^{\dagger }. \end{aligned}$$
    (5.155)

    This \(A^{-}\) is a Moore-Penrose generalized inverse for normal matrices.

  2. 2.

    Non-square matrix

    Next, we consider the case of \(m\ne n\). For a given non-square matrix \(A \in M_{m,n}\), we define the left-inverse matrix \(A^{-1}_\mathrm{{left}}\in M_{n,m}\) and right-inverse matrix \(A^{-1}_\mathrm{{right}}\in M_{n,m}\) by

    $$\begin{aligned} A^{-1}_\mathrm{{left}} A = I_{n}, \end{aligned}$$
    (5.156)
    $$\begin{aligned} A A^{-1}_\mathrm{{right}} = I_{m}. \end{aligned}$$
    (5.157)

    We define the rank of \(A\) by the number of independent row vectors of \(A\) and use the notation \(\mathrm{{rank}}(A)\). When \(\mathrm{{rank}}(A) = \mathrm{{min}} \{m,n\}\) holds, \(A\) is called full-rank, and otherwise it is called rank-deficient. Suppose that \(A\) is full-rank. When \(m>n\), the left-inverse matrix \(A^{-1}_\mathrm{{left}}\) exists, and when \(m<n\), the right-inverse matrix \(A^{-1}_\mathrm{{right}}\) exists. When \(A\) is rank-deficient, there are no left- and right-inverse matrices of \(A\).

5.1.1.2 Inhomogeneous Equation

Let \(A\) be a real \((m,n)\)-matrix and \(\varvec{v}\in \mathbb {R}^{n}\), \(\varvec{w}\in \mathbb {R}^{m}\). When \(\varvec{w}\) is not \(\varvec{0}\), an equation

$$\begin{aligned} A \varvec{v} = \varvec{w} \end{aligned}$$
(5.158)

is called an inhomogeneous equation. e consider an inverse problem such that for given \(A\) and \(\varvec{w}\), we find \(\varvec{v}\) satisfying Eq. (5.158). We define the image of \(A\) by

$$\begin{aligned} {\mathrm{{Im}}} (A):= \{A \varvec{v}^{\prime } | \varvec{v}^{\prime } \in \mathbb {R}^{n} \} \subset \mathbb {R}^{m}. \end{aligned}$$
(5.159)

When \(\varvec{w}\in {\mathrm{{Im}}} (A)\), the solutions of Eq. (5.158) exist, and otherwise they do not exist. We define an augmented matrix of \(A\) with \(\varvec{w}\) by

$$\begin{aligned}{}[A|\varvec{w}] := \left( \begin{array}{ccc|c} A_{1,1} &{} \cdots &{} A_{1,n} &{} w_{1} \\ \vdots &{} \ddots &{} \vdots &{} \vdots \\ A_{m,1} &{} \cdots &{} A_{m,n} &{} w_{m} \end{array} \right) . \end{aligned}$$
(5.160)

Then following theorem holds.

Theorem 5.12

The following statements are equivalent.

  1. (i)

    For a given \(A\) and \(\varvec{w}\), Eq. (5.158) has solutions.

  2. (ii)

    \(\varvec{w} \in {\mathrm{{Im}}} (A)\).

  3. (iii)

    The ranks of \(A\) and the augmented matrix are same, i.e.,

    $$\begin{aligned} \mathrm{{rank}}(A) = \mathrm{{rank}}([A|\varvec{w}]). \end{aligned}$$
    (5.161)

When the solutions exist, the solutions have \(\bigl (n - \mathrm{{rank}}(A)\bigr )\) degrees of freedom.

We summarize the contents of this subsection. Suppose that \(A\in M_{m,n}\) and \(\varvec{w}\in \mathbb {R}^{m}\) are given.

  • If and only if the statements (ii) or (iii) are not satisfied, Eq. (5.158) has no solutions.

  • If and only if he statements (ii) or (iii) are satisfied, Eq. (5.158) has solutions.

  • If Eq. (5.158) has solutions, they have \(\bigl (n - \mathrm{{rank}}(A)\bigr )\) degrees of freedom.

  • Suppose that Eq. (5.158) has a solution and \(n > \mathrm{{rank}}(A)\) holds. Then the solution is not unique.

  • Suppose that Eq. (5.158) has a solution and \(n = \mathrm{{rank}}(A)\) holds. Then the solution is unique.

    • When \(n=m\), \(A\) is full-rank and invertible, and the solution is given as \(A^{-1}\varvec{w}\).

    • When \(m>n\), \(A\) is full-rank and the left-inverse matrix exists. The solution is given as \(A^{-1}_\mathrm{{left}}\varvec{w}\).

5.1.1.3 Norms and Loss Functions

We obey the definitions of norms in [50].

  • Vector norms

    A function \(\Vert \cdot \Vert : \mathbb {C}^{k} \rightarrow \mathbb {R}\) is a vector norm if for all \(\varvec{v},\ \varvec{w}\in \mathbb {C}^{k}\),

    1. (i)

      (Non-negativity) \(\Vert \varvec{v} \Vert \ge 0\).

    2. (ii)

      (Positivity) \(\Vert \varvec{v} \Vert =0\) if and only if \(\varvec{x}=\varvec{0}\).

    3. (iii)

      (Homogeneity) \(\Vert c \varvec{v} \Vert = |c|\cdot \Vert \varvec{v} \Vert \) for all scalars \(c\in \mathbb {C}\).

    4. (iv)

      (Triangle inequality) \(\Vert \varvec{v} + \varvec{w}\Vert \le \Vert \varvec{v} \Vert + \Vert \varvec{w} \Vert \).

    We introduce three representative vector norms:

    1. 1.

      \(\ell _{1}\)-norm (the sum norm) \(\Vert \varvec{v} \Vert _{1} := \sum _{i=1}^{k} |v_{i}|\).

    2. 2.

      \(\ell _{2}\)-norm (the Euclidean norm) \(\Vert \varvec{v} \Vert _{2} := \sqrt{\sum _{i=1}^{k} |v_{i}|^{2}}\).

    3. 3.

      \(\ell _{\infty }\)-norm (the max norm) \(\Vert \varvec{v} \Vert _{\infty } := \max _{i=1, \ldots , k} |v_{i}|\).

    These vector norms satisfy the following inequalities:

    $$\begin{aligned} \Vert \varvec{v} \Vert _{2} \le \Vert \varvec{v} \Vert _{1} \le \sqrt{k} \Vert \varvec{v} \Vert _{2}, \end{aligned}$$
    (5.162)
    $$\begin{aligned} \Vert \varvec{v} \Vert _{\infty } \le \Vert \varvec{v} \Vert _{1} \le k \Vert \varvec{v} \Vert _{\infty }, \end{aligned}$$
    (5.163)
    $$\begin{aligned} \Vert \varvec{v} \Vert _{\infty } \le \Vert \varvec{v} \Vert _{2} \le \sqrt{k} \Vert \varvec{v} \Vert _{\infty }. \end{aligned}$$
    (5.164)

    For any probability distributions \(\varvec{p}\) and \(\varvec{q}\), \(\ell _{1}\)-norm and the Kullback-Leibler divergence satisfy the following inequality [51]

    $$\begin{aligned} ( \Vert \varvec{p} - \varvec{q} \Vert _{1} )^{2} \le K ( \varvec{p} \Vert \varvec{q}). \end{aligned}$$
    (5.165)

    Equation (5.165) is called Pinsker’s inequality.

  • Matrix norms

    A function \(|||\cdot |||: M_{k,k} \rightarrow \mathbb {R}\) is a matrix norm if for all \(A,\ B \in M_{k,k}\),

    1. (i)

      (Non-negativity) \(|||A |||\ge 0\).

    2. (ii)

      (Positivity) \(|||A |||=0\) if and only if \(A=O\).

    3. (iii)

      (Homogeneity) \(|||c A |||= |c| \cdot |||A |||\) for all scalars \(c\in \mathbb {C}\).

    4. (iv)

      (Triangle inequality) \(|||A + B |||\le |||A |||+ |||B |||\).

    5. (v)

      (Submultiplicativity) \(|||AB |||\le |||A |||\cdot |||B |||\).

    We introduce two representative matrix norms:

    1. 1.

      Trace norm \(|||A |||_\mathrm{{tr}} := \mathrm{tr}[ |A| ]\).

    2. 2.

      Hilbert-Schmidt norm (the Frobenius norm) \(|||A |||_\mathrm{{HS}} := \mathrm{tr}[ A^{\dagger } A ]^{1/2}\).

    These matrix norms satisfy the following inequalities:

    $$\begin{aligned} |||A |||_\mathrm{{HS}} \le |||A |||_\mathrm{{tr}} \le \sqrt{\mathrm{{rank}}(A)} |||A |||_\mathrm{{HS}}. \end{aligned}$$
    (5.166)

    The Hilbert-Schmidt distance between two density matrices are defined by

    $$\begin{aligned} \varDelta ^\mathrm{{HS}} (\hat{\rho } , \hat{\rho }^{\prime }) := \frac{1}{\sqrt{2}} |||\hat{\rho } - \hat{\rho }^{\prime } |||_\mathrm{{HS}}. \end{aligned}$$
    (5.167)

    The normalization factor makes the maximal value one. In the generalized Bloch parametrization, we have

    $$\begin{aligned} \varDelta ^\mathrm{{HS}} (\hat{\rho }(\varvec{s}), \hat{\rho }(\varvec{s}^{\prime })) = \frac{1}{2} \Vert \varvec{s} - \varvec{s}^{\prime } \Vert _{2}. \end{aligned}$$
    (5.168)
  • Fidelity, infidelity, Bures distance

    In quantum information science, one of the most popular evaluation functions is fidelity. For two density matrices \(\hat{\rho }\) and \(\hat{\rho }^{\prime }\), the fidelity is defined by

    $$\begin{aligned} f (\hat{\rho } , \hat{\rho }^{\prime }) := \mathrm{Tr}\Bigl [ \sqrt{ \sqrt{\hat{\rho }} \hat{\rho }^{\prime } \sqrt{\hat{\rho }}} \Bigr ]^{2}. \end{aligned}$$
    (5.169)

    Fidelity satisfies \(f(\hat{\rho }, \hat{\rho })=1\) and it is not a loss function. The square-root \(\sqrt{f(\hat{\rho } , \hat{\rho }^{\prime })}\) is called the root fidelity. The infidelity is defined by

    $$\begin{aligned} \varDelta ^\mathrm{{IF}} (\hat{\rho } , \hat{\rho }^{\prime })&:= 1 - \mathrm{Tr}\Bigl [ \sqrt{ \sqrt{\hat{\rho }} \hat{\rho }^{\prime } \sqrt{\hat{\rho }}} \Bigr ]^{2} \end{aligned}$$
    (5.170)
    $$\begin{aligned}&= 1 - f(\hat{\rho } , \hat{\rho }^{\prime }). \end{aligned}$$
    (5.171)

    Infidelity is a loss function but it is not a distance. The Bures distance is defined by

    $$\begin{aligned} \varDelta ^\mathrm{{B}} (\hat{\rho } , \hat{\rho }^{\prime })&:= \sqrt{1 - \mathrm{Tr}\Bigl [ \sqrt{ \sqrt{\hat{\rho }} \hat{\rho }^{\prime } \sqrt{\hat{\rho }}} \Bigr ] }\end{aligned}$$
    (5.172)
    $$\begin{aligned}&= \sqrt{1 - \sqrt{f (\hat{\rho } , \hat{\rho }^{\prime })} }. \end{aligned}$$
    (5.173)

    For the trace distance, Bures distance, and infidelity, the following inequalities hold [52]:

    $$\begin{aligned} \varDelta ^\mathrm{{B}} (\hat{\rho } , \hat{\rho }^{\prime }) \le \sqrt{\varDelta ^\mathrm{{IF}} (\hat{\rho } , \hat{\rho }^{\prime })} \le \sqrt{2} \varDelta ^\mathrm{{B}} (\hat{\rho } , \hat{\rho }^{\prime }),\end{aligned}$$
    (5.174)
    $$\begin{aligned} \varDelta ^\mathrm{{B}} (\hat{\rho } , \hat{\rho }^{\prime })^{2} \le \varDelta ^\mathrm{{T}} (\hat{\rho } , \hat{\rho }^{\prime }) \le \sqrt{\varDelta ^\mathrm{{IF}} (\hat{\rho } , \hat{\rho }^{\prime })}. \end{aligned}$$
    (5.175)

5.1.2 Projection in Convex Analysis

A subset \(R \subseteq \mathbb {R}^{k}\) is called convex if for all \(\varvec{r},\ \varvec{r}^{\prime } \in R\), \(p\varvec{r} + (1-p)\varvec{r}^{\prime }\) is included in \(S\), where \(p\in [0,1]\). Let \(R\) be a non-empty, closed, and convex set in \(\mathbb {R}^{k}\) and \(\varvec{t}\) be a vector in \(\varvec{R}^{k}\). The vector \(\varvec{t}\) is not necessary included in \(R\). We define the projection of \(\varvec{t}\) onto \(R\) by

$$\begin{aligned} \varvec{P}_{R} (\varvec{t}) := \mathop {\mathrm{argmin}}\limits _{\varvec{r} \in R} \Vert \varvec{t} - \varvec{r} \Vert _{2}. \end{aligned}$$
(5.176)

Obviously, \(\varvec{P}_{R} (\varvec{t})=\varvec{t}\) holds for any \(\varvec{t} \in R\).

Theorem 5.13

(Non-expandability of projections [53])

Suppose that \(R\) is a non-empty, closed, and convex set in \(\mathbb {R}^{k}\). Then for any \(\varvec{t}, \varvec{t}^{\prime } \in \mathbb {R}^{k}\),

$$\begin{aligned} \Vert \varvec{P}_{R}(\varvec{t}) - \varvec{P}_{R}(\varvec{t}^{\prime }) \Vert _{2} \le \Vert \varvec{t} - \varvec{t}^{\prime } \Vert _{2} \end{aligned}$$
(5.177)

holds.

Theorem 5.13 indicates that any projections do not expand the distance between any two vectors in the Euclidean space. For \(\varvec{t}^{\prime }\in R\), we have

$$\begin{aligned} \Vert \varvec{P}_{R}(\varvec{t}) - \varvec{t}^{\prime } \Vert _{2} \le \Vert \varvec{t} - \varvec{t}^{\prime } \Vert _{2}. \end{aligned}$$
(5.178)

Supplement for Sect. 5.3.2

In this section, we give supplements for Sect. 5.3.2. In Sect. B.1, we derive the upper bound on error probability for \(k\)-qubit state tomography using Pauli measurements with detection losses. In Sects. B.2 and B.3, we explain a way of evaluating the effect of systematic and numerical errors on the error threshold and upper bound. In Sect. B.4, we derive the upper bound on error probability for the constrained least squares estimator, which was introduced in Sect. 5.3.2, and compare the performance to that of the \(\ell _{2}\)-eNM estimator.

5.1.1 Proof of Eq. (5.130)

Suppose that we prepare \(N\) identical copies of \(\hat{\rho } \in \fancyscript{S} ( (\mathbb {C}^{2})^{\otimes k})\) and make the three Pauli measurements with detection efficiency \(\eta \) on each qubit. The POVMs describing the ideal Pauli measurements on each qubit, \(\varvec{\varPi }^{(i)} = \{ \hat{\varPi }^{(i)}_{+1}, \hat{\varPi }^{(i)}_{-1} \}\), are given as

$$\begin{aligned} \hat{\varPi }^{(i)}_{\pm 1}&:= \frac{1}{2} \left( \hat{\mathbb {1}} \pm \varvec{e}_{i} \cdot \varvec{\sigma } \right) , \end{aligned}$$
(5.179)

where \(i=1,2,3\),

$$\begin{aligned} \varvec{e}_{1} := \left( \begin{array}{c} 1 \\ 0 \\ 0 \\ \end{array} \right) , \ \varvec{e}_{2} := \left( \begin{array}{c} 0 \\ 1 \\ 0 \\ \end{array} \right) , \ \varvec{e}_{3} := \left( \begin{array}{c} 0 \\ 0 \\ 1 \\ \end{array} \right) , \end{aligned}$$
(5.180)

and

$$\begin{aligned} \hat{\sigma }_{1} := \left( \begin{array}{cc} 0 &{} 1 \\ 1 &{} 0 \end{array} \right) , \ \hat{\sigma }_{2} := \left( \begin{array}{cc} 0 &{} -i \\ i &{} 0 \end{array} \right) , \ \hat{\sigma }_{3} := \left( \begin{array}{cc} 1 &{} 0\\ 0 &{} -1 \end{array} \right) . \end{aligned}$$
(5.181)

When the measurements have detention loss, the corresponding POVMs, \(\varvec{\varPi }^{\eta , (i)} = \{ \hat{\varPi }^{\eta , (i)}_{+1}, \hat{\varPi }^{\eta , (i)}_{-1}, \hat{\varPi }^{\eta , (i)}_{0} \}\), are given as

$$\begin{aligned} \hat{\varPi }^{\eta , (i)}_{\pm 1}&:= \frac{\eta }{2} \left( \hat{\mathbb {1}} \pm \varvec{e}_{i} \cdot \varvec{\sigma } \right) \end{aligned}$$
(5.182)
$$\begin{aligned} \hat{\varPi }^{\eta , (i)}_{0}&:= (1 - \eta ) \hat{\mathbb {1}}, \end{aligned}$$
(5.183)

where \(\eta \) is the detection efficiency and takes the value from \(0\) to \(1\). The outcome “\(0\)” means no detection at the measurement trial. When we perform the imperfect Pauli measurements on each qubit, the POVM on \(k\)-qubit is given as

$$\begin{aligned} \varvec{\varPi }^{\eta , (\varvec{i})} := \otimes _{q=1}^{k}\varvec{\varPi }^{\eta , (i_{q})}, \end{aligned}$$
(5.184)

where \(\varvec{i} = \{i_{q} \}_{q=1}^{k}\) and \(i_{q} = 1, 2, 3\). The label of the different POVMs, \(\varvec{i}\), corresponds to \(j\) in this chapter. Suppose that we perform each measurement described by \(\varvec{\varPi }^{\eta , (\varvec{i})}\) equally \(n:= N / 3^{k}\) times.

Let us choose \(\varvec{\lambda }\) to be the set of tensor products of Pauli and identity matrices with the normalization factor \(1 / \sqrt{2^{k-1}}\), i.e.,

$$\begin{aligned} \hat{\lambda }_{\varvec{\beta }} := \frac{1}{\sqrt{2^{k-1}}}\otimes _{q=1}^{k} \hat{\sigma }_{\beta _{q}}, \end{aligned}$$
(5.185)

where \(\varvec{\beta } := \{ \beta _{q} \}_{q=1}^{k}\) and \(\beta _{q} = 0, 1, 2, 3\). We eliminate from \(\varvec{\beta }\) the case that all \(\beta _{q}\) are \(0\). The label of the matrices, \(\varvec{\beta }\), corresponds to \(\alpha \) in this chapter. Using this \(\varvec{\lambda }\), any density matrices are represented as

$$\begin{aligned} \hat{\rho } = \frac{1}{2^{k}} \hat{\mathbb {1}} + \frac{1}{2} \varvec{\lambda } \cdot \varvec{s}, \end{aligned}$$
(5.186)

where

$$\begin{aligned} s_{\varvec{\beta }}&= \mathrm{Tr}\left[ \hat{\rho } \hat{\lambda }_{\varvec{\beta }} \right] \end{aligned}$$
(5.187)
$$\begin{aligned}&= \frac{1}{\sqrt{2^{k-1}}}\mathrm{Tr}\left[ \hat{\rho } \left( \otimes _{q=1}^{k} \hat{\sigma }_{\beta _{q}} \right) \right] . \end{aligned}$$
(5.188)

Equation (5.188) indicates that the parameter \(s_{\varvec{\beta }}\) is the expectation of a tensor product of ideal Pauli and identity matrices.

In \(k\)-qubit state tomography with \(k\ge 2\), we need to be careful about the treatment of multiple uses of same data. For example, in order to estimate the expectation of \(\hat{\sigma }_{1} \otimes \hat{\mathbb {1}}\) in \(2\)-qubit case, we use the data of three types of measurements; \(\hat{\sigma }_{1} \otimes \hat{\sigma }_{1}\), \(\hat{\sigma }_{1} \otimes \hat{\sigma }_{2}\), and \(\hat{\sigma }_{1} \otimes \hat{\sigma }_{3}\). Therefore the estimation of each parameter can de dependent even for \(\hat{\rho }^\mathrm{{LLS}}\).

We try to estimate these parameters from a data set of the imperfect Pauli measurements \(\breve{\varvec{\varPi }} := \{ \varvec{\varPi }^{\eta , (\varvec{i})} \}_{\varvec{i}}\). In order to calculate \(c_{\alpha }\), we need to derive a matrix \(B\) satisfying

$$\begin{aligned} \varvec{s} = B ( \varvec{p} - \varvec{a}_{0}). \end{aligned}$$
(5.189)

This matrix \(B\) corresponds to \(\varLambda ^{-1}_\mathrm{{left}}\) in this chapter. Let \(l\) denote the number of \(\hat{\mathbb {1}}\) appearing in \(\hat{\lambda }_{\varvec{\beta }}\). The number of \(\hat{\lambda }_{\varvec{\beta }}\) including \(l\) identities is \(3^{k-l} \times \frac{k !}{l! (k-l)!}\). \(\hat{\lambda }_{\varvec{\beta }} = \hat{\mathbb {1}}^{\otimes l} \otimes \left( \otimes _{q=l+1}^{k} \hat{\sigma }_{i_{q}} \right) / \sqrt{2^{k-1}}\) is an example of such \(\hat{\lambda }_{\varvec{\beta }}\). In this case, Eq. (5.188) is rewritten by the probability distributions of the imperfect Pauli measurement as

$$\begin{aligned} s_{\varvec{\beta }}&= \frac{1}{\sqrt{2^{k-1}}} \sum _{m_{l+1}, \ldots , m_{k}; \atop m_{q} \,=\, \pm 1} \left( \prod _{q\,=\,l+1}^{k} m_{q} \right) p( m_{l+1}, \ldots , m_{k} | I^{\otimes l} \nonumber \\&\otimes (\otimes _{q\,=\,l+1}^{k} \varvec{\varPi }^{(i_{q})}), \hat{\rho }) \end{aligned}$$
(5.190)
$$\begin{aligned}&= \frac{1}{\sqrt{2^{k-1}}} \sum _{m_{l+1}, \ldots , m_{k}; \atop m_{q} \,=\, \pm 1} \left( \prod _{q\,=\,l+1}^{k} m_{q} \right) \frac{1}{\eta ^{k-l}}p( m_{l+1}, \ldots , m_{k} | I^{\otimes l} \nonumber \\&\otimes (\otimes _{q\,=\,l+1}^{k} \varvec{\varPi }^{\eta , (i_{q})}), \hat{\rho }) \end{aligned}$$
(5.191)
$$\begin{aligned}&= \frac{1}{\sqrt{2^{k-1}}} \sum _{i_{1}, \ldots , i_{l}; \atop i_{q} \,=\, 1, 2, 3} \sum _{m_{1}, \ldots , m_{l}; \atop m_{q} \,=\, \pm 1, 0} \sum _{m_{l+1}, \ldots , m_{k}; \atop m_{q} \,=\, \pm 1} \left( \prod _{q=l+1}^{k} m_{q} \right) \ \nonumber \\&\frac{1}{\eta ^{k-l}} \frac{1}{3^{l}} p( m_{1}, \ldots , m_{k} | \varvec{\varPi }^{\eta , (\varvec{i})}, \hat{\rho }). \end{aligned}$$
(5.192)

Therefore we have

$$\begin{aligned} B_{\varvec{\beta }, (\varvec{i}, \varvec{m})} = \pm \frac{1}{\sqrt{2^{k-1}}} \frac{1}{\eta ^{k-l}} \frac{1}{3^{l}}, \end{aligned}$$
(5.193)

if \(i_{q} = 1, 2, 3\) and \(m_{q} = \pm 1, 0\) for \(q= 1, \ldots , l\) and \(i_{q} = \beta _{q}\) and \(m_{q} = \pm 1\) for \(q = l+1, \ldots , k\). Otherwise \(B_{\varvec{\beta }, (\varvec{i}, \varvec{m})} = 0\). Then for each \(\beta \) and \(\varvec{i}\),

$$\begin{aligned}&\qquad \max \limits _{\varvec{m}} B_{\varvec{\beta }, (\varvec{i}, \varvec{m})} - \min \limits _{\varvec{m}} B_{\varvec{\beta }, (\varvec{i}, \varvec{m})} \nonumber \\&= \left\{ \begin{array}{ll} \frac{2}{\sqrt{2^{k-1}} \eta ^{k-l} 3^{l}} &{} \text{ if }\ i_{q} = \beta _{q}\ \text{ for }\ q = l+1, \ldots , k\\ 0 &{} \text{ otherwise } \end{array} \right. \end{aligned}$$
(5.194)

holds, and we obtain

$$\begin{aligned} c_{\varvec{\beta }}&= \sum _{\varvec{i}} 3^{k} \left\{ \max _{\varvec{m}} B_{\varvec{\beta }, (\varvec{i}, \varvec{m})} - \min _{\varvec{m}} B_{\varvec{\beta }, (\varvec{i}, \varvec{m})} \right\} ^{2} \end{aligned}$$
(5.195)
$$\begin{aligned}&= \sum _{i_{1}, \ldots , i_{l}; \atop i_{q} = 1, 2, 3} 3^{k} \left\{ \frac{2}{\sqrt{2^{k-1}} \eta ^{k-l} 3^{l}} \right\} ^{2} \end{aligned}$$
(5.196)
$$\begin{aligned}&= \frac{3^{k-l}}{2^{k-3} \eta ^{2(k-l)}}. \end{aligned}$$
(5.197)

From the above discussion, we can see that \(c_{\varvec{\beta }}\) takes same value for different \(\hat{\lambda }_{\varvec{\beta }}\) with the same \(l\). The upper bound of error probability is calculated as

$$\begin{aligned} \text {P}_{u}^{\varDelta } (\delta , N, \breve{\varvec{\varPi }}, \hat{\rho }^{\ell _{2}\mathrm - \mathrm {eNM}})&= 2 \sum _{\varvec{\beta }} \exp \left[ - \frac{b}{c_{\varvec{\beta }}} \delta ^{2} N \right] \nonumber \\&= 2 \sum _{l=0}^{k-1} 3^{k-l} \left( {\begin{array}{c}k\\ l\end{array}}\right) \exp \left[ - b \frac{2^{k-3} \eta ^{2(k-l)}}{3^{k-l}} \delta ^{2} N \right] , \end{aligned}$$
(5.198)

where

$$\begin{aligned} b&:= \left\{ \begin{array}{ll} 8 / (d^{2} - 1) &{} \text{ if }\ \varDelta = \varDelta ^\mathrm{{HS}} \\ 16 / d (d^{2} - 1) &{} \text{ if }\ \varDelta = \varDelta ^\mathrm{{T}}\\ 4 / d (d^{2} - 1) &{} \text{ if }\ \varDelta = \varDelta ^\mathrm{{IF}} \end{array} \right. . \end{aligned}$$
(5.199)

When we choose the trace distance as the loss function, we have

$$\begin{aligned} b = \frac{16}{d (d^{2} -1 )} = \frac{1}{2^{k-4} \cdot (2^{2k} - 1)}, \end{aligned}$$
(5.200)

and

$$\begin{aligned} \text {P}_{u}^\mathrm{{T}} (\delta , N, \breve{\varvec{\varPi }}, \hat{\rho }^{\ell _{2}\mathrm - \mathrm {eNM}}) = 2 \sum _{l=0}^{k-1} 3^{k-l} \left( {\begin{array}{c}k\\ l\end{array}}\right) \exp \left[ - \frac{2}{2^{2k} - 1} \frac{\eta ^{2(k-l)}}{3^{k-l}} \delta ^{2} N \right] .\nonumber \\ \end{aligned}$$
(5.201)

In one-qubit (\(k=1\)) and two-qubit (\(k=2\)) cases, we have

$$\begin{aligned} \text {P}_{u}^\mathrm{{T}} (k=1)&= 6 \exp \left[ - \frac{2}{9} \eta ^{2} \delta ^{2} N \right] , \end{aligned}$$
(5.202)
$$\begin{aligned} \text {P}_{u}^\mathrm{{T}} (k=2)&= 18 \exp \left[ - \frac{2}{135} \eta ^{4} \delta ^{2} N \right] + 12 \exp \left[ - \frac{2}{45} \eta ^{2} \delta ^{2} N \right] . \end{aligned}$$
(5.203)

As in the above discussion, when the directions of each Pauli measurement are perfectly orthogonal, it is easy to derive \(c_{\varvec{\beta }}\). When the directions are not orthogonal, we need to calculate \(\varLambda ^{-1}_\mathrm{{left}} = ( \varLambda ^{T} \varLambda )^{-1} \varLambda ^{T}\). Then, it becomes more difficult to analyze \(c_{\varvec{\beta }}\), and we would need to calculate them numerically.

5.1.2 Effect of Systematic Errors

Theorems 5.9 is valid for any informationally complete POVMs and is applicable for cases in which a systematic error exists. However, we must know exactly the mathematical representation of the systematic error in order to strictly verify a value of the confidence level. This assumption can be unrealistic in some experiments. In this section, we will weaken the assumption to a more realistic condition and give a formula of \(\text {P}_{u}^{\varDelta }\) in such a case.

Let \(\breve{\varvec{\varPi }}\) denote a set of POVMs exactly describing the measurement used, and let \(\breve{\varvec{\varPi }}^{\prime } (\ne \breve{\varvec{\varPi }})\) denote a set of POVMs that we mistake as the correct set of POVMs. We assume that \(\breve{\varvec{\varPi }}\) and \(\breve{\varvec{\varPi }}^{\prime }\) are both informationally complete. Suppose that we do not know \(\breve{\varvec{\varPi }}\), but we know that \(\breve{\varvec{\varPi }}\) is in a known set \(\fancyscript{M}\). For example, consider the case where an experimentalist wants to perform a projective easurement of \(\hat{\sigma }_{1}\). If they can guarantee that their actual measurement is prepared within \(0.5\) degrees from the \(x\)-axis, and if their detection efficiency is \(0.9\), then \(\fancyscript{M}\) is the set of all POVMs whose measurement direction and detection efficiency are within \(0.5\) degrees of the \(x\)-axis and \(0.9\), respectively.

For given relative frequencies \(\varvec{\nu }_{N}\), the correct and mistaken eL estimates are

$$\begin{aligned} \varvec{s}^\mathrm{{eL}}_{N}&= \varLambda ^{-1}_\mathrm{{left}} (\breve{\varvec{\varPi }}) \left\{ \varvec{\nu }_{N} - \varvec{a}_{0} \right\} ,\end{aligned}$$
(5.204)
$$\begin{aligned} \varvec{s}^\mathrm{{eL} \prime }_{N}&= \varLambda ^{-1}_\mathrm{{left}} (\breve{\varvec{\varPi }}^{\prime }) \left\{ \varvec{\nu }_{N} - \varvec{a}_{0}^{\prime } \right\} . \end{aligned}$$
(5.205)

Then the actual and mistaken \(\ell _{2}\)-eNM estimates are

$$\begin{aligned} \varvec{s}^{\ell _{2}\mathrm - \mathrm {eNM}}_{N}&= \mathop {\mathrm{argmin}}\limits _{\varvec{s}^{\prime } \in B_{d}} \Vert \varvec{s}^{\prime } - \varvec{s}^\mathrm{{eL}}_{N} \Vert _{2}, \end{aligned}$$
(5.206)
$$\begin{aligned} \varvec{s}^{\ell _{2}\mathrm - \mathrm {eNM}\prime }_{N}&= \mathop {\mathrm{argmin}}\limits _{\varvec{s}^{\prime } \in B_{d}} \Vert \varvec{s}^{\prime } - \varvec{s}^\mathrm{{eL}\prime }_{N} \Vert _{2}. \end{aligned}$$
(5.207)

Let \(\hat{\rho }^{\ell _{2}\mathrm - \mathrm {eNM}}_{N}\) and \(\hat{\rho }^{\ell _{2}\mathrm - \mathrm {eNM}\prime }_{N}\) denote the corresponding density matrix estimates. Let us define the size of the systematic error as

$$\begin{aligned} \xi := \max _{\breve{\varvec{\varPi }} \in \fancyscript{M}} \varDelta (\hat{\rho }^{\ell _{2}\mathrm - \mathrm {eNM}\prime }_{N}, \hat{\rho }^\mathrm{{eNM}}_{N} ). \end{aligned}$$
(5.208)

This is a function of \(\varDelta \), \(\varvec{\nu }_{N}\), \(\breve{\varvec{\varPi }}^{\prime }\), and \(\fancyscript{M}\). Then for any \(\hat{\rho } \in \fancyscript{S}(\fancyscript{H})\) and \(\breve{\varvec{\varPi }} \in \fancyscript{M}\),

$$\begin{aligned} \varDelta (\hat{\rho }_{*}, \hat{\rho })&\le \varDelta (\hat{\rho }_{*}, \hat{\rho }^{\ell _{2}\mathrm - \mathrm {eNM}\prime }_{N}) + \varDelta (\hat{\rho }^{\ell _{2}\mathrm - \mathrm {eNM}\prime }_{N}, \hat{\rho }^{\ell _{2}\mathrm - \mathrm {eNM}}_{N} ) + \varDelta (\hat{\rho }^{\ell _{2}\mathrm - \mathrm {eNM}}_{N}, \hat{\rho } ) \nonumber \\&\le \varDelta (\hat{\rho }_{*}, \hat{\rho }^{\ell _{2}\mathrm - \mathrm {eNM}\prime }_{N}) + \xi + \delta \end{aligned}$$
(5.209)

holds with probability at least

$$\begin{aligned} 1 - \min _{\breve{\varvec{\varPi }} \in \fancyscript{M}} \text {P}_{u}^{\varDelta } = 1 - 2 \max _{\breve{\varvec{\varPi }} \in \fancyscript{M}} \sum _{\alpha = 1}^{d^{2} -1} \exp \left[ -\frac{b}{c_{\alpha }} \delta ^{2} N \right] . \end{aligned}$$
(5.210)

Using Eqs. (5.209) and (5.210), we can evaluate the precision of state preparation, \(\varDelta (\hat{\rho }_{*}, \hat{\rho } )\), without knowing the true state \(\hat{\rho }\) and true sets of POVMs \(\breve{\varvec{\varvec{\varPi }}}\).

5.1.3 Effect of Numerical Errors

In this section, we analyze the effect of numerical errors and explain a method for evaluating the precision of the state preparation in the cases that numerical errors exits.

The \(\ell _{2}\)-eNM estimator \(\hat{\rho }^{\ell _{2}\mathrm - \mathrm {eNM}}\) requires a nonlinear minimization, which requires the use of a numerical algorithm. Suppose that we choose an algorithm for the minimization and obtain a result \(\hat{\sigma }^{\ell _{2}\mathrm - \mathrm {eNM}}_{N}\) for a given data set. In practice, there exists a numerical error on the result, and \(\hat{\sigma }^{\ell _{2}\mathrm - \mathrm {eNM}}_{N}\) differs from the exact solution \(\hat{\rho }^{\ell _{2}\mathrm - \mathrm {eNM}}_{N}\). We cannot obtain the exact solution, but we can guarantee the accuracy of the numerical result with accuracy-guaranteed algorithms [54]. Suppose that we use an algorithm for which \(\varDelta (\hat{\sigma }^{\ell _{2}\mathrm - \mathrm {eNM}}_{N}, \hat{\rho }^{\ell _{2}\mathrm - \mathrm {eNM}}_{N}) \le \zeta \) is guaranteed. Then

$$\begin{aligned} \varDelta (\hat{\rho }_{*}, \hat{\rho } )&\le \varDelta ( \hat{\rho }_{*}, \hat{\sigma }_{N}^{\ell _{2}\mathrm - \mathrm {eNM}}) + \varDelta ( \hat{\sigma }_{N}^{\ell _{2}\mathrm - \mathrm {eNM}}, \hat{\rho }^{\ell _{2}\mathrm - \mathrm {eNM}}_{N}) + \varDelta ( \hat{\rho }_{N}^{\ell _{2}\mathrm - \mathrm {eNM}}, \hat{\rho }) \nonumber \\&\le \varDelta ( \hat{\rho }_{*}, \hat{\sigma }_{N}^{\ell _{2}\mathrm - \mathrm {eNM}}) + \zeta + \delta \end{aligned}$$
(5.211)

holds with probability at least \(1 - \text {P}_{u}^{\varDelta } \). The error threshold is changed from \(\delta \) to \(\zeta + \delta \).

Usually systematic and numerical errors both exists. In such a case, by combining Eqs. (5.209) and (5.211), we can prove that the inequality

$$\begin{aligned} \varDelta (\hat{\rho }_{*}, \hat{\rho }) \le \varDelta (\hat{\rho }_{*}, \hat{\sigma }^{\ell _{2}\mathrm - \mathrm {eNM}}_{N}) + \zeta + \xi + \delta \end{aligned}$$
(5.212)

holds with probability in Eq. (5.210), where \(\zeta \) is a numerical error threshold for \(\breve{\varvec{\varPi }}^{\prime }\). Therefore Theorem 5.9 with a modification can apply for the cases that systematic and numerical errors exist.

5.1.4 Error Probability for Constrained Least Squares Estimator

From Eq. (5.56), the probability distribution of \(\hat{\rho }^\mathrm{{eL}}_{N}\) is the projection of \(\varvec{\nu }_{N}\) on the probability space of trace-one Hermitian matrices (\(\{ \varvec{p}(\hat{\sigma } ) | \hat{\sigma } = \hat{\sigma }^{\dagger }, \mathrm{Tr}[\hat{\sigma } ] = 1 \}\)), and we have

$$\begin{aligned} \Vert \varvec{p} (\hat{\rho }^{\prime }) - \varvec{\nu }_{N} \Vert _{2}^{\ 2} = \Vert \varvec{p} (\hat{\rho }^{\prime }) - \varvec{p} (\hat{\rho }^\mathrm{{eL}}_{N}) \Vert _{2}^{\ 2} + \Vert \varvec{p} (\hat{\rho }^\mathrm{{eL}}_{N}) - \varvec{\nu }_{N} \Vert _{2}^{\ 2}, \ \forall \hat{\rho }^{\prime } \in \fancyscript{S}(\fancyscript{H}). \end{aligned}$$
(5.213)

Therefore, Eq. (5.131) is rewritten as

$$\begin{aligned} \hat{\rho }^\mathrm{{CLS}}_{N} = \mathop {\mathrm{argmin}}\limits _{\hat{\rho }^{\prime } \in \fancyscript{S}(\fancyscript{H})} \Vert \varvec{p} (\hat{\rho }^{\prime }) - \varvec{p} (\hat{\rho }^\mathrm{{eL}}_{N}) \Vert _{2}, \end{aligned}$$
(5.214)

and \(\hat{\rho }^\mathrm{{CLS}}_{N}\) is the projection of \(\hat{\rho }^\mathrm{{eL}}_{N}\) on \(\fancyscript{S}(\fancyscript{H})\) with respect to the \(2\)-norm on the probability space. We can see from Eqs. (5.68) and (5.214) that \(\hat{\rho }^{\ell _{2}\mathrm - \mathrm {eNM}}\) and \(\hat{\rho }^\mathrm{{CLS}}\) are the projections of \(\hat{\rho }^\mathrm{{eL}}_{N}\) with respect to difference spaces (or different norms).

Using Theorem 5.13, we obtain

$$\begin{aligned} \Vert \varvec{p} (\hat{\rho }^\mathrm{{CLS}}_{N}) - \varvec{p} (\hat{\rho }) \Vert _{2}&\le \Vert \varvec{p} (\hat{\rho }^\mathrm{{eL}}_{N}) - \varvec{p} (\hat{\rho }) \Vert _{2}, \ \forall \hat{\rho } \in \fancyscript{S} (\fancyscript{H}), \end{aligned}$$
(5.215)
$$\begin{aligned} \Vert A ( \varvec{s}^\mathrm{{CLS}}_{N} - \varvec{s}) \Vert _{2}&\le \Vert A( \varvec{s}^\mathrm{{eL}}_{N} - \varvec{s}) \Vert _{2}, \ \forall \varvec{s} \in B_{d}, \end{aligned}$$
(5.216)

where \(\varvec{s}^\mathrm{{CLS}}_{N}\) is the Bloch vector corresponding to \(\hat{\rho }^\mathrm{{CLS}}_{N}\). Let us define \(\Vert \varLambda \Vert _{2, \max }\) and \(\Vert \varLambda \Vert _{2, \min }\) as

$$\begin{aligned} \Vert \varLambda \Vert _{2, \max } := \max _{\varvec{v} \ne \varvec{0}} \frac{\Vert \varLambda \varvec{v} \Vert _{2}}{\Vert \varvec{v} \Vert _{2}}, \end{aligned}$$
(5.217)
$$\begin{aligned} \Vert \varLambda \Vert _{2, \min } := \min _{\varvec{v} \ne \varvec{0}} \frac{\Vert \varLambda \varvec{v} \Vert _{2}}{\Vert \varvec{v} \Vert _{2}}. \end{aligned}$$
(5.218)

When \(\breve{\varvec{\varPi }}\) is informationally complete, \(\varLambda \) is full-rank and \(\Vert \varLambda \Vert _{\min } > 0\). We have

$$\begin{aligned} \Vert \varLambda \Vert _{2, \min } \cdot \Vert \varvec{s}^\mathrm{{CLS}}_{N} - \varvec{s} \Vert _{2}&\le \Vert \varLambda (\varvec{s}^\mathrm{{CLS}}_{N} - \varvec{s} ) \Vert _{2} \end{aligned}$$
(5.219)
$$\begin{aligned}&\le \Vert \varLambda (\varvec{s}^\mathrm{{eL}}_{N} - \varvec{s} ) \Vert _{2} \end{aligned}$$
(5.220)
$$\begin{aligned}&\le \Vert \varLambda \Vert _{2, \max } \cdot \Vert \varvec{s}^\mathrm{{eL}}_{N} - \varvec{s} \Vert _{2}. \end{aligned}$$
(5.221)

We obtain

$$\begin{aligned} \mathbb {P}\left[ \Vert \varvec{s}^\mathrm{{CLS}}_{N} - \varvec{s} \Vert _{2} > \delta \right] \le \mathbb {P}\left[ \Vert \varvec{s}^\mathrm{{eL}}_{N} - \varvec{s} \Vert _{2} > \frac{\Vert \varLambda \Vert _{2, \min }}{\Vert \varLambda \Vert _{2, \max }} \delta \right] . \end{aligned}$$
(5.222)

From the same logic in the proof of Theorem 5.14, we obtain the following theorem:

Theorem 5.14

(Error probability, \(\hat{\rho }^\mathrm{{CLS}}\mathbf{,}\, \varDelta ^\mathrm{{HS}}\mathbf{,}\, \varDelta ^\mathrm{{T}}\mathbf{,}\, \varDelta ^\mathrm{{IF}}\) )

When we choose the Hilbert-Schmidt distance, Trace distance, or infidelity as the loss function for the density matrix, we have the following upper bounds on the error probabilities for the constrained least squares estimator.

$$\begin{aligned} \text {P}_{\delta , N}^\mathrm{{HS}} (\breve{\varvec{\varPi }}, \hat{\rho }^\mathrm{{CLS}} | \hat{\rho })&\le 2\sum _{\alpha =1}^{d^{2}-1} \exp \left[ - \left( \frac{\Vert \varLambda \Vert _{2, \min }}{\Vert \varLambda \Vert _{2, \max }}\right) ^{2}\frac{8}{d^{2} - 1} \frac{\delta ^{2}}{c_{\alpha }}N \right] , \end{aligned}$$
(5.223)
$$\begin{aligned} \text {P}_{\delta , N}^\mathrm{{T}} (\breve{\varvec{\varPi }}, \hat{\rho }^\mathrm{{CLS}} | \hat{\rho })&\le 2\sum _{\alpha =1}^{d^{2}-1} \exp \left[ - \left( \frac{\Vert \varLambda \Vert _{2, \min }}{\Vert \varLambda \Vert _{2, \max }}\right) ^{2}\frac{16}{d(d^{2} - 1)} \frac{\delta ^{2}}{c_{\alpha }}N \right] , \end{aligned}$$
(5.224)
$$\begin{aligned} \text {P}_{\delta , N}^\mathrm{{IF}} (\breve{\varvec{\varPi }}, \hat{\rho }^\mathrm{{CLS}} | \hat{\rho })&\le 2\sum _{\alpha =1}^{d^{2}-1} \exp \left[ - \left( \frac{\Vert \varLambda \Vert _{2, \min }}{\Vert \varLambda \Vert _{2, \max }}\right) ^{2}\frac{4}{d(d^{2} - 1)} \frac{\delta ^{2}}{c_{\alpha }}N \right] , \end{aligned}$$
(5.225)

for any true density matrix \(\hat{\rho }\).

Compared to Theorem 5.9, there is an additional factor \(\left( \frac{\Vert \varLambda \Vert _{2, \min }}{\Vert \varLambda \Vert _{2, \max }} \right) ^{2} (\le 1)\) in the rate of exponential decrease in Theorem 5.14. When \(\Vert \varLambda \Vert _{2, \max } = \Vert \varLambda \Vert _{2, \min }\) holds, the upper bounds for \(\hat{\rho }^\mathrm{{CLS}}\) coincides with those for \(\hat{\rho }^{\ell _{2}\mathrm - \mathrm {eNM}}\). Roughly speaking, the condition, \(\Vert \varLambda \Vert _{2, \max } = \Vert \varLambda \Vert _{2, \min }\), implies that we perform measurements extracting information of each Bloch vector element with an equivalent weight. When \(\Vert \varLambda \Vert _{2, \max } > \Vert \varLambda \Vert _{2, \min }\), the upper bounds for \(\hat{\rho }^\mathrm{{CLS}}\) is larger than those for \(\hat{\rho }^{\ell _{2}\mathrm - \mathrm {eNM}}\). This does not mean we can immediately conclude that \(\hat{\rho }^\mathrm{{CLS}}\) is less precise than \(\hat{\rho }^{\ell _{2}\mathrm - \mathrm {eNM}}\) because their upper bounds are probably not optimal. However, we can say that \(\hat{\rho }^\mathrm{{CLS}}\) is less precise than \(\hat{\rho }^{\ell _{2}\mathrm - \mathrm {eNM}}\) insofar as Theorems 5.9 and 5.14 give the only upper bounds known for point estimators in quantum tomography to date. Additionally, the computational cost of \(\hat{\rho }^{\ell _{2}\mathrm - \mathrm {eNM}}\) can be smaller than that of \(\hat{\rho }^\mathrm{{CLS}}\) as explained in Sect. 5.3.2. Therefore, we believe that the \(\ell _{2}\)-eNM estimator performs better than the CLS estimator and is at present our best choice.

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer Japan

About this chapter

Cite this chapter

Sugiyama, T. (2014). Evaluation of Estimation Precision in Quantum Tomography. In: Finite Sample Analysis in Quantum Estimation. Springer Theses. Springer, Tokyo. https://doi.org/10.1007/978-4-431-54777-8_5

Download citation

Publish with us

Policies and ethics