Skip to main content

Geometry on Positive Definite Matrices Deformed by V-Potentials and Its Submanifold Structure

  • Chapter
  • First Online:
Geometric Theory of Information

Part of the book series: Signals and Communication Technology ((SCT))

Abstract

In this paper we investigate dually flat structure of the space of positive definite matrices induced by a class of convex functions called V-potentials, from a viewpoint of information geometry. It is proved that the geometry is invariant under special linear group actions and naturally introduces a foliated structure. Each leaf is proved to be a homogeneous statistical manifold with a negative constant curvature and enjoy a special decomposition property of canonically defined divergence. As an application to statistics, we finally give the correspondence between the obtained geometry on the space and the one on elliptical distributions induced from a certain Bregman divergence.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    The reference [32] is a conference paper of this chapter omitting proofs and the whole Sect. 2.5.

References

  1. Amari, S.: Differential-geometrical methods in statistics, Lecture notes in Statistics. vol. 28, Springer, New York (1985)

    Google Scholar 

  2. Amari, S., Nagaoka, H.: Methods of information geometry, AMS & OUP, Oxford (2000)

    Google Scholar 

  3. David, A.P.: The geometry of proper scoring rules. Ann. Inst. Stat. 59, 77–93 (2007)

    Google Scholar 

  4. Eguchi, S.: Information geometry and statistical pattern recognition. Sugaku Expositions Amer. Math. Soc. 19, 197–216 (2006) (originally Sūgaku, 56, 380–399 (2004) in Japanese)

    Google Scholar 

  5. Eguchi, S.: Information divergence geometry and the application to statistical machine learning. In: Emmert-Streib, F., Dehmer, M. (eds.) Information Theory and Statistical Learning, pp. 309–332. Springer, New York (2008)

    Google Scholar 

  6. Eguchi, S., Copas, J.: A class of logistic-type discriminant functions. Biometrika 89(1), 1–22 (2002)

    Google Scholar 

  7. Eguchi, S., Komori, O., Kato, S.: Projective power entropy and maximum tsallis entropy distributions. Entropy 13, 1746–1764 (2011)

    Article  MathSciNet  Google Scholar 

  8. Fang, K.T., Kotz, S., Ng, K.W.: Symmetric Multivariate and Related Distributions. Chapman and Hall, London (1990)

    Book  MATH  Google Scholar 

  9. Faraut, J., Korányi, A.: Analysis on Symmetric Cones. Oxford University Press, New York (1994)

    MATH  Google Scholar 

  10. Grunwald, P.D., David, A.P.: Game theory, maximum entropy, minimum discrepancy and robust bayesian decision theory. Ann. Stat. 32, 1367–1433 (2004)

    Article  Google Scholar 

  11. Hao, J.H., Shima, H.: Level surfaces of nondegenerate functions in \({ r}^{n+1}\). Geom. Dedicata 50(2), 193–204 (1994)

    Article  MATH  MathSciNet  Google Scholar 

  12. Helgason, S.: Differential Geometry and Symmetric Spaces. Academic Press, New York (1962)

    MATH  Google Scholar 

  13. Higuchi, I., Eguchi, S.: Robust principal component analysis with adaptive selection for tuning parameters. J. Mach. Learn. Res. 5, 453–471 (2004)

    MATH  MathSciNet  Google Scholar 

  14. Kanamori, T., Ohara, A.: A bregman extension of quasi-newton updates I: an information geometrical framework. Optim. Methods Softw. 28(1), 96–123 (2013)

    Article  MATH  MathSciNet  Google Scholar 

  15. Kakihara, S., Ohara, A., Tsuchiya, T.: Information geometry and interior-point algorithms in semidefinite programs and symmetric cone programs. J. Optim. Theory Appl. 157(3), 749–780 (2013)

    Article  MATH  MathSciNet  Google Scholar 

  16. Kass, R.E., Vos, P.W.: Geometrical Foundations of Asymptotic Inference. Wiley, New York (1997)

    Google Scholar 

  17. Koecher, M.: The Minnesota Notes on Jordan Algebras and their Applications. Springer, Berlin (1999)

    MATH  Google Scholar 

  18. Kullback, S.: Information Theory and Statistics. Wiley, New York (1959)

    MATH  Google Scholar 

  19. Kurose, T.: Dual connections and affine geometry. Math. Z. 203(1), 115–121 (1990)

    Article  MATH  MathSciNet  Google Scholar 

  20. Kurose, T.: On the divergences of 1-conformally flat statistical manifolds. Tohoku Math. J. 46(3), 427–433 (1994)

    Article  MATH  MathSciNet  Google Scholar 

  21. Lauritzen, S.: Statistical manifolds. In: Amari, S.-I., et al. (eds.) Differential Geometry in Statistical Inference, Institute of Mathematical Statistics, Hayward (1987)

    Google Scholar 

  22. Minami, M., Eguchi, S.: Robust blind source separation by beta-divergence. Neural Comput. 14, 1859–1886 (2002)

    Article  MATH  Google Scholar 

  23. Muirhead, R.J.: Aspects of Multivariate Statistical Theory. Wiley, New York (1982)

    Book  MATH  Google Scholar 

  24. Murata, N., Takenouchi, T., Kanamori, T., Eguchi, S.: Information geometry of u-boost and bregman divergence. Neural Comput. 16, 1437–1481 (2004)

    Article  MATH  Google Scholar 

  25. Murray, M.K., Rice, J.W.: Differential Geometry and Statistics. Chapman & Hall, London (1993)

    Book  MATH  Google Scholar 

  26. Naudts, J.: Continuity of a class of entropies and relative entropies. Rev. Math. Phys. 16, 809–822 (2004)

    Article  MATH  MathSciNet  Google Scholar 

  27. Naudts, J.: Estimators, escort probabilities, and \(\phi \)-exponential families in statistical physics. J. Ineq. Pure Appl. Math. 5, 102 (2004)

    MathSciNet  Google Scholar 

  28. Nesterov, Y.E., Todd, M.J.: Primal-dual interior-point methods for self-scaled cones. SIAM J. Optim. 8, 324–364 (1998)

    Article  MATH  MathSciNet  Google Scholar 

  29. Nomizu, K., Sasaki, T.: Affine differential geometry. Cambridge University Press, Cambridge (1994)

    MATH  Google Scholar 

  30. Ohara, A.: Geodesics for dual connections and means on symmetric cones. Integr. Eqn. Oper. Theory 50, 537–548 (2004)

    Article  MATH  MathSciNet  Google Scholar 

  31. Ohara, A., Amari, S.: Differential geometric structures of stable state feedback systems with dual connections. Kybernetika 30(4), 369–386 (1994)

    MATH  MathSciNet  Google Scholar 

  32. Ohara, A., Eguchi, S.: Geometry on positive definite matrices induced from V-potential function. In: Nielsen, F., Barbaresco, F. (eds.) Geometric Science of Information; Lecture Notes in Computer Science 8085, pp. 621–629. Springer, Berlin (2013)

    Google Scholar 

  33. Ohara, A., Eguchi, S.: Group invariance of information geometry on \(q\)-gaussian distributions induced by beta-divergence. Entropy 15, 4732–4747 (2013)

    Article  MathSciNet  Google Scholar 

  34. Ohara, A., Suda, N., Amari, S.: Dualistic differential geometry of positive definite matrices and its applications to related problems. Linear Algebra Appl. 247, 31–53 (1996)

    Article  MATH  MathSciNet  Google Scholar 

  35. Ohara, A., Wada, T.: Information geometry of \(q\)-Gaussian densities and behaviors of solutions to related diffusion equations. J. Phys. A: Math. Theor. 43, 035002 (18pp.) (2010)

    Google Scholar 

  36. Ollila, E., Tyler, D., Koivunen, V., Poor, V.: Complex elliptically symmetric distributions : survey, new results and applications. IEEE Trans. signal process. 60(11), 5597–5623 (2012)

    Article  MathSciNet  Google Scholar 

  37. Rothaus, O.S.: Domains of positivity. Abh. Math. Sem. Univ. Hamburg 24, 189–235 (1960)

    Article  MATH  MathSciNet  Google Scholar 

  38. Sasaki, T.: Hyperbolic affine hyperspheres. Nagoya Math. J. 77, 107–123 (1980)

    MATH  MathSciNet  Google Scholar 

  39. Scott, D.W.: Parametric statistical modeling by minimum integrated square error. Technometrics 43, 274–285 (2001)

    Article  MathSciNet  Google Scholar 

  40. Shima, H.: The geometry of Hessian structures. World Scientific, Singapore (2007)

    Book  MATH  Google Scholar 

  41. Takenouchi, T., Eguchi, S.: Robustifying adaboost by adding the naive error rate. Neural Comput. 16(4), 767–787 (2004)

    Article  MATH  Google Scholar 

  42. Tsallis, C.: Introduction to Nonextensive Statistical Mechanics. Springer, New York (2009)

    MATH  Google Scholar 

  43. Uohashi, K., Ohara, A., Fujii, T.: 1-conformally flat statistical submanifolds. Osaka J. Math. 37(2), 501–507 (2000)

    MATH  MathSciNet  Google Scholar 

  44. Uohashi, K., Ohara, A., Fujii, T.: Foliations and divergences of flat statistical manifolds. Hiroshima Math. J. 30(3), 403–414 (2000)

    MATH  MathSciNet  Google Scholar 

  45. Vinberg, E.B.: The theory of convex homogeneous cones. Trans. Moscow Math. Soc. 12, 340–430 (1963)

    MATH  Google Scholar 

  46. Wolkowicz, H., et al. (eds.): Handbook of Semidefinite Programming. Kluwer Academic Publishers, Boston (2000)

    Google Scholar 

Download references

Acknowledgments

We thank the anonymous referees for their constructive comments and careful checks of the original manuscript.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Atsumi Ohara .

Editor information

Editors and Affiliations

Appendices

Appendices

2.1.1 A Proof of Theorem 1

It is observed that \(-\nu _1(\det P) \not = 0\) on \(PD(n,\mathbf{R})\) is necessary because the second term is not positive definite. Hence, the Hessian can be represented as

$$\begin{aligned} g^{(V)}_P (X,Y)&= -\nu _1(\det P) \{ \text {tr}(P^{-1}XP^{-1}Y) -\beta ^{(V)}(\det P) \text {tr}(P^{-1}X) \text {tr}(P^{-1}Y)\}\nonumber \\&= -\nu _1(\det P) \mathrm{vec}^T(\tilde{X}) \left( I_{n^2}-\beta ^{(V)}(\det P) \mathrm{vec}(I_n)\mathrm{vec}^T(I_n) \right) \mathrm{vec}(\tilde{Y}). \end{aligned}$$

Here \(\tilde{X}=P^{-1/2}XP^{-1/2}, \tilde{Y}=P^{-1/2}YP^{-1/2}\), \(\mathrm{vec}(\bullet )\) is the operator that maps \(A=(a_{ij}) \in \mathbf{R}^{n \times n}\) to \([a_{11},\cdots ,a_{n1},a_{12},\cdots ,a_{n2},\cdots ,a_{1n},\cdots ,a_{nn} ]^T \in \mathbf{R}^{n^2}\), and \(I_n\) and \(I_{n^2}\) denote the unit matrices of order \(n\) and \(n^2\), respectively. By congruently transforming the matrix \(I_{n^2}-\beta ^{(V)}(\det P) \mathrm{vec}(I_n)\mathrm{vec}^T(I_n)\) with a proper permutation matrix, we see the positive definiteness of \(g^{(V)}\) is equivalent with \(-\nu _1(\det P) > 0\) and

$$ I_n-\beta ^{(V)}(\det P)\mathbf{1}\mathbf{1}^T>0 ,\quad \text {where } \mathbf{1}=[1,1, \cdots , 1]^T \in \mathbf{R}^n. $$

Let \(W\) be an orthogonal matrix that has \(\mathbf{1}/\sqrt{n}\) as the first column vector. Since the following eigen-decomposition

$$ I_n-\beta ^{(V)}(\det P)\mathbf{1}\mathbf{1}^T =W\left( \begin{array}{cccc} 1-n\beta ^{(V)}(\det P) &{} 0 &{} \cdots &{} 0 \\ 0 &{} 1 &{} \ddots &{} \vdots \\ \vdots &{} \ddots &{} \ddots &{} 0 \\ 0 &{} \cdots &{} 0 &{} 1 \end{array} \right) W^T $$

holds, the conditions (2.5) are necessary and sufficient for positive definiteness of \(g^{(V)}\). Thus, the statement follows.\(\square \)

2.1.2 B Proof of Theorem 2

Since the components of \(P^*\) is an affine coordinate for the connection \(^* \nabla ^{(V)}\), the parallel shift \(\pi _t(Y)\) along the curve \(\gamma \) satisfies

$$ (\mathrm{grad}\varphi ^{(V)})_*(Y) = (\mathrm{grad}\varphi ^{(V)})_*(\pi _t(Y)) $$

for any \(t\).

From Lemma 1, this implies

$$\begin{aligned} \frac{d }{dt} \Big [ \nu _2( \det P_t) \text {tr}\{ P_t^{-1} \pi _t(Y)\}P_t^{-1} -\nu _1(\det P_t) P_t^{-1} \pi _t(Y) P_t^{-1} \Big ] =0 \end{aligned}$$

for any \(t \; (-\epsilon <t<\epsilon )\).

By calculating the left-hand side, we get

$$\begin{aligned}&\nu _3(s_t) \text {tr}\left( P_t^{-1}\frac{d P_t}{dt}\right) \text {tr}(P_t^{-1} \pi _t(Y)) P_t^{-1} - \nu _2(s_t) \text {tr}\left( P_t^{-1}\frac{d P_t}{dt}P_t^{-1} \pi _t(Y) \right) P_t^{-1} \\&+ \nu _2(s_t) \text {tr}\left( P_t^{-1} \frac{d \pi _t(Y)}{dt}\right) P_t^{-1} -\nu _2(s_t) \text {tr}(P_t^{-1}\pi _t(Y))P_t^{-1}\frac{d P_t}{dt}P_t^{-1} \\&- \nu _2(s_t) \text {tr}\left( P_t^{-1}\frac{dP_t}{dt}\right) P_t^{-1}\pi _t(Y)P_t^{-1} + \nu _1(s_t) P_t^{-1}\frac{dP_t}{dt}P_t^{-1}\pi _t(Y)P_t^{-1} \\&- \nu _1(s_t) P_t^{-1} \frac{d \pi _t(Y)}{dt}P_t^{-1} + \nu _1(s_t) P_t^{-1}\pi _t(Y)P_t^{-1}\frac{dP_t}{dt}P_t^{-1}=0, \end{aligned}$$

where \(s_t=\det P_t\). If \(t=0\), then this equation implies that

$$\begin{aligned}&\quad \nu _3(s) \text {tr}(P^{-1}X) \text {tr}(P^{-1} Y) P^{-1} - \nu _2(s) \text {tr}(P^{-1}XP^{-1} Y )P^{-1} \\&\quad + \nu _2(s) \text {tr}\left\{ P^{-1} \left( \frac{d \pi _t(Y)}{dt} \right) _{t=0}\right\} P^{-1} -\nu _2(s) \text {tr}(P^{-1}Y)P^{-1}XP^{-1} \\&\quad - \nu _2(s) \text {tr}\left( P^{-1}X \right) P^{-1}YP^{-1} + \nu _1(s) P^{-1}XP^{-1}YP^{-1} \\&\quad - \nu _1(s) P^{-1} \left( \frac{d \pi _t(Y)}{dt} \right) _{t=0} P^{-1} + \nu _1(s) P^{-1}YP^{-1}XP^{-1}=0, \end{aligned}$$

where \(s=\det P\). Hence we observe that

$$\begin{aligned}&\nu _1(s) P^{-1}\left( \frac{d \pi _t(Y)}{d t}\right) _{t=0} \\&= \nu _2(s) \text {tr}\left\{ P^{-1}\left( \frac{d \pi _t(Y)}{d t}\right) _{t=0}\right\} I \nonumber \\&\quad + \nu _1(s) (P^{-1}X P^{-1} Y +P^{-1} Y P^{-1}X) \nonumber \\&\quad -\nu _2(s) \left\{ \text {tr}\left( P^{-1}Y \right) P^{-1}X+\mathrm{tr}\left( P^{-1}X \right) P^{-1}Y \right\} \nonumber \\&\quad + \nu _3(s) \text {tr}\left( P^{-1}X \right) \text {tr}\left( P^{-1}Y \right) I - \nu _2(s)\text {tr}\left( P^{-1}YP^{-1}X \right) I. \nonumber \end{aligned}$$
(2.25)

Taking the trace for both sides of (2.25), we get

$$\begin{aligned}&(\nu _1(s)- n \nu _2(s))\mathrm{tr} \left\{ P^{-1}\left( \frac{d \pi _t(Y)}{d t}\right) _{t=0} \right\} \\ =&(2\nu _1(s)- n\nu _2(s))\mathrm{tr}\left( P^{-1}X P^{-1}Y \right) +(n\nu _3(s)-2\nu _2(s))\mathrm{tr}\left( P^{-1}X \right) \mathrm{tr}\left( P^{-1}Y \right) . \nonumber \end{aligned}$$
(2.26)

From (2.25) and (2.26) it follows that

$$\begin{aligned}&P^{-1} \left( \frac{d \pi _t(Y)}{d t}\right) _{t=0} \\&= P^{-1}X P^{-1} Y + P^{-1}Y P^{-1}X \\&\quad - \frac{\nu _2(s)}{\nu _1(s)} \left\{ \mathrm{tr} \left( P^{-1}X \right) P^{-1}Y + \mathrm{tr} \left( P^{-1}Y \right) P^{-1}X \right\} \\&\quad + \frac{(\nu _3(s)\nu _1(s)-2\nu _2(s)^2)\mathrm{tr}\left( P^{-1}X\right) \mathrm{tr}\left( P^{-1}Y \right) +\nu _2(s)\nu _1(s)\mathrm{tr}\left( P^{-1}X P^{-1}Y \right) }{\nu _1(s)(\nu _1(s)-n\nu _2(s))}I. \end{aligned}$$

This completes the proof.\(\square \)

2.1.3 C Proof of Proposition 4

Since geometric structure \((\mathcal {L}_s,g^{(V)})\) is also invariant under the transformation \(\tau _G\) where \(G \in SL(n,\mathbf{R})\), it suffices to consider at \(\lambda I \in \mathcal {L}_s\), where \(\lambda = s^{1/n}\).

Let \(\tilde{X} \in \mathcal {X}(\mathcal {L}_s)\) be a vector field defined at each \(P \in \mathcal {L}_s\) by

$$ \tilde{X}=\sum _i \tilde{X}^i(P)\frac{\partial }{\partial x^i} =P^{1/2}XP^{1/2}, \quad X \in T_I \mathcal {L}_1=\{X| \text {tr}(X)=0, \; X=X^T \}, $$

where \(\tilde{X}^i\) are certain smooth functions on \(\mathcal {L}_s\). Consider the curve \(P_t= \lambda \exp X t \in \mathcal {L}_s\) starting at \(t=0\) and a vector field \(\tilde{Y}\) along \(P_t\) defined by

$$ \tilde{Y}_{P_t}=P_t^{1/2}Y_tP_t^{1/2} =\sum _i \tilde{Y}^i(P_t)\frac{\partial }{\partial x^i}, $$

where \(Y_t\) is an arbitrary smooth curve in \(T_I \mathcal {L}_1\) with \(Y_0=Y\) and \(\tilde{Y}^i\) are smooth functions on \(P_t\). We show that the \((T_{\lambda I} \mathcal {L}_s)^\perp \)-component of \(\left( \hat{\nabla }^{(V)}_{\tilde{X}} \tilde{Y} \right) _{\lambda I}\), i.e., the covariant derivative at \(\lambda I\) orthogonal to \(T_{\lambda I} \mathcal {L}_s\), vanishes for any \(X\) and \(Y \in T_I \mathcal {L}_1\) if and only if \(\nu _2(s)=0\).

We see

$$ \left( P_t \right) _{t=0}=\lambda I, \quad \left( \frac{dP_t}{dt} \right) _{t=0}= \lambda X, \quad \tilde{Y}_{\lambda I}= \lambda Y $$

hold. Note that

$$ \frac{d}{dt} \tilde{Y} _{P_t}= \frac{1}{2}(X \tilde{Y}_{P_t}+ \tilde{Y}_{P_t}X) + P_t^{1/2}\frac{dY_t}{dt}P_t^{1/2}, $$

then using (2.13) and corollary 1, we obtain

$$\begin{aligned} \left( \hat{\nabla }^{(V)}_{\tilde{X}} \tilde{Y} \right) _{\lambda I}&= \left( \frac{d}{dt} \tilde{Y} _{P_t} \right) _{t=0} +\left( \sum _{i,j} \tilde{X}^i \tilde{Y}^j {\hat{\nabla }}^{(V)} _{\frac{\partial }{\partial x^i}} \frac{\partial }{\partial x^j} \right) _{\lambda I} \nonumber \\&= \frac{\lambda }{2}(XY+YX) + \lambda \left( \frac{d}{dt} Y _t \right) _{t=0}\\&\quad -\frac{1}{2} \left\{ \lambda (XY+YX)+\varPhi (\lambda X,\lambda Y,\lambda I)+\varPhi ^{\perp }(\lambda X,\lambda Y,\lambda I) \right\} \nonumber \\&= \lambda \left( \frac{d}{dt} Y _t \right) _{t=0} - \frac{1}{2} \varPhi ^{\perp }(\lambda X,\lambda Y, \lambda I). \end{aligned}$$

For the third equality we have used that \(\varPhi (\lambda X,\lambda Y,\lambda I)=0\) for any \(X\) and \(Y \in T_I \mathcal {L}_1\).

Since it holds that

$$ g_{\lambda I}^{(V)}\left( \left( \hat{\nabla }^{(V)}_{\tilde{X}} \tilde{Y} \right) _{\lambda I}, I \right) = \lambda ^{-1}(-\nu _1(s)+\nu _2(s)n) \text {tr}\left( \hat{\nabla }^{(V)}_{\tilde{X}} \tilde{Y} \right) _{\lambda I} $$

and \(-\nu _1(s)+\nu _2(s)n \not =0\) by (2.5), the \((T_{\lambda I} \mathcal {L}_s)^\perp \)-component of \(\left( \hat{\nabla }^{(V)}_{\tilde{X}} \tilde{Y} \right) _{\lambda I}\) vanishes for any \(X\) and \(Y \in T_I \mathcal {L}_1\) if and only if

$$ \text {tr}\left( \hat{\nabla }^{(V)}_{\tilde{X}} \tilde{Y} \right) _{\lambda I} = -\frac{1}{2} \text {tr}\varPhi ^{\perp }(\lambda X,\lambda Y, \lambda I) =0. $$

Here, we have used \(\text {tr}((dY_t/dt)_{t=0})=0\). The above equality is equivalent to \(\nu _2(s)=~0\). Hence, we conclude that the statement holds.\(\square \)

2.1.4 D Proof of Proposition 6

The statements (i) and (ii) follow from direct calculations. Since \(\left( ^* \tilde{\nabla }^{(V)}_{\tilde{X}} \tilde{Y} \right) _P\) is the orthogonal projection of \(\left( ^*\nabla ^{(V)} _{\tilde{X}} \tilde{Y} \right) _P\) to \(T_P \mathcal {L}_s\) with respect to \(g_P^{(V)}\), it can be represented by

$$ \left( {^* \tilde{\nabla }^{(V)}_{\tilde{X}} \tilde{Y}} \right) _P = \left( {^*\nabla ^{(V)}_{\tilde{X}} \tilde{Y}} \right) _P - \delta P, \quad \delta \in \mathbf{R}, $$

where \(\delta \) is determined from the orthogonality condition

$$ g_P^{(V)}\left( \left( {^* \tilde{\nabla }}^{(V)}_{\tilde{X}} \tilde{Y} \right) _P, P \right) =0. $$

Similarly to the proof of Proposition 4 where \(\lambda =s^{1/n}\), we see that

$$\begin{aligned} \left( {^* \nabla }^{(V)}_{\tilde{X}} \tilde{Y} \right) _{\lambda I}&= \left( \frac{d}{dt} \tilde{Y}_{P_t} \right) _{t=0} +\left( \sum _{i,j} \tilde{X}^i \tilde{Y}^j {^* \nabla }^{(V)} _{\frac{\partial }{\partial x^i}} \frac{\partial }{\partial x^j} \right) _{\lambda I} \nonumber \\&= \frac{\lambda }{2}(XY+YX) + \lambda \left( \frac{d}{dt} Y _t \right) _{t=0} \nonumber \\&- \left\{ \lambda (XY+YX)+\varPhi (\lambda X,\lambda Y,\lambda I)+\varPhi ^{\perp }(\lambda X,\lambda Y,\lambda I) \right\} \nonumber \\&= \lambda \left( \frac{d}{dt} Y _t \right) _{t=0} -\frac{\lambda }{2}(XY+YX) -\varPhi ^{\perp }(\lambda X,\lambda Y, \lambda I). \end{aligned}$$

Since \(\varPhi ^{\perp }(\lambda X,\lambda Y, \lambda I) \in (T_{\lambda I} \mathcal {L}_s)^\perp \) and \((dY_t/dt)_{t=0} \in T_{\lambda I} \mathcal {L}_s\), the orthogonal projection of \(\left( {^*\nabla }^{(V)}_{\tilde{X}} \tilde{Y} \right) _{\lambda I}\) to \(T_{\lambda I} \mathcal {L}_s\) is that of \(\lambda (dY_t/dt)_{t=0}-\lambda (YX+XY)/2\). Thus, from the orthogonality condition we have

$$ \left( {^* \tilde{\nabla }}^{(V)}_{\tilde{X}} \tilde{Y} \right) _{\lambda I} = \lambda \left( \frac{d}{dt} Y _t \right) _{t=0}-\frac{\lambda }{2}(XY+YX)+\frac{\lambda }{n}\mathrm{tr}(XY)I, $$

which is independent of \(V(s)\).\(\square \)

2.1.5 E Proof of Theorem 4

For \(P\) and \(Q\) in \(PD(n,\mathbf{R})\), we shortly write two density functions in a \(U\)-model as \(f_P(x)=f(x,P)\) and \( f_Q(x)=f(x,Q)\).

It suffices to show the dual canonical divergence \({^*D^{(V)}}(P,Q)=D^{(V)}(Q,P)\) of \((PD(n,\mathbf{R}), \nabla , g^{(V)})\) given by (2.10) coincides with \(D_U(f_P,f_Q)\). Note that an exchange of the order for two arguments in a divergence only causes that of the definitions of primal and dual affine connections in (2.3) but does not affect whole dualistic structure of the induced geometry.

Recalling (2.6), we have

$$ \mathrm{grad} \varphi ^{(V)}( P)= \Big (V ' (\det P) \det P \Big ) P^{-1}=\nu _1(\det P)P^{-1}, $$

where \(V'\) denotes the derivative of \(V\) by \(s\). On the other hand, we can directly differentiate \(\varphi ^{(V)}(P)\) defined via (2.24)

$$\begin{aligned}&\mathrm{grad} \varphi ^{(V)} (P) \\&= \mathrm{grad} \left\{ \int U \left( -\frac{1}{2}x^T P x -c_U(\det P) \right) dx + c_U (\det P) \right\} \\&= \int f_P(x) \left\{ -\frac{1}{2}xx^T-\Big (c'_U(\det P ) \det P \Big ) P^{-1} \right\} dx + \Big (c'_U(\det P ) \det P \Big ) P^{-1} \\&= - \frac{1}{2} \int f_P(x) xx^T dx = - \frac{1}{2} \mathrm{E}_P(xx^T), \end{aligned}$$

where \(\mathrm{E}_P\) is the expectation operator with respect to \(f_P(x)\). Thus, we have

$$\begin{aligned} \nu _1(\det P) P^{-1} =- \frac{1}{2} \mathrm{E}_P(xx^T). \end{aligned}$$
(2.27)

Note that

$$ \xi (f_P)= -\frac{1}{2}x^TPx-c_U(\det P), \quad \xi (f_Q)= -\frac{1}{2}x^TQx-c_U(\det Q) $$

because \(\xi (u)\) is the identity. From the definition, \(U\)-divergence is

$$\begin{aligned} D_U(f_P,f_Q)&= \int U \left( -\frac{1}{2}x^T Q x -c_U(\det Q) \right) - U \left( -\frac{1}{2}x^T P x -c_U(\det P) \right) \nonumber \\&-f_P(x) \left\{ -\frac{1}{2}x^T Q x -c_U(\det Q) +\frac{1}{2}x^T P x +c_U(\det P)\right\} dx \\&= \varphi ^{(V)}(\det Q) -\varphi ^{(V)}(\det P) +\frac{1}{2}\mathrm{E}_P \left( x^TQx - x^TPx \right) . \end{aligned}$$

Using (2.27), the third term is expressed by

$$\begin{aligned} \frac{1}{2}\mathrm{E}_P \left( x^TQx - x^TPx \right)&= \frac{1}{2}\mathrm{tr}\{\mathrm{{E}}_P(xx^T)(Q-P)\} \\&= \nu _1(\det P) \mathrm{tr}(P^{-1}(P-Q)). \end{aligned}$$

Hence, \(D_U(f_P,f_Q)={^*D^{(V)}}(P,Q)=D^{(V)}(Q,P)\).\(\square \)

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this chapter

Cite this chapter

Ohara, A., Eguchi, S. (2014). Geometry on Positive Definite Matrices Deformed by V-Potentials and Its Submanifold Structure. In: Nielsen, F. (eds) Geometric Theory of Information. Signals and Communication Technology. Springer, Cham. https://doi.org/10.1007/978-3-319-05317-2_2

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-05317-2_2

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-05316-5

  • Online ISBN: 978-3-319-05317-2

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics