Skip to main content
Log in

On the expectation of a persistence diagram by the persistence weighted kernel

  • Original Paper
  • Published:
Japan Journal of Industrial and Applied Mathematics Aims and scope Submit manuscript

Abstract

In topological data analysis, persistent homology characterizes robust topological features in data and it has a summary representation, called a persistence diagram. Statistical research for persistence diagrams have been actively developed, and the persistence weighted kernel shows several advantages over other statistical methods for persistence diagrams. If data is drawn from some probability distribution, the corresponding persistence diagram have randomness. Then, the expectation of the persistence diagram by the persistence weighted kernel is well-defined. In this paper, we study relationships between a probability distribution and the persistence weighted kernel in the viewpoint of (1) the strong law of large numbers and the central limit theorem, (2) a confidence interval to estimate the expectation of the persistence weighted kernel numerically, and (3) the stability theorem to ensure the continuity of the map from a probability distribution to the expectation. In numerical experiments, we demonstrate our method gives an interesting counterexample to a common view in topological data analysis.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

Notes

  1. This was originally called the persistence weighted Gaussian kernel in [26, 27] because we mainly focused on the Gaussian kernel \(k(x,y)=\exp (- \left\| x-y\right\| ^{2}/2\sigma ^{2}) ~ (\sigma >0)\) as the positive definite kernel, but the framework can be generalized to other positive definite kernels. Hence, we drop the word “Gaussian” here.

  2. A subset \(\mathbf {J}\subset \mathbb {R}\) is said to be an interval if, for any \(a,c \in \mathbf {J}\), \(b \in \mathbb {R}\) satisfying \(a< b < c\) is in \(\mathbf {J}\).

  3. A multiset is a set with multiplicity of each point. Note that the collection of birth-death pairs should be a multiset because an interval decomposition of \(\mathbb {U}\) can contain several intervals with the same birth-death pairs.

  4. Precisely speaking, the definition of the birth and death time can contain \(-\infty \) and \(\infty \). However, in practical, we can assume that all birth and death times take neither \(\infty \) nor \(-\infty \) (for more details, please see Section 2.1.2 in [27]).

  5. By considering infinite multiplicity of the diagonal set \({\varDelta }\), there always exists a multi-bijection from \(D \cup {\varDelta }\) to \(E \cup {\varDelta }\). The bottleneck distance is also called \(\infty \)-Wasserstein distance.

  6. This is also shown in Corollary 4 of [2]

  7. \(B^{*}\) is the set of all continuous linear real-valued functions \(f:B \rightarrow \mathbb {R}\).

  8. We call \(V:\varOmega \rightarrow B\)Radon if, for any \({\varepsilon }>0\), there exists a compact set K in the Borel \(\sigma \)-set of B such that \({\mathrm {Pr}}(V \in K) \ge 1-{\varepsilon }\). For the proof of \(\left\| \mathbb {E}[V]\right\| _{B} \le \mathbb {E}[\Vert V \Vert _{B}]\) and other details, please see Section 2.1 in [29].

  9. We do not define the concept of type 2 in this paper because a Hilbert space is of type 2 and a Banach space which will be used in this paper is a reproducing kernel Hilbert space. For more details, please see Section 9.2 in [29].

  10. \(\mathcal {N}(\mu ,\sigma ^{2})\) denotes the normal distribution with mean \(\mu \in \mathbb {R}\) and variance \(\sigma ^{2}>0\).

  11. Without loss of generality, we can make \(\log (T \mathop {\mathrm {Lip}}(k)\mathop {\mathrm {Bdd}}(k)^{-1/2})>0\) by retaking larger T or \(\mathop {\mathrm {Lip}}(k)\) if necessary. Remark that \(- \log {\varepsilon }\ge 0\) for \({\varepsilon }\in (0,1]\) and \( \int _{0}^{1} \sqrt{- \log {\varepsilon }}d{\varepsilon }< \infty \).

  12. A probability measure \(\pi \) on \(M \times M\) is called a coupling between \(\mu \) and \(\nu \) if, for a natural projection \(p_{i}(x_{1},x_{2})=x_{i} ~ (i=1,2, ~ x_{i} \in M)\), their induced measures satisfy \((p_{1})_{*}\pi =\mu \) and \((p_{2})_{*}\pi =\nu \).

  13. http://www.wpi-aimr.tohoku.ac.jp/hiraoka_labo/homcloud/index.en.html.

  14. In experiments, the quantile is estimated by resampling from both \(\varvec{D}_{n}\) and \(\varvec{E}_{n}\) randomly, called the aggregated data, because the current hypothesis is \(P=Q\). For more details, please see Appendix A.

  15. For comparison in machine learning tasks among the PWK vector, a persistence landscape, and the persistence scale-space kernel, please see [27].

  16. For the reason of fixing the parameters in this way, please also see [27].

  17. \(\alpha \) is the significance level. q is the dimension of persistence diagrams. m is the number of sampling to compute the type I error empirically. N is the number of trials to compute the type I error.

  18. Note that \(\phi _{*}\mathcal {P}f^{k,w}_{z}\) is the value at z of the expectation of the PWK vector, that is, \(\phi _{*}\mathcal {P}f^{k,w}_{z}=\mathbb {E}_{X \sim \mathcal {P}}[V^{k,w}(D_{q}(\mathbb {B}(X)))](z)\).

References

  1. Adams, H., Emerson, T., Kirby, M., Neville, R., Peterson, C., Shipman, P., Chepushtanova, S., Hanson, E., Motta, F., Ziegelmeier, L.: Persistence images: a stable vector representation of persistent homology. J. Mach. Learn. Res. 18(8), 1–35 (2017)

    MathSciNet  MATH  Google Scholar 

  2. Berlinet, A., Thomas-Agnan, C.: Reproducing kernel Hilbert spaces in probability and statistics. Springer, Berlin (2011)

    MATH  Google Scholar 

  3. Bubenik, P.: Statistical topological data analysis using persistence landscapes. J. Mach. Learn. Res. 16(1), 77–102 (2015)

    MathSciNet  MATH  Google Scholar 

  4. Bubenik, P., Scott, J.A.: Categorification of persistent homology. Discret. Comput. Geom. 51(3), 600–627 (2014). https://doi.org/10.1007/s00454-014-9573-x

    Article  MathSciNet  MATH  Google Scholar 

  5. Cang, Z., Mu, L., Wu, K., Opron, K., Xia, K., Wei, G.W.: A topological approach for protein classification. Comput. Math. Biophys. 3(1), 140–162 (2015)

    Article  MATH  Google Scholar 

  6. Carlsson, G., de Silva, V.: Zigzag persistence. Found. Comput. Math. 10(4), 367–405 (2010)

    Article  MathSciNet  MATH  Google Scholar 

  7. Carrière, M., Cuturi, M., Oudot, S.: Sliced Wasserstein kernel for persistence diagrams. In: D. Precup, Y.W. Teh (eds.) Proceedings of the 34th International Conference on Machine Learning, Proceedings of Machine Learning Research, vol. 70, pp. 664–673. PMLR, International Convention Centre, Sydney, Australia (2017). http://proceedings.mlr.press/v70/carriere17a.html

  8. Chazal, F., Fasy, B., Lecci, F., Michel, B., Rinaldo, A., Wasserman, L.: Subsampling methods for persistent homology. In: F. Bach, D. Blei (eds.) Proceedings of the 32nd International Conference on Machine Learning, Proceedings of Machine Learning Research, vol. 37, pp. 2143–2151. PMLR, Lille, France (2015). http://proceedings.mlr.press/v37/chazal15.html

  9. Chazal, F., Fasy, B.T., Lecci, F., Rinaldo, A., Singh, A., Wasserman, L.: On the bootstrap for persistence diagrams and landscapes. Model. Anal. Inf. Syst. 20(6), 111–120 (2013)

    Article  Google Scholar 

  10. Chazal, F., Fasy, B.T., Lecci, F., Rinaldo, A., Wasserman, L.: Stochastic convergence of persistence landscapes and silhouettes. In: Proceedings of the Thirtieth Annual Symposium on Computational Geometry, SOCG’14, pp. 474–483. ACM, New York, NY, USA (2014). https://doi.org/10.1145/2582112.2582128

  11. Chazal, F., de Silva, V., Oudot, S.: Persistence stability for geometric complexes. Geom. Dedic. 173(1), 193–214 (2014). https://doi.org/10.1007/s10711-013-9937-z

    Article  MathSciNet  MATH  Google Scholar 

  12. Cohen-Steiner, D., Edelsbrunner, H., Harer, J.: Stability of persistence diagrams. Discret. Comput. Geom. 37(1), 103–120 (2007). https://doi.org/10.1007/s00454-006-1276-5

    Article  MathSciNet  MATH  Google Scholar 

  13. Crawley-Boevey, W.: Decomposition of pointwise finite-dimensional persistence modules. J. Algebra Appl. 14(05), 1550066 (2015)

    Article  MathSciNet  MATH  Google Scholar 

  14. Donatini, P., Frosini, P., Lovato, A.: Size functions for signature recognition. Proceedings of SPIE - The International Society for Optical Engineering 3454 (1998)

  15. Durrett, R.: Probability: Theory and Examples. Cambridge University Press, Cambridge (2010)

    Book  MATH  Google Scholar 

  16. Edelsbrunner, H., Letscher, D., Zomorodian, A.: Topological persistence and simplification. Discret. Comput. Geom. 28(4), 511–533 (2002). https://doi.org/10.1007/s00454-002-2885-2

    Article  MathSciNet  MATH  Google Scholar 

  17. Gameiro, M., Hiraoka, Y., Izumi, S., Kramar, M., Mischaikow, K., Nanda, V.: A topological measurement of protein compressibility. Jpn J. Ind. Appl. Math. 32(1), 1–17 (2015)

    Article  MathSciNet  MATH  Google Scholar 

  18. Gretton, A., Borgwardt, K., Rasch, M., Schölkopf, B., Smola, A.J.: A kernel method for the two-sample-problem. In: B. Schölkopf, J.C. Platt, T. Hoffman (eds.) Advances in Neural Information Processing Systems 19, pp. 513–520. MIT Press (2007). http://papers.nips.cc/paper/3110-a-kernel-method-for-the-two-sample-problem.pdf

  19. Gretton, A., Borgwardt, K.M., Rasch, M.J., Schölkopf, B., Smola, A.: A kernel two-sample test. J. Mach. Learn. Res. 13(Mar), 723–773 (2012)

    MathSciNet  MATH  Google Scholar 

  20. Gretton, A., Fukumizu, K., Harchaoui, Z., Sriperumbudur, B.K.: A fast, consistent kernel two-sample test. In: Y. Bengio, D. Schuurmans, J.D. Lafferty, C.K.I. Williams, A. Culotta (eds.) Advances in Neural Information Processing Systems 22, pp. 673–681. Curran Associates, Inc. (2009). http://papers.nips.cc/paper/3738-a-fast-consistent-kernel-two-sample-test.pdf

  21. Hatcher, A.: Algebraic Topology. Cambridge University Press, Cambridge (2002)

    MATH  Google Scholar 

  22. Hiraoka, Y., Kusano, G.: Relative interleavings and applications to sensor networks. Jpn. J. Ind. Appl. Math. 33, 1–22 (2016)

    Article  MathSciNet  MATH  Google Scholar 

  23. Hiraoka, Y., Nakamura, T., Hirata, A., Escolar, E.G., Matsue, K., Nishiura, Y.: Hierarchical structures of amorphous solids characterized by persistent homology. Proc. Natl. Acad. Sci. 113(26), 7035–7040 (2016). https://doi.org/10.1073/pnas.1520877113

    Article  Google Scholar 

  24. Kosorok, M.R.: Introduction to Empirical Processes and Semiparametric Inference. Springer, New York (2008). https://doi.org/10.1007/978-0-387-74978-5

    Book  MATH  Google Scholar 

  25. Kramár, M., Levanger, R., Tithof, J., Suri, B., Xu, M., Paul, M., Schatz, M.F., Mischaikow, K.: Analysis of Kolmogorov flow and Rayleigh–Bénard convection using persistent homology. Phys. D Nonlinear Phenom. 334, 82–98 (2016)

    Article  MATH  Google Scholar 

  26. Kusano, G., Hiraoka, Y., Fukumizu, K.: Persistence weighted gaussian kernel for topological data analysis. In: M.F. Balcan, K.Q. Weinberger (eds.) Proceedings of The 33rd International Conference on Machine Learning, Proceedings of Machine Learning Research, vol. 48, pp. 2004–2013. PMLR, New York, New York, USA (2016). http://proceedings.mlr.press/v48/kusano16.html

  27. Kusano, G., Fukumizu, K., Hiraoka, Y.: Kernel method for persistence diagrams via kernel embedding and weight factor. J. Mach. Learn. Res. 18(189), 1–41 (2018)

    MathSciNet  MATH  Google Scholar 

  28. Kwitt, R., Huber, S., Niethammer, M., Lin, W., Bauer, U.: Statistical topological data analysis - a kernel perspective. In: C. Cortes, N.D. Lawrence, D.D. Lee, M. Sugiyama, R. Garnett (eds.) Advances in Neural Information Processing Systems 28, pp. 3070–3078. Curran Associates, Inc. (2015). http://papers.nips.cc/paper/5887-statistical-topological-data-analysis-a-kernel-perspective.pdf

  29. Ledoux, M., Talagrand, M.: Probability in Banach Spaces: Isoperimetry and Processes, vol. 23. Springer, New York (2013). 10.1007/978-3-642-20212-4

    MATH  Google Scholar 

  30. Matérn, B.: Spatial variation: stochastic models and their application to some problems in forest surveys and other sampling investigations. Meddelanden Fran Statens Skogsforskningsinstitut 49(5), 1–144 (1960)

    MathSciNet  Google Scholar 

  31. Nakamura, T., Hiraoka, Y., Hirata, A., Escolar, E.G., Nishiura, Y.: Persistent homology and many-body atomic structure for medium-range order in the glass. Nanotechnology 26, 304001 (2015)

    Article  Google Scholar 

  32. Paulsen, V.I., Raghupathi, M.: An Introduction to the Theory of Reproducing Kernel Hilbert Spaces, vol. 152. Cambridge University Press, Cambridge (2016)

    Book  MATH  Google Scholar 

  33. Reininghaus, J., Huber, S., Bauer, U., Kwitt, R.: A stable multi-scale kernel for topological machine learning. In: 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4741–4748 (2015). https://doi.org/10.1109/CVPR.2015.7299106

  34. Robins, V., Turner, K.: Principal component analysis of persistent homology rank functions with case studies of spatial point patterns, sphere packing and colloids. Phys. D Nonlinear Phenom. 334, 99–117 (2016)

    Article  MathSciNet  MATH  Google Scholar 

  35. Saadatfar, M., Takeuchi, H., Robins, V., Francois, N., Hiraoka, Y.: Pore configuration landscape of granular crystallization. Nat. Commun. 8, 15082 EP (2017). https://doi.org/10.1038/ncomms15082

    Article  Google Scholar 

  36. de Silva, V., Ghrist, R.: Coverage in sensor networks via persistent homology. Algebraic Geom. Topol. 7(1), 339–358 (2007)

    Article  MathSciNet  MATH  Google Scholar 

  37. Skraba, P., Ovsjanikov, M., Chazal, F., Guibas, L.: Persistence-based segmentation of deformable shapes. In: 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Workshops, pp. 45–52 (2010). https://doi.org/10.1109/CVPRW.2010.5543285

  38. Van der Vaart, A.: Asymptotic Statistics, vol. 3. Cambridge University Press, Cambridge (1998). 10.1017/CBO9780511802256

    Book  MATH  Google Scholar 

  39. Zomorodian, A., Carlsson, G.: Computing persistent homology. Discret. Comput. Geom. 33(2), 249–274 (2005). https://doi.org/10.1007/s00454-004-1146-y

    Article  MathSciNet  MATH  Google Scholar 

Download references

Acknowledgements

The author wish to express their sincere gratitude to Yasuaki Hiraoka, Tomoyuki Shirai, and Emerson Escolar for valuable discussions and comments on this paper. This work is supported by JSPS Research Fellow (17J02401).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Genki Kusano.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix A: Kernel two sample test

Appendix A: Kernel two sample test

In this section, we briefly review the kernel two sample test, following [18, 19].

1.1 Gereral framework of the two sample test

Let \((\mathcal {X}, \mathcal {B}_{\mathcal {X}})\) be a topological space, P and Q be probability distributions on \(\mathcal {X}\), \(X_{1},\ldots ,X_{m}, {\mathrm {i.i.d.}}\sim P\), \(Y_{1},\ldots , Y_{n}, {\mathrm {i.i.d.}}\sim Q\), and \(\theta (\varvec{X}_{m}, \varvec{Y}_{n})\) be a statistics. If \(H_{0}\) is true, the statistics \(\theta (\varvec{X}_{m}, \varvec{Y}_{n})\) depends only on P because \(Y_{1},\ldots , Y_{n}, {\mathrm {i.i.d.}}\sim Q=P\). Here, we assume that the upper \(\alpha \)-quantile \({\hat{\xi }}_{m,n,\alpha }\) which satisfies \({\mathrm {Pr}}(\theta (\varvec{X}_{m}, \varvec{Y}_{n}) \le {\hat{\xi }}_{m,n,\alpha })=1-\alpha \) is computable when \(H_{0}\) is true. When \(\alpha \) is set a small value, if a pair of realizations \((\varvec{x}_{m}, \varvec{y}_{n})\) of \((\varvec{X}_{m},\varvec{Y}_{n})\) satisfies \({\hat{\xi }}_{m,n, 1-\alpha /2} \le \theta (\varvec{x}_{m}, \varvec{y}_{n}) \le {\hat{\xi }}_{m,n,\alpha /2}\), we conclude that the hypothesis \(H_{0}\) is accepted under \(H_{0}\). Otherwise, we conclude that \(H_{0}\) is rejected. The threshold \(\alpha \) is called the significance level and \(\alpha =0.01\) or 0.05 is often used.

1.2 Kernel method for the two sample problem

Let k be a measurable positive definite kernel on \(\mathcal {X}\) satisfying \(\int _{\mathcal {X}}\int _{\mathcal {X}} k(x,y)^{2}dP(x)dQ(y) < \infty \). In [18, 19], a statistics

$$\begin{aligned}&\mathrm {MMD}_{u}(\varvec{X}_{m},\varvec{Y}_{n};k)^{2} \nonumber \\&\quad := \frac{1}{m(m-1)}\sum _{i =1}^{m} \sum _{j \ne i}^{m} k(X_{i},X_{j}) + \frac{1}{n(n-1)}\sum _{a=1}^{n}\sum _{b \ne a}^{n} k(Y_{a},Y_{b}) \nonumber \\&\qquad - \frac{2}{mn}\sum _{i=1}^{m} \sum _{a=1}^{n}k(X_{i},Y_{a}) \end{aligned}$$
(20)

is used to the two sample test and the distribution function is given as follows:

Theorem 12

(Theorem 8 in [18], Theorem 12 in [19]) Under the null hypothesis \(H_{0}\), \(n\mathrm {MMD}_{u}(\varvec{X}_{n},\varvec{Y}_{n};k)^{2} \rightarrow _{\mathrm {d}}\sum _{i=1}^{\infty }\lambda _{i}(z_{i}^{2}-2)\) where \(z_{1},\ldots , {\mathrm {i.i.d.}}\sim \mathcal {N}(0,2)\), \(\{\lambda _{i}\}_{i=1}^{\infty }\) are the solutions to the eigenvalue equation

$$\begin{aligned} \int _{\mathcal {X}} {\tilde{k}}(x, x')\psi _{i}(x)dP(x) = \lambda _{i} \psi _{i}(x'), \end{aligned}$$
(21)

and \({\tilde{k}}(x_{i},x_{j})=k(x_{i},x_{j})-\int _{\mathcal {X}}k(x_{i},x)dP(x)-\int _{\mathcal {X}}k(x,x_{j})dP(x)-\int _{\mathcal {X}}\int _{\mathcal {X}}k(x,x')dP(x)dP(x')\).

In order to obtain the distribution of \(\sum _{i=1}^{\infty }\lambda _{i}(z_{i}^{2}-2)\) numerically, we approximate the eigenvalues \(\{\lambda _{i}\}_{i=1}^{\infty }\) in Eq. (21). Let \(\tilde{\varvec{k}}\) denote the centered Gram matrix of \(\{x_{1},\ldots ,x_{n}\}\) whose (ij) component is given by \((\tilde{\varvec{k}})_{i,j}=k(x_{i},x_{j})- n^{-1} \sum _{b=1}^{n}k(x_{i},x_{b})-n^{-1} \sum _{a=1}^{n}k(x_{a},x_{j}) + n^{-2} \sum _{a,b=1}^{n}k(x_{a},x_{b})\) and \(\{{\hat{\mu }}_{i}\}_{i=1}^{n}\) be the set of the eigenvalues of \(\tilde{\varvec{k}}\). Then, it is shown from Theorem 1 in [20] that \(\sum _{i=1}^{n}{\hat{\lambda }}_{i}(z_{i}^{2}-2) \rightarrow _{\mathrm {d}}\sum _{i=1}^{\infty }\lambda _{i}(z_{i}^{2}-2)\) where \({\hat{\lambda }}_{i}=n^{-1}{\hat{\mu }}_{i}\). Therefore, the upper \(\alpha \)-quantile of \(n\mathrm {MMD}_{u}(\varvec{X}_{n}, \varvec{Y}_{n})^{2}\) is numerically obtained from the histogram of \(\sum _{i=1}^{n}{\hat{\lambda }}_{i}(z_{i}^{2}-2)\). Since the current null hypothesis is \(P=Q\), we have \(Y_{i} \sim P\) and we can approximate the eigenvalues on the aggregated data, that is, the eigenvalues are approximated by the centered Gram matrix of \(\{x_{1},\ldots ,x_{n},y_{1},\ldots ,y_{n}\}\). We estimate the quantile of \(\sum _{i=1}^{2n}{\hat{\lambda }}_{i}(z_{i}^{2}-2)\) by the (standard) bootstrap method. To sum up, the algorithm of the kernel two sample problem is given as follows:

figure c

Since the output \({\hat{p}}\) of Algorithm 3 is the acceptance ratio of \(H_{0}\), \(1-{\hat{p}}\) is the type I error when \(H_{0}\) is true.

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Kusano, G. On the expectation of a persistence diagram by the persistence weighted kernel. Japan J. Indust. Appl. Math. 36, 861–892 (2019). https://doi.org/10.1007/s13160-019-00374-2

Download citation

  • Received:

  • Revised:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13160-019-00374-2

Keywords

Mathematics Subject Classification

Navigation