Abstract
The idea of using geometry in learning and inference has a long history going back to canonical ideas such as Fisher information, Discriminant analysis, and Principal component analysis. The related area of Topological Data Analysis (TDA) has been developing in the last decade. The idea is to extract robust topological features from data and use these summaries for modeling the data. A topological summary generates a coordinate-free, deformation invariant and highly compressed description of the geometry of an arbitrary data set. Topological techniques are well-suited to extend our understanding of Big Data. These tools do not supplant existing techniques, but rather provide a complementary viewpoint to existing techniques. The qualitative nature of topological features do not give particular importance to individual samples, and the coordinate-free nature of topology generates algorithms and viewpoints well suited to highly complex datasets. With the introduction of persistence and other geometric-topological ideas we can find and quantify local-to-global properties as well as quantifying qualitative changes in data.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
Technically, these are path-connected components. However, this distinction is a mathematical formality, as the two are indistinguishable in any form of sampled data.
- 2.
References
Aanjaneya, M., Chazal, F., Chen, D., Glisse, M., Guibas, L., Morozov, D.: Metric graph reconstruction from noisy data. Int. J. Comput. Geom. Appl. 22(04), 305–325 (2012)
Adams, H., Carlsson, G.: Evasion paths in mobile sensor networks. Int. J. Robot. Res. 34(1), 90–104 (2015)
Adcock, A., Carlsson, E., Carlsson, G.: The ring of algebraic functions on persistence bar codes. http://comptop.stanford.edu/u/preprints/multitwo (2012)
Adcock, A., Rubin, D., Carlsson, G.: Classification of hepatic lesions using the matching metric. Comput. Vis. Image Underst. 121, 36–42 (2014)
Adler, R.J.: The Geometry of Random Fields, vol. 62. Siam (1981)
Adler, R.J.: Some new random field tools for spatial analysis. Stochast. Environ. Res. Risk Assess. 22(6), 809–822 (2008)
Amari, S.I., Nagaoka, H.: Methods of Information Geometry, vol. 191. American Mathematical Society (2007)
Arai, Z., Kalies, W., Kokubu, H., Mischaikow, K., Oka, H., Pilarczyk, P.: A database schema for the analysis of global dynamics of multiparameter systems. SIAM J. Appl. Dyn. Syst. 8(3), 757–789 (2009)
Babson, E., Benjamini, I.: Cut sets and normed cohomology with applications to percolation. Proc. Am. Math. Soc. 127(2), 589–597 (1999)
Bajardi, P., Delfino, M., Panisson, A., Petri, G., Tizzoni, M.: Unveiling patterns of international communities in a global city using mobile phone data. EPJ Data Sci. 4(1), 1–17 (2015)
Bauer, U., Kerber, M., Reininghaus, J.: Distributed computation of persistent homology. In: ALENEX, pp. 31–38. SIAM (2014)
Bauer, U., Kerber, M., Reininghaus, J., Wagner, H.: PHAT-persistent homology algorithms toolbox. In: Mathematical Software-ICMS 2014, pp. 137–143. Springer (2014)
Belkin, M., Niyogi, P.: Laplacian eigenmaps for dimensionality reduction and data representation. Neural Comput. 15(6), 1373–1396 (2003)
Bendich, P., Wang, B., Mukherjee, S.: Local homology transfer and stratification learning. In: Proceedings of the Twenty-Third Annual ACM-SIAM Symposium on Discrete Algorithms, pp. 1355–1370. SIAM (2012)
Berwald, J., Gidea, M., Vejdemo-Johansson, M.: Automatic recognition and tagging of topologically different regimes in dynamical systems. Discontinuity Non-linearity Complex. 3(4), 413–426 (2015)
Blumberg, A.J., Gal, I., Mandell, M.A., Pancia, M.: Robust statistics, hypothesis testing, and confidence intervals for persistent homology on metric measure spaces. Found. Comput. Math. 14(4), 745–789 (2014)
Bremer, P.T., Edelsbrunner, H., Hamann, B., Pascucci, V.: A multi-resolution data structure for two-dimensional morse-smale functions. In: Proceedings of the 14th IEEE Visualization 2003 (VIS’03), p. 19. IEEE Computer Society (2003)
Bubenik, P.: Statistical Topology Using Persistence Landscapes (2012)
Bubenik, P.: Statistical topological data analysis using persistence landscapes. J. Mach. Learn. Res. 16, 77–102 (2015)
Bubenik, P., Scott, J.A.: Categorification of persistent homology. arXiv:1205.3669 (2012)
Busaryev, O., Cabello, S., Chen, C., Dey, T.K., Wang, Y.: Annotating simplices with a homology basis and its applications. In: Algorithm Theory-SWAT 2012, pp. 189–200. Springer (2012)
Bush, J., Gameiro, M., Harker, S., Kokubu, H., Mischaikow, K., Obayashi, I., Pilarczyk, P.: Combinatorial-topological framework for the analysis of global dynamics. Chaos: Interdiscip. J. Nonlinear Sci. 22(4), 047,508 (2012)
Cabello, S., Giannopoulos, P.: The complexity of separating points in the plane. In: Proceedings of the Twenty-Ninth Annual Symposium on Computational Geometry, pp. 379–386. ACM (2013)
Carlsson, G.: Topology and data. Am. Math. Soc. 46(2), 255–308 (2009)
Carr, H., Snoeyink, J., Axen, U.: Computing contour trees in all dimensions. Comput. Geom. 24(2), 75–94 (2003)
Chambers, E.W., Erickson, J., Nayyeri, A.: Homology flows, cohomology cuts. SIAM J. Comput. 41(6), 1605–1634 (2012)
Chazal, F., Cohen-Steiner, D., Glisse, M., Guibas, L.J., Oudot, S.Y.: Proximity of persistence modules and their diagrams. In: Proceedings of the 25th Annual Symposium on Computational Geometry, SCG’09, pp. 237–246. ACM, New York, NY, USA (2009). doi:10.1145/1542362.1542407
Chazal, F., Cohen-Steiner, D., Guibas, L.J., Oudot, S.Y.: The Stability of Persistence Diagrams Revisited (2008)
Chazal, F., Fasy, B.T., Lecci, F., Rinaldo, A., Wasserman, L.: Stochastic convergence of persistence landscapes and silhouettes. In: Proceedings of the Thirtieth Annual Symposium on Computational Geometry, p. 474. ACM (2014)
Chazal, F., Guibas, L.J., Oudot, S.Y., Skraba, P.: Persistence-based clustering in riemannian manifolds. J. ACM (JACM) 60(6), 41 (2013)
Chazal, F., de Silva, V., Glisse, M., Oudot, S.: The structure and stability of persistence modules. arXiv:1207.3674 (2012)
Chazal, F., de Silva, V., Oudot, S.: Persistence stability for geometric complexes. arXiv:1207.3885 (2012)
Chazal, F., Skraba, P., Patel, A.: Computing well diagrams for vector fields on \(\mathbb{R}^{n}\). Appl. Math. Lett. 25(11), 1725–1728 (2012)
Chen, C., Freedman, D.: Quantifying homology classes. arXiv:0802.2865 (2008)
Chen, C., Freedman, D.: Hardness results for homology localization. Discrete Comput. Geom. 45(3), 425–448 (2011)
Cheng, Y.: Mean shift, mode seeking, and clustering. IEEE Trans. Pattern Anal. Mach. Intell. 17(8), 790–799 (1995)
Choudhury, A.I., Wang, B., Rosen, P., Pascucci, V.: Topological analysis and visualization of cyclical behavior in memory reference traces. In: Pacific Visualization Symposium (PacificVis), 2012 IEEE, pp. 9–16. IEEE (2012)
Cohen-Steiner, D., Edelsbrunner, H., Harer, J.: Stability of persistence diagrams. Discrete Comput. Geom. 37(1), 103–120 (2007)
Cohen-Steiner, D., Edelsbrunner, H., Harer, J.: Extending persistence using Poinca and Lefschetz duality. Found. Comput. Math. 9(1), 79–103 (2009). doi:10.1007/s10208-008-9027-z
Cohen-Steiner, D., Edelsbrunner, H., Harer, J., Mileyko, Y.: Lipschitz functions have \({\rm L}_{\rm p}\)-stable persistence. Found. Comput. Math. 10(2), 127–139 (2010)
Cohen-Steiner, D., Edelsbrunner, H., Morozov, D.: Vines and vineyards by updating persistence in linear time. In: Proceedings of the Twenty-Second Annual Symposium on Computational Geometry, SCG’06, pp. 119–126. ACM, New York, NY, USA (2006). doi:10.1145/1137856.1137877
de Silva, V., Ghrist, R., Muhammad, A.: Blind swarms for coverage in 2-D. In: Robotics: Science and Systems, pp. 335–342 (2005)
Dey, T.K., Hirani, A.N., Krishnamoorthy, B.: Optimal homologous cycles, total unimodularity, and linear programming. SIAM J. Comput. 40(4), 1026–1044 (2011)
Dey, T.K., Wenger, R.: Stability of critical points with interval persistence. Discrete Comput. Geom. 38(3), 479–512 (2007)
Donoho, D.L., Grimes, C.: Hessian eigenmaps: Locally linear embedding techniques for high-dimensional data. Proc. Natl. Acad. Sci. 100(10), 5591–5596 (2003)
Dłotko, P., Ghrist, R., Juda, M., Mrozek, M.: Distributed computation of coverage in sensor networks by homological methods. Appl. Algebra Eng. Commun. Comput. 23(1), 29–58 (2012). doi:10.1007/s00200-012-0167-7
Edelsbrunner, H., Harer, J.: Persistent homology—a survey. In: Goodman, J.E., Pach, J., Pollack, R. (eds.) Surveys on Discrete and Computational Geometry: Twenty Years Later, Contemporary Mathematics, vol. 453, pp. 257–282. American Mathematical Society (2008)
Edelsbrunner, H., Harer, J.: Computational Topology: An Introduction. AMS Press (2009)
Edelsbrunner, H., Harer, J., Natarajan, V., Pascucci, V.: Morse-smale complexes for piecewise linear 3-manifolds. In: Proceedings of the Nineteenth Annual Symposium on Computational Geometry, pp. 361–370. ACM (2003)
Edelsbrunner, H., Harer, J., Zomorodian, A.: Hierarchical morse complexes for piecewise linear 2-manifolds. In: Proceedings of the Seventeenth Annual Symposium on Computational Geometry, pp. 70–79. ACM (2001)
Edelsbrunner, H., Letscher, D., Zomorodian, A.: Topological persistence and simplification. In: 41st Annual Symposium on Foundations of Computer Science, 2000. Proceedings, pp. 454–463 (2000)
Edelsbrunner, H., Morozov, D., Patel, A.: Quantifying transversality by measuring the robustness of intersections. Found. Comput. Math. 11(3), 345–361 (2011)
Eisenbud, D.: Commutative Algebra with a View Toward Algebraic Geometry, vol. 150. Springer (1995)
Erickson, J., Whittlesey, K.: Greedy optimal homotopy and homology generators. In: Proceedings of the Sixteenth Annual ACM-SIAM Symposium on Discrete Algorithms, pp. 1038–1046. Society for Industrial and Applied Mathematics (2005)
Fasy, B.T., Kim, J., Lecci, F., Maria, C.: Introduction to the R package TDA. arXiv:1411.1830 (2014)
Gabriel, P.: Unzerlegbare Darstellungen I. Manuscripta Mathematica 6(1), 71–103 (1972). doi:10.1007/BF01298413
Ghrist, R.: Barcodes: the persistent topology of data. Bull. Am. Math. Soc. 45(1), 61–75 (2008)
Ghrist, R., Krishnan, S.: A topological max-flow-min-cut theorem. In: Proceedings of Global Signal Inference (2013)
Ghrist, R., Muhammad, A.: Coverage and hole-detection in sensor networks via homology. In: Proceedings of the 4th International Symposium on Information Processing in Sensor Networks, p. 34. IEEE Press (2005)
Gyulassy, A., Natarajan, V., Pascucci, V., Hamann, B.: Efficient computation of morse-smale complexes for three-dimensional scalar functions. IEEE Trans. Vis. Comput. Graph. 13(6), 1440–1447 (2007)
Hatcher, A.: Algebraic Topology. Cambridge University Press (2002)
Huang, K., Ni, C.C., Sarkar, R., Gao, J., Mitchell, J.S.: Bounded stretch geographic homotopic routing in sensor networks. In: INFOCOM, 2014 Proceedings IEEE, pp. 979–987. IEEE (2014)
Kruskal, J.B.: Multidimensional scaling by optimizing goodness of fit to a nonmetric hypothesis. Psychometrika 29(1), 1–27 (1964)
Kruskal, J.B.: Nonmetric multidimensional scaling: a numerical method. Psychometrika 29(2), 115–129 (1964)
Kruskal, J.B., Wish, M.: Multidimensional Scaling, vol. 11. Sage (1978)
Lamar-Leon, J., Baryolo, R.A., Garcia-Reyes, E., Gonzalez-Diaz, R.: Gait-based carried object detection using persistent homology. In: Bayro-Corrochano, E., Hancock, E. (eds.) Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications, no. 8827 in Lecture Notes in Computer Science, pp. 836–843. Springer International Publishing (2014)
Le Roux, B., Rouanet, H.: Geometric Data Analysis. Springer, Netherlands, Dordrecht (2005)
Lee, J.A., Verleysen, M.: Nonlinear dimensionality reduction of data manifolds with essential loops. Neurocomputing 67, 29–53 (2005). doi:10.1016/j.neucom.2004.11.042
Lesnick, M.: The Optimality of the Interleaving Distance on Multidimensional Persistence Modules. arXiv:1106.5305 (2011)
Li, X., Lin, S., Yan, S., Xu, D.: Discriminant locally linear embedding with high-order tensor data. IEEE Trans. Syst. Man Cybern. Part B: Cybern. 38(2), 342–352 (2008)
Lum, P.Y., Singh, G., Lehman, A., Ishkanov, T., Vejdemo-Johansson, M., Alagappan, M., Carlsson, J., Carlsson, G.: Extracting insights from the shape of complex data using topology. Sci. Rep. 3 (2013). doi:10.1038/srep01236
van der Maaten, L.J., Postma, E.O., van den Herik, H.J.: Dimensionality reduction: a comparative review. J. Mach. Learn. Res. 10(1–41), 66–71 (2009)
Maria, C., Boissonnat, J.D., Glisse, M., Yvinec, M.: The Gudhi library: simplicial complexes and persistent homology. In: Mathematical Software-ICMS 2014, pp. 167–174. Springer (2014)
Mather, J.: Notes on Topological Stability. Harvard University Cambridge (1970)
Mischaikow, K.: Databases for the global dynamics of multiparameter nonlinear systems. Technical report, DTIC Document (2014)
Mischaikow, K., Kokubu, H., Mrozek, M., Pilarczyk, P., Gedeon, T., Lessard, J.P., Gameiro, M.: Chomp: Computational homology project. http://chomp.rutgers.edu
Morozov, D.: Dionysus. http://www.mrzv.org/software/dionysus/ (2011)
Morozov, D., de Silva, V., Vejdemo-Johansson, M.: Persistent cohomology and circular coordinates. Discrete Comput. Geom. 45(4), 737–759 (2011). doi:10.1007/s00454-011-9344-x
Mrozek, M.: Topological dynamics: rigorous numerics via cubical homology. In: Advances in Applied and Computational Topology: Proceedings Symposium, vol. 70, pp. 41–73. American Mathematical Society (2012)
Muhammad, A., Jadbabaie, A.: Decentralized computation of homology groups in networks by gossip. In: American Control Conference, ACC 2007, pp. 3438–3443. IEEE (2007)
Munch, E., Turner, K., Bendich, P., Mukherjee, S., Mattingly, J., Harer, J.: Probabilistic fréchet means for time varying persistence diagrams. Electron. J. Statist. 9(1), 1173–1204 (2015). doi:10.1214/15-EJS1030. http://dx.doi.org/10.1214/15-EJS1030
Nanda, V.: Perseus: The Persistent Homology Software (2012)
Otter, N., Porter, M.A., Tillmann, U., Grindrod, P., Harrington, H.A.: A roadmap for the computation of persistent homology. arXiv:1506.08903 [physics, q-bio] (2015)
Perea, J.A., Deckard, A., Haase, S.B., Harer, J.: Sw1pers: Sliding windows and 1-persistence scoring; discovering periodicity in gene expression time series data. BMC Bioinf. (Accepted July 2015)
Perea, J.A., Harer, J.: Sliding windows and persistence: an application of topological methods to signal analysis. Found. Comput. Math. 15(3), 799–838 (2013)
Petri, G., Expert, P., Turkheimer, F., Carhart-Harris, R., Nutt, D., Hellyer, P.J., Vaccarino, F.: Homological scaffolds of brain functional networks. J. R. Soc. Interface 11(101) (2014). doi:10.1098/rsif.2014.0873
Pokorny, F.T., Bekiroglu, Y., Exner, J., Björkman, M.A., Kragic, D.: Grasp Moduli spaces, Gaussian processes, and multimodal sensor data. In: RSS 2014 Workshop: Information-based Grasp and Manipulation Planning (2014)
Pokorny, F.T., Bekiroglu, Y., Kragic, D.: Grasp moduli spaces and spherical harmonics. In: Robotics and Automation (ICRA), 2014 IEEE International Conference on, pp. 389–396. IEEE (2014)
Pokorny, F.T., Ek, C.H., Kjellström, H., Kragic, D.: Topological constraints and kernel-based density estimation. In: Advances in Neural Information Processing Systems 25, Workshop on Algebraic Topology and Machine Learning, 8 Dec, Nevada, USA (2012)
Pokorny, F.T., Hang, K., Kragic, D.: Grasp moduli spaces. In: Robotics: Science and Systems (2013)
Pokorny, F.T., Kjellström, H., Kragic, D., Ek, C.: Persistent homology for learning densities with bounded support. In: Advances in Neural Information Processing Systems, pp. 1817–1825 (2012)
Pokorny, F.T., Stork, J., Kragic, D., others: Grasping objects with holes: A topological approach. In: 2013 IEEE International Conference on Robotics and Automation (ICRA), pp. 1100–1107. IEEE (2013)
Richardson, E., Werman, M.: Efficient classification using the Euler characteristic. Pattern Recogn. Lett. 49, 99–106 (2014)
Robinson, M.: Universal factorizations of quasiperiodic functions. arXiv:1501.06190 [math] (2015)
Roweis, S.T., Saul, L.K.: Nonlinear dimensionality reduction by locally linear embedding. Science 290(5500), 2323–2326 (2000)
Salamon, D.: Morse theory, the conley index and floer homology. Bull. London Math. Soc 22(2), 113–140 (1990)
Sexton, H., Vejdemo-Johansson, M.: jPlex. https://github.com/appliedtopology/jplex/ (2008)
Sheehy, D.R.: Linear-size approximations to the vietoris-rips filtration. Discrete Comput. Geom. 49(4), 778–796 (2013)
de Silva, V., Ghrist, R.: Coordinate-free coverage in sensor networks with controlled boundaries via homology. Int. J. Robot. Res. 25(12), 1205–1222 (2006). doi:10.1177/0278364906072252
de Silva, V., Ghrist, R.: Coverage in sensor networks via persistent homology. Algebraic Geom. Topol. 7, 339–358 (2007)
de Silva, V., Morozov, D., Vejdemo-Johansson, M.: Dualities in persistent (co)homology. Inverse Prob. 27(12), 124,003 (2011). doi:10.1088/0266-5611/27/12/124003
de Silva, V., Vejdemo-Johansson, M.: Persistent cohomology and circular coordinates. In: Hershberger, J., Fogel, E. (eds.) Proceedings of the 25th Annual Symposium on Computational Geometry, pp. 227–236. Aarhus (2009)
de Silva, V., Škraba, P., Vejdemo-Johansson, M.: Topological analysis of recurrent systems. In: NIPS 2012 Workshop on Algebraic Topology and Machine Learning, 8 Dec, Lake Tahoe, Nevada, pp. 1–5 (2012)
Singh, G., Mémoli, F., Carlsson, G.E.: Topological methods for the analysis of high dimensional data sets and 3D object recognition. In: SPBG, pp. 91–100 (2007)
Skraba, P., Ovsjanikov, M., Chazal, F., Guibas, L.: Persistence-based segmentation of deformable shapes. In: 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 45–52. IEEE (2010)
Stork, J., Pokorny, F.T., Kragic, D., others: Integrated motion and clasp planning with virtual linking. In: 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 3007–3014. IEEE (2013)
Stork, J., Pokorny, F.T., Kragic, D., others: A topology-based object representation for clasping, latching and hooking. In: 2013 13th IEEE-RAS International Conference on Humanoid Robots (Humanoids), pp. 138–145. IEEE (2013)
Tahbaz-Salehi, A., Jadbabaie, A.: Distributed coverage verification in sensor networks without location information. IEEE Trans. Autom. Control 55(8), 1837–1849 (2010)
Takens, F.: Detecting strange attractors in turbulence. Dyn. Syst. Turbul. Warwick 1980, 366–381 (1981)
Tausz, A., Vejdemo-Johansson, M., Adams, H.: javaPlex: a research platform for persistent homology. In: Book of Abstracts Minisymposium on Publicly Available Geometric/Topological Software, p. 7 (2012)
Tenenbaum, J.B., De Silva, V., Langford, J.C.: A global geometric framework for nonlinear dimensionality reduction. Science 290(5500), 2319–2323 (2000)
Turner, K., Mileyko, Y., Mukherjee, S., Harer, J.: Fréchet means for distributions of persistence diagrams. Discrete Comput. Geom. 52(1), 44–70 (2014)
Vejdemo-Johansson, M.: Sketches of a platypus: persistent homology and its algebraic foundations. Algebraic Topol.: Appl. New Dir. 620, 295–320 (2014)
Vejdemo-Johansson, M., Pokorny, F.T., Skraba, P., Kragic, D.: Cohomological learning of periodic motion. Appl. Algebra Eng. Commun. Comput. 26(1–2), 5–26 (2015)
Vergne, A., Flint, I., Decreusefond, L., Martins, P.: Homology based algorithm for disaster recovery in wireless networks. In: 2014 12th International Symposium on Modeling and Optimization in Mobile, Ad Hoc, and Wireless Networks (WiOpt), pp. 685–692. IEEE (2014)
Worsley, K.J.: Local maxima and the expected Euler characteristic of excursion sets of \(\chi ^{2}\), F and t fields. Adv. Appl. Probab. 13–42 (1994)
Worsley, K.J.: Boundary corrections for the expected Euler characteristic of excursion sets of random fields, with an application to astrophysics. Adv. Appl. Probab. 943–959 (1995)
Worsley, K.J.: Estimating the number of peaks in a random field using the Hadwiger characteristic of excursion sets, with applications to medical images. Ann. Stat. 640–669 (1995)
Zarubin, D., Pokorny, F.T., Song, D., Toussaint, M., Kragic, D.: Topological synergies for grasp transfer. In: Hand Synergies—How to Tame the Complexity of Grapsing, Workshop, IEEE International Conference on Robotics and Automation (ICRA 2013), Karlsruhe, Germany. Citeseer (2013)
Zarubin, D., Pokorny, F.T., Toussaint, M., Kragic, D.: Caging complex objects with geodesic balls. In: 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 2999–3006. IEEE (2013)
Zomorodian, A., Carlsson, G.: Computing persistent homology. Discrete Comput. Geom. 33(2), 249–274 (2005)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this chapter
Cite this chapter
Vejdemo-Johansson, M., Skraba, P. (2016). Topology, Big Data and Optimization. In: Emrouznejad, A. (eds) Big Data Optimization: Recent Developments and Challenges. Studies in Big Data, vol 18. Springer, Cham. https://doi.org/10.1007/978-3-319-30265-2_7
Download citation
DOI: https://doi.org/10.1007/978-3-319-30265-2_7
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-30263-8
Online ISBN: 978-3-319-30265-2
eBook Packages: EngineeringEngineering (R0)