Local Topological Data Analysis to Uncover the Global Structure of Data Approaching Graph-Structured Topologies
Gene expression data of differentiating cells, galaxies distributed in space, and earthquake locations, all share a common property: they lie close to a graph-structured topology in their respective spaces [1, 4, 9, 10, 20], referred to as one-dimensional stratified spaces in mathematics. Often, the uncovering of such topologies offers great insight into these data sets. However, methods for dimensionality reduction are clearly inappropriate for this purpose, and also methods from the relatively new field of Topological Data Analysis (TDA) are inappropriate, due to noise sensitivity, computational complexity, or other limitations. In this paper we introduce a new method, termed Local TDA (LTDA), which resolves the issues of pre-existing methods by unveiling (global) graph-structured topologies in data by means of robust and computationally cheap local analyses. Our method rests on a simple graph-theoretic result that enables one to identify isolated, end-, edge- and multifurcation points in the topology underlying the data. It then uses this information to piece together a graph that is homeomorphic to the unknown one-dimensional stratified space underlying the point cloud data. We evaluate our method on a number of artificial and real-life data sets, demonstrating its superior effectiveness, robustness against noise, and scalability. Code related to this paper is available at: https://bitbucket.org/ghentdatascience/gltda-public.
KeywordsTopological Data Analysis Persistent homology Metric spaces Graph theory Stratified spaces
This work was funded by the ERC under the European Union’s Seventh Framework Programme (FP7/2007-2013) / ERC Grant Agreement no. 615517, and the FWO (G091017N, G0F9816N).
- 6.Carlsson, G.: Topological pattern recognition for point cloud data (2013)Google Scholar
- 8.Chazal, F., Cohen-Steiner, D., Mérigot, Q.: Geometric inference for measures based on distance functions (2009)Google Scholar
- 11.Fasy, B.T., Wang, B.: Exploring persistent local homology in topological data analysis. In: 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 6430–6434 (2016)Google Scholar
- 17.Medina, P., Doerge, R.: Statistical methods in topological data analysis for complex, high-dimensional data. In: Annual Conference on Applied Statistics in Agriculture (2015)Google Scholar
- 18.Nanda, V., Sazdanović, R.: Simplicial models and topological inference in biological systems. In: Jonoska, N., Saito, M. (eds.) Discrete and Topological Models in Molecular Biology. NCS, pp. 109–141. Springer, Heidelberg (2014). https://doi.org/10.1007/978-3-642-40193-0_6CrossRefzbMATHGoogle Scholar
- 22.Wang, K.: The basic theory of persistent homology (2012)Google Scholar