Inferring Temporal Phenotypes with Topological Data Analysis and Pseudo Time-Series

  • Arianna DagliatiEmail author
  • Nophar Geifman
  • Niels Peek
  • John H. Holmes
  • Lucia Sacchi
  • Seyed Erfan Sajjadi
  • Allan Tucker
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11526)


Temporal phenotyping enables clinicians to better under-stand observable characteristics of a disease as it progresses. Modelling disease progression that captures interactions between phenotypes is inherently challenging. Temporal models that capture change in disease over time can identify the key features that characterize disease subtypes that underpin these trajectories. These models will enable clinicians to identify early warning signs of progression in specific sub-types and therefore to make informed decisions tailored to individual patients. In this paper, we explore two approaches to building temporal phenotypes based on the topology of data: topological data analysis and pseudo time-series. Using type 2 diabetes data, we show that the topological data analysis approach is able to identify trajectories representing different temporal phenotypes and that pseudo time-series can infer a state space model characterized by transitions between hidden states that represent distinct temporal phenotypes. Both approaches highlight lipid profiles as key factors in distinguishing the phenotypes.


Type 2 diabetes Unsupervised machine learning Longitudinal studies Electronic phenotyping 



This work was co-funded by the Medical Research Council and the Engineering and Physical Sciences Research Council grant MR/N00583X/1 “Manchester Molecular Pathology Innovation Centre (MMPathIC): bridging the gap between biomarker discovery and health and wealth” and the NIHR Manchester Biomedical Research Centre.


  1. 1.
    Dagliati, A.: Temporal electronic phenotyping by mining careflows of breast cancer patients. J. Biomed. Inf. 66, 136–147 (2017)Google Scholar
  2. 2.
    Hripcsak, G., Albers, D.J.: Next-generation phenotyping of electronic health records. J. Am. Med. Inform. Assoc. 20(1), 117–121 (2012)Google Scholar
  3. 3.
    Offroy, M., Duponchel, L.: Topological data analysis: a promising big data exploration tool in biology, analytical chemistry and physical chemistry. Anal. Chim. Acta 910, 1–11 (2016)Google Scholar
  4. 4.
    Carlsson, G.: Topology and data. Bull. Am. Math. Soc. 46(2), 255–308 (2009)Google Scholar
  5. 5.
    Shortliffe, E.H., Sepúlveda, M.J.: Clinical decision support in the era of artificial intelligence. JAMA – J. Am. Med. Assoc. 320(21), 2199–2200 (2018)Google Scholar
  6. 6.
    Li, L.L., et al.: Identification of type 2 diabetes subgroups through topological analysis of patient similarity. Sci. Transl. Med. 7(311), 311ra174–311ra174 (2015)Google Scholar
  7. 7.
    Nielson, J.L., et al.: Topological data analysis for discovery in preclinical spinal cord injury and traumatic brain injury. Nat. Commun. 6, 8581 (2015)Google Scholar
  8. 8.
    Torres, B.Y., Oliveira, J.H.M., Thomas Tate, A., Rath, P., Cumnock, K., Schneider, D.S.: Tracking resilience to infections by mapping disease space. PLoS Biol. 14(4), e1002436 (2016)Google Scholar
  9. 9.
    Tucker, A., Garway-Heath, D.: The pseudotemporal bootstrap for predicting glaucoma from cross-sectional visual field data. IEEE Trans. Inf. Technol. Biomed. 14(1), 79–85 (2010)Google Scholar
  10. 10.
    Magwene, P.M., Lizardi, P., Kim, J.: Reconstructing the temporal ordering of biological samples using microarray data. Bioinformatics 19(7), 842–850 (2003)Google Scholar
  11. 11.
    Campbell, K.R., Yau, C.: Uncovering pseudotemporal trajectories with covariates from single cell and bulk expression data. Nat. Commun. 9(1), 2442 (2018)Google Scholar
  12. 12.
    Gupta, A., Bar-Joseph, Z.: Extracting dynamics from static cancer expression data. IEEE/ACM Trans. Comput. Biol. Bioinform. 5(2), 172–182 (2008)Google Scholar
  13. 13.
    Li, Y., Swift, S., Tucker, A.: Modelling and analysing the dynamics of disease progression from cross-sectional studies. J. Biomed. Inform. 46(2), 266–274 (2013)CrossRefGoogle Scholar
  14. 14.
    Tucker, A., Li, Y., Garway-Heath, D.: Updating Markov models to integrate cross-sectional and longitudinal studies. Artif. Intell. Med. 77, 23–30 (2017)Google Scholar
  15. 15.
    Nicolau, M., Levine, A.J., Carlsson, G.: Topology based data analysis identifies a subgroup of breast cancers with a unique mutational profile and excellent survival. Proc. Natl. Acad. Sci. 108(17), 7265–7270 (2011)CrossRefGoogle Scholar
  16. 16.
    Lum, P.Y., et al.: Extracting insights from the shape of complex data using topology. Sci. Rep. 3, 1236 (2013)Google Scholar
  17. 17.
    Brandes, U., et al.: On modularity clustering. IEEE Trans. Knowl. Data Eng. 20(2), 172–188 (2008)Google Scholar
  18. 18.
    Teliti, M., et al.: Risk factors for the development of micro-vascular complications of type 2 diabetes in a single-centre cohort of patients. Diabetes Vasc. Dis. Res. 15(5), 424–432 (2018). p. 1479164118780808Google Scholar
  19. 19.
    Dagliati, A., et al.: A dashboard-based system for supporting diabetes care. J. Am. Med. Inform. Assoc. 25(5), 538–547 (2018)CrossRefGoogle Scholar
  20. 20.
    Dagliati, A., et al.: Machine learning methods to predict diabetes complications. J. Diabetes Sci. Technol. 12(2), 295–302 (2017)Google Scholar
  21. 21.
    Dagliati, A., Tibollo, V., Cogni, G., Chiovato, L., Bellazzi, R., Sacchi, L.: Careflow mining techniques to explore type 2 diabetes evolution. J. Diabetes Sci. Technol. 12(2), 251–259 (2018)Google Scholar
  22. 22.
    Batal, I., Fradkin, D., Harrison, J., Moerchen, F., Hauskrecht, M.: Mining recent temporal patterns for event detection in multivariate time series data (2012)Google Scholar
  23. 23.
    Batal, I., Valizadegan, H., Cooper, G.F., Hauskrecht, M.: A temporal pattern mining approach for classifying electronic health record data. ACM Trans. Intell. Syst. Technol. 4(4), 63 (2013)Google Scholar
  24. 24.
    Moskovitch, R., Shahar, Y.: Fast time intervals mining using the transitivity of temporal relations. Knowl. Inf. Syst. 42(1), 21–48 (2015)Google Scholar
  25. 25.
    Moskovitch, R., Shahar, Y.: Classification of multivariate time series via temporal abstraction and time intervals mining. Knowl. Inf. Syst. 45(1), 35–74 (2015)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  • Arianna Dagliati
    • 1
    • 2
    Email author
  • Nophar Geifman
    • 1
  • Niels Peek
    • 2
    • 3
  • John H. Holmes
    • 4
  • Lucia Sacchi
    • 5
  • Seyed Erfan Sajjadi
    • 6
  • Allan Tucker
    • 6
  1. 1.Centre for Health InformaticsUniversity of ManchesterManchesterUK
  2. 2.Manchester Molecular Pathology Innovation CentreUniversity of ManchesterManchesterUK
  3. 3.NIHR Manchester Biomedical Research CentreUniversity of ManchesterManchesterUK
  4. 4.Department of Biostatistics, Epidemiology, and Informatics, Penn Institute for Biomedical InformaticsUniversity of Pennsylvania Perelman School of MedicinePhiladelphiaUSA
  5. 5.Department of Electrical, Computer and Biomedical EngineeringUniversity of PaviaPaviaItaly
  6. 6.Department of Computer ScienceBrunel University LondonLondonUK

Personalised recommendations