Topological classifier for detecting the emergence of epileptic seizures
 427 Downloads
Abstract
Objective
An innovative method based on topological data analysis is introduced for classifying EEG recordings of patients affected by epilepsy. We construct a topological space from a collection of EEGs signals using Persistent Homology; then, we analyse the space by Persistent entropy, a global topological feature, in order to classify healthy and epileptic signals.
Results
The performance of the resulting onefeaturebased linear topological classifier is tested by analysing the Physionet dataset. The quality of classification is evaluated in terms of the Area Under Curve (AUC) of the receiver operating characteristic curve. It is shown that the linear topological classifier has an AUC equal to \(97.2\%\) while the performance of a classifier based on Sample Entropy has an AUC equal to 62.0%.
Keywords
Complex systems Brain Epilepsy Topological data analysis Persistent entropy Time seriesAbbreviations
 AUC
Area Under Curve
 EEG
electroencephalogram
 H
persistent entropy
 PCD
Point Cloud Data
 ROC
receiving operative characteristic
 TDA
topological data analysis
 LTC
linear topological classifier
Introduction
Epilepsy is a chronic brain disorder characterised by recurrent seizures of several entity with different manifestations. They are caused by sudden excessive electrical discharges in a group of neurons [1] and they are defined as a spontaneous hypersynchronous activity of clusters of neurons [2].
Human brain can be considered as a complex selfadaptive system composed of billions of nonidentical neurons, entangled in loops of nonlinear interactions, determining the brain behaviours [3]. Epilepsy is just an example of such behaviours: identifying the onset of a neural hypersynchronisation is similar to discovering patterns of information expressed by a network of interactions in the space of neurons.
The electroencephalogram (EEG) is the standard technique used to record the electrical activity of the brain. The direct observation of EEG signals helps neurologists in diagnosing epilepsy while automatic methods for this task are still not used even if, in the last decades, several methods for automatic diagnosis have been proposed in the literature [4, 5, 6, 7, 8]. The intrinsic nonlinearity and nonstationarity of EEG signals requires methods capable of extracting global information, characterising the processes described by the signals.
Topological data analysis (TDA) is able to extract such information [9, 10, 11, 12, 13, 14]; currently, it has been used for the analysis of EEG signals [15] within the TOPDRIM project [16]. The keyconcept in TDA is persistent homology: a procedure for counting, through a process called filtration, the higher dimensional persistent holes of topological spaces. Its visualisation can be given as persistent barcodes or as persistent diagrams.
In this paper we describe the realisation of a Persistent Entropybased classifier to discriminate the epileptic EEG signals from the nonepileptic ones. The proposed method defines an automatic classifier of signals and it is a preliminary step towards the study of an automatic detection of epileptic seizures. Afterwards, we use the Vietoris–Rips filtration for understanding how the regions of the brain are involved in the spreading of epileptic signals.
Main text
Material and methods
Dataset
TDA: a new method for data analysis
Consider a set of points G, i.e. our data, embedded in a ddimensional space \(\mathbb {D}^d\) and assume that those data were sampled from an unknown kdimensional space \(\mathbb {D}^k\) with \(k \le d\). Our task is to reconstruct the space \(\mathbb {D}^k\) from the dataset G.
In TDA, G elements are equipped with a notion of proximity that characterises a coordinatefree metric. Those points are converted into topological spaces called simplicial complexes. Simplicial complexes are made up by building blocks called simplices: points are 0simplices, line segments are 1simplices, filled triangles are 2simplices, filled tetrahedra are 3simplices and so on (see Fig. 1d).
A Filtration is a collection of nested simplicial complexes. Building a filtration can be seen as wearing lenses for examining the dataset: different lenses consent to extract different kinds of information from the topological space. In this paper we use Piecewise filtration and Vietoris–Rips filtration. Choosing a filtration is a crucial step: different filtrations give rise to different conversions of the data points G into simplicial complexes [18, 19, 20].
Piecewise filtration
Piecewise filtration, recently introduced by Rucco et al. [21], is used for studying signals. The procedure is based on the well known concept of Piecewise Linear function (PL), \(PL:\mathbb {R}\rightarrow \mathbb {R}\), shown in Fig. 2a, b.
Vietoris–Rips filtration
Vietoris–Rips filtration is used for studying Point Cloud Data (PCD). It creates a sequence of simplices, built on a metric space, used to add topological structure to an otherwise disconnected set of points [22, Chapter III]. Figure 2c, d, e show a graphical representation of this approach.
Persistent homology
Persistent homology is the combinatorial counterpart of Homology, an algebraic object that counts the number of ndimensional holes in a topological space, the socalled Betti numbers. The filtration process is necessary for the computation of persistent homology. The set of Betti numbers is composed by \(\beta _0\), the number of connected components in a generic topological space K; \(\beta _1\), the number of holes in K; \(\beta _2\), the number of voids in K and so on. Along the filtration, persistent homology calculates \(k\)dimensional Betti intervals: a \(k\)dimensional Betti interval \([t_{start}, t_{end}]\) defines the time at which a kdimensional hole appears in the simplicial complex (\(t_{start}\)), while \(t_{end}\) is the time at which it disappears. The holes that are still present at \(t_{end}= t_{max}\) correspond to persistent topological features [23]. A graphical representation of those intervals in K is called persistence barcode and it is associated to a filtration. An equivalent representation is a persistence diagram [24]. An additional information returned by the computation of persistent homology is the list of the generators, which are the simplices involved in the holes. Experimentally, the generators play a crucial role for the description of the data under analysis [25, 26].
Persistent entropy
A new topological classifier for epilepsy

Step I preprocessing of the input.

Step II computation of H using the Piecewise filtration and derivation of a linear topological classifier (LTC).

Step III identification of regions involved in the spreading of the epileptic signals using Vietoris–Rips filtration.
Step I
 1.
Filtering the EEG reduces the noise by using a bandpass filter between 1–70 Hz, and removes the power line using a notch filter, between 8 and 52 Hz [28, 29, 30].
 2.
Downsampling the EEG reduces the time needed for the computation of the topological features during the subsequent steps. The worstcase complexity of computing persistent homology using the JavaPlex tool [31] is cubic in the number of simplices. This number is linear with respect to the number of points in case of piecewise complexes. Downsampling should be used if and only if it preserves the main geometrical characteristics of the original signals, that is the shape. In MATLAB we used the command “decimate” [32].
Step II
After performing the Piecewise filtration, we computed \(\mathcal {H}\) for each \(\mathbb {\tilde{S}}^j\) thus obtaining a vector of 23 values of \(\mathcal {H}\). Then, we calculated the average value of this vector, \(\widehat{\mathcal {H}^j}\). \(\widehat{\mathcal {H}^j}\) is our 1dimensional feature able to differentiate signals by looking at their shapes [21].
We repeated the procedure using Sample Entropy, a wellestablished technique in time series analysis [33, 34], on the same dataset. Finally, we trained an \(\mathcal {H}\)based supervised classifier and a Sample Entropybased supervised LTC. We randomly divided the dataset into a training (\(70\%\)) and a testing (\(30\%\)) subset. We applied a 10fold cross validation.
Step III
Results
We report the results of the analysis on the signals decimated by a factor 10, which produced 92160 samples per signal (N = 92160). We tested our method using the nondownsampling signals and using different values of the decimation factor (\(df = 10\) and \(df=100\)). We report the results of the analysis using \(df =10\) (because H did not show significative changes for \(df= 100\)). In Fig. 3c, d the frequency of the values of \(\mathcal {H}_i^j\) is reported. The class of epileptic patients is characterised by a peak of 313 elements in the range [0.942, 0.967] of \(\mathcal {H}\) values, with centre value 0.955. The class of healthy patients is characterised by a peak containing 247 elements with \(\mathcal {H}\) values in [0.930, 0.948] with centre value 0.939. A strong separation between the two classes is clearly depicted in Fig. 3a where \(\widehat{\mathcal {H}^j}\) is plotted. It is evident from the figure that there is a strong separation between the two populations. The Wilcoxon test (pvalue = 1.8346e−36 and confidence interval [1.6942, 1.9675]), used because of the nonnormal distribution of classes, confirmed the separation. Sample Entropy failed to separate the two classes, see Fig. 3b.
The receiver operating characteristic (ROC) curves of the two classifiers are shown in Fig. 3e, f. The Area Under Curve (AUC) for the \(\widehat{\mathcal {H}}\)based LTC is \(97.2\%\), while the AUC for the Sample Entropybased classifier is \(62\%\). The \(\widehat{\mathcal {H}}\)based classifier ROC curve suggests that the best threshold for the separation of the two classes is \(\theta =0.8754\).
For each patient we extracted the values of the Betti numbers: even if there are less epileptic than healthy signals with \(\beta _0\) (3 vs. 12), this difference is not significant (pvalue = 0.6946, Wilcoxon test). In Fig. 3h, i the generators of all the found idimensional holes, were grouped in a frequency histogram. We can recognise that the epileptic patients are characterised by 3 sensors (IDs 1, 2 and 5) while the healthy patients are characterised by sensors with IDs 1, 2, 3, 7, 10, 13 and 14. Those histograms are to be intended quantitatively: sensors involved in epilepsy spread are a few with respect to the ones involved in the normal brain activity.
Limitations
The results for the classifier are very promising, even if we are aware that the reduced number of samples requires further investigations over the effectiveness of the method. Moreover, the role of generators should be deeply investigated. Nevertheless, we believe the present methodology provides a useful example regarding the use of TDA, especially in time series analysis.
Notes
Authors’ contributions
MP, MR and EM conceived the project. MP and MR performed the computations. MP wrote the paper. MP, MR, LT and EM contributed in the interpretation of the results. EM and LT critically revised the manuscript. All authors read and approved the final manuscript.
Acknowledgements
The authors thank Giovanna Viticchi, and Riccardo Ricciuti, from AOU  Ospedali Riuniti di Ancona (Italy) for valuable discussions on medical aspects about epilepsy and Alessandra Renieri regarding algebraic issues.
Competing interests
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential competing interests.
Availability of data and materials
The datasets analysed during the current study are available at the PhysioBank repository, https://physionet.org/physiobank/database/chbmit/, https://doi.org/10.13026/C2K01R. PhysioBank databases are made available under the ODC Public Domain Dedication and License v1.0. [17].
Consent to publish
Not applicable.
Ethics approval and consent to participate
Not applicable.
Funding
The financial support of this paper was provided by the Future and Emerging Technologies (FET) program within the Seventh Framework Programme (FP7) for Research of the European Commission, under the FET Proactive grant agreement TOPDRIM, number FP7ICT318121. Funding sources had no role in the design of this study, no role in analysis, and interpretation of data or in writing the manuscript.
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
References
 1.Organisation W. H. Epilepsy key facts. http://www.who.int/en/newsroom/factsheets/detail/epilepsy. Accessed 29 May 2018.
 2.Majumdar K, Prasad PD, Verma S. Synchronization implies seizure or seizure implies synchronization? Brain Topogr. 2014;27(1):112–22.CrossRefPubMedGoogle Scholar
 3.Telesford QK, Simpson SL, Burdette JH, Hayasaka S, Laurienti PJ. The brain as a complex system: using network science as a tool for understanding the brain. Brain Connectivity. 2011;1(4):295–308.CrossRefPubMedPubMedCentralGoogle Scholar
 4.McSharry PE, Smith L, Tarassenko L. Comparison of predictability of epileptic seizures by a linear and a nonlinear method. IEEE Trans Biomed Eng. 2003;50(5):628–33.CrossRefGoogle Scholar
 5.Iasemidis LD, Sackellares JC. REVIEW : Chaos theory and epilepsy. Neurosci. 1996;2:118–26.Google Scholar
 6.Santaniello S, Burns SP, Golby AJ, Singer JM, Anderson WS, Sarma SV. Quickest detection of drugresistant seizures: an optimal control approach. Epilepsy Behav. 2011;22:S49–60.CrossRefPubMedPubMedCentralGoogle Scholar
 7.Iasemidis LD, Pardalos P, Sackellares JC, Shiau DS. Quadratic binary programming and dynamical system approach to determine the predictability of epileptic seizures. J Comb Optim. 2001;5:9–26.CrossRefGoogle Scholar
 8.Merelli E, Piangerelli M. Rnnbased model for selfadaptive systems—the emergence of epilepsy in the human brain. In: NCTA 2014proceedings of the international conference on neural computation theory and applications, part of IJCCI 2014, Rome; 22–24 October 2014. p. 356–61. https://doi.org/10.5220/0005165003560361.
 9.Perea JA, Harer J. Sliding windows and persistence: an application of topological methods to signal analysis. Found Comput Math. 2015;15(3):799–838.CrossRefGoogle Scholar
 10.Rucco M, Concettoni E, Cristalli C, Ferrante A, Merelli E. Topological classification of small dc motors. In: 2015 IEEE 1st international forum on research and technologies for society and industry leveraging a better tomorrow (RTSI). IEEE; 2015. p. 192–7.Google Scholar
 11.de Silva V, Ghrist R. Coverage in sensor networks via persistent homology. Algebraic Geom Topol. 2007;7(339–358):24.Google Scholar
 12.Chan JM, Carlsson G, Rabadan R. Topology of viral evolution. Proc Natl Acad Sci. 2013;110(46):18566–71.CrossRefPubMedPubMedCentralGoogle Scholar
 13.Ibekwe AM, Ma J, Crowley DE, Yang CH, Johnson AM, Petrossian TC, Lum PY. Topological data analysis of Escherichia coli O157: H7 and nonO157 survival in soils. Front Cell Infect Microbiol. 2014;4:122.CrossRefPubMedPubMedCentralGoogle Scholar
 14.Taylor D, Klimm F, Harrington HA, Kramár M, Mischaikow K, Porter MA, Mucha PJ. Topological data analysis of contagion maps for examining spreading processes on networks. Nat Commun. 2015;6:7723.CrossRefPubMedPubMedCentralGoogle Scholar
 15.Merelli E, Rucco M, Piangerelli M, Toller D. A topological approach for multivariate time series characterization: the epilepsy case study. In: Proceedings of the 9th EAI conference on bioinspired information and communications technologies (BICT 2015). 2015.Google Scholar
 16.TOPDRIM Website. http://www.topdrim.eu. Accessed 29 May 2018.
 17.Goldberger AL, Amaral LAN, Glass L, Hausdorff JM, Ivanov PC, Mark RG, Mietus JE, Moody GB, Peng CK, Stanley HE. PhysioBank, PhysioToolkit, and PhysioNet: components of a new research resource for complex physiologic signals. Circulation. 2000;101(23):215–20. https://doi.org/10.1161/01.CIR.101.23.e215.CrossRefGoogle Scholar
 18.Edelsbrunner H, Harer J. Persistent homologya survey. Contemp Math. 2008;453:257–82.CrossRefGoogle Scholar
 19.Carlsson G, Zomorodian A, Collins A, Guibas L. Persistence barcodes for shapes. In: Proceedings of the 2004 Eurographics/ACM SIGGRAPH symposium on geometry processing. ACM; 2004. p. 124–35.Google Scholar
 20.Binchi J, Merelli E, Rucco M, Petri G, Vaccarino F. jholes: a tool for understanding biological complex networks via clique weight rank persistent homology. Electronic Notes Theor Comput Sci. 2014;306:5–18.CrossRefGoogle Scholar
 21.Rucco M, GonzalezDiaz R, Jimenez MJ, Atienza N, Concettoni E, Cristalli C, Ferrante A, Merelli E. A new topological entropybased approach for measuring similarities among piecewise linear functions. Submitted. http://arxiv.org/abs/1512.07613. (2016)
 22.Edelsbrunner H, Harer J. Computational topology : an introduction. Providence, R.I.: American Mathematical Society; 2010.Google Scholar
 23.Adams H, Tausz A. Javaplex tutorial. Stanford: Stanford University; 2011.Google Scholar
 24.Zomorodian A, Carlsson G. Computing persistent homology. Discret Comput Geom. 2005;33(2):249–74.CrossRefGoogle Scholar
 25.Merelli E, Rucco M, Sloot P, Tesei L. Topological characterization of complex systems: using persistent entropy. Entropy. 2015;17(10):6872–92.CrossRefGoogle Scholar
 26.Lockwood S, Krishnamoorthy B. Topological features in cancer gene expression data. 2014. arXiv preprint arXiv:1410.3198.
 27.Rucco M, Castiglione F, Merelli E, Pettini M. Characterisation of the idiotypic immune network through persistent entropy. In: Proceedings of 11th European conference on complex systems (ECCS 2014). Berlin: Springer; 2015. p. 117–28.Google Scholar
 28.Schmidt H, Petkov G, Richardson MP, Terry JR. Dynamics on networks: the role of local dynamics and global networks on the emergence of hypersynchronous neural activity. PLoS Comput Biol. 2014;10(11):1003947.CrossRefGoogle Scholar
 29.Mateo J, SánchezMorla E, Santos J. A new method for removal of powerline interference in ecg and eeg recordings. Comput Electrical Eng. 2015;45:235–48.CrossRefGoogle Scholar
 30.Keshtkaran MR, Yang Z. A fast, robust algorithm for power line interference cancellation in neural recording. J Neural Eng. 2014;11(2):026017.CrossRefPubMedGoogle Scholar
 31.Tausz A, VejdemoJohansson M, Adams H. Javaplex: a research platform for persistent homology. Book of abstracts: minisymposium on publicly available geometric/topological software, June 17th & 19th, 2012, Chapel Hill, NC, USA. p. 7–12.Google Scholar
 32.MathWorks: Decimate documentation. http://it.mathworks.com/help/signal/ref/decimate.html. Accessed 29 May 2018.
 33.Song Y, Liò P. A new approach for epileptic seizure detection: sample entropy based feature extraction and extreme learning machine. J Biomed Sci Eng. 2010;3(06):556.CrossRefGoogle Scholar
 34.Richman JS, Moorman JR. Physiological timeseries analysis using approximate entropy and sample entropy. Am J Physiol Heart Circ Physiol. 2000;278(6):2039–49.CrossRefGoogle Scholar
 35.Mershon B. Vietoris–Rips complex block. http://bl.ocks.org/bmershon/41bc67cfedf95f7d196d. Accessed 29 May 2018.
Copyright information
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.