Towards Prediction of Heart Arrhythmia Onset Using Machine Learning
- 143 Downloads
Current study aims at prediction of the onset of malignant cardiac arrhythmia in patients with Implantable Cardioverter-Defibrillators (ICDs) using Machine Learning algorithms. The input data consisted of 184 signals of RR-intervals from 29 patients with ICD, recorded both during normal heartbeat and arrhythmia. For every signal we generated 47 descriptors with different signal analysis methods. Then, we performed feature selection using several methods and used selected feature for building predictive models with the help of Random Forest algorithm. Entire modelling procedure was performed within 5-fold cross-validation procedure that was repeated 10 times. Results were stable and repeatable. The results obtained (AUC = 0.82, MCC = 0.45) are statistically significant and show that RR intervals carry information about arrhythmia onset. The sample size used in this study was too small to build useful medical predictive models, hence large data sets should be explored to construct models of sufficient quality to be of direct utility in medical practice.
KeywordsArrhythmia Implantable Cardioverter-Defibrillators Artificial intelligence Machine Learning Random Forest
Some types of cardiac arrhythmia, such as VF (ventricular fibrillation) or VT (ventricular tachycardia), are life-threatening. Therefore, prediction, detection, and classification of arrhythmia are very important issues in clinical cardiology, both for diagnosis and treatment. Recently research has concentrated on the two latter problems, namely detection and classification of arrhythmia which is a mature field . These algorithms are implemented in Implantable Cardioverter-Defibrillators (ICD) , which are used routinely to treat cardiac arrhythmia . However, the related problem of prediction of arrhythmia events still remains challenging.
In recent years we have observed an increased interest in application of Machine Learning (ML) and artificial intelligence methods in analysis of biomedical data in hope of introducing new diagnostic or predictive tools. Recently an article by Shakibfar et al.  describes prediction results regarding electrical storm (i.e. arrhythmic syndrome) with the help of Random Forest using daily summaries of ICDs monitoring. Authors then generated 37 predictive variables using daily ICD summaries from 19935 patients and applied ML algorithms, for construction of predictive models. They concluded that the use of Machine Learning methods can predict the short-term risk of electrical storm, but the models should be combined with clinical data to improve their accuracy.
In the current study ML algorithms are used for prediction of the onset of malignant cardiac arrhythmia using RR intervals. This is important problem, since the standard methods of prediction aim at stratification of patients into high- and low-risk groups using various sources of clinical data . Then, the patients from the high-risk group undergo surgical implantation of ICD , which monitors the heart rate. The algorithms for identification of arrhythmia events implemented in these devices recognise the event and apply the electric signal that restarts proper functioning of the heart. Despite technological progress, inappropriate ICD interventions are still a very serious side-effect of this kind of therapy. About 10–\(30\%\) of therapies delivered by ICD have been estimated as inappropriate . These are usually caused by supraventricular tachyarrhythmias, T-wave oversensing, noise or non-sustained ventricular arrhythmias.
The goal of the current study is to examine whether one may predict an incoming arrhythmia event using only the signal available for these devices. If such predictions are possible with high enough accuracy, they might be communicated by ICD’s to warn patients of incoming event, helping to minimise adverse effects or even possibly avoid them completely. One of the first studies considering this problem was a Master of Science thesis by P. Iranitalab . In that study the author used time and frequency domain analysis of QRS-complex as well as R-R interval variability analysis for only 18 patients, but he concluded that none of these methods proved to be an effective predictor that could be applied to a large patient population successfully. This analysis was performed on normal (sinus) and pre-arrhythmia EGM (ventricular electrogram) data. The newest article considering prediction of ventricular tachycardia (VT) and ventricular fibrillation (VF) was published in September 2019 by Taye et al. . Authors extracted features from HRV and ECG signals and used artificial neural network (ANN) classifiers to predict the VF onset 30 s before its occurrence. The prediction accuracy estimated using HRV features was \(72\%\) and using QRS complex shape features from ECG signals – \(98.6\%\), but only 27 recordings were used for this study.
Other studies, which seem to be related [7, 8, 9] in fact consider different issues. In  authors investigate a high risk patients of an ICD and evaluate QT dispersion, which may be a significant predictor of cardiovascular mortality. They claim that QT dispersion at rest didn’t predict the occurrence and/or reoccurrence of ventricular arrhythmias. In  authors proposed a new atrial fibrillation (AF) prediction algorithm to explore the prelude of AF by classifying ECG before AF into normal and abnormal states. ECG was transformed into spectrogram using short-time Fourier transform and then trained. In paper , it seems like it’s more about detection or classification than prediction of onset of arrhythmia. Authors used a clustering approach and regression methodology to predict type of cardiac arrhythmia.
Machine Learning algorithms are powerful tools, but should be used with caution. Loring et al. mention in their paper  the possible difficulties in application of methods of this kind (e.g. critical evaluation of methodology, errors in methodology difficult to detect, challenging clinical interpretation). We have planned our research taking this into account.
2 Materials and Methods
2.1 Data Set
The input data consisted of 184 tachograms (signals of RR-intervals i.e. beat-to-beat intervals, observed in ECG) from 29 patients with single chamber ICD implanted in the years 1995–2000 due to previous myocardial infarction. Only data from patients with devices compatible with the PDM 2000 (Biotronik) and STDWIN (Medtronic) programs were analysed in the study. Patients who had a predominantly paced rhythm were excluded from the study. The VF zone was active in all patients with the lower threshold from 277 ms to 300 ms. The VT zone was switched on in all patients. Antitachycardia pacing (ATP) was the first therapy in the VT zone. Ventricular pacing rate was 40–60 beats/min (bpm).
Patients clinical characteristics (\(n=29\)), ACEI – Angiotensin Converting Enzyme Inhibitors, ARB – Angiotensin Converting Enzyme Inhibitor, CABG – Coronary Artery Bypass Grafting, PCI – Percutaneous Transluminal Intervention, SCD – Sudden Cardiac Death.
Male gender (%)
Left ventricular ejection fraction %
PCI n (%)
CABG n (%)
Indications for ICD implantation n (%)
Primary prophylaxis of SCD
Secondary prophylaxis of SCD
Pharmacological treatment n (%)
ACEI and (or) ARB
ICD manufacturer n (%)
2.2 Data Preprocessing
Data preprocessing was performed with the help of the RHRV package for analysis of heart rate variability of ECG records  implemented in R . We followed the basic procedure proposed by the authors of this package. First, the heart beat positions were used to build an instantaneous heart rate series. Then, the basic filter was applied in order to eliminate spurious data points. Finally, the interpolated version of data series with equally spaced values was generated and used in frequency analysis. The default parameters were used for the analysis, with the exception of the width of the window for further analysis, as described later. For every signal we generated descriptors – performed basic analysis in time domain, frequency domain and also we calculated parameters related to selected nonlinear methods.
The preprocessed data series was then used to generate 47 descriptors using following approaches: statistical analysis in time domain, analysis in frequency (Fourier analysis) and time-frequency (wavelet analysis) domains, nonlinear analysis (Poincaré maps, the detrended fluctuation analysis, and the recurrence quantification analysis). The detailed description of the parameters is presented below.
SDNN—standard deviation of the RR interval,
SDANN—standard deviation of the average RR intervals calculated over short periods (50 s),
SDNNIDX—mean of the standard deviation calculated over the windowed RR intervals,
pNN50—proportion of successive RR intervals greater than 50 ms,
SDSD—standard deviation of successive differences,
r-MSSD—root mean square of successive differences,
IRRR—length of the interval determined by the first and third quantile of the \(\varDelta \)RR time series,
MADRR—median of the absolute values of the \(\varDelta \)RR time series,
TINN—triangular interpolation of RR interval histogram,
HRV index—St. George’s index.
Spectral analysis is based on the application of Fourier transform in order to decompose signals into sinusoidal components with fixed frequencies . The power spectrum yields the information about frequencies occurring in signals. In particular we used RHRV package and we applied STFT (short time Fourier transform) with Hamming window (in our computations with parameters size = 50 and shift = 5, which, after interpolation, gives 262–376 windows, depending on the signal).
Wavelet analysis allows to simultaneously analyse time and frequency contents of signals . It is achieved by fixing a function called mother wavelet and decomposing the signal into shifted and scaled versions of this function. It allows to precisely distinguish local characteristics of signals. By computing wavelet power spectrum one can obtain the information about frequencies occurring in the signal as well as when these frequencies occur. In this study we used Daubechies wavelets.
We obtained mean values and standard deviations for power spectrum (using Fourier and wavelet transform) for 4 frequency bands: ULF—ultra low frequency component 0–0.003 Hz, VLF—very low frequency component 0.003–0.03 Hz, LF—low frequency component 0.03–0.15 Hz, HF—high frequency component 0.15–0.4 Hz. We have also computed mean values and standard deviations of LF/HF ratio, using Fourier and wavelet transform.
Parameters from Nonlinear Methods
Poincaré Maps. We used standard parameters derived from Poincaré maps, They are return maps, in which each result of measurement is plotted as a function of the previous one. A shape of the plot describes the evolution of the system and allows us to visualise the variability of time series (here RR-intervals). There are standard descriptors used in quantifying Poincaré plot geometry, namely SD1 and SD2 [16, 17], that are obtained by fitting an ellipse to the Poincaré map. We also computed SD1/SD2 ratio.
DFA Method. Detrended Fluctuation Analysis (DFA) quantifies fractal-like autocorrelation properties of the signals [18, 19]. This method is a modified RMS (root mean square) for the random walk. Mean square distance of the signal from the local trend line is analysed as a function of scale parameter. There is usually a power-law dependence and an interesting parameter is the exponent. We obtained 2 parameters: short-range scaling exponent (fast parameter f.DFA) and long-range scaling exponent (slow parameter s.DFA) for time scales.
REC – recurrence, percentage of recurrence points in a recurrence plot,
DET – determinism, percentage of recurrence points that form diagonal lines,
RATIO – ratio between DET and REC, the density of recurrence points in a recurrence plot,
Lmax – length of the longest diagonal line,
DIV – inverse of Lmax,
Lmean – mean length of the diagonal lines; Lmean takes into account the main diagonal,
LmeanWithoutMain – mean length of the diagonal lines; the main diagonal is not taken into account,
ENTR – Shannon entropy of the diagonal line lengths distribution,
TREND – trend of the number of recurrent points depending on the distance to the main diagonal,
LAM – percentage of recurrent points that form vertical lines,
Vmax – longest vertical line,
Vmean – average length of the vertical lines.
2.4 Identification of Informative Variables
We have used several methods to identify the descriptors generated from the signal that are related to the occurrence of arrhythmia, namely the straightforward t-test, importance measure from the Random Forest , relevant variables returned by Boruta algorithm for all-relevant feature selection , as well as relevant variables returned by the MDFS (Multi-Dimensional Feature Selection) algorithm [23, 24]. Boruta is a wrapper on the Random Forest algorithm, whereas MDFS is a filter that relies on the multi-dimensional information entropy and therefore can take into account non-linear relationships and synergistic interactions between multiple descriptors and decision variable. We have applied MDFS in one and two-dimensional mode, using default parameters. All computations were performed in R , using R packages.
2.5 Predictive Models
let the number of training objects be N, and the number of features in features vector be M,
training set for each tree is built by choosing N times with replacement from all N available training objects,
number \(m<<M\) is an amount of features on which to base the decision at that node. These features are randomly chosen for each node,
each tree is built to the largest extent possible. There is no pruning.
Repetition of this algorithm yields a forest of trees, which all have been trained on bootstrap samples from training set. Thus, for a given tree, certain elements of training set will have been left out during training. The randomForest function was called with default parameters, with one modification – 1000 trees were used instead of 500.
Measuring Quality of Models and Validation of Modelling Procedure. Three metrics were used to assess the quality of models: AUC (area under ROC curve) and MCC (Matthews Correlation Coefficient)  in addition to ordinary error level. Two former functions are more robust, in particular for imbalanced data sets.
It is well-known that variable selection can introduce significant over-fitting, especially when parameters selected within cross-validation are not highly informative . To deal with the problem and to estimate the robustness of the models we applied the entire modelling was performed in five-fold cross-validation scheme. Then the procedure was repeated ten times and results are averaged to remove dependence on the particular split of data set into folds. This protocol is very demanding computationally, since entire modelling procedure is performed 50 times. In particular also the most time-consuming part of protocol, namely identification of informative variables, is performed 50 times. Nevertheless, these computations are essential for robust estimate of performance of the machine learning models.
3 Results and Discussion
3.1 Feature Selection
Number of occurrences of parameters in cross-validation loop for different feature selection methods.
Feature selection method
RF the best 10
For Random Forest feature importance one can see results for the best 10 features. The most frequently appearing parameters SD1/SD2 and SD2 are obtained from the Poincaré maps. The s.DFA arises from the Detrended Fluctuation Analysis. The HRVi, IRRR, r-MSSD, SDNNIDX, TINN, MADRR, SDSD and pNN50 variables are the statistical parameters in the time domain. The mean.fULF, sd.fHF and sd.wHF arise in the wavelet analysis. Interestingly, all methods agree on that variables arising from nonlinear analysis are most important. Then the relative importance of variables diverges among methods. Most methods agree that statistical variables in time domain are important, but there are significant differences between methods with respect to which of them are most relevant. The largest disagreement concerns variables arising from spectral analysis, which are generally considered irrelevant by most methods, but some variables are considered very important by some methods.
3.2 Predictive Models
Testing of arrhythmia prediction’s possibility (true labels versus random labels).
Results of prediction on different feature sets with selected parameters (mean value ± standard deviation of the mean).
\( 0.432\pm 0.022\)
RF the best 10
We focused on the Random Forest model. The evaluation of prediction was done by 5-fold cross validation. Tests were carried out in two ways. First we performed 1000 iterations with true labels (Table 3 row labelled true). The result was poor: error median and mean were about 0.3. Nevertheless, it shows that it is possible to perform prediction. Then, we did the same procedure, but with random labels. Before each iteration a new set of labels was randomised. The next step was to perform the prediction using Random Forest in 5-fold cross validation. The results are in Table 3 (row labelled random). Mean and median of prediction error were 0.5. The comparison of the results described in Table 3 shows that there is a significant difference in prediction based on real and random labels.
The prediction results in cross-validation loop for different feature selection algorithms measured by AUC and MCC are presented in Table 4. The best results were obtained for classifier that used 10 most relevant variables from the Random Forest. Results are stable and repeatable.
Each model was built using all variables that were deemed relevant by a feature selection algorithm in a given iteration of the cross-validation. Usually the number of relevant variables was close to 10—depending on applied feature selection method.
Based on obtained results we concluded that it’s possible to find information about arrhythmia in RR intervals, but it’s too weak to build useful medical predictive models using currently available methods. The subject requires further research to find algorithms better suited to the problem. In particular, a substantial increase of the size of the experimental sample, for instance by two or three orders of magnitude, should improve the quality of the models, as has been shown in numerous cases in applications of Machine Learning tools to different problems . Additionally, it is likely that building individual models for each patient could yield better results.
This work was supported by the Polish Ministry of Science and Higher Education under subsidy for maintaining the research potential of the Institute of Informatics, University of Białystok (grant BST-144).
Conflict of Interest
A. P. was a consultant for Biotronik and receives lectures fees from Medtronic, Biotronik and Abbott. He receives also a proctoring contract from Medtronic.
- 5.Iranitalab, I.: Prediction of arrythmia through analysis of the ventricular electrogram. A thesis presented to The Faculty of the Department of Chemical and Materials Engineering. San Jose State University (2009)Google Scholar
- 7.Blužaitė, I., Rickli, H., et al.: Assessment of QT dispersion in prediction of life-threatening ventricular arrythmias in recipients of implantable cardioverter defibrillator. Elek. Elektrotech. 75(3), 73–76 (2007)Google Scholar
- 9.Cp, P., Suresh, A., Suresh, G.: Prediction of cardiac arrhythmia type using clustering and regression approach (P-CA-CRA). In: 2017 International Conference on Advances in Computing, Communications and Informatics (ICACCI), pp. 51–54. IEEE (2017)Google Scholar
- 11.Przybylski, A., Baranowski, R., et al.: Verification of implantable cardioverter defibrillator (ICD) interventions by nonlinear analysis of heart rate variability - preliminary results. Eur. Eur. Pacing Arrhythm. Card. Electrophysiol. J. Work. Groups Card. Pacing Arrhythm. Card. Cell. Electrophysiol. Eur. Soc. Cardiol. 6, 617–624 (2004). https://doi.org/10.1016/j.eupc.2004.08.001CrossRefGoogle Scholar
- 13.R Development Core Team, R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria (2008)Google Scholar
- 24.Mnich, K., Rudnicki, W.R.: All-relevant feature selection using multidimensional filters with exhaustive search. Inf. Sci. (2020, in Press). https://doi.org/10.1016/j.ins.2020.03.024
- 27.Liaw, A., Wiener, M.: Classification and Regression by randomForest. R News. 2, 18–22 (2002)Google Scholar