An improved fast shapelet selection algorithm and its application to pervasive EEG

Abstract

With the rapid development of pervasive devices, a great deal of time series are generated by various sensors, and many time series classification (TSC) algorithms have been proposed to deal with these data. Among them, shapelet-based algorithms have attracted great attention due to its high accuracy and strong interpretability. However, time complexity of shapelet-based algorithms is high. In this paper, we propose an improved Fast Shapelet Selection algorithm based on Clustering (FSSoC), which greatly reduces the time of shapelet selection. Firstly, time series are clustered into several groups with improved k-means, and then some time series are sampled from each cluster with a strategy based on Euclidean Distance sorting. Secondly, Important Data Points (IDPs) of the sampled time series are identified and only the subsequences between two nonadjacent IDPs are added to shapelet candidates. Therefore, the number of shapelet candidates is greatly reduced, which leads to a obviously reduction in time consumption. Thirdly, FSSoC is applied to shapelet transformation algorithm to test classification accuracy and running time, the experiments demonstrate that FSSoC is obviously faster than existing shapelet selection algorithms while keeping a high accuracy. At last, a case study on EEG time series is presented, which verifies the feasibility of FSSoC application to automatically discover representative EEG features.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

References

  1. 1.

    Yacchirema D, De Puga J, Palau C, et al. (2019) Fall detection system for elderly people using IoT and ensemble machine learning algorithm. Pers Ubiquit Comput 23(5):801–817

    Article  Google Scholar 

  2. 2.

    Joo W, Choi K, Kim Y, et al. (2019) Deep learning model for unstructured knowledge classification using structural features. Pers Ubiquit Comput 24(3):1–12

    Google Scholar 

  3. 3.

    Ye L, Keogh E (2009) Time series shapelets: a new primitive for data mining. Knowledge discovery and data mining:947–956

  4. 4.

    Yu XM, Wang H, Zheng XW, et al. (2016) Effective algorithms for vertical mining probabilistic frequent patterns in uncertain mobile environments. Int J AD HOC Ubiq Comput 23(3-4):137–151

    Article  Google Scholar 

  5. 5.

    Hills J, Lines J, Baranauskas E, et al. (2014) Classification of time series by shapelet transformation. Data Min Knowl Disc 28(4):851–881

    MathSciNet  Article  Google Scholar 

  6. 6.

    Mueen A, Keogh E, Young NE (2011) Logical-Shapelets: An expressive primitive for time series classification. In: Proceedings of the 17th ACM SIGKDD international conference on knowledge discovery and data mining, pp 1154–1162

  7. 7.

    Bagnall AJ, Bostrom A, Large J, et al. (2016) The great time series classification bake off: an experimental evaluation of recently proposed algorithms. Extended Version, arXiv: Learning, arXiv:1602.01711

  8. 8.

    Ji C, Zhao C, Liu S, et al. (2018) A fast shapelet selection algorithm for time series classification. Comput Netw 148:231–240

    Article  Google Scholar 

  9. 9.

    Jeong Y, Jeong MK, Omitaomu OA, et al. (2011) Weighted dynamic time warping for time series classification. Pattern Recogn 44(9):2231–2240

    Article  Google Scholar 

  10. 10.

    Stefan A, Athitsos V, Das G, et al. (2013) The move-split-merge metric for time series. IEEE Trans Knowl Data Eng 25(6):1425–1438

    Article  Google Scholar 

  11. 11.

    Batista GE, Keogh E, Tataw OM, et al. (2014) CID: an efficient complexity-invariant distance for time series. Data Min Knowl Disc 28(3):634–669

    MathSciNet  Article  Google Scholar 

  12. 12.

    Gorecki T, Luczak M (2013) Using derivatives in time series classification. Data Min Knowl Disc 26(2):310–331

    MathSciNet  Article  Google Scholar 

  13. 13.

    Senin P, Malinchik S (2013) SAX-VSM: interpretable time series classification using SAX and vector space model. International conference on data mining:1175–1180

  14. 14.

    Kate RJ (2016) Using dynamic time warping distances as features for improved time series classification. Data Min Knowl Disc 30(2):283–312

    MathSciNet  Article  Google Scholar 

  15. 15.

    Schafer P (2015) The BOSS is concerned with time series classification in the presence of noise. Data Min Knowl Disc 29(6):1505–1530

    MathSciNet  Article  Google Scholar 

  16. 16.

    Baydogan MG, Runger GC (2016) Time series representation and similarity based on local autopatterns. Data Min Knowl Disc 30(2):476–509

    MathSciNet  Article  Google Scholar 

  17. 17.

    Baydogan MG, Runger GC, Tuv E, et al. (2013) A bag-of-features framework to classify time series. IEEE Trans Pattern Anal Mach Intell 35(11):2796–2802

    Article  Google Scholar 

  18. 18.

    Lines J, Bagnall AJ (2015) Time series classification with ensembles of elastic distance measures. Data Min Knowl Disc 29(3):565–592

    MathSciNet  Article  Google Scholar 

  19. 19.

    Ye L, Keogh E (2011) Time series shapelets: a novel technique that allows accurate, interpretable and fast classification. Data Min Knowl Disc 22(1):149–182

    MathSciNet  Article  Google Scholar 

  20. 20.

    Grabocka J, Wistuba M, Schmidtthieme L, et al. (2016) Fast classification of univariate and multivariate time series through shapelet discovery. Knowl Inf Syst 49(2):429–454

    Article  Google Scholar 

  21. 21.

    Renard X, Rifqi M, Erray W, et al. (2015) Random-shapelet: an algorithm for fast shapelet discovery. IEEE international conference on data science and advanced analytics: 1–10

  22. 22.

    Gordon D, Hendler D, Rokach L, et al. (2015) Fast and space-efficient shapelets-based time-series classification. Intell Data Anal 19(5):953–981

    Article  Google Scholar 

  23. 23.

    Karlsson I, Papapetrou P, Bostrom H, et al. (2016) Generalized random shapelet forests. European Conference on Machine Learning 30(5):1053–1085

    MathSciNet  MATH  Google Scholar 

  24. 24.

    Zhang Z, Zhang H, Wen Y, et al. (2016) Accelerating time series shapelets discovery with key points. Asia-Pacific web conference: 330–342

  25. 25.

    Ji C, Zhao C, Pan L, et al. (2019) A just-in-time shapelet selection service for online time series classification. Comput Netw 157:89–98

    Article  Google Scholar 

  26. 26.

    Xing SN, Liu FA, Wang QQ, et al. (2019) A hierarchical attention model for rating prediction by leveraging user and product reviews. Neurocomputing 322:417–427

    Article  Google Scholar 

  27. 27.

    Rakthanmanon T, Keogh E (2013) Fast shapelets: a scalable algorithm for discovering time series shapelets. SIAM international conference on data mining, pp 668–676

  28. 28.

    Grabocka J, Schilling N, Wistuba M, et al. (2014) Learning time-series shapelets, knowledge discovery and data mining: 392–401

  29. 29.

    Lines J, Davis LM, Hills J, et al. (2012) A shapelet transform for time series classification, knowledge discovery and data mining, pp 289–297

  30. 30.

    Liu R, Wang H, Yu XM (2018) Shared-nearest-neighbor-based clustering by fast search and find of density peaks. Inform Sci 450:200–226

    MathSciNet  Article  Google Scholar 

  31. 31.

    Bagnall A, Lines J, Vickers W, Keogh E (2016) The UEA & UCR time series classification repository, www.timeseriesclassification.com

  32. 32.

    Arul M, Kareem A (2019) Shapelets for earthquake detection, arXiv: Learning, arXiv:1911.09086

  33. 33.

    Aldhanhani A, Damiani E, Mizouni R, et al. (2019) Framework for traffic event detection using Shapelet transform. Eng Appl Artif Intel 82:226–235

    Article  Google Scholar 

  34. 34.

    Zorko A, Fruhwirth M, Goswami N, et al. (2020) Heart rhythm analyzed via shapelets distinguishes sleep from awake. Front Physiol 10(1):1–16

    Google Scholar 

  35. 35.

    Hong L, Yang X, Zheng W, et al. (2019) Emotional regulation goals of young adults with depression inclination: an event-related potential study. Acta Psychologica Sinica 51(6):637–647

    Article  Google Scholar 

Download references

Funding

We are grateful for the support of the Natural Science Foundation of Shandong Province, China (No. ZR2019MF071), the National Natural Science Foundation of China (No. 61373149, 61672329).

Author information

Affiliations

Authors

Corresponding authors

Correspondence to Xiangwei Zheng or Cun Ji.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Zou, X., Zheng, X., Ji, C. et al. An improved fast shapelet selection algorithm and its application to pervasive EEG. Pers Ubiquit Comput (2021). https://doi.org/10.1007/s00779-020-01501-4

Download citation

Keywords

  • Time series classification
  • Shapelet selection
  • k-means
  • EEG features