Advertisement

Data Mining and Knowledge Discovery

, Volume 33, Issue 1, pp 96–130 | Cite as

Domain agnostic online semantic segmentation for multi-dimensional time series

  • Shaghayegh GharghabiEmail author
  • Chin-Chia Michael Yeh
  • Yifei Ding
  • Wei Ding
  • Paul Hibbing
  • Samuel LaMunion
  • Andrew Kaplan
  • Scott E. Crouter
  • Eamonn Keogh
Article

Abstract

Unsupervised semantic segmentation in the time series domain is a much studied problem due to its potential to detect unexpected regularities and regimes in poorly understood data. However, the current techniques have several shortcomings, which have limited the adoption of time series semantic segmentation beyond academic settings for four primary reasons. First, most methods require setting/learning many parameters and thus may have problems generalizing to novel situations. Second, most methods implicitly assume that all the data is segmentable and have difficulty when that assumption is unwarranted. Thirdly, many algorithms are only defined for the single dimensional case, despite the ubiquity of multi-dimensional data. Finally, most research efforts have been confined to the batch case, but online segmentation is clearly more useful and actionable. To address these issues, we present a multi-dimensional algorithm, which is domain agnostic, has only one, easily-determined parameter, and can handle data streaming at a high rate. In this context, we test the algorithm on the largest and most diverse collection of time series datasets ever considered for this task and demonstrate the algorithm’s superiority over current solutions.

Keywords

Time series Semantic segmentation Online algorithms 

Notes

Acknowledgements

We gratefully acknowledge NIH R01HD083431 and NSF awards 1510741 and 1544969. We also acknowledge the many donors of datasets.

References

  1. Aminikhanghahi S, Cook DJ (2017) A survey of methods for time series change point detection. Knowl Inf Syst 51:339–367CrossRefGoogle Scholar
  2. Anonymous (2018) Progress in artificial intelligence. WikipediaGoogle Scholar
  3. Aoki T, Lin JF-S, Kulić D, Venture G (2016) Segmentation of human upper body movement using multiple IMU sensors. In: Engineering in medicine and biology society (EMBC), 2016 IEEE 38th annual international conference of the. IEEE, pp 3163–3166Google Scholar
  4. Bouchard D, Badler N (2007) Semantic segmentation of motion capture using laban movement analysis. In: International workshop on intelligent virtual agents. Springer, pp 37–44Google Scholar
  5. Bregler C (1997) Learning and recognizing human dynamics in video sequences. In: 1997 IEEE Computer society conference on computer vision and pattern recognition, 1997. Proceedings, IEEE, pp 568–574Google Scholar
  6. Cain KL, Sallis JF, Conway TL, Van Dyck D, Calhoon L (2013) Using accelerometers in youth physical activity studies: a review of methods. J Phys Act Health 10:437–450CrossRefGoogle Scholar
  7. Cassisi C, Prestifilippo M, Cannata A, Montalto P, Patanè D, Privitera E (2016) Probabilistic reasoning over seismic time series: volcano monitoring by hidden markov models at mt. etna. Pure appl Geophys 173:2365–2386CrossRefGoogle Scholar
  8. Chen Y, Keogh E, Hu B, Begum N, Bagnall A, Mueen A, Batista G Welcome to the UCR Time Series Classification/Clustering Page. http://www.cs.ucr.edu/~eamonn/time_series_data/. Accessed 7 Sep 2018
  9. Chuttani K, Tischler MD, Pandian NG, Lee RT, Mohanty PK (1994) Diagnosis of cardiac tamponade after cardiac surgery: relative value of clinical, echocardiographic, and hemodynamic signs. Am Heart J 127:913–918CrossRefGoogle Scholar
  10. Crouter SE, Flynn JI, Bassett DR Jr (2015) Estimating physical activity in youth using a wrist accelerometer. Med Sci Sports Exerc 47:944CrossRefGoogle Scholar
  11. Dau HA, Begum N, Keogh E (2016) Semi-supervision dramatically improves time series clustering under dynamic time warping. In: Proceedings of the 25th ACM international on conference on information and knowledge management. ACM, pp 999–1008Google Scholar
  12. Esteban C, Hyland SL, Rätsch G (2017) Real-valued (medical) time series generation with recurrent conditional GANs. arXiv preprint arXiv:170602633
  13. Ha TM, Bunke H (1997) Off-line, handwritten numeral recognition by perturbation method. In: IEEE transactions on pattern analysis & machine intelligence, pp 535–539Google Scholar
  14. Hao Y, Chen Y, Zakaria J, Hu B, Rakthanmanon T, Keogh E (2013) Towards never-ending learning from time series streams. In: Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, pp 874– 882Google Scholar
  15. Harguess J, Aggarwal JK (2009) Semantic labeling of track events using time series segmentation and shape analysis. In: 2009 16th IEEE international conference on image processing (ICIP), IEEE, pp 4317–4320Google Scholar
  16. Heldt T, Oefinger MB, Hoshiyama M, Mark RG (2003) Circulatory response to passive and active changes in posture. In: Computers in cardiology, 2003. IEEE, pp 263–266Google Scholar
  17. Hu B, Chen Y, Keogh E (2016) Classification of streaming time series under more realistic assumptions. Data Min Knowl Disc 30:403–437MathSciNetCrossRefGoogle Scholar
  18. Keogh E (2017) Supporting website for this paper. http://www.cs.ucr.edu/~eamonn/FLOSS/. Accessed 7 Sep 2018
  19. Keogh E, Kasetty S (2003) On the need for time series data mining benchmarks: a survey and empirical demonstration. Data Min Knowl Disc 7:349–371MathSciNetCrossRefGoogle Scholar
  20. Keogh E, Chu S, Hart D, Pazzani M (2004) Segmenting time series: A survey and novel approach. In: Data mining in time series databases. World Scientific, pp 1–21Google Scholar
  21. Kozey-Keadle S, Libertine A, Lyden K, Staudenmayer J, Freedson PS (2011) Validation of wearable monitors for assessing sedentary behavior. Med Sci Sports Exerc 43:1561–1567CrossRefGoogle Scholar
  22. Lainscsek C, Hernandez ME, Weyhenmeyer J, Sejnowski TJ, Poizner H (2013) Non-linear dynamical analysis of EEG time series distinguishes patients with Parkinson’s disease from healthy individuals. Front Neurol.  https://doi.org/10.3389/fneur.2013.00200 Google Scholar
  23. Lan R, Sun H (2015) Automated human motion segmentation via motion regularities. Vis Comput 31:35–53CrossRefGoogle Scholar
  24. Lin JF-S, Karg M, Kulić D (2016) Movement primitive segmentation for human motion modeling: a framework for analysis. IEEE Trans Hum Mach Syst 46:325–339CrossRefGoogle Scholar
  25. Lyden K, Keadle SK, Staudenmayer J, Freedson PS (2014) A method to estimate free-living active and sedentary behavior from an accelerometer. Med Sci Sports Exerc 46:386CrossRefGoogle Scholar
  26. Machné R, Murray DB, Stadler PF (2017) Similarity-based segmentation of multi-dimensional signals. Sci Rep 7:12355CrossRefGoogle Scholar
  27. Maschke GW, Scalabrini GJ (2005) The lie behind the lie detector. Antipolygraph orgGoogle Scholar
  28. Matsubara Y, Sakurai Y, Faloutsos C (2014a) Autoplait: Automatic mining of co-evolving time sequences. In: Proceedings of the 2014 ACM SIGMOD international conference on Management of data. ACM, pp 193–204Google Scholar
  29. Matsubara Y, Sakurai Y, Ueda N, Yoshikawa M (2014b) Fast and exact monitoring of co-evolving data streams. In: 2014 IEEE international conference on data mining (ICDM), IEEE, pp 390–399Google Scholar
  30. Matteson DS, James NA (2014) A nonparametric approach for multiple change point analysis of multivariate data. J Am Stat Assoc 109:334–345MathSciNetCrossRefzbMATHGoogle Scholar
  31. Mocap.cs.cmu.edu (2017) Carnegie Mellon University—CMU Graphics Lab—motion capture library. http://mocap.cs.cmu.edu./. Accessed 7 Sep 2018
  32. Mohammadian E, Noferesti M, Jalili R (2014) FAST: Fast Anonymization of Big Data Streams. In: Proceedings of the 2014 international conference on big data science and computing (BigDataScience ‘14). ACM, pp 231–238Google Scholar
  33. Molina JM, García J, Garcia AB, Melo R, Correia L (2009) Segmentation and classification of time-series: real case studies. In: International conference on intelligent data engineering and automated learning. Springer, pp 743–750Google Scholar
  34. Morris D, Saponas TS, Guillory A, Kelner I (2014) RecoFit: using a wearable sensor to find, recognize, and count repetitive exercises. In: Proceedings of the SIGCHI conference on human factors in computing systems. ACM, pp 3225–3234Google Scholar
  35. Mu Y, Lo H, Amaral K, Ding W, Crouter SE (2013) Discriminative accelerometer patterns in children physical activitiesGoogle Scholar
  36. Mueen A, Viswanathan K, Gupta CK, Keogh E (2015) The fastest similarity search algorithm for time series subsequences under Euclidean distance. url: www cs unm edu/∼ mueen/FastestSimilaritySearch html (Accessed 24 May 2016)Google Scholar
  37. Nishino J, Itoh M, Ishinomori T, Kubota N, Uemichi Y (2003) Development of a catalytic cracking process for converting waste plastics to petrochemicals. J Mater Cycles Waste Manag 5:89–93.  https://doi.org/10.1007/s10163-003-0086-6 CrossRefGoogle Scholar
  38. Pavlovic V, Rehg JM, MacCormick J (2001) Learning switching linear models of human motion. In: Advances in neural information processing systems. pp 981–987Google Scholar
  39. Reinhardt A, Christin D, Darmstadt TU, Kanhere SS (2013) Predicting the power consumption of electric appliances through time series pattern matching. In: In: Proceedings of the 5th ACM workshop on embedded systems for energy-efficient buildings (BuildSysGoogle Scholar
  40. Reiss A, Stricker D (2012) Introducing a new benchmarked dataset for activity monitoring. In: 2012 16th International symposium on wearable computers. IEEE, Newcastle, United Kingdom, pp 108–109Google Scholar
  41. Serra J, Muller M, Grosche P, Arcos JL (2014) Unsupervised music structure annotation by time series structure features and segment similarity. IEEE Trans Multimed 16:1229–1240.  https://doi.org/10.1109/TMM.2014.2310701 CrossRefGoogle Scholar
  42. Wang P, Wang H, Wang W (2011) Finding semantics in time series. In: SIGMOD’11 proceedings of the 2011 ACM SIGMOD. pp 385–396Google Scholar
  43. Weiner ID, Charles SW (1997) Hypokalemia–consequences, causes, and correction. J Am Soc Nephrol 8:1179–1188Google Scholar
  44. Crouter S, Ding W, Keogh E Novel Approaches for Predicting Unstructured Short Periods of Physical Activities in Youth. GrantomeGoogle Scholar
  45. Yao L, Sheng QZ, Ruan W, Li X, Wang S, Yang Z (2015) Unobtrusive posture recognition via online learning of multi—dimensional RFID received signal strength. In: 2015 IEEE 21st international conference on parallel and distributed systems (ICPADS), IEEE, pp 116–123Google Scholar
  46. Yeh C-CM, Zhu Y, Ulanova L, Begum N, Ding Y, Hoang AD, Furtado Silva D, Mueen A (2016) Matrix profile I: all pairs similarity joins for time series: a unifying view that includes motifs, discords and shapelets. IEEE, pp 1317–1322Google Scholar
  47. Zhao J, Itti L (2016) Decomposing time series with application to temporal segmentation. In: 2016 IEEE winter conference on applications of computer vision (WACV). pp 1–9Google Scholar

Copyright information

© The Author(s) 2018

Authors and Affiliations

  1. 1.Department of Computer Science and EngineeringUniversity of CaliforniaRiversideUSA
  2. 2.Department of Computer ScienceUniversity of Massachusetts BostonBostonUSA
  3. 3.Department of Kinesiology, Recreation, and Sport StudiesThe University of Tennessee KnoxvilleKnoxvilleUSA

Personalised recommendations