Advertisement

An Efficient Method for Estimating Time Series Motif Length Using Sequitur Algorithm

  • Nguyen Ngoc Phien
Conference paper
Part of the Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering book series (LNICST, volume 251)

Abstract

Motifs in time series are approximately repeated subsequence found within a long time series data. There are some popular and effective algorithms for finding motif in time series. However, these algorithms still have one major weakness: users of these algorithms are required to select an appropriate value of the motif length which is unknown in advance. In this paper, we propose a novel method to estimate the length of 1-motif in a time series. This method is based on GrammarViz, a variable-length motif detection approach which has Sequitur at its core. Sequitur is known as a grammar compression algorithm that is able to have enough identification not just common subsequences but also identify the hierarchical structure in data. As GrammarViz, our method is also based on the Sequitur algorithm, but for another purpose: a preprocessing step for finding motif in time series. The experimental results prove that our method can help to estimate very fast the length of 1-motif for some TSMD algorithms, such as Random Projection.

Keywords

Motif detection Sequitur algorithm Grammar inference Motif length Time series 

References

  1. 1.
    Agrawal, R., Imielinski, T., Swami, A.: Mining association rules between sets of items in large databases. In: Proceedings of the 1993 ACM SIGMOD International Conference on Management of Data, Washington, D.C., pp. 207–216, 26–28 May 1993Google Scholar
  2. 2.
    Castro, N., Azevedo, P.: Multiresolution motif discovery in time series. In: Proceedings of SIAM International Conference on Data Mining, 29 April–1 May, Columbus, Ohio, USA, pp. 665–676 (2010)Google Scholar
  3. 3.
    Cherniavsky, N., Ladner, R.: Grammar-based Compression of DNA Sequences. UW CSE Technical Report 2007-05-02 (2007)Google Scholar
  4. 4.
    Chiu, B., Keogh, E. Lonardi, S.: Probabilistic discovery of time series motifs. In: Proceedings of the 9th International Conference on Knowledge Discovery and Data Mining (KDD 2003), pp. 493–498 (2003)Google Scholar
  5. 5.
    Fu, T.C.: A review on time series data mining. Eng. Appl. Artif. Intell. 24(1), 164–181 (2011)CrossRefGoogle Scholar
  6. 6.
    Gruber, C., Coduro, M., Sick, B.: Signature verification with dynamic RBF network and time series motifs. In: Proceedings of 10th International Workshop on Frontiers in Hand Writing Recognition (2006)Google Scholar
  7. 7.
    Keogh, E., Chakrabarti, K., Pazzani, M., Mehrotra, S.: Dimensionality reduction for fast similarity search in large time series databases. J. Knowl. Inf. Syst. 3(3), 263–286 (2000)CrossRefGoogle Scholar
  8. 8.
    Keogh, E.: The UCR Time Series Data Mining Archive (2018)Google Scholar
  9. 9.
    Li, Y., Lin, J., Oates, T.: Visualizing variable-length time series motifs. In: Proceedings of SDM (2012)CrossRefGoogle Scholar
  10. 10.
    Lin, J., Keogh, E., Patel, P., Lonardi, S.: Finding motifs in time series. In: Proceedings of the 2nd Workshop on Temporal Data Mining, at the 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (2002)Google Scholar
  11. 11.
    Lin, J., Keogh, E., Lonardi, S., Chiu, B.: A symbolic representation of time series, with implications for streaming algorithms. In: Proceedings of the 8th ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery, San Diego, CA, 13 June 2003Google Scholar
  12. 12.
    Lkhagva, B., Suzuki, K., Kawagoe, K.: Extended SAX: extension of symbolic aggregate approximation for financial time series data representation. In: Proceedings of DEWS Data Engineering Workshop, DNEWS 2006, 4A-i8 (2006)Google Scholar
  13. 13.
    Mueen, A., Keogh, E., Zhu, Q., Cash, S., Westover, B.: Exact discovery of time series motif. In: Proceedings of 2009 SIAM International Conference on Data Mining. SIAM (2009)Google Scholar
  14. 14.
    Nevill-Manning, C.G., Witten, I.H.: Identifying hierarchical structure in sequences: a linear-time algorithm. J. Artif. Intell. Res. 7, 67–82 (1997)CrossRefGoogle Scholar
  15. 15.
    Ratanamahatana, C.A., Lin, J., Gunopulos, D., Keogh, E., Vlachos, M., Das, G.: Mining time series data. In: Maimon, O., Rokach, L. (eds.) Data Mining and Knowledge Discovery Handbook, 2nd edn, pp. 1049–1077. Springer, Heidelberg (2010)Google Scholar
  16. 16.
    Tanaka, Y., Iwamoto, K., Uehara, K.: Discovery of time series motif from multi-dimensional data based on MDL principle. Mach. Learn. 58(2–3), 269–300 (2005)CrossRefGoogle Scholar

Copyright information

© ICST Institute for Computer Sciences, Social Informatics and Telecommunications Engineering 2018

Authors and Affiliations

  1. 1.Center for Applied Information TechnologyTon Duc Thang UniversityHo Chi Minh CityVietnam
  2. 2.Faculty of Information TechnologyTon Duc Thang UniversityHo Chi Minh CityVietnam

Personalised recommendations