Time Series Discretization Based on the Approximation of the Local Slope Information
In the last decade symbolic representations approaches have shown effectiveness for knowledge discovery in time series, such as the Symbolic Aggregate Approximation (SAX). However, SAX doesn’t preserve the local slope information of the time series because it uses only the mean values of the segments. The modification Extended SAX (ESAX) proposed to treat this problem by the dimensionality increase. In this paper, we present a symbolic representation method that preserves the behavior of local slope characteristics in the symbolic representations of the time series. The proposed method was evaluated with three different discretization approaches and compared with the SAX and the ESAX algorithms. The experimental evaluation, using artificial and real datasets with 1-nearest-neighbor classification, demonstrate the method effectiveness to reduce the error rates of time series classification and to keep the local slope information in the symbolic representations.
Keywordstime series knowledge discovery symbolic representation classification dimensionality reduction
Unable to display preview. Download preview PDF.
- 1.Antunes, C.M., Oliveira, A.L.: Temporal Data Mining – An Overview. In: Workshop on Temporal Data Mining at the 7th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD 2001), pp. 1–15 (2001)Google Scholar
- 2.Lin, J., Keogh, E., Lonardi, S., Chiu, B.: A Symbolic Representation of Time Series, with Implications for Atreaming Algorithms. In: 8th ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery (DMKD 2003), pp. 2–11. ACM (2003)Google Scholar
- 3.Karamitopoulos, L., Evagelidis, G.: Current Trends in Time Series Representation. In: 11th Panhellenic Conference on Informatics (PCI 2007), pp. 217–226 (2007)Google Scholar
- 7.Pham, N.D., Le, Q.L., Dang, T.K.: Two Novel Adaptive Symbolic Representations for Similarity Search in Time Series Databases. In: 12th International Asia-Pacific Web Conference (APWEB 2010), pp. 181–187. IEEE (2010)Google Scholar
- 8.Ding, H., Trajcevski, G., Scheuermann, P., Wang, X., Keogh, E.: Querying and Mining of Time Series Data – Experimental Comparison of Representations and Distance Measures. VLDB Endowment 1(2), 1542–1552 (2008)Google Scholar
- 10.Lkhagva, B., Suzuki, Y., Kawagoe, K.: New Time Series Representation ESAX for Financial Applications. In: 22nd International Conference on Data Engineering Workshops (ICDEW 2006), pp. 17–22. IEEE (2006)Google Scholar
- 11.Zalewski, W., Silva, F., Lee, H.D., Maletzke, A.G., Wu, F.C.: A Symbolic Representation Method to Preserve the Characteristic Slope of Time Series. In: 21st Brazilian Symposium on Artificial Intelligence, SBIA 2012 (to appear, 2012)Google Scholar
- 13.Keogh, E., Zhu, Q., Hu, B., Hao, Y., Xi, X., Wei, L., Ratanamahatana, C.A.: The UCR Time Series Classification/Clustering Homepage (2011), www.cs.ucr.edu/~eamonn/time_series_data/
- 14.Gorecki, T., Luczak, M.: Using Derivatives in Time Series Classification. In: Data Mining and Knowledge Discovery, pp. 1–22 (in press, 2012)Google Scholar
- 15.Batista, G.E.A.P.A., Wang, X., Keogh, E.J.: A Complexity-Invariant Distance Measure for Time Series. In: 11th SIAM International Conference on Data Mining (SDM 2011), pp. 699–710 (2011)Google Scholar