Time Series Discretization Based on the Approximation of the Local Slope Information

  • Willian Zalewski
  • Fabiano Silva
  • Huei Diana Lee
  • Andre Gustavo Maletzke
  • Feng Chung Wu
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7637)

Abstract

In the last decade symbolic representations approaches have shown effectiveness for knowledge discovery in time series, such as the Symbolic Aggregate Approximation (SAX). However, SAX doesn’t preserve the local slope information of the time series because it uses only the mean values of the segments. The modification Extended SAX (ESAX) proposed to treat this problem by the dimensionality increase. In this paper, we present a symbolic representation method that preserves the behavior of local slope characteristics in the symbolic representations of the time series. The proposed method was evaluated with three different discretization approaches and compared with the SAX and the ESAX algorithms. The experimental evaluation, using artificial and real datasets with 1-nearest-neighbor classification, demonstrate the method effectiveness to reduce the error rates of time series classification and to keep the local slope information in the symbolic representations.

Keywords

time series knowledge discovery symbolic representation classification dimensionality reduction 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Antunes, C.M., Oliveira, A.L.: Temporal Data Mining – An Overview. In: Workshop on Temporal Data Mining at the 7th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD 2001), pp. 1–15 (2001)Google Scholar
  2. 2.
    Lin, J., Keogh, E., Lonardi, S., Chiu, B.: A Symbolic Representation of Time Series, with Implications for Atreaming Algorithms. In: 8th ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery (DMKD 2003), pp. 2–11. ACM (2003)Google Scholar
  3. 3.
    Karamitopoulos, L., Evagelidis, G.: Current Trends in Time Series Representation. In: 11th Panhellenic Conference on Informatics (PCI 2007), pp. 217–226 (2007)Google Scholar
  4. 4.
    Laxman, S., Sastry, P.S.: A Survey of Temporal Data Mining. Sadhana 31(2), 173–198 (2006)MathSciNetMATHCrossRefGoogle Scholar
  5. 5.
    Fu, T.C.: A Review on Time Series Data Mining. Engineering Applications of Artificial Intelligence 24(1), 164–181 (2010)CrossRefGoogle Scholar
  6. 6.
    Hugueney, B.: Adaptive Segmentation-Based Symbolic Representations of Time Series for Better Modeling and Lower Bounding Distance Measures. In: Fürnkranz, J., Scheffer, T., Spiliopoulou, M. (eds.) PKDD 2006. LNCS (LNAI), vol. 4213, pp. 545–552. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  7. 7.
    Pham, N.D., Le, Q.L., Dang, T.K.: Two Novel Adaptive Symbolic Representations for Similarity Search in Time Series Databases. In: 12th International Asia-Pacific Web Conference (APWEB 2010), pp. 181–187. IEEE (2010)Google Scholar
  8. 8.
    Ding, H., Trajcevski, G., Scheuermann, P., Wang, X., Keogh, E.: Querying and Mining of Time Series Data – Experimental Comparison of Representations and Distance Measures. VLDB Endowment 1(2), 1542–1552 (2008)Google Scholar
  9. 9.
    Batyrshin, I., Sheremetov, L.: Perception-Based Approach to Time Series Data Mining. Applied Soft Computing 8(3), 1211–1221 (2008)CrossRefGoogle Scholar
  10. 10.
    Lkhagva, B., Suzuki, Y., Kawagoe, K.: New Time Series Representation ESAX for Financial Applications. In: 22nd International Conference on Data Engineering Workshops (ICDEW 2006), pp. 17–22. IEEE (2006)Google Scholar
  11. 11.
    Zalewski, W., Silva, F., Lee, H.D., Maletzke, A.G., Wu, F.C.: A Symbolic Representation Method to Preserve the Characteristic Slope of Time Series. In: 21st Brazilian Symposium on Artificial Intelligence, SBIA 2012 (to appear, 2012)Google Scholar
  12. 12.
    Alonso, F., Martinez, L., Perez-Perez, A., Santamaria, A., Valente, J.P.: Modelling Medical Time Series Using Grammar-Guided Genetic Programming. In: Perner, P. (ed.) ICDM 2008. LNCS (LNAI), vol. 5077, pp. 32–46. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  13. 13.
    Keogh, E., Zhu, Q., Hu, B., Hao, Y., Xi, X., Wei, L., Ratanamahatana, C.A.: The UCR Time Series Classification/Clustering Homepage (2011), www.cs.ucr.edu/~eamonn/time_series_data/
  14. 14.
    Gorecki, T., Luczak, M.: Using Derivatives in Time Series Classification. In: Data Mining and Knowledge Discovery, pp. 1–22 (in press, 2012)Google Scholar
  15. 15.
    Batista, G.E.A.P.A., Wang, X., Keogh, E.J.: A Complexity-Invariant Distance Measure for Time Series. In: 11th SIAM International Conference on Data Mining (SDM 2011), pp. 699–710 (2011)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Willian Zalewski
    • 1
    • 2
  • Fabiano Silva
    • 1
  • Huei Diana Lee
    • 2
  • Andre Gustavo Maletzke
    • 2
  • Feng Chung Wu
    • 2
  1. 1.Formal Methods and Artificial Intelligence Laboratory – LIAMFFederal University of Parana – UFPRCuritibaBrazil
  2. 2.Bioinformatics Laboratory – LABIState University of West Parana – UNIOESTEFoz do IguassuBrazil

Personalised recommendations