Advertisement

Mining Time Series Data: A Selective Survey

  • Marcella CorduasEmail author
Conference paper
Part of the Studies in Classification, Data Analysis, and Knowledge Organization book series (STUDIES CLASS)

Abstract

Time series prediction and control may involve the study of massive data archive and require some kind of data mining techniques. In order to make the comparison of time series meaningful, one important question is to decide what similarity means and what features have to be extracted from a time series. This question leads to the fundamental dichotomy: (a) similarity can be based solely on time series shape; (b) similarity can be measured by looking at time series structure. This article discusses the main dissimilarity indices proposed in literature for time series data mining.

Keywords

Dynamic Time Warping Compare Time Series Time Series Cluster Time Series Feature ARIMA Process 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Notes

Acknowledgements

This research was supported by Dipartimento di Scienze Statistiche, University of Naples Federico II, and CFEPSR (Portici).

References

  1. Agrawal, R., Faloutsos, C., & Swami, A. (1994). Efficient similarity search in sequence databases. 4th F.O.D.O. Lecture notes in Computer Science (Vol. 730, pp. 69–84). New York: SpringerGoogle Scholar
  2. Bagnall, A. J., & Janaceck, G. J. (2005). Clustering time series from ARMA models with clipped data. Machine Learning, 58, 151–178zbMATHCrossRefGoogle Scholar
  3. Baragona, R., & Vitrano, S. (2005). Statistical and numerical algorithms for time series classification. Proceedings of CLADAG 2007 (pp. 65–68). EUM, MacerataGoogle Scholar
  4. Berndt, D., & Clifford, J. (1994). Using dynamic time warping to find patterns in time series. Proceedings of the AAAI-94 workshop of SIGKDD, pp. 229–248Google Scholar
  5. Box, G. E. P., & Jenkins, G. M. (1994). Time series analysis: Forecasting and control (rev. ed.). San Francisco: Holden-DayzbMATHGoogle Scholar
  6. Caiado, J., Crato, N., & Peña, D. (2006). A periodogram-based metric for time series classification. Computational Statistics & Data Analysis, 50, 2668–2684zbMATHCrossRefMathSciNetGoogle Scholar
  7. Chan, K., & Fu, A. W. (1999). Efficient time series matching by wavelets, ICDE (pp. 126–133)Google Scholar
  8. Chu, S., Keogh, E., Hart, D., & Pazzani, M. (2002). Iterative deepining dynamic time warping for time series. Proceedings of SIAM KDD, electronic editionGoogle Scholar
  9. Corduas, M. (2000). La metrica Autoregressiva tra modelli ARIMA: una procedura in linguaggio GAUSS. Quaderni di Statistica, 2, 1–37Google Scholar
  10. Corduas, M., & Piccolo, D. (2008). Time series clustering and classification by the Autoregressive Metric. Computational Statistics & Data Analysis, 52, 1860–1872zbMATHCrossRefMathSciNetGoogle Scholar
  11. Gray, A. H., & Markel, J. D. (1976). Distance measures for speech recognition. IEEE Transactions on Acoustics, Speech, & Signal Processing, ASSP-24, 380–391Google Scholar
  12. Itakura, F. (1975). Minimum prediction residual principle applied to speech recognition. IEEE Transactions on Acoustics, Speech, & Signal Processing, ASSP-23, 67–72Google Scholar
  13. Kalpakis, K., Gada, D., & Puttagunta, V. (2001). Distance measures for effective clustering of ARIMA time series. IEEE International Conference on Data Mining, 273–280Google Scholar
  14. Kang, W., Shiu, J., Cheng, C., Lai, J., Tsao, H., & Kuo, T. (1995). The application of cepstral coefficients and maximum likelihood method in EGM pattern recognition. IEEE Transactions on Biomedical Engineering, 42, 777–785CrossRefGoogle Scholar
  15. Kazakos, D., & Papantoni-Kazakos, P. (1980). Spectral distances between Gaussian processes. IEEE Transactions on Automatic Control, AC-25, 950–959Google Scholar
  16. Keogh, E. (2003). Exact indexing of dynamic time warping. 28th International Conference on VLDB (pp. 406–417). Hong KongGoogle Scholar
  17. Keogh, E., & Kasetty, S. (2003). On the need for time series data mining benchmarks: A survey and empirical demonstration. Data Mining and Knowledge Discovery, 7, 349–371CrossRefMathSciNetGoogle Scholar
  18. Li, T., Li, Q., Zhu, S., & Ogihara, M. (1997). A survey on wavelet applications in data mining. SIGKDD Explorations, 4, 49–68CrossRefGoogle Scholar
  19. Markel, J. D., & Gray, A. H. (1976). Linear prediction of speech. New York: SpringerzbMATHGoogle Scholar
  20. Morlini, I. (2005). On the dynamic time warping for computing the dissimilarity between curves. In M. Vichi, P. Monari, S. Mignani, & A. Montanari (Eds.), New developments in classification and data analysis (pp. 63–70). Berlin: SpringerCrossRefGoogle Scholar
  21. Piccolo, D. (1984). Una topologia per la classe dei processi ARIMA. Statistica, XLIV, 47–59Google Scholar
  22. Piccolo, D. (1990). A distance measure for classifying ARIMA models. Journal of Time Series Analysis, 11, 153–164CrossRefGoogle Scholar
  23. Piccolo, D. (2007). Statistical issues on the AR metric in time series analysis. Proceedings of the SIS Intermediate Conference (pp. 221–232). Cleup, PadovaGoogle Scholar
  24. Ratanamahatana, C. A., & Keogh, E. (2004). Making time-series classification more accurate using learned constraints. 4-th SIAM International Conference on Data Mining (pp. 1–20)Google Scholar
  25. Romano, E., & Scepi, G. (2004). Integrating time alignment and Self-Organizing Maps for classifying curves. Proceedings of KNEMO COMPSTAT 2006 Satellite Workshop, CapriGoogle Scholar
  26. Sakoe, H., & Chiba, S. (1978). Dynamic programming algorithm optimization for spoken word recognition. IEEE Transactions on Acoustics, Speech, & Signal Processing, 26, 143–165 (1978).Google Scholar
  27. Scepi, G., & Milone, G. (2007). Temporal data mining: clustering methods and algorithms. Proceedings of CLADAG 2007 (pp.73–76). EUM, MacerataGoogle Scholar
  28. Shumway, R. H. (1982). Discriminant analysis for time series. In Krishnaiah,P. R. & Kanals, L.N. (Eds.), Handbook of Statistics (Vol. 2, pp. 1–46). New York: North HollandGoogle Scholar
  29. Struzik, Z. R., & Siebes, A. (1999). The Haar wavelet in the time series similarity paradigm. 3rd European Conference on Principles of Data Mining and Knowledge Discovery (pp. 12–22). Prague: SpringerGoogle Scholar
  30. Thomson, P. J., & De Souza, P. (1985). Speech recognition using LPC distance measures. In E. J. Hannan, P. R. Krishnaiah, & M. M. Rao (eds.), Handbook of Statistics (Vol. 5, pp. 389–412). Amsterdam: ElsevierGoogle Scholar
  31. Tohkura, Y. (1987). A weighted cepstral distance measure for speech recognition. IEEE Transactions on Acoustics, Speech, & Signal Processing, ASSP-35, 1414–1422Google Scholar
  32. Vlachos, M., Yu, P., Castelli, V., & Meek, C. (2006a). Structural periodic measures for time series data. Data Mining and Knowledge Discovery, 12, 1–28CrossRefMathSciNetGoogle Scholar
  33. Vlachos, M., Hadejieleftheriou, M., Gunopulos, D., & Keogh, E. (2006b). Indexing multidimensional time series. The VBDL Journal, 15, 1–20Google Scholar
  34. Wang, C., & Wang, X. S. (2000). Supporting content-based searches on time series via approximations. 12th International Conference on Scientific and Statistical Database Management (pp. 69–81)Google Scholar
  35. Xiong Y., & Yeung D. (2004). Time series clustering with ARMA mixtures. Pattern Recognition, 37, 1675–1689zbMATHCrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2010

Authors and Affiliations

  1. 1.Dipartimento di Scienze StatisticheUniversità di Napoli Federico IINapoli(I)Italy

Personalised recommendations