Advertisement

Journal of Classification

, Volume 35, Issue 1, pp 71–99 | Cite as

The Hierarchical Spectral Merger Algorithm: A New Time Series Clustering Procedure

  • Carolina Euán
  • Hernando Ombao
  • Joaquín Ortega
Article

Abstract

We present a new method for time series clustering which we call the Hierarchical Spectral Merger (HSM) method. This procedure is based on the spectral theory of time series and identifies series that share similar oscillations or waveforms. The extent of similarity between a pair of time series is measured using the total variation distance between their estimated spectral densities. At each step of the algorithm, every time two clusters merge, a new spectral density is estimated using the whole information present in both clusters, which is representative of all the series in the new cluster. The method is implemented in an R package HSMClust. We present two applications of the HSM method, one to data coming from wave-height measurements in oceanography and the other to electroencefalogram (EEG) data.

Keywords

Hierarchical spectral merger clustering: Time series clustering Hierarchical clustering Total variation distance Time series Spectral analysis 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. ALVAREZ-ESTEBAN, P.C., EUÁN, C., and ORTEGA, J. (2016), “Time Series Clustering Using the Total Variation Distance with Applications in Oceanography”, Environmetrics, 27, 355–369.MathSciNetCrossRefGoogle Scholar
  2. BRODTKORB, P.A., JOHANNESSON, P., LINDGREN, G., RYCHLIK, I., RYDÉN, J., and SJÖ, E. (2010), “WAFO - A Matlab Toolbox for Analysis of Random Waves and Loads”, in Proceedings of the 10th International Offshore and Polar Engineering Conference, Vol. 3, Seattle, USA, pp. 343–350.Google Scholar
  3. CAIADO, J., CRATO, N., and PEÑA, D. (2006), “A Periodogram-Based Metric for Time Series Classification”, Computational Statistics and Data Analysis, 50, 2668–2684.MathSciNetCrossRefzbMATHGoogle Scholar
  4. CAIADO, J., CRATO, N., and PEÑA, D. (2009), “Comparison of Times Series with Unequal Length in the Frequency Domain", Communications in Statistics - Simulation and Computation, 38, 527–540.MathSciNetCrossRefzbMATHGoogle Scholar
  5. CAIADO, J., MAHARAJ, E.A., and D’URSO, P. (2015), “Time Series Clustering”, in Handbook of Cluster Analysis, eds. C. Hennig, M. Meila, F. Murtagh, and R. Rocci, Handbooks of Modern Statistical Methods, Chap. 12, Chapman and Hall/CRC, pp. 241–263.Google Scholar
  6. CATTELL, R.B. (1966), “The Scree Test For The Number Of Factors”, Multivariate Behavioral Research, 1, 245–276.CrossRefGoogle Scholar
  7. CONTRERAS, P., and MURTAGH, F. (2015), "Hierarchical Clustering", in Handbook of Cluster Analysis, eds. C. Hennig, M. Meila, F. Murtagh, and R. Rocci, Handooks of Modern Statistiacl Methods, Chap. 12, Chapman and Hall/CRC, pp. 103–123.Google Scholar
  8. EUÁN, C. (2016), “Detection of Changes in Time Series: A Frequency Domain Approach”, PhD dissertation, CIMAT.Google Scholar
  9. GAVRILOV, M., ANGUELOV, D., INDYK, P., and MOTWANI, R. (2000), “Mining the Stock Market: Which Measure is Best”, in Proceedings of the 6 th ACM Internationall Conference on Knowledge Discovery and Data Mining, pp. 487–496.Google Scholar
  10. GOUTTE, C., TOFT, P., ROSTRUP, E., NIELSEN, F., and HANSEN, L.K. (1999), “On Clustering fMRI Time Series”, NeuroImage, 9, 298–310.CrossRefGoogle Scholar
  11. KRAFTY, R.T. (2016), “Discriminant Analysis of Time Series in the Presence of Within-Group Spectral Variability”, Journal of Time Series Analysis, 37, 435–450.MathSciNetCrossRefzbMATHGoogle Scholar
  12. KRAFTY, R.T., HALL, M., and GUO, W. (2011), “Functional Mixed Effects Spectral Analysis”, Biometrika, 98, 583–598.MathSciNetCrossRefzbMATHGoogle Scholar
  13. KREISS, J.-P., and PAPARODITIS, E. (2015), “Bootstrapping Locally Stationary Processes”, Journal of the Royal Statistical Society. Series B. Statistical Methodology, 77, 267–290.MathSciNetCrossRefGoogle Scholar
  14. LIAO, T.W. (2005), “Clustering of Time Series Data – A Survey”, Pattern Recognition, 38, 1857–1874.CrossRefzbMATHGoogle Scholar
  15. LONGUETT-HIGGINS, M. (1957), “The Statistical Analysis of a Random Moving Surface”, Philosophical Transactions of the Royal Society of London, Series A, 249, 321–387.MathSciNetCrossRefGoogle Scholar
  16. MAHARAJ, E., D’URSO, P., and GALAGEDERA, D. (2010), “Wavelet-Based Fuzzy Clustering of Time Series”, Journal of Classification, 27, 231–275.MathSciNetCrossRefzbMATHGoogle Scholar
  17. MAHARAJ, E.A. (2002), “Comparison of Non-Stationary Time Series in the Frequency Domain”, Computational Statistics and Data Analysis, 40, 131–141.MathSciNetCrossRefzbMATHGoogle Scholar
  18. MAHARAJ, E.A., and ALONSO, A.M. (2007), “Discrimination of Locally Stationary Time Series Using Wavelets”, Computational Statistics and Data Analysis, 52, 879–895.MathSciNetCrossRefzbMATHGoogle Scholar
  19. MAHARAJ, E.A., and ALONSO, A.M (2014), “Discriminant Analysis of Multivariate Time Series: Application to Diagnosis Based on ECG Signals”, Computational Statistics and Data Analysis, 70, 67–87.Google Scholar
  20. MAHARAJ, E.A., and D’URSO, P. (2011), “Fuzzy Clustering of Time Series in the Frequency Domain”, Information Sciences, 181, 1187–1211.CrossRefzbMATHGoogle Scholar
  21. MAHARAJ, E.A., and D’URSO, P. (2012), “Wavelets-Based Clustering of Multivariate Time Series”, Fuzzy Sets and Systems, 193, 33–61.MathSciNetCrossRefzbMATHGoogle Scholar
  22. MONTERO, P., and VILAR, J. (2014), “TsClust: An R package for Time Series Clustering”, Journal of Statistical Software, 62(1), 1–43CrossRefGoogle Scholar
  23. OCHI, M.K. (1998), Ocean Waves: The Stochastic Approach, Cambridge, U.K: Cambridge University Press.CrossRefzbMATHGoogle Scholar
  24. PÉRTEGA DÍAZ, S., and VILAR, J.A. (2010), “Comparing Several Parametric and Nonparametric Approaches to Time Series Clustering: A Simulation Study”, Journal of Classification, 27, 333–362.MathSciNetCrossRefzbMATHGoogle Scholar
  25. PIERSON, W.J. (1955), “Wind-Generated Gravity Waves”, Advances in Geophysics, 2, 93–178.MathSciNetCrossRefGoogle Scholar
  26. R CORE TEAM (2014), R: A Language and Environment for Statistical Computing, R Foundation for Statistical Computing, Vienna, Austria.Google Scholar
  27. SHUMWAY, R.H., and STOFFER, D.S. (2011), Time Series Analysis and Its Applications. With R Examples (3rd ed.), New York: Springer.CrossRefzbMATHGoogle Scholar
  28. THORNDIKE, R.L. (1953), “Who Belongs in the Family”, Psychometrika, 18(4), 267–276.CrossRefGoogle Scholar
  29. TIBSHIRANI, R., WALTHER, G., and HASTIE, T. (2001), “Estimating the Number of Clusters in a Data Set via the Gap Statistic”, Journal of the Royal Statistical Society: Series B (Statistical Methodology), 63, 411–423.MathSciNetCrossRefzbMATHGoogle Scholar
  30. WU, J., SRINIVASAN, R., KAUR, A., and CRAMER, S.C. (2014), “Resting-State Cortical Connectivity Predicts Motor Skill Acquision”, NeuroImage, 91, 84–90.CrossRefGoogle Scholar
  31. XU, R., and WUNSCH, D. (2005), “Survey of Clustering Algorithms”, IEEE Transactions on Neural Networks, 16, 645–678.CrossRefGoogle Scholar

Copyright information

© Classification Society of North America 2018

Authors and Affiliations

  • Carolina Euán
    • 1
    • 2
    • 3
  • Hernando Ombao
    • 2
    • 3
    • 4
  • Joaquín Ortega
    • 1
  1. 1.Centro de Investigación en MatemáticasGuanajuatoMéxico
  2. 2.King Abdullah University of Science and TechnologyThuwalSaudi Arabia
  3. 3.UC Irvine Space-Time Modeling GroupUniversity of CaliforniaIrvineUSA
  4. 4.Department of StatisticsUniversity of California, IrvineIrvineUSA

Personalised recommendations