In this study, we develop a clustering method for multivariate time series data. In practical situations, such problems can arise in finance, economics, control theory, and health science. First, we propose to use a simulation based approximation to the test statistic and develop a method to test if two multivariate time series are coming from same VAR process. Then, the testing method is extended to a group of multivariate time series objects. Finally, a new clustering algorithm is developed using the testing method. The proposed algorithm does not use a predetermined number of clusters and finds the best possible clustering from the data. Empirical studies are provided in this paper, and they establish the accuracy of the algorithm. Finally, as a practical example, the algorithm is implemented to identify activities of different persons from a real-life data obtained from single chest-mounted accelerometers worn by different individuals.
Similar content being viewed by others
References
C. Abraham, P. A. Cornillon, E. Matzner-Løber, and N. Molinari, “Unsupervised curve clustering using B-splines,” Scand. J. Stat., 30, No. 3, 581–595 (2003).
A. Antoniadis, J. Bigot, and R. von Sachs, “A multiscale approach for statistical characterization of functional images,” J. Comput. Graph. Stat., 18, No. 1, 216–237 (2009).
L. Bao and S. Intille, “Activity recognition from user-annotated acceleration data,” Pervasive Computing, 1–17 (2004).
P. Bloomfield, Fourier Analysis of Time Series: An Introduction, John Wiley & Sons, New York (2004).
G. Box, G. M. Jenkins, G. C. Reinsel, and G. M. Ljung, Time Series Analysis: Forecasting and Control, John Wiley and Sons, New York (2015).
P. Casale, P. Pujol, and P. Radeva, “Personalization and user verification in wearable systems using biometric walking patterns,” Persow. Ubiq. Comput., 16, No. 5, 563–580 (2012).
J.-M. Chiou and P.-L. Li, “Functional clustering and identifying substructures of longitudinal data,” J. R. Stat. Soc. Ser. B, 69, No. 4, 679–699 (2007).
D. Degras, Z. Xu, T. Zhang, and W. B. Wu, “Testing for parallelism among trends in multiple time series,” IEEE Trans. Signal Process., 60, No. 3, 1087–1097 (2012).
A. P. Dempster, N. M. Laird, and D. B. Rubin. “Maximum likelihood from incomplete data via the EM algorithm,” J. R. Stat. Soc. Ser. B, 39, No. 1, 1–38 (1977).
D.A. Dickey and W.A. Fuller, “Distribution of the estimators for autoregressive time series with a unit root,” J. Am. Stat. Assoc., 74, No. 366a, 427–431 (1979).
Z. Gao, Y. Yang, P. Fang, Y. Zou, C. Xia, and M. Du, “Multiscale complex network for analyzing experimental multivariate time series,” Europhys. Let., 109, No. 3, 30005 (2015).
L. A. Garcia-Escudero and A. Gordaliza, “A proposal for robust curve clustering,” J. Classif., 22, No. 2, 185–201 (2005).
P. Hall, Y. K. Lee, and B. U. Park, “A method for projecting functional data onto a low-dimensional space,” J. Comput. Graph. Stat., 16, No. 4, 799–812 (2007).
J.D. Hamilton, Time Series Analysis, Vol. 2, Princeton University Press, Princeton (1994).
H. Izakian, W. Pedrycz, and I. Jamal, “Fuzzy clustering of time series data using dynamic time warping distance,” Eng. Appl. Artif. Intell., 39, 235–244 (2015).
Y. Kakizawa, R. H. Shumway, and M. Taniguchi, “Discrimination and clustering for multivariate time series,” J. Am. Stat. Assoc., 93, No. 441, 328–340 (1998).
T. W. Liao, “Clustering of time series data — a survey,” Pattern Recognit., 38, No. 11, 1857–1874 (2005).
S. Lloyd, “Least squares quantization in PCM,” IEEE Trans. Inform. Theor., 28, No. 2, 129–137 (1982).
H. Lütkepohl, New Introduction to Multiple Time Series Aanalysis, Springer, New York (2005).
J. MacQueen, “Some methods for classification and analysis of multivariate observations,” in: Proc. Fifth Berkeley Sympos. Math. Stat. and Probability, Vol. I: Statistics, University of California Press, Berkeley (1967), pp. 281–297.
A. Mannini and A.M. Sabatini, “Machine learning methods for classifying human physical activity from on-body accelerometers,” Sensors, 10, No. 2, 1154–1175 (2010).
T. Oates, L. Firoiu, and P. Cohen, “Clustering time series with hidden Markov models and dynamic time warping,” in: Proceedings of the IJCAI-99 Workshop on Neural, Symbolic and Reinforcement Learning Methods for Sequence Learning, Stockholm (1999), pp. 17–21.
T. Santos and R. Kern, “A literature survey of early time series classification and deep learning,” SAMI@ iKNOW (2016).
A. Singhal and D. E. Seborg, “Clustering multivariate time-series data,” J. Chemomet., 19, No. 8, 427–438 (2005).
P. Smyth et al., “Clustering sequences with hidden Markov models,” Adv. Neur. Inform. Process Syst., 648–654 (1997).
T. Tarpey and K. K. J. Kinateder, “Clustering functional data,” J. Classif., 20, No. 1, 93–114 (2003).
J. H. Ward Jr., “Hierarchical grouping to optimize an objective function,” J. Am. Stat. Assoc., 58, 236–244 (1963).
W. B. Wu, “Nonlinear system theory: Another look at dependence,” Proc. Natl. Acad. Sci. USA, 102, No. 40, 14150–14154 (2005).
K. Yang and C. Shahabi, “A PCA-based similarity measure for multivariate time series,” in: Proceedings of the 2nd ACM International Workshop on Multimedia Databases, ACM (2004), pp. 65–74.
T. Zhang, “Clustering high-dimensional time series based on parallelism,” J. Am. Stat. Assoc., 108, No. 502, 577–588 (2013).
E. Zivot and J. Wang, “Vector autoregressive models for multivariate time series,” Modeling Financial Time Series with S-PLUS, 385–429 (2006).
Author information
Authors and Affiliations
Corresponding author
Additional information
Proceedings of the XXXIV International Seminar on Stability Problems for Stochastic Models, Debrecen, Hungary.
Rights and permissions
About this article
Cite this article
Deb, S. VAR Model Based Clustering Method for Multivariate Time Series Data. J Math Sci 237, 754–765 (2019). https://doi.org/10.1007/s10958-019-04201-4
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10958-019-04201-4