Using Lowly Correlated Time Series to Recover Missing Values in Time Series: A Comparison Between SVD and CD
The Singular Value Decomposition (SVD) is a matrix decomposition technique that has been successfully applied for the recovery of blocks of missing values in time series. In order to perform an accurate block recovery, SVD requires the use of highly correlated time series. However, using lowly correlated time series that exhibit shape and/or trend similarities could increase the recovery accuracy. Thus, the latter time series could also be exploited by including them in the recovery process.
In this paper, we compare the accuracy of the Centroid Decomposition (CD) against SVD for the recovery of blocks of missing values using highly and lowly correlated time series. We show that the CD technique better exploits the trend and shape similarity to lowly correlated time series and yields a better recovery accuracy. We run experiments on real world hydrological and synthetic time series to validate our results.
KeywordsTime Series Mean Square Error Singular Value Decomposition Input Matrix Stochastic Gradient Descent
- 1.Khayati, M., Böhlen, M.: Rebom: Recovery of blocks of missing values in time series. In: Proceedings of the 2012 ACM International Conference on Management of Data. COMAD 2012, pp. 44–55. Computer Society of India (2012)Google Scholar
- 4.Achlioptas, D., McSherry, F.: Fast computation of low-rank matrix approximations. J. ACM 54 (2007)Google Scholar
- 5.Li, L., McCann, J., Pollard, N.S., Faloutsos, C.: Dynammo: mining and summarization of coevolving sequences with missing values. In: Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 507–516. Paris, France, 28 June–1 July 2009Google Scholar
- 7.Khayati, M., Böhlen, M., Gamper, J.: Memory-efficient centroid decomposition for long time series. In: ICDE. pp. 100–111 (2014)Google Scholar
- 10.Yu, H., Hsieh, C., Si, S., Dhillon, I.S.: Scalable coordinate descent approaches to parallel matrix factorization for recommender systems. In: 12th IEEE International Conference on Data Mining, ICDM 2012, pp. 765–774. Brussels, Belgium, 10–13 December 2012Google Scholar
- 11.Gemulla, R., Nijkamp, E., Haas, P.J., Sismanis, Y.: Large-scale matrix factorization with distributed stochastic gradient descent. In: KDD, pp. 69–77 (2011)Google Scholar
- 13.Balzano, L., Nowak, R., Recht, B.: Online identification and tracking of subspaces from highly incomplete information. CoRR abs/1006.4046 (2010)Google Scholar
- 15.Björck, A.: Numerical methods for least squares problems. SIAM (1996)Google Scholar