Abstract
This paper presents a new method that uses orthogonalized features for time series clustering and classification. To cluster or classify time series data, either original data or features extracted from the data are used as input for various clustering or classification algorithms. Our methods use features extraction to represent a time series by a fixed-dimensional vector whose components are statistical metrics. Each metric is a specific feature based on the global structure of the time series data given. However, if there are correlations between feature metrics, it could result in clustering in a distorted space. To address this, we propose to orthogonalize the space of metrics using linear correlation information to reduce the impact on the clustering from the correlations between clustering inputs. We demonstrate the orthogonal feature learning on two popular clustering algorithms, k-means and hierarchical clustering. Two benchmarking data sets are used in the experiments. The empirical results shows that our proposed orthogonal feature learning method gives a better clustering accuracy compared to all other approaches including: exhaustive feature search, without feature optimization or selection, and without feature extraction for clustering. We expect our method to enhance the feature extraction process which also serves as an improved dimension reduction resolution for time series clustering and classification.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Scargle, J.: Timing: New Methods for Astronomical Time Series Analysis. American Astronomical Society, 197th AAS Meeting,# 22.02; Bulletin of the American Astronomical Society 32, 1438 (2000)
Trouve, A., Yu, Y.: Unsupervised clustering trees by non-linear principal component analysis. Pattern Recognition and Image Analysis 2, 108–112 (2001)
Owsley, L., Atlas, L., Bernard, G.: Automatic clustering of vector time-series for manufacturing machine monitoring. In: Proc. of ICASSP IEEE Int. Conf. Acoust. Speech Signal Process., vol. 4, pp. 3393–3396 (1997)
Keogh, E., Kasetty, S.: On the Need for Time Series Data Mining Benchmarks: A Survey and Empirical Demonstration. DMKD 7(4), 349–371 (2003)
Wang, X., Smith, K., Hyndman, R.: Characteristic-Based Clustering for Time Series Data. Data Mining and Knowledge Discovery 13(3), 335–364 (2006)
Lu, Z.: Estimating Lyapunov Exponents in Chaotic Time Series with Locally Weighted Regression. PhD thesis, University of North Carolina at Chapel Hill (1994)
Cox, D.: Long-range dependence: a review. Statistics: an appraisal. In: David, H.A., David, H.T. (eds.) 50th Anniversary Conf., Iowa State Statistical Laboratory, pp. 55–74. The Iowa State University Press (1984)
Dasu, T., Swayne, D.F., Poole, D.: Grouping Multivariate Time Series: A Case Study. KDD-2006 workshop report: Theory and Practice of Temporal Data Mining, ACM SIGKDD Explorations Newsletter 8(2), 96–97 (2006)
Bradley, P., Fayyad, U.: Refining Initial Points for K-Means Clustering. In: Proc. 15th International Conf. on Machine Learning, vol. 727 (1998)
Hartigan, J., Wong, M.: A K-means clustering algorithm. JR Stat. Soc., Ser. C 28, 100–108 (1979)
MacQueen, J.: Some methods for classification and analysis of multivariate observations. In: Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, vol. 1, pp. 281–297 (1967)
Lance, G., Williams, W.: A general theory of classificatory sorting strategies. I. Hierarchical systems. Computer Journal 9(4), 373–380 (1967)
Ward, J.: Hierarchical grouping to optimize an objective function. Journal of the American Statistical Association 58(301), 236–244 (1963)
Keogh, E., Folias, T.: The ucr Time Series Data Mining Archive (2002), http://www.cs.ucr.edu/~eamonn/TSDMA/index.html
Ihaka, R., Gentleman, R.: R: A Language for Data Analysis and Graphics. Journal of Computational and Graphical Statistics 5(3), 299–314 (1996)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Wang, X., Lopes, L. (2011). Orthogonal Feature Learning for Time Series Clustering. In: Liu, D., Zhang, H., Polycarpou, M., Alippi, C., He, H. (eds) Advances in Neural Networks – ISNN 2011. ISNN 2011. Lecture Notes in Computer Science, vol 6676. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-21090-7_23
Download citation
DOI: https://doi.org/10.1007/978-3-642-21090-7_23
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-21089-1
Online ISBN: 978-3-642-21090-7
eBook Packages: Computer ScienceComputer Science (R0)