Advertisement

A BIRCH-Based Clustering Method for Large Time Series Databases

  • Vo Le Quy Nhon
  • Duong Tuan Anh
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7104)

Abstract

This paper presents a novel approach for time series clustering which is based on BIRCH algorithm. Our BIRCH-based approach performs clustering of time series data with a multi-resolution transform used as feature extraction technique. Our approach hinges on the use of cluster feature (CF) tree that helps to resolve the dilemma associated with the choices of initial centers and significantly improves the execution time and clustering quality. Our BIRCH-based approach not only takes full advantages of BIRCH algorithm in the capacity of handling large databases but also can be viewed as a flexible clustering framework in which we can apply any selected clustering algorithm in Phase 3 of the framework. Experimental results show that our proposed approach performs better than k-Means in terms of clustering quality and running time, and better than I-k-Means in terms of clustering quality with nearly the same running time.

Keywords

time series clustering cluster feature DWT BIRCH-based 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Chan, K., Fu, W.: Efficient time series matching by wavelets. In: Proceedings of the 15th IEEE Intl. Conf. on Data Engineering (ICDE 1999), March 23-26, pp. 126–133 (1999)Google Scholar
  2. 2.
    Gavrilov, M., Anguelov, M., Indyk, P., Motwani, R.: Mining The Stock Market: Which Measure is Best? In: Proc. of 6th ACM Conf. on Knowledge Discovery and Data Mining, Boston, MA, August 20-23, pp. 487–496 (2000)Google Scholar
  3. 3.
    Halkdi, M., Batistakis, Y., Vizirgiannis, M.: On Clustering Validation Techniques. J. Intelligent Information Systems 17(2-3), 107–145 (2001)CrossRefGoogle Scholar
  4. 4.
    Han, J., Kamber, M.: Data Mining: Concepts and Techniques, 2nd edn. Morgan Kaufmann (2006)Google Scholar
  5. 5.
    Kalpakis, K., Gada, D., Puttagunta, V.: Distance Measures for Effective Clustering of ARIMA Time Series. In: Proc. of 2001 IEEE Int. Conf. on Data Mining, pp. 273–280 (2001)Google Scholar
  6. 6.
    Keogh, E., Folias, T.: The UCR Time Series Data Mining Archive (2002), http://www.cs.ucr.edu/~eamonn/TSDMA/index.html
  7. 7.
    Lin, J., Vlachos, M., Keogh, E.J., Gunopulos, D.: Iterative Incremental Clustering of Time Series. In: Hwang, J., Christodoulakis, S., Plexousakis, D., Christophides, V., Koubarakis, M., Böhm, K. (eds.) EDBT 2004. LNCS, vol. 2992, pp. 106–122. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  8. 8.
    Cao, L.: In-depth Behavior Understanding and Use: the Behavior Informatics Approach. Information Science 180(17), 3067–3085 (2010)CrossRefGoogle Scholar
  9. 9.
    May, P., Ehrlich, H.-C., Steinke, T.: ZIB Structure Prediction Pipeline: Composing a Complex Biological Workflow Through Web Services. In: Nagel, W.E., Walter, W.V., Lehner, W. (eds.) Euro-Par 2006. LNCS, vol. 4128, pp. 1148–1158. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  10. 10.
    Redmond, S., Heneghan, C.: A Method for Initialization the k-Means Clustering Algorithm Using kd-Trees. Pattern Recognition Letters (2007)Google Scholar
  11. 11.
    Strehl, A., Ghosh, J.: Cluster Ensembles – A Knowledge Reuse Framework for Combining Multiple Partitions. J. of Machine Learning Research 3(3), 583–617 (2002)MathSciNetzbMATHGoogle Scholar
  12. 12.
    Zhang, H., Ho, T.B., Zhang, Y., Lin, M.S.: Unsupervised Feature Extraction for Time Series Clustering Using Orthogonal Wavelet Transform. Journal Informatica 30(3), 305–319 (2006)MathSciNetzbMATHGoogle Scholar
  13. 13.
    Zhang, T., Ramakrishnan, R., Livny, M.: BIRCH: A new data clustering algorithm and its applications. Journal of Data Mining and Knowledge Discovery 1(2), 141–182 (1997)CrossRefGoogle Scholar
  14. 14.
    Historical Data for S&P 500 Stocks, http://kumo.swcp.com/stocks/

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Vo Le Quy Nhon
    • 1
  • Duong Tuan Anh
    • 1
  1. 1.Faculty of Computer Science and EngineeringHo Chi Minh City University of TechnologyVietnam

Personalised recommendations