Abstract
Indexing large time series databases is crucial for efficient searching of time series queries. In the paper, we propose a novel indexing scheme RQI (Range Query based on Index) which includes three filtering methods: first-k filtering, indexing lower bounding and upper bounding as well as triangle inequality pruning. The basic idea is calculating wavelet coefficient whose first k coefficients are used to form a MBR (minimal bounding rectangle) based on haar wavelet transform for each time series and then using point filtering method; At the same time, lower bounding and upper bounding feature of each time series is calculated, in advance, and stored into index structure. At last, triangle inequality pruning method is used by calculating the distance between time series beforehand. Then we introduce a novel lower bounding distance function SLBS (Symmetrical Lower Bounding based on Segment) and a novel clustering algorithm CSA (Clustering based on Segment Approximation) in order to further improve the search efficiency of point filtering method by keeping a good clustering trait of index structure. Extensive experiments over both synthetic and real datasets show that our technologies provide perfect pruning power and could obtain an order of magnitude performance improvement for time series queries over traditional naive evaluation techniques.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Agrawal, R., Faloutsos, C., Swami, A.: Efficient Similarity Search in Sequence databases. In: Lomet, D.B. (ed.) FODO 1993. LNCS, vol. 730, pp. 69–74. Springer, Heidelberg (1993)
Faloutsos, C., Ranganathan, M., Manolopoulos, Y.: Fast subsequence matching in time-series databases. In: Proc. of ACM SIGMOD Conference, pp. 419–429. ACM Press, New York (1994)
Keogh, E.: Exact indexing of dynamic time warping. In: Proc. of ACM VLDB Conference, pp. 406–417. ACM Press, New York (2002)
Keogh, E., Chakrabarti, K., Mehrotra, S., Pazzani, M.: Locally adaptive dimensionality reduction for indexing large time series databases. In: Proc. of ACM SIGMOD Conference, pp. 151–162. ACM Press, New York (2001)
Kim, S., Park, S., Chu, W.W.: An index-based approach for similarity search supporting time warping in large sequence databases. In: Proc. of IEEE ICDE Conference, pp. 607–614. IEEE Press, New York (2001)
Korn, F., Jagadish, H.V., Faloutsos, C.: Supporting ad hoc queries in large datasets of time sequences. In: Proc. ACM SIGMOD Conference, pp. 289–300. ACM Press, New York (1997)
Junkui, L., Yuanzhen, W., Xinping, L.: LB_HUST: A symmetrical boundary distance for clustering time series. In: The 9th Int’l Conf. on Information Technology, pp. 203–208. IEEE Press, New York (2006)
Liu, B., Zhihui, W., Jingtao, L., Wang, W., Shi, B.: Tight bounds on the estimation distance using wavelet. In: Yu, J.X., Kitsuregawa, M., Leong, H.-V. (eds.) WAIM 2006. LNCS, vol. 4016, pp. 460–471. Springer, Heidelberg (2006)
Vlachos, M., Kollios, G., Gunopulos, D.: Discovering similar multidimensional trajectories. In: Proc. of IEEE ICDE, pp. 673–684. IEEE Press, New York (2002)
Moon, Y.-S., Loh, W.-K., Whang, K.-Y.: Duality-based subsequence matching in time-series databases. In: Proc. of IEEE ICDE, pp. 263–272. IEEE Press, New York (2001)
Yi, B.K., Faloutsos, C.: Fast time sequence indexing for arbitrary L p norms. In: Proc. of ACM VLDB Conference, pp. 385–394. ACM Press, New York (2000)
Yi, B.K., Jagadish, H.V., Faloutsos, C.: Efficient retrieval of similar time sequences under time warping. In: Proc. of IEEE ICDE Conference, pp. 23–27. IEEE Press, New York (1998)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Feng, Y., Jiang, T., Zhou, Y., Li, J. (2008). An Efficient Similarity Searching Algorithm Based on Clustering for Time Series. In: Perner, P. (eds) Advances in Data Mining. Medical Applications, E-Commerce, Marketing, and Theoretical Aspects. ICDM 2008. Lecture Notes in Computer Science(), vol 5077. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-70720-2_28
Download citation
DOI: https://doi.org/10.1007/978-3-540-70720-2_28
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-70717-2
Online ISBN: 978-3-540-70720-2
eBook Packages: Computer ScienceComputer Science (R0)