Abstract
We consider the similarity problem in time series databases. Given a set of sequences, we are interested in finding those sequences whose behaviors are similar. Several methods have been proposed so far to solve this problem. Among them, [AFS93] mapped a time sequence to the frequency domain using the Discrete Fourier Transformation (DFT), and kept only first few coefficients. The similarity of two sequences is determined by the Euclidean distance between their coefficients. [ALS+95] (hereafter referred to as ALS approach) introduced a new model. Two time-series are said to be similar if they have enough non-overlapping time-ordered pairs of subsequences that are similar. Different from the ALS approach, [DGM97] used another method to find the longest common subsequence. The idea is, given a set of transformation functions, to try to find a linear function y=ax+b such that two subsequences X, Y are e-similar. However no paper up to now, as to our knowledge, had compared these algorithms on the same benchmark. This paper discusses and compares these three popular methods. By overcoming the drawbacks of these algorithms, a new algorithm is proposed. Experiments are conducted on the Nasdaq-100 market.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Rakesh Agrawal, Christos Faloutsos and Arun N. Swami. Efficient Similarity Search in Sequence Databases. In Proceedings of 4N Foundations of Data Organization and Algorithms (FODO’93), Chicago, Illinois, USA, October 13–15, 1993, pp: 69–84.
Rakesh Agrawal, King-Ip Lin, Harpreet S. Sawhney and Kyuseok Shim. Fast Similarity Search in the Presence of Noise, Scaling, and Translation in Time-Series Databases. In Proceedings of 21 th International Conference on Very Large Data Bases, September 11–15, 1995, Zurich, Switzerland, pp 490–501.
BĂ©la BollobĂ¢s, Gautam Das, Dimitrios Gunopulos and Heikki Mannila. TimeSeries Similarity Problems and Well Separated Geometric Sets. Proceedings of the Thirteenth Annual Symposium on Computational Geometry, June 4–6, 1997, Nice, France, pp 454 —456.
Norbert Beckmann, Hans-Peter Kriegel, Ralf Schneider and Bernhard Seeger. The R*-tree: An Efficient and Robust Access Method for Points and Rectangles. In Proceedings of the 1990 ACM SIGMOD International Conference on Management of Data, Atlantic City, NJ, May 23–25, 1990, pp: 322–331.
Gautam Das, Dimitrios Gunopulos and Heikki Mannila. Finding Similar Time Series. Principles of Data Mining and Knowledge Discovery, First European Symposium, PKDD ‘87, Trondheim, Norway, June 24–27, 1997, pp: 88–100.
Gautam Das, King-Ip Lin, Heikki Mannila, Copal Renganathan and Padhraic Smyth. Rule Discovery from Time Series. In Proceedings of the Fourth International Conference on Knowledge Discovery and Data Mining (KDD98), August 27–31, 1998, New York City, New York, USA, pp: 16–22.
R. D. Edwards and J. Magee. Technical Analysis of Stock Trends. John Magee, Springfield, Massachsetts, 1969.
Christos Faloutsos, H. V. Jagadish, Alberto O. Mendelzon and Tova Milo. A Signature Technique for Similarity-Based Queries. SEQUENCES 97, Positano-Salerno, Italy June 11–13 1997.
Christos Faloutsos, M. Ranganathan and Yannis Manolopoulos. Fast Subsequence Matching in Time-Series Databases. In Proceedings of the 1994 ACM SIGMOD International Conference on Management of Data, Minneapolis, Minnesota, May 24–27, 1994, pp: 419–429.
Dina Q. Goldin and Paris C. Kanellakis. On Similarity Queries for Time-Series Data: Constraint Specification and Implementation. International Conference, CP’95, Cassis, France, September 19–22, 1995, pp 137–153.
Jiawei Han, Guozhou Dong and Yiwen Yin. Efficient Mining of Partial Periodic Patterns in Time Series Database. In Proceedings of the 15th International Conference on Data Engineering, 23–26 March 1999, Sydney, Australia, pp: 106–115.
Flip Korn, H. V. Jagadish and Christos Faloutsos. Efficient Supporting Ad Hoc Queries in Large Datasets of Time Sequences. In Proceedings ACM SIGMOD International Conference on Management of Data, May 13–15, 1997, Tucson, Arizona, USA, pp: 289–300.
Heikki Mannila. Methods and Problems in Data Mining. Database Theory - ICDT ‘87, 6th International Conference, Delphi, Greece, January 8–10, 1997, pp: 41–55.
T.Sellis, N. Roussopoulos, and C. Faloutsos. The R+-tree: a dynamic index for multidimensional objects. In Proceedings of the 13`° International Conference on VLDB, England, 1987, pp: 507–518.
Davood Rafiei. On Similarity-Based Queries for Time Series Data. In Proceedings of the 15th International Conference on Data Engineering, 23–26 March 1999, Sydney, Austrialia, pp: 410–417.
Byoung-Kee Yi, H. V. Jagadish and Christos Faloutsos. Efficient Retrieval of Similar Time-Sequences under Time Warping. In Proceedings of the Fourteenth International Conference on Data Engineering, February 23–27, 1998, Orlando, Florida, USA, pp: 201–208.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2002 Springer Japan
About this paper
Cite this paper
Wu, F., Plihon, V., Gardarin, G. (2002). Similarity-Based Queries for Time Series Databases. In: Jin, Q., Li, J., Zhang, N., Cheng, J., Yu, C., Noguchi, S. (eds) Enabling Society with Information Technology. Springer, Tokyo. https://doi.org/10.1007/978-4-431-66979-1_2
Download citation
DOI: https://doi.org/10.1007/978-4-431-66979-1_2
Publisher Name: Springer, Tokyo
Print ISBN: 978-4-431-66981-4
Online ISBN: 978-4-431-66979-1
eBook Packages: Springer Book Archive