Similarity-Based Queries for Time Series Databases

Wu, Fei; Plihon, Véronique; Gardarin, Georges

doi:10.1007/978-4-431-66979-1_2

Fei Wu⁷,
Véronique Plihon⁷ &
Georges Gardarin⁷

127 Accesses

Abstract

We consider the similarity problem in time series databases. Given a set of sequences, we are interested in finding those sequences whose behaviors are similar. Several methods have been proposed so far to solve this problem. Among them, [AFS93] mapped a time sequence to the frequency domain using the Discrete Fourier Transformation (DFT), and kept only first few coefficients. The similarity of two sequences is determined by the Euclidean distance between their coefficients. [ALS+95] (hereafter referred to as ALS approach) introduced a new model. Two time-series are said to be similar if they have enough non-overlapping time-ordered pairs of subsequences that are similar. Different from the ALS approach, [DGM97] used another method to find the longest common subsequence. The idea is, given a set of transformation functions, to try to find a linear function y=ax+b such that two subsequences X, Y are e-similar. However no paper up to now, as to our knowledge, had compared these algorithms on the same benchmark. This paper discusses and compares these three popular methods. By overcoming the drawbacks of these algorithms, a new algorithm is proposed. Experiments are conducted on the Nasdaq-100 market.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Rakesh Agrawal, Christos Faloutsos and Arun N. Swami. Efficient Similarity Search in Sequence Databases. In Proceedings of 4^N Foundations of Data Organization and Algorithms (FODO’93), Chicago, Illinois, USA, October 13–15, 1993, pp: 69–84.
Google Scholar
Rakesh Agrawal, King-Ip Lin, Harpreet S. Sawhney and Kyuseok Shim. Fast Similarity Search in the Presence of Noise, Scaling, and Translation in Time-Series Databases. In Proceedings of 21 th International Conference on Very Large Data Bases, September 11–15, 1995, Zurich, Switzerland, pp 490–501.
Google Scholar
Béla Bollobâs, Gautam Das, Dimitrios Gunopulos and Heikki Mannila. TimeSeries Similarity Problems and Well Separated Geometric Sets. Proceedings of the Thirteenth Annual Symposium on Computational Geometry, June 4–6, 1997, Nice, France, pp 454 —456.
Google Scholar
Norbert Beckmann, Hans-Peter Kriegel, Ralf Schneider and Bernhard Seeger. The R*-tree: An Efficient and Robust Access Method for Points and Rectangles. In Proceedings of the 1990 ACM SIGMOD International Conference on Management of Data, Atlantic City, NJ, May 23–25, 1990, pp: 322–331.
Google Scholar
Gautam Das, Dimitrios Gunopulos and Heikki Mannila. Finding Similar Time Series. Principles of Data Mining and Knowledge Discovery, First European Symposium, PKDD ‘87, Trondheim, Norway, June 24–27, 1997, pp: 88–100.
Google Scholar
Gautam Das, King-Ip Lin, Heikki Mannila, Copal Renganathan and Padhraic Smyth. Rule Discovery from Time Series. In Proceedings of the Fourth International Conference on Knowledge Discovery and Data Mining (KDD98), August 27–31, 1998, New York City, New York, USA, pp: 16–22.
Google Scholar
R. D. Edwards and J. Magee. Technical Analysis of Stock Trends. John Magee, Springfield, Massachsetts, 1969.
Google Scholar
Christos Faloutsos, H. V. Jagadish, Alberto O. Mendelzon and Tova Milo. A Signature Technique for Similarity-Based Queries. SEQUENCES 97, Positano-Salerno, Italy June 11–13 1997.
Google Scholar
Christos Faloutsos, M. Ranganathan and Yannis Manolopoulos. Fast Subsequence Matching in Time-Series Databases. In Proceedings of the 1994 ACM SIGMOD International Conference on Management of Data, Minneapolis, Minnesota, May 24–27, 1994, pp: 419–429.
Google Scholar
Dina Q. Goldin and Paris C. Kanellakis. On Similarity Queries for Time-Series Data: Constraint Specification and Implementation. International Conference, CP’95, Cassis, France, September 19–22, 1995, pp 137–153.
Google Scholar
Jiawei Han, Guozhou Dong and Yiwen Yin. Efficient Mining of Partial Periodic Patterns in Time Series Database. In Proceedings of the 15th International Conference on Data Engineering, 23–26 March 1999, Sydney, Australia, pp: 106–115.
Google Scholar
Flip Korn, H. V. Jagadish and Christos Faloutsos. Efficient Supporting Ad Hoc Queries in Large Datasets of Time Sequences. In Proceedings ACM SIGMOD International Conference on Management of Data, May 13–15, 1997, Tucson, Arizona, USA, pp: 289–300.
Google Scholar
Heikki Mannila. Methods and Problems in Data Mining. Database Theory - ICDT ‘87, 6th International Conference, Delphi, Greece, January 8–10, 1997, pp: 41–55.
Google Scholar
T.Sellis, N. Roussopoulos, and C. Faloutsos. The R+-tree: a dynamic index for multidimensional objects. In Proceedings of the 13`^° International Conference on VLDB, England, 1987, pp: 507–518.
Google Scholar
Davood Rafiei. On Similarity-Based Queries for Time Series Data. In Proceedings of the 15th International Conference on Data Engineering, 23–26 March 1999, Sydney, Austrialia, pp: 410–417.
Google Scholar
Byoung-Kee Yi, H. V. Jagadish and Christos Faloutsos. Efficient Retrieval of Similar Time-Sequences under Time Warping. In Proceedings of the Fourteenth International Conference on Data Engineering, February 23–27, 1998, Orlando, Florida, USA, pp: 201–208.
Google Scholar

Download references

Author information

Authors and Affiliations

PRiSM Laboratory, University of Versailles, 45 Avenue des Etats-Unis, 78035, Versailles Cedex, France
Fei Wu, Véronique Plihon & Georges Gardarin

Authors

Fei Wu
View author publications
You can also search for this author in PubMed Google Scholar
Véronique Plihon
View author publications
You can also search for this author in PubMed Google Scholar
Georges Gardarin
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

The University of Aizu, 965-8580, Aizu-Wakamatsu, Fukushima, Japan
Qun Jin
University of Tsukuba, 305-8573, Tsukuba, Ibaraki, Japan
Jie Li
Hiroshima Shudo University, 731-3195, Hiroshima, Japan
Nan Zhang
Saitama University, 338-8570, Saitama, Japan
Jingde Cheng
University of Illinois at Chicago, 60612, Chicago, IL, USA
Clement Yu
Sendai Foundation for Applied Information Sciences, 983-0852, Sendai, Miyagi, Japan
Shoichi Noguchi

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Wu, F., Plihon, V., Gardarin, G. (2002). Similarity-Based Queries for Time Series Databases. In: Jin, Q., Li, J., Zhang, N., Cheng, J., Yu, C., Noguchi, S. (eds) Enabling Society with Information Technology. Springer, Tokyo. https://doi.org/10.1007/978-4-431-66979-1_2

Download citation

DOI: https://doi.org/10.1007/978-4-431-66979-1_2
Publisher Name: Springer, Tokyo
Print ISBN: 978-4-431-66981-4
Online ISBN: 978-4-431-66979-1
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics