Skip to main content

Similarity-Based Queries for Time Series Databases

  • Conference paper
Enabling Society with Information Technology

Abstract

We consider the similarity problem in time series databases. Given a set of sequences, we are interested in finding those sequences whose behaviors are similar. Several methods have been proposed so far to solve this problem. Among them, [AFS93] mapped a time sequence to the frequency domain using the Discrete Fourier Transformation (DFT), and kept only first few coefficients. The similarity of two sequences is determined by the Euclidean distance between their coefficients. [ALS+95] (hereafter referred to as ALS approach) introduced a new model. Two time-series are said to be similar if they have enough non-overlapping time-ordered pairs of subsequences that are similar. Different from the ALS approach, [DGM97] used another method to find the longest common subsequence. The idea is, given a set of transformation functions, to try to find a linear function y=ax+b such that two subsequences X, Y are e-similar. However no paper up to now, as to our knowledge, had compared these algorithms on the same benchmark. This paper discusses and compares these three popular methods. By overcoming the drawbacks of these algorithms, a new algorithm is proposed. Experiments are conducted on the Nasdaq-100 market.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Rakesh Agrawal, Christos Faloutsos and Arun N. Swami. Efficient Similarity Search in Sequence Databases. In Proceedings of 4N Foundations of Data Organization and Algorithms (FODO’93), Chicago, Illinois, USA, October 13–15, 1993, pp: 69–84.

    Google Scholar 

  2. Rakesh Agrawal, King-Ip Lin, Harpreet S. Sawhney and Kyuseok Shim. Fast Similarity Search in the Presence of Noise, Scaling, and Translation in Time-Series Databases. In Proceedings of 21 th International Conference on Very Large Data Bases, September 11–15, 1995, Zurich, Switzerland, pp 490–501.

    Google Scholar 

  3. BĂ©la BollobĂ¢s, Gautam Das, Dimitrios Gunopulos and Heikki Mannila. TimeSeries Similarity Problems and Well Separated Geometric Sets. Proceedings of the Thirteenth Annual Symposium on Computational Geometry, June 4–6, 1997, Nice, France, pp 454 —456.

    Google Scholar 

  4. Norbert Beckmann, Hans-Peter Kriegel, Ralf Schneider and Bernhard Seeger. The R*-tree: An Efficient and Robust Access Method for Points and Rectangles. In Proceedings of the 1990 ACM SIGMOD International Conference on Management of Data, Atlantic City, NJ, May 23–25, 1990, pp: 322–331.

    Google Scholar 

  5. Gautam Das, Dimitrios Gunopulos and Heikki Mannila. Finding Similar Time Series. Principles of Data Mining and Knowledge Discovery, First European Symposium, PKDD ‘87, Trondheim, Norway, June 24–27, 1997, pp: 88–100.

    Google Scholar 

  6. Gautam Das, King-Ip Lin, Heikki Mannila, Copal Renganathan and Padhraic Smyth. Rule Discovery from Time Series. In Proceedings of the Fourth International Conference on Knowledge Discovery and Data Mining (KDD98), August 27–31, 1998, New York City, New York, USA, pp: 16–22.

    Google Scholar 

  7. R. D. Edwards and J. Magee. Technical Analysis of Stock Trends. John Magee, Springfield, Massachsetts, 1969.

    Google Scholar 

  8. Christos Faloutsos, H. V. Jagadish, Alberto O. Mendelzon and Tova Milo. A Signature Technique for Similarity-Based Queries. SEQUENCES 97, Positano-Salerno, Italy June 11–13 1997.

    Google Scholar 

  9. Christos Faloutsos, M. Ranganathan and Yannis Manolopoulos. Fast Subsequence Matching in Time-Series Databases. In Proceedings of the 1994 ACM SIGMOD International Conference on Management of Data, Minneapolis, Minnesota, May 24–27, 1994, pp: 419–429.

    Google Scholar 

  10. Dina Q. Goldin and Paris C. Kanellakis. On Similarity Queries for Time-Series Data: Constraint Specification and Implementation. International Conference, CP’95, Cassis, France, September 19–22, 1995, pp 137–153.

    Google Scholar 

  11. Jiawei Han, Guozhou Dong and Yiwen Yin. Efficient Mining of Partial Periodic Patterns in Time Series Database. In Proceedings of the 15th International Conference on Data Engineering, 23–26 March 1999, Sydney, Australia, pp: 106–115.

    Google Scholar 

  12. Flip Korn, H. V. Jagadish and Christos Faloutsos. Efficient Supporting Ad Hoc Queries in Large Datasets of Time Sequences. In Proceedings ACM SIGMOD International Conference on Management of Data, May 13–15, 1997, Tucson, Arizona, USA, pp: 289–300.

    Google Scholar 

  13. Heikki Mannila. Methods and Problems in Data Mining. Database Theory - ICDT ‘87, 6th International Conference, Delphi, Greece, January 8–10, 1997, pp: 41–55.

    Google Scholar 

  14. T.Sellis, N. Roussopoulos, and C. Faloutsos. The R+-tree: a dynamic index for multidimensional objects. In Proceedings of the 13`° International Conference on VLDB, England, 1987, pp: 507–518.

    Google Scholar 

  15. Davood Rafiei. On Similarity-Based Queries for Time Series Data. In Proceedings of the 15th International Conference on Data Engineering, 23–26 March 1999, Sydney, Austrialia, pp: 410–417.

    Google Scholar 

  16. Byoung-Kee Yi, H. V. Jagadish and Christos Faloutsos. Efficient Retrieval of Similar Time-Sequences under Time Warping. In Proceedings of the Fourteenth International Conference on Data Engineering, February 23–27, 1998, Orlando, Florida, USA, pp: 201–208.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2002 Springer Japan

About this paper

Cite this paper

Wu, F., Plihon, V., Gardarin, G. (2002). Similarity-Based Queries for Time Series Databases. In: Jin, Q., Li, J., Zhang, N., Cheng, J., Yu, C., Noguchi, S. (eds) Enabling Society with Information Technology. Springer, Tokyo. https://doi.org/10.1007/978-4-431-66979-1_2

Download citation

  • DOI: https://doi.org/10.1007/978-4-431-66979-1_2

  • Publisher Name: Springer, Tokyo

  • Print ISBN: 978-4-431-66981-4

  • Online ISBN: 978-4-431-66979-1

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics