Skip to main content
Book cover

Data Streams pp 237–259Cite as

Indexing and Querying Data Streams

  • Chapter
  • 2830 Accesses

Part of the book series: Advances in Database Systems ((ADBS,volume 31))

Abstract

Online monitoring of data streams poses a challenge in many data-centric applications including network traffic management, trend analysis, web-click streams, intrusion detection, and sensor networks. Indexing techniques used in these applications have to be time and space efficient while providing a high quality of answers to user queries: (1) queries that monitor aggregates, such as finding surprising levels (“volatility” of a data stream), and detecting bursts, and (2) queries that monitor trends, such as detecting correlations and finding similar patterns. Data stream indexing becomes an even more challenging task, when we take into account the dynamic nature of underlying raw data. For example, bursts of events can occur at variable temporal modalities from hours to days to weeks. We focus on a multi-resolution indexing architecture. The architecture enables the discovery of “interesting” behavior online, provides flexibility in user query definitions, and interconnects registered queries for real-time and in-depth analysis.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   89.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD   169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. R. Agrawal, C. Faloutsos, and A. Swami. Efficient similarity search in sequence databases. In FODO, pages 69–84, 1993.

    Google Scholar 

  2. A. Akella, A. Bharambe, M. Reiter, and S. Seshan. Detecting DDoS attacks on ISP networks. In MPDS, 2003.

    Google Scholar 

  3. A. Arasu and J. Widom. Resource sharing in continuous sliding-window aggregates. In VLDB, pages 336–347, 2004.

    Google Scholar 

  4. S. Banerjee, B. Bhattacharjee, and C. Kommareddy. Scalable Application Layer Multicast. In SIGCOMM, pages 205–217, 2002.

    Google Scholar 

  5. N. Beckmann, H. Kriegel, R. Schneider, and B. Seeger. The R*-tree: An efficient and robust access method for points and rectangles. In SIGMOD, pages 322–331, 1990.

    Google Scholar 

  6. J. Bentley, B. Weide, and A. Yao. Optimal expected time algorithms for closest point problems. In ACM Trans. on Math. Software, volume 6, pages 563–580, 1980.

    Article  MATH  MathSciNet  Google Scholar 

  7. A. Bulut and A. Singh. SWAT: Hierarchical stream summarization in large networks. In ICDE, pages 303–314, 2003.

    Google Scholar 

  8. A. Bulut and A. Singh. A unified framework for monitoring data streams in real time. In ICDE, pages 44–55, 2005.

    Google Scholar 

  9. D. Carney, U. Cetintemel, M. Cherniack, C. Convey, S. Lee, G. Seidman, M. Stonebraker, N. Tatbul, and S. Zdonik. Monitoring streams-a new class of data management applications. In VLDB, pages 215–226, 2002.

    Google Scholar 

  10. Y. Chen, G. Dong, J. Han, J. Pei, B. Wah, and J. Wang. Online analytical processing stream data: Is it feasible? In DMKD, 2002.

    Google Scholar 

  11. A. Deshpande, C. Guestrin, S. Madden, J. Hellerstein, and W. Hong. Model-driven data acquisition in sensor networks. In VLDB, pages 588–599, 2004.

    Google Scholar 

  12. P. Dinda. CMU, Aug 97 Load Trace. In Host Load Data Archive http://www.cs.northwestern.edu/~pdinda/LoadTraces/.

    Google Scholar 

  13. C. Faloutsos, M. Ranganathan, and Y. Manolopoulos. Fast subsequence matching in time-series databases. In SIGMOD, pages 419–429, 1994.

    Google Scholar 

  14. C. Guestrin, P. Bodi, R. Thibau, M. Paski, and S. Madden. Distributed regression: an efficient framework for modeling sensor network data. In IPSN, pages 1–10, 2004.

    Google Scholar 

  15. A. Guttman. R-trees: A dynamic index structure for spatial searching. In SIGMOD, pages 47–57, 1984.

    Google Scholar 

  16. T. Kahveci and A. Singh. Variable length queries for time series data. In ICDE, pages 273–282, 2001.

    Google Scholar 

  17. E. Keogh, K. Chakrabarti, S. Mehrotra, and M. Pazzani. Locally adaptive dimensionality reduction for indexing large time series databases. In SIGMOD, pages 151–162, 2001.

    Google Scholar 

  18. E. Keogh and T. Folias. Time Series Data Mining Archive. In http://www.cs.ucr.edu/~eamonn/TSDMA, 2002.

    Google Scholar 

  19. Y. Law, H. Wang, and C. Zaniolo. Query languages and data models for database sequences and data streams. In VLDB, pages 492–503, 2004.

    Google Scholar 

  20. M. Lee, W. Hsu, C. Jensen, B. Cui, and K. Teo. Supporting frequent updates in R-Trees: A bottom-up approach. In VLDB, pages 608–619, 2003.

    Google Scholar 

  21. S. Mallat. A Wavelet Tour of Signal Processing. Academic Press, 2 edition, 1999.

    Google Scholar 

  22. Y. Moon, K. Whang, and W. Han. General match: a subsequence matching method in time-series databases based on generalized windows. In SIGMOD, pages 382–393, 2002.

    Google Scholar 

  23. T. Palpanas, M. Vlachos, E. Keogh, D. Gunopulos, and W. Truppel. Online amnesic approximation of streaming time series. In ICDE, pages 338–349, 2004.

    Google Scholar 

  24. S. Papadimitriou, A. Brockwell, and C. Faloutsos. AWSOM: Adaptive, hands-off stream mining. In VLDB, pages 560–571, 2003.

    Google Scholar 

  25. N. Roussopoulos, S. Kelley, and F. Vincent. Nearest neighbor queries. pages 71–79, 1995.

    Google Scholar 

  26. H. Wu, B. Salzberg, and D. Zhang. Online event-driven subsequence matching over financial data streams. In SIGMOD, pages 23–34, 2004.

    Google Scholar 

  27. B. Wyman and D. Werner. Content-based Publish-Subscribe over APEX. In Internet-Draft, April, 2002.

    Google Scholar 

  28. B. Yi, N. Sidiropoulos, T. Johnson, H. Jagadish, C. Faloutsos, and A. Biliris. Online data mining for co-evolving time sequences. In ICDE, 2000.

    Google Scholar 

  29. P. Young. Recursive Estimation and Time-Series Analysis: An Introduction. Springer-Verlag, 1984.

    Google Scholar 

  30. Y. Zhu and D. Shasha. Statstream: Statistical monitoring of thousands of data streams in real time. In VLDB, pages 358–369, 2002.

    Google Scholar 

  31. Y. Zhu and D. Shasha. Efficient elastic burst detection in data streams. In SIGKDD, pages 336–345, 2003.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Springer Science+Business Media, LLC

About this chapter

Cite this chapter

Bulut, A., Singh, A.K. (2007). Indexing and Querying Data Streams. In: Aggarwal, C.C. (eds) Data Streams. Advances in Database Systems, vol 31. Springer, Boston, MA. https://doi.org/10.1007/978-0-387-47534-9_11

Download citation

  • DOI: https://doi.org/10.1007/978-0-387-47534-9_11

  • Publisher Name: Springer, Boston, MA

  • Print ISBN: 978-0-387-28759-1

  • Online ISBN: 978-0-387-47534-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics