Advertisement

Finding Frequent Items in a Turnstile Data Stream

  • Regant Y. S. Hung
  • Kwok Fai Lai
  • Hing Fung Ting
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5092)

Abstract

Because of important applications such as denial-of-service attack detection, finding frequent items in data streams under different models has been studied extensively. Finding frequent items in a turnstile data stream is the most challenging because both insertions and deletions of items are allowed in the stream. In this paper, we propose a deterministic algorithm that solves the problem. Furthermore, we propose a randomized algorithm for the problem. Empirical results show that our randomized algorithm provides better results than existing randomized algorithms for the problem and our algorithm uses much smaller space, and supports faster query time and similar update time.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Alon, N., Gibbons, P.B., Matias, Y., Szegedy, M.: Tracking Join and Self-join Sizes in Limited Storage. In: Symposium on Principles of Database Systems, pp. 10–20 (1999)Google Scholar
  2. 2.
    Arasu, A., Manku, G.S.: Approximate Counts and Quantiles over Sliding Windows. In: Symposium on Principles of Database Systems, pp. 286–296 (2004)Google Scholar
  3. 3.
    Cormode, G., Garofalakis, M., Sacharidis, D.: Fast Approximate Wavelet Tracking on Streams. In: International Conference on Extending Database Technology, pp. 4–22 (2006)Google Scholar
  4. 4.
    Cormode, G., Muthukrishnan, S.: An Improved Data Stream Summary: The Count-Min Sketch and its Applications. Journal of Algorithms 55(1), 58–75 (2005)zbMATHCrossRefMathSciNetGoogle Scholar
  5. 5.
    Demaine, E.D., López-Ortiz, A., Munro, J.I.: Frequency Estimation of Internet Packet Streams with Limited Space. In: European Symposium on Algorithms, pp. 348–360 (2002)Google Scholar
  6. 6.
    Gaber, M.M., Zaslavsky, A., Krishnaswamy, S.: Mining Data Streams: a Review. SIGMOD Record 34(2), 18–26 (2005)CrossRefGoogle Scholar
  7. 7.
    Ganguly, S.: Counting Distinct Items over Update Streams. In: International Symposium on Algorithms and Computation, pp. 505–514 (2005)Google Scholar
  8. 8.
    Ganguly, S., Garofalakis, M.N., Kumar, A., Rastogi, R.: Join-Distinct Aggregate Estimation over Update Streams. In: Symposium on Principles of Database Systems, pp. 259–270 (2005)Google Scholar
  9. 9.
    Ganguly, S., Majumder, A.: Deterministic k-Set Structure. In: Symposium on Principles of Database Systems, pp. 280–289 (2006)Google Scholar
  10. 10.
    Ganguly, S., Majumder, A.: CR-precis: A Deterministic Summary Structure for Update Data Streams. In: IntErnational Symposium on Combinatorics, Algorithms, Probabilistic and Experimental Methodologies, pp. 48–59 (2007)Google Scholar
  11. 11.
    Jin, C., Qian, W., Sha, C., Yu, J.X., Zhou, A.: Dynamically Maintaining Frequent Items over a Data Stream. In: International Conference on Information and Knowledge Management, pp. 287–294 (2003)Google Scholar
  12. 12.
    Karp, R.M., Shenker, S., Papadimitriou, C.H.: A Simple Algorithm for Finding Frequent Elements in Streams and Bags. ACM Transactions on Database Systems 28(1), 51–55 (2003)CrossRefGoogle Scholar
  13. 13.
    Lee, L.K., Ting, H.F.: A Simpler and More Efficient Deterministic Scheme for Finding Frequent Items over Sliding Windows. In: Symposium on Principles of Database Systems, pp. 290–297 (2006)Google Scholar
  14. 14.
    Manku, G.S., Motwani, R.: Approximate Frequency Counts over Data Streams. In: Very Large Data Bases Conference, pp. 346–357 (2002)Google Scholar
  15. 15.
    Misra, J., Gries, D.: Finding Repeated Elements. Science of Computer Programming 2, 143–152 (1982)zbMATHCrossRefMathSciNetGoogle Scholar
  16. 16.
    Muthukrishnan, S.: Data Streams: Algorithms and Applications. Now Publishers (2005)Google Scholar
  17. 17.
    Sanitized UCLA CSD Traffic Traces, http://www.lasr.cs.ucla.edu/ddos/traces/

Copyright information

© Springer-Verlag Berlin Heidelberg 2008

Authors and Affiliations

  • Regant Y. S. Hung
    • 1
  • Kwok Fai Lai
    • 1
  • Hing Fung Ting
    • 1
  1. 1.Department of Computer ScienceThe University of Hong Kong, PokfulamHong Kong 

Personalised recommendations