Encyclopedia of Database Systems

2018 Edition
| Editors: Ling Liu, M. Tamer Özsu

Frequent Items on Streams

  • Ahmed Metwally
Reference work entry
DOI: https://doi.org/10.1007/978-1-4614-8265-9_169

Synonyms

Frequent elements; Heavy hitters; Hot items

Definition

Frequent items are the items that mostly represent the stream, since these are the items that occur more than a given user threshold. Formally, given a stream, S, of size N from an alphabet, A, a frequent item, EiA, is an item whose frequency, or number of occurrences, Fi exceeds a specific user support φN, where 0 ≤ φ ≤ 1. There cannot be more than \(\lfloor {1\over \phi } \rfloor - 1\)

This is a preview of subscription content, log in to check access.

Recommended Reading

  1. 1.
    Agarwal P, Cormode G, Zengfeng H, Phillips J, Wei Z, Yi K. Mergeable summaries. In: Proceedings of the 31st ACM PODS Symposium on Principles of Database Systems; 2012. p. 23–34. An extended version appeared in ACM Trans Database Syst. 2013;38(4):26:1–28.MathSciNetzbMATHCrossRefGoogle Scholar
  2. 2.
    Arasu A, Manku G. Approximate counts and quantiles over sliding windows. In: Proceedings of the 23rd ACM PODS Symposium on Principles of Database Systems; 2004. p. 286–96.Google Scholar
  3. 3.
    Bandi N, Metwally A, Agrawal D, Abbadi AE. Fast data stream algorithms using associative memories. In: Proceedings of the ACM SIGMOD International Conference on Management of Data; 2007. p. 247–56.Google Scholar
  4. 4.
    Boyer R, Moore J. A fast majority vote algorithm. Technical report 1981–32. Austin: Institute for Computing Science, University of Texas; 1981.Google Scholar
  5. 5.
    Chakrabarti Al, Cormode G, McGregor A. A near-optimal algorithm for computing the entropy of a stream. In: Proceedings of the 18th ACM-SIAM Symposium on Discrete Algorithms; 2007. p. 328–35.Google Scholar
  6. 6.
    Cormode G, Hadjieleftherion M. Finding frequent items in data streams. Proc VLDB Endowment. 2008;1(2):1530–41.CrossRefGoogle Scholar
  7. 7.
    Cormode G, Hadjieleftheriou M. Methods for finding frequent items in data streams. VLDB J. 2010;19: 3–20.CrossRefGoogle Scholar
  8. 8.
    Cormode G, Korn F, Muthukrishnan S, Srivastava D. Diamond in the rough: finding hierarchical heavy hitters in multi-dimensional data. In: Proceedings of the ACM SIGMOD International Conference on Management of Data; 2004. p. 155–66. An extended version appeared in ACM Trans Knowl Discov Data. 2008;1(4):1–48.CrossRefGoogle Scholar
  9. 9.
    Cormode G, Korn F, Tirthapura S. Exponentially decayed aggregates on data streams. In: Proceedings of the IEEE 24th ICDE International Conference on Data Engineering; 2008. p. 1379–81.Google Scholar
  10. 10.
    Cormode G, Muthukrishnan S. What’s hot and what’s not: tracking most frequent items dynamically. In: Proceedings of the 22nd ACM PODS Symposium on Principles of Database Systems; 2003. p. 296–306. An extended version appeared in ACM Trans Comput Syst. 2005;30(1):249–78.Google Scholar
  11. 11.
    Demaine E, López-Ortiz A, Munro J. Frequency estimation of internet packet streams with limited space. In: Proceedings of the 10th Annual European Symposium on Algorithms; 2002. p. 348–60.CrossRefGoogle Scholar
  12. 12.
    Estan C, Varghese G. New directions in traffic measurement and accounting: focusing on the elephants, ignoring the mice. ACM Trans Comput Syst. 2003;21(3):270–313.CrossRefGoogle Scholar
  13. 13.
    Fischer M, Salzberg S. Finding a majority among N votes: solution to problem 81–5. J Algorithms. 1982;3(4):376–9.Google Scholar
  14. 14.
    Jin C, Qian W, Sha C, Yu J, Zhou A. Dynamically maintaining frequent items over a data stream. In: Proceedings of the 12th International Conference on Information and Knowledge Management; 2003. p. 287–94.Google Scholar
  15. 15.
    Karp R, Shenker S, Papadimitriou C. A simple algorithm for finding frequent elements in streams and bags. ACM Trans Database Syst. 2003;28(1):51–5.CrossRefGoogle Scholar
  16. 16.
    Lee L, Ting H. A simpler and more efficient deterministic scheme for finding frequent items over sliding windows. In: Proceedings of the 25th ACM PODS Symposium on Principles of Database Systems; 2006. p. 290–7.Google Scholar
  17. 17.
    Liu H, Lin Y, Han J. Methods for mining frequent items in data streams: an overview. Knowl Inf Syst. 2011;26:1–30.CrossRefGoogle Scholar
  18. 18.
    Manku G, Motwani R. Approximate frequency counts over data streams. In: Proceedings of the 28th International Conference on Very Large Data Bases; 2002. p. 346–57.CrossRefGoogle Scholar
  19. 19.
    Metwally A, Agrawal D, El Abbadi A. Efficient computation of frequent and top-k elements in data streams. In: Proceedings of the 10th International Conference on Database Theory; 2005. p. 398–412. An extended version appeared in ACM Trans Database Syst. 2006;31(3):1095–133.CrossRefGoogle Scholar
  20. 20.
    Misra J, Gries D. Finding repeated elements. Sci Comput Program. 1982;2:143–52.MathSciNetzbMATHCrossRefGoogle Scholar
  21. 21.
    Zhang L, Guan Y. Frequency estimation over sliding windows. In: Proceedings of the 24th International Conference on Data Engineering; 2008. p. 1385–7.Google Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2018

Authors and Affiliations

  1. 1.LinkedIn Corp.Mountain ViewUSA