Advertisement

Hot Spot Tracking by Time-Decaying Bloom Filters and Reservoir Sampling

  • Kai ChengEmail author
Conference paper
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 926)

Abstract

In most networking applications such as IoT (Internet of Things), data are being generated at a high rate so that the long–term storage cost outweighs its benefits. Such streams of data are stored temporarily, and should be mined fast before they are lost forever. In a previous work, we have presented Time–decaying Bloom Filters (TBF) for maintaining time–varying frequency statistics in data streams. TBF extends the standard Bloom Filters (for approximate membership queries) by replacing the bit-vector with an array of small counters, whose values decay periodically with time. In this paper, we consider hot spot tracking problem for data streams. To this problem, we propose a novel scheme by integrating TBF and online sampling technology. Data streams are sampled in an online manner using a reservoir. Items newly sampled are passed to a TBF where frequency statistics are maintained for hot spot reporting.

References

  1. 1.
    Bloom, B.H.: Space/time trade-offs in hash coding with allowable errors. Commun. ACM 13(7), 422–426 (1970)CrossRefGoogle Scholar
  2. 2.
    Charikar, M., Chen, K., Farach-Colton, M.: Finding frequent items in data streams. In: Proceedings of the International Colloquium on Automata, Languages and Programming (ICALP), pp. 693–703 (2002)Google Scholar
  3. 3.
    Cheng, K., Xiang, L., Iwaihara, M.: Time-decaying bloom filters for data streams with skewed distributions. In: 15th International Workshop on Research Issues in Data Engineering: Stream Data Mining and Applications (RIDE-SDMA 2005), pp. 63–69, April 2005Google Scholar
  4. 4.
    Cohen, E., Strauss, M.: Maintaining time-decaying stream aggregates. In: PODS 2003, pp. 223–233 (2003)Google Scholar
  5. 5.
    Cohen, S., Matias, Y.: Spectral bloom filters. In: SIGMOD Conference, pp. 241–252 (2003)Google Scholar
  6. 6.
    Deng, F., Rafiei, D.: Approximately detecting duplicates for streaming data using stable bloom filters. In: Proceedings of the 2006 ACM SIGMOD International Conference on Management of Data, SIGMOD 2006, pp. 25–36. ACM, New York, NY, USA, (2006)Google Scholar
  7. 7.
    Estan, C., Varghese, G.: New directions in traffic measurement and accounting: focusing on the elephants, ignoring the mice. ACM Trans. Comput. Syst. (TOCS) 21(3), 270–313 (2003)CrossRefGoogle Scholar
  8. 8.
    Fang, M., Shivakumar, N., Garcia-Molina, H., Motwani, R., Ullman, J.D.: Computing iceberg queries efficiently. In: Proceedings of the Twenty-fourth International Conference on Very Large Databases, pp. 299–310 (1998)Google Scholar
  9. 9.
    Luo, L., Guo, D., Ma, R.T.B., Rottenstreich, O., Luo, X.: Optimizing bloom filter: challenges, solutions, and comparisons. CoRR, abs/1804.04777 (2018)Google Scholar
  10. 10.
    Manku, G., Motwani, R.: Approximate frequency counts over data streams. In: Proceedings of 28th International Conference on Very Large Data Bases, VLDB 2002, pp. 346–357 (2002)Google Scholar
  11. 11.
    Tarkoma, S., Rothenberg, C.E., Lagerspetz, E.: Theory and practice of bloom filters for distributed systems. IEEE Commun. Surv. Tutorials 14(1), 131–155 (2012)CrossRefGoogle Scholar
  12. 12.
    Yoon, M.K.: Aging bloom filter with two active buffers for dynamic sets. IEEE Trans. Knowl. Data Eng. 22(1), 134–138 (2010)CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  1. 1.Department of Information ScienceKyushu Sangyo UniversityFukuokaJapan

Personalised recommendations