Skip to main content

Efficient Mining of Frequent Itemsets from Data Streams

  • Conference paper
Sharing Data, Information and Knowledge (BNCOD 2008)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 5071))

Included in the following conference series:

Abstract

As technology advances, floods of data can be produced and shared in many applications such as wireless sensor networks or Web click streams. This calls for efficient mining techniques for extracting useful information and knowledge from streams of data. In this paper, we propose a novel algorithm for stream mining of frequent itemsets in a limited memory environment. This algorithm uses a compact tree structure to capture important contents from streams of data. By exploiting its nice properties, such a tree structure can be easily maintained and can be used for mining frequent itemsets, as well as other patterns like constrained itemsets, even when the available memory space is small.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Agrawal, R., et al.: Mining association rules between sets of items in large databases. In: Proc. ACM SIGMOD, pp. 207–216 (2003)

    Google Scholar 

  2. Agrawal, R., Srikant, R.: Fast algorithms for mining association rules. In: Proc. VLDB, pp. 487–499 (1994)

    Google Scholar 

  3. Bashir, S., Baig, A.R.: Max-FTP: mining maximal fault-tolerant frequent patterns from databases. In: Cooper, R., Kennedy, J. (eds.) BNCOD 2007. LNCS, vol. 4587, pp. 235–246. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  4. Bucila, C., et al.: DualMiner: a dual-pruning algorithm for itemsets with constraints. In: Proc. ACM KDD, pp. 42–51 (2002)

    Google Scholar 

  5. Chi, Y., et al.: Moment: maintaining closed frequent itemsets over a stream sliding window. In: Proc. IEEE ICDM, pp. 59–66 (2004)

    Google Scholar 

  6. El-Hajj, M., Zaïane, O.R.: COFI-tree mining: a new approach to pattern growth with reduced candidacy generation. In: Proc. FIMI (2003)

    Google Scholar 

  7. El-Hajj, M., Zaïane, O.R.: Non-recursive generation of frequent k-itemsets from frequent pattern tree representations. In: Kambayashi, Y., Mohania, M., Wöß, W. (eds.) DaWaK 2003. LNCS, vol. 2737, pp. 371–380. Springer, Heidelberg (2003)

    Google Scholar 

  8. Gaber, M.M., et al.: Mining data streams: a review. ACM SIGMOD Record 34(2), 18–26 (2005)

    Article  MathSciNet  Google Scholar 

  9. Giannella, C., et al.: Mining frequent patterns in data streams at multiple time granularities. In: Data Mining: Next Generation Challenges and Future Directions, ch. 6. AAAI/MIT Press (2004)

    Google Scholar 

  10. Guo, Y., et al.: A FP-tree based method for inverse frequent set mining. In: Bell, D.A., Hong, J. (eds.) BNCOD 2006. LNCS, vol. 4042, pp. 152–163. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  11. Han, J., et al.: Mining frequent patterns without candidate generation. In: Proc. ACM SIGMOD, pp. 1–12 (2000)

    Google Scholar 

  12. Jin, R., Agrawal, G.: An algorithm for in-core frequent itemset mining on streaming data. In: Proc. IEEE ICDM, pp. 210–217 (2005)

    Google Scholar 

  13. Lakshmanan, L.V.S., Leung, C.K.-S., Ng, R.T.: Efficient dynamic mining of constrained frequent sets. ACM TODS 28(4), 337–389 (2003)

    Article  Google Scholar 

  14. Leung, C.K.-S., et al.: A tree-based approach for frequent pattern mining from uncertain data. In: Washio, T., et al. (eds.) PAKDD 2008. LNCS (LNAI), vol. 5012, pp. 653–661 (2008)

    Google Scholar 

  15. Leung, C.K.-S., et al.: CanTree: a canonical-order tree for incremental frequent-pattern mining. KAIS 11(3), 287–311 (2007)

    Article  Google Scholar 

  16. Leung, C.K.-S., et al.: Exploiting succinct constraints using FP-trees. ACM SIGKDD Explorations 4(1), 40–49 (2002)

    Article  Google Scholar 

  17. Leung, C.K.-S., et al.: FIsViz: a frequent itemset visualizer. In: Washio, T., et al. (eds.) PAKDD 2008. LNCS (LNAI), vol. 5012, pp. 644–652. Springer, Heidelberg (2008)

    Google Scholar 

  18. Leung, C.K.-S., Khan, Q.I.: DSTree: a tree structure for the mining of frequent sets from data streams. In: Proc. IEEE ICDM, pp. 928–932 (2006)

    Google Scholar 

  19. Leung, C.K.-S., Khan, Q.I.: Efficient mining of constrained frequent patterns from streams. In: Proc. IDEAS, pp. 61–68 (2006)

    Google Scholar 

  20. Ng, R.T., et al.: Exploratory mining and pruning optimizations of constrained associations rules. In: Proc. ACM SIGMOD, pp. 13–24 (1998)

    Google Scholar 

  21. Pei, J., et al.: Mining frequent itemsets with convertible constraints. In: Proc. IEEE ICDE, pp. 433–442 (2001)

    Google Scholar 

  22. Yu, J.X., et al.: False positive or false negative: mining frequent itemsets from high speed transactional data streams. In: Proc. VLDB, pp. 204–215 (2004)

    Google Scholar 

  23. Zaki, M.J., Hsiao, C.-J.: CHARM: an efficient algorithm for closed itemset mining. In: Proc. SDM, pp. 457–473 (2002)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Alex Gray Keith Jeffery Jianhua Shao

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Leung, C.KS., Brajczuk, D.A. (2008). Efficient Mining of Frequent Itemsets from Data Streams. In: Gray, A., Jeffery, K., Shao, J. (eds) Sharing Data, Information and Knowledge. BNCOD 2008. Lecture Notes in Computer Science, vol 5071. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-70504-8_2

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-70504-8_2

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-70503-1

  • Online ISBN: 978-3-540-70504-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics