Efficient Mining of Frequent Itemsets from Data Streams

Leung, Carson Kai-Sang; Brajczuk, Dale A.

doi:10.1007/978-3-540-70504-8_2

Carson Kai-Sang Leung¹ &
Dale A. Brajczuk¹

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 5071))

Included in the following conference series:

British National Conference on Databases

632 Accesses
7 Citations

Abstract

As technology advances, floods of data can be produced and shared in many applications such as wireless sensor networks or Web click streams. This calls for efficient mining techniques for extracting useful information and knowledge from streams of data. In this paper, we propose a novel algorithm for stream mining of frequent itemsets in a limited memory environment. This algorithm uses a compact tree structure to capture important contents from streams of data. By exploiting its nice properties, such a tree structure can be easily maintained and can be used for mining frequent itemsets, as well as other patterns like constrained itemsets, even when the available memory space is small.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Agrawal, R., et al.: Mining association rules between sets of items in large databases. In: Proc. ACM SIGMOD, pp. 207–216 (2003)
Google Scholar
Agrawal, R., Srikant, R.: Fast algorithms for mining association rules. In: Proc. VLDB, pp. 487–499 (1994)
Google Scholar
Bashir, S., Baig, A.R.: Max-FTP: mining maximal fault-tolerant frequent patterns from databases. In: Cooper, R., Kennedy, J. (eds.) BNCOD 2007. LNCS, vol. 4587, pp. 235–246. Springer, Heidelberg (2007)
Chapter Google Scholar
Bucila, C., et al.: DualMiner: a dual-pruning algorithm for itemsets with constraints. In: Proc. ACM KDD, pp. 42–51 (2002)
Google Scholar
Chi, Y., et al.: Moment: maintaining closed frequent itemsets over a stream sliding window. In: Proc. IEEE ICDM, pp. 59–66 (2004)
Google Scholar
El-Hajj, M., Zaïane, O.R.: COFI-tree mining: a new approach to pattern growth with reduced candidacy generation. In: Proc. FIMI (2003)
Google Scholar
El-Hajj, M., Zaïane, O.R.: Non-recursive generation of frequent k-itemsets from frequent pattern tree representations. In: Kambayashi, Y., Mohania, M., Wöß, W. (eds.) DaWaK 2003. LNCS, vol. 2737, pp. 371–380. Springer, Heidelberg (2003)
Google Scholar
Gaber, M.M., et al.: Mining data streams: a review. ACM SIGMOD Record 34(2), 18–26 (2005)
Article MathSciNet Google Scholar
Giannella, C., et al.: Mining frequent patterns in data streams at multiple time granularities. In: Data Mining: Next Generation Challenges and Future Directions, ch. 6. AAAI/MIT Press (2004)
Google Scholar
Guo, Y., et al.: A FP-tree based method for inverse frequent set mining. In: Bell, D.A., Hong, J. (eds.) BNCOD 2006. LNCS, vol. 4042, pp. 152–163. Springer, Heidelberg (2006)
Chapter Google Scholar
Han, J., et al.: Mining frequent patterns without candidate generation. In: Proc. ACM SIGMOD, pp. 1–12 (2000)
Google Scholar
Jin, R., Agrawal, G.: An algorithm for in-core frequent itemset mining on streaming data. In: Proc. IEEE ICDM, pp. 210–217 (2005)
Google Scholar
Lakshmanan, L.V.S., Leung, C.K.-S., Ng, R.T.: Efficient dynamic mining of constrained frequent sets. ACM TODS 28(4), 337–389 (2003)
Article Google Scholar
Leung, C.K.-S., et al.: A tree-based approach for frequent pattern mining from uncertain data. In: Washio, T., et al. (eds.) PAKDD 2008. LNCS (LNAI), vol. 5012, pp. 653–661 (2008)
Google Scholar
Leung, C.K.-S., et al.: CanTree: a canonical-order tree for incremental frequent-pattern mining. KAIS 11(3), 287–311 (2007)
Article Google Scholar
Leung, C.K.-S., et al.: Exploiting succinct constraints using FP-trees. ACM SIGKDD Explorations 4(1), 40–49 (2002)
Article Google Scholar
Leung, C.K.-S., et al.: FIsViz: a frequent itemset visualizer. In: Washio, T., et al. (eds.) PAKDD 2008. LNCS (LNAI), vol. 5012, pp. 644–652. Springer, Heidelberg (2008)
Google Scholar
Leung, C.K.-S., Khan, Q.I.: DSTree: a tree structure for the mining of frequent sets from data streams. In: Proc. IEEE ICDM, pp. 928–932 (2006)
Google Scholar
Leung, C.K.-S., Khan, Q.I.: Efficient mining of constrained frequent patterns from streams. In: Proc. IDEAS, pp. 61–68 (2006)
Google Scholar
Ng, R.T., et al.: Exploratory mining and pruning optimizations of constrained associations rules. In: Proc. ACM SIGMOD, pp. 13–24 (1998)
Google Scholar
Pei, J., et al.: Mining frequent itemsets with convertible constraints. In: Proc. IEEE ICDE, pp. 433–442 (2001)
Google Scholar
Yu, J.X., et al.: False positive or false negative: mining frequent itemsets from high speed transactional data streams. In: Proc. VLDB, pp. 204–215 (2004)
Google Scholar
Zaki, M.J., Hsiao, C.-J.: CHARM: an efficient algorithm for closed itemset mining. In: Proc. SDM, pp. 457–473 (2002)
Google Scholar

Download references

Author information

Authors and Affiliations

The University of Manitoba, Winnipeg, MB, Canada
Carson Kai-Sang Leung & Dale A. Brajczuk

Authors

Carson Kai-Sang Leung
View author publications
You can also search for this author in PubMed Google Scholar
Dale A. Brajczuk
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Alex Gray Keith Jeffery Jianhua Shao

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Leung, C.KS., Brajczuk, D.A. (2008). Efficient Mining of Frequent Itemsets from Data Streams. In: Gray, A., Jeffery, K., Shao, J. (eds) Sharing Data, Information and Knowledge. BNCOD 2008. Lecture Notes in Computer Science, vol 5071. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-70504-8_2

Download citation

DOI: https://doi.org/10.1007/978-3-540-70504-8_2
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-70503-1
Online ISBN: 978-3-540-70504-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics