Incrementally Mining Recently Repeating Patterns over Data Streams

Koh, Jia-Ling; Chou, Pei-Min

doi:10.1007/978-3-642-00399-8_3

Jia-Ling Koh²⁶ &
Pei-Min Chou²⁶

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 5433))

Included in the following conference series:

Pacific-Asia Conference on Knowledge Discovery and Data Mining

461 Accesses
1 Citations

Abstract

Repeating patterns represent temporal relations among data items, which could be used for data summarization and data prediction. More and more data of various applications is generated as a data stream. Based on time sensitive concern, mining repeating patterns from the whole history data sequence of a data stream does not extract the current trend of patterns over the stream. Therefore, the traditional strategies for mining repeating patterns on static database are not applicable to data streams. For this reason, an algorithm, named appearing-bit-sequence-based incremental mining algorithm, for efficiently discovering recently repeating patterns over a data stream is proposed in this paper. The appearing bit sequences are used to monitor the occurrences of patterns within a sliding window. Two versions of algorithms are proposed by maintaining the appearing bit sequences of maximum repeating patterns and closed repeating patterns, respectively. Accordingly, the cost of re-mining repeating patterns over a sliding window is reduced to that of monitoring frequency changes of the maintained patterns. The experimental results show that the incremental mining methods perform much better than the re-miming approach.

This work was partially supported by the R.O.C. N.S.C. under Contract No. 96-2221-E-003-018 and 96-2524-S-003-001.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Agrawal, R., Srikant, R.: Fast Algorithms for Mining Association Rules. In: Proc. of Int. Conf. on Very Large Data Bases (1994)
Google Scholar
Chang, J.H., Lee, W.S.: Finding Recent Frequent Itemsets Adaptively over Online Data Streams. In: Proc. the 9th ACM International Conference on Knowledge Discovery and Data Ming (2003)
Google Scholar
Chi, Y., Wang, H., Yu, P.S., Muntz, R.R.: Moment: Maintaining Closed Frequent Itemsets over a Stream Sliding Window. In: Proc. Int. Conf. on Data Mining (ICDM 2004) (2004)
Google Scholar
Domingos, P., Hulten, G.: Mining high-speed data streams. In: Proc. The 6th ACM International Conference on Knowledge Discovery and Data Ming (2000)
Google Scholar
Hsu, J.L., Liu, C.C., Chen, A.L.P.: Discovering Nontrivial Repeating Patterns in Music Data. IEEE Transactions on Multimedia (2001)
Google Scholar
Koh, J.L., Yu, W.D.C.: Efficient Feature Mining in Music Objects. In: Mayr, H.C., Lazanský, J., Quirchmayr, G., Vogel, P. (eds.) DEXA 2001. LNCS, vol. 2113, p. 221. Springer, Heidelberg (2001)
Chapter Google Scholar
Koh, J.L., Kung, Y.T.: An Efficient Approach for Mining Top-K Fault-Tolerant Repeating Patterns. In: Li Lee, M., Tan, K.-L., Wuwongse, V. (eds.) DASFAA 2006. LNCS, vol. 3882, pp. 95–110. Springer, Heidelberg (2006)
Chapter Google Scholar
Li, H., Lee, S., Shan, M.K.: Online Mining (Recently) Maximal Frequent Itemsets over Data Streams. In: Proc. of RIDE-SDMA 2005 (2005)
Google Scholar
Lin, C.H., Chiu, D.Y., Wu, Y.H., Chen, A.L.P.: Mining Frequent Itemsets from Data Streams with a Time-Sensitive Sliding Window. In: Proc. SIAM International Conference on Data Mining (2005)
Google Scholar
Liu, N.-H., Wu, Y.-H., Chen, A.L.P.: An Efficient Approach to Extracting Approximate Repeating Patterns in Music Databases. In: Zhou, L.-z., Ooi, B.-C., Meng, X. (eds.) DASFAA 2005. LNCS, vol. 3453, pp. 240–252. Springer, Heidelberg (2005)
Chapter Google Scholar
Manku, G.S., Chen Motwani, R.: Approximate Frequent Counts over Data Streams. In: Proc. of the 28th International Conference on Very Large Database (2002)
Google Scholar
Wand, H., Fan, W., Yu, P.S., Han, J.: Mining Concept Drifting Data Streams using Ensemble Classifiers. In: Proc. the 9th ACM International Conference on Knowledge Discovery and Data Ming (2003)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science and Information Engineering, National Taiwan Normal University, Taipei, 106, Taiwan, R.O.C.
Jia-Ling Koh & Pei-Min Chou

Authors

Jia-Ling Koh
View author publications
You can also search for this author in PubMed Google Scholar
Pei-Min Chou
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

School of Information Technologies, University of Sydney, NSW, Australia
Sanjay Chawla
The Institute of Scientific and Industrial Research, Osaka University, 8-1, Mihogaoka, Ibaraki, 567, Osaka, Japan
Takashi Washio
Division of Computing Science, Hokkaido University, 060-0814, Sapporo, Japan
Shin-ichi Minato
School of Medicine, Department of Medical Informatics, Shimane University, 89-1 Enya-cho, Izumo, 693-8501, Shimane, Japan
Shusaku Tsumoto
Central Research Institute of Electric Power Industry, 2-11-1 Iwado-kita, Komae-shi, 201-8511, Tokyo, Japan
Takashi Onoda
National Institute of Informatics, 2-1-2 Hitotsubashi, Chiyoda-ku, 101-8430, Tokyo, Japan
Seiji Yamada
The Institute of Scientific and Industrial Research, Osaka University, 8-1 Mihogaoka, Ibaraki, 567, Osaka, Japan
Akihiro Inokuchi

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Koh, JL., Chou, PM. (2009). Incrementally Mining Recently Repeating Patterns over Data Streams. In: Chawla, S., et al. New Frontiers in Applied Data Mining. PAKDD 2008. Lecture Notes in Computer Science(), vol 5433. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-00399-8_3

Download citation

DOI: https://doi.org/10.1007/978-3-642-00399-8_3
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-00398-1
Online ISBN: 978-3-642-00399-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics