Abstract
Research in Weighted Association Rule Mining (WARM) has largely concentrated on mining traditional static transactional datasets. Whilst there have been a few attempts at researching WARM in a data stream environment, none have addressed the problem of assigning and adapting weights in the presence of concept drift, which often occurs in a data stream environment. In this research we experiment with two methods of adapting weights; firstly, a simplistic method that recomputes the entire set of weights at fixed intervals, and secondly a method that relies on a distance function that assesses the extent of change in the stream and only updates those items that have had significant change in their patterns of interaction. We show that the latter method is able to maintain good accuracy whilst being several times faster than the former.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Koh, Y.S., Pears, R., Yeap, W.: Valency based weighted association rule mining. In: Zaki, M.J., Yu, J.X., Ravindran, B., Pudi, V. (eds.) PAKDD 2010. LNCS, vol. 6118, pp. 274–285. Springer, Heidelberg (2010)
Chang, J.H., Lee, W.S.: Finding recent frequent itemsets adaptively over online data streams. In: KDD 2003, pp. 487–492. ACM, New York (2003)
Chi, Y., Wang, H., Yu, P., Muntz, R.: Moment: maintaining closed frequent itemsets over a stream sliding window. In: ICDM 2004, pp. 59 –66 (November 2004)
Leung, C.S., Khan, Q.: Dstree: A tree structure for the mining of frequent sets from data streams. In: ICDM 2006, pp. 928 –932 (December 2006)
Ahmed, C.F., Tanbeer, S.K., Jeong, B.S.: Efficient mining of weighted frequent patterns over data streams. In: 10th IEEE International Conference on High Performance Computing and Communications, pp. 400–406 (2009)
Kim, Y., Kim, W., Kim, U.: Mining frequent itemsets with normalized weight in continuous data streams. JIPS 6(1), 79–90 (2010)
Hoeffding, W.: Probability inequalities for sums of bounded random variables. Journal of the American Statistical Association 58(301), 13–30 (1963)
IBM Almaden Research Center: Synthetic data generation code for associations and sequential patterns (1997), http://www.almaden.ibm.com/softwarequest
Goethals, B.: Fimi dataset repository, http://fimi.cs.helsinki.fi/data/
Geurts, K., Wets, G., Brijs, T., Vanhoof, K.: Profiling high frequency accident locations using association rules. Transportation Research Record: Journal of the Transportation Research Board 1840, 123–130 (2003)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Koh, Y.S., Pears, R., Dobbie, G. (2011). Automatic Assignment of Item Weights for Pattern Mining on Data Streams. In: Huang, J.Z., Cao, L., Srivastava, J. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2011. Lecture Notes in Computer Science(), vol 6634. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-20841-6_32
Download citation
DOI: https://doi.org/10.1007/978-3-642-20841-6_32
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-20840-9
Online ISBN: 978-3-642-20841-6
eBook Packages: Computer ScienceComputer Science (R0)